stringalign.error_classification.confusable_error

Contents

`stringalign.error_classification.confusable_error`#

stringalign.error_classification.confusable_error.count_confusable_errors(reference: str, predicted: str, tokenizer: Tokenizer, consider_confusables: Literal['confusables', 'intentional'] | dict[str, str]) → int[source]#

Count the number of errors that are solely due to characters being replaced with a confusable (e.g. I and 1).

This function counts the number of edits we can avoid if we resolve the confusable characters in the strings before aligning them.

Parameters:#

reference: The reference text.
predicted: The predicted text.
tokenizer: Tokenizer: Tokenizer to use
consider_confusables: Which confusable list to use, see stringalign.normalize.StringNormalizer() or Confusables for more information.

Returns:#

int: The number of confusable errors.