stringalign.error_classification.confusable_error#
- stringalign.error_classification.confusable_error.count_confusable_errors(reference: str, predicted: str, tokenizer: Tokenizer, consider_confusables: Literal['confusables', 'intentional'] | dict[str, str]) int[source]#
Count the number of errors that are solely due to characters being replaced with a confusable (e.g. I and 1).
This function counts the number of edits we can avoid if we resolve the confusable characters in the strings before aligning them.
Parameters:#
- reference
The reference text.
- predicted
The predicted text.
- tokenizer: Tokenizer
Tokenizer to use
- consider_confusables
Which confusable list to use, see
stringalign.normalize.StringNormalizer()or Confusables for more information.
Returns:#
- int
The number of confusable errors.