Note
Go to the end to download the full example code.
Visualizing alignments#
Visualizing the alignment of the predicted and reference string is good way to gain some insight beyond summarized metrics. To aid this type of analysis, Stringalign has functionality to quickly display a lightweight visualization of an alignment.
import stringalign
from stringalign.evaluate import AlignmentAnalyzer
reference = "Hello world!"
predicted = "Hello w0rld!!"
tokenizer = stringalign.tokenize.GraphemeClusterTokenizer()
analyzer = AlignmentAnalyzer.from_strings(reference=reference, predicted=predicted, tokenizer=tokenizer)
analyzer.visualize()
The visualization is based on html and CSS and can easily be displayed in a notebook, in dashboard-frameworks that support html or in a web application.
Sometimes it can be beneficial to add extra spacing between the alignment operation (for example if your tokenizer removes spaces or your text contains non-spacing tokens).
To add spacing between each token, you can use the space_alignment_ops flag.
analyzer.visualize(space_alignment_ops=True)
Customize the visualization#
The stringalign.evaluate.AlignmentAnalyzer.visualize() method is a convenience wrapper around stringalign.visualize.create_alignment_html().
If you want more customization you can use stringalign.visualize.create_alignment_html() directly. Then you can, for example, change the text labels
stringalign.visualize.create_alignment_html(
alignment=analyzer.raw_alignment,
reference_label="Gold standard:",
predicted_label="Model estimate:",
space_alignment_ops=True,
)
Customize the styling (advanced)#
You can also supply your own style sheet, which we demonstrate in this short example.
Total running time of the script: (0 minutes 0.004 seconds)