The library provides efficient implementations of various strings metric
algorithms. It works with strict
The current version of the package implements:
- Levenshtein distance
- Normalized Levenshtein distance
- Damerau-Levenshtein distance
- Normalized Damerau-Levenshtein distance
- Hamming distance
- Jaro distance
- Jaro-Winkler distance
- Overlap coefficient
- Jaccard similarity coefficient
Comparison with the
edit-distance package whose scope overlaps with the scope of
this package. The differences are:
edit-distanceallows to specify costs for every operation when calculating Levenshtein distance (insertion, deletion, substitution, and transposition). This is rarely needed though in real-world applications, IMO.
edit-distanceonly provides Levenshtein distance,
text-metricsaims to provide implementations of most string metrics algorithms.
text-metricsworks on strict
Although we originally used C for speed, currently all functions are pure Haskell tuned for performance. See this blog post for more info.
Copyright © 2016–2017 Mark Karpov
Distributed under BSD 3 clause license.
Text Metrics 0.3.0
All functions are now implemented in pure Haskell.
All functions return
Ratio Intinstead of
overlap(returns overlap coefficient) and
jaccard(returns Jaccard similarity coefficient).
Text Metrics 0.2.0
Text Metrics 0.1.0
- Initial release.