text-metrics

Calculate various string metrics efficiently

LTS Haskell 24.49:	0.3.3@rev:1
Stackage Nightly 2026-07-09:	0.3.3@rev:1
Latest on Hackage:	0.3.3@rev:1

See all snapshots text-metrics appears in

BSD-3-Clause licensed and maintained by Mark Karpov

This version can be pinned in stack with:text-metrics-0.3.3@sha256:6bf74b58e165195bae692fc9cd28d54d4bff2c724b1806a561238e2e7ae823e6,2855

Module documentation for 0.3.3

Data
- Data.Text
  - Data.Text.Metrics

Depends on 5 packages(full list with versions):

base, containers, primitive, text, vector

Used by 3 packages in nightly-2026-07-09(full list with versions):

fuzzyset, infer-license, mmark

Text Metrics

The library provides efficient implementations of various strings metric algorithms. It works with strict Text values.

The current version of the package implements:

Comparison with the `edit-distance` package

There is edit-distance package whose scope overlaps with the scope of this package. The differences are:

edit-distance allows to specify costs for every operation when calculating Levenshtein distance (insertion, deletion, substitution, and transposition). This is rarely needed though in real-world applications, IMO.
edit-distance only provides Levenshtein distance, text-metrics aims to provide implementations of most string metrics algorithms.
edit-distance works on Strings, while text-metrics works on strict Text values.

Implementation

Although we originally used C for speed, currently all functions are pure Haskell tuned for performance. See this blog post for more info.

Contribution

Issues, bugs, and questions may be reported in the GitHub issue tracker for this project.

Pull requests are also welcome.

License

Distributed under BSD 3 clause license.

Changes

Text Metrics 0.3.3

Slightly optimized the levenshtein and levenshteinNorm functions. PR 50.

Text Metrics 0.3.2

Works with text-2.0.

Text Metrics 0.3.1

Fixed a bug in the implementation of Jaro-Winkler distance when two strings share a long prefix. PR 21.
Dropped support for GHC 8.6 and older.

Text Metrics 0.3.0

All functions are now implemented in pure Haskell.
All functions return Int or Ratio Int instead of Natural and Ratio Natural.
Added overlap (returns overlap coefficient) and jaccard (returns Jaccard similarity coefficient).

Text Metrics 0.2.0

Made the levenshtein, levenshteinNorm, damerauLevenshtein, and demerauLevenshtein more efficient.
Added jaro and jaroWinkler functions.

Text Metrics 0.1.0

Initial release.