BSD-3-Clause licensed by Harendra Kumar

Module documentation for 0.3.6

This version can be pinned in stack with:unicode-transforms-0.3.6@sha256:8bde4cf60f3a303a77de5f4463b78ab71b5bde183d5cac99df7cc065f23b0d69,5683

Unicode Transforms

Fast Unicode 12.1.0 normalization in Haskell (NFC, NFKC, NFD, NFKD).

What is normalization?

Unicode characters with adornments (e.g. Á) can be represented in two different forms, as a single composed character (U+00C1 = Á) or as multiple decomposed characters (U+0041(A) U+0301( ́ ) = Á). They are differently encoded byte sequences but for humans they have exactly the same visual appearance.

A regular byte comparison may tell that two strings are different even though they might be equivalent. We need to convert both the strings in a normalized form using the Unicode Character Database before we can compare them for equivalence. For example:

>> import Data.Text.Normalize
>> normalize NFC "\193" == normalize NFC "\65\769"
True

Contributing

Please use https://github.com/harendra-kumar/unicode-transforms to raise issues, or send pull requests.

Changes

0.3.6

  • Update to Unicode version 12.1.0
  • Update Quickcheck dependency version bounds
  • Test with GHC 8.6.5

0.3.5

  • Update dependency version bounds
  • Test with GHC 8.6.2

0.3.4

  • GHC 8.4.1 support

0.3.3

  • GHC 8.2.1 support

0.3.2

  • Work around a GHC/LLVM issue for ARM

0.3.1

  • Update dependency versions

0.3.0

  • Support Unicode version 9.0

0.2.1

  • Improve speed and resource hog during compilation

0.2.0

  • Support Unicode version 8.0
  • Switch to pure Haskell implementation

0.1.0.1

  • Initial release based on utf8proc C implementation
Used by 2 packages in nightly-2019-07-11(full list with versions):
comments powered byDisqus