Text-ICU: Comprehensive support for string manipulation

This package provides the Data.Text.ICU library, for performing complex manipulation of Unicode text. It provides features such as the following:

  • Unicode normalization

  • Conversion to and from many common and obscure encodings

  • Date and number formatting

  • Comparison and collation

Prerequisites

This library is implemented as bindings to the well-respected ICU library (which is not bundled, and must be installed separately).

macOS

brew install icu4c
brew link icu4c --force

You might need:

export PKG_CONFIG_PATH="$(brew --prefix)/opt/icu4c/lib/pkgconfig"

Debian/Ubuntu

sudo apt-get update
sudo apt-get install libicu-dev

Fedora/CentOS

sudo dnf install unzip libicu-devel

Nix/NixOS

nix-shell --packages icu

Windows/MSYS2

Under MSYS2, ICU can be installed via pacman.

pacman --noconfirm -S mingw-w64-x86_64-icu

Depending on the age of the MSYS2 installation, the keyring might need to be updated to avoid certification issues, and pkg-config might need to be added. In this case, do this first:

pacman --noconfirm -Sy msys2-keyring
pacman --noconfirm -S mingw-w64-x86_64-pkgconf

Windows/stack

With stack on Windows, which comes with its own bundled MSYS2, the following commands give up-to-date system dependencies for text-icu-0.8.0 (tested 2023-09-30):

stack exec -- pacman --noconfirm -Sy msys2-keyring
stack exec -- pacman --noconfirm -S mingw-w64-x86_64-pkgconf
stack exec -- pacman --noconfirm -S mingw-w64-x86_64-icu

Compatibility

Upstream ICU occasionally introduces backwards-incompatible API breaks. This package tries to stay up to date with upstream, and is currently more or less in sync with ICU 72.

Minimum required version is ICU 62.

Get involved!

Please report bugs via the github issue tracker.

Authors

This library was written by Bryan O’Sullivan.

Changes

0.8.0.4

  • Fixed tests to work with ICU < 72 (#94)

0.8.0.3

  • Support for ICU 72 (#94)

0.8.0.2

  • Support for creating a collator from custom rules (#76)

0.8.0.1

  • Restore build with GHC 7.10 - 8.8 (#61)
  • New CI for Linux, macOS and Windows (#63, #64, #66, #69)

0.8.0

  • Support for text-2.0 (#57)
  • Support for ICU 69 and new features (#55)
  • Add lib/include dirs for newer homebrew (#54)
  • basic number formatting added (#46)
  • Declare pkg-config dependencies (#43)
  • Added support for arabic shaping and BiDi (#41)
  • Include icuio lib (#36)
  • Character Set Detection (#27)

0.7.1.0

  • Add fix for undefined TRUE value in cbits (#52)
  • Improve CI and documentation (#20)

Thanks to everyone who contributed!

0.7.0.0

  • Built and tested against ICU 53.

  • The isoComment function has been deprecated, and will be removed in the next major release.

  • The Collator type is no longer an instance of Eq, as this functionality has been removed from ICU 53.

  • Many NFData instances have been added.