An efficient packed Unicode text type.


Version on this page:
LTS Haskell 22.30:2.0.2
Stackage Nightly 2024-07-21:2.1.1
Latest on Hackage:2.1.1@rev:1

See all snapshots text appears in

BSD-3-Clause licensed and maintained by Bryan O'Sullivan
This version can be pinned in stack with:text-,5589

Module documentation for

  • Data
    • Data.Text
      • Data.Text.Array
      • Data.Text.Encoding
        • Data.Text.Encoding.Error
      • Data.Text.Foreign
      • Data.Text.IO
      • Data.Text.Internal
        • Data.Text.Internal.Builder
          • Data.Text.Internal.Builder.Functions
          • Data.Text.Internal.Builder.Int
            • Data.Text.Internal.Builder.Int.Digits
          • Data.Text.Internal.Builder.RealFloat
            • Data.Text.Internal.Builder.RealFloat.Functions
        • Data.Text.Internal.Encoding
          • Data.Text.Internal.Encoding.Fusion
            • Data.Text.Internal.Encoding.Fusion.Common
          • Data.Text.Internal.Encoding.Utf16
          • Data.Text.Internal.Encoding.Utf32
          • Data.Text.Internal.Encoding.Utf8
        • Data.Text.Internal.Functions
        • Data.Text.Internal.Fusion
          • Data.Text.Internal.Fusion.CaseMapping
          • Data.Text.Internal.Fusion.Common
          • Data.Text.Internal.Fusion.Size
          • Data.Text.Internal.Fusion.Types
        • Data.Text.Internal.IO
        • Data.Text.Internal.Lazy
          • Data.Text.Internal.Lazy.Encoding
            • Data.Text.Internal.Lazy.Encoding.Fusion
          • Data.Text.Internal.Lazy.Fusion
          • Data.Text.Internal.Lazy.Search
        • Data.Text.Internal.Private
        • Data.Text.Internal.Read
        • Data.Text.Internal.Search
        • Data.Text.Internal.Unsafe
          • Data.Text.Internal.Unsafe.Char
          • Data.Text.Internal.Unsafe.Shift
      • Data.Text.Lazy
        • Data.Text.Lazy.Builder
          • Data.Text.Lazy.Builder.Int
          • Data.Text.Lazy.Builder.RealFloat
        • Data.Text.Lazy.Encoding
        • Data.Text.Lazy.IO
        • Data.Text.Lazy.Internal
        • Data.Text.Lazy.Read
      • Data.Text.Read
      • Data.Text.Unsafe

Text: Fast, packed Unicode strings, using stream fusion

This package provides the Data.Text library, a library for the space- and time-efficient manipulation of Unicode text in Haskell.

Normalization, conversion, and collation, oh my!

This library intentionally provides a simple API based on the Haskell prelude’s list manipulation functions. For more complicated real-world tasks, such as Unicode normalization, conversion to and from a larger variety of encodings, and collation, use the text-icu package.

That library uses the well-respected and liberally licensed ICU library to provide these facilities.

Get involved!

Please report bugs via the github issue tracker.

Master git repository:

  • git clone git://github.com/bos/text.git

There’s also a Mercurial mirror:

  • hg clone https://bitbucket.org/bos/text

(You can create and contribute changes using either Mercurial or git.)


The base code for this library was originally written by Tom Harper, based on the stream fusion framework developed by Roman Leshchinskiy, Duncan Coutts, and Don Stewart.

The core library was fleshed out, debugged, and tested by Bryan O’Sullivan [email protected], and he is the current maintainer.


  • The Data.Data instance now allows gunfold to work, via a virtual pack constructor

  • dropEnd, takeEnd: new functions

  • Comparing the length of a Text against a number can now short-circuit in more cases

  • streamDecodeUtf8: fixed gh-70, did not return all unconsumed bytes in single-byte chunks

  • encodeUtf8: Performance is improved by up to 4x.

  • encodeUtf8Builder, encodeUtf8BuilderEscaped: new functions, available only if bytestring >= is installed, that allow very fast and flexible encoding of a Text value to a bytestring Builder.

    As an example of the performance gain to be had, the encodeUtf8BuilderEscaped function helps to double the speed of JSON encoding in the latest version of aeson! (Note: if all you need is a plain ByteString, encodeUtf8 is still the faster way to go.)

  • All of the internal module hierarchy is now publicly exposed. If a module is in the .Internal hierarchy, or is documented as internal, use at your own risk - there are no API stability guarantees for internal modules!

  • decodeUtf8: Fixed a regression that caused us to incorrectly identify truncated UTF-8 as valid (gh-61)

  • Added support for Unicode 6.3.0 to case conversion functions

  • New function toTitle converts words in a string to title case

  • New functions peekCStringLen and withCStringLen simplify interoperability with C functionns

  • Added support for decoding UTF-8 in stream-friendly fashion

  • Fixed a bug in mapAccumL

  • Added trusted Haskell support

  • Removed support for GHC 6.10 (released in 2008) and older