zenacy-unicode

Unicode utilities for Haskell https://github.com/mlcfp/zenacy-unicode

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

MIT licensed and maintained by Michael Williams

Zenacy Unicode

hackage-shield stackage-shield linux-shield packdeps-shield

Zenacy Unicode includes tools for checking byte order marks (BOM) and cleaning data to remove invalid bytes. These tools can help ensure that data pulled from the web can be parsed and converted to text.

The following is an example of converting dubious data to a text.

textDecode :: ByteString -> Text
textDecode b =
  case bomStrip b of
    (Nothing, s)           -> T.decodeUtf8 $ unicodeCleanUTF8 s -- Assume UTF8
    (Just BOM_UTF8, s)     -> T.decodeUtf8 $ unicodeCleanUTF8 s
    (Just BOM_UTF16_BE, s) -> T.decodeUtf16BE s
    (Just BOM_UTF16_LE, s) -> T.decodeUtf16LE s
    (Just BOM_UTF32_BE, s) -> T.decodeUtf32BE s
    (Just BOM_UTF32_LE, s) -> T.decodeUtf32LE s

Changes

Change Log

1.0.0

  • Initial FOSS release