pcre2
Regular expressions via the PCRE2 C library (included)
https://github.com/sjshuck/hs-pcre2#readme
| LTS Haskell 24.16: | 2.2.2 | 
| Stackage Nightly 2025-10-25: | 2.2.2 | 
| Latest on Hackage: | 2.2.2 | 
Apache-2.0 licensed by Steven Shuck and contributors
Maintained by [email protected]
This version can be pinned in stack with:
pcre2-2.2.2@sha256:e123f3f36e343bbab851d69b6de7608f8fbeeba781cf69f14d5f6d671b451142,7121Module documentation for 2.2.2
- Text
Depends on 6 packages(full list with versions):
pcre2
Regular expressions for Haskell.
Teasers
licensePlate :: Text -> Maybe Text
licensePlate = match "[A-Z]{3}[0-9]{3,4}"
licensePlates :: Text -> [Text]
licensePlates = match "[A-Z]{3}[0-9]{3,4}"
case "The quick brown fox" of
    [regex|\bbrown\s+(?<animal>[A-z]+)\b|] -> Text.putStrLn animal
    _                                      -> die "nothing brown"
let kv'd = lined . packed . [_regex|(?x)  # Extended PCRE2 syntax
        ^\s*          # Ignore leading whitespace
        ([^=:\s].*?)  # Capture the non-empty key
        \s*           # Ignore trailing whitespace
        [=:]          # Separator
        \s*           # Ignore leading whitespace
        (.*?)         # Capture the possibly-empty value
        \s*$          # Ignore trailing whitespace
    |]
forMOf kv'd file $ execStateT $ do
    k <- gets $ capture @1
    v <- gets $ capture @2
    liftIO $ Text.putStrLn $ "found " <> k <> " set to " <> v
    case myMap ^. at k of
        Just v' | v /= v' -> do
            liftIO $ Text.putStrLn $ "setting " <> k <> " to " <> v'
            _capture @2 .= v'
        _ -> liftIO $ Text.putStrLn $ "no change for " <> k
Features
- No opaque “Regex” object. Instead, quiet functions with simple types—for the most part it’sText(pattern)-> Text(subject)-> result.
- No custom typeclasses.
- A single datatype for both compile and match options, the Optionmonoid.
- UTF-8 Texteverywhere.
- Match success expressed via Alternative.
- Opt-in Template Haskell facilities for compile-time verification of patterns, indexing captures, and memoizing inline regexes.
- Opt-in lenssupport.
- No failure monads to express compile errors, preferring pure functions and
throwing imprecise exceptions with pretty Showinstances. Write simple code and debug it. Or, don’t, and use the Template Haskell features instead. Both are first-class.
- Vast presentation of PCRE2 functionality. We can even register Haskell callbacks to run during matching!
- Zero-copying of substrings where beneficial.
- Few dependencies.
- Bundled, statically-linked build of up-to-date PCRE2 (version 10.44), with a complete, exposed Haskell binding.
Performance
Currently we are slower than other libraries. For example:
| Operation | pcre2 | pcre-light | regex-pcre-builtin | 
|---|---|---|---|
| Compile and match a regex | 3.9 μs | 1.2 μs | 2.9 μs | 
If it’s really regex processing that’s causing a bottleneck, pcre-light/-heavy/lens-regex-pcre are recommended instead of this library for the very best performance.
Wishlist
- Many performance optimizations.
- Make use of DFA matching for lazy (infinite) inputs. This likely requires some upstream changes as well but in theory it’s possible
- Improve compile time. Support external libpcre2maybe
License
Apache 2.0. PCRE2 is distributed under the 3-clause BSD license.
Main Author
©2020–2025 Steven Shuck
Changes
Changelog and Acknowledgements
2.2.1
- Fixed #26 where wide UTF-8 characters were not handled correctly.
- Docs fully updated for UTF-8 instead of UTF-16. (Docs were deleted from the 2.2.0 release.)
2.2.0
- Switched to UTF-8 to support text2.0, implementing #22.text< 2 is no longer supported.- Changed type synonym PCRE2_UCHARfromCUShorttoCUCharin the low-level bindings.
- No API changes in the high-level bindings.
- There is a minor regression in the ability to match \Ragainst line separators (U+2028) and paragraph separators (U+2029). See #26.
 
- Changed type synonym 
2.1.1.1
- Updated library, tests, and docs for mtl2.3 andmicrolens-platform0.4.3.0. Themtlpart of this is pursuant to #30.
2.1.1
- Added pattern serialization API, which fixes #23.
- Updated PCRE2 to 10.40 (no API changes).
2.1.0.1
- Explicitly required text< 2.
- Minor docs adjustments.
2.1.0
- Replaced Proxy :: Proxy infowith type applications in splices fromregex/_regex. This significantly shortens the splices, producing nicer error messages. As a very minor consequence, we now require the user to turn on{-# LANGUAGE TypeApplications #-}when usingregex/_regexwith patterns with parenthesized captures, even when not usingcapture/_capture.
2.0.5
- Enabled PCRE2’s built-in Unicode support, which fixes #21.
2.0.4
- Added Showinstance forCapturesto ease debugging user code.
2.0.3
- Updated PCRE2 to 10.39 (no API changes). The C sources are now drawn from https://github.com/PhilipHazel/pcre2, which fixes #10.
2.0.2
- Fixed a minor issue where the caret indicating pattern location of a
Pcre2CompileExceptionwas misplaced if the pattern contained a newline.
2.0.1
- Added microlensas a dependency to improve Haddock docs (Traversal'et al. are clickable) and relieve maintenance burden somewhat.
- Moderate refactoring of internals.
2.0.0
This release introduces significant breaking changes in order to make the API smaller, more consistent, and safer.
- Implemented #18:
- Removed matchAll,matchAllOpt,capturesAll, andcapturesAllOpt.
- Upgraded match,matchOpt,captures, andcapturesOptto offer their functionality, respectively.
- Renamed capturesAandcapturesAOpttocapturesandcapturesOpt, replacing the latter two functions altogether.captures/-Optwere intended to be extreme convenience functions that required no special datatypes beyond thePrelude. However, this was of doubtful benefit, since that’s false anyway—they requiredText, not to mention{-# LANGUAGE OverloadedStrings #-}. Their names are simple and valuable, and no otherAlternative-producing function has the naming convention “-A”, so repurposing their names was in order.
 
- Removed 
- Moved the callout interface to a new module, Text.Regex.Pcre2.Unsafe. This includes the optionsUnsafeCompileRecGuard,UnsafeCallout,UnsafeSubCallout, andAutoCallout, and the typesCalloutInfo,CalloutIndex,CalloutResult,SubCalloutInfo, andSubCalloutResult.
- Also moved option BadEscapeIsLiteralthere.
- Removed the ineffectual options DupNamesandUtf.
Other improvements with no API impact:
- Updated PCRE2 to 10.37.
- Replaced copied C files with symlinks, diminishing codebase by 1.5K lines and simplifying future PCRE2 updates.
- Reduced size of Template Haskell splices to make error messages less obnoxious.
- Moderate refactoring of internals and documentation.
1.1.5
- Fixed #17, where functions
returning Alternativecontainers were not restricted to single results despite their documentation.
- Minor improvements to docs and examples.
1.1.4
- Fixed some incorrect foreign imports’ safety.
1.1.3.1
- Fixed a very minor issue where pcreVersionstill reported “10.35” even though it in fact was using 10.36.
1.1.3
- Made in-house streaming abstraction based on streamingand removed the latter as a dependency.
- Updated PCRE2 to 10.36 (no API changes).
- Docs fixes.
1.1.2
- Refactored using the streaminglibrary. Fixed #11, where large global matches were very slow.
1.1.1
- Fixed #12, where some functions returned too many match results.
1.1.0
- Added global matching.
- New functions matchAll,matchAllOpt,capturesAll,capturesAllOpt.
- Changed all traversals from affine to non-affine.
 
- New functions 
- Changed capturesOptAtocapturesAOptfor naming consistency.
1.0.2
- Fixed #4, where multiple named captures were not type-indexed correctly.
- Established automated builds using Github Workflows. Thanks amesgen!
1.0.1.1
- Temporarily eliminate all dependency version bounds to get it building on Hackage.
1.0.1
- Fixed #1, where building on Windows would succeed but not run. Thanks Andrew!
- Try to adjust dependency version bounds to get it building on Hackage. Thanks snoyberg!
1.0.0
- Initial release.
