modern-uri

Modern library for working with URIs https://github.com/mrkkrp/modern-uri

Version on this page:0.2.2.0
LTS Haskell 12.22:0.2.2.0
Stackage Nightly 2018-12-14:0.3.0.1
Latest on Hackage:0.3.0.1

See all snapshots modern-uri appears in

BSD3 licensed and maintained by Mark Karpov

Module documentation for 0.2.2.0

Modern URI

License BSD3 Hackage Stackage Nightly Stackage LTS Build Status

This is a modern library for working with URIs in Haskell as per RFC 3986:

https://tools.ietf.org/html/rfc3986

Features

The modern-uri package features:

  • Correct by construction URI data type. Correctness is ensured by guaranteeing that every sub-component of the URI record is by itself cannot be invalid. This boils down to careful use of types and a set of smart constructors for things like scheme, host, etc.
  • Textual components in the URI data type represented as Text rather than ByteString, because they are percent-decoded and so they can contain characters outside of ASCII range (i.e. Unicode). This allows for easier manipulation of URIs, while encoding and decoding headaches are handled by the parsers and renders for you.
  • Absolute and relative URIs differ only by the scheme component: if it’s Nothing, then URI is relative, otherwise it’s absolute.
  • Megaparsec parser that can be used as a standalone smart constructor for the URI data type (see mkURI) as well as be seamlessly integrated into a bigger Megaparsec parser that consumes strict Text (see parser) or strict ByteString (see parserBs).
  • The parser performs some normalization, for example it collapses consecutive slashes. Some smart constructors such as mkScheme and mkHost also perform normalization. So in a sense URIs are also “normalized by construction” to some extent.
  • Fast rendering to strict Text and ByteString as well as to their respective Builder types and to String/ShowS.
  • Extensive set of lensy helpers for easier manipulation of the nested data types (see Text.URI.Lens).
  • Quasi-quoters for compile-time construction of the URI data type and refined text types (see Text.URI.QQ).

Quick start

The modern-uri package serves three main purposes:

  • Construction of the URI data type.
  • Inspection and manipulation of the URI data type (in the sense of changing its parts).
  • Rendering of URIs.

Let’s walk through every operation quickly.

Construction of URIs

There are four ways to create a URI value. First off, one could assemble it manually like so:

λ> :set -XOverloadedStrings
λ> import qualified Text.URI as URI
λ> scheme <- URI.mkScheme "https"
λ> scheme
"https"
λ> host <- URI.mkHost "markkarpov.com"
λ> host
"markkarpov.com"
λ> let uri = URI.URI (Just scheme) (Right (URI.Authority Nothing host Nothing)) Nothing [] Nothing
λ> uri
URI
  { uriScheme = Just "https"
  , uriAuthority = Right
      (Authority
        { authUserInfo = Nothing
        , authHost = "markkarpov.com"
        , authPort = Nothing })
  , uriPath = Nothing
  , uriQuery = []
  , uriFragment = Nothing }

In this library we use quite a few refined text values. They only can be constructed by using smart constructors like mkScheme :: MonadThrow m => Text -> m (RText 'Scheme). For example, if argument to mkScheme is not a valid scheme, an exception will be thrown. Actually this is not necessarily so because there are pure monads that are instances of the MonadThrow type class, and so the smart constructors may be used in e.g. the Maybe monad as well.

There is a smart constructor that can make an entire URI too, it’s called (unsurprisingly) mkURI:

λ> uri <- URI.mkURI "https://markkarpov.com"
λ> uri
URI
  { uriScheme = Just "https"
  , uriAuthority = Right
      (Authority
        { authUserInfo = Nothing
        , authHost = "markkarpov.com"
        , authPort = Nothing })
  , uriPath = Nothing
  , uriQuery = []
  , uriFragment = Nothing }

If argument of mkURI is not a valid URI, then an exception will be thrown. The exception will contain full context and the actual parse error.

If some refined text value or URI is known statically at compile time, we can use Template Haskell, namely the “quasi quotes” feature. To do so import the Text.URI.QQ module and enable the QuasiQuotes language extension, like so:

λ> :set -XQuasiQuotes
λ> import qualified Text.URI.QQ as QQ
λ> let uri = [QQ.uri|https://markkarpov.com|]
λ> uri
URI
  { uriScheme = Just "https"
  , uriAuthority = Right
      (Authority
        { authUserInfo = Nothing
        , authHost = "markkarpov.com"
        , authPort = Nothing })
  , uriPath = Nothing
  , uriQuery = []
  , uriFragment = Nothing }

Note how the value returned by the url quasi quote is pure, its construction cannot fail because when there is an invalid URI inside the quote it’s a compilation error.

The Text.URI.QQ module has quasi quoters for scheme, host, and other things, check it out.

Finally the package provides two Megaparsec parsers: parser and parserBs. The first works on strict Text, while other one works on strict ByteStrings. You can use the parsers in a bigger Megaparsec parser to parse URIs. To get started with Megaparsec, see its Hackage page.

Inspection and manipulation

Although one could use record syntax directly, possibly with language extensions like RecordWildcards, the best way to inspect and edit parts of URI is with lenses. The lenses can be found in the Text.URI.Lens module. If you have never used the lens library, you could probably start by reading/watching materials suggested in the library description on Hackage.

Here are some examples, just to show off what you can do:

λ> import Text.URI.Lens
λ> uri <- URI.mkURI "https://example.com/some/path?foo=bar&baz=quux&foo=foo"
λ> uri ^. uriScheme
Just "https"
λ> uri ^? uriAuthority . _Right . authHost
Just "example.com"
λ> uri ^. isPathAbsolute
True
λ> uri ^. uriPath
["some","path"]
λ> k <- URI.mkQueryKey "foo"
λ> uri ^.. uriQuery . queryParam k
["bar","foo"]
-- etc.

Rendering

Rendering turns a URI into a sequence of bytes or characters. Currently the following options are available:

  • render for rendering to strict Text.
  • render' for rendering to text Builder. It’s possible to turn that into lazy Text by using the toLazyText function from Data.Text.Lazy.Builder.
  • renderBs for rendering to strict ByteString.
  • renderBs' for rendering to byte string Builder. Similarly it’s possible to get a lazy ByteString from that by using the toLazyByteString function from Data.ByteString.Builder.
  • renderStr can be used to render to String. Sometimes it’s handy. The render uses difference lists internally so it’s not that slow, but in general I’d advise avoiding Strings.
  • renderStr' returns ShowS, which is just a synonym for String -> String—a function that prepends the result of rendering to a given String. This is useful when the URI you want to render is a part of a bigger output, just like with the builders mentioned above.

Examples:

λ> uri <- mkURI "https://markkarpov.com/posts.html"
λ> render uri
"https://markkarpov.com/posts.html"
λ> renderBs uri
"https://markkarpov.com/posts.html"
λ> renderStr uri
"https://markkarpov.com/posts.html"
-- etc.

Contribution

Issues, bugs, and questions may be reported in the GitHub issue tracker for this project.

Pull requests are also welcome and will be reviewed quickly.

License

Copyright © 2017–2018 Mark Karpov

Distributed under BSD 3 clause license.

Changes

Modern URI 0.3.0.1

  • Allow superfluous & right after question sign in query parameters.

Modern URI 0.3.0.0

  • Uses Megaparsec 7. Visible API changes amount to an adjustment in definition of the ParseException type.

Modern URI 0.2.2.0

  • Removed a potentially overlapping instance Arbitrary (NonEmpty (RText 'PathPiece)).

  • Fixed a bug that made it impossible to have empty host names. This allows us to parse URIs like file:///etc/hosts.

Modern URI 0.2.1.0

  • Added emptyURIURI value representing the empty URI.

Modern URI 0.2.0.0

  • Changed the type of uriPath field of the URI record from [RText 'PathPiece] to Maybe (Bool, NonEmpty (RText 'PathPiece)). This allows us to store whether there is a trailing slash in the path or not. See the updated documentation for more information.

  • Added the relativeTo function.

  • Added the uriTrailingSlash 0-1 traversal in Text.URI.Lens.

Modern URI 0.1.2.1

  • Allow Megaparsec 6.4.0.

Modern URI 0.1.2.0

  • Fixed handling of + in query strings. Now + is parsed as space and serialized as %2b as per RFC 1866 (paragraph 8.2.1). White space in query parameters is serialized as +.

Modern URI 0.1.1.1

  • Fixed implementation of Text.URI.Lens.queryParam traversal.

Modern URI 0.1.1.0

  • Derived NFData for ParseException.

  • Adjusted percent-encoding in renders so it’s only used when absolutely necessary. Previously we percent-escaped a bit too much, which, strictly speaking, did not make the renders incorrect, but that didn’t look nice either.

Modern URI 0.1.0.1

  • Updated the readme to include “Quick start” instructions and some examples.

Modern URI 0.1.0.0

  • Changed the type of uriAuthority from Maybe Authority to Either Bool Authority. This allows to know if URI path is absolute or not without duplication of information, i.e. when the Authority component is present the path is necessarily absolute, otherwise the Bool value tells if it’s absolute (True) or relative (False).

  • Added isPathAbsolute in Text.URI and the corresponding getter in Text.URI.Lens.

Modern URI 0.0.2.0

  • Added the renderStr and renderStr' functions for efficient rendering to String and ShowS.

  • Added the parserBs that can consume strict ByteString streams.

Modern URI 0.0.1.0

  • Initial release.
comments powered byDisqus