modern-uri
Modern library for working with URIs
https://github.com/mrkkrp/modern-uri
Version on this page: | 0.3.0.1@rev:1 |
LTS Haskell 22.37: | 0.3.6.1@rev:2 |
Stackage Nightly 2024-10-11: | 0.3.6.1@rev:2 |
Latest on Hackage: | 0.3.6.1@rev:2 |
modern-uri-0.3.0.1@sha256:d9e395cc8e228e76d52855ad95dd41dc09270549416cbc46646387d9afa2008f,4659
Module documentation for 0.3.0.1
Modern URI
This is a modern library for working with URIs in Haskell as per RFC 3986:
https://tools.ietf.org/html/rfc3986
Features
The modern-uri
package features:
- Correct by construction
URI
data type. Correctness is ensured by guaranteeing that every sub-component of theURI
record is by itself cannot be invalid. This boils down to careful use of types and a set of smart constructors for things like scheme, host, etc. - Textual components in the
URI
data type represented asText
rather thanByteString
, because they are percent-decoded and so they can contain characters outside of ASCII range (i.e. Unicode). This allows for easier manipulation ofURI
s, while encoding and decoding headaches are handled by the parsers and renders for you. - Absolute and relative URIs differ only by the scheme component: if it’s
Nothing
, then URI is relative, otherwise it’s absolute. - Megaparsec parser that can be used as a standalone smart constructor for
the
URI
data type (seemkURI
) as well as be seamlessly integrated into a bigger Megaparsec parser that consumes strictText
(seeparser
) or strictByteString
(seeparserBs
). - The parser performs some normalization, for example it collapses
consecutive slashes. Some smart constructors such as
mkScheme
andmkHost
also perform normalization. So in a sense URIs are also “normalized by construction” to some extent. - Fast rendering to strict
Text
andByteString
as well as to their respectiveBuilder
types and toString
/ShowS
. - Extensive set of lensy helpers for easier manipulation of the nested data
types (see
Text.URI.Lens
). - Quasi-quoters for compile-time construction of the
URI
data type and refined text types (seeText.URI.QQ
).
Quick start
The modern-uri
package serves three main purposes:
- Construction of the
URI
data type. - Inspection and manipulation of the
URI
data type (in the sense of changing its parts). - Rendering of
URI
s.
Let’s walk through every operation quickly.
Construction of URI
s
There are four ways to create a URI
value. First off, one could assemble
it manually like so:
λ> :set -XOverloadedStrings
λ> import qualified Text.URI as URI
λ> scheme <- URI.mkScheme "https"
λ> scheme
"https"
λ> host <- URI.mkHost "markkarpov.com"
λ> host
"markkarpov.com"
λ> let uri = URI.URI (Just scheme) (Right (URI.Authority Nothing host Nothing)) Nothing [] Nothing
λ> uri
URI
{ uriScheme = Just "https"
, uriAuthority = Right
(Authority
{ authUserInfo = Nothing
, authHost = "markkarpov.com"
, authPort = Nothing })
, uriPath = Nothing
, uriQuery = []
, uriFragment = Nothing }
In this library we use quite a few refined text values. They only can be
constructed by using smart constructors like mkScheme :: MonadThrow m => Text -> m (RText 'Scheme)
. For example, if argument to mkScheme
is not a
valid scheme, an exception will be thrown. Actually this is not necessarily
so because there are pure monads that are instances of the MonadThrow
type
class, and so the smart constructors may be used in e.g. the Maybe
monad
as well.
There is a smart constructor that can make an entire URI
too, it’s called
(unsurprisingly) mkURI
:
λ> uri <- URI.mkURI "https://markkarpov.com"
λ> uri
URI
{ uriScheme = Just "https"
, uriAuthority = Right
(Authority
{ authUserInfo = Nothing
, authHost = "markkarpov.com"
, authPort = Nothing })
, uriPath = Nothing
, uriQuery = []
, uriFragment = Nothing }
If argument of mkURI
is not a valid URI, then an exception will be thrown.
The exception will contain full context and the actual parse error.
If some refined text value or URI
is known statically at compile time, we
can use Template Haskell, namely the “quasi quotes” feature. To do so import
the Text.URI.QQ
module and enable the QuasiQuotes
language extension,
like so:
λ> :set -XQuasiQuotes
λ> import qualified Text.URI.QQ as QQ
λ> let uri = [QQ.uri|https://markkarpov.com|]
λ> uri
URI
{ uriScheme = Just "https"
, uriAuthority = Right
(Authority
{ authUserInfo = Nothing
, authHost = "markkarpov.com"
, authPort = Nothing })
, uriPath = Nothing
, uriQuery = []
, uriFragment = Nothing }
Note how the value returned by the url
quasi quote is pure, its
construction cannot fail because when there is an invalid URI inside the
quote it’s a compilation error.
The Text.URI.QQ
module has quasi quoters for scheme, host, and other
things, check it out.
Finally the package provides two Megaparsec parsers: parser
and
parserBs
. The first works on strict Text
, while other one works on
strict ByteString
s. You can use the parsers in a bigger Megaparsec parser
to parse URI
s. To get started with Megaparsec, see its Hackage
page.
Inspection and manipulation
Although one could use record syntax directly, possibly with language
extensions like RecordWildcards
, the best way to inspect and edit parts of
URI
is with lenses. The lenses can be found in the Text.URI.Lens
module.
If you have never used the
lens
library, you could
probably start by reading/watching materials suggested in the library
description on Hackage.
Here are some examples, just to show off what you can do:
λ> import Text.URI.Lens
λ> uri <- URI.mkURI "https://example.com/some/path?foo=bar&baz=quux&foo=foo"
λ> uri ^. uriScheme
Just "https"
λ> uri ^? uriAuthority . _Right . authHost
Just "example.com"
λ> uri ^. isPathAbsolute
True
λ> uri ^. uriPath
["some","path"]
λ> k <- URI.mkQueryKey "foo"
λ> uri ^.. uriQuery . queryParam k
["bar","foo"]
-- etc.
Rendering
Rendering turns a URI
into a sequence of bytes or characters. Currently
the following options are available:
render
for rendering to strictText
.render'
for rendering to textBuilder
. It’s possible to turn that into lazyText
by using thetoLazyText
function fromData.Text.Lazy.Builder
.renderBs
for rendering to strictByteString
.renderBs'
for rendering to byte stringBuilder
. Similarly it’s possible to get a lazyByteString
from that by using thetoLazyByteString
function fromData.ByteString.Builder
.renderStr
can be used to render toString
. Sometimes it’s handy. The render uses difference lists internally so it’s not that slow, but in general I’d advise avoidingString
s.renderStr'
returnsShowS
, which is just a synonym forString -> String
—a function that prepends the result of rendering to a givenString
. This is useful when theURI
you want to render is a part of a bigger output, just like with the builders mentioned above.
Examples:
λ> uri <- mkURI "https://markkarpov.com/posts.html"
λ> render uri
"https://markkarpov.com/posts.html"
λ> renderBs uri
"https://markkarpov.com/posts.html"
λ> renderStr uri
"https://markkarpov.com/posts.html"
-- etc.
Contribution
Issues, bugs, and questions may be reported in the GitHub issue tracker for this project.
Pull requests are also welcome and will be reviewed quickly.
License
Copyright © 2017–2018 Mark Karpov
Distributed under BSD 3 clause license.
Changes
Modern URI 0.3.0.1
- Allow superfluous
&
right after question sign in query parameters.
Modern URI 0.3.0.0
- Uses Megaparsec 7. Visible API changes amount to an adjustment in
definition of the
ParseException
type.
Modern URI 0.2.2.0
-
Removed a potentially overlapping instance
Arbitrary (NonEmpty (RText 'PathPiece))
. -
Fixed a bug that made it impossible to have empty host names. This allows us to parse URIs like
file:///etc/hosts
.
Modern URI 0.2.1.0
- Added
emptyURI
—URI
value representing the empty URI.
Modern URI 0.2.0.0
-
Changed the type of
uriPath
field of theURI
record from[RText 'PathPiece]
toMaybe (Bool, NonEmpty (RText 'PathPiece))
. This allows us to store whether there is a trailing slash in the path or not. See the updated documentation for more information. -
Added the
relativeTo
function. -
Added the
uriTrailingSlash
0-1 traversal inText.URI.Lens
.
Modern URI 0.1.2.1
- Allow Megaparsec 6.4.0.
Modern URI 0.1.2.0
- Fixed handling of
+
in query strings. Now+
is parsed as space and serialized as%2b
as per RFC 1866 (paragraph 8.2.1). White space in query parameters is serialized as+
.
Modern URI 0.1.1.1
- Fixed implementation of
Text.URI.Lens.queryParam
traversal.
Modern URI 0.1.1.0
-
Derived
NFData
forParseException
. -
Adjusted percent-encoding in renders so it’s only used when absolutely necessary. Previously we percent-escaped a bit too much, which, strictly speaking, did not make the renders incorrect, but that didn’t look nice either.
Modern URI 0.1.0.1
- Updated the readme to include “Quick start” instructions and some examples.
Modern URI 0.1.0.0
-
Changed the type of
uriAuthority
fromMaybe Authority
toEither Bool Authority
. This allows to know if URI path is absolute or not without duplication of information, i.e. when theAuthority
component is present the path is necessarily absolute, otherwise theBool
value tells if it’s absolute (True
) or relative (False
). -
Added
isPathAbsolute
inText.URI
and the corresponding getter inText.URI.Lens
.
Modern URI 0.0.2.0
-
Added the
renderStr
andrenderStr'
functions for efficient rendering toString
andShowS
. -
Added the
parserBs
that can consume strictByteString
streams.
Modern URI 0.0.1.0
- Initial release.