Pretty : A Haskell Pretty-printer library
Pretty is a pretty-printing library, a set of API’s that provides a way to easily print out text in a consistent format of your choosing. This is useful for compilers and related tools.
It is based on the pretty-printer outlined in the paper ‘The Design of a Pretty-printing Library’ in Advanced Functional Programming, Johan Jeuring and Erik Meijer (eds), LNCS 925 http://www.cs.chalmers.se/~rjmh/Papers/pretty.ps
This library is BSD-licensed.
The library uses the Cabal build system, so building is simply a matter of running
cabal configure --enable-tests cabal build
Usually two branches are maintained for Pretty development:
master: This branch is generally kept in a stable state and is where release are pulled and made from. The reason for this is GHC includes the pretty library and tracks the master branch by default so we don’t want experimental code being pulled into GHC at times.
next: This branch is the general development branch.
We are happy to receive bug reports, fixes, documentation enhancements, and other improvements.
Please report bugs via the github issue tracker.
Master git repository:
git clone git://github.com/haskell/pretty.git
This library is maintained by David Terei, [email protected]. It was originally designed by John Hughes’s and since heavily modified by Simon Peyton Jones.
Pretty library change log
- Update pretty cabal file and readme.
- Fix tests to work with latest quickcheck.
Version 4.0, 24 August 2011
Big change to the structure of the library. Now we don’t have a fixed TextDetails data type for storing the various String types that we support. Instead we have changed that to be a type class that just provides a way to convert String and Chars to an arbitary type. This arbitary type is now provided by the user of the library so that they can implement support very easily for any String type they want.
This new code lives in Text.PrettyPrint.Core and the Text.PrettyPrint module uses it to implement the old API. The Text.PrettyPrint.HughesPJ module has been left unchanged for a compatability module but deprecated.
Version 3.0, 28 May 1987
Cured massive performance bug. If you write:
foldl <> empty (map (text.show) [1..10000])
You get quadratic behaviour with V2.0. Why? For just the same reason as you get quadratic behaviour with left-associated (++) chains.
This is really bad news. One thing a pretty-printer abstraction should certainly guarantee is insensitivity to associativity. It matters: suddenly GHC’s compilation times went up by a factor of 100 when I switched to the new pretty printer.
I fixed it with a bit of a hack (because I wanted to get GHC back on the road). I added two new constructors to the Doc type, Above and Beside:
<> = Beside $$ = Above
Then, where I need to get to a “TextBeside” or “NilAbove” form I “force” the Doc to squeeze out these suspended calls to Beside and Above; but in so doing I re-associate. It’s quite simple, but I’m not satisfied that I’ve done the best possible job. I’ll send you the code if you are interested.
Added new exports: punctuate, hang int, integer, float, double, rational, lparen, rparen, lbrack, rbrack, lbrace, rbrace,
fullRender’s type signature has changed. Rather than producing a string it now takes an extra couple of arguments that tells it how to glue fragments of output together:
fullRender :: Mode -> Int – Line length -> Float – Ribbons per line -> (TextDetails -> a -> a) – What to do with text -> a – What to do at the end -> Doc -> a – Result
The “fragments” are encapsulated in the TextDetails data type:
data TextDetails = Chr Char | Str String | PStr FAST_STRING
The Chr and Str constructors are obvious enough. The PStr constructor has a packed string (FAST_STRING) inside it. It’s generated by using the new “ptext” export.
An advantage of this new setup is that you can get the renderer to do output directly (by passing in a function of type (TextDetails -> IO () -> IO ()), rather than producing a string that you then print.
Version 3.0, 28 May 1987
Made empty into a left unit for <> as well as a right unit; it is also now true that nest k empty = empty which wasn’t true before.
Fixed an obscure bug in sep that occasionally gave very weird behaviour
Corrected and tidied up the laws and invariants
Relative to John’s original paper, there are the following new features:
There’s an empty document, “empty”. It’s a left and right unit for both <> and $$, and anywhere in the argument list for sep, hcat, hsep, vcat, fcat etc.
It is Really Useful in practice.
There is a paragraph-fill combinator, fsep, that’s much like sep, only it keeps fitting things on one line until it can’t fit any more.
Some random useful extra combinators are provided. <+> puts its arguments beside each other with a space between them, unless either argument is empty in which case it returns the other
hcat is a list version of <> hsep is a list version of <+> vcat is a list version of $$
sep (separate) is either like hsep or like vcat, depending on what fits
cat behaves like sep, but it uses <> for horizontal composition fcat behaves like fsep, but it uses <> for horizontal composition
These new ones do the obvious things: char, semi, comma, colon, space, parens, brackets, braces, quotes, doubleQuotes
The “above” combinator, $$, now overlaps its two arguments if the last line of the top argument stops before the first line of the second begins.
For example: text “hi” $$ nest 5 (text “there”) lays out as hi there rather than hi there
There are two places this is really useful
a) When making labelled blocks, like this: Left -> code for left Right -> code for right LongLongLongLabel -> code for longlonglonglabel The block is on the same line as the label if the label is short, but on the next line otherwise.
b) When laying out lists like this: [ first , second , third ] which some people like. But if the list fits on one line you want [first, second, third]. You can’t do this with John’s original combinators, but it’s quite easy with the new $$.
The combinator $+$ gives the original “never-overlap” behaviour.
Several different renderers are provided:
- a standard one
- one that uses cut-marks to avoid deeply-nested documents simply piling up in the right-hand margin
- one that ignores indentation (fewer chars output; good for machines)
- one that ignores indentation and newlines (ditto, only more so)
Numerous implementation tidy-ups Use of unboxed data types to speed up the implementation