A collection of tools for processing PDF files. https://github.com/Yuras/pdf-toolbox

Latest on Hackage:

See all snapshots pdf-toolbox-document appears in

BSD-3-Clause licensed by Yuras Shumovich
Maintained by Yuras Shumovich

Mid level tools for processing PDF files.

Level of abstraction: document, catalog, page


  • fix compilation on ghc 7.4, 7.6 and 7.8
  • fix xobject handling in text extraction

  • support xobjects in text extraction

  • switch to errors-2.0

  • support ghc-7.10.1

  • support crypto handler version 4 (V2 and AESV2)

  • extracting text: try to insert spaces and newlines
  • fix attoparsec module deprication warnings
  • fix AMP warnings