Command line utilities for working with epub files https://github.com/dino-/epub-tools.git
|Latest on Hackage:||2.11|
This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.
Command line utilities for working with epub files (Haskell)
A suite of command-line utilities for creating and manipulating epub book files. Included are: epubmeta, epubname, epubzip. This software uses the epub-metadata library, also available on Hackage.
epubmeta is a command-line utility for examining and editing epub book metadata. With it you can export, import and edit the raw OPF Package XML document for a given book. Or simply dump the metadata to stdout for viewing in a friendly format.
Here’s an example of epubmeta output:
$ epubmeta Kelly_Kessel_Lethem-NinetyPercentOfEverything.epub package version: 2.0 unique-identifier: calibre_id identifier id: calibre_id scheme: calibre text: b1026732-69a5-4a05-a8d9-a1701685f6fa identifier scheme: ISBN text: 1-590620-00-3 title: Ninety Percent of Everything language: en-us contributor text: calibre (0.5.1) [http://calibre.kovidgoyal.net] file-as: calibre role: bkp creator text: James Patrick Kelly file-as: Kelly, James Patrick role: aut creator text: John Kessel role: aut creator text: Jonathan Lethem role: aut date: 2001-03-25T00:00:00 publisher: www.Fictionwise.com subject: Science Fiction/Fantasy
epubname is a command-line utility for renaming epub ebook files based on their OPF Package metadata. It tries to use author names and title info to construct a sensible name.
Using it looks like this:
$ epubname poorly-named-book.epub poorly-named-book.epub -> WattsPeter-Blindsight_2006.epub $ epubname another-poorly-named-book.epub another-poorly-named-book.epub -> Kelly_Kessel_Lethem-NinetyPercentOfEverything.epub
epubzip is a handy utility for zipping up the files that comprise an epub into an .epub zip file. Using the same technology as epubname, it can try to make a meaningful filename for the book.
Getting this software
Getting source for development
- Download the cabalized source package from Hackage
- Get the source with git:
$ git clone https://github.com/dino-/epub-tools.git
- Get the source with stack:
$ stack unpack epub-tools
- If you’re just looking, browse the source
Once you have source, building the usual way:
$ stack build $ stack test $ stack haddock
Dino Morelli <firstname.lastname@example.org>
- Switched license from BSD3 to ISC
- Fixes and additions to some epubname rules
- Fixed a couple of cabal file issues
- Changed the Stackage resolver from a nightly to an lts
- Now using a less-fragile
- Removed an unused GHC options compile directive
- Added a missing module to all other-modules stanzas
- Added a unit test for authors with a middle-name
- Switched build to stack
- Added hsinstall installation script and updated windows dist script
- Various cabal file updates
- Moved copyright date up to 2016
- Updated README with better instructions for getting this software
- Fixed some magazine and anthology naming rules
- Added back Control.Applicative import for GHC 7.8 compatibility
- Fixed an error in bad DSL command index reporting
- Replaced deprecated Control.Monad.Error with Control.Monad.Except
- Replaced deprecated System.Cmd with System.Process
- Removed useless import of Control.Applicative
- Updated cabal homepage, tested-with and source-repository
- Removed unused path to rules file in unit tests
- Reformatted TODO and development notes into Markdown documents
- Updated boringfile with cabal sandbox filespecs
- Switched over to using Data.Version instead of hard-coded version string
- Additions and modifications to the stock rules to both support more books and also use more generic rules than before
- Added code to set proper case for Roman numerals in titles
- Added code to handle file-as names with parenthesized name info
- Simplified and consolidated special-character filtering code
- Added support for multiline fields in the Metadata
- Fixed problem in Windows cmd shell with missing UNIX HOME env variable
- Now gracefully handling last-name-first creators. For books with no file-as and the Creator text arranged last-name-first with a comma, do the right thing.
- Added a new rule for generically-titled magazines with an issue
- Incorporated project website info into README.md and changelog.md files. This information is now in source control where it belongs.
- Added missing files to .cabal for sdist
- Changed copyright date range to 2014
- These tools now support both epub2 and epub3
- Documentation changes and additions
- Updated to build against recent changes and bug fixes in epub-metadata 3.0
- All support data files have been brought into the binaries now. This makes these tools more tolerant to being moved to a different location than what they were configured for build with.
- Some documentation additions and changes
- Fixed a stack overflow problem with some epub documents
- Added new subjectMatch command to the DSL, similar to authorMatch. This is being primarily used to detect anthology publications.
- Removed some rules that are now handled by anthology detection, and fixed relevant unit tests
- Clarified DSL documentation for authorMatch a little more
- Modified rules for some magazines to reflect changes to recent editions
- Fixed an error in the DSL documentation
- Fixed a bug in epubzip where no epub file will be created if none already exists
- Major redesign of the formatting rules system. Renaming machinery is now described in a domain-specific language, NOT in statically compiled code. Users are able to extend the functionality with custom naming rules in conf files.
- Added interactive mode to ask about each file rename as they happen, this is like darcs now!
- Added ability to specify target directory for books to be moved to as part of renaming. Includes code to check that target directory exists.
- Removed –overwrite option. Turns out, renameFile has always been smart enough to not overwrite existing.
- Added / character to filters, a big no-no character for file paths on most sane filesystems
- Publication year was looking for publication before original-publication, causing problems in books that have both tags
- Miscellaneous rules changes for various publications
- Changed how this code provides epub zip file contents as a ByteString to the epub-metadata library. Need to read this data strictly to avoid dangling open files.
- Corrected for breakage due to change in title format of Eclipse magazine
- Some work done on the utility script for deploying Windows binaries of these tools
- Added parsing support and test cases for more date formats
- Minor usage info changes
- Changed how publication date is found to more closely follow the OPF spec recommendations
- Changed the switches related to publication date
- Redesigned unit test code and added more tests for new date code
- Huge redesign of how formatting works, dramatically shortening the code needed to handle any given book type. Code is much more monadic now and consolidated into one module.
- Many changes/additions to magazine and compilation book name formatting
- Fixed a group of bugs that occur when a creator has only a single word for their name
- Extensive changes/additions to unit testing for above
- Extensive changes to the cabal build of this project to bring it up to Cabal 1.10
- Unit tests now use the test-suite cabal stanza
- Initial release