citeproc
Generates citations and bibliography from CSL styles.
| LTS Haskell 24.17: | 0.9.0.1 | 
| Stackage Nightly 2025-10-26: | 0.11 | 
| Latest on Hackage: | 0.11 | 
citeproc-0.11@sha256:9cdd3dce56d312f3e72de89eaec05e844af511c82d0a44eb374d5db8c4838001,5665Module documentation for 0.11
citeproc
This library generates citations and bibliography formatted according to a CSL style. Currently version 1.0.2 of the CSL spec is targeted.
This library is a successor to pandoc-citeproc, which was a fork of Andrea Rossato’s citeproc-hs. I always found it difficult to fix bugs in pandoc-citeproc and decided that implementing citeproc from scratch would give me a better basis for understanding. This library has a number of other advantages over pandoc-citeproc:
- 
it is much faster (as a rough benchmark, running the CSL test suite takes less than 4 seconds with this library, compared to 12 seconds with pandoc-citeproc) 
- 
it interprets CSL more faithfully, passing more of the CSL tests 
- 
it has fewer dependencies (in particular, it does not depend on pandoc) 
- 
it is more flexible, not being tied to pandoc’s types. 
Unlike pandoc-citeproc, this library does not provide an executable. It will be used in pandoc itself to provide integrated citation support and bibliography format conversion (so the pandoc-citeproc filter will no longer be necessary).
How to use it
The main point of entry is the function citeproc from the
module Citeproc.  This takes as arguments:
- 
a CiteprocOptionsstructure, which includes the following options:- 
linkCitationscontrols whether citations are hyperlinked to the bibliography.
- 
linkBibliographyautomatically linkifies any identifiers (DOI, PMCID, PMID, or URL) appearing in a bibliography entry. When an entry has a DOI, PMCID, PMID, or URL available but none of these are rendered by the style, add a link to the title (or, if no title is present, the whole entry), using the URL for the DOI, PMCID, PMID, or URL (in that order of priority). See Appendix VI of the CSL v1.0.2 spec.
 
- 
- 
a Style, which you will want to produce by parsing a CSL style file usingparseStylefromCiteproc.Style.
- 
Optionally a Lang, which allows you to override a default locale,
- 
a list of References, which you can produce from a CSL JSON bibliography using aeson’sdecode,
- 
a list of Citations (each of which may have multipleCitationItems).
It yields a Result, which includes a list of formatted
citations and a formatted bibliography, as well any warnings
produced in evaluating the style.
The types are parameterized on a CiteprocOutput instance a,
which represents formatted content in your bibliographic
fields (e.g. the title).  If you want a classic CSL processor,
you can use CslJson Text.  But you can also use another type,
such as a pandoc Inlines.  If you want to work with a type
other than these, you need to define an instance of
CiteprocOutput for your type. This tells citeproc how to
apply various kinds of formatting transformations, such as
adding emphasis, making things uppercase, and so on.  Note that
the same type must be used for References and Citations; thus,
for example, you can’t process a list of Citation Inlines
against references of type Reference (CslJson Text).
The signature of parseStyle may not be self-evident:
the first argument is a function that takes a URL and
retrieves the text from that URL.  This is used to fetch
the “independent parent” of a dependent style.  You can supply
whatever function you like: it can search your local file
system or fetch the content via HTTP.  If you’re not using
dependent styles, you can get by with \_ -> return mempty.
The citeproc executable
If the package is compiled with the executable flag, an
executable citeproc will be built.  citeproc reads
a JSON-encoded Inputs object from stdin (or from
a file if a filename is provided) and writes
a JSON-encoded Result object to stdout.  (It does so using
CslJson Text as the underlying type.) This executable
can be used to add citation processing to non-Haskell projects.
citeproc --help will summarize usage information.  See
the man page for more information.
Known bugs and limitations
Although this library is much more accurate in implementing the CSL spec than pandoc-citeproc was, it still fails some of the tests from the CSL test suite (62/818). However, most of the failures are on minor corner cases, and in many cases the expected behavior goes beyond what is required by the CSL spec. (For example, we intentionally refrain from capitalizing terms in initial position in note styles. It makes more sense for the calling program, e.g. pandoc, to do the capitalization when it puts the citations in notes, since some citations in note styles may already be in notes and in this case their rendering may not require capitalization. It is easy to capitalize reliably, hard to uncapitalize reliably.)
Changes
citeproc changelog
0.11
- 
Expand macros in evaluation rather than style parsing (#172). This fixes a serious performance issue in styles with heavy use of macros, such as the new chicago styles. With this change, memory use goes down by more than a factor of ten with these styles. 
- 
All fields in NameFormat are now Maybe values, so we can tell what has been explicitly set [API change]. 
- 
A new function combineNameFormatallows filling Nothing values in the first argument with Just values in the second [API change]. The old defaults that were used for the non-Maybe values are now set at the appropriate place in Citeproc.Eval.
- 
Add styleNameFormatfield to Style [API change].
- 
Add layoutNameFormatto Layout [API change].
- 
Add parameter for a NameFormat to SortKeyMacro constructor on SortKey [API change]. 
- 
CSL JSON: allow formatting in numeric fields (#170). There’s a catch, though. Currently the number splitting code ( splitNums) has to convert everything to text, so the formatting will be lost. Still, this is better than treating the formatting code as plain text which will then be escaped in the output. So, for example, we get1erinstead of1<sup>er</sup>for CSL JSON1<sup>er</sup>.
- 
Improve test suite so that expected failures are tracked. 
0.10
- 
Update locales from upstream (#161). A number of new locales, as well as new terms, e.g. number-of-volumes, have been added.
- 
Add PreserveCase constructor to TextCase [API change]. Ensure that PreserveCase is added to names that begin with a lowercase letter. See jgm/pandoc#10983. This will block transformations like the addition of a capital letter at the beginning of a note citation. 
- 
Fix handling of <substitute>so it works with<choose>inside (#159).
- 
Put TagPrefix/TagSuffix tags only over prefix/suffix. Previously they also enclosed the material being prefixed/suffixed. This change has no effect on any citeproc or pandoc tests. 
- 
Update CSL tests from upstream and adjust tests expected to pass. Note that the expected test outputs in these tests seem out of sync with the new locales… 
- 
locator = “page” should not be true when there is no locator (#165). 
- 
Add Citation prefix and suffix (#156). This adds citationPrefix,citationSuffixfields toCitation[API change].
- 
Fix sorting so that anything with a prefix or suffix is left in place (#155, cf. #89). Motivating example: blabla [e.g., @zeta2021;@alpha2020]should not render as (Alpha 2020, e.g., Zeta 2021).
- 
Fix entity-escaping of characters in the executable output (#153, Daphne Preston-Kendal). 
- 
Fix is-numeric="locator"(#164). Previously we weren’t looking at the locator in this case, but only at the variables defined in the Reference.
- 
Improve is-numericdetection.
- 
Add Eq typeclass instance for Result a(Linus Arver) [API change].
- 
Improve README (#167, building on suggestinos by @listx). 
0.9.0.1
- Fix readAsIntso it handles negative numbers in strings.readAsIntattempts to read strings as integers, but previously it didn’t properly handle strings like"-387", which are sometimes used in bibliographies. See jgm/pandoc#10839.
0.9
- 
Fix handling of typeconditions inif(#151). In anifelement withtype="article-journal chapter", citeproc previously treated this as two separate conditions (type=article-journal, type=chapter). But it seems that the intended behavior is to treat it as a single condition that succeeds if any of the listed types match. The difference between current and intended behavior comes out whenmatch="all"is used; this will always fail whentypecontains more than one type.To fix this, we change the HasTypeconstructor onConditionso that it takes a list of Texts rather than single one [API change], and we populate it with the result of splitting the argument oftype. In Eval, we change the clause for the HasType condition so that it succeeds if any of the types in the list match.
- 
Add --link-citationsand--link-bibliographyoptions to binary (#142, Daphne Preston-Kendal).
- 
Bump containers upper bound. 
0.8.1.3
- Don’t add SubstitutedVal to variables that were empty (#148).
This fixes a bug which caused variable=tests to succeed in some cases where they should have failed.
0.8.1.2
- 
Allow containers 0.7 (#143) 
- 
Update tests to use Diff >= 1.0 (#146). 
- 
Fix dropTextWhileanddropTextWhileEndin Citeproc.Pandoc. Ensure that they treat SoftBreak like Space (jgm/pandoc#10451).
0.8.1.1
- 
Include 10/prefix in short DOI links (#136).
- 
Properly implement demote-non-dropping-particle="sort-only"(#141). We had previously gotten sorting behavior right for this, but not display behavior.
0.8.1
- 
In Pandoc and CslJson CaseTransform, group punctuation in clusters (#127). 
- 
Fix sorting on dates (#126). Previously this broke for some styles, e.g. apa.csl, which styles dates as MM/DD/YYYY, and would lead to incorrect sorting of dates with months and/or days. 
- 
Add citation-key variable from citeId. This is a new addition in CSL 1.0.2. 
- 
Update locales from upstream. 
- 
Raise an error if multiple layout elements are present (#120). 
- 
Fix two test cases. They had illegal bibliography elements with no layout children. 
- 
If there are multiple layout elements, only use the last one. This can happen with CSL-M styles. The last layout should be locale-unspecific. This change will prevent us from emitting doubled citations or bibliographic references (see #120), allowing more graceful handling of CSL-M styles, even though we don’t support CSL-M. 
0.8.0.2
- Fix missing locator after collapsing and grouping with year suffix (#96).
0.8.0.1
- 
Fix disambiguation edge case (#116). We weren’t properly disambiguating when only one of two ambiguous names had a subsequent citation. 
- 
Chicago page numbering fixes. 
- 
Update test suite form upstream. 
- 
Handle whole-citation links differently in secondFieldAlign(#113, Benjamin Bray).
- 
Require data-default >= 0.5.2 (#114, Bodigrim). 
0.8
- 
Add SubstitutedValconstructor forVal[API change] (#101, #108). This is used to track variables that are repressed due to substitution. (We can’t just delete them, because they still count when we have “if” elements that check for a variable.)
- 
Fix logic for including a group. A group with a text node and an empty variable should count as empty. 
- 
CaseTransform: don’t change words that are a mix of uppercase and nonletters, like CRT1000. 
- 
Fix label with “page” variable (#107). 
- 
Fix error in test suite. We stripped indentation in the expected result in some cases. 
- 
Update fr-FR locale from upstream. 
0.7
- 
Handle old term form sub verboas if it issub-verbo(the new form).
- 
Update to latest locales in CSL repository. 
- 
Makefile: Fix update-locales target. 
- 
Keep explicit “et al.” (#102, Albert Krewinkel). 
- 
Factor out deleteSubstitutedVariables.
- 
Add any references in citationItemDatato references.
- 
Add citationItemDatafield toCitationItem[API change]. This corresponds to theitemDatathat can appear in the JSON representation of a citation item.
- 
Add Ord, Eq instances for Reference,DisambiguationData,Val[API change].
0.6.0.1
- Ensure that position evaluates false inside bibliography (#99).
0.6
- 
Add Term parameter to TagTerm [API change]. 
- 
Add TagPrefix, TagSuffix constructors to Tag [API change]. 
- 
Make sure that extracted AuthorOnly names have the correct formatting (#55). 
- 
Do case-insensitive sorting, like Zotero (#91). 
- 
Ignore “ibid” entries in computing ambiguities. 
- 
Improved disambiguation for author-in-text citations. 
- 
In disambiguating, convert author-in-text to normal citations. Otherwise we disambiguate incorrectly. 
- 
Fix title disambiguation with note style (#90). Previously we’d been calculating ambiguities by generating renderings for citation items independently of context. This meant that we didn’t detect ambiguities in “subsequent” citations (which might e.g. just have an author). 
- 
Ensure we don’t do collapsing of items across a prefix or suffix (#89). If we have [@doe99; for contrasting views see @smith33; @doe00], we don’t want to get collapsing to(Doe 1999, 2000; for contrasting views, see Smith 1933). This isn’t strictly by the spec, but it gives better results.
- 
Allow collapsing after an initial prefix. 
0.5
- 
Add linkBibliographyfield toCiteprocOptions[API change]. When this is set to True, we hyperlink bibliography entries according to the draft of the CSL v1.02 spec (Appendix VI). When an entry has a DOI, PMCID, PMID, or URL available but none of these are rendered by the style, add a link to the title (or, if no title is present, the whole entry), using the URL for the DOI, PMCID, PMID, or URL (in that order of priority). (Benjamin Bray, #88.)
- 
In generating citation labels, only use issueddate. Not, for example,accessed(#80).
- 
Citeproc.Locale: export lookupQuotes. [API change]
- 
Citeproc.Types: Add localizeQuotesmethod to CiteprocOutput class [API change].
- 
Citeproc.CslJson, Citeproc.Pandoc: Implement localizeQuotes.
- 
Citeproc: apply localizeQuotesafter rendering. This ensures that quotes are properly localized and flipflopped. Previously this was done inrenderCslJson(for CSL JSON) and in pandoc (for Pandoc Inlines). It is more consistent to do this as part of the rendering pipeline, in citeproc itself.
- 
Citeproc.CslJson: Drop the Locale parameter from the signature of renderCslJson[breaking API change]. It was only needed for quote localization, which now occurs outside of this function.
- 
Citeproc.Pandoc: use a Span with class csl-quoted for quotes, rather than a Quoted inline. This way we can leave Quoted elements passed in by pandoc alone, and we won’t get strange effects like the one described in #87 (where "behaves differently when in a citation suffix).
- 
Default to Shifted with icuflag (#83). This makes the library behave similarly whether compiled withicuor with the defaultunicode-collationand prevents test failures withicu.
- 
Require recent text-icu with icu flag. Older versions don’t build with newer versions of icu4c. 
- 
Support links in CslJson (Benjamin Bray). Currently they are only supported in rendering, not parsing (in support of #88). 
- 
Allow test cases to specify CiteprocOptions (Benjamin Bray). 
- 
Update locales from upstream. 
- 
Add new CSL tests to repository. 
0.4.1
- Change Pandoc inNoteso it creates aSpanwith classcsl-noterather than aNote. This should make it easier to integrate citations with ordinary notes in pandoc.
- Do not hyperlink author-only citations (#77). If we do this we get two consecutive hyperlinks for author-in-text forms.
- movePunctuationInsideQuotes: only move- ,and- ., not- ?and- !, as per the CSL spec.
0.4.0.1
- Fix bug introduced by the fix to #61 (#74). In certain circumstances, we could get doubled “et al.”.
- Depend on unicode-collation unconditionally (#71). It is necessary even when text-icu is used, because of Text.Collate.Lang.
- Rename tests in extra/ so they fall into categories.
0.4
- We now use Lang from unicode-collation rather than defining our own. The type constructor has changed, as has the signature of parseLang.
- Use unicode-collation by default for more accurate sorting.
- text-icu will still be used if the icu flag is set. This may give better performance, at the cost of depending on a large C library.
- Change type of SortKeyValue so it doesn’t embed Lang. [API change] Instead, we now store a language-specific collator in the Eval Context.
- Move compSortKeyValues from Types to Eval.
 
- Add curly open quote to word splitters in normalizeSortKey.
- Improve date sorting: use the format YYYY0000 if no month, day, and YYYYMM00 if no day when generating sort keys.
- Special treatment of literal “others” as last name in a list (#61). When we convert bibtex/biblatex bibliographies, the form “and others” yields a last name with nameLiteral = “others”. We detect this and generate a localized “and others” (et al).
- Make abbreviations case-insensitive (#45).
0.3.0.9
- Implement et-al-subsequent-minandet-al-subsequent-use-first(#60).
0.3.0.8
- In parsing abbreviations JSON, ignore top-level fields besides “default” (#57), e.g. “info” which is used in Zotero’s default abbreviations file.
0.3.0.7
- Remove check for ASCII in case transform code. Previously we weren’t doing case transform on words containing non-ASCII characters.
0.3.0.6
- Fix infinite loop in fixPunct(#49). In a few rare casesfixPunctwould hang.
0.3.0.5
- Add a space between “no date” term and disambiguator if the long form is used (#47).
0.3.0.4
- Improve disambiguation code. Add type signatures, move some functions to the top-level, and make the logic clearer and more efficient.
- Re-render after each stage of ambiguity resolution instead of relying on analysis of names and dates. This is necessary especially for styles like chicago-note-bibliography which use titles in citations. Closes #44. No measurable performance impact.
- Update test suite from upstream.
- Update it-ITlocale.
0.3.0.3
- Fix author-only citations (#43). We got bad results with some styles when a reference had both an author and a translator.
0.3.0.2
- Don’t use cite-group delimiter if ANY citation in group has
locator (#38).  This seems to be citeproc.js’s behavior and it gives
better results for chicago-author-date:  we want both
[@foo20; @foo21, p. 3]and[@foo20, p. 3; @foo21]to produce a semicolon separator, rather than a comma.
0.3.0.1
- Better handle initialize-withthat ends in a nonbreaking space. In this case, citeproc should not add an additional space or strip the nonbreaking space. Closes #37.
0.3
- Change makeReferenceMapto return a cleaned-up list of references as well as a reference map. The cleanup-up list removes references with duplicate ids. When there are multiple references with the same id, the last one is included and the others discarded. [API change]
0.2.0.1
- FromJSON for Name: make straight quotes curly. Otherwise nothing will do this, when we are decoding JSON to (Reference a), a /= CslJson Text.
- Remove redundant pragmas and imports (Albert Krewinkel).
- Use custom prelude with GHC 8.6.* and older (Albert Krewinkel). This adds support for GHC 8.0.x.
0.2
- Remove AfterOtherPunctuationconstructor fromCaseTransformState[API change]. This gave bad results with things like parentheses (#27).
- Change SortKeyValueto includeMaybe Lang[API change]. This allows us to do locale-sensitive sorting (though this won’t matter much unless theicuflag is used).
- Add Maybe Langparameter oninitialize(since capitalization can be locale-dependent).
- Add cabal.project.icu for building with icu lib.
- Add (unexported) Citeproc.Unicode compatibility module.
This allows us to use the same functions whether or not
the icuflag is used.
0.1.1.1
- Pay attention to citationNoteNumber in computing position. In calculating whether an item is alone in its citation, we need to take into account citationNoteNumber, since two citations may occur in the same note and they should not be ranked “alone.” See jgm/pandoc#6813, citation-style-language/documentation#121
0.1.1
- Ensure that uncited references are sorted last when it comes to assigning citation numbers (#22).
- Remove “capitalize initial term” feature. This is required by the test suite but not the spec. It makes more sense for us to do this capitalization in the calling program, e.g. pandoc. For some citations in note styles may already be in notes and thus not trigger separate footnotes. If initial terms had been capitalized, we’d need to uncapitalize, and that is hard to do reliably.
- Treat empty FancyValas an empty value.
- Derive Functor, Traversable, Foldable for Result [API change].
0.1.0.3
- Better handling of author-only/suppress-author. Previously all results of “names” elements were treated as authors. But only the first should be (generally this is the author, but it could be the editor of an edited volume with no author). See jgm/pandoc#6765.
0.1.0.2
- Don’t enclose contents of e:choose in a Formatted element (#19). The e:choose element is “transparent” and the delimiter controlling its formatting should be inserted between the items it returns.
0.1.0.1
- 
Fix sorting when no <sorting>element given. The spec says: “In the absence of cs:sort, cites and bibliographic entries appear in the order in which they are cited.” This affects IEEE in particular. See jgm/pandoc#6741.
- 
Improve sameNamesand citation grouping. Preivously if a citation item had a prefix, it would not be grouped with following citations. See jgm/pandoc#6722 for discussion.
- 
Remove unneeded hasNoSuffixcheck insameNames.
- 
Remove unneeded import 
- 
citeprocexecutable: strip BOM before parsing style (#18).
0.1
- Initial release.
