texmath

texmath is a Haskell library for converting between formats used to represent mathematics. Currently it provides functions to read and write TeX math, presentation MathML, and OMML (Office Math Markup Language, used in Microsoft Office), and to write Gnu eqn and pandoc's native format (allowing conversion, using pandoc, to a variety of different markup formats). The TeX reader and writer supports basic LaTeX and AMS extensions, and it can parse and apply LaTeX macros. The package also includes several utility modules which may be useful for anyone looking to manipulate either TeX math or MathML. For example, a copy of the MathML operator dictionary is included.

You can try it out online here. (Note that the math it produces will be rendered correctly only if your browser supports MathML. Firefox does; Safari and Chrome do not.)

By default, only the Haskell library is installed. To install a test program, texmath, use the executable Cabal flag:

cabal install -fexecutable

To run the test suite, compile with --enable-tests and do cabal test.

Alternatively, texmath can be installed using stack. Install the stack binary somewhere in your path. Then, in the texmath repository,

stack setup
stack install --flag texmath:executable

The texmath binary will be put in ~/.local/bin.

Macro definitions may be included before a LaTeX formula.

texmath will behave as a CGI script when called under the name texmath-cgi (e.g. through a symbolic link).

The file cgi/texmath.html contains an example of how it can be used.

Generating lookup tables

There are three main lookup tables which are built form externally compiled lists. This section contains information about how to modify and regenerate these tables.

In the lib direction there are two sub-directories which contain the necessary files.

MMLDict.hs

The utility program xsltproc is required. You can find these files in lib/mmldict/

  1. If desired replace unicode.xml with and updated version (you can download a copy from here
  2. xsltproc -o dictionary.xml operatorDictionary.xsl unicode.xml
  3. runghc generateMMLDict.hs
  4. Replace the operator table at the bottom of src/Text/TeXMath/Readers/MathML/MMLDict.hs with the contents of mmldict.hs

ToTeXMath.hs

You can find these files in lib/totexmath/

  1. If desired, replace unimathsymbols.txt with an updated verson from here
  2. runghc unicodetotex.hs
  3. Replace the record table at the bottom of src/Text/TeXMath/Unicode/ToTeXMath.hs with the contents of UnicodeToLaTeX.hs

ToUnicode.hs

You can find these files in lib/tounicode/.

  1. If desired, replace UnicodeData.txt with an updated verson from here.
  2. runghc mkUnicodeTable.hs
  3. Replace the table at the bottom of src/Text/TeXMath/Unicode/ToUnicode.hs with the output.

Editing the tables

It is not necessary to edit the source files to add records to the tables. To add to or modify a table it is easier to add modify either unicodetotex.hs or generateMMLDict.hs. This is easily achieved by adding an item to the corresponding updates lists. After making the changes, follow the above steps to regenerate the table.

Authors

John MacFarlane wrote the original TeX reader, MathML writer, and OMML writer. Matthew Pickering contributed the MathML reader, the TeX writer, and many of the auxiliary modules. Jesse Rosenthal contributed the OMML reader. Thanks also to John Lenz for many contributions.

Changes

texmath (0.9.4.1)

* Improved handling of accents and upper/lower delimiters (#98).
* cgi: use cloudflare cdn for mathjax.

texmath (0.9.4)

* Added writer for GNU eqn format (used with *roff).

+ New module Text.TeXMath.Writers.Eqn, exporting writeEqn.
+ Moved handleDownup to Shared.
+ Add eqn output option in texmath executable.
+ Add eqn writer tests.

texmath (0.9.3)

* Expose pMacroDefinition in Text.TeXMath.Readers.TeX.Macros,
with generalized type.

texmath (0.9.2)

* Support `\newenvironment` and `\renewenvironment`.

texmath (0.9.1.1)

* Added program to generate cbits/{key,val}ToASCII.c from a data file.
* Migrate lookup structures for Unicode/ToTex.hs to use C
source files to accelerate builds (Carter Tazio Schonwald).
* Allow \boldsymbol + a token without braces.
And same for other styling commands. Closes #94.
* Better ghc-prof-options.

texmath (0.9.1)

* Pandoc writer: add thin space after math operators.
* TeX reader: improve parsing of `\mathop` to allow
multi-character operator names.
* Minor optimizations.
* Added Ord instances to Exp and associated types.

texmath (0.9)

* OMML writer: Properly handle nary inside delimiters (#92).
Previously under-overs inside delimiters didn't get
converted to nary the way they did outside of delimiters.
* TeX reader: Support bm, mathbold, pmd.
* OMML reader: Handle w:sym (#65).
* New module, Text.TeXMath.Unicode.Fonts, for MS font code point
lookup. Copied from pandoc Text.Pandoc.Readers.Docx.Fonts,
which will be changed to import this. [API change]
* Fixed typo in test program for round-trip tests.
* Parse math inside text constructions (#62). E.g.
`\text{if $y$ is greater than $0$}` Text and math can nest indefinitely.
* Modify test suite so tests can be marked as "ought to raise error."
So far this is only used in mml. If the test is called foo
and `readers/mml/foo.error` exists, then the test is expected
to raise a parse error.
* MathML reader err: Don't print colon after line number.
* Restore proper error handling to MathML reader. Note that the tests
need to be redone, since they assumed the old behavior of just
returning empty elements on parse errors.

texmath (0.8.6.7)

* TeX reader: treat backslash + newline as like backslash + space.
Previously this case gave an error. See jgm/pandoc#3189.

texmath (0.8.6.6)

* Allow pandoc-types 1.17.*.

texmath (0.8.6.5)

* Fixed transposition of sub/super on operators (jgm/pandoc#3040).

texmath (0.8.6.4)

* Support 'equation' environment, without the numbering of course.

texmath (0.8.6.3)

* Use POST instead of GET for texmath-cgi.

texmath (0.8.6.2)

* Fixed array alignment issues (jgm/pandoc#2864, jgm/pandoc#2310).
* Use 1 and 0 for _Hide attributes, rather than on and off.
Officially it seems that either 1/0 or true/false are wanted.
* Fixed EUnderOver for omml output. Previously both the under and
the over part were being placed under (jgm/pandoc#2775).

texmath (0.8.6.1)

* OMML writer: Fixed rendering of roots, so that the degree appears
in the right place.
* OMML writer: Don't include empty rPr elements.

texmath (0.8.6)

* TeX reader: Support hundreds more math symbols (all of those defined in
Text.TeXMath.Unicode.ToTeX), including `\nwarrow`, `\swarrow`,
`\nearrow`, `\searrow` (#90).

texmath (0.8.5.1)

* OMML writer: Fixed order of elements in nary formulas to conform
to OMML spec (#88, Niko Weh). `<e>` must follow the `<sup>` and `<sub>`
parts of `<nary>`. This fixes rendering issues in LibreOffice
(though Word copes with the incorrect order).
* Added Paths_texmath to Other-Modules for texmath executable.

texmath (0.8.5)

* TeX parser: Support limited styling inside \DeclareMathOperator (#85).
* TeX reader: Correctly parse \mbox. Its argument is text mode.
* Updated mathml tests to use mo for operators.
* TeX reader: support mathopen, mathclose, mathpunct.
* MathML writer: render EMathOperator as mo, not mi (#86).
* MathML: handle leading space in EText (#87).
* Take --version in executable from cabal metadata.
* Added Paths_texmath to other-modules.

texmath (0.8.4.2)

* Fixed overbrace, underbrace (#82). Previously we were using the wrong
character: U+FE37 instead of U+23DE. This didn't work in Word.
* Support \mathop, \mathrel, \mathbin, \mathord
* MathML - render Symbol Ord as mi, not mo (#83).
* Handle align environments with > 2 cells per row (#84).

texmath (0.8.4.1)

* Added stack install instructions.
* Fixed bold-italic in OMML (#76). Previously `\mathbfit` didn't work
properly in OMML output.
* Ignore `\nonumber` (#79).
* Allow styling in `\operatorname` e.g. `\operatorname{\mathcal{L}}` (#80).
* Fixed bug in `supHide` and `subHide` for OMML. This led to little
empty boxes being displayed in integrals with subscripts but no
superscripts. See jgm/pandoc#2571.
* Implemented `\mod` as a math operator (#81). This doesn't capture all the
spacing subtleties of the amsmath version, but should be good enough
for most purposes.
* Allow pandoc-types < 1.17.

texmath (0.8.4)

* Improved symbol spacing in Pandoc output (jgm/pandoc#2261).
This change avoids putting space around binary symbols that
come at the beginning or end of a group, or appear on their
own. It also avoids spacing on a binary symbol that follows
a Bin, Op, Rel, Open, or Punct atom, in accord with
TeXBook Appendix G. We could go farther towards exactly
matching the TeXBook rules, but this simple change goes some of
the way.
* Added stack.yaml.

texmath (0.8.3)

* Parse uppercase Greek letters as EIdentifier, not ESymbol Op.
This fixes handling of things like `$Lambda^1$`, particularly in omml.

texmath (0.8.2.2)

* Handle `.` after number with no following digits.

texmath (0.8.2.1)

* Handle bare hyphen in `\text{...}`. Closes jgm/pandoc#2274.
* Support `\ltimes` and `\rtimes` in the TeX reader (Arata Mizuki).
* Slightly more efficient number parser.

texmath (0.8.2)

* Better handling of decimal points. Decimal points are now parsed
as parts of numbers, not as separate symbols. E.g. in MathML they
now appear in `<mn>` elements. Closes #74.

texmath (0.8.1)

* OMML: Don't force everything into Roman font by default.
This change ensures that variables will be italic by
default in Word. See jgm/pandoc#2075.

texmath (0.8.0.2)

* Fixed typo in `defaultEnv` to include `amssymb` (#68).
* Moved some lookup tables to C, and disabled aggressive
profiling defaults, to avoid excessive memory usage in
compiling with clang (#70).
* Support `\newcommand*` in `parseMacroDefinition` (jgm/pandoc#2005).
* Fixed order bug for over/under in OMML reader (#66).
* Support `\boldsymbol` (#67).

texmath (0.8.0.1)

* Added network-uri flag. This addresses the split of network
and network-uri packages.
* OMML reader: change default accent (Jesse Rosenthal).
The default had previously been set as accute (possibly as a
placeholder). It appears to be circumflex/hat instead.

texmath (0.8)

* Added OMML reader (Jesse Rosenthal).
* Support latex \substack (#57).
* Added EBoxed and implemented in readers and writers (#58).
* Handle latex \genfrac. Use \genfrac for \brace, \brack,
etc. when amsmath is available.
* Improvements in handling of space characters.
* Use ESpace rather than EText when a mathml mtext just contains
a space.
* Use \mspace when needed to get latex spaces with odd sizes, rather
than finding the closest simple command.
* Use Rational instead of Double in ESpaced, EScaled.
* Shared: Export getSpaceWidth, getSpaceChars.
* Shared: Export fixTree, isEmpty, empty (formerly in MathML reader).

texmath (0.7.0.2)

* TeX reader: further improvements in error reporting.
Instead of reporting line and column, a snippet is printed
with a caret indicating the position of the error. Also
fixed bad position information when control sequences are
followed by a letter.

texmath (0.7.0.1)

* TeX reader:

+ Improved error reporting.
+ Optimized parser.
+ Treat `\ ` as ESpaced rather than ESymbol.
+ Internal improvements, including using the parsec3 interface
instead of the older parsec2 compatibility interface.

* Added tests for phantom.

texmath (0.7)

* Changes in Exp type:

+ Removed EUp, EDown, EDownup, EUnary, EBinary.
+ Added EFraction (and FractionType), ESqrt, Eroot, EPhantom.
+ Added boolean "convertible" parameter to EUnder, EOver, EUnderover.
+ Changed parameter of EScaled from String to Double.
+ Changed parameter of ESpace from String to Double.
+ Removed EStretchy.
+ Added EStyled, corresponding to mstyle in mathml, and \mathrm,
\mathcal, etc. in TeX (which can contain arbitrary math content,
not just text).
+ Changed the type of EDelimited. The contents of an EDelimited are
now either Right Exp or Left String (the latter case represents a
fence in middle position, e.g. \mid| in LaTeX).

* Module reorganisation: the exposed interface has been completely
changed, and modules for reading MathML and writing TeX math
have been added:

+ All writers now reside in Text.TeXMath.Writers.
- Text.TeXMath.MathML -> Text.TeXMath.Writers.MathML.
toMathML and showExp are removed, writeMathML added.
- Text.TeXMath.OMML -> Text.TeXMath.Writers.OMML.
toOMML and showExp removed, writeOMML added.
- Text.TeXMath.Pandoc -> Text.TeXMath.Writers.Pandoc.
toPandoc removed, writePandoc added.
- New module Text.TeXMath.Writers.TeX, exporting writeTeX,
writeTeXWith, addLaTeXEnvironment (the latter giving control
over which packages are assumed to be available).

+ All readers now reside in Text.TeXMath.Readers.
- Text.TeXMath.MathMLParser -> Text.TeXMath.Readers.MathML,
exporting readMathML.
- Text.TeXMath.Readers.TeX nows exports readTeX rather than
parseFormula.

+ New modules for unicode conversion: Text.TeXMath.Unicode.ToASCII,
Text.TeXMath.Unicode.ToTeX, Text.TeXMath.Unicode.ToUnicode.

+ Two MathML specific modules: Text.TeXMath.Readers.MathML.EntityMap,
Text.TeXMath.Readers.MathML.MMLDict.

+ In Text.TeXMath, all the XtoY functions have been removed
in favour of rexporting raw reader and writer functions. The
data type Exp is now also exported.

* Bug fixes and improvements.

+ TeX writer: Properly handle accents inside \text{}.
Use real text environments for EText (\textrm, not \mathrm).
Improved handling of scalers (\Big etc.). Use amsmath matrix
environments when appropriate. Fixed \varepsilon.
+ MathML writer: Omit superfluous outer mrows. Add position
information to fences.
+ OMML writer: Handle \phantom.
+ Pandoc writer: Use unicode characters to support Fraktur and
other text styles.
+ TeX reader: Use EUnder/Over for \stackrel, \overset, \underset.
Improved handling of primes. Fixed \notin. Avoid superfluous
grouping of single elements. Improved handling of scalers (\Big etc.).
Handle \choose, \brace, \brack, \bangle (#21).
+ Macros: Don't raise an error if applying a macro fails to
resolve to a fixed point; instead, just return the original string.

* Rewrote test suite as a proper cabal test suite. Added
--regenerate-tests and --round-trip options.

* Updated texmath online demo for bidirectional conversion.

* Removed cgi and test flags. Added executable flag to build texmath.

* Modified texmath so it works like a cgi script when run as
texmath-cgi (through symlink or renaming). Removed dependency on
the cgi package.

texmath (0.6.7)

* New Module: Text.TeXMath.Unidecode, a module for approximating
unicode characters in ASCII.
* New Module: Text.TeXMath.Shared, a module for shared lookup
tables between the various readers and writers
* New Module: Text.TeXMath.MathMLParser, exporting readMathML.
* New Module: Text.TeXMath.EntityMap, exporting getUnicode,
a conversion table from MathML entities to unicode characters.
* New Module: Text.TeXMath.UnicodeToLaTeX, exporting getLaTeX,
converting a string of unicode characters to a string of equivalent LaTeX
commands.
* New Module: Text.TeXMath.LaTeX, replacing Text.TeXMath.Parser,
exporting toTeXMath.
* New Module: Text.TeXMath.MMLDict, implements a lookup table from
operators to their default values as defined by the MML dictionary,
exporting getOperator.
* New Module: Text.TeXMath.Compat, maintaining compatibility with
mtl < 2.2.1.
* Modified Text.TeXMath to export the primitive readers, as well as
mathMLTo{Writer} for all writers
* Modified: Text.TeXMath.Types: added additional record types for
Text.TeXMath.MMLDict and Text.TeXMath.UnicodeToLaTeX.
New Exports: Operator(..), Record(..).
* Modified test suite: use cabal test, added significant number of tests.
* Added recognition of the LaTeX command \phantom

texmath (0.6.6.3)

* Use combining tilde accent for \tilde. Closes pandoc #1324.

texmath (0.6.6.2)

* Allow \left to be used with ), ], etc. Ditto with \right.
Previously only (, [, etc. were allowed with \left. Closes pandoc #1319.

texmath (0.6.6.1)

* Support \multline (previously it was mispelled "multiline")
* Changed data-files to extra-source-files.

texmath (0.6.6)

* Insert braces around macro expansions to prevent breakage (#7).
* Support \operatorname and \DeclareMathOperator (rekka) (#17).
* Support \providecommand (#15).
* Fixed spacing bugs in pandoc rendering (#24).
* Ignore \hline at end of array row instead of failing (#19).

comments powered byDisqus