calligraphy

HIE-based Haskell call graph and source code visualizer

https://github.com/jonascarpay/calligraphy#readme

Stackage Nightly 2022-08-19:0.1.3
Latest on Hackage:0.1.3

See all snapshots calligraphy appears in

BSD-3-Clause licensed by Jonas Carpay
Maintained by Jonas Carpay
This version can be pinned in stack with:calligraphy-0.1.3@sha256:a086b7a8c4a2810afb803c688818b8ce729facf00f497f180d87ab1a816bfb89,2573

calligraphy

calligraphy on Hackage calligraphy on Stackage Nightly

Calligraphy

calligraphy is a Haskell call graph/source code visualizer.

It works directly on GHC-generated HIE files, giving us features that would otherwise be tricky, like type information and support for generated files. calligraphy has been tested with all versions of GHC that can produce HIE files (i.e. GHC 8.8, 8.10, 9.0, and 9.2.)

See the accompanying blog post for more examples, and an extended tutorial.

Usage

  1. Install calligraphy through your Haskell package manager. Since it uses HIE files, it usually needs to be compiled with the same version of GHC as your project.

  2. Install GraphViz. By default, calligraphy needs dot to be available in the search path.

  3. Generate HIE files for your project by passing the -fwrite-ide-info to GHC. If you’re using Cabal, for example, you’d invoke cabal build --ghc-options=-fwrite-ide-info

  4. Run calligraphy. You probably want to start by running calligraphy --help to see what options it supports, but as an example, the above graph was produced using the following command:

calligraphy Calligraphy --output-png out.png --collapse-data

Where Calligraphy in this case is the name of the module.

Philosophy

Writing and especially maintaining Haskell tooling is really hard. Haskell, let alone GHC, is underspecified, overcomplicated, and constantly changing. If you don’t have a strategy for dealing with this, reality eventually catches up with you; there is an abundance of abandoned projects (think formatters, linters, editor plugins, IDEs, etc.) So too it is with calligraphy. Working with HIE files instead of Haskell source files allows us to leverage GHC for parsing and type checking, which is nice, but HIE files themselves are nothing but untyped views into GHC’s eldritch heart, and come with their own threats to sanity. So, how do we deal with this?

Put simply, the goal of calligraphy is not to be accurate, but to be as simple as possible while still being useful. If we can get 80% accuracy for 20% of the effort, that’s great, and if we can get 64% accuracy for 4% effort, that’s even better. That necessarily means that calligraphy will sometimes be wrong. When this happens, please open a bug report (especially if it’s egregious), but know that there’s a chance it’s simply not worth fixing.

Here’s an example. The type-related logic is currently ~15 lines. It works by, for every identifier, walking the type HIE gives us for that identifier, and adding an edge to every identifier it references. This works perfectly in 95% of cases, but field accessors will, only on GHC 9.2 and only sometimes, not get the type of their parent data type. That’s annoying, but we have to draw a line somewhere, and calligraphy always errs towards simplicity and maintainability. We could try to figure out and fix this as a special case, or try to use information from type signatures to fix it in general, but for this project the 15 lines is more important than the 95%. As another example, supporting for graphing Template Haskell-generated code would be a great feature, and it seems like it’d be easy to implement since HIE files are generated after TH expansion. Unfortunately however, the way TH code appears in the HIE output breaks many heuristics that we currently use to structure the source graph, so for now I decided that unless there’s an elegant way to naturally incorporate it, it’s not worth it.

That doesn’t mean that we don’t care about accuracy at all. The test suite contains a baseline reference module, and makes sure that calligraphy generates the same correct graph for it across GHC versions. Finding a simple, maintainable, and robust set of heuristics that passes the test suite and never face plants on edge cases, took months, a lot of failed attempts, and a hefty dose of sunk cost fallacy. Furthermore, there’s almost certainly still ways to make it simpler and more general. I’m very open to questions and suggestions on how to do this, especially if you have experience with GHC/HIE files.

Changes

Changelog

0.1.3

[Changed]

  • [#7] When encountering overlapping declarations, this will now keep the first one we find instead of throwing an error. Overlapping declarations are the result of TH slices. Since we don’t have any guarantees for those anyways, producing garbage instead of an error seems like a net win.

0.1.2

[Added]

  • --collapse-modules option to collapse entire modules into a single node

0.1.1

[Changed]

  • [#2] [#3] Ignore all identifiers that have a zero-width span. These are the result of generated code, and should be rejected elsewhere, but apparently can occasionally creep through.

0.1.0

[Added]

  • Initial release