Haxl is a Haskell library that simplifies access to remote data, such as databases or web-based services. Haxl can automatically
- batch multiple requests to the same data source,
- request data from multiple data sources concurrently,
- cache previous requests.
Having all this handled for you behind the scenes means that your data-fetching code can be much cleaner and clearer than it would otherwise be if it had to worry about optimizing data-fetching. We’ll give some examples of how this works in the pages linked below.
There are two Haskell packages here:
haxl: The core Haxl framework
haxl-facebook(in example/facebook): An (incomplete) example data source for accessing the Facebook Graph API
To use Haxl in your own application, you will likely need to build one or more data sources: the thin layer between Haxl and the data that you want to fetch, be it a database, a web API, a cloud service, or whatever.
There is a generic datasource in “Haxl.DataSource.ConcurrentIO” that can be used for performing arbitrary IO operations concurrently, given a bit of boilerplate to define the IO operations you want to perform.
haxl-facebook package shows how we might build a Haxl data
source based on the existing
fb package for talking to the Facebook
Where to go next?
The Story of Haxl explains how Haxl came about at Facebook, and discusses our particular use case.
An example Facebook data source walks through building an example data source that queries the Facebook Graph API concurrently.
Fun with Haxl (part 1) Walks through using Haxl from scratch for a simple SQLite-backed blog engine.
The N+1 Selects Problem explains how Haxl can address a common performance problem with SQL queries by automatically batching multiple queries into a single query, without the programmer having to specify this behavior.
Haxl Documentation on Hackage.
There is no Fork: An Abstraction for Efficient, Concurrent, and Concise Data Access, our paper on Haxl, accepted for publication at ICFP’14.
Changes in version 184.108.40.206
Completely rewritten internals to support arbitrarily overlapping I/O and computation. Haxl no longer runs batches of I/O in “rounds”, waiting for all the I/O to complete before resuming the computation. In Haxl 2, we can spawn I/O that returns results in the background and computation fragments are resumed when the values they depend on are available. See
tests/FullyAsyncTest.hsfor an example.
PerformFetchconstructor supports the new concurrency features:
BackgroundFetch. The data source is expected to call
putResultin the background on each
BlockedFetchwhen its result is ready.
There is a generic
Haxl.DataSource.ConcurrentIOfor performing each I/O operation in a separate thread.
Lots of cleanup and refactoring of the APIs.
License changed from BSD+PATENTS to plain BSD3.
Changes in version 0.5.1.0
- ‘pAnd’ and ‘pOr’ were added
- ‘asyncFetchAcquireRelease’ was added
- ‘cacheResultWithShow’ was exposed
- GHC 8.2.1 compatibility
Changes in version 0.5.0.0
- Rename ‘Show1’ to ‘ShowP’ (#62)
Changes in version 0.3.0.0
Some performance improvements, including avoiding quadratic slowdown with left-associated binds.
Documentation cleanup; Haxl.Core is the single entry point for the core and engine docs.
(>>) is now defined to be (*>), and therefore no longer forces sequencing. This can have surprising consequences if you are using Haxl with side-effecting data sources, so watch out!
New function withEnv, for running a sub-computation in a local Env
Add a higher-level memoization API, see ‘memo’
Show is no longer required for keys in cachedComputation
Exceptions now have