Typical buffer–based incremental I/O is based around a single loop, which reads data from some source (such as a socket or file), transforms it, and generates one or more outputs (such as a line count, HTTP responses, or modified file). Although efficient and safe, these loops are all single–purpose; it is difficult or impossible to compose buffer–based processing loops.
Haskell’s concept of “lazy I/O” allows pure code to operate on data from an external source. However, lazy I/O has several shortcomings. Most notably, resources such as memory and file handles can be retained for arbitrarily long periods of time, causing unpredictable performance and error conditions.
Enumerators are an efficient, predictable, and safe alternative to lazy I/O. Discovered by Oleg Kiselyov, they allow large datasets to be processed in near–constant space by pure code. Although somewhat more complex to write, using enumerators instead of lazy I/O produces more correct programs.
This library contains an enumerator implementation for Haskell, designed to be both simple and efficient. Three core types are defined, along with numerous helper functions:
Iteratee: Data sinks, analogous to left folds. Iteratees consume a sequence of input values, and generate a single output value. Many iteratees are designed to perform side effects (such as printing to
stdout), so they can also be used as monad transformers.
Enumerator: Data sources, which generate input sequences. Typical enumerators read from a file handle, socket, random number generator, or other external stream. To operate, enumerators are passed an iteratee, and provide that iteratee with input until either the iteratee has completed its computation, or EOF.
Enumeratee: Data transformers, which operate as both enumerators and iteratees. Enumeratees read from an outer enumerator, and provide the transformed data to an inner iteratee.