Streamly, short for streaming concurrently, provides monadic streams, with a simple API, almost identical to standard lists and vector, and an in-built support for concurrency. By using stream-style combinators on stream composition, streams can be generated, merged, chained, mapped, zipped, and consumed concurrently – providing a generalized high level programming framework unifying streaming and concurrency. Controlled concurrency allows even infinite streams to be evaluated concurrently. Concurrency is auto scaled based on feedback from the stream consumer. The programmer does not have to be aware of threads, locking or synchronization to write scalable concurrent programs.
The basic streaming functionality of streamly is equivalent to that provided by
streaming libraries like
In addition to providing streaming functionality, streamly subsumes
the functionality of list transformer libraries like
list-t, and also the logic
programming library logict. On
the concurrency side, it subsumes the functionality of the
async package, and provides even
higher level concurrent composition. Because it supports
streaming with concurrency we can write FRP applications similar in concept to
Why use streamly?
- Simplicity: Simple list like streaming API, if you know how to use lists then you know how to use streamly. This library is built with simplicity and ease of use as a design goal.
- Concurrency: Simple, powerful, and scalable concurrency. Concurrency is built-in, and not intrusive, concurrent programs are written exactly the same way as non-concurrent ones.
- Generality: Unifies functionality provided by several disparate packages (streaming, concurrency, list transformer, logic programming, reactive programming) in a concise API.
- Performance: Streamly is designed for high performance. It employs stream
fusion optimizations for best possible performance. Serial peformance is
equivalent to the venerable
vectorlibrary in most cases and even better in some cases. Concurrent performance is unbeatable. See streaming-benchmarks for a comparison of popular streaming libraries on micro-benchmarks.
The following chart shows a summary of the cost of key streaming operations processing a million elements. The timings for streamly and vector are in the 600-700 microseconds range and therefore can barely be seen in the graph.
conduit and like
composes stream data instead of stream processors (functions). A stream is
just like a list and is explicitly passed around to functions that process the
stream. Therefore, no special operator is needed to join stages in a streaming
pipeline, just the standard function application (
$) or reverse function
&) operator is enough. Combinators are provided in
Streamly.Prelude to transform or fold streams.
The following snippet provides a simple stream composition example that reads numbers from stdin, prints the squares of even numbers and exits if an even number more than 9 is entered.
import Streamly import qualified Streamly.Prelude as S import Data.Function ((&)) main = runStream $ S.repeatM getLine & fmap read & S.filter even & S.takeWhile (<= 9) & fmap (\x -> x * x) & S.mapM print
Concurrent Stream Generation
Monadic construction and generation functions e.g.
fromFoldableM etc. work concurrently
when used with appropriate stream type combinator (e.g.
The following code finishes in 3 seconds (6 seconds when serial):
> let p n = threadDelay (n * 1000000) >> return n > S.toList $ aheadly $ p 3 |: p 2 |: p 1 |: S.nil [3,2,1] > S.toList $ parallely $ p 3 |: p 2 |: p 1 |: S.nil [1,2,3]
The following finishes in 10 seconds (100 seconds when serial):
runStream $ asyncly $ S.replicateM 10 $ p 10
Concurrent Streaming Pipelines
|$ to apply stream processing functions concurrently. The
following example prints a “hello” every second; if you use
& instead of
|& you will see that the delay doubles to 2 seconds instead because of serial
main = runStream $ S.repeatM (threadDelay 1000000 >> return "hello") |& S.mapM (\x -> threadDelay 1000000 >> putStrLn x)
We can use
sequence functions concurrently on a stream.
> let p n = threadDelay (n * 1000000) >> return n > runStream $ aheadly $ S.mapM (\x -> p 1 >> print x) (serially $ repeatM (p 1))
Serial and Concurrent Merging
Semigroup and Monoid instances can be used to fold streams serially or concurrently. In the following example we compose ten actions in the stream, each with a delay of 1 to 10 seconds, respectively. Since all the actions are concurrent we see one output printed every second:
import Streamly import qualified Streamly.Prelude as S import Control.Concurrent (threadDelay) main = S.toList $ parallely $ foldMap delay [1..10] where delay n = S.yieldM $ threadDelay (n * 1000000) >> print n
Streams can be combined together in many ways. We provide some examples
below, see the tutorial for more ways. We use the following
function in the examples to demonstrate the concurrency aspects:
import Streamly import qualified Streamly.Prelude as S import Control.Concurrent delay n = S.yieldM $ do threadDelay (n * 1000000) tid <- myThreadId putStrLn (show tid ++ ": Delay " ++ show n)
main = runStream $ delay 3 <> delay 2 <> delay 1
ThreadId 36: Delay 3 ThreadId 36: Delay 2 ThreadId 36: Delay 1
main = runStream . parallely $ delay 3 <> delay 2 <> delay 1
ThreadId 42: Delay 1 ThreadId 41: Delay 2 ThreadId 40: Delay 3
Nested Loops (aka List Transformer)
The monad instance composes like a list monad.
import Streamly import qualified Streamly.Prelude as S loops = do x <- S.fromFoldable [1,2] y <- S.fromFoldable [3,4] S.yieldM $ putStrLn $ show (x, y) main = runStream loops
(1,3) (1,4) (2,3) (2,4)
Concurrent Nested Loops
To run the above code with, lookahead style concurrency i.e. each iteration in the loop can run run concurrently by but the results are presented in the same order as serial execution:
main = runStream $ aheadly $ loops
To run it with depth first concurrency yielding results asynchronously in the same order as they become available (deep async composition):
main = runStream $ asyncly $ loops
To run it with breadth first concurrency and yeilding results asynchronously (wide async composition):
main = runStream $ wAsyncly $ loops
The above streams provide lazy/demand-driven concurrency which is automatically scaled as per demand and is controlled/bounded so that it can be used on infinite streams. The following combinator provides strict, unbounded concurrency irrespective of demand:
main = runStream $ parallely $ loops
To run it serially but interleaving the outer and inner loop iterations (breadth first serial):
main = runStream $ wSerially $ loops
Streams can perform semigroup (<>) and monadic bind (>>=) operations
concurrently using combinators like
parallelly. For example,
to concurrently generate squares of a stream of numbers and then concurrently
sum the square roots of all combinations of two streams:
import Streamly import qualified Streamly.Prelude as S main = do s <- S.sum $ asyncly $ do -- Each square is performed concurrently, (<>) is concurrent x2 <- foldMap (\x -> return $ x * x) [1..100] y2 <- foldMap (\y -> return $ y * y) [1..100] -- Each addition is performed concurrently, monadic bind is concurrent return $ sqrt (x2 + y2) print s
Of course, the actions running in parallel could be arbitrary IO actions. For example, to concurrently list the contents of a directory tree recursively:
import Path.IO (listDir, getCurrentDir) import Streamly import qualified Streamly.Prelude as S main = runStream $ aheadly $ getCurrentDir >>= readdir where readdir d = do (dirs, files) <- S.yieldM $ listDir d S.yieldM $ mapM_ putStrLn $ map show files -- read the subdirs concurrently, (<>) is concurrent foldMap readdir dirs
In the above examples we do not think in terms of threads, locking or
synchronization, rather we think in terms of what can run in parallel, the rest
is taken care of automatically. When using
aheadly the programmer does
not have to worry about how many threads are to be created, they are
automatically adjusted based on the demand of the consumer.
For bounded concurrent streams, stream yield rate can be specified. For example, to print hello once every second you can simply write this:
import Streamly import Streamly.Prelude as S main = runStream $ asyncly $ avgRate 1 $ S.repeatM $ putStrLn "hello"
For some practical uses of rate control, see AcidRain.hs and CirclingSquare.hs . Concurrency of the stream is automatically controlled to match the specified rate. Rate control works precisely even at throughputs as high as millions of yields per second. For more sophisticated rate control see the haddock documentation.
Reactive Programming (FRP)
Streamly is a foundation for first class reactive programming as well by virtue of integrating concurrency and streaming. See AcidRain.hs for a console based FRP game example and CirclingSquare.hs for an SDL based animation example.
For more information, see:
- A comprehensive tutorial
- Some practical examples
- See the
Comparison with existing packagessection at the end of the tutorial
- Streaming benchmarks comparing streamly with other streaming libraries
- Quick tutorial comparing streamly with the async package
- Concurrency benchmarks comparing streamly with async
The code is available under BSD-3 license on github. Join the gitter chat channel for discussions. You can find some of the todo items on the github wiki. Please ask on the gitter channel or contact the maintainer directly for more details on each item. All contributions are welcome!
This library was originally inspired by the
transient package authored by
Alberto G. Corona.
- Performance improvements, especially space consumption, for concurrent streams
- Leftover threads are now cleaned up as soon as the consumer is garbage collected.
- Fix a bug in concurrent function application that in certain cases would unnecessarily share the concurrency state resulting in incorrect output stream.
- Fix passing of state across
wSerialcombinators. Without this fix combinators that rely on state passing e.g.
maxBufferwon’t work across these combinators.
- Added rate limiting combinators
constRateto control the yield rate of a stream.
Streamly.Timemodule is now deprecated, its functionality is subsumed by the new rate limiting combinators.
- foldxM was not fully strict, fixed.
- Signatures of
- Some functions in prelude now require an additional
Monadconstraint on the underlying type of the stream.
oncehas been deprecated and renamed to
- Add concurrency control primitives
- Concurrency of a stream with bounded concurrency when used with
takeis now limited by the number of elements demanded by
- Significant performance improvements utilizing stream fusion optimizations.
yieldto construct a singleton stream from a pure value
repeatto generate an infinite stream by repeating a pure value
fromListMto generate streams from lists, faster than
mapas a synonym of fmap
scanlM', the monadic version of scanl’
- Some prelude functions, to whom concurrency capability has been added, will
now require a
- Fixed a race due to which, in a rare case, we might block indefinitely on an MVar due to a lost wakeup.
- Fixed an issue in adaptive concurrency. The issue caused us to stop creating more worker threads in some cases due to a race. This bug would not cause any functional issue but may reduce concurrency in some cases.
- Added a concurrent lookahead stream type
fromFoldableMAPI that creates a stream from a container of monadic actions
- Monadic stream generation functions
fromFoldableMcan now generate streams concurrently when used with concurrent stream types.
- Monad transformation functions
sequencecan now map actions concurrently when used at appropriate stream types.
- Added concurrent function application operators to run stages of a stream processing function application pipeline concurrently.
- Fixed a bug that caused some transformation ops to return incorrect results
when used with concurrent streams. The affected ops are
Changed the semantics of the Semigroup instance for
ParallelT. The new semantics are as follows:
<>operation interleaves two streams
<>now concurrently merges two streams in a left biased manner using demand based concurrency.
<>operation now concurrently meges the two streams in a fairly parallel manner.
To adapt to the new changes, replace
serialwherever it is used for stream types other than
Alternativeinstance. To adapt to this change replace any usage of
Stream type now defaults to the
SerialTtype unless explicitly specified using a type combinator or a monomorphic type. This change reduces puzzling type errors for beginners. It includes the following two changes:
- Change the type of all stream elimination functions to use
SerialTinstead of a polymorphic type. This makes sure that the stream type is always fixed at all exits.
- Change the type combinators (e.g.
parallely) to only fix the argument stream type and the output stream type remains polymorphic.
Stream types may have to be changed or type combinators may have to be added or removed to adapt to this change.
- Change the type of all stream elimination functions to use
Change the type of
foldrMto make it consistent with
asyncis renamed to
asyncis now a new API with a different meaning.
ZipAsyncis renamed to
ZipAsyncis now ZipAsyncM specialized to the IO Monad.
MonadErrorinstance as it was not working correctly for parallel compositions. Use
MonadThrowinstead for error propagation.
Remove Num/Fractional/Floating instances as they are not very useful. Use
- Deprecate and rename the following symbols:
- Deprecate the following symbols for future removal:
- Add the following functions:
|:operator to construct streams from monadic actions
onceto create a singleton stream from a monadic action
repeatMto construct a stream by repeating a monadic action
scanl'strict left scan
foldl'strict left fold
foldlM'strict left fold with a monadic fold function
serialrun two streams serially one after the other
asyncrun two streams asynchronously
parallelrun two streams in parallel (replaces
WAsyncTstream type for BFS version of
- Add simpler stream types that are specialized to the IO monad
- Put a bound (1500) on the output buffer used for asynchronous tasks
- Put a limit (1500) on the number of threads used for Async and WAsync types
- Fixed a bug that casued unexpected behavior when
purewas used to inject values in Applicative composition of
consright associative and provide an operator form
- Improve performance of some stream operations (
- Fix the
productoperation. Earlier, it always returned 0 due to a bug
- Fix the
lastoperation, which returned
Nothingfor singleton streams
- Initial release