The ‘store’ package provides efficient binary serialization. There are a couple
features that particularly distinguish it from most prior Haskell serialization
libraries:
Its primary goal is speed. By default, direct machine representations are used
for things like numeric values (Int, Double, Word32, etc) and buffers
(Text, ByteString, Vector, etc). This means that much of serialization
uses the equivalent of memcpy.
We have plans for supporting architecture independent serialization - see
#36 and
#31. This plan makes little endian
the default, so that the most common endianness has no overhead.
Another way that the serialization behavior can vary is if
integer-simple is used instead of GHC’s default of using
GMP. Integer serialized with the integer-simple flag enabled
are not compatible with those serialized without the flag enabled.
Instead of implementing lazy serialization / deserialization involving
multiple input / output buffers, peek and poke always work with a single
buffer. This buffer is allocated by asking the value for its size before
encoding. This simplifies the encoding logic, and allows for highly optimized
tight loops.
store can optimize size computations by knowing when some types always
use the same number of bytes. This allows us to compute the byte size of a
Vector Int32 by just doing length v * 4.
It also features:
Optimized serialization instances for many types from base, vector,
bytestring, text, containers, time, template-haskell, and more.
TH and GHC Generics based generation of Store instances for datatypes
TH generation of testcases.
Utilities for streaming encoding / decoding of Store encoded messages, via the
store-streaming package.
Gotchas
Store is best used for communication between trusted processes and
local caches. It can certainly be used for other purposes, but the
builtin set of instances have some gotchas to be aware of:
Store’s builtin instances serialize in a format which depends on
machine endianness.
Store’s builtin instances trust the data when deserializing. For
example, the deserialization of Vector will read the vector’s link
from the first 8 bytes. It will then allocate enough memory to store
all the elements. Malicious or malformed input could cause
allocation of large amounts of memory. See issue #122
Fixes compilation with vector >= 0.12.1.1 by making
deriveManyStoreUnboxVector capable of handling more complex
instance constraints. In particular, it now correctly generates
instances Store (Vector (f (g a))) => Store (Vector (Compose f g a)) and Store (Vector (f a)) => Store (Vector (Alt f a)).
0.7.1
Fixes compilation with GHC-7.10 due to it not defining Generic
instances for Complex and Identity. See #142.
Documents some gotchas about using store vs other libraries
0.7.0
Fixes a bug where the Store instances for Identity, Const, and
Complex all have Storable superclasses instead of `Store. See
#143.
0.6.1
Can now optionally be built with integer-simple instead of
integer-gmp, via the integer-simple cabal flag. Note that the
serialization of Integer with integer-simple differs from what
is used by the GMP default. See #147.
0.6.0.1
Now builds with GHC-7.10 - compatibility was broken in 0.6.0 due to
the fix for GHC-8.8. See
[#146][https://github.com/fpco/store/issues/146].
0.6.0
Now builds with GHC-8.8. This is a major version bump because
MonadFail constraints were added to some functions, which is
potentially a breaking change.
Update to the instances for generics, to improve error messages for
sum types with more than 255 constructors. See
#141
0.5.1.0
Update to TH to support sum types with more than 62 constructors.
Uses TH to derive Either instance, so that it can sometimes have ConstSize #119.
0.5.0.1
Updates to test-suite enabling store to build with newer dependencies.
0.5.0
Data.Store.Streaming moved to a separate package, store-streaming.
0.4.3.2
Buildable with GHC 8.2
Fix to haddock formatting of Data.Store.TH code example
0.4.3.1
Fixed compilation on GHC 7.8
0.4.3
Less aggressive inlining, resulting in faster compilation / simplifier
not running out of ticks
0.4.2
Fixed testsuite
0.4.1
Breaking change in the encoding of Map / Set / IntMap / IntSet,
to use ascending key order. Attempting to decode data written by
prior versions of store (and vice versa) will almost always fail
with a decent error message. If you’re unlucky enough to have a
collision in the data with a random Word32 magic number, then the
error may not be so clear, or in extremely rare cases,
successfully decode, yielding incorrect results. See
#97 and
#101.
Performance improvement of the ‘Peek’ monad, by introducing more
strictness. This required a change to the internal API.
API and behavior of ‘Data.Store.Version’ changed. Previously, it
would check the version tag after decoding the contents. It now
also stores a magic Word32 tag at the beginning, so that it fails
more gracefully when decoding input that lacks encoded version
info.
0.4.0
Deprecated in favor of 0.4.1
0.3.1
Fix to derivation of primitive vectors, only relevant when built with
primitive-0.6.2.0 or later
Removes INLINE pragmas on the generic default methods. This
dramatically improves compilation time on recent GHC versions.
See #91.
Adds instance Contravariant Size
0.3
Uses store-core-0.3.*, which has support for alignment sensitive
architectures.
Adds support for streaming decode from file descriptor, not supported on
windows. As part of this addition, the API for “Data.Store.Streaming” has
changed.
0.2.1.2
Fixes a bug that could could result in attempting to malloc a negative
number of bytes when reading corrupted data.
0.2.1.1
Fixes a bug that could result in segfaults when reading corrupted data.
0.2.1.0
Release notes:
Adds experimental Data.Store.Version and deprecates Data.Store.TypeHash.
The new functionality is similar to TypeHash, but there are much fewer false
positives of hashes changing.
Other enhancements:
Now exports types related to generics
0.2.0.0
Release notes:
Core functionality split into store-core package
Breaking changes:
combineSize' renamed to combineSizeWith
Streaming support now prefixes each Message with a magic number, intended to
detect mis-alignment of data frames. This is worth the overhead, because
otherwise serialization errors could be more catastrophic - interpretting some
bytes as a length tag and attempting to consume many bytes from the source.