bitvec
Space-efficient bit vectors
https://github.com/Bodigrim/bitvec
| LTS Haskell 24.16: | 1.1.5.0@rev:3 |
| Stackage Nightly 2025-10-24: | 1.1.5.0@rev:3 |
| Latest on Hackage: | 1.1.5.0@rev:3 |
bitvec-1.1.5.0@sha256:434be6dc60e22858a52869c58038c35353f1a778b9679ebc06a2165bcc7f88b3,4921Module documentation for 1.1.5.0
bitvec

A newtype over Bool with a better Vector instance: 8x less memory, up to 3500x faster.
The vector
package represents unboxed arrays of Bools
spending 1 byte (8 bits) per boolean.
This library provides a newtype wrapper Bit and a custom instance
of an unboxed Vector, which packs bits densely,
achieving an 8x smaller memory footprint.
The performance stays mostly the same;
the most significant degradation happens for random writes
(up to 10% slower).
On the other hand, for certain bulk bit operations
Vector Bit is up to 3500x faster than Vector Bool.
Thread safety
Data.Bitis faster, but writes and flips are not thread-safe. This is because naive updates are not atomic: they read the whole word from memory, then modify a bit, then write the whole word back. Concurrently modifying non-intersecting slices of the same underlying array may also lead to unexpected results, since they can share a word in memory.Data.Bit.ThreadSafeis slower (usually 10-20%), but writes and flips are thread-safe. Additionally, concurrently modifying non-intersecting slices of the same underlying array works as expected. However, operations that affect multiple elements are not guaranteed to be atomic.
Quick start
Consider the following (very naive) implementation of
the sieve of Eratosthenes. It returns a vector with True
at prime indices and False at composite indices.
import Control.Monad
import Control.Monad.ST
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as MU
eratosthenes :: U.Vector Bool
eratosthenes = runST $ do
let len = 100
sieve <- MU.replicate len True
MU.write sieve 0 False
MU.write sieve 1 False
forM_ [2 .. floor (sqrt (fromIntegral len))] $ \p -> do
isPrime <- MU.read sieve p
when isPrime $
forM_ [2 * p, 3 * p .. len - 1] $ \i ->
MU.write sieve i False
U.unsafeFreeze sieve
We can switch from Bool to Bit just by adding newtype constructors:
import Data.Bit
import Control.Monad
import Control.Monad.ST
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as MU
eratosthenes :: U.Vector Bit
eratosthenes = runST $ do
let len = 100
sieve <- MU.replicate len (Bit True)
MU.write sieve 0 (Bit False)
MU.write sieve 1 (Bit False)
forM_ [2 .. floor (sqrt (fromIntegral len))] $ \p -> do
Bit isPrime <- MU.read sieve p
when isPrime $
forM_ [2 * p, 3 * p .. len - 1] $ \i ->
MU.write sieve i (Bit False)
U.unsafeFreeze sieve
The Bit-based implementation requires 8x less memory to store
the vector. For large sizes it allows to crunch more data in RAM
without swapping. For smaller arrays it helps to fit into
CPU caches.
> listBits eratosthenes
[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97]
There are several high-level helpers, digesting bits in bulk,
which makes them up to 64x faster than the respective counterparts
for Vector Bool. One can query the population count (popcount)
of a vector (giving us the prime-counting function):
> countBits eratosthenes
25
And vice versa, query an address of the n-th set bit (which corresponds to the n-th prime number here):
> nthBitIndex (Bit True) 10 eratosthenes
Just 29
One may notice that the order of the inner traversal by i
does not matter and get tempted to run it in several parallel threads.
In this case it is vital to switch from Data.Bit to Data.Bit.ThreadSafe,
because the former is not thread-safe with regards to writes.
There is a moderate performance penalty (usually 10-20%)
for using the thread-safe interface.
Sets
Bit vectors can be used as a blazingly fast representation of sets,
as long as their elements are Enumeratable and sufficiently dense,
leaving IntSet far behind.
For example, consider three possible representations of a set of Word16:
- As an
IntSetwith a readily availableunionfunction. - As a 64k-long unboxed
Vector Bool, implementing union aszipWith (||). - As a 64k-long unboxed
Vector Bit, implementing union aszipBits (.|.).
When the simd flag is enabled,
according to our benchmarks (see bench folder),
the union of Vector Bit evaluates magnitudes faster
than the union of not-too-sparse IntSets
and stunningly outperforms Vector Bool.
Here are benchmarks on MacBook M2:
union
16384
Vector Bit:
61.2 ns ± 3.2 ns
Vector Bool:
96.1 μs ± 4.5 μs, 1570.84x
IntSet:
2.15 μs ± 211 ns, 35.06x
32768
Vector Bit:
143 ns ± 7.4 ns
Vector Bool:
225 μs ± 16 μs, 1578.60x
IntSet:
4.34 μs ± 429 ns, 30.39x
65536
Vector Bit:
249 ns ± 18 ns
Vector Bool:
483 μs ± 28 μs, 1936.42x
IntSet:
8.77 μs ± 835 ns, 35.18x
131072
Vector Bit:
322 ns ± 30 ns
Vector Bool:
988 μs ± 53 μs, 3071.83x
IntSet:
17.6 μs ± 1.6 μs, 54.79x
262144
Vector Bit:
563 ns ± 27 ns
Vector Bool:
2.00 ms ± 112 μs, 3555.36x
IntSet:
36.8 μs ± 3.3 μs, 65.40x
Binary polynomials
Binary polynomials are polynomials with coefficients modulo 2.
Their applications include coding theory and cryptography.
While one can successfully implement them with the poly package,
operating on UPoly Bit,
this package provides even faster arithmetic routines
exposed via the F2Poly data type and its instances.
> :set -XBinaryLiterals
> -- (1 + x) * (1 + x + x^2) = 1 + x^3 (mod 2)
> 0b11 * 0b111 :: F2Poly
F2Poly {unF2Poly = [1,0,0,1]}
Use fromInteger / toInteger to convert binary polynomials
from Integer to F2Poly and back.
Package flags
-
Flag
simd, enabled by default.Use a C SIMD implementation for the ultimate performance of
zipBits,invertBitsandcountBits.
Similar packages
-
arrayis memory-efficient forBool, but lacks a handyVectorinterface and is not thread-safe.
Additional resources
Changes
1.1.5.0
-
Make
zipBitsunconditionally strict in its second bit vector argument (thanks to @treeowl). -
Add
simdflag (enabled by default) to use a C SIMD implementation forzipBits,invertBits,countBits,bitIndex,nthBitIndex,selectBits,excludeBits,reverseBits(thanks to @konsumlamm). -
Decomission
libgmpflag.
1.1.4.0
- Include
Data.Bit.Gmponly iflibgmpflag is set. - Tweak inlining pragmas to inline less aggressively.
1.1.3.0
- Fix malformed
signumforF2Poly.
1.1.2.0
- Fix
setBit,clearBit,complementBitto preserve vector’s length. - Fix various issues on big-endian architectures.
- Fix Cabal 3.7+ incompatibility.
1.1.1.0
- Export
BitVecandBitMVecconstructors.
1.1.0.0
- Fix a grave bug in
bitIndex. - Remove
integer-gmpflag. - Make
libgmpflag disabled by default. Users are strongly encouraged to enable it whenever possible. - Add
mapBitsandmapInPlacefunctions. - Add
cloneToByteStringandcloneFromByteStringfunctions.
1.0.3.0
- Add
Bits (Vector Bit)instance. - Add
castFromWords8,castToWords8,cloneToWords8to facilitate interoperation withByteString.
1.0.2.0
- Fix out-of-bounds writes in mutable interface.
- Improve thread-safety of mutable interface.
- Add extended GCD for
F2Poly. - Change
Showinstance ofF2Poly.
1.0.1.2
- Fix more bugs in
F2Polymultiplication.
1.0.1.1
- Fix bugs in
F2Polymultiplication. - Performance improvements.
1.0.1.0
- Implement arithmetic of binary polynomials.
- Add
invertBitsandreverseBitsfunctions. - Add
Num,Real,Integral,FractionalandNFDatainstances. - Performance improvements.
1.0.0.1
- Performance improvements.
1.0.0.0
- Redesign API from the scratch.
- Add a thread-safe implementation.
- Add
nthBitIndexfunction.
0.2.0.1
- Fix
Readinstance.
0.2.0.0
- Remove hand-written
Num,Real,Integral,Bitsinstances. - Derive
BitsandFiniteBitsinstances. - Expose
Bitconstructor directly and removefromBoolfunction. - Rename
toBooltounBit.
0.1.1.0
- Fix bugs in
MVectorandVectorinstances ofBit. - Speed up
MVectorandVectorinstances ofBit.