This version can be pinned in stack with:tdigest-0.3@sha256:975df9741a336f2498a684460dd01700213ea7bb8e61f9d722057f854c70cae7,2895
Module documentation for 0.3
tdigest
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means.
See original paper: “Computing extremely accurate quantiles using t-digest” by Ted Dunning and Otmar Ertl
Synopsis
λ *Data.TDigest > median (tdigest [1..1000] :: TDigest 3)
Just 499.0090729817737
Benchmarks
Using 50M exponentially distributed numbers:
- average: 16s; incorrect approximation of median, mostly to measure prng speed
- sorting using
vector-algorithms
: 33s; using 1000MB of memory
- sparking t-digest (using some
par
): 53s
- buffered t-digest: 68s
- sequential t-digest: 65s
Example histogram
tdigest-simple -m tdigest -d standard -s 100000 -c 10 -o output.svg -i 34
cp output.svg example.svg
inkscape --export-png=example.png --export-dpi=80 --export-background-opacity=0 --without-gui example.svg
0.3
- Depend on
foldable1-classes-compat
instead of semigroupoids
.
0.2.1.1
0.2.1
- Add size, valid, validate, and debugPrint for NonEmpty
#26
0.2
- Add
Data.TDigest.Vector
module.
0.1
- Add
validateHistogram
and debugPrint
- Fix a pointy centroid bug.
- Add
Data.TDigest.NonEmpty
module
- Add
mean
, variance
, stddev