tasty-bench
Featherlight benchmark framework
https://github.com/Bodigrim/tasty-bench
| LTS Haskell 24.16: | 0.4.1@rev:1 | 
| Stackage Nightly 2025-10-24: | 0.4.1@rev:1 | 
| Latest on Hackage: | 0.4.1@rev:1 | 
tasty-bench-0.4.1@sha256:c38a77747f14e8d26ca99869b879ee56099b04e5381a74a62255dcbab85aab84,2066Module documentation for 0.4.1
- Test- Test.Tasty
 
tasty-bench  
  
 
Featherlight benchmark framework (only one file!) for performance measurement
with API mimicking criterion
and gauge.
A prominent feature is built-in comparison against previous runs
and between benchmarks.
- How lightweight is it?
- How is it possible?
- How to switch?
- How to write a benchmark?
- How to read results?
- Wall-clock time vs. CPU time
- Statistical model
- Memory usage
- Combining tests and benchmarks
- Troubleshooting
- Isolating interfering benchmarks
- Comparison against baseline
- Comparison between benchmarks
- Plotting results
- Build flags
- Command-line options
- Custom command-line options
How lightweight is it?
There is only one source file Test.Tasty.Bench and no non-boot dependencies
except tasty.
So if you already depend on tasty for a test suite, there
is nothing else to install.
Compare this to criterion (10+ modules, 50+ dependencies) and gauge (40+ modules, depends on basement and vector). A build on a clean machine is up to 16x
faster than criterion and up to 4x faster than gauge. A build without dependencies
is up to 6x faster than criterion and up to 8x faster than gauge.
tasty-bench is a native Haskell library and works everywhere, where GHC
does, including WASM. We support a full range of architectures (i386, amd64, armhf,
arm64, ppc64le, s390x) and operating systems (Linux, Windows, macOS,
FreeBSD, OpenBSD, NetBSD), plus any GHC from 8.0 to 9.10
(and earlier releases stretch back to GHC 7.0).
How is it possible?
Our benchmarks are literally regular tasty tests, so we can leverage all existing
machinery for command-line options, resource management, structuring,
listing and filtering benchmarks, running and reporting results. It also means
that tasty-bench can be used in conjunction with other tasty ingredients.
Unlike criterion and gauge we use a very simple statistical model described below.
This is arguably a questionable choice, but it works pretty well in practice.
A rare developer is sufficiently well-versed in probability theory
to make sense and use of all numbers generated by criterion.
How to switch?
Cabal mixins
allow to taste tasty-bench instead of criterion or gauge
without changing a single line of code:
cabal-version: 2.0
benchmark foo
  ...
  build-depends:
    tasty-bench
  mixins:
    tasty-bench (Test.Tasty.Bench as Criterion, Test.Tasty.Bench as Criterion.Main, Test.Tasty.Bench as Gauge, Test.Tasty.Bench as Gauge.Main)
This works vice versa as well: if you use tasty-bench, but at some point
need a more comprehensive statistical analysis,
it is easy to switch temporarily back to criterion.
How to write a benchmark?
Benchmarks are declared in a separate section of cabal file:
cabal-version:   2.0
name:            bench-fibo
version:         0.0
build-type:      Simple
synopsis:        Example of a benchmark
benchmark bench-fibo
  main-is:       BenchFibo.hs
  type:          exitcode-stdio-1.0
  build-depends: base, tasty-bench
  ghc-options:   "-with-rtsopts=-A32m"
  if impl(ghc >= 8.6)
    ghc-options: -fproc-alignment=64
And here is BenchFibo.hs:
import Test.Tasty.Bench
fibo :: Int -> Integer
fibo n = if n < 2 then toInteger n else fibo (n - 1) + fibo (n - 2)
main :: IO ()
main = defaultMain
  [ bgroup "Fibonacci numbers"
    [ bench "fifth"     $ nf fibo  5
    , bench "tenth"     $ nf fibo 10
    , bench "twentieth" $ nf fibo 20
    ]
  ]
Since tasty-bench provides an API compatible with criterion,
one can refer to its documentation for more examples.
How to read results?
Running the example above (cabal bench or stack bench)
results in the following output:
All
  Fibonacci numbers
    fifth:     OK (2.13s)
       63 ns ± 3.4 ns
    tenth:     OK (1.71s)
      809 ns ±  73 ns
    twentieth: OK (3.39s)
      104 μs ± 4.9 μs
All 3 tests passed (7.25s)
The output says that, for instance, the first benchmark was repeatedly executed for 2.13 seconds (wall-clock time), its predicted mean CPU time was 63 nanoseconds and means of individual samples do not often diverge from it further than ±3.4 nanoseconds (double standard deviation). Take standard deviation numbers with a grain of salt; there are lies, damned lies, and statistics.
Wall-clock time vs. CPU time
What time are we talking about?
Both criterion and gauge by default report wall-clock time, which is
affected by any other application which runs concurrently.
Ideally benchmarks are executed on a dedicated server without any other load,
but — let’s face the truth — most of developers run benchmarks
on a laptop with a hundred other services and a window manager, and
watch videos while waiting for benchmarks to finish. That’s the cause
of a notorious “variance introduced by outliers: 88% (severely inflated)” warning.
To alleviate this issue tasty-bench measures CPU time by getCPUTime
instead of wall-clock time by default.
It does not provide a perfect isolation from other processes (e. g.,
if CPU cache is spoiled by others, populating data back from RAM
is your burden), but is a bit more stable.
Caveat: this means that for multithreaded algorithms
tasty-bench reports total elapsed CPU time across all cores, while
criterion and gauge print maximum of core’s wall-clock time.
It also means that by default tasty-bench does not measure time spent out of process,
e. g., calls to other executables. To work around this limitation
use --time-mode command-line option or set it locally via TimeMode option.
Statistical model
Here is a procedure used by tasty-bench to measure execution time:
- Set $n \leftarrow 1$.
- Measure execution time $t_n$ of $n$ iterations and execution time $t_{2n}$ of $2n$ iterations.
- Find $t$ which minimizes deviation of $(nt,2nt)$ from $(t_n,t_{2n})$, namely $t \leftarrow (t_n + 2t_{2n}) / 5n$.
- If deviation is small enough (see --stdevbelow) or time is running out soon (see--timeoutbelow), return $t$ as a mean execution time.
- Otherwise set $n \leftarrow 2n$ and jump back to Step 2.
This is roughly similar to the linear regression approach which criterion takes,
but we fit only two last points. This allows us to simplify away all heavy-weight
statistical analysis. More importantly, earlier measurements,
which are presumably shorter and noisier, do not affect overall result.
This is in contrast to criterion, which fits all measurements and
is biased to use more data points corresponding to shorter runs
(it employs $n \leftarrow 1.05n$ progression).
Mean time and its deviation does not say much about the distribution of individual timings. E. g., imagine a computation which (according to a coarse system timer) takes either 0 ms or 1 ms with equal probability. While one would be able to establish that its mean time is 0.5 ms with a very small deviation, this does not imply that individual measurements are anywhere near 0.5 ms. Even assuming an infinite precision of a system timer, the distribution of individual times is not known to be normal.
Obligatory disclaimer: statistics is a tricky matter, there is no
one-size-fits-all approach.
In the absence of a good theory
simplistic approaches are as (un)sound as obscure ones.
Those who seek statistical soundness should rather collect raw data
and process it themselves using a proper statistical toolbox.
Data reported by tasty-bench
is only of indicative and comparative significance.
Memory usage
Configuring RTS to collect GC statistics
(e. g., via cabal bench --benchmark-options '+RTS -T'
or stack bench --ba '+RTS -T') enables tasty-bench to estimate and report
memory usage:
All
  Fibonacci numbers
    fifth:     OK (2.13s)
       63 ns ± 3.4 ns, 223 B  allocated,   0 B  copied, 2.0 MB peak memory
    tenth:     OK (1.71s)
      809 ns ±  73 ns, 2.3 KB allocated,   0 B  copied, 4.0 MB peak memory
    twentieth: OK (3.39s)
      104 μs ± 4.9 μs, 277 KB allocated,  59 B  copied, 5.0 MB peak memory
All 3 tests passed (7.25s)
This data is reported as per GHC.Stats.RTSStats fields:
- 
allocated_bytesTotal size of data ever allocated since the start of the benchmark iteration. Even if data was immediately garbage collected and freed, it still counts. 
- 
copied_bytesTotal size of data ever copied by GC (because it was alive and kicking) since the start of the benchmark iteration. Note that zero bytes often mean that the benchmark was too short to trigger GC at all. 
- 
max_mem_in_use_bytesPeak size of live data since the very start of the process. This is a global metric, it cumulatively grows and does not say much about individual benchmarks, but rather characterizes heap environment in which they are executed. 
Combining tests and benchmarks
When optimizing an existing function, it is important to check that its
observable behavior remains unchanged. One can rebuild
both tests and benchmarks after each change, but it would be more convenient
to run sanity checks within benchmark itself. Since our benchmarks
are compatible with tasty tests, we can easily do so.
Imagine you come up with a faster function myFibo to generate Fibonacci numbers:
import Test.Tasty.Bench
import Test.Tasty.QuickCheck -- from tasty-quickcheck package
fibo :: Int -> Integer
fibo n = if n < 2 then toInteger n else fibo (n - 1) + fibo (n - 2)
myFibo :: Int -> Integer
myFibo n = if n < 3 then toInteger n else myFibo (n - 1) + myFibo (n - 2)
main :: IO ()
main = Test.Tasty.Bench.defaultMain -- not Test.Tasty.defaultMain
  [ bench "fibo   20" $ nf fibo   20
  , bench "myFibo 20" $ nf myFibo 20
  , testProperty "myFibo = fibo" $ \n -> fibo n === myFibo n
  ]
This outputs:
All
  fibo   20:     OK (3.02s)
    104 μs ± 4.9 μs
  myFibo 20:     OK (1.99s)
     71 μs ± 5.3 μs
  myFibo = fibo: FAIL
    *** Failed! Falsified (after 5 tests and 1 shrink):
    2
    1 /= 2
    Use --quickcheck-replay=927711 to reproduce.
1 out of 3 tests failed (5.03s)
We see that myFibo is indeed significantly faster than fibo,
but unfortunately does not do the same thing. One should probably
look for another way to speed up generation of Fibonacci numbers.
Troubleshooting
- 
If benchmarks take too long, set --timeoutto limit execution time of individual benchmarks, andtasty-benchwill do its best to fit into a given time frame. Without--timeoutwe rerun benchmarks until achieving a target precision set by--stdev, which in a noisy environment of a modern laptop with GUI may take a lot of time.While criterionruns each benchmark at least for 5 seconds,tasty-benchis happy to conclude earlier, if it does not compromise the quality of results. In our experimentstasty-benchsuites tend to finish earlier, even if some individual benchmarks take longer than withcriterion.A common source of noisiness is garbage collection. Setting a larger allocation area (nursery) is often a good idea, either via cabal bench --benchmark-options '+RTS -A32m'orstack bench --ba '+RTS -A32m'. Alternatively bake it intocabalfile asghc-options: "-with-rtsopts=-A32m".
- 
Never compile benchmarks with -fstatic-argument-transformation, because it breaks a trick we use to force GHC into reevaluation of the same function application over and over again.
- 
If benchmark results look malformed like below, make sure that you are invoking Test.Tasty.Bench.defaultMainand notTest.Tasty.defaultMain(the difference isconsoleBenchReportervs.consoleTestReporter):All fibo 20: OK (1.46s) WithLoHi (Estimate {estMean = Measurement {measTime = 41529118775, measAllocs = 0, measCopied = 0, measMaxMem = 0}, estStdev = 1595055320}) (-Infinity) Infinity
- 
If benchmarks fail with an error message Unhandled resource. Probably a bug in the runner you're using.or Unexpected state of the resource (NotCreated) in getResource. Report as a tasty bug.this is likely caused by envorenvWithCleanupaffecting benchmarks structure. You can useenvto read test data fromIO, but not to read benchmark names or affect their hierarchy in other way. This is a fundamental restriction oftastyto list and filter benchmarks without launching missiles.Strict pattern-matching on resource is also prohibited. For instance, if it is a tuple, the second argument of envshould use a lazy pattern match\~(a, b) -> ...
- 
If benchmarks fail with Test dependencies form a looporTest dependencies have cycles, this is likely because ofbcompare, which compares a benchmark with itself. Locating a benchmark in a global environment may be tricky, please refer totastydocumentation for details and consider usinglocateBenchmark.
- 
When seeing This benchmark takes more than 100 seconds. Consider setting --timeout, if this is unexpected (or to silence this warning).do follow the advice: abort benchmarks and pass -t100or similar. Unless you are benchmarking a very computationally expensive function, a single benchmark should stabilize after a couple of seconds. This warning is a sign that your environment is too noisy, in which casetasty-benchwill continue trying with exponentially longer intervals, often unproductively.
- 
The following error can be thrown when benchmarks are built with ghc-options: -threaded:Benchmarks must not be run concurrently. Please pass -j1 and/or avoid +RTS -N.The underlying cause is that tastyruns tests concurrently, which is harmful for reliable performance measurements. Make sure to usetasty-bench >= 0.3.4and invokeTest.Tasty.Bench.defaultMainand notTest.Tasty.defaultMain. Note thatlocalOption (NumThreads 1)quashes the warning, but does not eliminate the cause.
- 
If benchmarks using GHC 9.4.4+ segfault on Windows, check that you are not using non-moving garbage collector --nonmoving-gc. This is likely caused by GHC issue. Previous releases oftasty-benchrecommended enabling--nonmoving-gcto stabilise benchmarks, but it’s discouraged now.
- 
If you see <stdout>: commitBuffer: invalid argument (cannot encode character '\177')or Uncaught exception ghc-internal:GHC.Internal.IO.Exception.IOException: <stdout>: commitBuffer: invalid argument (cannot encode character '\956')it means that your locale does not support UTF-8. tasty-benchmakes an effort to force locale to UTF-8, but sometimes, when benchmarks are a part of a larger application, it’s impossible to do so. In such case runlocale -ato list available locales and set a UTF-8-capable one (e. g.,export LANG=C.UTF-8) before starting benchmarks.
Isolating interfering benchmarks
One difficulty of benchmarking in Haskell is that it is
hard to isolate benchmarks so that they do not interfere.
Changing the order of benchmarks or skipping some of them
has an effect on heap’s layout and thus affects garbage collection.
This issue is well attested in
both
criterion
and
gauge.
Usually (but not always) skipping some benchmarks speeds up remaining ones. That’s because once a benchmark allocated heap which for some reason was not promptly released afterwards (e. g., it forced a top-level thunk in an underlying library), all further benchmarks are slowed down by garbage collector processing this additional amount of live data over and over again.
There are several mitigation strategies. First of all, giving garbage collector
more breathing space by +RTS -A32m (or more) is often good enough.
Further, avoid using top-level bindings to store large test data. Once such thunks
are forced, they remain allocated forever, which affects detrimentally subsequent
unrelated benchmarks. Treat them as external data, supplied via env: instead of
largeData :: String
largeData = replicate 1000000 'a'
main :: IO ()
main = defaultMain
  [ bench "large" $ nf length largeData, ... ]
use
import Control.DeepSeq (force)
import Control.Exception (evaluate)
main :: IO ()
main = defaultMain
  [ env (evaluate (force (replicate 1000000 'a'))) $ \largeData ->
    bench "large" $ nf length largeData, ... ]
Finally, as an ultimate measure to reduce interference between benchmarks, one can run each of them in a separate process. We do not quite recommend this approach, but if you are desperate, here is how:
cabal run -v0 all:benches -- -l | sed -e 's/[\"]/\\\\\\&/g' | while read -r name; do cabal run -v0 all:benches -- -p '$0 == "'"$name"'"'; done
This assumes that there is a single benchmark suite in the project and that benchmark names do not contain newlines.
Comparison against baseline
One can compare benchmark results against an earlier run in an automatic way.
When using this feature, it’s especially important to compile benchmarks with
ghc-options: -fproc-alignment=64, otherwise results could be skewed by
intermittent changes in cache-line alignment.
Firstly, run tasty-bench with --csv FILE key
to dump results to FILE in CSV format
(it could be a good idea to set smaller --stdev, if possible):
Name,Mean (ps),2*Stdev (ps)
All.Fibonacci numbers.fifth,48453,4060
All.Fibonacci numbers.tenth,637152,46744
All.Fibonacci numbers.twentieth,81369531,3342646
Now modify implementation and rerun benchmarks
with --baseline FILE key. This produces a report as follows:
All
  Fibonacci numbers
    fifth:     OK (0.44s)
       53 ns ± 2.7 ns,  8% more than baseline
    tenth:     OK (0.33s)
      641 ns ±  59 ns,       same as baseline
    twentieth: OK (0.36s)
       77 μs ± 6.4 μs,  5% less than baseline
All 3 tests passed (1.50s)
You can also fail benchmarks, which deviate too far from baseline, using
--fail-if-slower and --fail-if-faster options. For example, setting both of them
to 6 will fail the first benchmark above (because it is more than 6% slower),
but the last one still succeeds (even while it is measurably faster than baseline,
deviation is less than 6%). Consider also using --hide-successes to show
only problematic benchmarks, or even
tasty-rerun package
to focus on rerunning failing items only.
If you wish to compare two CSV reports non-interactively, here is a handy awk incantation:
awk 'BEGIN{FS=",";OFS=",";print "Name,Old,New,Ratio"}FNR==1{trueNF=NF;next}NF<trueNF{print "Benchmark names should not contain newlines";exit 1}FNR==NR{oldTime=$(NF-trueNF+2);NF-=trueNF-1;a[$0]=oldTime;next}{newTime=$(NF-trueNF+2);NF-=trueNF-1;if(a[$0]){print $0,a[$0],newTime,newTime/a[$0];gs+=log(newTime/a[$0]);gc++}}END{if(gc>0)print "Geometric mean,,",exp(gs/gc)}' old.csv new.csv
A larger shell snippet to compare two git commits can be found in compare_benches.sh.
Note that columns in CSV report are different from what criterion or gauge
would produce. If names do not contain commas, missing columns can be faked this way:
awk 'BEGIN{FS=",";OFS=",";print "Name,Mean,MeanLB,MeanUB,Stddev,StddevLB,StddevUB"}NR==1{trueNF=NF;next}NF<trueNF{print $0;next}{mean=$(NF-trueNF+2);stddev=$(NF-trueNF+3);NF-=trueNF-1;print $0,mean/1e12,mean/1e12,mean/1e12,stddev/2e12,stddev/2e12,stddev/2e12}'
To fake gauge in --csvraw mode use
awk 'BEGIN{FS=",";OFS=",";print "name,iters,time,cycles,cpuTime,utime,stime,maxrss,minflt,majflt,nvcsw,nivcsw,allocated,numGcs,bytesCopied,mutatorWallSeconds,mutatorCpuSeconds,gcWallSeconds,gcCpuSeconds"}NR==1{trueNF=NF;next}NF<trueNF{print $0;next}{mean=$(NF-trueNF+2);fourth=$(NF-trueNF+4);fifth=$(NF-trueNF+5);sixth=$(NF-trueNF+6);NF-=trueNF-1;print $0,1,mean/1e12,0,mean/1e12,mean/1e12,0,sixth+0,0,0,0,0,fourth+0,0,fifth+0,0,0,0,0}'
Comparison between benchmarks
You can also compare benchmarks to each other without any external tools, all in the comfort of your terminal.
import Test.Tasty.Bench
fibo :: Int -> Integer
fibo n = if n < 2 then toInteger n else fibo (n - 1) + fibo (n - 2)
main :: IO ()
main = defaultMain
  [ bgroup "Fibonacci numbers"
    [ bcompare "tenth"  $ bench "fifth"     $ nf fibo  5
    ,                     bench "tenth"     $ nf fibo 10
    , bcompare "tenth"  $ bench "twentieth" $ nf fibo 20
    ]
  ]
This produces a report, comparing mean times of fifth and twentieth to tenth:
All
  Fibonacci numbers
    fifth:     OK (16.56s)
      121 ns ± 2.6 ns, 0.08x
    tenth:     OK (6.84s)
      1.6 μs ±  31 ns
    twentieth: OK (6.96s)
      203 μs ± 4.1 μs, 128.36x
To locate a baseline benchmark in a larger suite use locateBenchmark.
One can leverage comparisons between benchmarks to implement portable performance
tests, expressing properties like “this algorithm must be at least twice faster
than that one” or “this operation should not be more than thrice slower than that”.
This can be achieved with bcompareWithin, which takes an acceptable interval
of performance as an argument.
Plotting results
Users can dump results into CSV with --csv FILE
and plot them using gnuplot or other software. But for convenience
there is also a built-in quick-and-dirty SVG plotting feature,
which can be invoked by passing --svg FILE. Here is a sample of its output:
Build flags
Build flags are a brittle subject and users do not normally need to touch them.
- If you find yourself in an environment, where tastyis not available and you have access to boot packages only, you can still usetasty-bench! Just copyTest/Tasty/Bench.hsto your project (imagine it like a header-only C library). It will provide you with functions to buildBenchmarkableand run them manually viameasureCpuTime. This mode of operation can be also configured by disabling Cabal flagtasty.
Command-line options
Use --help to list all command-line options.
- 
-p,--patternThis is a standard tastyoption, which allows filtering benchmarks by a pattern orawkexpression. Please refer totastydocumentation for details.
- 
-t,--timeoutThis is a standard tastyoption, setting timeout for individual benchmarks in seconds. Use it when benchmarks tend to take too long:tasty-benchwill make an effort to report results (even if of subpar quality) before timeout. Setting timeout too tight (insufficient for at least three iterations) will result in a benchmark failure. One can adjust it locally for a group of benchmarks, e. g.,localOption (mkTimeout 100000000)for 100 seconds.
- 
--stdevTarget relative standard deviation of measurements in percents (5% by default). Large values correspond to fast and loose benchmarks, and small ones to long and precise. It can also be adjusted locally for a group of benchmarks, e. g., localOption (RelStDev 0.02). If benchmarking takes far too long, consider setting--timeout, which will interrupt benchmarks, potentially before reaching the target deviation.
- 
--csvFile to write results in CSV format. 
- 
--baselineFile to read baseline results in CSV format (as produced by --csv).
- 
--fail-if-slower,--fail-if-fasterUpper bounds of acceptable slow down / speed up in percents. If a benchmark is unacceptably slower / faster than baseline (see --baseline), it will be reported as failed. Can be used in conjunction with a standardtastyoption--hide-successesto show only problematic benchmarks. Both options can be adjusted locally for a group of benchmarks, e. g.,localOption (FailIfSlower 0.10).
- 
--svgFile to plot results in SVG format. 
- 
--time-modeWhether to measure CPU time ( cpu, default) or wall-clock time (wall).
- 
+RTS -TEstimate and report memory usage. 
Custom command-line options
As usual with tasty, it is easy to extend benchmarks with custom command-line options.
Here is an example:
import Data.Proxy
import Test.Tasty.Bench
import Test.Tasty.Ingredients.Basic
import Test.Tasty.Options
import Test.Tasty.Runners
newtype RandomSeed = RandomSeed Int
instance IsOption RandomSeed where
  defaultValue = RandomSeed 42
  parseValue = fmap RandomSeed . safeRead
  optionName = pure "seed"
  optionHelp = pure "Random seed used in benchmarks"
main :: IO ()
main = do
  let customOpts  = [Option (Proxy :: Proxy RandomSeed)]
      ingredients = includingOptions customOpts : benchIngredients
  opts <- parseOptions ingredients benchmarks
  let RandomSeed seed = lookupOption opts
  defaultMainWithIngredients ingredients benchmarks
benchmarks :: Benchmark
benchmarks = bgroup "All" []
Changes
0.4.1
- Force GC before collecting RTSStats, otherwise measurements are inaccurate.
- Restore locale encoding after defaultMain.
0.4
- Switch nf,nfIOandnfAppIOto evaluate outputs to a normal form withrnfinstead offorce. It means that parts of the output, which have already been forced, can be garbage collected early, without waiting for the entire output to be allocated at the same time. This decreases benchmarking overhead in many scenarios and brings the behaviour in line withcriterion. See #39 for discussion.
- Drop support of tasty < 1.4.
- Make IObenchmarks immune to-fspec-constr-countlimit.
- Decomission debugbuild flag.
- Decomission warning when --timeoutis absent.
- Add instance {Eq,Ord,Num,Fractional} {RelStDev,FailIfSlower,FailIfFaster}.
- Add instance {Eq,Ord} {CsvPath,SvgPath,BaselinePath}.
0.3.5
- Support tasty-1.5.
- Report benchmarking progress.
0.3.4
- Force single-threaded execution in defaultMain.
- Expose measureCpuTimeAndStDevhelper to analyse benchmarks manually.
0.3.3
- Drop support of tasty < 1.2.3.
- Make benchmarks immune to -fspec-constr-countlimit.
0.3.2
- Add locateBenchmarkandmapLeafBenchmarks.
- Support measuring of wall-clock time.
- Make messages for baseline comparison less ambiguous.
- Graceful degradation on non-Unicode terminals.
0.3.1
- Add bcompareWithinfor portable performance tests.
- Add tastyanddebugbuild flags.
0.3
- Report mean time with 3 significant digits.
- Report peak memory usage, when run with +RTS -T.
- Run benchmarks only once, if RelStDevis infinite.
- Make Benchmarkableconstructor public.
- Expose measureCpuTimehelper to run benchmarks manually.
- Expose CsvPath,BaselinePath,SvgPath.
0.2.5
- Fix comparison against baseline.
0.2.4
- Add a simplistic SVG reporter.
- Add bcompareto compare between benchmarks.
- Throw a warning, if benchmarks take too long.
0.2.3
- Prohibit duplicated benchmark names in CSV reports.
0.2.2
- Remove NFDataconstraint fromwhnfIO.
0.2.1
- Fix integer overflow in stdev computations.
0.2
- Add envandenvWithCleanup.
- Run console and CSV reporters in parallel.
- Extend console reporter and export it as consoleBenchReporter.
- Add comparison against baseline and relevant options.
- Export RelStDevoption.
- Export benchIngredients.
0.1
- Initial release.
