bench-show
Show, plot and compare benchmark results
https://github.com/composewell/bench-show
Version on this page: | 0.2.2 |
LTS Haskell 20.26: | 0.3.2 |
Stackage Nightly 2024-05-04: | 0.3.2@rev:1 |
Latest on Hackage: | 0.3.2@rev:1 |
bench-show-0.2.2@sha256:ecc994d4f3ba79468bd56b281671996f94a8103e381d5e9ad371ca1a75405a37,5467
Module documentation for 0.2.2
bench-show
Generate text reports and graphical charts from the benchmark results
generated by gauge
or criterion
, showing or comparing benchmarks in
many useful ways. In a few lines of code, we can report time taken, peak
memory usage, allocations, among many other fields; we can group benchmarks
and compare the groups; we can compare benchmarks before and after a change;
we can show absolute or percentage difference from the baseline; we can sort
the results to get the worst affected benchmarks by percentage change.
It can help us in answering questions like the following, visually or textually:
- Across two benchmark runs, show all the operations that resulted in a regression of more than 10%, so that we can quickly identify and fix performance problems in our application.
- Across two (or more) packages providing similar functionality, show all the operations where the performance differs by more than 10%, so that we can critically analyze the packages and choose the right one.
Quick Start
Use gauge
or criterion
to generate a results.csv
file, and then use the
following code to generate a textual report or a graph:
report "results.csv" Nothing defaultConfig
graph "results.csv" "output" defaultConfig
For advanced usage, control the generated report by modifying the
defaultConfig
.
Reports and Charts
report
with Fields
presentation style generates a multi-column report. We
can select many fields from a gauge
raw report. Units of the fields are
automatically determined based on the range of values:
report "results.csv" Nothing defaultConfig { presentation = Fields }
Benchmark time(μs) maxrss(MiB)
------------- -------- -----------
vector/fold 641.62 2.75
streamly/fold 639.96 2.75
vector/map 638.89 2.72
streamly/map 653.36 2.66
vector/zip 651.42 2.58
streamly/zip 644.33 2.59
graph
generates one bar chart per field:
graph "results.csv" "output" defaultConfig
When the input file contains results from a single benchmark run, by default all the benchmarks are placed in a single benchmark group named “default”.
Grouping
Let’s write a benchmark classifier to put the streamly
and vector
benchmarks in their own groups:
classifier name =
case splitOn "/" name of
grp : bench -> Just (grp, concat bench)
_ -> Nothing
Now we can show the two benchmark groups as separate columns. We can
generate reports comparing different benchmark fields (e.g. time
and
maxrss
) for all the groups:
report "results.csv" Nothing
defaultConfig { classifyBenchmark = classifier }
(time)(Median)
Benchmark streamly(μs) vector(μs)
--------- ------------ ----------
fold 639.96 641.62
map 653.36 638.89
zip 644.33 651.42
We can do the same graphically as well, just replace report
with graph
in the code above. Each group is placed as a cluster on the graph. Multiple
clusters are placed side by side (i.e. on the same scale) for easy
comparison. For example:
Regression, Percentage Difference and Sorting
We can append benchmarks results from multiple runs to the same file. These runs can then be compared. We can run benchmarks before and after a change and then report the regressions by percentage change in a sorted order:
Given a results file with two runs, this code generates the report that follows:
report "results.csv" Nothing
defaultConfig
{ classifyBenchmark = classifier
, presentation = Groups PercentDiff
, selectBenchmarks = \f ->
reverse
$ map fst
$ sortBy (comparing snd)
$ either error id $ f $ ColumnIndex 1
}
(time)(Median)(Diff using min estimator)
Benchmark streamly(0)(μs)(base) streamly(1)(%)(-base)
--------- --------------------- ---------------------
zip 644.33 +23.28
map 653.36 +7.65
fold 639.96 -15.63
It tells us that in the second run the worst affected benchmark is zip taking 23.28 percent more time compared to the baseline.
Graphically:
Full Documentation and examples
- See the haddock documentation on Hackage
- See the comprehensive tutorial module in the haddock docs
- For examples see the test directory in the package
Contributions and Feedback
Contributions are welcome! Please see the TODO.md file or the existing issues if you want to pick up something to work on.
Any feedback on improvements or the direction of the package is welcome. You can always send an email to the maintainer or raise an issue for anything you want to suggest or discuss, or send a PR for any change that you would like to make.
Changes
0.2.2
- Allow additional annotations to title to be controlled via config
- Better error handling
0.2.1
- Use new version of
statistics
package.
0.2.0
Release Notes
- Due to a bug in the
statistics
package, reporting may crash on certain inputs with avector index out of bounds
message. The bug has been fixed and will be available in an upcoming release.
Breaking Changes
- The package
bench-graph
has been renamed tobench-show
to reflect the fact that it now includes text reports as well. This includes the change of module nameBenchGraph
toBenchShow
. - The
bgraph
API has been removed and replaced bygraph
- The way output file is generated has changed. Now field name or group name being plotted or both may be suffixed to the output file name automatically. The estimator type (e.g. mean or median) is also suffixed to the filename.
- Changes to
Config
record:chartTitle
field has been renamed totitle
.- The type of
outputDir
is now aMaybe
. comparisonStyle
has been replaced bypresentation
ComparisonStyle
has been replaced byPresentation
sortBenchmarks
has been replaced byselectBenchmarks
. The new function can be defined as follows in terms of an older definition:selectBenchmarks = \g -> sortBenchmarks $ either error (map fst) $ f (ColumnIndex 0)
sortBenchGroups
has been replaced byselectGroups
setYScale
field has been broken down into two fieldsfieldRanges
andfieldTicks
. Now you also need to specify which fields’ scale you want to set.
Enhancements
- A
report
API has been added to generate textual reports - More ways to compare groups have been added, including percent and percent difference
- Now we can show multiple fields as columns in a single benchmark group report
- Field units are now automatically selected based on the range of values
- Additions to
Config
record type:selectFields
added to select the fields to be plotted and to change their presentation order.selectBenchmarks
can now sort the results based on values corresponding to any field or benchmark group.- new fields added:
diffStrategy
,verbose
,estimator
,threshold
0.1.4
- Fix a bug resulting in a bogus error, something like “Field [time] found at different indexes..” even though the field has exactly the same index at all places.
0.1.3
- Add maxrss plotting support
0.1.2
-
Fixed a bug that caused missing graphs in some cases when multiple iterations of a benchmark are present in the bechmark results file.
-
Better error reporting to pinpoint errors when a problem occurs.
0.1.1
- Support GHC 8.4
0.1.0
- Initial release