Criterion: robust, reliable performance measurement
This package provides the Criterion module, a Haskell library for measuring and analysing software performance.
To get started, read the online tutorial, and take a look at the programs in the examples directory.
Building and installing
To build and install criterion, just run
cabal install criterion
Please report bugs via the github issue tracker.
Master github repository:
git clone https://github.com/bos/criterion.git
There’s also a Mercurial mirror:
hg clone https://bitbucket.org/bos/criterion
(You can create and contribute changes using either Mercurial or git.)
This library is written and maintained by Bryan O’Sullivan, firstname.lastname@example.org.
- Fix a bug where HTML reports failed to escape JSON properly.
The HTML reports have been reworked.
flotplotting library (
js-floton Hackage) has been replaced by
- Most practical changes focus on improving the functionality of the overview
- It now supports logarithmic scale (#213). The scale can be toggled by clicking the x-axis.
- Manual zooming has been replaced by clicking to focus a single bar.
- It now supports a variety of sort orders.
- The legend can now be toggled on/off and is hidden by default.
- Clicking the name of a group in the legend shows/hides all bars in that group.
- The regression line on the scatter plot shows confidence interval.
- Better support for mobile and print.
- Warn if an HTML report name contains newlines, and replace newlines with whitespace to avoid syntax errors in the report itself.
- Use unescaped HTML in the
- Allow building with
- Fix the build on old GHCs with the
parserWith, which allows creating a
criterioncommand-line interface using a custom
Parser. This is usefule for sitations where one wants to add additional command-line arguments to the default ones that
For an example of how to use
parserWith, refer to
Tweak the way the graph in the HTML overview zooms:
- Zooming all the way out resets to the default view (instead of continuing to zoom out towards empty space).
- Panning all the way to the right resets to the default view in which zero is left-aligned (instead of continuing to pan off the edge of the graph).
- Panning and zooming only affecs the x-axis, so all results remain in-frame.
- Make more functions (e.g.,
runMode) able to print the
µcharacter on non-UTF-8 encodings.
Fix a bug in which HTML reports would render incorrectly when including benchmark names containing apostrophes.
Only incur a dependency on
failon old GHCs.
Add some documentation in
Move the measurement functionality of
criterioninto a standalone package,
criterion-measurement. In particular,
Criterion.Measurementare now in
criterion-measurement, along with the relevant definitions of
Criterion.Types.Internal(both of which are now under the
criterionnow depends on
This will let other libraries (e.g. alternative statistical analysis front-ends) to import the measurement functionality alone as a lightweight dependency.
Fix a bug on macOS and Windows where using
runAndAnalyseand other lower-level benchmarking functions would result in an infinite loop.
We now do three samples for statistics:
performMinorGCbefore the first sample, to ensure it’s up to date.
- Take another sample after the action, without a garbage collection, so we can gather legitimate readings on GC-related statistics.
performMinorGCand sample once more, so we can get up-to-date readings on other metrics.
The type of
applyGCStatisticshas changed accordingly. Before, it was:
Maybe GCStatistics -- ^ Statistics gathered at the end of a run. -> Maybe GCStatistics -- ^ Statistics gathered at the beginning of a run. -> Measured -> Measured
Now, it is:
Maybe GCStatistics -- ^ Statistics gathered at the end of a run, post-GC. -> Maybe GCStatistics -- ^ Statistics gathered at the end of a run, pre-GC. -> Maybe GCStatistics -- ^ Statistics gathered at the beginning of a run. -> Measured -> Measured
applyGCStatistics, we carefully choose whether to diff against the end stats pre- or post-GC.
performGCto update garbage collection statistics. This improves the benchmark performance of fast functions on large objects.
Fix a bug in the
ToJSON Measuredinstance which duplicated the mutator CPU seconds where GC CPU seconds should go.
Fix a bug in sample analysis which incorrectly accounted for overhead causing runtime errors and invalid results. Accordingly, the buggy
getOverheadfunction has been removed.
Fix a bug in
Measurement.measurewhich inflated the reported time taken for
Reduce overhead of
whnfIOby removing allocation from the central loops.
criterionwas previously reporting the following statistics incorrectly on GHC 8.2 and later:
This has been fixed.
The type signature of
runBenchmarkablehas changed from:
Benchmarkable -> Int64 -> (a -> a -> a) -> (IO () -> IO a) -> IO a
Benchmarkable -> Int64 -> (a -> a -> a) -> (Int64 -> IO () -> IO a) -> IO a
Int64argument represents how many iterations are being timed.
Remove the deprecated
applyGCStatsfunctions (which have been replaced by
Remove the deprecated
Config, as well as the corresponding
The header in generated JSON output mistakenly used the string
"criterio". This has been corrected to
Add error bars and zoomable navigation to generated HTML report graphs.
(Note that there have been reports that this feature can be somewhat unruly when using macOS and Firefox simultaneously. See https://github.com/flot/flot/issues/1554 for more details.)
Use a predetermined set of cycling colors for benchmark groups in HTML reports. This avoids a bug in earlier versions of
criterionwhere benchmark group colors could be chosen that were almost completely white, which made them impossible to distinguish from the background.
- Add an
-fembed-data-filesflag. Enabling this option will embed the
criterion.cabaldirectly into the binary, producing a relocatable executable. (This has the downside of increasing the binary size significantly, so be warned.)
- Fix issue where
--helpwould display duplicate options.
Improve the error messages that are thrown when forcing nonexistent benchmark environments.
forceGChas not had any effect for several releases, and it will be removed in the next major
Important bugfix: versions 188.8.131.52 and 184.108.40.206 were incorrectly displaying the lower and upper bounds for measured values on HTML reports.
criterionemit warnings if suspicious things happen during mustache template substitution when creating HTML reports. This can be useful when using custom templates with the
Criterion.Measurement. These are inteded to replace
GCStats(which has been deprecated in
baseand will be removed in GHC 8.4), as well as
applyGCStats, which have also been deprecated and will be removed in the next major
Add new matchers for the
--match pattern, which matches by searching for a given substring in benchmark paths.
--match ipattern, which is like
--match patternbut case-insensitive.
Criterion.toBenchmarkable, which behaves like the
Benchmarkableconstructor did prior to
Add support for per-run allocation/cleanup of the environment with
Add support for per-batch allocation/cleanup with
envWithCleanup, a variant of
envwith cleanup support.
criterion-reportexecutable, which creates reports from previously created JSON files.
Unicode output is now correctly printed on Windows.
Add Safe Haskell annotations.
--jsonoption for writing reports in JSON rather than binary format. Also: various bugfixes related to this.
code-pagelibrary to ensure that
criterionprints out Unicode characters (like ², which
criterionuses in reports) in a UTF-8-compatible code page on Windows.
Give an explicit implementation for
Binary Regressioninstance. This should fix sporadic
criterionfailures with older versions of
test-frameworkin the test suites.
Restore support for 32-bit Intel CPUs.
Restore build compatibilty with GHC 7.4.
If a benchmark uses
Criterion.envin a non-lazy way, and you try to use
--listto list benchmark names, you’ll now get an understandable error message instead of something cryptic.
We now flush stdout and stderr after printing messages, so that output is printed promptly even when piped (e.g. into a pager).
A new function
runModeallows custom benchmarking applications to run benchmarks with control over the
Added support for Linux on non-Intel CPUs.
This version supports GHC 8.
--only-runoption for benchmarks is renamed to
The dependency on the either package has been dropped in favour of a dependency on transformers-compat. This greatly reduces the number of packages criterion depends on. This shouldn’t affect the user-visible API.
The documentation claimed that environments were created only when needed, but this wasn’t implemented. (gh-76)
The package now compiles with GHC 7.10.
On Windows with a non-Unicode code page, printing results used to cause a crash. (gh-55)
- Bump lower bound on optparse-applicative to 0.11 to handle yet more annoying API churn.
- Added a lower bound of 0.10 on the optparse-applicative dependency, as there were major API changes between 0.9 and 0.10.