perfect-hash-generator

Perfect minimal hashing implementation in native Haskell

https://github.com/kostmo/perfect-hash-generator#readme

Version on this page:0.2.0.6
LTS Haskell 23.0:1.0.0
Stackage Nightly 2024-12-09:1.0.0
Latest on Hackage:1.0.0

See all snapshots perfect-hash-generator appears in

Apache-2.0 licensed and maintained by Karl Ostmo
This version can be pinned in stack with:perfect-hash-generator-0.2.0.6@sha256:eee069e99534d8f6e02fa6bdbd4069c4225ebb0112a0b114eb65fa57707c9f82,5229

Module documentation for 0.2.0.6

A perfect hash function for a set S is a hash function that maps distinct elements in S to a set of integers, with no collisions. A minimal perfect hash function is a perfect hash function that maps n keys to n consecutive integers, e.g. the numbers from 0 to n-1.

In contrast with the PerfectHash package, which is a binding to a C-based library, this package is a fully-native Haskell implementation.

It is intended primarily for generating C code for embedded applications (compare to gperf). The output of this tool is a pair of arrays that can be included in generated C code for allocation-free hash tables.

Though lookups also perform reasonably well for Haskell applications, it hasn't been benchmarked thorougly with respect to other data structures.

This implementation was adapted from Steve Hanov's Blog.

Usage

The library is written generically to hash both strings and raw integers according to the FNV-1a algorithm. Integers are split by octets before hashing.

import Data.PerfectHash.Construction (createMinimalPerfectHash)
import qualified Data.HashMap.Strict as HashMap

tuples = [
   (1000, 1)
 , (5555, 2)
 , (9876, 3)
 ]

lookup_table = createMinimalPerfectHash $ HashMap.fromList tuples

Generation of C code based on the arrays in lookup_table is left as an exercise to the reader. Algorithm documentation in the Data.PerfectHash.Hashing and Data.PerfectHash.Lookup modules will be helpful.

See the hash-perfectly-strings-demo and hash-perfectly-ints-demo, as well as the test suite, for working examples.

$ stack build
$ stack exec hash-perfectly-strings-demo