full-text-search

In-memory full text search engine

Latest on Hackage:0.2.2.3

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

BSD-3-Clause licensed by Duncan Coutts
Maintained by Duncan Coutts, Adam Gundry

An in-memory full text search engine library. It lets you run full-text queries on a collection of your documents.

Features:

  • Keyword queries and auto-complete/auto-suggest queries.

  • Can search over any type of "document". (You explain how to extract search terms from them.)

  • Supports documents with multiple fields (e.g. title, body)

  • Supports documents with non-term features (e.g. quality score, page rank)

  • Uses the state of the art BM25F ranking function

  • Adjustable ranking parameters (including field weights and non-term feature scores)

  • In-memory but quite compact. It does not keep a copy of your original documents.

  • Quick incremental index updates, making it possible to keep your text search in-sync with your data.

It is independent of the document type, so you have to write the document-specific parts: extracting search terms and any stop words, case-normalisation or stemming. This is quite easy using libraries such as tokenize and snowball.

The source package includes a demo to illustrate how to use the library. The demo is a simplified version of how the library is used in the hackage-server where it provides the backend for the package search feature.