# monad-memo

Memoization monad transformer https://github.com/EduardSergeev/monad-memo

LTS Haskell 11.10: | 0.4.1 |

Stackage Nightly 2018-05-23: | 0.4.1 |

Latest on Hackage: | 0.4.1 |

**Eduard Sergeev**

**eduard.sergeev@gmail.com**

#### Module documentation for 0.4.1

- Control
- Control.Monad
- Control.Monad.Memo
- Control.Monad.Trans

- Control.Monad
- Data

## Purpose

This package provides a convenient mechanism for adding memoization to Haskell monadic functions.

## Memoization

Memoization is a well known way to speed up function evaluation by caching previously calculated results and reusing them whenever a memoized function is needed to be evaluated with the same arguments again. It is usually associated with dynamic programming techiques.

## Overview

Even though it is possible to manually add memoization to the code which would benefit from it, this ad-hoc approach has usual ad-hoc drawbacks: code pollution, bugs, resistance to changes.
This package however encapsulates the underlying plumbing behind its simple monadic interface `MonadMemo`

with a single combinator `memo`

which, when applied to monadic function, turns it into "memoized" one.

The package offers various implementation of `MonadMemo`

(which differs in terms of performance and requirements) and it is possible to choose/change the implementation without affecting the main function code.
The range of supported implementations "out of box" is limited by the range of containers provided by the standard packages installed by Haskel Platform:
from default pure "fit them all" Data.Map to very fast but limiting vector.
It is also possible to plug-in a custom container (from a third-party library) and run existing monadic code with it.

The default implementation of `MonadMemo`

is also monad transformer so it can be "mixed" with other monads.
The package also provides the "memoized" versions of most standard monads found in mtl.

## Example of usage

A clasic example of function which greatelly benefits from memoization is a recursively defined Fibonacci number function. A plain version of this function can be written in the following way:

```
fib :: Integer -> Integer
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
```

which is very inefficient (impractical for `n>40`

).

We can rewrite this definition as a monad:

```
fibm :: Monad m => Integer -> m Integer
fibm 0 = return 0
fibm 1 = return 1
fibm n = do
f1 <- fibm (n-1)
f2 <- fibm (n-2)
return (f1+f2)
```

and even run it with `Identity`

monad with identical inefficiency:

```
evalFibmId :: Integer -> Integer
evalFibmId = runIdentity . fibm
```

But all we need to do to make this function "computable" for reasonable argument is to add memoization for both recursive branches with `memo`

combinator:

```
fibm :: (MonadMemo Integer Integer m) => Integer -> m Integer
fibm 0 = return 0
fibm 1 = return 1
fibm n = do
f1 <- memo fibm (n-1)
f2 <- memo fibm (n-2)
return (f1+f2)
```

then, to evaluate it with default `Data.Map`

based memoization cache we use the following "eval*" function:

```
evalFibm :: Integer -> Integer
evalFibm = startEvalMemo . fibm
```

Now the range of the arguments it can handle is limited only by `Integer`

computation complexity and stack memory limit.

## More Examples

### Slightly more complicated recursive function

Well known Ackerman function is a two arguments function.
To memoize two argument function `for2`

combinator can be used, giving the following generic code:

```
ackm :: (Num n, Ord n, MonadMemo (n, n) n m) => n -> n -> m n
ackm 0 n = return (n+1)
ackm m 0 = for2 memo ackm (m-1) 1
ackm m n = do
n1 <- for2 memo ackm m (n-1) -- 'for2' adapts 'memo' for 2-argument 'ackm'
for2 memo ackm (m-1) n1
evalAckm :: (Num n, Ord n) => n -> n -> n
evalAckm n m = startEvalMemo $ ackm n m
```

### Mutually recursive function memoization

This example is taken from paper "Monadic Memoization Mixins" by Daniel Brown and William R. Cook

Given the following mutually recursive function definitions:

```
-- 'f' depends on 'g'
f :: Int -> (Int,String)
f 0 = (1,"+")
f (n+1) =(g(n,fst(f n)),"-" ++ snd(f n))
-- 'g' depends on 'f'
g :: (Int, Int) -> Int
g (0, m) = m + 1
g (n+1,m) = fst(f n)-g(n,m)
```

How can we memoize both functions?

Lets try to just add memo for both functions:

```
-- WRONG: Will NOT compile!
fm 0 = return (1,"+")
fm (n+1) = do
fn <- memo fm n
gn <- memo gm (n , fst fn)
return (gn , "-" ++ snd fn)
gm (0,m) = return (m+1)
gm (n+1,m) = do
fn <- memo fm n
gn <- memo gm (n,m)
return $ fst fn - gn
```

GHC complains:

```
"Occurs check: cannot construct the infinite type: t = (t, v)
Expected type: t
Inferred type: (t, v)"
```

which is understandable since we are trying to use the same cache for storing "key-value" pairs of the functions of different types (`fm :: Int -> m (Int,String)`

and `gm :: (Int, Int) -> m Int`

).
Obviously, to cache both function we will need *two* caches (even if the types of the functions were identical, it's not very good idea to share the same cache).
And this is precisely what we have to do - use two memoization caches! The way to achieve it is to use *two* MemoT monad transformers one nested in another:

```
-- Memo-cache for 'fm'
type MemoF = MemoT Int (Int,String)
-- Memo-cache for 'gm'
type MemoG = MemoT (Int,Int) Int
-- | Combined stack of caches (transformers)
-- Stacks two 'MemoT' transformers in one monad to be used in both 'gm' and 'fm' monadic functions
type MemoFG = MemoF (MemoG Identity)
```

NB As usually with Haskell it isn't necessary to specify types here (or restrict them to MemoT combinations for the given example).

Then, a little bit of complication, since we use *two* caches now (one from the current monad transformer and another from the next, nested in the current) we need to use *memol X* set of functions: memol0, memol1 etc. Where

*X*specifies "sequential number" of the transformer in stack for a given cache (starting from the current). Here we use the current (0) and the next (1) for

`fm`

and `gm`

respectively:```
fm :: Int -> MemoFG (Int,String)
fm 0 = return (1,"+")
fm (n+1) = do
fn <- memol0 fm n
gn <- memol1 gm (n , fst fn)
return (gn , "-" ++ snd fn)
gm :: (Int,Int) -> MemoFG Int
gm (0,m) = return (m+1)
gm (n+1,m) = do
fn <- memol0 fm n
gn <- memol1 gm (n,m)
return $ fst fn - gn
evalAll = startEvalMemo . startEvalMemoT
-- | Function to run 'fm' computation
evalFm :: Int -> (Int, String)
evalFm = evalAll . fm
-- | Function to run 'gm' computation
evalGm :: (Int,Int) -> Int
evalGm = evalAll . gm
```

In fact we can also define 'gm' function in curried form:

```
fm2 :: Int -> MemoFG (Int,String)
fm2 0 = return (1,"+")
fm2 n = do
fn <- memol0 fm2 (n-1)
gn <- for2 memol1 gm2 (n-1) (fst fn)
return (gn , "-" ++ snd fn)
-- 2-argument function now
gm2 :: Int -> Int -> MemoFG Int
gm2 0 m = return (m+1)
gm2 n m = do
fn <- memol0 fm2 (n-1)
gn <- for2 memol1 gm2 (n-1) m -- 'for2' adapts 'memol1' for 2-argument gm2
return $ fst fn - gn
evalFm2 :: Int -> (Int, String)
evalFm2 = evalAll . fm2
evalGm2 :: Int -> Int -> Int
evalGm2 n m = evalAll $ gm2 n m
```

### Combining MemoT with other monads

Being monad transformer, memoization monad can be combined with most of existing monads. Here we mix it with MonadWriter:

```
fibmw :: (Num n, MonadWriter String m, MonadMemo n n m) => n -> m n
fibmw 0 = tell "0" >> return 0
fibmw 1 = tell "1" >> return 1
fibmw n = do
f1 <- memo fibmw (n-1)
f2 <- memo fibmw (n-2)
tell $ show n
return (f1+f2)
-- To run combined monad we need to sequence both 'run' functions:
evalFibmw :: Integer -> (Integer, String)
evalFibmw = startEvalMemo . runWriterT . fibmw
res = evalFibmw 6 -- > produces (8,"1021310241021351021310246")
```

## Custom pure cache container

From monad-memo version 0.3.0 it is possible to replace default Data.Map with another (more efficient?) implementation of internal cache-container as long as there is an instance of Data.MapLike defined for this container. The package currently defines these instances for Data.Map and Data.IntMap only.

For instance, should we decide to use unordered-containers all we need to do is to define the following instance for our container:

```
import Data.Hashable
import qualified Data.HashMap.Strict as H
instance (Eq k, Hashable k) => MapLike (H.HashMap k v) k v where
lookup = H.lookup
add = H.insert
```

then we just need to use `(evalMemoStateH.empty)`

instead of `startEvalMemo`

and our memoized function will be evaluated using `Hashmap`

as an internal container hosted in MemoState.
There is usually no need to do any modification to the memoized function itself.

## Mutable arrays and vectors as MonadCache

### Array-based memoization cache

version 0.4.0 adds ArrayCache: a new MonadCache implementation based on mutable arrays (inside IO or ST s monad). The main benefit of this `MonadCache`

is its performance: it can be an order of magnitude faser than standard `Data.Map`

-based cache. This is due to the fact that arrays have `O(1)`

lookup time and in-place mutable arrays also have `O(1)`

for updates (i.e. the cache `add`

operation).

Unfortunatelly you cannot always use this `MonadCache`

due to array's natural limitations:

- The key must be an instance of Ix typeclass
- The bounds of the array must be known (and specified) beforehand and array cannot be resized
- Array is a continious space of values, so if the key distribution is wide and sparse the memory will be wasted (or array may not even fit into memory)

But if the nature of your memoized function permits the usage of `ArrayCache`

you can make your code much more faster by simply switching from Map-based `MonadCache`

to `ArrayCache`

especially if the value type of your function can be "unboxed" (i.e. it is one of primitive types like `Int`

or `Double`

). "Unboxed" values are packed in unboxed arrays `UArray`

which offer even faster execution and are the most efficient in terms of memory usage.
Normally you don't have to modify your monadic function definition to run `ArrayCache`

-based memoization: just use appropriate `eval*`

or `run*`

function. For instance our canonical `fibm`

function:

```
fibm 0 = return 0
fibm 1 = return 1
fibm n = do
n1 <- memo fibm (n-1)
n2 <- memo fibm (n-2)
return (n1+n2)
```

can be run using `ST`

array of `Integers`

with the following function:

```
evalFibmSTA :: Integer -> Integer
evalFibmSTA n = runST $ evalArrayMemo (fibm n) (0,n)
```

here the `(0,n)`

argument defines the bounds of cache array.
Is it equally easy to use unboxed version of the array, but `Integer`

cannot be unboxed (it isn't primitive type), so lets just use `Double`

for our function result:

```
evalFibmSTUA :: Integer -> Double
evalFibmSTUA n = runST $ evalUArrayMemo (fibm n) (0,n)
```

Instead of `ST`

you can use `IO`

monad:

```
evalFibmIOA :: Integer -> IO Integer
evalFibmIOA n = evalArrayMemo (fibm n) (0,n)
evalFibmIOUA :: Integer -> IO Double
evalFibmIOUA n = evalUArrayMemo (fibm n) (0,n)
```

### Vector-based memoization cache

For even better performance use VectorCache and its flavours (unsafe version and dynamically expandable version) which are all based on very fast vector library.

Note however that this `MonadCache`

is even more limiting that `ArrayCache`

since `vector`

supports only `Int`

as an index.

The usage is very similar to `ArrayCache`

, but instead of range we need to specify the length of the vector:

```
evalFibmSTV :: Int -> Integer
evalFibmSTV n = runST $ evalVectorMemo (fibm n) n
evalFibmIOUV :: Int -> IO Double
evalFibmIOUV n = evalUVectorMemo (fibm n) n
```

Use "Expandable" version to avoid specifying length parameter:

```
import qualified Control.Monad.Memo.Vector.Expandable as VE
evalFibmSTVE :: Int -> Integer
evalFibmSTVE n = runST $ VE.startEvalVectorMemo (fibm n)
```

## Performance of different `MonadCache`

's:

The difference in performance for different `MonadCache`

's with Fibonacci function is demonstrated by this criterion test.
The test runs memoized Fibonacci function using the following caches:
* default Map-based
* State-based with Data.IntMap
* array and unboxed array based (Array and UArray)
* vector, unsafe vector and expandable vector (both boxed and unboxed vectors)

Full report can be found here.

## Custom mutable cache

It is also possible to use a mutable container as a `MonadCache`

not defined here.
For example if we wish to use mutable hash-table from hashtables package we can do so with the following code:

```
{-# LANGUAGE MultiParamTypeClasses, TypeSynonymInstances, FlexibleInstances #-}
import Data.Hashable
import Control.Monad.ST
import Control.Monad.Memo
import Control.Monad.Trans.Memo.ReaderCache
import qualified Data.HashTable.ST.Basic as H
newtype Container s k v = Container { toTable :: H.HashTable s k v }
type Cache s k v = ReaderCache (Container s k v)
instance (Eq k, Hashable k) => MonadMemo k v (Cache s k v (ST s)) where
{-# INLINE memo #-}
memo f k = do
c <- container
e <- lift $ H.lookup (toTable c) k
if isNothing e
then do
v <- f k
lift $ H.insert (toTable c) k v
return v
else return (fromJust e)
{-# INLINE fib1 #-}
fibm 0 = return 0
fibm 1 = return 1
fibm n = do
f1 <- memo fibm (n-1)
f2 <- memo fibm (n-2)
return (f1+f2)
evalFib :: Int -> Int
evalFib n = runST $ do
c <- H.new
evalReaderCache (fibm n) (Container c)
```

## References

- http://www.haskell.org/haskellwiki/Memoization
- "Monadic Memoization Mixins" by Daniel Brown and William R. Cook
- data-memocombinators
- "Fun with Type Functions" by Oleg Kiselyov, Ken Shan, and Simon Peyton Jones (see 3.1 - "Type-directed memoization")

## Changes

0.2.0

- Added a set of forX functions (for2, for3 and for4) to adapt curried function into uncuried MemoCache

0.3.0

- Added generalized MemoStateT transformer (to host any Data.MapLike cache-container)

- MemoT is now MemoStateT instantiated with Data.Map

0.4.0

- ArrayCache: mutable array-based MonadCache for top performance

- VectorCache (and flavours) vector-based MonadCache for even better performance

- Bug fixes in transformer implementations (Reader, State, RWS)

- Simple benchmark included

0.4.1

- Documentation fixes

- `Example` is renamed to `example` and is excluded from package's module hierarchy