ADPfusion

Efficient, high-level dynamic programming. https://github.com/choener/ADPfusion

Latest on Hackage:0.5.2.2

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

BSD3 licensed by Christian Hoener zu Siederdissen, 2011-2016

Build Status

ADPfusion

generalized Algebraic Dynamic Programming Homepage

Ideas implemented here are described in a couple of papers:

  1. Christian Hoener zu Siederdissen
    Sneaking Around ConcatMap: Efficient Combinators for Dynamic Programming
    2012, Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
    paper preprint
  2. Andrew Farmer, Christian Höner zu Siederdissen, and Andy Gill.
    The HERMIT in the stream: fusing stream fusion’s concatMap
    2014, Proceedings of the ACM SIGPLAN 2014 workshop on Partial evaluation and program manipulation.
    paper
  3. Christian Höner zu Siederdissen, Ivo L. Hofacker, and Peter F. Stadler.
    Product Grammars for Alignment and Folding
    2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics. 99
    paper
  4. Christian Höner zu Siederdissen, Sonja J. Prohaska, and Peter F. Stadler
    Algebraic Dynamic Programming over General Data Structures
    2015, BMC Bioinformatics
    preprint
  5. Maik Riechert, Christian Höner zu Siederdissen, and Peter F. Stadler
    Algebraic dynamic programming for multiple context-free languages
    2015, submitted
    preprint

Introduction

ADPfusion combines stream-fusion (using the stream interface provided by the vector library) and type-level programming to provide highly efficient dynamic programming combinators.

From the programmers' viewpoint, ADPfusion behaves very much like the original ADP implementation http://bibiserv.techfak.uni-bielefeld.de/adp/ developed by Robert Giegerich and colleagues, though both combinator semantics and backtracking are different.

The library internals, however, are designed not only to speed up ADP by a large margin (which this library does), but also to provide further runtime improvements by allowing the programmer to switch over to other kinds of data structures with better time and space behaviour. Most importantly, dynamic programming tables can be strict, removing indirections present in lazy, boxed tables.

As an example, even rather complex ADP code tends to be completely optimized to loops that use only unboxed variables (Int# and others, indexIntArray# and others).

Completely novel (compared to ADP), is the idea of allowing efficient monadic combinators. This facilitates writing code that performs backtracking, or samples structures stochastically, among others things.

Installation

Follow the gADP examples.

Implementors Notes (if you want to extend ADPfusion)

  • The general inlining scheme is: (i) mkStream is {-# INLINE mkStream #-}, inner functions like mk, step, worker functions, and index-modifying functions get an {-# INLINE [0] funName #-}. Where there is no function to annotate, use delay_inline.

  • If you implement a new kind of memoizing table, like the dense Table.Array ones, you will have to implement mkStream code. When you hand to the left, the (i,j) indices and modify their extend (by, say, having NonEmpty table constaints), you have to delay_inline this (until inliner phase 0). Otherwise you will break fusion for mkStream.

  • Terminals that capture both, say indexing functions, and data should have no strictness annotations for the indexing function. This allows the code to be duplicated, then inlined. This improves performance a lot, because otherwise a function is created that performs these lookups, which has serious (50% slower or so) performance implications.

Contact

Christian Hoener zu Siederdissen
Leipzig University, Leipzig, Germany
choener@bioinf.uni-leipzig.de
http://www.bioinf.uni-leipzig.de/~choener/

Changes

0.5.2.2

  • Modified signature of Edge to make explicit the @From@ and @To@ nodes of the edge. Minor version bump, because @Edge@ is not official yet.
  • optimized table filling yields large improvements for linear languages

0.5.2.1

  • removed upper bounds

0.5.2.0

  • table filling fully inlined in the forward algorithm
  • experimental PrimBool operations
  • note: these optimizations are mostly of interest for linear languages, where is rule (or function call) is comparatively expensive
  • major re-arrangement of modules: import ADP.Fusion.Core for development of novel DP systems. Import ADP.Fusion.Point if you want to write a sequence alignment algorithm

0.5.1.0

  • improved table filling algorithm performance
  • some optimizations to terminal symbols

0.5.0.0

  • complete re-design of Inside / Outside / Complement handling based on phantom types
  • very liberal combination of multi-tape grammars
  • simplified index generation system (both faster, and easier to write new symbol now)

0.4.1.1

  • bugfix in Multitape Subword Index calculations (A.F.S.Indices.hs) [this one is quite spurious. I needed quickcheck to find a suitable minimal example where Pseudoknot.hs fails]

0.4.1.0

  • initial support for multi-context free grammars
  • mcfgs allow for interleaved syntactic variables
  • applications include: natural language modelling and pseudoknotted structures in RNA
  • the simplest formal language that requires this is: a^i b^j a^i b^j
  • the GenussFold library gives a simple example grammar

0.4.0.2

  • bugfixes

0.4.0.0

  • travic-ci integration
  • forward phase now operates on immutable tables that are internally thawed
  • resembles the behavior of Data.Vector.Generic.constructN
  • Empty needs to be bound to input. We require this as certain index structures have no natural notion of and empty index -- unless one provides additional information in the index

0.3.0.0

  • simplified boundary checking: sometimes gives performance gain (!) due to one loop variable less
  • optimized loop variable design following "The HERMIT in the Stream" (Farmer et al, 2014)
  • somewhat nicer programmer interfaces
  • automatic filling and freezing of tables
  • multiple example algorithms (build with -fexamples switch): - Needleman-Wunsch global alignment - RNA secondary structure prediction using simple base pair maximization
  • updated Table code to handle single-dim Subwords in a better way.
  • simplified backtracking

0.2.x.x

  • Streamlined interface: access everything via ADP.Fusion
  • /Multi-tape/ grammars can now be written and are fused
comments powered byDisqus