Simulate sequencing with different models for priming and errors

Latest on Hackage:0.0

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow to host generated Haddocks.

GPL licensed by Ketil Malde
Maintained by Ketil Malde

Simseq - SIMulate SEQuences. Yep, that's real creative.


Generates a bunch of sequences from a set of reference sequences.
For ESTs, NCBI's refseq transcripts are probably good choices.

The generated sequences are generated using a model that specifies
priming conditions and error generation.

Currently, this is not very refined, you can try

simseq --model=sanger:n,d reference.fasta

Where n indicates the number of sequences to generate, starting points
drawn from a uniform distribution, and d probability of being in the
forward direction. Or, even more experimentally:

simseq --model=454:n,d

Which implemets a completely unfounded and baseless model of 454/Roche
pyrosequencing. (Okay, actually based on a paper by Marguiles et al, but
more data is definitely a requirement).

Solexa will be installed as soon as anybody says something definitive
about the error modes.

In any case, running out of sequence results in X's, indicating vector,
which I hope makes sense for Sanger, at least.


The usual Cabal routine. Get a working GHC compiler, install
my 'bio' library, and do:

chmod +x Setup.hs
./Setup.hs configure
./Setup.hs build
sudo ./Setup.hs install

Mail me if it didn't work - <ketil at>.
Depends on:
Used by 1 package:
comments powered byDisqus