Simulate sequencing with different models for priming and errors http://malde.org/~ketil/
|Latest on Hackage:||0.0|
This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.
Simseq - SIMulate SEQuences. Yep, that's real creative.
Generates a bunch of sequences from a set of reference sequences.
For ESTs, NCBI's refseq transcripts are probably good choices.
The generated sequences are generated using a model that specifies
priming conditions and error generation.
Currently, this is not very refined, you can try
simseq --model=sanger:n,d reference.fasta
Where n indicates the number of sequences to generate, starting points
drawn from a uniform distribution, and d probability of being in the
forward direction. Or, even more experimentally:
Which implemets a completely unfounded and baseless model of 454/Roche
pyrosequencing. (Okay, actually based on a paper by Marguiles et al, but
more data is definitely a requirement).
Solexa will be installed as soon as anybody says something definitive
about the error modes.
In any case, running out of sequence results in X's, indicating vector,
which I hope makes sense for Sanger, at least.
The usual Cabal routine. Get a working GHC compiler, install
my 'bio' library, and do:
chmod +x Setup.hs
sudo ./Setup.hs install
Mail me if it didn't work - <ketil at malde.org>.