ClustalParser

Libary for parsing Clustal tools output

Version on this page:1.1.4
Stackage Nightly 2017-08-19:1.2.1
LTS Haskell 9.1:1.2.1
Stackage Nightly 2017-08-19:1.2.1
Latest on Hackage:1.2.1
GPL-3.0 licensed by Florian Eggenhofer
Maintained by egg@tbi.univie.ac.at

Module documentation for 1.1.4

ClustalParser Hackage Build Status

Currently contains parsers and datatypes for: clustalw2, clustalo, mlocarna, cmalign

Clustal tools are multiple sequence alignment tools for biological sequences like DNA, RNA and Protein. For more information on clustal Tools refer to http://www.clustal.org/.

Mlocarna is a multiple sequence alignment tool for RNA sequences with secondary structure output. For more information on mlocarna refer to http://www.bioinf.uni-freiburg.de/Software/LocARNA/.

cmalign is a multiple sequence alignment program based on RNA family models and produces ,among others, clustal output. It is part of infernal http://infernal.janelia.org/.

4 types of output are parsed

  • Alignment file (.aln):
  • Parsing with readClustalAlignment from filepath (Bio.ClustalParser)
  • Parsing with parseClustalAlignment from String (Bio.ClustalParser)
  • Alignment file with secondary structure (.aln):
  • Parsing with readStructuralClustalAlignment from filepath (Bio.ClustalParser)
  • Parsing with parsStructuralClustalAlignment from String (Bio.ClustalParser)
  • Summary (printed to STDOUT):
  • Parsing with readClustalSummary from filepath (Bio.ClustalParser)
  • Parsing with parseClustalSummary from String (Bio.ClustalParser)
  • Phylogenetic Tree (.dnd):
  • Parsing with readGraphNewick from filepath (Bio.Phylogeny)
  • Parsing with readGraphNewick from String (Bio.Phylogeny)

Changes

-*-change-log-*-
1.2.1 Florian Eggenhofer <egg@cs.uni-freiburg.de> 06. February 2017
* Structural alignment parser now works with multiline consensus structures
1.2.0 Florian Eggenhofer <egg@cs.uni-freiburg.de> 07. January 2017
* Changed datastructures for sequence identifers and sequences to Data.Text
1.1.4 Florian Eggenhofer <egg@cs.uni-freiburg.de> 30. May 2016
* Fixed a bug in output of clustal alignments with sequence length of 60
1.1.3 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 4. July 2015
* Nucleotide sequences are now parsed by a unified function in line
with IUPAC nucleotide code
1.1.2 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 3. July 2015
* Included parsing of optional field in mlocarna clustal output
1.1.1 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 2. July 2015
* Added support for cmalign clustal output .
1.1.0 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 1. July 2015
* Added Hspec test-suite for parsing functions
* Added Show instances for ClustalAlignment and StructuralClustalAlignment
1.0.3 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 19. April 2015
* Added Y (pyrimidine) and R (purine) to sequence characters
1.0.2 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 19. March 2015
* Linebreaks are now filtered from structural alignment sequence identifiers
1.0.1 Florian Eggenhofer <florian.eggenhofer@univie.ac.at> 27. October 2014
* Fixed compiler warnings and updated documentation to mention structural clustal format
* Added -Wall and -O2 compiler options
* Added support for clustal alignments with secondary structure annotation
comments powered byDisqus