Libary for parsing Clustal tools output

Version on this page:1.2.3
LTS Haskell 12.26:1.2.3
Stackage Nightly 2018-09-28:1.2.3
Latest on Hackage:1.2.3

See all snapshots ClustalParser appears in

GPL-3.0-only licensed by Florian Eggenhofer
Maintained by

Module documentation for 1.2.3

ClustalParser Hackage Build Status

Currently contains parsers and datatypes for: clustalw2, clustalo, mlocarna, cmalign

Clustal tools are multiple sequence alignment tools for biological sequences like DNA, RNA and Protein. For more information on clustal Tools refer to

Mlocarna is a multiple sequence alignment tool for RNA sequences with secondary structure output. For more information on mlocarna refer to

cmalign is a multiple sequence alignment program based on RNA family models and produces ,among others, clustal output. It is part of infernal

4 types of output are parsed

  • Alignment file (.aln):
  • Parsing with readClustalAlignment from filepath (Bio.ClustalParser)
  • Parsing with parseClustalAlignment from String (Bio.ClustalParser)
  • Alignment file with secondary structure (.aln):
  • Parsing with readStructuralClustalAlignment from filepath (Bio.ClustalParser)
  • Parsing with parsStructuralClustalAlignment from String (Bio.ClustalParser)
  • Summary (printed to STDOUT):
  • Parsing with readClustalSummary from filepath (Bio.ClustalParser)
  • Parsing with parseClustalSummary from String (Bio.ClustalParser)
  • Phylogenetic Tree (.dnd):
  • Parsing with readGraphNewick from filepath (Bio.Phylogeny)
  • Parsing with readGraphNewick from String (Bio.Phylogeny)


1.2.3 Florian Eggenhofer <> 12. March 2018
* Fixed parsing of additional newline in Biopythons AlignIO output without conservation track
1.2.2 Florian Eggenhofer <> 07. March 2018
* Clustal parser can now parse alignments with missing consensus
1.2.1 Florian Eggenhofer <> 06. February 2017
* Structural alignment parser now works with multiline consensus structures
1.2.0 Florian Eggenhofer <> 07. January 2017
* Changed datastructures for sequence identifers and sequences to Data.Text
1.1.4 Florian Eggenhofer <> 30. May 2016
* Fixed a bug in output of clustal alignments with sequence length of 60
1.1.3 Florian Eggenhofer <> 4. July 2015
* Nucleotide sequences are now parsed by a unified function in line
with IUPAC nucleotide code
1.1.2 Florian Eggenhofer <> 3. July 2015
* Included parsing of optional field in mlocarna clustal output
1.1.1 Florian Eggenhofer <> 2. July 2015
* Added support for cmalign clustal output .
1.1.0 Florian Eggenhofer <> 1. July 2015
* Added Hspec test-suite for parsing functions
* Added Show instances for ClustalAlignment and StructuralClustalAlignment
1.0.3 Florian Eggenhofer <> 19. April 2015
* Added Y (pyrimidine) and R (purine) to sequence characters
1.0.2 Florian Eggenhofer <> 19. March 2015
* Linebreaks are now filtered from structural alignment sequence identifiers
1.0.1 Florian Eggenhofer <> 27. October 2014
* Fixed compiler warnings and updated documentation to mention structural clustal format
* Added -Wall and -O2 compiler options
* Added support for clustal alignments with secondary structure annotation
comments powered byDisqus