hPDB

Protein Databank file format library

https://github.com/BioHaskell/hPDB

Version on this page:1.2.0.1
LTS Haskell 9.21:1.2.0.10
Stackage Nightly 2017-07-25:1.2.0.9
Latest on Hackage:1.5.0.0

See all snapshots hPDB appears in

BSD-3-Clause licensed by Michal J. Gajda
This version can be pinned in stack with:hPDB-1.2.0.1@sha256:f798658f944bb70b8e52e3e13fff984c5e6c5a696d850c3f695353e20fb81c01,5182

Module documentation for 1.2.0.1

  • Bio
    • Bio.PDB
      • Bio.PDB.EventParser
        • Bio.PDB.EventParser.ExperimentalMethods
        • Bio.PDB.EventParser.HelixTypes
        • Bio.PDB.EventParser.PDBEventParser
        • Bio.PDB.EventParser.PDBEventPrinter
        • Bio.PDB.EventParser.PDBEvents
        • Bio.PDB.EventParser.StrandSense
      • Bio.PDB.Fasta
      • Bio.PDB.IO
        • Bio.PDB.IO.OpenAnyFile
      • Bio.PDB.Iterable
      • Bio.PDB.Structure
        • Bio.PDB.Structure.Elements
        • Bio.PDB.Structure.List
        • Bio.PDB.Structure.Neighbours
        • Bio.PDB.Structure.Vector
      • Bio.PDB.StructureBuilder
      • Bio.PDB.StructurePrinter

hPDB

Haskell PDB file format parser.

Build Status Hackage

Protein Data Bank file format is a most popular format for holding biomolecule data.

This is a very fast parser:

  • below 7s for the largest entry in PDB - 1HTQ which is over 70MB
  • as compared with 11s of RASMOL 2.7.5,
  • or 2m15s of BioPython with Python 2.6 interpreter.

It is aimed to not only deliver event-based interface, but also a high-level data structure for manipulating data in spirit of BioPython’s PDB parser.

Details on official releases are on Hackage

This package is also a part of Stackage - a stable subset of Hackage.

Projects for the future:

Please let me know if you would be willing to push the project further.

In particular one may considering these features:

  • Migrate out of text-format, since it gives portability trouble, and slows things down when printing.
  • Migrate from AC-Vector to another vector library:
    • vector-space
    • or linear
  • Use lens to facilitate access to the data structures.
    • torsion angles within protein/RNA chain.
  • Add Octree to the default data structure (with automatic update.)
  • Write a combinator library for generic fast parsing.
  • Checking whether GHC 7.8 improved efficiency of fixed point arithmetic, since PDB coordinates have dynamic range of just ~2^20 bits, with smallest step of 0.001.
  • Implement basic spatial operations of RMS superposition (with SVD), affine transform on a substructure.
  • Class-based wrappers showing Structure-Model-Chain-Residue-Atom interface with possible wrapping of Repa/Accelerate arrays for fast computation.