Welcome to the ppALIGN homepage !
What it is?
In bioinformatics score-based alignments are widely used to compare biological sequences and to search a query sequence against large molecular databases. Such algorithms are usually fast due to sophisticated heuristics, but unfortunately the underlying scoring model lacks in a statistical description of the reliability of the reported alignments. In particular close to gaps or in low complexity regions of the alignment a huge number alternative alignments arise which leads the certainty about the optimal alignment to decrease. Learn more.
ppALIGN is a set of tools to analyse the accuracy of biological sequence alignments by means of the posterior distribution. The software uses uses hidden Markov model techniques to compute position-wise reliability of user supplied alignments (see example below).
![]() |
![]() |
![]() |
In this output, the confidence (posterior probability) in the alignment is indicated by vertical bars. The colored region correspond to a low confidence region. By clicking on this colored region one can cycle through alternative alignments for this region with the corresponding meaning for the color: red for optimal alignment, green for maximal marginal posterior probability alignment, purple for sampled alignments.
The ppALIGN package contains two standalone programs and a C++ library,
- The program, ppalign computes the posterior probabilities for a given alignment.
- The program, ppblast takes the structured output of BLAST (XML format) and determine the posterior probabilties for each hit.
- The library contains the core algorithms and an interface to write new modules that can be used with the programs.
Credits
We borrowed source code from:
- NTL -- A Library for Doing Number Theory
- Memory-efficient dynamic programming backtrace and pairwise local sequence alignment, L.A. Newberg, Bioinformatics 2008
- Templatized C++ Command Line Parser Library
References
- R. Durbin, S. Eddy, A. Krogh, G. Mitchison Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids
- S. Miyazawa A reliable sequence alignment method based on probabilities of residue correspondences, Protein Eng., 8 (1995)
- L.A. Newberg Memory-efficient dynamic programming backtrace and pairwise local sequence alignment, Bioinformatics., 24 (2008)
- S. Wolfsheimer, A. K. Hartmann, and, G. Nuel ppALIGN: posterior probabilities for score based alignments, In preparation


