Lysholm F, Andersson B, Persson B
BMC Bioinformatics 12 (-) 293 [2011-07-19; online 2011-07-19]
High throughput pyrosequencing (454 sequencing) is the major sequencing platform for producing long read high throughput data. While most other sequencing techniques produce reading errors mainly comparable with substitutions, pyrosequencing produce errors mainly comparable with gaps. These errors are less efficiently detected by most conventional alignment programs and may produce inaccurate alignments. We suggest a novel algorithm for calculating the optimal local alignment which utilises flowpeak information in order to improve alignment accuracy. Flowpeak information can be retained from a 454 sequencing run through interpretation of the binary SFF-file format. This novel algorithm has been implemented in a program named FAAST (Flow-space Assisted Alignment Search Tool). We present and discuss the results of simulations that show that FAAST, through the use of the novel algorithm, can gain several percentage points of accuracy compared to Smith-Waterman-Gotoh alignments, depending on the 454 data quality. Furthermore, through an efficient multi-thread aware implementation, FAAST is able to perform these high quality alignments at high speed. The tool is available at http://www.ifm.liu.se/bioinfo/
PubMed 21771335
DOI 10.1186/1471-2105-12-293
Crossref 10.1186/1471-2105-12-293
pii: 1471-2105-12-293
pmc: PMC3228549