Strobealign: flexible seed size enables ultra-fast and accurate read alignment.

Sahlin K

Genome Biol. 23 (1) 260 [2022-12-15; online 2022-12-15]

Read alignment is often the computational bottleneck in analyses. Recently, several advances have been made on seeding methods for fast sequence comparison. We combine two such methods, syncmers and strobemers, in a novel seeding approach for constructing dynamic-sized fuzzy seeds and implement the method in a short-read aligner, strobealign. The seeding is fast to construct and effectively reduces repetitiveness in the seeding step, as shown using a novel metric E-hits. strobealign is several times faster than traditional aligners at similar and sometimes higher accuracy while being both faster and more accurate than more recently proposed aligners for short reads of lengths 150nt and longer. Availability: https://github.com/ksahlin/strobealign.

Kristoffer Sahlin

SciLifeLab Fellow

PubMed 36522758

DOI 10.1186/s13059-022-02831-7

Crossref 10.1186/s13059-022-02831-7

pmc: PMC9753264
pii: 10.1186/s13059-022-02831-7


Publications 9.5.0