Metaxa2 Database Builder: enabling taxonomic identification from metagenomic or metabarcoding data using any genetic marker.

Bengtsson-Palme J, Richardson RT, Meola M, Wurzbacher C, Tremblay ÉD, Thorell K, Kanger K, Eriksson KM, Bilodeau GJ, Johnson RM, Hartmann M, Nilsson RH

Bioinformatics 34 (23) 4027-4033 [2018-12-01; online 2018-06-19]

Correct taxonomic identification of DNA sequences is central to studies of biodiversity using both shotgun metagenomic and metabarcoding approaches. However, no genetic marker gives sufficient performance across all the biological kingdoms, hampering studies of taxonomic diversity in many groups of organisms. This has led to the adoption of a range of genetic markers for DNA metabarcoding. While many taxonomic classification software tools can be re-trained on these genetic markers, they are often designed with assumptions that impair their utility on genes other than the SSU and LSU rRNA. Here, we present an update to Metaxa2 that enables the use of any genetic marker for taxonomic classification of metagenome and amplicon sequence data. We evaluated the Metaxa2 Database Builder on 11 commonly used barcoding regions and found that while there are wide differences in performance between different genetic markers, our software performs satisfactorily provided that the input taxonomy and sequence data are of high quality. Freely available on the web as part of the Metaxa2 package at Supplementary data are available at Bioinformatics online.

DDLS Fellow

Johan Bengtsson-Palme

PubMed 29912385

DOI 10.1093/bioinformatics/bty482

Crossref 10.1093/bioinformatics/bty482

pii: 5038464
pmc: PMC6247927

Publications 9.5.0