Evaluating and optimizing the performance of software commonly used for the taxonomic classification of DNA metabarcoding sequence data.

Richardson RT, Bengtsson-Palme J, Johnson RM

Mol Ecol Resour 17 (4) 760-769 [2017-07-00; online 2016-11-21]

The taxonomic classification of DNA sequences has become a critical component of numerous ecological research applications; however, few studies have evaluated the strengths and weaknesses of commonly used sequence classification approaches. Further, the methods and software available for sequence classification are diverse, creating an environment in which it may be difficult to determine the best course of action and the trade-offs made using different classification approaches. Here, we provide an in silico evaluation of three DNA sequence classifiers, the rdp Naïve Bayesian Classifier, rtax and utax. Further, we discuss the results, merits and limitations of both the classifiers and our method of classifier evaluation. Our methods of comparison are simple, yet robust, and will provide researchers a methodological and conceptual foundation for making such evaluations in a variety of research situations. Generally, we found a considerable trade-off between accuracy and sensitivity for the classifiers tested, indicating a need for further improvement of sequence classification tools.

DDLS Fellow

Johan Bengtsson-Palme

PubMed 27797448

DOI 10.1111/1755-0998.12628

Crossref 10.1111/1755-0998.12628


Publications 9.5.1