How good are pathogenicity predictors in detecting benign variants?

Niroula A, Vihinen M

PLoS Comput Biol 15 (2) e1006481 [2019-02-00; online 2019-02-11]

Computational tools are widely used for interpreting variants detected in sequencing projects. The choice of these tools is critical for reliable variant impact interpretation for precision medicine and should be based on systematic performance assessment. The performance of the methods varies widely in different performance assessments, for example due to the contents and sizes of test datasets. To address this issue, we obtained 63,160 common amino acid substitutions (allele frequency ≥1% and <25%) from the Exome Aggregation Consortium (ExAC) database, which contains variants from 60,706 genomes or exomes. We evaluated the specificity, the capability to detect benign variants, for 10 variant interpretation tools. In addition to overall specificity of the tools, we tested their performance for variants in six geographical populations. PON-P2 had the best performance (95.5%) followed by FATHMM (86.4%) and VEST (83.5%). While these tools had excellent performance, the poorest method predicted more than one third of the benign variants to be disease-causing. The results allow choosing reliable methods for benign variant interpretation, for both research and clinical purposes, as well as provide a benchmark for method developers.

Abhishek Niroula

DDLS Fellow

PubMed 30742610

DOI 10.1371/journal.pcbi.1006481

Crossref 10.1371/journal.pcbi.1006481

pmc: PMC6386394
pii: PCOMPBIOL-D-18-01511

Publications 9.5.0