Kang W, Eldfjell Y, Fromm B, Estivill X, Biryukova I, Friedländer MR
Genome Biol. 19 (1) 213 [2018-12-04; online 2018-12-04]
We present here miRTrace, the first algorithm to trace microRNA sequencing data back to their taxonomic origins. This is a challenge with profound implications for forensics, parasitology, food control, and research settings where cross-contamination can compromise results. miRTrace accurately (> 99%) assigns real and simulated data to 14 important animal and plant groups, sensitively detects parasitic infection in mammals, and discovers the primate origin of single cells. Applying our algorithm to over 700 public datasets, we find evidence that over 7% are cross-contaminated and present a novel solution to clean these computationally, even after sequencing has occurred. miRTrace is freely available at https://github.com/friedlanderlab/mirtrace .