-
Notifications
You must be signed in to change notification settings - Fork 23
Description
We spotted an issue with misclassification, particularly within the Brucella genus. We generated FASTQ files containing simulated ONT reads for 4 Brucella species and analysed them with Metamaps using a genus-level database constructed from all of the available RefSeq genomes for all Brucella species.
The results looked great when Metamaps was ran on FASTQ files containing just one species, the percent of correctly classified reads being 100% for 3 of the species and 99.95% for one of them.
However, when concatenating the 4 FASTQ files so that the input file contains all 4 of our Brucella species, the percent of correctly classified reads dropped to as low as 1.18% for one of the species, and 39.9%, 46.93%, and 99.94% for the others.
I was hoping you could please investigate this and let us know how we can improve the classification in our analysis pipeline, which incorporates Metamaps.