I was wondering if I could get opinions on my quality control and feature classification methods for my microbiome data. Our original pipeline in qiime2 involved using DADA2 to quality control sequences and Sklearn to classify features in our data. But the results of that pipeline gave us a lot of unclassified bacteria and unnamed bacteria. So, we tried switching our methods to use the quality control qscore filter along with deblur in qiime to quality control and then vsearch to classify our bacteria. From looking at taxonomic bar plots it seems like we’ve successfully lowered the amount of unclassified and unnamed bacteria in our data. But we were slightly confused on the inner workings of the qscore filter and deblur. Mainly if we should merge our paird end sequences before running them through the q score filter or not (that’s what I did and it seemed to work). Whenever I try to view the quality plots after qscore filtering it looks like the reeds were merged but I’m not sure if this plot is the correct tool for visualizing these results. We are happy with the final results after feature classification, but are just wanting second opinions on if our pipeline sounds valid. If you guys have any input, it’d be greatly appreciated.
Hi @mes1174
The graphic you show represents reads that are either single end, or merged paired-end, in either case a single contagious “read”. This table wouldn’t distinguish the exact case further. So this looks good to me!
Then, for the classification – two different classification algorithms giving different results – that also seems fine. One is against a known reference, and the other is predictive, so I would interpret this as meaning you have somewhat novel data (or, that what you have was difficult to map to the reference for some reason).
Others are welcome to add more!
Thank you very much @jennaj. !
I appreciate the clarification. Here are my taxonomic bar plots as well so you can see the amount of unassigned and unnamed bacteria in the before vs after.
The top photo is my original results from sklearn. The bottom photo is my newest results using vsearch
Do you know what the “d__Bacteria;__” in our data could be?
Hi @mes1174
Maybe this discussion at the Qiime2 forum helps? taxonomy db problem - #6 by SoilRotifer - User Support - QIIME 2 Forum