Hi, everyone! I am a very new user of Galaxy. I used Kraken2 on my processed sequences three times with three different databases (Silva 138 16S, Greengenes, and RDP).
Each Kraken2 job showed 44,794 classified sequences.
For some reason, my pie charts are showing different numbers under “Total” in the top right corner. I am not sure what this means. Silva 138 16S showed 84,401, RDP gave 50,686, and Greengenes had 136,825.
Is this number the number of sequences? If so, why is it so different among the charts? My three Kraken2 outputs showed the same number of classified sequences.
Thank you in advance for your help; I have a lot to learn!
The short answer is that each database will include a different set of reference sequences. That means the hits will sort out differently and be summarized differently. This is totally expected and normal! I know this is obvious so I’m mostly saying this for clarity: if each database produced the same results, there wouldn’t be a reason to have different databases at all.
GTN tutorials for Krona pie chart: from taxonomic profile → Galaxy Training!
There are a few other tools that produce Krona data, see the bottom of tools forms to find these. You’ll find descriptions of the values in these.
With that context, I’m wondering which exact values you have and with which data. There will be counts for the total sequences then breakdowns for each of the categories. This tool is entirely dependent on the input file – it doesn’t “do” any manipulations itself. That means you can compare values in the chart to values in your file. And, the chart is interactive – so that “Total” in the upper corner can change based on the chart settings and where you clicked to highlight the sections!
If you want to share some screenshots and point out the values, we can probably help to understand it more. Try to capture the entire screen so we can see how things are set up. If that data is from a tutorial, please share that link, too.
Thank you so much for your response! I have attached some screenshots of my Silva 138 16S Krona pie chart and the report from Kraken2. I see 44,974 in the first row of the table which matches the number of classified sequences. I am wondering where the 84,401 in the Total section of the pie chart comes from.
Also, when I click on Firmicutes on the chart, I get 65,272 for the number of sequences, and about half are unassigned. For some reason, the Kraken2 report says 0% of sequences were unclassified. I am not sure how all of these values fit together.