Kraken2 databases mixed up

Hello, I have been analyzing my reads using Kraken2. I used the database: Prebuilt Refseq indexes: PlusPFP, which should also contain plant genomes, but it seems they are not included in this version in Galaxy (I tried it at both usegalaxy.eu and usegalaxy.org). I am sure some of my reads match plant genomes, but I don’s see any plants in the Report even if I check the option “Report counts for ALL taxa, even if counts are zero”. So it looks like the PlusPFP database is only PlusPF (without the plants).

Hi @vojtech

Do you have an example to share? Search this forum with “sharing your history” if you are not sure how.

Hi
@vojtech

I have also plant sample (rna-seq) and i did Kraken2 with plusPFP 2022 (galaxy EU), and also get only bacterial and fungi classified reads so Im also confused how is possible?

Did you get more info from your side?

PS
I tried also with another sample from pepper - same thing no single read to plants :flushed: only bacteria and fungi

I still need an example to confirm the problem. Whoever has one can share it back.

Here @jennaj

I trues with 2-3 samples from different plants same result, here is one with pepper:

Not a single read classified in viridiplantae?!

1 Like

Thanks, I’m reviewing :slight_smile:

Ok, thanks for sharing that, and I was able to compare the indexes and see the missing data.

Issue ticket is here: Kraken2 reference database issue · Issue #672 · galaxyproject/usegalaxy-tools · GitHub and the technical details will have followup here Kraken2 reference index mixup · Issue #37 · galaxyproject/idc · GitHub (updated)

Thanks for following up!

Thank you vebaev for sharing the Galaxy history, I was’t around to do it myself.

@jennaj any news on the issue?

is Galaxy ORG have the same issue in the DB?

From your comparison seems that plusPF are same as plusPFP, like plusPF is copied and saved as plusPFP, since both have same taxonn nodes which probably mirrors the plusPF

1 Like

I just asked for an update.

1 Like

Thanks! That will be great!

And what is the chance for including also nr Database too? (Index zone by BenLangmead)

Hi @vebaev

For new index requests, you can open an issue ticket at this repository → GitHub - galaxyproject/idc: Simon's Data Club - Reference data for Galaxy servers

Just made new ticket :slight_smile:

Let me know please when plusPFP update is done?

1 Like

Follow the second ticket for live updates on the technical progress. The first ticket is a tracking ticket and when it also closes out, that means the correction is live, has been tested (probably by me), and ready to use. I’ll also come back here and post an update but that might be a day or so after it is actually ready. :slight_smile:

@jennaj seems like the bug will live quite some time…

Hi @vebaev

You could comment in the Github ticket maybe? I can’t personally speed this up :slight_smile: