Kraken2 Database Fungi

Hi everyone,

In the Kraken2 tool, among the available databases, there is one called “Fungi Genomes (2019)”. However, on the Index Zone page, the listed databases only go up to 2020, and there doesn’t seem to be any database specifically for fungi — except as part of the PLUSPF and PLUSPFP databases.

Since I’m getting some interesting results with this “Fungi Genomes (2019)” database, I’d like to ask if anyone has more information about it — for example, whether it’s available online, or where it was originally taken from and how it was built.

Thanks a lot — any info would be really helpful!

Hello @AleMoxy

All of the databases are public. The ones you are asking about where were sourced from here:

Hopefully this helps! :slight_smile:

I’m sorry but in the Index zone i can find most of the databases available in the galaxy kraken tool but not the fungi genomes, where am i wrong?

1 Like

Hi @AleMoxy

To locate the older indexes, start here Index zone by BenLangmead.

All the releases are listed out, then as you reach the older ones end near the end, please find:

Please see the Kraken, Kraken 2, KrakenUniq and Bracken websites for more information on the software, authors, and how to cite the work.

Then go to the Kraken 2 site to review the instructions for how these legacy indexes were created (creation commands/tools, search criteria against the public database source, plus how/where you can still get these files).

That would be from here https://ccb.jhu.edu/software/kraken2/ to https://ccb.jhu.edu/software/kraken2/index.shtml?t=downloads

Pre-built Kraken 2 Databases

These additional databases have been provided by non-CCB labs:

Finally, following the Maxikraken2 and Kraken2-microbial databases link, leads to the instructions for how these were created, and where to get a copy for local use.

The Fungal Genomes (2019) database is the fungi sub-database from the release maxikraken2_1903_140GB (March 2019, 140GB).


I’m not aware of any problems with that particular older Kraken2 index release, but RefSeq itself could have had some issue too, and my search could have missed it.

I do know there was a problem with the Fungi sub-domain of the PlusPF databases circa 2021 (see the Kraken2 notations section here:

The PlusPF database (including PlusPF-8 and PlusPF-16), as well as the PlusPFP database (including PlusPFP-8 and PlusPFP-16) posted on 5/17/2021 mistakenly omitted genomes from Refseq “fungi”. We posted the fixed databases on 1/27 and 1/28/2021.

(I think I see another typo! and the fixed databases were 2022)

Some details at our forum are here Not identifying fungi with kraken2 PlusPF database (plus other topics that link from it).

This impacted everyone of course (not just Galaxy!) and we always keep an original database indefinitely (for reproducibility reasons) but the newer releases are available too. This means for new work, when working with Fungi, it is probably best to avoid the indexes with known issues entirely.

In short, complicated! But hopefully this explains where the data was sourced. If you are seeing expected results when screening against the different versions of indexes, consider using the most current known-problem-free versions.

Then, if you suspect a new issue has been identified in a current database, reporting it to the people who create the index is best. That way more people from the wider Bioinformatics community can try to replicate, confirm the issue, then the team who works with the Langmead lab can correct/document it at the source (and it would flow down to everyone else, including Galaxy). If you are not sure how to do that, please ask and I can make some more suggestions about what to do next. :slight_smile:

BONUS: While I was looking for prior known issues, I stumbled across this, and I think the advice from the community is really good! You can do all of this in Galaxy and we happen to have tutorials to support the methods at all of the UseGalaxy servers. See the bottom of the individual tool forms for the links. → Filtering Alignment Results for Fungi in PlusPF Database · Issue #791 · DerrickWood/kraken2 · GitHub

1 Like

Awesome! Thank you very much!!!

1 Like