Hi Igor,
If you look at the available Kraken2 databases that are hosed on the Amazon Cloud service, it’s referred to as the “core_nt Database” Very large collection, inclusive of Genbank, RefSeq, TPA and PDB. It’s current release is 9/4/2024 with a size of 233.3 GB. The most inclusive one currently on Galaxy.eu is the PlusPFP from 9/2022 is 142 GB which is KNOWN to be defective as it was found to be unintentially missing sequences (it’s current size is 188 GB).
I suspect the core_nt database may include host sequences, which would be EXTREMELY helpful in running Kraken2 for unknown stuff.
I would like to request this to be added to the Galaxy options if possible.
Jon