Kaiju database request

Dear Galaxy Team,

I would like to run Kaiju using a reference database that includes bacteria, archaea, and eukaryotes, such as refseq_nr. Could you please help add this database? Thank you!

Welcome @chuanzhai

The Kaiju indexes do not appear to under undergoing updates anymore (as of 2024). This is likely why the tool is only hosted at UseGalaxy.eu and not the other UseGalaxy servers.

Instead, please have a look at Kraken2 and related tools. We have many tutorials that can guide you through using them and the indexes are current from the same public source that others are using when working outside of Galaxy.

:graduation_cap: Galaxy Training Network

Relevant Tutorials

Where we source the indexes → Kraken2 databases question - #2 by jennaj

We hope this explains the current situation and provides an alternative! :slight_smile:

@chuanzhai while the DBs not receiving updates will increasingly be problematic, I triggered installation of all the existing genomic DBs now on Galaxy Europe. They should be available as DB choices in the tool from tomorrow on.

Cheers,

Wolfgang

Hopefully you can get this running on Galaxy. I have wanted this for a long time, I think it gives a more accurate classification than Kraken2.

Hi @chuanzhai and @Jon_Colman

The indexes appear to be in place! Please give the tool a try!

I didn’t go through and test each, so if either of you run into problems, please share back the full log messages and inputs/parameters and we can help to investigate!

Glad this could be done! :slight_smile:

I tried running Kaiju twice, using different databases, both failed??

Ah, ok, the quick addition was worth a try!

I was able to reproduce your use case with tool test data and found another small issue as well. I’ve ticketed these here → Corrections for kaiju_kaiju 1.10.1+galaxy1 · Issue #8045 · galaxyproject/tools-iuc · GitHub.

@wm75 is out right now but he’ll see this when he returns. Maybe there was some part of the nr nr_euk and refseq indexes that didn’t get replicated into the correct location for the working job directory to see it. The others are Ok.

Warning that the other issue I found will need to be corrected in order to use the same options that you applied if you want to try a different index. In short, “Enable SEG low complexity filter” need to be toggled to Yes or the job falls through to a different problem. The is technically supported by the underlying tool and I didn’t find a known issue so it may be spurious and something else is happening here.

Hope this helps and more next week! :slight_smile:

Yeah, I suspected some small issues. I didn’t want to spend too much time, as it was slow processing.