I’ve created a database using SnpEff-build, this generates a version 5.2 database. However, when I try to annotate variants using SnpEff-eff and select the newly created database, the tool is unable to run and gives me the message “Parameter snpeff_db: This version of SnpEff will only work with 5.2 genome databases”. I’ve tried downloading the snpeff_db file and uploading it again but am still unable to use the tool. Is there anything I can do on my end to fix the issue or is it a matter of updating/fixing the tools to make them compatible?
Welcome, @AM2023
The message about database versions is likely spurious. So, something else is going wrong and it might be reported in other places in the job logs: the red error dataset, the standard error, or the standard out. (Find all using the i-icon).
If you still need help with this after reviewing on your own, I’m willing to look closer to help determine what exactly the problem is, and to help to get it fixed: maybe it is something we can adjust server side or maybe there is something you can do as a user to avoid it.
How to share your work is described in the banner at this forum, also here directly → How to get faster help with your question. For this troubleshooting, seeing the data you created the database from, and the failed job using that database, both, would provide what we’ll need. You can post the share link back here with your comments.
Thanks, and if you solve this yourself, please let us know about that too!
There isn’t an error message that allows me to investigate any further - though I may just be missing something - only a message that the tool cannot use the database generated in the previous SnpEff build step. I have a screenshot of the error I’m seeing as well as a link to my history here: Galaxy
Hi @AM2023 Thanks for posting back the screenshot and shared history link. Super helpful!
These are my observations from what you shared. If I missed anything, please let me know, and if it helps that would be great to learn too. We want this to work for you!
- Query for available pre-built indexes
It looks like you were able to list out the available databases, or maybe found the identifier another way (since you were able to load these later on)? So, good!
The query “falciparum” would probably work best for what you doing. It is always a bit of a guess to learn how the data author labeled these pre-built indexes but I do think species is a good place to start, since genius and sub-species can both be abbreviated in non-obvious ways, plus the versioning or strain notation confuses the format more. So, I think this worked for you, but please ask if you have questions.
- Download a pre-built index or Create a custom index
Downloading appears to have also worked for you, great! These indexes are labeled with a software version number that correspond to the version of the SNPeff annotation tools you can use it with later.
Versioning was your original question, correct? I can let you know that this comes from the original tool author, everyone using these tools anywhere (not just Galaxy) would need to make sure that the index version and tool version are a “match” for things to work as expected.
Why? The SNPeff author made some changes. That impacted how the index is created, then how the downstream tools interpret that index. The older version of the tools still work with the older index versions, and the same for the new versions. Try not to mix these up to avoid technical and scientific problems.
- SNPeff annotate a VCF file
It looks like you were able to navigate the Options → Versions menu in the upper right corner of the tool form. Details for anyone else reading → FAQ: Changing the tool version
There are two current primary versions of these tools: 4.3 and 5.2. I don’t know if all of the 4.3 genomes are available for 5.2, or the reverse, but I am guessing there is probably not perfect overlap. That said, you could probably reach out to the SNPeff authors and ask that a new index be created. Instructions for contact are linked from the tool form, but also here → Help and Bugs - SnpEff & SnpSift.
Changes that happen at the source will flow down to Galaxy, with new indexes being ready the quickest: Galaxy is making queries against the public resources hosted in the cloud, so everything is “live” and current. But don’t wait for this – it is quick to create a custom index and we can help here if you run into problems. See snpeff_build_gb and snpeff for prior troubleshooting around this.
Summary
Query the pre-built indexes. Simple search terms find more hits.
If your genome is not pre-build yet: you can create the index in Galaxy with the index creation tool, and also ask the authors for a new index for later on.
Use an index version that matches the same version of the other tools you plan to use. This ensures everything works together to produce the highest quality scientific results!
The tool forms may attempt to filter potential input indexes for you. This works similar to “datatype format” filters. Try not to override those filters – it is an important warning – you may need to update a datatype, create a new file, or as in this case, adjust the tool version you are using.
Sorry for length, but we haven’t had a question about this tool recently, so I took the opportunity to write up our current advice for you and anyone else reading later on. Do please let us know if you think things are working for you now, and follow up questions about any of this, or anything I missed, are welcomed.