How to add Triticum aestivum snpEff4.3 genome database or appropriate wheat genome database in Galaxy for VCF annotation?

I added the T. aestivum snpEff4.3 genome database in Galaxy (Galaxy), seems the database has been added as I could see theT. aestivum snpEff4.3 genome database at the right side history of Galaxy page column. But, when I add VCF file and snpEff4.3 wheat database, it’s showing error. I would like to annotate the natural variants against the wheat genome (particularly against wheat homeolog 3DL). I also tried with VEP ENSEMBL, it is working, but I would like to try with Galaxy to see the annotations from Galaxy. Does anyone know about this?

1 Like

What error exactly are you getting (post the error message here). It’s hard to help without this info.

1 Like

I attached the file, showing the added wheat database is unavailable, when I run a VCF. file with the added database.

1 Like

Hi - We are in the process of updating the SnpEff tool suite at Galaxy Main https://usegalaxy.org. You might have run into one of the known bugs the update will address. The ticket with details if interested: https://github.com/galaxyproject/usegalaxy-playbook/issues/157

Choices:

  1. Wait for the tools to be updated at Galaxy Main (the ticket linked above will close when that is completed) and rerun using the new tool versions.

  2. Use the tools at Galaxy EU https://usegalaxy.eu. EU has the updated tool versions already installed and as far as I know, those are working as expected now. Should the tool still present problems at Galaxy EU, then more is going on.

@wm75 is an admin for the EU server and I am an admin for the Main server – either of us can help with more troubleshooting to determine if this is a usage problem versus some tool issue that remains to be addressed.

Update: Looking at your graphic closer I am wondering of the snpeff database dataset is in a hidden/unstable state or from an earlier SnpEff tool version (not always compatible). It appears to be uploaded and not created new in the same working history. If you want to direct message me here and share your registered account email address at Galaxy Main (do NOT need your password and you should never share that with anyone), I can take a look to see if that is a factor first, to save you some time before bothering with reruns/moving your data to another server. Your choice - thanks!

1 Like

@Karthikeyan_Thiyagar If this is really, as noted by @jennaj and suggested by your screenshot, a snpeffdb dataset downloaded from Galaxy, then reuploaded, then this is not going to work on any server at the moment.
As things are implemented currently, there is extra data associated with a snpeffdb dataset, which will not be included in your download, but which is necessary for snpEff. In other words, SnpEff genomes obtained through Galaxy cannot be exported from that particular Galaxy instance to anywhere else in a useable form. This situation is different from what you may be used to from other Galaxy-generated data, but is technically not easy to avoid.

So the remaining question is why you tried this approach in the first place. Was that in reaction to other ways not working either? If so that would most likely indicate a bug with a snpEff tool.
Otherwise, I would suggest you to download the snpEff genome again (creating a new dataset in your history), then use snpEff with that genome directly.

1 Like

Thanks @wm75 for the clarification!

@Karthikeyan_Thiyagar Please try this at the Galaxy EU server for now using the latest version of the SnpEff tools. This will avoid known prior issues with older versions (now fixed). It is fine to transfer other data between servers (your vcf data, etc). Please let us know how this goes!

Thanks @jennaj and @wm75, I just tried again to added the the same database with the latest version of snpEff tool, but showing the same error as before.

Please see the attached file with the files history.

1 Like

It looks like the file selected is the “available database” listing (tool SnpEff databases: list available databases).

You want to select the .db result in dataset 13 – that look like the result of the tool SnpEff build (and not SnpEff download) but both produce a “snpEff database” db output.

Note: Dataset 13 is a collection – so click on the collection “folder” icon instead of the single dataset icon to select the proper input. Hover over the three icons to see the pop-ups describing what each of the three are for.

If choosing the .db input doesn’t work for some reason or you run into more problems, please confirm that this was done at the EU server (appears to be from the low usage of quota).

Thanks a lot Jennaj, the I found the same error message even with European Galaxy server. But I found an article, “https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0137549” in that the authors used snpEff toolbox to annotate the SNPs using wheat reference sequence.
Article title " Mutation Scanning in Wheat by Exon Capture and Next-Generation Sequencing"

I contacted the corresponding author of this article via email and he said me to contact a Bioinformatician of that article to have an idea about the usage of snpEff. I will contact the Bioinformatician of the paper, if I found an answer or an idea to use the wheat snpEff database, I will post here. Thanks

1 Like

Hi, I was wondering how this was resolved.

I am facing a similar issue with trying to run SnpEff.

  • I tried downloading the SnpEff database directly from sourceforge and uploading it to Galaxy. However SnpEff eff says this file is unavailable (similar to the problem above). I think the issue has got to do with the metadata, but autodetect does not work.

  • The reason I tried the above method is because when I use the SnpEff download tool or SnpEff build tool, all I get is a file with 2 lines. Not sure why this happens.

I have tried both methods on both the Galaxy main instance and EU site. Unsure how to proceed, will be grateful for any suggestions.

1 Like

Right, this particular data does not work “from the history” when loaded this way. You have to create or download the database using the other tools.

The output with just a few lines is an internal link to the SnpEff database created on the server. The SnpEff tool form should recognize and accept it as a database input.

@wm75 explains more about this is one of the prior posts in this topic (scroll up) and in a new duplicated Topic/post here SnpEff errors while trying to run the tool.