Subject: Urgent Assistance Needed with SnpEff Custom Database Creation

Subject: Urgent Assistance Needed with SnpEff Custom Database Creation

Dear Galaxy Server Admin,

I’m writing to seek your assistance with creating a custom SnpEff database for my organism, [organism name]. I’ve been working on this for some time now, attempting several approaches as documented in [mention any resources or tutorials used]. Unfortunately, I haven’t been successful.

Interestingly, several of my colleagues, using the same files and processes, have managed to create custom databases on their individual accounts. Even I’ve been able to accomplish this on other Galaxy server accounts. This leads me to believe there might be an issue with my account or specific server settings hindering my attempts.

I would greatly appreciate it if you could look into this matter as soon as possible. My research progress is significantly impacted by the lack of a custom database for my organism.

Please let me know if you require any further information or if there’s any specific troubleshooting I can perform on my end.

Thank you for your time and consideration.


Muhammad Rizwan

Hi @Muhammad_Rizwan

You are posting to a public forum… just wanted to make sure that you knew that :slight_smile:

Maybe ask one of your colleagues to share a workflow generated from their history with the successful database creation? And their starting data in a shared history so you can compare your data with theirs? I would try running that workflow on their data to see what settings they used and how it results, then try to do the same on your own data.

We don’t have a specific tutorial for this since how the tool works in Galaxy is the same as for the command-line version. That means the author resources are the help. Technically, Galaxy creates a sort of mini-database with the output but those are not exposed to users.

Some tips:

  1. Use properly formatted reference data as the source
  2. Review the help on the tool form, and the link outs to the author resources
  3. Be careful about how you are adding labels on the tool form. The database name should be all oneWord and not starting with a number. If you run the tool several times, use a distinct database name for each run to avoid clashes. I tend to just number them as I change my mind about the parameters: species1, species2, speciesN.

None of this is related to an account, and I created a database at last week that worked fine. So, please give this another try. If you need more help, you’ll need to share more details. The banner on this forum explains how to share your work. If the server URL is not included in what you post, please specify that directly.

Let’s start there :slight_smile:

How to get faster help with your question

:mechanic: FAQ: What information should I include when reporting a problem?

Any persistent problems can be reported in a new question for community help. Be sure to provide enough context so others can review the situation exactly and quickly offer advice.

Consider FAQ: Sharing your History or posting content from the Job Information :information_source: view as described in FAQ: Troubleshooting errors.

XREF: Regarding SNP eff


Hi @Muhammad_Rizwan I have some time today. If you want to create a new history just with the reference genome, reference annotation, and genbank file plus some trial runs using SNPeff build that fail to create the database, I’ll play around and see if I can get it to work and share back the exact “how-to” here as a public example of troubleshooting.

Why three files? Because there are two ways to do this. Using the genbank file alone, or using the genome + annotation files together. One of these methods should work.

Do this please so I can tell what is going on :slight_smile:

  1. create a history
  2. get the genome fasta and annotation gtf/gff file and genbank files all new from the public sources, directly in that history
  3. name your history and put the URLs to all files in the history comments (top of the panel using the pencil icon – the same place where you name a history is a comment/notes area). I need to know the exact sources in case I need a different version of any. Don’t worry if that is seems like too much content for that box – just paste one URL at a time with a newline between URLs).
  4. do the format standardizations, then recreate the failures – both ways: genbank only then fasta/gtf-gff3. You can include as many tries as you want but at least two for each method. remember to name the “database” differently for each (see my prior comments for how I do it)
  5. generate a share link to the history (history menu > share or publish)
  6. post the share link back here