SNpEff build Java error

Dear Galaxy community,
I am trying to build database for Mycobacterium tuberculosis H37Rv using GFF annotation. I tried to use .fna and .gff files from ncbi but am getting following error:Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/jobdir/024/539/24539611/_job_tmp -Xmx7g -Xms256m
Exception in thread “main” java.lang.ArrayIndexOutOfBoundsException: 1
at org.snpeff.SnpEff.parseArgs(SnpEff.java:947)
at org.snpeff.SnpEff.cm
Is this something related to my computer or am I doing something wrong?
I also tried .fasta and .gff3 files but came back with the same result.
Alternatively, I tried to snpEff download and received following error message: Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/jobdir/024/539/24539793/_job_tmp -Xmx7g -Xms256m
00:00:00 SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani
00:00:00 Command: ‘download’
00:00:00 Reading configuration fil

I wonder if anyone can help me get through this.

Thank you

1 Like

Hi Sanjay

Nice to meet you. It sounds like there might be a problem with the data that you are trying to analyse.

Firstly when it comes to M. tuberculosis, the snpEff database to use is Mycobacterium_tuberculosis_h37rv. This corresponds to the NCBI NC000962.3 genome (i.e. H37Rv). You can either download this (using the snpEff download tool) and use it from your history or just use the text string in your snpEff.

I don’t know what you are using for variant calling, but if you use snippy with Genbank reference input (i.e. the one from here https://www.ncbi.nlm.nih.gov/nuccore/NC_000962.3) it will run snpEff on your VCF for you automatically. This works for me with the 4.3.6+galaxy2 snippy tool.

2 Likes

Thanks Peter. It worked.

2 Likes

Hello,

I’m trying to build a SnpEff database for a new genome that is not publically available. I’m using

toolshed.g2.bx.psu.edu/repos/iuc/snpeff/snpEff_build_gb/4.3+T.galaxy4

with a gff3 and a fasta file.

I get the below error. Previous the tool worked to build a SnpEff database for another genome and I don’t understand why the gff3 file (which appears the same same as the gff3 file used for the previous genome now leads to the error message).

What do I need to change?

thanks

> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
> 	at org.snpeff.SnpEff.parseArgs(SnpEff.java:947)
> 	at org.snpeff.SnpEff.cmd(SnpEff.java:235)
> 	at org.snpeff.SnpEff.run(SnpEff.java:1174)
> 	at org.snpeff.SnpEff.main(SnpEff.java:162)

Dear Paul,
please open a separate thread for this distinct issue.
Also, please tell us what the sources of the gff3 and fasta files are.
Cheers,
Wolfgang

1 Like

Hi Wolfgang,

I no longer have the problem (I will not open separate thread now).

It seems the error message was because I had spaces instead of underscores in the “Name for the database”.

The example the instructions give for naming database “For E. coli K12 you may want to use ‘EcK12’ etc.” is indicating this should be done but I didn’t realise this.

regards,
Paul

1 Like

Respected All,

I am trying to create a custom SNPeff database using a gff3 and fasta formatted genome file. However it seems any issue that the output database size is only 17byte. I can only see following command detail in the run window -

2 lines
formatsnpeffdb
database?
Reading GFF3 data file : ‘/galaxy-repl/main/jobdir/029/903/29903497/working/dataset_43687863_files/TeaCSS/genes.gff’ Total: 289351 markers added.
Create exons from CDS (if needed): Exons created for 0 transcripts.
Deleting redundant exons (if ne

Finally the database build is somehow erratic, so that I cannot perform the variant prediction for my VCF file. Can anyone help me to get rid of this issue?

With sincere regards,
Anjan