Snpeff errors=numbers of variants process

SANJAY · August 12, 2019, 4:30pm

Hi,
I wonder if anyone can help me understand this: 00%20PM

Thank you
Sanjay

Peter_van_Heusden · August 12, 2019, 6:46pm

This log doesn’t show much unfortunately. You need to show the specific errors. One common error I have encountered with snpEff in the past is the chromosome name in the VCF file not matching what snpEff expects. Is this a job you are running on one of the usegalaxy servers?

Peter_van_Heusden · August 13, 2019, 10:48am

Ok, I have a user (pvanheus) on usegalaxy.org - if you want you can share your history with me so I can see the error in more detail.

What specific error are you getting? Does it show in the VCF?

SANJAY · August 13, 2019, 2:10pm

Can you share your “Galaxy user email”? Its not taking up the username.

Peter_van_Heusden · August 13, 2019, 2:16pm

I have done so using a private message.

Peter_van_Heusden · August 13, 2019, 9:08pm

Thank you. Your error is, indeed, the “Chromosome name not found” error, but it is masked by the size of your variant file. You have 1429 actual variants in the VCF. The other 4300000-odd positions are non-variant sites. If I filter out the non-variant sites by using snpSift with the (!ANN='.') clause included in its filters, I can run it through snpEff and obviously see the error messages. The problem here is that you called variants against the AL123456.3 whereas the snpEff database you downloadd expects the reference to be called Chromosome (it is modelled on the H37Rv sequence in Ensembl Bacteria which is the same sequence as AL123456.3 (and NC_000962.3) but with a different name.

To remedy this I ran your VCF through Text transformation with sed with the SED Program /^[^#]/s/^AL123456.3/Chromosome/ and then snpEff and it worked.

I took the workflow you used with modification described above and turned it into a Galaxy workflow that is accessible at https://usegalaxy.org/u/pvanheus/w/mtb-map-and-variant-call .

Honestly though, I’d just use snippy for variant calling. You can find it on usegalaxy.eu.

Peter

Peter_van_Heusden · August 13, 2019, 9:10pm

BTW we (at the South African National Bioinformatics Institute - SANBI) have been working extensively on M. tuberculosis bioinformatics using Galaxy - perhaps this is something we can discuss via email.

SANJAY · August 14, 2019, 12:55am

Hi Peter,
Thank you for the extensive help. I am new in galaxy.
Could you please tell me which file do i need to pick for select at runtime in Step 8: bcftools call (sample file under “restrict to”), and ploidy file and sample file under “Select Predefined Ploidy” ? Also for Step 11: SnpEff eff: Use custom interval file for annotation and Only use the transcripts in this file.
I would love to connect with through the email.

Thank you
Sanjay

Peter_van_Heusden · August 14, 2019, 11:28am

The default for bcftools is to treat everything as haploid, which is correct for a bacterium. And then you don’t need to restrict to particular regions in the bcftools or snpEff steps. If you want to filter out thing like PE/PPE genes, consider this script: https://github.com/combat-tb/tb_variant_filter - which is not on any of the main usegalaxy servers but is available via bioconda.

lucy · September 25, 2020, 9:51am

Hi Peter,
I would like to annotate my called variants (from Snippy) using SNPeff to look at the potential effect on the protein, I tried to follow the workflow you listed, it runs but I do not get the information on the gene, in the info I get instead: QR=0;RO=0;DP=1775;AB=0;AO=402;QA=14790;TYPE=snp;EFF=(MODIFIER||||||||||T|ERROR_CHROMOSOME_NOT_FOUND)
Do you have any clue, what I can do?

Peter_van_Heusden · April 9, 2021, 10:05am

Sorry I didn’t look at this message board for a long time. If you get the “ERROR_CHROMOME_NOT_FOUND” error it is because your reference chromosome name does not match the one SnpEff uses. Depending on the database you’re using SnpEff uses different names… e.g. Chromosome or NC_000962. I often use a SED or AWK script (in Galaxy) to make my VCF match the expected reference name. One option besides using SnpEff outside of Snippy is to use the Genbank format of your reference genome (if you have one) - that is the approach described in this tutorial: M. tuberculosis Variant Analysis

Topic		Replies	Views
Snpeff database run errors usegalaxy.org support tool-help , snpeff , snpeff_build_gb	3	63	March 13, 2025
SnpEff annotation errors snpeff	1	914	August 26, 2020
snippy tutorial - snpeff step variant-analysis , snpeff_build_gb	9	23	June 11, 2025
SnpEff annotation- transcript information discordant to the information available on the Ensemble website usegalaxy.eu support variant-analysis , snpeff	11	4055	January 29, 2020
SnpEff build: Create Vaccinium meridionale database snpeff , snpeff_build_gb	3	16	December 10, 2024

Snpeff errors=numbers of variants process

Related topics