SnpEff annotation errors

Hi,
I have tried to use SnpEff to annotate the variants identified from two different parasite strains. As suggested, I have first used the SnpEff build to get the database and then SnpEff eff for annotating the variants. Surprisingly, for one parasite strain, the missense-variant for a single nucleotide change (SNP) is correctly annotated, and the amino acid and its position in the reference genome is also correctly indicated, however, the changed amino acid in the newly sequenced strain is not shown with only a question marker, such as L69? see below:


For the second parasite strain genome sequence, all the SNPs are wrongly annotated with frame-shift-variant & missense-variant, as a result, the changed amino acids due to SNP are wrongly annotated as fs though the amino acid and position is correctly annotated for the reference genome.
such as p.Pro903fs see below:

I am wondering what could go wrong and how can it be corrected. Thanks for help.

Good news: these annotation problems are finally solved. These errors were caused by the wrong variants file type (tabular) provided to SnpEff instead of the correct VCF format. The wrong tabular variants file type was due to a single variants calling script mistake (–outputs-vcf 1 instead of the correct script: --output-vcf 1). Although the wrong variants calling outputs (tabular file type) still reported the correct positions for all these variants, SnpEFF is not compatible to the tabular format and so gave the wrong and incomplete annotations. It looks like this is an uncommon problem in Galaxy Community, but I hope It will help beginners in case they face similar problems.

1 Like