Hello,
I am trying to annotate a .vcf file of nucleotide changes in sample of Influenza B Virus genome vs the reference strain (B/Victoria/02/1987) with corresponding amino acid changes.
I have built a SnpEff database using SnpEff build and the reference genome assembly (ASM3108318v1) for this strain which I obtained from NCBI in gbff.gz format
I have run the SnpEff chromosome-info tool on the database and the chromosome names and co-ordiantes appear to be correct and correspond to the chromosome names found in my .vcf file for this sample. Here is the result of that tool:
CY018764 1β 2351
CY018763 1β 2334
CY018762 1β 2271
CY018757 1β 1843
CY018760 1β 1803
CY018759 1β 1521
CY018758 1β 1151
CY018761 1β 1061
However when I run SnpEff eff using the database and my .vcf, every row in the output vcf shows the error βNO CHROMOSOME FOUNDβ and amino acid nomenclature is not present. For example:
CY018764 1822 . G A . PASS DP=1340;MQ=249.42;FractionInformativeReads=1;SoftClipRatio=0;STR;RU=G;RPA=2;ANN=A||MODIFIER|||||||||||||ERROR_CHROMOSOME_NOT_FOUND GT:SQ:AD:AF:F1R2:F2R1:DP:SB:MB 1:67.9:0,1333:1:0,695:0,638:1333:0,0,927,406:0,0,701,632
I will eventually need to repeat this process for a variety of Influenza strains (many of which do not have pre existing databases).
I am unsure why this is occurring and wondered if anyone could help me troubleshoot this issue. Any help would be hugely appreciated!