Hi,
I am attempting to identify whether certain SNPs obtained through GBS are synonymous or non-synonymous mutations and to determine their effects on the sequences. To achieve this, I am trying to create a database for Vaccinium meridionale, using a .fa
genome file and a .gff
annotation file as input.
However, the process fails after approximately two hours of execution. The error message generated is as follows:
plaintext
*** Total: 1079584 markers added.
00:00:10 Create exons from CDS (if needed):
00:00:10 Exons created for 0 transcripts.
00:00:10 Deleting redundant exons (if needed):
00:00:11 Total transcripts with deleted exons: 0
00:00:11 Collapsing zero length introns (if needed):
00:00:12 Total collapsed transcripts: 0
00:00:12 Reading sequences :
WARNING_CHROMOSOME_NOT_FOUND: Ignoring sequences for 'null'. Cannot find chromosome. File '/data/jwd05e/main/076/580/76580216/working/snpeff_output/Vaccinium_meridionale/genes.gff' line 1733010 '##FASTA'
00:00:12 Total: 0 sequences added, 0 sequences ignored.
It appears that the process is unable to locate chromosomes for the sequences specified in the GFF file.
Additional Context
- The genome file consists of scaffolds rather than assembled chromosomes.
- The GFF file is formatted according to the GFF3 standard, with features such as
gene
,mRNA
, andCDS
properly defined.
Could you help me understand the root cause of this issue? Specifically:
- Is the problem related to the use of scaffolds instead of chromosomes?
- Are there additional steps or adjustments needed to ensure compatibility with SnpEff?
I appreciate any guidance or recommendations you can provide to resolve this issue.
Thank you very much for your help!