No correspondence between coordinates -- Dante, Protein Domains, Repeat Explorer

Using DANTE and Protein Domains Filter I noticed that sometimes there is no exact correspondence between the coordinates of the same domain between the gff and the fasta files. What is the reason for this?

1 Like

Hi @m_ventimiglia

Correct, these are not always expected to be an exact match. For this specific example, my guess is that the stop codon is being omitted from the fasta, but you could investigate that by examining the region in a genomic browser.

Quote from the Protein Domains Filter tool form:

OUTPUTS PRODUCED:

  1. Filtered GFF3 file
  2. Translated protein sequences of the filtered domains regions of original DNA sequence in fasta format

Translated sequences are taken from the best alignment (Best_Hit attribute) within a domain region, however this alignment does not necessarily have to cover the whole region reported as a domain in gff file

To learn more about how this tool suite tool functions, including Galaxy usage, please see:

Thanks!

1 Like