Hi @Liulu
For this type of error
It means that in your “fasta” of species’ genomes, the series of numbers has a different count in one or more.
This file would contain genome fasta strings where the gene count is “equal” for all included. Each has four numbers, each representing a gene.
> genome1
A B C D
> genome2
A C B D
> genome3
A -C -B D
> genome4
A -C B D
And this file wouldn’t be equal – because genome3
is missing some values. The others have 4 values, and genome3
only has 2 values. For some reason C and B didn’t get added.
> genome1
A B C D
> genome2
A C B D
> genome3
A D
> genome4
A -C B D
Sometimes other format problems can also lead to this being detected, including a fasta file that was truncated. You can sometimes spot this by looking at the end of a file with the tool Select last lines from a dataset. Other times, just Uploading the file again is enough, maybe the process was interrupted.
Tools that work with fasta files might not work with this particular type of fasa file, but you can still use other Text Manipulation tools. Maybe you did some manipulations already? Can you see where the problem was introduced? We have some tutorials that explain how these file parsing tools work, too, maybe it helps?
Please give that a try! If you get stuck, you can generate a share link to your history and post it back for feedback. It is hard to guess more without seeing your data. How to:
Let’s start there, thanks!