I have uploaded my .fna file to predict optimum growth temperature with Growthpred and get the error saying my sequences are not multiples of three and then the program doesn’t work. I have run a python script to check my sequences and they are multiples of three and have correct start and stop codons.
Does anyone know what to do in this situation. I see a similar query from 2 years ago and it was suggested to check the sequences. I have done that and they are multiples of three. Thanks.
Are you running a tool at a public Galaxy server? If yes, or you can reproduce the problem at a public server, you can share back your history and we can try to help troubleshoot what is going wrong. See the forum banner for how to do that, or here directly → How to get faster help with your question
I have also been trying to run growthpred as a python script download on Jupyter, which is where the .faa file in the link above came from, and get the same- ERROR: sequence length not multiple of 3: TVZ.WA3_42_TVZ.WA3_42_NODE_35_length_246808_cov_3.342252_66
ERROR: internal stop in sequence: TVZ.WA3_42_TVZ.WA3_42_NODE_35_length_246808_cov_3.342252_99
The query fasta appears to be protein sequence, not nucleotide sequence. This tool wants nucleotide since predicting the correct codons (protein amino acid residues) is part of what it is doing.
To be clear: what you have loaded into this history is a faa file (fasta amino acid) file, not an fna (fasta nucleotide) file.
The nucleotides need to be “in a multiple of three” for that translation. This is likely what you input last time that it worked.
And, I agree that this is a bit confusing! But maybe my comments help? Let us know!
Quote from tool form
How does it work ?
The program takes a set of DNA protein coding sequences, searches for ribosomal protein coding genes and translates all genes into proteins. The codon usage in ribosomal protein coding genes and the other genes allows to build a set of codon usage indexes. The amino acid composition is used to predict optim:al growth temperature. The codon usage biases and optimal growth temperature are then used to predict optimal growth rates.
I’m not sure why I uploaded a .faa file, no wonder it didn’t work. I’ve tried again with my own ribosomal.fna file and my sample .fna file and it has worked! I checked the boxes to ignore the start and stop codons, then ran it again without ignoring them and it gave the error I’ve been getting all along.
Interesting to know, it turns out I need to add the flags -s and -S which ignore the start and stop codons when running the python script on my system. I can also use it here on the galaxy website now too as a back up option. Thanks for the help!