From Blast to Tree

David_Prikryl · July 7, 2025, 1:07pm

Hi guys, new user here. I managed to blastp my sequence, but it contains duplicates and I am unable to get rid of them or to bypass it, so Clustal is giving me an error message. I tried “Unique” tool on both fasta and tabular files, nothing helped. Tool to merge multiple fasta files and filter unique sequences is also returning an error. What should I do? Thanks!

jennaj · July 10, 2025, 10:51pm

Hello @David_Prikryl

For these technical processing issues:

The Filter FASTA tool had a technical problem that should be resolved now. Please give it a try.
For concatenating multiple fasta files, put them all into a collection list folder, then change the datatype to uncompressed (pencil icon), and then run the Collapse Collection tool.

From there, do you want to explain what you are doing? Maybe there is a better way to get this done. Below is what I think you want to do, and I may be wrong in parts, but we can use it as a starting place that you can clarify from.

Starting with a protein sequence
Run BLASTp against a protein database to find homology based hits
(you are having trouble here)
Run ClustalW on the best hits (unique sequences)
Run FASTTREE to create a tree

If this is correct, you could generate the tabular output (Step2 above), filter it for significant hits (don’t skip this!), retrieve the protein sequences for those hits, run ClustalW, then run FASTTREE.

You want the entire protein sequence for the hits, yes? With the hit identifiers isolated into a unique list, you could get the entire protein for each with → NCBI BLAST+ blastdbcmd entry(s) Extract sequence(s) from BLAST database. This would be a unique file of fasta sequences to use with ClustalW, sourced from the same database that you mapped against.

Please review and let me know what I guessed both correct or not – and I’ll try to help more.

Topic		Replies	Views
Error: Duplicate seq_ids are found usegalaxy.eu support blast	3	1531	February 4, 2022
Makeblastdb: Job output not returned from cluster error usegalaxy.eu support makeblastdb	3	602	February 3, 2022
Selective elimination of Sequences using tools usegalaxy.org support blast	3	457	April 26, 2023
Compress Files Not Working usegalaxy.org support troubleshooting	4	20	July 8, 2024
How to merge or generate FASTA file with the same chromosome usegalaxy.org support server-admin , quality-control	5	2481	December 12, 2018

From Blast to Tree

Related topics