Is there a tool in usegalaxy that can help me retrieve reads from a fasta file (8Gb) based on a list of read identifier names (~150)? The list of identifer names is in a text file but I can convert to fasta or something else. I’d then run the selected reads through blast for ID.
Hi @dacosta,
in order to extract a set of reads based on a list of identifiers you can use the seq_filter_by_id tool. Previously you will need to set the text file datatype as tabular.
Thank you so much for your reply @gallardoalba. I might be mistaken but it looks like I’d need to use this tool with python. Thanks in part to your reply, I was able to find the filter fasta tool which does the same thing but is an online usegalaxy tool.
Yes, you can find that tool in Galaxy as Filter sequences by ID (if you look for seq_filter_by_id in the tool search bar you will get the same result).