Hi @Gabriella_Quinn
Yes, if the coverage value is somewhere in a line of text data, it can be parsed out and/or filtered on.
The tool you mention is a good choice, but it isn’t installed at Galaxy Main https://usegalaxy.org. It is installed at Galaxy EU https://usegalaxy.eu. The post is tagged as usegalaxy.org (??) – may be a simple mistake? Please clarify the server by URL and that this is the tool you intend to use:
-
Filter FASTA on the headers and/or the sequences (Galaxy Version 2.1)
If you share an example of a few of your fasta title lines (the full “>” line for at least three fasta records) we can help you to construct a regular expression. If your contig fasta dataset is large (too large to scroll through to copy/paste out three title lines) you can filter out just the title lines with the tool/expression:
-
Select lines that match an expression (Galaxy Version 1.0.1)
- using the regular expression
^>.
The Select tool could also be used for the full fasta filtering, but that would require converting the data to tabular format first, then back to fasta again after. If you want to do it that way, just state so. The regular expression to use will be a bit different between the two tools.
Should you decide to post back a few of your fasta title lines, be sure to preserve the formatting by using the “block quote” format option (“quote” icon at the top of where you write your reply back to this post). That should be enough information, but if not, I’ll ask you to share a history link with me (privately) to review your exact data. Whitespace (spaces, tabs, etc) can look the same with copy/paste functions, and sometimes matter.
Let’s start there. Thanks!