Hello, I am new to Galaxy.
I want to use collapse tool to get non-redundant sequences in fasta files and get the counts of each unique sequence, but the number appears confusing to me. my fasta file has about 377K reads, and collapse tool output about 44K sequences, however, the total reads is over 37millions ! how could that be?
hope someone who is familiar with this tool can explain to me. Many thanks!
Welcome @fshapt
That sounds really strange, I agree! Would you like to share the history or maybe better for this, post some screenshots to show where the odd numbers are located? I wondering if that number is a base count, not read count, maybe from some other file parsing (?) but we can try help to figure out what the numbers are you seeing mean.
This is the tool I’m thinking of is this one.
- Collapse sequences – link to a public server to confirm which exact tool we are discussing. This one should be at most servers, so just confirm the top part of the form.
Let’s start there, thanks! And if you figured this out please let us know. 