Hi everyone,
I’ve been following the mothur tutorial to perform my metagenomics analysis, and having trouble with the filter.seqs step in the data cleaning after sequence allignment.
When I use the settings as in the tutorial, all of the positions in my allignments are removed and so my fasta file is an empty list of sequences.

Removing the “trump”: . setting, retains my sequences, but I am confused why the length of my filtered allignment is still quite long.

I am thinking it might be an issue with the alignment as I wasn’t so sure about the database I used, but the outputs for all the steps out to this point look as I expect, so I’m not sure.
Does anyone have any guidance on the outputs I’ve been getting, and if it would be okay to proceed with the long filtered allignment? Thanks in advance for your help!
Hi @ameliak
I have a completed history here with the tutorial data here run through the tutorial’s workflow. Maybe you can compare the usage and notice where things are different?
This is the filter step to help you navigate the history.
These are the statistics from the log.
Length of filtered alignment: 376
Number of columns removed: 13049
Length of the original alignment: 13425
Number of sequences used to construct filter: 16298
As a pure guess, you could seems to be based on the number of 16s reads that are used with this pipeline, not the smaller list of assemblies from the earlier steps (mothur align.seqs output).
Hope this helps! 
Thanks for your help!
I’m still a bit confused why all the columns are removed when I use the trump character, as this seemed like a standard selection from the tutorial.
And then I don’t really understand why the length of my filtered alignment is much longer than the amplicon size, but if it could just be because I have many more samples than in the tutorial I guess that should be okay.
Thanks for the history, it seems to match well with what I have done so I will try proceeding to the next steps!
1 Like