I am using the lotuS2 tool with AGRF data which comes from leaf litter samples and was sequenced for fungi using PacBio. The.gz files I am uploading are with the primers trimmed as it was already sequenced for being within certain primers. But I am, running these files through lotuS2 AS the sequencing company doesn’t use tools specific for fungi.
This is for a university project and all of this is very new to me. However it has been made clear to me that we want OTUs (operational taxonomic units) are preferred over ASVs.
When I work with the phyloseq in R, there is an object called OTU table but when I export my work in R to CSV the column is ASVs.
Your outputs are mirroring this content – an example is in dataset 60 LotuS2 on data 58, data 18, and others: OTU abundance matrix
The jobs parameters are using CD-HIT for the clustering, which produces the results in OTUs. If you had selected DADA2 instead, then you would have ASVs.
What you are observing here:
This is probably just a labeling artifact. The upstream data reduction steps could produce either OTUs or ASVs but once that is done, the data is treated (about) the same way: a unique feature unit. Phyloseq itself won’t be transforming between the two clustering systems.
Yes after posting this I tried CD-HIT but it came back as ASVs just like DADA2 and VSEARCH.
Just to clarify, if I run this with VSEARCH and the other output files have OTUs but the OTU.txt lists ASVs, that would also likely be a labelling issue the same way?
Meaning they are affectively OTUs that the pipeline just calls ASVs?
For any pipeline’s data that cycles through Phyloseq, the feature originally input is what you will be getting out again. It doesn’t do any transformations.