In the tutorial, building an amplicon sequence variant (ASV) table from 16S data using DADA2, could you please specify the steps for downloading the ASV table (similar to OTU) that will be used for further microbial community network analysis?
Thanks!
Welcome, @Ranjith_Kumar
You are working with this tutorial, correct?
Could you explain a bit more about what you would like to do? I’m not sure I understand. Thanks!
Thanks for your reply @jennaj
Yes, I am following the tutorial for 16S. Which steps I can download the ASV table that will be used for microbial community network analysis?
Enclosed is the screenshot of the sample table i am looking for from my analysis.
@jennaj I am curious to know if you have any update. Please provide your insights. Thanks!
The steps in that tutorial are what creates the ASV table. There is a workflow associated that you could adapt for use with your own custom sample 16S read data.
These are the places to start:
- Run through the tutorial to see how it works with the example data.
- Then, attempted to run your own sample data though the same workflow.
One of the outputs will be a table like you posted. You can then use that for downstream analysis with other tools like Phyloseq.
I’m having trouble understanding what you have done already and which step you need help with. You can share back your work and the tutorial step you are at to help explain. How to do that is in the banner of this forum.
Hi @jennaj
Thats works for me. I will look into it. Thanks!
Hi @jennaj
I created a metadata file and upload it manually, but i am still wondering where can i found ASV table that i mentioned before. can you provide some information on that?
Hi @jennaj
Upon completion of the tutorial, I obtained two files: one resulting from the removal of chimeras and another from taxonomic assignment. I would like to inquire whether it is appropriate to replace the gene sequence (column1) with the number ASV1, ASV2, and so forth to generate an ASV table. Is this approach correct? Please see the screenshot of both tables. Thanks!
Hi @everyone, it would be helpful, if anyone could provide insights on my query? Thanks!
These are the outputs from this step in the tutorial, correct?
The first table is summarizing where the reads collapsed per sample. The second tables is summarizing the taxonomic classification of those collapsed sequences.
There is also metadata about each sample available. This is what is being constructed in this step of the tutorial, along with the merging of all three data files into a Phyloseq object.
- Hands-on: Building an amplicon sequence variant (ASV) table from 16S data using DADA2 / Building an amplicon sequence variant (ASV) table from 16S data using DADA2 / Microbiome (#exploration-of-asvs-with-phyloseq)
I guess you could think of that collapsed sequence, what you are naming a “gene sequence” as a common key between these two tables.
What is your final goal? Meaning, is there a tool you are preparing the data for? How do those tools instructions explain the expected data input? You can share links to publications, etc if that is what you are following.
And, let’s keep your question in here instead of all of the other resolved topics. I’m going to merge those back into here.
Thanks!
Hi @Jennaj, Thank you for the information. You are right. I also realized that I need to replace the gene sequence in both the table with ASV1, ASV2…to get sample data.
I am going to use the table to construct the co-microbial network analysis. I am trying to make a figure in the paper. Please use this link to see the paper https://pubs.acs.org/doi/10.1021/acs.est.4c00942
I would like to know how can i download the raw data at Phyloseq. I am trying to download the raw data for alpha diversity, and taxonomic classification,etc, but i am not sure where can i download it. Please let me know. Thanks again!
Ok, thanks for explaining the situation, I think I understand it now.
The Shiney-Phyloseq application hosting from UseGalaxy.org will allow downloads of the graphics generated but I don’t see a way of downloading the raw data tables behind the graphics (computed sub-slices of the original phyloseq object). Why? This isn’t really a single data table – it is an object with many layers, but maybe it can be flattened and output.
You might need to do this in R directly, or I might be missing some functionality in the Shiney implementation currently hosted, or maybe this is something that could be added in (a sort of export back to Galaxy seems to be what you want?).
Let’s ask the scientists that work in this area for advice. I’ve cross-posted over to the forum linked at the top of the Microbiome tutorials. They will probably reply here but you could also join that chat for now or later. You're invited to talk on Matrix
Hi @jennaj, thank you so much for the detailed information and posting on microbiome. I am glad that we are finally on the same track. Sorry for the inconvenience. I will follow the post.
Thanks again!
Dear @Ranjith_Kumar,
a phyloseq object is just a R object that combines OTU, Taxonomy and Metadata tables (tree and sequences optionally as well). Since you always need those datasets for ASV/OTU analysis and it’s difficult to work with them individually.
At the end of the tutorial, the step: Create phyloseq object from dada2 creates a phyloseq object. This phyloseq object can be used in the shiny-phyloseq tool that I wrapped. But this is just a simple app that provides some phyloseq functions, but cannot do fancy operations ! But you can also download the phyloseq object and import it to R and use the phyloseq package to do customized operations. You can also load it into the interactive tool: jupyter notebook.
Install phyloseq: mamba create --name myenvname bioconductor-phyloseq
and other packages you might need.
A simple example code to create a co-occurrence network would be:
# Load necessary libraries
library(phyloseq)
library(igraph)
library(vegan)
library(ggplot2)
# Load your phyloseq object (replace with your actual data)
# Example: ps <- readRDS("your_phyloseq_object.rds")
# Extract the OTU table from the phyloseq object
otu_table <- otu_table(ps)
# Convert the OTU table to a data frame for easier manipulation
otu_df <- as.data.frame(otu_table)
otu_df[otu_df > 0] <- 1 # Convert presence/absence data (1 if present, 0 if absent)
# Calculate the co-occurrence matrix (using a simple correlation metric, like Jaccard)
co_occurrence <- vegdist(t(otu_df), method = "jaccard")
co_occurrence_matrix <- as.matrix(co_occurrence)
# Convert the co-occurrence matrix into a correlation matrix
# Apply a threshold for the correlation, e.g., only keep values above a threshold like 0.5
threshold <- 0.5
co_occurrence_matrix[co_occurrence_matrix < threshold] <- 0
# Create an igraph object from the matrix
network_graph <- graph_from_adjacency_matrix(co_occurrence_matrix, mode = "undirected", diag = FALSE)
# Visualize the network using ggplot2 (you can customize this part)
plot(network_graph, vertex.size=5, vertex.label.cex=0.5, edge.width=0.5, main="Co-occurrence Network")
You can further adapt this to your needs.
If I find the time, I can also work on a Galaxy tool for this. But usually directly writing R code is simpler since many customizations might be needed.
Best, Paul