Build expression matrix for a de novo assembly of RNA-Seq data by Trinity

Hi everyone!,

After running “Align reads and estimate abundance on a de novo assembly of RNA-Seq data”, I’m trying to run “Build expression matrix for a de novo assembly of RNA-Seq data by Trinity”. One consideration is that, in order to run Trinity assembly and Align reads and estimate abundance, I first concatenated samples (there are so many samples), so I have 2 samples (forward and reverse). I obtained 2 files when I run Align and estimate abundance: gene counts and isoform counts. In order to run Build expression matrix I’m not sure which samples I should use: I would use “gene counts” in Abundance estimates, and Gene to transcripts map (from Trinity) in Gene to transcript correspondence (‘gene(tab)transcript’ lines). Is that correct?, thanks in advance!

1 Like

Welcome, @bio_code

What you are trying to do is very complicated (not the data part, the processing part). The tools are also being deprecated for other reasons. The replacement for Trinity is rnaSPAdes.

And, for DE analysis, full assembly is not needed. You would probably only need actual assembly for variant calling, and rnaSPAdes will meet those needs fine.

Consider trying these methods instead for RNA-seq based de novo DE. Related tutorials in that same category have more examples. Each have a workflow! → Hands-on: De novo transcriptome reconstruction with RNA-Seq / Transcriptomics

1 Like

Thanks @jennaj!!!, I’ll try those methods