In stringtie merge, do we merge the replicates of the treatment samples
In short, the answer is usually “yes”. The tool can do this always but whether you want to or not depends on your analysis details.
Usage and which inputs to include (sample GTFs/known GTF) depend on whether you are going to perform the DE analysis on only known transcripts/genes, or on only discovered (sample) transcripts/genes, or on both.
Stingtie Merge can be used to do the following:
-
Pre-process an existing reference annotation GTF dataset, by itself, so it can be used with Stringtie to guide assembly, and with other downstream tools. Not all public GTFs are formatted in a way other tools can interpret and this tool can be used to “groom” a GTF.
-
Combine the per-sample GTF results of Stingtie, and optionally the reference GTF (knowns), into a unified GTF assembly that can be used with downstream differential expression tools.
More details that summarize the different use-cases are in this prior Galaxy Biostars Q&A post: Question: StringTie and StringTie merge - when to apply the Guide gff (reference annotation file)?.
Please also see the Stringtie tool manual.
It would probably also help to review the Galaxy Tutorials for DE analysis. Several of the RNA-seq protocols include it – and importantly, HISAT2 (settings to create Stringtie-readable BAMs).
-
Galaxy Tutorials for Galaxy Main: Start here: RNA-seq: Discovering and quantifying new transcripts - an in-depth transcriptome analysis example
-
Galaxy Training Network Tutorials : Some GTN tutorials are appropriate for Galaxy Main and some are not. Where you can run each is noted per tutorial – click on the Galaxy instances gear icon to review the Public Galaxy server choices. If a tutorial is supported by a pre-configured Galaxy Docker training image, instructions for how to get it will be listed below the tutorial listings, per category.