Require Assistance with RNA-Seq Analysis Pipeline for Studying Differential Expression

Hello Everyone :hugs:,

I am presently engaged in an RNA-Seq research project and am comparatively new to using the Galaxy platform. Finding gene expression variations between the two sample groups (treatment and control) is my aim, however I’m running into several problems in the process. I would be very grateful for any advice or recommendations from the community.

I’ve taken the following actions thus far:

  • Data Upload: My paired-end FASTQ documents have been successfully uploaded to Galaxy.
  • Quality Control: I did pruning with Trimmomatic to get rid of low-quality readings and adapters, and I utilised FastQC for checks to ensure quality.
  • Mapping: I used HISAT2 to match the truncated reads to the target genome.
  • Quantification: To get the count array for gene expression, I utilised featureCounts.

Even though I was able to finish these stages, I am currently having trouble with the differential expression downstream analysis.

I’m having issues with the following in particular:

Normalisation: I’m not sure how to use Galaxy to apply the normalisation approach (such as TMM, RPKM, or TPM) that is best suited for my dataset.

Differential Expression Analysis: I’m having trouble figuring out how to use Galaxy’s DESeq2 or edgeR to find genes that express themselves differently. Any advice on properly configuring the parameters would be greatly appreciated.

Visualisation: To visualise the results, I would like to generate plots (such as heatmaps, MA plots, and volcano plots), but I’m not certain which Galaxy tools are appropriate for this.

I also checked this :point_right:

Could someone please provide me a detailed tutorial or a step-by-step guide? :thinking:

Thank you :pray: in advance.

Hi @dosavator12
Check Hands-on: 2: RNA-seq counts to genes / 2: RNA-seq counts to genes / Transcriptomics
Hope that helps.
Kind regards,

1 Like