Differential Analysis with EdgeR

0

gravatar for mlai2567

12 hours ago by

mlai25670

Hi all,

I’m trying to analyze an RNA-seq dataset that was just run. It consists of 3 conditions (Control, Treatment 1, Treatment 2) with 2 biological replicates (Cell line #1, Cell line #2) each. However, because the biological replicates are from cell lines of different passage number, there is variation in the base level of gene expression, even within the controls, making it difficult to conduct differential expression analysis across the replicates. For example, the expression value for the control of Cell line #1 may 1, but the control of cell line #2 may be 2. Thus, the expression values of the 2 treatments will vary accordingly, making it difficult to conduct edgeR analysis as there are a low number of differentially expressed genes with a FDR<0.05.

Because of this, I’m trying to measure the differential expression within each cell line individually. Is there a way with Galaxy EdgeR to:

  1. Measure differential expression with data that has no replicates?
  2. Is there a way to measure only fold change (and not FDR/p-value) on edgeR with Galaxy?
  3. I’ve tried inputting the data as replicates and have received abnormal logFC values of -200 to -300 with edgeR on Galaxy? Does anyone know why this might be the case?

Thanks in advance.

Cheers, Michael

1 Like

Can you add an additional factor for cell line passage number i.e. is your setup like in the table below with the controls and treatments from the same passage? (Does the MDS/PCA show the samples are separating by passage?) as that would be better than trying to analyse the data as separate replicates. Then for the edger tool if you were going to input a factor file then you could have something like

SampleID Group Passage
sample1 control p1
sample2 treatment1 p1
sample3 treatment2 p1
sample4 control p2
sample5 treatment1 p2
sample6 treatment2 p2

For your 2nd question, if you have to, you could calculate fold changes yourself between samples using the normalised counts that can be output from the edger tool but you’d have no P values.

For your 3rd question, what do the counts for the individual samples look like. Large fold changes are possible, see here for some info that might be helpful: https://www.biostars.org/p/276527/

1 Like