DESEQ2 - How does it work?

Sammy · April 22, 2021, 8:58pm

Hello,

Trying to use DESEQ2 for a couple of days now, I have the output of FeatureCounts, I also used Limma Voom successfully. However I need the tables with the p-value and the fold difference in expression so I try to get them via DESEQ2. I tried to compare the Control with each mutant (4) but I think I got it wrong. I’m not sure how to import the files correctly and I couldn’t find any proper explanation of the factors. Please help or an idea where to start from?

Flow · April 23, 2021, 8:28am

Dear Sammy,
Ok where to start. Well … first here are the DESeq papers: DESeq2 and DeSeq. Then I would recommend to look into one of the Galaxy training tutorial that uses DESeq: Reference-based RNA-Seq data analysis. There is also a good biconductor post and a thorough document from the created and author here. Ah and for a visual and audio-tutorial you can watch this DESeq2 series, which is quite intuitively explained.

From your explanation I just guess that you do not received any p-vlaues with DESeq2? If that is the case than the reason for that is probably that you only have one replicate for either mutant or control, or both. DESeq2 needs at least two replicates two calculate a p-value. This has something to do with the variance estimation.

I hope I could help you and have a nice day!
Florian

.

Sammy · April 23, 2021, 1:05pm

Hi, Florian!

Thank you for your reply.

I have done like in this tutorial except I only have 1 factor. and 5 factor levels. The error I get is

Error in data.frame(…, check.names = FALSE) :
arguments imply differing number of rows: 173, 170
Calls: get_deseq_dataset … eval → eval → eval → cbind → cbind → data.frame

I have the Data.Frame but I don’t know where to upload it, if that’s what it means. I’m reading the same tutorials over and over again, I’m thinking of trying to process it in R directly.

P.S. I have triplicates.

Flow · April 23, 2021, 1:38pm

Hey Sammy,
The tool complains that your input files have different number of rows. Check the following two things:

(a) Does your files contain one or multiple header lines? If so, then you have to remove them.
(b) Does your input files have different region/gene/… sets? Meaning, have you generated your count files from different annotations? The input files need to be consistent in the region/gene/… set.

Cheers,
Florian

Sammy · April 23, 2021, 3:17pm

Thank you, Florian. I did have 2 headers and that was the problem. One more question: do you know if Limma Voom also outputs the tables like DESeq2? So far I could only get the plots and normalised counts.

When I was reading about managing DESeq2 in R, it asked as input for a data.frame, however Galaxy didn’t seem to mind. On the other hand, Limma Voom asked for the data.frame in Galaxy. Do you know why?

Cheers. You’ve been really helpful already

Flow · April 26, 2021, 8:13am

Hey Samma,
To give a short explanation. The wrapper DESeq2 in Galaxy that you use activates a script in R with DESeq2. Thus, the wrapper (Rscript) actually do mind that you provide the input data as a data.frame, but you do not really see it unless there is an error.

That as a preface, yes limma voom will output you a similar table like DESeq2. If you see an error or message asking for a data.frame, then your input is not correct. For an example to use limme you can have a look at this Galaxy training material: 2: RNA-seq counts to genes.

Have a good day and best wishes,
Florian

brebelo · October 25, 2023, 11:39am

Hello everyone.

I performed featurecounts and then deseq2 in two samples, WT and Mutant. However, even filtering pvalue<0.001 and a fold change of 2, I still have 19 000 DEGs. I am working with Nicotiana tabacum. Any idea why?

jennaj · October 26, 2023, 4:17pm

Hi @brebelo I will post in your new question, and close this older question out. Deseq2 analysis output too many DEGs