Having worked through the Tutorial “Reference-based RNA-seq data analysis”, I am doing this for the first time on real data.
I have been trying to run DEseq2 with just one factor (genotype) with two levels (one with a wild type gene count, one with transgenic). I only have one tabular file of gene counts (both produced by FeatureCounts) for each factor level which I extracted from the dataset in FeatureCount output using “Extract dataset”.
I did this extraction as the files didn’t list separately to be chosen in DESeq2 and the tutorial appeared to have separated the files between steps “Counting the number of reads per annotated file” and ”Analysis of the differential gene expression”.
DEseq2 keeps failing with “an error occurred with this dataset”.
yes, it is the lack of biological replicates. You need at least 3 replicates to sensibly estimate the biological variation. Maybe very old version of DESeq2 or limma supports DE analysis without replicates. But those results might not make sense because with a single replicate you are not taking the biological variation into the differential expression analysis.