I am working on the RNA-seq data of Desulfovibrio alaskensis . I was able to run until the Featurecounts stage and ended up with an error in Deseq2 results. Below are the error, FYI, I am using publicly available RNA seq datasets by uploading the SRR accession number in Galaxy. It would be great if you could help me with this.
Fatal error: An undefined error occurred, please check your input carefully and contact your administrator.
Tool generated the following standard error:
it appears that the last variable in the design formula, ‘Dvh’,
has a factor level, ‘Control’, which is not the reference level. we recommend
to use factor(…,levels=…) or relevel() to set this as the reference level
before proceeding. for more information, please see the ‘Note on factor levels’
estimating size factors
Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.
Calls: DESeq … estimateDispersions → .local → checkForExperimentalReplicates
Thank you and Regards
How many samples do you have? DESeq2 requires replicates since v1.22
The data doesn’t have replicates. I am analyzing the publicly available data to train a machine learning tool. So, I believe I might need to use only data with replicates. Please confirm.
“The design matrix has the same number of samples and coefficients to fit, so estimation of dispersion is not possible.” Please suggest your thoughts to overcome this.
Thank you very much for your reply
how many samples do you use for DESeq2? Current versions of DESeq2 require replicates.
If you don’t have replicates, either switch to an old version (before v1.22) or try tools from Cufflinks package. Version of a tool in Galaxy can be selected in the top right corner (three boxes icon) during the job setup. The best option is replicates.
Hope this helps.
Thank you very much foR the clarification on replicates. Also, could you please comment on the design matrix that has the same number of samples and coefficients to fit? I see that I have an issue with the HISAT2 file. Among the columns in the HISAT2 result, both MPOS and ISIZEcolumns had only zeros for all the genes. When I used the same GALAXY pipeline last month I had several numbers in these columns where I ended up with good results. I am skeptical that may HISAT2 BAM file has some issues and it is reflected in DESEQ2 error.
I can share the history if you want to take a look. Please send me your email ID.
It would be great if you could help me with this.