Trouble with running DESeq2 in a workflow

Hello,

I am creating a workflow in Galaxy that is configured for my institution - Cedars-Sinai medical center. I am the admin for this instance. The workflow is based on the Galaxy tutorial - Reference-based RNA-Seq data analysis. This tutorial provides example datasets and provides detailed information for each step along with the required tools and parameters. However, it is not designed as a workflow. I designed the workflow with exactly the same tools. However, when I run the workflow, DESeq2 encounters an error -

Error in h(simpleError(msg, call)) : error in evaluating the argument 'obj' in selecting a method for function 'unname': a character vector argument expected Calls: unname ... basename -> basename -> <Anonymous> -> .handleSimpleError -> h

When it fails, and I go into the history where the error shows up, there is an option to “Run this job again”. I click that and the job completes successfully without changing any parameters. I couldn’t find a solution for this within DESeq2 error forums. So, I think this is a Galaxy issue. Has anyone experienced this? Any help, advice, or suggestion is highly appreciated. Thank you for reading!

Thank you,
Priyanka

Welcome, @Priyanka_Bhandary

I think this error indicates that whatever job/output is upstream wasn’t actually “ready” to use when this tool executed. Let’s see if anyone else has a comment.

Most tutorials include at least one workflow, and this particular tutorial has two. Maybe use those as a reference guide for how to model the data in your own workflow?

Find available workflows per tutorial linked in the top information box of each.
https://training.galaxyproject.org/training-material/topics/transcriptomics/tutorials/ref-based/workflows/

I also asked for help from the workflow application developers at their chat. Replies may post here or there, and feel free to join the chat. You're invited to talk on Matrix

Hi @Priyanka_Bhandary ,
I don’t know this error. Would you mind to try to test the workflows indicated by @jennaj on your instance and tell us if you still have issues?
Thanks

1 Like

Thank you @jennaj I will try and see whether that will work. Thank you for the quick and detailed response!

Thank you @lldelisle I will test these workflows and see if it works on my instance. Thanks again!

I checked out the workflows you shared. It looks good!! I was wondering if there is just one workflow for the entire tutorial from start to finish. The tutorials divide the mapping and quantification part into one and differential expression analysis and functional enrichment into another workflow. Is there a way that it can be combined because it would be easier to explain how to run the workflow to users in our institution? Thank you so much!

Thanks,
Priyanka

Hi @Priyanka_Bhandary

Creating a master workflow with two subworkflow components could be a solution.

Example: Creating, Editing and Importing Galaxy Workflows

2 posts were split to a new topic: Troubleshooting Python configuration errors

Hi @jennaj,

Thank you so much for the detailed explanation. I really appreciate this detailed and thorough answer. I ran this tool outside of the workflow and it still didn’t work. Before trying to figure out what happened at the HPC end, I instead checked out a simpler join tool called “Join datasets by identifier column”. It worked outside the workflow as well as part of the workflow. Now, the workflow is working but gets another error with the “heatmap2” tool. Actually, there are two times that the heatmap tool is called. One is for the visualization of the expression of differentially expressed genes and the other is for visualization of the distribution of the Z-score of these genes as seen in (Reference-based RNA-Seq data analysis). Interestingly, the first heatmap works but the second heatmap doesn’t work with the following error -

An error occurred with this dataset:
options(show.error.messages=F, error=function(){cat(geterrmessage(), file=stderr()); q("no",1,F)})

loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")

library("RColorBrewer")
library("gplots")

input <- read.delim('/data/tools/galaxy/database/objects/d/

I investigated this error further using the “View details” option and there’s one more error message-

Error in hclust(x, method = "complete") : 
  NA/NaN/Inf in foreign function call (arg 10)
Calls: heatmap.2 -> hclustfun -> hclust

I checked the input but couldn’t find any NA in the table. Is there something I’m missing? Thank you for all the help!

Best,
Priyanka

Hi @Priyanka_Bhandary

Last time I investigated an error like this, it meant that the data has values that are either blank or that are not a number (or rather, not a number the tool knows how to process). There could also be extra columns of data.

Check for that situation in your data first. I’ve also seen extra data columns trigger the problem (tool is expecting 10? Or nine?). An extra wrinkle is that this tool is using R functions, so the content needs to be understood by R, which can be pickier then other tools. (And those R packages need to not only be in the compute environment, but also the expected package version called by the tool :face_with_peeking_eye:). Another potential problem is scientific notation since that can be formatted/condensed in a few different ways.

Another check you could try is to run that same data at a usegalaxy.* server to compare. That would help to isolate a data problem from a configuration problem.

I’m guessing that not using containers is the root problem. That is your choice but expect problems when running tools that way. Even if you solve the immediate errors, over time the local environment could change in ways that introduce new errors for tools that previously worked fine. That might involve a workflow re-write…