Job execution error (Tool executes too early)

Melclic1 · February 24, 2020, 11:55am

Hello,

I am developing galaxy tools, and constructed a workflow to chain them together. The problem I am encountering comes from one of the tools executing before a previous tool finishes to run.

Attached are two images. One with the workflow where circled in red (RP Selenzyme) is the tool that executes too early (before RP Thermodynamics finishes). The second image is the side panel showing that the tool RP Selenzyme returns an error due to an empty input since it executes earlier than it should.

Any help would be very welcomed. Do not hesitate to ask me for any logs. I tried this on the Galaxy 19.05 and 20.01 releases and the same error occurs.
Thank you,

mvdbeek · February 24, 2020, 12:39pm

Without the tools and workflow we can’t help you there, but keep in mind that parallel execution may be misleading here, the first job whose inputs are ready to run will run. In your history I see multiple parallel executions, so it is not clear to me that the input wasn’t already finished and failed for another reason.

Melclic1 · February 24, 2020, 2:11pm

Thanks a lot for your response.

Misinterpretation of parallel runs is exactly my problem!

The tool circled in red is run in parallel to the other first tools to run when it should not. The “RP Selenzyme” tool requires two inputs, “RP Thermodynamics” and the underlined input. However, you can see from the first image that “RP Thermodynamics” is in grey, indicating that it has not finished and yet “RP Selenzyme” executes.

If there a way for me to manually specify when to execute a tool in a workflow you think?

mvdbeek · February 24, 2020, 3:21pm

No, I think the input dataset to the failed job is hidden and completed correctly. You can see the hidden datasets by clicking on “59 hidden”. You can also see more information if you click on the failed dataset and then the i symbol. The input dataset numbers will be listed there, and they will probably not correspond to any grey dataset.

Melclic1 · February 24, 2020, 3:43pm

Thanks for suggesting the debug.

Please find attached the screenshot from the information tag of the failed tool. Unfortunately, the dataset is not hidden, corresponds to the right tool and as I described before fires before the previous dataset is generated (run number 352 that us still grayed out).

mvdbeek · February 24, 2020, 3:45pm

Can you share tools and workflows somewhere ?

Melclic1 · February 24, 2020, 3:49pm

As of now I cannot share the tools nor the workflow. I’ll ask my collaborators if we can share our Galaxy test server temporarily with you.

Thanks a lot for helping, I’ll get back to you.

mvdbeek · February 24, 2020, 3:53pm

No, that will not help, I’ll try and see if I can reproduce this.

Melclic1 · March 2, 2020, 9:13am

Hello,

The problem persists. You are welcomed to go on the projects galaxy and create an account to inspect the problem:
https://synbiocad.micalis.inra.fr/test/
Here are two workflows, one that work fine (rpranker-2) and another that contains the bug that I mentioned (rprankerdebug):
https://synbiocad.micalis.inra.fr/test/u/mdulac/w/rpranker-2
https://synbiocad.micalis.inra.fr/test/u/mdulac/w/rprankerdebug

To run the workflow import the E.Coli model in the “shared data” space and fill the required input with:

GEM SBML: E.Coli
Target InChI: InChI=1S/C6H6O4/c7-5(8)3-1-2-4-6(9)10/h1-4H,(H,7,8)(H,9,10)/p-2/b3-1-,4-2-
Maximal Pathway Length: 3

Essentially the rpSelenzyme tool is the culprit and the only difference between the two workflows is an additional input to that tool. Please let me know if you want me to post logs or anything.

Cheers

mvdbeek · March 2, 2020, 1:15pm

I see, the text input is short-circuiting the inputs ready check.

mvdbeek · March 2, 2020, 2:06pm

So connecting a text parameter to a select is a new feature in 20.01, this won’t work on 19.09. Then there is the issue that I think there are 2 parameters called input in RpSelenzyme. Can you try changing the one Taxonomy JSON input to a different name and rebuild and re-run the workflow ?

Melclic1 · March 2, 2020, 3:59pm

That was it! The naming convention of the Galaxy tool wrapper seems to have confused the execution of the workflow. Thanks a million!

bernt-matthias · March 19, 2020, 4:23pm

Hey @Melclic1 : Can you tell me if the tools in question contain sections? @mvdbeek guessed that this https://github.com/galaxyproject/galaxy/pull/9493 might solve this – but so far this PR should only affect sections.

mvdbeek · March 19, 2020, 4:34pm

It did at the time of writing.

mvdbeek · March 19, 2020, 4:36pm

Actually it was a conditional, but that should be the same problem.

bernt-matthias · March 19, 2020, 4:46pm

Ahh. Now I see. The API test (and the code) for the case of conditionals seems wrong … one “second”.

Topic		Replies	Views
Keep getting "This is a new dataset and not all of its data are available yet" for all plots and pretreatments I used. queued-gray-datasets	10	1018	April 5, 2020
This job is waiting to run - freebayes queued-gray-datasets	1	559	May 21, 2021
my assembly is not running usegalaxy.eu support queued-gray-datasets	1	494	July 23, 2021
"Rescuing" a workflow invocation?	0	319	November 21, 2021
Jupyter interactive notebook unable to finish job error usegalaxy.eu support interactive-tools	3	164	September 13, 2023

Job execution error (Tool executes too early)

Related Topics