Which tool is guilty? Custom local Galaxy install

wormball · April 20, 2021, 10:08am

Hello!

I edited some tool definition xml files (and also chowned the galaxy to non-root user) and now when i invoke my workflow i get:

Invocation scheduling failed - Galaxy administrator may have additional details in logs.

In the galaxy.log i see:

.......
.......
[JobHandlerQueue.monitor_thread] (2138) Mapped job to destination id: local
    galaxy.jobs.handler DEBUG 2021-04-20 12:33:14,988 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2138) Dispatching to local runner
    galaxy.jobs DEBUG 2021-04-20 12:33:15,034 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2138) Persisting job destination (destination id: local)
    galaxy.jobs DEBUG 2021-04-20 12:33:15,056 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2138) Working directory for job is: /data/galaxy/database/jobs_directory/002/2138
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:15,070 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] Job [2138] queued (81.814 ms)
    galaxy.jobs.handler INFO 2021-04-20 12:33:15,089 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2138) Job dispatched
    galaxy.jobs.mapper DEBUG 2021-04-20 12:33:15,114 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2139) Mapped job to destination id: local
    galaxy.jobs.handler DEBUG 2021-04-20 12:33:15,156 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2139) Dispatching to local runner
    galaxy.jobs DEBUG 2021-04-20 12:33:15,234 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2139) Persisting job destination (destination id: local)
    galaxy.jobs DEBUG 2021-04-20 12:33:15,267 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2139) Working directory for job is: /data/galaxy/database/jobs_directory/002/2139
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:15,289 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] Job [2139] queued (133.633 ms)
    galaxy.jobs.handler INFO 2021-04-20 12:33:15,312 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2139) Job dispatched
    galaxy.jobs.mapper DEBUG 2021-04-20 12:33:15,348 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2140) Mapped job to destination id: local
    galaxy.workflow.run DEBUG 2021-04-20 12:33:15,416 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3956 outputs of invocation 41 delayed (dependent step [3951] delayed, so this step must be delayed)
    galaxy.jobs.handler DEBUG 2021-04-20 12:33:15,423 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2140) Dispatching to local runner
    galaxy.workflow.run DEBUG 2021-04-20 12:33:15,585 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3955 outputs of invocation 41 delayed (dependent step [3951] delayed, so this step must be delayed)
    galaxy.jobs DEBUG 2021-04-20 12:33:15,630 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2140) Persisting job destination (destination id: local)
    galaxy.jobs.command_factory INFO 2021-04-20 12:33:15,671 [p:423042,w:1,m:0] [LocalRunner.work_thread-2] Built script [/data/galaxy/database/jobs_directory/002/2138/tool_script.sh] for tool command [pwd; ln -fL '/data/galaxy/database/objects/0/4/f/dataset_04f4806e-fbc9-42c6-bcad-f42d2fe50ed5.dat' left.fastq.gz && ln -fL '/data/galaxy/database/objects/4/d/c/dataset_4dc809ed-dcb7-4dfb-8691-5efba0388564.dat' right.fastq.gz && docker run -v `pwd`/:/data/ -v `pwd`/:/input/ -t  --user "root:root" fred2/optitype -i /input/left.fastq.gz /input/right.fastq.gz -d -o /data/ && cp ./*/*result.tsv '/data/galaxy/database/objects/7/9/0/dataset_790ed4db-3616-464f-81ba-ff612782533d.dat' && cp ./*/*.pdf '/data/galaxy/database/objects/9/f/2/dataset_9f296f27-2f5b-4524-b972-8b399a27b516.dat']
    galaxy.jobs DEBUG 2021-04-20 12:33:15,703 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2140) Working directory for job is: /data/galaxy/database/jobs_directory/002/2140
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:15,735 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] Job [2140] queued (312.311 ms)
    galaxy.jobs.handler INFO 2021-04-20 12:33:15,761 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2140) Job dispatched
    galaxy.workflow.run DEBUG 2021-04-20 12:33:15,787 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3954 outputs of invocation 41 delayed (dependent step [3950] delayed, so this step must be delayed)
    galaxy.jobs.mapper DEBUG 2021-04-20 12:33:15,823 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2141) Mapped job to destination id: local
    galaxy.jobs.handler DEBUG 2021-04-20 12:33:15,944 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2141) Dispatching to local runner
    galaxy.workflow.run DEBUG 2021-04-20 12:33:15,955 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3953 outputs of invocation 41 delayed (dependent step [3949] delayed, so this step must be delayed)
    galaxy.jobs DEBUG 2021-04-20 12:33:16,104 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2141) Persisting job destination (destination id: local)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,142 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3959 outputs of invocation 41 delayed (dependent step [3956] delayed, so this step must be delayed)
    galaxy.jobs.command_factory INFO 2021-04-20 12:33:16,187 [p:423042,w:1,m:0] [LocalRunner.work_thread-3] Built script [/data/galaxy/database/jobs_directory/002/2139/tool_script.sh] for tool command [bwa mem -t '12' -R  '@RG\tID:h1805_wn0552.A7CE75785_normal\tSM:h1805_wn0552.A7CE75785\tLB:LIB1\tPL:ILLUMINA\tPU:UNIT1' '/data/galaxy/tools/melanoma_tools/genome/hg38.analysisSet.fa' '/data/galaxy/database/objects/3/7/c/dataset_37cce193-8ff2-4b22-8bca-df25853e743a.dat' '/data/galaxy/database/objects/e/2/e/dataset_e2e072e3-24ab-43b7-b109-9deabf9aa78d.dat' | samtools sort -@ '12' -o '/data/galaxy/database/objects/0/1/9/dataset_019cdeb0-5777-4a06-9618-a2361e36e2e0.dat'; samtools index '/data/galaxy/database/objects/0/1/9/dataset_019cdeb0-5777-4a06-9618-a2361e36e2e0.dat']
    galaxy.jobs DEBUG 2021-04-20 12:33:16,204 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2141) Working directory for job is: /data/galaxy/database/jobs_directory/002/2141
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:16,240 [p:423042,w:1,m:0] [LocalRunner.work_thread-2] (2138) command is: mkdir -p working outputs configs
    if [ -d _working ]; then
        rm -rf working/ outputs/ configs/; cp -R _working working; cp -R _outputs outputs; cp -R _configs configs
    else
        cp -R working _working; cp -R outputs _outputs; cp -R configs _configs
    fi
    cd working; /bin/bash /data/galaxy/database/jobs_directory/002/2138/tool_script.sh > ../outputs/tool_stdout 2> ../outputs/tool_stderr; return_code=$?; cd '/data/galaxy/database/jobs_directory/002/2138'; 
    [ "$GALAXY_VIRTUAL_ENV" = "None" ] && GALAXY_VIRTUAL_ENV="$_GALAXY_VIRTUAL_ENV"; _galaxy_setup_environment True
    python "metadata/set.py"; sh -c "exit $return_code"
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:16,260 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] Job [2141] queued (315.588 ms)
    galaxy.jobs.runners.local DEBUG 2021-04-20 12:33:16,278 [p:423042,w:1,m:0] [LocalRunner.work_thread-2] (2138) executing job script: /data/galaxy/database/jobs_directory/002/2138/galaxy_2138.sh
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,304 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3958 outputs of invocation 41 delayed (dependent step [3954] delayed, so this step must be delayed)
    galaxy.jobs.handler INFO 2021-04-20 12:33:16,313 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2141) Job dispatched
    galaxy.jobs.mapper DEBUG 2021-04-20 12:33:16,373 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2142) Mapped job to destination id: local
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,398 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3957 outputs of invocation 41 delayed (dependent step [3953] delayed, so this step must be delayed)
    galaxy.jobs.handler DEBUG 2021-04-20 12:33:16,537 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2142) Dispatching to local runner
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,549 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3962 outputs of invocation 41 delayed (dependent step [3959] delayed, so this step must be delayed)
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:16,666 [p:423042,w:1,m:0] [LocalRunner.work_thread-3] (2139) command is: mkdir -p working outputs configs
    if [ -d _working ]; then
        rm -rf working/ outputs/ configs/; cp -R _working working; cp -R _outputs outputs; cp -R _configs configs
    else
        cp -R working _working; cp -R outputs _outputs; cp -R configs _configs
    fi
    cd working; /bin/bash /data/galaxy/database/jobs_directory/002/2139/tool_script.sh > ../outputs/tool_stdout 2> ../outputs/tool_stderr; return_code=$?; cd '/data/galaxy/database/jobs_directory/002/2139'; 
    [ "$GALAXY_VIRTUAL_ENV" = "None" ] && GALAXY_VIRTUAL_ENV="$_GALAXY_VIRTUAL_ENV"; _galaxy_setup_environment True
    python "metadata/set.py"; sh -c "exit $return_code"
    galaxy.jobs.runners.local DEBUG 2021-04-20 12:33:16,687 [p:423042,w:1,m:0] [LocalRunner.work_thread-3] (2139) executing job script: /data/galaxy/database/jobs_directory/002/2139/galaxy_2139.sh
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,716 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3961 outputs of invocation 41 delayed (dependent step [3958] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,783 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3960 outputs of invocation 41 delayed (dependent step [3957] delayed, so this step must be delayed)
    galaxy.jobs.command_factory INFO 2021-04-20 12:33:16,816 [p:423042,w:1,m:0] [LocalRunner.work_thread-1] Built script [/data/galaxy/database/jobs_directory/002/2140/tool_script.sh] for tool command [bwa mem -t '12' -R  '@RG\tID:h1805_wn0552.A7CE75785_tumor\tSM:h1805_wn0552.A7CE75785\tLB:LIB1\tPL:ILLUMINA\tPU:UNIT1' '/data/galaxy/tools/melanoma_tools/genome/hg38.analysisSet.fa' '/data/galaxy/database/objects/f/d/b/dataset_fdb38d00-6c1b-4bd8-9f5b-6aeafb056031.dat' '/data/galaxy/database/objects/7/b/7/dataset_7b7c584f-b4e2-416f-ba81-fd007d87b948.dat' | samtools sort -@ '12' -o '/data/galaxy/database/objects/f/e/c/dataset_fecdc2ca-252c-4319-8afb-f6378c510457.dat'; samtools index '/data/galaxy/database/objects/f/e/c/dataset_fecdc2ca-252c-4319-8afb-f6378c510457.dat']
    galaxy.jobs DEBUG 2021-04-20 12:33:16,828 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2142) Persisting job destination (destination id: local)
    galaxy.jobs DEBUG 2021-04-20 12:33:16,890 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2142) Working directory for job is: /data/galaxy/database/jobs_directory/002/2142
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:16,926 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] Job [2142] queued (387.861 ms)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,933 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3965 outputs of invocation 41 delayed (dependent step [3962] delayed, so this step must be delayed)
    galaxy.jobs.handler INFO 2021-04-20 12:33:16,947 [p:423042,w:1,m:0] [JobHandlerQueue.monitor_thread] (2142) Job dispatched
    galaxy.workflow.run DEBUG 2021-04-20 12:33:16,981 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3964 outputs of invocation 41 delayed (dependent step [3961] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,023 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3963 outputs of invocation 41 delayed (dependent step [3960] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,061 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3968 outputs of invocation 41 delayed (dependent step [3965] delayed, so this step must be delayed)
    galaxy.jobs.runners DEBUG 2021-04-20 12:33:17,071 [p:423042,w:1,m:0] [LocalRunner.work_thread-1] (2140) command is: mkdir -p working outputs configs
    if [ -d _working ]; then
        rm -rf working/ outputs/ configs/; cp -R _working working; cp -R _outputs outputs; cp -R _configs configs
    else
        cp -R working _working; cp -R outputs _outputs; cp -R configs _configs
    fi
    cd working; /bin/bash /data/galaxy/database/jobs_directory/002/2140/tool_script.sh > ../outputs/tool_stdout 2> ../outputs/tool_stderr; return_code=$?; cd '/data/galaxy/database/jobs_directory/002/2140'; 
    [ "$GALAXY_VIRTUAL_ENV" = "None" ] && GALAXY_VIRTUAL_ENV="$_GALAXY_VIRTUAL_ENV"; _galaxy_setup_environment True
    python "metadata/set.py"; sh -c "exit $return_code"
    galaxy.jobs.runners.local DEBUG 2021-04-20 12:33:17,091 [p:423042,w:1,m:0] [LocalRunner.work_thread-1] (2140) executing job script: /data/galaxy/database/jobs_directory/002/2140/galaxy_2140.sh
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,139 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3967 outputs of invocation 41 delayed (dependent step [3961] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,226 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3966 outputs of invocation 41 delayed (dependent step [3960] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,264 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3974 outputs of invocation 41 delayed (dependent step [3965] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,288 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3971 outputs of invocation 41 delayed (dependent step [3967] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,327 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3973 outputs of invocation 41 delayed (dependent step [3966] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,364 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3972 outputs of invocation 41 delayed (dependent step [3967] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,386 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3969 outputs of invocation 41 delayed (dependent step [3966] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,402 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3970 outputs of invocation 41 delayed (dependent step [3966] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,412 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3977 outputs of invocation 41 delayed (dependent step [3974] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,423 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3978 outputs of invocation 41 delayed (dependent step [3974] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,436 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3976 outputs of invocation 41 delayed (dependent step [3971] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,447 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3975 outputs of invocation 41 delayed (dependent step [3970] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,458 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3980 outputs of invocation 41 delayed (dependent step [3977] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,469 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3981 outputs of invocation 41 delayed (dependent step [3978] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,492 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3979 outputs of invocation 41 delayed (dependent step [3972] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,508 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3983 outputs of invocation 41 delayed (dependent step [3981] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,519 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3982 outputs of invocation 41 delayed (dependent step [3979] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,529 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3985 outputs of invocation 41 delayed (dependent step [3983] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,544 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3984 outputs of invocation 41 delayed (dependent step [3982] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,555 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3989 outputs of invocation 41 delayed (dependent step [3985] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,573 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3986 outputs of invocation 41 delayed (dependent step [3984] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,587 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3987 outputs of invocation 41 delayed (dependent step [3974] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,605 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3988 outputs of invocation 41 delayed (dependent step [3984] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,615 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3990 outputs of invocation 41 delayed (dependent step [3987] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,633 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3991 outputs of invocation 41 delayed (dependent step [3990] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,648 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3992 outputs of invocation 41 delayed (dependent step [3991] delayed, so this step must be delayed)
    galaxy.workflow.run DEBUG 2021-04-20 12:33:17,706 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Marking step 3993 outputs of invocation 41 delayed (dependent step [3955] delayed, so this step must be delayed)
    galaxy.workflow.scheduling_manager DEBUG 2021-04-20 12:33:17,737 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Workflow invocation [41] scheduled
    galaxy.workflow.scheduling_manager DEBUG 2021-04-20 12:33:18,747 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Attempting to schedule workflow invocation [(41,)]
    galaxy.workflow.modules INFO 2021-04-20 12:33:18,947 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] parameter_def [{'parameter_type': 'text', 'optional': False}], how [none]
    galaxy.workflow.run ERROR 2021-04-20 12:33:18,982 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Failed to execute scheduled workflow.
    Traceback (most recent call last):
      File "lib/galaxy/workflow/run.py", line 83, in __invoke
        outputs = invoker.invoke()
      File "lib/galaxy/workflow/run.py", line 174, in invoke
        remaining_steps = self.progress.remaining_steps()
      File "lib/galaxy/workflow/run.py", line 316, in remaining_steps
        self._recover_mapping(invocation_step)
      File "lib/galaxy/workflow/run.py", line 514, in _recover_mapping
        step_invocation.workflow_step.module.recover_mapping(step_invocation, self)
      File "lib/galaxy/workflow/modules.py", line 640, in recover_mapping
        progress.set_outputs_for_input(invocation_step, already_persisted=True)
      File "lib/galaxy/workflow/run.py", line 417, in set_outputs_for_input
        raise ValueError(message)
    ValueError: Step with id WorkflowStep[index=5,type=data_input] not found in inputs_step_id ({3940: <galaxy.model.HistoryDatasetAssociation(2153) at 0x7f61a4dd5e80>, 3939: <galaxy.model.HistoryDatasetAssociation(2163) at 0x7f62365e1220>, 3938: <galaxy.model.HistoryDatasetAssociation(2162) at 0x7f62363b7bb0>, 3935: <galaxy.model.HistoryDatasetAssociation(2155) at 0x7f61a4dd5970>, 3937: <galaxy.model.HistoryDatasetAssociation(2150) at 0x7f61a4dd5fa0>, 3934: <galaxy.model.HistoryDatasetAssociation(2154) at 0x7f6236482e50>, 3936: <galaxy.model.HistoryDatasetAssociation(2149) at 0x7f61a4b541c0>, 3941: '12', 3933: 'h1805_wn0552.A7CE75785'})
    galaxy.workflow.scheduling_manager DEBUG 2021-04-20 12:33:19,019 [p:423042,w:1,m:0] [WorkflowRequestMonitor.monitor_thread] Workflow invocation [41] scheduled
    galaxy.model WARNING 2021-04-20 12:33:38,023 [p:423042,w:1,m:0] [uWSGIWorker1Core3] Datatype class not found for extension 'tar.gz'
    galaxy.model WARNING 2021-04-20 12:33:38,023 [p:423042,w:1,m:0] [uWSGIWorker1Core3] Datatype class not found for extension 'tar.gz'
    galaxy.model WARNING 2021-04-20 12:33:38,023 [p:423042,w:1,m:0] [uWSGIWorker1Core3] Datatype class not found for extension 'tar.gz'
    galaxy.model WARNING 2021-04-20 12:33:38,034 [p:423042,w:1,m:0] [uWSGIWorker1Core3] Datatype class not found for extension 'tar.gz'
    galaxy.model WARNING 2021-04-20 12:33:38,034 [p:423042,w:1,m:0] [uWSGIWorker1Core3] Datatype class not found for extension 'tar.gz'
.....
.....

I see “Failed to execute scheduled workflow” messages there but i can not figure out which tool gives this error. Is there a way to find this out?

Thanks in advance.

bjoern.gruening · April 21, 2021, 6:12am

My guess is BWA-mem. Why are you changing the tool XMLs?

wormball · April 21, 2021, 11:53am

I am changing tool xmls cos they are composed for our lab by some people, but they are not responding any more, and these xmls contain a number of errors, and there is no one to repair these xmls except me.

Unfortunately i did not change BWA-mem xml.

wormball · April 21, 2021, 2:19pm

I reverted all changed tools to their original form, but error persists. So it seems it is due to chowning the galaxy and running it from non-root user. But i do not understand how it may lead to such errors.

jennaj · April 23, 2021, 5:09pm

@wormball

My guess is that the paths are off (working directory and/or config files). All are relative to the local Galaxy installation directory by default, not the full path from / root.

Please see these and related sections under the Administration topic group: https://docs.galaxyproject.org/en/master/admin/options.html https://docs.galaxyproject.org/en/master/admin/config.html

wormball · April 26, 2021, 4:11pm

I prepended the $PATH with the path as it was from root. However i do not think it has something to do with the problem.

I tried to invoke the “old” workflow (the one i had not edited), and it was enqueued successfully (from non-root and with old tool xmls). Unfortunately i had not tried this workflow between the error manifestated and path editing (cos it requires manual entering of millions of parameters).

Then i tried to invoke some “new” workflows. Sometimes i got “Invocation scheduling failed”, sometimes “Workflow submission failed” Workflow submission failed , and one time i even got successful invocation (but this time i forgot to push “Send results to a new history” button).

I tried to edit the old workflow again. This time i was unable to connect the genome input to MergeBamAlignment even if i created the new instance from the left panel. Workflow automation? - #4 by wormball However after some random mouse clicks the editor allowed me to do this, but only in two of three MergeBamAlignment steps i have. So i had to copy one of these and replace the third such step. After i finished the editing and saved the workflow, it enqueued just fine.

Maybe this step is root of all evil. On the other hand, when i tried to crop the workflow to find the bad step, it started to enqueue perfectly while all three MergeBamAlignment steps were present. Also i suspected that SnpEff eff may be guilty, but again, sometimes it worked with it and not worked without it. So the cause of this error is still a mystery. And the fact that the galaxy says nothing about this (and not only this) error whilst it writes millions of letters to its log makes me a sad panda.

jennaj · April 27, 2021, 12:57am

@wormball Thanks for following up.

In some cases, yes

Short answer: Workflows contain metadata. This is set from the inputs down through tools. When a change is made to an existing workflow, especially the input type, sometimes the metadata no longer matches up between tools. This can result in noodles between tools not connecting or workflows aborting at certain steps. Removing all noodles and reconnecting them from the input down through tools in the order of processing will resolve the problems, as can replacing an existing tool with a new copy (for some subset of situations, it depends on how many tools are involved). For either, when an upstream input/tool is connected, new metadata connections are created.

Not ideal, but useful to know about. It has been covered in a few posts at this forum Search results for 'workflow metadata' - Galaxy Community Help, but the root issue is explained in this ticket if interested in the details. The workflow editor and invocation do give more hints about what might be going wrong in the most current release, but those are sometimes missed. Workflow Editor - validation and propagation of singular/collection input swaps · Issue #7431 · galaxyproject/galaxy · GitHub

Apologies for not recognizing your problem as potentially being related to that in the prior replies.

wormball · April 27, 2021, 4:46pm

Thanks Jennifer! However i am not sure i fully understand the issue discussion. As i get it is about dataset collections, but i never used collections. And what is metadata and where is it stored? Can i look at this metadata?

I composed some script which compares two workflows. Maybe it will be useful to somebody. Or maybe i just reinvented the wheel.

#!/usr/bin/env python3

import json
import sys


def keys_to_labels(j, name):
    s = j["steps"]
    jo = j.copy()
    o = s.copy()
    for k, v in s.items():
        # print(k)
        label = v["label"]
        if label == None:
            label = v["name"]
        if label in o:
            print("Warning! Workflow \"" + name + "\" has duplicate \"" + label + "\" steps! Please rename at least one of these.")
            n = 2
            while (label + "_" + str(n)) in o:
                n += 1
            label = label + "_" + str(n)
        v["workflowdiff_step_name"] = label
        del v["id"] # step id is not needed any more
        del o[k]
        o[label] = v
    jo["steps"] = o
    return jo

def input_numbers_to_names(j, j0, name):
    s = j["steps"]
    for k, v in s.items():
        if "input_connections" in v:
            for c in v["input_connections"].values():
                n = c["id"]
                # print(n)
                label = j0["steps"][str(n)]["workflowdiff_step_name"]
                c["id"] = label

def decode_strings(j): # some values are enquoted strings that are however valid json
    for k, v in j.items():
        if type(v) == str:
            # print("string: " + v)
            d = None
            try:
                d = json.loads(v)
            except Exception:
                # print("Could not decode string: " + v) 
                pass
            if type(d) == dict:
                j[k] = d
                decode_strings(d)
        elif type(v) == list: # also turning lists into dictionnaries for convenience
            d = {}
            for i in range(len(v)):
                ik = "list[" + str(i) + "]"
                d[ik] = v[i]
            j[k] = d
        elif type(v) == dict:
            decode_strings(v)


def prepare_workflow(filename):
    j0 = json.load(open(filename, "r"))
    j = keys_to_labels(j0, filename)
    input_numbers_to_names(j, j0, filename)
    decode_strings(j)
    return j


def print_diff(j1, j2, name1, name2, indent = ""):
    # print(indent + "Steps present only in the first workflow:")
    for k in j1.keys():
        if not k in j2:
            print (indent + "Only in " + name1 + ": " + k)
    # print(indent + "Steps present only in the second workflow:")
    for k in j2.keys():
        if not k in j1:
            print (indent + "Only in " + name2 + ": " + k)
            # print (k)
    # print(indent + "Different steps:")
    for k, v1 in j1.items():
        if k == "position": # ignore positions!
            continue
        if k == "uuid": # ignore uuids!
            continue
        if not k in j2:
            continue
        v2 = j2[k]
        n1 = name1 + "." + k
        n2 = name2 + "." + k
        if (type(v1) == dict) and (type(v2) == dict):
            print_diff(v1, v2, n1, n2, indent + "    ")
        else:
            if v1 != v2:
                print(indent + n1 + ": " + str(v1))
                print(indent + n2 + ": " + str(v2))


if __name__ == "__main__":
    if len(sys.argv) <= 2:
        print("""
Galaxyproject workflow comparison. Usage:

    workflowdiff.py file1.ga file2.ga

Show topological difference only:

    workflowdiff.py file1.ga file2.ga | grep input_connections
""")
        exit()
    file1 = sys.argv[1]
    file2 = sys.argv[2]
    print("Comparing galaxyproject workflows:")
    print("1: " + file1)
    print("2: " + file2)
    j1 = prepare_workflow(file1)
    j2 = prepare_workflow(file2)
    print_diff(j1, j2, "1", "2")

So it gave me following (the first file is my “new” workflow that was successfully enqueued yesterday, the second is slightly older workflow that worked but ceased to work):

transgeneprep@transgeneprep-System-Product-Name:~/dima/galaxy_workflows$ ./workflowdiff.py Galaxy-Workflow-main_new.ga Galaxy-Workflow-main_simple_pe.ga | grep -v tool_version | grep -v changeset_revision 
Comparing galaxyproject workflows:
1: Galaxy-Workflow-main_new.ga
2: Galaxy-Workflow-main_simple_pe.ga
Warning! Workflow "Galaxy-Workflow-main_new.ga" has duplicate "gatk4_GetPileupSummaries" steps! Please rename at least one of these.
Warning! Workflow "Galaxy-Workflow-main_simple_pe.ga" has duplicate "gatk4_GetPileupSummaries" steps! Please rename at least one of these.
1.annotation: with special wrappers for the bwa and RNA-star. RNA star version changed to 2.7.6
2.annotation: 
1.name: main_new
2.name: main_simple_pe
    Only in 2.steps: empty file, do not select anything
            Only in 1.steps.scripts_optitype.input_connections: sample
            1.steps.scripts_optitype.tool_state.sample: {'__class__': 'ConnectedValue'}
            2.steps.scripts_optitype.tool_state.sample: test_sample
            Only in 2.steps.bwa tumor exome.inputs: list[0]
            Only in 2.steps.bwa tumor exome.inputs: list[1]
                1.steps.bwa tumor exome.tool_state.left_reads.__class__: ConnectedValue
                2.steps.bwa tumor exome.tool_state.left_reads.__class__: RuntimeValue
                1.steps.bwa tumor exome.tool_state.right_reads.__class__: ConnectedValue
                2.steps.bwa tumor exome.tool_state.right_reads.__class__: RuntimeValue
            Only in 2.steps.bwa normal exome.inputs: list[0]
            Only in 2.steps.bwa normal exome.inputs: list[1]
                1.steps.bwa normal exome.tool_state.left_reads.__class__: ConnectedValue
                2.steps.bwa normal exome.tool_state.left_reads.__class__: RuntimeValue
                1.steps.bwa normal exome.tool_state.right_reads.__class__: ConnectedValue
                2.steps.bwa normal exome.tool_state.right_reads.__class__: RuntimeValue
            Only in 1.steps.star_geno.inputs: list[0]
            Only in 1.steps.star_geno.inputs: list[1]
                1.steps.star_geno.tool_state.reads1.__class__: RuntimeValue
                2.steps.star_geno.tool_state.reads1.__class__: ConnectedValue
                1.steps.star_geno.tool_state.reads2.__class__: RuntimeValue
                2.steps.star_geno.tool_state.reads2.__class__: ConnectedValue
            Only in 1.steps.MergeBamAlignment tumor exome all reads and mapped reads.inputs: list[1]
                            1.steps.MergeBamAlignment tumor exome all reads and mapped reads.tool_state.aligned_or_read1_and_read2.aligned_bams.list[0].aligned_bam.__class__: RuntimeValue
                            2.steps.MergeBamAlignment tumor exome all reads and mapped reads.tool_state.aligned_or_read1_and_read2.aligned_bams.list[0].aligned_bam.__class__: ConnectedValue
                1.steps.MergeBamAlignment tumor exome all reads and mapped reads.tool_state.unmapped_bam.__class__: RuntimeValue
                2.steps.MergeBamAlignment tumor exome all reads and mapped reads.tool_state.unmapped_bam.__class__: ConnectedValue
            Only in 1.steps.MergeBamAlignment normal exome all reads and mapped reads.inputs: list[1]
                            1.steps.MergeBamAlignment normal exome all reads and mapped reads.tool_state.aligned_or_read1_and_read2.aligned_bams.list[0].aligned_bam.__class__: RuntimeValue
                            2.steps.MergeBamAlignment normal exome all reads and mapped reads.tool_state.aligned_or_read1_and_read2.aligned_bams.list[0].aligned_bam.__class__: ConnectedValue
                1.steps.MergeBamAlignment normal exome all reads and mapped reads.tool_state.unmapped_bam.__class__: RuntimeValue
                2.steps.MergeBamAlignment normal exome all reads and mapped reads.tool_state.unmapped_bam.__class__: ConnectedValue
            Only in 1.steps.MergeBamAlignment  tumor transcriptome.inputs: list[1]
                            1.steps.MergeBamAlignment  tumor transcriptome.tool_state.aligned_or_read1_and_read2.aligned_bams.list[0].aligned_bam.__class__: RuntimeValue
                            2.steps.MergeBamAlignment  tumor transcriptome.tool_state.aligned_or_read1_and_read2.aligned_bams.list[0].aligned_bam.__class__: ConnectedValue
                1.steps.MergeBamAlignment  tumor transcriptome.tool_state.unmapped_bam.__class__: RuntimeValue
                2.steps.MergeBamAlignment  tumor transcriptome.tool_state.unmapped_bam.__class__: ConnectedValue
                1.steps.MergeBamAlignment  tumor transcriptome.workflow_outputs.list[0].label: MergeBamAlignment on input dataset(s): BAM with merged alignments
                2.steps.MergeBamAlignment  tumor transcriptome.workflow_outputs.list[0].label: None
                1.steps.gatk4_GetPileupSummaries.input_connections.bam.id: gatk4_ApplyBQSR tumor exome
                2.steps.gatk4_GetPileupSummaries.input_connections.bam.id: gatk4_ApplyBQSR normal exome
                1.steps.gatk4_GetPileupSummaries_2.input_connections.bam.id: gatk4_ApplyBQSR normal exome
                2.steps.gatk4_GetPileupSummaries_2.input_connections.bam.id: gatk4_ApplyBQSR tumor exome
                1.steps.gatk4_CalculateContamination.input_connections.n_table.id: gatk4_GetPileupSummaries_2
                2.steps.gatk4_CalculateContamination.input_connections.n_table.id: gatk4_GetPileupSummaries
                1.steps.gatk4_CalculateContamination.input_connections.t_table.id: gatk4_GetPileupSummaries
                2.steps.gatk4_CalculateContamination.input_connections.t_table.id: gatk4_GetPileupSummaries_2
            Only in 2.steps.SnpEff eff:.input_connections: intervals
            Only in 2.steps.SnpEff eff:.input_connections: transcripts
            Only in 1.steps.SnpEff eff:.inputs: list[0]
            Only in 1.steps.SnpEff eff:.inputs: list[1]
                1.steps.SnpEff eff:.tool_state.intervals.__class__: RuntimeValue
                2.steps.SnpEff eff:.tool_state.intervals.__class__: ConnectedValue
                1.steps.SnpEff eff:.tool_state.transcripts.__class__: RuntimeValue
                2.steps.SnpEff eff:.tool_state.transcripts.__class__: ConnectedValue
            Only in 1.steps.scripts_deconvolution.input_connections: sample
            1.steps.scripts_deconvolution.tool_state.sample: {'__class__': 'ConnectedValue'}
            2.steps.scripts_deconvolution.tool_state.sample: test_sample
1.version: 17
2.version: 2

Also it says the old workflow has “None” in the tool versions:

transgeneprep@transgeneprep-System-Product-Name:~/dima/galaxy_workflows$ ./workflowdiff.py Galaxy-Workflow-main_new.ga Galaxy-Workflow-main_simple_pe.ga | grep tool_version
        1.steps.Sample_tumor name.tool_version: 0.1.1
        2.steps.Sample_tumor name.tool_version: None
        1.steps.Sample_normal name.tool_version: 0.1.1
        2.steps.Sample_normal name.tool_version: None
        1.steps.scripts_optitype.tool_version: 1.0
        2.steps.scripts_optitype.tool_version: None
        1.steps.FastqToSam tumor transcriptome.tool_version: 2.18.2.1
        2.steps.FastqToSam tumor transcriptome.tool_version: None
        1.steps.bwa tumor exome.tool_version: 0.1.0
        2.steps.bwa tumor exome.tool_version: None
        1.steps.bwa normal exome.tool_version: 0.1.0
        2.steps.bwa normal exome.tool_version: None
        1.steps.FastqToSam tumor exome.tool_version: 2.18.2.1
        2.steps.FastqToSam tumor exome.tool_version: None
        ........
        ........

Also i see some steps in the old workflow have “input_connections” not mentioned in “inputs”:

{
            "annotation": "",
            "content_id": "toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_MergeBamAlignment/2.18.2.1",
            "errors": null,
            "id": 23,
            "input_connections": {
                "aligned_or_read1_and_read2|aligned_bams_0|aligned_bam": {
                    "id": 18,
                    "output_name": "mapping"
                },
                "reference_source|ref_file": {
                    "id": 7,
                    "output_name": "output"
                },
                "unmapped_bam": {
                    "id": 19,
                    "output_name": "outFile"
                }
            },
            "inputs": [
                {
                    "description": "runtime parameter for tool MergeBamAlignment",
                    "name": "reference_source"
                }
            ],
            "label": "MergeBamAlignment  tumor transcriptome",
            "name": "MergeBamAlignment",
            "outputs": [
                {
                    "name": "outFile",
                    "type": "bam"
                }
            ],
            "position": {
                "bottom": 1706.8125,
                "height": 83,
                "left": 2535.71875,
                "right": 2585.71875,
                "top": 1623.8125,
                "width": 50,
                "x": 2535.71875,
                "y": 1623.8125
            },
            "post_job_actions": {},
            "tool_id": "toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_MergeBamAlignment/2.18.2.1",
            "tool_shed_repository": {
                "changeset_revision": "9ffcddf6f9c0",
                "name": "picard",
                "owner": "devteam",
                "tool_shed": "toolshed.g2.bx.psu.edu"
            },
            "tool_state": "{\"add_mate_cigar\": \"true\", \"aligned_or_read1_and_read2\": {\"aligned_or_read1_and_read2_selector\": \"paired_one_file\", \"__current_case__\": 0, \"aligned_bams\": [{\"__index__\": 0, \"aligned_bam\": {\"__class__\": \"ConnectedValue\"}}]}, \"aligned_reads_only\": \"false\", \"aligner_proper_pair_flags\": \"false\", \"attributes_to_remove\": [], \"attributes_to_retain\": [], \"clip_adapters\": \"true\", \"clip_overlapping_reads\": \"true\", \"include_secondary_alignments\": \"true\", \"is_bisulfite_sequence\": \"false\", \"max_insertions_or_deletions\": \"1\", \"orientations\": null, \"primary_alignment_strategy\": \"BestMapq\", \"read1_trim\": \"0\", \"read2_trim\": \"0\", \"reference_source\": {\"reference_source_selector\": \"history\", \"__current_case__\": 1, \"ref_file\": {\"__class__\": \"RuntimeValue\"}}, \"unmapped_bam\": {\"__class__\": \"ConnectedValue\"}, \"validation_stringency\": \"LENIENT\", \"__page__\": null, \"__rerun_remap_job_id__\": null}",
            "tool_version": null,
            "type": "tool",
            "uuid": "d693b14d-4916-4639-a81b-79893364a357",
            "workflow_outputs": [
                {
                    "label": null,
                    "output_name": "outFile",
                    "uuid": "516d98f0-dd12-435a-976a-0944ab714586"
                }
            ]
        }

jennaj · April 27, 2021, 7:47pm

Hi @wormball

Thanks for posting back, hopefully, it helps others and the advice below helps you. It sounds like you’ll need to update the older workflow. You’ll want to be using the most current version of any tool if running the most current version of Galaxy. Or, set up an environment that is like the version of Galaxy the workflow used to work in (Docker, etc). Python, Conda, underlying wrapped tool version/tool forms, other dependencies – all change over time, as does Galaxy itself.

Older workflows would still work if run in the same exact version of Galaxy as they originally ran, including the tool versions and local environment. It may be too late for that option for you for the prior work, but you might want to consider it for future work/reproducibility reasons.
Setting up a Docker Galaxy is one way to maintain a stable computational environment. Some examples of those are included in the Galaxy Training network’s resources (not all tutorials are updated for the most current release immediately and/or require special configuration) – example: Galaxy Training! (scroll down to find it)
The metadata in workflows is included directly in each step in the attributes. The input type is one example. Reference data being pre-set or selected at runtime is another.
Tool versions didn’t always exist in the same format as they do now.
Workflow versioning now exists and that plus related projects are pending even more updates.
About the issue ticket: Using collections or not is not the primary issue – that was just how the conversation was started (an observation). The other issues linked to the issue I posted have all sorts of other breakouts, as do other pull requests (some closed/merged, some open).
The long list of workflow updates (ongoing): Issues · galaxyproject/galaxy · GitHub

The important parts for you probably are:

All workflows now require that at least one input is included for all tools in the workflow and that the input type is specified. That can be an initial input(s) set at the start of a workflow or the input(s) can be the output(s) from an upstream tool(s). Several years ago, an initial input at the very start of a workflow wasn’t required (could be added at runtime) but that caused other problems/confusion for end-users plus a few technical issues).
If you change the input type (collection or not, initial input or the output from an upstream tool), you may need to reset the workflow connections so that the downstream tool understands what is being inputted and how to handle the data. That is the metadata that can cause conflicts – the input tool is expecting and has settings configured for are from the prior input, but a change is made, and those settings are no longer valid. Reconnecting noodles updates those settings.
Backwards compatibility is always a very important consideration for our developers but isn’t always possible. This is another area where a contained environment like a Docker image can be helpful. Tools can also be individually configured to execute in specific environments (most already do “under the hood” with conda and other dependencies containerized). Ask the admins at Gitter if you want to learn more about that, although I don’t think it will really help for your case (older workflow structure seems to be the primary issue).

More help:

There have been many recent workflow updates that are covered in the first session of this webinar series: Galaxy-ELIXIR webinars series: Advanced Features | ELIXIR
More from ELIXIR re Galaxy + Workflows: Search | ELIXIR
Simple workflow FAQs (mostly for end-users): Galaxy Training!
Docker + Galaxy: Docker and Galaxy
GalaxyProject Youtube. Several cover workflow creation/help: https://www.youtube.com/channel/UCwoMMZPbz1L9AZzvIvrvqYA
Pan-Galactic google search. Expect this to be noisy … but at least it won’t include anything about phones Searching the Galaxy

wormball · April 28, 2021, 11:29am

I think i found not the root of the problem, but the way to reproduce it.

If i invoke workflow on some history (and set “Send results to a new history”), and then try to invoke one more workflow on the same dataset (while the first workflow is still running or purged while running), the second workflow have quite a good chance to not schedule correctly. Moreover, all future workflow invocations on this dataset are doomed to fail.

If i then “copy” such history, the workflow invocation on “copied” history works fine. But if i first “purge” “original” history or some other “copies” of this history, some steps in the invocation dialog that are connected “Input dataset” steps say, e. g.:

parameter 'fastq': the previously selected dataset has been deleted.
Input fastq file for the first read in paired end data
Output dataset 'output' from step 6

And the same messages for all my fastq.gz input files (but interestingly not for human genome fasta file, maybe cos i uploaded it separately and also “copied” it to other histories). However if i select some other file in one of the “Input dataset” dialogs, these messages disappear, but the overall invocation process still ends up with “Workflow submission failed” error.

So the workflows per se are fine, but the root of all evil is that the galaxy pretends to be “massively parallel” and “seamless” while it actually is not. So when i wanted to save some time by skipping “upload” steps and invoking second workflow while the first is not done, it resulted in at least two weeks of my wasted time. I think the galaxy should at least warn reckless users when they decide to play with such fancy stuff like copying, purging or pressing too much buttons per minute, if this bug(s) is impossible to fix.

jennaj · April 28, 2021, 2:44pm

Hi @wormball

Datasets are in a “locked” state to prevent changes while they are being used as an input by a tool. This preserves all of the original metadata associated with the inputs. If that tool’s output is deleted/purged, the job may have already been sent to a cluster for processing. The job isn’t killed immediately, but once it is returned from the cluster the output will fail to write to the placeholder outputs since they are deleted/purged. You can confirm this by going into any history, adding a dataset, and running a tool against it that will take longer to execute. Tags can be added, but the dataset’s metadata cannot be changed (the name, datatype, etc). However, that dataset (unchanged) can be repeatedly used as an input to other tools.

Datasets in your own history are copies of each other. The original dataset is the master, the others are clones. If a change is made to a clone, it then becomes its own “master”.

Scheduling a workflow (a group of 1 or more tools) works in a similar way – however, it is newer functionality. I’m wondering if the “locked” state for the input datasets is waiting for the scheduled workflow (and contained tools, or at least the first tool that is using the input) to quit out fully before the dataset can be copied/cloned into another history.

An immediate workaround would probably be to keep the “master” version of a dataset in a different history (or a shared data library) and work with copies/clones in other histories Let’s ask our developer @mvdbeek what he thinks. Stress testing helps us to make Galaxy better, and he may have a more detailed explanation about what is going on re if this is the intended behavior or not (don’t allow valid copies/clones to be created from a “locked” dataset), or if it is some known issue or not.

To confirm:

You are running the most current version of Galaxy?
Using the most current version of the tools included in the workflow?
And a newly created workflow?

All this will help with troubleshooting the issue. Meanwhile, I’ll ping Marius. He is in the EU, so the reply may not be immediate.

Thanks again for explaining odd behavior that you are running into!

mvdbeek · April 28, 2021, 3:28pm

If i then “copy” such history, the workflow invocation on “copied” history works fine. But if i first “purge” “original” history or some other “copies” of this history, some steps in the invocation dialog that are connected “Input dataset” steps say, e. g.:

Your workflow runs on the input dataset in the original history, even if the outputs are sent to another history. That is the expected behavior.

However if i select some other file in one of the “Input dataset” dialogs, these messages disappear, but the overall invocation process still ends up with “Workflow submission failed” error.

What exactly is this error ? You’ll want to look at your logs.
It is very difficult to guess what your workflow is doing without having the workflow at hand, but please avoid any runtime data inputs. That is a legacy feature that may not work correctly for more complex workflows.
If the tool versions are missing from the exported workflow you may want to update Galaxy to a later release ideally, or at least to the newest commit on the 20.09 branch, that sounds like a bug we fixed some time ago.

wormball · April 29, 2021, 11:52am

You’ll want to look at your logs.
It is very difficult to guess what your workflow is doing without having the workflow at hand

Unfortunately i am not allowed to publish the whole workflow/toolchain, but here is the log for this week.

please avoid any runtime data inputs

But how can i define any input file if i can not change it at the invocation time?

If the tool versions are missing from the exported workflow you may want to update Galaxy to a later release ideally, or at least to the newest commit on the 20.09 branch, that sounds like a bug we fixed some time ago.

Are you talking about “tool versions are missing” or “Workflow submission failed” kind of bug? The tool versions return to existence when i “edit” and “save” the workflow. Anyway i will try to update the galaxy soon (i am using 20.09).

mvdbeek · April 29, 2021, 1:45pm

I’m not sure we’re talking about the same issue. Runtime data inputs are input datasets you select outside of the dataset input select boxes. The first one in the screenshot is fine, the second one is to be avoided. You can change the input datasets at any point, and you can include optional input datasets.
Screenshot 2021-04-29 at 15.22.20

Based on

                1.steps.bwa tumor exome.tool_state.right_reads.__class__: ConnectedValue
                2.steps.bwa tumor exome.tool_state.right_reads.__class__: RuntimeValue

one of these inputs is now a RuntimeValue instead of a ConnectedValue, and it should appear in an expanded box in the workflow run form.

I’m referring to

Also it says the old workflow has “None” in the tool versions:

Hard to predict what the outcome of this is, it may or may not be a problem. Better to avoid it altogether, but if you can’t I’d suggest walking along the workflow in the editor and making sure that when you upgrade to the latest tool version (which is what happens if there is no tool version included) all the nodes are properly connected and that the parameters still make sense.

If you can update to 21.01 a lot of the problems I’m mentioning here will be highlighted in the workflow editor, under workflow options → workflow best pracrices, and there were also a lot of workflow related bugfixes and enhancements, not all of which we were able to backport to 20.09.

mvdbeek · April 29, 2021, 1:47pm

Oh, and another thing that is generally helpful, especially if you can’t share the workflow in its entirety is to break it down to the failing steps only. Often that will be sufficient to figure out what is wrong, and in case there is a bug it’s much easier for us to fix it.

wormball · April 29, 2021, 2:47pm

Runtime data inputs are input datasets you select outside of the dataset input select boxes.

These were present in the “original” workflow, but i replaced them all with “input datasets”. However i am not sure which workflow it was, but i tried many workflows. I will try this again hopefully.

is to break it down to the failing steps only

I tried. But sometimes it works with more steps and does not work with less steps. Also i tried to figure something out from logs but failed. I hope you as the developer can do so. If bjoern.gruening’s guess is right, maybe bwa-mem xml will be helpful:

<tool id="bwa_geno" name="bwa_geno" version="0.1.0">
  <description>BWA for exome</description>
  <command>bwa mem -t '$threads' -R  '@RG\tID:${sample}_${type}\tSM:${sample}\tLB:LIB1\tPL:ILLUMINA\tPU:UNIT1' '${__tool_directory__}/genome/hg38.analysisSet.fa' '$left_reads' '$right_reads' | samtools sort -@ '$threads' -o '${mapping}'; samtools index '${mapping}' </command>
  <stdio>
    <regex match="^$" source="stdout" level="warning" description="Empty exit code" />
  </stdio>
  <inputs>
    <param  name="threads" type="text" label="number of threads"/>
    <param  name="sample" type="text" label="sample name"/>
    <param  name="type" type="text" label="type: tumor/normal"/>
    <param format="fastq" name="left_reads" type="data" label="left reads"/>
    <param format="fastq" name="right_reads" type="data" label="right reads"/>
  </inputs>
  <outputs>
    <data format="bam" name="mapping" />
  </outputs>

  <tests>
    <test>
    <param name="threads" value="4" />
    <param  name="sample" value="h1803"/>
    <param  name="type" value="tumor"/>
    <param name="input_left_reads" value="input_left_reads.fastq"/>
    <param name="input_right_reads" value="input_right_reads.fastq"/>
	<output format="bam" name="output.bam" />
    </test>
  </tests>

  <help>
  This tool is a wrapper over BWA that inserts bam file headers from sample information
  </help>

</tool>

As i know these xmls were needed mainly to be able to set number of threads which in the shed tools is unclear to set, and also to configure some pre-computed database (which as i have read is supposed to be done via data managers, but i have not tried this yet).

jennaj · April 29, 2021, 8:54pm

Hi @wormball

This tool

<tool id="bwa_geno" name="bwa_geno" version="0.1.0">

does not exist in either the Main ToolShed https://toolshed.g2.bx.psu.edu/ or the Test ToolShed https://testtoolshed.g2.bx.psu.edu.

It isn’t found in any public Galaxy-related resource either: Searching the Galaxy. The link out to regular google also doesn’t hit anything meaningful.

And, there isn’t anything meaningful in any public Github repo (Galaxy or not): Search · bwa_geno · GitHub

This looks like a custom tool wrapper. You could try to fix it (Tools in Galaxy) or replace it with a supported tool. Install from the Main Toolshed. The Test Toolshed is a “sandbox” with all sorts of old or non-functioning tools.

Topic		Replies	Views
I tried to run a new workflow but it stopped scheduling at some point... seems like a better error message is in order? usegalaxy.org support	6	536	October 20, 2023
Infinitely spawning ghost jobs (JobHandlerQueue.monitor_thread) server-admin , galaxy-local , troubleshooting , gxadmin	2	581	March 3, 2021
local galaxy seems fail to run ----spown error and job is waiting to run for ever server-admin , galaxy-local	3	435	November 21, 2023
Debugging a stuck workflow server-admin , workflow	1	567	April 11, 2021
Local instance SGE jobs	0	383	April 14, 2020

Which tool is guilty? Custom local Galaxy install

Related topics