Debugging a stuck workflow

Hi there

Our Galaxy server at SANBI is running Galaxy 20.01 with 5 mules.

I have a workflow invocation that seems to be stuck despite on some level being reported as running. This workflow takes two inputs: a dataset and a list of pairs. It submits to our cluster here at SANBI. The workflow invocation screen and part of the history looks like this:

Which suggests that there are some running jobs waiting to complete. However, zooming within one of the final collections shows waiting-to-run jobs:


and digging through the details of the workflow execution shows 8 BWA-MEM jobs that are in the “new” state. Eight to be exactly. Here is a snapshot of one from the job view in the Admin panel:

Somehow the scheduler seems to have stopped scheduling these jobs. Any tips to find out why? And how to maybe have the scheduler restart?

P.S. there are no jobs in error and the upstream datasets for these jobs completed fine.

Restarting the Galaxy server got the jobs re-scheduled and completed.