Killing jobs or dead processes after a database crash on galaxy

Hello Recently the old instance of galaxy that we host has went through some heavy utilization and crashed the server and database several times. I am currently stuck at looking at several upload jobs for one user that will not error out or go away even after a reboot.

Under “Admin” Manage jobs I see all their jobs listed but there’s no check box for me to select it and submit it for stopping. Is there a way to do this in the terminal of the server?

thank you

1 Like

Hi @glam02,

There is a really neat Galaxy Admin utility called gxadmin which can help you out. It contains a bunch of querys and scripts that can look at jobs, fail jobs manually etc.

You can get it from: https://github.com/usegalaxy-eu/gxadmin

The particular query you want to run is: gxadmin mutate fail-job <job_id> where <job_id> is the job number of the stalled jobs in your database.

There is a little bit of setup required, mostly so that gxadmin knows how to talk to your database.

I hope this helps.

Simon.

2 Likes

@glam02 just a quick note (one of the gxadmin authors here), you’ll want to run:

gxadmin mutate fail-job <job_id>
gxadmin mutate fail-terminal-datasets

the first command just marks the job as failed, the second command finds all terminal jobs, and then marks their outputs as “failed”, which is what is shown to users, not the job’s ok/error status. I’ve been meaning to merge the second command into the first but have not done it yet I’m afraid.

2 Likes

Thank you All! I will give the tool a try and post back the progress!

Hello All,
thank you for the tip! This was a great solution… successfully killed the runaway jobs. I’m onto asking my next question since I’m a newb to all this.