SPADES - Remote job server indicated a problem running or monitoring this job.

Hi,
I launched a job on Friday for alignment on SPADES on usegalaxy.org
I used FASTQ to check the quality of my uploaded sequences and Trimmomatic to trim my data beforehand. I used the “paired” output from Trimmomatic to launch SPADES.
I launched 24 alignments at the same time (all from a same pair-end Illumina run; 24 samples) and with the same settings (run only assembly; careful correction; k-mers to use 33,55,99; coverage cutoff auto).
Two jobs are still ongoing, 12 succeeded and 10 failed.
For the failed jobs, the error reported is:

tool error
An error occurred with this dataset:
Remote job server indicated a problem running or monitoring this job.

I’m not too sure what that means.
I don’t understand why 10 jobs failed as from the quality analysis (FASTQ) it did not seem to me that there was any difference between the 12 that worked and the 10 that failed.
Can this be a server issue? Is it worth it to re-launch the jobs that failed?

Thanks for your help
Vanina

1 Like

Hi @vanina.guernier

Yes, there were some server issues and SPAdes was one impacted tool. Not all runs, just some.

The error message you posted can be produced by a server/cluster issue or an input issue. Given what you state and the timing, this seems to be likely server related. Rerun the failed jobs and see if that works.

At least one rerun is a good idea anyway – unless the input problem is obvious. Due to the high volume of work done at this server, some small fraction of jobs will fail even under normal conditions. But about half of a batch failing usually means more is going on – so glad you asked about it!

FAQ: https://galaxyproject.org/support/tool-error/#type-cancelled-by-admin-or-a-cluster-failure

If the reruns fail again, please let us know:

  1. Send in a bug report from any of the new errored results and include a link to this Galaxy Help topic in the comments (for context)
  2. Please do not delete the inputs or outputs
  3. Quickly post back here that you sent the bug report in, to make sure we find it

Thanks!

Thank you !!
Galaxy is down for scheduled maintenance at the moment so I will try to rerun the failed jobs when everything is back to normal and will keep you posted.
Thanks again
Vanina

1 Like

Yes, please try again after the scheduled maintenance is completed.

For others curious about downtime for server upgrades (now or later), this is where to find out about usegalaxy.* server and related core component status:

https://galaxyproject.statuspage.io/

Hi!
So I tried to re-run the 10 failed jobs.
I now had 6 successful jobs and 4 that still failed.
Because it seems a bit inconsistent, I reported a bug as you suggested, flagging this galaxy help link.

Thanks for helping
Vanina

1 Like

@vanina.guernier Thanks! reviewing now

1 Like

Hi
I wanted to let you know that I just tried to run some of the same data with Unicycler as I had both short and long reads for 9 samples.
All of them failed in a few minutes.
Again it’s showing the same error:

tool error
An error occurred with this dataset:
Remote job server indicated a problem running or monitoring this job.
Of note, I did put

  • the Illumina data R1 and R2 as first and second set of reads
  • MinIon files as long reads (but output from filtlong instead of raw data as I had some issues before with these).
    All settings were otherwise by default.

Should I open this as a separate request for help??
Thanks

Vanina

1 Like

No need to open another topic here, let’s keep it all together for context.

If the Unicycler runs are in the same history, I’ll load it up again to capture the new runs and review it with the others. If it is in a different history, please send in another bug report – that makes it easy to make sure we are reviewing the same data :slight_smile:

Thanks for doing more testing and reporting problems!!

If I am not mistaken, I have only one history.
But just to be sure, I reported one of the bug and linked to this chat again.
Of note, as I ran into trouble before when using trimmed sequences (with Trimmomatic) in Unicycler, I tested a few different options for a same sample:

  • trimmed Illumina reads + filtered Nanopore reads (using filtlong)
  • trimmed Illumina reads + raw Nanopore reads
  • raw Illumina reads + filtered Nanopore reads (using filtlong)
  • raw Illumina reads + raw Nanopore reads

None of it worked and the jobs were not queued, it failed straight away in minutes or sometimes in seconds (while for example, when my jobs failed with Spades, it was queued for days before finally failing).

Thanks for your help
Vanina

1 Like

Thanks, I saw your report plus can reproduce the “fails immediately” issue with test data across a set of specific tools.

There is a problem with one of our dedicated clusters. Unicycler, Spades, plus a few other tools are impacted. Our administrator is working on it. We’ll post back an update tomorrow, even if not fully resolved yet.

If your work is urgent, you might want to try the usegalaxy.eu server as a short-term alternative for now. Most public Galaxy servers are very very busy, so using more than one is common. Usage terms are one account per person per public Galaxy server – so it is fine to use both, plus any others that host the tools you are interested in: https://galaxyproject.org/use/

All the feedback appreciated!

Thanks! It’s kind of reassuring to know that it’s a server issue rather than a problem with my data.
I’m guessing that if I want to use http://usegalaxy.eu I will need to upload all my raw data again on the european server? (the different servers are not “linked” right?)
Vanina

1 Like

Update: The usegalaxy.org server now has a status banner for the impacted tools. When that clears or updates, then tools not listed will run again.

Certain large memory tools are not functioning due to issues with the Bridges supercomputer at PSC. Affected tools include Trinity, SPAdes, Unicycler, and RNA STAR.


The problem is still being resolved, so using usegalaxy.eu is a good idea for now.

And yes, accounts are distinct between these two servers (and most public Galaxy servers). There are some domain-specific sub-servers where the same account is used across all, but you’ll be able to tell by the server’s URL.

Example: covid19.usegalaxy.eu is a subdomain server of usegalaxy.eu and your account at both would be the same, but different from your account at usegalaxy.org. Accounts at any usegalaxy.eu or usegalaxy.org servers (including subdomain servers) are distinct from any account at usegalaxy.org.au.

How to move data around between Galaxy servers by URL (avoiding local download/upload) https://galaxyproject.org/support/#data-options

  • Datasets can be copied directly between servers. Capture the link from the disc icon in the “from” dataset then paste that into the Upload tool at the “to” server. The data will load into the active history. If you have many datasets, copy the URLs into some text editor scratch space, then paste all in at the same time and the data will load in batch.

  • Histories can be exported as an archive at the “from” server (history menu option) then loaded by URL at the “to” server (User > Histories > Import). Be aware that very large histories will take some time to create an archive from, and then more time load at the other server. To reduce an original history’s size, copy just the datasets you need into a new history, then archive/transfer that.

  • Workflows can also be transferred between servers by URL (Workflow view for both). Generate a share link for the workflow on the “from” server and paste that URL into the “to” server (Workflow > Import). Note that different public Galaxy servers host different tools/versions – so be sure to open, review, and update any workflows as needed before running them.

1 Like

Ok, thanks for the update.
I’m afraid I’m not quite clear on the transfer of datasets though:

Capture the link from the disc icon

What is the disc icon? Whatever I click on does not change the URL on the web browser.
The only “disc” I can see is this on the upper left when I click on a data in my history:
Screen Shot 2020-06-23 at 3.46.59 PM
But it just allows to download the data as shown on the picture, so I don’t find a link.
I tried to find the information elsewhere but I’m stuck, sorry.
Vanina

1 Like

Hold the control key down and click on the disc icon. A menu will appear with an option to copy the link.

https://galaxyproject.org/support/#basics >> https://galaxyproject.org/support/download-data/ plus others in that group explain functions within datasets, icons, and related.

Hi
I managed to transfer the needed data to usegalaxy.eu
All the Unicycler runs that I launched worked!!
So I will probably try to do the same thing with SPAdes for my 4 remaining failed jobs.

Thanks a lot for your help, that was really useful.
Vanina

1 Like