Extremely Long Queue Times RNA-seq Data Processing

Dear Galaxy support,

I was wondering if you might be able to assist me. I have been trying to process some raw Fastqs with several of the RNA-seq tools on the Galaxy main server (usegalaxy.org) for a batch of 22 sample files. I first concatenated the raw fastqs corresponding to the same samples from different lanes, and used the merged file as my input for read trimming, mapping, and counting (using Trimmomatic, HiSat2 and Featurecounts). The merging of the files worked with minimal difficulty (a few of the data sets had errors the first time, but when I reran the jobs, the merged fastQs all looked fine). I then made a collection of all of the FastQs, and queued the read trimming, HiSat2 and Featurecounts. The read trimming went well for all 22 files in the collection, and for 18/22 files in the collection, the HiSat2 job and featurecounts jobs ran. This was last Thursday, 5/28/2020. The other 4 trimmed fastqsanger files have been sitting in the gray “queued status” (pre-mapping) since last Thursday. The files are formatted identically to the other 18 jobs, which ran just fine. Per your guidelines about not cancelling a restarting jobs, I have not cancelled the jobs. I know HiSat2 is a relatively computationally intensive tool, but is it typical to have a wait time of more than four days without any progress? I don’t want to cancel the jobs and lose my spot in the queue, but if anyone could chime in as to why this might be taking so long, I would very kindly appreciate it.

The other thing that happened with a couple of the 18 jobs that did run, is the trimmed fastQ files did not map properly in HiSat2. The trimmed fastQ itself looked fine- identical format to the successful 16/18 jobs, but the output of the read mapping in HiSat2 for 2/18 jobs that did run came out as “0 mapped reads,” which is very strange. I have attempted to rerun these jobs, but they have also been sitting in the queue since last Thursday.

I am not nearly at my storage limit either, so I do not think that is the issue here… I checked the Galaxy server status on https://status.galaxyproject.org/, and it looks like there has not been any reported downtime in the last 4 days, so I really do not understand this. Please let me know if there is something you would recommend I do to move these jobs forward.

Thank you for your consideration,
–Kenny

1 Like

Update: All of the jobs are still just sitting in the queue, still have not started as of this morning. Most have been in the queue since last Thursday. In order to test if it might be a formatting issue, I set up another job to run with one of the trimmed fastQs that successfully processed through the Featurecounts stage last week. That job is also just sitting in the queue. Is there something that might have happened with my account that would prevent these from running? I am still significantly below my storage quota… I am trying to be patient here, but tomorrow will be a week without any progress on most of these jobs, and I do not understand why. I would be grateful for any feedback, explanation and advice for how to proceed that anyone might be able to suggest.

Thank you,
–Kenny

Hi, I have been experiencing days-long queues and running statuses since 10 days, mainly with BWA mapping. Some fastq processing job got paused and I cannot un-pause them.

You will notice that support hasn’t been answering queries in this forum since roughly the same period. Therefore, it seems the platform is undergoing some instability.

1 Like

Hi @kfelsen1

We received your email message at this server’s support mailing list and are reviewing your specific account/jobs. Please do not delete any queued jobs, or they enter the job queue at the back again, extending wait time. We will send you more feedback about your specific work via email (help about potential input issues, etc).

In general, the Galaxy Main https://usegalaxy.org server is under very heavy usage. Due to heavy load and related factors, some queued jobs are delayed longer than usual. Our administrator is working to move those forward. Delays are expected for all work at most public Galaxy servers, including this one. There are many existing topics at this forum that explain the job execution delays across tools, analysis goals, and public resources. The basic underlying reason for delays is that there are simply more people using public resources, including Galaxy servers, for learning and computational work.

Large, computationally intensive work with time-sensitive deadlines are usually not appropriate for public Galaxy servers. Public Galaxy servers are shared resources. The good news is that there are many ways to use Galaxy!

A cloud version of Galaxy is often a good solution – the GVL version of Galaxy using AWS is a popular choice – simplified web-based administration, on-demand resource allocation, and the like. AWS has always offered grants for research work and that program was expanded, last time I checked, in an effort to help the many more people that are learning, teaching, and focusing on computational projects online. Galaxy itself is always free – but commercial storage/computational resources are not.

Full details:

Regarding:

Please note that the “quota space” represents the amount of data storage available. It is unrelated to the resources required to execute jobs/analysis. All tools hosted at usegalaxy.org already have the maximum computational resources allocated. If jobs fail for exceeding resources (red dataset with a memory or walltime aka “execution time” error), that means there is an input problem or the work really is too large to run at the public resource.

How to check: Troubleshooting resources for errors or unexpected results

@eduardofox2 Longer queue times are expected, especially for compute-intensive tools like BWA. If you have paused jobs, check the upstream jobs that are inputs. Were these completed successfully? You may need to examine data nested inside of dataset collections (some elements may have run successfully, and some not). Those upstream jobs can be rerun. And if you are using dataset collections, those reruns can replace the original failed results – tool forms include an option for this, located right above the job submission button. That said, if you would also like a closer review, send me a direct message here and we can troubleshoot more from there.

Thanks!

2 Likes

Thank you so much for the reply, Jennifer @jennaj ! I understand and am very grateful for all the work people have put in to make Galaxy a public resource with reproducible outputs, and recognize that at times the load may be heavy- that would certainly appear to be the case now, since the jobs have been sitting in the queue for the last 8 days without visible progress. I certainly appreciate that Galaxy Main is free and think this is an amazing resource.

I would certainly be willing to pay for the service in order to move things along, and tried to set up a cloudman instance through AWS yesterday. Unfortunately, I was not able to get that running successfully. I have 0 experience with setting up cloud servers. I tried to follow the instructions on https://galaxyproject.org/cloudman/getting-started/, but for some reason it did not work. The video on that page looks somewhat dated, since, as you mentioned, the latest iteration of the cloud version is the GVL, and the old Galaxy cloudman mentioned in the video is since deprecated. Nevertheless, I tried to set up the cloud version with both of these, but for some reason it did not seem to like the credentials I made with the AWS account I set up for that… Not sure where things went wrong.

One other question I had is regarding the use of other Galaxy servers- I realize that my credentials for Galaxy Main do not work on any of the other servers, and I wanted to clarify regarding your user policy as to whether it would be permissible to generate credentials on any of the other servers listed here?: https://galaxyproject.org/use/

I think the 250 GB quota for Galaxy Main is extremely generous, and have not had any need for more storage, but was curious if you think running some of these jobs on other servers might move things forward a little faster, given the heavy load currently on Main? Just want to make sure it would not be a violation of the duplicate accounts policy. I know several of the servers on that website have different suites of tools available than Galaxy Main, and since the credentials for one are not useable for all, my interpretation of the policy would be that different servers would not count as duplicate accounts, but I just wanted some clarification on that point- I do not wish to be in violation.

Thank you so much for your consideration!

Sincerely,
–Kenny

1 Like

Hi @kfelsen1

Your jobs all are queued normally. Outstanding queued jobs were queued 4 days ago and some 2 days ago. Most are sets of jobs all queued at the same time – and the initial jobs completed and the downstream jobs will execute as resources become available.

Regarding accounts, the terms are one account per person per public Galaxy server. This means it is fine to have a distinct account at all the other public Galaxy servers. Do note that some of the “domain-specific” Galaxy servers are associated with primary servers and your account/data is the same at all in the group – you’ll be able to tell by the URL. For example: rna.usegalaxy.eu is a domain-specific server associated with usegalaxy.eu, and your account at both would be the same – but distinct from an account at usegalaxy.org.

Regarding if jobs run faster at other public Galaxy servers versus usegalaxy.org … some may run slower or faster at other servers. All are very busy. Most use distinct storage/compute resources. The quota allocation can also vary. But there is often a way to request more space for academic projects – contact information is usually on the home page of public Galaxy servers and/or in their Galaxy Directory listing https://galaxyproject.org/use/.

Regarding Cloud Galaxy options, once you have your AWS account set up, this is where to “start up” a Galaxy instance. Use the GVL version. https://launch.usegalaxy.org/. How to use it is covered in the GTN tutorials here https://training.galaxyproject.org/training-material/topics/admin/ – start with the tutorial “Galaxy on the Cloud”.

Thanks!