RNASTAR tool - Stuck for days

SSL · September 8, 2025, 5:52pm

I’m trying to run RNA-STAR to map and get BAM files for DESeq on usegalaxy.org. I imported the data into Galaxy directly from GEO. I did FastQC to make sure the files are ok. Then I ran RNA-STAR. All files were mapped except one. It’s been 10 days now. When I checked the log, it says finished

 Started job on |	Aug 28 12:42:03
                             Started mapping on |	Aug 28 12:43:13
                                    Finished on |	Aug 28 12:47:28

But on the main page, it says running

Not sure what is happening. Any help is appreciated.

Thanks.

SSL

jennaj · September 8, 2025, 6:41pm

Hi @SSL

What happens if you completely refresh the view? Do this by clicking on the “Galaxy” HOME icon in the very top left of the masthead bar.

If that is not enough, then you probably still have a job pending completion. Each of the input fastq files (or, pairs) will have a separate RNA-STAR job, then each of those will have a log of the statistics. You are looking at the log for one of the completed jobs in your screenshot. Since you have 24 jobs, there will eventually be 24 of these logs, one per fastq input or pair of inputs.

To view the status of each job, you can click on the collection folder 532 in the history. This will drill down into the listing of elements inside that collection folder (RNA-STAR jobs in this case, the BAM result).

Maybe there is something special about that sample? You may be able to see that in the FastQC results, but if the issue is about scientific content (where the reads are mapping, and how they are mapping, instead of the read quality itself) the basic QA stats may not mean too much. Letting the job complete is the only way to get the logs from RNA-STAR itself (multimapping/non-specific reads can lead to extended job runtimes, but also other cases, some of which you can see called out in the statistics in your screenshot).

Since this has been 10 days (of queuing time and execution time), it may might there is a problem but my immediate guess is that these last jobs were simply queued last and are now finally running. How “fast” a batch of jobs run depends on the tool/parameter choice, how many jobs were queued, and how many other jobs you may have running at the same time (in any history) during the total time frame, and to a smaller degree, which server you are working at and how busy their clusters happen to be. The clusters in Galaxy work the same as any other cluster – resources are managed to be fair, some of your jobs run, some of other people’s run, more of yours, repeat until done. Later on, workflows can speed this up since all jobs in a pipleline are queued at the same time due to fewer delays waiting for the next tool to be launched.

From here, you are welcome to share the history and we can help to check the timestamps and confirm this is all running as expected. That would also allow us to follow up with server administrators if needed, but I don’t think that is a problem so far. But meanwhile, I would strongly suggest leaving the work queued and avoiding rerunning, since that always just puts all restarted jobs back at the end of the cluster queues again!

I’ll watch for your reply!

XRef

SSL · September 8, 2025, 7:38pm

Thanks for the reply. So initially, i ran 3 SRA data at once. I ran into exceeding 250 gb space. i then stopped other two, moved to 1 TB temporary data and allowed this one to run exclusively for may be 6 days without any interruption. out of 24, only 1 is not done for 10 days. All others were completed. I checked one by one. Here is a link to my history for you to import in your history.

Just today, i started another study to process BAM files as a separate history - that one was working fine but just for the sake of solving this issue, I have deleted those and now only allowing this particular file to run.

Many thanks for your help

jennaj · September 8, 2025, 10:46pm

Great, thanks for sharing the history @SSL !!

The job is still running and I see some clues in the FastQC reports for that sample.

Notice the rate of unique reads if deduplicated? This sample is less than 20%. Other samples are hovering around 60%. This sample also has something unknown flagged as overrepresented, and that is likely leading to the reported GC content issue. What that is could explored, although, letting this process will likely filter it out from the mapping result. Once you have the mapping results, you can explore those and decide if trimming is needed (these reads do have adaptor remaining, which again, mapping tools can sometimes handle, but the mapping processes faster and more predictably when removed – maybe all samples would benefit, or maybe it doesn’t matter that much).

I would let this process. Meanwhile, you can explore the QC reports if curious – the Help on the FastQC tool form has links to our tutorials that explain how to review plus links to the original author’s guidance. You may also be interested in this topic → Quality Control Start Here! multQC issue and guidance?. The shared workflow in that topic would be appropriate for your reads from what I can tell, and would run the FastQC steps, apply trimming, then FastQC again (to make sure QA did what you expected!), and put all of that into a nice visual summary. It may help your later samples to process smoother.

The alternative is to kill the lingering jobs, then filter the collection to “remove failed jobs”, and drop the outlier sample from your analysis. This is a judgement call and you may be able to rescue it with some QC, but not while it is already running in this particular job, and I wouldn’t suggest applying QC “rules” to some samples selectively and not to others or you may introduce bias. Those criteria need to be set, applied to all samples, then the downstream steps launched.

Hope this provides some insight into what is going on, and some options!

SSL · September 9, 2025, 2:47pm

Thanks. This was really helpful. I will go through the tutorial to understand what the issue might be. However, there is nothing much I can do about it, as this is from a publicly available dataset. I think I will wait for today and go with the ‘kill-filter-remove failed job’ strategy.

Many thanks again for your detailed response!