On trying to import the files I got an error, so I whittled it down to the most basic files for my analysis. The error still remains the same:
This job was terminated because it used more memory than it was allocated. Please click the bug icon to report this problem if you need help.
Is there a way to increase the allocation? I went back to double check to make sure all the files were computed correctly according to the manual but not sure where I could be going wrong.
Job details:
History Content API ID: f9cad7b01a472135c87438db2382b700
History API ID: bbd44e69cb8906b5219f1f0101eac96e
UUID: 3428afbe-b503-40ba-a28b-adcf46d5e3ca
Job API ID: bbd44e69cb8906b5e46592b9eaf3bb84
CPU usage time: 7 minutes
CPU user time 7 minutes
CPU system time 16.9519500 seconds
Number of processes belonging to this cgroup killed by any kind of OOM killer 1
Max memory usage recorded 5.7 GB
Container ID /cvmfs/singularity.galaxyproject.org/all/mulled-v2-74fb92a516e1351a0d862b5a82851a388617fca6:ece820afc8d64dd226ddf93e2f56fe6b20c0cf0f-0
Container Type singularity
Cores Allocated 1
Memory Allocated (MB)|5837|
Job Start Time 2025-04-21 08:47:51
Job End Time 2025-04-21 08:56:27
Job Runtime (Wall Clock) 8 minutes
Processor Count 23
Total System Memory 86.6 GB
Total System Swap 0 bytes
uname
Operating System Linux galaxy-main-set02-5 5.14.0-362.24.1.el9_3.0.1.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 4 22:31:43 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Jobs that fail this way might actually be too large to process, but much more commonly the issue is with the input data or sometimes a parameter. The runtime of less than 8 minutes is a strong indicator that your error is about the input, possibly a format problem.
Reference data is the most difficult to get synched up. We have a guide here about some of the basics to watch out for. It is for human examples but the same advice applies to any assembly. → Reference genomes at public Galaxy servers: GRCh38/hg38 example
For you use here:
I’m most curious about which parts of the data are not loading. This tutorial underwent some large changes recently. I am also going to try to go through it but if I could know which part you had trouble with, that will help.
You can share your history back here. That way we can get the tutorial author’s involved in any corrections (and work arounds!) needed.
If you are using the tutorial workfllow, that would also be interesting to know. If you made any changes, that can also have a share link generated and posted back, then we can work with both together.
From a very quick look, you might need to remove the header line from the GTF in dataset 2. The other steps look good. I am testing with just change that but it will take some time to run.
That guide above includes help for “formatting” base reference data.
Your job is also large but let’s see what happens! I’m not going to share back here since you haven’t shared first, so I will get back to you tomorrow. You could also test this change.
In both the old and the new history, I retried the import step with only 3 files for each group and it worked! What would you suggest I do so that I can scale this up to include at least 25 files?
Great, thanks for testing that out. I was able to get your job to run earlier today with just a few samples too, and started up the process to see if we can increase the runtime memory allocated a UseGalaxy.org. We might have feedback by tomorrow. I’ll update here.
The other option is to try at the UseGalaxy.eu server. They are currently allocating a bit more on a different cluster type.
The european server was able to run the import successfully. However, I realised I have an error in my pipeline prior to galaxy so I will address this first. Thanks for your help so far!
I am still planning on going through the updated tutorial. If I find anything odd with it, I’ll link back the ticket to the GTN and to alert you or anyone else using it about any things that might impact general usage.