SLURM job resource requests ignored (CPU/memory/time) on Galaxy + Azure CycleCloud setup

Garithyr · August 1, 2025, 1:31am

Greetings all!

I would really appreciate some help if anyone is able :D.

The setup:
I have set up a Galaxy instance within our Microsoft Azure subscription. Galaxy is running in a container (I wrote the image myself but based it on the Dockerfile from GitHub - bgruening/docker-galaxy: 🐋📊📚 Docker Images tracking the stable Galaxy releases.). In addition to that we setup a CycleCloud instance (Overview - Azure CycleCloud | Microsoft Learn) and within CycleCloud we setup a SLURM cluster (Overview of Azure CycleCloud Workspace for Slurm - Azure CycleCloud | Microsoft Learn).

I was able to connect Galaxy to this SLURM cluster. I can run tasks from Galaxy which will automatically sbatch to SLURM. These tasks are run in containers using Apptainer (and usually quay.io).

Everything works except that resource requests (CPU, memory, time) defined in the Galaxy job configuration or selected in the tool form are not being honored by SLURM.

My job_conf.yml looks like this:

runners:
  local:
    load: galaxy.jobs.runners.local:LocalJobRunner
    workers: 4
  slurm:
    load: galaxy.jobs.runners.slurm:SlurmJobRunner

handling:
  processes:
    handler0:
    handler1:

execution:
  default: local
  environments:
    local:
      runner: local
      params: {}
    singularity_slurm_hpc:
      runner: slurm
      require_container: true
      params:
        submit_native_specification: >-
          --nodes=1
          --ntasks-per-node=1
          --partition=hpc
          --mem={{ memory | default(15) }}G
          --cpus-per-task={{ processors | default(4) }}
          --time={{ time | default(48) }}:00:00
      resources: all
      use_resource_params: true
      singularity_enabled: true
      singularity_volumes: $defaults,/galaxy
      singularity_run_extra_arguments: '--env APPTAINER_NO_SETGROUPS=1'
      singularity_cleanenv: true
      singularity_sudo: false
      singularity_default_container_id: docker://ubuntu:noble-20250404
      env:
        - name: LC_ALL
          value: C
        - name: APPTAINER_CACHEDIR
          value: /scratch/singularity/containercache
        - name: APPTAINER_TMPDIR
          value: /scratch/singularity/tmpdir
        - name: SINGULARITY_CACHEDIR
          value: /scratch/singularity/containercache
        - name: SINGULARITY_TMPDIR
          value: /scratch/singularity/tmpdir
        - file: /galaxy/.venv/bin/activate

tools:
  - id: minimap2
    destination: singularity_slurm_hpc
    resources: all
  - class: local
    environment: local

resources:
  default: default
  groups:
    default: []
    memoryonly: [memory]
    all: [processors, memory, time]

I also created a job_resource_params_conf.xml:

<parameters>
  <param label="CPUs" name="processors" type="integer" min="1" max="64" value="4" help="Number of CPU cores to allocate (SLURM: --cpus-per-task)" />
  <param label="Memory (GB)" name="memory" type="integer" min="1" max="256" value="15" help="Memory in GB (SLURM: --mem)" />
  <param label="Runtime (hours)" name="time" type="integer" min="1" max="4380" value="48" help="Job time limit in hours (SLURM: --time)" />
</parameters>

And a container_resolvers.yml (although I don’t think this is related to the issue):

- type: explicit_singularity
- type: explicit

The problem:
Despite configuring default and user-selectable resource parameters in job_conf.yml and job_resource_params_conf.xml, SLURM jobs always run with only 2 CPUs and 7.5GB RAM, instead of the requested 4 CPUs and 15GB RAM (or other manual settings).

The node used in the hpc partition has 8 vCPUs and 16GB RAM, so it’s not oversubscribed, yet SLURM seems to always allocate half of the available resources (this is due to configuration by CycleCloud in the slurm.conf).

But when I submit jobs manually using sbatch, from the scheduler node or from the Galaxy container (on a different VM), the job resource requests are honored correctly. So I don’t think the slurm.conf is blocking/overriding the requests.

Question:
How can I get SLURM to actually use the resource requests from Galaxy? Are there Galaxy-side defaults I’m missing? Do I need to configure anything differently in SLURM or CycleCloud? Is there something I did wrong in the job configuration?

Any advice is appreciated!

Topic		Replies	Views
SLURM job resource requests ignored (CPU/memory/time) on Galaxy + Azure CycleCloud setup server-admin , job-config , galaxy-local	0	4	July 31, 2025
How to configure Slurm without Singularity server-admin , galaxy-local	2	664	March 7, 2023
How to set CPU and RAM usage. server-admin , galaxy-local	1	466	January 23, 2023
Jobs served via SLURM fail server-admin , galaxy-local , exceeds-memory-error , evolution	3	1162	July 17, 2023
$GALAXY_SLOTS equals to 1 server-admin , galaxy-local	6	651	July 4, 2021

SLURM job resource requests ignored (CPU/memory/time) on Galaxy + Azure CycleCloud setup

Related topics