Cannot add SLURM worker nodes

yl3 · January 7, 2020, 5:42am

I managed to launch GVL with a SLURM cluster, and it seemed to be working, since I was able to run sinfo to list my master node.

However, when I add a worker node through CloudMan, I am receiving the following error messages in the CloudMan console.

05:27:45 - Adding 1 on-demand instance(s)

05:30:05 - Instance ‘i-09db541d174b1bed6; 34.228.77.158; w2’ reported alive

05:30:30 - —> PROBLEM, running command ‘/usr/bin/scontrol reconfigure’ returned code ‘1’, the following stderr: 'scontrol: error: slurm_receive_msg: Zero Bytes were transmitted or received slurm_reconfigure error: Zero Bytes were transmitted or received ’ and stdout: ‘’

05:30:30 - Could not get a handle on job manager service to add node ‘i-09db541d174b1bed6; 34.228.77.158; w2’

05:30:30 - Waiting on worker instance ‘i-09db541d174b1bed6; 34.228.77.158; w2’ to configure itself.

05:30:35 - Slurm error: slurmctld not running; setting service state to Error

05:30:41 - Instance ‘i-09db541d174b1bed6; 34.228.77.158; w2’ ready

Back on the master node, sinfo now correctly shows two nodes.

PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
main* up infinite 1 drain master
main* up infinite 1 idle w2

However, my srun and sbatch jobs are getting executed on the master node instead of the worker nodes. They get run as regular bash tasks and neither squeue nor smap are showing any running tasks.

Does anybody know what is going on and how to fix this?

Topic		Replies	Views
Jobs served via SLURM fail server-admin , galaxy-local , exceeds-memory-error , evolution	3	1162	July 17, 2023
Using the GVL as a multiuser elastic slurm cluster server-admin , cluster , gvl	2	782	June 13, 2019
SLURM job resource requests ignored (CPU/memory/time) on Galaxy + Azure CycleCloud setup server-admin , job-config , galaxy-local	0	3	July 31, 2025
How to configure Slurm without Singularity server-admin , galaxy-local	2	664	March 7, 2023
How to enable SLURM storage accounting on GVL 4.4.0? server-admin , galaxy-local	0	679	June 28, 2020

Cannot add SLURM worker nodes

Related topics