When attempting to start Galaxy, the following error occurs:
libjemalloc.so.2: cannot allocate memory in static TLS block
This error leads to the failure of Galaxy’s job handler process and other components. This issue seems to be related to thread-local storage (TLS) allocation when using jemalloc for memory management.
Steps to Reproduce:
Set up a Galaxy instance with the default configuration.
Ensure that jemalloc is either explicitly loaded via LD_PRELOAD or is part of the system’s default memory allocator.
Start the Galaxy server.
Expected Behavior:
The Galaxy server should start without errors, and the job handler should function correctly.
Actual Behavior:
The server can start. When I created the new job, it would be waiting to run forever.
System Information:
Galaxy Version: 24.1
Operating System: Ubuntu 24.04.1 LTS
Python Version: 3.12
jemalloc Version: 5.3.0-2build1
Kernel Version: 6.8.0-48-generic
Output of ulimit -a:
real-time non-blocking time (microseconds, -R) unlimited
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1030836
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1030836
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Logs:
The relevant portion of the Galaxy log:
Traceback (most recent call last):
File "/mnt/md0/apps/galaxy/./lib/galaxy/main.py", line 266, in <module>
main()
File "/mnt/md0/apps/galaxy/./lib/galaxy/main.py", line 262, in main
func(args, log)
File "/mnt/md0/apps/galaxy/./lib/galaxy/main.py", line 119, in app_loop
galaxy_app = load_galaxy_app(
File "/mnt/md0/apps/galaxy/./lib/galaxy/main.py", line 95, in load_galaxy_app
app = UniverseApplication(global_conf=config_builder.global_conf(), attach_to_pools=attach_to_pools, **kwds)
File "/mnt/md0/apps/galaxy/lib/galaxy/app.py", line 826, in __init__
self.application_stack.register_postfork_function(self.job_manager.start)
File "/mnt/md0/apps/galaxy/lib/galaxy/web_stack/__init__.py", line 48, in register_postfork_function
f(*args, **kwargs)
File "/mnt/md0/apps/galaxy/lib/galaxy/jobs/manager.py", line 41, in start
self.job_handler = handler.JobHandler(self.app)
File "/mnt/md0/apps/galaxy/lib/galaxy/jobs/handler.py", line 94, in __init__
self.dispatcher = DefaultJobDispatcher(app)
File "/mnt/md0/apps/galaxy/lib/galaxy/jobs/handler.py", line 1191, in __init__
self.job_runners = self.app.job_config.get_job_runner_plugins(self.app.config.server_name)
File "/mnt/md0/apps/galaxy/lib/galaxy/jobs/__init__.py", line 898, in get_job_runner_plugins
rval[id] = runner_class(
File "/mnt/md0/apps/galaxy/lib/galaxy/jobs/runners/drmaa.py", line 70, in __init__
drmaa = __import__("drmaa")
File "/mnt/md0/apps/galaxy/.venv/lib/python3.12/site-packages/drmaa/__init__.py", line 65, in <module>
from .session import JobInfo, JobTemplate, Session
File "/mnt/md0/apps/galaxy/.venv/lib/python3.12/site-packages/drmaa/session.py", line 39, in <module>
from drmaa.helpers import (adapt_rusage, Attribute, attribute_names_iterator,
File "/mnt/md0/apps/galaxy/.venv/lib/python3.12/site-packages/drmaa/helpers.py", line 36, in <module>
from drmaa.wrappers import (drmaa_attr_names_t, drmaa_attr_values_t,
File "/mnt/md0/apps/galaxy/.venv/lib/python3.12/site-packages/drmaa/wrappers.py", line 56, in <module>
_lib = CDLL(libpath, mode=RTLD_GLOBAL)
File "/mnt/md0/apps/galaxy/.venv/lib/python3.12/site-packages/pylibmagic/__init__.py", line 92, in __magic_init__
self.__init_orig__(
File "/usr/lib/python3.12/ctypes/__init__.py", line 379, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block
2024-11-12 14:16:40,990 WARN exited: sge_handler (exit status 1; not expected)
2024-11-12 14:16:41,442 INFO gave up: sge_handler entered FATAL state, too many start retries too quickly
2024-11-12 14:16:41,442 WARN exited: special_handler1 (exit status 1; not expected)
2024-11-12 14:16:41,554 INFO gave up: special_handler1 entered FATAL state, too many start retries too quickly
2024-11-12 14:16:41,555 WARN exited: handler1 (exit status 1; not expected)
2024-11-12 14:16:41,604 INFO gave up: handler1 entered FATAL state, too many start retries too quickly
2024-11-12 14:16:41,604 WARN exited: handler0 (exit status 1; not expected)
2024-11-12 14:16:41,741 INFO gave up: handler0 entered FATAL state, too many start retries too quickly
2024-11-12 14:16:41,741 WARN exited: special_handler0 (exit status 1; not expected)
2024-11-12 14:16:42,742 INFO gave up: special_handler0 entered FATAL state, too many start retries too quickly