🔧 gunicorn.sock not created despite Galaxy running — and dependency_resolvers_conf.xml missing

Hi all,

After a power outage, I had some troubles restarting my galaxy instance using Ansible and have encountered some issues :


:puzzle_piece: 1. Gunicorn socket (gunicorn.sock) not created

:page_facing_up: Setup

  • Galaxy is run via galaxyctl
  • The systemd unit galaxy-gunicorn.service calls:
ExecStart=/home/galaxy/galaxy/.venv/bin/galaxyctl --config-file /home/galaxy/galaxy/config/galaxy.yml exec _default_ gunicorn
  • In my galaxy.yml, I’ve defined:
gunicorn:
  bind: unix:/home/galaxy/galaxy/config/gunicorn.sock

:cross_mark: The issue

Despite Galaxy reporting as running (galaxyctl status shows active (running)), the expected socket file is not created:

$ sudo ls -l /home/galaxy/galaxy/config/gunicorn.sock
ls: cannot access: No such file or directory

Running lsof -U shows Gunicorn has active UNIX streams, but no named socket bound on the filesystem:

gunicorn 18361 galaxy 1u unix 0xffff98cc56883300 0t0 type=STREAM
...

:white_check_mark: What I’ve checked

  • File permissions are correct (galaxy can write to config/).
  • Socket path is valid and matches what is defined in galaxy.yml.
  • No obvious errors in journalctl logs or stdout from galaxyctl.
  • I have also followed the missing gunicorn.sock file with
systemctl daemon-reload
systemctl restart gunicorn.service

:puzzle_piece: 2. Missing dependency_resolvers_conf.xml

I’m also seeing this in the logs at startup:

galaxyctl[259655]: galaxy.tool_util.deps DEBUG 025-07-09 12:35:41,943 [pN:main,p:259655,tN:MainThread] Unable to find config file '/home/galaxy/galaxy/config/dependency_resolvers_conf.xml'

That would be fine (the file is optional), but immediately after this, the following error occurs:

galaxy.util.filelock.FileLockException: Timeout occurred.
Exception: Failed to get file lock for /home/galaxy/tool_dependencies/conda

This seems to come from:

with FileLock(..., timeout=300):

So, even though dependency_resolvers_conf.xml is optional, its absence may be triggering Galaxy to fall back to conda, which in turn tries to acquire a lock on /home/galaxy/tool_dependencies/conda.lock — and fails because the file already exists (left behind?) or another process is stuck.

:light_bulb: Questions

  • Has anyone seen gunicorn.sock not created despite a valid bind path and no startup error?
  • Should we be explicitly creating dependency_resolvers_conf.xml even though it’s optional?
  • Can a failed Conda lock prevent Gunicorn from binding or finalizing startup?

Thanks in advance for any insights you can offer! I’m happy to provide logs, config excerpts, or run debugging commands if helpful.

Best,
— Naïra

Welcome @NairaNaouar

This will need some advanced help. I’ve cross posted over the Admin Chat. They usually reply back here but feel free to join, too! :hammer_and_wrench: You're invited to talk on Matrix

XRef:

your Galaxy configs seem correct to me, I’d explore your system’s state further

I don’t think so, but make sure you are not specifying its path in galaxy.yml, then Galaxy would try to find it.

I doubt this, but inability to create sock and access lock could speak about a system’s issue

Hi @marten

What should I look into in the system that may block the creation of the gunicorn.socket?

I have checked the galaxy.xml and indeed, the dependency_resolvers_conf.xml file is specified there :

sudo grep -B3 -A1 dependency_resolver /home/galaxy/galaxy/config/galaxy.yml 
    data_manager_config_file: /home/galaxy/galaxy/config/data_manager_conf.xml.sample
    database_connection: postgresql:///galaxy?host=/var/run/postgresql
    datatypes_config_file: /home/galaxy/galaxy/config/datatypes_conf.xml.sample
    dependency_resolvers_config_file: /home/galaxy/galaxy/config/dependency_resolvers_conf.xml
    display_servers: hgw1.cse.ucsc.edu,hgw2.cse.ucsc.edu,hgw3.cse.ucsc.edu,hgw4.cse.ucsc.edu,hgw5.cse.ucsc.edu,hgw6.cse.ucsc.edu,hgw7.cse.ucsc.edu,hgw8.cse.ucsc.edu,lowepub.cse.ucsc.edu

How could I change the galaxy.yml in this case?

Thanks a lot in advance,
Naïra

Could you post your galaxy.yml config? Namely the gravity section. But please mind any secrets.

in my understanding galaxy-ansible should set this up only if you use this configuration: https://github.com/galaxyproject/ansible-galaxy/blob/main/README.md?plain=1#L225

Here is the gravity section :

gravity:
    galaxy_root: /home/galaxy/galaxy
    galaxy_user: galaxy
    gunicorn:
        bind: unix:/home/galaxy/galaxy/config/gunicorn.sock
        extra_args: --forwarded-allow-ips="*"
        preload: true
        workers: 8
    handlers:
        handler:
            pools:
            - job-handler
            - workflow-scheduler
            processes: 4
    process_manager: systemd
    virtualenv: /home/galaxy/galaxy/.venv

And here the galaxy section



galaxy:
    admin_users: ****
    allow_path_paste: true
    allow_user_dataset_purge: true
    allow_user_deletion: true
    allow_user_impersonation: true
    brand: 🧬 Mississippi[2]
    builds_file_path: shared/ucsc/builds.txt
    cleanup_job: onsuccess
    container_resolvers_config_file: ''
    data_dir: /home/galaxy/galaxy/database
    data_manager_config_file: /home/galaxy/galaxy/config/data_manager_conf.xml.sample
    database_connection: postgresql:///galaxy?host=/var/run/postgresql
    datatypes_config_file: /home/galaxy/galaxy/config/datatypes_conf.xml.sample
    dependency_resolvers_config_file: /home/galaxy/galaxy/config/dependency_resolvers_conf.xml
    display_servers: hgw1.cse.ucsc.edu,hgw2.cse.ucsc.edu,hgw3.cse.ucsc.edu,hgw4.cse.ucsc.edu,hgw5.cse.ucsc.edu,hgw6.cse.ucsc.edu,hgw7.cse.ucsc.edu,hgw8.cse.ucsc.edu,lowepub.cse.ucsc.edu
    email_from: ***
    enable_per_request_sql_debugging: true
    enable_quotas: true
    enable_tool_source_display: true
    error_email_to: ***
    expose_dataset_path: true
    expose_potentially_sensitive_job_metrics: true
    expose_user_name: true
    external_service_type_config_file: /home/galaxy/galaxy/config/external_service_types_conf.xml.sample
    file_path: datasets
    ftp_upload_dir: /home/galaxy/galaxy/database/ftp
    ftp_upload_site: ***
    id_secret: ***
    integrated_tool_panel_config: /home/galaxy/galaxy/config/integrated_tool_panel.xml
    interactivetools_enable: true
    job_config_file: /home/galaxy/galaxy/config/job_conf.yml
    job_metrics_config_file: /home/galaxy/galaxy/config/job_metrics_conf.xml
    job_working_directory: /home/galaxy/galaxy/database/jobs
    len_file_path: /home/galaxy/galaxy/config/len
    migrated_tools_config: /home/galaxy/galaxy/config/migrated_tools_conf.xml
    new_user_dataset_access_role_default_private: true
    nginx_x_accel_redirect_base: /_x_accel_redirect
    object_store_store_by: id
    openid_config_file: /home/galaxy/galaxy/config/openid_conf.xml.sample
    outputs_to_working_directory: true
    require_login: true
    shed_data_manager_config_file: /home/galaxy/galaxy/config/shed_data_manager_conf.xml
    shed_tool_config_file: /home/galaxy/galaxy/config/shed_tool_conf.xml
    shed_tool_data_table_config: /home/galaxy/galaxy/config/shed_tool_data_table_conf.xml
    show_welcome_with_login: true
    slow_query_log_threshold: 5
    smtp_password: ***
    smtp_server: smtp.***:587
    smtp_username: ***
    static_enabled: false
    themes_config_file: /home/galaxy/galaxy/config/themes_conf.yml
    tool_config_file: /home/galaxy/galaxy/config/tool_conf.xml.sample
    tool_data_path: /home/galaxy/galaxy/tool-data
    tool_data_table_config_path: /home/galaxy/galaxy/config/tool_data_table_conf.xml
    tool_dependency_dir: /home/galaxy/tool_dependencies
    tool_sheds_config_file: /home/galaxy/galaxy/config/tool_sheds_conf.xml.sample
    ucsc_build_sites: /home/galaxy/galaxy/tool-data/shared/ucsc/ucsc_build_sites.txt.sample
    visualization_plugins_directory: config/plugins/visualizations
    watch_job_rules: auto
    watch_tool_data_dir: true
    watch_tools: 'true'

I would suggest that you set conda_auto_init: false in the galaxy section of galaxy.yml. conda_auto_init would otherwise attempt to install conda to the default location. This only works if a single process is being started, otherwise multiple processes attempt to install conda at the same time, failing to acquire the lock. If you do need conda I would suggest that you install it manually in a location of your choice, then set conda_prefix to that location. If that doesn’t work it would help if you post the startup logs.

Thanks @mvdbeek
Il will check that and send the logs if needed.
I will let you know!

Hi @marten & @mvdbeek

I have encountered a new error that might explain that the gunicorn.sock is not created.
After running the journalctl of the gunicorn.service I got the following errors:

$sudo journalctl  -u galaxy-gunicorn.service -n 50 --no-pager
[***some skipped lines***]
Jul 15 15:06:38 mississippi-2 galaxyctl[515907]: galaxy.tool_util.deps DEBUG 2025-07-15 15:06:38,034 [pN:main,p:515907,tN:MainThread] Unable to find config file '/home/galaxy/galaxy/config/dependency_resolvers_conf.xml'
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]: Traceback (most recent call last):
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/webapps/galaxy/buildapp.py", line 68, in app_pair
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     app = galaxy.app.UniverseApplication(global_conf=global_conf, is_webapp=True, **kwargs)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/app.py", line 750, in __init__
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     self._configure_toolbox()
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/app.py", line 368, in _configure_toolbox
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     ToolBoxSearch(self.toolbox, index_dir=self.config.tool_search_index_dir, index_help=index_help),
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/tools/search/__init__.py", line 98, in __init__
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     panel_searches[panel_view_id] = ToolPanelViewSearch(
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/tools/search/__init__.py", line 199, in __init__
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     self.index = self._index_setup()
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/tools/search/__init__.py", line 203, in _index_setup
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     return get_or_create_index(self.index_dir, self.schema)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/lib/galaxy/tools/search/__init__.py", line 75, in get_or_create_index
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     if index.exists_in(index_dir):
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/index.py", line 136, in exists_in
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     ix = open_dir(dirname, indexname=indexname)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/index.py", line 123, in open_dir
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     return FileIndex(storage, schema=schema, indexname=indexname)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/index.py", line 421, in __init__
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     TOC.read(self.storage, self.indexname, schema=self._schema)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/index.py", line 632, in read
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     check_size("int", _INT_SIZE)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/index.py", line 626, in check_size
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     sz = stream.read_varint()
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/filedb/structfile.py", line 191, in read_varint
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     return read_varint(self.read)
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:   File "/home/galaxy/galaxy/.venv/lib/python3.8/site-packages/whoosh/util/varints.py", line 102, in read_varint
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]:     b = ord(readfn(1))
Jul 15 15:06:40 mississippi-2 galaxyctl[515907]: TypeError: ord() expected a character, but string of length 0 found
Jul 15 15:06:42 mississippi-2 systemd[1]: galaxy-gunicorn.service: Main process exited, code=exited, status=1/FAILURE
Jul 15 15:06:42 mississippi-2 systemd[1]: galaxy-gunicorn.service: Failed with result 'exit-code'.
Jul 15 15:06:42 mississippi-2 systemd[1]: galaxy-gunicorn.service: Consumed 17.837s CPU time.
Jul 15 15:06:42 mississippi-2 systemd[1]: galaxy-gunicorn.service: Scheduled restart job, restart counter is at 14.
Jul 15 15:06:42 mississippi-2 systemd[1]: Stopped Galaxygunicorn.
Jul 15 15:06:42 mississippi-2 systemd[1]: galaxy-gunicorn.service: Consumed 17.837s CPU time.
Jul 15 15:06:42 mississippi-2 systemd[1]: Started Galaxygunicorn.
Jul 15 15:06:42 mississippi-2 galaxyctl[516682]: Working directory: /home/galaxy/galaxy
Jul 15 15:06:42 mississippi-2 galaxyctl[516682]: Executing: PYTHONPATH=lib GALAXY_CONFIG_FILE=/home/galaxy/galaxy/config/galaxy.yml VIRTUAL_ENV=/home/galaxy/galaxy/.venv PATH=/home/galaxy/galaxy/.venv/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin /home/galaxy/galaxy/.venv/bin/gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' --timeout 300 --pythonpath lib -k galaxy.webapps.galaxy.workers.Worker -b unix:/home/galaxy/galaxy/config/gunicorn.sock --workers=8 --config python:galaxy.web_stack.gunicorn_config --preload --forwarded-allow-ips="*"

It seems that the gunicorn failed and starts again.
Have you ever seen already this python error?

Naïra

Dear,

Hello again.

So finally I managed somehow to generate my socket again.
With the last error given, I “saw” that there was a problem with the tool_search_index directory so I

  1. Stopped the gunicorn.service
sudo systemctl stop galaxy-gunicorn.service
  1. Removed the directory
sudo rm -rf /home/galaxy/galaxy/database/tools_search_index
  1. Restart Galaxy
sudo systemctl daemon-reload
sudo systemctl start galaxy-gunicorn.service

And the socket was created again and with a functional nginx, all is running back!

Thanks a lot for the support :slight_smile:

Naïra

1 Like

Thanks for posting back what worked @NairaNaouar :rocket: