Galaxy Admin Tutorial: TUSd not installing/working?

Hello everyone,

I am setting up a local Galaxy instance and followed the Galaxy admin tutorial up to step 13 - Pulsar. Since I did not have an SSL certificate on hand, I skipped the NGINX setup at first. Later I realized I needed it for TUSd (I followed the tutorial for TUSd at first, but then removed it when I realized it wouldn’t work without NGINX) and to set up Pulsar.

So I went back and set up NGINX with a certificate from Sectigo (Let’s Encrypt is blocked by IT), which seems to work.

Now I wanted to set up TUSd, following the tutorial, but somehow it does not seem to get installed:

systemctl status galaxy-tusd

Unit galaxy-tusd.service could not be found.

There are no errors when running the ansible playbook, but I can no longer upload data. The log does not indicate that tusd is installed, the step is skipped.

I would appreciate any information that could help.

Here the galaxy.yml

---
- hosts: dbservers
  become: true
  become_user: root
  pre_tasks:
    - name: Install Dependencies
      package:
        name: 'acl'
  roles:
    - galaxyproject.postgresql
    - role: galaxyproject.postgresql_objects
      become: true
      become_user: postgres

- hosts: galaxyservers
  become: true
  become_user: root
  vars_files:
    - group_vars/secret.yml
  vars:
    sslkeys:
      privatekey.pem: |
        -----BEGIN PRIVATE KEY-----
        -----END PRIVATE KEY-----
      nginx_conf_ssl_certificate: galaxy.pem
      nginx_conf_ssl_certificate_key: privatekey.pem
      nginx_servers:
      - redircet-ssl
      - galaxy
      nginx_conf_http:
        client_max_body_size: 1g
  pre_tasks:
    - name: Install Dependencies
      package:
        name: ['acl', 'bzip2', 'git', 'make', 'tar', 'python3-virtualenv', 'python3-venv', 'python3-setuptools']
    - name: Install RHEL/CentOS/Rocky specific dependencies
      package:
        name: ['tmpwatch']
      when: ansible_os_family == 'RedHat'
    - name: Install Debian/Ubuntu specific dependencies
      package:
        name: ['tmpreaper']
      when: ansible_os_family == 'Debian'
    - git:
        repo: 'https://github.com/usegalaxy-eu/libraries-training-repo'
        dest: /libraries/
  roles:
    - galaxyproject.repos
    - galaxyproject.slurm
    - usegalaxy_eu.apptainer
    - galaxyproject.tusd
    - galaxyproject.galaxy
    - role: galaxyproject.miniconda
      become: true
      become_user: "{{ galaxy_user_name }}"
    - galaxyproject.nginx
    - galaxyproject.gxadmin
    - galaxyproject.cvmfs
  post_tasks:
    - name: Setup gxadmin cleanup task
      ansible.builtin.cron:
        name: "Cleanup Old User Data"
        user: galaxy # Run as the Galaxy user
        minute: "0"
        hour: "0"
        job: "SHELL=/bin/bash source {{ galaxy_venv_dir }}/bin/activate &&  GALAXY_LOG_DIR=/tmp/gxadmin/ GALAXY_ROOT={{ galaxy_root }}/server GALAXY_CONFIG_FILE={>
    - name: Install slurm-drmaa
      package:
        name: slurm-drmaa1

Here the requirments.yml:

# Galaxy, Postgres, Nginx
- src: galaxyproject.galaxy
  version: 0.11.1
- src: galaxyproject.nginx
  version: 0.7.1
- src: galaxyproject.postgresql
  version: 1.1.2
- src: galaxyproject.postgresql_objects
  version: 1.2.0
- src: galaxyproject.miniconda
  version: 0.3.1
#- src: usegalaxy_eu.certbot
#  version: 0.1.11
# gxadmin (used in cleanup, and later monitoring.)
- src: galaxyproject.gxadmin
  version: 0.0.12
# TUS (uploads)
- name: galaxyproject.tusd
  version: 0.0.1
# CVMFS Support
- src: galaxyproject.cvmfs
  version: 0.2.21
# Singularity/Apptainer
- src: usegalaxy_eu.apptainer
  version: 0.0.1
# SLURM as our DRM
- src: galaxyproject.repos
  version: 0.0.3
- src: galaxyproject.slurm
  version: 1.0.2

Here the group_vars/galaxyserver.yml:

# Galaxy
galaxy_create_user: true # False by default, as e.g. you might have a 'galaxy' user provided by LDAP or AD.
galaxy_separate_privileges: true # Best practices for security, configuration is owned by 'root' (or a different user) than the processes
galaxy_manage_paths: true # False by default as your administrator might e.g. have root_squash enabled on NFS. Here we can create the directories so it's fine.
galaxy_manage_cleanup: true
galaxy_layout: root-dir
galaxy_root: /srv/galaxy
galaxy_user: {name: "{{ galaxy_user_name }}", shell: /bin/bash}
galaxy_commit_id: release_24.0
galaxy_force_checkout: true
miniconda_prefix: "{{ galaxy_tool_dependency_dir }}/_conda"
miniconda_version: 23.9
miniconda_channels: ['conda-forge', 'defaults']


# Galaxy Job Configuration
galaxy_job_config:
  runners:
    local_runner:
      load: galaxy.jobs.runners.local:LocalJobRunner
      workers: 4
    slurm:
      load: galaxy.jobs.runners.slurm:SlurmJobRunner
      drmaa_library_path: /usr/lib/slurm-drmaa/lib/libdrmaa.so.1
  handling:
    assign: ['db-skip-locked']
  execution:
    default: slurm
    environments:
      local_env:
        runner: local_runner
        tmp_dir: true
      slurm:
        runner: slurm
        singularity_enabled: true
        env:
        - name: LC_ALL
          value: C
        - name: APPTAINER_CACHEDIR
          value: /tmp/singularity
        - name: APPTAINER_TMPDIR
          value: /tmp
      singularity:
        runner: local_runner
        singularity_enabled: true
        env:
        # Ensuring a consistent collation environment is good for reproducibility.
        - name: LC_ALL
          value: C
        # The cache directory holds the docker containers that get converted
        - name: APPTAINER_CACHEDIR
          value: /tmp/singularity
        # Apptainer uses a temporary directory to build the squashfs filesystem
        - name: APPTAINER_TMPDIR
          value: /tmp

  tools:
    - class: local # these special tools that aren't parameterized for remote execution - expression tools, upload, etc
      environment: local_env


galaxy_config:
  galaxy:
    # Branding
    brand: Galxy
    # Main Configuration
    admin_users:
    -
    database_connection: "postgresql:///{{ galaxy_db_name }}?host=/var/run/postgresql"
    file_path: /data/datasets
    job_working_directory: /data/jobs
    object_store_store_by: uuid
    id_secret: "{{ vault_id_secret }}"
    job_config: "{{ galaxy_job_config }}" # Use the variable we defined above
    # Authentication
    user_activation_on: true
    track_jobs_in_database: true
    require_login: true
    smtp_server: smtp.de
    #smtp_username: example_username
    #smtp_password: example_passsword
    activation_email: 
    error_email_to: 
    # SQL Performance
    slow_query_log_threshold: 5
    enable_per_request_sql_debugging: true
    # File serving Performance
   nginx_x_accel_redirect_base: /_x_accel_redirect
    # Automation / Ease of Use / User-facing features
    watch_job_rules: 'auto'
    allow_path_paste: true
    enable_quotas: true
    allow_user_deletion: true
    show_welcome_with_login: true
    expose_user_name: true
    expose_dataset_path: true
    expose_potentially_sensitive_job_metrics: true
    # NFS workarounds
    retry_job_output_collection: 3
    # Debugging
    cleanup_job: onsuccess
    allow_user_impersonation: true
    # Tool security
    outputs_to_working_directory: true
    new_user_dataset_access_role_default_private: true # Make datasets private by default
    # TUS
    galaxy_infrastructure_url: "https://{{ inventory_hostname }}"
    tus_upload_store: "{{ galaxy_tus_upload_store }}"
    # CVMFS
    tool_data_table_config_path: /cvmfs/data.galaxyproject.org/byhand/location/tool_data_table_conf.xml,/cvmfs/data.galaxyproject.org/managed/location/tool_data_t>
    # Tool Dependencies
    dependency_resolvers_config_file: "{{ galaxy_config_dir }}/dependency_resolvers_conf.xml"
    container_resolvers_config_file: "{{ galaxy_config_dir }}/container_resolvers_conf.yml"
    # Data Library Directories
    library_import_dir: /libraries/admin
    user_library_import_dir: /libraries/user

  gravity:
    process_manager: systemd
    galaxy_root: "{{ galaxy_root }}/server"
    galaxy_user: "{{ galaxy_user_name }}"
    virtualenv: "{{ galaxy_venv_dir }}"
    gunicorn:
      # listening options
      bind: "unix:{{ galaxy_mutable_config_dir }}/gunicorn.sock"
      # performance options
      workers: 2
      # Other options that will be passed to gunicorn
      # This permits setting of 'secure' headers like REMOTE_USER (and friends)
      # https://docs.gunicorn.org/en/stable/settings.html#forwarded-allow-ips
      extra_args: '--forwarded-allow-ips="*"'
      # This lets Gunicorn start Galaxy completely before forking which is faster.
      # https://docs.gunicorn.org/en/stable/settings.html#preload-app
      preload: true
    celery:
      concurrency: 2
      loglevel: DEBUG
    tusd:
      enable: true
      tusd_path: /usr/local/sbin/tusd
      upload_dir: "{{ galaxy_tus_upload_store }}"
    handlers:
      handler:
        processes: 2
        pools:
          - job-handlers
          - workflow-schedulers

galaxy_config_files_public:
  - src: files/galaxy/welcome.html
    dest: "{{ galaxy_mutable_config_dir }}/welcome.html"

galaxy_config_templates:
  - src: templates/galaxy/config/container_resolvers_conf.yml.j2
    dest: "{{ galaxy_config.galaxy.container_resolvers_config_file }}"
  - src: templates/galaxy/config/dependency_resolvers_conf.xml
    dest: "{{ galaxy_config.galaxy.dependency_resolvers_config_file }}"

galaxy_extra_dirs:
  - /data

galaxy_local_tools:
  - testing.xml

# Certbot
#certbot_auto_renew_hour: "{{ 23 |random(seed=inventory_hostname)  }}"
#certbot_auto_renew_minute: "{{ 59 |random(seed=inventory_hostname)  }}"
#certbot_auth_method: --webroot
#certbot_install_method: virtualenv
#certbot_auto_renew: yes
#certbot_auto_renew_user: root
#certbot_environment: staging
#certbot_well_known_root: /srv/nginx/_well-known_root
#certbot_share_key_users:
#  - www-data
#certbot_post_renewal: |
#    systemctl restart nginx || true
#certbot_domains:
# - "{{ inventory_hostname }}"
#certbot_agree_tos: --agree-tos

# NGINX
nginx_selinux_allow_local_connections: true
nginx_servers:
  - redirect-ssl
nginx_ssl_servers:
  - galaxy
nginx_enable_default_server: false
nginx_conf_http:
  client_max_body_size: 1g
# gzip: "on" # This is enabled by default in Ubuntu, and the duplicate directive will cause a crash.
gzip_proxied: "any"
gzip_static: "on"   # The ngx_http_gzip_static_module module allows sending precompressed files with the ".gz" filename extension instead of regular files.
gzip_vary: "on"
gzip_min_length: 128
gzip_comp_level: 6  # Tradeoff of better compression for slightly more CPU time.
gzip_types: |
    text/plain
    text/css
    text/xml
    text/javascript
    application/javascript
    application/x-javascript
    application/json
    application/xml
    application/xml+rss
    application/xhtml+xml
    application/x-font-ttf
    application/x-font-opentype
    image/png
    image/svg+xml
    image/x-icon
#nginx_ssl_role: usegalaxy_eu.certbot
#nginx_conf_ssl_certificate: /etc/ssl/certs/fullchain.pem
#nginx_conf_ssl_certificate_key: /etc/ssl/user/privkey-www-data.pem
nginx_conf_ssl_certificate: galaxy.pem
nginx_conf_ssl_certificate_key: privatekey.pem

# TUS
galaxy_tusd_port: 1080
galaxy_tus_upload_store: /data/tus

# Slurm
slurm_roles: ['controller', 'exec'] # Which roles should the machine play? exec are execution hosts.
slurm_nodes:
- name: localhost # Name of our host
  CPUs: 7         # Here you would need to figure out how many cores your machine has. For this training we will use 2 but in real life, look at `htop` or similar.
slurm_config:
  SlurmdParameters: config_overrides   # Ignore errors if the host actually has cores != 2
  SelectType: select/cons_res
  SelectTypeParameters: CR_CPU_Memory  # Allocate individual cores/memory instead of entire node

And the templates/nginx/galaxy.j2:

upstream galaxy {
        server {{ galaxy_config.gravity.gunicorn.bind }};

        # Or if you serve galaxy at a path like http(s)://fqdn/galaxy
        # Remember to set galaxy_url_prefix in the galaxy.yml file.
        # server {{ galaxy_config.gravity.gunicorn.bind }}:/galaxy;
}

server {
        # Listen on port 443 
        listen        *:443 ssl default_server;
        # The virtualhost is our domain name
        server_name   "{{ inventory_hostname }}";

        # Our log files will go to journalctl
        access_log  syslog:server=unix:/dev/log;
        error_log   syslog:server=unix:/dev/log;

        # The most important location block, by default all requests are sent to gunicorn
        # If you serve galaxy at a path like /galaxy, change that below (and all other locations!)
        location / {
                # This is the backend to send the requests to.
                proxy_pass http://galaxy;

                proxy_set_header Host $http_host;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Proto $scheme;
                proxy_set_header Upgrade $http_upgrade;
        }

        location /api/upload/resumable_upload {
                # Disable request and response buffering
                proxy_request_buffering     off;
                proxy_buffering             off;
                proxy_http_version          1.1;

                # Add X-Forwarded-* headers
                proxy_set_header X-Forwarded-Host   $host;
                proxy_set_header X-Forwarded-Proto  $scheme;

                proxy_set_header Upgrade            $http_upgrade;
                proxy_set_header Connection         "upgrade";
                client_max_body_size        0;
                proxy_pass http://localhost:{{ galaxy_tusd_port }}/files;
        }
                        
        # Static files can be more efficiently served by Nginx. Why send the
        # request to Gunicorn which should be spending its time doing more useful
        # things like serving Galaxy!
        location /static {
                alias {{ galaxy_server_dir }}/static;
                expires 24h;
        }

        # In Galaxy instances started with run.sh, many config files are
        # automatically copied around. The welcome page is one of them. In
        # production, this step is skipped, so we will manually alias that.
        location /static/welcome.html {
                alias {{ galaxy_mutable_config_dir }}/welcome.html;
                expires 24h;
        }

        # serve visualization and interactive environment plugin static content
        location ~ ^/plugins/(?<plug_type>[^/]+?)/((?<vis_d>[^/_]*)_?)?(?<vis_name>[^/]*?)/static/(?<static_file>.*?)$ {
                alias {{ galaxy_server_dir }}/config/plugins/$plug_type/;
                try_files $vis_d/${vis_d}_${vis_name}/static/$static_file
                          $vis_d/static/$static_file =404;
        }

        location /robots.txt {
                alias {{ galaxy_server_dir }}/static/robots.txt;
        }

        location /favicon.ico {
                alias {{ galaxy_server_dir }}/static/favicon.ico;
        }

        location /_x_accel_redirect {
                internal;
                alias /;
        }

        # Support click-to-run in the GTN-in-Galaxy Webhook
        location /training-material/ {
                proxy_pass https://training.galaxyproject.org/training-material/;
        }
}

And the tempaltes/nginx/redirect-ssl.j2

server {
        listen 80 default_server;
        listen [::]:80 default_server;

        server_name "{{ inventory_hostname }}";

        location / {
                return 302 https://$host$request_uri;
        }
}

Any hint would be appreciated!

Best regards,
Patrick

As additional information: If I run tusd (as galaxy user) on the command line, the upload in the galaxy interface works.

sudo su - galaxy
tusd -upload-dir=/data/tus -port 1080

But there are no files in /data/tus, but the dataset is accesible with the Galaxy interface.

Somehow the managed service by systemd does not get setup:

systemctl status galaxy-tusd
Unit galaxy-tusd.service could not be found.

Do I have to set it up manually? I would expect this is done by the ansible role, or?

Cheers!

sudo galaxyctl restart

solved the issue … :melting_face:

1 Like