Galaxy & Kubernetes

Koen_Nijbroek · November 27, 2024, 11:45am

Hi,

I’ve deployed the Galaxy3 helm chart on a Kubernetes cluster.

I’m struggling to define local jobs rather than k8s jobs. With default settings, jobs are submitted to the k8s cluster through TPV. However, how should I configure that jobs (e.g. upload) can run locally? I’ve tried updating the job_conf.yml without succes but also tpv_rules_local.yml without success…

Thanks in advance!

jennaj · November 27, 2024, 11:18pm

Welcome, @Koen_Nijbroek

For context, these are the resources we have for running a server.

Then, for your question, I think getting feedback from another administrator will lead to the best solution. I’ve cross-posted your question over to the Admin chat. They may reply here or there, and feel free to join the chat! You're invited to talk on Matrix

Let’s start there, thanks!

bernt-matthias · November 28, 2024, 7:57am

I might be able to help with the definition via job_conf, but I need more info. How do you try to configure it to run upload locally. Can you post job_conf that you are trying.

Koen_Nijbroek · November 28, 2024, 8:17am

Thanks for the quick response!

This is the current configuration (although I’ve tried a lot of different things) - based on galaxy/lib/galaxy/config/sample/job_conf.sample.yml at dev · galaxyproject/galaxy · GitHub. It seems that only the default is being read even though I’m trying to give tool-specific instructions - because if I switch it back to tpv_dispatcher pods are submitted. In the current configuration everything is executed locally.

job_conf.yml: execution: default: local_env environments: local_env: runner: - Pastebin.com

bernt-matthias · November 28, 2024, 8:49am

Just to make sure you have in the Galaxy config:

job_config_file: job_conf.yml
job_config: null

If you have doubt that the correct config is read, you could directly specify the yaml in the 2nd variable.

With this:

  - destination: tpv_dispatcher
    id: upload1

you explicitly assign TPV for upload1. Also there is now a second upload tool/mechanism (__DATA_FETCH__). Which you also should handle.

Koen_Nijbroek · November 28, 2024, 9:38am

Hi Bernt,

Currently, in the values.yaml job_config_file is set to:
job_config_file: “/galaxy/server/config/job_conf.yml”

I’m fairly certain that the correct config is read, as some changes in there do give an action. The key-value pair job_config : null was not set, but adding this does not make a difference in the current situation. Specifically putting the yml there results in crashes.

I’m aware of the second upload mechanism, thanks for reminding me! Upload1 was simply selected as a test case.

nuwang · November 28, 2024, 2:11pm

To route jobs to the local dispatcher, you can either do so directly through job_conf by using

  - destination: local
    id: upload1

Or if you are using tpv_dispatcher as the destination (recommended), you will need to change the tpv_rules_local.yml and add:

tools:
   upload1:
          scheduling:
            require:
              - local

in your tpv_rules_local.yml. Also make sure that the local destination is defined in your tpv_rules_local.yml. (pasting your tpv_rules_local.yml here would also help)

However, I’m curious to know why you want to use the local runner though? In general, production scenarios benefit from avoiding it.

Koen_Nijbroek · November 28, 2024, 4:45pm

The solution within the tpv_dispatcher works like a charm… many thanks! So the trick is specific tool destination inside the tpv_rules_local rather than in the job_conf… - which causes apparently some conflicts in my situation - fine with me :). The reason why we want to use the local runner is to limit the overhead in some specific workflow configurations - where bunches of primers are being designed but the way we’ve set it up now causes potentially thousands of pods being initialized (with a job duration of a few seconds). In those cases we simply choose to run those locally.

Topic		Replies	Views
Local instance SGE jobs	0	383	April 14, 2020
[Galaxy Local] Installing server-admin , galaxy-local	1	533	September 3, 2020
simple job_conf.yml example server-admin , job-config , tool-dev , planemo , galaxy-local	3	367	October 19, 2023
Connect local Galaxy instance to Grid Engine server-admin , galaxy-local	2	199	April 22, 2024
Custom local instance with cluster - various questions server-admin , galaxy-local	4	401	January 19, 2024

Galaxy & Kubernetes

Related topics