I manage a galaxy instance in production, and this error confounds me.
Traceback (most recent call last):
File "/galaxy/galaxy-base/database/jobs_directory/000/42/metadata/set.py", line 5, in <module>
from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata()
File "/galaxy/galaxy-base/lib/galaxy_ext/metadata/set_metadata.py", line 20, in <module>
from galaxy.metadata.set_metadata import set_metadata
File "/galaxy/galaxy-base/lib/galaxy/metadata/__init__.py", line 9, in <module>
import galaxy.model
File "/galaxy/galaxy-base/lib/galaxy/model/__init__.py", line 51, in <module>
import sqlalchemy
ModuleNotFoundError: No module named 'sqlalchemy'
But the module is installed and can be loaded via the venv
$ source .venv/bin/activate
(.venv) $ python3
Python 3.11.7 (main, Oct 9 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlalchemy
>>> sqlalchemy.__version__
'2.0.30'
>>>
The error I encounter is for jobs submitted via batch. The job itself runs to completion(slurm exit code 0) but galaxy has the above error in stderr.
State ExitCode
---------- --------
COMPLETED 0:0
COMPLETED 0:0
COMPLETED 0:0
The short answer is that the environment that the tool is running in cannot “see” sqlalchemy (even if the server environment can). This can be rooted in a python versus problem, or a “lost environment” due to other reasons.
A Galaxy server can have multiple modes or destinations for running jobs. Some might happen locally and some might be dispatched to a cluster in a container environment, or to just another server or cluster. Maybe one has a problem, and the others don’t.
If you want to check that to see if it helps where to look first, then let us know if you need more help, it might be a good place to start. The version of Galaxy you are running and some details about your job configuration would probably help us to help you, too. Are you using dependency resolvers? Are you using containerized job environments?
Meanwhile, I’ll post some of our resources that add in more context about how distributed job running runs plus help about containers.
With all top levels of those guides starting from here.
I’m going to cross-post your question over to our Admin chat to get some feedback, too, since this is about as far as I can take you! Maybe someone recognizes the problem and can offer a quicker fix suggestion! They will probably reply here but feel free to join the chat! → You're invited to talk on Matrix