Missing module error at Galaxy instance upon starting job

I have posted the original question here, if you’d like to answer from there too:

biostars

I have an (old) Galaxy installation at a cluster that makes use of default python 2.7. I can see that there are 2 symlinks called python2 and python2.7 that maps to a python binary inside /galaxy/.venv/bin and calling it with --version returns
2.7.14

I recently installed a tool using tool.xml that has the following directive:

<configfiles>
  <inputs name="inputJSON" filename="myConfig.json" data_style="paths"></inputs>
<configfiles>

The above directive tells galaxy to parse the entire UI from XML and create a JSON representation of the user choices, which is then passed to a perl script.

When I ran the tool I got the following error:

ImportError: No module named xml.dom.minidom

Since the tool and the wrappers are written in node and perl, I am guessing this has something to do with galaxy dependencies that is required to parse the xml directive above.

I checked the default python from $PATH and it’s version is 2.7.14 as well. When I go to /usr/lib64/python2.7 or /usr/lib64/python I can see that there is a folder structure xml -> dom -> minidom.py. So it means I already have this module installed.

Next I went to galaxy/.venv and I see there is a folder called lib and another called lib64 which points to lib itself. Inside the lib there are many python scripts that are symlinks inside /usr/lib64/python2.7/. For example galaxy/.venv/lib/types.py maps to /usr/lib64/python2.7/types.py .

At this point I have some questions:

  1. How can solve this dependency issue?
  2. Shall I create a folder symlink called xml inside galaxy/.venv/lib that maps to /usr/lib64/python2.7/xml ? Would that be sufficient ? - Tried and this did not help
  3. Shouldn’t these dependency symlinks be refreshed/created when galaxy is restarted? Because when I restart galaxy, I still get the same error, meaning galaxy is still not aware that this module exists.

Additional information:

  • lsb_release -a yields SUSE 15.1
  • I also have a local clone of galaxy (on Ubuntu) from the repo that uses python 3.8 and CANNOT reproduce the error above
  • In both galaxy instances (the local one that uses 3.8 and one at SUSE that uses 2.7) there is a file at location galaxy/lib/galaxy/util/__init__.py and here I can see both galaxy instances are trying to import xml.dom.minidom. However, in the older galaxy’s __init__.py there is an additional line at the start of the script: from __future__ import absolute_import. Would this affect the search path or cause the problem in this question?

Hi @ibowankenobi,

usually, Galaxy instances are configured to run tools in a conda environment or container. Those tools are independent of the Galaxy environment. How old is your Galaxy and do you see in your tool a “” section?

Ciao,
Bjoern

Hello Bjoern,

The galaxy instance I am testing the tool on is at least 5-6 years old, running on python 2.7. I am not sure if conda is used under the hood but it looks like it is either using what is available in $PYTHONPATH or the virtual environment under galaxy/.venv/bin/python.

The thing I that confuses me is when I do: galaxy/.venv/bin/python -c "import xml.dom.minidom; print(\"hello\")" from the terminal it works OK and exits with 0. So the python can import xml.dom.minidom but somehow when run from the galaxy UI, I get the below stack trace:

Cannot open the input file!
Traceback (most recent call last):
  File "/some/path/to/database/job_working_directory/019/19094/set_metadata_cW4TpH.py", line 1, in <module>
    from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata()
  File "/some/path/to/galaxy/galaxy/lib/galaxy_ext/metadata/set_metadata.py", line 24, in <module>
    import galaxy.model.mapping  # need to load this before we unpickle, in order to setup properties assigned by the mappers
  File "/some/path/to/galaxy/galaxy/lib/galaxy/model/__init__.py", line 44, in <module>
    import galaxy.model.metadata
  File "/some/path/to/galaxy/galaxy/lib/galaxy/model/metadata.py", line 21, in <module>
    from galaxy.util import (in_directory, listify, string_as_bool,
  File "/some/path/to/galaxy/galaxy/lib/galaxy/util/__init__.py", line 25, in <module>
    import xml.dom.minidom
ImportError: No module named xml.dom.minidom

The Cannot open the input file! is from my script that is run, the rest if from galaxy’s side

Hi Bjoern,

Looking at the python that is used by default, it is at least 5-6 years old. I am sure if it is using conda under the hood, but I can see that it is either using $PYTHONPATH, system path or what is available under /galaxy/.venv/bin/python.

Interestingly, when I run: /some/path/to/galaxy/.venv/bin/python -c "import xml.dom.minidom; print(\"hello\");", it prints and exits with 0. But when the tool is run from the UI I get the below stack trace:

Cannot open the input file!
Traceback (most recent call last):
  File "/some/path/to/data/derived/Galaxy/database/job_working_directory/019/19094/set_metadata_cW4TpH.py", line 1, in <module>
    from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata()
  File "/some/path/to/galaxy/galaxy/lib/galaxy_ext/metadata/set_metadata.py", line 24, in <module>
    import galaxy.model.mapping  # need to load this before we unpickle, in order to setup properties assigned by the mappers
  File "/some/path/to/galaxy/galaxy/lib/galaxy/model/__init__.py", line 44, in <module>
    import galaxy.model.metadata
  File "/some/path/to/galaxy/galaxy/lib/galaxy/model/metadata.py", line 21, in <module>
    from galaxy.util import (in_directory, listify, string_as_bool,
  File "/some/path/to/galaxy/galaxy/lib/galaxy/util/__init__.py", line 25, in <module>
    import xml.dom.minidom

ImportError: No module named xml.dom.minidom

Cannot open the input file! is from my script, the rest is from galaxy’s.

I don’t really know. xml.dom is a standardmodule, so should be available. Do you have a manipulated PYTHONPATH somewhere? For example an xml.py file somewhere which confuses Python?

1 Like

Thank you so much for your swift response Bjoern, I apologize for not responding earlier.
The issue turned out that, in Galaxy 18.09 (I have previously experimented with Galaxy 22 and it was more consistent) the JSON representation of the UI’s xml that Galaxy hands over to your wrapper script is NOT in the working directory but up 1 level, so the PARENT of the working directory. I only realized this when I made Galaxy pass the output’s $name variable as an argument to the wrapper script and inspected its value.

Due to this, the wrapper script failed to open the JSON and instead Galaxy reported the error from the python script that called the wrapper. So when I saw the error from the python script telling me that a module is missing, I was wrong-footed. It is solved now. Thank you.