How to set a path directory to another disk volume?

Hi, everyone
I am familiar with Galaxy, but only now have I started setting it up on our local computer (no server, just a well-built computer running in UNIX). I have succeeded to install it, set admin accounts, download reference files (i.e. reference genomes) and install new packages.

My problem now is, how can I set directories to store data (i.e. reference files, output files, and temp job files, if possible)? Moreover, how to set them to a location on another disk volume, other than the one galaxy was installed?

I have tried troubleshooting from this official link and this post, but I cannot make it work.

For one thing, I believe the galaxy.ini file has been replaced by galaxy.yml file. Is that correct?
If so, I found some lines similar to what is referenced on the official link, and did the following changes:
On line 178, it reads:

# on any cluster nodes that will run Galaxy jobs, unless using Pulsar.
file_path: /media/mol/hdmol/galaxy/database/files

# Where temporary files are stored. It must accessible at the same
# path on any cluster nodes that will run Galaxy jobs, unless using
# Pulsar.
new_file_path: /media/mol/hdmol/galaxy/database/tmp

Note: /media/mol/hdmol is the path to the other disk volume. And all sub-directories from galaxy were copied to that location. Hence, I as using the exact same sub-directories, and changing only the volume (to /media/mol/hdmol/).

Any help is very much appreciated.
Thank you!

2 Likes

Galaxy uses an filesystem-independent ‘Object Store’ to store files. here is a very good tutorial on how to configure it so you can store files at different volumes.

As of the reference data, here is another tutorial on how to attach our shared reference data repository to your Galaxy instance, so users can use hundreds of references prepared by us.

This is correct.

Please ask if you have more questions.

Thanks a lot, @marten!
And thank you for the references! I have read them through, but one thing puzzled me. Right at the beginning it mentions

This tutorial assumes you have done the “Ansible for installing Galaxy”

I have installed galaxy on our machine by following the directions from the Use Galaxy webpage, rather than Ansible. (to be honest, I am not familiar at all with Ansible). I believe it is based on their git repository.

So, either I could not find the object_store_config_file and galaxy_config_files, or they do not exist. Moreover, those lines described on the config/object_store_conf.xml file are also absent on the file I have here (but it is the exact same name).

In any case, I did change all the path lines on the object_store_conf.xml by the directory I was aiming for (it is, /media/mol/hdmol/galaxy/), and it still didn’t work. It kills the sudo sh run.sh without initializing the program.

Any other suggestions? I am very lost.
And thanks again for the help!

The document you linked is the Galaxy startup 101 i.e. starting Galaxy for the first time for exploration/development.

For anything more serious, with actual users and resources we heavily recommend (and build our tutorials on) Ansible - which is a software for configuration and deployment of applications. That is the best practice for running an actual production Galaxy.

There are training that will teach you how to use Ansible in the admin section of the training materials: https://training.galaxyproject.org/training-material/topics/admin/


Now to get your thing working:

The sample configuration of object store is distributed with Galaxy at https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/config/sample/object_store_conf.xml.sample

Would you mind sharing your object_store_conf.xml so I can have a look?

Thanks again, @marten!
I read the documentation and those 10 tips before installing it, and thought that just installing a local instance, as that explained on the tutorial would be enough. Ok, so if you allow me to take a small turn here, before moving on with the original topic: my need for now is to install Galaxy and some other tools in a local computer (good fairly enough to run genomics data in a small scale), so just a couple of users can use it.
That being said, would you still think it is a good idea to move on to install a production Galaxy?

Now, back to subject, here is how the object_store_conf.xml file looks like with the editing:

<object_store type="hierarchical">
    <backends>
        <object_store type="distributed" id="primary" order="0">
            <backends>
                <backend id="files1" type="disk" weight="1">
                    <files_dir path="/media/mol/hdmol/galaxy/database/files1"/>
                    <extra_dir type="temp" path="/media/mol/hdmol/galaxy/database/tmp1"/>
                    <extra_dir type="job_work" path="/media/mol/hdmol/galaxy/database/job_working_directory1"/>
                </backend>
                <backend id="files2" type="disk" weight="1">
                    <files_dir path="/media/mol/hdmol/galaxy/database/files2"/>
                    <extra_dir type="temp" path="/media/mol/hdmol/galaxy/database/tmp2"/>
                    <extra_dir type="job_work" path="/media/mol/hdmol/galaxy/database/job_working_directory2"/>
                </backend>
            </backends>
        </object_store>
        <object_store type="disk" id="secondary" order="1">
            <files_dir path="/media/mol/hdmol/galaxy/database/files3"/>
            <extra_dir type="temp" path="/media/mol/hdmol/galaxy/database/tmp3"/>
            <extra_dir type="job_work" path="/media/mol/hdmol/galaxy/database/job_working_directory3"/>
        </object_store>

Note: as mentioned on the original post, I pasted all the directories and sub-directories of Galaxy to the volume I would like to have the data on. Hence, I just added /media/mol/hdmol/galaxy/.

@gbbio No worries. We made that mindset switch recently and not all of our website reflect it. That said what you linked is a code documentation and the low level details - and I believe they are correct.

Still the best practice for production instance is to use Ansible to configure these detail and Galaxy at large, and we provide so called ‘ansible playbooks’ that help with many of these details, because they do them for you.

First, that is not a valid XML (it misses two closing tags).
Second, I think you do not want to use hierarchical or distributed OS, so something like this should suffice:

 <?xml version="1.0"?>
 <object_store type="disk">
    <files_dir path="/media/mol/hdmol/galaxy/database/files1"/>
    <extra_dir type="temp" path="/media/mol/hdmol/galaxy/database/tmp1"/>
    <extra_dir type="job_work" path="/media/mol/hdmol/galaxy/database/job_working_directory1"/>
 </object_store>

Third, make sure you tell Galaxy about this store configuration by putting it in the galaxy.yml

object_store_config_file: config/object_store_conf.xml
2 Likes

I already removed my reply =) I got confused because of the name “ansible-galaxy”. As I understand now, the code and functionality is the same but it is a way of installing maybe even back-upping like puppet. Thanks for your reply!

1 Like

Correct, exactly like Puppet/Chef!

1 Like

Hi, @marten
Thank you again! Yes, you are correct - the closing tags ( </backend>; <\object_store>) were not on the lines above. Those lines were extracted from the entire object_store_conf.xml file (I didn’t add the rest because they were all comment lines).
Those lines you provided work very fine! Thank you very much for all the troubleshooting! It was a great help.

1 Like