A few months ago I created for a colleague of mine a Galaxy instance using Docker on a virtual machine on our cloud.
After she was done with her analyses and I explained her how to download her results, I deleted the docker container and the virtual machine. I saved the mapping directory “just in case”.
She recently asked me if I could restart the Galaxy instance since she had a problem when saving her datasets.
Now I have troubles starting a new container using the same mapping directory.
I had to go inside the Docker container and manually launch the startup script to get more information.
I had the following error:
Exception: Your database has version ‘134’ but this code expects version ‘141’. Please backup your database and then migrate the database schema by running ‘sh manage_db.sh upgrade’.
So I backuped my database, upgraded postgresql, reimported the backuped database and launched the startup script again.
Now I can access the Galaxy instance but I cannot log to the admin account (which was used by my colleague to do her analyses. She didn’t bother creating a new account). I have no error message but the “admin” tag doesn’t appear and when I click on “user” it says “connected as” with nothing following.
I even tried to import the database backup into a “virgin” Docker container created from the same Galaxy Docker image but there were no datasets in this new instance.
I know the datasets are there. There are a lot of files in mapping/galaxy-central/database/files/000/ but named dataset_XXX.dat. I don’t know where to find the information about the real datasets names, the histories names…
Is there a way to get back the datasets with their metadata?
All the information you are seeking should be in Galaxy’s database. What database did you use in your configuration? Do you have it backed up from the time after your colleague’s analysis?
I don’t know how to get the information I need in the database. I would like to give my colleague either an archive with all her datasets (with the correct names and extensions) or (best) create a new Galaxy instance with all her work.
I didn’t change anything to the database configuration.
My Galaxy instance is based on quay.io/shiltemann/galaxy-metagenomics, which is based on bgruening/galaxy-ngs-preprocessing:17.05, and this one is based on bgruening/galaxy-stable:17.05.
This is what I did to backup the database:
su - postgres
pg_dumpall > pg_backup.bak
Then I created a new Docker container based on the same image, I copied the pg_backup.bak file in the appropriate directory and I imported it (psql -f pg_backup.bak postgres) but I couldn’t see my datasets in this new instance.
I realized that I didn’t used the right Docker image…
So I just created a new container based on the right Docker image and linked to my mapping directory.
I connected to the container interactively to see what happens.
First the postgresql service was down. Starting failed because of the following error:
Error: The cluster is owned by user id 1000 which does not exist any more
So I ran the following commands and then restarted the postgresql service (successfully):
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: password authentication failed for user “galaxy”
FATAL: password authentication failed for user “galaxy”
I ran the following commands to recreate the galaxy user for postgresl:
postgresql: ERROR (abnormal termination)
[…]
==> /home/galaxy/logs/uwsgi.log <==
return self.dbapi.connect(*cargs, **cparams)
File “/galaxy_venv/local/lib/python2.7/site-packages/psycopg2/init.py”, line 164, in connect
conn = _connect(dsn, connection_factory=connection_factory, async=async)
OperationalError: (psycopg2.OperationalError) could not connect to server: Connection refused
Is the server running on host “localhost” (127.0.0.1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Cannot assign requested address
Is the server running on host “localhost” (::1) and accepting
TCP/IP connections on port 5432?
When doing this
tail /var/log/postgresql/postgresql-9.3-main.log
I got the following line:
2019-07-16 08:56:15 UTC LOG: could not receive data from client: Connection reset by peer
I don’t know Paste or uwsgi. I didn’t change anything to Galaxy’s inner architecture or config.
My instance is based on Docker image bgruening/galaxy-stable:17.05.
But I just found a way to solve my problems. Here is what I did:
Create a Docker container based on the same Docker image I used previously
and mapping to the same mapping directory I used previously and which contains
my datasets.
Restore the back-uped Postgresql database (inside “new” Docker container).
cp /export/galaxy-central/dump_galaxy.tar /home/galaxy
su - galaxy
pg_restore -d galaxy dump_galaxy.tar -c -U galaxy
exit # exit from galaxy user
exit # exit from Docker container
Copy datasets files from “old” container to “new” one
(which are stored in /export/galaxy-central/database/)
sudo tar czf /my/old/mapping/dir/galaxy-central/database.tar.gz /my/old/mapping/dir/galaxy-central/database
sudo cp /my/old/mapping/dir/galaxy-central/database.tar.gz /my/new/mapping/dir/galaxy-central/database.tar.gz
sudo mv /my/new/mapping/dir/galaxy-central/database /my/new/mapping/dir/galaxy-central/database_bak
sudo tar xzf /my/new/mapping/dir/galaxy-central/database.tar.gz
Restart “new” Docker container
sudo docker restart galaxy_clean
I don’t know if what I did is really clean but at least it allowed me to get back my datasets simply from the mapping directory.
Maybe this can help other people as well.