Why is database so large?


In my galaxy there are 3 users with 474, 312 and 0 gigabytes of used space as galaxy says. However the database actually uses 1.2 TB of disk space. I ran cleanup scripts Cleaning up Dataset Objects but they freed exactly 0 bytes.

What else my database contains that is not shown and can not be deleted by scripts? Can i delete it in some other way?

Thanks in advance.

Hi @wormball, are you running Postgresql as the database? If so then you may need to run VACUUM FULL. You will need to stop Galaxy for this work though. PostgreSQL: Documentation: 12: VACUUM

I think i use the default option (sqlite) but i do not know how to check this.

I think he means the folder galaxy/database. You could check how big the folder galaxy/database/tmp is. And check if it matches up with the 1.2TB.


root@transgen-1:/data/galaxy/scripts/cleanup_datasets# du -d 1 /data/galaxy/database/
1284 /data/galaxy/database/tool_search_index
27516 /data/galaxy/database/shed_tools
32 /data/galaxy/database/mulled
32 /data/galaxy/database/citations
849294740 /data/galaxy/database/objects
4 /data/galaxy/database/object_store_cache
3679008 /data/galaxy/database/dependencies
2192 /data/galaxy/database/tool_cache
5132 /data/galaxy/database/jobs_directory
398252968 /data/galaxy/database/tmp
516 /data/galaxy/database/compiled_templates
12 /data/galaxy/database/container_cache
1251305624 /data/galaxy/database/

So the difference between 1.2 and what you see as Galaxy datasets is exactly the size of the /tmp folder. You can clean it up periodically with something like tmpreaper. Thanks @ggbio for identifying this.

1 Like

@wormball - What Marten said. :smile: I have to regularly empty the tmp file. I normally use something like tmpreaper too.

Thanks all!

I reviewed our workflow, and it turned out that it were poorly written custom tool xmls that left tons of garbage behind them. Now i had fixed them, and galaxy is innocent.