Dear galaxy supportteam,
I have questions regarding the cleanup of datasets, libraries and histories using the provided scripts.
I have a local galaxy (19.05) instance working with postgresql and after making some libraries am now trying to cleanup my libraries and all the datasets involved. I first deleted the libraries in the galaxy browser and the histories with associated datasets. After that, I ran the root/scripts/cleanup_datasets/cleanup_datasets.py script with -r for disk removal (later also with -f to force retrying) and -4 for library removal, which shows std output information that all libraries were purged and associated datasets marked as deleted. However, in the browser I can still undo the deletion of all these libraries and use them consequently to add files. I tried all kinds of combinations with the script and also ran the sh wrapped scripts that go with it. Some of the datasets in database/files are removed, but many subdirectories still remain filled with datsets.
I was wondering whether you have seen this behaviour before, or if I am missing a crucial step in the process.
Thanks for your quick reply! To my knowledge, all histories were removed prior to cleaning up of the libraries, but I will dive into this! Anyhow, I would expect the cleanup_dataset.py -6 argument to take care of these associative âlinksâ, is that true or did I misunderstand these links?
The script is designed to âclean upâ, i.e. it will operate on data that users marked deleted, it wonât delete them under their hands. So I think it is entirely possible that there are users using the dataset in question (e.g. in histories) hence the script refuses to remove the physical file.
I recommend to verify this by finding the problematic dataset id and querying the database for any hda or ldda that use it.