Permanently deleting datasets in history that is already purged

Hi there,

I am using usegalaxy.eu. In an attempt to free up space, I deleted an entire history and purged it using the account storage manager. However, I did not delete or purge the individual datasets within the history prior to that. Now these datasets (which are now within a deleted and purge history) are still counting towards the total storage and there is no obvious way to delete and purge them. I have read relevant FAQs related to this but I have not found a solution. I have tried accessing the relevant datasets by clicking on Data → datasets or histories and deleting them there, but the button for deleting these is missing because these datasets are already associated with a deleted and purged history. I would be happy to learn how to solve this problem.

Hi there,

I also have problems with deleting and purging at the moment. Could it be that this is still a problem due to the servicing last week? I tried several times individual and also by multiple selecting and permamently deleting, but they do not turn into purged. Is there a solution for this?

I could delete them at storage dashboard → free up disk space → Deleted datasets, maybe this works for you as well?

Hi @f.gather and @strandkurt

Some information about how dataset is associated with histories:

  1. When originally loaded into your account, a dataset is included in one history. That single copy is the only copy.
  • If you purge that history, that single copy of the dataset is also purged
  • Quota is completely freed up
  1. If you make a copy of that dataset (Copy Datasets function), that copy is a clone of the original dataset. What is stored with your quota is the data associated with that “copy group”. Meaning, copies of datasets do not consume extra quota.
  • If you purge a history that includes a copy of a dataset within a copy group, one of the clones is removed, but a copy of that dataset is still active (or deleted) in another history.
  • Quota is not freed up (yet).
  1. It doesn’t matter which copy you purge: it could be the original loaded dataset or one of the copies or all of the copies.
  • If any copies of a dataset exist as active (or deleted) in any history (active or deleted), you have that data in your account.
  • Only purging all copies of a dataset frees up quota space.
  • Any dataset copies that have been marked as deleted will show up under User → Preferences → Storage Dashboard
  • All of your data, including copies, will also show up in the Storage Dashboard, and those will be linked to individual histories, and could be repeated in the listing

Given that context, a few things may have been going on.

  • A copy of a dataset was included in a purged history, but also existed in a different history. So, the copy was removed when the history was purged, but the data was still in your account, so quota usage did not free up.

  • If the dataset was in another history that was also deleted (but not yet purged), that copy of the dataset would be marked as deleted, and would show up in the Storage Dashboard.

  • The action was scheduled to process, but due to the recent maintenance, the process wasn’t yet completed. This sort of situation will clear up over the next days as the job backlog and server actions are catching up.

  • Or, finally, know that the purging action can take up to 24 hrs to fully complete even under normal circumstances. This is especially true for larger data purging actions. This can sometimes be moved up in priority by using the Storage Dashboard → Refresh function.

If this doesn’t address what is going on… very strange. I am most curious about your feedback @f.gather – you were able to purge in one view (Storage Dashboard) but not directly in a history? That is the application view of the dataset copies and should be able to be managed anywhere that particular copy of the data exists (unrelated to quota calculations for the reasons explained above). If this happens again, please capture and post back screenshots if you could. That would be a corner case bug as far as I know! We can ask the EU administrators to help with confirmation of an actual application issue as needed. I would be curious about doing the same should this be replicated at the UseGalaxy.org server.

Ok – let’s start there. Both servers had some upgrades recently (the new release, plus backend cluster changes) so a bit tricky … but we can still capture the use cases. If the data management functions are not completely resolved in the next few days, we can dig into why.

Thanks for the feedback!!! :slight_smile:

Hi @jennaj

Thanks for your comments. However, these datasets are not associated or copied into other histories so that is unlikely to be causing this. The datasets simply doesnt show up in the storage dashboard so they cannot be preferentially deleted or purged. It is very strange. This problem has persisted since 13th April, so a few days after the latest Galaxy update.

Thanks for explaining @strandkurt

You situation is at the EU server, correct? The clusters were undergoing changes there before, and now the changes are in the backend. Not sure if what is going on could be part of that.

How long has it been since you purged? 10 hours? And have you used the Refresh button? If that history was larger than 50 GB, the server might just be catching up. If the data is still around after 24 hours, I would be interested in following up more.

Hi @jennaj
Yes, this is the EU server. The history was purged 4 days ago and still nothing has happened to the quota. I can see the datasets within the purged history by clicking data → datasets or data → Histories → purged entries → view, but there are no ways to delete the datasets within the purged histories.

I would appreciate a solution to this.

Hi @strandkurt
I think this is related to the ongoing cluster maintenance in EU servers, some services are still not 100% functioning. I’m pretty sure your datasets are properly deleted and purged if they show like that in the UI.
The issue is probably recalculating your current disk quota. This hopefully should be fixed today.

Hi @davelopez

Thanks for your comments. I think the problem is that the datasets have not been deleted and purged individually, but only the history in which they are contained. Somehow they got stuck in a limbo between being purged and active and with no way to delete them (even via the data storage manager). I have been unable to run any jobs because of this April 10.

Hi @strandkurt

It seems the background tasks services are now live again in usegalaxy.eu. The purging of the datasets happens in one of those background tasks. Hopefully, the tasks are running now and catching up with the queue. Please check again in a while and see if those datasets are now purged. Then go to https://usegalaxy.eu/storage and click refresh.

If you still have the same problem after waiting some time. Let us know so one of the admins can try to check your issue.

1 Like

Hi @davelopez

Thanks, I have just checked and the datasets have now been deleted and the quota is showing the right usage. I guess it was as you said - an issue with the background tasks lingering on since the planned update. I am glad that it is fixed… Thanks!

1 Like

I’m also glad it is working now for you!

Thanks for reporting back!