The intermediate output file still exists

Hi,I saw the Output cleanup option :Upon completion of this step, delete unchecked outputs from completed workflow steps if they are no longer required as inputs.And I chose yes.However, the output file generated in the trim step of my workflow still exists, and my memory is full. And I chose yes from beginning to end in Output cleanup.The intermediate output file still exists. I hope that these process files don’t exist, otherwise my memory is far from enough.


Hi @chenqiang

As a guess, the files are used in later steps since those are the post-QA core fastqsanger data, correct?

Read files are usually the largest files in any analysis and can definitely explode the quota usage.

Try this if quota space is limited:

  1. Remove the raw read files after QA steps
  2. Keep processed read files for downstream steps like mapping or assembly
  3. After those steps are completed, you can then remove any non-result files.
  4. Dataset files organized into collections with different shapes do not consume extra quota space, since those are just references to the original files.
  5. But if you introduce any changes to metadata or the file contents, those will then become new independent files that do consume quota.

In short, if you want a tool to work on a datasets, it needs to be retained until that tool completes. Once a file is not needed for any downstream tools, you can safely remove it.

You’ll have a copy of the raw files at the original pre-Upload-to-Galaxy source, so those are not needed on the server, correct?

Xref prior related Q&A – the Running Galaxy workflows — Planemo 0.75.20 documentation help is what to review next. Or, maybe review some Published Galaxy workflows? Find the search on the Workflow link in the masthead of the web application.

For the prior Q&A, if you think the problem is with Galaxy, please create a very simple example that demonstrates the problem. This should be against a public server where you can share all artifacts.

Three steps would be ideal: input, action1, action2. Where you want the outputs from the 2nd deleted, but specifying that in the 3rd step does not actually clean up the intermediate file.

Oh,I see what you mean. The third step will not clean up the output of the second step, right? Can it only be generated and then deleted manually? Can’t I specify that the output of the second step is not in my history? Because the file I generated in the second step is very large.

This is my workflow, please take a look at it for me:https://usegalaxy.eu/u/ccc123/w/copy-of-metag-wrokflow

The file must be in the active history for tools to use it.

If you are working at Usegalaxy.org, we are introducing a new feature: scratch space. No documentation yet but you can find the toggle under User > Preferences > Preferred Object Store. That “database” icon is in the UI in a few places too. How it works in in the pop-ups. You could set all data to write to the scratch space. It is 1 Tb for everyone. The regular storage is still the same, and you can move data between the two.

Oh,I tried it.So I can download it to scratch space by default by clicking on it?
41ba353a4ea8d0885eae9a969696e55

If so, I highly recommend that this function should also be available in usegalaxy.eu, because it is very important for users who store large amounts of data.

Because all my work is done in usegalaxy.eu, and there are many tools in usegalaxy.eu.I hope you can also add temporary space in usegalaxy.eu

@chenqiang , this is only a few days old. We are testing it out at UseGalaxy.org first. I don’t know if or when the EU server would introduce this.

You can work at any of the public servers :slight_smile:

Oh,Thank you for your answer, mainly because many of the tools I use are not available in UseGalaxy.org.Only in EU server