Hi @chenqiang
As a guess, the files are used in later steps since those are the post-QA core fastqsanger data, correct?
Read files are usually the largest files in any analysis and can definitely explode the quota usage.
Try this if quota space is limited:
- Remove the raw read files after QA steps
- Keep processed read files for downstream steps like mapping or assembly
- After those steps are completed, you can then remove any non-result files.
- Dataset files organized into collections with different shapes do not consume extra quota space, since those are just references to the original files.
- But if you introduce any changes to metadata or the file contents, those will then become new independent files that do consume quota.
In short, if you want a tool to work on a datasets, it needs to be retained until that tool completes. Once a file is not needed for any downstream tools, you can safely remove it.
You’ll have a copy of the raw files at the original pre-Upload-to-Galaxy source, so those are not needed on the server, correct?
Xref prior related Q&A – the Running Galaxy workflows — Planemo 0.75.20 documentation help is what to review next. Or, maybe review some Published Galaxy workflows? Find the search on the Workflow link in the masthead of the web application.