We’re developing a transcriptomics processing pipeline which involves multiple steps (tools) which we want to keep together in a single Galaxy tool, rather than parting them out as individual tools in Galaxy.
As part of this we want to tightly control the reference indexes used at each step (e.g. HG38 indexed for STAR, HISAT2, and the associated transcripts for Salmon).
Ideally we’d like a way of bundling all needed reference indexes in one set or collection that would be used by our pipeline when it’s a Galaxy tool. To be clear, unlike other tools, we’d be hard coding the reference(s) ahead of time so the user wouldn’t have a choice.
The reason for that is to ensure that any run of the Galaxy version of our pipeline is in lock step with the offline version we’re running elsewhere.
I’ve read a little bit about Data Managers being used to get custom references into a Galaxy server, however, it wasn’t clear this would be appropriate for a set of indexes, rather than just one, if I wanted to bundle them together.
Any suggestions on how to do this (or a pointer to documentation that I missed) would be helpful.