Re-using NetMHC Runs?

Hi,

I’m building a Galaxy tool that uses NetMHC, which takes in a file containing peptides, and an MHC allele, and predicts peptide-MHC binding affinity. On something the size of the human proteome, running it takes quite a while, so I’d like to cache the results for when my tool needs NetMHC ran on the same set of peptides and the same MHC allele.

I know about data managers, but those seem to be aimed towards data that I as the tool creator/administrator upload ahead-of-time, rather than caching the results of user’s runs. Does Galaxy have any kind of scaffolding for what I’m trying to do? I tried to look in the documentation, but I didn’t find anything.

Thanks,

Jordan

1 Like

You are correct about the data manager approach. Datamanagers usually have an associated tool to ingest and retrieve data from the manager.
One approach would be to have two outputs on this tool, one for emitting the cached result and another for forwarding the input dataset to your analysis tool. Have that tool link back to another input of your datamanager tool. If this tool does not produce an output, the linked tools will not be run.

1 Like

Ah, okay, thanks!

1 Like