Scheduling recurrent workflows

Hi,

I was wondering if it’s possible to schedule recurrent workflows in Galaxy?

Ex: run a certain workflow once per day at 7AM, etc.
I searched and haven’t been able to find out how to do it.

Although there are maybe some ideas:

But, overall, this should be in the web UI

  • a separate tab “Scheduling”
  • in the workflow-related pages, a “Run (scheduled)” button (triangle + hourglass?) alongside the typical triangle “Run” button already available
  • I was not able to find these pages/buttons

Thank you,

Vlad

2 Likes

Hi @vlad.visan

Planemo is the recommended choice.

We have a tutorial here that covers this exact usage → Hands-on: Automating Galaxy workflows using the command line / Using Galaxy and Managing your Data

2 Likes

The 2nd part of the answer will be cron.

I can’t imagine how this is supposed to work in the web UI because one typically needs to select input data (which seems hard to automate). On the command line, i.e. this seems much easier.

3 Likes

Thank you for the super relevant link!
I wasn’t searching for the right keywords it seems, “Automating” was the right one.

1 Like

Thanks for the answer.

As to how it could be done in the UI:

  • Have a time/day worfklow parameter. That is filled in (potentially) manually if the invocation is manual. But if the invocation is scheduled, this parameter is auto-filled by Galaxy just before the invocation.
  • At least one step would use this parameter, for example to download the data for that specific day
  • And then, for example, the downloaded data for that specific day, is passed to the actual data-processing part of the workflow

@vlad.visan all your ideas will work :slight_smile:

During the pandemic @wm75 has been running this bot GitHub - usegalaxy-eu/ena-cog-uk-wfs: Collection of scripts for automated SARS-CoV-2 genome surveillance. on github and Jenkins to process new COVID data from ENA. I hope the scripts can help you, we would be also interested into generalizing this, but never found time so far.

3 Likes
  • Have a time/day worfklow parameter. That is filled in (potentially) manually if the invocation is manual. But if the invocation is scheduled, this parameter is auto-filled by Galaxy just before the invocation.
  • At least one step would use this parameter, for example to download the data for that specific day
  • And then, for example, the downloaded data for that specific day, is passed to the actual data-processing part of the workflow

Agreed, that could be a way and I’m even using one workflow that has such a date input parameter, but it’s a niche use case. The vast majority of users wouldn’t have workflows that behave differently with different date inputs I guess.
Still, an interesting idea :slight_smile:

2 Likes

Thank you for the link, the method seems very useful!

I’m not sure how widespread this demand is, but at least locally we have several teams requesting it (one in biodiversity genomics, the other in radar geology, both get new data daily and have relatively stable workflows).

I see, and I agree in general.

However, outside of bioinformatics that need might be more common, so maybe adding this functionality could help grow the Galaxy userbase.

Specifically, we have two use-cases internally that require it, one in biodiversity genomics, the other in radar geology, as they have new data daily. And so if they are to use Galaxy we need to provide this feature at least on our instance, if not in the main Galaxy codebase.

2 Likes