nf-core Pipeline Integration to Local Galaxy

gozdekb · August 14, 2024, 6:39am

Hi everyone, I want to integrate some nf-core pipelines into my local Galaxy. I’m particularly interested in wrapping the nf-core/ampliseq pipeline by using docker.

I was wondering if anyone managed to successfully write a proper XML file for this or any other pipeline and if you could share your experience or any tips you might have. It would be really helpful as I work on this.

Thanks in advance!

jennaj · August 14, 2024, 10:20pm

Cross-reference posts for context

Resources

General development, including tool wrapping → Development in Galaxy / Tutorial List
Wrapping tools for Galaxy with Planemo → https://planemo.readthedocs.io/

And, this advice is from our senior developer

That’s all just for context in case someone runs across this thread.

If anyone has done this, and wants to share what they did, you are welcome to post here!

fubar · August 21, 2024, 11:51am

FWIW, some personal views:

For simple NF workflows that will be widely used, reimplementing as a new Galaxy workflow has many advantages including making your work available to the community in a computationally efficient form, with all the outputs available for downstream processing in Galaxy, so that’s recommended if possible and is usually easy if all the tools are already available.

If new tools are needed, that requires specialised developer effort. There is substantial ROI if the new workflow is important to the community, but the tools have to be available. Ampliseq looks very complex, requiring a collaborative community-led effort to implement something equivalent or better.

If computational efficiency is not an issue, the whole NF kit and kaboodle can be wrapped as a new Galaxy tool because nextflow is in Conda. There’s a very crude example here where the data are all URI in the user supplied yaml file and the NF workflow has a special python runner to set up the actual NF command line. That runner could provide a model for NF workflows without one.

Main benefits

compared to running nextflow on a command line, the results are identical, because Galaxy really just runs it on a command line.
when the NF workflow is updated, there is minimal effort to update the tool

This approach has many, many problems and is rarely the right solution

The tool runs the entire workflow as a single job
- misses out on all of Galaxy’s workflow management benefits.
- requires the maximum RAM/CPU required for any subworkflow (!) for the entire duration.
- Depending on the workflow, that may be a terrible waste.
The most tangible benefit is having outputs in Galaxy for downstream processing.
On a private Galaxy, it may be a solution that requires far less skilled effort than a conversion.
The computational inefficiency would be an unacceptable burden for public Galaxy servers.

Topic		Replies	Views
nextflow as a galaxy module? workflow , galaxy-local , nextflow	2	463	August 13, 2024
Running nextflow nfcore pipelines using a singularity tool-dev , galaxy-local , epigenetics , nextflow	6	1538	August 13, 2024
Help Needed with Wrapping Nextflow Script into Galaxy Tool XML File tool-dev , planemo , galaxy-local , nextflow	3	65	September 24, 2024
Bundle pigz in containers tool-dev	3	38	April 16, 2025
Help needed: Creating a samplesheet of inputs in Galaxy tool-dev , galaxy-local	8	52	December 24, 2024

nf-core Pipeline Integration to Local Galaxy

Related topics