Editing a workflow to choose a different reference genome

Ahmadr215 · April 20, 2025, 12:54pm

I chose the Short version of a Galaxy tutorial titled: "Pathogen detection from (direct Nanopore) sequencing data using Galaxy - Foodborne Edition / Pathogen detection from (direct Nanopore) sequencing data using Galaxy - Foodborne Edition / Microbiome](Hands-on: Pathogen detection from (direct Nanopore) sequencing data using Galaxy - Foodborne Edition / Pathogen detection from (direct Nanopore) sequencing data using Galaxy - Foodborne Edition / Microbiome).

I also watched the recorded YouTube video on this tutorial, and tried to reproduce what is presented in the tutorial and the YouTube outputs on 25 minutes of video ( short version) (https://youtu.be/rGP-BKYwUbc). After I imported “nanopore_preprocesssing.au” I did get an error message (see screenshot below). I am now totally lost because I could not reproduce what is shown in the workflow process. I also attempted to find the GTN 2024 history from Public History on https://usegalacy.org.au, but I could not locate GTN 2024 in the Public History either. I noticed that usegalaxy.eu was used in the YouTube video. I am not sure if the history of Galaxy Australia is different from Galaxy EU and USA.

I am not sure when This tutorial is recorded, because some of the galaxy features in the video slightly differ from what I can see in the current version of Galaxy 2025. So, I am stuck and couldn’t continue the tutorial any further. I decided to share this post to see if someone can help me with instructions and tips to find my way. I am interested in pathogen genomics, and Galaxy Tutorials have been excellent sources for my learning process so far!!

Your assistant would be greatly appreciated!

Error message:

jennaj · April 21, 2025, 5:20pm

Hi @Ahmadr215

Thanks for sharing all of these details.

I just tried to replicate your steps, and yes, the workflow might be working on the UseGalaxy.eu server best for now.

For now, whenever you run into a message like yours when importing a workflow, it means that the exact tools and reference data (and versions of those!) are not hosted at the server where you are working. Sometimes the difference is really small, and you can make a change yourself! But if there is a missing tool, you can ask about it. There might be a reason why one server hosts it and other do not. And if you were running your own Galaxy, all of this could be customized using administrator functions.

Let’s start from the top!

When running through a tutorial, all the details are in the card at the top. For a direct run, working at one of the server’s known to support the exact steps is one choice. But you can also try at the other public servers, especially if you have already run through it exactly and are now working with your own data, and are maybe a bit more comfortable with exploring.

How to find the Available at these Galaxies pull-down menu on any tutorial

For this one, the EU server is recommended, and the CZ and AU servers might work but will need a bit of technical manipulation on your part.

Importing the workflow

Go to the Workflows view, and use the pull-down menu to select the server.

Then let it process. It might automatically launch the workflow, but it is also common to get a warning. This means you’ll need to go into the Workflow editor to review what is needed. Use the link in the warning to do this.

Example message

The Nanopore Preprocessing workflow may contain tools which have changed since it was last saved or some other problems have been detected. Please click here to edit and review the issues before running this workflow.

Then clicking on the link for

click here to edit and review the issues

go to the Workflow editor and the pop-up card lists out the problems detected.

It looks like the workflow is expecting a chicken reference genome galGal6 but this server doesn’t host it.

The list of available native indexes is given, then the “automatic” correction of choosing the first in the list is noted. Now, the default genome is honey bee apiMel4!

That’s Ok. You at least know what you will need to do. The workflow needs a chicken reference genome for Minimap2. You can check to see if the genome is named a bit differently, or if there is a different version, make the change in the editor, save, and try a run. If the server doesn’t host it at all, you can supply it from the history as a Custom Reference Genome, and still use this workflow.

Editing the workflow

In the editor, review the steps that the warnings were about.

For this workflow, when I went to the Minimap2 step, clicked on the tool, expanded the form, and checked under the reference genome setting, it looks like there are other versions of the chicken genome that you can use.

More about reference genomes in Galaxy

This is a good guide that explains the details using human examples, but all reference data works this way. There are assembly versions, and all the data associated is specific to that reference genome.

Reference genomes at public Galaxy servers: GRCh38/hg38 example

The chicken genome galGalN assemblies were sourced from UCSC just like the human example. If you are choosing galGal4 (the most current genome hosted at the AU server), you’ll need to make sure all of the data in your pipeline are also based on that assembly.

This particular set of workflows is using metagenomic reference data, so this might be the only change you need to make. But if the data involved chicken reference data like an annotation file, you would need to supply one from this same assembly version, instead of any included with the tutorial (since it would be based on a different assembly).

If you are new to workflows in Galaxy, this is a great simple place to start.

Hands-on: Galaxy Basics for everyone / Galaxy Basics for everyone / Introduction to Galaxy Analyses

With more covered in these. You can also search the training site with keywords like “workflow” to find FAQs and more. This forum also can be search.

Using Galaxy and Managing your Data / Tutorial List

That is a lot of information! Please give this a try. It looks like there is only one change needed so far. You could also decide to work at the EU server with the tutorial data the first time through, learn a bit about how this is expected to work, then try at the AU server with your own data.

This was a good question! Please let us know if you need more help as you work through this. There might be more changes needs with the reference data but I can’t remember the exact details. If you ask about those as you run into them, we can reach out the AU administrators and see what can be done.

Ahmadr215 · April 22, 2025, 5:52am

Hi Jenna

Thanks again for your comprehensive feedback!
I had done all the steps that you suggested until I received the warning message about the reference. When I clicked “continue”, as you mentioned, a Workflow chart was created using the default reference apiMel4, and then I stopped.

When I reached Editing the workflow heading in your post, I couldn’t find the editor option in the workflow and was not sure how to open this screen (i.e, right panel of your screenshot) that you sent me in your post (see below). Can you please help me with this?

I think I need to follow your advice and read/watch the tutorials that you suggested. I have already watched and practised a couple of these, but I need more exploration to find my way. I hope you don’t mind if I get back to you if I get stuck again, which I am sure I will!
Thanks again!

jennaj · April 22, 2025, 7:42pm

Glad this is helping!

To edit a workflow, we have a walk-through here → Hands-on: Creating, Editing and Importing Galaxy Workflows / Creating, Editing and Importing Galaxy Workflows / Using Galaxy and Managing your Data

The UI has been updated, and that does not include many screenshots anyway, but maybe I can help more here.

First, know that workflows have versions that you can navigate. So, don’t be afraid to click on items and experiment. You can navigate everything you have done under the Changes icon in the far left menu.

Then, to edit a tool, start by clicking on the box for that tool in the editor. This will bring up the same tool form that you use when working directly in the history. This is loaded at the right side of the view. Using the icons at the top of that pop-out, you can expand to get the view in my screenshot. All of the tool form options can be modified here. Once you like the changes, save the workflow and run it. And, I didn’t mention this originally but you can modify reference genome choices at the workflow runtime too, on the workflow launch form.

That’s a lot! Here are some screenshots with those options. Maybe helps? Every workflow will be different of course, and tools themselves are different, but the basic portions about modifying the reference data is similar, as is the UI navigation.

Selecting a different reference genome in a workflow

Option 1: Modifying the reference genome used within the workflow in the editor (persistent changes).

Doing this means you only have to make the change once, since you will be saving the selection in your copy of the workflow.

Click on the Workflow icon in the far left Activity Bar to view your listing. Search on this view as needed. You can explore and import public workflows here too!

On the card for the workflow, choose the Edit button.

You can now make changes in your copy. Navigate by scrolling around, or using the lower right map, or even a web browser search as I did here (that is just a Chrome find search command-fat the top in my screenshot).

Click on the box for the tool you wish to modify.

In general, it is a good idea to start with the warning items reported by the content checker when you imported. Address those, then make any other changes that you want to.

Use the minimize and maximize buttons in the top corner if wanted. This was the part you were stuck at, yes? It is brand new so will not be in the tutorials/videos yet.

Scroll down the form to the option you want to change.

Once all the edits are complete, use the Save + Exit button.

Note: if you try to navigate away before saving, Galaxy will present a pop-up window to alert you about any unsaved changes! You can choose to just exit, and whatever the original state was will be unchanged. You can also use the Changes function to navigate what the changes are for review or to undo anything not wanted. You can also use this to double check that what you changed was only what you wanted to change!

I boxed in two other helpful functions you can explore later: Best Practices and Reports.

You will now be back to the Workflow listing and exited from the editor.

From here, you can now try your updated workflow!

When using your own data with a new workflow, consider running a smaller representative dataset.

The tutorial data is one way to sort of “kick the tires” to flush out any problems.

Option 2: Modifying the reference genome used on the Workflow Launch view (runtime changes).

Doing this will mean that the changes are only applied for this specific workflow run.

Starting at the Workflow Launch view.

This is the same view where the first method ends, and where the Run button for a workflow will always start.

You will be selecting your workflow inputs on this view, but you can also explore the workflow itself, and any items that can be modified at runtime will be presented.

The full view is collapsed. Use the link at the bottom to expand the details.

Use the scroll bar to navigate down.

From here, you can expand the individual tool steps.

The tool we are interested in has the reference genome option presented at the top with an Edit button.

Clicking on Edit brings up the same tool form section we were using the editor, and you can search and select the same way to apply changes.

The change is applied directly to this run (no “save” needed). You can also undo this if you want to start over.

Once you are done making changes, scroll back up to the top.

Select your inputs as usual. Some servers will offer more complex ways to organize the run, too. Some even offer different different storage locations.

You can also do things like toggle the option to send the workflow outputs to a new history as I did in the screenshot. This puts all of the outputs and a clone of the inputs into a new history for you. I really recommend using this! It helps to keep things organized, but also can make any troubleshooting or changes easier to track.

The history will be named the same as the workflow by default but you can customize this. Everything will be timestamped as usual, along with all of the tool details, parameters, everything! You will be able to navigate the run in a History view, or under Workflow Invocations.

All the data will be exactly like working in a history directly. From here you can apply single tool runs, or use different tool runs, copy the data other places, download, etc. Nothing is static.

When you do want a persistent exact copy of the output, consider setting the history to an History → Archive status. This is a good choice for when the data will be shared in a publication or similar.

I hope this helps! The views here are newer so may not be captured exactly in the tutorials yet. Please let us know if this works or not for you!

Ahmadr215 · April 23, 2025, 3:55am

Dear Jenna

Thank you for the detailed instructions; this is great - much appreciated! I will go through these to see if I can find my way. As you said, that’s a lot to absorb- but your screenshots and descriptions are very beneficial.
One more question: due to my unfamiliarity with Genomics and limited experience in this field, how can I select a reference genome instead of galGal6 in the workflow? Is there a guide or something to help the new users with this, or is this something that the operators (e.g., biologists, etc.) should have known?
Thanks again for your time and assistance with this!

jennaj · April 23, 2025, 4:40pm

Yes, @Ahmadr215, using a different species here is a scientific consideration. This pipeline is using a chicken genome since that was the host species. Any of the galGalN species are from UCSC – see that earlier link I posted about reference genomes, or just go to https://genome.ucsc.edu directly to review those, then review reference the guide once oriented since it will make more sense.

But not knowing that automatically just means you don’t know that part yet! The tutorial will explain what is happening in the Minimap2 step with more in the citations at the end. Plus you can run a literature search. Any bioinformatics pipeline for a similar analysis will probably follow a similar analysis path: clean up the reads, remove the host, investigate the bugs! Some pipelines just identify, some also quantify. The exact tools used might be different since some of this is a preference with multiple options for each generalize “analysis block”.

In summary: the mapping step is negatively filtering out the reads of the host species, leaving the microbial reads behind, then those are investigated through a metagenomic analysis process. You could technically choose any reference genome you want to. But switching the host species probably means other steps and even these mapping parameters might need to be tuned.

So, whether this exact pipeline is appropriate or not for other hosts is a tricky question to answer! Maybe, but you’ll need to experiment to get it to work as expected. We might be able to help here with choosing the assembly to use. What is the host species? Where was the sample obtained (what tissue or sampling method)? Are the reads whole genome (DNA) ONT?

Ahmadr215 · April 23, 2025, 11:39pm

Hi Jenna
Thanks for your comments here; much appreciated! I started watching, reading, and practising the other tutorials that you recommended. These are great, particularly the tutorial on Workflows. After completing these, I will dive more into the genome reference. I also enrolled in the Galaxy Training Academy in May 2025. I hope this can help with my skill development in pathogen genomics.

Topic		Replies	Views
Which tool is guilty? Custom local Galaxy install server-admin , workflow , tool-dev , galaxy-local , transcriptomics , snpeff , rna_star	16	1089	April 29, 2021
Dingo reference genome upload request usegalaxy.org support database , custom-genome	1	818	January 29, 2019
Galaxy 25_0 on GCE, Microarray Tissue Analysis tutorial, ImageJ installation server-admin	15	37	July 14, 2025
Request to upload/add new version of reference genome of Aedes albopictus (Genome assembly AalbF5) usegalaxy.org support reference-index , custom-genome , reference-annotation , reference-genome , reference-transcriptome	1	15	February 19, 2025
Indexing reference genomes with Data Managers: Resources, tutorials, troubleshooting galaxy-local , data-manager , picard_markduplicates	28	7644	July 7, 2021