Workflow ending in multiple VCFs instead of one single VCF

I created a workflow:

At the end it goes not as expected. Freebayes results in single VCFs, although I select to “merge output VCFs”. How can this be resolved?

Hi @bejo

Using the tool in the history is a bit different from using it in a workflow for this kind of processing. In short, you need to “tell” the tool how to process groupings of data.

The solution is to use dataset collections to bundle the samples together. This defines the data as a group. Otherwise, each sample is streaming through tools independently, so there is nothing to merge at the last step.

Tools by default run each element (dataset) in the collection (folder) independently. This is what will happen at the upstream steps in your workflow.

Then, some have an option to merge/group/pool all of the elements in the same collection. This is what will happen with Freebayes (if you set the merge option). This is what differs from using the tool directly: when used directly, you can specify multiple files as a “group” using the multi-select option, but to do this in a workflow, you’ll need a collection.

Please give that a try and see if it works! I think it will but we can troubleshoot more if needed. :slight_smile:

Xref

I made a collection, fastqsanger format. I tried to run it though my workflow but it does not “recognize” the collection. But if I am just running BWA-MEM2 seperately, the collection is recognized. That odd to me.
In the picture attached, the workflow and I cannot select the collection.

1 Like

Hi @bejo

If you change the way the starting data is organized, you’ll also need to update the workflow.

These are the two important places you’ll need to adjust:

  • Inputs → Change the workflow “inputs” to be the collection type (the folder option)
  • Noodles → Disconnect the noodle connections between all tools, then reconnect them in the order that the data will flow through the tools.

This resets important metadata, and should solve the problem. It will also allow the option for “merged VCFs” to see the collection once it reaches that step.

Please give that a try. :slight_smile:

Yes indeed, the noodles need to be disconnected. Finally I noticed the forked-like connection between the tools. I will move on now!

1 Like

Hi @bejo Great that it worked!! Thanks for letting us know :slight_smile: