Splittng Paired End Sequences and MultiQC of Paired Sequences

luckeythesis · January 20, 2026, 12:15pm

Good day, I am a complete newbie to bioinformatics and am having a confusing time with the recent update with the fastq splitter as it no longer designates the forward and reverse sequences in separate collections as R1 and R2 collections. Currently, the forward and reverse sequences are located within a sub-collection under the original sample’s name (see attached photo). As a result, it has become quite difficult to proceed to MultiQC as for some reason, despite the number of many samples (with different forward and reverse sequences), the report only shows the samples being forward and reverse. I am also unable to see which specific sequences have failed in fastqc analysis. Please help me.

jennaj · January 20, 2026, 8:00pm

Welcome @luckeythesis

We can probably help!

A paired end collection folder is organized by sample – we call these a List of Pairs collection shape. This can be converted to a simple listing that that MultiQC can understand – a List shape – by using the tool Unzip collection. The reverse can be done by using Zip collections.

However, MultiQC has an extra wrinkle. It assumes the file name is the sample name. We couldn’t reprogram around this requirement when wrapping the tool for Galaxy (we always use the original underlying tool then add nice stuff around it!), so, what us usually done instead is to use the Flatten Collection tool instead of Unzip. This adds in a _forward and _reverse label to the sample names (collection identifiers). The result is a listing of files, all with a unique sample name, that MultiQC knows how to parse and group by the primary (original) sample label i.e. the paired information isn’t lost.

Many more details are in this topic if you are curious about the why (sample label collisions happen outside of Galaxy too!). → Quality Control Start Here! multQC issue and guidance?

The most important part of that topic is an example of these manipulations in a workflow!

That example workflow is also in the Workflows → Public Workflows tab at the UseGalaxy servers. And I have a workflow invocation here that makes it easy to see what it does with some sample data. → https://usegalaxy.org/workflows/invocations/dd0fa32840a4c0d5

I included a CutAdapt run with FastQC runs (both before and after), then bundled all three reports together into MultiQC. You can grab a copy of that workflow, make any changes you want, or just give it a quick look to see how the data is reshaped to go through different tools. Flattening a paired end collection could follow several tools that create a List of Pairs type of collection. I haven’t tested directly with the FastQ Splitter tool so if you have trouble, maybe screenshot what is going on and I’ll try to reproduce and suggest more things to try!

Please give that a review and let us know if it actually helps or if you have any follow up questions!

luckeythesis · January 26, 2026, 8:03am

Hello,

Thank you so much! I’ll try it later and see if it works. I’ll update on any changes.

luckeythesis · February 1, 2026, 7:44am

Hello, this worked! Thank you again!

jennaj · February 3, 2026, 1:34am

Great! Thanks for letting us know!