MultiQC not working correctly

stealsh · January 24, 2022, 4:12pm

I have 24 untrimmed fastq files for RNA-seq. They are paired files. I have them in a data collection and run fastQC on them with no problem, but when I try to aggregate fastQC results with multiQC it only reports two files; forward and reverse, instead of each of the 24 individual files. What I’m I doing wrong? I’ve followed the tutorials and have set all parameters per those tutorials but keep getting the same result.

jennaj · January 25, 2022, 2:12am

@stealsh

I’ve seen this before, too, for about the last 5-4 months when collections changed a bit in the 21.09 release. These two tools don’t work in a series the way they used to.

Details: The problem comes from the way the data is organized and where the sample names are derived from when in a nested collection. They are named the same in the top level of the nested structure – – one “forward” and one “reverse”. The actual sample names are one level deeper.

I couldn’t figure out how to solve it before, gave up, and no one else reported the issue. Will create the test again and ask others to review it. There is probably a solution, and it would probably involve organizing the collection differently. “Flatten collection” was one option I reviewed but that didn’t produce the MultiQC output properly either (forward and reverse from the same sample had the same “identifier” that MultiQC was interpreting instead, so again there was data loss from common naming). “Rename collections” was problematic, too, but I forget why.

If there isn’t a good workaround, will open up a ticket. For either case, expect another reply tomorrow with an update. The FastQC tool itself might need a change – or maybe MultiQC (although that tool is tricker to change).

Meanwhile, one of these might work, and probably only the latter:

Expand the collection and drag and drop the datasets from inside to the MultiQC tool input. This involves a LOT of clicking.
Or – unhide the datasets in your history, then multi-select those for the input. I think this worked only when all forward were combined, then all reverse, but not together. Warning that this will make a lot of clutter in the history. Maybe copy just the FastQC output into a new different history and try it there, so any tests are easier to get rid of.

Thanks for reporting this! And @gbbio if you can think of a way to do this, feel free to add more to our replies. It is easily replicated: put any two pairs in a collection then run FastQC > MultiQC. MultiQC is only able to report back one pair, not both, no matter how the collection is arranged. I guess one option is to create some new collections just for input to MultiQC but that doesn’t combine by sample ID. Maybe I missed something obvious that fresh eyes will find

stealsh · January 25, 2022, 2:23pm

Hi,

Thanks for the detailed reply. Yes, I see now that all the files when paired were named either forward or reverse. I will try the work arounds later today, when I have some free time.

stealsh · January 25, 2022, 2:54pm

Update, I attempted both of the suggested work arounds but got the same result, multiQC only outputting forward and reverse.

jennaj · January 26, 2022, 12:01am

Hi @stealsh

Ok, it was worth trying. These two specific tools won’t work together for now when inputting the collection as a whole.

This will need a ticket, I’ll get to it this week, and post that link back here for reference/tracking. I can’t estimate how long it will take for the review and actual change to make it back to the server, so don’t wait for that. The individual FastQC reports can be reviewed as an alternative for now.

Thanks for reporting the problem and so sorry there isn’t some easier or immediate solution.

Update

Looks like this is a known issue still pending a correction: MultiQC - Use "element_identifier" as "sample name" for all tools · Issue #1595 · galaxyproject/tools-iuc · GitHub

igor · February 15, 2022, 7:05am

Hi @stealsh
try “flatten collection” from Collection Operation section on collection of paired reads before the FastQC step. Datasets in flatten collection have unique names.
Hope this helps.
Kind regards,
Igor

stealsh · March 11, 2022, 8:28pm

Hi Igor,

Thanks for pointing this out. It works perfectly!

Steve

Topic		Replies	Views
regarding MULTIQC not working usegalaxy.eu support multiqc , collections , quality-control	4	580	July 26, 2022
MultiQC troubleshooting usegalaxy.org support multiqc , troubleshooting , tool-help , quality-control	4	195	January 27, 2024
MultiQC does not recognize raw data (txt) from FastQC when in a collection multiqc , collections , quality-control	2	456	May 2, 2022
How do you use MultiQC on FastQC files taken from SRA files? usegalaxy.org support multiqc , quality-control	1	2615	December 10, 2018
FastQC reports from list of pairs PE Illumina reads all have sample name "forward" or "reverse" multiqc , galaxy-local , quality-control	5	564	May 28, 2021

MultiQC not working correctly

Related Topics