FastQC per tile sequence quality not being generated

jennaj · February 4, 2025, 7:41pm

Great, thanks for sharing your history and more details about your question.

If you scroll down on the Galaxy form you’ll find links to the original author’s documentation. This is an especially good resource for a tool like this one! Each module is covered in their docs. This is the one for the module you are asking about:

Per Tile Sequence Quality

Quote from the top

Per Tile Sequence Quality

Summary

This graph will only appear in your analysis results if you’re using an Illumina library which retains its original sequence identifiers. Encoded in these is the flowcell tile from which each read came. The graph allows you to look at the quality scores from each tile across all of your bases to see if there was a loss in quality associated with only one part of the flowcell.

Comparing that to the format of the reads you have: notice that you have standardized identifiers given by the SRA, without that flowcell information.

Screenshot

It seems that this study’s original reads are not automatically available but can be requested (all of GEO is the same as far as I know!). See SRA detail pages like this one for more details: SRA Archive: NCBI. Galaxy offers pay-for-use deployment variations that make use of S3 buckets, example: AnVIL - Galaxy Community Hub

And, one final piece of help would be to consider running your analysis with batches of collections instead of loose files. You can retrieve paired end read data with just a list of accessions and have the output placed into collection folders with a tool like:

Faster Download and Extract Reads in FASTQ format from NCBI SRA (link at ORG)

All SRA data will have extended metadata you can use for even more complex data loading options. See the bottom of the Faster form for links to tutorials with examples. You can also see:

Using Galaxy and Managing your Data / Tutorial List → Uploading Data

Hope this helps!