FastQC per tile sequence quality not being generated

I am doing a FASTQC run, but my per tile sequence quality is not being generated. Could anyone tell me why is that?

Hi @Krutika_Sadadekar

This is the topic we can use to sort out these issues.

For each problem with FastQC and MultiQC you can post back a share link to the history with that error and we can help to troubleshoot it here. That link will include all the details and is very easy to generate. You can post the link back here in a reply, and include a few details like which dataset we should look at, any error messages you found and how you tried to address them, or if you are not sure where to look you can state that and we’ll explain. You can unshare after we are done. See details here → How to get faster help with your question

If you don’t want to post back the share link for some reason, you can capture all of the details of your run and we can try to troubleshoot that way instead. For each problem, we’ll need the following: server URL (this is at the top of your browser window), then everything on the job details view (i-icon). Fully expand and copy/paste the peek view of the input and output datasets, the job parameters, and the contents of all the log sections.

If you are already following a tutorial, you can also link that back here for more context.

Hopefully we can solve these, and if you are able to solve it first let us know, thanks! :slight_smile:

Xref

Dear Galaxy

This is the link to my history: https://usegalaxy.org/u/krutika_1808/h/gse137418. I was trying to do RNA seq analysis for GSE137418. Using the SRA run selector I used all the runs in the GSE study by pasting/fetching the SRR accession of the run from ENA database to Galaxy. On doing fastQC I am getting all the statistics except per tile sequence quality.

Regards
Krutika

1 Like

Hi @Krutika_Sadadekar

Great, thanks for sharing your history and more details about your question.

If you scroll down on the Galaxy form you’ll find links to the original author’s documentation. This is an especially good resource for a tool like this one! Each module is covered in their docs. This is the one for the module you are asking about:


Quote from the top

Per Tile Sequence Quality

Summary

This graph will only appear in your analysis results if you’re using an Illumina library which retains its original sequence identifiers. Encoded in these is the flowcell tile from which each read came. The graph allows you to look at the quality scores from each tile across all of your bases to see if there was a loss in quality associated with only one part of the flowcell.


Comparing that to the format of the reads you have: notice that you have standardized identifiers given by the SRA, without that flowcell information.

Screenshot

It seems that this study’s original reads are not automatically available but can be requested (all of GEO is the same as far as I know!). See SRA detail pages like this one for more details: SRA Archive: NCBI. Galaxy offers pay-for-use deployment variations that make use of S3 buckets, example: AnVIL - Galaxy Community Hub

And, one final piece of help would be to consider running your analysis with batches of collections instead of loose files. You can retrieve paired end read data with just a list of accessions and have the output placed into collection folders with a tool like:

  • Faster Download and Extract Reads in FASTQ format from NCBI SRA (link at ORG)

All SRA data will have extended metadata you can use for even more complex data loading options. See the bottom of the Faster form for links to tutorials with examples. You can also see:

Hope this helps! :slight_smile:

thankyou So much