Assistance with Filtlong Tool for Quality Filtering of FastQ Data

ge96dah · November 14, 2023, 3:42pm

Hello forum members,

I’m encountering an issue while attempting to filter my FastQ data based on a mean quality score greater than or equal to 20 using the Filtlong tool. Despite applying the tool, the FastQC report shows no apparent changes, and upon closer inspection, it seems that no reads were filtered at all.

To provide a clearer picture, I’ve attached two screenshots—one with the original data and another after applying the Filtlong tool. I’ve chosen Filtlong for filtering as my output data files are in .fastq or .fastq.gz format, and unfortunately, they are not in fastqsanger or similar formats compatible with other filtering tools.

Could someone please help me understand what might be going wrong or if there’s a step I’m overlooking in the process? Any insights or guidance on using Filtlong effectively in this context would be greatly appreciated.

Thank you in advance for your assistance!

jennaj · November 14, 2023, 6:04pm

Hi @ge96dah

What format are your data in? What does FastQC report a the top of the report for the quality score scaling?

You could also try using the auto-detect datatype function Detecting the datatype (file format). Maybe the quality score scaling is already Fastq Sanger. The Upload tool would also detect the datatype (try using all defaults during Upload, especially for read data). If Galaxy guesses wrong, that is usually an important clue about some format/content problem.

I’m also wondering how the data was input to the tool. “Dragging and Dropping” is never recommended. The format requirements are expecting .fastqsanger.

Most fastq data uses that same quality scale scoring now, even long reads.

Related Q&A fastq unavailable -- Tool does not recognize inputs? How to check why - #2 by jennaj

ge96dah · November 14, 2023, 9:34pm

Hello @jennaj,

The input is Nanopore sequencing data. This is usually a .fastq.gz file but I am able to unzip it with Python before uploading the data to Galaxy. The FastQC report says that it has Sanger / Illumina 1.9 scaling.

Update:
I tried to auto-detect again and Galaxy still detects the data type as fastq.gz file. Should I assign a new data type?

jennaj · November 14, 2023, 11:04pm

Hi @ge96dah

That is fastq sanger. The redetect for datatype should work fine.

Am I missing something? Did you try that already?

Update:

Maybe the tutorial here will help? It covers nanopore reads. If nothing else, you could try comparing the methods now that the datatype issue seems to be resolved. Quality Control

Topic		Replies	Views
How to convert fastq.gz to fastqsanger.gz usegalaxy.org support upload , fastqsanger , epigenetics , quality-control	4	1123	August 24, 2023
FASTQ Groomer, fastq.qz and fastqsanger upload , fastqsanger	4	73	October 23, 2024
Trim galore not taking fastq dataset usegalaxy.org.au support upload , fastqsanger	1	203	January 22, 2024
FastQC Troubleshooting tool-help , quality-control , fastqc	4	149	August 26, 2024
fastq to fastqsanger stuck in orange usegalaxy.eu support upload , troubleshooting , fastqsanger	1	197	September 11, 2023

Assistance with Filtlong Tool for Quality Filtering of FastQ Data

Related topics