choosing the right QC option

Netanel_Cohen · November 1, 2022, 9:57am

Hello.
I’m attaching a snapshot of the fastQC output of a sample, I’m debating how to QC this and similar samples, on the one hand it look alright as is, on the other hand I sent it to bioinformatic and he did the following:
Trimming and filtering of raw reads

Raw reads (fastq files) were inspected for quality issues with FastQC. Following that, the reads were quality-trimmed with cutadapt, using a quality threshold of 32 for both ends, poly-G sequences (NextSeq’s no signal) and adapter sequences were removed from the 3’ end, and poly-T stretches were removed from the 5’ end (being the reverse-complement of poly-A tails). The cutadapt parameters included using a minimal overlap of 1, allowing for read wildcards, and filtering out reads that became shorter than 15 nt. Finally low quality reads were filtered out using fastq_quality_filter, with a quality threshold of 20 at 90 percent or more of the read’s positions.

his QC caused the lost of 16% of the reads

I also triad using fastp on default setting which only discard 1.2% of the reads…

i will be happy to hear your inputs

Thank you
Netanel

igor · November 23, 2022, 12:02pm

hi @Netanel_Cohen
stringency of trimming depends on downstream analysis, so take settings from a recent paper with a similar analysis as yours or test different trimming setups on one sample or small number of samples. Number of reads is not really relevant without information on the project. 47M reads look like an overkill for differential gene expression in bacteria but might not be sufficient for whole genome variant calling, and excessive data might results in poor quality in some projects.
Kind regards,
Igor

Topic		Replies	Views
Outputs Cutadapt tool	0	395	October 24, 2021
RNA Seq Analysis - Trimming, FastQC usegalaxy.eu support quality-control	1	792	April 5, 2024
Analysing multiplex sequences, how to clean up and analyse individually usegalaxy.org support metagenomics	1	580	May 24, 2021
Remove reads with low quality usegalaxy.org support transcriptomics , rna-seq	1	526	July 2, 2023
where is the quality cutoff for Read 1? (cutadapt) usegalaxy.org support quality-control	2	208	March 12, 2024

choosing the right QC option

Related topics