I used NGS: RSeQC - Gene Body Coverage (BAM) Read coverage over gene body. (Galaxy Version 220.127.116.11) - Input: bam, Reference gene model: bed12.
The input is a BAM results of a WGS 2x75 run on NextSeq, reference was hg38.
Then for lib#261, I got this significant drop at 50% percentile.
Can anyone explain it to me?
Or if there’s better tool for the coverage uniformity check.
Thanks for clarifying the usage.
X-axis = Gene Body Percentile (5’ > 3’) – all transcript regions normalized at 100
Y-axis = Coverage – counts of read coverage per transcript regions
This is a graph of the RseQC
txt output, an alternative view of the
What this means: The WGS reads are not mapping well to the middle of transcripts. There is 3’/5’ bias with some type of problem at very end of the 3’ UTR (sequencing artifact? library construction bias?).
Some things to check:
The sample size is pretty small and might be biased. Try checking a random sample (or all) of properly paired reads with a MapQ of 30.
Does the BED12 represent a UCSC “Genes and Gene Predictions” track that is complete (full gene)? RefSeq and Ensembl are good choices. Avoid Genebank’s “All mRNA” and other types of fragmented/high duplication tracks.
WGS data needs to be mapped with an unspliced mapping tool. Choices can include BWA/BWA-MEM and Bowtie2. Avoid spliced mapping tools – those are for spliced data (e.g. RNA-seq).
You might want to run FastQC on the original fastqsanger datasets to find out about artifact and other sequence problems that may be present.
Hope that helps!
Thanks jennaj for the explanation.
For whole genome sequencing data as is the case here, what are segmented into 100 sections by RSeQC?
These are the transcripts defined by the BED12 dataset.
For WGS data, comparing this result to the complete genome coverage (not just transcript regions) can be informative. See the tool group
BEDTools. Example tools of interest:
MakeWindowsBed. Tool manual: https://bedtools.readthedocs.io/en/latest/content/bedtools-suite.html (explains line-command options that are mirrored on the Galaxy tool forms)