Input for Gene Body Coverage (BAM)

Hi,

I have been trying to generate a gene coverage plot from my bam file using Gene Body Coverage (BAM) from Galaxy.org, but keep getting results that I am not expecting (see the attached heatmap png). While the png file suggested all samples have a shart peak near the 40th percentile, it is very different from what I see when I visualise my data on IGV. So I am wondering if my input bed12 file is in the wrong format.

I have attached a screenshot of my bed12 file (see the screenshot png below). Would you mind checking if the bed12 file is in the right format?

output.geneBodyCoverage.heatMap

best wishes,

Stephen

Hi @stephen_li,
the format is correct, but that signal seems to be the result of some kind of artifact since it is too narrow and homogeneous to be an over-represented transcript. Could you analyze the read distribution across features? I could provide some additional information.

Regards

Hi,

Sure. Heres the link to the read distribution result : https://usegalaxy.org/datasets/bbd44e69cb8906b59133118c8ae76acd/display?to_ext=html

I am puzzled by the fact that none of the reads are classified as “CDS_Exons”. Wonder if that may provide some hint to the strange result from the previous post?

The full input bed12 file is attached in the screenshot below

best wishes,

Stephen

Hi,

I have just done a quick test using just one RNASeq sample for one gene. The IGV screenshot of the selected gene (rnpB) is shown as following. It shows the data should be skewed towards the 3’end

.

However, the output for gene body coverage showed one sharp peak in about the 50th percentile

Is that suggesting error from my input files?

best wishes,

Stephen

According to the features of your BED12 file, it seems that the transcripts correspond to non-coding RNAs (tracrRNA, crRNA, 6S RNA). It can explain the fact that none of them are classified as CDs. Which kind of RNA-seq are you using?