Hi
I need help in calculating Median coverage (and range) per targeted base from a bam file. I have tried several sam tools in galaxy but couldnot find it. Can someone please explain and help.
Thank you
Hi
I need help in calculating Median coverage (and range) per targeted base from a bam file. I have tried several sam tools in galaxy but couldnot find it. Can someone please explain and help.
Thank you
Hello, I did try Samtools coverage but the output file doesnot contain this information. Please see the attached file if I can extract this information from the output.
Thank you very much
Hi @naila2025,
Sorry, it was my mistake.
mosdepth has an option for median instead of mean, but for my test file mean and median values are very similar, so I cannot check. Maybe calculate per depth coverage with any suitable tool, for example, Samtools depth and calculate median using Summary statistics or datamash.
Alternatively, you can get this value from QualiMap BamQC, genome_coverage_fraction table.
Kind regards,
Igor
Samtool depth and than datamash gives this output. Please confirm is this median coverage (range) per base if yes then it should be presented as 6.19705e+07 (719 - 2.48572e+08).
I am sorry but this all is very new for me.
Thank you
Hi @naila2025,
IMHO, both mean and median values look too high. By any chance, have you used column 2 (positions in genome) instead of column 3 (coverage) for calculation?
Check my history. I sorted output from samtools depth on column 3. As you can see in dataset 18, some positions have zero coverage. These positions will be used in calculation of mean and median values. Not important in my case, as only few positions have zero coverage, but if you deal with exome sequencing, the outcome might be different. In this BAM file the median and mean coverage have very similar values (dataset 19).
Also, check output from QualiMap BamQC (dataset 16). You are after the Genome Fraction Coverage graph. Median is a middle point of the genome bases, so draw a line for 50% (Y axes) and check the depth (X axes). It is 37.
Kind regards,
Igor
hello,
yes I made this mistake. Below is the outcome with column 3. I did targeted genome sequencing. Actually a reviwer has demanded this information and I am stuck to find this out. I am practicing all sorts of tools on one dataset only. Is this fine now? 2(0-402)
hi
Can you please comment why in the output from QualiMap BamQC the Genome Fraction Coverage graph is empty? Please have a look on history
Hi @naila2025,
The new mean and median numbers look OK for targeted regions. However, in this situation you probably want a median coverage for target regions only, not the whole genome. Median for the whole genome does not say anything about the targeted regions. Samtools depth has an option, Filter by regions. To use it, you need positions of the target regions in BED format. Note that BED uses 0 offset for Start position, so, the first ten positions on chrX are described by:
chrX 0 10
(should be tab separated).
I’ll look at the history later.
Kind regards,
Igor