How is genome coverage / depth is determined?

dna · January 24, 2022, 5:42am

I have the raw R1 and R2 files of a bacterial genome sequence that have been assembled into contigs using spades. How can I calculate the genome coverage / depth? Thanks.

Flow · January 25, 2022, 2:27pm

Dear @dna,
You calculate the typical genome coverage with paired-end sequencing with the equation:

C = (N · L · 2) / G

N = number of reads
L = read length
G = genome size
2 = factor for the paired-end sequencing

Kind regards,
Florian

dna · January 28, 2022, 3:44am

@Flow thanks, I tried with this equation but still unsure if its accurate. There seems to be ~ 40000000 raw reads (including both R1 and R2) of 150 bp paired end sequencing and the assembly is ~2000000 nucleotides. So its 3000x coverage?

Flow · January 28, 2022, 2:03pm

Dear @dna,
Just to make sure I ask a colleague of mine @pavanvidem. He confirmed my assumptions.

Maybe the coverage after mapping is more explainable, since your data might contain some reads with bad quality.

Coverage of 3000x might be good for a variant analyses. Thus, it really depends if 3000x is now “right” or “wrong”.

Kind regards,
Florian

dna · January 29, 2022, 4:50pm

Thanks again

Topic		Replies	Views
Prokayrote genome coverage help	6	669	January 23, 2022
Mapping reads to contigs	0	457	November 20, 2019
coverage info for NCBI genome submission assembly	1	64	February 28, 2024
Estimate repeat sequence length from an annotated genome (.gff3) - need help usegalaxy.org support	1	157	June 15, 2023
Mapping whole genome sequencing of several strains of organism A to gene X	1	242	September 18, 2020

How is genome coverage / depth is determined?

Related Topics