How is genome coverage / depth is determined?

I have the raw R1 and R2 files of a bacterial genome sequence that have been assembled into contigs using spades. How can I calculate the genome coverage / depth? Thanks.

Dear @dna,
You calculate the typical genome coverage with paired-end sequencing with the equation:

C = (N · L · 2) / G

N = number of reads
L = read length
G = genome size
2 = factor for the paired-end sequencing

Kind regards,
Florian

3 Likes

@Flow thanks, I tried with this equation but still unsure if its accurate. There seems to be ~ 40000000 raw reads (including both R1 and R2) of 150 bp paired end sequencing and the assembly is ~2000000 nucleotides. So its 3000x coverage?

Dear @dna,
Just to make sure I ask a colleague of mine @pavanvidem. He confirmed my assumptions.

Maybe the coverage after mapping is more explainable, since your data might contain some reads with bad quality.

Coverage of 3000x might be good for a variant analyses. Thus, it really depends if 3000x is now “right” or “wrong”.

Kind regards,
Florian

1 Like

Thanks again