How the depth of contig was calculated?
Issue #35
resolved
Hi, Is there any where I can know how the depth of contig coverage was calculated
Comments (3)
-
-
- changed status to resolved
-
Hi thanks to your prompt reply. I am some confused about what's the formula used to calculate the contig depth. In samtools, I can use command:
samtools depth k119_4_sorted.bam >d1.txt
to generate the depth of each site. and if I use metabat, I used the following command to generate the contig depth:
jgi_summarize_bam_contig_depths --outputDepth d2.txt k119_4_sorted.bam
Initially I thought the average depth of each site(from samtool) will be the contig depth.
awk '{sum+=$3}END{print sum/329}' d1.txt
However, it's not true. Can you help me to figure out where it goes wrong?
Thanks, Yun
- Log in to comment
Hi,
There are a couple of factors that are used (by default) to improve the depth coverage estimate. The depth per contig calculations are calculated by the jgi_summarize_bam_contigs_depth program in the MetaBAT distribution that can be invoked independently.
Here are the options to that:
But basically the depth is calculated for every read that maps with >=97% identity to the assembly scaffold : (i.e. matches / (matches + mismatches + insertions + deletions) >= 0.97)
And the edges of a contig are excluded (1 average-read-length from either end)
-Rob