duplicate reads in bam file

Issue #88 resolved
Former user created an issue

Hi When I run "jgi_summarize_bam_contig_depths" to generate a depth file, I discovered that there are numerous duplicate reads in bam file . Do I need to remove duplicate reads. Thanks, xudong

Comments (1)

  1. Rob Egan

    If there are duplicate reads, then yes, your data is corrupted by some upstream preprocessing, and you should fix it and re-assemble before running metabat.

    If the # of duplicates is <1% then it probably won’t make a significant difference, but regardless you should verify all the inputs are okay (for example you did not include the same fastq file multiple times or that the read names are not universally unique across samples, etc).

  2. Log in to comment