Duplicated scaffolds in metabat

Issue #67 resolved
Former user created an issue

Hi

I was wondering if its normal for metabat1 to produce bins where identical scaffolds have been identified in two different bins?

Comments (5)

  1. Rob Egan

    Do you mean run to run or within a single metabat1 run?

    If it is run to run then it may be the case that there is some randomness and that is okay. You can adjust the starting pseudo random number with the --seed parameter which we use for testing and any given seeding should result in an identical set of contigs.

    If it is within a single metabat1 run, then your duplicate scaffolds may indeed go into two different bins… that would depend on your mapper and how it handles ambiguous placement of reads and the total and differential coverage depths of the contigs. If you have an assembly with lots of duplicated scaffolds, then you really should address that first before attempting to bin them, because metabat expects your assembly to be a unique set of scaffolds.

  2. Andy Leu

    Hello Rob

    It is within a run. So its normal if the same contig (same header id) are being binned into two different bins?

  3. Rob Egan

    I’d first say that having duplicate scaffold in your assembly is not normal, so I do not “expect” any given answer, and given an imperfect assembly, you will have an imperfect binning.

    That said if both the coverage depths and the nucleotide sequences are identical between two scaffolds (the name should not matter), then I would expect them to be in the same bin, but that condition is unlikely due to the mapping of the reads and your choice of the mapper (some will not map any reads to the duplicate one). You can inspect the depths file to see if the coverage between the two scaffolds is identical too.

  4. Andy Leu

    Hey Rob

    Sorry about reporting the issue. This is a problem on my side. It seems like running multiple metabat run one after another on hpc is causing this problem. if i ran it on a local server or spaced the runs one after another on hpc the problem does not occur.

    Weird huh?

    Cheers, Andy

  5. Log in to comment