Manually set --minContig below 1500

Issue #85 resolved
Former user created an issue

Hello,

The documentation for metabat2 suggests that the --minConig parameter is tunable below 1500. It says should be >=1500, which implies that it can be less, however whenever I set it less than 1500 I get an error. Is this a hardcoded or tunable parameter?

Thanks! Jarrod

Comments (6)

  1. Jarod Scott

    Sorry, I wrote my original message before signing up so I can’t edit my original post

    I should say that I installed METABAT2 using v2.14 installed with conda.

    And sorry for setting the priority to “major” ---I would edit that too. I am new to Bitbucket :)

  2. Rob Egan

    The 1500 limit on --minContig is a hard limit and is coded to result in an error message if it is specified smaller on the command line. We have found that smaller values is deteimental to the binning calculations and workflow.

    Contigs >= 1000 are recruited to sufficiently-large-bins after the contigs above the minContig threshold have been considered for binning. This option is on by default, but you can disable it with --noAdd. We have found that 1000 is the minimum contig length that can reliably be recruited to bins of 200kb or greater, so that parameter too is hard-coded.

  3. Jarod Scott

    Thank you @Rob Egan for the explanation. One of the reasons I ask is because I am comparing the results of several binning methods using [anvi’o](http://merenlab.org/software/anvio/) , which as you may know recently added METABAT2, CONCOCT, MAXBIN2, BINSANITY, & DASTOOL options to its binning step.

    Based on what you are saying then, if I leave the option to recruit contigs >= 1000 “on”, then a direct comparison of say METABAT2 minlength = 1500 to MAXBIN2 minlength = 1000 is appropriate?

  4. Rob Egan

    With the caveat that all these options tradeoff sensitivity for specificity, that would be reasonable. Alternatively, the default options for metabat should still recruit the small contigs between 1000 and 2500 to the bins already created on the >=2500 base contigs.

    For comparison purposes, I think it is preferable to simply use the defaults on all packages, as that is what the authors presume should be the best fit for most datasets.

  5. Jarod Scott

    Ah, ok. Fair point and very helpful moving forward. I have a set of analyses where I used defaults for everything. Thank you for engaging me in this dialogue.

  6. Rob Egan

    Hi, I encourage you to try v2.15 as some bugs in the binning algorithm that were recently introduced have been resolved in this new version.

  7. Log in to comment