Too many bins with test data

Issue #86 resolved
Víctor Blancato created an issue

I run metabat2 with the assembly and the 2 mock communities provided, and i get 121 bins instead of the 28 indicated in the web page. Any advice?

This is the output on the screen.

Executing: 'jgi_summarize_bam_contig_depths --outputDepth assembly.fa.depth.txt --percentIdentity 97 --minContigLength 1000 --minContigDepth 1.0 --referenceFasta assembly.fa library1.sorted.bam library2.sorted.bam' at vie dic 27 11:57:15 -03 2019
Output depth matrix to assembly.fa.depth.txt
Minimum percent identity for a mapped read: 0.97
minContigLength: 1000
minContigDepth: 1
Reference fasta file assembly.fa
jgi_summarize_bam_contig_depths v2.14-7-g2b4b398 2019-12-27T11:49:07
Output matrix to assembly.fa.depth.txt
Reading reference fasta file: assembly.fa
... 1893 sequences
0: Opening bam: library1.sorted.bam1: Opening bam: library2.sorted.bam

Processing bam files
Thread 0 finished: library1.sorted.bam with 82747556 reads and 82230198 readsWellMapped
Thread 1 finished: library2.sorted.bam with 175361525 reads and 165063079 readsWellMapped
Creating depth matrix file: assembly.fa.depth.txt
Closing most bam files
Closing last bam file
Finished
Finished jgi_summarize_bam_contig_depths at vie dic 27 12:01:46 -03 2019
Creating depth file for metabat at vie dic 27 12:01:46 -03 2019
Executing: 'metabat2 --inFile assembly.fa --outFile assembly.fa.metabat-bins-20191227_120146/bin --abdFile assembly.fa.depth.txt' at vie dic 27 12:01:46 -03 2019
MetaBAT 2 (v2.14-7-g2b4b398) using minContig 2500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, maxEdges 200 and minClsSize 200000.
121 bins (85393251 bases in total) formed.
Finished metabat2 at vie dic 27 12:01:47 -03 2019

Comments (7)

  1. Rob Egan

    Hi Victor,

    Thanks for your post.

    Upon investigation I see that there is a large discrepancy between the bins generated from the master branch and our documented postings in the wiki from version 2.10.2 and I’m investigating it now. I believe metabat2 started misbehaving in just the last commit on the master branch (v2.14-7-g2b4b398) and I am testing and validating the fix presently.

    I’ll hopefully have a working version by tomorrow.

    Best,

    Rob

  2. Rob Egan

    Hello Victor,

    Thank you again for brining this to my attention.

    There were two bugs in the code which resulted in a significant and a minor performance degradation of the resulting bins, at least with respect to the “Best Binning Practices” wiki page for metabat2 (https://bitbucket.org/berkeleylab/metabat/wiki/Best Binning Practices) . These are now fixed in v2.15 and I encourage you to try again with this new version.

    If you still see a problem please indicate which “web page” you are referring to and I can see if there are any further discrepancies.

    Best,

    Rob

  3. Víctor Blancato reporter

    Hello Rob,

    thanks a lot for answering my question. I installed the new version v2.15, and I ran metabat2 again with the files downloaded from (http://portal.nersc.gov/dna/RD/Metagenome_RD/MetaBAT/Software/Mockup/). I was following the “Example with real data” section from the web page https://bitbucket.org/berkeleylab/metabat/src/master/.

    This time I obtained 38 bins instead of the 28 indicated in the web page (https://bitbucket.org/berkeleylab/metabat/src/master/ line “MetaBAT forms about 28 bins. In this example ….” ). So probably now is working fine. Later i will try my own samples and see if I also have less bins.

    Please let me know if you need further information.

    All the best,

    Victor

  4. Rob Egan

    Hi Victor,

    Yes, the discrepancy in the Mockup dataset originates between v2.11.2 and v2.11.3 where a bug was fixed which, for that dataset, now results in more precise bins when species and strains are closely related and the number of samples used is <=2. Unfortunately there is almost always a tradeoff between sensitivity and specificity so fixing that bug resulted in more bins, some of which are less complete.

    Best,

    Rob

  5. Log in to comment