Segfault when using -m flag

Issue #82 resolved
Former user created an issue

When I set -m to 2000 or lower v2.14 segfaults.

This doesn't seem to be an issue with earlier versions.

metabat2 -m 1500 -i /tmp/mega_assembly.fasta.gz -a /tmp/megahit.cov -o test MetaBAT 2 (2.14 (Bioconda)) using minContig 1500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200. [1] 8910 segmentation fault (core dumped) metabat2 -m 1500 -i /tmp/mega_assembly.fasta.gz -a /tmp/megahit.cov -o test

Comments (10)

  1. Rob Egan

    I cannot replicate this, so it looks like it is something to do with your data.

    It could be any number of things. I don’t know what /tmp/megahit.cov is but I unless metabat generated it from the BAM files, it is not going to be in the proper format. You would also need to ensure that all the contigs selected by the -m parameter have a value in the coverage file.

    It is also possible that your machine has insufficient memory to handle the greater number of contigs below that threshold. Can you share your assembly? And/or send me a stack trace of your core dump? And/or run with -v?

  2. zhenjian lin

    could you please list a standard commands to run bbmap for generating bam file? hard to know what is wrong with our own bam file. thanks

  3. Rob Egan

    Your bamfiles have nothing to do with the command listed above. In MetaBAT, the bamfiles are used in the first stage of the runMetaBAT.sh script which in turn calls the jgi_summarize_bam_contig_depths program to generate the depths.txt file (i.e. coverage abundances) which metabat2 utilizes.

    Usage: jgi_summarize_bam_contig_depths <options> sortedBam1 [ sortedBam2 ...]

    The bam files need to be sorted but can be from any aligner. Feel free to use the default options for bbmap, and ask their support for help on the best options for your particular data set.

  4. Mitchell Sullivan

    Hi Rob,

    I created this issue. Sorry forgot to log in.

    megahit.cov was generated with jgi_summarize_bam_contig_depths, I can send you the BAM if you like.

    -Mitch

  5. Mitchell Sullivan

    Sorry seems I attached the coverage file twice. Instead of the assembly and coverage file. Fixed now.

    Let me know if you have any trouble replicating.

  6. Rob Egan

    Okay, so the files are fine but the assembly is extremely short, and there are only 21 contigs greater than 1500 bases, and only another 201 that are >= 1000 <1500. Of the 21 that MetaBAT will consider for binning, none of them have close enough similarity to start to evaluate whether they are from the same genome.

    Simply put there is not enough there to bin. And what little is present in the assembly has a very low coverage (3-9x). This indicates to me that you will need a lot more input data to get a decent assembly.

    metabat2 -i ~/Downloads/mega_assembly.fasta.gz -a ~/Downloads/megahit.cov -o xxx/ -m 1500 -v
    MetaBAT 2 (v2.14-5-g376f5b8) using minContig 1500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, maxEdges 200 and minClsSize 200000.
    [00:00:00] Executing with 4 threads
    [00:00:00] Parsing abundance file
    [00:00:00] Parsing assembly file
    [00:00:00] Number of large contigs >= 1500 are 21.
    [00:00:00] Reading abundance file
    [00:00:00] Finished reading 16045 contigs and 1 coverages from /home/regan/Downloads/megahit.cov
    [00:00:00] Number of target contigs: 21 of large (>= 1500) and 201 of small ones (>=1000 & <1500).
    [00:00:00] Start TNF calculation. nobs = 21
    [00:00:00] Finished TNF calculation.
    [00:00:00] Finished Preparing TNF Graph Building [pTNF = 89.80]
    [00:00:00] Finished Building TNF Graph (210 edges) [1.1Gb / 15.6Gb]
    There were 21 nodes and 0 edges -- insufficient to compute bins

    Furthermore I tried a few tricks to see if any of the 201 short contigs could be recruited to any of the larger ones and did not have any luck on that front.

    And as for the segfault that started this ticket, I can confirm that v2.14 segfaults when there are lots of threads like you have on your machine, and that issue is now fixed in v2.14-5-g376f5b8.

  7. Mitchell Sullivan

    Hi Rob,

    Yeah, it's part of a pipeline that we expect to return no bins some of the time, as reads can be removed in a previous step. The problem is I don't want to a catchall for metabat2 exiting with a segfault in case the error is caused by something else.

  8. Rob Egan

    Hi Mitchell, I agree that a segfault is not okay. I believe that the segfault should be fixed in the latest version. Please confirm it works for you too when you have a chance.

  9. Log in to comment