jgi_summarize_bam_contig_depths error using --noIntraDepthVariance option

Issue #164 resolved
Mariana Guerrero created an issue

Hi, I am trying to calculate the coverage of my contigs by using “ jgi_summarize_bam_contig_depths error“ option. I already have all my contig sorted (by ID name) and in bam format.

Everytime I am trying to run my script without any parameters ( just the jgi_summ command and the fwd, rev sorted bam files) I am getting the following error:

Please execute 'samtools sort' on unsorted input bam files and try again!

But I did check and my files are sorted by the ID name, for doublechek I sorted my alignments again using samtools sort option.

When I add the parameters that I want in the script:

jgi_summarize_bam_contig_depths --noIntraDepthVariance --minMapQual 20 --minContigLength 500 --outputDepth /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/pulque_cov.txt /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sortbyname.bam /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R2_sortbyname.bam

I get the following error:

Calculating intra contig depth variance
minMapQual: 20
minContigLength: 500
Output depth matrix to /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/pulque_cov.txt
jgi_summarize_bam_contig_depths GIT-NOTFOUND 2023-08-15T14:33:10
Running with 88 threads to save memory you can reduce the number of threads with the OMP_NUM_THREADS variable
Output matrix to /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/pulque_cov.txt
Opening all bam files and validating headers
Processing bam files with largest_contig=0
Thread 1 opening and reading the header for file: /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sortbyname.bam
Thread 1 opened the file: /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sortbyname.bam
Thread 1 processing bam 0: Pulque_R1_sortbyname.bam
Thread 0 opening and reading the header for file: /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R2_sortbyname.bam
/cm/local/apps/sge/var/spool/compute-00-12/job_scripts/364029: line 26: 7063 Segmentation fault (core dumped) jgi_summarize_bam_contig_depths --noIntraDepthVariance --minMapQual 20 --minContigLength 500 --outputDepth /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/pulque_cov.txt /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sortbyname.bam /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R2_sortbyname.bam

Please could you help me?
Also I am doing this in my university cluster, so the memory is not an issue :)

Comments (15)

  1. Rob Egan

    The bam needs to be sorted by contig ( i.e. the same order of the assembly fasta), not by name of each read. Sorting by name is not a useful ordering, imo, and is (approximately and unintentionally) what happens normally by your aligner as it aligns each read. This ordering by contig is necessary so that the read alignments are ordered by where they map within the assembly and is used to calculate the coverage more optimally and without blowing up memory, as each contig can be evaluated serially instead of holding all that information at the same time.

  2. Mariana Guerrero reporter

    Thank you, I run it with the sort bam files and not with the ID sort, it run well without any parameter.

    But when I add to the script all the paramets that I need: jgi_summarize_bam_contig_depths --noIntraDepthVariance --minMapQual 20 --minContigLength 500 --outputDepth… is still giving me this error:
    Segmentation fault (core dumped)

    I am using the metabat version 2

  3. Mariana Guerrero reporter

    Hi

    I double checked and the parameter that is giving me this issue is noIntraDepthVariance, when I add the other ones everything runs good.
    Is there any package or something I need in order to correct this?

  4. Rob Egan

    That looks like it could be a bug, but I cannot replicate it.

    Please provide the entire command line and output. Additionally, run with both the -v and -d options which should provide the exact version of metabat which you are using and more information for me. Hopefully you can attach the log to this ticket, since it might be long.

    Please also confirm that you have both contigs > 500 bases in your assembly and mappings better than 20 map qual (i.e. Sam’s MAPQ: -10 log10 Pr{Mapping position is wrong} ) in your bam file(s).

  5. Mariana Guerrero reporter

    Hi, sure. The metabat version I am using is 2.16. And yes all the contigs have those parameters.

    command line:

    jgi_summarize_bam_contig_depths --noIntraDepthVariance --minMapQual 20 --minContigLength 500 --outputDepth /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/cov.txt /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sort.bam /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R2_sort.bam

    output:
    Calculating intra contig depth variance
    minMapQual: 20
    minContigLength: 500
    Output depth matrix to /mnt/Guanina/lmorales/Public/MetaHiC/data/pulque_cov.txt
    jgi_summarize_bam_contig_depths GIT-NOTFOUND 2023-08-15T14:33:10
    Running with 88 threads to save memory you can reduce the number of threads with the OMP_NUM_THREADS variable
    Output matrix to /mnt/Guanina/lmorales/Public/MetaHiC/data/pulque_cov.txt
    Opening all bam files and validating headers
    Processing bam files with largest_contig=0
    Thread 1 opening and reading the header for file: /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment_Try1/Pulque_R1_sort.bam
    Thread 1 opened the file: /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment_Try1/Pulque_R1_sort.bam
    Thread 0 opening and reading the header for file: /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment_Try1/Pulque_R2_sort.bam
    Thread 1 processing bam 0: Pulque_R1_sort.bam
    Segmentation fault (core dumped)

    The only log I am getting after executing these lines is in binary, is a core file but I am not able to upload it

  6. Rob Egan

    Thanks Mariana.

    It looks like you are running the latest version.

    I still cannot reproduce your problem, so it may be something going on which is data dependent. You could help me to figure out what may be going on by doing any or all of the following:

    1. If you are building it directly, build the debug version and then hopefully it will drop a stack trace to direct me to the line of code that is breaking

      1. cmake -DCMAKE_BUILD_TYPE=Debug ……
    2. run with a single thread and optionally within a debugger.

      1. OMP_NUM_THREADS=1 jgi_summarize_bam_contig_depths --noIntraDepthVariance --minMapQual 20 --minContigLength 500 --outputDepth /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/cov.txt /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sort.bam /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R2_sort.bam
      2. OMP_NUM_THREADS=1 gdb jgi_summarize_bam_contig_depths
        then run within the debugger
        r --noIntraDepthVariance --minMapQual 20 --minContigLength 500 --outputDepth /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/cov.txt /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R1_sort.bam /mnt/Guanina/lmorales/Public/MetaHiC/data/Metator/alignment/Pulque_R2_sort.bam
    3. run with just 1 bam file at a time (if it only crashes with both bams then they they may be inconsistent somehow)

    4. share your files with me (i.e. Pulque_R1_sort.bam and Pulque_R2_sort.bam), so that I can reproduce the problem and more efficiently track it down.

    I appreciate you patience and help.

  7. Rob Egan

    Hi, actually I was able to replicate this with a slightly more complex test case than what I was using. The quick fix is to leave out the “--noIntraDepthVariance” option. Knowing where it is breaking now, I should be able to track down the problem and issue a fix soon.

    -Rob

  8. Mariana Guerrero reporter

    Thank you very much for your effort and help Rob! I really apretiate it. I already ask to the IT team to update the metabat version and I will try this and let you know

  9. Mariana Guerrero reporter

    Hi Rob, it is succesfully working now !!!
    Thank you very much for all your help and work, I super apretiate it!

  10. Log in to comment