"Number of clusters formed: 0"

Issue #22 resolved
Former user created an issue

Heia, I am trying to run metabat on my metagenomes. The first one I ran worked ... however for my following ones I get "Number of clusters formed: 0".

Could you help me figure out what I am doing wrong?

I used bbmap/bbwrap.sh to make my bam file. (nohup /homes/nesbocam/bbmap/bbwrap.sh ref=HI.1247.001_spades_scaffolds.fasta in=/groups/edwards/camilla/elisse/HI.1247.001_R1_trim_paired.fastq in2=/groups/edwards/camilla/elisse/HI.1247.001_R2_trim_paired.fastq out=HI.1247aln.bam &)

runMetaBat.sh HI.1247.001_spades_scaffolds.fasta HI.1247aln.bam

The content of my output:

Executing: 'jgi_summarize_bam_contig_depths --outputDepth HI.1247.001_spades_scaffolds.fasta.depth.txt --pairedContigs HI.1247.001_spades_scaffolds.fasta.paired.txt --minContigLength 1000 --minContigDepth 2 HI.1247aln.bam' at Sat May 20 11:16:26 MDT 2017 Output depth matrix to HI.1247.001_spades_scaffolds.fasta.depth.txt Output pairedContigs lower triangle to HI.1247.001_spades_scaffolds.fasta.paired.txt minContigLength: 1000 minContigDepth: 2 Output matrix to HI.1247.001_spades_scaffolds.fasta.depth.txt Opening bam: HI.1247aln.bam Consolidating headers Allocating pairedContigs matrix: 10 MB over 1 threads Processing bam files Thread 0 processing: HI.1247aln.bam Thread 0 finished: HI.1247aln.bam with 29762848 reads and 24098617 readsWellMapped Creating depth matrix file: HI.1247.001_spades_scaffolds.fasta.depth.txt Closing most bam files Creating pairedContigs matrix file: HI.1247.001_spades_scaffolds.fasta.paired.txt Closing last bam file Finished Finished jgi_summarize_bam_contig_depths at Sun May 21 13:50:37 MDT 2017 Creating depth file for metabat at Sun May 21 13:50:37 MDT 2017 Executing: 'metabat --saveTNF saved-HI.1247.001_spades_scaffolds.fasta.depth.txt.TNF --saveDistance saved-HI.1247.001_spades_scaffolds.fasta.depth.txt.distance --inFile HI.1247.001_spades_scaffolds.fasta --outFile HI.1247.001_spades_scaffolds.fasta.metabat-bins- --abdFile HI.1247.001_spades_scaffolds.fasta.depth.txt' at Sun May 21 13:50:37 MDT 2017 [Info] Correlation binning won't be applied since the number of samples (1) < minSamples (10)

Number of clusters formed: 0 Finished metabat at Sun May 21 13:50:40 MDT 2017

Thanks, Camilla

Comments (13)

  1. Don Kang

    You need to supply assembly fasta file as well like this:

    runMetaBat.sh <options> assembly.fasta sample1.bam [sample2.bam ...]

  2. Camilla Nesbø

    Thanks for getting back to me. I am supplying the assembly file: HI.1247.001_spades_scaffolds.fasta - or am I misunderstanding you? (my command: runMetaBat.sh HI.1247.001_spades_scaffolds.fasta HI.1247aln.bam)

  3. Shuangfei Zhang

    @Camilla L. Nesbø. I have the same question like you. you said that you sorted the bam file. can you tell me your specific code about that. my code is following: /home-fn/users/nscc1082/software/samtools-1.3.1/samtools sort -l 0 -o file.bam -O bam -n -T file -@ file.sam. however, my file.bam file is empty.

  4. Camilla Nesbø

    Hi, I use BBmap I first make the reference /bbmap/bbmap.sh ref=../final.contigs.fa then in the same folder I make the sorted bam file: /bbmap/bbmap.sh in=HI.1247.002_R1_trim_paired.fastq in2=HI.1247.002_R2_trim_paired.fastq out=HI.1247.002mapped.bam bs=bs.sh; sh bs.sh &

  5. Rob Egan

    Hi Shaungfei,

    Your samtools sort command is incorrect in several ways:

    1) you must not sort by name! Sort by reference position (i.e. no -n option) 2) -@ takes a number of threads, not a file name 3) -O probably requires 'BAM' (not 'bam'). You don't need to specify it as BAM is the default 4) I believe that samtools sort requires the file to be in a bam format already, not sam

    try this: samtools view -Sbu file.sam | samtools sort -@ 8 -o file.bam -

  6. Shuangfei Zhang

    @Camilla L. Nesbø @Rob Egan Thank you very much! I am a fresh fish. Your advice are good and available. I also have a question to ask you. In my metageomic data, I find a special archea maker gene and the archea may be a new classification. So I want to ask you how to bin a single genome from metagenome precisely. Thank you again. Looking forward to your reply.

  7. Rob Egan

    Hi Shuangfei, MetaBAT can't just bin 1 genome, it takes the dataset as a whole and bins them all as best it can with all the information available to it. I recommend running MetaBAT on the whole assembly and then find the bin or bins that contain your maker genes for follow-up.

  8. Shuangfei Zhang

    @Rob Egan Thank you for your reply. Metabat is easy and available. However, I find that the scale of some bins are too big, like 20-40Mb and the maximum of bins are out of control. How do you look at this question? I have no idea. After using some detective softwares, such as CheckM, generally these big bins should be killed.

  9. Rob Egan

    Hi Shuagfei,

    There are a few options that you can try on metabat to vary the sensitivity vs the specificity, but I don't have any advise on where to set the variables off of the default -- every data set behaves differently.

    --maxP arg (=95) Percentage of 'good' contigs considered for binning decided by connection among contigs. The greater, the more sensitive.

    --minS arg (=60) Minimum score of a edge for binning (should be between 1 and 99). The greater, the more specific.

    --maxEdges arg (=200) Maximum number of edges per node. The greater, the more sensitive.

    Additionally, I would advise you to try to annotate the large bins. You may find a partial eukaryote or a few very closely related strains in those large bins.

  10. Yue Lou

    Hi,

    I am trying to run metabat but i get the error “[Info] Correlation binning won't be applied since the number of samples (7) < minSamples (10)”.

    This is one of my commands:

    metabat -a L1_007.jgi.depth.txt -i L1_007_000G1_euk_contigs.fa.gz L1_007_000M1_euk_contigs.fa.gz L1_007_029G1_euk_contigs.fa.gz L1_007_061G1_euk_contigs.fa.gz L1_007_091G1_euk_contigs.fa.gz L1_007_122G1_euk_contigs.fa.gz L1_007_242G1_euk_contigs.fa.gz L1_007_365G1_euk_contigs.fa.gz -o bin -t 6

    I have used jgi_summarize_bam_contig_depths to calculate depths for all my samples. I tried both zipped and unzipped fasta files and they gave me the same error. I have also tried to adjust “--minSamples” to 3. When I tried that, it just returns “number of clusters formed: 0”. Could you please help me figure out the problems here?

    Thanks!

    Clare

  11. Rob Egan

    It looks like you've done several different assemblies. MetaBAT does not work that way, as it expects to cluster the sequences from a single assembly. It works best when you have multiple samples which contributed to produce that assembly, and each of the BAM files map the reads from a single sample to that common assembly.

  12. Log in to comment