- attached aaa.jpg
runing with 11million contigs
hello! I'm using metabat2 version 2.12.1 running with my data, about 1.2 terabase sequencing reads assembled into 11 million contigs(14GB in base length) by megahits, seems metabat2 running endlessly. any advice about the running and the time consumption? the commond "metabat2 -i assembl.fasta -a /output/work_files/metabat_depth.txt -o /output/metabat2_bins/bin -m 1500 -t 16 --unbinned"
Comments (4)
-
reporter -
hi, Fred.
we don't have quite effective way to help for now. yet, I want to mention that "-m 1500" means minimum size of a contig for binning(default 2500). it'll make the computation much larger. if possible, i advise you remove this option.
besides, -v option will give you verbose output, which may help monitor the progress of binning. i think it's better than nothing.
good luck~
-
I concur with this. 11 million contigs will result in upwards of 250 trillion calculations before the clustering will start which will take some time on even the most powerful computers. We are looking into more efficient approaches than N squared and performing the calculations on a cluster / MPI job but if time is a constraint for you, increasing the minimum contig size to 2500 would reduce the total number of contigs that need to be compared considerably, and should have the effect of improving the accuracy of the resulting clusters (at the cost of completeness, obviously).
-
- changed status to resolved
I'm going to close this issue but I opened up two related enhancement issues to be scheduled for development work:
- Log in to comment