add more progress / parallelism to long running stages

Issue #108 resolved
Rob Egan created an issue

Running a huge data set (50GB fasta and 7.6GB depths file) show very long times to read files and long spans without messages

MetaBAT 2 (v2.15-4-ga101cde) using minContig 2500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, maxEdges 200 and minClsSize 200000. with random seed=1603241415
[00:00:00] Executing with 64 threads
[00:00:00] Parsing abundance file
[00:12:09] Parsing assembly file
[00:29:49] Number of large contigs >= 2500 are 2366349.
[00:30:01] Reading abundance file
[00:40:50] Finished reading 50455338 contigs and 9 coverages from final_assembly_depths.txt
[00:40:55] Number of target contigs: 2332240 of large (>= 2500) and 10467802 of small ones (>=1000 & <2500).
[00:40:55] Start TNF calculation. nobs = 2332240
[00:40:58] Finished TNF calculation.
[00:40:58] Attempt 0 of 10 to gen_tnf_graph_sample
[00:41:51] Preparing TNF Graph Building [pTNF = 94.6; 2373 / 2500 (P = 94.92%) round 15]
[00:41:51] Attempt 1 of 10 to gen_tnf_graph_sample
[00:42:45] Preparing TNF Graph Building [pTNF = 94.9; 2363 / 2500 (P = 94.52%) round 17]
[00:42:46] Attempt 2 of 10 to gen_tnf_graph_sample
[00:43:41] Preparing TNF Graph Building [pTNF = 95.0; 2343 / 2500 (P = 93.72%) round 16]
[00:43:41] Attempt 3 of 10 to gen_tnf_graph_sample
[00:44:34] Preparing TNF Graph Building [pTNF = 94.6; 2374 / 2500 (P = 94.96%) round 17]
[00:44:34] Attempt 4 of 10 to gen_tnf_graph_sample
[00:45:27] Preparing TNF Graph Building [pTNF = 94.7; 2367 / 2500 (P = 94.68%) round 17]
[00:45:27] Attempt 5 of 10 to gen_tnf_graph_sample
[00:46:21] Preparing TNF Graph Building [pTNF = 95.3; 2348 / 2500 (P = 93.92%) round 15]
[00:46:21] Attempt 6 of 10 to gen_tnf_graph_sample
[00:47:15] Preparing TNF Graph Building [pTNF = 94.8; 2362 / 2500 (P = 94.48%) round 18]
[00:47:15] Attempt 7 of 10 to gen_tnf_graph_sample
[00:48:09] Preparing TNF Graph Building [pTNF = 94.4; 2359 / 2500 (P = 94.36%) round 19]
[00:48:09] Attempt 8 of 10 to gen_tnf_graph_sample
[00:49:03] Preparing TNF Graph Building [pTNF = 94.9; 2360 / 2500 (P = 94.40%) round 16]
[00:49:03] Attempt 9 of 10 to gen_tnf_graph_sample
[00:49:56] Preparing TNF Graph Building [pTNF = 95.0; 2351 / 2500 (P = 94.04%) round 15]
[00:49:56] Finished Preparing TNF Graph Building [pTNF = 94.61]
[09:26:53] Finished Building TNF Graph (187343054 edges) [44.5Gb / 125.8Gb]
[09:26:54] Applying coverage correlations to TNF graph with 187343054 edges
[09:32:14] Traversing graph with 2332240 nodes and 187343054 edges
[09:32:17] Building SCR Graph and Binning (221563 vertices and 246671 edges) [P = 9.50%; 41.9Gb / 125.8Gb]
[09:32:29] Building SCR Graph and Binning (443126 vertices and 624767 edges) [P = 19.00%; 41.9Gb / 125.8Gb]
[09:32:56] Building SCR Graph and Binning (664689 vertices and 1142910 edges) [P = 28.50%; 41.9Gb / 125.8Gb]
[09:33:45] Building SCR Graph and Binning (886252 vertices and 1817369 edges) [P = 38.00%; 41.9Gb / 125.8Gb]
[09:34:41] Building SCR Graph and Binning (1107814 vertices and 2632070 edges) [P = 47.50%; 41.9Gb / 125.8Gb]
[09:36:10] Building SCR Graph and Binning (1329377 vertices and 3643663 edges) [P = 57.00%; 41.9Gb / 125.8Gb]
[09:38:17] Building SCR Graph and Binning (1550941 vertices and 5059658 edges) [P = 66.50%; 41.9Gb / 125.8Gb]
[09:42:34] Building SCR Graph and Binning (1772503 vertices and 7454357 edges) [P = 76.00%; 41.9Gb / 125.8Gb]
[09:51:43] Building SCR Graph and Binning (1994066 vertices and 13349537 edges) [P = 85.50%; 41.9Gb / 125.8Gb]
[10:07:49] Building SCR Graph and Binning (2168151 vertices and 29645615 edges) [P = 95.00%; 42.3Gb / 125.8Gb][22:25:20] Binning lost contigs...
[22:25:20] Binning lost contigs...
[22:26:00] Binning small contigs...
[23:02:54] 0.01% (1467167 bases) of large (>=2500) contigs were re-binned out of small bins (<200000).
[23:02:54] Rescuing singleton large contigs
[23:02:54] There are 322 bins already
[23:02:54] Outputting bins
[23:08:43] 93.54% (10443775446 bases) of large (>=2500) and 0.27% (41502746 bases) of small (<2500) contigs were binned.
322 bins (10485278192 bases in total) formed.
[23:08:43] Finished

(still going will update on completion)

Comments (3)

  1. Log in to comment