Reduce blocking overhead of --checkpoint option

Issue #129 invalid
Rob Egan created an issue

Currently when checkpointing is active, Contigs.dump_contigs() is called and this is a blocking routine, timed with a BarrierTimer and calling the dist_ofstream synchronous close().

Modify this function to return a future from dist_ofstream::async_close() and let the next stage of continuing wait on it for completion.

This would allow better interleaving of I/O and computation / networking. The (blocking) serialization cost of writing the contigs should be significantly less than waiting for the whole file to finish writing to disk and the memory overhead should be limited to just a few MB of private memory and Rendezvous buffers per rank while the data is on the wire or buffered in IO operations.

Comments (1)

  1. Log in to comment