Reduce multiprocessing overhead

Issue #16 new
Jason Vander Heiden created an issue
  1. Remove the alive flag as per https://bitbucket.org/javh/presto/issue/6.
  2. Chunking may help (eg 1,000 sequences at a time).

Comments (3)

  1. Former user Account Deleted

    I’m currently doing the chunking in a wrapper (i.e. using parallel or joblib), but it seems to need at least 3 processes, of which one is CPU bound. Do you have suggestions as to what would be a good ratio of nproc vs. number of parallel runs (e.g. of MaskPrimers and AssemblePairs)?

  2. Jason Vander Heiden reporter

    Yeah, the way it’s setup is that total number of processes equals nproc + 2, where nproc sets the number of worker (computation) processes. The other two processes manage file I/O. One owns reading from the input file and putting data into the compute queue; one owns writing to the output files by collecting results from the queue storing the output of the computations. You should be able to basically ignore these 2 feeder/collector processes in your allocation as they aren’t doing much - they just exist to avoid worker processes mucking up file I/O.

    Setting nproc higher than 15-20 (depending upon the task) doesn’t really reduce runtime due to the overhead. I’d have to dig up the old scalability curves to be sure, but I think distributing load across jobs with nproc set to ~8-12 would be quickest.

    Though, in this case, by “chunking” I meant changing how the data was fed into the compute queue within a single execution. Ie, loaded chunks of sequences from disk into memory instead of doing them one at a time to reduce the intraprocess communication.

    Though, since then, I’ve realized that the biggest performance issue is the use of biopython Seq objects to store everything…

  3. Former user Account Deleted

    Ok, thanks for your reply! Indeed on our systems we also see no real benefit in using more than 8 cores. By the looks of it, it doesn’t seem too complicated to change how data is fed in (I might take a stab at that on my end to see if that makes it scale beyond 8 cores). Thanks!

  4. Log in to comment