map function different chunksize for nodes

Issue #165 resolved
cTatu created an issue

Hi. I’m currently using mpi4py to run simulations on a heterogeneous cluster. Some nodes have 8, 12 and even 16 cores. I want to use map function from MPIPoolExecutor to compute all the simulations but fixed size chunksize parameter won’t use all cores.

Would be great if chunksize parameter could be a dictionary having as key the rank and as value the chunksize. In that way the MPIPoolExecutor would send to each node the exact number of simulations as CPUs has.

Example:

nodes_cores = {
  0: 8,
  1: 12,
  2: 16
}

executor = MPIPoolExecutor(3)
for result in executor.map(pow, [2]*32, range(32), chunksize=nodes_cores ):
    print(result)

Comments (3)

  1. Lisandro Dalcin

    I think you do not really understand how MPI works. MPIPoolExecutor(3) will create an executor using 3 processes, not 3 nodes. If you want to use all the cores you have in the 3 compute nodes, you have to MPIPoolExecutor(31), assuming the other core is using for the main, master process. You should run the classical MPI helloworld example first, and learn how to map processes to nodes (that depends on the MPI implementation). After that, you will realize what you are asking for is not that easy to implement. Moreover, if you have not so many tasks as in your example, maybe you should not use chucksize at all, you may be prematurely optimizing things.

  2. Log in to comment