- edited description
map function different chunksize for nodes
Hi. I’m currently using mpi4py to run simulations on a heterogeneous cluster. Some nodes have 8, 12 and even 16 cores. I want to use map function from MPIPoolExecutor
to compute all the simulations but fixed size chunksize
parameter won’t use all cores.
Would be great if chunksize
parameter could be a dictionary having as key the rank and as value the chunksize. In that way the MPIPoolExecutor
would send to each node the exact number of simulations as CPUs has.
Example:
nodes_cores = {
0: 8,
1: 12,
2: 16
}
executor = MPIPoolExecutor(3)
for result in executor.map(pow, [2]*32, range(32), chunksize=nodes_cores ):
print(result)
Comments (3)
-
reporter -
I think you do not really understand how MPI works.
MPIPoolExecutor(3)
will create an executor using 3 processes, not 3 nodes. If you want to use all the cores you have in the 3 compute nodes, you have toMPIPoolExecutor(31)
, assuming the other core is using for the main, master process. You should run the classical MPI helloworld example first, and learn how to map processes to nodes (that depends on the MPI implementation). After that, you will realize what you are asking for is not that easy to implement. Moreover, if you have not so many tasks as in your example, maybe you should not use chucksize at all, you may be prematurely optimizing things. -
reporter - changed status to resolved
- Log in to comment