Parallelize sampling with MPI

Some of the cosmosis samplers can be parallelized to run using MPI. This can hugely speed up the time taken to complete, depending on your system and problem. In other cases the parallelism acts more as a check that convergence has taken place.

Parallelism types

The emcee, grid, snake, and multinest samplers use intrinsically parallel algorithms: so they are greatly speeded by MPI.

The pymc and metropolis samplers can have several chains run in parallel, This does not speed up convergence, but it does let you diagnose it more easily.

The other samplers are serial.

Running a sampler in parallel

To run a sampler in parallel you must have mpi4py installed; this means you must also have an MPI environment including mpicc and mpif90 (the automatic installation includes all these).

For all the samplers, you can run in parallel on, e.g. four samplers, like this:

#!bash

mpirun -n 4 cosmosis --mpi params.ini

Maximum process numbers

For emcee, the most MPI processes you can use without having some idle is nwalkers/2+1.

For grid samplers you can use any number up to the total number of points to be sampled, i.e. nsample_dimension^n_dimension.

For snake, the number varies throughout the run, so you may sometimes have idle cores. The most cores used at once is n_dimension * 2, but it will typically be about half that.

The emcee, grid, and snake samplers use a master-slave set up for sampling. This means you can typically use one more process than you have cores effectively. For multinest you should not use more processes than cores.

Wiki

cosmosis / mpi

Parallelize sampling with MPI

Parallelism types

Running a sampler in parallel

Maximum process numbers