MPI convenience functions

Issue #27 wontfix
Jacob Hinkle created an issue

I am using the following code in one of my projects that is pretty general and could probably be shipped with PyCA (or should it??)

def _reduceOverMPI(A, hA, op=MPI.SUM):
    """
    Reduce PyCA Image3D or Field3D over MPI

    A can live anywhere but hA needs to be of mType MEM_HOST
    """
    if A.memType() == ca.MEM_HOST:
        # can do this in place without using hA
        MPI.COMM_WORLD.Allreduce(MPI.IN_PLACE, A.asnp(), op=op)
    else:
        # will need some uploading and downloading
        assert(hA.memType() == ca.MEM_HOST)  # make sure we can actually use hA
        ca.Copy(hA, A)  # download
        MPI.COMM_WORLD.Allreduce(MPI.IN_PLACE, hA.asnp(), op=op)
        ca.Copy(A, hA)  # upload

This is used to for example, sum across nodes an image or vector field. The other thing I do is spawn mpiexec from my script when I detect that the user's configuration is requesting MPI support with a certain number of threads. See the following code:

if compute.useMPI:
    # detect if we've started this process in MPI yet or do that now
    if not 'OMPI_COMM_WORLD_LOCAL_RANK' in os.environ:
        # limit processes to number of study files we have
        nproc = min(compute.numProcesses,
                    len(study.subjectFiles))
        print "Spawning " + str(nproc) + " MPI processes..."
        # TODO: use mpi4py for spawning instead of mpiexec
        #comm = MPI.COMM_SELF.Spawn(sys.executable,
                                   #args=sys.argv, maxprocs=nproc)
        CMD = "mpiexec -x PYTHONPATH -n " + str(nproc) + " --host \"" +\
              ','.join(compute.hostList) + "\" " + ' '.join(sys.argv)
        print CMD
        os.system(CMD)
        sys.exit(0)

This just detects whether we're already inside an mpiexec call and, if not, spawns mpiexec with the exact command line we already supplied. This could be wrapped up as

if compute.useMPI:
    ca.MPI.Spawn(min(compute.numProcesses,
                     len(study.subjectFiles)),
                 compute.hostList)

There are probably other, more complicated patterns we'll use as well. @nikhilsingh probably knows more about this than I do. For now, I plan to split my own MPI-specific stuff into an MPI.py and try it out. If it's useful we can then copy it into PyCA.contrib or something maybe?

Comments (4)

  1. Sam Preston

    Yeah, this is great -- we shouldn't make people manage the mpi and asnp calls if they don't need to. I think this could be included in PyCA.MPI or something similar. I also really like spawning the MPI jobs from python, it just means fewer shell scripts.

  2. Jacob Hinkle reporter

    I have an example in PDiff, using something like this. It would be a little nicer and more cross-platform to spawn using mpi4py directly instead of using an os call to mpiexec. I looked at this before and couldn't figure out how to supply the host list and numproc doing it that way. The whole idiom where we decide if we're using MPI then split and run a worker, or just run a single worker (that is, the stuff surrounding the first code block above) is general also and could be invoked with something like

    Compute(_runWorker, compute, study)
    

    I'm just not exactly how to build in the logic to subdivide the workload (subjectFiles in the Atlas.py example) across the available nodes in a consistent way. Basically, when I do this I would also want to assign

    nodeFiles = study.subjectFiles[mpiRank::mpiSize]
    

    but actually, maybe this is just done inside the worker function. Yes, then if we are only given an mpiSize of 1, it will not change the nodeFiles.

    What's really sweet is that this makes writing MPI code trivial. You write your worker function, which should need only minimal tweaks from single-threaded code, then run it using a unified Compute function. I'm going to put this Compute function, along with some computing configs inside my MPI.py (which I'll also rename Compute.py. Stay tuned.

  3. Jacob Hinkle reporter

    I ran up against this spawning issue again. I'm using Compute.py from my AppUtils, which spawns using the mpiexec ... stuff from above, which is not portable. On windows it just craps out at that os.system call. I've created an issue over on the AppUtils tracker for this, so this discussion can go on over there. I'm marking this closed here as a result.

  4. Log in to comment