Investigate deadlock with MPI sub comms

Issue #426 invalid
Prof Garth Wells created an issue

See thread at http://fenicsproject.org/pipermail/fenics/2014-November/002075.html. Test code (run on two procs):

import mpi4py.MPI as mpi
import petsc4py.PETSc as petsc

# Set communicator and get process information
comm = mpi.COMM_WORLD
group = comm.Get_group()
size = comm.Get_size()

# Divide the processors into two groups 1 with 1/4 and the second with 3/4 of the ranks
rank = comm.Get_rank()
group_comm_0 = petsc.Comm(comm.Create(group.Incl(range(1))))
group_comm_1 = petsc.Comm(comm.Create(group.Incl(range(1,2))))

from dolfin import Expression, UnitSquareMesh, Function, TestFunction, \
     Form, FunctionSpace, dx, CompiledSubDomain, SubSystemsManager

deadlock = True
if not deadlock:
    SubSystemsManager.init_petsc()

if rank == 0:
    e = Expression("4", mpi_comm=group_comm_0)

else:
    mesh = UnitSquareMesh(group_comm_1, 2, 2)
    V = FunctionSpace(mesh, "P", 1)

    # If SubSystemsManager has not initalized petsc, slepc will be
    # initialized when creating the vector in Function. That is done
    # collectively using PETSc_COMM_WORLD==MPI_COMM_WORLD in this
    # example
    u = Function(V)
    v = TestFunction(V)
    Form(u*v*dx)

Comments (2)

  1. Prof Garth Wells reporter

    'Isse' is that PETSc initialisation is collective, and code example causes PETSc initialisation to be called only a subset of procs (when a PETSc object is first created).

    The root problem is the 'auto-magic' initialisation of PETSc (and MPI) in DOLFIN, rather than users managing initialisation manually.

  2. Log in to comment