MPI_Comm argument in compiled extension module

Issue #930 resolved
Francesco Ballarin created an issue

Dear FEniCS developers,

I have recently upgraded to DOLFIN master (as of yesterday, commit a963f16) and I am currently unable to compile any extension module which uses MPI_Comm as an argument, due to a compilation error in SWIG.

The same code was working with master as of July 28, 2017 (commit f1759ee). I have the same environment (same build script, same compilers, same swig). I only upgraded petsc and petsc4py, slepc and slepc4py to more recent maint versions (e.g., upgraded petsc from 74c0405 of July 08, 2017 to d332b86 of September 01, 2017).

I attach a .py file that shows the issue, the relevant part of the compile.log and a diff for PYTHON_wrap.cxx files between the fomer version (which was working) and the current one. I am baffled by the fact that both versions generate the same .i file, but then the resulting PYTHON_wrap.cxx are different.

I tried to look for DOLFIN commits between f1759ee and current master that could have changed this behavior but sadly I was not able to pinpoint the source of the issue. Can you reproduce this? Do you have any suggestion?

Thanks,

Francesco

Comments (8)

  1. Prof Garth Wells

    Instant, which does the JIT, is in a bit of a mess. It relies on custom, non-standard and undocumented formatting of the CMake file of packages that use it. It will be deprecated soon.

    Try specifying the compiler to be mpicxx to make the MPI includes accessible.

  2. Jan Blechta

    I can't reproduce it. I suggest try building DOLFIN in a clean source dir to regenerate SWIG files. You can force that to by cmake/scripts/generate-swig-interface.py. Eventually disable PETSc try again with a clean source dir.

  3. Francesco Ballarin reporter

    Thanks Jan for the suggestion. I always clean up the source dir and always use a clean build directory to avoid issues with SWIG. BTW, am I right to assume that this will not be needed anymore when moving to pybind11?

    Thanks Garth for the suggestion. mpicxx was detected correctly.

    I managed to track down the issue to DOLFINConfig.cmake.

    It contains set(DOLFIN_CXX_DEFINITIONS "-DDOLFIN_VERSION=\"2017.2.0.dev0\";-DNDEBUG;-DDOLFIN_SIZE_T=8;-DDOLFIN_LA_INDEX_SIZE=4;-DHAS_HDF5;-DHAS_SLEPC;-DHAS_PETSC;-DHAS_UMFPACK;-DHAS_CHOLMOD;-DHAS_SCOTCH;-DHAS_PARMETIS;-DHAS_ZLIB;-DHAS_MPI") for the old version,

    while it sets set(DOLFIN_CXX_DEFINITIONS "-DDOLFIN_VERSION=\"2017.2.0.dev0\";") for the most recent one.

    This results in -DHAS_MPI not being passed to SWIG, which in turn results in an incorrect typedef when including dolfin/common/MPI.h

    Garth, is there any chance that this difference is related to at least one of the following commits?

    commit 7e9ceed81e3d59f0e9161ce23f63512ac270e635
    Author: Garth N. Wells <gnw20@cam.ac.uk>
    Date:   Sun Aug 27 09:41:10 2017 +0100
    
        Tidy up CMake build definitions.
    
    commit b7516e655bd66adf30cf31cb78e762805942d6a9
    Author: Garth N. Wells <gnw20@cam.ac.uk>
    Date:   Sat Aug 26 23:52:53 2017 +0100
    
        Fix SWIG problem with compiler defs.
    
    commit be8539e1158065cfeb6ed27c13bde3d7d41b7705
    Author: Garth N. Wells <gnw20@cam.ac.uk>
    Date:   Sat Aug 26 20:26:29 2017 +0100
    
        Compiler definition cleanup.
    
    commit c10d21cba46136636e2c53616c962e064720799f
    Author: Garth N. Wells <gnw20@cam.ac.uk>
    Date:   Sat Aug 26 19:15:20 2017 +0100
    
        CMake work-arounds for Instant.
    
    commit 03311d43e0fe026ba7ce820d3ba48a2e94f5a57f
    Author: Garth N. Wells <gnw20@cam.ac.uk>
    Date:   Sat Aug 26 18:04:27 2017 +0100
    
        Improve CMake handling of include files.
    
    commit 8386a1a4caf4c10be0087acc66bd5bb69a53ac14
    Author: Garth N. Wells <gnw20@cam.ac.uk>
    Date:   Sat Aug 26 17:34:13 2017 +0100
    
        Fix addition of Boost libs to pkg-config file.
    

    Thanks,

    Francesco

  4. Prof Garth Wells

    @francesco_ballarin Yes, there is a good chance one of my changes caused the problem. The issue is that Instant not in good shape and our tests didn't detect an issue. As I mentioned, Instant relies on other projects having a specific, old-fashioned and undocumented CMake file structure so if a test doesn't pick up an issue it's virtually impossible to know if a DOLFIN change will cause a problem.

    We're going to drop Instant after the next release, so I'm not inclined to dig but we'd accept patches.

    What I would recommend is trying out https://bitbucket.org/fenics-project/dolfin/pull-requests/394. I'm happy to help getting that working well for you.

  5. Francesco Ballarin reporter

    Thanks, I will definitely check the new pybind11 interface once its is merged to master.

    I manually fixed my DOLFINConfig.cmake as I mentioned and everything is working now. I agree that it is probably not going to be worth the effort of fixing that if Instant is going to be retired soon. Nevertheless, if you agree I will leave this ticket open for the time being so that my workaround may help other people, and I will close it once Instant is retired.

  6. Log in to comment