Segmentation fault in cell.facet_area under certain conditions

Issue #568 resolved
Tormod Landet created an issue

I get a segmentation fault when running cell.facet_area(0) in both FEniCS 1.5 and 1.6

from dolfin import *
#parameters['reorder_dofs_serial'] = False

mesh = UnitSquareMesh(4, 4, 'right')
#V = FunctionSpace(mesh, 'DG', 1)

cell = Cell(mesh, 0)
cell.facet_area(0)

If I uncomment the "V = ..." line the segfault goes away. If I further uncomment the reorder_dofs line it again segfaults.

Comments (23)

  1. Martin Sandve Alnæs

    With latest dev and mesh = UnitSquareMesh(1,1) it also fails.

    Adding mesh.init(1) (which initializes the edges of the mesh) makes it pass, which I believe is hinting at the problem. Can you try that on your installation?

    @logg

  2. Tormod Landet reporter

    Yep, that works.

    I assumed this was related to some data structure not being initialized before building a function space, it was the interplay with reorder_dofs_serial that took me a while to figure out ...

  3. Martin Sandve Alnæs

    Actually it segfaulted for me regardless of reorder_dofs_serial, I think that was just random.

  4. Tormod Landet reporter

    For both me and Miro the following works:

    from dolfin import *
    #parameters['reorder_dofs_serial'] = False
    
    mesh = UnitSquareMesh(4, 4, 'right')
    V = FunctionSpace(mesh, 'DG', 1)
    
    cell = Cell(mesh, 0)
    cell.facet_area(0)
    

    While this segfaults.

    from dolfin import *
    parameters['reorder_dofs_serial'] = False
    
    mesh = UnitSquareMesh(4, 4, 'right')
    V = FunctionSpace(mesh, 'DG', 1)
    
    cell = Cell(mesh, 0)
    cell.facet_area(0)
    

    The latter example is how I found this bug in the first place.

  5. Martin Sandve Alnæs

    I'm not sure how to classify this. The solution is to do mesh.init(1), maybe we just need to document that, and add an assertion to Cell::facet_area.

  6. Tormod Landet reporter

    I looked into this and I think the problem is that cell.entities(1) returns an empty list which the cell implementation (TriangleCell.cpp) tries to use to look up the facet

    // Create facet from the mesh and local facet number
    const Facet f(cell.mesh(), cell.entities(1)[facet]);
    

    Should cell.entities() in MeshEntity.h check to see it the connectivity has been initialized so that some errors can be avoided when it returns an empty set? The current implementation is just

    const unsigned int* entities(std::size_t dim) const
    { return _mesh->topology()(_dim, dim)(_local_index); }
    
  7. Tormod Landet reporter

    This change to MeshEntity.h gives an explanatory error message when asking for non-initialized entities, but other places may rely on the empty result somehow?

    const unsigned int* entities(std::size_t dim) const
        {
          auto& mesh_connectivity = _mesh->topology()(_dim, dim);
          if (mesh_connectivity.empty())
          {
            dolfin_error("MeshEntity.h", "get entities",
              "The mesh connectivity is empty (not initialized). "
              "Run mesh.init(%d) to initialize connectivity", dim);
          }
          return mesh_connectivity(_local_index);
        }
    
  8. Tormod Landet reporter

    Could I get added to the team so that I can make a branch directly in dolfin so that the unit tests are run? I will still work through pull requests. It is just a hassle to keep a fork just to make small fixes like this

  9. Tormod Landet reporter

    Back from vacation now.

    @blechta: I have merged master into the branch to see if this fixes the test failures. When I made the fix three weeks ago it seemed that every branch was failing with some kind of Instant error, so it might have been something unrelated. Lets see.

    I do not think I can change master or next currently so I will ask for someone else to merge it into next (or should I create pull requests towards next and then master?)

  10. Jan Blechta

    I didn't think that such a minimal change needs a PR. That's why I was asking whether you have the access.

    Anyway the failure seems to be really cause by this change. The assertion fails in unit tests and demos. (Instant stderr message on "Job summary" tab is misleading. This one is, I think, expected to happen and caught in a unit test.) Real problem is visible on "Logs" tab, see http://fenics-bamboo.simula.no:8085/browse/DOL-DODO75-PUT-33/log.

    You can follow example at https://bitbucket.org/fenics-project/testing/issues/1 to fetch the build and reproduce the problem with essentially no work.

  11. Jan Blechta

    To the actual problem, maybe you should rather test that you're not dereferencing nullptr or indexing out of bounds. Now you asking whether the connectivity is empty.

  12. Tormod Landet reporter

    The code now checks for null pointer and fixes the one place in the DOLFIN code that relied on getting a null pointer and not using it. At least, that was the only place covered by the tests. I also added a unit test for this issue

  13. Log in to comment