Ghosted meshes do not get refined/marked correctly

Issue #876 new
johwing created an issue

There seems to be an issue with the direction of the facet normals on interior facet integration in 3D meshes when dolfin is run in parallel.

(For 2D meshes, this issue can be alleviated by specifying parameters['ghost_mode']='shared_facet' (or 'shared_vertex') as answered in question: 'Interior facet integration not working properly in parallel'. )

However, when I run a 3D mesh in parallel it seems that the interior facet integration (i.e. dS) does not have the cell index information necessary to define the facet normal directions '+'/'-' across the interface in a global manner. The normal direction are defined using a cell marker for the mesh with a "dummy" volume integration part in the form, i.e. Constant(0)*dx(99, domain=mesh, subdomain_data=cell_f) as specified in the assembly script help. The random directions of the facet normals typically occurs if the mesh is partitioned (when in parallel) such that at least a part of the interior facet is on the edge of the mesh for a parallel processor and setting the 'ghost_mode' parameters to either facet or vertex does not seem to fix this issue for 3D meshes in a reliable manner.

I have created the following minimum working example that can mimic the issue using mpiexec -np 2 python code_below.py:

from dolfin import *
parameters['ghost_mode'] = 'shared_facet' 

mesh = BoxMesh(Point(0, 0, 0), Point(1, 1, 1), 12,4,3)
mesh = refine(mesh) 

facet_f = FacetFunction('size_t', mesh, 0)
CompiledSubDomain('near(x[0], 0.5)').mark(facet_f, 1)

cell_f = CellFunction('size_t', mesh, 0)
CompiledSubDomain('x[0] - 0.5 + 1e-6 >= 0').mark(cell_f, 1)

n = FacetNormal(mesh)

form = n('-')[0]*dS(domain=mesh, subdomain_data=facet_f, subdomain_id=1) + \
Constant(0)*dx(99, domain=mesh, subdomain_data=cell_f) 
value = assemble(form)
print('normal -n_x (this should be equal to 1.0):', value)

Where instead of the expected value 1.0, I find 0.66666. Interestingly, the above code produces the expected result if I use 'shared_vertex' instead. However, then the surface integral result is 0.833333 when using four cores which also is wrong.

Note, without the refine(mesh) the issue is only partly reproducible when the code is run in parallel. Then, depending on the number of processors used, the surface integral of the facet normal x-component alternates between -1 and +1. As I have specified the cell indexes to 0 and 1 to the left and right of the interface, I did expect the result to alternate. However, for more advanced 3D meshes with complex interior boundaries, this issue becomes apparent rather often.

I am currently using Fenics 2017.1 installed on a local Ubuntu machine. The issue was also noted in Fenics 2016.2 before I upgraded. I have recently posted a question on the Q&A board, but as this may be a bug I thought an issue might be a more suitable.

Comments (2)

  1. Jan Blechta

    The issue seems to be pretty random. I get

    jan@gott:/vivid/home/jan$ mpirun -n 1 python3 issue876.py 
    Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    jan@gott:/vivid/home/jan$ mpirun -n 2 python3 issue876.py 
    Process 0: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 1: Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    jan@gott:/vivid/home/jan$ mpirun -n 3 python3 issue876.py 
    Process 0: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 1: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 2: Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    jan@gott:/vivid/home/jan$ mpirun -n 4 python3 issue876.py 
    Process 2: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 3: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 1: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 0: Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.6666666666666659
    normal -n_x (this should be equal to 1.0): 0.6666666666666659
    normal -n_x (this should be equal to 1.0): 0.6666666666666659
    normal -n_x (this should be equal to 1.0): 0.6666666666666659
    

    with shared facet and

    jan@gott:/vivid/home/jan$ mpirun -n 1 python3 issue876.py 
    Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    jan@gott:/vivid/home/jan$ mpirun -n 2 python3 issue876.py 
    Process 0: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 1: Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.8333333333333321
    normal -n_x (this should be equal to 1.0): 0.8333333333333321
    jan@gott:/vivid/home/jan$ mpirun -n 3 python3 issue876.py 
    Process 0: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 1: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 2: Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    normal -n_x (this should be equal to 1.0): 0.9999999999999986
    jan@gott:/vivid/home/jan$ mpirun -n 4 python3 issue876.py 
    Process 2: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 1: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 0: Number of cells increased from 864 to 6912 (700.0% increase).
    Process 3: Number of cells increased from 864 to 6912 (700.0% increase).
    normal -n_x (this should be equal to 1.0): 1.0000000000000002
    normal -n_x (this should be equal to 1.0): 1.0000000000000002
    normal -n_x (this should be equal to 1.0): 1.0000000000000002
    normal -n_x (this should be equal to 1.0): 1.0000000000000002
    

    with shared vertex.

  2. Jan Blechta

    The problem might be either in refinement or marking. Those facilities might not be ready for usage with ghosted meshes. Indeed, commenting out mesh = refine(mesh) line, the answer is correct for both ghost modes on one to eight processes.

    Preliminary assigning to @chris_richardson.

  3. Log in to comment