- edited description
- changed title to SCOTCH mesh partitioner has side effects
SCOTCH mesh partitioner has side effects
Subsequent calls to UnitSquareMesh(4, 4)
produce different meshes (with SCOTCH partitioner). On the other hand dofmap reordering (using SCOTCH) seems to be fine. Following script demonstrates the problem:
from __future__ import print_function
from dolfin import *
parameters.mesh_partitioner = 'SCOTCH'
comm = mpi_comm_world()
rank = MPI.rank(comm)
def create_mesh():
mesh = UnitSquareMesh(4, 4)
return mesh.num_vertices(), mesh.num_cells()
def create_dofmap(mesh=None):
if not mesh:
mesh = UnitSquareMesh(4, 4)
V = FunctionSpace(mesh, "P", 1)
return (mesh.num_vertices(), mesh.num_cells(),
mesh.topology().hash(), mesh.geometry().hash(),
V.dofmap().ownership_range())
def test_determinism(func):
result0 = func()
while True:
result1 = func()
diff = int(result0 != result1)
if MPI.max(comm, diff) > 0:
print(rank, result0, result1)
break
def main():
print(rank, "Test mesh build")
test_determinism(create_mesh)
print()
MPI.barrier(comm)
print(rank, "Test mesh and dofmap build")
test_determinism(create_dofmap)
print()
MPI.barrier(comm)
print(rank, "Test dofmap build with static mesh")
mesh = UnitSquareMesh(4, 4)
test_determinism(lambda: create_dofmap(mesh=mesh))
print()
MPI.barrier(comm)
if __name__ == '__main__':
main()
Output on my system:
1 Test mesh build
0 Test mesh build
2 Test mesh build
0 (11, 10) (11, 10)
2 (12, 11) (13, 11)
1 (12, 11) (11, 11)
2 Test mesh and dofmap build
1 Test mesh and dofmap build
0 Test mesh and dofmap build
2 (12, 11, 16345232928620356316L, 8317758475742265463, (16, 25)) (12, 11, 15156240723649878619L, 7318551913629007991, (16, 25))
1 (12, 11, 8462581566930303203L, 6060639746735424119, (8, 16)) (12, 11, 13550905085782729940L, 4625243689556973335, (7, 16))
0 (11, 10, 4607311713341590251L, 2348913127836622316, (0, 8)) (11, 10, 482017684130753681L, 4193610146276079084, (0, 7))
0 Test dofmap build with static mesh
2 Test dofmap build with static mesh
1 Test dofmap build with static mesh
and the program hangs (the last test never fails).
With parameters.mesh_partitioner = 'ParMETIS'
the problem disappers (the program hangs in the first test). Dof reordering library does not play a role here.
Comments (14)
-
reporter -
reporter There is bug in
SCOTCH_randomReset()
which is supposed to reset pseudorandom number generator. The bug appears (at least) in SCOTCH 6.0.0-6.0.3. SCOTCH 6.0.4 is fixed. Buggy versions can be fixed by compiling with-DCOMMON_RANDOM_SYSTEM
.We can ask for fixing this in PETSc (either version bump or
-DCOMMON_RANDOM_SYSTEM
workaround) or switch default mesh partitioner to ParMETIS. Opinions @garth-wells, @chris_richardson? -
We should ask PETSc to bump the SCOTCH version.
-
reporter -
reporter PETSc fixed this in maint. Let's wait for release of 3.7.4, bump to that version in docker dev image and consider this as fixed.
Maybe we could add SCOTCH version check and switch default mesh partitioner to ParMETIS when SCOTCH is buggy.
-
Is the SCOTCH bug causing a problem for us?
-
reporter I think it is. The problem is described in the description above.
-
@blechta What does it break?
-
reporter SCOTCH partitioning has side effects (creates different topologies on subsequent mesh generations). It does not currently break anything now but it is undesired behaviour. It is dangerous, for example, in combination with #720. It was in fact a reason for random fails in
test_p31_box_2
and we haven't been able to debug this for a long time. -
- changed status to closed
Closing this since updated SCOTCH in PETSc resolves issue.
-
reporter - changed status to open
Let's keep this open until 3.7.4 is used in docker images (our default distribution method).
-
reporter @jackhale, could you close this once you bump to 3.7.4 in docker images?
-
- changed status to closed
PETSc version bumped in https://bitbucket.org/fenics-project/docker/commits/a334ff53ef30890a989992ccbddb1cea4e6b9876.
-
reporter Cool
- Log in to comment