PETScMatrix::init hangs in parallel for rectangular matrices.
This is a leftover of issue #86.
The attached testcase hangs in parallel because the process owning the row assumes that
the matrix is serial, the others that the matrix is parallel.
I think that the (bogus) test at PETScMatrix.cpp:124
if (row_range.first == 0 && row_range.second == M)
can be fixed in two ways:
1) by checking also the columns, i.e.
if (row_range.first == 0 && row_range.second == M && col_range.first == 0 && col_range.second == N)
2) by following Garth's advice given at the end of https://bitbucket.org/fenics-project/dolfin/pull-request/49/fix-issue-86-dolfin-sparsitypattern-apply/diff, i.e. changing it into
int comm_size;
MPI_Comm_size(sparsity_pattern.mpi_comm(), &comm_size);
if (comm_size == 1)
I can make a pull request if one of the two solutions is acceptable.
Comments (8)
-
-
This may be the source of other deadlocks where one process continues past a collective matrix operation.
-
@MarcoMorandini Did you make a pull request with your fix for this issue?
-
It's here:
https://bitbucket.org/fenics-project/dolfin/pull-request/174/fix-issue-392/diff
However I still get an error:
Traceback (most recent call last): File "pippo.py", line 12, in <module> lD = assemble(LD) File "/home/martinal/opt/fenics/dorsal-dev-1410/lib/python2.7/site-packages/dolfin/fem/assembling.py", line 203, in assemble assembler.assemble(tensor, dolfin_form) RuntimeError: *** ------------------------------------------------------------------------- *** DOLFIN encountered an error. If you are not able to resolve this issue *** using the information listed below, you can ask for help at *** *** fenics@fenicsproject.org *** *** Remember to include the error message listed below and, if possible, *** include a *minimal* running example to reproduce the error. *** *** ------------------------------------------------------------------------- *** Error: Unable to complete call to function init(). *** Reason: Assertion _local_range[_primary_dim].second > _local_range[_primary_dim].first failed. *** Where: This error was encountered inside ../../dolfin/la/SparsityPattern.cpp (line 107). *** Process: unknown *** *** DOLFIN version: 1.4.0+ *** Git changeset: 0685c8f1f92e7d743347d7e012e895ec10d4119d *** -------------------------------------------------------------------------
-
@garth-wells I guess some places assume that each process has a nonzero number of rows?
-
Found a fix.
-
- changed status to resolved
-
- removed milestone
Removing milestone: 1.5 (automated comment)
- Log in to comment
Now that objects store a communicator, use
dolfin::MPI::size(...)
to get the number of processes.