- edited description
Inefficient implementation of PETScSNESSOLVER::FormJacobian
Currently the function PETScSNESSolver::FormJacobian
uses an inefficient MatCopy
of matrices after assembling, instead of passing them as reference.
The problem is to let Mat* A
to point to the mat object of (PETScMatrix dA
), which is assembled by nonlinear_problem->form(dA, f, *dx)
.
I report below the current (inefficient) FormJacobian
function. Note that FormJacobian
is called at each iteration of the SNES and for large problem its efficient implementation is important.
PetscErrorCode PETScSNESSolver::FormJacobian(SNES snes, Vec x, Mat* A, Mat* B,
MatStructure* flag, void* ctx)
{
struct snes_ctx_t snes_ctx = *(struct snes_ctx_t*) ctx;
NonlinearProblem* nonlinear_problem = snes_ctx.nonlinear_problem;
PETScVector* dx = snes_ctx.dx;
PETScVector f;
PETScMatrix dA;
// Cast from the PETSc Vec type to dolfin's PETScVector
boost::shared_ptr<Vec> vptr(&x, NoDeleter());
PETScVector x_wrap(vptr);
*dx = x_wrap;
nonlinear_problem->form(dA, f, *dx);
nonlinear_problem->J(dA, *dx);
MatCopy(*dA.mat(), *A, SAME_NONZERO_PATTERN);
if (B != A)
{
MatCopy(*dA.mat(), *B, SAME_NONZERO_PATTERN);
}
*flag = SAME_NONZERO_PATTERN;
return 0;
}
Comments (16)
-
-
- edited description
-
-
reporter A similar, but less penalizing, issue is present in
PETScSNESSolver::FormFunction
where a vector is copied.Another point: if I understand correctly the logic of
nonlinear_problem
,nonlinear_problem-> form(dA, *dx)
duplicates the matrix assembly. May you confirm? -
The description sounds backward. You want
dA
to point to*A
. Best to makenonlinear_problem->form
andnonlinear_problem->J
stateless, just using the matrix that is passed in.Now I see the constructor
explicit PETScMatrix(boost::shared_ptr<Mat> A, bool use_gpu=false);
which makes no sense to me.
Mat
is already a reference counted pointer typetypedef struct _p_Mat* Mat;
(reference counting is internal). Similarly,
boost::shared_ptr<Mat> _Acopy(new Mat, PETScMatrixDeleter());
which is making a heap allocation to hold the pointer (
struct _p_Mat*
).I suggest making the
PETScMatrix
wrapper have a constructor that takes a normalMat
. CallPetscObjectReference
in that constructor and callMatDestroy
(orPetscObjectDereference
) in the destructor. Noboost::shared_ptr<Mat>
. -
reporter I agree. Indeed, I tried using the natural strategy suggested by Jed with
explicit PETScMatrix(boost::shared_ptr<Mat> A, bool use_gpu=false);
but i was unable to let it working. The constructor ofPETScMatrix
fromMat
was unclear to me. But changing this may have various implications in other places and I do not feel confident in proceeding. Note that the problem goes back also toPETScBaseMatrix
and there areboost::shared_ptr<Mat>
or similar everywhere indolfin
. For this reason I was just looking for a workaround forPETScSNESSolver
. I found a solution in theTAO
interface with a constant hessian (seeTAOLinearBoundSolver::__TAOFormHessianQuadraticProblem
).Garth or some other core developer, may you have a look to this?
Jed: A partially related question. Has
SNES
an option to do not call at each iteration the assembly routine when solving a minimization problem for a quadratic functional with bound constraints? It is useless to assemble the hessian of the functional at each iteration when it is constant. -
Yes, getting rid of boost::shared_ptr<Mat> is a bigger task, but I it should be done and it will make these interfaces simpler.
For your second question, see:
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html
-
reporter Thanks. Regarding the second question, the documentation appears quite cryptic to me. In the case above (quadratic problem with bound constraints), should I set
-snes_lag_jacobian -2
or something like-snes_lag_jacobian 1000
? I will add this as an explicit parameter in the fenics interface. -
I've started looking into this issue, following Jed suggestion on getting rid of all
boost::shared_ptr
and on using instead internal petsc reference counter. Per se, the issue is not difficult to address, but it involves several files and lines of code, so it could potentially introduce new bugs.At the moment I've ported
PETSc{Vector,Matrix,SNESSolver}
to the new "reference counting" strategy, andForm{Function,Jacobian}
don't need a copy anymore, with an exception: sinceFormFunction
takes aVec
(and not aVec*
) for the residual, when a reset_sparsity occurs a copy cannot be avoided, since a VecCreate is called. ForFormJacobian
, the argument is aMat*
, so it doesn't matter.Next week I will provide a preliminary pull request on this issue.
Another question: when
Form{Function,Jacobian}
are called, the state solution is also provided. Generally this is not used byNonlinearProblem
, since theForm
s have a reference of it. In the former implementation,SNESSolver
was copying the state into the solution:struct snes_ctx_t snes_ctx = *(struct snes_ctx_t*) ctx; // Cast from the PETSc Vec type to dolfin's PETScVector boost::shared_ptr<Vec> vptr(&x, NoDeleter()); PETScVector x_wrap(vptr); *dx = x_wrap; // <== operator= calls a VecCopy
I think this is not efficient, and also misleading. Suggestions?
-
I don't see the reason to not use shared_ptr. One can supply a custom deleter, if required, and one can unwrap the shared_ptr. I would like to minimise PETSc syntax because it's pretty ugly and follows a design different from what we strive for.
-
Here is my rationale: since we want to wrap some data (in this case a PETSc object, which is just a pointer), the main structure should be:
template <typename T> struct PETScWrapper { public: // [...] private: T __obj; // or boost::scoped_ptr<T> __obj; // or C++11 std::unique_ptr<T> __obj; };
In my opinion, a
shared_ptr
doesn't make any sense here because the wrapped object is mine and I don't want to share it. If I really have to share it, I'll give you a copy of it and, for this particular case, it's not very expensive (as I said, a pointer).When I initialise the object,
<T>Create
is called, and when I have to destroy it,<T>Destroy
is called. The class is opaque in this respect, and the ugly PETSc syntax is hidden.Other behaviours:
- copy constructor: if a deep copy is requested, there is no problem, since the two PETSc objects are different. When the copy is shallow, the value of
__obj
is the same, soPetscObjectReference
must be called; - construction from raw T: as before, deep copy doesn't harm, while then shallow copy is the same as before;
- conversion from Wrapper to T:
__obj
is returned.
The only criticality is the last point but, since the user is requesting the raw object, she is also leaving DOLFIN wonderland, and she has to be careful. Indeed, when the wrapper is destroyed, the previously returned object becomes invalid, unless
PetscObjectReference
is called. On the other hand, I don't see how ashared_ptr
can heal these issues. - copy constructor: if a deep copy is requested, there is no problem, since the two PETSc objects are different. When the copy is shallow, the value of
-
Change of plans. I figured out that a switch from
shared_ptr
to raw data can be harmful to merge, since it would yield several changes in everyPETSc...
-related class (but I still consider a bad design choice the usage ofshared_ptr
in this context).So I propose a
shared_ptr
-based solution with the attached patch. This also fix the assertion raised when areset_sparsity
is not performed.It is still open how to deal with the preconditioner (this is not only limited to
SNES
): a possibility is to introduced a virtual method in theNonlinearProblem
class.diff --git a/dolfin/nls/PETScSNESSolver.cpp b/dolfin/nls/PETScSNESSolver.cpp index d29a262..34dbb98 100644 --- a/dolfin/nls/PETScSNESSolver.cpp +++ b/dolfin/nls/PETScSNESSolver.cpp @@ -39,7 +39,6 @@ #include <dolfin/common/timing.h> #include <dolfin/common/Timer.h> - using namespace dolfin; // Utility function @@ -397,18 +396,17 @@ PetscErrorCode PETScSNESSolver::FormFunction(SNES snes, Vec x, Vec f, void* ctx) NonlinearProblem* nonlinear_problem = snes_ctx.nonlinear_problem; PETScVector* dx = snes_ctx.dx; - PETScMatrix A; PETScVector df; - // Cast from the PETSc Vec type to dolfin's PETScVector - boost::shared_ptr<Vec> vptr(&x, NoDeleter()); - PETScVector x_wrap(vptr); + boost::shared_ptr<Vec> x_ptr(reference_to_no_delete_pointer(x)); + PETScVector x_wrap(x_ptr); *dx = x_wrap; // Compute F(u) - nonlinear_problem->form(A, df, *dx); - nonlinear_problem->F(df, *dx); + PETScMatrix A; + nonlinear_problem->form(A, df, x_wrap); + nonlinear_problem->F(df, x_wrap); VecCopy(*df.vec(), f); @@ -422,24 +420,60 @@ PetscErrorCode PETScSNESSolver::FormJacobian(SNES snes, Vec x, Mat* A, Mat* B, NonlinearProblem* nonlinear_problem = snes_ctx.nonlinear_problem; PETScVector* dx = snes_ctx.dx; - PETScVector f; - PETScMatrix dA; + boost::shared_ptr<Mat> A_ptr(reference_to_no_delete_pointer(*A)); + PETScMatrix A_wrap(A_ptr); + A_ptr.reset(); - // Cast from the PETSc Vec type to dolfin's PETScVector - boost::shared_ptr<Vec> vptr(&x, NoDeleter()); - PETScVector x_wrap(vptr); + boost::shared_ptr<Vec> x_ptr(reference_to_no_delete_pointer(x)); + PETScVector x_wrap(x_ptr); *dx = x_wrap; - nonlinear_problem->form(dA, f, *dx); - nonlinear_problem->J(dA, *dx); + PETScVector f; + nonlinear_problem->form(A_wrap, f, x_wrap); + nonlinear_problem->J(A_wrap, x_wrap); + + if (*A != *A_wrap.mat()) + { + // Matrix has been resetted during assembly + MatDestroy(A); + PetscObjectReference(PetscObject(*A_wrap.mat())); + *A = *A_wrap.mat(); + *flag = DIFFERENT_NONZERO_PATTERN; + } + else + { + *flag = SAME_NONZERO_PATTERN; + } + + if (*A != *B) + { + boost::shared_ptr<Mat> B_ptr(reference_to_no_delete_pointer(*B)); + PETScMatrix B_wrap(B_ptr); + B_ptr.reset(); + + // Uncomment these two lines when a proper + // preconditioner constructor is provided, + // and comment out the third one. + + // nonlinear_problem->Jprec(B_wrap, x_wrap); + // Mat B_out(*B_wrap.mat()) - MatCopy(*dA.mat(), *A, SAME_NONZERO_PATTERN); - if (B != A) + Mat B_out(*A_wrap.mat()); + + if (*B != B_out) + { + MatDestroy(B); + PetscObjectReference(PetscObject(B_out)); + *B = B_out; + } + } + else { - MatCopy(*dA.mat(), *B, SAME_NONZERO_PATTERN); + MatDestroy(B); + PetscObjectReference(PetscObject(*A_wrap.mat())); + *B = *A_wrap.mat(); } - *flag = SAME_NONZERO_PATTERN; return 0; }
-
- changed milestone to 1.4
-
- changed version to dev
-
- changed status to resolved
Fix Issue
#132.This commit removes the copying, but shows up the design flaw that is GenericTensor::resize. GenericTensor::resize should be remove.
→ <<cset c24aa5ceaafc>>
-
- removed milestone
Removing milestone: 1.4 (automated comment)
- Log in to comment