bad convergence of optimization problem

Issue #7 resolved
Florian Bruckner created an issue

hello,

i hope that this is the correct place to ask some questions about PDE constrained optimization using adjoint methods!?

i am using an adjoint method to PDE constrained optimization. a simple gradient descent implementation shows that the calculated gradient seems to be correct, since the correct minimum is reached after 10000 iterations.

Using the BFGS algorithm unfortunately leads to very poor convergence behavior. Additionally the linesearch methods fails to find a proper step after about 100 iterations, and the method returns with a relatively bad solution. perhaps this is due to a relatively flat region in the energy landscape!?

How can this situations be improved?

I found some papers about optimal preconditioners for PDE constrained problems, but they are applied to the complete dense system, and I am not sure how these can be incorporated into an adjoint method? is this possible somehow?

at the moment i am using scipy's BFGS method. May the Moola solvers provide better performance?

thanks for any advice greetings Florian

Comments (8)

  1. Simon Funke

    Just a quick answer for now (I'm on a flight).

    Do you use higher order discretions for your control, or an unstructured mesh? In that case you should give moola a shot.

    Otherwise yiu cannot expect too much from switching to moola. You could try some of the bfgs-b options in SciPy. In particular the length of memory can have impact on the performance.

  2. Florian Bruckner reporter

    Thanks for your fast reply.

    We are simulating a cube meshed with gmsh (so the mesh should be not too bad) and linear elements. A just tried some moola optimizers and played with some of the options, but the results are qualitatively the same. Most of the time the linesearch algorithm fails after some 100 iterations. without linesearch convergence is even worse. i tried the scipy BFGS (not the memory limited L_BFGS_B version) and also most of the other scipy optimizers (and various options) but all of them show bad convergence.

    it seems that either our gradient (calculated via an adjoint method) is inaccurate, or the optimization problem has a bad condition. i tried to eliminate the first possibility by using a simple gradient descent method, which finally converges down to acceptable accuracy, but needs about 10000 iterations.

    concerning the condition of the problem i am not sure howto apply a preconditioner with some of the mentioned optimization methods, and finally how a proper preconditioner should look like!?

    greetings Florian

  3. Simon Funke

    Hi Florian,

    i tried to eliminate the first possibility by using a simple gradient descent method, which finally converges down to acceptable accuracy, but needs about 10000 iterations.

    It would be worth to perform a Taylor test to check the correctness of the gradient/Hessian computation (see here)

    concerning the condition of the problem i am not sure howto apply a preconditioner with some of the mentioned optimization methods, and finally how a proper preconditioner should look like!?

    This depends on your concrete optimisation problem, and it might be quite difficult to construct a good preconditioner. Can you explain more in detail which optimisation problem you are solving?

  4. Florian Bruckner reporter

    Hallo Simon,

    sorry for the long delay. The problem that we try to solve is the inverse Magnetostatic strayfield problem ( PDE: laplace u = -div(M), OBJ: F=||grad(u)-H*|| -> min ). I found the paper "Optimal solvers for PDE-Constrained Optimization" where a saddle-point problem is defined and solved by an iterative MINRES solver. Unfortunately we are solving the forward problem by means of an FEM-BEM coupling method. Thus in contrast to the paper we only have an operator u(M) and the adjoint operator available instead of the individual (sparse) matrices.

    Nevertheless I think that it should be possible to derive a suitable preconditioner for our system, but i am a bit confused about how to apply the preconditioner together with available optimizers (e.g. scipy.optimize). There seem to be no arguments to set a preconditioner (as it is default for most linear solvers). How can I manually apply the preconditioner (do i have to provide the precondtioned gradient P^-1 grad(F))? Or is it advantageous to use a solver instead of an optimizer to handle those kind of problems?

    thanks again greeting Florian

  5. Simon Funke

    Hi Florian,

    Did you get any further with this? As far as I know, scipy.optimize algorithms does not allow to set a preconditioner it its solvers. You might want to look into the TAO solver, where you can define a custom PETSc matrix as a preconditioner (table 4.4 in http://www.mcs.anl.gov/petsc/petsc-current/docs/tao_manual.pdf) to the linear solver. Finally, the Moola optimisation solver allows you to set a custom inner product for the control variables (instead the l2 inner product which is the default), which can be interpreted as choosing a special preconditioner.

    Best wishes,

    Simon

  6. Florian Bruckner reporter

    Hello again,

    I will have a look at the PETSc optimizers you suggested. I recently figured out that our gradient fails to pass the taylor remainder test. It seems like there is some systematic error, which does not influence the convergence of the gradient descent method, but makes the more complicated optimizers fail.

    thanks for your advice again. I will close this issue. greetings Florian

  7. Simon Funke

    Ok, fixing your gradient should have highest priority. This might resolves the issues that described before.

  8. Log in to comment