Levenberg Marquardt optimizer

Issue #134 new
Geng Sun created an issue

Hello, all,

Can AMP use directly Levenberg Marquardt algorithm as an optimizer?

According to this paper http://dx.doi.org/10.1016/j.commatsci.2015.11.047 LM algorithm seems to perform better than the L-BFGS algorithm and Gradient descent method. So I plan to use AMP with LM algorithm in this package: http://cars9.uchicago.edu/software/python/lmfit/intro.html (which seems supplied more uniform interface for different optimizers)

But LM is a little different from other optimizers, which needs the loss function to return the residual array instead of the square sum. So I look a little more about AMP code, it seems that I need to modify the Fortran code to get the residual array.

Am I right? If the Fortran code needs to be changed, Can you give me some guides about this, I mean who else will be influenced?

Thank you very much.

Geng

Comments (4)

  1. andrew_peterson repo owner

    That sounds interesting.

    @muammar and I were just discussing how it would be better if our code had a more unified interface to the optimizer. That is, if you can wrap any optimizer to look like scipy's fmin_bfgs, it will work. (Actually, this is basically the behavior now, it is just not so well documented.) Our current optimizers need the loss function and the partial derivatives of the loss function with respect to the free parameters. What exactly is the residual array?

    Feel free to make a branch to try it out with. Our philosophy is python first, then fortran. That is, we implement everything in python where it's easy to debug and play with. After we're satisfied, we make a fortran version of the compute-heavy routines, and try to make the variables look the same in both the python and fortran versions.

    Also, be sure to use pyflakes and pep8 to make sure your code fits the formatting. Let us know if you have questions.

  2. Geng Sun reporter

    Hello, sorry for late response, since I was on holiday recently.

    Firstly, If I understand it correctly, the residual array for Levenberg Marquardt method means the array of the differences between predicted and observed values. This interface is same as the leastsq in scipy.optimize https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq

     scipy.optimize.leastsq(func, x0, args=(), Dfun=None, full_output=0, col_deriv=0, ftol=1.49012e-08, xtol=1.49012e-08, gtol=0.0, maxfev=0, epsfcn=None, factor=100, diag=None)[source]
    

    And func is defined as

    func(params) = ydata - f(xdata, params)
    

    so that the objective function is

      min sum((ydata - f(xdata, params))**2, axis=0)
    params
    

    So the difference between this method and other optimizers is that the residual energy or force of every image should be return in the get_loss function instead of returning only the total losses.

    Now I can only use this LM method for training with a single core and without force, so the system is quite small. I still need some time to figure out how to use parallelized training with this method.

  3. Alireza Khorshidi

    If you want to feed the residuals of every single image to the optimizer, then I think you need to modify the calculate_loss method here in python and here in fortran.

    Like @andrewpeterson suggested, try to first get a working version of pure python as you desire, and then implement it inside fortran.

  4. Log in to comment