Incorrect measure of optimizer progress when using m-estimators

Issue #249 new
Michael Bosse created an issue

I believe that there is a problem in the way the optimizers (LM, GN, and such) evaluate the progress of the optimization by only looking at the total weighted error.

When m-estimators are employed, (especially non-convex versions like a Cauchy,) then the total weighted error may increase when the residual on factors decreases, since their effective influence is also being increased. So the optimization algorithms mistakenly throw out iterations (ie to increase lambda in LM) or prematurely declare convergence when the total weighted error will not decrease.

The correct way to evaluate the change in error over an iteration is to fix the m-estimator weights using the residuals at the beginning of the iteration, and compute the nonlinear weighted error after the iteration using those weights (i.e. not using the weights from the post iteration residuals.) The error with the weights fixed for the iteration should always decrease with proper step control; but the reweighted error may not decrease. Convergence then can be gauged by a small enough absolute magnitude of the change in error.

Now since the evaluation of the m-estimator weights is buried in the robust noise model class, and the optimization algorithms have no direct access to them, I am uncertain of an easy fix.

My intuition is that this issue is also the root cause of Issue #200.

Comments (2)

  1. Michael Bosse reporter

    Would need some way to expose the notion of the weights from the m-estimator to the noise model abstraction. Perhaps all non m-estimator noise models can have a fixed weight of 1.0. Or just a method to lock the current weights given a residual vector. But maybe that doesn't work with the functional style programming. Then one would need a function to transform the m-estimator noise model to a fixed-weight noise model during the optimization iteration.

  2. Log in to comment