Possibly incorrect computation of steepest descent direction

I suspect that the computation of the steepest descent direction used for Powell’s dog leg method in iSAM2 (in GTSAM 4.0.0) is incorrect. For my particular problem, which is a planar problem with 2 vehicles using odometry measurements and a range measurement between them, I often get the error “Warning: Dog leg stopping because cannot decrease error with minimum delta” when I take a range measurement. I get much more reasonable results when I just use the GaussNewtonOptimizer instead, but I would like to use the dog leg optimizer to potentially improve convergence speed, and since the book “Factor Graphs for Robot Perception” (Dellaert and Kaess, 2017) states that “Powell’s dog leg can provide improved efficiency, and, as we will later see, is essential when incrementally updating matrix factorizations”.

When I enable verbose output of the dog leg optimizer, I can see that even very close to the linearization point (with delta approaching 1e-5 or less), the model function increases in the steepest descent direction (even though the nonlinear error function actually decreases). Normally, the model function should decrease in the steepest descent direction, but the nonlinear function may or may not decrease depending on the quality of the model. This implies that the model is a poor approximation and perhaps incorrect.

In reading the document trustregion.pdf included in the GTSAM doc directory, it states that the gradient, g, is computed as R'*(R*x-d), where R is an upper triangular factor of A that can be found using QR factorization of the measurement Jacobian A, x is a column vector of the states, and d=Q*b, where b is the column vector of measurement errors (normalized by standard deviations). I can see how this is derived by essentially taking the derivative of the model function with respect to x, but it ignores the fact that R is also a function of x (this may not be the best explanation, but something doesn’t seem right here). The correct gradient, as given in the 2017 book in section 2.5.1, or in Dennis and Schnabel (1996), is g=A'*b. However, if GTSAM actually implements what is stated in trustregion.pdf, it may be incorrect.

I have constructed a small Matlab example that provides further evidence. I did some of the math by hand, so it is possible there are errors though.

% The purpose of this script is to numerically investigate the validity of the 
% steepest descent direction used for gtsam. The scenario is a planar problem
% with 2 vehicles using odometry and a range measurement. The simulation 
% propagates one step in time, and heading is assumed to be known exactly.

% Measurements and standard deviations
sigma_dx = 0.01;
sigma_dy = 0.01;
sigma_rho = 0.01;
dx1 = 1;
dy1 = 0;
dx2 = 1;
dy2 = 0;
rho = 0.99;  % range measurement between vehicles at time 2

% Linearization point
x1 = [0 1];  % x position of vehicle 1 at times 1 and 2
y1 = [0 0];
x2 = [0 1];
y2 = [1 1];

% Range model
rho_hat = sqrt((x2(2)-x1(2))^2 + (y2(2)-y1(2))^2);

% Measurement vector
b = [(x1(2) - x1(1) - dx1)/sigma_dx;
     (y1(2) - y1(1) - dy1)/sigma_dy;
     (x2(2) - x2(1) - dx2)/sigma_dx;
     (y2(2) - y2(1) - dy2)/sigma_dy;
     (rho_hat - rho)/sigma_rho];

% Measurement Jacobian
A = [1/sigma_dx, 0, 0, 0;
     0, 1/sigma_dy, 0, 0;
     0, 0, 1/sigma_dx, 0;
     0, 0, 0, 1/sigma_dy;
     -(x2(2)-x1(2))/rho_hat/sigma_rho, -(y2(2)-y1(2))/rho_hat/sigma_rho, (x2(2)-x1(2))/rho_hat/sigma_rho, (y2(2)-y1(2))/rho_hat/sigma_rho];

% Gradient according to Dennis and Schnabel (1996), or Dellaert and Kaess (2017)
A'*b

% QR factorization of the measurement Jacobian
[Q,R]=qr(A,0);
d=Q'*b;

% Gradient according to trustregion.pdf
R'*(R*[x1(2);y1(2);x2(2);y2(2)]-d)

The two gradients do not agree. The first one gives a gradient of [0;-100;0;100], and the second gives [10000;-9900;10000;19900]. The first one makes sense to me. Since the range measurement should be 1 to agree with the odometry, it would make sense to have to move the vehicles closer to each other (in the negative of the gradient) in the y direction to reduce error in the range estimate. If I set rho=1, the gradient given by the first method is indeed zero (as it should be), but the second one is not.

Comments (21)