Could anyone help fix the problem for poseSLAM using isam1 and isam2

Issue #334 resolved
Xipeng Wang created an issue

ISAM1 I got error: Screenshot from 2017-08-04 14-31-49.png

ISAM2 I got error: Screenshot from 2017-08-04 14-34-06.png Example dataset

VERTEX2 0 0.000000 0.000000 0.000000
EDGE2 1 0 1.030390 0.011350 -0.012958 44.721360 44.721360 44.721360
VERTEX2 1 1.030390 0.011350 -0.012958
EDGE2 2 1 1.013900 -0.058639 -0.013225 44.721360 44.721360 44.721360
VERTEX2 2 2.043445 -0.060422 -0.026183
EDGE2 3 2 1.027650 -0.007456 0.004833 44.721360 44.721360 44.721360
VERTEX2 3 3.070548 -0.094779 -0.021350
EDGE2 4 3 -0.012016 1.004360 1.566790 44.721360 44.721360 44.721360
VERTEX2 4 3.079976 0.909609 1.545440
EDGE2 5 4 1.016030 0.014565 -0.016304 44.721360 44.721360 44.721360
VERTEX2 5 3.091176 1.925681 1.529136
EDGE2 6 5 1.023890 0.006808 0.010981 44.721360 44.721360 44.721360
VERTEX2 6 3.127018 2.948966 1.540117
EDGE2 7 6 0.957734 0.003159 0.010901 44.721360 44.721360 44.721360
VERTEX2 7 3.153237 3.906347 1.551018
EDGE2 8 7 -1.023820 -0.013668 -3.093240 44.721360 44.721360 44.721360
VERTEX2 8 3.146655 2.882457 -1.542222
EDGE2 9 8 1.023440 0.013984 -0.007802 44.721360 44.721360 44.721360
EDGE2 9 5 0.033943 0.032439 -3.127400 44.721360 44.721360 44.721360
VERTEX2 9 3.189873 1.859834 -1.550024
EDGE2 10 9 1.003350 0.022250 0.023491 44.721360 44.721360 44.721360
EDGE2 10 3 0.044020 0.988477 -1.563530 44.721360 44.721360 44.721360
VERTEX2 10 3.232959 0.857162 -1.526533
EDGE2 11 10 0.977245 0.019042 -0.028623 44.721360 44.721360 44.721360
VERTEX2 11 3.295225 -0.118283 -1.555156
EDGE2 12 11 -0.996880 -0.025512 -3.126915 44.721360 44.721360 44.721360
VERTEX2 12 3.254125 0.878076 1.601114
EDGE2 13 12 0.990646 0.018396 -0.016519 44.721360 44.721360 44.721360
VERTEX2 13 3.205708 1.867709 1.584594
EDGE2 14 13 0.945873 0.008893 -0.002602 44.721360 44.721360 44.721360
EDGE2 14 8 0.015808 0.021059 3.128310 44.721360 44.721360 44.721360
VERTEX2 14 3.183765 2.813370 1.581993
EDGE2 15 14 1.000010 0.006428 0.028234 44.721360 44.721360 44.721360
EDGE2 15 7 -0.014728 -0.001595 -0.019579 44.721360 44.721360 44.721360
VERTEX2 15 3.166141 3.813245 1.610227
-------------------------------

Source code:

/* ----------------------------------------------------------------------------

 * GTSAM Copyright 2010, Georgia Tech Research Corporation,
 * Atlanta, Georgia 30332-0415
 * All Rights Reserved
 * /

/**
 * A structure-from-motion example with landmarks
 *  - The landmarks form a 10 meter cube
 *  - The robot rotates around the landmarks, always facing towards the cube
 */

// For loading the data
#include "SFMdata.h"

// Camera observations of landmarks (i.e. pixel coordinates) will be stored as Point2 (x, y).
#include <gtsam/geometry/Pose2.h>
#include <gtsam/geometry/Point2.h>

// Each variable in the system (poses and landmarks) must be identified with a unique key.
// We can either use simple integer keys (1, 2, 3, ...) or symbols (X1, X2, L1).
// Here we will use Symbols
#include <gtsam/inference/Symbol.h>
#include <gtsam/inference/Key.h>

#include <gtsam/slam/PriorFactor.h>
#include <gtsam/slam/BetweenFactor.h>

// We want to use iSAM to solve the structure-from-motion problem incrementally, so
// include iSAM here
#define XP_ISAM1

#ifdef XP_ISAM1
#include <gtsam/nonlinear/NonlinearISAM.h>
#else
#include <gtsam/nonlinear/ISAM2.h>
#endif

// iSAM requires as input a set set of new factors to be added stored in a factor graph,
// and initial guesses for any new variables used in the added factors
#include <gtsam/nonlinear/NonlinearFactorGraph.h>
#include <gtsam/nonlinear/Values.h>

#include <vector>

#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/mman.h>

using namespace std;
using namespace gtsam;

static int64_t utime_now()
{
    struct timeval tv;
    gettimeofday (&tv, NULL);
    return (int64_t) tv.tv_sec * 1000000 + tv.tv_usec;
}

/* ************************************************************************* */
int main(int argc, char* argv[]) {

    // Create a NonlinearISAM object which will relinearize and reorder the variables
    // every "relinearizeInterval" updates
    FILE *f;
    if(argc == 2) {
        f = fopen(argv[1], "r");
    } else {
        f = fopen("/var/tmp/test1500.graph", "r");
    }
    FILE *f_out;
    f_out = fopen("/tmp/isam_error.txt", "a");

#ifdef XP_ISAM1
    int relinearizeInterval = 100;
    NonlinearISAM isam(relinearizeInterval);
#else
    ISAM2Params parameters;
    parameters.relinearizeThreshold = 0.1;
    parameters.relinearizeSkip = 10;
    parameters.evaluateNonlinearError = true;
    ISAM2 isam(parameters);
#endif

    // Create a Factor Graph and Values to hold the new data
    NonlinearFactorGraph graph;
    Values initialEstimate;
    noiseModel::Diagonal::shared_ptr model = noiseModel::Diagonal::Sigmas(Vector3(1, 1, 1));
    // Loop over the different poses, adding the observations to iSAM incrementally
    int64_t total_time = 0;
    char str[20];
    int ID;
    double init[3];
    int aidx;
    int bidx;
    double T[3];
    double W[3];
    //Read Data
    while(1 && fscanf(f, "%s", str) != -1) {
        if(!strcmp(str, "VERTEX2")) {
            fscanf(f, "%d %lf %lf %lf", &ID, &init[0], &init[1], &init[2]);
            if (ID == 0) {
                noiseModel::Diagonal::shared_ptr priorNoise = noiseModel::Diagonal::Sigmas(Vector3(0.001, 0.001, 0.001));
                graph.emplace_shared<PriorFactor<Pose2> >(Symbol('x', ID), Pose2(0, 0, 0), priorNoise);
                initialEstimate.insert(Symbol('x', ID), Pose2(init[0], init[1], init[2]));
            } else {
                cout << "****************************************************" << endl;
                cout << "Frame " << ID << ": " << endl;
                initialEstimate.insert(Symbol('x', ID), Pose2(init[0], init[1], init[2]));
                int64_t utime0 = utime_now();
#ifdef XP_ISAM1
                isam.update(graph, initialEstimate);
#else
                ISAM2Result result = isam.update(graph, initialEstimate);
#endif
                int64_t utime1 = utime_now();
                total_time += (utime1 - utime0);
                printf("optimized using (%.3f ms) / (%.3f ms)\n", (utime1 - utime0) / 1.0E3, total_time / 1.0E3);

#ifdef XP_ISAM2
                printf("Error: %f\n", *(result.errorAfter));
                fprintf(f_out, "%d,%f\n", ID, *(result.errorAfter));
#endif

#ifdef XP_ISAM1
                Values currentEstimate = isam.estimate();
#else
                Values currentEstimate = isam.calculateEstimate();
#endif

                // Clear the factor graph and values for the next iteration
                graph.resize(0);
                initialEstimate.clear();
            }
        } else if(!strcmp(str, "EDGE2")) {
            fscanf(f, "%d %d %lf %lf %lf %lf %lf %lf",
                   &aidx, &bidx, &T[0], &T[1], &T[2],
                   &W[0], &W[1], &W[2]);
            noiseModel::Diagonal::shared_ptr model = noiseModel::Diagonal::Sigmas(Vector3(sqrt(1/W[0]), sqrt(1/W[1]),sqrt(1/W[2])));
            graph.emplace_shared<BetweenFactor<Pose2> >(Symbol('x', bidx), Symbol('x', aidx), Pose2(T[0], T[1], T[2]), model);
        } else {
            printf("-------------------END------------String: %s \n", str);
            fclose(f);
            fclose(f_out);
            exit(-1);
        }
    }
    fclose(f);
    fclose(f_out);
    return 0;
}

Comments (27)

  1. Xipeng Wang reporter

    The code is written based on the file VisualISAMExample.cpp and Pose2SLAMExample.cpp. In the code I posted, I only add pose node and rigid transformation factor between poses. The example data is first 15 nodes and factors from Manhattan3500 data. I am not very familiar with gtsam, so I just read line by line from the data file, and add node step by step. For isam1, I got the error if I add loop closure factor, and call the function isam.estimate(). For isam2, I got the error when I tested it on a larger graph which has 3500 nodes. It seems I got an ill-conditioned or or under constrained system.

    Both isam1 and isam2 work well if I only add Odometry factors. I thought I was using wrong way to add loop closure factors? However, It works if I add all nodes once and optimize the whole graph as the way in Pose2SLAMExample.cpp

  2. Frank Dellaert

    isam.update(graph, initialEstimate) should only get new variables in the initial estimate, so that's probably what's happening. No need to give an initial estimate if you already optimized for a variable.

  3. Xipeng Wang reporter

    Duy-Nguyen Ta helped me fix the isam1 error.

    diff --git a/gtsam/inference/ISAM-inst.h b/gtsam/inference/ISAM-inst.h
    index bf2cacc84..186349a95 100644
    --- a/gtsam/inference/ISAM-inst.h
    +++ b/gtsam/inference/ISAM-inst.h
    @@ -47,9 +47,9 @@ namespace gtsam {
         const KeySet newFactorKeys = newFactors.keys();
         const Ordering constrainedOrdering =
           Ordering::ColamdConstrainedLast(varIndex, std::vector<Key>(newFactorKeys.begin(), newFactorKeys.end()));
    -    Base bayesTree = *factors.eliminateMultifrontal(constrainedOrdering, function, varIndex);
    -    this->roots_.insert(this->roots_.end(), bayesTree.roots().begin(), bayesTree.roots().end());
    -    this->nodes_.insert(bayesTree.nodes().begin(), bayesTree.nodes().end());
    +    auto bayesTree = factors.eliminateMultifrontal(constrainedOrdering, function, varIndex);
    +    this->roots_.insert(this->roots_.end(), bayesTree->roots().begin(), bayesTree->roots().end());
    +    this->nodes_.insert(bayesTree->nodes().begin(), bayesTree->nodes().end());
       }
    

    Now, both isam1 and isam2 give me the same second error - ill-conditioned matrix. However, if I directly optimize the whole graph instead of using isam.update(), it works fine. And this makes me think that something goes wrong when I use isam.update().

    And I am using the following code to calculate chi^2 error. The results seem to be bigger than I expected.

    Values currentEstimate = isam.estimate();
    double err = isam.getFactorsUnsafe().error(currentEstimate);
    printf("Error: %f\n", err);
    
  4. Duy-Nguyen Ta

    Regarding the error in NonlinearISAM: it's considered a temporary fix as I didn't manage to track down the bug till the end. In Xi Peng's test, the BayesTree's copy constructor and assignment operator in BayesTree-inst.h somehow duplicates the tree content two times at the first loop closure. The fix simply avoids calling these operations. From the local context in ISAM-inst, I think the temporary local variable is redundant. However, I don't have a bigger picture to justify why it was implemented like that in the first place.

  5. Xipeng Wang reporter

    Yes. I am pretty sure that no variable in initialEstimate was already in the graph.

    Also I realize that if the whole graph is optimized when estimate() is called. In the code I post, I call estimate() every iteration which make it not incremental smoothing any more. However, if I remove estimate() and use getLinearizationPoint() to calculate graph error, ISAM1 doesn't work properly.

    Values NonlinearISAM::estimate() const {
        if(isam_.size() > 0)
            return linPoint_.retract(isam_.optimize());
        else
            return linPoint_;
    }
    
  6. Frank Dellaert

    Guys, that fix does not resolve the issue for me (still "Requested to insert variable 'x6' already in this VectorValues"). Are you sure that's the entire diff?

  7. Duy-Nguyen Ta

    I found the problem. Very subtle.

    The first problem that I found is the BayesTree's clone assignment/copy constructor does not exactly preserve the tree's content. The following assertion fails in some cases:

    auto bayesTree = factors.eliminateMultifrontal(constrainedOrdering, function, varIndex);
    Base bayesTreeCopy = *bayesTree;
    assert(bayesTreeCopy.equals(*bayesTree));
    

    The reason is that the result of eliminateMultifrontal does not populate all clique assignment for all variables, whereas the copy assignment does. EliminateMultifrontal does not use insertRoot as said in ClusterTree-inst.h:195-197, whereas the copy assignment does. The function insertRoot calls fillNodesIndex to populate the node-clique correspondences for all nodes. As EliminateMultifrontal works, I'm not sure if it's a bug or it was intended to be this way.

    But that's not the reason for the crash.

    The true reason is challenging. It's in these lines in ISAM-inst.h:

    this->roots_.insert(...);
    this->nodes_.insert(...);
    

    Except for the mismatch in the node-clique map, the copy assignment cloned the BayesTree properly as intended. However, the ISAM object expands because the newly cloned cliques are added while the old ones are still in there. The actual root cause is that the behavior of this_nodes_.insert is undefined, because the nodes still exist in the map, and are still associated with the old cliques. So some were updated to associate with the newly cloned cliques, some still stick to the old ones.

    Calling this->clear() before insert fixes the problem.

    However, I still think the clone copy here is redundant. The diff fix above works for me.

  8. Frank Dellaert

    Hmmm. clear() is no good because a BayesTree can be a forest, I remember.

    Also, I don't understand what would be "undefined". Footnote (5) is not our case, I believe...

  9. Duy-Nguyen Ta

    Actually you're right. It might not be "undefined", just that it didn't replace the old ones. This is an example from one of my runs, showing the nodes and their associated clique addresses. Except for the new root nodes and the x5 (the old root), the other nodes' cliques were not updated.

    ISAM before update:: cliques: 6, variables: 7
    Node: x0
    Clique: 0x7f94a1f015a0
    Node: x1
    Clique: 0x7f94a1f00aa0
    Node: x2
    Clique: 0x7f94a1d01830
    Node: x3
    Clique: 0x7f94a1d00c30
    Node: x4
    Clique: 0x7f94a1e00590
    Node: x5
    Clique: 0x7f94a1e00500
    Node: x6
    Clique: 0x7f94a1e00500
    =================
    Clone:
    Node: x6 --> 0x7f94a1c04df0
    Node: x7 --> 0x7f94a1c04df0
    Node: x5 --> 0x7f94a1c04f50
    Node: x4 --> 0x7f94a1c04fe0
    Node: x3 --> 0x7f94a1c03fd0
    Node: x2 --> 0x7f94a1c04060
    Node: x1 --> 0x7f94a1c04470
    Node: x0 --> 0x7f94a1c04500
    ===========
    ISAM after update:: cliques: 7, variables: 8
    Node: x0
    Clique: 0x7f94a1f015a0
    Node: x1
    Clique: 0x7f94a1f00aa0
    Node: x2
    Clique: 0x7f94a1d01830
    Node: x3
    Clique: 0x7f94a1d00c30
    Node: x4
    Clique: 0x7f94a1e00590
    Node: x5
    Clique: 0x7f94a1c04f50
    Node: x6
    Clique: 0x7f94a1c04df0
    Node: x7
    Clique: 0x7f94a1c04df0
    
  10. Frank Dellaert

    Aha. Interesting. BTW, my "make check" is not working, does that work for you? I want to add a unit test to debug this. (Note I pushed a new branch in which to do so).

  11. Frank Dellaert

    So, just checked in a unit test with what I believe is equivalent to your code, minus the file IO. And, it succeeds! Can you stare a bit at what might be the difference? Using GTSAM_PRINT(isam) at each iteration in both unit test and your code might yield the answer...

  12. Xipeng Wang reporter

    Awesome! Thank you very much! I will post the differences later after I testing it.

    A suggestion: I know w10000 dataset is in this gtsam repository. I think it will be handy and useful if there are isam1 and isam2 examples tested on this large pose-graph dataset. Several problems I encountered happen when I test systems on large dataset. If there is an example, I can easily compare the differences as you suggested. Then I will know whether I am using isam correctly or not.

  13. Xipeng Wang reporter

    For testing on the small dataset with 15 nodes, I got same results as yours. (Sorry for the large figures).

    However, when I test on larger dataset (Manhattan3500), it doesn't work as I expect. Chi_error is quite large.

    ISAM1 result (Your code):

    isam1.png

    ISAM1 result (My code):

    isam1_5.png

  14. Xipeng Wang reporter

    I tested isam1 on a graph with 100 nodes, and I got: (I think sum_chi^2 error is too large)

    error.png

    When I tested it on a graph with 3500 nodes, I got: (linear system is ill-posed)

    err2.png

  15. Xipeng Wang reporter

    Sorry, I didn't make it clear. With Duy's fix, file-based i/o version also works. But it only works for a small dataset with 15 nodes. When I test on larger graph with 3500 nodes, I get error as shown in my previous message.

  16. Frank Dellaert

    Thanks. Duy's fix did not fix it for me, though. But moving to the string-based test did. Hence, I'm thinking that your script might have some memory corruption - but I can't see where...

  17. Log in to comment