Unit3 deadlock

Issue #261 resolved
Chris Beall created an issue

Unit3 sometimes deadlocks. On Ubuntu 14.04, in Debug mode + TBB this happens roughly 1/10 times.

I noticed because Jenkins was unable to get through its latest build. I just configured a unit test timeout on Jenkins so it will count this as a failure and move on.

Comments (9)

  1. Hayk Martirosyan

    Oh man, that's a bummer. I moved the mutex from a global to a member variable. I recall running into this issue at some point, and then before I checked in I tried to reproduce it many times and couldn't get anything, so I thought it had been a red herring. Didn't realize it only happened in debug mode. Frank and I looked at it together back then and determined that it was properly used. Though I am used to the C++11 variants, so maybe there's a difference there with the TBB? I recall trying to make it recursive to see if that would fix the issue, but as far as I can tell nothing in basis should be invoking basis, because that would be bad for several reasons.

  2. Chris Beall reporter

    Here's a hint. Compiling this on Mac gives this error:

    /Users/cbeall3/git/gtsam/gtsam/geometry/Unit3.h:156:12: error: call to implicitly-deleted copy constructor of 'gtsam::Unit3'
        return Unit3(p_.cross(q.p_));
               ^~~~~~~~~~~~~~~~~~~~~
    /Users/cbeall3/git/gtsam/gtsam/geometry/Unit3.h:51:22: note: copy constructor of 'Unit3' is implicitly deleted because field
          'B_mutex_' has a deleted copy constructor
      mutable tbb::mutex B_mutex_; ///< Mutex to protect the cached basis.
                         ^
    /opt/local/include/tbb/mutex.h:40:15: note: copy constructor of 'mutex' is implicitly deleted because base class
          'internal::mutex_copy_deprecated_and_disabled' has a deleted copy constructor
    class mutex : internal::mutex_copy_deprecated_and_disabled {
                  ^
    /opt/local/include/tbb/tbb_stddef.h:334:44: note: copy constructor of 'mutex_copy_deprecated_and_disabled' is implicitly deleted
          because base class 'tbb::internal::no_copy' has an inaccessible copy constructor
    class mutex_copy_deprecated_and_disabled : no_copy {};
                                               ^
    
  3. Frank Dellaert

    A possible fix is to add an explicit copy constructor, which default-constructs the mutex, copies only p_.

  4. Frank Dellaert

    Since you, @cbeall3 , are the only one who has that error (puzzling!) seems like a quick thing to try?

  5. Chris Beall reporter

    Must be compiler/tbb version. I'm on the very latest:
    XCode 7.1 with Clang Apple LLVM version 7.0.0 (clang-700.1.76) and tbb 4.4-20150728_0.

  6. Log in to comment