Issue #24 resolved

PlannerThreadedTerminationCondition locking problem causes RRT* planner to freeze while improving solution

Anonymous created an issue

There is some kind of a problem in the PlannerThreadedTerminationCondition that causes a deadlock. This problem appears sometimes when improving an existing solution using the RRT* planner. The planner was given 20 seconds of time and invoked through SimpleSetup with:

ob::PlannerStatus solution=setup->solve(solve_t_limit);

Comments (7)

  1. Antti Valli

    Forgot the stacktrace and now I can't attach it for some reason. So here it is as a comment.

    Attaching to process 27473
    (gdb) bt
    #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
    #1  0x00007f7035417065 in _L_lock_858 () from /lib/x86_64-linux-gnu/
    #2  0x00007f7035416eba in __pthread_mutex_lock (mutex=0x1b83798)
        at pthread_mutex_lock.c:61
    #3  0x00007f70311b2172 in boost::thread::interrupt() ()
       from /usr/lib/
    #4  0x00007f703713eacf in ompl::base::PlannerThreadedTerminationCondition::stopEvalThread() () from /usr/local/lib/
    #5  0x00007f703713eb2f in ompl::base::PlannerThreadedTerminationCondition::~PlannerThreadedTerminationCondition() () from /usr/local/lib/
    #6  0x00007f70371426c3 in ompl::base::Planner::solve(double) ()
       from /usr/local/lib/
    #7  0x00007f70371de784 in ompl::geometric::SimpleSetup::solve(double) ()
       from /usr/local/lib/
    #8  0x0000000000445b57 in MotionSolver::improve (this=0x7ffff5e84630)
        at MotionSolver.cpp:327
    #9  0x000000000042a6c5 in main (argc=1, argv=0x7ffff5e84af8) at PlannerTest.cpp:90
    (gdb) frame 2
    #2  0x00007f7035416eba in __pthread_mutex_lock (mutex=0x1b83798)
        at pthread_mutex_lock.c:61
    61  pthread_mutex_lock.c: No such file or directory.
    (gdb) print *mutex
    $2 = {__data = {__lock = 2, __count = 0, __owner = 7237, __nusers = 1, __kind = 0, 
        __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = "\002\000\000\000\000\000\000\000E\034\000\000\001", '\000' <repeats 26 times>, __align = 2}
    (gdb) info threads
      Id   Target Id         Frame 
      2    Thread 0x7f702d0ac700 (LWP 7237) "PlannerTest" __lll_lock_wait ()
        at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
    * 1    Thread 0x7f7038224740 (LWP 27473) "PlannerTest" __lll_lock_wait ()
        at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
    (gdb) thread 2
    [Switching to thread 2 (Thread 0x7f702d0ac700 (LWP 7237))]
    #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
    132 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
    (gdb) bt
    #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
    #1  0x00007f7035417065 in _L_lock_858 () from /lib/x86_64-linux-gnu/
    #2  0x00007f7035416eba in __pthread_mutex_lock (mutex=0x1b836f0)
        at pthread_mutex_lock.c:61
    #3  0x00007f70311b24a4 in boost::this_thread::interruption_point() ()
       from /usr/lib/
    #4  0x00007f70311b445e in boost::this_thread::sleep(boost::posix_time::ptime const&)
        () from /usr/lib/
    #5  0x00007f703713fee6 in ompl::base::PlannerThreadedTerminationCondition::periodicEval() () from /usr/local/lib/
    #6  0x00007f70311b1ce9 in thread_proxy () from /usr/lib/
    #7  0x00007f7037b31b74 in ?? () from /usr/lib/nvidia-current/
    #8  0x00007f7035414e9a in start_thread (arg=0x7f702d0ac700) at pthread_create.c:308
    #9  0x00007f7035a19cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
    #10 0x0000000000000000 in ?? ()
  2. Ioan Sucan

    What version of OMPL are you using? Does the problem happen for every execution of the algorithm? How is RRT* configured to improve the solution (how is the optimization objective set up)? Are you able to send a bit of code that exhibits this problem?

  3. Antti Valli

    The problem first appeared with ompl-0.11.1 and updating to 0.12.2 did not resolve it. The problem randomly appears for some executions of the algorithm, but often enough that it affects every longer benchmarking run that I have done. BTRRT* does not appear to suffer from this problem. The planner is set up with ompl::geometric::SimpleSetup and delay_cc is set to 1. The configuration space is 6 dimensional CompoundStateSpace, with SO2 and RealVector statespaces as components. The solve method of simple setup is called repeatedly until a specified minimum improvement time has been used.

  4. Ioan Sucan

    Sorry for the delayed reply -- somehow I did not catch your answer. I suspect this could be a numerical precision issue. This has been fixed in the source code. Would it be possible for you to try this using the latest source code?

  5. Log in to comment