LaserScan causes gazebo to crash when moving around

Create issue
Issue #339 wontfix
John Hoare created an issue

This was originally asked on answers.ros.org [1], but upon more inspection I'm convinced this is a bug.

Using the DRCSim Robot (I've tried versions 1.1,1.2, and 1.3, built both from source and the packaged versions) while using the "fake walking" and subscribing to the laser scan from the head, gazebo will crash.

Steps for re-creating the bug:

  1. roslaunch atlas_utils drc_sim_v0.launch
  2. rosrun pr2_teleop teleop_pr2_keyboard cmd_vel:=atlas/cmd_vel
  3. rostopic hz scan
  4. Using the teleop_pr2_keyboard drive the robot around a bit. For me this will cause a crash in under a minute. This doesn't appear to happen when the robot is stationary.

Here is a stack trace of what I get when gzserver crashes: http://pastebin.com/LYv3tt8t

I've filed this as a bug against gazebo rather than DRCSim because of the contents of the stack trace.

[1] http://answers.gazebosim.org/question/591/drcsim-1213-with-gazebo-crashes-often-when-moving/

Comments (12)

  1. John Hsu

    I just tried running default branches of gazebo and drcsim in debug mode, but am unable to reproduce the crash.

    It looks like a race condition in tmpPlaneBuffer array, where tmpPlaneBuffer[i + 1] is not accessible (with address of 0x110).

    Can you do a "thread apply all bt" and see if there are multiple threads trying to access heightfield.cpp?

    Thanks, John

  2. John Hoare reporter

    I did a backtrace across multiple threads, but I only see references to heightfield.cpp in the crashing thread (thread 1)

    Thread 26 is in a malloc() call during the crash though, all the other threads look to be blocked or sleeping.

    Stacktrace: http://pastebin.com/XJ6MBVKz

    Edit: I am running tagged gazebo_1.3.1 and drcsim_1.3.1 I will try using the default branches now and verify the bug is still occuring.

    It does seem strange that you can not reproduce the crash. We are able to replicate this bug on several different models of computers, however all are core i7 or xeon machines.

  3. John Hoare reporter

    Okay, I did just ran the default branches of both gazebo & drcsim, and I am no longer seeing the crash occur. Or at least I ran the robot around for about 5x longer than I would have been able to before, without any crashes.

  4. John Hoare reporter

    Have you tried reproducing it with gazebo_1.3.1 and drcsim_1.3.1?

    My concern with closing it is that there hasn't been a specific fix for it, so the bug is potentially still there. As John implied, this is likely caused by a race condition. The race condition may just be much less likely to occur in the newer (default) branches. I have had gazebo/drcsim crash on me with the newer branches, but with much less frequency and never with a resulting core dump or debugger attached to it.

  5. John Hoare reporter

    I'll just close this out because I believe the issue still exists on the reported version, but I have (obviously) not been able to reproduce on newer versions.

  6. Log in to comment