1. OpenSourceRoboticsFoundation
  2. Simulation
  3. gazebo
  4. Issues
Issue #706 closed

intermittent segmentation fault

gerkey NA
created an issue

We're seeing segmentation faults in Gazebo intermittently. The most common situation seems to be running a VRC world with logging enabled. We'll collect evidence here.

Debug procedure: after a crash, there should be a core file at ~/.ros/core, so:

sudo apt-get install gdb
gdb gzserver ~/.ros/core
bt

Also grab the last line of the score log, to get the time elapsed. E.g. if the run was vrc_task_2:

tail -n1  /tmp/vrc_task_2/score.log

Comments (16)

  1. gerkey NA reporter

    Here's a backtrace that we've seen twice, both times in vrc_task_2:

    Core was generated by `gzserver -s libgazebo_ros_api_plugin.so vrc_task_2.world -q -r --record_encodin'.
    Program terminated with signal 6, Aborted.
    #0  0x00007fe94e8ad425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
    (gdb) bt
    #0  0x00007fe94e8ad425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007fe94e8b0b8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
    #2  0x00007fe94ef0369d in __gnu_cxx::__verbose_terminate_handler() ()
       from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #3  0x00007fe94ef01846 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #4  0x00007fe94ef01873 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #5  0x00007fe94ef0196e in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #6  0x00007fe94ef01e4d in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #7  0x00007fe94eeeaa89 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #8  0x00007fe94fe89e40 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<char*>(char*, char*, std::allocator<char> const&) () from /usr/lib/libprotobuf.so.7
    #9  0x00007fe94eee46b8 in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::str() const () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #10 0x00007fe9511cc94a in gazebo::util::LogRecord::Log::Update() () from /usr/lib/libgazebo_util.so.1
    #11 0x00007fe9511cd251 in gazebo::util::LogRecord::Update() () from /usr/lib/libgazebo_util.so.1
    #12 0x00007fe9511cd7c8 in gazebo::util::LogRecord::RunUpdate() () from /usr/lib/libgazebo_util.so.1
    #13 0x00007fe94f5c1ce9 in thread_proxy () from /usr/lib/libboost_thread.so.1.46.1
    #14 0x00007fe950129e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #15 0x00007fe94e96accd in clone () from /lib/x86_64-linux-gnu/libc.so.6
    #16 0x0000000000000000 in ?? ()
    

    Last line of score.log:

    1039.631,881.000,0.000,0.000,0,0,""
    
  2. gerkey NA reporter

    Another one from vrc_task_2:

    Core was generated by `gzserver -s libgazebo_ros_api_plugin.so vrc_task_2.world -q -r --record_encodin'.
    Program terminated with signal 11, Segmentation fault.
    #0  0x00007fe42a94d258 in gazebo::physics::World::OnLog(std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&) () from /usr/lib/libgazebo_physics.so.1
    (gdb) bt
    #0  0x00007fe42a94d258 in gazebo::physics::World::OnLog(std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&) () from /usr/lib/libgazebo_physics.so.1
    #1  0x00007fe42b1248a0 in gazebo::util::LogRecord::Log::Update() () from /usr/lib/libgazebo_util.so.1
    #2  0x00007fe42b125251 in gazebo::util::LogRecord::Update() () from /usr/lib/libgazebo_util.so.1
    #3  0x00007fe42b1257c8 in gazebo::util::LogRecord::RunUpdate() () from /usr/lib/libgazebo_util.so.1
    #4  0x00007fe429519ce9 in thread_proxy () from /usr/lib/libboost_thread.so.1.46.1
    #5  0x00007fe42a081e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #6  0x00007fe4288c2ccd in clone () from /lib/x86_64-linux-gnu/libc.so.6
    #7  0x0000000000000000 in ?? ()
    

    score.log:

    995.320,877.046,0.000,0.000,0,16,""
    
  3. Steven Peters

    I've gone back through some of our Jenkins build history. The earliest segmentation fault I can find in a drcsim-2.6_gazebo-1.8 branch is from May 24th.

    I also looked at the drcsim-default-gazebo-default builds back to May 16th and didn't see any seg-faults, and the first seg-fault I see is on May 23rd when we merged gazebo_1.8 back to default in pull request #540.

    That included pull request #529, pull request #531, pull request #533, pull request #530, pull request #535, and pull request #537.

  4. Nathan Koenig

    Here a good backtrace.

    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    btCore was generated by `gzserver -s libgazebo_ros_api_plugin.so vrc_task_2.world -q -r --record_encodin'.
    Program terminated with signal 11, Segmentation fault.
    
    #0  0x00002b7c180f8808 in operator+= (__n=0, this=<synthetic pointer>)
        at /usr/include/c++/4.6/bits/stl_deque.h:199
    199         _M_cur = _M_first + (__offset - __node_offset
    (gdb) bt
    #0  0x00002b7c180f8808 in operator+= (__n=0, this=<synthetic pointer>)
        at /usr/include/c++/4.6/bits/stl_deque.h:199
    #1  operator+ (__n=0, this=0x3adb528)
        at /usr/include/c++/4.6/bits/stl_deque.h:209
    #2  operator[] (__n=0, this=0x3adb528)
        at /usr/include/c++/4.6/bits/stl_deque.h:225
    #3  operator[] (__n=0, this=0x3adb518)
        at /usr/include/c++/4.6/bits/stl_deque.h:1217
    #4  gazebo::physics::World::OnLog (this=0x3adb110, _stream=...)
        at /home/ubuntu/work/gazebo/gazebo/physics/World.cc:1730
    #5  0x00002b7c17883e00 in operator() (a0=..., this=0x827d380)
        at /usr/include/boost/function/function_template.hpp:1013
    #6  gazebo::util::LogRecord::Log::Update (this=0x827d370)
        at /home/ubuntu/work/gazebo/gazebo/util/LogRecord.cc:551
    #7  0x00002b7c178848e9 in gazebo::util::LogRecord::Update (this=0x64eac0)
        at /home/ubuntu/work/gazebo/gazebo/util/LogRecord.cc:473
    #8  0x00002b7c17884f28 in gazebo::util::LogRecord::RunUpdate (this=0x64eac0)
        at /home/ubuntu/work/gazebo/gazebo/util/LogRecord.cc:455
    #9  0x00002b7c19499ce9 in thread_proxy ()
       from /usr/lib/libboost_thread.so.1.46.1
    #10 0x00002b7c18920e9a in start_thread ()
       from /lib/x86_64-linux-gnu/libpthread.so.0
    #11 0x00002b7c1a11accd in clone () from /lib/x86_64-linux-gnu/libc.so.6
    #12 0x0000000000000000 in ?? ()
    
  5. gerkey NA reporter

    Interesting. I guess that this->states is corrupt at that point, perhaps from an earlier concurrent write? Out of curiosity, why use [0] to pull the first element out of a deque, instead of front()?

  6. Log in to comment