Boxes dropping on boxes

Issue #116 resolved
Steven Gray created an issue

Just noticed this one, but it happened twice in a row. My code starts the competition, then starts the conveyor at power 100, and another box drops right onto the first box.

box_on_box.png

Comments (16)

  1. Deanna Hood

    Do you have any sensors positioned in such a way that they prevent boxes from moving freely along the belt by any chance?

  2. Steven Gray reporter

    I realized the picture I put up might look like that, but the box is clear of the camera.

    The double box stacking occurred at the start of the conveyor belt, and I have no sensors over there. Unfortunately I didn't have logging on when I saw it.

  3. Deanna Hood

    OK. This is something that only occurs occasionally by the sounds of it, is that correct?

    Sounds like the belt is not moving the box (for a reason other than sensors blocking it), despite the belt being turned on. It starts moving properly once the second box lands.

    I'll see if I can find what's causing this. If you have any additional observations about what makes it likely to occur, they are welcome.

  4. Steven Gray reporter

    Yes, this has only occurred twice in ~30 trials or so. I was watching the output of my code and gazebo when this happened. The new box dropped immediately after I commanded the conveyor belt to start moving. I haven't seen any indication the belt isn't moving the box properly. The alignment in the picture shows how far the bottom box moves at 100% power in the time it takes the new box to descend.

  5. Deanna Hood

    OK, thanks for clarifying that when this occurs it seems that two boxes are spawned at the same time, as opposed to the issue being that one box is spawned but not moving until the second box is spawned.

    We'll look into resolving this ASAP but in the context of qual1 if this occurs during an automated trial run of a participant it will warrant a rerun.

  6. Steven Gray reporter

    @d_hood I just switched everything over to the new arm. When I start Gazebo/gear, shipping_box_0 is already on the conveyor belt. When I start the competition service, another box drops right on top of it. Here is a video of this -- you can see on the bottom left terminal that my 'advance belt' service hasn't come up yet and so I have not sent any commands to the belt when the new box drops. This is now happening every time with ariac 2.1.2 (though I haven't checked each previous release).

    It seems like a quick fix would be to not have shipping_box_0 spawn on the conveyor belt when the Gazebo loads. What do you think?

  7. Steven Gray reporter

    Just realized that I can't delete shipping_box_0 in Gazebo, it crashes it every time...

    *** Error in `gzserver': double free or corruption (fasttop): 0x00007ff320047eb0 ***
    ======= Backtrace: =========
    /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7ff4031ad7e5]
    /lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7ff4031b637a]
    /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7ff4031ba53c]
    /opt/ros/kinetic/lib/libSideContactPlugin.so(_ZNSt14_Function_base13_Base_managerIN5boost3_bi6bind_tIvNS1_4_mfi3mf1IvN6gazebo17SideContactPluginERKNS6_6common10UpdateInfoEEENS2_5list2INS2_5valueIPS7_EENS1_3argILi1EEEEEEEE10_M_managerERSt9_Any_dataRKSM_St18_Manager_operation+0x22)[0x7ff319e861b2]
    /usr/lib/x86_64-linux-gnu/libgazebo_physics.so.8(+0x292c97)[0x7ff402912c97]
    /usr/lib/x86_64-linux-gnu/libgazebo_physics.so.8(_ZN6gazebo7physics5World6UpdateEv+0xfb)[0x7ff4028fac5b]
    /usr/lib/x86_64-linux-gnu/libgazebo_physics.so.8(_ZN6gazebo7physics5World4StepEv+0x36e)[0x7ff4029094ce]
    /usr/lib/x86_64-linux-gnu/libgazebo_physics.so.8(_ZN6gazebo7physics5World7RunLoopEv+0x22d)[0x7ff402909a1d]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80)[0x7ff4037cec80]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7ff402c2f6ba]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff40323d41d]
    ======= Memory map: ========
    ...
    
  8. Deanna Hood

    Yeah, don't delete the shipping box 😄 You can just drag it out of the way if you want to get rid of it.

    Unfortunately on all machines I've tested with the second box only appears after shipping_box_0 has moved, so not starting with shipping_box_0 on the belt wouldn't necessarily be a good approach "in general". I can definitely appreciate that this is happening to you, it's just not something I've been able to reproduce: your machine might be faster and so more prone to race conditions. I'll investigate the root cause because it's clearly more of an issue now than it was before.

    While I'm looking into it, if you're building from source you can prevent shipping_box_0 from starting on the belt by removing lines 170-178 of this file: https://bitbucket.org/osrf/ariac/src/4b2af2ea8fac3e6cdbbd686997573ead073a53c9/osrf_gear/worlds/gear.world.template?at=master&fileviewer=file-view-default#gear.world.template-170 As I mentioned, this change won't make sense in general, but it might help you side-step the issue on your machine specifically so you can keep developing.

  9. Steven Gray reporter

    Thanks, I had been dragging the boxes off to the side for the time being. I'm weirded out by this; my machine is 5 years old, an i7-3770k, so it shouldn't be that fast. I'll play with timings and see if I can avoid having this happen. I'll also see if I can create a minimal example to reproduce it. When you test it, are you calling the services from the command line? Or are they in a test script and happen in rapid succession?

  10. Deanna Hood

    Mostly command-line; we do have unit tests that call the services automatically though.

    I have an idea of what the issue could be even without the reproducible example, so no need to spend too much time on it. I'll work on a patch and let you know when you can test it if you're interested in a fix right now, otherwise I'll push it out in the next release.

  11. Steven Gray reporter

    Thanks for the quick fix. I had been running the 2.1.2 release. I just built 2.1.3 from source (commit 808063d) and cannot reproduce this issue any more, even without your fix. For sanity's sake, I ran with 2.1.3 that you packaged today and saw the issue intermittently (2 of 5 runs). I can confirm I cannot reproduce the issue with the early_box_drop_debug branch, but I also can't get it to recur when I build from source regardless of branch.

  12. Deanna Hood

    Must have just been bad luck when you first updated (the issue as I understand it was sensitive to the reception order of messages, so just luck of the draw).

    I'm going to take the liberty of assuming that it this has been fixed for good 😄 , let me know if you run into it again! Otherwise I'll close this when the patch is released.

  13. Deanna Hood

    thanks for reporting, @dan77062 . The fix will be released this week; if it keeps happening and is interrupting your development in the meantime you can build from source and the fix will be included.

  14. Log in to comment