new gear is very slow when using qual1a.yaml

Issue #117 resolved
Former user created an issue

I just updated to the new ARIAC version 2.1.0. My Gazebo simulation slowed down from what was formerly a real-time factor of about 0.8 to what is now a real-time factor of 0.02.

Is there a way to get back to 40x faster?

Comments (17)

  1. Deanna Hood

    How's the performance with roslaunch osrf_gear sample_environment.launch?

    If performance is fine with the sample, then perhaps your team's sensor configuration hasn't been updated to account for the change in conveyor belt location, and your sensors are in collision with the belt. If that's the problem, the fix would be to reposition your sensors so that they don't collide with the conveyor belt in the new environment.

    If the sample has the same issue, could you please attach:

    • a gazebo state log generated using roslaunch osrf_gear sample_environment state_logging:=true (you'll have to let it run for ~5min since the sim clock is running so slowly). This will let us inspect any unusual collisions that may be in the environment.
    • a video recording of the gazebo client window (there's a button in the top right corner). This will let us see if any of your simulation models are displaying differently on your machine specifically.
  2. Wyatt Newman

    aha. I think that's it. Thanks!

    Can you point me to the appropriate way to launch qual2 with OSRF config together with a team config?

  3. Wyatt Newman

    But, if I use NO team config file and launch with just:

    rosrun osrf_gear gear.py -f catkin_find --share --first-only osrf_gear/config/quals/qual1a.yaml

    I get a real-time factor of 0.02. Is this your experience?

  4. Deanna Hood

    I have seen impacted real-time factor from sensors colliding with the belt, but I haven't seen it before with just the trial config file, no. There are some default sensors that are loaded even if a team sensor configuration isn't provided. In particular, the break beam at the end of the belt may be in collision with the belt in your environment for some reason. This is what it should look like: congestion_sensor.jpg.

    Do you see that?

    You might be impacted by mixed GEAR versions again. Please follow the same steps as in https://bitbucket.org/osrf/ariac/issues/93 and see if that changes anything.

  5. Wyatt Newman

    to clarify, this works fine:

    roslaunch osrf_gear sample_environment.launch

    It runs on my machine at a real-time factor about 0.90. However, I need to launch with our own config files.

    This following launch (with NO team config):

    rosrun osrf_gear gear.py -f catkin_find --share --first-only osrf_gear/config/quals/qual1a.yaml

    runs at an RTF of 0.02. So something is incompatible with the qual1a.yaml file and the new osrf_gear simulation.

    My work-around is currently to run this:

    rosrun osrf_gear gear.py --visualize-sensor-views -f catkin_find --share --first-only osrf_gear/config/sample.yaml ~/ariac_ws/ariac-docker/team_config/team_case_config/qual2_config.yaml

    This is somewhat odd, though, as it begins with 2 boxes on the conveyor (one near the first quality sensor). I also don't know what scenarios it will respond to (publication of different types of orders, simulation of dropped parts, inverted parts, faulty parts, etc).

    Are these scenarios embedded in the osrf_gear/config/sample.yaml type files? And if so, can we access alternative variations? (that are compatible with the new layout?)

    thanks again, Wyatt

  6. Deanna Hood

    OK, it's not for all trials, just for some of them. That sounds like a different problem, thanks for clarifying.

    The sample.yaml config file is a special one that starts in the middle of the belt for the tutorials, it's not "standard" behaviour. There are various samples in the config directory with a description at the top. E.g. sample_interruption1 explains that an interruption is expected. You can use those for testing, or see the config file tutorial for how to make a custom config file.

    What might be happening with qual1a.yaml is just that it has more products than your machine can handle, you might simply be running out of RAM. If that's the case you may need to switch computers.

  7. Wyatt Newman

    Thanks. I'll check those resources and tutorial

    I'm using <25% of RAM when I launch with qual1a.yaml, and there are 14 parts present. However, one of my cores is saturated, and RTF is 0.02. This does sound like a case of collisions slowing the simu down.

    When I run rosrun osrf_gear gear.py --visualize-sensor-views -f catkin_find --share --first-only osrf_gear/config/sample.yaml ~/ariac_ws/ariac-docker/team_ config/team_case_config/qual2_config.yaml

    there are twice as many parts present (30, really). RTF is over 0.97, memory is still below 25%, and all cores are < 50% busy.

    It seems qual1a.yaml is not compatible with the new gear. But perhaps it is not intended to be compatible with the new gear.

    I have my work-around, plus your tips for making alternative configs, so this is no longer urgent. But I don't know if it will hold up any other teams.

    thanks again, Wyatt

  8. Deanna Hood

    It might be the state logging, then, which is enabled by default in qual1a now (and only in qual1a), but it was not enabled by default in previous versions. If that's the case then I'm afriad your workaround will only help until you want to use state logging. The reason why this is important is that the automated evaluation setup has state logging enabled by default.

    Most config files have state logging disabled by default. If you add --state-logging=yes to your gear.py invocation of the trial configs that are otherwise working fine, does it affect the RTF? And conversely does adding --state-logging=no to the invocation of qual1a fix its RTF?

    If state logging is the issue, then first check if it's still an issue in the automated evaluation setup. ./run_trial.bash example_team sample should take less than a few minutes to run; if it doesn't then the RTF might be low in the automated setup too, which will be an issue when you are preparing your team's submission. If your automated evaluation setup is recording state logs fine, then that's what matters most, and so I'd agree with you that getting to the bottom of this is not too urgent.

  9. Deanna Hood

    @wnewman I have a (simple) idea about what's causing your RTF.

    What's your gazebo version?

    Gazebo 8.3 caused a low RTF when state logging was enabled. This has been fixed in gazebo 8.4 (we made reference to this issue in the announcements page but it's easy to miss). If your version is 8.3, please upgrade gazebo and it should be fixed!

  10. Steven Gray

    Just to chime in, this indeed seemed to be caused by the state logging. My RTF only dropped to about 0.63, but disabling the logging brought it back up to 0.93 or so.

  11. Deanna Hood

    thanks for the input @stgray ; state logging will lower the RTF a bit normally, that is expected behaviour. @wnewman unfortunately experienced quite a significant drop, that is not expected behaviour, but for which a likely explanation is an old gazebo version.

  12. Wyatt Newman

    my version of Gazebo is 8.3, so that's not it. I think it has to do with the qual1a.yaml--perhaps a collision with the new environment?

  13. Log in to comment