Topic name mismatch on heavy worlds with custom names

Create issue
Issue #2053 resolved
Louise Poubel created an issue

When loading a heavy world with a custom name, it seems that some topics are mistakenly created under /gazebo/default, while others are correctly created under /gazebo/custom_name

Comments (14)

  1. Louise Poubel reporter

    It was a world with a few thousand boxes. If I recall correctly, while the world was still being loaded, some classes didn't receive the world name in time and started referring to entities with the default name. Entities loaded after the world name was received would get the correct name.

  2. Louise Poubel reporter

    Also worth pointing out that I used the word "heavy" because I think this is more related to the time it takes to load a world than to the number of models. I can imagine the same issue can happen with a world containing a single model which is really large.

  3. Louise Poubel reporter

    This issue hit me again today while working on ServiceSim. The side-effect is that Gazebo freezes at startup with a black screen and the splash screen never goes away. What happens:

    1. gzclient and gzserver get setup with different gazebo transport namespaces (i.e. gzserver has /gazebo/ServiceSim according to the SDF world file, but gzclient falls back to the default /gazebo/default namespace).
    2. gzclient's scene sends a request to /gazebo/default/request
    3. gzserver never responds
    4. The client scene is never initialized
    5. The splash screen never disappears

    If you run gz topic -l while Gazebo is hanging, you can see that there are topics on different namespaces.

    A quick solution is to set <world name="default"> on SDF.

  4. d_hood

    I hit this in ARIAC and @mxgrey ran into this on a project they're working on. It's affecting community members too:

    While investigating for I identified it as something that was always possible (this issue is from 2016, so we knew that already) but it has been made much more likely in gazebo7.11 / gazebo8.3 releases (compared to gazebo8.2).

    Since there's a known workaround it might make sense to announce that to users until the issue itself can be resolved, because this is affecting a lot of users as of late

  5. d_hood

    @chapulina pointed out that I spoke too soon about! We haven't confirmed that that's the same issue yet (just a similar symptom) 😄

    If the issue is just affecting internal projects, and we all have a workaround, then maybe this isn't high priority. The question is how many people would be using custom world names externally.

  6. Michael Grey

    The project I'm working on is an external project that I'm assisting a client with, so I think this should be a high priority issue. The world file that's being launched is one that's installed to the /opt/ros/ filesystem, so changing the name in the .world file requires elevated privileges, which isn't a very appealing workaround (albeit, it does get the job done).

  7. Louise Poubel reporter

    I'm tracking down the code path. So going backwards:

    1. gzclient nodes start falling back to the default namespace and print the infamous No namespaces found message.
    2. That's because transport::waitForNamespaces times out
    3. Because transport::ConnectionManager::GetTopicNamespaces repeatedly returns empty
    4. Because packet topic_namepaces_init (yes, without an s) has never been received from Master::OnAccept

    I didn't go deeper. It seems to me that a quick and dirty solution would be to make the scene wait in an infinite loop until there are namespaces before it calls node->Init. This assumes that the rendering should never be initialized first, which I think is a fair assumption based on the way Gazebo currently works.

  8. Michael Grey

    Maybe a cleaner way would be to have a flag that tracks whether initialization has ever happened successfully. While that flag is dirty, the client will periodically recheck topic_namepaces_init, and then perform initialization once it's received.

    The benefit would be that the gzclient would feel snappier for users. The disadvantage is that users might think something bugged out if the GUI pops up empty while the server is still booting up.

  9. Louise Poubel reporter

    A flag sounds nice too.

    I'm thinking here that for this to scale well, the fix will need to go into transport. The black screen problem comes from not receiving the scene message, but all other gzclient nodes should also wait for the actual namespace.

  10. Michael Grey

    Ah, so if the scene message is never received, the screen will just remain black anyway.

    In that case, I agree with your original proposal. If all we can get is a black screen until we're listening to the correct topics, then we should wait indefinitely for the topic namespace message. We just need to make sure that we handle SIGINT gracefully so that users can cleanly exit the program while we're in the indefinitely long loop.

  11. Log in to comment