Unable to launch multiple vehicles

Issue #168 closed
Derek Knowles created an issue

Since the recent changes to cloudsim_bridge, I have been unable to have a successful development environment for launching multiple agents.

This morning, I rebuilt the cloudsim_sim and cloudsim_bridge images according the tutorial, but have the following error whenever I try to launch the controllers.

[X2] CommsClient::Register: Problem registering with broker
[CommsClient] Retrying register.. 

I used to be able to start multiple bridges at one time similar to this tutorial, but now that doesn’t work. (Which is a logical change).

I also tried the docker-compose tutorial, but got this error:

sim_1        | [Err] [Ogre2RenderEngine.cc:338] Unable to open display: :1
sim_1        | No protocol specified
sim_1        | terminate called after throwing an instance of 'Ogre::RenderingAPIException'
sim_1        |   what():  OGRE EXCEPTION(3:RenderingAPIException): Couldn`t open X display :1 in GLXGLSupport::getGLDisplay at /var/lib/jenkins/workspace/ogre-2.1-debbuilder/repo/RenderSystems/GL3Plus/src/windowing/GLX/OgreGLXGLSupport.cpp (line 789)

Does that error have something to do with the fact that in the docker install instructions you’re instructed to remove docker docker-ce docker-engine docker.io , but never instructed how to reinstall docker? I guessed sudo apt install docker-ce.

Any suggestions for me?

Comments (23)

  1. Alfredo Bencomo

    Derek,

    Can you check if this file exist in your local system?

    ls -l /tmp/.docker.xauth
    

    Regarding the docker install instructions, if you install nvidia-container-toolkit; then it will install docker engine for you. Therefore, you don’t need to explicitly run sudo apt install docker-ce

  2. Derek Knowles reporter
    REPOSITORY                  TAG                           IMAGE ID            CREATED             SIZE
    cloudsim_sim                2019_Sep_04_0944              4198f0f242ab        6 hours ago         4.55GB
    cloudsim_sim                31f67e3801db                  4198f0f242ab        6 hours ago         4.55GB
    cloudsim_sim                latest                        4198f0f242ab        6 hours ago         4.55GB
    cloudsim_bridge             2019_Sep_04_0944              8e114576bf73        6 hours ago         3.41GB
    cloudsim_bridge             31f67e3801db                  8e114576bf73        6 hours ago         3.41GB
    cloudsim_bridge             latest                        8e114576bf73        6 hours ago         3.41GB
    simple_submit               2019_Sep_04_0924              a05a3663a1b2        7 hours ago         4.49GB
    simple_submit               latest                        a05a3663a1b2        7 hours ago         4.49GB
    osrf/subt-virtual-testbed   cloudsim_sim_latest           f11b467a9f4b        5 days ago          4.55GB
    osrf/subt-virtual-testbed   cloudsim_bridge_latest        f9b59a945ead        5 days ago          3.41GB
    osrf/subt-virtual-testbed   subt_solution_latest          c44241493b84        6 days ago          3.21GB
    nvidia/cuda                 9.0-base                      1443caa429f9        9 days ago          137MB
    nvidia/cuda                 latest                        010a71dc59db        2 months ago        2.81GB
    nvidia/cudagl               10.1-devel-ubuntu18.04        95fbf8b375b9        3 months ago        3.08GB
    nvidia/opengl               1.0-glvnd-devel-ubuntu18.04   debc9feabdef        3 months ago        424MB
    hello-world                 latest                        fce289e99eb9        8 months ago        1.84kB
    
  3. Hector Escobar

    I am having the same issues as Derek with cloud_sim and and docker-compose tutorial. If I run your previous command I get similar error as below:

    No protocol specified
    [Err] [Ogre2RenderEngine.cc:338] Unable to open display: :0
    No protocol specified
    terminate called after throwing an instance of 'Ogre::RenderingAPIException'
    what(): OGRE EXCEPTION(3:RenderingAPIException): Couldn`t open X display :0 in GLXGLSupport::getGLDisplay at /var/lib/jenkins/workspace/ogre-2.1-debbuilder/repo/RenderSystems/GL3Plus/src/windowing/GLX/OgreGLXGLSupport.cpp (line 789)

  4. Derek Knowles reporter

    The simulation does not launch for me with that command. First error I got was:

    QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-developer'
    No protocol specified
    qt.qpa.screen: QXcbConnection: Could not connect to display :1
    Could not connect to any X display.
    

    And then:

    No protocol specified
    [Err] [Ogre2RenderEngine.cc:338] Unable to open display: :1
    No protocol specified
    terminate called after throwing an instance of 'Ogre::RenderingAPIException'
      what():  OGRE EXCEPTION(3:RenderingAPIException): Couldn`t open X display :1 in GLXGLSupport::getGLDisplay at /var/lib/jenkins/workspace/ogre-2.1-debbuilder/repo/RenderSystems/GL3Plus/src/windowing/GLX/OgreGLXGLSupport.cpp (line 789)
    ./run_sim.bash: line 6:    49 Aborted                 (core dumped) ign launch -v 4 $@
    
  5. Hector Escobar

    @Alfredo Bencomo ,

    I ran the same commands you specify and got the same error messages as Derek.

    with

    Docker version 19.03.1, build 74b1e89

  6. Alfredo Bencomo

    Derek/Hector:

    Run these commands:

    $ ls -la /tmp | grep docker
    
    # IF YOU SEE THIS OUTPUT, THEN RUN THE COMMANDS BELOW...
    drwxr-xr-x  2 root    root     4096 Sep  3 14:27 .docker.xauth
    -rw-------  1 root    root      100 Sep  4 16:16 .docker.xauth-n
    
    $ sudo rmdir /tmp/.docker.xauth
    $ sudo rm /tmp/.docker.xauth-n
    
  7. Derek Knowles reporter

    I removed all of my subt docker images and am rebuilding the cloudsim images through the tutorial again to test it.

    As in the initial post, during the docker install instructions, I still get stuck on the sudo systemctl restart docker step because docker isn't installed at that point. The installing nvidia-container-toolkit step doesn't actually install anything new since I already have nvidia-container-toolkit installed and it's not removed earlier in the instructions. At this point dpkg -l | grep docker shows rc docker-ce 5:19.03.2~3-0~ubuntu-bionic . Again, I explicitly installed docker with sudo apt install docker-ce

  8. Derek Knowles reporter

    I ran ls -la /tmp | grep docker if that’s what you’re asking. It didn’t output anything, so I didn’t remove anything from the /tmp/ directory

    My last comments are regarding this tutorial to install docker. I don’t understand in which step docker is actually supposed to be installed.

  9. Derek Knowles reporter

    The cloudsim docker images finished building and I tried finishing the tutorial, but am still encountering the same problem with the comms clients.

    [X2] CommsClient::Register: Problem registering with broker
    [CommsClient] Retrying register.. 
    

    Thanks for looking into the docker-compose issue, but the original cloudsim issue as I posted doesn’t seem to have been resolved. Did anything change with the cloudsim docker images? I am confused at what was supposed to change between when I first posted and when I just retried everything again.

  10. Hector Escobar

    @Alfredo Bencomo ,

    I still have the same issues even when removing

    sudo rmdir /tmp/.docker.xauth
    sudo rm /tmp/.docker.xauth-n
    

    Also did you move your previous comments? Can’t see them anymore.

  11. Alfredo Bencomo

    Hector,
    So when you now run ls -l /tmp/.docker.xauth nothing is listed and you still have the same issues when you run ONLY this tutorial (don’t run docker-compose yet)

  12. Alfredo Bencomo
    • changed status to open

    Run these commands and update your workspace branch (if applicable).

    $ sudo rm -rf /tmp/.docker.xauth
    
    $ docker-compose down
    
    $ cd ~/subt_ws/src/subt
    
    $ hg pull && hg up -C
    

    And use ./run_docker_compose.sh to run with Docker-Compose instead of the former docker-compose up command.

  13. Hector Escobar

    @Alfredo Bencomo ,

    I tested both tutorials and here are my results.

    For tutorial 1, with no file when doing ls -l /tmp/.docker.xauth I get both robots to communicate and move but I get some error messages on different terminals ( including not binding error). Are these expected?

    So here is a detailed description:

    Terminal 1 (No Error):

    ./run.bash cloudsim_sim cloudsim_sim.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG1 robotName2:=X2 robotConfig2:=X2_SENSOR_CONFIG1

    Terminal 2 (bridge 1, tf_world_static-4 finished, death of process):

    ./run.bash cloudsim_bridge robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG1

    [Dbg] [Manager.cc:421] Death of process[52] with name[x1_description].
    process[X1/ros1_ign_bridge_imu-1]: started with pid [108]
    process[subt_ros_relay-3]: started with pid [127]
    process[X1/ros1_ign_bridge_pose-2]: started with pid [140]
    process[tf_world_static-4]: started with pid [153]
    ….

    process[X1/pose_tf_broadcaster-9]: started with pid [249]
    [tf_world_static-4] process has finished cleanly

    Terminal 3 (bridge 2, death of process)

    ./run.bash cloudsim_bridge robotName2:=X2 robotConfig2:=X2_SENSOR_CONFIG1

    [Dbg] [Manager.cc:421] Death of process[52] with name[x2_description].

    process[X2/ros1_ign_bridge_odom-7]: started with pid [182]
    process[X2/ros1_ign_bridge_battery_state-8]: started with pid [194]
    process[X2/pose_tf_broadcaster-9]: started with pid [206]

    Terminal 4 and 5 show the same error (control 1 and 2)

    roslaunch subt_example example_robot.launch name:=X1

    [CommsClient] Retrying register..
    [X1] CommsClient::Register: Problem registering with broker
    [CommsClient] Validation service not available, invalid address or model not available
    [X1] Bind() error: Trying to bind before communications are enabled!
    [FATAL] [1567765907.647300154, 55.940000000]: subt_example_node did not successfully bind to the CommsClient
    [ INFO] [1567765907.649587710, 55.944000000]: Starting competitor

    [ INFO] [1567765946.030639903, 93.116000000]: TeleopCommCallback

  14. Hector Escobar

    Using ./run_docker_compose.sh launches a world with two robots that head to the entrance and seems to work fine.

  15. Hector Escobar

    The main problem is that we are able to launch 4 vehicles with the catkin method but not with the cloudsim. When we launch the vehicles with the cloudsim we get the following error appears after a few seconds:

    [X2_1] CommsClient::Register: Problem registering with broker
    [CommsClient] Validation service not available, invalid address or model not available
    [X2_1] Bind() error: Trying to bind before communications are enabled!

    I’ll try to remove our comms and try again.

  16. Alfredo Bencomo

    Hi Hector,

    Thank you for taking the time and providing more details. I was able to reproduce theCommsClientproblem registering with broker so we’re looking into it right now. I’ll post back here an update.

    In the meantime, can you try to run your 4 vehicles using docker-compose by editing the docker-compose.ymlfile . Notice that you need to add a bridge, a solution, and a relay for each of the vehicles.

    Let me know if you have any questions.

  17. Alfredo Bencomo

    Hector/Derek,

    PR#286 and PR#287 have been merged. The former addresses the Couldn`t open X display :1 issue and the documentation updated.

    # Use now the new script to launch docker-compose
    
    $ ./run_docker_compose.sh
    

    The later fixes the CommsClient::Register: Problem registering with broker issue, and the documentation has also been updated (i.e. use one single bridge for a local system in the same network).

    # Update your SubT workspace
    
    $ cd ~/subt_ws/src/
    $ hg pull && hg up -C
    
  18. Hector Escobar

    Hi @Alfredo Bencomo ,

    I tested the new instructions for local cloudsim and now I can launch again my 4 vehicles. I tested the Docker Compose with your example using ./run_docker_compose.sh and it works too.

    I’ll be testing the docker compose with our own image next and I’ll report back. I just have to figure out how to adapt the entry point for each different robot with a single solution image.

    Could you share the Dockerfile of your “osrf/subt-virtual-testbed:subt_solution_latest”?

  19. Log in to comment