#2955 Merged at 9f027b1
Repository
enwaytech
Branch
gpu_vertical_jacob
Repository
osrf
Branch
gazebo7
Author
  1. Jacob Seibert
Reviewers
Description

Fixes GpuRaySensor with more than one vertical ray (see issue #946).

The configuration is identical to the CPU RaySensor.

Comments (37)

  1. Javier Choclin

    Hi, Jacob, if I run the INTEGRATION_gpu_laser test I get:

    :~/gazebo/build$ ./test/integration/INTEGRATION_gpu_laser 
    [==========] Running 5 tests from 1 test case.
    [----------] Global test environment set-up.
    [----------] 5 tests from GPURaySensorTest
    
    ....
    
    [       OK ] GPURaySensorTest.Heightmap (2084 ms)
    [ RUN      ] GPURaySensorTest.LaserVertical
    [Msg] Waiting for master.
    [Msg] Connected to gazebo master @ http://127.0.0.1:11345
    [Msg] Publicized address: 172.17.0.2
    [Dbg] [ServerFixture.cc:207] ServerFixture load in 1.1 seconds, timeout after 600 seconds
    [Wrn] [msgs.cc:1808] Conversion of sensor type[gpu_ray] not suppported.
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.079109901099460034, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.43266329169273376,
    expectedRange evaluates to 0.35355339059327373, and
    1e-3 evaluates to 0.001.
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.065850478119853451, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.39836812019348145,
    expectedRange evaluates to 0.33251764207362799, and
    1e-3 evaluates to 0.001.
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.053652972699826496, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.36877107620239258,
    expectedRange evaluates to 0.31511810350256608, and
    
    ....
    
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.043081311625098206, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.34375375509262085,
    expectedRange evaluates to 0.30067244346752264, and
    1e-3 evaluates to 0.001.
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.053652764083569726, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.36877086758613586,
    expectedRange evaluates to 0.31511810350256614, and
    1e-3 evaluates to 0.001.
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.065850299305919124, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.39836794137954712,
    expectedRange evaluates to 0.33251764207362799, and
    1e-3 evaluates to 0.001.
    /home/docker/gazebo/test/integration/gpu_laser.cc:465: Failure
    The difference between raySensor->Range(i*samples + mid) and expectedRange is 0.07910975208784804, which exceeds 1e-3, where
    raySensor->Range(i*samples + mid) evaluates to 0.43266314268112183,
    expectedRange evaluates to 0.35355339059327379, and
    1e-3 evaluates to 0.001.
    [Dbg] [ServerFixture.cc:133] ServerFixture::Unload
    [  FAILED  ] GPURaySensorTest.LaserVertical (1762 ms)
    [ RUN      ] GPURaySensorTest.LaserScanResolution
    [Msg] Waiting for master.
    [Msg] Connected to gazebo master @ http://127.0.0.1:11345
    [Msg] Publicized address: 172.17.0.2
    [Dbg] [ServerFixture.cc:207] ServerFixture load in 1 seconds, timeout after 600 seconds
    [Wrn] [msgs.cc:1808] Conversion of sensor type[gpu_ray] not suppported.
    [Dbg] [ServerFixture.cc:133] ServerFixture::Unload
    [       OK ] GPURaySensorTest.LaserScanResolution (1448 ms)
    [----------] 5 tests from GPURaySensorTest (10228 ms total)
    
    [----------] Global test environment tear-down
    [==========] 5 tests from 1 test case ran. (10228 ms total)
    [  PASSED  ] 4 tests.
    [  FAILED  ] 1 test, listed below:
    [  FAILED  ] GPURaySensorTest.LaserVertical
    

    This is consistent to what I see in Gazebo using a simple world:

    Screenshot from 2018-05-09 16-55-19.png

    We still have that saw looking border. It is better than we had before, but we still have this issue.

    Did you have the same results that I had?

  2. Jacob Seibert author

    That's strange because the tests passed yesterday on my machine. I will check it again tomorrow.

    Yes, the borders look similar for me but this also depends on the resolution. My guess about the saw looking border: I think it is due to the (vertical) resolution of the depth image. At large incident angles (near 90 degrees) there are big depth steps between neighbouring pixels. You can increase the resolution to reduce that effect but I think just because of the way the depth image is computed we can never completely avoid it.

    laser_saw_shape.JPG

    Ideally we would have a parameter to increase the internal resolution of the first pass depth image without changing the resulting scan size. Unfortunately we can't use the "resolution" parameter for that without breaking consistency with the CPU laser.

    Alternatively one could try to apply smoothing (anti-aliasing?) to the textures mapped for the second pass. But I'm not sure if that is really a good idea. It could also mess up the ranges.

  3. Javier Choclin

    Check that you are running INTEGRATION_gpu_laser. For that you need to enable graphics tests, check the display capabilities in the cmake output. You can force compiling those tests with `-DFORCE_GRAPHIC_TESTS_COMPILATION=True` .

    Maybe an option is to increase the resolution for the first pass with a constant multiplier in height and width for the resolution. At least that way we can be sure if that is the issue. After confirm that is the issue we can think in leaving it there. It is not ideal, but we keep consistency with the CPU laser.

  4. Jacob Seibert author

    I suggest we make a multiplier that ensures that the first pass has a minimum resolution of x pixels per degree of the respective field of view angle. x being a hardcoded constant. Then we can see if that is satisfying.

    BTW: I also did some profiling today and observed that the GpuRaySensor spent a big amount of time getting parameters out of the SDF inside of loops. I modified the code to improve this and I can push it when I tested it.

  5. Javier Choclin

    Screenshot from 2018-05-11 14-10-30.png

    Ok, I tested it myself. It looks much better, you can see the result above. I just multiplied by 10 the resolution for the first image here. We can add a parameter and have it harcoded. I haven’t tested how this impact in the performance, but I can see that the FPS drop to 50, and I am using a desktop GTX 1070.

    1. Ian Chen

      just need to be careful about the multiplier. We should set an upper limit since different graphics cards have different max texture size support, used to be 2048 or 4096. High-end cards these days have large max size

  6. Javier Choclin

    Well, another option is to use a fixed resolution value on one axis and change the other to keep the aspect ratio. This way we can guarantee compatibility. Or we can set this max resolution and that way guarantee that the result from the multiplier is safe.

  7. Jacob Seibert author

    I'm not yet sure which solution for the resolution I'd prefer. I mean, at least for me the motivation of switching to GPU laser is to have better performance when simulating robots with not only one but multiple LIDARs on board. So I think the resolution should be as high as needed but not higher.

  8. Jacob Seibert author

    I now implemented a simple solution that ensures a minimum horizontal resolution of 2048 pixels per camera/texture (vertical resolution is set accordingly). I think that should be safe regarding the GPU's max. texture size. Of course one could exceed this 2048 by setting a higher sample count, but I think that’s unlikely for now. And I think the results are "good enough" or do you think it's a too low resolution?

  9. Jacob Seibert author

    Today I discovered a bug: The plugin only worked correctly when vertical min and max angles where symmetrical (i.e. vertical half angle = 0). But I was able to fix it so that now it should also be possible to create sensors with asymmetric angles like this:

    Unsymmetrix_angles_fix_small.png

    This matters for example when simulating Velodyne HDL-32E which has +10° to -30° FOV.

    Please feel free to also test this behaviour, mistakes are always possible...

  10. Javier Choclin

    It is looking good. Just in case, can you share the SDF for the Velodyne sensor that you are using? I do feel that yours is from -10° to -30°, is that possible? Besides that I made two very small comments and I think we can approve this PR. 2048 for minimal value seems a little big, but 1048 didn’t look very good, so I understand your decision and I think it is ok, from my side.

    I feel that we should add more tests to cover more cases as the one you pointed out and fixed, with different values of VFOV. If you want to do it that is great, I offer myself to add them in a separated PR later if not. What is your opinion @Ian Chen?

  11. Jacob Seibert author

    I can't share the actual SDF but when taking the screen shot above I was just using two arbitrary angles for testing, that was not related to any real existing sensor. In fact, I only found that bug now because most of the time I tested with settings for the Velodyne VLP-16, i.e. symmetrical -15° to +15°. It's very similar to https://bitbucket.org/DataspeedInc/velodyne_simulator/src/master/velodyne_description/urdf/VLP-16.urdf.xacro

    I'm on vacation for the next week, so I won't be able to work on this now. I would also vote for more tests but personally I would prefer them to be added in a seperate PR.
    I think the most common sensor types should be working now and even if we may find more bugs it's probably already much more usable than the current GpuRaySensor.

    1. Ian Chen

      xenial-nvidia’s Unit_Visual test failed - I can not reproduce it locally so could be red herring

      homebrew’s gpu_laser integration test failed - I’ll look into it

      1. Ian Chen

        I tested on OS X and get gpu_laser integration test failure there. This is the result I’m getting:

        gpulaser_osx.jpg

        The laser penetrates the ground plane and the ray intersections near the bottom of the box are also a little off.

        Going forward, I think we should still try to get this PR merged as it works well on xenial, and we’ll leave the issue open to capture these OSX issues.

    2. Javier Choclin

      It seems the test didn’t run in xenial-nvidia. No idea why. As you said, it works locally for me too .

      About the integration gpu laser test I am back to inconsistent results. Sometimes I get segmentation fault, sometimes it passes. Let me know if you want me to look into it too.

      1. Ian Chen

        I don’t get the gpu_laser integration test failure on xenial locally. So ya, please take a look at the issue on your end; it could be related to the homebrew test failure

        1. Javier Choclin

          This is the backtrace with gdb:

          [       OK ] GPURaySensorTest.LaserUnitBox (2542 ms)
          [ RUN      ] GPURaySensorTest.NameCollision
          
          Thread 1 "INTEGRATION_gpu" received signal SIGSEGV, Segmentation fault.
          0x00007ffff3dcd545 in _int_malloc (av=av@entry=0x7ffff410fb20 <main_arena>, bytes=bytes@entry=8192) at malloc.c:3727
          3727    malloc.c: No such file or directory.
          (gdb) backtrace
          #0  0x00007ffff3dcd545 in _int_malloc (av=av@entry=0x7ffff410fb20 <main_arena>, bytes=bytes@entry=8192) at malloc.c:3727
          #1  0x00007ffff3dcf184 in __GI___libc_malloc (bytes=8192) at malloc.c:2913
          #2  0x00007ffff46c1e78 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
          #3  0x00007ffff46c1f19 in operator new[](unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
          #4  0x00007ffff4720468 in std::basic_filebuf<char, std::char_traits<char> >::_M_allocate_internal_buffer() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
          #5  0x00007ffff47246b2 in std::basic_filebuf<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
          #6  0x00007ffff47248c0 in std::basic_ofstream<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
          #7  0x00007ffff5babd25 in gazebo::common::FileLogger::Init (this=0x7ffff5ec6c60 <gazebo::common::Console::log>, _prefix="test-", _filename="test.log") at /home/docker/gazebo/gazebo/common/Console.cc:211
          #8  0x0000000000477961 in gazebo::ServerFixture::ServerFixture (this=0x7cb5b0) at /home/docker/gazebo/gazebo/test/ServerFixture.cc:91
          #9  0x000000000043faba in GPURaySensorTest::GPURaySensorTest (this=0x7cb5b0) at /home/docker/gazebo/test/integration/gpu_laser.cc:28
          #10 0x000000000043fc90 in GPURaySensorTest_NameCollision_Test::GPURaySensorTest_NameCollision_Test (this=0x7cb5b0) at /home/docker/gazebo/test/integration/gpu_laser.cc:216
          #11 0x000000000044606e in testing::internal::TestFactoryImpl<GPURaySensorTest_NameCollision_Test>::CreateTest (this=0x7b8dd0) at /home/docker/gazebo/test/gtest/include/gtest/internal/gtest-internal.h:454
          #12 0x000000000046bc3f in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*> (object=0x7b8dd0, 
              method=&virtual testing::internal::TestFactoryBase::CreateTest(), location=0x4b4468 "the test fixture's constructor") at /home/docker/gazebo/test/gtest/src/gtest.cc:2079
          #13 0x0000000000466fa3 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*> (object=0x7b8dd0, 
              method=&virtual testing::internal::TestFactoryBase::CreateTest(), location=0x4b4468 "the test fixture's constructor") at /home/docker/gazebo/test/gtest/src/gtest.cc:2115
          #14 0x000000000044cc13 in testing::TestInfo::Run (this=0x7c9ee0) at /home/docker/gazebo/test/gtest/src/gtest.cc:2320
          #15 0x000000000044d2c6 in testing::TestCase::Run (this=0x7b8cb0) at /home/docker/gazebo/test/gtest/src/gtest.cc:2445
          #16 0x00000000004543a4 in testing::internal::UnitTestImpl::RunAllTests (this=0x7b9670) at /home/docker/gazebo/test/gtest/src/gtest.cc:4316
          #17 0x000000000046cad1 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x7b9670, 
              method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x4540e4 <testing::internal::UnitTestImpl::RunAllTests()>, 
              location=0x4b4b08 "auxiliary test code (environments or event listeners)") at /home/docker/gazebo/test/gtest/src/gtest.cc:2079
          #18 0x0000000000467b5f in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x7b9670, 
              method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x4540e4 <testing::internal::UnitTestImpl::RunAllTests()>, 
              location=0x4b4b08 "auxiliary test code (environments or event listeners)") at /home/docker/gazebo/test/gtest/src/gtest.cc:2115
          #19 0x0000000000452f6f in testing::UnitTest::Run (this=0x6f5820 <testing::UnitTest::GetInstance()::instance>) at /home/docker/gazebo/test/gtest/src/gtest.cc:3927
          #20 0x000000000043eb00 in RUN_ALL_TESTS () at /home/docker/gazebo/test/gtest/include/gtest/gtest.h:2288
          #21 0x000000000043d056 in main (argc=1, argv=0x7fffffffea78) at /home/docker/gazebo/test/integration/gpu_laser.cc:602
          (gdb) 
          

          The issue seems to be that it fails to open the test log from time to time. Because of this I have SIGSEGVS from time to time. EDIT: The malloc error seems to be from a heap corruption.

            1. Javier Choclin

              97% tests passed, 19 tests failed out of 572

              Total Test time (real) = 5809.00 sec

              The following tests FAILED:
              309 - PERFORMANCE_transport_stress (Timeout)
              310 - check_PERFORMANCE_transport_stress (Failed)
              323 - INTEGRATION_contain_plugin (Failed)
              343 - INTEGRATION_harness (Failed)
              375 - INTEGRATION_model_database (Timeout)
              376 - check_INTEGRATION_model_database (Failed)
              417 - INTEGRATION_rest_web (Timeout)
              418 - check_INTEGRATION_rest_web (Failed)
              435 - INTEGRATION_static_map_plugin (OTHER_FAULT)
              436 - check_INTEGRATION_static_map_plugin (Failed)
              463 - INTEGRATION_worlds_installed (Timeout)
              464 - check_INTEGRATION_worlds_installed (Failed)
              475 - INTEGRATION_gpu_laser (SEGFAULT)
              476 - check_INTEGRATION_gpu_laser (Failed)
              483 - INTEGRATION_ogre_log (Failed)
              519 - EXAMPLE_examples_build (Failed)
              565 - REGRESSION_config-cmake (Failed)
              567 - REGRESSION_config-pkgconfig (Failed)
              571 - UNIT_gz_TEST (Failed)

              I would say that something corrupts the heap for this test. It is not an issue in common/Console.cc or test/ServerFixture.ccIt could even be something from nvidia or a dependency…

  12. Javier Choclin

    After testing this PR through its commit history and even into Gazebo7, I think that the issue is not from this PR. I am having problem at least since the changeset 36226, I didn’t tested it further, but that is the parent commit for the initial branch. I suspect this is a problem with the dependencies, not gazebo itself. I think we can go forward merging this PR at least you want to do further research @Ian Chen .

    1. Ian Chen

      we don’t have an estimate yet but it would be nice to make a new gazebo7 minor release with these changes. We may wait for a few existing PRs targeting gazebo7 to be reviewed and merged first and then make a new release together.