Invalid write by dxtemplateJobListContainer::ReleaseAJob()

Issue #70 resolved
Bram Stolk created an issue

There is an invalid write performed by demo_moving_trimesh.

To reproduce:

  • Build demos for ode 0.16.1
  • Run valgrind session: valgrind ./demo_moving_trimesh
  • Wait 10 seconds or so. (Sometimes a mouseclick hastens it?)
$ /opt/valgrind/bin/valgrind ./demo_moving_trimesh 
==29821== Memcheck, a memory error detector
==29821== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==29821== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==29821== Command: ./demo_moving_trimesh
==29821== 

Simulation test environment v0.02
   Ctrl-P : pause / unpause (or say `-pause' on command line).
   Ctrl-O : single step when paused.
   Ctrl-T : toggle textures (or say `-notex' on command line).
   Ctrl-S : toggle shadows (or say `-noshadow' on command line).
   Ctrl-V : print current viewpoint coordinates (x,y,z,h,p,r).
   Ctrl-W : write frames to ppm files: frame/frameNNN.ppm
   Ctrl-X : exit.

Change the camera position by clicking + dragging in the window.
   Left button - pan and tilt.
   Right button - forward and sideways.
   Left + Right button (or middle button) - sideways and up.

To drop another object, press:
   b for box.
   s for sphere.
   y for cylinder.
   c for capsule.
   x for a composite object.
   v for a convex object.
   m for a trimesh.
To select an object, press space.
To disable the selected object, press d.
To enable the selected object, press e.
To toggle showing the geom AABBs, press a.
To toggle showing the contact points, press t.
To toggle dropping from random position/orientation, press r.
==29821== Thread 3:
==29821== Invalid write of size 4
==29821==    at 0x146592: dxtemplateJobListContainer<dxtemplateThreadedLull<dxCondvarWakeup, dxOUAtomicsProvider, false>, dxMutexMutex, dxOUAtomicsProvider>::ReleaseAJob(dxThreadedJobInfo*, bool, void (*)(void*)) (threading_impl_templates.h:674)
==29821==    by 0x1466F8: ReleaseAJobAndPickNextPendingOne (threading_impl_templates.h:601)
==29821==    by 0x1466F8: dxtemplateJobListThreadedHandler<dxCondvarWakeup, dxtemplateJobListContainer<dxtemplateThreadedLull<dxCondvarWakeup, dxOUAtomicsProvider, false>, dxMutexMutex, dxOUAtomicsProvider> >::PerformJobProcessingSession() (threading_impl_templates.h:980)
==29821==    by 0x146747: dxtemplateJobListThreadedHandler<dxCondvarWakeup, dxtemplateJobListContainer<dxtemplateThreadedLull<dxCondvarWakeup, dxOUAtomicsProvider, false>, dxMutexMutex, dxOUAtomicsProvider> >::PerformJobProcessingUntilShutdown() (threading_impl_templates.h:959)
==29821==    by 0x1467AE: StickToJobsProcessing (threading_impl_templates.h:942)
==29821==    by 0x1467AE: dxtemplateThreadingImplementation<dxtemplateJobListContainer<dxtemplateThreadedLull<dxCondvarWakeup, dxOUAtomicsProvider, false>, dxMutexMutex, dxOUAtomicsProvider>, dxtemplateJobListThreadedHandler<dxCondvarWakeup, dxtemplateJobListContainer<dxtemplateThreadedLull<dxCondvarWakeup, dxOUAtomicsProvider, false>, dxMutexMutex, dxOUAtomicsProvider> > >::StickToJobsProcessing(void (*)(void*), void*) (threading_impl_templates.h:1249)
==29821==    by 0x14706B: ThreadedServeImplementation (threading_pool_posix.cpp:554)
==29821==    by 0x14706B: dxThreadPoolThreadInfo::RunCommandHandlingLoop() (threading_pool_posix.cpp:545)
==29821==    by 0x147148: dxThreadPoolThreadInfo::ThreadProcedure_Callback(void*) (threading_pool_posix.cpp:461)
==29821==    by 0x587F6DA: start_thread (pthread_create.c:463)
==29821==    by 0x64F7A3E: clone (clone.S:95)
==29821==  Address 0x1ffefff6dc is on thread 1's stack
==29821==  156 bytes below stack pointer
==29821== 

Note: demo_crash has the same issue.

Comments (8)

  1. Oleh Derevenko

    Fixed: Issue #70 fixed

    Threaded job, while being released after execution, first signaled completion to its possible owner and assigned job failure status only afterwards. This was wrong since the owner might immediately start accessing the failure status and dispose its storage yet before the value was assigned there.

    → <<cset 7942e0687a54>>

  2. Oleh Derevenko

    Fixed: Issue #70 fixed

    Threaded job, while being released after execution, first signaled completion to its possible owner and assigned job failure status only afterwards. This was wrong since the owner might immediately start accessing the failure status and dispose its storage yet before the value was assigned there.

    → <<cset 0eec554360a0>>

  3. Oleh Derevenko

    I have pushed the fix to ‘master’ and ‘0.16.x’. I did not test it myself but the issue seems to be pretty obvious and the fix should work. Please give it a try.
    And thank you for the find.

  4. Bram Stolk reporter

    This particular issue seems fixed. Although the master branch does seem to create SIGFPEs in my app that were not there in 0.16.1 I think a division by zero has been introduced recently, possibly in gimpact. I will investigate and create another issue.

  5. Log in to comment