Deteriorated stability of simulations.

Issue #51 resolved
Bram Stolk created an issue

Non trivial OpenDE simulations have become unstable a few years back, when constraint-reordering was changed.

Random constraint reordering used to reorder all constraints. Now, it does it in two batches that do not intermingle.

demo_feedback and demo_convex can be used to observe the difference.

Here is a video of first the new, and then the old behaviour of OpenDE. Notice how the old simulation is stable, whereas the new one is not. https://youtu.be/ifDLqCP9_nw

USENET thread describing this issue can be found here: https://groups.google.com/forum/#!search/Random$20constraint$20reordering/ode-users/vVUcf-JHX6M/QcmhDR3WESgJ

At the very least, OpenDE should add a ./configure flag to revert back to the old style reordering system, so that old behaviour can be reinstated.

Or, alternatively, always reorder fully, skipping the need for a new flag.

Comments (27)

  1. Oleh Derevenko Account Deactivated

    Bram, I have committed a fix to QuickStep. Please try it to see if it improves things for you. The problem was not in the separate constraint randomization. Rather the problem was that originally the constraints were mostly solved in their natural order as generated and only each 8th iteration was randomized. With my changes in #1898 the constraints started to be randomized but never reverted back to the normal order on subsequent iterations. And that, obviously, broke the algorithm. Also, an interesting thing is that in the original implementation, during dependent constraints separation, just due to implementation convenience, those were stored from the tail to the array middle (in reverse order). Somehow, that reversed dependent constraint order is important. If you place them in straight order the simulation becomes worse.

  2. Bram Stolk reporter

    Good catch Oleh,

    Unfortunately, it is only a marginal improvement.

    0.13 and 0.12 are still vastly better than the current branch, with latest fix.

  3. Oleh Derevenko Account Deactivated

    Can you provide something simpler I could see difference with? Demo_convex seems to be fixed with the latest changes. If you want to try the complete re-shuffling you can easily have it by replacing:

                if (ThrsafeExchange(&stage4CallContext->m_SOR_reorderHeadTaken, 1) == 0) {
                    // Process the head
                    const dxQuickStepperLocalContext *localContext = stage4CallContext->m_localContext;
                    ConstraintsReorderingHelper()(stage4CallContext, 0, localContext->m_m - localContext->m_valid_findices);
                }
    
                if (ThrsafeExchange(&stage4CallContext->m_SOR_reorderTailTaken, 1) == 0) {
                    // Process the tail
                    const dxQuickStepperLocalContext *localContext = stage4CallContext->m_localContext;
                    ConstraintsReorderingHelper()(stage4CallContext, localContext->m_m - localContext->m_valid_findices, localContext->m_valid_findices);
                }
    

    with single

                if (ThrsafeExchange(&stage4CallContext->m_SOR_reorderHeadTaken, 1) == 0) {
                    // Process the head
                    const dxQuickStepperLocalContext *localContext = stage4CallContext->m_localContext;
                    ConstraintsReorderingHelper()(stage4CallContext, 0, localContext->m_m);
                }
    

    in lines [2468-2478]

  4. Bram Stolk reporter

    See my pull request for demo_feedback changes, which better show stability.

    In a properly functioning solver, the bridge should not break. I'll send videos.

  5. Oleh Derevenko Account Deactivated

    I have strong impression that demo_feedback depends on CPU speed and behaves differently on slower and faster machines even if you run the same code.

  6. Oleh Derevenko Account Deactivated

    Can you point a revision in the source control at which you believe demo_feedback was working?

  7. Bram Stolk reporter

    Nope, fixed 2ms step, always.

    Also, did you see the youtube video I linked? Day and night difference, each time I run it.

    If you stack weights on a chain, it should never buckle upwards like that. Completely broken.

  8. Bram Stolk reporter

    It broke with the very patch that changed the random reordering.

    In major versions:

    0.12 on sourceforge works. 0.13 on sourceforge is unstable, but is easily fixed by doing the full reorder.

    everything else is broken.

  9. Bram Stolk reporter

    To reduce it ever further: even with 0 boxes on the bridge (STACKCNT=0) the bridge will explode in current ODE tree, even with full reorder change you suggest. ODE 0.12 is fine.

    I've reexamined the demo_feedback code: I see no obvious error in it. It's just a bunch of masses connected together, with a slider on both ends to suspend it in-air. Why can't ODE simulate this without exploding?

    Do you see the explode as well? It starts to shake, first slowly, then violently, and it explodes. I build on linux, CPU is a vanilla Haswell Intel CPU. I tried disabling MT but same result.

  10. Bram Stolk reporter

    And another datapoint to consider:

    demo_feedback is rock solid using dWorldStep().

    This is a QuickStep specific issue.

    worldstep.jpeg

  11. Oleh Derevenko Account Deactivated

    So, I updated to 0.12 in Hg (revision #1741) and configured vs2008 project with the Premake. Then I imported that into Visual Studio 2013 and built. The bridge breaks in all the build configurations (Single/Double, x86/x64). Also I tried building that with MinGW GCC 5.3.0 (Release+Double) and the bridge breaks with that as well. The only thing I must admit, it breaks consistently in the same way across all the debug builds and consistently across x86/x64 for the release single and release double (though release single and release double differ a bit).

  12. Bram Stolk reporter

    You need to set maxforce from 2000 to 6000 to avoid breakage. See my pull rq.

    Or better yet: STaCKCNT to zero.... an empty bridge will break.

  13. Oleh Derevenko Account Deactivated

    All right Bram, I've fixed demo_feedback but the demo_convex still has the fragment popping up.

    Then changes may not work with MT operation mode yet - I still need to review it. Please use the single threaded mode for your testing at this time.

  14. Bram Stolk reporter

    Thank you, Oleh, I will test it.

    Be advised: the demo_convex pop is a different issue, unrelated to this. ODE 0.12 just happened to be more forgiving to the error in the convex code.

    I'm pretty sure the convex-vs-convex collision is suspect, and occasionally produces contacts that are far too deep.

  15. Bram Stolk reporter

    Good job. The demo_feedback is well-behaved now. I also tried my crane simulator, and that one is stable now too (first time since many years again, with current dev branch.) crane.png

  16. Oleh Derevenko Account Deactivated

    Hi Bram,

    I was investigating the issue with single thread working bad while re-routed into multi threaded path and I have found that with independent constraints placed at end for the first 8 iterations it works well in all the cases.

    Could you try this patch with your crane simulator and tell me whether it makes any difference?

    diff -r b802a0b726fa ode/src/quickstep.cpp
    --- a/ode/src/quickstep.cpp Wed Nov 21 02:06:09 2018 +0200
    +++ b/ode/src/quickstep.cpp Sun Dec 02 22:03:27 2018 +0200
    @@ -2231,12 +2231,12 @@
    
         {
             // make sure constraints with findex < 0 come first.
    -        IndexError *orderhead = order, *ordertail = order + (m - valid_findices);
    +        IndexError *orderhead = order, *ordertail = order + (/*m - */valid_findices);
             const int *findex = localContext->m_findex;
    
             // Fill the array from both ends
             for (unsigned int i = 0; i != m; ++i) {
    -            if (findex[i] == -1) {
    +            if (findex[i] != -1) {
                     orderhead->index = i; // Place them at the front
                     ++orderhead;
                 } else {
    @@ -2244,7 +2244,7 @@
                     ++ordertail;
                 }
             }
    -        dIASSERT(orderhead == order + (m - valid_findices));
    +        dIASSERT(orderhead == order + (/*m - */valid_findices));
             dIASSERT(ordertail == order + m);
         }
     }
    
  17. Oleh Derevenko Account Deactivated

    @bram_stolk, in particular, I 'm also interested in multi-threaded mode if you would be so kind.

  18. Bram Stolk reporter

    I don't have much time a.t.m, I tried, but my quickstep.cpp looks totally different at line 2231. I'm struggling with hg.

    $ hg status gives me:

    changeset: 2126:ac56b6ed3333 tag: tip user: oleh_derevenko date: Tue Dec 04 03:00:39 2018 +0200 summary: Changed: Some optimizations to dxJointContact::getInfo1/2

  19. Oleh Derevenko Account Deactivated

    @bram_stolk , no problem. Take your time.

    Your changeset looks correct. But if automatic merge fails you could easily make those three changes manually.

  20. Oleh Derevenko Account Deactivated

    To have multiple solver threads you only need to frame your simulation loop with thread pool allocation and release like the demos do. It's very easy.

  21. Oleh Derevenko Account Deactivated

    Never mind. I did some quantitative estimations on the solver output quality and I do not see any improvements with the reversed order. So I'll leave it as it was.

  22. Log in to comment