McLachlan test case failing due to ADMBase variable differences

Create issue
Issue #490 closed
Ian Hinder created an issue

The McLachlan test ML_BSSN_sgw3d has been failing since 03-Aug-2011. The only tested variable which is showing differences is kxx, but the differences appear to be significantly larger than roundoff:

kxx.x.asc: substantial differences significant differences on 16 (out of 36) lines maximum absolute difference in column 13 is 0.000822901437894541 maximum relative difference in column 13 is 0.0560528644051835 (insignificant differences on 16 lines)

This coincides with the following commit:

commit 3ba8a55ae2578cb6dc06f0ec8b81f86b3a2654ac Author: Erik Schnetter schnetter@cct.lsu.edu Date: Tue Aug 2 20:37:19 2011 -0400

Correct schedule, in particular for checkpoint/recovery

Do not mark ADMBase variables for non-checkpointing if they have
multiple timelevels. (Variables with multiple timelevels must always
be checkpointed, because the past timelevels cannot be regenerated
after recovery.)

Finally remove all perl post-processing of the auto-generated code;
instead, use proper Kranc mechanisms.

Schedule the ADM constraints and ADM quantities after MoL_PostStep,
since this is where the ADMBase variables are set.

Schedule enforcing the BSSN constraints in the new schedule group
MoL_PostStepModify, since they should not be enforced after recovery.
(This would lead to inconsistencies at floating-point round-off
level.)

Regenerate all thorns.

and the other commits made to Cactus at the same time as explained in this email:

http://cactuscode.org/pipermail/users/2011-July/002872.html

Attached is a diff which shows what changed between the test passing and failing.

Since this test does not deal with checkpointing or recovery, I don't understand why kxx should change so drastically.

Keyword: McLachlan

Comments (14)

  1. Barry Wardell
    • removed comment

    The specific change which causes this is the scheduling of ML_BSSN_enforce. This was previously scheduled "IN MoL_PostStep AFTER ML_BSSN_SelectBoundConds" whereas now it is scheduled "IN MoL_PostStepModify". When I roll back this change the testsuites pass with zero differences.

  2. Erik Schnetter
    • removed comment

    Thanks. Yes, MoL_PostStepModify must be scheduled BEFORE MoL_Poststep. A few other routines scheduled before MoL_PostStep must then be scheduled to be before MoL_PostStepModify as well. I'm working on this...

  3. Roland Haas
    • removed comment

    The proposed patch (again :-) ) removes boundary and SetTmunu calls from after AtmosphereReset in GRHydro. Since AtmosphereReset depends (in part) on values in spacemask which were set only in the interior (in GRHydro_UpdateMask.F90 line 84, they depend on RHS values in GRHydroRHS), it can only reset points to atmosphere in the interior as well (though it is quite possible that it actually loops over everything).

    While I do agree with everyone else that these calls are annoying, violate all kinds of conventions, abuse MoL and can possibly impact performance (since they involve communication), I believe they are still necessary (unless one determines that no "Hydro_Atmosphere/atmosphere" bits are set in any of the regions affected by SYNC and boundaries on any of the processors.

    Can any of the GRHydro users/developers confirm this (or confirm that these schedule items do not need to be present after all)?

  4. Erik Schnetter
    • removed comment

    Okay -- I had assumed that my current version was correct. Thanks for being so quick.

    There are two issues. One is whether these calls are really necessary; I suggest to add test case which will detect this. The other is scheduling in MoL_Evolution; I am going to change this to be scheduled in evol after MoL_Evolution instead.

    Next version of the patch is attached.

  5. Barry Wardell
    • removed comment

    For the record, the MoL patch fixes the original reported problem with McLachlan. With it applied, the testsuites all pass with zero differences.

  6. Roland Haas
    • removed comment

    I compared schedules with and without the patch and the scheduling of the GRHydro routines does not change. Most of the GRHydro patch had already been applied (in r264 by me). I have attached a reduced patch that applies cleanly. Schedule changes between -r259 (before any GRHydro schedule changes) and with grhydro[2].diff are small, only moving GRHydro_AtmosphereReset out of MoL_Evolution with only sync_GRHydro_C2P_failed slipping in between. So the patch(es) can be applied. Note however #523 on issues with MoL_PostStepModify ordering and that maybe having a dedicated MoL_PostFullStep for final cleanup before MoL_Evolution leaves would reduce the danger of unexpected routines slipping in between MoL_Evolution and AtmosphereReset (and therefore occasionally picking up wrong values for the hydro variables).

  7. Ian Hinder reporter
    • changed status to open
    • removed comment

    Almost! The McLachlan tests now pass with the stable Carpet, but ML_BSSN_sgw3d still fails for the development version (http://damiana2.aei.mpg.de/~ianhin/testsuites/einsteintoolkit/). The only difference is in kxx:

    kxx.x.asc: substantial differences significant differences on 16 (out of 36) lines maximum absolute difference in column 13 is 0.000822901437894541 maximum relative difference in column 13 is 0.0560528644051835 (insignificant differences on 16 lines)

    and this is significantly above roundoff. Reopening ticket.

  8. Log in to comment