PreSync syncing more than old code
I have been investigating the source of the slowdown for the evolution of a BBH using BaikalVacuum when PreSync is turned on. The largest problem was reported in Ticket #2653, but the code is still noticeably slower than without PreSync. I have tracked the extra sync calls to restriction. There is one group which is only written, never read. In the old way, these syncs were manually scheduled in the schedule.ccl. PreSync (properly) syncs this group at restriction at every refinement level. The observed behavior shows it
1. Syncs the group aux_variables from the coarsest to finest level
Then, it begins heading back up the levels, presumably to begin the restriction operation. However, it then
2. syncs the groups aux_variables and evol_variables from finest to coarsest, skipping the finest level
The syncs in 2 are not present in the old way. Either the old way wasn’t properly syncing during restriction, or PreSync is introducing extra syncs into this operation. I’m inclined to believe the latter. My guess as to the possible source of this is the code
// Restrict a refinement level
void ggf::ref_restrict_all(comm_state &state, int const tl, int const rl,
int const ml) {
if (transport_operator == op_none or transport_operator == op_sync)
return;
static Timer timer("ref_restrict_all");
timer.start();
// Require same times
assert(std::fabs(t.get_time(ml, rl, tl) - t.get_time(ml, rl + 1, tl)) <=
1.0e-8 * (1.0 + std::fabs(t.get_time(ml, rl, tl))));
transfer_from_all(state, tl, rl, ml, &dh::fast_dboxes::fast_ref_rest_sendrecv,
tl, rl + 1, ml);
timer.stop(0);
// Update state, both fine and coarse bcome invalid in the boundaries
// coarse b/c fine was restricted to it, fine b/c prologation from coarse
// will change values
int const coarse_old_valid = valid(ml, rl, tl);
set_valid(ml, rl, tl,
coarse_old_valid & ~(CCTK_VALID_GHOSTS|CCTK_VALID_BOUNDARY));
int const fine_old_valid = valid(ml, rl, tl);
set_valid(ml, rl + 1, tl,
fine_old_valid & ~(CCTK_VALID_GHOSTS|CCTK_VALID_BOUNDARY));
}
which sets the validities to be interior only at the end. I’ll verify that this code triggers the syncs and comment with more info when I have it.
Comments (3)
-
reporter -
reporter I’m also considering the possibility that this could be because the read/writes on two of the functions aren’t right. I based them on the loops (everywhere), but I think the floor_the_lapse and enforce_detgammahat functions actually only need to be read/write (interior). I have to carefully count the number of syncs per iteration to figure this out.
-
reporter - changed status to resolved
After discussions with Eric and Roland, I have managed to determine that the syncs called in Restrict by PreSync are called in the old code in PostRestrict. While they have moved around, the total number of syncs is the same.
- Log in to comment
As an update, commenting out these set_valids stops the syncs from happening. So, something here is changing the behavior of the code. Are we sure that this is what we want to do here? @Roland Haas @Steven R. Brandt