Carpet PreSync triggering syncs for all refinement levels unnecessarily

Issue #2653 new
Samuel Cupp created an issue

In Carpet, the function PreSyncGroups calls SyncProlongateGroups to automatically sync variables based on the reads and writes. I include the function below.

void PreSyncGroups(cFunctionData *attribute, cGH *cctkGH,
                   const std::vector<int> &pre_groups) {
  DECLARE_CCTK_PARAMETERS;

  assert(not CCTK_EQUALS(presync_mode, "off") and
         not CCTK_EQUALS(presync_mode, "warn-only"));

  if (pre_groups.empty())
    return;

  if (reflevel > 0) {
    // recurse to check that all coarsers levels are properly SYNCed
    CCTK_REAL previous_time = cctkGH->cctk_time;
    const int parent_reflevel = reflevel - 1;
    BEGIN_GLOBAL_MODE(cctkGH) {
      ENTER_LEVEL_MODE(cctkGH, parent_reflevel) {
        cctkGH->cctk_time = tt->get_time(mglevel, reflevel, timelevel);
        PreSyncGroups(attribute, cctkGH, pre_groups);
      }
      LEAVE_LEVEL_MODE;
    }
    END_GLOBAL_MODE;
    cctkGH->cctk_time = previous_time;
  }

  // ask Carpet to do the SYNC, this will apply BC as well
  SyncProlongateGroups(cctkGH, pre_groups, attribute);
}

Starting at line 11, the code claims to “check that all coarser levels are properly SYNCed”. However, this “check” amounts to blindly syncing every coarser refinement level! This is very inefficient and a huge performance problem that grows with the number of refinement levels. With a parfile for a binary black hole simulation with BaikalVacuum using 12 refinement levels, I found that each iteration was taking 6 times longer. For now in my work, I am commenting lines 11-24 out and assuming that coarser levels are fine. However, this definitely needs to be fixed in a more ‘proper’ way.

This ties into a similar issue I ran into when I didn’t have boundary conditions registered (due to my own mistake). Since boundaries weren’t getting filled, every READ(everywhere) would trigger sync+BCs, but there were no BCs to apply. Therefore, even if it should have been synced already, it would sync again with the next READ(everywhere) because there were no BCs applied. As an aside, I feel that this case might have needed to generate an error somewhere, as a READ(everywhere) was running fine even though it was only valid for interior + ghosts.

I propose that a reasonable fix would be to add a check in SyncProlongateGroups to see what is actually needed. This function eventually ends up with a list of goodgroups that are to be synced and have boundary conditions applied. Once this preliminary list is constructed, PreSync should further refine this list into those that need syncs and those that need BCs using validities. Then, it can pass two different lists to each, if necessary. I don’t know if there’s another way for PreSync simulations to get to SyncGroups other than through this function, so there may have to be a check in SyncGroups if that is the case.

Comments (8)

  1. Samuel Cupp reporter

    I also feel I should add that I’m not convinced that this recursive behavior is necessary. Do we actually need to do this?

  2. Samuel Cupp reporter

    I attempted to put in some logic to check whether it should actually call SyncProlongateBoundaries. Leaving the recursion loop unchanged, I have

      std::vector<int> check_groups;
      std::set<int> tmpgroups;
      for (int g = 0; g < pre_groups.size(); g++) {
        int gi = pre_groups[g];
        for (int vi = 0; vi < CCTK_NumVarsInGroupI(gi); vi++) {
          int const map0 = 0;
          ggf *const ff = arrdata.AT(gi).AT(map0).data.AT(vi);
          assert(ff);
          int const valid = ff->valid(mglevel, reflevel, timelevel);
          if (not is_set(valid, CCTK_VALID_EVERYWHERE)) {
            tmpgroups.insert(gi);
            break;
          }
        }
      }
      check_groups.assign(tmpgroups.begin(), tmpgroups.end());
      if(!check_groups.empty()) {
      // ask Carpet to do the SYNC, this will apply BC as well
      SyncProlongateGroups(cctkGH, check_groups, attribute);
      }
    

    This loops over the groups in pre_groups. It then loops over the variables in each group and checks if they are valid everywhere. If not, then the group is added to tmpgroups and the loops are cycled to the next group. At the end, the vector check_groups is assigned all the group indices that need to be passed on. I mimicked what is done in the PreCheckValid function, so all of this should be fine.

    I did this here because it will be a bit more complicated to do it in SyncProlongateBoundaries because we will have to do it separately for syncs and BC application. There’s also a ProlongateBoundaries call that I don’t know what logic I would need to use.

  3. Samuel Cupp reporter

    I haven’t yet run a simulation with a lot of reflevels, but I verified that this code doesn’t call SyncProlongateGroups except for the current reflevel when coarser levels don’t need to be synced. Thus, it is equivalent to commenting out the loop. The only added time will be from looping over the groups and comparing ints at each reflevel, which should be negligible. I’ll run the BaikalVacuum with this and check performance.

    Since I don’t have a code where the coarser levels aren’t synced before getting to the finer level, I can’t do a ‘real’ test of the case where this loop is actually necessary. I’m still not convinced that’s something that can happen without a thorn having incorrect reads/writes.

  4. Samuel Cupp reporter

    I pushed a branch and made a pull request. I’m doing some final testing now to verify that it behaves as I expect (that is to say, it removes the extra calls just like commenting out the recursion does). I’ll comment again when I’m done.

  5. Samuel Cupp reporter

    I’ve verified that this fixes the behavior, though actually iterating through all the variables in the pre_groups on every level for every sync is slower than simply commenting out the recursion (~17 time/hr vs ~19 time/hr). I think that we should consider whether this check is necessary or not, and if it is then see if there’s any way to improve the method.

  6. Samuel Cupp reporter

    Thinking about this problem again, I think that this PR is not the proper solution. The recursive syncs are not the core problem if I remember correctly. It has been some time since I’ve looked at this, but as I recall the core issue was when I had no BCs registered for a variable. I had another PR (accepted a while ago) that added a warning for this, but originally there was a silent issue when trying to sync + apply BCs to a variable with no BCs registered. Carpet PreSync sees that it is valid in the interior and schedules these routines, but the BC call is effectively empty because nothing is registered. It silently returns to the function and then continues on.

    This means that it syncs every time something tries to read it because it is always at best interior+ghosts, and applying BCs always includes syncs. This ticket comes from the fact that the recursive part of the code accumulates these syncs, recursively syncs everything all the time, which is terrible.

    There should be something in Carpet or Boundary that properly warns/errors due to this, but that doesn’t seem to be the case. Once this behavior is fixed, we can see if we need this change to the recursive part of the code or not. However, my current thinking (having admittedly not looked at this for quite some time) is that this is more a mitigation of the issue than a true fix.

  7. Log in to comment