gh::regrid contains code whose runtime is quadratic in number of components

Issue #2211 open
Roland Haas
created an issue

The member function gh::regrid contains this code

  // Check component consistency
for (int ml = 0; ml < mglevels(); ++ml) {
  for (int rl = 0; rl < reflevels(); ++rl) {
    assert(components(rl) >= 0);
    for (int c = 0; c < components(rl); ++c) {
      ibbox const &b = extent(ml, rl, c);
      ibbox const &b0 = extent(ml, rl, 0);
      assert(all(b.stride() == b0.stride()));
      assert(b.is_aligned_with(b0));
      for (int cc = c + 1; cc < components(rl); ++cc) {                                      
        assert((b & extent(ml, rl, cc)).empty());
      }
    }
  }
}

which b/c of the c and cc loops is quadratic in the number of components (MPI ranks).

For many (tens of thousands) components this becomes a dominant cost of the regrid operation.

The Carpet branch rhaas/quadratic_regrid_time contains a test thorn ArrayTest and a hacked version of Carpet that can simulate a number of MPI ranks using just one rank.

One can run:

mpirun -n 1 exe/cactus_sim arrangements/Carpet/ArrayTest/par/array.par

and control the number of pretend ranks by setting ArrayTest::size in array.par.

On my workstation regrid takes ~3.5s for 16k ranks with the consistency check and ~0.5s without.

This is per grid array and per grid scalar. So for a typical setup with ~400 grid arrays and grid scalars (no matter whether they have storage or not) this amounts to 400 * 3s = 1200s of time spent in the consistency check.

This happens only once per simulation (since grid arrays and grid scalars are only regrid once) but for a test simulation can be quite significant (at scale).

The branch actually arranges for the consistency check to be skipped if one defines CARPET_OPTIMISE.

I attach a gnuplot script that takes carpet-timing-statistics.0000.txt and shows how much time is spent in dh and gh regrid.

Keyword: None

Comments (9)

  1. Erik Schnetter
    • removed comment

    For obvious reasons, I am in the unique position to confirm that the person who wrote this code wasn't thinking straight. There are several straightforward ways to improve this code.

    1. There is no need to first calculate the intersection and then check whether it is empty. It is much more efficient to check whether b and extent are disjoint.
    2. Instead of checking whether the bounding boxes are pairwise disjoint, one can calculate their union, and then compare the cardinality of the union with the sum of the cardinality of the individual boxes. This algorithm has complexity n * log(n).
    3. This check is most likely superfluous, as dh::regrid, which does the actual work, contains much more stringent checks. In fact, to my recollection, none of the gh consistency checks ever caught a bug. We could put all of them into an #ifdef CARPET_DEBUG clause.
  2. Roland Haas reporter
    • removed comment

    My branch isn't really changes that should be applied (since it hacks Carpet and adds a testing thorn).

    The possibly applicable changes are surrounding the consistency checks in gh::regrid with a

    #ifndef CARPET_OPTIMiSE
    ...
    #endif
    

    and the same for the consistency checks in dh::regrid (the part timed by the "tests" timer).

    Erik's suggestion is basically to surround the gh::regrid stuff with

    #ifdef CARPET_DEBUG
    

    which makes disables the tests unless explicitly asked for while my suggestion would have left them in unless explicitly disabled. The ones in dh::regrid are already (mostly) no-ops for CARPET_OPTIMISE since the ASSERT_XXX macros become no-ops in that case.

  3. Log in to comment