I have been trying to debug why some runs I was performing could not recover from a checkpoint file, but would otherwise proceed as normal.
I attached a minimalist parfile showing the problem. A small grid is manually distributed over 8 processors and terminates at iteration 2. An attempt at recover fails with nans on grid::x. If the manual topology section is commented out, no problems are seen.
The issue seems to be that with manual topology a region_t structure has it's map entry incorrectly set
What happens is, in
bool gh::recompose there is the check bool const do_recompose = level_did_change(rl);
In level_did_change, the level is considered to change because
the new region_t is
while the old isregion_t(extent=([41,0,0]:[80,10,10]:[1,1,1]/[41,0,0]:[80,10,10]/[40,11,11]/4840),outer_boundaries=[[0,1,1],[1,1,1]],map=0,processor=1)
The only difference is the new map is 51.
If I add a line Carpet/src/Recompose.cc:SplitRegions_AsSpecified to force the map entry to be zero, then all seems to work.
Without the change, Carpet recomposes the grid but never calls the postregrid functions. Hence the Nans in grid::x