Possible attempts to save_for_restart with an unallocated gnew

Issue #171 new
David Dickinson created an issue

In simulations with save_for_restart = T we attempt to save restart files at the end of the simulation as a part of finalising the diagnostics. In non-linear runs it is possible for us to exit the main run_equations routine with the distribution function in an unallocated state. This will lead to the attempt to save the restart files to fail, potentially corrupting any previously written restart files.

Specifically, this can happen when we attempt to when we attempt to change the time step and this fails due to trying to change it too much at once.

At https://bitbucket.org/gyrokinetics/gs2/src/142c78781c1bc4c06d234b32fee70eaf100d68e4/src/gs2_reinit.f90#lines-199 we set the initialisation state such that we can change the timestep. In doing this we end up deallocating gnew (and other arrays). Usually this will fix when we call init at cket.org/gyrokinetics/gs2/src/142c78781c1bc4c06d234b32fee70eaf100d68e4/src/gs2_reinit.f90#lines-256 to go back up to the “full” level, during which gnew will be reallocated and restored from the stored state. Unfortunately between these two init calls we have some logic which may see us decide to abort (i.e. return from this routine with the exit flag set). If these are triggered then we don’t reallocate gnew.

There are several possible fixes:

  1. Replace the return with an mp_abort.
  2. Move the logic before the first init call such that we decide to exit before we dellocate gnew etc.
  3. Repeat the second init call just before the return statements.

Option 1 should work but isn’t very friendly and means we won’t get restart files from the final state (if in_memory = T).

Option 3 should work but re-initialising can be expensive so this seems like a waste of effort given we are about to stop.

Option 2 is probably preferred but does have a potential pitfall/gotcha. The logic as it stands attempts to keep on changing the timestep until it is small/large enough or until we’ve tried too many times. This changes the timestep, so exiting after this logic without going through the init calls means that the code_dt value will be left inconsistent with the arrays in the rest of the code that involve code_dt. This probably isn’t a major issue and we can probably choose to just reset the value of code_dt to the value prior to this logic at the return point.

Comments (0)

  1. Log in to comment