Improve the response matrix dump/read functionality

Issue #96 resolved
David Dickinson created an issue

Both fields_implicit and fields_local over the ability to save and load the response matrices to/from file. Whilst this works it is not as useful as it could be.

The current problems are:

  1. It only currently makes sense to have one of dump/read response true in a particular run typically.
  2. In non-linear runs response matrix files will be overwritten each time we change the timestep
  3. There’s no information about what timestep was used to generate a particular file.
  4. Others

Proposed enhancements:

  1. Add the current timestep to the response matrix file name. This should allow us to identify the timestep used to produce a file.
  2. Restructure the code to allow us to test if the expected response matrix file exists before we try to read from it. This should allow us to enable read response without fear the run will abort because the file is absent, instead we can just calculate the response matrix directly. We could use similar functionality to avoid dump_response from overwriting files?
  3. Consider storing some meta-data in the response matrix file to better describe compatibility etc.

With proposals 1 and 2 we should be able to have dump_response = .true. and read_response = .true. in a nonlinear simulation and expect to be able to reuse previously calculated response matrix data. For example consider a nonlinear simulation that is oscillating between timestep sizes A and B. Currently each time we change the time step we must recalculate the full response matrices, with the proposed enhancements we would only calculate the response matrices for timestep size A once and then same for size B. This could be a decent saving for large simulations and should help us avoid much of the computation penalty of our implicit scheme in nonlinear simulations.

Challenges:

  1. This is a backwards incompatible change (change to filename).
  2. Testing

Comments (7)

  1. Joseph Parker

    This seems like a very good idea! Is the size of response matrices also a challenge? Is storing multiple copies on disk likely to be a problem?

  2. David Dickinson reporter

    File size could be an issue but generally I’d have thought a set of response matrices for a single timestep shouldn’t be any more than naky*available_memory_per_core (and in general it can’t be any bigger than total_memory_consumed_by_gs2) so say available_memory_per_core = 4GBand naky = 128 we’re talking 0.5TB – this is a lot and we’d need to multiply this by say 4 different timesteps. It’s a lot but not unrealistic for systems where you’re going to be running these large cases. There’s also a program that just generates the dump files for different timesteps – we could use this to precalculate N timesteps and then only use read_response = .true. (with dump_response = .false.) in the main run. This would ensure no additional disk space is required, i.e. we won’t write any extra ones.

  3. David Dickinson reporter

    The branch now contains code to check if the expected response matrix exists and if not quietly carry on with calculating the response matrix instead (this is reported to the error unit). I think this should be enough to get enhancement 2 but I’ve yet to develop a test for this.

  4. David Dickinson reporter

    It might also make sense to skip dumping the response matrix if a file with the expected name already exists. This might have slight performance advantages but should also help reduce peak memory requirements for big jobs that wish to dump the response matrix.

  5. David Dickinson reporter

    PR 293 implements pretty much all of this now. If that is approved and merged I think that will close this issue.

  6. Log in to comment