Restarting nonlinear simulations on archer gives NaNs

Issue #82 resolved

Ollie Beeke created an issue 2019-09-03

I have run and restarted a nonlinear cyclone-base-case simulation on archer. Upon restarting, the first printed value of heat flux and potential matches the last step in the previous simulation. All subsequent values of potential are NaNs. I have tried a linear simulation with exactly the same input parameters and number of cores, and I do not get NaNs. I have attached the two input files that I used for the initial and restarted simulations.

Comments (8)

David Dickinson
Thanks for the report, what version (or commit hash) are you using here and what modules on Archer are you using? How many processors are you running with?
- 2019-09-03T11:10:07+00:00
Joseph Parker
I see this locally using next, and these input files with (nx,ny)=(8,24), 4 procs.
- 2019-09-03T11:26:07+00:00
David Dickinson
I've reproduced this on another system with 32 cores and nx=ny=4 to speed things up a bit. This is using current next.
- 2019-09-03T11:27:26+00:00
Joseph Parker
Setting nstep=0 in cbc_restart.in gives sensible values in the restart files, but setting nstep=1 gives nans.
- 2019-09-03T11:31:11+00:00
Ollie Beeke reporter
@David Dickinson the latest commit I see from git log is f7c09ab. I used 864 cores, although I realise now that I should have used 432 as I included only one species (though I doubt that is would cause the nans!). The module list is shown below:
- 2019-09-03T11:41:17+00:00
David Dickinson
@Ollie Beeke great, thanks. Joseph and I have reproduced this independently. Could you tell me the output of ncdump -v delt2 nc/restart.nc.0?
- 2019-09-03T11:43:01+00:00
David Dickinson
Could you try the branch provided in PR #201 (or apply the changes there to your case) this seems to have fixed the issue in my small reproducer. This boils down to a copy-paste error which meant we saved the oldest timestep in the wrong variable and restored it to the wrong location as well.
- 2019-09-03T12:01:59+00:00
David Dickinson
- changed status to resolved
Fixed in release 8.0.3
- 2020-02-12T11:51:23+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: resolved

Component: –

Milestone: –

Version: –

Votes: 0

Watchers: 2