Very large .out.nc files expensive to read or transfer -> restructure the netcdf output

When running large simulations, we (myself, Juan Ruiz Ruiz) typically create very large .out.nc files with a vast range of possibly interesting data, from geometrical data, to fluxes with t, t,kx,ky, to full fluctuation data as a function of all spatial variables. When we perform very large simulations over multiple scales (resulting in 100GB+ output files), we observe that this data takes a long time to transfer between machines for analysis, and also expensive to read with a simple python script, due to the large amounts of data in the file.

It would be convenient to give the user the option to store inexpensive data in a separate netcdf file to the expensive fluctuation data. This is especially pertinent in very large simulations, which we might be able to only perform once or twice, and therefore we might be tempted to write out the fluctuation data even if we are not sure that we will need it for analysis. This option would allow rapid analysis of basic diagnostic data, whilst still giving the option to speculatively store more expensive fluctuation data.

Would it be possible to consider such an option?

@David Dickinson

‌

Comments (4)