Tests Fail with USE_PARALLEL_NETCDF

Issue #80 resolved
Colin Malcolm Roach created an issue

(1) Build with parallel_netcdf: $make USE_PARALLEL_NETCDF=on

(2) Run tests: $make tests USE_PARALLEL_NETCDF=on

(3) Unit tests then fail:

FAILED: gs2_diagnostics_new (mpirun -np 2 ./test_gs2_diagnostics_new test_gs2_diagnostics_new_append.in)

$ more tests/unit_tests/gs2_diagnostics_new/test_gs2_diagnostics_new_append.error

ERROR: No such file or directory in file: ./test_gs2_diagnostics_new_start.nc

ERROR: NetCDF: Not a valid ID in variable: vnm1

ERROR: NetCDF: Not a valid ID in variable: vnm2

(4) This is a knock-on error arising from the fact that in the preceding test (test_gs2_diagnostics_new_start) parallel_netcdf failed to write the restart file needed in the …_append test (which should be a single restart file because we are specifying USE_PARALLEL_NETCDF)

$more tests/unit_tests/gs2_diagnostics_new/test_gs2_diagnostics_new_start.error
nf90_create error: NetCDF: Parallel operation on file opened for non-parallel access

I find this on Fedora30. Is this affecting other OS distributions? Does anyone else have the same issue?

Is USE_PARALLEL_NETCDF broken?

Comments (7)

  1. David Dickinson

    @Colin Malcolm Roach I just tried to reproduce this on a Fedora 27 machine using the netcdf-fortran-openmpi package on master of GS2 (with GK_SYSTEM=fedora) but I couldn’t reproduce the issue I’m afraid. I got a single restart file correctly created by the test.

    Could you report the output of ldd bin/gs2 ?

  2. David Dickinson

    Offline conversation suggests this could be due to path order in LD_LIBRARY_PATH or equivalent resulting in the executable picking up the serial netcdf library instead of the parallel one.

  3. Log in to comment