- edited description
HDF5 Write crashes in parallel on OSX 10.10 and Debian Wheezy
Dear all,
I think the HDF5-installation that is part of the pre-compiled FEniCS Application for OS X (available here: http://www.fenicsproject.org/pub/software/fenics/fenics-1.5.0-p2-osx10.10.dmg) includes a faulty HDF5 library.
If I execute the attached minimal example using OS X 10.10.2 single thread by
"python H5Crash.py"
everything works normally. But if I use it in parallel
"mpiexec -np 2 python H5Crash.py"
I get a ton of error messages, all similar to this one:
# 003: /Users/johannr/fenics-1.5.0/fenics-superbuild/build-fenics/CMakeExternals/src/HDF5/src/H5FDmpio.c line 1052 in H5FD_mpio_open(): MPI_File_open failed major: Internal error (too specific to document in detail) minor: Some MPI function failed
Comments (10)
-
reporter -
reporter - edited description
-
reporter - edited description
-
reporter Dear all, I can now also confirm the same happening on Debian Wheezy after doing a "standard" installation using http://fenicsproject.org/fenics-install.sh
-
reporter - changed title to HDF5 Write crashes in parallel on OSX 10.10 and Debian Wheezy
-
I'm having the same issue with reading HDF5 files on OS X 10.10. Reading a mesh stored in an HDF5 file with one process (mpiexec -np 1) works fine, but any process count greater than that errors out like the reported issue.
-
-
assigned issue to
-
assigned issue to
-
- changed status to invalid
This is not a bug in DOLFIN. It is a problem with older versions of OpenMPI. The
fenics-install.sh
script has been changed to use MPICH instead of OpenMPI, so this is no longer a problem there. The DMG bundle for OS X still uses an old version of OpenMPI, but I will look at upgrading to the latest version of OpenMPI or switching to MPICH. -
I'm having the same issue on an HPC. I had been using MPICH as well as a workaround until that caused too many red flags for the admins. They use OpenMPI task monitoring and insist on OpenMPI.
Using the latest OpenMPI resolved this for me. I did have to upgrade a number of dependencies along the way.
-
Switching from NFS3 to NFS4 is what helped us.
- Log in to comment