recovering on different number of processes than was used to write a checkpoint is painfully slow. Part of the reason seems to be that each process essentially has to read all files to find out where each piece of data it requires is located. The attached patch (not to be included in the main code due to bad file formats and coding) enables CarpetIOHDF5 to read all the information stored in the union of index files to from a single file. This means (together with the other patches proposed today) that CarpetIOHDF5 only ever opens those HDF5 files that are required to restore the simulation on a given process. It significantly (factor > 4 where I don't quite know how fast since the unpatched version ran out of walltime) speeds up recovery with many more processors than wrote the files.
It also adds an optimization for CCTK_VarIndex calls inside CarpetIOHDF5 (which happens for every dataset in the file).
This is intended only as a proof of what might speed up recovery. A proper implementation would need a more sensible file format. Two option seem possible: 1) extend the index file format by a "filename" or "filenum" attribute to each dataset and use a concatenation of all index files as the map file 2) define a custom hdf5 data type corresponding to the information in a single patch_t, which would have mostly integer field plus two variable length / enumerated ASCII fields (for the patch name, variable name)