Cactus build fails on file systems that do not update timestamps all the time

Create issue
Issue #1864 closed
Roland Haas created an issue

Ian and I recently ran into an issue caused by a particular file system (NFS v4 and also BeeGFS) not updating the file modification time when doing this:

: >make.checked

which is what the build system uses (since [https://bitbucket.org/cactuscode/cactus/commits/0779c17697d3b4a254065c10836770e355071f41 0779c17697d3b4a254065c10836770e355071f41] "Cactus: Replace "echo" by ":" in makefile" Thu Nov 27 16:35:13 2014 -0500) to update the marker files once a directory is finished building. While such a behaviour is not POSIX compliant (https://bugzilla.kernel.org/show_bug.cgi?id=6127), we'd still want to work around it.

The simplest solution seems to me to revert [https://bitbucket.org/cactuscode/cactus/commits/0779c17697d3b4a254065c10836770e355071f41 0779c17697d3b4a254065c10836770e355071f41] and use

echo "" >make.checked

again. Erik: since you made the change, would you see any downside to reverting it?

While investigating this Ian also found that some file systems only offer 1 second granularity in their timestamps (eg ext3 but also possibly XFS and NFS) which can negatively affect make if a rule takes less than a second to complete. See https://savannah.gnu.org/bugs/?40056#comment0 and https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Timestamps-and-Make.html . The most conservative approach would be to (arrange for) sleep 1 to execute after each make recipe.

Keyword: backport

Comments (9)

  1. Ian Hinder
    • removed comment

    Note that the observed symptom was that changes to the thornlist of a configuration were not reflected in the list of thorns Cactus thought were available at runtime, even though they were linked in to the executable at link time. i.e. the link line did not match the output of cactus_sim -T. So if a new thorn was added, Cactus failed when activating the thorn, as it thought the thorn was not present. If a thorn was removed, Cactus failed at link time, because the bindings were still referring to the thorn, even though it was not linked in. When this is fixed, we should announce it to the mailing list, in case people are running into this problem.

  2. Erik Schnetter
    • removed comment

    The difference between : >FILE and echo 1 >FILE is that the former creates an empty file, while the latter writes something into the file. Writing an empty string (as you suggest) is also fine, but is the same as just writing echo, as both simply write a newline character.

    I don't see a downside, except (tongue-in-cheek) the increase in build time for calling echo, and the increase in disk space for creating non-empty files.

  3. Frank Löffler
    • removed comment

    I agree that having a workaround would be nice, and the little overhead (not the pause, but creating a new file instead of changing an existing file) should not hurt.

    However, I suggest you contact the admins of the machines where this happens. BeeGFS seems to (at least by default) properly support modification times, and NFSv4 can be configured to do so, too (don't know what the default is there). Usually, problems with modification/creation times usually come from machines with unsyncronized clocks between clients and file servers, e.g., http://www.beegfs.com/wiki/FAQ.

  4. Roland Haas reporter
    • removed comment

    The machine in question is the new cluster being purchased by the AEI. Please note that we are not talking about the file modification time being incorrect. Instead it simply is not modified at all unless some data is actually written to the file. I can reproduce this on at least two BeeGFS installations one of which should be using "default" options.

    I would still prefer if we change our make system such that it works under non-ideal situations. Usually I dislike assuming that our script operate in a perfect world and much prefer if they assume as little as possible about the environment they run in.

    Unless vetoed I will apply the patch in https://bitbucket.org/cactuscode/cactus/pull-requests/22/cactus-restore-echo-to-create-makechecked/diff on Wednesday.

  5. Frank Löffler
    • removed comment

    Of course, please go ahead and make it work in these situations too. I didn't mean to discourage that. I just wanted to suggest you contact the FS vendor about the issue. The BeeGFS guys are actually quite nice to talk to (at least the ones I met at the SC conference). I would expect them to be interested in such reports.

  6. Log in to comment