- changed status to open
- removed comment
Cactus build fails on file systems that do not update timestamps all the time
Ian and I recently ran into an issue caused by a particular file system (NFS v4 and also BeeGFS) not updating the file modification time when doing this:
: >make.checked
which is what the build system uses (since [https://bitbucket.org/cactuscode/cactus/commits/0779c17697d3b4a254065c10836770e355071f41 0779c17697d3b4a254065c10836770e355071f41] "Cactus: Replace "echo" by ":" in makefile" Thu Nov 27 16:35:13 2014 -0500) to update the marker files once a directory is finished building. While such a behaviour is not POSIX compliant (https://bugzilla.kernel.org/show_bug.cgi?id=6127), we'd still want to work around it.
The simplest solution seems to me to revert [https://bitbucket.org/cactuscode/cactus/commits/0779c17697d3b4a254065c10836770e355071f41 0779c17697d3b4a254065c10836770e355071f41] and use
echo "" >make.checked
again. Erik: since you made the change, would you see any downside to reverting it?
While investigating this Ian also found that some file systems only offer 1 second granularity in their timestamps (eg ext3 but also possibly XFS and NFS) which can negatively affect make if a rule takes less than a second to complete. See https://savannah.gnu.org/bugs/?40056#comment0 and https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Timestamps-and-Make.html . The most conservative approach would be to (arrange for) sleep 1
to execute after each make recipe.
Keyword: backport
Comments (9)
-
reporter -
- removed comment
Note that the observed symptom was that changes to the thornlist of a configuration were not reflected in the list of thorns Cactus thought were available at runtime, even though they were linked in to the executable at link time. i.e. the link line did not match the output of cactus_sim -T. So if a new thorn was added, Cactus failed when activating the thorn, as it thought the thorn was not present. If a thorn was removed, Cactus failed at link time, because the bindings were still referring to the thorn, even though it was not linked in. When this is fixed, we should announce it to the mailing list, in case people are running into this problem.
-
- removed comment
The difference between
: >FILE
andecho 1 >FILE
is that the former creates an empty file, while the latter writes something into the file. Writing an empty string (as you suggest) is also fine, but is the same as just writingecho
, as both simply write a newline character.I don't see a downside, except (tongue-in-cheek) the increase in build time for calling
echo
, and the increase in disk space for creating non-empty files. -
- removed comment
I agree that having a workaround would be nice, and the little overhead (not the pause, but creating a new file instead of changing an existing file) should not hurt.
However, I suggest you contact the admins of the machines where this happens. BeeGFS seems to (at least by default) properly support modification times, and NFSv4 can be configured to do so, too (don't know what the default is there). Usually, problems with modification/creation times usually come from machines with unsyncronized clocks between clients and file servers, e.g., http://www.beegfs.com/wiki/FAQ.
-
reporter - removed comment
The machine in question is the new cluster being purchased by the AEI. Please note that we are not talking about the file modification time being incorrect. Instead it simply is not modified at all unless some data is actually written to the file. I can reproduce this on at least two BeeGFS installations one of which should be using "default" options.
I would still prefer if we change our make system such that it works under non-ideal situations. Usually I dislike assuming that our script operate in a perfect world and much prefer if they assume as little as possible about the environment they run in.
Unless vetoed I will apply the patch in https://bitbucket.org/cactuscode/cactus/pull-requests/22/cactus-restore-echo-to-create-makechecked/diff on Wednesday.
-
- removed comment
Of course, please go ahead and make it work in these situations too. I didn't mean to discourage that. I just wanted to suggest you contact the FS vendor about the issue. The BeeGFS guys are actually quite nice to talk to (at least the ones I met at the SC conference). I would expect them to be interested in such reports.
-
reporter - changed status to resolved
- removed comment
Applied in git hash d526b01 "Cactus: restore echo to create make.checked files" of the flesh.
-
- removed comment
Can this be backported to the release?
-
reporter - edited description
- changed status to closed
- Log in to comment