Many ExternalLibraries thorns build when they shouldn't

Create issue
Issue #1897 closed
Ian Hinder created an issue

With current master of the ET, I see the MPI thorn building OpenMPI even when I provide enough in the optionlist that it doesn't have to.

********************************************************************************
Running configuration script for thorn MPI:
Found MPI compiler wrapper at /opt/local/bin/mpicxx-openmpi-mp!
Successfully configured MPI.
Finished running configuration script for thorn MPI.

********************************************************************************

...

Installing MPI into /Users/ian/Cactus/EinsteinToolkitGit/configs/sim/scratch/external/MPI
MPI: Preparing directory structure...
MPI: Unpacking archive...
MPI: Configuring...

============================================================================
== Configuring Open MPI
============================================================================

*** Startup tests

Keyword: MPI

Comments (19)

  1. Ian Hinder reporter
    • removed comment

    The first section shows that enough options are provided that we don't need to build OpenMPI, but the second section shows that we are building it anyway. The culprit appears to be the "done" file. The MPI "done" file is not being updated with the current date by detect.pl, and make.code.deps has a rule which says to run build.pl if the done file is older than the build.pl. I suspect the relevant commit is

    commit 86b1f5fad74891a044ef1f6f2059f98c3e8b0db0
    Author: eschnett <eschnett@043a8217-7a68-40fe-abfd-36aa7d4fa6a8>
    Date:   Sat Apr 23 05:18:34 2016 +0000
    
        Don't modify the 'done' file unless there was a real change
    
        git-svn-id: https://svn.cactuscode.org/projects/ExternalLibraries/MPI/trunk@75 043a8217-7a68-40fe-abfd-36aa7d4fa6a8
    
    diff --git a/src/detect.pl b/src/detect.pl
    index 3ddaf3c..36712c3 100644
    --- a/src/detect.pl
    +++ b/src/detect.pl
    @@ -154,8 +154,10 @@ if ($mpi_build and !$mpi_info_set) {
     } else {
         my $THORN = "MPI";
         my $DONE_FILE = "$ENV{SCRATCH_BUILD}/done/${THORN}";
    -    mkdir("$ENV{SCRATCH_BUILD}/done");
    -    system("date > ${DONE_FILE}") == 0 or die;
    +    if (! -e $DONE_FILE) {
    +        mkdir("$ENV{SCRATCH_BUILD}/done");
    +        system("date > ${DONE_FILE}") == 0 or die;
    +    }
     }
    

    So it looks like make.code.deps assumes that the done file will be updated unless building is needed, but this change is stopping this from happening.

  2. Frank Löffler
    • changed status to open
    • marked as
    • removed comment

    After conversation with Erik, we came to the conclusion that it seems we need two files to indicate the two different things we need. We need one file to indicate that the thorn is done, and whose date only updates if something in that thorn was updated (done), and we need another file that indicates that the library in external thorns is available, either built or available otherwise. The date of that file has to update whenever that library changed (when built). That second file can be considered 'internal' to that thorn. We considered two possible names for that file: build_unnecessary_or_complete or available. If no objections are raised and no other names are proposed, it will be 'available'. This will need to be changed in all external libraries, and is a release-blocker because currently libraries are built even if they are supposed to not be.

  3. Frank Löffler
    • removed comment

    Replying to [comment:3 knarf]:

    The date of that file has to update whenever that library changed (when built).

    It also needs to change after each invocation of the detect script, if the library was still detected. This is because the make-mechanism can then use this file to determine whether to build or not build the library. This is what is currently missing, and can lead to the unnecessary builds if the library is available elsewhere, and build.sh is updated. (make compares the date of 'done' (and later 'available') to the date of 'build.sh')

  4. Erik Schnetter
    • removed comment

    Note in particular that the mechanism suggested above requires changes only internal to each external library thorn, and no changes outside. The new tag "available" would only be used to communicate between the "detect" and the "build" scripts.

  5. Roland Haas
    • removed comment

    I am confused about which script sets the time of which file and under which circumstances. Would it be possible to provide something like a flow diagram? Namely who sets "available" and "done" under which circumstances and which makefile targets (mostly the Cactus thornlib) depend on "done" or "available".

  6. Frank Löffler
    • removed comment

    'done': - tells Cactus - That that thorn is 'ready'. - The last update time of the status of that thorn (file date). This is used to decide whether dependencies need to be rebuilt. - created by detect.sh if nothing needs to be built (detected installed library) Otherwise not touched there. - created and/or modified (touched) always after build

    'available': - not used by Cactus - tells the make system whether build.sh needs to run, i.e., whether after detection a build is necessary. - is created and/or updated whenever detection is run and something was detected. The update is necessary to make it newer than the file time of 'build.sh', which is what 'make' uses as comparison. The comparison is done so that a re-build is triggered if build.sh is changed (and detection doesn't find anything).

    What is left is the (currently also buggy I think) case of a vanishing system library. In that case, 'done' would need to be removed by 'detect' (but isn't I think at the moment), and 'available' probably also should be removed in that case.

  7. Roland Haas
    • removed comment

    When you say "tells Cactus that thorn is ready" do you mean "cactus thornlib or make.checked target depends on this so that thornlib or make.checked is updated triggering dependent thorns to rebuild themselves"?

    For 'available' it says "is created and/or updated whenever detection is run and something was detected." does that mean "available" is not touched if detection is run and nothing was detected (this would seem to be in direct contradiction of "tells the make system whether build.sh needs to run, i.e., whether after detection a build is necessary. ")? Which script creates/touches 'available'? Only detect.sh or sometimes also build.sh?

  8. Frank Löffler
    • removed comment

    Replying to [comment:9 rhaas]:

    When you say "tells Cactus that thorn is ready" do you mean "cactus thornlib or make.checked target depends on this so that thornlib or make.checked is updated triggering dependent thorns to rebuild themselves"?

    Yes.

    For 'available' it says "is created and/or updated whenever detection is run and something was detected." does that mean "available" is not touched if detection is run and nothing was detected (this would seem to be in direct contradiction of "tells the make system whether build.sh needs to run, i.e., whether after detection a build is necessary. ")? Which script creates/touches 'available'? Only detect.sh or sometimes also build.sh?

    Good questions. I think I might have gotten something wrong.

    'available' will get set both by detect.sh when something was detected, and by build.sh if it was build. Both will update the time stamp. However, this leaves us with the problem that we both want to use it to see if detect.sh found something, but also for preserving the last built date (if build). We cannot do both, as detect.sh should remove the file in case nothing is detected to indicate that. Do we need yet another indicator ('done', 'last_build', 'syslib_detected')?

  9. Roland Haas
    • removed comment

    I think "available" may be misleading. As far as I can tell we may need something like:

    • 'done' which is created/set by detect.sh if a system library is used and by build.sh once it finishes, 'done' is the file that make.depend depends on and hence each time that detect.sh runs and sets 'done' the thornlib is updated and Formaline includes the thorn source
    • 'do_build' which is created/set by detect if build.sh should run, 'done' lists 'do_build' as a prereq and the recipe for 'done' is build.sh

    Neither editing build.sh nor editing detect.sh triggers a rebuild of the library. One has to touch configuration.ccl to do so (this is already true for detect.sh). If we want automated rebuild when build.sh is changed then this would need something like:

    # use wildcard to ignore do_build if no build is requested anyway
    done: build.sh $(wildcard do_build) 
            build.sh
    
  10. Erik Schnetter
    • removed comment

    The libraries will only be rebuilt if the CST stage runs. You can trigger this manually (make *-rebuild), or by touching one of the ccl files.

    "done" is not a file in the current directory; does this work with make?

    If "do_build" doesn't exist, wouldn't the rule trigger all the time?

  11. Roland Haas
    • removed comment

    In one version of the proposal, yes. I went through multiple versions of the rule and now I use $(wildcard do_build) to avoid this effect (must have missed it when testing this on my workstation). Thank you for pointing out the make *-depend target (I can never remember which one it is).

    Yes, as far as I know one can specify the full path on dependency side (I think there is some make var that points to the thorndir, isn't there?). This would all go into the ExternalLibrary's makefile and not be part of Cactus' make.subdir or make.thornlib. A full path for the target also works but make then needs to be called with that string as its target.

  12. Frank Löffler
    • removed comment

    Any new ideas about how to fix this?

    Trying to summarize, we need:

    a) a file that indicates to Cactus that something in the library configuration changed, especially after a re-build of the library by Cactus. So far, this is called 'done'.

    b) a file that indicates that the library needs to be build. This is thorn-internal, and only passes that information from detect.sh to build.sh.

    c) a file that 'saves' the last build date, to be able to skip build if build.sh wasn't updated (no new version available). This is also thorn-internal.

    Initially, I tried to combine b) and c) into one file, 'available', but that was shown to be not sufficient. As it stands now, we don't have a solution which would combine b) and c) into one file. It would be nice to have only two files (simpler), but in absence of a solution using only two files, but the need for a solution in general, I only see one way forward: three files.

    a) 'done': would not change from current behavior. Created/removed by detect.sh (but not updated), created/removed(?)/updated by build (if run). b) 'build_not_needed' (?): created/updated/removed by detect.sh c) 'last_build' (?): created/updated by build.sh

    If b) exists after detection, build doesn't need to run. If c) is newer than build.sh, build doesn't need to run. Otherwise: build and update c).

  13. Ian Hinder reporter
    • changed status to open
    • assigned issue to
    • removed comment

    In the ET call today, we decided that the changes which introduced this bug would be reverted (the changes were intended to avoid parts of Carpet from being rebuilt, but the logic is incompatible with how the libraries actually function). Erik, would you mind taking care of this? It's probably best, as you know exactly which commits in each repository to revert.

    I was just hit by it again when trying to build on my Mac, and this time hwloc tried to build itself (it's supposed to be found from MacPorts).

  14. Log in to comment