Simfactory: potentially serious problem with CACHE directory in the simulations directory

Issue #1772 closed
Bruno Mundim created an issue

Is the directory CACHE in the simulation directory really necessary? We are talking about executables with at most 400MB of size, which is nothing compared to current HPC storage systems.

I think I might have found a design flaw on simfactory use of CACHE directory which can go unnoticed until it is too late with potential loss of thousands of SUs. Suppose we have the following situation:

1) We build a configuration A and send a simulation A1 with with parameter file 1. So simfactory copies the executable from configuration A to simulation A1 simfactory directory and creates a symlink from /scratch/simulations/CACHE/exe/cactus_A to /scratch/simulations/A1/SIMFACTORY/exe/cactus_A.

2) We then create a new simulation A2 with a different parameter file 2. This time simfactory symlink the simulation executable /scratch/simulations/A2/SIMFACTORY/exe/cactus_A to the cached one /scratch/simulations/CACHE/exe/cactus_A.

3) After a few days (or restarts) of simulations A1 and A2, you come up with a better idea/fix/new parameter which requires to recompile your configuration A. Note that we don't want to build a new configuration from scratch since cactus configurations consume both a lot of time and space to build. So you rebuild your configuration A and its executable cactus_A is updated.

4) Let's say now we submit the updated configuration with the same parameter file 2 in order to test your new idea/fix/parameter and compare it with the simulation A2, which is still running and have a few extra restarts to completion. Call this simulation A2_updated. Simfactory then copy the new updated executable cactus_A from the Cactus/exe/cactus_A to the simulation directory /scratch/simulations/A2_updated/SIMFACTORY/exe/cactus_A and update the CACHE symlink to that new simulation directory, ie:

$ cd /scratch/simulations/CACHE/exe $ ls -l cactus_A cactus_A -> ../../../../scratch/simulations/A2_updated/SIMFACTORY/exe/cactus_A

5) The problem: now my simulation A2 restarts are compromised with a new executable. Remember that that simulation executable is actually a symlink to the one in the CACHE directory, which has just been updated.

I think this whole cache directory intermediate step introduces unnecessary complexity for the user to track; it is really unnecessary and in my opinion not a good design choice. I would vote to eliminate it from simfactory completely as soon as possible, ideally even for this release. Just use one copy of the executable from cactus/exe to simulation/SIMFACTORY/exe and that's it. This is all we need to have that simulation and future ones running consistently with the same executable.

Thanks!

PS: I have actually noticed this issue on Hershel release (there is no option pointing to Hershel release on trac). I am working on tests for development version to confirm this issue, but give simfactory commits I believe it is still there.

Keyword: CACHE

Comments (26)

  1. Erik Schnetter
    • removed comment

    Yes, the CACHE directory is necessary if you are running many similar simulations. This can happen e.g. during benchmarking. While most HPC systems can handle a large number of executables, there are some that cannot, and where one runs out of quota.

    The cache works slightly differently than you describe. First, it uses hard links, not soft (symbolic) links. The cache is just this, a cache -- the actual executables are safely stored in the simulation directories, and are never modified. Here is what actually happens when a simulation is created:

    1. Check cache whether it has the right executable. If not, ignore cache.
    2. If cache is good, create a hard link from cache to simulation directory.
    3. Check simulation directory if it now has the right executable, since the cache may have changed in the mean time. If no, delete hard link again.
    4. If simulation directory does not have an executable, copy it from the Cactus build directory.
    5. The simulation directory now has the correct executable.
    6. If we are not using a hard link from the cache, then delete the cache file, and create a new hard link from the simulation directory to the cache.

    This guarantees that simulations always use the correct executable, and that a simulation's executable is never overwritten. Also, there are fallbacks in place in case any of the operations fails (e.g. creating a hard link).

    What is actually a problem currently is that some external libraries that are built are dynamic libraries, which can change or go away when one rebuilds. These are not copied into the simulation directory. We try to enforce using static libraries for this reason, but I'm not sure whether this is the case for all external libraries.

  2. Ian Hinder
    • removed comment

    Another problem is that during the transition from simfactory 1 to simfactory 2, a feature was lost. SimFactory 1 used to additionally make a hardlink of the executable in each restart directory. This made it possible to replace the executable at the top-level of the simulation while it was running, so that the next restart would get the new executable (you have to be careful to delete the original hardlink first, because cp by default will copy the new data into the old file, which affects all hard links). If you do that now, the simulation will crash with a "bus error" (at least, that's what happened to me), because each restart executes the top-level executable.

  3. Bruno Mundim reporter
    • removed comment

    Hi Erik,

    thanks a lot for your clarification! I should have tested on other clusters before filing this ticket. Note however that what I described earlier does happen on Loewe! Please see my comments below:

    Replying to [comment:1 eschnett]:

    Yes, the CACHE directory is necessary if you are running many similar simulations. This can happen e.g. during benchmarking. While most HPC systems can handle a large number of executables, there are some that cannot, and where one runs out of quota.

    Ok, point taken! The executable size might not be a problem but the total number of them might take you out of your quota on some systems.

    The cache works slightly differently than you describe. First, it uses hard links, not soft (symbolic) links.

    That doesn't seem to happen on Loewe. I might be completely confused, but as far as I understand all links created by simfactory for executables held on the CACHE directory and on the simulation directories were symbolic links. For example:

    $ pwd 
    /scratch/astro/mundim/simulations/ET_2014_11_herschel/bns_thc/SIMFACTORY/exe
    $ stat cactus_thc_i15_O2 
      File: `cactus_thc_i15_O2' -> `../../../../../../../astro/mundim/simulations/ET_2014_11_herschel/CACHE/exe/cactus_thc_i15_O2'
      Size: 93              Blocks: 1          IO Block: 524288 symbolic link
    Device: 19h/25d Inode: 9962542217151410378  Links: 1
    Access: (0777/lrwxrwxrwx)  Uid: (58311/  mundim)   Gid: (58057/   astro)
    Access: 2015-04-21 22:59:48.000000000 +0200
    Modify: 2015-04-21 22:59:48.000000000 +0200
    Change: 2015-04-21 22:59:48.000000000 +0200
    
    $ cd ../../../../../../../astro/mundim/simulations/ET_2014_11_herschel/CACHE/exe
    $ stat cactus_thc_i15_O2 
      File: `cactus_thc_i15_O2' -> `../../../../../../astro/mundim/simulations/ET_2014_11_herschel/bns_thc_new/SIMFACTORY/exe/cactus_thc_i15_O2'
      Size: 119             Blocks: 1          IO Block: 524288 symbolic link
    Device: 19h/25d Inode: 4055656628534771390  Links: 1
    Access: (0777/lrwxrwxrwx)  Uid: (58311/  mundim)   Gid: (58057/   astro)
    Access: 2015-05-05 15:49:48.000000000 +0200
    Modify: 2015-05-05 15:49:48.000000000 +0200
    Change: 2015-05-05 15:49:48.000000000 +0200
    

    The cache is just this, a cache -- the actual executables are safely stored in the simulation directories, and are never modified. Here is what actually happens when a simulation is created:

    1. Check cache whether it has the right executable. If not, ignore cache.

    If not, then create the cache, no? What you mean by right executable is if the current cache and Cactus/exe/cactus_sim, for example, are the same executable, right?

    1. If cache is good, create a hard link from cache to simulation directory.

    What if it is creating a symbolic link silently as it seems to happen on Loewe? Is there a way of testing if actually a hard link was created?

    1. Check simulation directory if it now has the right executable, since the cache may have changed in the mean time. If no, delete hard link again.
    2. If simulation directory does not have an executable, copy it from the Cactus build directory.

    Unfortunately this is not happening on Loewe. The cache has changed, but the restart of my old simulation still has a symbolic link to the cache directory which now symlinks to the newer simulation with an updated copy of the cactus executable.

    1. The simulation directory now has the correct executable.

    I agree with your explanation, but something is going wrong on loewe (or on my head for not noticing something obvious you stated)

    1. If we are not using a hard link from the cache, then delete the cache file, and create a new hard link from the simulation directory to the cache.

    This guarantees that simulations always use the correct executable, and that a simulation's executable is never overwritten. Also, there are fallbacks in place in case any of the operations fails (e.g. creating a hard link).

    Ok, what is the fallback for not creating hard links?

    What is actually a problem currently is that some external libraries that are built are dynamic libraries, which can change or go away when one rebuilds. These are not copied into the simulation directory. We try to enforce using static libraries for this reason, but I'm not sure whether this is the case for all external libraries.

  4. Bruno Mundim reporter
    • removed comment

    Hi Ian:

    Replying to [comment:2 hinder]:

    Another problem is that during the transition from simfactory 1 to simfactory 2, a feature was lost. SimFactory 1 used to additionally make a hardlink of the executable in each restart directory. This made it possible to replace the executable at the top-level of the simulation while it was running, so that the next restart would get the new executable (you have to be careful to delete the original hardlink first, because cp by default will copy the new data into the old file, which affects all hard links). If you do that now, the simulation will crash with a "bus error" (at least, that's what happened to me), because each restart executes the top-level executable.

    I think if you have a copy of the executable at simulations/SIMFACTORY/exe you could just change for another one and restart the simulation without a problem. I did this for a simulation someone else run and I had to continue. We do need to be extremely careful on this new executable since it could easily segfault if we don't use a very similar Cactus source from the previous simulation.

  5. Erik Schnetter
    • removed comment

    If there is a machine where creating a hard link leads to a soft link being created, then someone needs to get a stern talking-to. They have very different behaviour, and this leads to exactly the problem you describe. I assume that Loewe may have a file system that doesn't support hard links, and that someone decided to use file system options that replaces hard links by soft links. I'm not happy.

    That's exactly the reason why Simfactory is so complex: It's not that things are difficult, it's that each HPC system is its own fiefdom that does things every so slightly differently than others. To provide a uniform interface, one needs to check all the corner cases.

    What machine and what file system is Loewe?

    Regarding file size vs. number of files: No, it is really the file size that makes the difference, not the number of files.

    The fallback when a hard link cannot be created (e.g. on a Blue Gene/Q, which does not support hard links) is to copy files instead. In this case the cache becomes meaningless.

    In my steps above, the cache is not updated at step 1. It is only updated (or created if need be) in step 6, after the simulation has the correct executable.

    And with "right executable" I mean that it is the same executable, byte for byte, as determined by an actual comparison, as the one in the source tree.

  6. Ian Hinder
    • removed comment

    I've just looked at the relevant function (https://bitbucket.org/simfactory/simfactory2/src/9c19923096de7cefe1efc903037aff0440807f3f/lib/restartlib.py?at=master#cl-576). It is definitely not deliberately creating symbolic links. If the system is configured to create symbolic links instead of hard links, I agree with Erik that that is an incorrectly-configured system. Unless you can make the system administrators see sense, SimFactory is going to have to check whether the created file is a symbolic link, and if so, remove it and use copying instead.

    Replying to [comment:5 eschnett]:

    And with "right executable" I mean that it is the same executable, byte for byte, as determined by an actual comparison, as the one in the source tree.

    The logic is currently a bit weird:

    If there is no cached file with the same leaf name as the destination, Copy the source to the destination Try to hard link the destination to the cache file If it fails, copy the destination to the cache file else (there is already a file with the same leaf name as the destination), try to hard link the cache file to the destination If it fails, copy the cache file to the destination [This is potentially a waste of time if you haven't first compared that the files are equal; it could just as easily be a new (or old) executable]

    If the source file has a modification time before or equal to that of the destination file, and the sizes are equal, call the source and destination files "equal" (??? this means that submitting an older executable with the same name will use the older executable)

    If the files are not "equal" as defined above, Remove the destination file Copy the source file to the destination Remove the cache file Hard link the destination to the cache If this fails, copy the destination to the cache

    Am I missing something, or is this logic fairly weird (and wrong, in the case of an older executable)?

  7. Erik Schnetter
    • removed comment

    The logic to copy instead of hard link was added later, as an error handling mechanism. The logic without this mechanism should be easier to understand. I did not want to rewrite the whole algorithm since it was working correctly. It may now be time to introduce a new function "create_hardlink_or_copy" that does this, and which also checks for erroneous soft links.

    Yes, the logic to compare files apparently only looks at sizes and dates. That was presumably a performance optimization that works well in practice.

  8. Ian Hinder
    • removed comment

    I think size and date is fine (in fact, date should be sufficient), but why is it doing <= instead of ==? That seems to introduce very unexpected behaviour.

  9. Ian Hinder
    • removed comment

    Copying without preserving the mtime? Why not copy the mtime as well, and then compare for equality? Do you worry about different mtime precisions on different filesystems? At the moment, I think it will give wrong behaviour if you try to use an old executable (which happens to have the same size).

  10. Ian Hinder
    • removed comment

    This problem only affects one machine, and there is a workaround (delete the cache directory before submitting a simulation). Should we just add it to the release notes, change the ticket status to major, and add the check to simfactory after the release?

  11. Bruno Mundim reporter
    • removed comment

    Replying to [comment:14 rhaas]:

    If you want to check if a file is a symbolic link (instead of checking that it is not multiply hard-linked) you can also use os.path.islink. See https://stackoverflow.com/questions/11068419/check-if-file-is-symlink-in-python and https://docs.python.org/2/library/os.path.html#os.path.islink

    and how would I know if the python runtime supports symlinks?

    https://bugs.python.org/issue13143

    in any case, as long as it works I am happy.

  12. Ian Hinder
    • removed comment

    From reading that issue, it appears to be related to Windows. Some Python versions support some notion of symbolic links on Windows, whereas others don't. I would assume that all versions support symbolic links on Unix-like operating systems.

  13. Roland Haas
    • removed comment

    Actually the way Bruno does it may be better. The test really should be "did I get a hard link" (which is what I wanted) rather than "did I get something else that I got on this faulty system". One would have to check that eg python's stat called on a symlink which points to a file for which hard links exist does indeed report the number of hard links on the symlink and not those on the pointed to file. Ie.

    touch TestFile
    ln TestFile HardLinkToTestFile
    ln -s TestFile SymbolicLinkToTestFile
    python <<EOF
    import os
    print os.lstat("SymbolicLinkToTestFile")
    print os.stat("SymbolicLinkToTestFile")
    EOF
    

    which at least on my workstation gives "st_nlinks = 2" for the "stat" call and "st_nlinks=1" for the "lstat" call (see man 2 lstat for the difference between the two).

  14. Erik Schnetter
    • removed comment

    Your current approach checks whether the hard link has been created correctly, and if not, introduces a new error condition. I would prefer a simpler solution: The current code already knows how to handle the case that the link could not be created. I would extend this error check to include the case that a symbolic link was created. You would then reuse the existing error handling, which copies the file instead.

  15. Bruno Mundim reporter
    • removed comment

    Replying to [comment:19 eschnett]:

    Your current approach checks whether the hard link has been created correctly, and if not, introduces a new error condition. I would prefer a simpler solution: The current code already knows how to handle the case that the link could not be created. I would extend this error check to include the case that a symbolic link was created. You would then reuse the existing error handling, which copies the file instead.

    I agree your solution is simpler. However os.link(source, link_name) doesn't raise any exception if a symbolic link is created instead of a hard link. This forced me to use this extra if statement.

  16. Erik Schnetter
    • removed comment

    Yes, you need an extra if statement. But you don't need to add an extra branch to handle the error. Instead, you can delete the symbolic link, and then use the existing error handling path that copies the file instead.

  17. Log in to comment