Inconsistent behavior and error messages with executable search

Issue #173 resolved
Dan Bonachea created an issue

This issue is the result of a mailing list complaint from an external user.

The upcxx-run handling of executable search with respect to user PATH is currently inconsistent and misleading. Consider the following trace, where hello-world is an smp-conduit UPC++ executable in the CWD:

$ upcxx-run -n 2 does-not-exist                   
usage: upcxx-run [-h] [-n NUM] [-N NUM] [-shared-heap HEAPSZ] [-backtrace] [-show] [-info]
                 [-ssh-servers HOSTS] [-localhost] [-v] [-vv]
                 command ...

Error: "does-not-exist" does not appear to execute a UPC++/GASNet executable

$ upcxx-run -n 2 hello-world
Traceback (most recent call last):
  File "/usr/local/bin/upcxx-run", line 401, in <module>
    main()
  File "/usr/local/bin/upcxx-run", line 396, in main
    os.execvp(cmd[0], cmd)
  File "/usr/lib/python2.7/os.py", line 346, in execvp
    _execvpe(file, args)
  File "/usr/lib/python2.7/os.py", line 382, in _execvpe
    func(fullname, *argrest)
OSError: [Errno 2] No such file or directory

$ upcxx-run -n 2 ./hello-world
Hello world from rank 1
Hello world from rank 0

$ setenv PATH ".:$PATH"

$ upcxx-run -n 2 hello-world  
Hello world from rank 1
Hello world from rank 0

$ cd junk/

junk/$ upcxx-run -n 2 ../hello-world
Hello world from rank 1
Hello world from rank 0

$ setenv PATH "..:$PATH"

junk/$ upcxx-run -n 2 hello-world   
usage: upcxx-run [-h] [-n NUM] [-N NUM] [-shared-heap HEAPSZ] [-backtrace] [-show] [-info]
                 [-ssh-servers HOSTS] [-localhost] [-v] [-vv]
                 command ...

Error: "hello-world" does not appear to execute a UPC++/GASNet executable

Problems:

  1. When "." is not in the PATH, invoking upcxx-run on a "bare" executable in the CWD generates a Python crash, instead of a nice explanatory error message.
  2. As shown in the last command above, PATH is not otherwise searched for executables.

I'm not sure what the right behavior is here, but the current behavior seems inconsistent. We should either uniformly treat the exename as a filename that is searched for in $PATH in the same manner as the shell, or treat is as literal relative or absolute pathname (uniformly ignoring $PATH).

Comments (6)

  1. Paul Hargrove

    I agree we should not produce a python backtrace for this input.

    As to what we should to with this input: I have no strong feeling on a "right" choice, but feel that a good balance between easy to use, to implement and to document is:

    1. If the character / appears in the executable then it is a path (relative or absolute) to be used as-is.
    2. Otherwise, implicitly prepend ./

    This proposal excludes a search of $PATH, which might be challenging to implement correctly (especially given the recent change to allow non-UPCXX executables such as strace to appear in the command).

    Related:
    My recollection (imperfect, perhaps) is that different mpirun implementations have (or had when I looked many years ago while working GASNet's spawners) different behaviors with respect to a bare executable (no /). All did search $PATH IIRC. However, if . was not in $PATH then they differed in whether or not they also searched ..

  2. Paul Hargrove

    Not sure if it makes any difference at all, but I want to note that this release is an ideal opportunity to drop support for Python2 entirely. I mention this here only because doing so could (in theory) impact how a solution to this issue might be implemented.

  3. Dan Bonachea reporter

    issue #173: upcxx-run executable search and error messages

    This commit alters the handling of the GEX executable identified by upcxx-run (the first command line argument that leads to a valid file containing the ident strings). Currently PATH is NOT searched when identifying the executable - this is intentional but could be added as an extension later if we wanted.

    If the executable is a bare filename in the current directory, then ./ is prepended to that argument, ensuring it will be found when we later pass it to exec or an underlying spawner.

    Fixes issue #173

    → <<cset c3a707d97dab>>

  4. Log in to comment