Program exit code: advice to implementers
Originally reported on Google Code with ID 45
This is for issue 49.
For thread termination, Section 5.1.2.2, what is the value returned to the OS for this
program:
int main( void ) { return MYTHREAD; }
Reported by yzheng@lbl.gov
on 2012-05-22 23:49:31
Comments (13)
-
Account Deleted -
Account Deleted ``` The maximum approach is useful, although it does require a max reduction at program exit. For reference as far as how this problem has been solved elsewhere, here is what the Fortran standard requires:
"When normal termination occurs on more than one image, it is expected that a processor-dependent summary of any stop codes and signaling exceptions will be made available."
Which fits with your suggestion because the max of all exit values would be a reasonable summary. It's the most reasonable one that I can think of actually. How about this language, different in that it also covers the success case and gives the user something stronger to rely upon if their implementation provides an exit value?
"When a UPC program terminates, implementations are encouraged to make the program exit value available to the mechanism that launched the program. If an exit value is provided to the launch mechanism and not all threads terminated with the same exit value, then the maximum exit value shall be provided." ```
Reported by `johnson.troy.a` on 2012-06-15 18:31:33 - Labels added: Milestone-Spec-1.3
-
Account Deleted ``` Keep in mind that the range of valid return values is usually constrained by the OS. For instance, on my linux laptop, I only get the low 8 bits of the return value:
$ cat t.c
- include<stdlib.h> int main() { return atoi(argv[1]); } $ gcc t.c $ ./a.out 16384 $ echo $? 0 $ ./a.out 16383 $ echo $? 255 ```
Reported by `sdvormwa@cray.com` on 2012-06-15 18:41:39
-
Account Deleted ``` Oh fun, so there it acts like an "unsigned char main()" function instead. I suppose the program is attempting to return the correct value but something else truncates it, otherwise it would not truly be "int main()".
I think the proposed language is still useful. What the launch mechanism does with the value is not specified, so displaying a truncated form of the provided value is allowed. ```
Reported by `johnson.troy.a` on 2012-06-15 18:47:59
-
Account Deleted ``` POSIX defines that the parent process only sees the low 8 bits of the child's exit status. See the manpage for wait(). ```
Reported by `sdvormwa@cray.com` on 2012-06-15 18:56:16
-
Account Deleted ``` Also, the rationale in the POSIX manpage for exit() (man 3p exit) goes into good detail on the subject. ```
Reported by `sdvormwa@cray.com` on 2012-06-15 19:16:23
-
Account Deleted ``` So, do we perform the MAX() reduction on the uint8_t value as seen by wait(), or on the int returned by main(), passed to upc_global_exit(), etc.?
I would say the reduction on the truncated value is the better choice because it can be performed both pre-exit (in the runtime) or post-exit (in the job launcher) depending on which is better suited to a given implementation. ```
Reported by `phhargrove@lbl.gov` on 2012-06-15 20:00:27
-
Account Deleted ``` Note there are systems that don't follow POSIX here--for instance, on Plan 9 one could conceivably return *all* the values, possibly even sorted by thread. I think this is best left as implementation defined behavior, perhaps with a note to implementers with the recommendation that the max of the truncated exit statuses be the resulting status on POSIX systems. ```
Reported by `sdvormwa@cray.com` on 2012-06-15 20:15:35
-
Account Deleted ``` Set default Consensus to "Low". ```
Reported by `gary.funck` on 2012-08-19 23:26:19 - Labels added: Consensus-Low
-
Account Deleted ``` We seem to have loose consensus that we should not try to specify a strong requirement here, since there is wide variance in the behavior of job control systems that are outside our control. I like Paul's wording from comment 1 because it encourages preserving non-zero exit codes, with preference for the max, without actually requiring any specific behavior (which may be difficult or impossible on a given system).
Whatever we come up with should probably also be used to clarify the behavior of upc_global_exit(status) in 7.2.1 - currently the 'status' argument is completely unmentioned in the semantics. ```
Reported by `danbonachea` on 2012-09-21 19:34:57 - Labels added: Consensus-Medium - Labels removed: Consensus-Low
-
Account Deleted ``` I am still in favor of my original language (comment
#1) or something equally "loose". I object to Troy's use of "shall" in his proposal (comment #2).I've also recently realized that we are implicitly assuming zero is "Good". However, in "7.20 General utilities <stdlib.h>" C99 defined EXIT_SUCCESS and EXIT_FAILURE rather than following the unix "tradition" that SUCCESS==0. So, I think we should state our "advice" in terms of EXIT_SUCCESS instead of "zero". So we should then make the "max is best" text explicitly conditional on the common case EXIT_SUCCESS==0. Thoughts? ```
Reported by `phhargrove@lbl.gov` on 2012-09-21 22:23:56
-
Account Deleted The issue of exit code behavior (issue 45 & 90) was discussed in the 10/24 telecon. The consensus was that we don't want to impose any hard requirements at all, and that we consider it Best Practice for UPC programmers to never rely upon exit code behavior. We seem to have agreement that the best action is to add a clause to the spec declaring all exit code behavior as implementation-defined, possibly with stated "preferred behavior". This will encourage implementers to document their behavior and discourage users from relying upon non-portable behavior.
Reported by
danbonachea
on 2012-10-27 03:31:12 - Labels added: Consensus-High - Labels removed: Consensus-Medium -
Account Deleted On the 1/17 telecon we decided (and Troy agreed) this should be closed
Reported by
danbonachea
on 2013-01-17 19:44:06 - Status changed:Rejected
- Log in to comment
``` I am OK with leaving the case of non-single-valued return as implementation specific. I base this on the observation that is MPI hasn't yet solved this problem then we are not likely to either. I would also argue that since we don't have a specification for how one launches a UPC program, specifying the exit value of that otherwise unspecified mechanism is getting ahead of ourselves.
I would be quite happy with "advice to implementer" to the effect: If one or more UPC threads terminate with a non-zero exit value, then it is strongly encouraged that the mechanism used to launch the UPC application also return a non-zero value. Preference is given for returning the same value with which the UPC threads exited, or with the maximum value when the values differ. ```
Reported by `phhargrove@lbl.gov` on 2012-05-23 20:18:26