extend-add: should error-check filesystem calls
Currently (as of 666a014) if you run extend-add with no arguments or where the first argument is a non-existent file, you get an ugly crash:
{cori[1]} srun -n 1 extend-add_upcxx-seq
nprow is 1
npcol is 1
timer frontal_matrix_creation maximum value: 0.000358297 s
*** Caught a fatal signal (proc 0): SIGSEGV(11)
[0] Invoking GDB for backtrace...
[0] /usr/bin/gdb -nx -batch -x /tmp/gasnet_ZEjIyJ '/global/u1/b/bonachea/UPC/bupcr-icc-hsw/dbg/gasnet/tests/upcr-harness/external-upcxx/./extend-add_upcxx-seq' 6597
[0] [Thread debugging using libthread_db enabled]
[0] Using host libthread_db library "/lib64/libthread_db.so.1".
[0] 0x000000002033fa3a in __waitpid (pid=6600, stat_loc=stat_loc@entry=0x7fffffff2948, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
[0] #0 0x000000002033fa3a in __waitpid (pid=6600, stat_loc=stat_loc@entry=0x7fffffff2948, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
[0] #1 0x00000000205d6d6f in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
[0] #2 0x00000000200af114 in gasneti_system_redirected (cmd=0x40851e20 <cmd> "/usr/bin/gdb -nx -batch -x /tmp/gasnet_ZEjIyJ '/global/u1/b/bonachea/UPC/bupcr-icc-hsw/dbg/gasnet/tests/upcr-harness/external-upcxx/./extend-add_upcxx-seq' 6597", stdout_fd=10) at /global/u1/b/bonachea/UPC/upcr/gasnet/gasnet_tools.c:1275
[0] #3 0x00000000200af7f2 in gasneti_bt_gdb (fd=10) at /global/u1/b/bonachea/UPC/upcr/gasnet/gasnet_tools.c:1531
[0] #4 0x00000000200b0541 in gasneti_print_backtrace (fd=2) at /global/u1/b/bonachea/UPC/upcr/gasnet/gasnet_tools.c:1806
[0] #5 0x00000000200b0d9a in _gasneti_print_backtrace_ifenabled (fd=2) at /global/u1/b/bonachea/UPC/upcr/gasnet/gasnet_tools.c:1938
[0] #6 0x000000002024a650 in gasneti_defaultSignalHandler (sig=11) at /global/u1/b/bonachea/UPC/upcr/gasnet/gasnet_internal.c:704
[0] #7 <signal handler called>
[0] #8 0x0000000020007681 in main (argc=1, argv=0x7fffffff63b8) at src/main.cpp:243
[0] [Inferior 1 (process 6597) detached]
srun: error: nid00188: task 0: Segmentation fault
srun: Terminating job step 23946682.4
The code at src/main.cpp:48
currently blindly assumes the first argument corresponds to a valid input file. We should add at least some minimal error checking that the user provided an argument naming a file that can successfully be opened, otherwise provide an explanatory error message.
Ideally the input file parsing logic would also detect early EOF or other forms of invalid/truncated input file and similarly issue an error message instead of a crash.
Similarly, it appears that src/main.cpp:40
is unconditionally opening logfiles in the $CWD
and not checking for success, which could easily fail if the directory or filesystem is read-only.
We want this code to be "exemplary", so it should check for plausible errors when interacting with the file system.
Comments (4)
-
reporter -
reporter - changed milestone to 2020.3.0 release
- changed title to extend-add: should error-check filesystem calls
Bulk roll-over of unresolved issues to next milestone
-
- changed status to resolved
Resolved by commit #fdb3023
-
Thanks, @Mathias Jacquelin
- Log in to comment
On a closely related topic, a common error mode is to pass the wrong input file, which also results in an ugly crash and no indication of what the user did wrong - for example: