Nobs Build regression on multiple systems: I/O error in os.write

Issue #189 resolved
Dan Bonachea created an issue

Nightly testing reveals that commit 3297eb5 has caused a regression in nobs build behavior, on at least two mac systems and our WSL system (so far).

Thus far our Linux test systems appear unaffected, so my first guess would be a problem with case-insensitive file systems, although the causal factor might also be something else like Python version. However I've confirmed on our el-capitan system that the same failure mode also occurs with the latest Python 2.7.15.

The failure mode for a manual install of 3297eb5 on high-sierra (@jdbachan you have access on our private network) with Python 2.7.10 shown below:

{ihigh ~/UPC/upcxx} rm -Rf .nobs ; ./install inst
UPCXX revision: upcxx-2018.9.5-7-g3297eb5
System: Darwin high-sierra.local 17.7.0 Darwin Kernel Version 17.7.0: Fri Nov 2 20:43:16 PDT 2018; root:xnu-4570.71.17~1/RELEASE_X86_64 x86_64
ProductName:    Mac OS X
ProductVersion: 10.13.6
BuildVersion:   17G4015
Xcode 9.0
Build version 9A235

Date: Fri Jan 4 09:35:19 PST 2019
Current directory: /Users/bonachea/UPC/upcxx
Install directory:
Settings:

/usr/bin/python:  Python 2.7.10

/usr/bin/clang++
Apple LLVM version 9.0.0 (clang-900.0.37)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
/usr/bin/clang
Apple LLVM version 9.0.0 (clang-900.0.37)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Downloading https://upc-bugs.lbl.gov/nightly/unlisted/GASNet-EX-collaborator-snapshot.tar.gz
Finished    https://upc-bugs.lbl.gov/nightly/unlisted/GASNet-EX-collaborator-snapshot.tar.gz
Configuring GASNet...

...

g++ -std=c++11 -D_GNU_SOURCE=1 -I/Users/bonachea/UPC/upcxx/.nobs/art/ea55e9aa18cf1f09810abb6ed04156d0763ac597 -DNOBS_DISCOVERY -MM -MT x /Users/bonachea/UPC/upcxx/src/reduce.cpp

g++ -std=c++11 -D_GNU_SOURCE=1 -I/Users/bonachea/UPC/upcxx/.nobs/art/ea55e9aa18cf1f09810abb6ed04156d0763ac597 -DNOBS_DISCOVERY -MM -MT x /Users/bonachea/UPC/upcxx/src/rget.cpp

g++ -std=c++11 -D_GNU_SOURCE=1 -I/Users/bonachea/UPC/upcxx/.nobs/art/ea55e9aa18cf1f09810abb6ed04156d0763ac597 -DNOBS_DISCOVERY -MM -MT x /Users/bonachea/UPC/upcxx/src/rput.cpp

g++ -std=c++11 -D_GNU_SOURCE=1 -I/Users/bonachea/UPC/upcxx/.nobs/art/ea55e9aa18cf1f09810abb6ed04156d0763ac597 -DNOBS_DISCOVERY -MM -MT x /Users/bonachea/UPC/upcxx/src/vis.cpp

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/bonachea/UPC/upcxx/nobs/nobs/subexec.py", line 93, in io_thread_fn
    os.write(fd, rev_bufs.pop())
OSError: [Errno 5] Input/output error

gcc -std=c11 -D_GNU_SOURCE=1 -I/Users/bonachea/UPC/upcxx/.nobs/art/ea55e9aa18cf1f09810abb6ed04156d0763ac597 -DNOBS_DISCOVERY -MM -MT x /Users/bonachea/UPC/upcxx/src/backend/gasnet/upc_link.c

gcc -std=c11 -D_GNU_SOURCE=1 -I/Users/bonachea/UPC/upcxx/.nobs/art/ea55e9aa18cf1f09810abb6ed04156d0763ac597 -DNOBS_DISCOVERY -MM -MT x /Users/bonachea/UPC/upcxx/src/dl_malloc.c

At which point it hangs forever.

The CI log looks a bit different since it uses ./install -single, but the Python exception traceback crash output is the same.

Please fix or revert ASAP.

Comments (1)

  1. Log in to comment