Intel compiler floor untestable on Cray XC (and broken elsewhere)

Issue #415 resolved
Paul Hargrove created an issue

Of the three Cray XC systems I have access to, only NERSC's Cori has our documented floor of intel/17.0.2.174. On Piz-Daint the oldest is 17.0.4.196, and on Theta 18.0.2.199.

On both Cori and Piz-Daint the intel/17.* compilers currently fail to configure GASNet due to an interaction between GASNet, the glibc system headers, and older Intel compilers. This problem was not evident in the past before the Cray systems were updated to a more recent SLES release. Nor had it been visible in tests on Cori in which we pair the Intel floor with a (built-by-Paul) GCC-6.4.0 (but issue #414 has us raising that floor).

The failures occur when GASNet's configure is looking for sizeof() various types, and looks like:

/usr/include/stdlib.h(133): error: identifier "_Float128" is undefined
  extern _Float128 strtof128 (const char *__restrict __nptr,
         ^

This occurs because the glibc headers use _Float128 based on (after some expansions) __GNUC__ >= 7 && __USE_GNU, but the Intel compilers don't grok this type. One part of this mess that bothers me is that the __USE_GNU part of that is the direct result of GASNET_EXTRA_DEFINES=-D_GNU_SOURCE=1, for which we don't seem to have a kill-switch.

It is worth noting that Intel 17.0.x releases come before or soon after GCC 7.1's release (2017-05-02), so (IMO) it is not reasonable to expect reasonable behavior from Intel compilers backed by GCC 7.x (though that combo worked w/ older glibc). In which case we can characterize this issue as "pilot error".

Some testing shows that intel/18.0.1.163 works with gcc/7.1.0 from the old s/w archive on Cori.
I don't find intel/18.0.0.128 on any Cray system (Cori, Theta, Piz-Daint).

Note that this is NOT a Cray issue, just an issue that manifests only with a glibc newer than the one on Dirac. I have no doubt that non-Cray systems with the same glibc as a Cray login node has the same problem.

So, options include (at least):

  1. Just clarify/disclaim that you need to match your GCC/libstdc++ with your vendor compiler (similar issues w/ PGI), with maybe some "and this may depend on your glibc too" sort of text.
  2. Make GASNet's configure smarter to not pass -D_USE_GNU if it makes things "blowup". In that is the path we follow, then this should move to the GASNt Bugzilla. HOWEVER, we don't have any certainty that is the only source of problems.
  3. Raise our Intel floor to 18+ on Cray XC to match the gcc/7+ floor of issue #414

My preference is for 3. in the near-term, and longer-term a documentation issue along the lines of 1., clarifying that mixing too-new GCC with vendor compilers is dangerous territory, and that pairs that work on one system might not on another.

Comments (4)

  1. Dan Bonachea

    Re "option 2": As documented in the configure script, GASNet unconditionally passes -D_GNU_SOURCE=1 because it enables the widest possible range of functions in the system headers, making them available for configure detection and use. _GNU_SOURCE is a fully supported and widely used user-controllable switch for the Linux system headers, and I believe we are using it correctly.

    The fact that some combos of vendor compiler + glibc "blowup" on the system headers in the presence of -D_GNU_SOURCE=1 is IMO a bug in the compiler and/or pilot error for attempting a glibc+compiler combo that the vendor has not validated/fixed. I do not think we should endorse attempts to "kill switch" this define away - because the result would be that functions that should always be available on Linux would disappear from configure detection. This in turn could lead to subtle bugs in our Linux-specific code that assume such entry points are always present. I'd rather reject the compiler/glibc combo outright than find and hack up all such places to support "degraded" Linux system headers (likely leading to rarely tested, lower-performance code).

    Based on the description here my preference concurs with Paul - pursue both option 1 (improve documentation to disclaim unsupported combos) and option 3 (raise the Intel floor on XC to 18.x since 17.x cannot handle the new glibc 7.1.0 floor on XC)

  2. Paul Hargrove reporter

    I have created a distinct issue 417 for the documentation updates which Dan and I seem to agree are warranted.

    This issue is now devoted solely to the issue of what does or does not compile.
    Unless I hear dissenting opinions, I am planning to proceed with raising the Intel floor to 18.x on Cray XC only.

  3. Paul Hargrove reporter

    Raise minimum Intel C version on Cray XC to 18.0.0

    This commit resolves issue #415 by raising our "floor" version of the intel environment module on the Cray XC from 17.0.2 to 18.0.0. Changes include updates to documentation and to configure-time enforcement.

    → <<cset 60c98623a394>>

  4. Log in to comment