nobs dropping linkage flags from CXX

Issue #23 resolved
Paul Hargrove created an issue

###Summary###

It appears to me that nobs is eliding compiler/linker options in the GASNET_{CC,CXX} variables which are required for correct linkage of applications on some systems.

###Background 1###

Because upc++ requires a fairly new libstdc++ there are a few systems where I have compiled a more recent gcc or clang than provided by the distros, and installed them in /usr/local (or similar).

In order to ensure that the newer libstdc++ is actually the one loaded at runtime, I need to add the installed compiler's library directory to the runtime library search path. Note that I m not talking about LD_LIBRARY_PATH.

In many/most cases the issue can be avoided by having the LD_LIBRARY_PATH variable set, since the runtime linker/loader will also examine it. However, for multi-node launch there is a chicken-and-egg problem of how does one set the LD_LIBRARY_PATH remotely. So, in general one needs to use the runtime library search path, which is embedded in the executable itself.

On Linux there are two means one can use to specify this path. One is to set the LD_RUN_PATH variable. If that was effective then all would be fine in the world and I'd not be filing this issue. Unfortunately, the documented linker behavior is that LD_RUN_PATH is only honored is there are no explicit -rpath options passed to the linker. Alas, both Open MPI and MPICH pass -rpath options when they link, and we must use mpicc or mpicxx to link ibv-conduit executables if we want mpirun-based launch (the default) and, of course, for mpi-conduit. So, LD_RUN_PATH is ignored when mpicxx is our linker.

###Background 2###

There are two methods in common use to deal with locally installed compiler's library directories. One either sets LDFLAGS to contain the necessary options, or one sets CC and CXX to contain them. These are common recommendations that have nothing at all to do with GASNet, MPI, etc. However, GASNet supports both and I have had success with both Berkeley UPC and Chapel based on those approaches.

Example 1: configure GASNet with LDFLAGS

--with-ldflags=Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64

Example 2: configure GASNet to use compilers with rpath options

--with-cc="gcc -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64"
--with-cxx="g++ -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64"

###The Issue###

I have tried both approaches, above, when configuring an external GASNet to use a local install of gcc-7.1.0. However, with the second approach the nobs-based build of tests such as hello_upcxx appears to be somehow removing these options from the link command. I believe I had observed the same behavior with the first approach, but am not currently able to reproduce (but more on that below).

I had thought that part of the issue was that when I set CXX=mpicxx to link the mpi-spawner support, I may have been causing the problem myself. However, setting CXX=mpicxx -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64 in my environment when building tests is still not linking properly. Below are the (abbreviated) compile and link commands from nobs exe test/hello_upcxx.cpp for ibv-conduit, with GASNet configured as in the second example above, and CXX set with the -rpath (as in the previous sentence):

mpicxx -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64 -std=c++11 -D_GNU_SOURCE=1 -I/tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/e13756621cc0a5106e066a2162baa24a3bb937d4 -DUPCXX_BACKEND=gasnet1_seq -D_GNU_SOURCE=1 -DGASNET_SEQ -D_REENTRANT -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/ibv-conduit -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/other -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/other/firehose -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/extended-ref -I/tmp/BLD-ex-gcc7/dbg/gasnet -MM -MT x /tmp/BLD-ex-gcc7/dbg/upcxx/test/hello_upcxx.cpp

mpicxx -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64 -std=c++11 -D_GNU_SOURCE=1 -I/tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/e13756621cc0a5106e066a2162baa24a3bb937d4 -DUPCXX_BACKEND=gasnet1_seq -D_GNU_SOURCE=1 -DGASNET_SEQ -D_REENTRANT -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/ibv-conduit -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/other -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/other/firehose -I/home/pcp1/phargrov/UPC/upc-runtime/gasnet/extended-ref -I/tmp/BLD-ex-gcc7/dbg/gasnet -O0 -g -Wall -g3 -Wall -Wpointer-arith -Wwrite-strings -Wmissing-format-attribute -Wno-unused -Wno-unused-parameter -Wno-address -c /tmp/BLD-ex-gcc7/dbg/upcxx/test/hello_upcxx.cpp -o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/838cf7b86cc4cb94ae9cf4146d69bbdb9b04c008.hello_upcxx.cpp.o

mpicxx -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64 -std=c++11 -D_GNU_SOURCE=1 -I/tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/b06adf4bcc4fa6811bf69f74d6e70da924f900eb -MM -MT x /tmp/BLD-ex-gcc7/dbg/upcxx/src/future/core.cpp

mpicxx -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64 -std=c++11 -D_GNU_SOURCE=1 -I/tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/b06adf4bcc4fa6811bf69f74d6e70da924f900eb -MM -MT x /tmp/BLD-ex-gcc7/dbg/upcxx/src/diagnostic.cpp

mpicxx -Wl,-rpath=/usr/local/pkg/gcc/7.1.0/lib64 -std=c++11 -D_GNU_SOURCE=1 -I/tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/b06adf4bcc4fa6811bf69f74d6e70da924f900eb -MM -MT x /tmp/BLD-ex-gcc7/dbg/upcxx/src/packing.cpp

[...]

mpicxx -D_GNU_SOURCE=1 -g3 -Wall -Wpointer-arith -Wnested-externs -Wwrite-strings -Wmissing-format-attribute -Wno-unused -Wno-unused-parameter -Wno-address -o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/6ac906127f33140b7cd63ddc637bbfb72fdaf2e2.x /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/e374a57ebe102e206b4b7979f563936b02b5f3eb.diagnostic.cpp.o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/d4c307606f00ae6de9c19a3b836ae00c28ed5da0.dl_malloc.c.o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/838cf7b86cc4cb94ae9cf4146d69bbdb9b04c008.hello_upcxx.cpp.o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/78c1173c59838130d44b8e04d2eb57c342430a6c.core.cpp.o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/baa56324adaef387784c79a284777379fa797d2f.packing.cpp.o /tmp/BLD-ex-gcc7/dbg/upcxx/.nobs/art/a9cd75fa568a60349718f96a292f644842db078a.backend.cpp.o -L/tmp/BLD-ex-gcc7/dbg/gasnet/ibv-conduit -lgasnet-ibv-seq -libverbs -lpthread -lrt -L/usr/local/pkg/gcc/7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0 -lgcc -lm

As you can see, my -rpath is not present in the (final) link command, but oddly is present in (nearly?) every other command! As a result, multi-node runs fail because they don't find a new-enough libstdc++:

$ [path-to]/gasnetrun_ibv -np 2 ./hello_upcxx
/home/data2/phargrov/WORK/gasnet/tests/upcr-harness/external-upcxx/./hello_upcxx: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/data2/phargrov/WORK/gasnet/tests/upcr-harness/external-upcxx/./hello_upcxx)
/home/data2/phargrov/WORK/gasnet/tests/upcr-harness/external-upcxx/./hello_upcxx: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /home/data2/phargrov/WORK/gasnet/tests/upcr-harness/external-upcxx/./hello_upcxx)

The results with the -rpath option in GASNET_LDFLAGS and CXX=mpicxx are currently passing (though I swear they did not at some point in the past week).

While I think nobs is filtering the flags it uses, @bonachea has speculated that there might be a length-related aspect to this issue. If that is the case, then it may explain why I recall seeing issues with -rpath in GASNET_LDFLAGS that have now vanished.

###To Reproduce###

As somebody who deals with bug reports from others on a regular basis, I regret that I don't yet have a simple set of steps to reproduce this problem. The actual path from start to failure is using various scripts that do unrelated things and introduce unrelated dependencies.

I will try to get a simple reproducer assembled if you indicate this cannot be resolved without. However, based on the behavior I am seeing I hope this is something that can be fixed in nobs by code inspection (by somebody who understands python, which I do not).

I am concerned that if nobs is filtering compiler or linker options that it does not recognize in CC and/or CXX, then we have a pretty fragile state of affairs. If there are options which effect ABI (such as -fpie or -m64) that are discarded, then linking will fail.

Comments (8)

  1. Paul Hargrove reporter

    I have managed to reproduce the failures I was seeing for the case of adding the -rpath to GASNET_LDFLAGS.

    In that case I am now looking at just smp-conduit, and thus no MPI compiler/linker is involved.
    My GASNet has been configured using

    CC=/opt/cfarm/gcc-latest/bin/gcc
    CXX=/opt/cfarm/gcc-latest/bin/g++
    --with-ldflags=-Wl,-rpath=/opt/cfarm/gcc-latest/lib64
    

    If I look at the output for building hello_upc, I see the proper -rpath option being used:

    /opt/cfarm/gcc-latest/bin/g++ -g3 -Wall -Wpointer-arith -Wnested-externs -Wwrite-strings -Wmissing-format-attribute -Wno-unused -Wno-unused-parameter -Wno-address -Wl,-rpath=/opt/cfarm/gcc-latest/lib64 -o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/57e49dd41be2f1bc041aab5c382f8a9737a2fad8.x /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/2326a7fea77d0de128e3c559d2b38d82f82ba074.diagnostic.cpp.o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/5b42aaa91fceda566aba22c5a3828c356a0b4afe.hello_upcxx.cpp.o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/b121737eefcfb752e2c22b5cc4c29c16548e0ba2.core.cpp.o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/b237fc16c7093885db1778d960ea75235d3252d3.dl_malloc.c.o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/6d3c7f7821975c2bf004ee9da4dc4edab314c05d.packing.cpp.o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/2a7ead6b2c8bbc35bfef75a907fc3d025260a912.backend.cpp.o -L/home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/inst/dbg/lib -lgasnet-smp-seq -lrt -L/home/iulius/autobuild/bin/gcc-7.1.0/lib/gcc/powerpc64le-unknown-linux-gnu/7.1.0 -lgcc -lm
    

    However when build the future test, the -rpath is missing entirely:

    /opt/cfarm/gcc-latest/bin/g++ -std=c++17 -std=c++11 -D_GNU_SOURCE=1 -I/home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/e75e006b168a3f190fe5c4fcff4cc3af204d7321 -MM -MT x /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/src/future/core.cpp
    /opt/cfarm/gcc-latest/bin/g++ -std=c++17 -std=c++11 -D_GNU_SOURCE=1 -I/home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/e75e006b168a3f190fe5c4fcff4cc3af204d7321 -O0 -g -Wall -c /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/src/future/core.cpp -o /home/phargrov/upcnightly/EX-ppc64el-smp-gcc-pshm/runtime/bld/dbg/upcxx/.nobs/art/b121737eefcfb752e2c22b5cc4c29c16548e0ba2.core.cpp.o
    

    As a result, future does not run:

    ./future: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./future)
    

    So, both approaches to passing a non-default rpath fail, but on different tests.

  2. Former user Account Deleted

    Generally nobs tries not to filter flags, but in the CXX="... -Wl,..." case you've stumbled upon a little workaround of mine due to a gasnet bug where GASNET_LD isn't able to link c++. I compute the linker as such: LD = CXX[0] + GASNET_LD[1:], which is python for "throw away the first token of GANSET_LD and replace it with the first token from CXX". The hope being that GASNET_LD=gcc <flags> would be translated to LD=g++ <flags>. It would be better if I didn't chop CXX down and included all of its tokens, that would fix this so I'll do that. But is there a better way to determine the linker given that I can't trust GASNET_LD (and should I be expecting that? How does gasnet know if I need a c++ linker or not?).

    The GASNET_LDFLAGS='-rpath ..' case isn't working because test/future.cpp isn't parallel and has no dependency on gasnet. The fact that you're providing a GASNET=... is very kind but completely irrelevant. The above linker flag hack only kicks in if there is a library dependency which claims to know the linker, and for gasnet-less tests like future this isn't the case, so CXX is being used in full.

  3. Paul Hargrove reporter

    Re: "How does GASNet know if I need a c++ linker or not":
    Dan and I have been working on this particular point for a (near-)future release.
    Since Chapel and Legion also need C++ linkers (and some conduits need MPI linkers) this is non-trivial (but important) to fix right.

    For now (meaning the Sep 30 release) I think that the best option is for me (and others who need to build a newer g++) to use GASNET_LDFLAGS (via GASNET_CONFIGURE_ARGS=--with-ldflags=-Wl,-rpath=[something]). Dan and I are planning to write up some documentation for UPC++ users with locally-built compilers and will probably make this our primary recommendation.

    Re: GASNET_LDFLAGS=.. and test/future.cpp:
    What is the "right way" for upc++ to use a CXX that requires an RPATH when one does not use GASNet?
    Normally I'd say this should be specified when running configure or cmake, but that is not an option.
    If I set CXX='clang++ -Wl,-rpath=....' then every single compilation warns about the unused linker flags! (thought this may depend on clang version)

  4. Former user Account Deleted

    Didn't realize putting rpath in CXX was noisy. Given that, the right way to give rpath to test/future.cpp doesn't currently exist. I'll add an LDFLAGS env var for nobs to sniff. You can put whatever you want in there.

  5. Former user Account Deleted

    The linker flag chopping has been changed to: LD = CXX + GASNET_LD[1:]

    Nobs now sniffs the LDFLAGS environment variable for flags to put near the front of the link line.

  6. Log in to comment