- changed component to Build/install scripts
- marked as proposal
- marked as minor
App build with nvcc wrapper around configured compiler leads to errors (eg. __builtin_launder)
I am unable to build the Kokkos interop example on Summit with 2021.3.6-snapshot due to missing __builtin_launder
support with nvcc. It is invoked with the Kokkos nvcc_wrapper
, which may possibly have something to do with it. I’m unsure why the configure detected it as being available. I’m working around this problem at the moment by including upcxx_config.hpp
and undefining UPCXX_HAVE___BUILTIN_LAUNDER
.
Comments (13)
-
-
- changed title to App build with nvcc wrapper around configured compiler leads to errors (eg. __builtin_launder)
-
@Dan Bonachea I believe you have stated above that our own UPC++/Kokkos interoperability example for the CUDA-enabled case is an embodiment of "pilot error". Have I misunderstood something?
-
our own UPC++/Kokkos interoperability example for the CUDA-enabled case is an embodiment of "pilot error"
Correct. The Makefile (notably not yet on master) for
KOKKOS_DEVICES=Cuda
includes:CXX = $(KOKKOS_PATH)/bin/nvcc_wrapper -ccbin $(UPCXX)
which I believe technically constitutes "pilot error" for a UPC++ install built with
CXX=mpicxx
, because it means our installed headers will be parsed by a frontend other than the configuredmpicxx
(namelynvcc_wrapper
). I'd kept silent about this when it didn't cause a problem, but now it apparently has. -
reporter Kokkos interoperability being supported depends on nvcc being supported. There aren’t separate
.cu
files, as Kokkos is single source. Thus, the source is always going to be parsed with nvcc. Separating Kokkos code into different files from UPC++ code would be quite a pain.
It looks like
__builtin_launder
support was added with CUDA 11, but UPC++ is built with CUDA 10 on Summit so it will compile but not run.
We could check for
__NVCC__
to handle some of this. Perhaps something like this:#define UPCXX_HAVE___BUILTIN_LAUNDER_CXX /* configure generated value */ #define UPCXX_HAVE___BUILTIN_LAUNDER_NVCC /* configure generated value */ #ifdef __NVCC__ #define UPCXX_HAVE___BUILTIN_LAUNDER UPCXX_HAVE___BUILTIN_LAUNDER_NVCC #else #define UPCXX_HAVE___BUILTIN_LAUNDER UPCXX_HAVE___BUILTIN_LAUNDER_CXX #endif
-
We could check for
__NVCC__
to handle some of this.Yep, that's exactly what I had in mind when I said "additionally probe properties of nvcc [..], and adjust upcxx_config.hpp accordingly". However as I also observed this inherently assumed we've correctly identified the only relevant way in which the behavior of the nvcc frontend parsing diverges from the underlying CXX frontend (which remains TBD).
-
My initial thought when Colin reported the problem in Slack was that we could add something similar to
UPCXX_USE_NODISCARD
to allow the user to override our "probe" logic (regardless of whether it resides in the configure step or an as-shipped header).One alternate to the
UPCXX_USE_*
approach of introducing a new identifier would be to adjust the generatedupcxx_config.h
to honor any existing setting for this (and other?) probe results, by wrapping generated definitions with#ifndef
:#ifndef UPCXX_HAVE___BUILTIN_LAUNDER
I am not opposed to improving also the "out of box" experience when using
nvcc
, but like the idea of also adding a user-facing "opt-out" if for no other reason than a vague concern that AMD and Intel GPU support may have similar issues.Thinking in that general direction makes me think that
UPCXX_HAVE___BUILTIN_ASSUME_ALIGNED
would be another candidate for opt-out when using a non-configured compiler.In GASNet-EX there is precedent for detecting when using a non-configured compiler and discarding some probed compiler properties in favor of conservative assumptions. Additionally there is a precedent of using preprocessor identifiers to override probes. For instance
GASNETT_USE_GCC_ATTRIBUTE_WARNUNUSEDRESULT
can be defined to1
or0
to control use of__attrtibute__((__warn_unused_result__))
. I think thatUPCXX_USE_NODISCARD
is already a move in that direction.@Dan Bonachea Thoughts on extending "compiler mismatch" and/or the
GASNETT_USE_*
approach to the (currently few) compiler properties used conditionally by UPC++? As I said above, I don't think doing so would need to preclude adding better logic for any specific cases we care about (e.g. something like Colin's proposedUPCXX_HAVE___BUILTIN_LAUNDER_NVCC
in this case). -
Dan and I discussed my suggestion, above, that UPC++ might embrace some of the precedents we'd developed in GASNet-EX. However, Dan rightly observed that UPC++ has knowledge of the
nvcc
(and its arguments) at configure time and can therefore perform the necessary probes. Additionally, Dan reminded me that somethings like launder don't have "safe default" implementations to fall back on in the "compiler mismatch" case. These mean we are looking to address a different problem space than the GASNet-EX solutions were built for.We sketched a design for work which would generalize our existing logic used to generate
upcxx_config.hpp
to have it probe multiple compilers (CXX, nvcc, hipcc, ...) while keeping the individual probes mostly unchanged and providing for reasonable maintainability as compilers and probes are added (common case is O(1) work to add either).So I withdraw my suggestions in the previous comment, and instead endorse the proposal to "additionally probe properties of nvcc [..], and adjust upcxx_config.hpp accordingly".
-
- changed milestone to 2022.3.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2022.9.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2023.3.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2023.9.0 release
Mass roll-over of open issues to next release milestone
-
- removed milestone
Clear past Milestone for open issues
- Log in to comment
UPC++ configure detection runs on the C++ compiler provided via
CXX
at configure time, which for the module installs on summit is the MPI wrapper around the C++ compiler (ie mpicxx wrapping g++ or pgc++).It sounds like the configure script correctly detected support for
__builtin_launder
in the provided C++ compiler, but the nvcc wrapper added at app build time (which also has to parse the headers) lacks the required support.We don't officially support this mode of operation, for exactly this reason (ie passing the installed UPC++ headers to any compiler other than the CXX provided to configure). The configure script detects properties of the CXX compiler it is given, but cannot detect anything about a compiler wrapper added later at app build time that imposes additional restrictions, affects ABI compatibility or might otherwise invalidate the configure results. The problem with
__builtin_launder
led to an obvious error in this case, but there might theoretically be other more subtle problems lurking as a result of this compiler mismatch. Because this conflict arose from what is currently considered "pilot error", I'm changing this issue from "bug" to "proposal".One possible solution is to configure and install UPC++ with
CXX="nvcc -ccbin=mpicxx"
, which should fix the configure detection problem. However we do not officially support use of nvcc wrapper as the target C++ compiler, so there might be other problems with that approach. This might also suffer from problems due to recursion when nvcc calls upcxx calls nvcc.Another possibility is we could special-case configure in
--with-cuda
mode to additionally probe properties ofnvcc
(or more generally, the provided--with-nvcc
), and adjustupcxx_config.hpp
accordingly. This is somewhat "fragile" in that we'd need to manually curate the list of items that might diverge in behavior and handle each accordingly.