Cactus' link command uses CPPFLAGS and CXXFLAGS
Cactus link line in make.configuration looks like this right now:
$(LD) $(CREATEEXE)$(OPTIONSEP)"$(call TRANSFORM_DIRS,$@)" $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) $(EXTRAFLAGS) "$(call TRANSFORM_DIRS,$(TOP)/datestamp.o)" $(BEGIN_WHOLE_ARCHIVE_FLAGS) $(CACTUSLIBLINKLINE) $(END_WHOLE_ARCHIVE_FLAGS) $(GENERAL_LIBRARIES)
ie it passed $(CPPFLAGS)
and $(CXXFLAGS)
to $(LD)
. While we do normally set LD
to the same as CXX
this is not always correct, eg if one would like to use g++
for C++ files but link with nvcc
.
Neither one of these options is required or should be there since $(LD)
does not compile anything at all. Any subset of options from $(CPPFLAGS)
or $(CXXFLAGS)
that may be needed (eg -fopenmp
should be set in $(LDFLAGS)
).
Comments (22)
-
-
reporter I see. I had not thought of link time optimizations.
To me it seems that passing CCPFLAGS to the compiler call makes sense since we actually call the compiler driver which internally first preprocesses the code (with cpp) and then calls the actual compiler (cc1). This is different from say FPP where Cactus explicitly calls the preprocessor and then passes the output file to Fortran compiler (though by now it seems that many Fortran compiler drivers can also call the C preprocessor internally).
Right now the linking stage in Cactus is really only linking object files and library files into the executable and does not include any compilation so the compiler and preprocessor options are (except for LTO) not needed.
The cases where this matters are probably limited and I would have thought that one should be able to use g++ to link CUDA code, but at least my initial tries failed until I used nvcc to link (otherwise it would report undefined symbols for some CUDA using code. Not for libcuda or libcudart though, which I already included in LIBS).
I will try and see if the build system / option lists would support having
LDFLAGS
default to$(CPPFLAGS) $(CXXFLAGS)
so that one can override its value completely via an option list. Currently it is impossible to remove$(CXXFLAGS)
from the linker command line. -
reporter - changed status to open
-
reporter - changed title to Cactus' link command uses CPPFLAGS and CXXFLAGS
-
reporter Apparently using g++ to link CUDA code is advanced use (or at least was in 2014). https://developer.nvidia.com/blog/separate-compilation-linking-cuda-device-code/ It requires that one pre-links the object files to extract CUDA device code into a form that g++ understands.
-
We can certainly change the way Cactus handles things.
-
reporter Well making LD be nvcc gives me extra trouble when compiling ExternalLibraries (same sort of issues that come from having Cactus LIBS mean something different then autoconf’s LIBS).
Instead I peeked at Formaline and wrote a short ExternallLibraries/CUDA thorn that adapts Cactus' link step based on the documentation by NVIDIA. The thorn is here: https://github.com/rhaas80/ExternalLibraries-CUDA.git and the trick is to collect all CUDA code in a new library using nvcc (just as NVIDIA shows):
CACTUSLIBLINKLINE += -l$(CCTK_LIBNAME_PREFIX)CUDA-gpucode CUDA-LIB = $(CCTK_LIBDIR)/$(LIBNAME_PREFIX)$(CCTK_LIBNAME_PREFIX)CUDA-gpucode$(LIBNAME_SUFFIX) $(EXEDIR)$(DIRSEP)$(EXE): $(CUDA-LIB) # TODO: make this depend on only the thorns that REQUIRE CUDA # TODO: check if depending on LINKLIST would be enough $(CUDA-LIB): $(CONFIG)/make.thornlist $(CONFIG)/cctki_version.h $(patsubst %,$(CCTK_LIBDIR)/$(LIBNAME_PREFIX)$(CCTK_LIBNAME_PREFIX)%$(LIBNAME_SUFFIX),$(notdir $(THORNS) $(CACTUSLIBS))) $(CCTK_LIBDIR)/LINKLIST $(CUCC) $(patsubst %,$(CCTK_LIBDIR)/$(LIBNAME_PREFIX)%$(LIBNAME_SUFFIX),$(ALLCACTUSLIBS)) -dlink -o $@ if test "x$(USE_RANLIB)" = "xyes"; then $(RANLIB) $(RANLIBFLAGS) $@; fi @echo $(DIVIDER)
which lets me compile this thornlist:
ExternalLibraries/CUDA
and run this parfile:
ActiveThorns = CUDA CUDA::test = yes
I think this should work, though I have not tested it on a system where
-filelist
is actually supported (given that Linux does not, I assume nothing does anymore), in which case maybe there are no libthornFOO.a files but only the raw object files. -
reporter And version 8aa7a36 lets me compile CarpetX, Z4c, Weyl, AMReX.
-
I didn’t realize you had trouble compiling these thorns.
For me, these settings work:
CXX = /home/eschnetter/Cactus/view-cuda/bin/nvcc --compiler-bindir /home/eschnetter/Cactus/view-cuda-compilers/bin/g++ -x cu CXXFLAGS = -pipe -g --compiler-options -march=native -std=c++17 --compiler-options -std=gnu++17 --expt-relaxed-constexpr --extended-lambda --gpu-architecture sm_75 --forward-unknown-to-host-compiler --Werror cross-execution-space-call --Werror ext-lambda-captures-this --relocatable-device-code=true --objdir-as-tempdir
with standard Cactus. This uses
nvcc
for all C++ code as well as for linking. -
reporter That one also worked for me, and is what I used at first. However it fails to compile the ExternalLibraries. I am not 100% sure why but two issues seem to be that CMake really does not like $CXX to be more than one word and then that “-x cu” seems to also interpreset *.o files as source files ocompile, which of course leas to failures.
I also have to add
-DSIMD_CPU
to avoid issues in Arith where I otherwise get an error that the return value of a constexpr function must be of integral type (this is with gcc 10.1 as the host complier so sufficiently modern).This work is part of an effort is to avoid having to do any of this and be able to use a more regular Cactus build setup where one can use the ExternalLibraries to automatically build the required libraries which should help reduce the amount of build failures when one starts with CarpetX.
I will attach a thornlist once I have tested things once or twice on my latpop (with and without GPU) and on some cluster.
-
Yes, external libraries really want traditional compiler settings. I always build external libraries externally; I think it was a mistake to use Cactus’s build system for that (downloads are large, build times are large, external libraries are not shared across configurations, etc.) I’m using Spack for all external libraries these days.
We could introduce
EXTERNAL_CC
etc. which we would use in external libraries. -
I’m working on a Spack recipe for the Einstein Toolkit. It’s working (with many external libraries, of course), but it currently uses plain
make
instead of Simfactory. Once that’s done I’ll submit it as official package. -
reporter My biggest worry with Spac is that, unless the cluster’s module system is also based on Spac, it is often hard to integrate Spac generated packages with the cluster ones. I realize that one can often just build everything using Spac incl. MPI etc. but am not sure how confident I am in this always providing a good solution. I expect it will do somewhat poorly on clusters with proprietary hardware, eg Crays of any shape and size or clusters where MPI required a lot of hand tuning.
-
You would need to point Spack to the compiler and MPI libraries that it should use (and maybe a few more), and then build the rest. Building your own MPI library is easy; using it is next to impossible since it’s very difficult to configure it to use the existing hardware (and it’s probably actually impossible on Crays).
-
reporter I have a branch and thornlist for CarpetX that lets me compile for GPUs without having to use nvcc for linking or compiling all CXX code (other than those thorns that mix CUDA and C++ code). See https://bitbucket.org/eschnett/cactusamrex/wiki/Getting Started
-
reporter The ticket got sidetracked a bit. The original question was to remove
CPPFLAGS
andCXXFLAGS
from the link call since there is no actual compilation happening (it really only links library and object files).
RemovingCXXFLAGS
may result in (fixable) build failures in option lists that set eg-fopenmp
only inCXXFLAGS
and not also inLDFLAGS
(orOPENMP_CXXFLAGS
and not alsoOPENMP_LDFLAGS
) since-fopenmp
must be present in the link line to link in the OpenMP runtime libraries. -
I think we should fix this issue now, while the current release is still young. That will give us time to notice any build failures likely to occur.
-
reporter Removing CXXFLAGS and CPPFLAGS from the LD line leads to link time failures about
GOMP_parallel
not being found. AddingLD_OPENMP_FLAGS = -fopenmp
does not help since that variable is not used. The onlyX_OPENMP_FLAGS
variables in configure are:lib/make/configure:F77_OPENMP_FLAGS="$F90_OPENMP_FLAGS" lib/make/configure:s%@CPP_OPENMP_FLAGS@%$CPP_OPENMP_FLAGS%g lib/make/configure:s%@FPP_OPENMP_FLAGS@%$FPP_OPENMP_FLAGS%g lib/make/configure:s%@C_OPENMP_FLAGS@%$C_OPENMP_FLAGS%g lib/make/configure:s%@CXX_OPENMP_FLAGS@%$CXX_OPENMP_FLAGS%g lib/make/configure:s%@CUCC_OPENMP_FLAGS@%$CUCC_OPENMP_FLAGS%g lib/make/configure:s%@F90_OPENMP_FLAGS@%$F90_OPENMP_FLAGS%g lib/make/configure:s%@F77_OPENMP_FLAGS@%$F77_OPENMP_FLAGS%g
So one first needs to define a new variable
LD_OPENMP_FLAGS
then make sure it gets added toLDFLAGS
and then one can removeCXXFLAGS
. Probably also a good idea to setLD_OPENMP_FLAGS
to the same default asCXX_OPENMP_FLAGS
inlib/make/known-architectures/linux
If one wants to preserve backwards compatibility then one also needs to default
LD_OPENMP_FLAGS
toCXX_OPENMP_FLAGS
. -
reporter Branch https://bitbucket.org/cactuscode/cactus/branch/rhaas/ld_xxx_flags contains code to avoid using
CXXFLAGS
andCPPFLAGS
in the linker command line. It does (have to) introduce a full set ofLD_XXX_FLAGS
variables that all default to theirCXX_XXX_FLAGS
analog ifLD
is not set (in which caseLD
defaults toCXX
). -
reporter This changes the way option lists behave, so should be reconsidered only after the May 2022 release since it requires use of
LD_XXX_FLAGS
to set settings that were previously inherited fromCPPFLAGS
andCXXFLAGS
on the linker line.The branch also has the unrelated commit https://bitbucket.org/cactuscode/cactus/commits/f04408489af6800ad8d5e177eabd9c4a53acc1fd which is not strictly related to the subject of the ticket and instead disallows multiple definitions with pgCC by defatul in the same manner that gcc starting with version 10 does (
-fno-common
is the default in gcc10 and up). -
reporter Similar issues also happen with HIP / RoCM compiles and using
-x hip
inCXXFLAGS
. -
For the upcoming release, we need to compile with nvcc. This leads to errors in ctthorns because Kranc detects cuda and starts putting the device tag on its functions. This leads to compiler failures for ctthorns. If we didn’t have to set CXX globally to nvcc, this problem should go away.
- Log in to comment
There are some optimizations that recompile code at link time. I don’t think these were ever successfully used in Cactus, nor whether compilers offer them any more. (The current set of
-lto
optimizations might store the compiler flags internally.)So far, the Cactus policy has been that each stage (preprocessor, compiler, linker) gets passed all the flags of the previous stages.