Cactus' link command uses CPPFLAGS and CXXFLAGS

Issue #2553 open
Roland Haas created an issue

Cactus link line in make.configuration looks like this right now:

$(LD) $(CREATEEXE)$(OPTIONSEP)"$(call TRANSFORM_DIRS,$@)" $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) $(EXTRAFLAGS) "$(call TRANSFORM_DIRS,$(TOP)/datestamp.o)" $(BEGIN_WHOLE_ARCHIVE_FLAGS) $(CACTUSLIBLINKLINE) $(END_WHOLE_ARCHIVE_FLAGS) $(GENERAL_LIBRARIES)

ie it passed $(CPPFLAGS) and $(CXXFLAGS) to $(LD). While we do normally set LD to the same as CXX this is not always correct, eg if one would like to use g++ for C++ files but link with nvcc.

Neither one of these options is required or should be there since $(LD) does not compile anything at all. Any subset of options from $(CPPFLAGS) or $(CXXFLAGS) that may be needed (eg -fopenmp should be set in $(LDFLAGS)).

Comments (22)

  1. Erik Schnetter

    There are some optimizations that recompile code at link time. I don’t think these were ever successfully used in Cactus, nor whether compilers offer them any more. (The current set of -lto optimizations might store the compiler flags internally.)

    So far, the Cactus policy has been that each stage (preprocessor, compiler, linker) gets passed all the flags of the previous stages.

  2. Roland Haas reporter

    I see. I had not thought of link time optimizations.

    To me it seems that passing CCPFLAGS to the compiler call makes sense since we actually call the compiler driver which internally first preprocesses the code (with cpp) and then calls the actual compiler (cc1). This is different from say FPP where Cactus explicitly calls the preprocessor and then passes the output file to Fortran compiler (though by now it seems that many Fortran compiler drivers can also call the C preprocessor internally).

    Right now the linking stage in Cactus is really only linking object files and library files into the executable and does not include any compilation so the compiler and preprocessor options are (except for LTO) not needed.

    The cases where this matters are probably limited and I would have thought that one should be able to use g++ to link CUDA code, but at least my initial tries failed until I used nvcc to link (otherwise it would report undefined symbols for some CUDA using code. Not for libcuda or libcudart though, which I already included in LIBS).

    I will try and see if the build system / option lists would support having LDFLAGS default to $(CPPFLAGS) $(CXXFLAGS) so that one can override its value completely via an option list. Currently it is impossible to remove $(CXXFLAGS) from the linker command line.

  3. Roland Haas reporter

    Well making LD be nvcc gives me extra trouble when compiling ExternalLibraries (same sort of issues that come from having Cactus LIBS mean something different then autoconf’s LIBS).

    Instead I peeked at Formaline and wrote a short ExternallLibraries/CUDA thorn that adapts Cactus' link step based on the documentation by NVIDIA. The thorn is here: https://github.com/rhaas80/ExternalLibraries-CUDA.git and the trick is to collect all CUDA code in a new library using nvcc (just as NVIDIA shows):

    CACTUSLIBLINKLINE += -l$(CCTK_LIBNAME_PREFIX)CUDA-gpucode
    
    CUDA-LIB = $(CCTK_LIBDIR)/$(LIBNAME_PREFIX)$(CCTK_LIBNAME_PREFIX)CUDA-gpucode$(LIBNAME_SUFFIX)
    
    $(EXEDIR)$(DIRSEP)$(EXE): $(CUDA-LIB)
    
    # TODO: make this depend on only the thorns that REQUIRE CUDA
    # TODO: check if depending on LINKLIST would be enough
    $(CUDA-LIB): $(CONFIG)/make.thornlist $(CONFIG)/cctki_version.h $(patsubst %,$(CCTK_LIBDIR)/$(LIBNAME_PREFIX)$(CCTK_LIBNAME_PREFIX)%$(LIBNAME_SUFFIX),$(notdir $(THORNS) $(CACTUSLIBS))) $(CCTK_LIBDIR)/LINKLIST
            $(CUCC) $(patsubst %,$(CCTK_LIBDIR)/$(LIBNAME_PREFIX)%$(LIBNAME_SUFFIX),$(ALLCACTUSLIBS)) -dlink -o $@
            if test "x$(USE_RANLIB)" = "xyes"; then $(RANLIB) $(RANLIBFLAGS) $@; fi
            @echo $(DIVIDER)
    

    which lets me compile this thornlist:

    ExternalLibraries/CUDA
    

    and run this parfile:

    ActiveThorns = CUDA
    CUDA::test = yes
    

    I think this should work, though I have not tested it on a system where -filelist is actually supported (given that Linux does not, I assume nothing does anymore), in which case maybe there are no libthornFOO.a files but only the raw object files.

  4. Erik Schnetter

    I didn’t realize you had trouble compiling these thorns.

    For me, these settings work:

    CXX = /home/eschnetter/Cactus/view-cuda/bin/nvcc --compiler-bindir /home/eschnetter/Cactus/view-cuda-compilers/bin/g++ -x cu
    CXXFLAGS = -pipe -g --compiler-options -march=native -std=c++17 --compiler-options -std=gnu++17 --expt-relaxed-constexpr --extended-lambda --gpu-architecture sm_75 --forward-unknown-to-host-compiler --Werror cross-execution-space-call --Werror ext-lambda-captures-this --relocatable-device-code=true --objdir-as-tempdir
    

    with standard Cactus. This uses nvcc for all C++ code as well as for linking.

  5. Roland Haas reporter

    That one also worked for me, and is what I used at first. However it fails to compile the ExternalLibraries. I am not 100% sure why but two issues seem to be that CMake really does not like $CXX to be more than one word and then that “-x cu” seems to also interpreset *.o files as source files ocompile, which of course leas to failures.

    I also have to add -DSIMD_CPU to avoid issues in Arith where I otherwise get an error that the return value of a constexpr function must be of integral type (this is with gcc 10.1 as the host complier so sufficiently modern).

    This work is part of an effort is to avoid having to do any of this and be able to use a more regular Cactus build setup where one can use the ExternalLibraries to automatically build the required libraries which should help reduce the amount of build failures when one starts with CarpetX.

    I will attach a thornlist once I have tested things once or twice on my latpop (with and without GPU) and on some cluster.

  6. Erik Schnetter

    Yes, external libraries really want traditional compiler settings. I always build external libraries externally; I think it was a mistake to use Cactus’s build system for that (downloads are large, build times are large, external libraries are not shared across configurations, etc.) I’m using Spack for all external libraries these days.

    We could introduce EXTERNAL_CC etc. which we would use in external libraries.

  7. Erik Schnetter

    I’m working on a Spack recipe for the Einstein Toolkit. It’s working (with many external libraries, of course), but it currently uses plain make instead of Simfactory. Once that’s done I’ll submit it as official package.

  8. Roland Haas reporter

    My biggest worry with Spac is that, unless the cluster’s module system is also based on Spac, it is often hard to integrate Spac generated packages with the cluster ones. I realize that one can often just build everything using Spac incl. MPI etc. but am not sure how confident I am in this always providing a good solution. I expect it will do somewhat poorly on clusters with proprietary hardware, eg Crays of any shape and size or clusters where MPI required a lot of hand tuning.

  9. Erik Schnetter

    You would need to point Spack to the compiler and MPI libraries that it should use (and maybe a few more), and then build the rest. Building your own MPI library is easy; using it is next to impossible since it’s very difficult to configure it to use the existing hardware (and it’s probably actually impossible on Crays).

  10. Roland Haas reporter

    The ticket got sidetracked a bit. The original question was to remove CPPFLAGS and CXXFLAGS from the link call since there is no actual compilation happening (it really only links library and object files).
    Removing CXXFLAGS may result in (fixable) build failures in option lists that set eg -fopenmp only in CXXFLAGS and not also in LDFLAGS (or OPENMP_CXXFLAGS and not also OPENMP_LDFLAGS) since -fopenmp must be present in the link line to link in the OpenMP runtime libraries.

  11. Steven R. Brandt

    I think we should fix this issue now, while the current release is still young. That will give us time to notice any build failures likely to occur.

  12. Roland Haas reporter

    Removing CXXFLAGS and CPPFLAGS from the LD line leads to link time failures about GOMP_parallel not being found. Adding LD_OPENMP_FLAGS = -fopenmp does not help since that variable is not used. The only X_OPENMP_FLAGS variables in configure are:

    lib/make/configure:F77_OPENMP_FLAGS="$F90_OPENMP_FLAGS"
    lib/make/configure:s%@CPP_OPENMP_FLAGS@%$CPP_OPENMP_FLAGS%g
    lib/make/configure:s%@FPP_OPENMP_FLAGS@%$FPP_OPENMP_FLAGS%g
    lib/make/configure:s%@C_OPENMP_FLAGS@%$C_OPENMP_FLAGS%g
    lib/make/configure:s%@CXX_OPENMP_FLAGS@%$CXX_OPENMP_FLAGS%g
    lib/make/configure:s%@CUCC_OPENMP_FLAGS@%$CUCC_OPENMP_FLAGS%g
    lib/make/configure:s%@F90_OPENMP_FLAGS@%$F90_OPENMP_FLAGS%g
    lib/make/configure:s%@F77_OPENMP_FLAGS@%$F77_OPENMP_FLAGS%g
    

    So one first needs to define a new variable LD_OPENMP_FLAGS then make sure it gets added to LDFLAGS and then one can remove CXXFLAGS. Probably also a good idea to set LD_OPENMP_FLAGS to the same default as CXX_OPENMP_FLAGS in lib/make/known-architectures/linux

    If one wants to preserve backwards compatibility then one also needs to default LD_OPENMP_FLAGS to CXX_OPENMP_FLAGS.

  13. Roland Haas reporter

    This changes the way option lists behave, so should be reconsidered only after the May 2022 release since it requires use of LD_XXX_FLAGS to set settings that were previously inherited from CPPFLAGS and CXXFLAGS on the linker line.

    The branch also has the unrelated commit https://bitbucket.org/cactuscode/cactus/commits/f04408489af6800ad8d5e177eabd9c4a53acc1fd which is not strictly related to the subject of the ticket and instead disallows multiple definitions with pgCC by defatul in the same manner that gcc starting with version 10 does (-fno-common is the default in gcc10 and up).

  14. Steven R. Brandt

    For the upcoming release, we need to compile with nvcc. This leads to errors in ctthorns because Kranc detects cuda and starts putting the device tag on its functions. This leads to compiler failures for ctthorns. If we didn’t have to set CXX globally to nvcc, this problem should go away.

  15. Log in to comment