compiling Baikal with gcc >= 9.3 is very slow

Issue #2410 resolved
Roland Haas created an issue

It was found that is it slow compile of Baikal code with gcc 9.3.0 or newer (10.1 is also affected)

  • takes about 30 minutes to compile 8th order FD RHS using gcc 9.3.0 using -O1 and -march=core2
  • Zach and Roland have been looking into this
  • slowness goes away if one uses gcc -Q -O1 --help=optimizers which claims to report the options that are used by -O1

Zach wanted to look into moving operators into non-inlined files anyway which may fix the issue.

Comments (7)

  1. Erik Schnetter

    In McLachlan, I find that the derivative operators themselves are quite large. I declare them CCTK_ATTRIBUTE_NOINLINE, but make their definition still available when compiling the caller. GCC then specializes the function, i.e. uses a special calling convention that is more efficient than the regular one.

  2. Zach Etienne

    @Erik Schnetter Thanks for the tip! I have just refactored NRPy+'s finite-difference generating code so that it generates CCTK_ATTRIBUTE_NOINLINE finite difference functions within Baikal* instead of inlined code.

    The net result is far faster compiles (>10x faster for gcc 10.1 on a Linux machine), and far faster codegens (~2.4x faster to generate Baikal* thorns using NRPy+). Further, in early tests, I have found no degradation in runtime performance (same performance within error bars).

    I confirmed that the updated Baikal* thorns still pass the testsuite, so I have replaced the Baikal* thorns in WVUThorns master with the updated ones. @Roland Haas will be retrying on the same machine used to produce the original benchmarks for this ticket.

  3. Roland Haas reporter

    Time spent compiling Baikal and BaikalVacuum version 8b2d570 "WVUThorns/Baikal*: Compute finite difference derivatives within functions instead of inlined. Results in ~2.4x faster codegen and much faster compiles with GCC 9.3 and later" using -O1 -march=core using gcc 9.3.0 on the same OSX VM using MacPorts as in the description (only showing files taking more than 1s):

    File name time to compile
    Baikal/src/driver_enforcedetgammabar_constraint.c 12.2738
    Baikal/src/BSSN_RHSs_enable_Tmunu_True_FD_order_4.c 7.50983
    Baikal/src/driver_BSSN_T4UU.c 5.40932
    Baikal/src/driver_pt2_BSSN_RHSs.c 3.18587
    Baikal/src/BSSN_Ricci_FD_order_4.c 2.23229
    BaikalVacuum/src/BSSN_RHSs_enable_Tmunu_False_FD_order_8.c 20.5416
    BaikalVacuum/src/driver_pt2_BSSN_RHSs.c 13.5228
    BaikalVacuum/src/BSSN_RHSs_enable_Tmunu_False_FD_order_6.c 11.0136
    BaikalVacuum/src/BSSN_Ricci_FD_order_8.c 5.86034
    BaikalVacuum/src/BSSN_Ricci_FD_order_6.c 3.13888
    BaikalVacuum/src/BSSN_to_ADM.c 1.22795

    I am recompiling the release code to compare but is has been compiling for a couple minutes already so is much slower to compile.

  4. Roland Haas reporter

    Table of compile time for gcc 9.3.0 using -O1 -march=core and the ET_2020_05_v0 version of the code

    File name time to compile
    Baikal/src/driver_enforcedetgammabar_constraint.c 458.326
    Baikal/src/driver_pt2_BSSN_RHSs.c 65.7796
    Baikal/src/BSSN_RHSs_enable_Tmunu_True_FD_order_4.c 46.865
    Baikal/src/BSSN_Ricci_FD_order_4.c 9.34175
    Baikal/src/driver_BSSN_T4UU.c 7.00304
    BaikalVacuum/src/BSSN_RHSs_enable_Tmunu_False_FD_order_4.c 8947.04
    BaikalVacuum/src/BSSN_RHSs_enable_Tmunu_False_FD_order_8.c 2453.11
    BaikalVacuum/src/BSSN_Ricci_FD_order_4.c 1100.26
    BaikalVacuum/src/driver_enforcedetgammabar_constraint.c 433.173
    BaikalVacuum/src/BSSN_Ricci_FD_order_8.c 305.455
    BaikalVacuum/src/driver_pt2_BSSN_RHSs.c 66.2126
    BaikalVacuum/src/BSSN_to_ADM.c 1.22881

    which is about a factor of 430 faster for the slowest file (BSSN_RHSs_enable_Tmunu_False_FD_order_4.c) and changes compile time from several hours to a minute or so.

  5. Zach Etienne

    430x! I’ve confirmed roundoff-level agreement with the original version (and no *runtime* performance degradation) and pushed this updated version to the WVUThorns repo (master branch). Can we consider this ticket closed, then?

  6. Log in to comment