Reduce WeylScal4 code size

Create issue
Issue #1020 closed
Erik Schnetter created an issue

The attached patch reduces the size of the auto-generated code of WeylScal4 by introducing some strategic temporary variables.

This prevents a build failure on BlueGene/Q.


Comments (11)

  1. Ian Hinder
    • removed comment

    Is this patch still necessary now that the symmetries have been implemented? I'm concerned that the code becomes harder to read and verify, and the test cases might have certain symmetries which hide problems in the implementation. These sorts of optimisations would ideally be performed by Kranc itself. Does CSE not work well enough in this case?

  2. Erik Schnetter reporter
    • removed comment

    I checked: CSE is not efficient enough in reducing the code size in the same way. For example, the code size of psis_calc_Nth decreases from 74 kB to 66 kB with the optimisations above.

    One example to look at is gamma111 in the Kranc-generated sources. CSE does not manage to pull out the equivalent of a Christoffel symbol with all lowercase indices (the gammal in my changes). I don't know why -- maybe the pattern that needs to be recognized is too complex, or maybe gammal has symmetries that I declared explicitly but which CSE does not find automatically.

  3. Ian Hinder
    • removed comment

    Do we have a mechanism to give Kranc hints for subexpressions to look for? Could such a thing be added to the CSE code?

  4. Erik Schnetter reporter
    • removed comment

    No, there is no mechanism to give hints to CSE.


    gamma[ua,lb,lc] -> gu[ua,ud] (PD[g[ld,lb],lc] + PD[lb,ld],lc] - PD[g[lb,lc],ld]) CSEHint -> {PD[g[ld,lb],lc] + PD[lb,ld],lc] - PD[g[lb,lc],ld]} That doesn't look nicer than my solution:

    gammal[ld,lb,lc] -> PD[g[ld,lb],lc] + PD[lb,ld],lc] - PD[g[lb,lc],ld], gamma[ua,lb,lc] -> gu[ua,ud] gammal[ld,lb,lc]

    Or maybe:

    gamma[ua,lb,lc] -> gu[ua,ud] Save[PD[g[ld,lb],lc] + PD[lb,ld],lc] - PD[g[lb,lc],ld]] where "Save" indicates that the expression should be saved in a temporary variable.

    Or maybe Kranc should do this grouping automatically when it expands indices for implicit sums.

    Maybe it would suffice to factor out those terms in a produce that are independent of the index that is expanded. That is, a term

    A B[ui] C[li] becomes then

    A (B1 C1 + B2 C2 + B3 C3) instead of repeating A three times. In long expressions, especially if multiple indices are expanded in the same product, this could be a big winner.

    Alternatively, and much more difficult, CSE could be looking for expressions with this pattern:

    A X1 + A X2 + A X3 and either transform this to

    A (X1 + X2 + X3) or at least identify A as a common subexpression. Of course, A would be a whole expression, and could be "spread out", i.e. wildly intermixed with the Xs.

  5. Ian Hinder
    • removed comment

    At the moment, I am reluctant to support this patch. The expressions become less readable and are now different to the expressions in the original paper they were copied from, making them less verifiable. Erik's comment that his UseCSE solution doesn't look any nicer than the original applies only if you read all the code (i.e. the expression and the hint). In the new form, the actual expression is in the form that the reader expects, and might be checking from a paper. The CSE hint is something the user can ignore, as Kranc guarantees that this is only an optimisation, and not a change in functionality. I would like for the source expressions to be directly comparable with the paper they are based on.

    The original reason for wanting to implement this change was due to a build failure on BlueGene/Q. Does the code still fail to build, now that some of the Riemann symmetries have been implemented? If it now builds successfully, then in my opinion it is better to have simpler, more readable Kranc source than to reduce the build time and size of the generated C++.

  6. Log in to comment