- changed status to open
- removed comment
Improve McLachlan accuracy
James van Meter provided me with an optimised version of McLachlan. He states:
-
You are not taking full advantage of the chi=exp(-2phi) variable. There are several terms you divide by chi or chi^2 in expressions with overall factors exp(-4phi). I rewrote the BSSN equations to make these cancellations before coding. So where you have an expression of the form chi^2(A+B/chi^2), I have chi^2A+B. This gives a slight but noticeable advantage in both accuracy and performance.
-
I added Hamiltonian-constraint-damping terms due to Duez et al. These terms don't seem to be well-known but they are effective.
-
I added a Gamma-constraint-damping term due to Yo et al.
-
I enforce det(g)=1.
I have not tested this new version yet, but James suggests to include it into the Einstein Toolkit.
Keyword:
Comments (23)
-
reporter -
- removed comment
This seems to be based on 3e4bb765aaff1776ef3da6840d55b4c417875803 "Regenerate ML_ADM" which is from 2011-11-20. So my first questions would be if we could get a patch against a more modern version of McLachlan (ideally one including CCZ4 since it also modifies the equations quite a bit). The second comment (this is turning into a paper review) is that in enforceCalc the file does:
phi -> Max[phi,10^(-16)], which seems wrong for the phi method where phi is the log of the conformal factor and might well be negative. Finally I would find it very useful if comments a la PRD XX YYYYY (ZZZZ) eqn EE were added for the constraint damping terms C1-C4. Since the source is the only documentation for ML it would be nice if all the references required to even understand where the equations come from were contained in it,
-
- removed comment
Is there still activity on this ticket?
-
reporter - removed comment
Yes there is.
-
- removed comment
I have just added a patch towards the development version. I have tested that all the testsuites passes using this version and ran a modified version of the qc0-mclachlan.par parameter for 5M and tested that then old and the new version gives the same results both when running using the standard phi- and W-methods. However, I noticed that the RHS1 routine is more expensive now. With the repository version I get (from the TimerReport):
ML_BSSN | ML_BSSN_RHS1 | 224.50275900 |
while with the new version I get:
ML_BSSN | ML_BSSN_RHS1 | 425.99931700 |
while the timings for all other McLachlan routines are essentially unchanged. I currently can't explain this difference in time. It doesn't seem like there is enough new calcuations added to warrant such an increase. I will investigate further.
-
reporter - removed comment
This could be a "platform effect". A small change in the code size or the number of local variables could now mean that the code runs essentially using the L2 cache instead of the L1 cache on your system, dramatically changing performance. If so, then this effect may or may not be there on other systems or with other compilers.
-
- removed comment
That could be. I will try to run under vtune and see if this is the case.
-
- removed comment
I have run both versions of the code with PAPI hardware performance timers on my laptop (Intel Core i7 chips) using the Intel 12.1.3.293 compilers. I see a clear increase in L1 instruction cache misses (24 times more in the Goddard version). However, I also see a factor of 2 increase in floating point instructions as well as a factor 4 increase in L1 data cache misses. I'm not sure if these other increases are caused by the instruction cache misses. We may not see the same on AMD machines, but a fairly significant fraction of HPC resources are based Intel chips, so I'm a bit hesitant applying the patch at the moment.
-
reporter - removed comment
I am currently working on a re-write of McLachlan in a branch "rewrite". This is a general clean-up of the code, including removing duplication etc. From my point of view, it would make sense to apply this improvement to the rewritten McLachlan to avoid divergent branches.
-
- removed comment
One comment and a question:
Comment: We have to make extensive test and ensure that performance is not degraded by these changes.
Question: Does this rewrite also include the gauge splitting stuff from Ticket #590? Or will that have to be considered separately afterwards?
-
reporter - removed comment
Comment: Yes; we need to check not only the performance, but also correctness. After the rewrite, it will be easy to generate several versions, and then pick (maybe even at run time?) the most efficient one for each platform.
Question: The gauge splitting is currently not done, but will be just a few lines of work in the new code. I'll give an overview of the design tomorrow morning in the ET call.
-
- changed status to open
- removed comment
I think Erik is going to test/implement these changes in the new version of McLachan (the rewrite branch). As a result, I am removing the "review" status from this ticket.
-
- removed comment
Which version are we including in the next release? Master or rewrite?
-
reporter - removed comment
I want to rename some parameters in the rewritten version. Given this, I suggest to stick with the master for the next release.
-
Is there any desire to continue working on this? From this ticket the functionality seems all there with minor changes required (and it possibly no longer applying to the rewritten ML code)?
-
Please review.
-
@Peter Diener mentioned during this morning’s ET telecon he would like to take a look at this, but hasn’t had time yet.
-
@Peter Diener did you have time to look into this?
-
@Peter Diener did you have time to look into this?
-
The “Duez et al” trick is quite powerful – I implemented this trick in Baikal a few months ago and found H constraint violations drop by about an order of magnitude compared to when it’s not enabled, for binary black hole simulations (e.g., a ET BBH gallery example - like simulation).
The trick dates back to Yoneda & Shinkai https://arxiv.org/abs/gr-qc/0204002 , who did an analytical (non-numerical) investigation
… and Duez et al I think were the first to report its use in NR simulations: https://arxiv.org/abs/gr-qc/0209102
In addition Raithel & Paschalidis recently found that this trick improves H constraint convergence behavior in the context of binary neutron star simulations: https://arxiv.org/abs/2204.00698
Future versions of Baikal/BaikalVacuum will have this trick enabled by default – I’m now calibrating the magnitude and scaling of the coefficient.
-
hackathon
-
@Peter Diener @Zach Etienne did you have time to look into this?
-
I plan to incorporate these improvements:
https://arxiv.org/abs/2404.01137in the November 2024 ET release of Baikal*.
- Log in to comment