- changed status to open
- removed comment
CactusNumeric/Slab: Improve performance
The attached patch does the following, prompted by performance problems reported by Christian Ott:
Reorganise some of the internals of thorn Slab:
Use LoopControl to parallelise loops via OpenMP.
Refactor the "work horse" routines that perform the actual copy routines. These routines are specialised for common cases that need to execute efficiently, in particular for the cases encountered in RotatingSymmetry90 and RotatingSymmetry180 when handling CCTK_REAL variables.
Offer an additional API (Slab_MultiTransfer_Init, Slab_MultiTransfer_Apply, Slab_MultiTransfer_Finalize) that calculates the communication schedule only once, and then re-uses it in further calls. This avoids some communication overhead.
Remove old CVS header comments.
Keyword:
Comments (3)
-
reporter -
repo owner - removed comment
I tried this patch with the current set of testsuites for the symmetry thorns. The testsuites still pass. Please apply.
-
reporter - changed status to resolved
- removed comment
Applied.
- Log in to comment