The attached patch does the following, prompted by performance problems reported by Christian Ott:
Reorganise some of the internals of thorn Slab:
Use LoopControl to parallelise loops via OpenMP.
Refactor the "work horse" routines that perform the actual copy routines. These routines are specialised for common cases that need to execute efficiently, in particular for the cases encountered in RotatingSymmetry90 and RotatingSymmetry180 when handling CCTK_REAL variables.
Offer an additional API (Slab_MultiTransfer_Init, Slab_MultiTransfer_Apply, Slab_MultiTransfer_Finalize) that calculates the communication schedule only once, and then re-uses it in further calls. This avoids some communication overhead.
Remove old CVS header comments.