Feature/try to reduce memory pressure in inversion
David Dickinson
Branch: feature/try_to_reduce_memory_pressure_in_inversion
Branch: next
Merged
Merged pull request
Merged in feature/try_to_reduce_memory_pressure_in_inversion (pull request #433)
Merged in feature/try_to_reduce_memory_pressure_in_inversion (pull request #433)
Only one processor inverts in fields local and then broadcast result
This is a change to the previous behaviour where each processor duplicates the inversion such that no additional communication is required. We already require communication in this routine so an additional synchronisation probably isn't too costly.
This change has been found to improve performance on Archer2 where the problem can often be memory bandwidth constrained.
If the communication of the inverted result could be done in a non-blocking manner then this could help free up processors to proceed to their next inversion.
Â
This is built on top of PR #416