Feature/try to reduce memory pressure in inversion

Merged
#433 · Created  · Last updated

Merged pull request

Merged in feature/try_to_reduce_memory_pressure_in_inversion (pull request #433)

ead1230·Author: ·Closed by: ·2021-04-08

Description

Only one processor inverts in fields local and then broadcast result

This is a change to the previous behaviour where each processor duplicates the inversion such that no additional communication is required. We already require communication in this routine so an additional synchronisation probably isn't too costly.

This change has been found to improve performance on Archer2 where the problem can often be memory bandwidth constrained.

If the communication of the inverted result could be done in a non-blocking manner then this could help free up processors to proceed to their next inversion.

 

This is built on top of PR #416

0 attachments

0 comments

Loading commits...