Alternative approach to linked boundary conditions?

Currently for our linked boundary conditions we need to send our local processor’s boundary values to all connected cells to the left/right. The code to calculate this communication pattern (in init_connected_bc) currently scales poorly with problem size (quadratic in nx) and is effectively serial in nature as it loops over the entire domain. We only do this calculation once per simulation.

The resulting communication pattern is then employed in every invert_rhs call and is implemented using point-to-point communications to send the minimal amount of information to just the processors which need it. Whilst this minimises the data transferred, it complicates the code and means we don’t take full advantage of tuned MPI routines. Could we instead simply send the boundary information to all processors which may need it using an mpi_allgather or mpi_allreduce? Sending more data may not be any more expensive if we’re latency bound.

Comments (1)