Excessive communication
Issue #176
resolved
When a process finishes calculating its part of the loss function before the others, it sits there pinging the master for new parameters until all the rest finish. I think this i creating some excessive network traffic and slowing things down. Code line
I think the solution is to switch to a publish-subscribe model, rather than have everything go through request reply. That is, we need multiple ZMQ sockets.
See also Esben's emails on amp-users.
Comments (2)
-
reporter -
reporter - changed status to resolved
- Log in to comment
This should be addressed largely by commit 81a38e7dd691171310f91fb3a16dc8da2730e44f and following.
However, we could probably cut down more on network traffic (and perhaps increase speed with some shared memory / less passing / and a little bit of the cost function summing on the nodes) by switching to inprocess communication within each node; that is, each node only communicates with master once to get parameters and return cost function components.
This could be done with the inproc communication protocol of zmq instead of tcp.