Mandatory Data flush needs to be phased out from DTD

Issue #159 new
Reazul Hoque created an issue

Flushing all data is necessary in a DTD program. This makes DTD inflexible in cases of composing multiple DTD kernels. Flushing the data enforces all the data to go back to the initial data distribution in a distributed run and introduces additional dependencies and in a unfavorable case more data transfer.

Why do we have this data_flushes? In DTD the data_flush has two meaning. It indicates the last usage of data and instructs the engine to transport the data back to the original owner(owner according to initial data-distribution). This helps us reuse memory buffer and in turn achieve better performance.

How to phase it out: We need to additional structures in the DTD system to have a more intelligent management of data. The first being a Cache and the second being a LRU. Everytime we get a remote data we will push it in a cache indexed by the data_key and the version. When we flush the data, the data will be cached in the LRU and based on some threshold will be reused as we need memory buffer for incoming data. If we have some data lingering in the LRU they will be cleaned when we destroy the data completely.

Comments (2)

  1. Log in to comment