Need an efficient dynamic hash table

Issue #142 resolved
Thomas Herault created an issue

Hash tables are currently of static size. They are used by DTD and PTG to store tasks and keep track of the dependency information.

Ideally, the size of these tables should be in the order of the parallelism available on the DAG. In PTG, for specific domains (e.g. DPLASMA), the user might provide a reasonable hint for this size. But in DTD, and sometimes in PTG for dynamic graphs, this is not possible.

A better approach would be to use dynamic hash tables that can resize themselves as the initial size hint is proven wrong. We discussed two approaches:

  • Have an MCA parameter that defines the maximal number of collisions in a given bucket; if that bucket exceeds that number, switch from linear list to a secondary / ternary hash table; Keep the lock at the highest level bucket to manage atomic switching, and rely on the O(1) acceleration to reduce/limit the time that lock is taken.
  • Have an MCA parameter that defines the maximal fillup ratio of an hash table, and if that ratio is reached, create a new hash table for insertion. Use a help-first lock-free strategy to move elements from the old hash table to the new one when elements are looked-up; only insert elements in the new hash table, removing/migrating elements from the old as they are used.

The best strategy should be investigated.

Comments (2)

  1. Log in to comment