What is HeteroDoop?

HeteroDoop is a MapReduce framework that employs both CPUs and GPUs in a cluster. HeteroDoop offers the following novel features: (i) a small set of directives can be placed on an existing sequential, CPU-only program, expressing MapReduce semantics; (ii) an optimizing compiler translates the directive-augmented program into a GPU code; (iii) a runtime system assists the compiler in handling MapReduce semantics on the GPU; and (iv) a tail scheduling scheme minimizes job execution time in light of disparate processing capabilities of CPUs and GPUs. HeteroDoop is built on top of the state-of-the-art, CPU-only Hadoop MapReduce framework, inheriting its functionality.

Why should one use HeteroDoop?

The deluge of data has inspired big-data processing frameworks that span across large clusters. Frameworks for MapReduce, a state-of-the-art programming model, have primarily made use of the CPUs in distributed systems, leaving out computationally powerful accelerators such as GPUs. HeteroDoop is a framework that exploits both CPUs and GPUs in a cluster, and hence, can obtain a good performance boost on compute-intensive MapReduce applications.

Need help? Send us an email

Wiki

HeteroDoop / Home

What is HeteroDoop?

Why should one use HeteroDoop?