Volume 2, Issue 4, November 2014, Pages 211 - 220
Embedding GPU Computations in Hadoop
- Jie Zhu, Hai Jiang, Juanjuan Li, Erikson Hardesty, Kuan-Ching Li, Zhongwen Li
- Corresponding Author
- Jie Zhu
Available Online 15 October 2017.
- https://doi.org/10.2991/ijndc.2014.2.4.2How to use a DOI?
- Hadoop, MapReduce, GPU, CUDA
- As the size of high performance applications increases, four major challenges including heterogeneity, programmability, fault resilience, and energy efficiency have arisen in the underlying distributed systems. To tackle with all of them without sacrificing performance, traditional approaches in resource utilization, task scheduling and programming paradigm should be reconsidered. While Hadoop has handled data-intensive applications well in Clouds, GPU has demonstrated its acceleration effectiveness for computation-intensive ones. This paper addresses the approaches for Hadoop to exploiting both CPU and GPU resources effectively to handle aforementioned challenges. Hadoop schedules MapReduce’s Map and Reduce functions across multiple different computing nodes through Java, whereas CUDA code helps accelerate local computations further on attached GPUs. All available heterogeneous computational power will be utilized. MapReduce in Hadoop eases the programming task by hiding communication and scheduling details. Hadoop Distributed File System will help achieve data-level fault resilience. GPU’s energy efficiency characteristics help reduce the power consumption of the whole system. To utilize GPU in Hadoop, four approaches including Jcuda, JNI, Hadoop Streaming, and Hadoop Pipes, have been accomplished and analyzed. Experimental results have demonstrated and compared their effectiveness.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - JOUR AU - Jie Zhu AU - Hai Jiang AU - Juanjuan Li AU - Erikson Hardesty AU - Kuan-Ching Li AU - Zhongwen Li PY - 2017 DA - 2017/10 TI - Embedding GPU Computations in Hadoop JO - International Journal of Networked and Distributed Computing SP - 211 EP - 220 VL - 2 IS - 4 SN - 2211-7946 UR - https://doi.org/10.2991/ijndc.2014.2.4.2 DO - https://doi.org/10.2991/ijndc.2014.2.4.2 ID - Zhu2017 ER -