Accelerating the Shuffle Phase to Speed up MapReduce Systems
- https://doi.org/10.2991/iceeecs-16.2016.15How to use a DOI?
- RDMA; MapReduce; Shuffle Phase
The CPU-centric traditional network protocol processing limits the utilization of network bandwidth, even with the high speed network (100Gbps); and the situation is more obvious in big data systems. The high performance network technology-Remote Direct Memory Access(RDMA), has the benefit of directly accessing remote application's memory without involving destination CPUs, broadening the performance boundary. In this paper, we build a pluggable shuffle module to boost the Map-Reduce system based on Unreliable Datagram transport of RDMA, including data fragmenting and a "Dynamic Controllably Apply and Allot" mechanism to ensure the reliability of data transmission. Experimental result shows that the performance of RDMA-based Spark is circa 16% better than that of IPoIB-based Spark.
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Rujie Yu AU - Songping Yu AU - Nong Xiao PY - 2016/12 DA - 2016/12 TI - Accelerating the Shuffle Phase to Speed up MapReduce Systems BT - Proceedings of the 2016 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2016) PB - Atlantis Press SP - 71 EP - 74 SN - 2352-538X UR - https://doi.org/10.2991/iceeecs-16.2016.15 DO - https://doi.org/10.2991/iceeecs-16.2016.15 ID - Yu2016/12 ER -