International Journal of Networked and Distributed Computing

Volume 4, Issue 3, July 2016, Pages 173 - 181

Paralleled Fast Search and Find of Density Peaks Clustering Algorithm on GPUs with CUDA

Authors
Mi Li, Jie Huang, Jingpeng Wang
Corresponding Author
Jie Huang
Available Online 1 July 2016.
DOI
10.2991/ijndc.2016.4.3.4How to use a DOI?
Keywords
Clustering; FSFDP; CUDA; Shared memory; Stream; GPU clusters.
Abstract

Fast Search and Find of Density Peaks (FSFDP) is a newly proposed clustering algorithm that has already been successfully applied in many applications. However, this algorithm shows a dissatisfactory performance on large dataset due to the time-consuming calculation of the distance matrix and potentials. In this paper, we proposed a GPU-accelerated FSFDP with CUDA to improve its performance. Thread/block models and the shared memory usage are dedicatedly designed to maximize the utilization of GPUs’ hardware resources, and a merge accumulation algorithm based on the odd and even positions of an array is introduced as well. Experimental results show that our parallel implementation of FSFDP can reach a 4.39X and a 15.75X speedup for the calculation of the distance matrix and potentials respectively compared to the serial program on a single CPU core. Higher speedup can be expected for data of larger scales until the device limits are reached. Besides, CUDA stream mechanism is also employed and extra time savings can be obtained by hiding the corresponding memory latency of multiple kernels in a two-way streams’ scheduling. Moreover, we evaluate our GPU-based implementation on GPU clusters of 9 nodes and compared to one GPU node, the program can achieve a further 7.55X speedup.

Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Journal
International Journal of Networked and Distributed Computing
Volume-Issue
4 - 3
Pages
173 - 181
Publication Date
2016/07/01
ISSN (Online)
2211-7946
ISSN (Print)
2211-7938
DOI
10.2991/ijndc.2016.4.3.4How to use a DOI?
Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Mi Li
AU  - Jie Huang
AU  - Jingpeng Wang
PY  - 2016
DA  - 2016/07/01
TI  - Paralleled Fast Search and Find of Density Peaks Clustering Algorithm on GPUs with CUDA
JO  - International Journal of Networked and Distributed Computing
SP  - 173
EP  - 181
VL  - 4
IS  - 3
SN  - 2211-7946
UR  - https://doi.org/10.2991/ijndc.2016.4.3.4
DO  - 10.2991/ijndc.2016.4.3.4
ID  - Li2016
ER  -