Performance Analysis of Data Mining Algorithms Based on PCA
- DOI
- 10.2991/meic-15.2015.345How to use a DOI?
- Keywords
- PCA; Classification; Clustering; Spectrum; Cataclysmic Variable Star
- Abstract
Data mining algorithms behave differently under different application context. It is an important topic to find out the characteristics of the relevant algorithms. This paper studied PCA based dimension reduction and the functional performance of data mining algorithms (ANN, Bayes, KNN, K-means) under different dimension reduction rates in finding Cataclysmic Variable Stars(CVs) in a hybrid celestial spectra dataset. The dataset was selected from SDSS(Sloan Digital Sky Survey), 1417 spectra altogether. In the dataset, there are 15 CVs, along with other type of celestial bodies. ANN, Bayes, KNN and K-means were chosen to test their performances in finding CVs and time cost under different PCA dimensions. The classification accuracy and time cost were analyzed of the four mentioned algorithms in detail under different PCA dimensions. A series of experiments were done to carry out our research. Through this study, we can understand the inherent characteristics of the four algorithms and make better choices in future data mining applications.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Ruifeng Bai AU - Lin Yang AU - Jie Wang AU - Jingchang Pan PY - 2015/04 DA - 2015/04 TI - Performance Analysis of Data Mining Algorithms Based on PCA BT - Proceedings of the 2015 International Conference on Mechatronics, Electronic, Industrial and Control Engineering PB - Atlantis Press SP - 1506 EP - 1509 SN - 2352-5401 UR - https://doi.org/10.2991/meic-15.2015.345 DO - 10.2991/meic-15.2015.345 ID - Bai2015/04 ER -