Proceedings of the 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016)

Bagging-Based Logistic Regression With Spark: A Medical Data Mining Method

Authors
Jian Pan, Yiang Hua, Xingtian Liu, Zhiqiang Chen, Zhaofeng Yan
Corresponding Author
Jian Pan
Available Online April 2016.
DOI
10.2991/ameii-16.2016.288How to use a DOI?
Keywords
Medical Data Mining, Bagging, Logistic Regression, Spark
Abstract

Medical data in various organizational forms is voluminous and heterogeneous, it is significant to utilize efficient data mining techniques to explore the development rules of diverse diseases. However, many single-node data analysis tools lack enough memory and computing power, therefore, distributed and parallel computing is in great demand. In this paper, we propose a comprehensive medical data mining method consisting of data preprocessing and bagging-based logistic regression with Spark (BLR algorithm) which is improved for better compatibility with Spark, a fast parallel computing framework. Experimental results indicated that although the BLR algorithm took a little more duration than logistic regression (LR), it was 2.12% higher than LR in accuracy and outperformed LR with other common evaluation indexes.

Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016)
Series
Advances in Engineering Research
Publication Date
April 2016
ISBN
10.2991/ameii-16.2016.288
ISSN
2352-5401
DOI
10.2991/ameii-16.2016.288How to use a DOI?
Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Jian Pan
AU  - Yiang Hua
AU  - Xingtian Liu
AU  - Zhiqiang Chen
AU  - Zhaofeng Yan
PY  - 2016/04
DA  - 2016/04
TI  - Bagging-Based Logistic Regression With Spark: A Medical Data Mining Method
BT  - Proceedings of the 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016)
PB  - Atlantis Press
SN  - 2352-5401
UR  - https://doi.org/10.2991/ameii-16.2016.288
DO  - 10.2991/ameii-16.2016.288
ID  - Pan2016/04
ER  -