Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science) (MIMSE-I-C-2022)

An Optimized Framework for Breast Cancer Prediction Using Classification and Regression Tree

Authors
Asma Agaal1, *, Mansour Essgaer1
1Department of Artificial Intelligence, Faculty of Information Technology, Sebha University, Sebha, Libya
*Corresponding author. Email: asma.agaal@sebhau.edu.ly
Corresponding Author
Asma Agaal
Available Online 26 December 2022.
DOI
10.2991/978-94-6463-084-8_33How to use a DOI?
Keywords
Breast Cancer; CART; Sebha Oncology Center; Dimensionality reduction with cross-validation; Grid search; Hyper-parameter tuning
Abstract

Several machine learning algorithms have been proposed in recent years to design accurate classification systems for a wide range of diseases such as cancers, hepatitis, and coronavirus. In this study, the Classification and Regression Tree (CART) is proposed to predict breast cancer in the early stage, later applied to real data collected from the Sebha oncology center. The study focuses on improving the CART accuracy through several methods: (1) cross-validation, (2) dimensionality reduction and (3) hyper-parameter tuning. However, two cross-validation strategies have been investigated namely: The K fold and stratified fold, followed by dimensionality reduction to determine the most effective features using two methods, namely: recursive feature elimination with cross-validation and principal component analysis, and lastly, investigating the most optimal CART parameters using two optimization algorithms, namely: grid search, and random search. The experimental results have shown that the best CART model which achieved 97% accuracy uses a stratified fold as a cross-validation strategy, recursive feature elimination with cross-validation as dimensionality reduction, and grid search as parameters tuning algorithm. Moreover, when compared to the original CART, the accuracy of the proposed CART has improved from 63% to 97%.

Copyright
© 2022 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science) (MIMSE-I-C-2022)
Series
Advances in Computer Science Research
Publication Date
26 December 2022
ISBN
10.2991/978-94-6463-084-8_33
ISSN
2352-538X
DOI
10.2991/978-94-6463-084-8_33How to use a DOI?
Copyright
© 2022 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Asma Agaal
AU  - Mansour Essgaer
PY  - 2022
DA  - 2022/12/26
TI  - An Optimized Framework for Breast Cancer Prediction Using Classification and Regression Tree
BT  - Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science) (MIMSE-I-C-2022)
PB  - Atlantis Press
SP  - 398
EP  - 412
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-084-8_33
DO  - 10.2991/978-94-6463-084-8_33
ID  - Agaal2022
ER  -