International Journal of Computational Intelligence Systems

Volume 11, Issue 1, 2018, Pages 1229 - 1247

A Three-Stage Based Ensemble Learning for Improved Software Fault Prediction: An Empirical Comparative Study

Authors
Chubato Wondaferaw Yohannesefreewwin@yahoo.com, Tianrui Litrli@swjtu.edu.cn, Kamal Bashirkamalbashir1@yahoo.com
School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China
Received 20 March 2018, Accepted 19 June 2018, Available Online 4 July 2018.
DOI
10.2991/ijcis.11.1.92How to use a DOI?
Keywords
Software Fault Prediction; Software Testing; Ensemble Learning Algorithms; Feature Selection; Data Balancing; Noise Filtering
Abstract

Software Fault Prediction (SFP) research has made enormous endeavor to accurately predict fault proneness of software modules, thus maximize precious software test resources, reduce maintenance cost and contributes to produce quality software products. In this regard, Machine Learning (ML) has been successfully applied to solve classification problems for SFP. However, SFP has many challenges that are created due to redundant and irrelevant features, class imbalance problem and the presence of noise in software defect datasets. Yet, neither of ML techniques alone handles those challenges and those may deteriorate the performance depending on the predictor’s sensitiveness to data corruptions. In the literature, it is widely claimed that building ensemble classifiers from preprocessed datasets and combining their predictions is an interesting method of overcoming the individual problems produced by each classifier. This statement is usually not supported by thorough empirical studies considering problems in combined implementation with resolving different types of challenges in defect datasets and, therefore, it must be carefully studied. Thus, the objective of this paper is to conduct large scale comprehensive experiments to study the effect of resolving those challenges in SFP in three stages in order to improve the practice and performance of SFP. In addition to that, the paper presents a thorough and statistically sound comparison of these techniques in each stage. Accordingly, a new three-stage based ensemble learning framework that efficiently handles those challenges in a combined form is proposed. The experimental results confirm that the proposed framework has exhibited the robustness of combined techniques in each stage. Particularly high performance results have achieved using combined ELA on selected features of balanced data after removing noise instances. Therefore, as shown in this study, ensemble techniques used for SFP must be carefully examined and combined with techniques to resolve those challenges and obtain robust performance so as to accurately identify the fault prone software modules.

Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Download article (PDF)
View full text (HTML)

Journal
International Journal of Computational Intelligence Systems
Volume-Issue
11 - 1
Pages
1229 - 1247
Publication Date
2018/07/04
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.11.1.92How to use a DOI?
Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Chubato Wondaferaw Yohannese
AU  - Tianrui Li
AU  - Kamal Bashir
PY  - 2018
DA  - 2018/07/04
TI  - A Three-Stage Based Ensemble Learning for Improved Software Fault Prediction: An Empirical Comparative Study
JO  - International Journal of Computational Intelligence Systems
SP  - 1229
EP  - 1247
VL  - 11
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.11.1.92
DO  - 10.2991/ijcis.11.1.92
ID  - Yohannese2018
ER  -