
Software fault prediction (SFP) is a challenging process that any successful software should go through it to make sure that all software components are free of faults. In general, soft computing and machine learning methods are useful in tackling this problem. The size of fault data is usually huge since it is obtained from mining software historical repositories. This data consists of a large number of features (metrics). Determining the most valuable features (i.e., Feature Selection (FS) is an excellent solution to reduce data dimensionality. In this paper, we proposed an enhanced version of the Whale Optimization Algorithm (WOA) by combining it with a single point crossover method. The proposed enhancement helps the WOA to escape from local optima by enhancing the exploration process. Five different selection methods are employed: Tournament, Roulette wheel, Linear rank, Stochastic universal sampling, and random-based. To evaluate the performance of the proposed enhancement, 17 available SFP datasets are adopted from the PROMISE repository. The deep analysis shows that the proposed approach outperformed the original WOA and the other six state-of-the-art methods, as well as enhanced the overall performance of the machine learning classifier.
feature selection, classification, Software fault prediction, binary whale optimization algorithm, adaptive synthetic sampling, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
feature selection, classification, Software fault prediction, binary whale optimization algorithm, adaptive synthetic sampling, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 47 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
