
The term ‘Code Smells’ was first coined in the book Refactoring: Improving the design of existing code by M Fowler in 1999. Code smells are poor design choices which have the potential to cause an error or failure in a computer program. The objective of this study is to use code smells as a candidate metric to build a bug prediction model. In this study we have built a bug prediction model using both source code metrics and code smell based metrics proposed in the literature. We used Naive Bayes, Random Forest and Logistic Regression as our candidate algorithms to build the model. We have trained our model against multiple versions of 13 different Java based open source projects. The trained model was used to predict bugs in a particular version of a project, within a particular project and among different projects. We were able to demonstrate, that code smell based metrics can significantly improve the accuracy of a bug prediction model when integrated with source code metrics. Random Forest algorithm based model showed higher accuracy within a version, within a project and among projects when compared to other algorithms.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
