Dynamic stacking ensemble for cross-language code smell detection

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 15 Aug 2024 English Publisher:PeerJJournal:PeerJ Computer Science, volume 10, page e2254 (eissn: 2376-5992,

Copyright policy )

Authors: Hamoud Aljamaan;

doi: 10.7717/peerj-cs.2254

pmid: 39314734

pmc: PMC11419637

Dynamic stacking ensemble for cross-language code smell detection

- Summary
- Subjects
- Metrics

Abstract

Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (i.e., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.

Related Organizations

King Fahd University of Petroleum and Minerals
Saudi Arabia

Keywords

Stacking ensemble, Detection, Artificial Intelligence, Ensemble learning, Electronic computers. Computer science, Dynamic ensemble, Machine learning, QA75.5-76.95, Code smell

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average