
Abstract Context A code smell indicates a poor implementation choice that often worsens software quality. Thus, code smell detection is an elementary technique to identify refactoring opportunities in software systems. Unfortunately, there is limited knowledge on how similar two or more developers detect smells in code. In particular, few studies have investigated if developers agree or disagree when recognizing a smell and which factors can influence on such (dis)agreement. Objective We perform a broader study to investigate how similar the developers detect code smells. We also analyze whether certain factors related to the developers’ profiles concerning background and experience may influence such (dis)agreement. Moreover, we analyze if the heuristics adopted by developers on detecting code smells may influence on their (dis)agreement. Method We conducted an empirical study with 75 developers who evaluated instances of 15 different code smell types. For each smell type, we analyzed the agreement among the developers and we assessed the influence of 6 different factors on the developers’ evaluations. Altogether more than 2700 evaluations were collected, resulting in substantial quantitative and qualitative analyses. Results The results indicate that the developers presented a low agreement on detecting all 15 smell types analyzed in our study. The results also suggest that factors related to background and experience did not have a consistent influence on the agreement among the developers. On the other hand, the results show that the agreement was consistently influenced by specific heuristics employed by developers. Conclusions Our findings reveal that the developers detect code smells in significantly different ways. As a consequence, these findings introduce some questions concerning the results of previous studies that did not consider the different perceptions of developers on detecting code smells. Moreover, our findings shed light towards improving state-of-the-art techniques for accurate, customized detection of code smells.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 36 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
