
doi: 10.3390/app15073632
Automated program repair (APR) plays a vital role in enhancing software quality and reducing developer maintenance efforts. Neural Machine Translation (NMT)-based methods demonstrate notable potential by learning translation patterns from bug-fix code pairs. However, traditional approaches are constrained by limited model capacity and training data scale, leading to performance bottlenecks in generalizing to unseen defect patterns. In this paper, we propose CodeTransFix, a novel APR approach that synergistically combines neural machine translation (NMT) methods with code-specific large language models of code (LLMCs) such as CodeBERT. The CodeTransFix approach innovatively learns contextual embeddings of bug-related code through CodeBERT and integrates these representations as supplementary inputs to the Transformer model, enabling context-aware patch generation. The repair performance is evaluated on the widely used Defects4j v1.2 benchmark. Our experimental results showed that CodeTransFix achieved a 54.1% performance improvement compared to the best NMT-based baseline model and a 23.3% performance improvement compared to the best LLMCs for fixing bugs. In addition, CodeTransFix outperformed existing APR methods in the Defects4j v2.0 generalization test.
Technology, Chemistry, QH301-705.5, T, Physics, QC1-999, context-aware patch generation, automated program repair (APR), TA1-2040, Biology (General), Engineering (General). Civil engineering (General), neural machine translation (NMT), QD1-999
Technology, Chemistry, QH301-705.5, T, Physics, QC1-999, context-aware patch generation, automated program repair (APR), TA1-2040, Biology (General), Engineering (General). Civil engineering (General), neural machine translation (NMT), QD1-999
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
