Name: An Empirical Study on Automatically Detecting AI-Generated Source Code: How Far are We?
Keywords: Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 26 Apr 2025Embargo end date: 01 Jan 2024Publisher:IEEEJournal:2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE)

Authors: Hyunjae Suh; Mahan Tafreshipour; Jiawei Li; Adithya Bhattiprolu; Iftekhar Ahmed;

doi: 10.1109/icse55347.2025.00064 , 10.32388/sbl97o , 10.48550/arxiv.2411.04299

arXiv: 2411.04299

An Empirical Study on Automatically Detecting AI-Generated Source Code: How Far are We?

- Summary
- Subjects
- Related research
  (3)
- Metrics

Abstract

Artificial Intelligence (AI) techniques, especially Large Language Models (LLMs), have started gaining popularity among researchers and software developers for generating source code. However, LLMs have been shown to generate code with quality issues and also incurred copyright/licensing infringements. Therefore, detecting whether a piece of source code is written by humans or AI has become necessary. This study first presents an empirical analysis to investigate the effectiveness of the existing AI detection tools in detecting AI-generated code. The results show that they all perform poorly and lack sufficient generalizability to be practically deployed. Then, to improve the performance of AI-generated code detection, we propose a range of approaches, including fine-tuning the LLMs and machine learning-based classification with static code metrics or code embedding generated from Abstract Syntax Tree (AST). Our best model outperforms state-of-the-art AI-generated code detector (GPTSniffer) and achieves an F1 score of 82.55. We also conduct an ablation study on our best-performing model to investigate the impact of different source code features on its performance.

Related Organizations

University of California System
United States
University of California, Irvine
United States
University of California, San Francisco
United States

Keywords

Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering

3 Research products, page 1 of 1

gpt-2-output-dataset software on GitHub
IsRelatedTo
awesome-chatgpt-prompts software on GitHub
IsRelatedTo
copilot software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

hybrid

An Empirical Study on Automatically Detecting AI-Generated Source Code: How Far are We?

An Empirical Study on Automatically Detecting AI-Generated Source Code: How Far are We?

3 Research products, page 1 of 1

gpt-2-output-dataset software on GitHub

awesome-chatgpt-prompts software on GitHub

copilot software on GitHub