Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

Name: Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods
Keywords: ddc:004, Tokenization, Plagiarism Obfuscation, Computer Science Education, DATA processing & computer science, Software Plagiarism Detection, Source Code Plagiarism Detection, Obfuscation Attacks, info:eu-repo/classification/ddc/004, Code Normalization

Robin Maisch; Nathan Hagel; Alexander Bartel

Found an issue? Give us feedback

https://doi.org/10.1...arrow_drop_down

https://doi.org/10.1145/372301...

Article . 2025 . Peer-reviewed

License: CC BY

Data sources: Crossref

KITopen

Conference object . 2025

License: CC BY

Data sources: KITopen

https://dx.doi.org/10.5445/ir/...

Article . 2025

License: CC BY

Data sources: Datacite

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Jun 2025Publisher:ACMJournal:Proceedings of the 6th European Conference on Software Engineering Education

Authors: Robin Maisch; Nathan Hagel; Alexander Bartel;

doi: 10.1145/3723010.3723019 , 10.5445/ir/1000180637

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

With the rise of AI-generated code, programming courses face new challenges in detecting code plagiarism. Traditional methods struggle against obfuscation techniques that modify code structure through statement insertion and deletion. To address this, we propose a novel approach based on tolerant token matching designed to enhance resilience against such attacks.We evaluate our method through three experiments on a real-life dataset with AI-obfuscated plagiarisms. The results show that our approach increased the median similarity gap between originals and plagiarisms by 1 to 6 percentage points.

Related Organizations

Karlsruhe Institute of Technology
Germany
Neu Ulm University of Applied Sciences
Germany

Keywords

ddc:004, Tokenization, Plagiarism Obfuscation, Computer Science Education, DATA processing & computer science, Software Plagiarism Detection, Source Code Plagiarism Detection, Obfuscation Attacks, info:eu-repo/classification/ddc/004, Code Normalization

1 Research products, page 1 of 1

Supplementary Material for "Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods"
2025IsSupplementedBy

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

hybrid

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

1 Research products, page 1 of 1

Supplementary Material for "Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods"