CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises

Name: CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
Creator: Nicolás Torres
Keywords: programming exercise generation, contrastive learning, computer science education, code generation, educational content creation, L, Education

Nicolás Torres

Found an issue? Give us feedback

Education Sciencesarrow_drop_down

Education Sciences

Article . 2025 . Peer-reviewed

License: CC BY

Data sources: Crossref

Education Sciences

Article . 2025

Data sources: DOAJ

CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises

descriptionPublicationkeyboard_double_arrow_right Article 13 Jan 2025 English Publisher:MDPI AGJournal:Education Sciences, volume 15, page 80 (eissn: 2227-7102,

Copyright policy )

Authors: Nicolás Torres;

doi: 10.3390/educsci15010080

CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises

- Summary
- Subjects
- Metrics

Abstract

Generating high-quality programming exercises with well-aligned problem descriptions, test cases, and code solutions is crucial for computer science education. However, current methods often lack coherence among these components, reducing their educational value. We present CodeContrast, a novel generative model that uses contrastive learning to map programming problems, test cases, and solutions into a shared feature space. By minimizing the distance between matched components and maximizing it for non-matched ones, CodeContrast learns the intricate relationships necessary to generate coherent programming exercises. Our model architecture includes three encoder networks for problem descriptions, test cases, and solutions. During training, CodeContrast processes positive triplets (matching problem, test case, solution) and negative triplets (non-matching combinations) and uses a contrastive loss to position positive triplets close in the feature space while separating negative ones. Comprehensive evaluations of CodeContrast—through automatic metrics, expert ratings, and student studies—demonstrate its effectiveness. Results show high code correctness (92.3% of test cases passed), strong problem–solution alignment (BLEU score up to 0.826), and robust test case coverage (85.7% statement coverage). Expert feedback and student performance further support the pedagogical value of these generated exercises, with students performing comparably to those using manually curated content. CodeContrast advances the automated generation of high-quality programming exercises, capturing relationships among programming components to enhance educational content and improve the learning experience for students and instructors.

Related Organizations

Federico Santa María Technical University
Chile

Keywords

programming exercise generation, contrastive learning, computer science education, code generation, educational content creation, L, Education

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

gold