Leveraging deep learning for Python version identification.

descriptionPublicationkeyboard_double_arrow_right Conference object , Article 01 Jan 2023 Netherlands Publisher:CEURJournal:CEUR workshop proceedings, volume 3,567, pages 33-40 (issn: 1613-0073, eissn: 1613-0073,

Copyright policy )

Authors: Gerhold, Marcus; Solovyeva, Lola; Zaytsev, Vadim;

Leveraging deep learning for Python version identification.

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Python, recognized for its dynamic and adaptable nature, has found widespread application in a myriad of projects. As the language evolves, determining the Python version employed in a project becomes pivotal to ensure compatibility and facilitate maintenance. Deep learning (DL) has emerged as a promising tool to automate this process. In this research, we assess various DL techniques in determining the minimum Python version for a code snippet. We explore the complexities of handling Python data and the DL techniques to achieve high classification accuracy. Our experimental results show, that LSTM with CodeBERT embedding achives an accuracy of 92%. This success can be attributed to the LSTM's proficiency in capturing structural details of the hierarchical nature of a source code, complemented by CodeBERT's ability to discern contextual differences between keywords and variable names. This research provides insights into the challenges associated with utilizing programming languages for deep learning models and suggests potential solutions for addressing these issues. The envisioned applications extend to predicting the minimum required version for individual files or an entire code base.

Country

Netherlands

Related Organizations

University of Twente
Netherlands

Keywords

Deep Learning, CodeBERT, version identification, Python

1 Research products, page 1 of 1

vermin software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Related to Research communities

Netherlands Research Portal

Leveraging deep learning for Python version identification.

Leveraging deep learning for Python version identification.

1 Research products, page 1 of 1

vermin software on GitHub