Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Electronicsarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Electronics
Article . 2024 . Peer-reviewed
License: CC BY
Data sources: Crossref
versions View all 2 versions
addClaim

Deep Semantics-Enhanced Neural Code Search

Authors: Ying Yin; Longfei Ma; Yuqi Gong; Yucen Shi; Fazal Wahab; Yuhai Zhao;

Deep Semantics-Enhanced Neural Code Search

Abstract

Code search uses natural language queries to retrieve code snippets from a vast database, identifying those that are semantically similar to the query. This enables developers to reuse code and enhance software development efficiency. Most existing code search algorithms focus on capturing semantic and structural features by learning from both text and code graph structures. However, these algorithms often struggle to capture deeper semantic and structural features within these sources, leading to lower accuracy in code search results. To address this issue, this paper proposes a novel semantics-enhanced neural code search algorithm called SENCS, which employs graph serialization and a two-stage attention mechanism. First, the code program dependency graph is transformed into a unique serialized encoding, and a bidirectional long short-term memory (LSTM) model is used to learn the structural information of the code in the graph sequence to generate code vectors rich in structural features. Second, a two-stage attention mechanism enhances the embedded vectors by assigning different weight information to various code features during the code feature fusion phase, capturing significant feature information from different code feature sequences, resulting in code vectors rich in semantic and structural information. To validate the performance of the proposed code search algorithm, extensive experiments were conducted on two widely used code search datasets, CodeSearchNet and JavaNet. The experimental results show that the proposed SENCS algorithm improves the average code search accuracy metrics by 8.30 % (MRR) and 17.85% (DCG) and compared to the best baseline code search model in the literature, with an average improvement of 14.86% in the SR@1 metric. Experiments with two open-source datasets demonstrate SENCS achieves a better search effect than state of-the-art models.

Related Organizations
Keywords

code search, semantic enhance, graph serialization, attention mechanism

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
1
Average
Average
Average