Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights

Name: Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights
Keywords: Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering

Rahul Krishna; Rangeet Pan; Saurabh Sinha; Srikanth Tamilselvam; Raju Pavuluri; Maja Vukovic

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2024

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1145/369663...

Article . 2025 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2024

License: CC BY

Data sources: Datacite

Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 23 Jun 2025Embargo end date: 01 Jan 2024Publisher:ACMJournal:Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering

Authors: Rahul Krishna; Rangeet Pan; Saurabh Sinha; Srikanth Tamilselvam; Raju Pavuluri; Maja Vukovic;

doi: 10.1145/3696630.3728555 , 10.48550/arxiv.2410.13007

arXiv: 2410.13007

Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation, and more. To leverage code LLMs to their full potential, developers must provide code-specific contextual information to the models. These are typically derived and distilled using program analysis tools. However, there exists a significant gap--these static analysis tools are often language-specific and come with a steep learning curve, making their effective use challenging. These tools are tailored to specific program languages, requiring developers to learn and manage multiple tools to cover various aspects of the their code base. Moreover, the complexity of configuring and integrating these tools into the existing development environments add an additional layer of difficulty. This challenge limits the potential benefits that could be gained from more widespread and effective use of static analysis in conjunction with LLMs. To address this challenge, we present codellm-devkit (hereafter, `CLDK'), an open-source library that significantly simplifies the process of performing program analysis at various levels of granularity for different programming languages to support code LLM use cases. As a Python library, CLDK offers developers an intuitive and user-friendly interface, making it incredibly easy to provide rich program analysis context to code LLMs. With this library, developers can effortlessly integrate detailed, code-specific insights that enhance the operational efficiency and effectiveness of LLMs in coding tasks. CLDK is available as an open-source library at https://github.com/IBM/codellm-devkit.

Keywords

Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering

2 Research products, page 1 of 1

WALA software on GitHub
IsRelatedTo
codellm-devkit software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Green

Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights

Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights

2 Research products, page 1 of 1

WALA software on GitHub

codellm-devkit software on GitHub