Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software . 2021
License: CC BY
Data sources: ZENODO
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
ZENODO
Software . 2024
License: CC BY
Data sources: ZENODO
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
ZENODO
Software . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Software . 2024
License: CC BY
Data sources: Datacite
ZENODO
Software . 2024
License: CC BY
Data sources: Datacite
ZENODO
Software . 2024
License: CC BY
Data sources: Datacite
versions View all 10 versions
addClaim

ISSTA artifact

Abstract

;This project introduces an innovative compiler testing framework called OPTFUZZ, which leverages Large Language Models (LLMs) to generate fine-grained compilation options and comprehensively explore the compilation space. OPTFUZZ utilizes LLMs to filter bug reports, identifying code segments as test cases, thereby enhancing the efficiency and effectiveness of compiler testing. Project Summary: Background: Compilers are fundamental infrastructure in software development, and their quality significantly impacts the performance and reliability of final software products. Despite the importance of compiler testing, existing research often focuses on specific subsets of compilation options, overlooking the broader implications of more complex combinations. Challenges: Manual testing of compiler options is labor-intensive and requires advanced developer expertise due to the complexity and diversity of these options. Approach: The OPTFUZZ framework enhances compiler testing through the following steps: Program Acquisition: Extracts test programs from historical bug reports. Code Abstraction: Abstracts source code using a method based on the intermediate representation (IR), extracting key code information. Compilation Option Generation: Uses LLMs to generate complex compilation options that may trigger compiler defects. Test Result Processing: Executes tests and processes results, including internal compiler errors (ICE) and hang issues. Contributions: The main contributions of OPTFUZZ include a novel approach to using LLMs to obtain code snippets from historical issue reports, abstracting source code at the IR level, and combining LLMs to generate complex compilation options for compiler testing. Experimental Results: Through extensive experiments on GCC and LLVM, OPTFUZZ has demonstrated superior bug detection capabilities compared to random compilation space exploration methods and other existing technologies. Practical Utility: OPTFUZZ identified 37 bugs in GCC and LLVM, with 27 confirmed or fixed, highlighting the practical utility of the method. Note: The script requires an internet connection to get data from Bugzilla and Github. The script requires a Python environment and the installation of listed libraries. Dependency libraries: github: Used to access the GitHub API. urlextract: Used to extract URLs from text. mistune: Used to parse Markdown text. requests: Used for HTTP requests. beautifulsoup4: Used to paser HTML file. shlex: Used to split strings into shell-like syntax. tqdm: Used to add a progress bar to loops and other iterables, providing a visual indication of progress. colorama: Used to emit colored output in terminals across different operating systems. signal: Used to provide mechanisms to use signal handlers for asynchronous events. google.generativeai: Used to Call Gemini API(Could be replaced to other LLMs) File structure: artifact: | |--bugreport--|--gcc---|--GCC_report_get.py | | |--llm_for_code.py | | | |--llvm--|--LLVM_report_get.py | |--llm_for_code.py | |--irplugin--|--generate_ir.py | |--makefile | |--plugin.cpp | |--optionstest--|--gcc--|--gcc_option_test.py | | | |--llvm-|--llvm_option_test.py Bugs Found by OPTFuzz: GCC bug-115411 : ICE : in expand_call, at calls.cc:3668 GCC bug-115412: ICE: canonical types differ for identical types ‘stdis_sametypename fooTtype, U’ and ‘stdis_sameT, U’ GCC bug-115426: ICE: in execute_todo, at passes.cc:2138 GCC bug-115431: ICE: tree check: expected tree that contains ‘decl common’ structure, have ‘error_mark’ in decl_template_parm_check, at cp/cp-tree.h:5131 GCC bug-115469: [14 Regression] ICE :tree check expected class 'type', have 'exceptional' (error_mark) in poplevel_named_label_1, at cpdecl.cc579 GCC bug-115489: [12/13/14/15 regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in create_tmp_from_val, at gimplify.cc:589 since r12-3278-g823685221de986 GCC bug-115501: [13/14 Regression] ICE: in build_call_a with dynamic_cast after invalid definition of __cxxabiv1::__dynamic_cast since r13-3299 GCC bug-115510: ICE : Segmentation fault in build_new_method_call and finish_call_expr GCC bug-115560: ICE: in type_dependent_expression_p, at cp/pt.cc:28576 GCC bug-115572: ICE: in dependent_type_p, at cp/pt.cc:28020 GCC bug-115588: ICE: in tsubst_stmt, at cp/pt.cc:18527 GCC bug-115599: ICE: qsort checking failed during GIMPLE pass: reassoc (error: qsort comparator non-negative on sorted output: 150142972) GCC bug-115620: ICE: in tsubst_pack_expansion, at cp/pt.cc:13703 GCC bug-115623: ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop GCC bug-115786: ICE: Segmentation fault (add_stmt at ./gcc/gcc/c/c-decl.cc:689 and c_parser_declaration_or_fndef at ./gcc/gcc/c/c-parser.cc:3027) GCC bug-115787: [GIMPLE-FE] ICE: in gimple_build_switch_nlabels, at gimple.cc:807 GCC bug-115919: ICE: in tsubst_expr, at cp/pt.cc:20300 GCC bug-115930: ICE: tree check: expected type_argument_pack or nontype_argument_pack, have integer_type in template_parm_natural_p, at cp/mangle.cc:1828 GCC bug-115940: ICE: tree check: expected record_type or union_type or qual_union_type, have translation_unit_decl in maybe_dummy_object, at cp/tree.cc:4379 GCC bug-116002: GCC Compiler time-hog with large basic block in Function GCC bug-116042: ICE Segmentation fault ( in ix86_finalize_stack_frame_flags and ix86_expand_prologue()) GCC bug-116113: [15 Regression] ICE: Segmentation fault (maybe_convert_cond) GCC bug-116320: [12/13 Regression] ICE: Segmentation fault (perform_or_defer_access_check) since r11-1350 GCC bug-116323: [12/13/14/15 Regression] ICE: tree check: expected record_type or union_type or qual_union_type, have bound_template_template_parm in access_in_type, at cp/search.cc:663 GCC bug-117065: ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in type_has_padding_at_level_p, at gimple-fold.cc:4820 GCC bug-117083: ICE: in get_expr_operands, at tree-ssa-operands.cc:939 GCC bug-117091: switch clustering takes extensive time with large switches even at -O0 LLVM bug-95366: Clang-19 crash:/lib/AST/ExprConstant.cpp:1633: void {anonymous}::LValue::setFrom(clang::ASTContext&, const clang::APValue&): Assertion `V.isLValue() && "Setting LValue from a non-LValue?"' failed. LLVM bug-95420: Clang-19 crash: void* clang::DeclarationName::getFETokenInfo() const: Assertion `getPtr() && "getFETokenInfo on an empty DeclarationName!"' failed. LLVM bug-95495: Clang-19 crash: const clang::FieldDecl* clang::APValue::getUnionField() const: Assertion `isUnion() && "Invalid accessor"' failed. LLVM bug-95500: Clang-19 crash: fatal error: error in backend: register rbp is allocatable: function has no frame pointer LLVM bug-97780: Clang-19 crash: bool clang::TreeTransform >::TransformExprs(clang::Expr* const*, unsigned int, bool, llvm::SmallVectorImpl&, bool*) [with Derived = {anonymous}::TemplateInstantiator]: Assertion `!Unexpanded.empty() && "Pack expansion without parameter packs?"' failed. LLVM bug-99516: Clang-19.0 :Hang Issue with LLVM: No Output or Error Message LLVM bug-99641: Clang-19.0 : Hang Issue with LLVM: No Output or Error Message involving recursive macros within a function LLVM bug-112069: Clang-19: Unable to find instantiation of declaration! UNREACHABLE executed at /llvm-project/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp:6437! LLVM bug-112086: clang-19: Assertion `0 && "Invalid SLocOffset or bad function choice"' failed. LLVM bug-113803: Clang-20 : Assertion `!DT.isNull() && "Undeduced types shouldn't reach here."' failed. in Folder irplugin : plugin.cpp is a GCC plugin for analyzing C/C++ code into IR: File Function: This plugin is used to analyze C/C++ code, record function calls and statement information, and output this information to a file. Main Features: Traverses basic blocks within functions. Records function calls and statement information. Outputs IR to a specified file. makefile is used to build a shared library named plugin.so. Here is a brief description of the parts of the file: plugin.so is the target rule in the Makefile for building the shared library named plugin.so. Depends on plugin.o, compiles using the g++ command, and links to create the shared library. By simply running make, the GCC plugin can be compiled. generate_ir.py is a Python script whose main function is to compile source code files and generate an intermediate representation (IR). run_command: Executes system commands and handles timeouts and errors. generate_coverage: Generates IR for individual source code files and records compilation results. compile_source_files: Traverses directories, starts a process for each source code file, and manages these processes. if __name__ == "__main__": : The main function entry, sets the source code directory, and starts the compilation process. Modify the file directory in directories and run python generate_ir.py to start execution. in Floder bugreport: GCC_report_get.py is used to extract bug report information from the GCC (GNU Compiler Collection) Bugzilla tracking system and classify and store it. Main functions: Obtain the HTML content of bug reports from GCC Bugzilla. Parse the HTML content to extract keywords, status, product, version, code snippets, and attachment links. Classify bug reports based on keywords and status and store them in corresponding directories. Download attachments and save them to the specified directory. Use multithreading to process multiple bug reports. How to run: Modify the number of reports to be crawled. Directly run the GCC_report_get.py file. LLVM_report_get.py is used to obtain closed bug reports from the LLVM project on GitHub and analyze the code blocks and compiler explorer URLs in the reports. Main functional modules: judge_code_blocks(text): Analyzes code blocks in the text and determines if there is C++ or C code. get_code_blocks(text, cwd): Analyzes code blocks in the text and saves them to the specified directory. save_issues(): Obtains closed bug reports from the LLVM project's GitHub repository and classifies and saves them based on the report content. get_compiler_explorer_urls(issue_body, cwd): Extracts compiler explorer URLs from bug reports and saves them to log files. judge_compiler_explorer_urls(issue_body): Determines if there are compiler explorer URLs in bug reports. download_source_code(issue_body, cwd): Extracts source code URLs from bug reports and downloads and saves them. initial(): Creates directories to save bug reports. main(): The main entry function of the script, which calls other functions to perform tasks. How to run: Modify the number of reports to be crawled. Directly run the LLVM_report_get.py file. llm_for_code.py is used to extract continuous C or C++ code snippets from log files and save them to a file. The script uses the google.generativeai library to interact with Gemini to generate descriptions of code snippets. Script execution process: The script first sets the API key for the Gemini library. Defines the error log template. In the if __name__ == "__main__": block, sets the working directory and the subdirectories to be processed. Calls the read_log function to traverse the specified directory and process log files. The read_log function traverses directories, reads log files, and calls the generate_answer function for processing. The generate_answer function calls the model API to generate descriptions of code snippets and calls the get_code_blocks function to process the descriptions. The get_code_blocks function extracts code blocks and saves them to files. in Floder optionstest: gcc_option_test.py is used to compile source code files and test GCC compiler options. Main functions: Traverses source code files in the specified directory and identifies compilable files (such as C, C++, C++ source files). Compiles these files using the GCC or g++ compiler and applies compilation options obtained from the Gemini API. Sets compilation timeout, and if the compilation exceeds the specified time, records timeout information. Catches compilation errors and records error information in log files. Deletes the generated executable files after successful compilation. Modify the file directory in directories and run python gcc_option_test.py to start execution. llvm_option_test.py is a script designed to compile source code files and test compilation options. Here are the main components of the script: Compile Source Code Files: The script traverses the specified directory for source code files and compiles them using the clang or clang++ compiler. It supports various file extensions such as .c, .i, .ii, .C, .cc, .cpp, .cxx, and .c++. Obtain Compilation Options: The script retrieves compilation option suggestions for specific source code by calling the Gemini API. Execute Compilation Commands: The script executes compilation commands with a set timeout. If the compilation times out or fails, the script records the error information. Modify the file directory in directories and run python llvm_option_test.py to begin execution. Overall Program Functionality Summary: This collection of programs is designed to automate the compilation process, including generating intermediate representations, obtaining and analyzing bug reports, testing compiler options, and extracting code snippets from logs. It serves as a comprehensive toolkit for analyzing C/C++ code, testing compiler options, and acquiring and organizing bug reports. The overall functional structure of the toolkit is depicted in the following diagram.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 4
  • 4
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
0
Average
Average
Average
4