Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Journal . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Journal . 2024
License: CC BY
Data sources: Datacite
ZENODO
Journal . 2024
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Artifact of Enhancing Search-Based Testing with LLMs for Finding Bugs in System Simulators

Authors: Aidan, Dakhama; Even Mendoza, Karine; Langdon, W.B.; Hector, Menendez; Justyna, Petke;

Artifact of Enhancing Search-Based Testing with LLMs for Finding Bugs in System Simulators

Abstract

Abstract. Despite the wide availability of automated testing techniques such as fuzzing, little attention has been devoted to testing computer architecture simulators. We propose a fully automated approach for this task. Our approach uses large language models to create input programs, including information about their parameters and their types, as test cases for the simulators. The LLM’s output becomes the initial seed for an existing fuzzer, AFL, which has been enhanced with three mutation operators, targeting both the input binary program and its parameters. We implement our approach in a tool called SearchSYS. We use it to test the gem5 system simulator. SearchSYS discovered 21 new bugs in gem5, 14 where gem5’s software prediction differs from the real behaviour on actual hardware and 7 where it crashed. New defects were uncovered with each of the 6 LLMs used. SearchSYS tool is available at https://github.com/karineek/SearchGEM5/. Experiments in this publication were done with commit 6514a24. The previous publication of SearchSYS (SearchGEM5), is available here: https://zenodo.org/records/10999115. This record includes all data collected during the experiments between January to August 2024. The code: ASEGem5-main.zip. gem5 instrumented with coverage and gcc-9: gem5-instrumented-with-coverage-x86-ubuntu-20.04-gcc-9.zip. The Docker in the coverage folder in ASEGem5-main.zip works for more recent gcc versions. The data of Experiment 1: Initial corpus files (RQ1): LLM_test_inputs.zip. Bugs analysis: BugHuntingData_exp1_raw.tar.gz Coverage of initial corpus (RQ1-2): LLMresultsCov_beforeFuzzing.zip. Configurations results (RQ3): Experiment-1-1h-30_repeat_30_confg.zip (data) and Experiment-1-AnalysisRes-Select-Values4Counters.xlsx (analysis) The data of Experiment 2 (RQ4): Data after fuzzing (fuzzed corpus): -exp2.tar.gz Coverage after fuzzing: CodeCoverage-Main-PostAFL.zip Analysis of the data: exp2-data.zip (raw) and gem5_mism_bugs_29-August_2024.xlsx (excel)

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
1
Average
Average
Average