Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
versions View all 2 versions
addClaim

Artifact for the FASE 26 Paper: Testing in Formal Verification via Witness Generation (Empirical Evaluation)

Authors: Beyer, Dirk; Lemberger, Thomas; Wachowitz, Henrik;

Artifact for the FASE 26 Paper: Testing in Formal Verification via Witness Generation (Empirical Evaluation)

Abstract

Artifact for the FASE 26 Paper Testing in Formal Verification via Witness Generation (Empirical Evaluation) This artifact guides you through reproducing the plots and tables presented in the paper and through rerunning the experimental evaluation. The artifact is shipped as a VirtualBox VM. - The artifact is contained within a Ubuntu 24.04 VM. - We require at least 16 GB of Memory and 4 Cores Getting Started Username and Passworduser: test-to-witnesspassword: fase26 This artifact contains: analysis/ : Files concerning plots backup/ : A backup of the experiment data benchexec/ : The benchmarking tool benchmark-defs/ : Files defining benchmarks benchmark-sets/ : Files describing sets of tasks combinations/ : Code of the portfolio data/ : data from SV-COMP 25 expected/ : The expected plots Makefile results/ : Here are the generated witnesses from test-to-witness results-baseline/ : results of the baseline (just the verifier) for the portfolios results-portfolio/ : results of the portfolio experiments results-validate/ : results of the validation of generated witnesses results-verified/ : Data from Test-Comp 25 sv-benchmarks/ : The sv-benchmark tasks table-defs/ : Files defining how to create csv tables from the raw experiment data test-to-witness/ : The tool test-to-witness tool_info/ : Files telling benchexec how to run our tools validators/ : The validators used in the experiment If you are interested in just the tool Test-To-Witness you may navigate into the test-to-witness folder. There we include a README on how to use test-to-witness. Please note: We did not install a golang distribution in the VM, thus it is not possible to recompile test-to-witness. For the most up-to-date version we refer readers to our public gitlab: https://gitlab.com/sosy-lab/software/test-to-witness. This artifact document consists of the following steps: Downloading and installing the VM Recreating the plots and tables from the paper Running a subset of the experiments Running the full set of experiments For Getting Started we recommend doing Sections 1, 2 and 3.1 these should be able to be completed in less than 30min. Sections 2, 3 and 4 also serve as the Step-by-Step guide. 1. Download and Installation Instructions Download the artifact VM from Zenodo using the DOI provided with the paper. The download consists of a single VirtualBox appliance file with the extension .ova. Install Oracle VirtualBox (version 7 or newer is recommended) from the official website. Start VirtualBox and select File → Import Appliance. Choose the downloaded .ova file and follow the import wizard. You may adjust the VM settings during import; we require allocating at least 16 GB of RAM and 4 CPU cores. After the import finishes, select the VM from the list and click Start to launch it. Log in to the VM using the credentials: user: test-to-witness pw: fase26 Inside the VM To keep the VM as small as possible to Download, we compressed the data inside the VM. Before starting the evaluation please decompress the data. This will take approximately 15-30min. The resulting size is approximately 49Gb. pigz -dc fase26.tar.gz | pv | tar xf - 2. Recreating the Plots and Tables of the Paper To recreate the plots, first open a terminal and change into the directory containing all data and scripts: cd $HOME/fase26 To generate all plots and tables used in the paper, run the following command (this takes approximately 1–2 minutes): make plots Expected Output: Copying data for plots... mkdir -p analysis/raw/ cp table-defs/portfolio-cpachecker.table.csv analysis/raw/ cp table-defs/portfolio-esbmc-kind.table.csv analysis/raw/ cp table-defs/portfolio-uautomizer.table.csv analysis/raw/ cp table-defs/portfolio-symbiotic.table.csv analysis/raw/ cp table-defs/portfolio-bubaak.table.csv analysis/raw/ cp table-defs/portfolio-cpv.table.csv analysis/raw/ cp data/reach_safety.table.csv analysis/raw/ cp results-verified/META_Cover-Error.table.csv analysis/raw/ cp table-defs/creation.table.csv analysis/raw/ cp table-defs/validation.table.csv analysis/raw/ Copied data successfully. Creating plots... cd analysis && python3 plot.py RQ1: Converted tasks (done): 10897 ] Generating validation.xml... Generated validation.xml successfully Generating validation.csv... PYTHONPATH=tool_info benchexec/bin/table-generator -x table-defs/validation.xml --no-diff -o table-defs -f csv INFO: Reading table definition from 'table-defs/validation.xml'... INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.cetfuzz-coverage-error-call-confirmed.cetfuzz-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.cetfuzz-coverage-error-call-confirmed.cetfuzz-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.esbmc-incr-coverage-error-call-confirmed.esbmc-incr-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.esbmc-incr-coverage-error-call-confirmed.esbmc-incr-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.esbmc-kind-coverage-error-call-confirmed.esbmc-kind-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.esbmc-kind-coverage-error-call-confirmed.esbmc-kind-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.fdse-coverage-error-call-confirmed.fdse-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.fdse-coverage-error-call-confirmed.fdse-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.fizzer-coverage-error-call-confirmed.fizzer-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.fizzer-coverage-error-call-confirmed.fizzer-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.fusebmc-coverage-error-call-confirmed.fusebmc-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.fusebmc-coverage-error-call-confirmed.fusebmc-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.fusebmc-ia-coverage-error-call-confirmed.fusebmc-ia-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.fusebmc-ia-coverage-error-call-confirmed.fusebmc-ia-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.hybridtiger-coverage-error-call-confirmed.hybridtiger-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.hybridtiger-coverage-error-call-confirmed.hybridtiger-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.klee-coverage-error-call-confirmed.klee-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.klee-coverage-error-call-confirmed.klee-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.kleef-coverage-error-call-confirmed.kleef-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.kleef-coverage-error-call-confirmed.kleef-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.legion-coverage-error-call-confirmed.legion-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.legion-symcc-coverage-error-call-confirmed.legion-symcc-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.owic-coverage-error-call-confirmed.owic-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.prtest-coverage-error-call-confirmed.prtest-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.legion-coverage-error-call-confirmed.legion-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.legion-symcc-coverage-error-call-confirmed.legion-symcc-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.owic-coverage-error-call-confirmed.owic-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.rizzer-coverage-error-call-confirmed.rizzer-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.prtest-coverage-error-call-confirmed.prtest-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.rizzer-coverage-error-call-confirmed.rizzer-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.symbiotic-coverage-error-call-confirmed.symbiotic-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.symbiotic-coverage-error-call-confirmed.symbiotic-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.tracerx-coverage-error-call-confirmed.tracerx-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.tracerx-wp-coverage-error-call-confirmed.tracerx-wp-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.tracerx-coverage-error-call-confirmed.tracerx-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.tracerx-wp-coverage-error-call-confirmed.tracerx-wp-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.utestgen-coverage-error-call-confirmed.utestgen-test-suites.xml.bz2 INFO: table-defs/../results-validate/witch-validate-violation-witnesses-2.0.2025-08-05_23-18-21.results.wasp-c-coverage-error-call-confirmed.wasp-c-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.utestgen-coverage-error-call-confirmed.utestgen-test-suites.xml.bz2 INFO: table-defs/../results-validate/cpachecker-validate-violation-witnesses-2.0.2025-08-05_23-17-34.results.wasp-c-coverage-error-call-confirmed.wasp-c-test-suites.xml.bz2 INFO: Merging results... INFO: The resulting table will have 10941 rows and 6 columns (in 2 run sets). INFO: Generating table... INFO: Writing CSV into table-defs/validation.table.csv ... INFO: done Generated validation.csv successfully Generating creation.table.csv... PYTHONPATH=tool_info benchexec/bin/table-generator -x table-defs/creation.xml -f csv --no-diff -o table-defs/ INFO: Reading table definition from 'table-defs/creation.xml'... INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.cetfuzz-coverage-error-call-confirmed.cetfuzz-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.esbmc-incr-coverage-error-call-confirmed.esbmc-incr-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.esbmc-kind-coverage-error-call-confirmed.esbmc-kind-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.fdse-coverage-error-call-confirmed.fdse-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.fizzer-coverage-error-call-confirmed.fizzer-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.fusebmc-coverage-error-call-confirmed.fusebmc-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.fusebmc-ia-coverage-error-call-confirmed.fusebmc-ia-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.hybridtiger-coverage-error-call-confirmed.hybridtiger-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.klee-coverage-error-call-confirmed.klee-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.kleef-coverage-error-call-confirmed.kleef-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.legion-coverage-error-call-confirmed.legion-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.legion-symcc-coverage-error-call-confirmed.legion-symcc-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.owic-coverage-error-call-confirmed.owic-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.prtest-coverage-error-call-confirmed.prtest-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.rizzer-coverage-error-call-confirmed.rizzer-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.symbiotic-coverage-error-call-confirmed.symbiotic-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.tracerx-coverage-error-call-confirmed.tracerx-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.tracerx-wp-coverage-error-call-confirmed.tracerx-wp-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.utestgen-coverage-error-call-confirmed.utestgen-test-suites.xml.bz2 INFO: table-defs/../results/test2witness.2025-10-16_10-29-27.results.wasp-c-coverage-error-call-confirmed.wasp-c-test-suites.xml.bz2 INFO: Merging results... INFO: The resulting table will have 10941 rows and 3 columns (in 1 run sets). INFO: Generating table... INFO: Writing CSV into table-defs/creation.table.csv ... INFO: done Generated creation.table.csv successfully Copying data for plots... mkdir -p analysis/raw/ cp table-defs/portfolio-cpachecker.table.csv analysis/raw/ cp table-defs/portfolio-esbmc-kind.table.csv analysis/raw/ cp table-defs/portfolio-uautomizer.table.csv analysis/raw/ cp table-defs/portfolio-symbiotic.table.csv analysis/raw/ cp table-defs/portfolio-bubaak.table.csv analysis/raw/ cp table-defs/portfolio-cpv.table.csv analysis/raw/ cp data/reach_safety.table.csv analysis/raw/ cp results-verified/META_Cover-Error.table.csv analysis/raw/ cp table-defs/creation.table.csv analysis/raw/ cp table-defs/validation.table.csv analysis/raw/ Copied data successfully. Creating plots... cd analysis && python3 plot.py RQ1: Converted tasks (done): 10897 -. To run a specific combination, for example CPAchecker with prtest, execute: make cpachecker-prtest To run all 24 evaluated, execute: make par-all 4.5 Recreating tables and plots To completely recreate the plots and their required data (after the experiments have run): ⚠️ Note: This will run a python script that will permanently update the table-defs/portfolio-*.xml to include the latest run portfolios. We recommend creating a backup of the table-defs cp -r table-defs table-defs.bak. make -B plots

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average