Reproduction Package for CAV 2022 Submission `Interpolation and SAT-Based Model Checking Revisited'

Reproduction Package Interpolation and SAT-Based Model Checking Revisited: Adoption to Software Verification Abstract This artifact is a reproduction package for the paper “Interpolation and SAT-Based Model Checking Revisited: Adoption to Software Verification” (submitted to CAV ’22). It contains code and data used in the evaluation of the paper, including CPAchecker with the implementation described in the paper, the benchmark suite used for the evaluation, the raw data resulting from the evaluation, scripts to execute the experiments, and instructions for reproducing the tables and figures in the paper. A full reproduction of the evaluation needs 16 GB of memory and about half a year of CPU time, but a limited subset of the benchmark tasks can be used for demonstration purposes if at least two CPU cores and 4 GB of memory are available. This artifact is published with DOI 10.5281/zenodo.5873048. Content This artifact contains the following items: README.*: this documentation License.txt: license information of the artifact cpachecker/: a directory containing CPAchecker at revision cfa-single-loop-transformation:39212 (sources plus binary), which was used in our evaluation sv-benchmarks/: a directory containing the benchmark suite used in SV-COMP ’22 and our evaluation (subset of tag svcomp22 of SV-Benchmarks) packages/: necessary packages for executing the experiments on Ubuntu 20.04, tested with the TACAS ’21 AE VM data-submission/: raw data produced from our experiments, both from the full experiment as well as a small demo subset CAV22-interpolation-model-checking.xml: the benchmark definition of the experiments for BenchExec Makefile: a file to execute the commands filter-tasks.py: a script to filter out specified verification tasks from benchmark results filtered-tasks.csv: a list of verification tasks for which explicitly tracking the callstack during the analysis prevents the application of IMC This readme file will guide you through the following steps: Setup of CPAchecker and BenchExec Running the example verification task used in the paper Performing experiments over a selected category and the complete benchmark suite Analyzing the experimental data Discussing known issues of the artifact TL;DR The raw results of the experiment presented in the paper are available in this table. To reproduce the experiment on an Ubuntu 20.04 machine, type the following commands to: Install Java 11: make install-java-11 Install BenchExec: make install-benchexec reboot the system make prepare-benchexec (to turn off swap memory and allow user namespaces, needs to be redone after every reboot) make test-cgroup (to test if cgroup is configured correctly) Test if CPAchecker can be run normally: make test-cpachecker Test if BenchExec is installed successfully: make test-benchexec Run an example for IMC (McMillan, 2003): make example-imc Run an example for predicate abstraction: make example-pred Run an example for k-induction: make example-kind Perform a demo experiment on subset of tasks: make demo-run In order to require less resources we recommend make timelimit=60s memlimit=3GB demo-run to set the time limit to 60 seconds and memory limit to 3 GB per task. Note that this still requires to increase the VM memory when using TACAS ’21 AE VM. Perform a complete experiment: make full-run (needs 16 GB of RAM and 180 days) Open this file with a browser to view the submitted results from a demo run Open this file with a browser to view the submitted results from a full run (the data used in the paper) To improve readability, in the following we only excerpt important fragments of the logs. The complete logs for the above commands are listed at the end of this file for reference.

Related Organizations

National Taiwan University of Arts
Taiwan

Keywords

Model checking, Software verification, SMT, Program analysis, CPAchecker, SAT, Interpolation

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility

views

10

10
views
Powered by

Found an issue? Give us feedback

visibility

0

Average

10