
Recent concerns about software supply chain security have led to the emergence of different binaries built from the same source code. This will sometimes result in binaries that are not identical and therefore have different cryptographic hashes. The question arises whether those binaries are still equivalent, i.e., whether they have the same behaviour. We explore whether differential testing can be used to provide evidence for non-equivalence. We study this for 3,541 pairs of binaries built for the same Maven artifact version, distributed on Maven Central, Google Assured Open Source Software and/or Oracle Build-From-Source. We use EvoSuite to generate tests for the baseline binary from Maven Central, run these tests against this baseline binary and any available alternately built binaries, and compare the results for consistency. We argue that any differences may indicate variations in program behaviour and could, therefore, be used to detect compromised binaries or failures at runtime. This dataset contains the final results used to produce Figure 1 in different_test_outcomes.tsv, as well as tools to reproduce these results from several starting points. See README.md for instructions. Citation: Jens Dietrich, Tim White, Valerio Terragni and Behnaz Hassanshahi. Towards Cross-Build Differential Testing. 18th IEEE International Conference on Software Testing, Verification and Validation (ICST) 2025.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
