
Finding and fixing data races is a difficult parallel programming problem, even for experienced programmers. Despite the usage of race detectors at application development time, programmers might not be able to detect all races. Severe damage can be caused after application deployment at clients due to crashes and corrupted data. Run-time race detectors can tackle this problem, but current approaches either slow down application execution by orders of magnitude or require complex hardware. In this paper, we present a new approach to detect and repair races at application run-time. Our approach monitors cache coherency bus traffic for parallel accesses to unprotected shared resources. The technique has low overhead and requires just minor extensions to standard multicore hardware and software to make measurements more accurate. In particular, we exploit synergy effects between data needed for debugging and data made available by standard performance analysis hardware. We demonstrate feasibility and effectiveness using a controlled environment with a fully implemented software-based detector that executes real C/C++ applications. Our evaluations include the Helgrind and SPLASH2 benchmarks, as well as 29 representative parallel bug patterns derived from real-world programs. Experiments show that our technique successfully detects and automatically heals common race patterns, while the cache message overhead increases on average by just 0.2%.
ddc:004, DATA processing & computer science, info:eu-repo/classification/ddc/004, 004
ddc:004, DATA processing & computer science, info:eu-repo/classification/ddc/004, 004
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
