
doi: 10.1109/12.256449
The cache invalidation patterns of several parallel applications are analyzed. The results are based on multiprocessor simulations with 8, 16, and 32 processors. To provide deeper insight into the observed invalidation behavior the invalidations observed in the simulations are linked to the high-level objects causing them in the programs. To predict what the invalidation patterns would look like beyond 32 processors, a classification scheme for data objects found in parallel programs is proposed. The classification scheme provides a powerful conceptual tool to reason about the invalidation patterns of parallel applications. Results indicate that it should be possible to scale well-written parallel programs to a large number of processors without an explosion in invalidation traffic. At the same time, the invalidation patterns are such that directory-based schemes with just a few pointers per entry can be very effective. The variations in invalidation behavior with different cache line sizes are discussed. The results indicate that cache line sizes in the 32-byte range yield the lowest data and invalidation traffic. >
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 76 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
