Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

Selective GPU caches to eliminate CPU-GPU HW cache coherence

Authors: Neha Agarwal; David W. Nellans; Eiman Ebrahimi; Thomas F. Wenisch; John Danskin; Stephen W. Keckler;

Selective GPU caches to eliminate CPU-GPU HW cache coherence

Abstract

Cache coherence is ubiquitous in shared memory multiprocessors because it provides a simple, high performance memory abstraction to programmers. Recent work suggests extending hardware cache coherence between CPUs and GPUs to help support programming models with tightly coordinated sharing between CPU and GPU threads. However, implementing hardware cache coherence is particularly challenging in systems with discrete CPUs and GPUs that may not be produced by a single vendor. Instead, we propose, selective caching, wherein we disallow GPU caching of any memory that would require coherence updates to propagate between the CPU and GPU, thereby decoupling the GPU from vendor-specific CPU coherence protocols. We propose several architectural improvements to offset the performance penalty of selective caching: aggressive request coalescing, CPU-side coherent caching for GPU-uncacheable requests, and a CPU-GPU interconnect optimization to support variable-size transfers. Moreover, current GPU workloads access many read-only memory pages; we exploit this property to allow promiscuous GPU caching of these pages, relying on page-level protection, rather than hardware cache coherence, to ensure correctness. These optimizations bring a selective caching GPU implementation to within 93% of a hardware cache-coherent implementation without the need to integrate CPUs and GPUs under a single hardware coherence protocol.

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    30
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
30
Top 10%
Top 10%
Top 10%
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!