Downloads provided by UsageCounts
The current trend in recently released Graphic Processing Units (GPUs) is to exploit transistor scaling at the architectural level, hence, larger and larger GPUs in every new chip generation are released. Architecturally, this implies that the clusters count of parallel processing elements embedded within a single GPU die is constantly increasing, posing novel and interesting research challenges for performance engineering in latency-sensitive scenarios. A single GPU kernel is now likely not to scale linearly when dispatched in a GPU that features a larger cluster count. This is either due to VRAM bandwidth acting as a bottleneck or due to the inability of the kernel to saturate the massively parallel compute power available in these novel architectures. In this context, novel scheduling approaches might be derived if we consider the GPU as a partitionable compute engine in which multiple concurrent kernels can be scheduled in non- overlapping sets of clusters. While such an approach is very effective in improving the GPU overall utilization, it poses significant challenges in estimating kernel execution time latencies when kernels are dispatched to variable-sized GPU partitions. Moreover, memory interference within co-running kernels is a mandatory aspect to consider. In this work, we derive a practical yet fairly accurate memory-aware latency estimation model for co-running GPU kernels.
Latency Prediction, Concurrent Kernels, GPGPU-Simulator, Partitionable GPU, Concurrent Kernels; GPGPU-Simulator; Latency Prediction; Partitionable GPU;
Latency Prediction, Concurrent Kernels, GPGPU-Simulator, Partitionable GPU, Concurrent Kernels; GPGPU-Simulator; Latency Prediction; Partitionable GPU;
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 100 | |
| downloads | 9 |

Views provided by UsageCounts
Downloads provided by UsageCounts