
Rich thread-level parallelism of GPU has motivated co-running GPU kernels on a single GPU. However, when GPU kernels co-run, it is possible that one kernel can leverage buffer overflow to attack another kernel running on the same GPU. There is very limited work aiming to detect buffer overflow for GPU. Existing work has either large performance overhead or limited capability in detecting buffer overflow. In this article, we introduce GMODx, a runtime software system that can detect GPU buffer overflow. GMODx performs always-on monitoring on allocated memory based on a canary-based design. First , for the fine-grained memory management, GMODx introduces a set of byte arrays to store buffer information for overflow detection. Techniques, such as lock-free accesses to the byte arrays, delayed memory free, efficient memory reallocation, and garbage collection for the byte arrays, are proposed to achieve high performance. Second , for the coarse-grained memory management, GMODx utilizes unified memory to delegate the always-on monitoring to the CPU. To reduce performance overhead, we propose several techniques, including customized list data structure and specific optimizations against the unified memory. For micro-benchmarking, our experiments show that GMODx is capable of detecting buffer overflow for the fine-grained memory management without performance loss, and that it incurs small runtime overhead (4.2 percent on average and up to 9.7 percent) for the coarse-grained memory management. For real workloads, we deploy GMODx on the TensorFlow framework, it only causes 0.8 percent overhead on average (up to 1.8 percent).
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
