
arXiv: 2310.10168
The growing volume of data in modern applications has led to significant computational costs in conventional processor-centric systems. Processing-in-memory (PIM) architectures alleviate these costs by moving computation closer to memory, reducing data movement overheads. UPMEM is the first commercially available PIM system, featuring thousands of in-order processors (DPUs) integrated within DRAM modules. However, a programming UPMEM-based system remains challenging due to the need for explicit data management and workload partitioning across DPUs. We introduce DaPPA (data-parallel processing-in-memory architecture), a programming framework that eases the programmability of UPMEM systems by automatically managing data movement, memory allocation, and workload distribution. The key idea behind DaPPA is to leverage a high-level data-parallel pattern-based programming interface to abstract hardware complexities away from the programmer. DaPPA comprises three main components: (i) data-parallel pattern APIs, a collection of five primary data-parallel pattern primitives that allow the programmer to express data transformations within an application; (ii) a dataflow programming interface, which allows the programmer to define how data moves across data-parallel patterns; and (iii) a dynamic template-based compilation, which leverages code skeletons and dynamic code transformations to convert data-parallel patterns implemented via the dataflow programming interface into an optimized UPMEM binary. We evaluate DaPPA using six workloads from the PrIM benchmark suite on a real UPMEM system. Compared to hand-tuned implementations, DaPPA improves end-to-end performance by 2.1x, on average, and reduces programming complexity (measured in lines-of-code) by 94%. Our results demonstrate that DaPPA is an effective programming framework for efficient and user-friendly programming on UPMEM systems.
FOS: Computer and information sciences, Hardware Architecture (cs.AR), Computer Science - Hardware Architecture
FOS: Computer and information sciences, Hardware Architecture (cs.AR), Computer Science - Hardware Architecture
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
