
arXiv: 1410.8772
handle: 1885/247402
Energy efficiency is the primary impediment in the path to exascale computing. Consequently, the high-performance computing community is increasingly interested in low-power high-performance embedded systems as building blocks for large-scale high-performance systems. The Adapteva Epiphany architecture integrates low-power RISC cores on a 2D mesh network and promises up to 70 GFLOPS/Watt of theoretical performance. However, with just 32 KB of memory per eCore for storing both data and code, programming the Epiphany system presents significant challenges. In this paper we evaluate the performance of a 64-core Epiphany system with a variety of basic compute and communication micro-benchmarks. Further, we implemented two well known application kernels, 5-point star-shaped heat stencil with a peak performance of 65.2 GFLOPS and matrix multiplication with 65.3 GFLOPS in single precision across 64 Epiphany cores. We discuss strategies for implementing high-performance computing application kernels on such memory constrained low-power devices and compare the Epiphany with competing low-power systems. With future Epiphany revisions expected to house thousands of cores on a single chip, understanding the merits of such an architecture is of prime importance to the exascale initiative.
stencil, FOS: Computer and information sciences, Network-on-chip, Computer Science - Distributed, Parallel, and Cluster Computing, Hardware Architecture (cs.AR), Computer Science - Mathematical Software, Distributed, Parallel, and Cluster Computing (cs.DC), Epiphany, Computer Science - Hardware Architecture, Mathematical Software (cs.MS), matrix-matrix multiplication, parallella
stencil, FOS: Computer and information sciences, Network-on-chip, Computer Science - Distributed, Parallel, and Cluster Computing, Hardware Architecture (cs.AR), Computer Science - Mathematical Software, Distributed, Parallel, and Cluster Computing (cs.DC), Epiphany, Computer Science - Hardware Architecture, Mathematical Software (cs.MS), matrix-matrix multiplication, parallella
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 30 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
