
arXiv: 2212.10612
As the landscape of deep neural networks evolves, heterogeneous dataflow accelerators, in the form of multi-core architectures or chiplet-based designs, promise more flexibility and higher inference performance through scalability. So far, these systems exploit the increased parallelism by coarsely mapping a single layer at a time across cores, which incurs frequent costly off-chip memory accesses, or by pipelining batches of inputs, which falls short in meeting the demands of latency-critical applications. To alleviate these bottlenecks, this work explores a new fine-grain mapping paradigm, referred to as layer fusion, on heterogeneous dataflow accelerators through a novel design space exploration framework called Stream. Stream captures a wide variety of heterogeneous dataflow architectures and mapping granularities, and implements a memory and communication-aware latency and energy analysis validated with three distinct state-of-the-art hardware implementations. As such, it facilitates a holistic exploration of architecture and mapping, by strategically allocating the workload through constraint optimization. The findings demonstrate that the integration of layer fusion with heterogeneous dataflow accelerators yields up to 2.2x lower energy-delay product in inference efficiency, addressing both energy consumption and latency concerns. The framework is available open-source at: https://github.com/kuleuven-micas/stream.
12 pages + references, 16 figures
Hardware Architecture, Technology, 1006 Computer Hardware, Science & Technology, Computer Hardware & Architecture, 0803 Computer Software, Engineering, Electrical & Electronic, 0805 Distributed Computing, 4606 Distributed computing and systems software, heterogeneous systems, design space exploration, Engineering, accelerators, 4009 Electronics, sensors and digital hardware, Computer Science, Deep neural networks, layer fusion, Computer Science, Hardware & Architecture, dataflow
Hardware Architecture, Technology, 1006 Computer Hardware, Science & Technology, Computer Hardware & Architecture, 0803 Computer Software, Engineering, Electrical & Electronic, 0805 Distributed Computing, 4606 Distributed computing and systems software, heterogeneous systems, design space exploration, Engineering, accelerators, 4009 Electronics, sensors and digital hardware, Computer Science, Deep neural networks, layer fusion, Computer Science, Hardware & Architecture, dataflow
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
