
handle: 2117/439326
Next-generation artificial intelligence (AI) workloads are posing challenges of scalability and robustness in terms of execution time due to their intrinsic evolving data-intensive characteristics. In this paper, we aim to analyse the potential bottlenecks caused due to data movement characteristics of AI workloads on scale-out accelerator architectures composed of multiple chiplets. Our methodology captures the unicast and multicast communication traffic of a set of AI workloads and assesses aspects such as the time spent in such communications and the amount of multicast messages as a function of the number of employed chiplets. Our studies reveal that some AI workloads are potentially vulnerable to the dominant effects of communication, especially multicast traffic, which can become a performance bottleneck and limit their scalability. Workload profiling insights suggest to architect a flexible interconnect solution at chiplet level in order to improve the performance, efficiency and scalability of next-generation AI accelerators.
5 Pages
FOS: Computer and information sciences, AI accelerators, Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, Multi-chiplet accelerators, Communication, Hardware Architecture (cs.AR), Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial, Network-on-package, Computer Science - Hardware Architecture
FOS: Computer and information sciences, AI accelerators, Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, Multi-chiplet accelerators, Communication, Hardware Architecture (cs.AR), Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial, Network-on-package, Computer Science - Hardware Architecture
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
