Downloads provided by UsageCounts
handle: 2117/393224
The scale and heterogeneity of exascale systems increment the complexity of programming applications exploiting them. Task-based approaches with support for nested tasks are a good-fitting model for them because of the flexibility lying in the task concept. Resembling the hierarchical organization of the hardware, this paper proposes establishing a hierarchy in the application workflow for mapping coarse-grain tasks to the broader hardware components and finer-grain tasks to the lowest levels of the resource hierarchy to benefit from lower-latency and higher-bandwidth communications and exploiting locality. Building on a proposed mechanism to encapsulate within the task the management of its finer-grain parallelism, the paper presents a hierarchical peer-to-peer engine orchestrating the execution of workflow hierarchies with fully-decentralized management. The tests conducted on the MareNostrum 4 supercomputer using a prototype implementation prove the validity of the proposal supporting the execution of up to 707,653 tasks using 2,400 cores and achieving speedups of up to 106 times faster than executions of a single workflow and centralized management.
This work has been supported by the Spanish Government (PID2019-107255GB), by MCIN/AEI /10.13039/501100011033 (CEX2021-001148-S), by the Departament de Recerca i Universitats de la Generalitat de Catalunya to the Research Group MPiEDist (2021 SGR 00412), and by the European Commission through the Horizon Europe Research and Innovation program under Grant Agreements 101070177 (ICOS project) and 101016577 (AI-Sprint project).
Peer Reviewed
Distributed systems, Workflow, Runtime system, Exascale, Programming model, Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, Hierarchy, Task-based, Peer-to-peer, Decentralized management, Electronic data processing -- Distributed processing, High performance computing, Càlcul intensiu (Informàtica), Processament distribuït de dades
Distributed systems, Workflow, Runtime system, Exascale, Programming model, Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, Hierarchy, Task-based, Peer-to-peer, Decentralized management, Electronic data processing -- Distributed processing, High performance computing, Càlcul intensiu (Informàtica), Processament distribuït de dades
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 64 | |
| downloads | 11 |

Views provided by UsageCounts
Downloads provided by UsageCounts