Downloads provided by UsageCounts
handle: 2117/349221
This paper presents the design, implementation, and application of TALP, a lightweight, portable, extensible, and scalable tool for online parallel performance measurement. The efficiency metrics reported by TALP allow HPC users to evaluate the parallel efficiency of their executions, both post-mortem and at runtime. The API that TALP provides allows the running application or resource managers to collect performance metrics at runtime. This enables the opportunity to adapt the execution based on the metrics collected dynamically. The set of metrics collected by TALP are well defined, independent of the tool, and consolidated. We extend the collection of metrics with two additional ones that can differentiate between the load imbalance originated from the intranode or internode imbalance. We evaluate the potential of TALP with three parallel applications that present various parallel issues and carefully analyze the overhead introduced to determine its limitations. This work is partially supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology (TIN2015-65316-P), by the Generalitat de Catalunya (2017-SGR-1414), and by the European POP CoE (GA n. 824080). Peer Reviewed
Supercomputadors, Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles, Parallel computer programs, Performance and optimization, High performance computing, Distributed computing systems, Heterogeneous, Supercomputers, Performance Monitoring, :Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC], Àrees temàtiques de la UPC::Informàtica::Programació, :Informàtica::Programació [Àrees temàtiques de la UPC]
Supercomputadors, Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles, Parallel computer programs, Performance and optimization, High performance computing, Distributed computing systems, Heterogeneous, Supercomputers, Performance Monitoring, :Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC], Àrees temàtiques de la UPC::Informàtica::Programació, :Informàtica::Programació [Àrees temàtiques de la UPC]
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 11 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
| views | 32 | |
| downloads | 150 |

Views provided by UsageCounts
Downloads provided by UsageCounts