
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
handle: 2117/117442
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the productivity of MPI programmers is low because of the complexity of expressing parallelism and communication, and the difficulty of debugging. To try to ease the burden on the programmer new programming models have tried to give the illusion of a global shared-address space (e.g., UPC, Co-array Fortran). Unfortunately, these models do not support, increasingly common, irregular forms of parallelism that require asynchronous task parallelism. Other models, such as X10 or Chapel, provide this asynchronous parallelism but the programmer is required to rewrite entirely his application. We present the implementation of OmpSs for clusters, a variant of OpenMP extended to support asynchrony, heterogeneity and data movement for task parallelism. As OpenMP, it is based on decorating an existing serial version with compiler directives that are translated into calls to a runtime system that manages the parallelism extraction and data coherence and movement. Thus, the same program written in OmpSs can run in a regular SMP machine, in clusters of SMPs, or even can be used for debugging with the serial version. The runtime uses the information provided by the programmer to distribute the work across the cluster while optimizes communications using affinity scheduling and caching of data. We have evaluated our proposal with a set of kernels and the OmpSs versions obtain a performance comparable, or even superior, to the one obtained by the same version of MPI.
Peer Reviewed
Parallel processing (Electronic computers), Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles, Processament en paral·lel (Ordinadors), Task parallelism, Address space, Master node, Remote node, Distribute shared memory, :Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC]
Parallel processing (Electronic computers), Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles, Processament en paral·lel (Ordinadors), Task parallelism, Address space, Master node, Remote node, Distribute shared memory, :Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC]
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 73 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
views | 58 |