
NestedMP: Taming Complex Configuration Space of Degree of Parallelism for Nested-Parallel Programs. It is beneficial to exploit multiple levels of parallelism for a wide range of applications, because a typical server already has tens of processor cores now. As the number of cores in a computer is increasing rapidly, efficient support of nested parallelism will be more important. However, compared to single-level parallelism, nested-parallelism is much more complicated for programming since its configuration space of degree of parallelism is more complicated. Nowadays parallel programming models such as OpenMP only have naive support for nested parallelism, and programmers need to specify number of threads for each parallel task explicitly to get a reasonable performance. Such method has two drawbacks. First, it is a complicated job to write code to figure out appropriate configurations for different environments and contexts. Second, the runtime system lacks sufficient global information about threads allocation to make optimal decision on task-core mapping, which easily causes significant performance loss. To deal with such problems, we propose NestedMP, a set of directives which extends OpenMP. NestedMP adopts a model that propagate available threads on task tree in a top-down way, which provides global information about threads allocation for runtime system when high level parallel tasks are launched, to help it make locality-aware task-core mapping decisions. On the other side, instead of configuring number of threads explicitly, programmers control that by policies defined in NestedMP. We have written a few benchmarks by NestedMP, which shows NestedMP makes the code more concise on most cases. We have implemented NestedMP in GCC 4.8.2 and tested the performance of these benchmarks on a 4-way 8-core SandyBridge server. The result shows NestedMP improves the performance significantly over GCC's OpenMP implementation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
