
handle: 11245/1.473197
With the turn to multicore in chip design and manufacturing, both consumer and high performance applications can benefit from ubiquitous hardware parallelism. However, the performance improvement to be achieved is not always in the orders of magnitude range. In this paper, we present the challenging example of designing a parallel version of a model fitting algorithm used in calibrating telescope observation data in radio astronomy. The complexity of the application, together with the limited opportunities for code modification, bound the performance gain that any parallel system could achieve. However, we show how classical "bound-and-bottleneck" analysis and optimization using multicore architectures help achieving up to 2.3x "wall clock" speedup compared to the original sequential implementation. We further discuss the reasons for this limitation, and suggest possible solutions to address it.
530, 620
530, 620
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
