
Strong gravitational lensing is a powerful probe of cosmology and the dark matter distribution. Efficient lensing software is already a necessity to fully use its potential and the performance demands will only increase with the upcoming generation of telescopes. In this paper, we study the possible impact of High Performance Computing techniques on a performance-critical part of the widely used lens modeling software LENSTOOL. We implement the algorithm once as a highly optimized CPU version and once with graphics card acceleration for a simple parametric lens model. In addition, we study the impact of finite machine precision on the lensing algorithm. While double precision is the default choice for scientific applications, we find that single precision can be sufficiently accurate for our purposes and lead to a big speedup. Therefore we develop and present a mixed precision algorithm which only uses double precision when necessary. We measure the performance of the different implementations and find that the use of High Performance Computing Techniques dramatically improves the code performance both on CPUs and GPUs. Compared to the current LENSTOOL implementation on 12 CPU cores, we obtain speedup factors of up to 170. We achieve this optimal performance by using our mixed precision algorithm on a high-end GPU which is common in modern supercomputers. We also show that these techniques reduce the energy consumption by up to 98%. Furthermore, we demonstrate that a highly competitive speedup can be reached with consumer GPUs. While they are an order of magnitude cheaper than the high-end graphics cards, they are rarely used for scientific computations due to their low double precision performance. Our mixed precision algorithm unlocks their full potential. The consumer GPU delivers a speedup which is only a factor of four lower than the best speedup achieved by a high-end GPU.
28 pages, submitted to Astronomy & Computing
Cosmology and Nongalactic Astrophysics (astro-ph.CO), Gravitational lensing, FOS: Physical sciences, Galaxies: halos, Applied computing: Physical sciences and engineering: Astronomy, [PHYS.PHYS.PHYS-INS-DET] Physics [physics]/Physics [physics]/Instrumentation and Detectors [physics.ins-det], Dark matter, Computing methodologies: Parallel computing methodologies: Parallel algorithms: Massively parallel algorithms, Galaxies: clusters: general, [PHYS.ASTR] Physics [physics]/Astrophysics [astro-ph], Astrophysics - Instrumentation and Methods for Astrophysics, Instrumentation and Methods for Astrophysics (astro-ph.IM), Astrophysics - Cosmology and Nongalactic Astrophysics
Cosmology and Nongalactic Astrophysics (astro-ph.CO), Gravitational lensing, FOS: Physical sciences, Galaxies: halos, Applied computing: Physical sciences and engineering: Astronomy, [PHYS.PHYS.PHYS-INS-DET] Physics [physics]/Physics [physics]/Instrumentation and Detectors [physics.ins-det], Dark matter, Computing methodologies: Parallel computing methodologies: Parallel algorithms: Massively parallel algorithms, Galaxies: clusters: general, [PHYS.ASTR] Physics [physics]/Astrophysics [astro-ph], Astrophysics - Instrumentation and Methods for Astrophysics, Instrumentation and Methods for Astrophysics (astro-ph.IM), Astrophysics - Cosmology and Nongalactic Astrophysics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
