Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object
Data sources: ZENODO
addClaim

Cross-platform Programming Model for GPU Implementation of OpenFOAM Using ISO C++

Authors: Kumar, Mayank; Castagna, Jony; Janssens, Mattijs; Tan, Raynold; Berrisford, Liam; Liu, Wendi; Tabor, Gavin; +1 Authors

Cross-platform Programming Model for GPU Implementation of OpenFOAM Using ISO C++

Abstract

In this work, we present our port of OpenFOAM to GPUs using the C++ standard parallel execution model (stdpar) introduced in ISO C++17. With a minimally intrusive strategy—primarily replacing serial loops with stdpar constructs—we offload the full computational workload of typical CFD simulations to multicore and manycore architectures. This approach is vendor‑agnostic, maintains a single unified codebase, and can be integrated easily into the main OpenFOAM release. We demonstrate performance using the icoFoam and simpleFoam solvers across four test cases: the 3D lid‑driven cavity, 3D conical diffuser, HPC motorbike, and drivAer automotive geometry. Experiments were conducted on a range of NVIDIA and AMD systems, including CPU‑only, hybrid CPU–GPU, and unified‑memory CPU–GPU configurations. Measured speedups relative to a fully populated 32‑core CPU socket range from 0.4× to 7.7×, depending on boundary‑condition complexity, turbulence modelling, and solver type. Details of the porting methodology and performance results are provided.

Powered by OpenAIRE graph
Found an issue? Give us feedback