Refficientlib: an efficient load-rebalanced adaptive mesh refinement algorithm for high-performance computational physics meshes
Baiges Aznar, Joan
Bayona Roa, Camilo Andrés
:Matemàtiques i estadística::Anàlisi numèrica [Àrees temàtiques de la UPC] | adaptivity | finite volumes | parallel | Numerical analysis | finite differences | adaptive mesh refinement | finite elements | high-performance computing | load rebalancing | Anàlisi numèrica
No separate or additional fees are collected for access to or distribution of the work.
In this paper we present a novel algorithm for adaptive mesh refinement in computational physics meshes in a distributed memory parallel setting. The proposed method is developed for nodally based parallel domain partitions where the nodes of the mesh belong to a single processor, whereas the elements can belong to multiple processors. Some of the main features of the algorithm presented in this paper are its capability of handling multiple types of elements in two and three dimensions (triangular, quadrilateral, tetrahedral, and hexahedral), the small amount of memory required per processor, and the parallel scalability up to thousands of processors. The presented algorithm is also capable of dealing with nonbalanced hierarchical refinement, where multirefinement level jumps are possible between neighbor elements. An algorithm for dealing with load rebalancing is also presented, which allows us to move the hierarchical data structure between processors so that load unbalancing is kept below an acceptable level at all times during the simulation. A particular feature of the proposed algorithm is that arbitrary renumbering algorithms can be used in the load rebalancing step, including both graph partitioning and space-filling renumbering algorithms. The presented algorithm is packed in the Fortran 2003 object oriented library \textttRefficientLib, whose interface calls which allow it to be used from any computational physics code are summarized. Finally, numerical experiments illustrating the performance and scalability of the algorithm are presented.