Large-memory nodes for energy efficient high-performance computing

Conference object English OPEN
Zivanovic, Darko ; Radulovic, Milan ; Llort, German ; Zaragoza, David ; Strassburg, Janko ; Carpenter, Paul M. ; Radojkovic, Petar ; Ayguadé Parra, Eduard (2016)
  • Publisher: Association for Computing Machinery (ACM)
  • Related identifiers: doi: 10.1145/2989081.2989083
  • Subject: :Informàtica [Àrees temàtiques de la UPC] | Computer systems organization | Càlcul intensiu (Informàtica) | Distributed architectures | High performance computing | Power and energy | Hardware

Energy consumption is by far the most important contributor to HPC cluster operational costs, and it accounts for a significant share of the total cost of ownership. Advanced energy-saving techniques in HPC components have received significant research and development effort, but a simple measure that can dramatically reduce energy consumption is often overlooked. We show that, in capacity computing, where many small to medium-sized jobs have to be solved at the lowest cost, a practical energy-saving approach is to scale-in the application on large-memory nodes. We evaluate scaling-in; i.e. decreasing the number of application processes and compute nodes (servers) to solve a fixed-sized problem, using a set of HPC applications running in a production system. Using standard-memory nodes, we obtain average energy savings of 36%, already a huge figure. We show that the main source of these energy savings is a decrease in the node-hours (node_hours = #nodes x exe_time), which is a consequence of the more efficient use of hardware resources. Scaling-in is limited by the per-node memory capacity. We therefore consider using large-memory nodes to enable a greater degree of scaling-in. We show that the additional energy savings, of up to 52%, mean that in many cases the investment in upgrading the hardware would be recovered in a typical system lifetime of less than five years. Peer Reviewed
Share - Bookmark