Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ http://dspace.vsb.cz...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.1109/hpcsim...
Article . 2016 . Peer-reviewed
Data sources: Crossref
versions View all 4 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Energy consumption optimization of the Total-FETI solver and BLAS routines by changing the CPU frequency

Authors: Horak, David; Riha, Lubomir; Sojka, Radim; Kruzik, Jakub; Beseda, Martin;

Energy consumption optimization of the Total-FETI solver and BLAS routines by changing the CPU frequency

Abstract

The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power an energy consumption is required on both software and hardware side. This poster deals with the energy consumption evaluation of the Total-Finite Element Tearing and Interconnect (TFETI) based solvers [2] of linear systems implemented in PERMON toolbox [1], which is an established method for solving real-world engineering problems, and with the energy consumption evaluation of the BLAS routines. The experiments performed in the poster deal with CPU frequency. This work is performed in the scope of the READEX project (Runtime Exploitation of Application Dynamism for Energy-efficient eXascale computing) [6]. The measurements were performed on the Intel Xeon E5-2680 (Intel Haswell micro-architecture) based Taurus system installed at TU Dresden. The system contains over 1400 nodes that have an FPGA-based power instrumentation called HDEEM (High Definition Energy Efficiency Monitoring), that allows for fine-grained and more accurate power and energy measurements. The measurements can be accessed through the HDEEM library, allowing developers to take energy measurements before and after the region of interest. We have evaluated the effect of the CPU frequency on the energy consumption of the TFETI solver for a linear elasticity 3D cube synthetic benchmark. On the dualized problem MPFX=MPd, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the TFETI method. There are two main phases in TFETI — preprocessing and solve. In preprocessing it is necessary to regularize the stiffness matrix K and factorize it and to assemble the G and GGT matrices and the second one to factorize. Both operations belong to the most time and also energy consuming operations. The solve employs the Preconditioned Conjugate Gradient (PCG) algorithm, which consists of sparse matrix-vector multiplications (by F, P, M L , M D matrices) and vector dot products and AXPY functions. In each iteration, we need to apply the direct solver twice, i.e., for forward and backward solves for the pseudoinverse K+ action and for the coarse problem solution, the (GGT)−1 action. The multiplication by the dense Schur complement matrix adds an additional operator with different computational characteristics, potentially increasing the exploitable dynamism. The poster provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The poster shows that static tuning brings up 11.84% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 2.68%. In total, the approach presented in this paper shows the potential to save up to 14.52% of energy for TFETI based solvers, see Table1. Another energy consumption evaluations were done with selected Sparse and Dense BLAS Level 1, 2 and 3 routines. For benchmarking we have used a set of matrices from University Florida collection [4]. We have employed AXPY, Sparse Matrix-Vector, Sparse MatrixMatrix, Dense Matrix-Vector, Dense Matrix-Matrix and Sparse Matrix-Dense Matrix multiplication routines from Intel Math Kernel Library (MKL) [3]. The measured characteristics illustrate the different energy consumption of BLAS routines, as some operations are memory-bounded and others are compute-bounded. Based on our recommendations one can explore dynamic frequency switching to achieve significant energy savings up to 23%, for more details see Table 2.

Country
Czech Republic
Keywords

energy conservation, READEX project, power aware computing, microprocessor chips, CPU frequency, BLAS routines, Total-FETI, CPU tuning, TFETI, energy consumption, PERMON toolbox, field programmable gate arrays

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    3
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
3
Average
Average
Average
Green