Exploring performance and power properties of modern multicore chips via simple machine models

Preprint English OPEN
Hager, Georg; Treibig, Jan; Habich, Johannes; Wellein, Gerhard;
  • Identifiers: doi: 10.1002/cpe.3180
  • Subject: Computer Science - Distributed, Parallel, and Cluster Computing | Computer Science - Performance
    arxiv: Computer Science::Performance | Computer Science::Hardware Architecture | Computer Science::Operating Systems

Modern multicore chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become possible to directly measure the power dissipation of a CPU chip and correlate this data with the performance properties of ... View more
  • References (27)
    27 references, page 1 of 3

    [1] Scho¨ nauer W. Scientific Supercomputing: Architecture and Use of Shared and Distributed Memory Parallel Computers. Self-edition, 2000. URL http://www.rz.uni-karlsruhe.de/~rx03/book.

    [2] Williams SW, Waterman A, Patterson DA. Roofline: An insightful visual performance model for floatingpoint programs and multicore architectures. Technical Report UCB/EECS-2008-134, EECS Department, University of California, Berkeley Oct 2008. URL http://www.eecs.berkeley.edu/Pubs/TechRpts/ 2008/EECS-2008-134.html.

    [3] Treibig J, Hager G. Introducing a performance model for bandwidth-limited loop kernels. Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science, vol. 6067, Wyrzykowski R, Dongarra J, Karczewski K, Wasniewski J (eds.). Springer Berlin / Heidelberg, 2010; 615-624, doi: 10.1007/978-3-642-14390-8 64.

    [4] Suleman MA, Qureshi MK, Patt YN. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs. SIGARCH Comput. Archit. News Mar 2008; 36(1):277- 286, doi:10.1145/1353534.1346317.

    [5] Hoisie A, Lubeck O, Wasserman HJ. Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications. Int. J. High Perform. Comp. Appl. 2000; 14:330-346, doi:10.1177/109434200001400405.

    [6] Nudd GR, Kerbyson DJ, Papaefstathiou E, Perry SC, Harper JS, Wilcox DV. Pace - A toolset for the performance prediction of parallel and distributed systems. Int. J. High Perform. Comp. Appl. 2000; 14(3):228- 251, doi:10.1177/109434200001400306.

    [7] Kerbyson DJ, Alme HJ, Hoisie A, Petrini F, Wasserman HJ, Gittings M. Predictive performance and scalability modeling of a large-scale application. Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM), Supercomputing '01, ACM: New York, NY, USA, 2001; 37-37, doi:10.1145/582034. 582071.

    [8] Kerbyson DJ, Jones PW. A performance model of the Parallel Ocean Program. Int. J. High Perform. Comp. Appl. 2005; 19:261-276, doi:10.1177/1094342005056114.

    [9] Horvath T, Skadron K. Multi-mode energy management for multi-tier server clusters. Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, ACM: New York, NY, USA, 2008; 270-279, doi:10.1145/1454115.1454153.

    [10] Li D, de Supinski BR, Schulz M, Nikolopoulos DS, Cameron KW. Strategies for energy efficient resource management of hybrid programming models. IEEE Transactions on Parallel and Distributed Systems 2012; 99(PrePrints), doi:10.1109/TPDS.2012.95.

  • Metrics
Share - Bookmark