DTexL: Decoupled Raster Pipeline for Texture Locality

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Oct 2022Publisher:IEEEJournal:2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)Funded by:EC | CoCoUnit

Authors: Joseph, Diya; Aragón Alcaraz, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María;

doi: 10.1109/micro56248.2022.00028

handle: 2117/376016

DTexL: Decoupled Raster Pipeline for Texture Locality

- Summary
- Subjects
- Related research
  (8)
- Metrics

Abstract

Contemporary GPU architectures have multiple shader cores and a scheduler that distributes work (threads) among them, focusing on load balancing. These load balancing techniques favor thread distributions that are detrimental to texture memory locality for graphics applications in the L1 Texture Caches. Texture memory accesses make up the majority of the traffic to the memory hierarchy in typical low power graphics architectures. This paper focuses on improving the L1 Texture cache locality by focusing on a new workload scheduler by exploring various methods to group the threads, assign the groups to shader cores and also to reorder threads without violating the correctness of the pipeline. To overcome the resulting load imbalance, we also propose a minor modification in the GPU architecture that helps translate the improvement in cache locality to an improvement in the GPU’s performance. We propose DTexL that envelops these ideas and evaluate it over a benchmark suite of ten commercial games, to obtain a 46.8% decrease in L2 Accesses, a 19.3% increase in performance and a 6.3% decrease in total GPU energy. All this with a negligible overhead.

© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://dx.doi.org/10.1109/MICRO56248.2022.00028

This work has been supported by the CoCoUnit ERC Advanced Grant of the EU’s Horizon 2020 program (grant No 833057), the Spanish State Research Agency (MCIN/AEI) under grant PID2020-113172RB-I00, the ICREA Academia program and the AGAUR grant 2020-FISDU-00287.

Peer Reviewed

Related Organizations

Universitat Politècnica de Catalunya
Spain
Universitat Polite`cnica de Catalunya
Spain
University of Murcia
Spain

Keywords

Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, Scheduling, Texture locality, Cache memory, Low-power, GPU, Graphics, Memòria cau, Caches, Graphics processing units, Unitats de processament gràfic

8 Research products, page 1 of 1

Dynamic sampling rate: harnessing frame coherence in graphics applications for energy-efficient GPUs
2022IsAmongTopNSimilarDocuments
Characterizing self-driving tasks in general-purpose architectures
2021IsAmongTopNSimilarDocuments
MEGsim: A Novel Methodology for Efficient Simulation of Graphics Workloads in GPUs
2022IsAmongTopNSimilarDocuments
Sliding window support for image processing in autonomous vehicles
2022IsAmongTopNSimilarDocuments
Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads
2022IsAmongTopNSimilarDocuments
A Survey of Near-Data Processing Architectures for Neural Networks
2022IsAmongTopNSimilarDocuments
TCOR: A Tile Cache with Optimal Replacement
2022IsAmongTopNSimilarDocuments
DNN pruning with principal component analysis and connection importance estimation
2022IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average