High Performance Unstructured SpMM Computation Using Tensor Cores

Name: High Performance Unstructured SpMM Computation Using Tensor Cores
Keywords: FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Mathematics of computing; SpMM; Matrix Multiplication; Tensor Cores, Distributed, Parallel, and Cluster Computing (cs.DC), Mathematics of computing; Matrix Multiplication; SpMM; Tensor Cores

Patrik Okanovic; Grzegorz Kwasniewski; Paolo Sylos Labini; Maciej Besta; Flavio Vella; Torsten Hoefler

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2024

Data sources: arXiv.org e-Print Archive

IRIS - Institutional Research Information System of the University of Trento

Conference object . 2024

Full-Text: https://iris.unitn.it/bitstream/11572/445354/1/High_Performance_Unstructured_SpMM_Computation_Using_Tensor_Cores.pdf

Data sources: IRIS - Institutional Research Information System of the University of Trento

https://doi.org/10.1109/sc4140...

Article . 2024 . Peer-reviewed

License: STM Policy #29

Data sources: Crossref

http://dx.doi.org/10.1109/sc41...

Conference object

License: STM Policy #29

Full-Text: http://xplorestaging.ieee.org/ielx8/10793057/10793058/10793184.pdf?arnumber=10793184

Data sources: Sygma

https://dx.doi.org/10.48550/ar...

Article . 2024

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

DBLP

Article

Data sources: DBLP

DBLP

Conference object

Data sources: DBLP

Research Collection

Conference object . 2024

Data sources: Research Collection

http://dx.doi.org/10.1109/sc41...

Conference object . 2024

Data sources: European Union Open Data Portal

High Performance Unstructured SpMM Computation Using Tensor Cores

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 17 Nov 2024Embargo end date: 01 Jan 2024 Switzerland, Italy Publisher:IEEEJournal:SC24: International Conference for High Performance Computing, Networking, Storage and AnalysisFunded by:EC | PSAP, EC | DEEP-SEA

Authors: Patrik Okanovic; Grzegorz Kwasniewski; Paolo Sylos Labini; Maciej Besta; Flavio Vella; Torsten Hoefler;

doi: 10.1109/sc41406.2024.00060 , 10.48550/arxiv.2408.11551

arXiv: 2408.11551

handle: 11572/445354 , 20.500.11850/714251

High Performance Unstructured SpMM Computation Using Tensor Cores

- Summary
- Subjects
- Related research
  (4)
- Metrics

Abstract

High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures. Yet, existing hardware, such as Tensor Cores (TC), is ill-suited for SpMM, as it imposes strict constraints on data structures that cannot be met by unstructured sparsity found in many applications. To address this, we introduce (S)parse (Ma)trix Matrix (T)ensor Core-accelerated (SMaT): a novel SpMM library that utilizes TCs for unstructured sparse matrices. Our block-sparse library leverages the low-level CUDA MMA (matrix-matrix-accumulate) API, maximizing the performance offered by modern GPUs. Algorithmic optimizations such as sparse matrix permutation further improve performance by minimizing the number of non-zero blocks. The evaluation on NVIDIA A100 shows that SMaT outperforms SotA libraries (DASP, cuSPARSE, and Magicube) by up to 125x (on average 2.6x). SMaT can be used to accelerate many workloads in scientific computing, large-model training, inference, and others.

Accepted by 2024 International Conference on High Performance Computing, Networking, Storage and Analysis, 2023 (SC'24)

Countries

Switzerland, Italy

Related Organizations

Keywords

FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Mathematics of computing; SpMM; Matrix Multiplication; Tensor Cores, Distributed, Parallel, and Cluster Computing (cs.DC), Mathematics of computing; Matrix Multiplication; SpMM; Tensor Cores

4 Research products, page 1 of 1

DASP software on GitHub
IsRelatedTo
CUDALibrarySamples software on GitHub
IsRelatedTo
smat software on GitHub
IsRelatedTo
MagicCube software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Funded by

EC| PSAP, EC| DEEP-SEA

High Performance Unstructured SpMM Computation Using Tensor Cores

High Performance Unstructured SpMM Computation Using Tensor Cores

4 Research products, page 1 of 1

DASP software on GitHub

CUDALibrarySamples software on GitHub

smat software on GitHub

MagicCube software on GitHub