LightSeq2: Accelerated Training for Transformer-Based Models on GPUs

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Nov 2022Embargo end date: 01 Jan 2021Publisher:IEEEJournal:SC22: International Conference for High Performance Computing, Networking, Storage and Analysis

Authors: Xiaohui Wang; Yang Wei; Ying Xiong; Guyue Huang; Xian Qian; Yufei Ding 0001; Mingxuan Wang; +1 Authors

doi: 10.1109/sc41404.2022.00043 , 10.48550/arxiv.2110.05722

arXiv: 2110.05722

LightSeq2: Accelerated Training for Transformer-Based Models on GPUs

- Summary
- Subjects
- Related research
  (8)
- Metrics

Abstract

Transformer-based neural models are used in many AI applications. Training these models is expensive, as it takes huge GPU resources and long duration. It is challenging because typical data like sentences have variable lengths, and Transformer's computation patterns are more complex than convolutional neural networks. Existing systems either only focus on model inference or optimization for only BERT-like encoder models. In this paper, we present LightSeq2, a system to accelerate training for a general family of Transformer models on GPUs. We propose a series of GPU optimization techniques tailored to the specific computation flow and memory access patterns of Transformer models. LightSeq2 supports many model architectures, including BERT (encoder-only), GPT (decoder-only), Transformer (encoder-decoder), and vision Transformer. Our experiments for a variety of models and benchmarks show that LightSeq2 is consistently faster (1.4-3.5x) than previous systems on different GPUs. In particular, it gains 308% training speedup compared with existing systems on a large public machine translation benchmark (WMT14 English-German).

13 pages, 22 figures, accepted by SC 22

Related Organizations

University of California, Santa Barbara
United States

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Computer Science - Mathematical Software, Computation and Language (cs.CL), Mathematical Software (cs.MS)

8 Research products, page 1 of 1

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
2022IsAmongTopNSimilarDocuments
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
2022IsAmongTopNSimilarDocuments
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
2023IsAmongTopNSimilarDocuments
Understanding BLOOM: An empirical study on diverse NLP tasks
2022IsAmongTopNSimilarDocuments
Findings: ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
2022IsAmongTopNSimilarDocuments
neurst software on GitHub
IsRelatedTo
FasterTransformer software on GitHub
IsRelatedTo
apex software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	9
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

9

Top 10%

Average

Top 10%

Green

Fields of Science

natural sciences

Fields of Science

natural sciences