UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

Name: UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing, Optimization and Control (math.OC), FOS: Mathematics, Distributed, Parallel, and Cluster Computing (cs.DC), Mathematics - Optimization and Control, Machine Learning (cs.LG)

Hao Lin; Ke Wu; Jie Li; Jun Li; Wu-Jun Li

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2023

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1109/cvpr52...

Article . 2025 . Peer-reviewed

License: STM Policy #29

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2023

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

DBLP

Conference object

Data sources: DBLP

DBLP

Article

Data sources: DBLP

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 10 Jun 2025Embargo end date: 01 Jan 2023Publisher:IEEEJournal:2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Authors: Hao Lin; Ke Wu; Jie Li; Jun Li; Wu-Jun Li;

doi: 10.1109/cvpr52734.2025.01951 , 10.48550/arxiv.2307.16375

arXiv: 2307.16375

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Distributed learning is commonly used for training deep learning models, especially large models. In distributed learning, manual parallelism (MP) methods demand considerable human effort and have limited flexibility. Hence, automatic parallelism (AP) methods have recently been proposed for automating the parallel strategy optimization process. Existing AP methods suffer from sub-optimal solutions because they do not jointly optimize the two categories of parallel strategies (i.e., inter-layer parallelism and intra-layer parallelism). In this paper, we propose a novel AP method called UniAP, which unifies inter- and intra-layer automatic parallelism by mixed integer quadratic programming. To the best of our knowledge, UniAP is the first parallel method that can jointly optimize the two categories of parallel strategies to find an optimal solution. Experimental results show that UniAP outperforms state-of-the-art methods by up to 3.80$\times$ in throughput and reduces strategy optimization time by up to 107$\times$ across five Transformer-based models.

17 pages, 10 figures, CVPR 2025

Related Organizations

Nanjing University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing, Optimization and Control (math.OC), FOS: Mathematics, Distributed, Parallel, and Cluster Computing (cs.DC), Mathematics - Optimization and Control, Machine Learning (cs.LG)

1 Research products, page 1 of 1

Megatron-DeepSpeed software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

1 Research products, page 1 of 1

Megatron-DeepSpeed software on GitHub