Dual-Balancing for Multi-Task Learning

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2025Embargo end date: 01 Jan 2023Publisher:Elsevier BVJournal:Neural Networks, volume 195, page 108,317 (issn: 0893-6080,

Copyright policy )

Authors: Baijiong Lin; Weisen Jiang; Feiyang Ye; Yu Zhang; Pengguang Chen; Ying-Cong Chen; Shu Liu; +2 Authors

doi: 10.2139/ssrn.5131389 , 10.1016/j.neunet.2025.108317 , 10.48550/arxiv.2308.12029

pmid: 41289618

arXiv: 2308.12029

Dual-Balancing for Multi-Task Learning

- Summary
- Subjects
- Metrics

Abstract

Multi-task learning aims to learn multiple related tasks simultaneously and has achieved great success in various fields. However, the disparity in loss and gradient scales among tasks often leads to performance compromises, and the balancing of tasks remains a significant challenge. In this paper, we propose Dual-Balancing Multi-Task Learning (DB-MTL) to achieve task balancing from both the loss and gradient perspectives. Specifically, DB-MTL achieves loss-scale balancing by performing logarithm transformation on each task loss, and rescales gradient magnitudes by normalizing all task gradients to comparable magnitudes using the maximum gradient norm. Extensive experiments on a number of benchmark datasets demonstrate that DB-MTL consistently performs better than the current state-of-the-art.

Accepted by Neural Networks

Related Organizations

Chinese University of Hong Kong
China (People's Republic of)
Agency for Science, Technology and Research
Singapore
Institute of High Performance Computing
Singapore
Hong Kong University of Science and Technology (HKUST)
Hong Kong
The Hong Kong University of Science and Technology (HKUST) / Department of Physics
Hong Kong

View all View all

Keywords

Machine Learning, FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Artificial Intelligence, Humans, Learning, Neural Networks, Computer, Algorithms, Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Top 10%

Average

Green

Related to Research communities

UArctic