Debiased Fine-Tuning for Vision-Language Models by Prompt Regularization

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 26 Jun 2023Embargo end date: 01 Jan 2023Publisher:Association for the Advancement of Artificial Intelligence (AAAI)Journal:Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 3,834-3,842 (issn: 2159-5399, eissn: 2374-3468,

Copyright policy )

Authors: Beier Zhu; Yulei Niu; Saeil Lee; Minhoe Hur; Hanwang Zhang;

doi: 10.1609/aaai.v37i3.25496 , 10.48550/arxiv.2301.12429

arXiv: 2301.12429

Debiased Fine-Tuning for Vision-Language Models by Prompt Regularization

- Summary
- Subjects
- Related research
  (4)
- Metrics

Abstract

We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). Different from traditional fine-tuning which easily overfits to the downstream task data, ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning. The motivation is: by prompting the large model “a photo of a [CLASS]”, the fill-in answer is only dependent on the pretraining encyclopedic knowledge while independent of the task data distribution, which is usually biased. Specifically, given a training sample prediction during fine-tuning, we first calculate its Kullback-Leibler loss of the prompt prediction and Cross-Entropy loss of the ground-truth label, and then combine them with a proposed sample-wise adaptive trade- off weight, which automatically adjusts the transfer between the pretrained and downstream domains. On various out-of-distribution benchmarks, we show the consistently strong performance of ProReg compared with conventional fine-tuning, zero-shot prompt, prompt tuning, and other state-of-the-art methods.

Related Organizations

Hyundai Motor Group (South Korea)
Korea (Republic of)
Columbia University
Nanyang Technological University
Singapore
NANYANG TECHNOLOGICAL UNIVERSITY
Nanyang Technological University

View all View all

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Vision and Pattern Recognition

4 Research products, page 1 of 1

Governança regulatória nas agências reguladoras estaduais de saneamento básico
2021IsAmongTopNSimilarDocuments
ProReg XL Tool: An Easy-To-Use Computer Tool Suite for Rapidly Regrouping a Large Number of Identical Electrophoretic Profiles
2009IsAmongTopNSimilarDocuments
A beta‐binomial mixed‐effects model approach for analysing longitudinal discrete and bounded outcomes
2018IsAmongTopNSimilarDocuments
Factors associated with the progression of gastric intestinal metaplasia: a multicenter, prospective cohort study
2021IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%