Dynamic Data-Free Knowledge Distillation by Easy-to-Hard Learning Strategy

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2023Embargo end date: 01 Jan 2022Publisher:Elsevier BVJournal:Information Sciences, volume 642, page 119,202 (issn: 0020-0255,

Copyright policy )

Authors: Jingru Li; Sheng Zhou 0004; Liangcheng Li; Haishuai Wang; Jiajun Bu; Zhi Yu;

doi: 10.2139/ssrn.4361656 , 10.1016/j.ins.2023.119202 , 10.48550/arxiv.2208.13648

arXiv: 2208.13648

Dynamic Data-Free Knowledge Distillation by Easy-to-Hard Learning Strategy

- Summary
- Subjects
- Metrics

Abstract

Data-free knowledge distillation (DFKD) is a widely-used strategy for Knowledge Distillation (KD) whose training data is not available. It trains a lightweight student model with the aid of a large pretrained teacher model without any access to training data. However, existing DFKD methods suffer from inadequate and unstable training process, as they do not adjust the generation target dynamically based on the status of the student model during learning. To address this limitation, we propose a novel DFKD method called CuDFKD. It teaches students by a dynamic strategy that gradually generates easy-to-hard pseudo samples, mirroring how humans learn. Besides, CuDFKD adapts the generation target dynamically according to the status of student model. Moreover, We provide a theoretical analysis of the majorization minimization (MM) algorithm and explain the convergence of CuDFKD. To measure the robustness and fidelity of DFKD methods, we propose two more metrics, and experiments shows CuDFKD has comparable performance to state-of-the-art (SOTA) DFKD methods on all datasets. Experiments also present that our CuDFKD has the fastest convergence and best robustness over other SOTA DFKD methods.

Accepted by Information Sciences, Proof version provided

Related Organizations

Zhejiang University
China (People's Republic of)
Zhejiang University
China (People's Republic of)
Zhejiang Ocean University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	18
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

18

Top 10%

Green

Fields of Science (4) View all

natural sciences

Fields of Science

natural sciences

View all