Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss

Jae-Ho Tak; Byung-Woo Hong

Found an issue? Give us feedback

Electronicsarrow_drop_down

Electronics

Article . 2024 . Peer-reviewed

License: CC BY

Data sources: Crossref

Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss

descriptionPublicationkeyboard_double_arrow_right Article 29 Jan 2024 English Publisher:MDPI AGJournal:Electronics, volume 13, page 535 (eissn: 2079-9292,

Copyright policy )

Authors: Jae-Ho Tak; Byung-Woo Hong;

doi: 10.3390/electronics13030535

Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss

- Summary
- Metrics

Abstract

Artificial intelligence (AI) technology has advanced significantly, now capable of performing tasks previously believed to be exclusive to skilled humans. However, AI models, in contrast to humans who can develop skills with relatively less data, often require substantial amounts of data to emulate human cognitive abilities in specific areas. In situations where adequate pre-training data is not available, meta-learning becomes a crucial method for enhancing generalization. The Model Agnostic Meta-Learning (MAML) algorithm, which employs second-order derivative calculations to fine-tune initial parameters for better starting points, plays a pivotal role in this area. However, the computational demand of this method can be challenging for modern models with a large number of parameters. The concept of the Approximate Hessian Effect is introduced in this context, examining the effectiveness of second-order derivatives in identifying initial parameters conducive to high generalization performance. The study suggests the use of cosine similarity and squared error (L2 loss) as a loss function within the Approximate Hessian Effect framework to modify gradient weights, aiming for more generalizable model parameters. Additionally, an algorithm that relies on first-order calculations is presented, designed to achieve performance levels comparable to MAML. This approach was tested and compared with traditional MAML methods using both the MiniImagenet dataset and a modified MNIST dataset. The results were analyzed to evaluate its efficiency. Compared to previous studies that achieved good performance using only the first derivative, this approach is more efficient because it does not require iterative loops to converge on additional loss functions. Additionally, there is potential for further performance enhancement through hyperparameter tuning.

Related Organizations

Chung Ang University
Chung-Ang University
Chung-Ang University
Korea (Republic of)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Top 10%

Average

gold

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering