Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Mathematicsarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Mathematics
Article . 2024 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Mathematics
Article . 2024
Data sources: DOAJ
versions View all 2 versions
addClaim

Lottery Rank-Pruning Adaptation Parameter Efficient Fine-Tuning

Authors: Juhyeong Kim; Gyunyeop Kim; Sangwoo Kang;

Lottery Rank-Pruning Adaptation Parameter Efficient Fine-Tuning

Abstract

Recent studies on parameter-efficient fine-tuning (PEFT) have introduced effective and efficient methods for fine-tuning large language models (LLMs) on downstream tasks using fewer parameters than required by full fine-tuning. Low-rank decomposition adaptation (LoRA) significantly reduces the parameter count to 0.03% of that in full fine-tuning, maintaining satisfactory performance when training only two low-rank parameters. However, limitations remain due to the lack of task-specific parameters involved in training. To mitigate these issues, we propose the Lottery Rank-Pruning Adaptation (LoRPA) method, which utilizes the Lottery Ticket Hypothesis to prune less significant parameters based on their magnitudes following initial training. Initially, LoRPA trains with a relatively large rank size and then applies pruning to enhance performance in subsequent training with fewer parameters. We conducted experiments to compare LoRPA with LoRA baselines, including a setting with a relatively large rank size. Experimental results on the GLUE dataset with RoBERTa demonstrate that LoRPA achieves comparable results on the base scale while outperforming LoRA with various rank sizes by 0.04% to 0.74% on a large scale across multiple tasks. Additionally, on generative summarization tasks using BART-base on the CNN/DailyMail and XSum datasets, LoRPA outperformed LoRA at the standard rank size and other PEFT methods in most of the metrics. These results validate the efficacy of lottery pruning for LoRA in downstream natural-language understanding and generation tasks.

Related Organizations
Keywords

parameter efficient finetuning, QA1-939, deep learning, large language model, transfer learning, low-rank adaptation, Mathematics

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
1
Average
Average
Average
gold