Plug-and-Play Knowledge Injection for Pre-trained Language Models

Name: Plug-and-Play Knowledge Injection for Pre-trained Language Models
Keywords: Syntax-based Translation Models, FOS: Computer and information sciences, Artificial intelligence, History, China, Computer Science - Computation and Language, Volume (thermodynamics), Natural language processing, Physics, Pretrained Models

Zhengyan Zhang; Zhiyuan Zeng; Yankai Lin; Huadong Wang; Deming Ye; Chaojun Xiao; Xiao Han; Zhiyuan Liu; Peng Li; Maosong Sun; Jie Zhou

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2023

Data sources: arXiv.org e-Print Archive

https://doi.org/10.18653/v1/20...

Article . 2023 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2023

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

https://dx.doi.org/10.60692/q7...

Other literature type . 2023

Data sources: Datacite

https://dx.doi.org/10.60692/1x...

Other literature type . 2023

Data sources: Datacite

Plug-and-Play Knowledge Injection for Pre-trained Language Models

حقن المعرفة بالتوصيل والتشغيل لنماذج اللغة المدربة مسبقًا

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type , Preprint 01 Jan 2023Embargo end date: 01 Jan 2023Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Authors: Zhengyan Zhang; Zhiyuan Zeng; Yankai Lin; Huadong Wang; Deming Ye; Chaojun Xiao; Xiao Han; +4 Authors

doi: 10.18653/v1/2023.acl-long.594 , 10.48550/arxiv.2305.17691 , 10.60692/q7ry1-v7693 , 10.60692/1xnd4-k1s36

arXiv: 2305.17691

Plug-and-Play Knowledge Injection for Pre-trained Language Models

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Injecting external knowledge can improve the performance of pre-trained language models (PLMs) on various downstream NLP tasks. However, massive retraining is required to deploy new knowledge injection methods or knowledge bases for downstream tasks. In this work, we are the first to study how to improve the flexibility and efficiency of knowledge injection by reusing existing downstream models. To this end, we explore a new paradigm plug-and-play knowledge injection, where knowledge bases are injected into frozen existing downstream models by a knowledge plugin. Correspondingly, we propose a plug-and-play injection method map-tuning, which trains a mapping of knowledge embeddings to enrich model inputs with mapped embeddings while keeping model parameters frozen. Experimental results on three knowledge-driven NLP tasks show that existing injection methods are not suitable for the new paradigm, while map-tuning effectively improves the performance of downstream models. Moreover, we show that a frozen downstream model can be well adapted to different domains with different mapping networks of domain knowledge. Our code and models are available at https://github.com/THUNLP/Knowledge-Plugin.

ACL 2023

Related Organizations

Tsinghua University
Renmin University of China
Peng Cheng Laboratory
China (People's Republic of)
Renmin University of China
China (People's Republic of)
Tsinghua University

View all View all

Keywords

Syntax-based Translation Models, FOS: Computer and information sciences, Artificial intelligence, History, China, Computer Science - Computation and Language, Volume (thermodynamics), Natural language processing, Physics, Pretrained Models, Computational linguistics, Statistical Machine Translation and Natural Language Processing, Computer science, Quantum mechanics, Language Modeling, Machine Translation, Archaeology, Artificial Intelligence, Part-of-Speech Tagging, Computer Science, Physical Sciences, Zhàng, Computation and Language (cs.CL), Natural Language Processing

1 Research products, page 1 of 1

k-adapter software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	6
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

6

Top 10%

Average

Top 10%

Green

Related to Research communities

Digital Humanities and Cultural Heritage

Knowmad Institut

Plug-and-Play Knowledge Injection for Pre-trained Language Models

Plug-and-Play Knowledge Injection for Pre-trained Language Models

1 Research products, page 1 of 1

k-adapter software on GitHub