Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 14 Sep 2023Embargo end date: 01 Jan 2021 English Publisher:Association for Computing Machinery (ACM)Journal:ACM Computing Surveys, volume 56, pages 1-40 (issn: 0360-0300, eissn: 1557-7341,

Copyright policy )

Authors: Bonan Min; Hayley Ross; Elior Sulem; Amir Pouran Ben Veyseh; Thien Huu Nguyen; Oscar Sainz; Eneko Agirre; +2 Authors

doi: 10.1145/3605943 , 10.48550/arxiv.2111.01243

arXiv: 2111.01243

Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey

- Summary
- Subjects
- Metrics

Abstract

Large, pre-trained language models (PLMs) such as BERT and GPT have drastically changed the Natural Language Processing (NLP) field. For numerous NLP tasks, approaches leveraging PLMs have achieved state-of-the-art performance. The key idea is to learn a generic, latent representation of language from a generic task once, then share it across disparate NLP tasks. Language modeling serves as the generic task, one with abundant self-supervised text available for extensive training. This article presents the key fundamental concepts of PLM architectures and a comprehensive view of the shift to PLM-driven NLP techniques. It surveys work applying the pre-training then fine-tuning, prompting, and text generation approaches. In addition, it discusses PLM limitations and suggested directions for future research.

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computation and Language (cs.CL), Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	850
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 0.01%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 0.1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 0.01%

Found an issue? Give us feedback

850

Top 0.01%

Top 0.1%

Top 0.01%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering