• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 2 versions
Publication . Article . Preprint . 2019 . Embargo end date: 01 Jan 2019

LAMOL: LAnguage MOdeling for Lifelong Language Learning

Sun, Fan-Keng; Ho, Cheng-Hao; Lee, Hung-Yi;
Open Access
Published: 07 Sep 2019
Publisher: arXiv

Most research on lifelong learning applies to images or games, but not language. We present LAMOL, a simple yet effective method for lifelong language learning (LLL) based on language modeling. LAMOL replays pseudo-samples of previous tasks while requiring no extra memory or model capacity. Specifically, LAMOL is a language model that simultaneously learns to solve the tasks and generate training samples. When the model is trained for a new task, it generates pseudo-samples of previous tasks for training alongside data for the new task. The results show that LAMOL prevents catastrophic forgetting without any sign of intransigence and can perform five very different language tasks sequentially with only one model. Overall, LAMOL outperforms previous methods by a considerable margin and is only 2-3% worse than multitasking, which is usually considered the LLL upper bound. The source code is available at


Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, Computer Science - Computation and Language, Computer Science - Artificial Intelligence

32 references, page 1 of 4

Aljundi, R.; Babiloni, F.; Elhoseiny, M.; Rohrbach, M.; and Tuytelaars, T. 2018. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV), 139-154. [OpenAIRE]

2018. The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730.

Chaudhry, A.; Ranzato, M.; Rohrbach, M.; and Elhoseiny, M. 2018. Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420. [OpenAIRE]

Chen, Z., and Liu, B. 2016. Lifelong Machine Learning.

Chen, Z.; Ma, N.; and Liu, B. 2015. Lifelong learning for sentiment classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).

2019. Episodic memory in lifelong language learning. arXiv preprint arXiv:1906.01076.

Elhoseiny, M.; Babiloni, F.; Aljundi, R.; Rohrbach, M.; Paluri, M.; and Tuytelaars, T. 2018. Exploring the challenges towards lifelong fact learning. In Asian Conference on Computer Vision, 66-84. Springer. [OpenAIRE]

Fernando, C.; Banarse, D.; Blundell, C.; Zwols, Y.; Ha, D.; Rusu, A. A.; Pritzel, A.; and Wierstra, D. 2017. Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734. [OpenAIRE]

Hanul Shin, Jung Kwon Lee, J. K., and Kim, J. 2017. Continual learning with deep generative replay. arXiv preprint arXiv:1705.08690. [OpenAIRE]

He, L.; Lee, K.; Lewis, M.; and Zettlemoyer, L. 2017. Deep semantic role labeling: What works and whats next. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 473- 483.