Gaussian Processes for Natural Language Processing

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Jan 2014Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Tutorials

Authors: Trevor Cohn; Daniel Preotiuc-Pietro; Neil D. Lawrence;

doi: 10.3115/v1/p14-6001

Gaussian Processes for Natural Language Processing

- Summary
- Metrics

Abstract

Gaussian Processes (GPs) are a powerful modelling framework incorporating kernels and Bayesian inference, and are recognised as stateof-the-art for many machine learning tasks. Despite this, GPs have seen few applications in natural language processing (notwithstanding several recent papers by the authors). We argue that the GP framework offers many benefits over commonly used machine learning frameworks, such as linear models (logistic regression, least squares regression) and support vector machines. Moreover, GPs are extremely flexible and can be incorporated into larger graphical models, forming an important additional tool for probabilistic inference. Notably, GPs are one of the few models which support analytic Bayesian inference, avoiding the many approximation errors that plague approximate inference techniques in common use for Bayesian models (e.g. MCMC, variational Bayes).1 GPs accurately model not just the underlying task, but also the uncertainty in the predictions, such that uncertainty can be propagated through pipelines of probabilistic components. Overall, GPs provide an elegant, flexible and simple means of probabilistic inference and are well overdue for consideration of the NLP community. This tutorial will focus primarily on regression and classification, both fundamental techniques of wide-spread use in the NLP community. Within NLP, linear models are near ubiquitous, because they provide good results for many tasks, support efficient inference (including dynamic programming in structured prediction) and support simple parameter interpretation. However, linear models are inherently limited in the types of relationships between variables they can model. Often

Related Organizations

University of Pennsylvania
United States
University of Sheffield
United Kingdom
University of Melbourne
Australia

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	8
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average