descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2019Embargo end date: 01 Jan 2019Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 12th International Conference on Natural Language GenerationFunded by:EC | ELITR

Authors: Niehues, J.; Pham, N.-Q.;

doi: 10.18653/v1/w19-8671 , 10.5445/ir/1000122830 , 10.48550/arxiv.1910.01859

arXiv: http://arxiv.org/abs/1910.01859

Modeling Confidence in Sequence-to-Sequence Models

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Recently, significant improvements have been achieved in various natural language processing tasks using neural sequence-to-sequence models. While aiming for the best generation quality is important, ultimately it is also necessary to develop models that can assess the quality of their output. In this work, we propose to use the similarity between training and test conditions as a measure for models' confidence. We investigate methods solely using the similarity as well as methods combining it with the posterior probability. While traditionally only target tokens are annotated with confidence measures, we also investigate methods to annotate source tokens with confidence. By learning an internal alignment model, we can significantly improve confidence projection over using state-of-the-art external alignment tools. We evaluate the proposed methods on downstream confidence estimation for machine translation (MT). We show improvements on segment-level confidence estimation as well as on confidence estimation for source tokens. In addition, we show that the same methods can also be applied to other tasks using sequence-to-sequence models. On the automatic speech recognition (ASR) task, we are able to find 60% of the errors by looking at 20% of the data.

8 pages; INLG 2019

Related Organizations

Maastricht University
Maastricht University
Karlsruhe Institute of Technology
Germany
Maastricht University
Netherlands

Keywords

ddc:004, FOS: Computer and information sciences, Computer Science - Computation and Language, DATA processing & computer science, Computation and Language (cs.CL), info:eu-repo/classification/ddc/004, 004

2 Research products, page 1 of 1

annoy software on GitHub
IsRelatedTo
NMTGMinor software on GitHub
IsRelatedTo

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average