descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Dec 2020Embargo end date: 01 Jan 2020 English Publisher:MIT PressJournal:Transactions of the Association for Computational Linguistics, volume 8, pages 795-809 (eissn: 2307-387X,

Authors: Tim Vieira; Clara Meister; Ryan Cotterell; Ryan Cotterell;

doi: 10.1162/tacl_a_00346 , 10.3929/ethz-b-000455678 , 10.48550/arxiv.2007.03909

arXiv: http://arxiv.org/abs/2007.03909

Best-First Beam Search

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Decoding for many NLP tasks requires an effective heuristic algorithm for approximating exact search because the problem of searching the full output space is often intractable, or impractical in many settings. The default algorithm for this job is beam search—a pruned version of breadth-first search. Quite surprisingly, beam search often returns better results than exact inference due to beneficial search bias for NLP tasks. In this work, we show that the standard implementation of beam search can be made up to 10x faster in practice. Our method assumes that the scoring function is monotonic in the sequence length, which allows us to safely prune hypotheses that cannot be in the final set of hypotheses early on. We devise effective monotonic approximations to popular nonmonontic scoring functions, including length normalization and mutual information decoding. Lastly, we propose a memory-reduced variant of best-first beam search, which has a similar beneficial search bias in terms of downstream performance, but runs in a fraction of the time.

Related Organizations

Johns Hopkins University
United States
University of Cambridge
United Kingdom
ETH Zurich
Switzerland

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Computer Science - Data Structures and Algorithms, Computational linguistics. Natural language processing, Data Structures and Algorithms (cs.DS), P98-98.5, Computation and Language (cs.CL)

2 Research products, page 1 of 1

sgnmt software on GitHub
IsRelatedTo
fairseq software on GitHub
IsRelatedTo

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	32
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%