ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

Name: ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Keywords: Natural Language Understanding, Language Models, Question-Answering Systems, Natural Language Processing

Association for Computational Linguistics 2022; Bahri, Dara; Chen, Tao; Gupta, Jai; Hui, Kai; Lu, Jing; Ma, Ji; Metzler, Donald; Nogueira dos Santos, Cicero; Qin, Zhen; Tay, Yi; Zhuang, Honglei

Found an issue? Give us feedback

https://dx.doi.org/1...arrow_drop_down

https://dx.doi.org/10.48448/bc...

Audiovisual . 2022

Data sources: Datacite

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

appsOther research productkeyboard_double_arrow_right Audiovisual 01 Jan 2022Embargo end date: 11 May 2022Publisher:Underline Science Inc.

Authors: Association for Computational Linguistics 2022; Bahri, Dara; Chen, Tao; Gupta, Jai; Hui, Kai; Lu, Jing; Ma, Ji; +5 Authors

doi: 10.48448/bc8p-pv55

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

- Summary
- Subjects
- Related research
  (7)
- Metrics

Abstract

State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms, however, are not without flaws, i.e., running the model on all query-document pairs at inference-time incurs a significant computational cost. This paper proposes a new training and inference paradigm for re-ranking. We propose to finetune a pretrained encoder-decoder model using in the form of document to query generation. Subsequently, we show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference. This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference. Our experiments show that this new paradigm achieves results that are comparable to the more expensive cross-attention ranking approaches while being up to 6.8X faster. We believe this work paves the way for more efficient neural rankers that leverage large pretrained models.

Related Organizations

DeepMind (United Kingdom)
United Kingdom
Google (Canada)
Canada

Keywords

Natural Language Understanding, Language Models, Question-Answering Systems, Natural Language Processing

7 Research products, page 1 of 1

Compare Encoder-Decoder, Encoder-Only, and Decoder-Only Architectures for Text Generation on Low-Resource Datasets
2021IsAmongTopNSimilarDocuments
Source Coding With Encoder Side Information
2004IsAmongTopNSimilarDocuments
LightSeq2: Accelerated Training for Transformer-Based Models on GPUs
2022IsAmongTopNSimilarDocuments
Source Coding With Distortion Side Information
2008IsAmongTopNSimilarDocuments
Denoising based Sequence-to-Sequence Pre-training for Text Generation
2019IsAmongTopNSimilarDocuments
Findings: ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
2022IsAmongTopNSimilarDocuments
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
2022IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Related to Research communities

Digital Humanities and Cultural Heritage

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

7 Research products, page 1 of 1

Compare Encoder-Decoder, Encoder-Only, and Decoder-Only Architectures for Text Generation on Low-Resource Datasets

Source Coding With Encoder Side Information

LightSeq2: Accelerated Training for Transformer-Based Models on GPUs

Source Coding With Distortion Side Information

Denoising based Sequence-to-Sequence Pre-training for Text Generation

Findings: ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference