Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ INRIA2arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
INRIA2
Conference object . 2021
Data sources: INRIA2
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
HAL-Rennes 1
Conference object . 2021
Data sources: HAL-Rennes 1
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Generating artificial texts for query expansion

Authors: Claveau, Vincent;

Generating artificial texts for query expansion

Abstract

Un moyen d'améliorer les performances de la recherche de documents consiste à étendre la requête de l'utilisateur. Plusieurs approches ont été proposées dans la littérature, et certaines d'entre elles obtiennent des résultats très compétitifs. Dans cet article, nous explorons l'utilisation de la génération de texte pour étendre automatiquement les requêtes. Nous nous appuyons sur un modèle génératif neuronal bien connu, GPT-2, qui est fourni avec des modèles pré-entraînés pour l'anglais mais qui peut également être affiné sur des corpus spécifiques. À travers différentes expériences, nous montrons que la génération de texte est un moyen très efficace d'améliorer les performances d'un système de RI, avec une marge importante (+10% de gains MAP), et qu'il surpasse des approches état-de-l'art reposant également sur l'expansion des requêtes (LM+RM3). Cette approche conceptuellement simple peut être facilement mise en oeuvre sur n'importe quel système de RI grâce à la disponibilité du code et des modèles GPT.

A well-known way to improve the performance of document retrieval is to expand the user's query. Several approaches have been proposed in the literature, and some of them are considered as yielding state-of-the-art results. In this paper, we explore the use of text generation to automatically expand the queries. We rely on a well-known neural generative model, GPT-2, that comes with pre-trained models for English but can also be fine-tuned on specific corpora. Through different experiments, we show that text generation is a very effective way to improve the performance of an IR system, with a large margin (+10% MAP gains), and that it outperforms strong baselines also relying on query expansion (LM+RM3). This conceptually simple approach can easily be implemented on any IR system thanks to the availability of GPT code and models.

Country
France
Keywords

[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], GPT2, Text generation, modèle de langue génératif, document retrieval, recherche de documents, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], query expansion, expansion de requête, augmentation de données, [INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR], generative language model, Génération de textes, data augmentation

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
Related to Research communities