Exploring the space of information retrieval term scoring functions

descriptionPublicationkeyboard_double_arrow_right Article 01 Mar 2017 English Publisher:Elsevier BVJournal:Information Processing & Management, volume 53, pages 454-472 (issn: 0306-4573,

Copyright policy )

Authors: Goswami, Parantapa; Eric, Gaussier; Amini, Massih-Reza;

doi: 10.1016/j.ipm.2016.11.003

Exploring the space of information retrieval term scoring functions

- Summary
- Subjects
- Metrics

Abstract

A novel automated discovery approach to systematically explore IR function space.Empirical analysis of heuristic IR constraints in light of the new discovery approach.Experimental validation of effectiveness of discovered IR scoring functions. In this paper we are interested in finding good IR scoring functions by exploring the space of all possible IR functions. Earlier approaches to do so however only explore a small sub-part of the space, with no control on which part is explored and which is not. We aim here at a more systematic exploration by first defining a grammar to generate possible IR functions up to a certain length (the length being related to the number of elements, variables and operations, involved in a function), and second by relying on IR heuristic constraints to prune the search space and filter out bad scoring functions. The obtained candidate scoring functions are tested on various standard IR collections and several simple but promising functions are identified. We perform extensive experiments to compare these functions with classical IR models. It is observed that these functions are yielding either better or comparable results. We also compare the performance of functions satisfying IR heuristic constraints and those which do not; the former set of functions clearly outperforms the latter, which shows the validity of IR heuristic constraints to design new IR models.

Related Organizations

Grenoble Alpes University
France
French National Centre for Scientific Research
France
Centre national de la recherche scientifique
France

Keywords

ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.3: Information Search and Retrieval, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.3: Information Search and Retrieval/H.3.3.4: Retrieval models

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	13
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

13

Top 10%

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now