Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao ZENODOarrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
ZENODO
Dataset
Data sources: ZENODO
addClaim

iSearch++: An Augmented State-of-the-Art Information Retrieval Test Collection for Integrated Academic Search

Authors: Schaer, Philipp; Breuer, Timo; Haak, Fabian; Engelmann, Björn;

iSearch++: An Augmented State-of-the-Art Information Retrieval Test Collection for Integrated Academic Search

Abstract

The iSearch test collection remains a unique resource for evaluating information access systems such as academic search engines. Built following the Cranfield evaluation paradigm, it combines arXiv full texts and metadata with detailed descriptions of users’ information needs across different expertise levels. Although the collection is now over 15 years old and relatively small by modern standards (~160,000 documents), its structured relevance assessments make it an ideal foundation for evaluating contemporary systems. The iSearch++ project aims to modernize this dataset by improving full-text extraction (e.g., table extraction), re-evaluating relevance using LLM-as-a-Judge methods, integrating the collection into the ir_datasets framework, and aligning it with FAIR principles. Within the NFDIxCS context, iSearch++ demonstrates how legacy research datasets can be updated to meet current technical and accessibility standards while preserving their original research value. Software is publicly available and can be found at the reference landing page on GitHub. This work is funded by the German Research Foundation (DFG) as part of the NFDIxCS consortium (Grant number: 501930651).

Powered by OpenAIRE graph
Found an issue? Give us feedback