Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Scaling Down: Multi-Hop Information Retrieval in Resource-Constrained Environments

Authors: Staroverov, Nikolay;

Scaling Down: Multi-Hop Information Retrieval in Resource-Constrained Environments

Abstract

While Multi-Hop Question Answering is a foundational task in Natural Language Processing, the current approaches rely on Large Language Models, which we characterize as computationally prohibitive for local deployment. In this paper, we propose a data-centric approach to Multi-Hop Retrieval, utilizing a 5.38M parameter model with 195.5ms of CPU latency, which occupies only 20.5MB of disk space. On standardized numerical benchmarks, it achieves 0.61 on MRR and 0.56 on Recall@1, outperforming the non-iterative sBERT baseline by factors of 1.72× and 2.75×, respectively. Through the Broken Chain experiment, we demonstrate that such models have the ability to implicitly learn sequential dependencies, previously often observed in larger-scale architectures. We also utilize the Explicit Positional Encoding (EPE) tokens as a way to ground the model’s output in long-sequence environments. Furthermore, we characterize EPE as a structural regularizer, demonstrating its ability to mitigate the Sequence Tax in small recurrent architectures.

Powered by OpenAIRE graph
Found an issue? Give us feedback