Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ BOA - Bicocca Open A...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.1145/366419...
Article . 2024 . Peer-reviewed
License: CC BY
Data sources: Crossref
versions View all 4 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Normalised Precision at Fixed Recall for Evaluating TAR

Authors: Wojciech Kusa; Georgios Peikos; Moritz Staudinger; Aldo Lipani; Allan Hanbury;

Normalised Precision at Fixed Recall for Evaluating TAR

Abstract

A popular approach to High-Recall Information Retrieval (HRIR) is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and medical systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of manual assessments. Previous work typically evaluated TAR models retrospectively, assuming that the system achieves a specific, fixed Recall level first and then measuring model quality (for instance, work saved at r% Recall).This paper presents an analysis of one of such measures: Precision at r% Recall (P@r%). We show that minimum Precision at r% scores depends on the dataset, and therefore, this measure should not be used for evaluation across topics or datasets. We propose its min-max normalised version (nP@r%), and show that it is equal to a product of TNR and Precision scores. Our analysis shows that nP@r% is least correlated with the percentage of relevant documents in the dataset and can be used to focus on additional aspects of the TAR tasks that are not captured with current measures. Finally, we introduce a variation of nP@r%, that is a geometric mean of TNR and Precision, preserving the properties of nP@r% and having a lower coefficient of variation.

Country
Italy
Keywords

citation screening; evaluation; high-recall retrieval; normalised precision; precision at recall; systematic reviews; tar;

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
1
Average
Average
Average
Green
hybrid
Funded by