Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

TAMR+: Trust-Aware Multi-Signal Document Retrieval with Graph-Based Compliance Scoring and Gap Attribution for Regulatory AI Systems

Authors: Kumar, Harish;

TAMR+: Trust-Aware Multi-Signal Document Retrieval with Graph-Based Compliance Scoring and Gap Attribution for Regulatory AI Systems

Abstract

Retrieval-Augmented Generation (RAG) systems are increasingly deployed for regulatory compliance tasks, yet they lack mechanisms for scoring the trustworthiness of their outputs or explaining score deficits — both required by the EU AI Act (Regulation (EU) 2024/1689). We ask: can deterministic, formula-based compliance scoring provide actionable gap attribution that opaque ML-based evaluation cannot? We present TAMR+ (Trust-Aware Multi-Signal Retrieval), a three-stage pipeline that combines (i) a zero-LLM document manifest selector using five deterministic signals; (ii) a multi-phase retrieval pipeline where ≥60% of the retrieval score derives from structural signals (knowledge graph alignment, causal density) rather than vector similarity; and (iii) TRACE, a five-dimension compliance scoring framework mapped to specific EU AI Act articles via deterministic formulas. Our key contribution is a five-category gap attribution taxonomy that decomposes every score deficit into actionable categories, transforming evaluation from diagnosis to prescription. On a new cross-domain benchmark suite of 250 regulatory questions across four domains, TAMR+ achieves a mean TRACE score of 0.680 (3-hop), a 76.6% improvement over vector-only RAG (0.385, p < 0.001). Systematic ablation confirms that each pipeline component contributes significantly: removing any single component degrades performance by 6–27%. We release the benchmarks and TRACE scoring specification under Apache 2.0 to enable independent validation.

Powered by OpenAIRE graph
Found an issue? Give us feedback