TAMR+: Trust-Aware Multi-Signal Document Retrieval with Graph-Based Compliance Scoring and Gap Attribution for Regulatory AI Systems

Kumar, Harish

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Preprint

Data sources: ZENODO

TAMR+: Trust-Aware Multi-Signal Document Retrieval with Graph-Based Compliance Scoring and Gap Attribution for Regulatory AI Systems

descriptionPublicationkeyboard_double_arrow_right Preprint Under curation English Publisher:Zenodo

Authors: Kumar, Harish;

doi: 10.5281/zenodo.18929634

TAMR+: Trust-Aware Multi-Signal Document Retrieval with Graph-Based Compliance Scoring and Gap Attribution for Regulatory AI Systems

- Summary

Abstract

Retrieval-Augmented Generation (RAG) systems are increasingly deployed for regulatory compliance tasks, yet they lack mechanisms for scoring the trustworthiness of their outputs or explaining score deficits — both required by the EU AI Act (Regulation (EU) 2024/1689). We ask: can deterministic, formula-based compliance scoring provide actionable gap attribution that opaque ML-based evaluation cannot? We present TAMR+ (Trust-Aware Multi-Signal Retrieval), a three-stage pipeline that combines (i) a zero-LLM document manifest selector using five deterministic signals; (ii) a multi-phase retrieval pipeline where ≥60% of the retrieval score derives from structural signals (knowledge graph alignment, causal density) rather than vector similarity; and (iii) TRACE, a five-dimension compliance scoring framework mapped to specific EU AI Act articles via deterministic formulas. Our key contribution is a five-category gap attribution taxonomy that decomposes every score deficit into actionable categories, transforming evaluation from diagnosis to prescription. On a new cross-domain benchmark suite of 250 regulatory questions across four domains, TAMR+ achieves a mean TRACE score of 0.680 (3-hop), a 76.6% improvement over vector-only RAG (0.385, p < 0.001). Systematic ablation confirms that each pipeline component contributes significantly: removing any single component degrades performance by 6–27%. We release the benchmarks and TRACE scoring specification under Apache 2.0 to enable independent validation.

Found an issue? Give us feedback