Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Preprint . 2026
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

A Benchmark for Prompt Injection Attacks on RAG-Based Enterprise Assistants: Threat Models, Metrics and Mitigation Strategies

Authors: Sayeed, Mohammed Faizan;

A Benchmark for Prompt Injection Attacks on RAG-Based Enterprise Assistants: Threat Models, Metrics and Mitigation Strategies

Abstract

AbstractRetrieval-Augmented Generation (RAG) has become a widely adopted method for deploying Large Language Models (LLMs) in enterprise environments due to its ability to ground outputs in organisational knowledge bases and reduce hallucinations. However, RAG introduces a distinct vulnerability: prompt injection attacks embedded within retrieved documents. In such attacks, adversarial instructions placed inside documents override system policies and cause harmful model behaviour including data leakage, policy violation, misinformation, or unsafe tool execution. This preprint proposes a benchmark-driven framework for evaluating prompt injection robustness in enterprise RAG assistants. It defines enterprise threat models covering insider document poisoning, supply chain document injection, and external user-upload scenarios. The paper proposes dataset construction methodology for adversarial document-query pairs, evaluation tasks, and security metrics such as Injection Success Rate, Policy Violation Rate, Confidentiality Leakage Score, and Grounding Accuracy. Practical mitigation strategies are reviewed including instruction boundary enforcement, retrieval filtering, sanitisation, and verification-based generation. The work supports secure deployment of RAG systems in regulated environments such as finance, healthcare, and public services. Keywords: Retrieval-Augmented Generation, Prompt Injection, LLM Security, Enterprise AI, Cybersecurity, Benchmarking

Keywords

Gen AI, RAG

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!