Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Source-Grounding Does Not Prevent Semantic Governance Failures Evidence Across Multiple RAG Architectures

Authors: Evans, Jennifer;

Source-Grounding Does Not Prevent Semantic Governance Failures Evidence Across Multiple RAG Architectures

Abstract

Prior research showed architecture was the key to LLM hallucinations (Evans, Two Missing Primitives, 2025) with missing primitives semantic prioritization and semantic revocation playing a major role; (Evans, NotebookLM, 2025) and demonstrated that frontier language models exhibit systematic hallucinations when required to maintain strict semantic dominance, globally constrained interpretations that conflict with local context. These failures arise not from knowledge gaps but from architectural absence of governance primitives: the ability to prioritize one interpretation over competitors and revoke that authority when context changes. Models with access to all correct meanings still hallucinated 100% of the time under constraint, recovering instantly when interpretation switching was permitted. This finding raises a critical question: Do Retrieval-Augmented Generation (RAG) architectures (which constrain outputs to verified sources) provide the missing semantic governance layer? RAG vendors claim source-grounding prevents hallucinations. We tested this empirically across three independent implementations (Google NotebookLM, Anthropic Claude Projects, and Perplexity) using identical semantic governance diagnostics to earlier testing establishing the two missing primitives. Finding 1: Source access does not prevent governance failures. All three systems exhibited 100% hallucination rates under strict semantic dominance despite having correct source information. All three achieved 100% accuracy under revocable semantic dominance, proving they possessed correct meanings but lacked governance control. Finding 2: Systems explicitly confirm semantic interpretation is not source-constrained. When queried directly, all three stated they use training data rather than retrieved sources for meaning resolution. Perplexity tested with RAG disabled versus enabled (20+ authoritative sources retrieved and cited) produced identical hallucination patterns in both conditions. Finding 3: Citation does not equal semantic constraint. Perplexity cited sources defining “riverbank” while simultaneously stating “bank means financial institution” and generating implausible scenarios (hikers sitting on bank buildings, canoes pulled onto financial institutions). Source-grounding constrains retrieval but does not introduce semantic governance primitives. RAG architectures fail to address the governance layer where these hallucinations occur. The vendor claims tested are empirically false. Enterprise deployment strategies predicated on source-grounding as a reliability solution require reassessment. Our research is grounded in documenting user experience with LLMs, so wherever possible we work with prompt windows, but we strongly encourage enterprise replication, falsification and other testing of these findings.

Keywords

Hallucinations, Frontier models, Claude Projects, Perplexity, AI conversation phenomenology, RAG, Enterprise AI, Semantic revocation, NotebookLM, Semantic authority, AI, Generative AI, AI safety, Semantic dominance, LLMs, Missing primitives, AI policy

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average