A Two-State Decoding Model for Hallucination-Resistant Language Generation: Context-Anchored Generation via Semantic Drift Control

Hallucinations in large language models (LLMs) arise primarily from uncontrolled semantic drift during autoregressive token generation: the model’s probability distribution gradually diverges from the prompt’s intended semantic frame, producing fluent but ungrounded continuations. Existing mitigation strategies largely operate outside decoding—via retrieval-augmented generation (RAG), reinforcement learning from human feedback (RLHF), or post-generation filtering—leaving the core inference-time process unconstrained. This paper proposes Context-Anchored Generation (CAG), a lightweight, model-agnostic decoding governance layer that inserts a two-state control system between the model’s raw token probability distribution and final token selection. CAG maintains a persistent semantic frame (anchor) initialized from the prompt, and governs generation via two modes: Constraint Mode, which enforces semantic proximity via cosine similarity filtering; and Expansion Mode, which permits controlled divergence upon pivot detection. Transitions are governed by a drift coefficient δ_t and accumulated drift window D_t, both derived from embedding-space similarity to the anchor frame. CAG operates purely at decoding time, adds only per-token similarity computation (near-linear overhead over the candidate set), requires no model retraining, and mitigates not only hallucinations but secondary drift-related pathologies: repetition loops, long-context incoherence, and premature topic abandonment. A fully validated reference implementation is provided, including a drop-in HuggingFace integration path and a mathematical validation suite (21/21 properties verified).

Keywords

Machine Learning, Decoding Algorithms, Large Language Models, Semantic Drift, Autoregressive Generation, Hallucination Mitigation, Context Anchored Generation, Natural Language Processing

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Related to Research communities

Digital Humanities and Cultural Heritage