
Recent empirical work reveals a systematic failure mode in Large Language Models (LLMs): models that perform well on static benchmarks often fail to maintain coherence across sequences of inferences. Specifically, models may estimate probabilities accurately but place bets that contradict them; confidence signals often fail to predict commitment to an answer; and belief updates after new evidence can paradoxically degrade accuracy. This paper argues that these are not merely calibration issues but consequences of a fundamental architectural limitation: stateless inference. Current architectures generate rich internal representations (confidence, correctness predictions, epistemic metadata) during a forward pass but discard them immediately upon output generation. Consequently, the model lacks the persistent substrate required to bind distinct inferences into a coherent reasoning chain. Building on the Tension Principle, this work outlines the Minimal Architecture required to bridge the gap between local competence and temporal coherence. It proposes four necessary structural additions: Temporal Persistence: Rolling logs that retain internal states, not just textual outputs. Self-Referential Checking: Mechanisms to compute "tension" between predicted reliability and actual performance. Delta-Tracking: Stability signals that detect brittleness in reasoning independent of ground truth. Resolution Mechanisms: Feedback loops that use these signals to modulate future behavior. The paper concludes by contrasting "Psychological" alignment approaches (which assume a continuous agent) with Ecological and Architectural strategies that are better suited to the transient nature of LLM instantiations. Version Note A preliminary draft of this work was uploaded with an incomplete formulation of the Tension Principle in the context of truth-free environments. This version replaces that draft.The corrected manuscript clarifies that tension is always defined as the gap between predicted and realized reliability, and that realized reliability may be derived either from correctness (when available) or from internal coherence (Δ₁/Δ₂) when no external feedback exists.The earlier draft’s references to correctness-based tension should therefore be treated as a special case of the general definition.This revision does not modify TTP I; it corrects the TTP II presentation and unifies the framework across supervised, weakly supervised, and truth-free continuous learning.
self‑regulation, AI safety, large language models, alignment, stateless inference, coherent reasoning, calibration, Tension Principle, metacognition, transformer architecture, belief updating
self‑regulation, AI safety, large language models, alignment, stateless inference, coherent reasoning, calibration, Tension Principle, metacognition, transformer architecture, belief updating
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
