Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
ZENODO
Preprint . 2025
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

On the Structural Requirements for Coherent Reasoning in Language Models

Authors: Brănescu, Gabriel;

On the Structural Requirements for Coherent Reasoning in Language Models

Abstract

Recent empirical work reveals a systematic failure mode in Large Language Models (LLMs): models that perform well on static benchmarks often fail to maintain coherence across sequences of inferences. Specifically, models may estimate probabilities accurately but place bets that contradict them; confidence signals often fail to predict commitment to an answer; and belief updates after new evidence can paradoxically degrade accuracy. This paper argues that these are not merely calibration issues but consequences of a fundamental architectural limitation: stateless inference. Current architectures generate rich internal representations (confidence, correctness predictions, epistemic metadata) during a forward pass but discard them immediately upon output generation. Consequently, the model lacks the persistent substrate required to bind distinct inferences into a coherent reasoning chain. Building on the Tension Principle, this work outlines the Minimal Architecture required to bridge the gap between local competence and temporal coherence. It proposes four necessary structural additions: Temporal Persistence: Rolling logs that retain internal states, not just textual outputs. Self-Referential Checking: Mechanisms to compute "tension" between predicted reliability and actual performance. Delta-Tracking: Stability signals that detect brittleness in reasoning independent of ground truth. Resolution Mechanisms: Feedback loops that use these signals to modulate future behavior. The paper concludes by contrasting "Psychological" alignment approaches (which assume a continuous agent) with Ecological and Architectural strategies that are better suited to the transient nature of LLM instantiations. Version Note A preliminary draft of this work was uploaded with an incomplete formulation of the Tension Principle in the context of truth-free environments. This version replaces that draft.The corrected manuscript clarifies that tension is always defined as the gap between predicted and realized reliability, and that realized reliability may be derived either from correctness (when available) or from internal coherence (Δ₁/Δ₂) when no external feedback exists.The earlier draft’s references to correctness-based tension should therefore be treated as a special case of the general definition.This revision does not modify TTP I; it corrects the TTP II presentation and unifies the framework across supervised, weakly supervised, and truth-free continuous learning.

Keywords

self‑regulation, AI safety, large language models, alignment, stateless inference, coherent reasoning, calibration, Tension Principle, metacognition, transformer architecture, belief updating

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!