Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

φ-Dynamics in Large Language Models: An Inductive Bias from Topology and the Theory of Diophantine Approximation

Authors: Kim, Leo; Kim, Sergey;

φ-Dynamics in Large Language Models: An Inductive Bias from Topology and the Theory of Diophantine Approximation

Abstract

Abstract Large language models exhibit a persistent drop in output quality when faced with inputs that lie outside the training distribution - the so-called Out-of-Distribution (OOD) regime. The standard explanation is statistical: a mismatch between training and test distributions. We argue that there is also a geometric component: the Cartesian geometry of hidden spaces Rᵈ is structurally misaligned with the polar nature of semantic representations, a fact supported empirically by the anisotropy of neural activations and by the recent success of polar quantization schemes for KV caches. In this paper we introduce φ-dynamics as a new, theoretically grounded inductive bias for LLM architectures. We establish three central results. First, OOD robustness requires minimizing the topological complexity of the hidden-state trajectory, as measured by Betti numbers. Second, the logarithmic spiral is the unique scale-invariant curve in R²ᵏ with a monotone phase. Third, the golden ratio φ = (1+√5)/2 is the uniquely optimal phase-shift parameter in the sense of Hurwitz’s theorem on Diophantine approximation. Building on these results, we introduce a differentiable regularizer Lφ that can be added to any existing architecture without structural modification, and we propose a concrete experimental verification protocol.

Powered by OpenAIRE graph
Found an issue? Give us feedback