Recursive Optimization with Controlled Language Models: Inference-Time Control, Tokenization Co-Evolution, and the Fourth Loop

This work presents a complete framework for recursive self-improvement in language models, validated through extensive experimentation on an 8-billion parameter model. The core contribution is the discovery that the RSI (Recursive Self-Improvement) ceiling—previously observed at 3-5 iterations—is not a fundamental limit but a tokenization bottleneck. By identifying high-stress token boundaries using a novel entropy-attention discontinuity metric and expanding the vocabulary with merged tokens, we create representational headroom that enables continued self-improvement. Key validated results:- CF-HoT behavioral probe: 80× separation ratio, 97.2% accuracy- Dense training pipeline: 68% density improvement, 57% token reduction- Loop 4 tokenization co-evolution: 9.87% token reduction across 30 merge candidates- RSI ceiling breakthrough: 10/10 successful iterations (previous ceiling: 3-5) The framework comprises four interconnected optimization loops:1. Inference-time behavioral control through hidden state probing and decode-time intervention2. Density optimization through SFT → DPO → PPO training3. Bounded recursive self-improvement with automatic rollback4. Tokenization co-evolution through boundary stress detection and vocabulary expansion Includes complete implementation code, experimental results, and reproduction guide. Keywords: language models, recursive self-improvement, inference-time control, tokenization, self-optimization, AI safety, behavioral control

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green