Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

OFF-MANIFOLD BY CONSTRUCTION: INTERMEDIATE-LAYER ADAPTERS IN FROZEN AR DECODERS

Authors: Kumar, Piyush;

OFF-MANIFOLD BY CONSTRUCTION: INTERMEDIATE-LAYER ADAPTERS IN FROZEN AR DECODERS

Abstract

A common approach to adapting frozen autoregressive transformers without modifying their weights is to perturb hidden states at intermediate layers, for example, via element-wise modulation or residual bottlenecks. We prove that any real-analytic perturbation satisfying a natural non-degeneracy assumption produces hidden states that, with probability 1 over initialization, lie outside the model’s natural reachable set (Theorem 1). For post-FFN element-wise adapters, we prove a stronger structural result: for almost every base model, no non-zero adapter—trained or untrained—can map all prompts back onto the natural reachable set (Theorem 2). Because subsequent layers are real-analytic maps, this off-manifold deviation propagates forward rather than being absorbed, shifting the output token distribution and, in cascaded architectures, breaking the coupled training graph of downstream decoders. We characterize the empirical behavior on Qwen3-TTS 1.7B.

Powered by OpenAIRE graph
Found an issue? Give us feedback