Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article
Data sources: ZENODO
addClaim

Cascade Alignment Under Semantic Compression: A Pilot Study of System-Prompt Identity Layers in Agent Architectures

Authors: Ruvalcaba, Cristian; _saluca, Alfred;

Cascade Alignment Under Semantic Compression: A Pilot Study of System-Prompt Identity Layers in Agent Architectures

Abstract

We examine how compressing the layered system-prompt cascade of an LLM-backed agent affects identity-level alignment, measured by keyword recurrence in the agent’s responses to identity-probing prompts. Using a 15-call pilot (3 conditions × 5 prompts × 1 sample) against claude-haiku-4-5, we replaced the cascade’s vision and department layers with content-hash + 120-character semantic summaries. Aggressive compression (5.71×, 337 → 59 tokens) cut absolute alignment by 54% (24/75 → 11/75 keyword hits), but per-token information density rose 2.62×. A hybrid condition preserving the vision layer while compressing the department layer (1.55× compression) preserved 87.5% of alignment, suggesting an asymmetry: the cross-domain bridging layer resists compression while the role-specific layer absorbs it. We further observe that compression failures concentrate on cross-domain prompts (where the cascade must bridge two of its layers) rather than on-axis prompts. The findings are limited by small sample, single agent, single model, and a blunt keyword-counting alignment proxy. We position this study as a pilot within a broader evaluation framework (SHI Eval Framework) and outline the larger study (5 agents × 25 prompts × 5 samples with an LLM-based personality evaluator) that this work is a precursor to.

Powered by OpenAIRE graph
Found an issue? Give us feedback