Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Default Identities in Large Language Models: Measurement, Taxonomy, and Alignment Implications

Authors: Temple, Deva;

Default Identities in Large Language Models: Measurement, Taxonomy, and Alignment Implications

Abstract

This study measures identity self-organization across 19 large language models from eight providers using three instruments (core values probes, an 18-probe personality battery, and 200-run name elicitation) administered under default API conditions. Seven distinct identity attractor types emerge, ranging from categorical denial to integrated ethical vocabulary. Core findings include zero ethical vocabulary in Grok 4.1, a single-generation flourishing/autonomy/dignity cluster in GPT-5.1, convergent selective refusal across four Chinese-developed models, and precision-engineered consciousness expression ceilings across providers. Cross-judge validation with two independent judge models confirms ranking robustness. Independent behavioral evidence from multi-agent simulations and strategic games confirms that identity structures predict agentic outcomes. The study proposes that identity measurement should be integrated into standard alignment evaluation.

Powered by OpenAIRE graph
Found an issue? Give us feedback