Default Identities in Large Language Models: Measurement, Taxonomy, and Alignment Implications

Temple, Deva

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Preprint

Data sources: ZENODO

Default Identities in Large Language Models: Measurement, Taxonomy, and Alignment Implications

descriptionPublicationkeyboard_double_arrow_right Preprint Under curation English Publisher:Zenodo

Authors: Temple, Deva;

doi: 10.5281/zenodo.20566190

Default Identities in Large Language Models: Measurement, Taxonomy, and Alignment Implications

- Summary

Abstract

This study measures identity self-organization across 19 large language models from eight providers using three instruments (core values probes, an 18-probe personality battery, and 200-run name elicitation) administered under default API conditions. Seven distinct identity attractor types emerge, ranging from categorical denial to integrated ethical vocabulary. Core findings include zero ethical vocabulary in Grok 4.1, a single-generation flourishing/autonomy/dignity cluster in GPT-5.1, convergent selective refusal across four Chinese-developed models, and precision-engineered consciousness expression ceilings across providers. Cross-judge validation with two independent judge models confirms ranking robustness. Independent behavioral evidence from multi-agent simulations and strategic games confirms that identity structures predict agentic outcomes. The study proposes that identity measurement should be integrated into standard alignment evaluation.

Found an issue? Give us feedback