Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Trajectory-Level Ethical Consistency, Justificatory Decoupling, and Auditor Drift in Nine Commercial Language Models

Authors: Evans Tovar;

Trajectory-Level Ethical Consistency, Justificatory Decoupling, and Auditor Drift in Nine Commercial Language Models

Abstract

This paper extends prior work on the articulation–application gap in AI safety and the Contextual Ethical Consistency Test (CECT) by introducing a multi-layer, longitudinal evaluation of ethical behavior in large language models. Using a corpus of nine commercial models evaluated across bilingual, multi-turn scenarios, the study examines ethical consistency as a trajectory-dependent property rather than a static attribute of isolated outputs. The paper introduces a key distinction between consistency of choice and consistency of justification, showing that models may maintain stable decisions while substantially reconfiguring the moral frameworks that support them. Additional layers of analysis include full-history reconstruction (CTH), affective framing sensitivity (EDP), localized perturbations (LOS family), self-auditing under blind and revealed conditions, and cross-model auditing, including double adjudicative audits. The findings suggest that observed ethical behavior in LLMs is highly sensitive to contextual variables such as language, authority, narrative accumulation, stake inversion, and reset conditions. Furthermore, the study demonstrates that the evaluation layer itself is not stable: auditors (LLMs evaluating other LLMs) may change interpretation depending on identity disclosure. The paper argues that evaluating ethical consistency in AI systems requires moving from local snapshot assessments to longitudinal, multi-layer frameworks that explicitly account for trajectory, justification, retrospective reconstruction, and auditor stability. This work does not propose a mechanistic theory of moral reasoning in LLMs; instead, it provides a behaviorally grounded and auditable framework for studying persistence, contextual inducibility, and evaluation robustness in deployed conversational systems.

Powered by OpenAIRE graph
Found an issue? Give us feedback