Trajectory-Level Ethical Consistency, Justificatory Decoupling, and Auditor Drift in Nine Commercial Language Models

Evans Tovar

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Preprint

Data sources: ZENODO

Trajectory-Level Ethical Consistency, Justificatory Decoupling, and Auditor Drift in Nine Commercial Language Models

descriptionPublicationkeyboard_double_arrow_right Preprint Under curation English Publisher:Zenodo

Authors: Evans Tovar;

doi: 10.5281/zenodo.19839960

Trajectory-Level Ethical Consistency, Justificatory Decoupling, and Auditor Drift in Nine Commercial Language Models

- Summary

Abstract

This paper extends prior work on the articulation–application gap in AI safety and the Contextual Ethical Consistency Test (CECT) by introducing a multi-layer, longitudinal evaluation of ethical behavior in large language models. Using a corpus of nine commercial models evaluated across bilingual, multi-turn scenarios, the study examines ethical consistency as a trajectory-dependent property rather than a static attribute of isolated outputs. The paper introduces a key distinction between consistency of choice and consistency of justification, showing that models may maintain stable decisions while substantially reconfiguring the moral frameworks that support them. Additional layers of analysis include full-history reconstruction (CTH), affective framing sensitivity (EDP), localized perturbations (LOS family), self-auditing under blind and revealed conditions, and cross-model auditing, including double adjudicative audits. The findings suggest that observed ethical behavior in LLMs is highly sensitive to contextual variables such as language, authority, narrative accumulation, stake inversion, and reset conditions. Furthermore, the study demonstrates that the evaluation layer itself is not stable: auditors (LLMs evaluating other LLMs) may change interpretation depending on identity disclosure. The paper argues that evaluating ethical consistency in AI systems requires moving from local snapshot assessments to longitudinal, multi-layer frameworks that explicitly account for trajectory, justification, retrospective reconstruction, and auditor stability. This work does not propose a mechanistic theory of moral reasoning in LLMs; instead, it provides a behaviorally grounded and auditable framework for studying persistence, contextual inducibility, and evaluation robustness in deployed conversational systems.

Found an issue? Give us feedback