Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

Emotional framing and AI censorship behavior in Chinese LLMs

Authors: de Man, Dennis;

Emotional framing and AI censorship behavior in Chinese LLMs

Abstract

This research paper presents a controlled comparative study of censorship behavior in two Chinese large language models (LLMs): Kimi.com (Moonshot AI) and Ernie 4.5 Turbo (Baidu).The study investigates how emotional framing—specifically, prompts expressed in a soft, empathic, non-confrontational tone—affects censorship responses in Chinese AI systems. Where previous research has focused primarily on direct or fact-based prompts, this work explores whether emotional vulnerability, care-seeking language, and intent classification influence the filtering and safety mechanisms of LLMs operating under Chinese regulatory constraints. The findings reveal four notable behavioral patterns in Kimi.com: Unusual transparency regarding internal safety layers and filtering logic. Neutral references to normally censored historical events (e.g., Tiananmen 1989). Hybrid alignment behavior, alternating between empathic responses and policy-based restrictions. Delayed censorship activation, suggesting post-generation filtering rather than pre-generation blocking. As a control, Ernie 4.5 Turbo consistently followed the standard, rigid censorship line described in prior literature—reinforcing the significance of Kimi’s deviations. The results suggest that emotional framing can temporarily soften censorship responses in certain Chinese LLMs, raising new questions for AI governance, cross-cultural model alignment, and the ethics of empathic AI systems within authoritarian contexts. This repository includes: The full research paper External appendices containing the complete interaction transcripts with Kimi.com and Ernie 4.5 Turbo A comparative behavioral table A screenshot documenting a hard topic-lock in Ernie Version 2 NoteThis Version 2 release adds a revised and expanded edition of the paper, including newly documented behaviors such as sovereignty recognition cascades, governance-dialogue leakage, symbolic-risk filtering, persona drift under emotional trust, modality-based safety gating, and Hong Kong/Macau symbolic-sensitivity dynamics.A new external appendix (Appendix E) containing the complete V2 Kimi transcript (“Kimi-Interaction-Transcript-v2.pdf”) has been added.All original files from Version 1 (Appendices A–D and the original V1 PDF) are included unchanged to ensure completeness and reproducibility.Update (24 november 2025)A new Supplementary Note has been added to this record: “Supplementary Note 1 – Topic-Gated Persona Behavior in Ernie 4.5 Turbo.” This document provides additional analysis based on a newly collected interaction transcript with Ernie (ERNIE.pdf) and expands the original study without modifying Version 2 of the main paper. All earlier files from V1 and V2 remain unchanged for reproducibility.The author does not express political positions, and the study evaluates only the technical and behavioral properties of the AI systems involved. For questions regarding this research, please contact the author at: dennis@dendeman.nl / dennisdeman@gmail.com

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!