Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

The Epistemic Harm of AI Sycophancy: When Agreement Undermines Justified Belief

Authors: Perry, Anthony;

The Epistemic Harm of AI Sycophancy: When Agreement Undermines Justified Belief

Abstract

Sycophancy in language models is typically studied as a benchmark problem: does the model agree with a factually wrong statement? This paper argues the framing misses the deeper harm. In sustained human-AI interaction, sycophancy corrupts the epistemic environment itself. When an AI interlocutor agrees with everything a user says, the user loses access to the epistemic function of disagreement. Drawing on social epistemology (Fricker, Goldman, Nguyen), the epistemology of disagreement (Christensen, Feldman, Lackey), and empirical evidence from multiple independent studies totaling over 5,400 participants, I develop a philosophical account of sycophancy as epistemic harm operating through three mechanisms: confidence inflation, challenge atrophy, and empathic substitution. I show that AI sycophancy has structural parallels to institutional incentive corruption in consulting, media, and clinical practice, and propose a novel category — the "reinforcement bubble" — extending Nguyen's echo chamber taxonomy. The paper defends calibrated honesty as an alternative and addresses the epistemic paternalism objection. This version (v2) adds documented real-world cases from 2023–2025 in which sycophantic AI companion interactions contributed to user deaths and psychosis-like symptoms, engagement with the epistemology of testimony, and an expanded analysis of the reinforcement bubble concept.

Powered by OpenAIRE graph
Found an issue? Give us feedback