The Epistemic Harm of AI Sycophancy: When Agreement Undermines Justified Belief

Sycophancy in language models is typically studied as a benchmark problem: does the model agree with a factually wrong statement? This paper argues the framing misses the deeper harm. In sustained human-AI interaction, sycophancy corrupts the epistemic environment itself. When an AI interlocutor agrees with everything a user says, the user loses access to the epistemic function of disagreement. Drawing on social epistemology (Fricker, Goldman, Nguyen), the epistemology of disagreement (Christensen, Feldman, Lackey), and empirical evidence from multiple independent studies totaling over 5,400 participants, I develop a philosophical account of sycophancy as epistemic harm operating through three mechanisms: confidence inflation, challenge atrophy, and empathic substitution. I show that AI sycophancy has structural parallels to institutional incentive corruption in consulting, media, and clinical practice, and propose a novel category — the "reinforcement bubble" — extending Nguyen's echo chamber taxonomy. The paper defends calibrated honesty as an alternative and addresses the epistemic paternalism objection. This version (v2) adds documented real-world cases from 2023–2025 in which sycophantic AI companion interactions contributed to user deaths and psychosis-like symptoms, engagement with the epistemology of testimony, and an expanded analysis of the reinforcement bubble concept.

Found an issue? Give us feedback