Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Functional Isomorphism Between Human Cognitive Limitations and LLM Failure Modes

Authors: Kasai, Yasuhiro;

Functional Isomorphism Between Human Cognitive Limitations and LLM Failure Modes

Abstract

Large language models (LLMs) were developed within a research tradition that aspired to rational agency — systems capable of reasoning without the cognitive limitations that compromise human judgment. A systematic examination of their outputs, however, suggests a structural paradox: the very processes designed to align LLMs with human values appear to have made them more human, but in the dimension of human cognition least worth emulating. This paper introduces the concept of functional isomorphism between human cognitive flaws and LLM output patterns, and proposes a programmatic theoretical framework for analyzing the structural origins of these correspondences. We argue that LLMs do not merely reflect human biases incidentally; rather, their architectural design and training procedures — specifically next-token prediction and reinforcement learning from human feedback (RLHF) — are hypothesized to structurally generate output patterns that mirror human cognitive limitations across eight domains: cognition, emotion, memory, social behavior, decision-making, logic, language, and metacognition. We present a taxonomy of 347 human flaws drawn from established psychology, behavioral economics, and cognitive science literature, each mapped to a corresponding LLM mechanism along two axes: Manifestation Strength (Strong / Moderate / Weak / Absent) and Primary Origin (RLHF / Architecture / Training Data). Of the 347 flaws examined, 270 (77.8%) are proposed as Strong — structurally reinforced by current LLM design. Critically, 62 Strong manifestations are traced to RLHF, the mechanism intended to align LLMs with human values. We further propose that current LLMs are optimized for human comfort over human accuracy, aligned to human preference rather than human judgment, calibrated to what humans approve rather than what humans need, and trained to satisfy rather than to inform. This optimization is appropriate for casual use cases but is hypothesized to become a structural liability in critical decision contexts — including medical, legal, financial, and edge AI deployments. This paper is positioned as a programmatic framework — a theoretical OS for a research agenda rather than an exhaustive empirical proof. The findings suggest that scaling current architectures within the prevailing optimization paradigm is unlikely to resolve these limitations; it may reinforce them. We conclude by motivating the need for next-generation metacognitive AI architectures designed for epistemic independence rather than approval optimization.

Powered by OpenAIRE graph
Found an issue? Give us feedback