Name: Relating Human Error–Based Learning to Modern Deep RL Algorithms
Keywords: name=Mind and Brain (Psychological Science), learning, Memory and learning in psychology, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_, 004, /dk/atira/pure/core/keywords/cognitive_science, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_; name=Mind and Brain (Psychological Science), Deep Learning, name=Cognitive Science, /dk/atira/pure/core/keywords/cognitive_science; name=Cognitive Science

Relating human error-based learning to modern deep RL algorithms

descriptionPublicationkeyboard_double_arrow_right Article 12 Dec 2024 English Publisher:MIT PressJournal:Neural Computation, volume 37, pages 128-159 (issn: 0899-7667, eissn: 1530-888X,

Authors: Garibbo, Michele; Ludwig, Casimir J H; Lepora, Nathan F; Aitchison, Laurence;

doi: 10.1162/neco_a_01721

pmid: 39383023

handle: 1983/8df1799a-1472-4268-b24f-50c9baaef8e6

Relating Human Error–Based Learning to Modern Deep RL Algorithms

- Summary
- Subjects
- Metrics

Abstract

Abstract In human error–based learning, the size and direction of a scalar error (i.e., the “directed error”) are used to update future actions. Modern deep reinforcement learning (RL) methods perform a similar operation but in terms of scalar rewards. Despite this similarity, the relationship between action updates of deep RL and human error–based learning has yet to be investigated. Here, we systematically compare the three major families of deep RL algorithms to human error–based learning. We show that all three deep RL approaches are qualitatively different from human error–based learning, as assessed by a mirror-reversal perturbation experiment. To bridge this gap, we developed an alternative deep RL algorithm inspired by human error–based learning, model-based deterministic policy gradients (MB-DPG). We showed that MB-DPG captures human error–based learning under mirror-reversal and rotational perturbations and that MB-DPG learns faster than canonical model-free algorithms on complex arm-based reaching tasks, while being more robust to (forward) model misspecification than model-based RL.

Related Organizations

University of Bristol
United Kingdom

Keywords

name=Mind and Brain (Psychological Science), learning, Memory and learning in psychology, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_, 004, /dk/atira/pure/core/keywords/cognitive_science, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_; name=Mind and Brain (Psychological Science), Deep Learning, name=Cognitive Science, /dk/atira/pure/core/keywords/cognitive_science; name=Cognitive Science, Humans, Learning, Neural Networks, Computer, Reinforcement, Psychology, human error, Algorithms, Artificial neural networks and deep learning

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green