
Abstract In human error–based learning, the size and direction of a scalar error (i.e., the “directed error”) are used to update future actions. Modern deep reinforcement learning (RL) methods perform a similar operation but in terms of scalar rewards. Despite this similarity, the relationship between action updates of deep RL and human error–based learning has yet to be investigated. Here, we systematically compare the three major families of deep RL algorithms to human error–based learning. We show that all three deep RL approaches are qualitatively different from human error–based learning, as assessed by a mirror-reversal perturbation experiment. To bridge this gap, we developed an alternative deep RL algorithm inspired by human error–based learning, model-based deterministic policy gradients (MB-DPG). We showed that MB-DPG captures human error–based learning under mirror-reversal and rotational perturbations and that MB-DPG learns faster than canonical model-free algorithms on complex arm-based reaching tasks, while being more robust to (forward) model misspecification than model-based RL.
name=Mind and Brain (Psychological Science), learning, Memory and learning in psychology, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_, 004, /dk/atira/pure/core/keywords/cognitive_science, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_; name=Mind and Brain (Psychological Science), Deep Learning, name=Cognitive Science, /dk/atira/pure/core/keywords/cognitive_science; name=Cognitive Science, Humans, Learning, Neural Networks, Computer, Reinforcement, Psychology, human error, Algorithms, Artificial neural networks and deep learning
name=Mind and Brain (Psychological Science), learning, Memory and learning in psychology, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_, 004, /dk/atira/pure/core/keywords/cognitive_science, /dk/atira/pure/core/keywords/mind_and_brain_psychological_science_; name=Mind and Brain (Psychological Science), Deep Learning, name=Cognitive Science, /dk/atira/pure/core/keywords/cognitive_science; name=Cognitive Science, Humans, Learning, Neural Networks, Computer, Reinforcement, Psychology, human error, Algorithms, Artificial neural networks and deep learning
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
