
Reinforcement learning is an unsupervised learning algorithm, where learning is based upon feedback from the environment. Prior research has proposed cognitive (e.g., Instance-based Learning or IBL) and statistical (Q-learning) reinforcement learning algorithms. However, an evaluation of these algorithms in a single dynamic environment has not been explored. In this paper, a comparison between the statistical Q-learning algorithm and the cognitive IBL algorithm is presented. A well-known environment, “Frozen Lake,” is used to train, generalize, and scale Q-learning and IBL algorithms. For generalizing, the Q-learning and IBL agents were trained on one version of the Frozen Lake and tested on a permuted version of the same environment. For scaling, the two algorithms were tested on a larger version of the Frozen Lake environment. Results revealed that the IBL algorithm used less training time and generalized better to different environment variants. The IBL algorithm was also able to show scalability by retaining its superior performance in the larger environment. These results indicate that the IBL algorithm could be proposed as an alternative to the standard reinforcement learning algorithms based on dynamic programming such as Q-learning. The inclusion of human factors (such as memory) in the IBL algorithm makes it suitable for robust learning in complex and dynamic environments.
openAI, frozen lake, Reinforcement learning, Q-learning, instance-based learning, Electrical engineering. Electronics. Nuclear engineering, cognitive modeling, TK1-9971
openAI, frozen lake, Reinforcement learning, Q-learning, instance-based learning, Electrical engineering. Electronics. Nuclear engineering, cognitive modeling, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 8 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
