Opposition-Based Reinforcement Learning

descriptionPublicationkeyboard_double_arrow_right Article 20 Jul 2006 English Publisher:Fuji Technology Press Ltd.Journal:Journal of Advanced Computational Intelligence and Intelligent Informatics, volume 10, pages 578-585 (issn: 1343-0130, eissn: 1883-8014,

Copyright policy )

Authors: Hamid R. Tizhoosh;

doi: 10.20965/jaciii.2006.p0578

Opposition-Based Reinforcement Learning

- Summary
- Metrics

Abstract

Reinforcement learning is a machine intelligence scheme for learning in highly dynamic, probabilistic environments. By interaction with the environment, reinforcement agents learn optimal control policies, especially in the absence of a priori knowledge and/or a sufficiently large amount of training data. Despite its advantages, however, reinforcement learning suffers from a major drawback - high calculation cost because convergence to an optimal solution usually requires that all states be visited frequently to ensure that policy is reliable. This is not always possible, however, due to the complex, high-dimensional state space in many applications. This paper introduces opposition-based reinforcement learning, inspired by opposition-based learning, to speed up convergence. Considering opposite actions simultaneously enables individual states to be updated more than once shortening exploration and expediting convergence. Three versions of Q-learning algorithm will be given as examples. Experimental results for the grid world problem of different sizes demonstrate the superior performance of the proposed approach.

Related Organizations

University of Waterloo
Canada

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	161
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%

Found an issue? Give us feedback

161

Top 1%

gold

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering