Multi-Agent Inverse Reinforcement Learning

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Dec 2010Publisher:IEEEJournal:2010 Ninth International Conference on Machine Learning and Applications

Authors: Sriraam Natarajan; Gautam Kunapuli; Kshitij Judah; Prasad Tadepalli; Kristian Kersting; Jude W. Shavlik;

doi: 10.1109/icmla.2010.65

Multi-Agent Inverse Reinforcement Learning

- Summary
- Metrics

Abstract

Learning the reward function of an agent by observing its behavior is termed inverse reinforcement learning and has applications in learning from demonstration or apprenticeship learning. We introduce the problem of multi-agent inverse reinforcement learning, where reward functions of multiple agents are learned by observing their uncoordinated behavior. A centralized controller then learns to coordinate their behavior by optimizing a weighted sum of reward functions of all the agents. We evaluate our approach on a traffic-routing domain, in which a controller coordinates actions of multiple traffic signals to regulate traffic density. We show that the learner is not only able to match but even significantly outperform the expert.

Related Organizations

Oregon State University
United States
University of Wisconsin–Oshkosh
United States
University of Wisconsin–Madison
United States
Fraunhofer Society
Germany

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	39
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average