
Multi-robot learning is a challenging task not only because of large and continuous state/action spaces, but also uncertainty and partial observability during learning. This paper presents a distributed policy gradient reinforcement learning (PGRL) methodology of a multi-robot system using neural network as the function approximator. This distributed PGRL algorithm enables each robot to independently decide its policy, which is, however, affected by all the other robots. Neural network is used to generalize over continuous state space as well as discrete/continuous action spaces. A case study on leader-follower formation application is performed to demonstrate the effectiveness of the proposed learning method.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
