
This paper proposes a fully distributed multiagent-based reinforcement learning method for optimal reactive power dispatch. According to the method, two agents communicate with each other only if their corresponding buses are electrically coupled. The global rewards that are required for learning are obtained with a consensus-based global information discovery algorithm, which has been demonstrated to be efficient and reliable. Based on the discovered global rewards, a distributed Q-learning algorithm is implemented to minimize the active power loss while satisfying operational constraints. The proposed method does not require accurate system model and can learn from scratch. Simulation studies with power systems of different sizes show that the method is very computationally efficient and able to provide near-optimal solutions. It can be observed that prior knowledge can significantly speed up the learning process and decrease the occurrences of undesirable disturbances. The proposed method has good potential for online implementation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 113 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
