
In this paper, we investigate an energy cost minimization problem for prosumers participating in peer-to-peer energy trading. Due to (i) uncertainties caused by renewable energy generation and consumption, (ii) difficulties in developing an accurate and efficient energy trading model, and (iii) the need to satisfy distribution network constraints, it is challenging for prosumers to obtain optimal energy trading decisions that minimize their individual energy costs. To address the challenge, we first formulate the above problem as a Markov decision process and propose a multi-agent deep deterministic policy gradient algorithm to learn optimal energy trading decisions. To satisfy the distribution network constraints, we propose distribution network tariffs which we incorporate in the algorithm as incentives to incentivize energy trading decisions that help to satisfy the constraints and penalize the decisions that violate them. The proposed algorithm is model-free and allows the agents to learn the optimal energy trading decisions without having prior information about other agents in the network. Simulation results based on real-world datasets show the effectiveness and robustness of the proposed algorithm.
15 pages, 19 figures
T1, FOS: Electrical engineering, electronic engineering, information engineering, Systems and Control (eess.SY), Q1, QA, Electrical Engineering and Systems Science - Systems and Control
T1, FOS: Electrical engineering, electronic engineering, information engineering, Systems and Control (eess.SY), Q1, QA, Electrical Engineering and Systems Science - Systems and Control
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 62 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
