
The use of photovoltaic (PV) arrays in smart grid systems is growing due to the increasing energy demand and greenhouse gas emissions. However, due to the intermittent nature of PV arrays, the Maximum Power Point Tracking (MPPT) algorithm is typically employed to optimize the system’s energy production. In the past, the conventional perturb and observe (P&O) method was proposed for solar PV MPPT control. While the P&O method can estimate the PV maximum power under uniform irradiation, it often exhibits sluggish tracking and unstable steady-state oscillations and fails to track the global maximum power point (GMPP) under partial shading conditions (PSCs). These problems have been addressed using deep reinforcement learning algorithms, such as the deep deterministic policy gradient (DDPG) algorithm. However, due to the DDPG’s intrinsic drawbacks, such as unstable training, Q-value overestimation, brittle convergence, and hyperparameter sensitivity, it often produces steady-state power oscillations near the GMPP under PSCs, resulting in power loss. This paper presents a soft actor-critic (SAC) algorithm for solving PV MPPT control problems under PSCs. Unlike DDPG, which utilizes only one Q-network in the critic, SAC utilizes two Q-networks in the critic and maximum entropy policy in the reward function, which guarantees its training stability and improves its exploration and robustness in the presence of “estimation and model errors”. Despite its potential, the SAC-based MPPT approach has not been extensively explored or compared with DDPG to determine the superior method for PV MPPT control. This paper provides a comprehensible comparative analysis of DDPG and SAC, including their optimal hyperparameter configurations for PV MPPT control. To solve the MPPT control problem, the mathematical model of the boost converter and the solar PV system were developed. Then, a Markov Decision Process model was formulated, which represents the PV system’s behavior. For completeness in the comparison, the conventional P&O algorithm was also included. Simulation results show that SAC and DDPG algorithms outperform the P&O method under PSCs and varying irradiance levels. It is shown that the SAC algorithm exhibits superior performance, achieving high tracking efficiency and eliminating power oscillations near the PV MPP and GMPP compared to the DDPG method.
photovoltaic system (PV), soft actor-critic (SAC), Electrical engineering. Electronics. Nuclear engineering, Deep deterministic policy gradient (DDPG), partial shading conditions (PSCs), maximum power tracking (MPPT), TK1-9971
photovoltaic system (PV), soft actor-critic (SAC), Electrical engineering. Electronics. Nuclear engineering, Deep deterministic policy gradient (DDPG), partial shading conditions (PSCs), maximum power tracking (MPPT), TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
