A Comparative Study Between Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) Algorithms for Solar PV MPPT Control Under Partial Shading Conditions

Name: A Comparative Study Between Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) Algorithms for Solar PV MPPT Control Under Partial Shading Conditions
Keywords: photovoltaic system (PV), soft actor-critic (SAC), Electrical engineering. Electronics. Nuclear engineering, Deep deterministic policy gradient (DDPG), partial shading conditions (PSCs), maximum power tracking (MPPT), TK1-9971

Sampson E. Nwachukwu; Komla A. Folly; Kehinde O. Awodele

Found an issue? Give us feedback

IEEE Accessarrow_drop_down

IEEE Access

Article . 2025 . Peer-reviewed

License: CC BY

Data sources: Crossref

IEEE Access

Article . 2025

Data sources: DOAJ

DBLP

Article

Data sources: DBLP

A Comparative Study Between Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) Algorithms for Solar PV MPPT Control Under Partial Shading Conditions

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2025Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Access, volume 13, pages 71,738-71,754 (eissn: 2169-3536,

Copyright policy )

Authors: Sampson E. Nwachukwu; Komla A. Folly; Kehinde O. Awodele;

doi: 10.1109/access.2025.3561807

A Comparative Study Between Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) Algorithms for Solar PV MPPT Control Under Partial Shading Conditions

- Summary
- Subjects
- Metrics

Abstract

The use of photovoltaic (PV) arrays in smart grid systems is growing due to the increasing energy demand and greenhouse gas emissions. However, due to the intermittent nature of PV arrays, the Maximum Power Point Tracking (MPPT) algorithm is typically employed to optimize the system’s energy production. In the past, the conventional perturb and observe (P&O) method was proposed for solar PV MPPT control. While the P&O method can estimate the PV maximum power under uniform irradiation, it often exhibits sluggish tracking and unstable steady-state oscillations and fails to track the global maximum power point (GMPP) under partial shading conditions (PSCs). These problems have been addressed using deep reinforcement learning algorithms, such as the deep deterministic policy gradient (DDPG) algorithm. However, due to the DDPG’s intrinsic drawbacks, such as unstable training, Q-value overestimation, brittle convergence, and hyperparameter sensitivity, it often produces steady-state power oscillations near the GMPP under PSCs, resulting in power loss. This paper presents a soft actor-critic (SAC) algorithm for solving PV MPPT control problems under PSCs. Unlike DDPG, which utilizes only one Q-network in the critic, SAC utilizes two Q-networks in the critic and maximum entropy policy in the reward function, which guarantees its training stability and improves its exploration and robustness in the presence of “estimation and model errors”. Despite its potential, the SAC-based MPPT approach has not been extensively explored or compared with DDPG to determine the superior method for PV MPPT control. This paper provides a comprehensible comparative analysis of DDPG and SAC, including their optimal hyperparameter configurations for PV MPPT control. To solve the MPPT control problem, the mathematical model of the boost converter and the solar PV system were developed. Then, a Markov Decision Process model was formulated, which represents the PV system’s behavior. For completeness in the comparison, the conventional P&O algorithm was also included. Simulation results show that SAC and DDPG algorithms outperform the P&O method under PSCs and varying irradiance levels. It is shown that the SAC algorithm exhibits superior performance, achieving high tracking efficiency and eliminating power oscillations near the PV MPP and GMPP compared to the DDPG method.

Related Organizations

University of Cape Town
South Africa
University of Cape Town (South Africa)
South Africa

Keywords

photovoltaic system (PV), soft actor-critic (SAC), Electrical engineering. Electronics. Nuclear engineering, Deep deterministic policy gradient (DDPG), partial shading conditions (PSCs), maximum power tracking (MPPT), TK1-9971

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

gold