
Path planning and its optimization is a critical and difficult task for a mobile robot in a complex and unknown environment. To tackle this problem, we propose an improved SAC (HM-SAC-CA) algorithm for path planning in unknown complex environments. First, based on the SAC maximum entropy framework, a deep reinforcement learning algorithm with clipped automatic entropy adjustment is proposed to improve the quality of policy learning by suppressing entropy evaluation. Second, an innovative hierarchical experience storage structure is constructed during experience replay, and the overfitting phenomenon caused by using good experiences is eliminated by a bias-free sampling strategy. Finally, a posture reward function and a staged incentive mechanism are proposed. The staged incentive mechanism uses both the sparse reward function and the posture reward function in stages to reduce the blindness of exploration during training and accelerate the training learning process. Experiments are conducted using a simulated Turtlebot3 and a real mobile robot and the results validate the performance of the proposed work.
deep reinforcement learning, Complex environment, Electrical engineering. Electronics. Nuclear engineering, mobile robot, path planning, TK1-9971
deep reinforcement learning, Complex environment, Electrical engineering. Electronics. Nuclear engineering, mobile robot, path planning, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
