An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators

Name: An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
Keywords: Motion cueing algorithm, deep reinforcement learning, analytic policy gradient (APG), driving simulator, differentiable simulator, Electrical engineering. Electronics. Nuclear engineering, TK1-9971

Xiaowei Huang; Xuhua Shi; Peiyao Wang; Hongzan Xu; Xiaojun Tang; Gaoran Zhang

Found an issue? Give us feedback

IEEE Accessarrow_drop_down

IEEE Access

Article . 2025 . Peer-reviewed

License: CC BY

Data sources: Crossref

IEEE Access

Article . 2025

Data sources: DOAJ

DBLP

Article

Data sources: DBLP

An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2025Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Access, volume 13, pages 81,507-81,523 (eissn: 2169-3536,

Copyright policy )

Authors: Xiaowei Huang; Xuhua Shi; Peiyao Wang; Hongzan Xu; Xiaojun Tang; Gaoran Zhang;

doi: 10.1109/access.2025.3564597

An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators

- Summary
- Subjects
- Metrics

Abstract

The proposed motion cueing algorithm (MCA), based on a reinforcement learning algorithm using gradient information to directly update the control policy, introduces three significant enhancements. First, transform the complex simulator environment into a differentiable simulator environment that provides gradient information at each time step and use this gradient information to directly update the control policy. Second, the network architecture is reconfigured into a concurrent controller format, similar to Model Predictive Control (MPC). This controller processes a sequence of vehicle motion reference signals over a future period, utilizing a multi-layer perceptron to generate the simulator’s motion reference control signal sequences for the same duration. Unlike the online optimization employed in MPC, this algorithm as an offline optimization method, providing substantial computational advantages when integrated into the driving simulator. As the prediction horizon increases, the algorithm demonstrates superior computational efficiency, which helps reduce the incidence of motion sickness during the use of the driving simulator. Third, a loss function specifically designed for the motion simulator is proposed. This function incorporates constraints derived from the MPC framework to address workspace limitations and applies them to workspace management. These constraints restrict the platform’s acceleration and speed near the workspace boundaries, allowing for better utilization of the available space. The algorithm is validated using Carla’s autonomous driving simulation software as the dataset generator. During the training process, the proposed algorithm in this paper achieves an order-of-magnitude improvement in convergence speed compared to conventional training methods of PPO and DDPG. Simulations with a 10-step prediction horizon indicate that the Root Mean Square Error (RMSE) produced by this algorithm is comparable to that of the MCA based on MPC (MPC-MCA) and significantly lower than that of the MCA based classical washout (CW-MCA). At higher prediction horizons, the algorithm achieves performance on par with state-of-the-art MPC-based motion cueing algorithms while exhibiting reduced algorithmic delay. Additionally, the proposed algorithm delivers quicker results and improved tracking performance across all prediction horizons, ultimately surpassing the current state-of-the-art MPC-MCA.

Related Organizations

Ningbo University
China (People's Republic of)

Keywords

Motion cueing algorithm, deep reinforcement learning, analytic policy gradient (APG), driving simulator, differentiable simulator, Electrical engineering. Electronics. Nuclear engineering, TK1-9971

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

gold