Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ IEEE Accessarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2025 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2025
Data sources: DOAJ
DBLP
Article
Data sources: DBLP
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators

Authors: Xiaowei Huang; Xuhua Shi; Peiyao Wang; Hongzan Xu; Xiaojun Tang; Gaoran Zhang;

An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators

Abstract

The proposed motion cueing algorithm (MCA), based on a reinforcement learning algorithm using gradient information to directly update the control policy, introduces three significant enhancements. First, transform the complex simulator environment into a differentiable simulator environment that provides gradient information at each time step and use this gradient information to directly update the control policy. Second, the network architecture is reconfigured into a concurrent controller format, similar to Model Predictive Control (MPC). This controller processes a sequence of vehicle motion reference signals over a future period, utilizing a multi-layer perceptron to generate the simulator’s motion reference control signal sequences for the same duration. Unlike the online optimization employed in MPC, this algorithm as an offline optimization method, providing substantial computational advantages when integrated into the driving simulator. As the prediction horizon increases, the algorithm demonstrates superior computational efficiency, which helps reduce the incidence of motion sickness during the use of the driving simulator. Third, a loss function specifically designed for the motion simulator is proposed. This function incorporates constraints derived from the MPC framework to address workspace limitations and applies them to workspace management. These constraints restrict the platform’s acceleration and speed near the workspace boundaries, allowing for better utilization of the available space. The algorithm is validated using Carla’s autonomous driving simulation software as the dataset generator. During the training process, the proposed algorithm in this paper achieves an order-of-magnitude improvement in convergence speed compared to conventional training methods of PPO and DDPG. Simulations with a 10-step prediction horizon indicate that the Root Mean Square Error (RMSE) produced by this algorithm is comparable to that of the MCA based on MPC (MPC-MCA) and significantly lower than that of the MCA based classical washout (CW-MCA). At higher prediction horizons, the algorithm achieves performance on par with state-of-the-art MPC-based motion cueing algorithms while exhibiting reduced algorithmic delay. Additionally, the proposed algorithm delivers quicker results and improved tracking performance across all prediction horizons, ultimately surpassing the current state-of-the-art MPC-MCA.

Related Organizations
Keywords

Motion cueing algorithm, deep reinforcement learning, analytic policy gradient (APG), driving simulator, differentiable simulator, Electrical engineering. Electronics. Nuclear engineering, TK1-9971

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
gold