
arXiv: 2304.08280
Reinforcement learning has received high research interest for developing planning approaches in automated driving. Most prior works consider the end-to-end planning task that yields direct control commands and rarely deploy their algorithm to real vehicles. In this work, we propose a method to employ a trained deep reinforcement learning policy for dedicated high-level behavior planning. By populating an abstract objective interface, established motion planning algorithms can be leveraged, which derive smooth and drivable trajectories. Given the current environment model, we propose to use a built-in simulator to predict the traffic scene for a given horizon into the future. The behavior of automated vehicles in mixed traffic is determined by querying the learned policy. To the best of our knowledge, this work is the first to apply deep reinforcement learning in this manner, and as such lacks a state-of-the-art benchmark. Thus, we validate the proposed approach by comparing an idealistic single-shot plan with cyclic replanning through the learned policy. Experiments with a real testing vehicle on proving grounds demonstrate the potential of our approach to shrink the simulation to real world gap of deep reinforcement learning based planning approaches. Additional simulative analyses reveal that more complex multi-agent maneuvers can be managed by employing the cycling replanning approach.
8 pages, 10 figures, to be published in 34th IEEE Intelligent Vehicles Symposium (IV)
FOS: Computer and information sciences, Computer Science - Machine Learning, Vehicle-to-Infrastructure, Behavior Planning, Vernetzte Fahrzeuge, Bestärkendes Lernen (Künstliche Intelligenz), Machine Learning (cs.LG), Computer Science - Robotics, Reinforcement learning, Car-to-Car-Kommunikation, Motion Planning, Robotics (cs.RO), Test Vehicle, info:eu-repo/classification/ddc/620
FOS: Computer and information sciences, Computer Science - Machine Learning, Vehicle-to-Infrastructure, Behavior Planning, Vernetzte Fahrzeuge, Bestärkendes Lernen (Künstliche Intelligenz), Machine Learning (cs.LG), Computer Science - Robotics, Reinforcement learning, Car-to-Car-Kommunikation, Motion Planning, Robotics (cs.RO), Test Vehicle, info:eu-repo/classification/ddc/620
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
