
pmid: 38232148
arXiv: 2309.15462
Legged locomotion is a complex control problem that requires both accuracy and robustness to cope with real-world challenges. Legged systems have traditionally been controlled using trajectory optimization with inverse dynamics. Such hierarchical model-based methods are appealing because of intuitive cost function tuning, accurate planning, generalization, and, most importantly, the insightful understanding gained from more than one decade of extensive research. However, model mismatch and violation of assumptions are common sources of faulty operation. Simulation-based reinforcement learning, on the other hand, results in locomotion policies with unprecedented robustness and recovery skills. Yet, all learning algorithms struggle with sparse rewards emerging from environments where valid footholds are rare, such as gaps or stepping stones. In this work, we propose a hybrid control architecture that combines the advantages of both worlds to simultaneously achieve greater robustness, foot-placement accuracy, and terrain generalization. Our approach uses a model-based planner to roll out a reference motion during training. A deep neural network policy is trained in simulation, aiming to track the optimized footholds. We evaluated the accuracy of our locomotion pipeline on sparse terrains, where pure data-driven methods are prone to fail. Furthermore, we demonstrate superior robustness in the presence of slippery or deformable ground when compared with model-based counterparts. Last, we show that our proposed tracking controller generalizes across different trajectory optimization methods not seen during training. In conclusion, our work unites the predictive capabilities and optimality guarantees of online planning with the inherent robustness attributed to offline learning.
Optimization, FOS: Computer and information sciences, Computer Science - Machine Learning, ANYmal, Technology (applied sciences), Robotics, Systems and Control (eess.SY), Electrical Engineering and Systems Science - Systems and Control, Machine Learning (cs.LG), Computer Science - Robotics, Locomotion Control, Legged Robots, FOS: Electrical engineering, electronic engineering, information engineering, Legged locomotion, info:eu-repo/classification/ddc/600, REINFORCEMENT LEARNING (ARTIFICIAL INTELLIGENCE), MPC (Model-based Predictive Control), Robotics (cs.RO)
Optimization, FOS: Computer and information sciences, Computer Science - Machine Learning, ANYmal, Technology (applied sciences), Robotics, Systems and Control (eess.SY), Electrical Engineering and Systems Science - Systems and Control, Machine Learning (cs.LG), Computer Science - Robotics, Locomotion Control, Legged Robots, FOS: Electrical engineering, electronic engineering, information engineering, Legged locomotion, info:eu-repo/classification/ddc/600, REINFORCEMENT LEARNING (ARTIFICIAL INTELLIGENCE), MPC (Model-based Predictive Control), Robotics (cs.RO)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 76 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
