On-Policy Robot Imitation Learning from a Converging Supervisor

Preprint English OPEN
Balakrishna, Ashwin; Thananjeyan, Brijen; Lee, Jonathan; Li, Felix; Zahed, Arsh; Gonzalez, Joseph E.; Goldberg, Ken;
  • Subject: Computer Science - Machine Learning | Computer Science - Artificial Intelligence | Computer Science - Robotics

Existing on-policy imitation learning algorithms, such as DAgger, assume access to a fixed supervisor. However, there are many settings where the supervisor may evolve during policy learning, such as a human performing a novel task or an improving algorithmic controller...
