Semiparametrical Gaussian Processes Learning of Forward Dynamical Models for Navigating in a Circular Maze


This paper presents a problem of model learning for the purpose of learning how to navigate a ball to a goal state in a circular maze environment with two degrees of freedom. The motion of the ball in the maze environment is influenced by several non-linear effects such as dry friction and contacts, which are difficult to model physically. We propose a semiparametric model to estimate the motion dynamics of the ball based on Gaussian Process Regression equipped with basis functions obtained from physics first principles. The accuracy of this semiparametric model is shown not only in estimation but also in prediction at n-steps ahead and its compared withstandard algorithms for model learning. The learned model is then used in a trajectory optimization algorithm to compute ball trajectories. We propose the system presented in the paper as a benchmark problem for reinforcement and robot learning,for its interesting and challenging dynamics and its relative ease of reproducibility.


  • Software & Data Downloads

  • Related News & Events

    •  NEWS    New robotics benchmark system
      Date: November 16, 2020
      MERL Contacts: Devesh K. Jha; Daniel N. Nikovski; Diego Romeres
      Research Areas: Artificial Intelligence, Machine Learning, Robotics
      • MERL researchers, in collaboration with researchers from MELCO and the Department of Brain and Cognitive Science at MIT, have released simulation software Circular Maze Environment (CME). This system could be used as a new benchmark for evaluating different control and robot learning algorithms. The control objective in this system is to tip and the tilt the maze so as to drive one (or multiple) marble(s) to the innermost ring of the circular maze. Although the system is very intuitive for humans to control, it is very challenging for artificial intelligence agents to learn efficiently. It poses several challenges for both model-based as well as model-free methods, due to its non-smooth dynamics, long planning horizon, and non-linear dynamics. The released Python package provides the simulation environment for the circular maze, where movement of multiple marbles could be simulated simultaneously. The package also provides a trajectory optimization algorithm to design a model-based controller in simulation.
    •  NEWS    MERL researcher Diego Romeres gave an invited talk at University of Connecticut on Reinforcement Learning for Robotics
      Date: November 20, 2019
      MERL Contact: Diego Romeres
      Research Areas: Artificial Intelligence, Data Analytics, Machine Learning, Robotics
      • Diego Romeres, a Research Scientist in MERL's Data Analytics group, gave a seminar lecture at the Electrical and Computer Engineering Colloquium of the University of Connecticut. The talk described novel reinforcement algorithms based on combining physical models with non-parametric models of robotic systems derived from data.
  • Related Videos