TR2017-075

Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller


    •  Farahmand, A.-M., Benosman, M., "Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller", The Multi-disciplinary Conference on Reinforcement Learning and Decision Making, June 2017.
      BibTeX TR2017-075 PDF
      • @inproceedings{Farahmand2017jun,
      • author = {Farahmand, Amir-massoud and Benosman, Mouhacine},
      • title = {Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller},
      • booktitle = {The Multi-disciplinary Conference on Reinforcement Learning and Decision Making},
      • year = 2017,
      • month = jun,
      • url = {https://www.merl.com/publications/TR2017-075}
      • }
  • MERL Contact:
  • Research Areas:

    Artificial Intelligence, Control, Optimization, Dynamical Systems, Machine Learning

Abstract:

We propose to merge together techniques from control theory and machine learning to design a stable learning-based controller for a class of nonlinear systems. We adopt a modular adaptive control design approach that has two components. The first is a model-based robust nonlinear state feedback, which guarantees stability during learning, by rendering the closed-loop system input-to-state stable (ISS). The input is considered to be the error in the estimation of the uncertain parameters of the dynamics, and the state is considered to be the closed-loop output tracking error. The second component is a data-driven Bayesian optimization method for estimating the uncertain parameters of the dynamics, and improving the overall performance of the closed-loop system. In particular, we suggest using Gaussian Process Upper Confidence Bound (GP-UCB) algorithm, which is a method for trading-off exploration-exploitation in continuous-armed bandits. GP-UCB searches the space of uncertain parameters and gradually finds the parameters that maximize the performance of the closed-loop system. These two systems together ensure that we have a stable learning-based control algorithm.