TR2017-075

Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller

- Farahmand, A.-M., Benosman, M., "Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller", The Multi-disciplinary Conference on Reinforcement Learning and Decision Making, June 2017.
  BibTeX TR2017-075 PDF
  - @inproceedings{Farahmand2017jun,
  - author = {Farahmand, Amir-massoud and Benosman, Mouhacine},
  - title = {{Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller}},
  - booktitle = {The Multi-disciplinary Conference on Reinforcement Learning and Decision Making},
  - year = 2017,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2017-075}
  - }
Research Areas:

Artificial Intelligence, Control, Optimization, Dynamical Systems, Machine Learning

Abstract:

We propose to merge together techniques from control theory and machine learning to design a stable learning-based controller for a class of nonlinear systems. We adopt a modular adaptive control design approach that has two components. The first is a model-based robust nonlinear state feedback, which guarantees stability during learning, by rendering the closed-loop system input-to-state stable (ISS). The input is considered to be the error in the estimation of the uncertain parameters of the dynamics, and the state is considered to be the closed-loop output tracking error. The second component is a data-driven Bayesian optimization method for estimating the uncertain parameters of the dynamics, and improving the overall performance of the closed-loop system. In particular, we suggest using Gaussian Process Upper Confidence Bound (GP-UCB) algorithm, which is a method for trading-off exploration-exploitation in continuous-armed bandits. GP-UCB searches the space of uncertain parameters and gradually finds the parameters that maximize the performance of the closed-loop system. These two systems together ensure that we have a stable learning-based control algorithm.

Research Areas:

Abstract: