Robust Optimization for Trajectory-Centric Model-based Reinforcement Learning

This paper presents a method to perform robust trajectory optimization for trajectory-centric Model-based Reinforcement Learning (MBRL). We propose a method that allows us to use the uncertainty estimates present in predictions obtained from a model-learning algorithm to generate robustness certificates for trajectory optimization. This is done by simultaneously solving for a time-invariant controller which is optimized to satisfy a constraint to generate the robustness certificate. We first present a novel formulation of the proposed method for the robust optimization that incorporates use of local sets around a trajectory where the closed-loop dynamics of the system is stabilized using a time-invariant policy. The method is demonstrated on an inverted pendulum system with parametric uncertainty. A Gaussian process is used to learn the residual dynamics and the uncertainty sets generated by the Gaussian process are then used to generate the trajectories with the local stabilizing policy.