TR2019-156

Robust Optimization for Trajectory-Centric Model-based Reinforcement Learning


This paper presents a method to perform robust trajectory optimization for trajectory-centric Model-based Reinforcement Learning (MBRL). We propose a method that allows us to use the uncertainty estimates present in predictions obtained from a model-learning algorithm to generate robustness certificates for trajectory optimization. This is done by simultaneously solving for a time-invariant controller which is optimized to satisfy a constraint to generate the robustness certificate. We first present a novel formulation of the proposed method for the robust optimization that incorporates use of local sets around a trajectory where the closed-loop dynamics of the system is stabilized using a time-invariant policy. The method is demonstrated on an inverted pendulum system with parametric uncertainty. A Gaussian process is used to learn the residual dynamics and the uncertainty sets generated by the Gaussian process are then used to generate the trajectories with the local stabilizing policy.

 

  • Related Publications

  •  Jha, D., Kolaric, P., Raghunathan, A., Lewis, F., Benosman, M., Romeres, D., Nikovski, D.N., "Local Policy Optimization for Trajectory-Centric Reinforcement Learning", IEEE International Conference on Robotics and Automation (ICRA), Ayanna Howard, Eds., May 2020, pp. 5094-5100.
    BibTeX TR2020-062 PDF
    • @inproceedings{Jha2020may,
    • author = {Jha, Devesh and Kolaric, Patrik and Raghunathan, Arvind and Lewis, Frank and Benosman, Mouhacine and Romeres, Diego and Nikovski, Daniel N.},
    • title = {Local Policy Optimization for Trajectory-Centric Reinforcement Learning},
    • booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
    • year = 2020,
    • editor = {Ayanna Howard},
    • pages = {5094--5100},
    • month = may,
    • publisher = {IEEE},
    • isbn = {978-1-7281-7395-5},
    • url = {https://www.merl.com/publications/TR2020-062}
    • }
  •  Kolaric, P., Jha, D., Raghunathan, A., Lewis, F., Benosman, M., Romeres, D., Nikovski, D.N., "Local Policy Optimization for Trajectory-Centric Reinforcement Learning", arXiv, January 2020.
    BibTeX arXiv
    • @article{Kolaric2020jan,
    • author = {Kolaric, Patrik and Jha, Devesh and Raghunathan, Arvind and Lewis, Frank and Benosman, Mouhacine and Romeres, Diego and Nikovski, Daniel N.},
    • title = {Local Policy Optimization for Trajectory-Centric Reinforcement Learning},
    • journal = {arXiv},
    • year = 2020,
    • month = jan,
    • url = {https://arxiv.org/abs/2001.08092}
    • }