TR2020-108

Safe Approximate Dynamic Programming via Kernelized Lipschitz Estimation


    •  Chakrabarty, A., Jha, D., Buzzard, G.T., Wang, Y., Vamvoudakis, K., "Safe Approximate Dynamic Programming via Kernelized Lipschitz Estimation", IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2020.2978805, July 2020.
      BibTeX TR2020-108 PDF
      • @article{Chakrabarty2020jul2,
      • author = {Chakrabarty, Ankush and Jha, Devesh and Buzzard, Gregery T. and Wang, Yebin and Vamvoudakis, Kyriakos},
      • title = {Safe Approximate Dynamic Programming via Kernelized Lipschitz Estimation},
      • journal = {IEEE Transactions on Neural Networks and Learning Systems},
      • year = 2020,
      • month = jul,
      • doi = {10.1109/TNNLS.2020.2978805},
      • url = {https://www.merl.com/publications/TR2020-108}
      • }
  • MERL Contacts:
  • Research Areas:

    Control, Machine Learning, Optimization

We develop a method for obtaining safe initial policies for reinforcement learning via approximate dynamic programming (ADP) techniques for uncertain systems evolving with discrete-time dynamics. We employ kernelized Lipschitz estimation to learn multiplier matrices that are used in semidefinite programming frameworks for computing admissible initial control policies with provably high probability. Such admissible controllers enable safe initialization and constraint enforcement while providing exponential stability of the equilibrium of the closed-loop system.

 

  • Related Publication

  •  Chakrabarty, A., Jha, D., Buzzard, G.T., Wang, Y., Vamvoudakis, K., "Safe Approximate Dynamic Programming Via Kernelized Lipschitz Estimation", arXiv, July 2019.
    BibTeX arXiv
    • @article{Chakrabarty2019jul2,
    • author = {Chakrabarty, Ankush and Jha, Devesh and Buzzard, Gregery T. and Wang, Yebin and Vamvoudakis, Kyriakos},
    • title = {Safe Approximate Dynamic Programming Via Kernelized Lipschitz Estimation},
    • journal = {arXiv},
    • year = 2019,
    • month = jul,
    • url = {https://arxiv.org/abs/1907.02151}
    • }