Value-Aware Loss Function for Model Learning in Reinforcement Learning

    •  Farahmand, A.-M., Barreto, A.M.S., Nikovski, D.N., "Value-Aware Loss Function for Model Learning in Reinforcement Learning", European Workshop on Reinforcement Learning (EWRL), December 2016.
      BibTeX TR2016-153 PDF
      • @inproceedings{Farahmand2016dec2,
      • author = {Farahmand, Amir-massoud and Barreto, Andre M.S. and Nikovski, Daniel N.},
      • title = {Value-Aware Loss Function for Model Learning in Reinforcement Learning},
      • booktitle = {European Workshop on Reinforcement Learning (EWRL)},
      • year = 2016,
      • month = dec,
      • url = {}
      • }
  • MERL Contact:
  • Research Areas:

    Data Analytics, Optimization


We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes a probabilistic loss, such as the log-loss, might be an overkill because such a probabilistic loss does not take into account the underlying structure of the decision problem and the RL algorithm that intends to solve it. We introduce a loss function that takes the structure of the value function into account. We provide a finite-sample upper bound for the loss function showing the dependence of the error on model approximation error and the number of samples.