TR2018-028

Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control


    •  Pan, Y., Farahmand, A.-M., White, M., Nabi, S., Grover, P., Nikovski, D.N., "Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control," Tech. Rep. TR2018-028, arXiv, February 2018.
      BibTeX Download PDF
      • @techreport{MERL_TR2018-028,
      • author = {Pan, Y. and Farahmand, A.-M. and White, M. and Nabi, S. and Grover, P. and Nikovski, D.N.},
      • title = {Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control},
      • institution = {MERL - Mitsubishi Electric Research Laboratories},
      • address = {Cambridge, MA 02139},
      • number = {TR2018-028},
      • month = feb,
      • year = 2018,
      • url = {https://www.merl.com/publications/TR2018-028/}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Data Analytics, Optimization


Recent work has shown reinforcement learning (RL) is promising to control partial differential equations (PDE) with discrete actions. This paper shows how to use RL algorithms to solve more general and common PDE control problems where the action can be in continuous high-dimensional space with spatial relationships amongst action dimensions. In particular, we propose the idea of action descriptors, which encodes regularities among spatially-extended action dimensions and enables the agent to control high-dimensional action PDEs. Based upon covering number argument, we provide theoretical evidence suggesting that this approach can be more sample efficient compared to a conventional approach that treat each action dimension separately and does not explicitly exploit the spatial regularity in the action space. The action descriptors approach is then used within the deep deterministic policy gradient algorithm, and experiments are conducted on two PDE control domains, with up to 256 dimensional continuous actions. The empirical results showthe advantage of the proposed approach over the conventional approach. We believe the action descriptor-based approach has the potential of solving various PDE control problems with high-dimensional action spaces, as well as some other classical high-dimensional action problems where the action dimensions have regularities among themselves.