Learning Optimization-based Control Policies Directly from Digital Twin Simulations


This paper proposes to use a digital twin of a dynamical system directly for optimization-based control. It proposes an algorithm based on an Unscented Kalman Filter (UKF) to solve optimization-based control problems, where the system dynamics is encoded in the digital twin. The UKF- based algorithm uses simulations of a digital twin directly to optimize the control policy and does not require gradients to be computed—making it suitable for differential-algebraic constraints, where gradients may be inaccessible. The proposed UKF-based algorithm does not require explicit knowledge of the internal model of the digital twin, nor the control map; that is, it is a purely simulation data-driven approach. The main advantage is that a high-precision simulation-oriented digital twin can approximate the physical dynamical system more accurately than an analytical control-oriented model and thus, can improve the performance of the controller. The digital twin-based optimal control approach is evaluated on two case studies. First, a pendulum on a cart is optimized to swing up and stabilize. Second, a crane controller is optimized to avoid oscillations of the load.