Simulation to Real Transfer Learning with Robustified Policies for Robot Tasks

Learning tasks from simulated data using reinforcement learning has been proven effective. A major advantage of using simulation data for training is that it reduces the burden of acquiring real data. Specifically when robots are involved, it is important to limit the amount of time a robot is occupied with learning, and can instead be used for its intended (manufacturing) task. A policy learned on simulation data can be transferred and refined for real data. In this paper we propose to learn a robustified policy during reinforcement learning using simulation data. A robustified policy is learned by exploiting the ability to change the simulation parameters (appearance and dynamics) for successive training episodes. We demonstrate that the amount of transfer learning for a robustified policy is reduced for transfer from a simulated to real task. We focus on tasks which involve reasl-time non-linear dynamics, since non-linear dynamics can only be approximately modeled in physics engines, and the need for robustness in learned policies becomes more evident.