Analysis of the contribution and temporal dependency of LSTM layers for reinforcement learning tasks

Long short-term memory (LSTM) architectures are widely used in deep neural networks (DNN) when the input data is time-varying, because of their ability to capture (often unknown) long-term dependencies of sequential data. In this paper, we present an approach to analyze the temporal dependencies needed by an LSTM layer. Our approach first locates so-called salient LSTM cells that contribute most to the neural network output, by combining both forward and backward propagation. For these salient cells, we compare their output contributions and the internal gates of LSTM to see whether the activation of gates precedes the increasing of contribution, and how far beforehand the precedence occurs. We apply our analysis in the context of reinforcement learning (RL) for robot control to understand how the LSTM layer reacts under different circumstances.