TR2020-011

DynamicsExplorer: Visual Analytics for Robot Control Tasks involving Dynamics and LSTM-based Control Policies

- He, W., Lee, T.-Y., van Baar, J., Wittenburg, K.B., Shen, H.-W., "DynamicsExplorer: Visual Analytics for Robot Control Tasks involving Dynamics and LSTM-based Control Policies", IEEE Pacific Visualization Symposium (PacificVis), DOI: 10.1109/PacificVis48177.2020.7127, January 2020, pp. 36-45.
  BibTeX TR2020-011 PDF
  - @inproceedings{He2020jan,
  - author = {He, Wenbin and Lee, Teng-Yok and {van Baar}, Jeroen and Wittenburg, Kent B. and Shen, Han-Wei},
  - title = {{DynamicsExplorer: Visual Analytics for Robot Control Tasks involving Dynamics and LSTM-based Control Policies}},
  - booktitle = {IEEE Pacific Visualization Symposium (PacificVis)},
  - year = 2020,
  - pages = {36--45},
  - month = jan,
  - doi = {10.1109/PacificVis48177.2020.7127},
  - url = {https://www.merl.com/publications/TR2020-011}
  - }
Research Areas:

Artificial Intelligence, Computer Vision, Data Analytics, Machine Learning

Abstract:

Deep reinforcement learning (RL), where a policy represented by a deep neural network is trained, has shown some success in playing video games and chess. However, applying RL to real-world tasks like robot control is still challenging. Because generating a massive number of samples to train control policies using RL on real robots is very expensive, hence impractical, it is common to train in simulations, and then transfer to real environments. The trained policy, however, may fail in the real world because of the difference between the training and the real environments, especially the difference in dynamics. To diagnose the problems, it is crucial for experts to understand (1) how the trained policy behaves under different dynamics settings, (2) which part of the policy affects the behaviors the most when the dynamics setting changes, and (3) how to adjust the training procedure to make the policy robust. This paper presents DynamicsExplorer, a visual analytics tool to diagnose the trained policy on robot control tasks under different dynamics settings. DynamicsExplorer allows experts to overview the results of multiple tests with different dynamics-related parameter settings so experts can visually detect failures and analyze the sensitivity of different parameters. Experts can further examine the internal activations of the policy for selected tests and compare the activations between success and failure tests. Such comparisons help experts form hypotheses about the policy and allows them to verify the hypotheses via DynamicsExplorer. Multiple use cases are presented to demonstrate the utility of DynamicsExplorer.

Research Areas:

Abstract: