April 10, 2017
MERL researcher Tim K. Marks presented an invited talk at the University of Utah School of Computing, entitled "Action Detection from Video and Robust Real-Time 2D Face Alignment."
Abstract: The first part of the talk describes our multi-stream bi-directional recurrent neural network for action detection from video. In addition to a two-stream convolutional neural network (CNN) on full-frame appearance (images) and motion (optical flow), our system trains two additional streams on appearance and motion that have been cropped to a bounding box from a person tracker. To model long-term temporal dynamics within and between actions, the multi-stream CNN is followed by a bi-directional Long Short-Term Memory (LSTM) layer. Our method outperforms the previous state of the art on two action detection datasets: the MPII Cooking 2 Dataset, and a new MERL Shopping Dataset that we have made available to the community. The second part of the talk describes our method for face alignment, which is the localization of a set of facial landmark points in a 2D image or video of a face. Face alignment is particularly challenging when there are large variations in pose (in-plane and out-of-plane rotations) and facial expression. To address this issue, we propose a cascade in which each stage consists of a Mixture of Invariant eXperts (MIX), where each expert learns a regression model that is specialized to a different subset of the joint space of pose and expressions. We also present a method to include deformation constraints within the discriminative alignment framework, which makes the algorithm more robust. Our face alignment system outperforms the previous results on standard datasets. The talk will end with a live demo of our face alignment system.
University of Utah School of Computing
- Related Publications:
- "Robust Face Alignment Using a Mixture of Invariant Experts", European Conference on Computer Vision (ECCV), DOI: 10.1007/978-3-319-46454-1_50, October 2016, vol. 9909, pp. 825-841. ,
- "A Multi-Stream Bi-Directional Recurrent Neural Network for Fine-Grained Action Detection", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI: 10.1109/CVPR.2016.216, June 2016, pp. 1961-1970. ,