Robust Attentional Pooling via Feature Selection


In this paper we propose a novel network module, namely Robust Attentional Pooling (RAP), that potentially can be applied in an arbitrary network for generating single vector representations for classification. By taking a feature matrix for each data sample as the input, our RAP learns data-dependent weights that are used to generate a vector through linear transformations of the feature matrix. We utilize feature selection to control the sparsity in weights for compressing the data matrices as well as enhancing the robustness of attentional pooling. As exemplary applications, we plug RAP into PointNet and ResNet for point cloud and image recognition, respectively. We demonstrate that our RAP significantly improves the recognition performance for both networks whenever sparsity is high. For instance, in extreme cases where only one feature per matrix is selected for recognition, RAP achieves more than 60% improvement over PointNet in terms of accuracy on the ModelNet40 dataset.