Anomalous sound detection using attentive neural processes


A typical approach for unsupervised anomaly detection of machine sounds learns an autoencoder model for reconstructing the spectrograms of normal sounds. During inference, fidelity of the reconstruction can be used to identify anomalous sounds different from normal sounds encountered during training. Recent improvements to the baseline autoencoder approach mask certain regions of the spectrogram at the input to the autoencoder, and then use the reconstruction error over masked regions as the anomaly score. We propose an alternative approach based on the attentive neural process, a recently proposed meta-learning technique for estimating distributions over signals. A benefit of our approach is that masked regions of the spectrogram do not need to be pre-specified at training time, and can determined based on signal properties or prior knowledge. Furthermore, we present an iterative approach that finds difficult-to-reconstruct spectrogram regions, and uses the reconstruction error over only those regions as the anomaly score. We demonstrate the effectiveness of our approach on experiments with the six machines of the DCASE 2020 Task 2 dataset, including in the case of zero-shot domain adaptation, where our approach outperforms baseline approaches in predicting anomalies for unseen machine instances.