An Improved Deep Learning Architecture for Person Re-Identification


In this work, we propose a method for simultaneously learning features and a corresponding similarity metric for person re-identification. We present a deep convolutional architecture with layers specially designed to address the problem of e-identification. Given a pair of images as input, our network outputs a similarity value indicating whether the two input images depict the same person. novel elements of our architecture include a layer that computes cross-input neighborhood differences, which capture local relationships between the two input images based on midlevel features from each input image. A high-level summary of he outputs of this layer is computed by a layer of patch summary features, which re then spatially integrated in subsequent layers. Our method significantly outperforms the state of the art on both a large data set (CUHK03) and a medium- sized data set (CUHK01), and is resistant to over- fitting. We also demonstrate hat by initially training on an unrelated large data set before fine-tuning on a mall target data set, our network can achieve results comparable to the state of he art even on a small data set (VIPeR).