TR2007-031
| Sparse overcomplete Decomposition for Single Channel Speaker Separation | |||
| Citation: | Shashanka, M.V.S.; Raj, B.; Smaragdis, P., "Sparse Overcomplete Decomposition for Single Channel Speaker Separation", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ISSN: 1520-6149, Vol. 2, pp. 11-641 - II-644, April 2007 (IEEE Xplore) | ||
| Date: | April 2007 | ||
| MERL Contact: | Bhiksha Raj | ||
We present an algorithm for separating multiple speakers from a mixed single channel recording. The algorithm is based on a model proposed by Raj and Smaragdis (2005). The idea is to extract certain characteristic spectra-temporal basis functions from training data for individual speakers and decompose the mixed signals as linear combinations of these learned bases. In other words, their model extracts a compact code of basis functions that can explain the space spanned by spectral vectors of a speaker. In our model, we generate a sparse-distributed code where we have more basis functions than the dimensionality of the space. We propose a probabilistic framework to achieve sparsity. Experiments show that the resulting sparse code better captures the structure in data and hence leads to better separation. | |||
| |||
