TR2005-136

Recognizing Speech from Simultaneous Speakers

- Raj, B., Singh, R., Smaragdis, P., "Recognizing Speech from Simultaneous Speakers", Eurospeech, September 2005.
  BibTeX TR2005-136 PDF
  - @inproceedings{Raj2005sep,
  - author = {Raj, B. and Singh, R. and Smaragdis, P.},
  - title = {Recognizing Speech from Simultaneous Speakers},
  - booktitle = {Eurospeech},
  - year = 2005,
  - month = sep,
  - url = {https://www.merl.com/publications/TR2005-136}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

In this paper we present and evaluate factored methods for recognition of simultaneous speech from multiple speakers in single-channel recordings. Factored methods decompose the problem of jointly recognizing the speech from each of the speakers by separately recognizing the speech from each speaker. In order to achieve this, the signal components of the target speaker in each case must be enhanced in some manner. We do this in two ways: using an NMF-based speaker separation algorithm that generates separated spectra for each speaker, and a mask estimation method that generates spectral masks for each speaker that must be used in conjunction with a missing-feature method that can recognize speech from partial spectral data. Experiments on synthetic mixtures of signals from the Wall Street Journal corpus show that both approaches can greatly improve the recognition of the individual signals in the mixture.

Related News & Events

NEWS Eurospeech 2005: 2 publications by MERL researchers and others
Date: September 4, 2005
Where: Eurospeech
Brief
- The papers "Bandwidth Expansion of Narrowband Speech Using non-Negative Matrix Factorization" by Bansal, D., Raj, B. and Smaragdis, P. and "Recognizing Speech from Simultaneous Speakers" by Raj, B., Singh, R. and Smaragdis, P. were presented at Eurospeech.

Research Areas:

Abstract: