TR2004-088

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

- Seltzer, M.L., Raj, B., Stern, R.M., "Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition", IEEE Transactions on Speech and Audio Processing, Vol. 12, No. 5, pp. 489-498, September 2004.
  BibTeX TR2004-088 PDF
  - @article{Seltzer2004sep1,
  - author = {Seltzer, M.L. and Raj, B. and Stern, R.M.},
  - title = {{Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition}},
  - journal = {IEEE Transactions on Speech and Audio Processing},
  - year = 2004,
  - volume = 12,
  - number = 5,
  - pages = {489--498},
  - month = sep,
  - note = {Awarded Best Young Author, March 2007},
  - issn = {1063-6676},
  - url = {https://www.merl.com/publications/TR2004-088}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

Speech recognition performance degrades significantly in distant-talking environments, where the speech signals can be severly distorted by additive noise and reverberation. In such environments, the use of microphone arrays has been proposed as a means of improving the quality of captured speech signals. Currently, microphone-array-based speech recognition is performed in two independent stages: array processing and then recognition. Array processing algorithms designed for signal enhancement are applied in order to reduce the distortion in the speech waveform prior to feature extraction and recognition. This approach assumes that improving the quality of the speech waveform will necessarily result inimproved recognition performance and ignores the manner in which speech recognition systems operate. In this paper a new approach to microphone-array processing is proposed in which the goal of the array processing is not to generate an enhanced output waveform but rather to generate a sequence of features which maximizes the likelihood of generating the correct hypothesis. In this approach, called Likelihood-Maximizing Beamforming (LIMABEAM), information from the speech recognition system itself is used to optimize a filter-and-sum beamformer. Speech recogniton experiments performed in a real distant-talking environment confirm the efficacy of the proposed approach.

Related News & Events

AWARD IEEE Young Author Best Paper Award
Date: March 16, 2007
Awarded to: Michael Seltzer
Awarded for: "Likelihood-Maximizing Beamforming for Robust Hands-free Speech Recognition"
Awarded by: IEEE Signal Processing Society
Research Area: Speech & Audio
NEWS IEEE Transactions on Speech and Audio Processing: publication by MERL researchers and others
Date: September 30, 2004
Where: IEEE Transactions on Speech and Audio Processing
Research Area: Speech & Audio
Brief
- The article "Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition" by Seltzer, M.L., Raj, B. and Stern, R.M. was published in IEEE Transactions on Speech and Audio Processing.

Research Areas:

Abstract: