Speech Recognizer Based Maximum Likelihood Beamforming

    •  Raj, B., Seltzer, M.L., Reyes-Gomez, M.J., "Speech Recognizer Based Maximum Likelihood Beamforming", NSF Workshop on Perspectives on Speech Separation, October 2003.
      BibTeX TR2003-87 PDF
      • @inproceedings{Raj2003oct,
      • author = {Raj, B. and Seltzer, M.L. and Reyes-Gomez, M.J.},
      • title = {Speech Recognizer Based Maximum Likelihood Beamforming},
      • booktitle = {NSF Workshop on Perspectives on Speech Separation},
      • year = 2003,
      • month = oct,
      • url = {}
      • }
  • Research Areas:

    Artificial Intelligence, Speech & Audio


In this paper we present a speech-recognizer-based maximum-likelihood beamforming technique, that can be used both for signal enhancement and speaker separation. The presented technique uses an HMM-based speech recognizer as a statistical model for the target signal to be enhanced or separated. The parameters of a filter-and-sum array processor are estimated to maximize the likelihood of the output as measured using the speech recognizer. The filter-and-sum operation may be performed either in the time domain or the frequency domain. When used for speaker separation, the beamforming must be performed individually for each of the speakers. Since the competing signal is also in-domain speech in this case, the statistical model used for the beamforming is now a factorial HMM formed from the HMM for the target, and that for the competing speaker (s).


  • Related News & Events