TR2004-086

A Bayesian Classifier for Spectrographic Mask Estimation for Missing Feature Speech Recognition


    •  Seltzer, M.L., Raj, B., Stern, R.M., "A Bayesian Classifier for Spectrographic Mask Estimation for Missing Feature Speech Recognition", Speech Communication, Vol. 43, No. 4, pp. 379-393, September 2004.
      BibTeX TR2004-086 PDF
      • @article{Seltzer2004sep2,
      • author = {Seltzer, M.L. and Raj, B. and Stern, R.M.},
      • title = {A Bayesian Classifier for Spectrographic Mask Estimation for Missing Feature Speech Recognition},
      • journal = {Speech Communication},
      • year = 2004,
      • volume = 43,
      • number = 4,
      • pages = {379--393},
      • month = sep,
      • url = {https://www.merl.com/publications/TR2004-086}
      • }
  • Research Areas:

    Artificial Intelligence, Speech & Audio

Abstract:

Missing feature methods of noise compensation for speech recognition operate by first identifying components of a spectrographic respresentation of speech that are considered to be corrupt. Recognition is then performed either using only the remaining reliable components, or the corrupt components are reconstructed prior to recognition. These methods require a spectrographic mask which accurately labels the reliable and corrupt regions of the spectrogram. Depending on the missing feature method applied, these masks must either contain binary values or probabilistic values. Current mask estimation techniques rely on explicit estimation of the characterristics of the corrupting noise. The estimation process usually assumes that the noise is pseudo-stationary or varies slowly with time. This is a significant drawback since the missing feature methods themselves have no such restrictions. We present a new mask estimation technique that uses a Bayesian classifier to determine the reliability of spectrographic elements. Features used for classification were designed that make no assumptions about the correupting noise signal, but rather exploit characteristics of the speech signal itself. Experiments were performed on speech corrupted by a variety of noises, using missing feature compensation methods which require binary masks and probabilistic masks. In all cases, the proposed Bayesian mask estimation method resulted in significantly better recognition accuracy than conventional maask estimation approaches.

 

  • Related News & Events

    •  NEWS    Speech Communication: 2 publications by MERL researchers and others
      Date: September 12, 2004
      Where: Speech Communication
      Brief
      • The articles "Reconstruction of Missing Features for Robust Speech Recognition" by Raj, B., Seltzer, M.L. and Stern, R.M. and "A Bayesian Framework for Spectrographic Mask Estimation for Missing Feature Speech Recognition" by Seltzer, M.L., Raj, B. and Stern, R.M. were published in Speech Communication.
    •