TR2005-160

Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition


    •  Raj, B., Singh, R., "Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), November 2005, pp. 27-32.
      BibTeX TR2005-160 PDF
      • @inproceedings{Raj2005nov,
      • author = {Raj, B. and Singh, R.},
      • title = {Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition},
      • booktitle = {IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)},
      • year = 2005,
      • pages = {27--32},
      • month = nov,
      • url = {https://www.merl.com/publications/TR2005-160}
      • }
  • Research Areas:

    Artificial Intelligence, Speech & Audio

TR Image
Soft mask from MaxVQ.
Abstract:

Missing-feature methods improve automatic recognition of noisy speech by removing unreliable noise corrupted spectrographic components from the signal. Recognition is performed either by modifying the recognizer to work from incomplete spectra, or by estimating the missing components to reconstruct complete spectra. While the former approach performs optimal classification with incomplete spectrograms, the latter permits recognition with cepstral features derived from reconstructed spectra. Traditionally, spectral components are considered unequivocally reliable or unreliable. Research has shown that the use of soft masks that provide a probability of reliability to spectral components instead can improve the performance of missing feature mehtods that modify the recognizer. However, soft masks have not been employed by methods that reconstruct the spectrogram. In this paper we present a new MMSE algorithm for spectrogram reconstruction. Experiments show that the use of soft masks results in significantly improved performance as compared to reconstruction methods that use binary masks.

 

  • Related News & Events

    •  NEWS    ASRU 2005: 2 publications by MERL researchers and others
      Date: November 28, 2005
      Where: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
      Brief
      • The papers "A Robust Voice Activity Detector Using an Acoustic Doppler Radar" by Hu, R. and Raj, B. and "Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition" by Raj, B. and Singh, R. were presented at the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
    •