TR2005-160

Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition
Citation: Raj, B.; Singh, R., "Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 27-32, November 2005 (IEEE Xplore)
Date:November 2005
MERL Contact:Bhiksha Raj

Missing-feature methods improve automatic recognition of noisy speech by removing unreliable noise corrupted spectrographic components from the signal. Recognition is performed either by modifying the recognizer to work from incomplete spectra, or by estimating the missing components to reconstruct complete spectra. While the former approach performs optimal classification with incomplete spectrograms, the latter permits recognition with cepstral features derived from reconstructed spectra. Traditionally, spectral components are considered unequivocally reliable or unreliable. Research has shown that the use of soft masks that provide a probability of reliability to spectral components instead can improve the performance of missing feature mehtods that modify the recognizer. However, soft masks have not been employed by methods that reconstruct the spectrogram. In this paper we present a new MMSE algorithm for spectrogram reconstruction. Experiments show that the use of soft masks results in significantly improved performance as compared to reconstruction methods that use binary masks.

 Read the full technical report (PDF: 174 kB)