TR2014-104

Discriminatively Trained Recurrent Neural Networks for Single-Channel Speech Separation


    •  Weninger, F., Le Roux, J., Hershey, J.R., Schuller, B., "Discriminatively Trained Recurrent Neural Networks for Single-Channel Speech Separation", IEEE Global Conference on Signal and Information Processing (GlobalSIP), DOI: 10.1109/GlobalSIP.2014.7032183, December 2014, pp. 577-581.
      BibTeX TR2014-104 PDF
      • @inproceedings{Weninger2014dec,
      • author = {Weninger, F. and {Le Roux}, J. and Hershey, J.R. and Schuller, B.},
      • title = {Discriminatively Trained Recurrent Neural Networks for Single-Channel Speech Separation},
      • booktitle = {IEEE Global Conference on Signal and Information Processing (GlobalSIP)},
      • year = 2014,
      • pages = {577--581},
      • month = dec,
      • publisher = {IEEE},
      • doi = {10.1109/GlobalSIP.2014.7032183},
      • url = {https://www.merl.com/publications/TR2014-104}
      • }
  • MERL Contact:
  • Research Areas:

    Artificial Intelligence, Speech & Audio

This paper describes an in-depth investigation of training criteria, network architectures and feature representations for regression-based single-channel speech separation with deep neural networks (DNNs). We use a generic discriminative training criterion corresponding to optimal source reconstruction from time-frequency masks, and introduce its application to speech separation in a reduced feature space (Mel domain). A comparative evaluation of time-frequency mask estimation by DNNs, recurrent DNNs and non-negative matrix factorization on the 2nd CHiME Speech Separation and Recognition Challenge shows consistent improvements by discriminative training, whereas long short-term memory recurrent DNNs obtain the overall best results. Furthermore, our results confirm the importance of fine-tuning the feature representation for DNN training.

 

  • Related News & Events

    •  NEWS   IEEE Spectrum's "Cars That Think" highlights MERL's speech enhancement research
      Date: March 9, 2015
      MERL Contact: Jonathan Le Roux
      Research Area: Speech & Audio
      Brief
      • Recent research on speech enhancement by MERL's Speech and Audio team was highlighted in "Cars That Think", IEEE Spectrum's blog on smart technologies for cars. IEEE Spectrum is the flagship publication of the Institute of Electrical and Electronics Engineers (IEEE), the world's largest association of technical professionals with more than 400,000 members.
    •  
    •  NEWS   MERL's noise suppression technology featured in Mitsubishi Electric Corporation press release
      Date: February 17, 2015
      MERL Contact: Jonathan Le Roux
      Research Area: Speech & Audio
      Brief
      • Mitsubishi Electric Corporation announced that it has developed breakthrough noise-suppression technology that significantly improves the quality of hands-free voice communication in noisy conditions, such as making a voice call via a car navigation system. Speech clarity is improved by removing 96% of surrounding sounds, including rapidly changing noise from turn signals or wipers, which are difficult to suppress using conventional methods. The technology is based on recent research on speech enhancement by MERL's Speech and Audio team. .
    •