NEWS    MERL's speech research featured in NPR's All Things Considered

Date released: February 9, 2018


  •  NEWS    MERL's speech research featured in NPR's All Things Considered
  • Date:

    February 5, 2018

  • Where:

    National Public Radio (NPR)

  • Description:

    MERL's speech separation technology was featured in NPR's All Things Considered, as part of an episode of All Tech Considered on artificial intelligence, "Can Computers Learn Like Humans?". An example separating the overlapped speech of two of the show's hosts was played on the air.
    The technology is based on a proprietary deep learning method called Deep Clustering. It is the world's first technology that separates in real time the simultaneous speech of multiple unknown speakers recorded with a single microphone. It is a key step towards building machines that can interact in noisy environments, in the same way that humans can have meaningful conversations in the presence of many other conversations.
    A live demonstration was featured in Mitsubishi Electric Corporation's Annual R&D Open House last year, and was also covered in international media at the time.

    (Photo credit: Sam Rowe for NPR)

    Link:
    "Can Computers Learn Like Humans?" (NPR, All Things Considered)
    MERL Deep Clustering Demo.

  • MERL Contact:
  • Research Area:

    Speech & Audio

    •  Isik, Y., Le Roux, J., Chen, Z., Watanabe, S., Hershey, J.R., "Single-Channel Multi-Speaker Separation using Deep Clustering", Interspeech, DOI: 10.21437/​Interspeech.2016-1176, September 2016, pp. 545-549.
      BibTeX TR2016-073 PDF
      • @inproceedings{Isik2016sep,
      • author = {Isik, Yusuf and Le Roux, Jonathan and Chen, Zhuo and Watanabe, Shinji and Hershey, John R.},
      • title = {Single-Channel Multi-Speaker Separation using Deep Clustering},
      • booktitle = {Interspeech},
      • year = 2016,
      • pages = {545--549},
      • month = sep,
      • doi = {10.21437/Interspeech.2016-1176},
      • url = {https://www.merl.com/publications/TR2016-073}
      • }
    •  Hershey, J.R., Chen, Z., Le Roux, J., Watanabe, S., "Deep Clustering: Discriminative Embeddings for Segmentation and Separation", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/​ICASSP.2016.7471631, March 2016, pp. 31-35.
      BibTeX TR2016-003 PDF
      • @inproceedings{Hershey2016mar,
      • author = {Hershey, John R. and Chen, Zhuo and Le Roux, Jonathan and Watanabe, Shinji},
      • title = {Deep Clustering: Discriminative Embeddings for Segmentation and Separation},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2016,
      • pages = {31--35},
      • month = mar,
      • doi = {10.1109/ICASSP.2016.7471631},
      • url = {https://www.merl.com/publications/TR2016-003}
      • }