News & Events

46 were found.




  •  NEWS   MERL's breakthrough speech separation technology featured in Mitsubishi Electric Corporation's Annual R&D Open House
    Date: May 24, 2017
    Where: Tokyo, Japan
    MERL Contacts: Bret Harsham; John Hershey; Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Mitsubishi Electric Corporation announced that it has created the world's first technology that separates in real time the simultaneous speech of multiple unknown speakers recorded with a single microphone. It's a key step towards building machines that can interact in noisy environments, in the same way that humans can have meaningful conversations in the presence of many other conversations. In tests, the simultaneous speeches of two and three people were separated with up to 90 and 80 percent accuracy, respectively. The novel technology, which was realized with Mitsubishi Electric's proprietary "Deep Clustering" method based on artificial intelligence (AI), is expected to contribute to more intelligible voice communications and more accurate automatic speech recognition. A characteristic feature of this approach is its versatility, in the sense that voices can be separated regardless of their language or the gender of the speakers. A live speech separation demonstration that took place on May 24 in Tokyo, Japan, was widely covered by the Japanese media, with reports by three of the main Japanese TV stations and multiple articles in print and online newspapers. The technology is based on recent research by MERL's Speech and Audio team.
      Links:
      Mitsubishi Electric Corporation Press Release
      MERL Deep Clustering Demo

      Media Coverage:

      Fuji TV, News, "Minna no Mirai" (Japanese)
      The Nikkei (Japanese)
      Nikkei Technology Online (Japanese)
      Sankei Biz (Japanese)
      EE Times Japan (Japanese)
      ITpro (Japanese)
      Nikkan Sports (Japanese)
      Nikkan Kogyo Shimbun (Japanese)
      Dempa Shimbun (Japanese)
      Il Sole 24 Ore (Italian)
      IEEE Spectrum (English)
  •  
  •  NEWS   MERL to present 10 papers at ICASSP 2017
    Date: March 5, 2017 - March 9, 2017
    Where: New Orleans
    MERL Contacts: Petros Boufounos; Chen Feng; John Hershey; Takaaki Hori; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Dong Tian; Anthony Vetro; Ye Wang
    Research Areas: Multimedia, Computer Vision, Computational Geometry, Computational Sensing, Digital Video, Information Security, Speech & Audio
    Brief
    • MERL researchers will presented 10 papers at the upcoming IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), to be held in New Orleans from March 5-9, 2017. Topics to be presented include recent advances in speech recognition and audio processing; graph signal processing; computational imaging; and privacy-preserving data analysis.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
  •  
  •  EVENT   John Hershey to present tutorial at the 2016 IEEE SLT Workshop
    Date: Tuesday, December 13, 2016
    Speaker: John Hershey, MERL
    MERL Contacts: John Hershey; Jonathan Le Roux
    Location: 2016 IEEE Spoken Language Technology Workshop, San Diego, California
    Research Areas: Multimedia, Machine Learning, Speech & Audio
    Brief
    • MERL researcher John Hershey presents an invited tutorial at the 2016 IEEE Workshop on Spoken Language Technology, in San Diego, California. The topic, "developing novel deep neural network architectures from probabilistic models" stems from MERL work with collaborators Jonathan Le Roux and Shinji Watanabe, on a principled framework that seeks to improve our understanding of deep neural networks, and draws inspiration for new types of deep network from the arsenal of principles and tools developed over the years for conventional probabilistic models. The tutorial covers a range of parallel ideas in the literature that have formed a recent trend, as well as their application to speech and language.
  •  
  •  EVENT   SANE 2016 - Speech and Audio in the Northeast
    Date: Friday, October 21, 2016
    MERL Contacts: John Hershey; Jonathan Le Roux
    Location: MIT, McGovern Institute for Brain Research, Cambridge, MA
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2016, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Friday October 21, 2016 at MIT's Brain and Cognitive Sciences Department, at the McGovern Institute for Brain Research, in Cambridge, MA.

      It is a follow-up to SANE 2012 (Mitsubishi Electric Research Labs - MERL), SANE 2013 (Columbia University), SANE 2014 (MIT CSAIL), and SANE 2015 (Google NY). Since the first edition, the audience has steadily grown, gathering 140 researchers and students in 2015.

      SANE 2016 will feature invited talks by leading researchers: Juan P. Bello (NYU), William T. Freeman (MIT/Google), Nima Mesgarani (Columbia University), DAn Ellis (Google), Shinji Watanabe (MERL), Josh McDermott (MIT), and Jesse Engel (Google). It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2016 is organized by Jonathan Le Roux (MERL), Josh McDermott (MIT), Jim Glass (MIT), and John R. Hershey (MERL).
  •  
  •  NEWS   MERL Speech & Audio researchers present two sold-out tutorials at Interspeech 2016
    Date: September 8, 2016
    Where: Interspeech 2016, San Francisco, CA
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MERL Speech and Audio Team researchers Shinji Watanabe and Jonathan Le Roux presented two tutorials on September 8 at the Interspeech 2016 conference, held in San Francisco, CA. Shinji collaborated with Marc Delcroix (NTT Communication Science Laboratories, Japan) to deliver a three-hour lecture on "Recent Advances in Distant Speech Recognition", drawing upon their experience organizing and participating in six different recent robust speech processing challenges. Jonathan teamed with Emmanuel Vincent (Inria, France) and Hakan Erdogan (Sabanci University, Microsoft Research) to give an in-depth tour of the latest advances in "Learning-based Approaches to Speech Enhancement And Separation". This collaboration stemmed from extensive stays at MERL by Emmanuel and Hakan, Emmanuel as a summer visitor, and Hakan as a MERL visiting research scientist for over a year while on sabbatical.

      Both tutorials were sold out, each attracting more than 100 researchers and students in related fields, and received high praise from audience members.
  •  
  •  NEWS   MERL researchers present 12 papers at ICASSP 2016
    Date: March 20, 2016 - March 25, 2016
    Where: Shanghai, China
    MERL Contacts: Petros Boufounos; John Hershey; Chiori Hori; Takaaki Hori; Kyeong Jin (K.J.) Kim; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Philip Orlik; Milutin Pajovic; Dong Tian; Anthony Vetro
    Research Areas: Electronics & Communications, Multimedia, Computational Sensing, Digital Video, Speech & Audio, Wireless Communications & Signal Processing, Signal Processing, Wireless Communications
    Brief
    • MERL researchers have presented 12 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which was held in Shanghai, China from March 20-25, 2016. ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing, with more than 1200 papers presented and over 2000 participants.
  •  
  •  NEWS   John Hershey gives invited talk at Johns Hopkins University on MERL's "Deep Clustering" breakthrough
    Date: March 4, 2016
    Where: Johns Hopkins Center for Language and Speech Processing
    MERL Contacts: John Hershey; Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MERL researcher and speech team leader, John Hershey, was invited by the Center for Language and Speech Processing at Johns Hopkins University to give a talk on MERL's breakthrough audio separation work, known as "Deep Clustering". The talk was entitled "Speech Separation by Deep Clustering: Towards Intelligent Audio Analysis and Understanding," and was given on March 4, 2016.

      This is work conducted by MERL researchers John Hershey, Jonathan Le Roux, and Shinji Watanabe, and MERL interns, Zhuo Chen of Columbia University, and Yusef Isik of Sabanci University.
  •  
  •  AWARD   MERL's Speech Team Achieves World's 2nd Best Performance at the Third CHiME Speech Separation and Recognition Challenge
    Date: December 15, 2015
    Awarded to: John R. Hershey, Takaaki Hori, Jonathan Le Roux and Shinji Watanabe
    MERL Contacts: John Hershey; Takaaki Hori; Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • The results of the third 'CHiME' Speech Separation and Recognition Challenge were publicly announced on December 15 at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015) held in Scottsdale, Arizona, USA. MERL's Speech and Audio Team, in collaboration with SRI, ranked 2nd out of 26 teams from Europe, Asia and the US. The task this year was to recognize speech recorded using a tablet in real environments such as cafes, buses, or busy streets. Due to the high levels of noise and the distance from the speaker's mouth to the microphones, this is very challenging task, where the baseline system only achieved 33.4% word error rate. The MERL/SRI system featured state-of-the-art techniques including multi-channel front-end, noise-robust feature extraction, and deep learning for speech enhancement, acoustic modeling, and language modeling, leading to a dramatic 73% reduction in word error rate, down to 9.1%. The core of the system has since been released as a new official challenge baseline for the community to use.
  •  
  •  EVENT   SANE 2015 - Speech and Audio in the Northeast
    Date: Thursday, October 22, 2015
    MERL Contacts: Jonathan Le Roux; John Hershey
    Location: Google, New York City, NY
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2015, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 22, 2015 at Google, in New York City, NY.

      It is a follow-up to SANE 2012, held at Mitsubishi Electric Research Labs (MERL), SANE 2013, held at Columbia University, and SANE 2014, held at MIT, which each gathered 70 to 90 researchers and students.

      SANE 2015 will feature invited talks by leading researchers from the Northeast, as well as from the international community: Rohit Prasad (Amazon), Michael Mandel (Brooklyn College, CUNY), Ron Weiss (Google), John Hershey (MERL), Pablo Sprechmann (NYU), Tuomas Virtanen (Tampere University of Technology), and Paris Smaragdis (UIUC). It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2015 is organized by Jonathan Le Roux (MERL), Hank Liao (Google), Andrew Senior (Google), and John R. Hershey (MERL).
  •  
  •  NEWS   Multimedia Group researchers presented 8 papers at ICASSP 2015
    Date: April 19, 2015 - April 24, 2015
    Where: IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP)
    MERL Contacts: Anthony Vetro; Hassan Mansour; Petros Boufounos; John Hershey; Jonathan Le Roux
    Research Area: Multimedia
    Brief
    • Multimedia Group researchers have presented 8 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing, which was held in Brisbane, Australia from April 19-24, 2015.
  •  
  •  NEWS   IEEE Spectrum's "Cars That Think" highlights MERL's speech enhancement research
    Date: March 9, 2015
    MERL Contacts: Jonathan Le Roux; John Hershey
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Recent research on speech enhancement by MERL's Speech and Audio team was highlighted in "Cars That Think", IEEE Spectrum's blog on smart technologies for cars. IEEE Spectrum is the flagship publication of the Institute of Electrical and Electronics Engineers (IEEE), the world's largest association of technical professionals with more than 400,000 members.
  •  
  •  NEWS   MERL's noise suppression technology featured in Mitsubishi Electric Corporation press release
    Date: February 17, 2015
    MERL Contacts: Jonathan Le Roux; John Hershey
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Mitsubishi Electric Corporation announced that it has developed breakthrough noise-suppression technology that significantly improves the quality of hands-free voice communication in noisy conditions, such as making a voice call via a car navigation system. Speech clarity is improved by removing 96% of surrounding sounds, including rapidly changing noise from turn signals or wipers, which are difficult to suppress using conventional methods. The technology is based on recent research on speech enhancement by MERL's Speech and Audio team.
  •  
  •  EVENT   SANE 2014 - Speech and Audio in the Northeast
    Date: Thursday, October 23, 2014
    MERL Contacts: Jonathan Le Roux; John Hershey
    Location: Mitsubishi Electric Research Laboratories (MERL)
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2014, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 23, 2014 at MIT, in Cambridge, MA. It is a follow-up to SANE 2012, held at Mitsubishi Electric Research Labs (MERL), and SANE 2013, held at Columbia University, which each gathered around 70 researchers and students. SANE 2014 will feature invited talks by leading researchers from the Northeast as well as Europe: Najim Dehak (MIT), Hakan Erdogan (MERL/Sabanci University), Gael Richard (Telecom ParisTech), George Saon (IBM Research), Andrew Senior (Google Research), Stavros Tsakalidis (BBN - Raytheon), and David Wingate (Lyric). It will also feature a lively poster session during lunch time, open to both students and researchers. SANE 2014 is organized by Jonathan Le Roux (MERL), Jim Glass (MIT), and John R. Hershey (MERL).
  •  
  •  AWARD   Awaya Prize Young Researcher Award
    Date: March 11, 2014
    Awarded to: Yuuki Tachioka
    Awarded for: "Effectiveness of discriminative approaches for speech recognition under noisy environments on the 2nd CHiME Challenge"
    Awarded by: Acoustical Society of Japan (ASJ)
    MERL Contacts: Jonathan Le Roux; John Hershey
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MELCO researcher Yuuki Tachioka received the Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) for "effectiveness of discriminative approaches for speech recognition under noisy environments on the 2nd CHiME Challenge", which was based on joint work with MERL Speech & Audio team researchers Shinji Watanabe, Jonathan Le Roux and John R. Hershey.
  •  
  •  NEWS   Members of the Speech & Audio team elected to IEEE Technical Committees
    Date: January 1, 2014
    MERL Contacts: Jonathan Le Roux; John Hershey
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Jonathan Le Roux, Shinji Watanabe and John R. Hershey have been elected for 3-year terms to Technical Committees of the IEEE Signal Processing Society. Jonathan has been elected to the IEEE Audio and Acoustic Signal Processing Technical Committee (AASP-TC), and Shinji and John to the Speech and Language Processing Technical Committee (SL-TC). Members of the Speech & Audio team now together hold four TC positions, as John also serves on the AASP-TC.
  •  
  •  NEWS   Prediction algorithms developed by MERL showcased for automotive HMI
    Date: February 10, 2014
    MERL Contacts: John Hershey; Bret Harsham; Jonathan Le Roux; Daniel Nikovski; Anthony Vetro
    Research Areas: Multimedia, Data Analytics
    Brief
    • Mitsubishi Electric Corporation demonstrated an ultra-simple HMI for in-car device operation using algorithms developed by MERL to predict user actions and destinations.
  •  
  •  EVENT   SANE 2013 - Speech and Audio in the Northeast
    Date & Time: Thursday, October 24, 2013; 8:45 AM - 5:00 PM
    MERL Contacts: Jonathan Le Roux; John Hershey
    Location: Columbia University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2013, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 24, 2013 at Columbia University, in New York City.

      A follow-up to SANE 2012 held in October 2012 at MERL in Cambridge, MA, this year's SANE will be held in conjunction with the WASPAA workshop, held October 20-23 in upstate New York. WASPAA attendees are welcome and encouraged to attend SANE.

      SANE 2013 will feature invited speakers from the Northeast, as well as from the international community. It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2013 is organized by Prof. Dan Ellis (Columbia University), Jonathan Le Roux (MERL) and John R. Hershey (MERL).
  •  
  •  TALK   Efficiently sampling wave fields
    Date & Time: Thursday, October 17, 2013; 12:00 PM
    Speaker: Prof. Laurent Daudet, Paris Diderot University, France
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • In acoustics, one may wish to acquire a wavefield over a whole spatial domain, while we can only make point measurements (ie, with microphones). Even with few sources, this remains a difficult problem because of reverberation, which can be hard to characterize. This can be seen as a sampling / interpolation problem, and it raises a number of interesting questions: how many sample points are needed, where to choose the sampling points, etc. In this presentation, we will review some case studies, in 2D (vibrating plates) and 3D (room acoustics), with numerical and experimental data, where we have developed sparse models, possibly with additional 'structures', based on a physical modeling of the acoustic field. These type of models are well suited to reconstruction techniques known as compressed sensing. These principles can also be used for sub-nyquist optical imaging : we will show preliminary experimental results of a new compressive imager, remarkably simple in its principle, using a multiply scattering medium.
  •  
  •  AWARD   Awaya Prize Young Researcher Award
    Date: September 26, 2013
    Awarded to: Jonathan Le Roux
    Awarded for: "A new non-negative dynamical system for speech and audio modeling"
    Awarded by: Acoustical Society of Japan (ASJ)
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  AWARD   CHiME 2012 Speech Separation and Recognition Challenge Best Performance
    Date: June 1, 2013
    Awarded to: Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux and John R. Hershey
    Awarded for: "Discriminative Methods for Noise Robust Speech Recognition: A CHiME Challenge Benchmark"
    Awarded by: International Workshop on Machine Listening in Multisource Environments (CHiME)
    MERL Contacts: Jonathan Le Roux; John Hershey
    Research Areas: Multimedia, Speech & Audio
    Brief
    • The results of the 2nd 'CHiME' Speech Separation and Recognition Challenge are out! The team formed by MELCO researcher Yuuki Tachioka and MERL Speech & Audio team researchers Shinji Watanabe, Jonathan Le Roux and John Hershey obtained the best results in the continuous speech recognition task (Track 2). This very challenging task consisted in recognizing speech corrupted by highly non-stationary noises recorded in a real living room. Our proposal, which also included a simple yet extremely efficient denoising front-end, focused on investigating and developing state-of-the-art automatic speech recognition back-end techniques: feature transformation methods, as well as discriminative training methods for acoustic and language modeling. Our system significantly outperformed other participants. Our code has since been released as an improved baseline for the community to use.
  •