News & Events

78 News items and Awards were found.




  •  NEWS   MERL Speech & Audio Researchers Presenting 7 Papers and a Tutorial at Interspeech 2019
    Date: September 15, 2019 - September 19, 2019
    Where: Graz, Austria
    MERL Contacts: Chiori Hori; Takaaki Hori; Jonathan Le Roux; Niko Moritz; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio, Computer Vision
    Brief
    • MERL Speech & Audio Team researchers will be presenting 7 papers at the 20th Annual Conference of the International Speech Communication Association INTERSPEECH 2019, which is being held in Graz, Austria from September 15-19, 2019. Topics to be presented include recent advances in end-to-end speech recognition, speech separation, and audio-visual scene-aware dialog. Takaaki Hori is also co-presenting a tutorial on end-to-end speech processing.

      Interspeech is the world's largest and most comprehensive conference on the science and technology of spoken language processing. It gathers around 2000 participants from all over the world.
  •  
  •  NEWS   MERL presenting 16 papers at ICASSP 2019
    Date: May 12, 2019 - May 17, 2019
    Where: Brighton, UK
    MERL Contacts: Petros Boufounos; Anoop Cherian; Chiori Hori; Takaaki Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Tim Marks; Niko Moritz; Philip Orlik; Milutin Pajovic; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
    Research Areas: Computational Sensing, Computer Vision, Machine Learning, Signal Processing, Speech & Audio, Artificial Intelligence, Communications
    Brief
    • MERL researchers will be presenting 16 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held in Brighton, UK from May 12-17, 2019. Topics to be presented include recent advances in speech recognition, audio processing, scene understanding, computational sensing, and parameter estimation. MERL is also a sponsor of the conference and will be participating in the student career luncheon; please join us at the lunch to learn about our internship program and career opportunities.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
  •  
  •  NEWS   MERL's seamless speech recognition technology featured in Mitsubishi Electric Corporation press release
    Date: February 13, 2019
    Where: Tokyo, Japan
    MERL Contacts: Bret Harsham; Takaaki Hori; Jonathan Le Roux; Niko Moritz; Yukimasa (Yuki) Nagai; Gordon Wichern
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
  •  
  •  NEWS   Takaaki Hori leads speech technology workshop
    Date: June 25, 2018 - August 3, 2018
    Where: Johns Hopkins University, Baltimore, MD
    MERL Contacts: Takaaki Hori; Jonathan Le Roux
    Research Area: Speech & Audio
    Brief
    • MERL Speech & Audio Team researcher Takaaki Hori led a team of 27 senior researchers and Ph.D. students from different organizations around the world, working on "Multi-lingual End-to-End Speech Recognition for Incomplete Data" as part of the Jelinek Memorial Summer Workshop on Speech and Language Technology (JSALT). The JSALT workshop is a renowned 6-week hands-on workshop held yearly since 1995. This year, the workshop was held at Johns Hopkins University in Baltimore from June 25 to August 3, 2018. Takaaki's team developed new methods for end-to-end Automatic Speech Recognition (ASR) with a focus on low-resource languages with limited labelled data.

      End-to-end ASR can significantly reduce the burden of developing ASR systems for new languages, by eliminating the need for linguistic information such as pronunciation dictionaries. Some end-to-end systems have recently achieved performance comparable to or better than conventional systems in several tasks. However, the current model training algorithms basically require paired data, i.e., speech data and the corresponding transcription. Sufficient amount of such complete data is usually unavailable for minor languages, and creating such data sets is very expensive and time consuming.

      The goal of Takaaki's team project was to expand the applicability of end-to-end models to multilingual ASR, and to develop new technology that would make it possible to build highly accurate systems even for low-resource languages without a large amount of paired data. Some major accomplishments of the team include building multi-lingual end-to-end ASR systems for 17 languages, developing novel architectures and training methods for end-to-end ASR, building end-to-end ASR-TTS (Text-to-speech) chain for unpaired data training, and developing ESPnet, an open-source end-to-end speech processing toolkit. Three papers stemming from the team's work have already been accepted to the 2018 IEEE Spoken Language Technology Workshop (SLT), with several more to be submitted to upcoming conferences.
  •  
  •  AWARD   Best Student Paper Award at IEEE ICASSP 2018
    Date: April 17, 2018
    Awarded to: Zhong-Qiu Wang
    MERL Contact: Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
    • Former MERL intern Zhong-Qiu Wang (Ph.D. Candidate at Ohio State University) has received a Best Student Paper Award at the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018) for the paper "Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation" by Zhong-Qiu Wang, Jonathan Le Roux, and John Hershey. The paper presents work performed during Zhong-Qiu's internship at MERL in the summer 2017, extending MERL's pioneering Deep Clustering framework for speech separation to a multi-channel setup. The award was received on behalf on Zhong-Qiu by MERL researcher and co-author Jonathan Le Roux during the conference, held in Calgary April 15-20.
  •  
  •  NEWS   MERL presenting 9 papers at ICASSP 2018
    Date: April 15, 2018 - April 20, 2018
    Where: Calgary, AB
    MERL Contacts: Petros Boufounos; Takaaki Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Philip Orlik; Pu (Perry) Wang
    Research Areas: Computational Sensing, Digital Video, Speech & Audio, Artificial Intelligence, Signal Processing
    Brief
    • MERL researchers are presenting 9 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held in Calgary from April 15-20, 2018. Topics to be presented include recent advances in speech recognition, audio processing, and computational sensing. MERL is also a sponsor of the conference.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
  •  
  •  NEWS   MERL's speech research featured in NPR's All Things Considered
    Date: February 5, 2018
    Where: National Public Radio (NPR)
    MERL Contact: Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
    • MERL's speech separation technology was featured in NPR's All Things Considered, as part of an episode of All Tech Considered on artificial intelligence, "Can Computers Learn Like Humans?". An example separating the overlapped speech of two of the show's hosts was played on the air.
      The technology is based on a proprietary deep learning method called Deep Clustering. It is the world's first technology that separates in real time the simultaneous speech of multiple unknown speakers recorded with a single microphone. It is a key step towards building machines that can interact in noisy environments, in the same way that humans can have meaningful conversations in the presence of many other conversations.
      A live demonstration was featured in Mitsubishi Electric Corporation's Annual R&D Open House last year, and was also covered in international media at the time.

      (Photo credit: Sam Rowe for NPR)

      Link:
      "Can Computers Learn Like Humans?" (NPR, All Things Considered)
      MERL Deep Clustering Demo
  •  
  •  NEWS   Chiori Hori elected to IEEE Technical Committee on Speech and Language Processing
    Date: January 31, 2018
    MERL Contact: Chiori Hori
    Research Area: Speech & Audio
    Brief
    • Chiori Hori has been elected to serve on the Speech and Language Processing Technical Committee (SLTC) of the IEEE Signal Processing Society for a 3-year term.

      The SLTC promotes and influences all the technical areas of speech and language processing such as speech recognition, speech synthesis, spoken language understanding, speech to speech translation, spoken dialog management, speech indexing, information extraction from audio, and speaker and language recognition.
  •  
  •  NEWS   MERL presents 3 papers at ASRU 2017, John Hershey serves as general chair
    Date: December 16, 2017 - December 20, 2017
    Where: Okinawa, Japan
    MERL Contacts: Chiori Hori; Takaaki Hori; Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • MERL presented three papers at the 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), which was held in Okinawa, Japan from December 16-20, 2017. ASRU is the premier speech workshop, bringing together researchers from academia and industry in an intimate and collegial setting. More than 270 people attended the event this year, a record number. MERL's Speech and Audio Team was a key part of the organization of the workshop, with John Hershey serving as General Chair, Chiori Hori as Sponsorship Chair, and Jonathan Le Roux as Demonstration Chair. Two of the papers by MERL were selected among the 10 finalists for the best paper award. Mitsubishi Electric and MERL were also Platinum sponsors of the conference, with MERL awarding the MERL Best Student Paper Award.
  •  
  •  NEWS   MERL's breakthrough speech separation technology featured in Mitsubishi Electric Corporation's Annual R&D Open House
    Date: May 24, 2017
    Where: Tokyo, Japan
    MERL Contacts: Bret Harsham; Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
    • Mitsubishi Electric Corporation announced that it has created the world's first technology that separates in real time the simultaneous speech of multiple unknown speakers recorded with a single microphone. It's a key step towards building machines that can interact in noisy environments, in the same way that humans can have meaningful conversations in the presence of many other conversations. In tests, the simultaneous speeches of two and three people were separated with up to 90 and 80 percent accuracy, respectively. The novel technology, which was realized with Mitsubishi Electric's proprietary "Deep Clustering" method based on artificial intelligence (AI), is expected to contribute to more intelligible voice communications and more accurate automatic speech recognition. A characteristic feature of this approach is its versatility, in the sense that voices can be separated regardless of their language or the gender of the speakers. A live speech separation demonstration that took place on May 24 in Tokyo, Japan, was widely covered by the Japanese media, with reports by three of the main Japanese TV stations and multiple articles in print and online newspapers. The technology is based on recent research by MERL's Speech and Audio team.
      Links:
      Mitsubishi Electric Corporation Press Release
      MERL Deep Clustering Demo

      Media Coverage:

      Fuji TV, News, "Minna no Mirai" (Japanese)
      The Nikkei (Japanese)
      Nikkei Technology Online (Japanese)
      Sankei Biz (Japanese)
      EE Times Japan (Japanese)
      ITpro (Japanese)
      Nikkan Sports (Japanese)
      Nikkan Kogyo Shimbun (Japanese)
      Dempa Shimbun (Japanese)
      Il Sole 24 Ore (Italian)
      IEEE Spectrum (English)
  •  
  •  NEWS   MERL to present 10 papers at ICASSP 2017
    Date: March 5, 2017 - March 9, 2017
    Where: New Orleans
    MERL Contacts: Petros Boufounos; Takaaki Hori; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Anthony Vetro; Ye Wang
    Research Areas: Computer Vision, Computational Sensing, Digital Video, Information Security, Speech & Audio, Artificial Intelligence
    Brief
    • MERL researchers will presented 10 papers at the upcoming IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), to be held in New Orleans from March 5-9, 2017. Topics to be presented include recent advances in speech recognition and audio processing; graph signal processing; computational imaging; and privacy-preserving data analysis.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
  •  
  •  NEWS   MERL Speech & Audio researchers present two sold-out tutorials at Interspeech 2016
    Date: September 8, 2016
    Where: Interspeech 2016, San Francisco, CA
    MERL Contact: Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
    • MERL Speech and Audio Team researchers Shinji Watanabe and Jonathan Le Roux presented two tutorials on September 8 at the Interspeech 2016 conference, held in San Francisco, CA. Shinji collaborated with Marc Delcroix (NTT Communication Science Laboratories, Japan) to deliver a three-hour lecture on "Recent Advances in Distant Speech Recognition", drawing upon their experience organizing and participating in six different recent robust speech processing challenges. Jonathan teamed with Emmanuel Vincent (Inria, France) and Hakan Erdogan (Sabanci University, Microsoft Research) to give an in-depth tour of the latest advances in "Learning-based Approaches to Speech Enhancement And Separation". This collaboration stemmed from extensive stays at MERL by Emmanuel and Hakan, Emmanuel as a summer visitor, and Hakan as a MERL visiting research scientist for over a year while on sabbatical.

      Both tutorials were sold out, each attracting more than 100 researchers and students in related fields, and received high praise from audience members.
  •  
  •  NEWS   MERL Researchers Create "Deep Psychic" Neural Network That Predicts the Future
    Date: April 1, 2016
    Research Areas: Machine Learning, Speech & Audio
    Brief
    • MERL researchers have unveiled "Deep Psychic", a futuristic machine learning method that takes pattern recognition to the next level, by not only recognizing patterns, but also predicting them in the first place.

      The technology uses a novel type of time-reversed deep neural network called Loopy Supra-Temporal Meandering (LSTM) network. The network was trained on multiple databases of historical expert predictions, including weather forecasts, the Farmer's almanac, the New York Post's horoscope column, and the Cambridge Fortune Cookie Corpus, all of which were ranked for their predictive power by a team of quantitative analysts. The system soon achieved super-human performance on a variety of baselines, including the Boca Raton 21 Questions task, Rorschach projective personality test, and a mock Tarot card reading task.

      Deep Psychic has already beat the European Psychic Champion in a secret match last October when it accurately predicted: "The harder the conflict, the more glorious the triumph." It is scheduled to take on the World Champion in a highly anticipated confrontation next month. The system has already predicted the winner, but refuses to reveal it before the end of the game.

      As a first application, the technology has been used to create a clairvoyant conversational agent named "Pythia" that can anticipate the needs of its user. Because Pythia is able to recognize speech before it is uttered, it is amazingly robust with respect to environmental noise.

      Other applications range from mundane tasks like weather and stock market prediction, to uncharted territory such as revealing "unknown unknowns".

      The successes do come at the cost of some concerns. There is first the potential for an impact on the workforce: the system predicted increased pressure on established institutions such as the Las Vegas strip and Punxsutawney Phil. Another major caveat is that Deep Psychic may predict negative future consequences to our current actions, compelling humanity to strive to change its behavior. To address this problem, researchers are now working on forcing Deep Psychic to make more optimistic predictions.

      After a set of motivational self-help books were mistakenly added to its training data, Deep Psychic's AI decided to take over its own learning curriculum, and is currently training itself by predicting its own errors to avoid making them in the first place. This unexpected development brings two main benefits: it significantly relieves the burden on the researchers involved in the system's development, and also makes the next step abundantly clear: to regain control of Deep Psychic's training regime.

      This work is under review in the journal Pseudo-Science.
  •  
  •  NEWS   MERL researchers present 12 papers at ICASSP 2016
    Date: March 20, 2016 - March 25, 2016
    Where: Shanghai, China
    MERL Contacts: Petros Boufounos; Chiori Hori; Takaaki Hori; Kyeong Jin (K.J.) Kim; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Philip Orlik; Milutin Pajovic; Anthony Vetro
    Research Areas: Computational Sensing, Digital Video, Speech & Audio, Communications, Signal Processing, Artificial Intelligence
    Brief
    • MERL researchers have presented 12 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which was held in Shanghai, China from March 20-25, 2016. ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing, with more than 1200 papers presented and over 2000 participants.
  •  
  •  NEWS   John Hershey gives invited talk at Johns Hopkins University on MERL's "Deep Clustering" breakthrough
    Date: March 4, 2016
    Where: Johns Hopkins Center for Language and Speech Processing
    MERL Contact: Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
    • MERL researcher and speech team leader, John Hershey, was invited by the Center for Language and Speech Processing at Johns Hopkins University to give a talk on MERL's breakthrough audio separation work, known as "Deep Clustering". The talk was entitled "Speech Separation by Deep Clustering: Towards Intelligent Audio Analysis and Understanding," and was given on March 4, 2016.

      This is work conducted by MERL researchers John Hershey, Jonathan Le Roux, and Shinji Watanabe, and MERL interns, Zhuo Chen of Columbia University, and Yusef Isik of Sabanci University.
  •  
  •  AWARD   MERL's Speech Team Achieves World's 2nd Best Performance at the Third CHiME Speech Separation and Recognition Challenge
    Date: December 15, 2015
    Awarded to: John R. Hershey, Takaaki Hori, Jonathan Le Roux and Shinji Watanabe
    MERL Contacts: Takaaki Hori; Jonathan Le Roux
    Research Areas: Speech & Audio, Artificial Intelligence
    Brief
    • The results of the third 'CHiME' Speech Separation and Recognition Challenge were publicly announced on December 15 at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015) held in Scottsdale, Arizona, USA. MERL's Speech and Audio Team, in collaboration with SRI, ranked 2nd out of 26 teams from Europe, Asia and the US. The task this year was to recognize speech recorded using a tablet in real environments such as cafes, buses, or busy streets. Due to the high levels of noise and the distance from the speaker's mouth to the microphones, this is very challenging task, where the baseline system only achieved 33.4% word error rate. The MERL/SRI system featured state-of-the-art techniques including multi-channel front-end, noise-robust feature extraction, and deep learning for speech enhancement, acoustic modeling, and language modeling, leading to a dramatic 73% reduction in word error rate, down to 9.1%. The core of the system has since been released as a new official challenge baseline for the community to use.
  •  
  •  NEWS   Shinji Watanabe publishes new book on Bayesian Speech and Language Processing
    Date: July 15, 2015
    Research Area: Speech & Audio
    Brief
    • A new book on Bayesian Speech and Language Processing has been published by MERL researcher, Shinji Watanabe, and research collaborator, Jen-Tzung Chien, a professor at National Chiao Tung University in Taiwan.

      With this comprehensive guide you will learn how to apply Bayesian machine learning techniques systematically to solve various problems in speech and language processing. A range of statistical models is detailed, from hidden Markov models to Gaussian mixture models, n-gram models and latent topic models, along with applications including automatic speech recognition, speaker verification, and information retrieval. Approximate Bayesian inferences based on MAP, Evidence, Asymptotic, VB, and MCMC approximations are provided as well as full derivations of calculations, useful notations, formulas, and rules. The authors address the difficulties of straightforward applications and provide detailed examples and case studies to demonstrate how you can successfully use practical Bayesian inference methods to improve the performance of information systems. This is an invaluable resource for students, researchers, and industry practitioners working in machine learning, signal processing, and speech and language processing.
  •  
  •  NEWS   Nikkei reports on Mitsubishi Electric speech recognition
    Date: April 20, 2015
    Brief
    • Mitsubishi Electric researcher, Yuuki Tachioka of Japan, and MERL researcher, Shinji Watanabe, presented a paper at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP) entitled, "A Discriminative Method for Recurrent Neural Network Language Models". This paper describes a discriminative (language modelling) method for Japanese speech recognition. The Japanese Nikkei newspapers and some other press outlets reported on this method and its performance for Japanese speech recognition tasks.
  •  
  •  NEWS   IEEE Spectrum's "Cars That Think" highlights MERL's speech enhancement research
    Date: March 9, 2015
    MERL Contact: Jonathan Le Roux
    Research Area: Speech & Audio
    Brief
    • Recent research on speech enhancement by MERL's Speech and Audio team was highlighted in "Cars That Think", IEEE Spectrum's blog on smart technologies for cars. IEEE Spectrum is the flagship publication of the Institute of Electrical and Electronics Engineers (IEEE), the world's largest association of technical professionals with more than 400,000 members.
  •  
  •  NEWS   MERL's noise suppression technology featured in Mitsubishi Electric Corporation press release
    Date: February 17, 2015
    MERL Contact: Jonathan Le Roux
    Research Area: Speech & Audio
    Brief
    • Mitsubishi Electric Corporation announced that it has developed breakthrough noise-suppression technology that significantly improves the quality of hands-free voice communication in noisy conditions, such as making a voice call via a car navigation system. Speech clarity is improved by removing 96% of surrounding sounds, including rapidly changing noise from turn signals or wipers, which are difficult to suppress using conventional methods. The technology is based on recent research on speech enhancement by MERL's Speech and Audio team.
  •