News & Events

14 News items, Awards, Events or Talks found.



Learn about the MERL Seminar Series.



  •  NEWS    MERL presenting 8 papers at ICASSP 2022
    Date: May 22, 2022 - May 27, 2022
    Where: Singapore
    MERL Contacts: Anoop Cherian; Chiori Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Tim K. Marks; Philip V. Orlik; Kuan-Chuan Peng; Pu (Perry) Wang; Gordon Wichern
    Research Areas: Artificial Intelligence, Computer Vision, Signal Processing, Speech & Audio
    Brief
    • MERL researchers are presenting 8 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held in Singapore from May 22-27, 2022. A week of virtual presentations also took place earlier this month.

      Topics to be presented include recent advances in speech recognition, audio processing, scene understanding, computational sensing, and classification.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
  •  
  •  NEWS    MERL work on scene-aware interaction featured in IEEE Spectrum
    Date: March 1, 2022
    MERL Contacts: Anoop Cherian; Chiori Hori; Jonathan Le Roux; Tim K. Marks; Alan Sullivan; Anthony Vetro
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio
    Brief
    • MERL's research on scene-aware interaction was recently featured in an IEEE Spectrum article. The article, titled "At Last, A Self-Driving Car That Can Explain Itself" and authored by MERL Senior Principal Research Scientist Chiori Hori and MERL Director Anthony Vetro, gives an overview of MERL's efforts towards developing a system that can analyze multimodal sensing information for highly natural and intuitive interaction with humans through context-dependent generation of natural language. The technology recognizes contextual objects and events based on multimodal sensing information, such as images and video captured with cameras, audio information recorded with microphones, and localization information measured with LiDAR.

      Scene-Aware Interaction for car navigation, one target application that the article focuses on, will provide drivers with intuitive route guidance. Scene-Aware Interaction technology is expected to have wide applicability, including human-machine interfaces for in-vehicle infotainment, interaction with service robots in building and factory automation systems, systems that monitor the health and well-being of people, surveillance systems that interpret complex scenes for humans and encourage social distancing, support for touchless operation of equipment in public areas, and much more. MERL's Scene-Aware Interaction Technology had previously been featured in a Mitsubishi Electric Corporation Press Release.

      IEEE Spectrum is the flagship magazine and website of the IEEE, the world’s largest professional organization devoted to engineering and the applied sciences. IEEE Spectrum has a circulation of over 400,000 engineers worldwide, making it one of the leading science and engineering magazines.
  •  
  •  TALK    [MERL Seminar Series 2022] Learning Speech Representations with Multimodal Self-Supervision
    Date & Time: Tuesday, March 1, 2022; 1:00 PM EST
    Speaker: David Harwath, The University of Texas at Austin
    MERL Host: Chiori Hori
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Abstract
    • Humans learn spoken language and visual perception at an early age by being immersed in the world around them. Why can't computers do the same? In this talk, I will describe our ongoing work to develop methodologies for grounding continuous speech signals at the raw waveform level to natural image scenes. I will first present self-supervised models capable of discovering discrete, hierarchical structure (words and sub-word units) in the speech signal. Instead of conventional annotations, these models learn from correspondences between speech sounds and visual patterns such as objects and textures. Next, I will demonstrate how these discrete units can be used as a drop-in replacement for text transcriptions in an image captioning system, enabling us to directly synthesize spoken descriptions of images without the need for text as an intermediate representation. Finally, I will describe our latest work on Transformer-based models of visually-grounded speech. These models significantly outperform the prior state of the art on semantic speech-to-image retrieval tasks, and also learn representations that are useful for a multitude of other speech processing tasks.
  •  
  •  NEWS    Chiori Hori will give keynote on scene understanding via multimodal sensing at AI Electronics Symposium
    Date: February 15, 2021
    Where: The 2nd International Symposium on AI Electronics
    MERL Contact: Chiori Hori
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio
    Brief
    • Chiori Hori, a Senior Principal Researcher in MERL's Speech and Audio Team, will be a keynote speaker at the 2nd International Symposium on AI Electronics, alongside Alex Acero, Senior Director of Apple Siri, Roberto Cipolla, Professor of Information Engineering at the University of Cambridge, and Hiroshi Amano, Professor at Nagoya University and winner of the Nobel prize in Physics for his work on blue light-emitting diodes. The symposium, organized by Tohoku University, will be held online on February 15, 2021, 10am-4pm (JST).

      Chiori's talk, titled "Human Perspective Scene Understanding via Multimodal Sensing", will present MERL's work towards the development of scene-aware interaction. One important piece of technology that is still missing for human-machine interaction is natural and context-aware interaction, where machines understand their surrounding scene from the human perspective, and they can share their understanding with humans using natural language. To bridge this communications gap, MERL has been working at the intersection of research fields such as spoken dialog, audio-visual understanding, sensor signal understanding, and robotics technologies in order to build a new AI paradigm, called scene-aware interaction, that enables machines to translate their perception and understanding of a scene and respond to it using natural language to interact more effectively with humans. In this talk, the technologies will be surveyed, and an application for future car navigation will be introduced.
  •  
  •  NEWS    MERL's Scene-Aware Interaction Technology Featured in Mitsubishi Electric Corporation Press Release
    Date: July 22, 2020
    Where: Tokyo, Japan
    MERL Contacts: Anoop Cherian; Chiori Hori; Jonathan Le Roux; Tim K. Marks; Alan Sullivan; Anthony Vetro
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio
    Brief
    • Mitsubishi Electric Corporation announced that the company has developed what it believes to be the world’s first technology capable of highly natural and intuitive interaction with humans based on a scene-aware capability to translate multimodal sensing information into natural language.

      The novel technology, Scene-Aware Interaction, incorporates Mitsubishi Electric’s proprietary Maisart® compact AI technology to analyze multimodal sensing information for highly natural and intuitive interaction with humans through context-dependent generation of natural language. The technology recognizes contextual objects and events based on multimodal sensing information, such as images and video captured with cameras, audio information recorded with microphones, and localization information measured with LiDAR.

      Scene-Aware Interaction for car navigation, one target application, will provide drivers with intuitive route guidance. The technology is also expected to have applicability to human-machine interfaces for in-vehicle infotainment, interaction with service robots in building and factory automation systems, systems that monitor the health and well-being of people, surveillance systems that interpret complex scenes for humans and encourage social distancing, support for touchless operation of equipment in public areas, and much more. The technology is based on recent research by MERL's Speech & Audio and Computer Vision groups.
  •  
  •  NEWS    MERL presenting 13 papers and an industry talk at ICASSP 2020
    Date: May 4, 2020 - May 8, 2020
    Where: Virtual Barcelona
    MERL Contacts: Karl Berntorp; Petros T. Boufounos; Chiori Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Yanting Ma; Hassan Mansour; Philip V. Orlik; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
    Research Areas: Computational Sensing, Computer Vision, Machine Learning, Signal Processing, Speech & Audio
    Brief
    • MERL researchers are presenting 13 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held virtually from May 4-8, 2020. Petros Boufounos is also presenting a talk on the Computational Sensing Revolution in Array Processing (video) in ICASSP’s Industry Track, and Siheng Chen is co-organizing and chairing a special session on a Signal-Processing View of Graph Neural Networks.

      Topics to be presented include recent advances in speech recognition, audio processing, scene understanding, computational sensing, array processing, and parameter estimation. Videos for all talks are available on MERL's YouTube channel, with corresponding links in the references below.

      This year again, MERL is a sponsor of the conference and will be participating in the Student Job Fair; please join us to learn about our internship program and career opportunities.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year. Originally planned to be held in Barcelona, Spain, ICASSP has moved to a fully virtual setting due to the COVID-19 crisis, with free registration for participants not covering a paper.
  •  
  •  NEWS    MERL Speech & Audio Researchers Presenting 7 Papers and a Tutorial at Interspeech 2019
    Date: September 15, 2019 - September 19, 2019
    Where: Graz, Austria
    MERL Contacts: Chiori Hori; Jonathan Le Roux; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL Speech & Audio Team researchers will be presenting 7 papers at the 20th Annual Conference of the International Speech Communication Association INTERSPEECH 2019, which is being held in Graz, Austria from September 15-19, 2019. Topics to be presented include recent advances in end-to-end speech recognition, speech separation, and audio-visual scene-aware dialog. Takaaki Hori is also co-presenting a tutorial on end-to-end speech processing.

      Interspeech is the world's largest and most comprehensive conference on the science and technology of spoken language processing. It gathers around 2000 participants from all over the world.
  •  
  •  NEWS    MERL presenting 16 papers at ICASSP 2019
    Date: May 12, 2019 - May 17, 2019
    Where: Brighton, UK
    MERL Contacts: Petros T. Boufounos; Anoop Cherian; Chiori Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Tim K. Marks; Philip V. Orlik; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
    Research Areas: Computational Sensing, Computer Vision, Machine Learning, Signal Processing, Speech & Audio
    Brief
    • MERL researchers will be presenting 16 papers at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which is being held in Brighton, UK from May 12-17, 2019. Topics to be presented include recent advances in speech recognition, audio processing, scene understanding, computational sensing, and parameter estimation. MERL is also a sponsor of the conference and will be participating in the student career luncheon; please join us at the lunch to learn about our internship program and career opportunities.

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
  •  
  •  EVENT    MERL is a Proud Sponsor of the Grace Hopper Celebration 2018!
    Date: Wednesday, September 26, 2018 - Friday, September 28, 2018
    Location: Houston, Texas
    MERL Contacts: Chiori Hori; Elizabeth Phillips
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • "MERL, in partnership with Mitsubishi Electric was a Gold Sponsor of the Grace Hopper Celebration 2018 (GHC18) held in Houston, TX on September 26-28th. Presented by AnitaB.org and the Association for Computing Machinery, this is world's largest gathering of women technologists. Chiori Hori and Elizabeth Phillips from MERL, and Yoshiyuki Umei, Jared Baker and Lien Randle from MEUS, proudly represented Mitsubishi Electric at the recruiting expo, that drew over 20,000 female technologists this year.
  •  
  •  NEWS    Chiori Hori elected to IEEE Technical Committee on Speech and Language Processing
    Date: January 31, 2018
    MERL Contact: Chiori Hori
    Research Area: Speech & Audio
    Brief
    • Chiori Hori has been elected to serve on the Speech and Language Processing Technical Committee (SLTC) of the IEEE Signal Processing Society for a 3-year term.

      The SLTC promotes and influences all the technical areas of speech and language processing such as speech recognition, speech synthesis, spoken language understanding, speech to speech translation, spoken dialog management, speech indexing, information extraction from audio, and speaker and language recognition.
  •  
  •  NEWS    MERL presents 3 papers at ASRU 2017, John Hershey serves as general chair
    Date: December 16, 2017 - December 20, 2017
    Where: Okinawa, Japan
    MERL Contacts: Chiori Hori; Jonathan Le Roux
    Research Area: Speech & Audio
    Brief
    • MERL presented three papers at the 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), which was held in Okinawa, Japan from December 16-20, 2017. ASRU is the premier speech workshop, bringing together researchers from academia and industry in an intimate and collegial setting. More than 270 people attended the event this year, a record number. MERL's Speech and Audio Team was a key part of the organization of the workshop, with John Hershey serving as General Chair, Chiori Hori as Sponsorship Chair, and Jonathan Le Roux as Demonstration Chair. Two of the papers by MERL were selected among the 10 finalists for the best paper award. Mitsubishi Electric and MERL were also Platinum sponsors of the conference, with MERL awarding the MERL Best Student Paper Award.
  •  
  •  EVENT    MERL leads organization of dialog technology challenges and associated workshop
    Date: Sunday, December 10, 2017
    Location: Hyatt Regency, Long Beach, CA
    MERL Contact: Chiori Hori
    Research Area: Speech & Audio
    Brief
    • MERL researcher Chiori Hori led the organization of the 6th edition of the Dialog System Technology Challenges (DSTC6). This year's edition of DSTC is split into three tracks: End-to-End Goal Oriented Dialog Learning, End-to-End Conversation Modeling, and Dialogue Breakdown Detection. A total of 23 teams from all over the world competed in the various tracks, and will meet at the Hyatt Regency in Long Beach, CA, USA on December 10 to present their results at a dedicated workshop colocated with NIPS 2017.

      MERL's Speech and Audio Team and Mitsubishi Electric Corporation jointly submitted a set of systems to the End-to-End Conversation Modeling Track, obtaining the best rank among 19 submissions in terms of objective metrics.
  •  
  •  TALK    Generative Model-Based Text-to-Speech Synthesis
    Date & Time: Wednesday, February 1, 2017; 12:00-13:00
    Speaker: Dr. Heiga ZEN, Google
    MERL Host: Chiori Hori
    Research Area: Speech & Audio
    Abstract
    • Recent progress in generative modeling has improved the naturalness of synthesized speech significantly. In this talk I will summarize these generative model-based approaches for speech synthesis such as WaveNet, a deep generative model of raw audio waveforms. We show that WaveNets are able to generate speech which mimics any human voice and which sounds more natural than the best existing Text-to-Speech systems.
      See https://deepmind.com/blog/wavenet-generative-model-raw-audio/ for further details.
  •  
  •  NEWS    MERL researchers present 12 papers at ICASSP 2016
    Date: March 20, 2016 - March 25, 2016
    Where: Shanghai, China
    MERL Contacts: Petros T. Boufounos; Chiori Hori; Kyeong Jin (K.J.) Kim; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Philip V. Orlik; Anthony Vetro
    Research Areas: Computational Sensing, Digital Video, Speech & Audio, Communications, Signal Processing
    Brief
    • MERL researchers have presented 12 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which was held in Shanghai, China from March 20-25, 2016. ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing, with more than 1200 papers presented and over 2000 participants.
  •