News & Events

114 were found.


  •  NEWS   MERL researchers present 12 papers at ICASSP 2016
    Date: March 20, 2016 - March 25, 2016
    Where: Shanghai, China
    MERL Contacts: Petros Boufounos; Chiori Hori; Takaaki Hori; Kyeong Jin (K.J.) Kim; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Philip Orlik; Milutin Pajovic; Dong Tian; Anthony Vetro
    Research Areas: Electronics & Communications, Multimedia, Computational Sensing, Digital Video, Speech & Audio, Wireless Communications & Signal Processing
    Brief
    • MERL researchers have presented 12 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which was held in Shanghai, China from March 20-25, 2016. ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing, with more than 1200 papers presented and over 2000 participants.
  •  
  •  TALK   Driver's mental workload estimation based on the reflex eye movement
    Date & Time: Tuesday, March 15, 2016; 12:45 PM - 1:30 PM
    Speaker: Prof. Hirofumi Aoki, Nagoya University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Driving requires a complex skill that is involved with the vehicle itself (e.g., speed control and instrument operation), other road users (e.g., other vehicles, pedestrians), surrounding environment, and so on. During driving, visual cues are the main source to supply information to the brain. In order to stabilize the visual information when you are moving, the eyes move to the opposite direction based on the input to the vestibular system. This involuntary eye movement is called as the vestibulo-ocular reflex (VOR) and the physiological models have been studied so far. Obinata et al. found that the VOR can be used to estimate mental workload. Since then, our research group has been developing methods to quantitatively estimate mental workload during driving by means of reflex eye movement. In this talk, I will explain the basic mechanism of the reflex eye movement and how to apply for mental workload estimation. I also introduce the latest work to combine the VOR and OKR (optokinetic reflex) models for naturalistic driving environment.
  •  
  •  TALK   A data-centric approach to driving behavior research: How can signal processing methods contribute to the development of autonomous driving?
    Date & Time: Tuesday, March 15, 2016; 12:00 PM - 12:45 PM
    Speaker: Prof. Kazuya Takeda, Nagoya University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Thanks to advanced "internet of things" (IoT) technologies, situation-specific human behavior has become an area of development for practical applications involving signal processing. One important area of development of such practical applications is driving behavior research. Since 1999, I have been collecting driving behavior data in a wide range of signal modalities, including speech/sound, video, physical/physiological sensors, CAN bus, LIDAR and GNSS. The objective of this data collection is to evaluate how well signal models can represent human behavior while driving. In this talk, I would like to summarize our 10 years of study of driving behavior signal processing, which has been based on these signal corpora. In particular, statistical signal models of interactions between traffic contexts and driving behavior, i.e., stochastic driver modeling, will be discussed, in the context of risky lane change detection. I greatly look forward to discussing the scalability of such corpus-based approaches, which could be applied to almost any traffic situation.
  •  
  •  TALK   Emotion Detection for Health Related Issues
    Date & Time: Tuesday, February 16, 2016; 12:00 PM - 1:00 PM
    Speaker: Dr. Najim Dehak, MIT
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Recently, there has been a great increase of interest in the field of emotion recognition based on different human modalities, such as speech, heart rate etc. Emotion recognition systems can be very useful in several areas, such as medical and telecommunications. In the medical field, identifying the emotions can be an important tool for detecting and monitoring patients with mental health disorder. In addition, the identification of the emotional state from voice provides opportunities for the development of automated dialogue system capable of producing reports to the physician based on frequent phone communication between the system and the patients. In this talk, we will describe a health related application of using emotion recognition system based on human voices in order to detect and monitor the emotion state of people.
  •  
  •  NEWS   John Hershey gives invited talk at Johns Hopkins University on MERL's "Deep Clustering" breakthrough
    Date: March 4, 2016
    Where: Johns Hopkins Center for Language and Speech Processing
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MERL researcher and speech team leader, John Hershey, was invited by the Center for Language and Speech Processing at Johns Hopkins University to give a talk on MERL's breakthrough audio separation work, known as "Deep Clustering". The talk was entitled "Speech Separation by Deep Clustering: Towards Intelligent Audio Analysis and Understanding," and was given on March 4, 2016.

      This is work conducted by MERL researchers John Hershey, Jonathan Le Roux, and Shinji Watanabe, and MERL interns, Zhuo Chen of Columbia University, and Yusef Isik of Sabanci University.
  •  
  •  AWARD   MERL's Speech Team Achieves World's 2nd Best Performance at the Third CHiME Speech Separation and Recognition Challenge
    Date: December 15, 2015
    Awarded to: John R. Hershey, Takaaki Hori, Jonathan Le Roux and Shinji Watanabe
    MERL Contacts: Takaaki Hori; Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • The results of the third 'CHiME' Speech Separation and Recognition Challenge were publicly announced on December 15 at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015) held in Scottsdale, Arizona, USA. MERL's Speech and Audio Team, in collaboration with SRI, ranked 2nd out of 26 teams from Europe, Asia and the US. The task this year was to recognize speech recorded using a tablet in real environments such as cafes, buses, or busy streets. Due to the high levels of noise and the distance from the speaker's mouth to the microphones, this is very challenging task, where the baseline system only achieved 33.4% word error rate. The MERL/SRI system featured state-of-the-art techniques including multi-channel front-end, noise-robust feature extraction, and deep learning for speech enhancement, acoustic modeling, and language modeling, leading to a dramatic 73% reduction in word error rate, down to 9.1%. The core of the system has since been released as a new official challenge baseline for the community to use.
  •  
  •  EVENT   SANE 2015 - Speech and Audio in the Northeast
    Date: Thursday, October 22, 2015
    MERL Contact: Jonathan Le Roux
    Location: Google, New York City, NY
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2015, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 22, 2015 at Google, in New York City, NY.

      It is a follow-up to SANE 2012, held at Mitsubishi Electric Research Labs (MERL), SANE 2013, held at Columbia University, and SANE 2014, held at MIT, which each gathered 70 to 90 researchers and students.

      SANE 2015 will feature invited talks by leading researchers from the Northeast, as well as from the international community: Rohit Prasad (Amazon), Michael Mandel (Brooklyn College, CUNY), Ron Weiss (Google), John Hershey (MERL), Pablo Sprechmann (NYU), Tuomas Virtanen (Tampere University of Technology), and Paris Smaragdis (UIUC). It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2015 is organized by Jonathan Le Roux (MERL), Hank Liao (Google), Andrew Senior (Google), and John R. Hershey (MERL).
  •  
  •  NEWS   Shinji Watanabe publishes new book on Bayesian Speech and Language Processing
    Date: July 15, 2015
    Research Areas: Multimedia, Speech & Audio
    Brief
    • A new book on Bayesian Speech and Language Processing has been published by MERL researcher, Shinji Watanabe, and research collaborator, Jen-Tzung Chien, a professor at National Chiao Tung University in Taiwan.

      With this comprehensive guide you will learn how to apply Bayesian machine learning techniques systematically to solve various problems in speech and language processing. A range of statistical models is detailed, from hidden Markov models to Gaussian mixture models, n-gram models and latent topic models, along with applications including automatic speech recognition, speaker verification, and information retrieval. Approximate Bayesian inferences based on MAP, Evidence, Asymptotic, VB, and MCMC approximations are provided as well as full derivations of calculations, useful notations, formulas, and rules. The authors address the difficulties of straightforward applications and provide detailed examples and case studies to demonstrate how you can successfully use practical Bayesian inference methods to improve the performance of information systems. This is an invaluable resource for students, researchers, and industry practitioners working in machine learning, signal processing, and speech and language processing.
  •  
  •  NEWS   Nikkei reports on Mitsubishi Electric speech recognition
    Date: April 20, 2015
    Research Area: Multimedia
    Brief
    • Mitsubishi Electric researcher, Yuuki Tachioka of Japan, and MERL researcher, Shinji Watanabe, presented a paper at the IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP) entitled, “A Discriminative Method for Recurrent Neural Network Language Models”. This paper describes a discriminative (language modelling) method for Japanese speech recognition. The Japanese Nikkei newspapers and some other press outlets reported on this method and its performance for Japanese speech recognition tasks.
  •  
  •  NEWS   IEEE Spectrum's "Cars That Think" highlights MERL's speech enhancement research
    Date: March 9, 2015
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Recent research on speech enhancement by MERL's Speech and Audio team was highlighted in "Cars That Think", IEEE Spectrum's blog on smart technologies for cars. IEEE Spectrum is the flagship publication of the Institute of Electrical and Electronics Engineers (IEEE), the world's largest association of technical professionals with more than 400,000 members.
  •  
  •  NEWS   MERL's noise suppression technology featured in Mitsubishi Electric Corporation press release
    Date: February 17, 2015
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Mitsubishi Electric Corporation announced that it has developed breakthrough noise-suppression technology that significantly improves the quality of hands-free voice communication in noisy conditions, such as making a voice call via a car navigation system. Speech clarity is improved by removing 96% of surrounding sounds, including rapidly changing noise from turn signals or wipers, which are difficult to suppress using conventional methods. The technology is based on recent research on speech enhancement by MERL's Speech and Audio team.
  •  
  •  EVENT   SANE 2014 - Speech and Audio in the Northeast
    Date: Thursday, October 23, 2014
    MERL Contact: Jonathan Le Roux
    Location: Mitsubishi Electric Research Laboratories (MERL)
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2014, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 23, 2014 at MIT, in Cambridge, MA. It is a follow-up to SANE 2012, held at Mitsubishi Electric Research Labs (MERL), and SANE 2013, held at Columbia University, which each gathered around 70 researchers and students. SANE 2014 will feature invited talks by leading researchers from the Northeast as well as Europe: Najim Dehak (MIT), Hakan Erdogan (MERL/Sabanci University), Gael Richard (Telecom ParisTech), George Saon (IBM Research), Andrew Senior (Google Research), Stavros Tsakalidis (BBN - Raytheon), and David Wingate (Lyric). It will also feature a lively poster session during lunch time, open to both students and researchers. SANE 2014 is organized by Jonathan Le Roux (MERL), Jim Glass (MIT), and John R. Hershey (MERL).
  •  
  •  NEWS   Second Place in REVERB Challenge
    Date: May 10, 2014
    Where: REVERB Workshop
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Mitsubishi Electric's submission to the REVERB workshop achieved the second best performance among all participating institutes. The team included Yuuki Tachioka and Tomohiro Narita of MELCO in Japan, and Shinji Watanabe and Felix Weninger of MERL. The challenge addresses automatic speech recognition systems that are robust against varying room acoustics.
  •  
  •  NEWS   MERL to co-sponsor HSCMA 2014 Joint Workshop on Hands-free Speech Communication and Microphone Arrays
    Date: May 12, 2014 - May 14, 2014
    Where: Hands-free Speech Communication and Microphone Arrays (HSCMA)
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MERL is a sponsor for the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014), held in Nancy, France, in May 2014.
  •  
  •  NEWS   MERL Researcher named co-chair of GlobalSIP 2014 Symposium on Machine Learning Applications in Speech Processing
    Date: May 1, 2014
    Where: IEEE Global Conference on Signal and Information Processing (GlobalSIP)
    Research Areas: Multimedia, Speech & Audio
    Brief
    • John R. Hershey is Co-Chair of the GlobalSIP 2014 Symposium on Machine Learning.
  •  
  •  AWARD   Awaya Prize Young Researcher Award
    Date: March 11, 2014
    Awarded to: Yuuki Tachioka
    Awarded for: "Effectiveness of discriminative approaches for speech recognition under noisy environments on the 2nd CHiME Challenge"
    Awarded by: Acoustical Society of Japan (ASJ)
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MELCO researcher Yuuki Tachioka received the Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) for "effectiveness of discriminative approaches for speech recognition under noisy environments on the 2nd CHiME Challenge", which was based on joint work with MERL Speech & Audio team researchers Shinji Watanabe, Jonathan Le Roux and John R. Hershey.
  •  
  •  NEWS   Guest Editor for IEEE Signal Processing Magazine, Special Issue on Signal Processing Techniques for Assisted Listening
    Date: March 1, 2014
    Where: IEEE Signal Processing Society
    Research Areas: Multimedia, Speech & Audio
    Brief
    • John R. Hershey is Guest Editor for the Special Issue on Signal Processing Techniques for Assisted Listening of the IEEE Signal Processing.
  •  
  •  NEWS   Members of the Speech & Audio team elected to IEEE Technical Committees
    Date: January 1, 2014
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Jonathan Le Roux, Shinji Watanabe and John R. Hershey have been elected for 3-year terms to Technical Committees of the IEEE Signal Processing Society. Jonathan has been elected to the IEEE Audio and Acoustic Signal Processing Technical Committee (AASP-TC), and Shinji and John to the Speech and Language Processing Technical Committee (SL-TC). Members of the Speech & Audio team now together hold four TC positions, as John also serves on the AASP-TC.
  •  
  •  EVENT   SANE 2013 - Speech and Audio in the Northeast
    Date & Time: Thursday, October 24, 2013; 8:45 AM - 5:00 PM
    MERL Contact: Jonathan Le Roux
    Location: Columbia University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2013, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 24, 2013 at Columbia University, in New York City.

      A follow-up to SANE 2012 held in October 2012 at MERL in Cambridge, MA, this year's SANE will be held in conjunction with the WASPAA workshop, held October 20-23 in upstate New York. WASPAA attendees are welcome and encouraged to attend SANE.

      SANE 2013 will feature invited speakers from the Northeast, as well as from the international community. It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2013 is organized by Prof. Dan Ellis (Columbia University), Jonathan Le Roux (MERL) and John R. Hershey (MERL).
  •  
  •  TALK   Efficiently sampling wave fields
    Date & Time: Thursday, October 17, 2013; 12:00 PM
    Speaker: Prof. Laurent Daudet, Paris Diderot University, France
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • In acoustics, one may wish to acquire a wavefield over a whole spatial domain, while we can only make point measurements (ie, with microphones). Even with few sources, this remains a difficult problem because of reverberation, which can be hard to characterize. This can be seen as a sampling / interpolation problem, and it raises a number of interesting questions: how many sample points are needed, where to choose the sampling points, etc. In this presentation, we will review some case studies, in 2D (vibrating plates) and 3D (room acoustics), with numerical and experimental data, where we have developed sparse models, possibly with additional 'structures', based on a physical modeling of the acoustic field. These type of models are well suited to reconstruction techniques known as compressed sensing. These principles can also be used for sub-nyquist optical imaging : we will show preliminary experimental results of a new compressive imager, remarkably simple in its principle, using a multiply scattering medium.
  •