News & Events

80 Events and Talks were found.


  •  TALK   A data-centric approach to driving behavior research: How can signal processing methods contribute to the development of autonomous driving?
    Date & Time: Tuesday, March 15, 2016; 12:00 PM - 12:45 PM
    Speaker: Prof. Kazuya Takeda, Nagoya University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Thanks to advanced "internet of things" (IoT) technologies, situation-specific human behavior has become an area of development for practical applications involving signal processing. One important area of development of such practical applications is driving behavior research. Since 1999, I have been collecting driving behavior data in a wide range of signal modalities, including speech/sound, video, physical/physiological sensors, CAN bus, LIDAR and GNSS. The objective of this data collection is to evaluate how well signal models can represent human behavior while driving. In this talk, I would like to summarize our 10 years of study of driving behavior signal processing, which has been based on these signal corpora. In particular, statistical signal models of interactions between traffic contexts and driving behavior, i.e., stochastic driver modeling, will be discussed, in the context of risky lane change detection. I greatly look forward to discussing the scalability of such corpus-based approaches, which could be applied to almost any traffic situation.
  •  
  •  TALK   Driver's mental workload estimation based on the reflex eye movement
    Date & Time: Tuesday, March 15, 2016; 12:45 PM - 1:30 PM
    Speaker: Prof. Hirofumi Aoki, Nagoya University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Driving requires a complex skill that is involved with the vehicle itself (e.g., speed control and instrument operation), other road users (e.g., other vehicles, pedestrians), surrounding environment, and so on. During driving, visual cues are the main source to supply information to the brain. In order to stabilize the visual information when you are moving, the eyes move to the opposite direction based on the input to the vestibular system. This involuntary eye movement is called as the vestibulo-ocular reflex (VOR) and the physiological models have been studied so far. Obinata et al. found that the VOR can be used to estimate mental workload. Since then, our research group has been developing methods to quantitatively estimate mental workload during driving by means of reflex eye movement. In this talk, I will explain the basic mechanism of the reflex eye movement and how to apply for mental workload estimation. I also introduce the latest work to combine the VOR and OKR (optokinetic reflex) models for naturalistic driving environment.
  •  
  •  TALK   Emotion Detection for Health Related Issues
    Date & Time: Tuesday, February 16, 2016; 12:00 PM - 1:00 PM
    Speaker: Dr. Najim Dehak, MIT
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Recently, there has been a great increase of interest in the field of emotion recognition based on different human modalities, such as speech, heart rate etc. Emotion recognition systems can be very useful in several areas, such as medical and telecommunications. In the medical field, identifying the emotions can be an important tool for detecting and monitoring patients with mental health disorder. In addition, the identification of the emotional state from voice provides opportunities for the development of automated dialogue system capable of producing reports to the physician based on frequent phone communication between the system and the patients. In this talk, we will describe a health related application of using emotion recognition system based on human voices in order to detect and monitor the emotion state of people.
  •  
  •  EVENT   SANE 2015 - Speech and Audio in the Northeast
    Date: Thursday, October 22, 2015
    MERL Contacts: Jonathan Le Roux; John Hershey
    Location: Google, New York City, NY
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2015, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 22, 2015 at Google, in New York City, NY.

      It is a follow-up to SANE 2012, held at Mitsubishi Electric Research Labs (MERL), SANE 2013, held at Columbia University, and SANE 2014, held at MIT, which each gathered 70 to 90 researchers and students.

      SANE 2015 will feature invited talks by leading researchers from the Northeast, as well as from the international community: Rohit Prasad (Amazon), Michael Mandel (Brooklyn College, CUNY), Ron Weiss (Google), John Hershey (MERL), Pablo Sprechmann (NYU), Tuomas Virtanen (Tampere University of Technology), and Paris Smaragdis (UIUC). It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2015 is organized by Jonathan Le Roux (MERL), Hank Liao (Google), Andrew Senior (Google), and John R. Hershey (MERL).
  •  
  •  EVENT   Celebrating "Women in Science at MERL" luncheon
    Date & Time: Tuesday, August 4, 2015; 12:00
    MERL Contacts: Elizabeth Phillips; Jinyun Zhang
    Location: Mitsubishi Electric Research Laboratories
    Research Areas: Algorithms, Electronics & Communications, Data Analytics, Multimedia, Mechatronics, Computer Vision
    Brief
    • To celebrate "Women in Science at MERL," a luncheon event was organized on August 4. Eleven female interns, three female researchers, and female members of HQ staff, interns’ hosts/managers and MERL executives participated in that event. All female interns introduced their research projects and their positive experiences at MERL; female researchers shared their own career development stories; and at the end all discussed how to be successful in the field of science. Every participant was inspired to continue contributing to the future of science.
  •  
  •  EVENT   ICME 2015 - IEEE International Conference on Multimedia & Expo
    Date: Wednesday, July 1, 2015
    MERL Contact: Anthony Vetro
    Location: Torino, Italy
    Research Area: Multimedia
    Brief
    • Anthony Vetro is the General Co-chair of ICME 2015, the IEEE International Conference on Multimedia & Expo, to be held in Torino, Italy, in July 2015.
  •  
  •  EVENT   SANE 2014 - Speech and Audio in the Northeast
    Date: Thursday, October 23, 2014
    MERL Contacts: Jonathan Le Roux; John Hershey
    Location: Mitsubishi Electric Research Laboratories (MERL)
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2014, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 23, 2014 at MIT, in Cambridge, MA. It is a follow-up to SANE 2012, held at Mitsubishi Electric Research Labs (MERL), and SANE 2013, held at Columbia University, which each gathered around 70 researchers and students. SANE 2014 will feature invited talks by leading researchers from the Northeast as well as Europe: Najim Dehak (MIT), Hakan Erdogan (MERL/Sabanci University), Gael Richard (Telecom ParisTech), George Saon (IBM Research), Andrew Senior (Google Research), Stavros Tsakalidis (BBN - Raytheon), and David Wingate (Lyric). It will also feature a lively poster session during lunch time, open to both students and researchers. SANE 2014 is organized by Jonathan Le Roux (MERL), Jim Glass (MIT), and John R. Hershey (MERL).
  •  
  •  EVENT   107th MPEG meeting
    Date: Monday, January 13, 2014 - Friday, January 17, 2014
    MERL Contact: Anthony Vetro
    Location: San Jose, CA
    Research Area: Multimedia
    Brief
    • MERL is a sponsor for the 107th MPEG meeting to be held in San Jose, CA, in January 2014. MERL researcher Anthony Vetro serves as Head of the US Delegation to MPEG.
  •  
  •  EVENT   SANE 2013 - Speech and Audio in the Northeast
    Date & Time: Thursday, October 24, 2013; 8:45 AM - 5:00 PM
    MERL Contacts: Jonathan Le Roux; John Hershey
    Location: Columbia University
    Research Areas: Multimedia, Speech & Audio
    Brief
    • SANE 2013, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, will be held on Thursday October 24, 2013 at Columbia University, in New York City.

      A follow-up to SANE 2012 held in October 2012 at MERL in Cambridge, MA, this year's SANE will be held in conjunction with the WASPAA workshop, held October 20-23 in upstate New York. WASPAA attendees are welcome and encouraged to attend SANE.

      SANE 2013 will feature invited speakers from the Northeast, as well as from the international community. It will also feature a lively poster session during lunch time, open to both students and researchers.

      SANE 2013 is organized by Prof. Dan Ellis (Columbia University), Jonathan Le Roux (MERL) and John R. Hershey (MERL).
  •  
  •  TALK   Efficiently sampling wave fields
    Date & Time: Thursday, October 17, 2013; 12:00 PM
    Speaker: Prof. Laurent Daudet, Paris Diderot University, France
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • In acoustics, one may wish to acquire a wavefield over a whole spatial domain, while we can only make point measurements (ie, with microphones). Even with few sources, this remains a difficult problem because of reverberation, which can be hard to characterize. This can be seen as a sampling / interpolation problem, and it raises a number of interesting questions: how many sample points are needed, where to choose the sampling points, etc. In this presentation, we will review some case studies, in 2D (vibrating plates) and 3D (room acoustics), with numerical and experimental data, where we have developed sparse models, possibly with additional 'structures', based on a physical modeling of the acoustic field. These type of models are well suited to reconstruction techniques known as compressed sensing. These principles can also be used for sub-nyquist optical imaging : we will show preliminary experimental results of a new compressive imager, remarkably simple in its principle, using a multiply scattering medium.
  •  
  •  EVENT   IEEE Transactions on Image Processing, Special Issue on 3D Representation, Compression & Rendering
    Date: Thursday, August 1, 2013
    MERL Contact: Anthony Vetro
    Research Area: Multimedia
    Brief
    • Anthony Vetro is Guest Editor for the Special Issue on 3D Representation, Compression & Rendering of the IEEE Transactions on Image Processing.
  •  
  •  EVENT   QoMEX 2013 - Fifth International Workshop on Quality of Multimedia Experience
    Date: Wednesday, July 3, 2013 - Friday, July 5, 2013
    MERL Contact: Anthony Vetro
    Location: Klagenfurt am Worthersee, Austria
    Research Area: Multimedia
    Brief
    • Anthony Vetro is the publicity chair for America of QoMEX 2013, the Fifth International Workshop on Quality of Multimedia Experience, to be held Klagenfurt am Worthersee, Austria, in July 2013.
  •  
  •  EVENT   CHiME 2013 - The 2nd International Workshop on Machine Listening in Multisource Environments
    Date & Time: Saturday, June 1, 2013; 9:00 AM - 6:00 PM
    MERL Contact: Jonathan Le Roux
    Location: Vancouver, Canada
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MERL researchers Shinji Watanabe and Jonathan Le Roux are members of the organizing committee of CHiME 2013, the 2nd International Workshop on Machine Listening in Multisource Environments, Jonathan acting as Program Co-Chair. MERL is also a sponsor for the event.

      CHiME 2013 is a one-day workshop to be held in conjunction with ICASSP 2013 that will consider the challenge of developing machine listening applications for operation in multisource environments, i.e. real-world conditions with acoustic clutter, where the number and nature of the sound sources is unknown and changing over time. CHiME brings together researchers from a broad range of disciplines (computational hearing, blind source separation, speech recognition, machine learning) to discuss novel and established approaches to this problem. The cross-fertilisation of ideas will foster fresh approaches that efficiently combine the complementary strengths of each research field.
  •  
  •  EVENT   ICASSP 2013 - Student Career Luncheon
    Date & Time: Thursday, May 30, 2013; 12:30 PM - 2:30 PM
    MERL Contacts: Anthony Vetro; Petros Boufounos; Jonathan Le Roux
    Location: Vancouver, Canada
    Research Areas: Multimedia, Speech & Audio
    Brief
    • MERL is a sponsor for the first ICASSP Student Career Luncheon that will take place at ICASSP 2013. MERL members will take part in the event to introduce MERL and talk with students interested in positions or internships.
  •  
  •  EVENT   ISCAS 2013 - IEEE International Symposium on Circuits & Systems
    Date: Sunday, May 19, 2013 - Thursday, May 23, 2013
    MERL Contact: Anthony Vetro
    Location: Beijing, China
    Research Area: Multimedia
    Brief
    • Anthony Vetro is the Demo Co-chair of ISCAS 2013, the IEEE International Symposium on Circuits & Systems, to be held in Beijing, China, in May 2013.
  •  
  •  TALK   Practical kernel methods for automatic speech recognition
    Date & Time: Tuesday, May 7, 2013; 2:30 PM
    Speaker: Dr. Yotaro Kubo, NTT Communication Science Laboratories, Kyoto, Japan
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Kernel methods are important to realize both convexity in estimation and ability to represent nonlinear classification. However, in automatic speech recognition fields, kernel methods are not widely used conventionally. In this presentation, I will introduce several attempts to practically incorporate kernel methods into acoustic models for automatic speech recognition. The presentation will consist of two parts. The first part will describes maximum entropy discrimination and its application to a kernel machine training. The second part will describes dimensionality reduction of kernel-based features.
  •  
  •  TALK   Visual Signal Analysis and Compression: Focus on Texture Similarity
    Date & Time: Friday, May 3, 2013; 12:00 PM
    Speaker: Prof. Thrasyvoulos N. Pappas, Northwestern University
    MERL Host: Anthony Vetro
    Research Area: Multimedia
    Brief
    • Texture is an important visual attribute both for human perception and image analysis systems. We present new structural texture similarity metrics and applications that critically depend on such metrics, with
      emphasis on image compression and content-based retrieval. The new metrics account for human visual perception and the stochastic nature of textures. They rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are similar or essentially identical.

      We also present new testing procedures for objective texture similarity metrics. We identify three operating domains for evaluating the performance of such similarity metrics: the top of the similarity scale, where a monotonic relationship between metric values and subjective scores is desired; the ability to distinguish between perceptually similar and dissimilar textures; and the ability to retrieve "identical" textures. Each domain has different performance goals and requires different testing procedures. Experimental results similarity metrics demonstrate both the performance of the proposed metrics and the effectiveness of the proposed subjective testing procedures.
  •  
  •  TALK   Communication/computation tradeoffs and other practical considerations in distributed convex optimization
    Date & Time: Thursday, March 21, 2013; 12:00 PM
    Speaker: Konstantinos Tsianos, McGill, Montreal, Canada
    MERL Host: Petros Boufounos
    Research Area: Multimedia
    Brief
    • Distributed algorithms become necessary to employ the computational resources needed for solving the large scale optimization problems that arise in areas such as machine learning,computation biology and others. We study a very general distributed setting where the data is distributed over many machines that can communicate with one another over a network that does not have any specialized communication infrastructure. In this setting the role of the network becomes critical in the performance of a distributed algorithm. From a more theoretical standpoint we discuss two questions: 1) How many nodes should we use for a given problem before communication becomes a bottleneck? and 2) How often should the nodes communicate to one another for the communication cost to be worth the transmission? In addition, we discuss some more practical issue that one needs to consider in implementing algorithms that are asynchronous and robust to communication delays
  •  
  •  TALK   Signal Processing on Graphs: Theory and Applications
    Date & Time: Thursday, March 21, 2013; 12:00 PM
    Speaker: Prof. Antonio Ortega, University of Southern California
    MERL Host: Anthony Vetro
    Research Area: Multimedia
    Brief
    • Graphs have long been used in a wide variety of problems, such analysis of social networks, machine learning, network protocol optimization, decoding of LDPCs or image processing. Techniques based on spectral graph theory provide a "frequency" interpretation of graph data and have proven to be quite popular in multiple applications.

      In the last few years, a growing amount of work has started extending and complementing spectral graph techniques, leading to the emergence of "Graph Signal Processing" as a broad research field. A common characteristic of this recent work is that it considers the data attached to the vertices as a "graph-signal" and seeks to create new techniques (filtering, sampling, interpolation), similar to those commonly used in conventional signal processing (for audio, images or video), so that they can be applied to these graph signals.

      In this talk, we first introduce some of the basic tools needed in developing new graph signal processing operations. We then introduce our design of wavelet filterbanks of graphs, which for the first time provides a multi-resolution, critically-sampled, frequency- and graph-localized transforms for graph signals. We conclude by providing several examples of how these new transforms and tools can be applied to existing problems. Time permitting, we will discuss applications to image processing, depth video compression, recommendation system design and network optimization.
  •  
  •  TALK   Probabilistic Latent Tensor Factorisation
    Date & Time: Tuesday, February 26, 2013; 12:00 PM
    Speaker: Prof. Taylan Cemgil, Bogazici University, Istanbul, Turkey
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Algorithms for decompositions of matrices are of central importance in machine learning, signal processing and information retrieval, with SVD and NMF (Nonnegative Matrix Factorisation) being the most widely used examples. Probabilistic interpretations of matrix factorisation models are also well known and are useful in many applications (Salakhutdinov and Mnih 2008; Cemgil 2009; Fevotte et. al. 2009). In the recent years, decompositions of multiway arrays, known as tensor factorisations have gained significant popularity for the analysis of large data sets with more than two entities (Kolda and Bader, 2009; Cichocki et. al. 2008). We will discuss a subset of these models from a statistical modelling perspective, building upon probabilistic Bayesian generative models and generalised linear models (McCulloch and Nelder). In both views, the factorisation is implicit in a well-defined hierarchical statistical model and factorisations can be computed via maximum likelihood.

      We express a tensor factorisation model using a factor graph and the factor tensors are optimised iteratively. In each iteration, the update equation can be implemented by a message passing algorithm, reminiscent to variable elimination in a discrete graphical model. This setting provides a structured and efficient approach that enables very easy development of application specific custom models, as well as algorithms for the so called coupled (collective) factorisations where an arbitrary set of tensors are factorised simultaneously with shared factors. Extensions to full Bayesian inference for model selection, via variational approximations or MCMC are also feasible. Well known models of multiway analysis such as Nonnegative Matrix Factorisation (NMF), Parafac, Tucker, and audio processing (Convolutive NMF, NMF2D, SF-SSNTF) appear as special cases and new extensions can easily be developed. We will illustrate the approach with applications in link prediction and audio and music processing.
  •