News & Events

146 Events and Talks were found.


  •  TALK   Probabilistic Latent Tensor Factorisation
    Date & Time: Tuesday, February 26, 2013; 12:00 PM
    Speaker: Prof. Taylan Cemgil, Bogazici University, Istanbul, Turkey
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
    Brief
    • Algorithms for decompositions of matrices are of central importance in machine learning, signal processing and information retrieval, with SVD and NMF (Nonnegative Matrix Factorisation) being the most widely used examples. Probabilistic interpretations of matrix factorisation models are also well known and are useful in many applications (Salakhutdinov and Mnih 2008; Cemgil 2009; Fevotte et. al. 2009). In the recent years, decompositions of multiway arrays, known as tensor factorisations have gained significant popularity for the analysis of large data sets with more than two entities (Kolda and Bader, 2009; Cichocki et. al. 2008). We will discuss a subset of these models from a statistical modelling perspective, building upon probabilistic Bayesian generative models and generalised linear models (McCulloch and Nelder). In both views, the factorisation is implicit in a well-defined hierarchical statistical model and factorisations can be computed via maximum likelihood.

      We express a tensor factorisation model using a factor graph and the factor tensors are optimised iteratively. In each iteration, the update equation can be implemented by a message passing algorithm, reminiscent to variable elimination in a discrete graphical model. This setting provides a structured and efficient approach that enables very easy development of application specific custom models, as well as algorithms for the so called coupled (collective) factorisations where an arbitrary set of tensors are factorised simultaneously with shared factors. Extensions to full Bayesian inference for model selection, via variational approximations or MCMC are also feasible. Well known models of multiway analysis such as Nonnegative Matrix Factorisation (NMF), Parafac, Tucker, and audio processing (Convolutive NMF, NMF2D, SF-SSNTF) appear as special cases and new extensions can easily be developed. We will illustrate the approach with applications in link prediction and audio and music processing.
  •  
  •  TALK   Bayesian Group Sparse Learning
    Date & Time: Monday, January 28, 2013; 11:00 AM
    Speaker: Prof. Jen-Tzung Chien, National Chiao Tung University, Taiwan
    Research Area: Speech & Audio
    Brief
    • Bayesian learning provides attractive tools to model, analyze, search, recognize and understand real-world data. In this talk, I will introduce a new Bayesian group sparse learning and its application on speech recognition and signal separation. First of all, I present the group sparse hidden Markov models (GS-HMMs) where a sequence of acoustic features is driven by Markov chain and each feature vector is represented by two groups of basis vectors. The features across states and within states are represented accordingly. The sparse prior is imposed by introducing the Laplacian scale mixture (LSM) distribution. The robustness of speech recognition is illustrated. On the other hand, the LSM distribution is also incorporated into Bayesian group sparse learning based on the nonnegative matrix factorization (NMF). This approach is developed to estimate the reconstructed rhythmic and harmonic music signals from single-channel source signal. The Monte Carlo procedure is presented to infer two groups of parameters. The future work of Bayesian learning shall be discussed.
  •  
  •  TALK   Estimation of time-varying parameters in nonlinear systems: Application to building systems and Extremum seeking control
    Date & Time: Tuesday, December 18, 2012; 12:00 PM
    Speaker: Prof. Martin Guay, Queen's University
    Brief
    • In this presentation, an adaptive estimation technique for the estimation of time-varying parameters for a class of continuous-time nonlinear system is proposed. In the first part of the talk, we present an application of the estimation routine for the estimation of unknown heat loads and heat sinks in building systems. The technique proposed is a set-based adaptive estimation that can be used to estimate the time-varying parameters along with an uncertainty set. The proposed method is such that the uncertainty set update is guaranteed to contain the true value of the parameters. Unlike existing techniques that rely on the use of polynomial approximations of the time-varying behaviour of the parameters, the proposed technique does not require a functional representation of the time-varying behaviour of the parameter estimates.

      In the second part of the talk, we consider the application of the estimation technique for the solution of a class of real-time optimization problems. It is assumed that the equations describing the dynamics of the nonlinear system and the cost function to be minimized are unknown and that the objective function is measured. The main contribution is to formulate the extremum-seeking problem as a time-varying estimation problem. The proposed approach is shown to avoid the need for averaging results which minimizes the impact of the choice of dither signal on the performance of the extremum seeking control system.
  •  
  •  TALK   Electromagnetic Remote Sensing for the Detection of Concealed Objects
    Date & Time: Thursday, December 13, 2012; 12:00 PM
    Speaker: Dr. Tomasz M. Grzegorczyk, Delpsi LLC
    MERL Host: Anthony Vetro
    Brief
    • Electromagnetic (EM) remote sensing is a well-established modality for the detection, tracking, and identification of concealed targets. The degree of freedom offered by the operating frequency (and the associated propagation or induction regimes) make EM waves sufficiently versatile to interrogate both large as well as small structures, metallic as well as dielectric objects, in close proximity or further away. This wide flexibility has made EM remote sensing a modality of choice in many applications. This presentation will focus on two implementations of non-destructive and non-contact EM sensing. The first is based on a tomographic approach, whereby EM waves are used to infer material properties within the volume of accessible structures. The two examples to be discussed are breast cancer detection, i.e. locating areas of high vascularity in otherwise healthy biological tissues, and inspection of concrete structures, i.e. identifying volumetric material property variations to locate rebars and cracks. The second area we will discuss is that of subsurface target detection, with again two very different applications. The first pertains to ground penetrating radars with frequencies in the GHz aimed at the detection of buried weak dielectric scatterers, whereas the second focuses on the detection of metallic targets in the magnetic induction regime, for which much lower frequencies are used. In all these applications, the data collected by the appropriate hardwares are processed by combining fundamental EM concepts with inverse methods for parameter estimation. We will discuss both a deterministic method -- Gauss-Newton -- and a stochastic method -- Kalman filters for real time target detection.
  •  
  •  TALK   Speech recognition for closed-captioning
    Date & Time: Tuesday, December 11, 2012; 12:00 PM
    Speaker: Takahiro Oku, NHK Science & Technology Research Laboratories
    Research Area: Speech & Audio
    Brief
    • In this talk, I will present human-friendly broadcasting research conducted in NHK and research on speech recognition for real-time closed-captioning. The goal of human-friendly broadcasting research is to make broadcasting more accessible and enjoyable for everyone, including children, elderly, and physically challenged persons. The automatic speech recognition technology that NHK has developed makes it possible to create captions for the hearing impaired in real-time automatically. For sports programs such as professional sumo wrestling, a closed-captioning system has already been implemented in which captions are created by using speech recognition on a captioning re-speaker. In 2011, NHK General Television started broadcasting of closed captions for the information program "Morning Market". After the introduction of the implemented closed-captioning system, I will talk about our recent improvement obtained by an adaptation method that creates a more effective acoustic model using error correction results. The method reflects recognition error tendencies more effectively.
  •  
  •  EVENT   APSIPA 2012
    Date: Monday, December 3, 2012 - Thursday, December 6, 2012
    MERL Contact: Anthony Vetro
    Location: Hollywood, CA
    Brief
    • MERL is a sponsor for APSIPA 2012, the fourth annual conference organized by Asia-Pacific Signal and Information Processing Association.
  •  
  •  TALK   Sensitive Manipulation
    Date & Time: Thursday, November 15, 2012; 12:00 PM
    Speaker: Dr. Eduardo Torres-Jara, Worcester Polytechnic Institute
    MERL Host: Jay Thornton
    Research Area: Computer Vision
    Brief
    • This talk presents an alternative approach to robotic manipulation. In this approach, manipulation is mainly guided by tactile feedback as opposed to vision. The motivation behind this approach stems from the fact that manipulating an object necessarily implies coming into contact with it. As a result, directly sensing physical contact seems more important than vision to control the interaction of the object and the robot. In this work, the traditional approach of a highly precise arm guided by a vision system is replaced by one that uses a low mechanical impedance arm with dense tactile sensing and exploration capabilities.

      The robots OBRERO and GoBot have been built to implement this approach. We have developed a novel tactile sensing technology and mounted our sensors on the robots' hands. These sensors are biologically inspired and present adequate features for manipulation. The success of this approach is shown by picking up objects in a poorly modeled environment. This task, simple for humans, has been a challenge for robots. The robot can deal with new, unmodeled objects. Specifically, OBRERO can gently contact, explore, lift, and place an object in a different location. It can also detect basic slippage and external forces acting on an object while it is held. These tasks can be performed successfully with very light objects, without fixtures, and on slippery surfaces. Similarly, GoBot is capable of manipulating small objects such as the stones in the game GO. Both OBRERO and GoBot perform all of their manipulations using tactile feedback.
  •  
  •  TALK   Robust Preconditioners for a boundary control elliptic problem
    Date & Time: Wednesday, November 7, 2012; 12:00 PM
    Speaker: Prof. Marcus Sarkis, Worcester Polytechnic Institute
    Brief
    • We discuss the following problem: Given a target function on a domain, what is the Neumann data on the boundary so that its harmonic extension into the domain is the closest function to the target function in the L2 norm? For convex polygonal domains, we show that regularization is not needed in case the space for the Neumann data is chosen properly. In the second part of the talk we discuss solvers for the associated discrete Hessian which are robust with respect to regularization parameters and mesh sizes.
  •  
  •  TALK   Zero-Resource Speech Pattern and Sub-Word Unit Discovery
    Date & Time: Wednesday, October 24, 2012; 9:10 AM
    Speaker: Prof. Jim Glass and Chia-ying Lee, MIT CSAIL
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  TALK   Self-Organizing Units (SOUs): Training Speech Recognizers Without Any Transcribed Audio.
    Date & Time: Wednesday, October 24, 2012; 2:15 PM
    Speaker: Dr. Herb Gish, BBN - Raytheon
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  TALK   A new class of dynamical system models for speech and audio
    Date & Time: Wednesday, October 24, 2012; 4:05 PM
    Speaker: Dr. John R. Hershey, MERL
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  TALK   Recognizing and Classifying Environmental Sounds
    Date & Time: Wednesday, October 24, 2012; 11:00 AM
    Speaker: Prof. Dan Ellis, Columbia University
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  TALK   Understanding Audition via Sound Analysis and Synthesis
    Date & Time: Wednesday, October 24, 2012; 11:45 AM
    Speaker: Josh McDermott, MIT, BCS
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  TALK   Factorial Hidden Restricted Boltzmann Machines for Noise Robust Speech Recognition
    Date & Time: Wednesday, October 24, 2012; 3:20 PM
    Speaker: Dr. Steven J. Rennie, IBM Research
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  EVENT   SANE 2012 - Speech and Audio in the Northeast
    Date & Time: Wednesday, October 24, 2012; 8:30 AM - 5:00 PM
    MERL Contact: Jonathan Le Roux
    Location: MERL
    Research Area: Speech & Audio
    Brief
    • SANE 2012, a one-day event gathering researchers and students in speech and audio from the northeast of the American continent, will be held on Wednesday October 24, 2012 at Mitsubishi Electric Research Laboratories (MERL) in Cambridge, MA.
  •  
  •  TALK   Latent Topic Modeling of Conversational Speech
    Date & Time: Wednesday, October 24, 2012; 1:30 PM
    Speaker: Dr. Timothy J. Hazen and David Harwath, MIT Lincoln Labs / MIT CSAIL
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  TALK   Advances in Acoustic Modeling at IBM Research: Deep Belief Networks, Sparse Representations
    Date & Time: Wednesday, October 24, 2012; 9:55 AM
    Speaker: Dr. Tara Sainath, IBM Research
    MERL Host: Jonathan Le Roux
    Research Area: Speech & Audio
  •  
  •  EVENT   Automotive UI 2012 - 4th International Conference on Automotive User Interfaces and Interactive Vehicular Applications
    Date: Wednesday, October 17, 2012 - Friday, October 19, 2012
    MERL Contact: Anthony Vetro
    Location: Portsmouth, NH
    Brief
    • MERL is a sponsor for the Fourth International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Automotive UI 2012.
  •  
  •  TALK   Interactive Visual Analysis for Engineering Applications
    Date & Time: Thursday, October 11, 2012; 12:00 PM
    Speaker: Kresimir Matkovic, VRVis Research Center, Vienna
    Brief
    • Increasing complexity and a large number of control parameters make the design and understanding of modern engineering systems impossible without simulation today. Advances in simulation technology and ability to run multiple simulations with different sets of parameters poses new challenges for analysis techniques. In this talk we will present our experiences in exploration and analysis of simulation ensembles realized in several projects with experts from automotive, meteorology, and medical domains. We tightly integrate simulation, numerical optimization, and interactive visual analysis in a unified framework. Our new data model supports families of curves and families of surfaces. Accompanying interactive visual analysis techniques offer new possibilities for data exploration and analysis. It is possible to start with a simple analysis, to continue with identifying hidden features, and finally to explore very complex dependencies using advanced interaction and on-the-fly data derivation and aggregation. All proposed techniques will be illustrated using a coordinated multiple views system and real-life data from various projects with scientists and engineers, including the optimization of an automotive rail injection system.
  •  
  •  TALK   Non-negative Hidden Markov Modeling of Audio
    Date & Time: Thursday, October 11, 2012; 2:30 PM
    Speaker: Dr. Gautham J. Mysore, Adobe
    Research Area: Speech & Audio
    Brief
    • Non-negative spectrogram factorization techniques have become quite popular in the last decade as they are effective in modeling the spectral structure of audio. They have been extensively used for applications such as source separation and denoising. These techniques however fail to account for non-stationarity and temporal dynamics, which are two important properties of audio. In this talk, I will introduce the non-negative hidden Markov model (N-HMM) and the non-negative factorial hidden Markov model (N-FHMM) to model single sound sources and sound mixtures respectively. They jointly model the spectral structure and temporal dynamics of sound sources, while accounting for non-stationarity. I will also discuss the application of these models to various applications such as source separation, denoising, and content based audio processing, showing why they yield improved performance when compared to non-negative spectrogram factorization techniques.
  •  
  •  EVENT   ICIP 2012 - IEEE International Conference on Image Processing
    Date: Sunday, September 30, 2012 - Wednesday, October 3, 2012
    MERL Contact: Anthony Vetro
    Location: Orlando, FL
    Brief
    • Anthony Vetro is the Industrial Co-chair of ICIP 2012, the IEEE International Conference on Image Processing, to be held in Orlando, Florida, in September 2012.
  •  
  •  TALK   Tensor representation of speaker space for arbitrary speaker conversion
    Date & Time: Thursday, September 6, 2012; 12:00 PM
    Speaker: Dr. Daisuke Saito, The University of Tokyo
    Research Area: Speech & Audio
    Brief
    • In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice Gaussian mixture model (EV-GMM) was proposed. In the EVC, similarly to speaker recognition approaches, a speaker space is constructed based on GMM supervectors which are high-dimensional vectors derived by concatenating the mean vectors of each of the speaker GMMs. In the speaker space, each speaker is represented by a small number of weight parameters of eigen-supervectors. In this talk, we revisit construction of the speaker space by introducing the tensor analysis of training data set. In our approach, each speaker is represented as a matrix of which the row and the column respectively correspond to the Gaussian component and the dimension of the mean vector, and the speaker space is derived by the tensor analysis of the set of the matrices. Our approach can solve an inherent problem of supervector representation, and it improves the performance of voice conversion. Experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.
  •  
  •  TALK   Challenges on shape acquisition of moving object
    Date & Time: Friday, August 17, 2012; 12:00 PM
    Speaker: Prof. Hiroshi Kawasaki, Kagoshima University
    Research Area: Computer Vision
    Brief
    • In this talk, I will introduce an overview of my research projects on 3D shape acquisition of moving object. The talk mainly focuses on two parts, the first one is about our 3D shape acquisition technique using projector and camera system and the second is entire shape acquisition using multi-view pro-cam system. I also briefly cover the following topics:

      -- Theory of shape from coplanarity technique
      -- Texture recovery method on pro-cam system
      -- Future plan on medical application of our scanner

      Those researches are jointly researched by Prof. Katushi Ikeuchi (Univ. of Tokyo), Prof. Ryo Furukawa (Hiroshima city Univ) and Prof. Ryusuke Sagawa (AIST).
  •  
  •  TALK   Communication Systems for Oilfield Applications
    Date & Time: Tuesday, August 7, 2012; 12:00 PM
    Speaker: Dr. Julius Kusuma, Schlumberger-Doll Research
    MERL Host: Petros Boufounos
    Brief
    • The oilfield is a rich area for research and engineering in communication and signal processing. Communication over non-standard channels, using constrained sources, noisy environments, and limited computational and energy resources, are some of the key challenges in this domain. In this talk I will give an introduction first on the role of science and technology, in particular communication and signal processing, in the oilfield. Due to its unique role in the industry, Schlumberger has a rich variety of communication systems over EM wireless, wired, acoustic, and even fluid pressure channels.

      In this talk we give a brief tour of some of the state-of-the-art and showcase how technology has revolutionized the practice of the industry, enabling innovations such as horizontal drilling, logging-while-drilling, and well-placement. At the same time, we give a tutorial on how the lifecycle of a reservoir is managed, including imaging, drilling, logging, sampling, testing, and completing. Throughout, we will show how communication has revolutionized the practice in the industry.
  •  
  •  TALK   Feedback Particle Filter and its Applications
    Date & Time: Wednesday, August 1, 2012; 12:00 PM
    Speaker: Prof. Prashant Mehta, University of Illinois at Urbana-Champaign
    MERL Host: Scott Bortoff
    Brief
    • In my talk, I will present a self-contained introduction to nonlinear filtering, and describe some recent developments. Specifically, I will introduce the feedback particle filter and show how it admits an innovations error-based feedback control structure. The control is chosen so that the posterior distribution of any particle matches the posterior distribution of the true state given the observations. The subject of my talk is a new formulation of nonlinear filter (for Bayesian inference) that is based on concepts from optimal control and mean-field game theory. Nonlinear filtering is important to many applications in engineering, biology, economics, atmospheric sciences and neuroscience. Several applications will be described to illustrate the theoretical concepts.

      This is joint work with Tao Yang and Sean Meyn at the University of Illinois.
  •