News & Events

80 Talks were found.


  •  TALK   Factorial Hidden Restricted Boltzmann Machines for Noise Robust Speech Recognition
    Date & Time: Wednesday, October 24, 2012; 3:20 PM
    Speaker: Dr. Steven J. Rennie, IBM Research
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Zero-Resource Speech Pattern and Sub-Word Unit Discovery
    Date & Time: Wednesday, October 24, 2012; 9:10 AM
    Speaker: Prof. Jim Glass and Chia-ying Lee, MIT CSAIL
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   A new class of dynamical system models for speech and audio
    Date & Time: Wednesday, October 24, 2012; 4:05 PM
    Speaker: Dr. John R. Hershey, MERL
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Self-Organizing Units (SOUs): Training Speech Recognizers Without Any Transcribed Audio.
    Date & Time: Wednesday, October 24, 2012; 2:15 PM
    Speaker: Dr. Herb Gish, BBN - Raytheon
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Recognizing and Classifying Environmental Sounds
    Date & Time: Wednesday, October 24, 2012; 11:00 AM
    Speaker: Prof. Dan Ellis, Columbia University
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Understanding Audition via Sound Analysis and Synthesis
    Date & Time: Wednesday, October 24, 2012; 11:45 AM
    Speaker: Josh McDermott, MIT, BCS
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Latent Topic Modeling of Conversational Speech
    Date & Time: Wednesday, October 24, 2012; 1:30 PM
    Speaker: Dr. Timothy J. Hazen and David Harwath, MIT Lincoln Labs / MIT CSAIL
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Advances in Acoustic Modeling at IBM Research: Deep Belief Networks, Sparse Representations
    Date & Time: Wednesday, October 24, 2012; 9:55 AM
    Speaker: Dr. Tara Sainath, IBM Research
    MERL Host: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
  •  
  •  TALK   Non-negative Hidden Markov Modeling of Audio
    Date & Time: Thursday, October 11, 2012; 2:30 PM
    Speaker: Dr. Gautham J. Mysore, Adobe
    MERL Host: John Hershey
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Non-negative spectrogram factorization techniques have become quite popular in the last decade as they are effective in modeling the spectral structure of audio. They have been extensively used for applications such as source separation and denoising. These techniques however fail to account for non-stationarity and temporal dynamics, which are two important properties of audio. In this talk, I will introduce the non-negative hidden Markov model (N-HMM) and the non-negative factorial hidden Markov model (N-FHMM) to model single sound sources and sound mixtures respectively. They jointly model the spectral structure and temporal dynamics of sound sources, while accounting for non-stationarity. I will also discuss the application of these models to various applications such as source separation, denoising, and content based audio processing, showing why they yield improved performance when compared to non-negative spectrogram factorization techniques.
  •  
  •  TALK   Interactive Visual Analysis for Engineering Applications
    Date & Time: Thursday, October 11, 2012; 12:00 PM
    Speaker: Kresimir Matkovic, VRVis Research Center, Vienna
    MERL Host: Kent Wittenburg
    Brief
    • Increasing complexity and a large number of control parameters make the design and understanding of modern engineering systems impossible without simulation today. Advances in simulation technology and ability to run multiple simulations with different sets of parameters poses new challenges for analysis techniques. In this talk we will present our experiences in exploration and analysis of simulation ensembles realized in several projects with experts from automotive, meteorology, and medical domains. We tightly integrate simulation, numerical optimization, and interactive visual analysis in a unified framework. Our new data model supports families of curves and families of surfaces. Accompanying interactive visual analysis techniques offer new possibilities for data exploration and analysis. It is possible to start with a simple analysis, to continue with identifying hidden features, and finally to explore very complex dependencies using advanced interaction and on-the-fly data derivation and aggregation. All proposed techniques will be illustrated using a coordinated multiple views system and real-life data from various projects with scientists and engineers, including the optimization of an automotive rail injection system.
  •  
  •  TALK   Tensor representation of speaker space for arbitrary speaker conversion
    Date & Time: Thursday, September 6, 2012; 12:00 PM
    Speaker: Dr. Daisuke Saito, The University of Tokyo
    Research Areas: Multimedia, Speech & Audio
    Brief
    • In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice Gaussian mixture model (EV-GMM) was proposed. In the EVC, similarly to speaker recognition approaches, a speaker space is constructed based on GMM supervectors which are high-dimensional vectors derived by concatenating the mean vectors of each of the speaker GMMs. In the speaker space, each speaker is represented by a small number of weight parameters of eigen-supervectors. In this talk, we revisit construction of the speaker space by introducing the tensor analysis of training data set. In our approach, each speaker is represented as a matrix of which the row and the column respectively correspond to the Gaussian component and the dimension of the mean vector, and the speaker space is derived by the tensor analysis of the set of the matrices. Our approach can solve an inherent problem of supervector representation, and it improves the performance of voice conversion. Experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.
  •  
  •  TALK   Challenges on shape acquisition of moving object
    Date & Time: Friday, August 17, 2012; 12:00 PM
    Speaker: Prof. Hiroshi Kawasaki, Kagoshima University
    MERL Host: Yuichi Taguchi
    Research Area: Computer Vision
    Brief
    • In this talk, I will introduce an overview of my research projects on 3D shape acquisition of moving object. The talk mainly focuses on two parts, the first one is about our 3D shape acquisition technique using projector and camera system and the second is entire shape acquisition using multi-view pro-cam system. I also briefly cover the following topics:

      -- Theory of shape from coplanarity technique
      -- Texture recovery method on pro-cam system
      -- Future plan on medical application of our scanner

      Those researches are jointly researched by Prof. Katushi Ikeuchi (Univ. of Tokyo), Prof. Ryo Furukawa (Hiroshima city Univ) and Prof. Ryusuke Sagawa (AIST).
  •  
  •  TALK   Communication Systems for Oilfield Applications
    Date & Time: Tuesday, August 7, 2012; 12:00 PM
    Speaker: Dr. Julius Kusuma, Schlumberger-Doll Research
    MERL Host: Petros Boufounos
    Research Area: Multimedia
    Brief
    • The oilfield is a rich area for research and engineering in communication and signal processing. Communication over non-standard channels, using constrained sources, noisy environments, and limited computational and energy resources, are some of the key challenges in this domain. In this talk I will give an introduction first on the role of science and technology, in particular communication and signal processing, in the oilfield. Due to its unique role in the industry, Schlumberger has a rich variety of communication systems over EM wireless, wired, acoustic, and even fluid pressure channels.

      In this talk we give a brief tour of some of the state-of-the-art and showcase how technology has revolutionized the practice of the industry, enabling innovations such as horizontal drilling, logging-while-drilling, and well-placement. At the same time, we give a tutorial on how the lifecycle of a reservoir is managed, including imaging, drilling, logging, sampling, testing, and completing. Throughout, we will show how communication has revolutionized the practice in the industry.
  •  
  •  TALK   Feedback Particle Filter and its Applications
    Date & Time: Wednesday, August 1, 2012; 12:00 PM
    Speaker: Prof. Prashant Mehta, University of Illinois at Urbana-Champaign
    MERL Host: Scott Bortoff
    Research Area: Mechatronics
    Brief
    • In my talk, I will present a self-contained introduction to nonlinear filtering, and describe some recent developments. Specifically, I will introduce the feedback particle filter and show how it admits an innovations error-based feedback control structure. The control is chosen so that the posterior distribution of any particle matches the posterior distribution of the true state given the observations. The subject of my talk is a new formulation of nonlinear filter (for Bayesian inference) that is based on concepts from optimal control and mean-field game theory. Nonlinear filtering is important to many applications in engineering, biology, economics, atmospheric sciences and neuroscience. Several applications will be described to illustrate the theoretical concepts.

      This is joint work with Tao Yang and Sean Meyn at the University of Illinois.
  •  
  •  TALK   Nonparametric Bayesian Latent Variable Models
    Date & Time: Friday, July 27, 2012; 12:00 PM
    Speaker: Mingyuan Zhou, Duke University
    MERL Host: Dehong Liu
    Research Area: Multimedia
    Brief
    • Bayesian nonparametrics, using stochastic processes as prior distributions, is a relatively young and rapidly growing research area in statistics and machine learning. In this talk, we first briefly review completely random measures, a family of pure-jump non-negative stochastic processes that are simple to construct and amenable for posterior computation. We then present nonparametric Bayesian latent variable models based on the beta process, Bernoulli process, gamma process, Poisson process, and in particular, the negative binomial process. Specifically, for continuous data, we discuss dictionary learning with the beta-Bernoulli process and dependent hierarchical beta process, and for count data, we present the beta-negative binomial process and Poisson factor analysis. Furthermore, we discuss how the seeming disjoint count and mixture modelings can be united under the negative binomial processes framework, providing new opportunities to build mixture and hierarchical mixture models with better data fitting, more efficient inference and more flexible model constructions. We show successful applications of our nonparametric Bayesian latent variable models to image processing, topic modeling and count data analysis.
  •  
  •  TALK   A Pole-Placement Approach to the Design of Robust Linear Multivariable Control Systems
    Date: Thursday, July 19, 2012
    Speaker: Rick Vaccaro, University of Rhode Island
    MERL Host: Scott Bortoff
    Research Area: Mechatronics
    Brief
    • The ability to directly specify the closed-loop poles of a multivariable control system is a major benefit of pole-placement algorithms for calculating state-feedback and observer gains. The drawback of these algorithms is the lack of any guarantee on the stability robustness of the resulting control system. The optimal control approach for calculating state-feedback gains (LQR) has a certain guaranteed robustness, but adding an observer (i.e. Kalman filter, LQG) can result in arbitrarily poor robustness. In this talk, a new pole-placement approach is introduced for calculating state-feedback and observer gains. The new approach optimizes robustness and gives impressive results, particularly for output feedback, observer-based control systems.
  •  
  •  TALK   Threat Assessment and Semi-Autonomous Control of Manned and Unmanned Vehicles
    Date & Time: Monday, July 16, 2012; 2:00 PM
    Speaker: Dr. Karl Iagnemma, Director, MIT Robotic Mobility Group
    MERL Host: Stefano Di Cairano
    Research Area: Mechatronics
    Brief
    • Operator error is a significant factor in a majority of manned and unmanned vehicle accidents. In this talk, a framework for semi-autonomous vehicle accident avoidance will be presented that has been shown to effectively mitigate collisions caused by operator error. The framework analyzes sensor data (from vision and/or LIDAR data) to identify "no go" regions in the environment, and automatically synthesize constraints on vehicle position. An optimal trajectory and associated control inputs are then found via linear or nonlinear model predictive control. The "threat" to the vehicle is quantified from various metrics computed over the optimal trajectory. A number of approaches for arbitrating between operator and control system authority, based on the predicted threat, will be discussed. Extensive simulation and experimental testing will be described for both manned and unmanned scenarios. Future directions in threat assessment and semi-autonomous control, based on the integration of vision-based sensing and active steering control, will also be discussed.
  •  
  •  TALK   Applications of Mobile Augmented Reality and Pervasive Computing in Architecture, Engineering, and Construction
    Date & Time: Tuesday, July 10, 2012; 11:00 AM
    Speaker: Prof Vineet Kamat, University of Michigan
    MERL Host: Yuichi Taguchi
    Research Area: Computer Vision
    Brief
    • This talk will present ongoing research at the University of Michigan Laboratory for Interactive Visualization in Engineering (LIVE) that is exploring applications of mobile pervasive computing and visualization in design, engineering, and construction. Findings from three specific research projects will be presented: Interactive Visualization of Construction Operations in Mobile Outdoor Augmented Reality; Rapid Building Damage Evaluation using Augmented Reality and Structural Simulation; and Location-Aware Contextual Information Access and Retrieval for Rapid On-Site Decision Making. In each case, the development of fundamental algorithms, their implementation as reusable and modular software, and their implementation in the engineering applications will be described.
  •  
  •  TALK   Quadratic Gaussian Multiterminal Source Coding
    Date & Time: Friday, July 6, 2012; 12:00 PM
    Speaker: Zixiang Xiong, Texas A&M University
    MERL Host: Anthony Vetro
    Research Area: Multimedia
    Brief
    • Driven by a host of emerging applications, distributed source coding has assumed renewed interest in the past decade. Although the Slepian-Wolf theorem has been known for almost 40 years and progresses have been made recently on the rate region of quadratic Gaussian two-terminal source coding, finding the sum-rate bound of quadratic Gaussian multiterminal source coding with more than two terminals is still an open problem. In this talk, I'll briefly go over existing results on distributed source coding problems before describing a set of new results we obtained recently.
  •  
  •  TALK   Sparse projections onto convex sets
    Date: Tuesday, July 3, 2012
    Speaker: Prof. Volkan Cevher, EPFL
    MERL Host: Petros Boufounos
    Research Area: Multimedia
    Brief
    • Many natural and man-made signals exhibit a few degrees of freedom relative to their dimension due to natural parameterizations or constraints. The inherent low-dimensional structure of such signals are mathematically modeled via combinatorial and geometric concepts, such as sparsity, unions-of-subspaces, or spectral sets, and are now revolutionizing the way we address linear inverse problems from incomplete data.

      In this talk, we describe a set of structured sparse models for constrained linear inverse problems that feature exact and epsilon-approximate projections in polynomial time. We pay particular attention to the sparsity models based on matroids, multi-knapsack, and clustering as well as spectrally constrained models. We then study sparse projections onto convex sets, such as the (general) simplex, and ell-1,2,inf balls. Finally, we describe a hybrid optimization framework which explicitly leverages these non-convex models along with additional convex constraints to obtain better recovery performance in compressive sensing, learn interpretable sparse densities from finite samples, and improved sparse Markowitzs portfolios with better return/cost performance.
  •