News & Events

399 were found.


  •  TALK   Non-negative Hidden Markov Modeling of Audio
    Date & Time: Thursday, October 11, 2012; 2:30 PM
    Speaker: Dr. Gautham J. Mysore, Adobe
    Research Areas: Multimedia, Speech & Audio
    Brief
    • Non-negative spectrogram factorization techniques have become quite popular in the last decade as they are effective in modeling the spectral structure of audio. They have been extensively used for applications such as source separation and denoising. These techniques however fail to account for non-stationarity and temporal dynamics, which are two important properties of audio. In this talk, I will introduce the non-negative hidden Markov model (N-HMM) and the non-negative factorial hidden Markov model (N-FHMM) to model single sound sources and sound mixtures respectively. They jointly model the spectral structure and temporal dynamics of sound sources, while accounting for non-stationarity. I will also discuss the application of these models to various applications such as source separation, denoising, and content based audio processing, showing why they yield improved performance when compared to non-negative spectrogram factorization techniques.
  •  
  •  EVENT   ICIP 2012 - IEEE International Conference on Image Processing
    Date: Sunday, September 30, 2012 - Wednesday, October 3, 2012
    MERL Contact: Anthony Vetro
    Location: Orlando, FL
    Research Area: Multimedia
    Brief
    • Anthony Vetro is the Industrial Co-chair of ICIP 2012, the IEEE International Conference on Image Processing, to be held in Orlando, Florida, in September 2012.
  •  
  •  NEWS   ICIP 2012: 4 publications by Anthony Vetro, Shantanu D. Rane, Huifang Sun and Dong Tian
    Date: September 30, 2012
    Where: IEEE International Conference on Image Processing (ICIP)
    MERL Contacts: Dong Tian; Anthony Vetro; Huifang Sun
    Research Area: Multimedia
    Brief
    • The papers "An Attribute-Based Framework for Privacy Preserving Image Querying" by Rane, S. and Sun, W., "Emerging Cryptographic Challenges in Image and Video Processing" by Puech, W., Erkin, Z., Barni, M., Rane, S. and Lagendijk, R.L., "A Local Depth Image Enhancement Scheme for View Synthesis" by Wang, Y., Tian, D. and Vetro, A. and "On Modeling the Rendering Error in 3D Video" by Cheung, N.-M., Tian, D., Vetro, A. and Sun, H. were presented at the IEEE International Conference on Image Processing (ICIP)
  •  
  •  NEWS   MMSP 2012: publication by Petros T. Boufounos, Shantanu D. Rane and others
    Date: September 17, 2012
    Where: IEEE International Workshop on Multimedia Signal Processing (MMSP)
    MERL Contact: Petros Boufounos
    Research Areas: Multimedia, Digital Video, Computational Sensing
    Brief
    • The paper "Quantized Embeddings of Scale-Invariant Image Features for Mobile Augmented Reality" by Li, M., Rane, S. and Boufounos, P. was presented at the IEEE International Workshop on Multimedia Signal Processing (MMSP)
  •  
  •  TALK   Tensor representation of speaker space for arbitrary speaker conversion
    Date & Time: Thursday, September 6, 2012; 12:00 PM
    Speaker: Dr. Daisuke Saito, The University of Tokyo
    Research Areas: Multimedia, Speech & Audio
    Brief
    • In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice Gaussian mixture model (EV-GMM) was proposed. In the EVC, similarly to speaker recognition approaches, a speaker space is constructed based on GMM supervectors which are high-dimensional vectors derived by concatenating the mean vectors of each of the speaker GMMs. In the speaker space, each speaker is represented by a small number of weight parameters of eigen-supervectors. In this talk, we revisit construction of the speaker space by introducing the tensor analysis of training data set. In our approach, each speaker is represented as a matrix of which the row and the column respectively correspond to the Gaussian component and the dimension of the mean vector, and the speaker space is derived by the tensor analysis of the set of the matrices. Our approach can solve an inherent problem of supervector representation, and it improves the performance of voice conversion. Experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.
  •  
  •  AWARD   MMSP 2012 Top 10% Paper Award
    Date: September 1, 2012
    Awarded to: Mu Li, Shantanu Rane and Petros Boufounos
    Awarded for: "Quantized Embeddings of Scale-Invariant Image Features for Mobile Augmented Reality"
    Awarded by: IEEE International Workshop on Multimedia Signal Processing (MMSP)
    MERL Contact: Petros Boufounos
    Research Areas: Multimedia, Digital Video, Computational Sensing
  •  
  •  NEWS   International Workshops APPROX/RANDOM 2012: publication by Petros T. Boufounos and others
    Date: August 15, 2012
    Where: International Workshops APPROX/RANDOM
    MERL Contact: Petros Boufounos
    Research Areas: Multimedia, Computational Sensing
    Brief
    • The paper "What's the Frequency, Kenneth?: Sublinear Fourier Sampling Off the Grid" by Boufounos, P., Cevher, V., Gilbert, A.C., Li, Y. and Strauss, M.J. was presented at the International Workshops APPROX/RANDOM
  •  
  •  NEWS   SPIE Conference on Applications of Digital Image Processing 2012: publication by Anthony Vetro and Dong Tian
    Date: August 12, 2012
    Where: SPIE Conference on Applications of Digital Image Processing
    MERL Contacts: Dong Tian; Anthony Vetro
    Research Areas: Multimedia, Digital Video
    Brief
    • The paper "Analysis of 3D and Multiview Extensions of the Emerging HEVC Standard" by Vetro, A. and Tian, D. was presented at the SPIE Conference on Applications of Digital Image Processing
  •  
  •  TALK   Communication Systems for Oilfield Applications
    Date & Time: Tuesday, August 7, 2012; 12:00 PM
    Speaker: Dr. Julius Kusuma, Schlumberger-Doll Research
    MERL Host: Petros Boufounos
    Research Area: Multimedia
    Brief
    • The oilfield is a rich area for research and engineering in communication and signal processing. Communication over non-standard channels, using constrained sources, noisy environments, and limited computational and energy resources, are some of the key challenges in this domain. In this talk I will give an introduction first on the role of science and technology, in particular communication and signal processing, in the oilfield. Due to its unique role in the industry, Schlumberger has a rich variety of communication systems over EM wireless, wired, acoustic, and even fluid pressure channels.

      In this talk we give a brief tour of some of the state-of-the-art and showcase how technology has revolutionized the practice of the industry, enabling innovations such as horizontal drilling, logging-while-drilling, and well-placement. At the same time, we give a tutorial on how the lifecycle of a reservoir is managed, including imaging, drilling, logging, sampling, testing, and completing. Throughout, we will show how communication has revolutionized the practice in the industry.
  •  
  •  TALK   Nonparametric Bayesian Latent Variable Models
    Date & Time: Friday, July 27, 2012; 12:00 PM
    Speaker: Mingyuan Zhou, Duke University
    MERL Host: Dehong Liu
    Research Area: Multimedia
    Brief
    • Bayesian nonparametrics, using stochastic processes as prior distributions, is a relatively young and rapidly growing research area in statistics and machine learning. In this talk, we first briefly review completely random measures, a family of pure-jump non-negative stochastic processes that are simple to construct and amenable for posterior computation. We then present nonparametric Bayesian latent variable models based on the beta process, Bernoulli process, gamma process, Poisson process, and in particular, the negative binomial process. Specifically, for continuous data, we discuss dictionary learning with the beta-Bernoulli process and dependent hierarchical beta process, and for count data, we present the beta-negative binomial process and Poisson factor analysis. Furthermore, we discuss how the seeming disjoint count and mixture modelings can be united under the negative binomial processes framework, providing new opportunities to build mixture and hierarchical mixture models with better data fitting, more efficient inference and more flexible model constructions. We show successful applications of our nonparametric Bayesian latent variable models to image processing, topic modeling and count data analysis.
  •  
  •  NEWS   IEEE Transactions on Information Forensics and Security: publication by Ye Wang, Shantanu D. Rane and others
    Date: July 24, 2012
    Where: IEEE Transactions on Information Forensics and Security
    MERL Contact: Ye Wang
    Research Areas: Multimedia, Information Security
    Brief
    • The article "A Theoretical Analysis of Authentication, Privacy, and Reusability Across Secure Biometric Systems" by Wang, Y., Rane, S., Draper, S.C. and Ishwar, P. was published in IEEE Transactions on Information Forensics and Security
  •  
  •  NEWS   IGARSS 2012: publication by Petros T. Boufounos and Dehong Liu
    Date: July 22, 2012
    Where: IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
    MERL Contacts: Dehong Liu; Petros Boufounos
    Research Areas: Multimedia, Digital Video, Computational Sensing
    Brief
    • The paper "Pan-Sharpening with Multi-scale Wavelet Dictionary" by Liu, D. and Boufounos, P.T. was presented at the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
  •  
  •  NEWS   ICME 2012: publication by Shantanu D. Rane and others
    Date: July 9, 2012
    Where: IEEE International Conference on Multimedia and Expo (ICME)
    Research Areas: Multimedia, Information Security
    Brief
    • The paper "A Distance-sensitive Attribute Based Cryptosystem for Privacy-Preserving Querying" by Sun, W. and Rane, S. was presented at the IEEE International Conference on Multimedia and Expo (ICME)
  •  
  •  TALK   Quadratic Gaussian Multiterminal Source Coding
    Date & Time: Friday, July 6, 2012; 12:00 PM
    Speaker: Zixiang Xiong, Texas A&M University
    MERL Host: Anthony Vetro
    Research Area: Multimedia
    Brief
    • Driven by a host of emerging applications, distributed source coding has assumed renewed interest in the past decade. Although the Slepian-Wolf theorem has been known for almost 40 years and progresses have been made recently on the rate region of quadratic Gaussian two-terminal source coding, finding the sum-rate bound of quadratic Gaussian multiterminal source coding with more than two terminals is still an open problem. In this talk, I'll briefly go over existing results on distributed source coding problems before describing a set of new results we obtained recently.
  •  
  •  TALK   Sparse projections onto convex sets
    Date: Tuesday, July 3, 2012
    Speaker: Prof. Volkan Cevher, EPFL
    MERL Host: Petros Boufounos
    Research Area: Multimedia
    Brief
    • Many natural and man-made signals exhibit a few degrees of freedom relative to their dimension due to natural parameterizations or constraints. The inherent low-dimensional structure of such signals are mathematically modeled via combinatorial and geometric concepts, such as sparsity, unions-of-subspaces, or spectral sets, and are now revolutionizing the way we address linear inverse problems from incomplete data.

      In this talk, we describe a set of structured sparse models for constrained linear inverse problems that feature exact and epsilon-approximate projections in polynomial time. We pay particular attention to the sparsity models based on matroids, multi-knapsack, and clustering as well as spectrally constrained models. We then study sparse projections onto convex sets, such as the (general) simplex, and ell-1,2,inf balls. Finally, we describe a hybrid optimization framework which explicitly leverages these non-convex models along with additional convex constraints to obtain better recovery performance in compressive sensing, learn interpretable sparse densities from finite samples, and improved sparse Markowitzs portfolios with better return/cost performance.
  •  
  •  NEWS   PCS 2012: publication by Anthony Vetro, Robert A. Cohen, Huifang Sun and others
    Date: May 7, 2012
    Where: Picture Coding Symposium (PCS)
    MERL Contacts: Anthony Vetro; Huifang Sun
    Research Areas: Multimedia, Digital Video
    Brief
    • The paper "Predictive Coding of Intra Prediction Modes for High Efficiency Video Coding" by Xu, X., Cohen, R., Vetro, A. and Sun, H. was presented at the Picture Coding Symposium (PCS)
  •  
  •  NEWS   IWSML 2012: publication by Jonathan Le Roux, John R. Hershey and others
    Date: March 31, 2012
    Where: International Workshop on Statistical Machine Learning for Speech Processing (IWSML)
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • The paper "Latent Dirichlet Reallocation for Term Swapping" by Heaukulani, C., Le Roux, J. and Hershey, J.R. was presented at the International Workshop on Statistical Machine Learning for Speech Processing (IWSML)
  •  
  •  EVENT   ICASSP 2012 - Special Session on Signal-Processing Challenges and Opportunities in Depth Cameras
    Date & Time: Friday, March 30, 2012; 2:00 PM - 4:00 PM
    MERL Contact: Anthony Vetro
    Location: Kyoto, Japan
    Research Area: Multimedia
    Brief
    • Anthony Vetro co-organized a Special Session of ICASSP 2012 on Signal-Processing Challenges and Opportunities in Depth Cameras. ICASSP 2012 will be held in Kyoto, Japan, in March 2012.
  •  
  •  NEWS   ICASSP 2012: 8 publications by Petros T. Boufounos, Dehong Liu, John R. Hershey, Jonathan Le Roux and Zafer Sahinoglu
    Date: March 25, 2012
    Where: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
    MERL Contacts: Dehong Liu; Jonathan Le Roux; Petros Boufounos
    Research Areas: Multimedia, Electronics & Communications
    Brief
    • The papers "Dictionary Learning Based Pan-Sharpening" by Liu, D. and Boufounos, P.T., "Multiple Dictionary Learning for Blocking Artifacts Reduction" by Wang, Y. and Porikli, F., "A Compressive Phase-Locked Loop" by Schnelle, S.R., Slavinsky, J.P., Boufounos, P.T., Davenport, M.A. and Baraniuk, R.G., "Indirect Model-based Speech Enhancement" by Le Roux, J. and Hershey, J.R., "A Clustering Approach to Optimize Online Dictionary Learning" by Rao, N. and Porikli, F., "Parametric Multichannel Adaptive Signal Detection: Exploiting Persymmetric Structure" by Wang, P., Sahinoglu, Z., Pun, M.-O. and Li, H., "Additive Noise Removal by Sparse Reconstruction on Image Affinity Nets" by Sundaresan, R. and Porikli, F. and "Depth Sensing Using Active Coherent Illumination" by Boufounos, P.T. were presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
  •  
  •  NEWS   ASJ 2012: publication by Jonathan Le Roux and John R. Hershey
    Date: March 13, 2012
    Where: Acoustical Society of Japan Spring Meeting (ASJ)
    MERL Contact: Jonathan Le Roux
    Research Areas: Multimedia, Speech & Audio
    Brief
    • The paper "Speech Enhancement by Indirect VTS" by Le Roux, J. and Hershey, J.R. was presented at the Acoustical Society of Japan Spring Meeting (ASJ)
  •