TR2013-021

Non-negative Dynamical System with Application to Speech and Audio

- Fevotte, C., Le Roux, J., Hershey, J.R., "Non-negative Dynamical System with Application to Speech and Audio", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013.
  BibTeX TR2013-021 PDF Software
  - @inproceedings{Fevotte2013may,
  - author = {Fevotte, C. and {Le Roux}, J. and Hershey, J.R.},
  - title = {{Non-negative Dynamical System with Application to Speech and Audio}},
  - booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2013,
  - month = may,
  - url = {https://www.merl.com/publications/TR2013-021}
  - }
MERL Contact:
- Jonathan
  Le Roux
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

Non-negative data arise in a variety of important signal processing domains, such as power spectra of signals, pixels in images, and count data. This paper introduces a novel non-negative dynamical system (NDS) for sequences of such data, and describes its application to modeling speech and audio power spectra. The NDS model can be interpreted both as an adaptation of linear dynamical systems (LDS) to non-negative data, and as an extension of non-negative matrix factorization (NMF) to support Markovian dynamics. Learning and inference algorithms were derived and experiments on speech enhancement were conducted by training sparse non-negative dynamical systems on speech data and adapting a noise model to the unknown noise condition. Results show that the model can capture the dynamics of speech in a useful way.

Software & Data Downloads

Non-negative Dynamical System model

Related News & Events

NEWS ICASSP 2013: 9 publications by Jonathan Le Roux, Dehong Liu, Robert A. Cohen, Dong Tian, Shantanu D. Rane, Jianlin Guo, John R. Hershey, Shinji Watanabe, Petros T. Boufounos, Zafer Sahinoglu and Anthony Vetro
Date: May 26, 2013
Where: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
MERL Contacts: Dehong Liu; Jianlin Guo; Anthony Vetro; Petros T. Boufounos; Jonathan Le Roux
Brief
- The papers "Stereo-based Feature Enhancement Using Dictionary Learning" by Watanabe, S. and Hershey, J.R., "Effectiveness of Discriminative Training and Feature Transformation for Reverberated and Noisy Speech" by Tachioka, Y., Watanabe, S. and Hershey, J.R., "Non-negative Dynamical System with Application to Speech and Audio" by Fevotte, C., Le Roux, J. and Hershey, J.R., "Source Localization in Reverberant Environments using Sparse Optimization" by Le Roux, J., Boufounos, P.T., Kang, K. and Hershey, J.R., "A Keypoint Descriptor for Alignment-Free Fingerprint Matching" by Garg, R. and Rane, S., "Transient Disturbance Detection for Power Systems with a General Likelihood Ratio Test" by Song, JX., Sahinoglu, Z. and Guo, J., "Disparity Estimation of Misaligned Images in a Scanline Optimization Framework" by Rzeszutek, R., Tian, D. and Vetro, A., "Screen Content Coding for HEVC Using Edge Modes" by Hu, S., Cohen, R.A., Vetro, A. and Kuo, C.C.J. and "Random Steerable Arrays for Synthetic Aperture Imaging" by Liu, D. and Boufounos, P.T. were presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

Related Research Highlights

Speech Enhancement

MERL Contact:

JonathanLe Roux

Research Areas:

Abstract:

Jonathan
Le Roux