Mitsubishi Electric Research Laboratories

MPEG-7 Sound Recognition

MPEG-7 is the newest member of the family of industry standards for media technology published by the ISO. Released in July 2002, MPEG-7 standardizes media content indexing and retrieval for professional and consumer applications.     MPEG-7 makes sounds, pictures and video as searchable as the text in Internet Web pages.

Background & Objective:  One of the major challenges in the design of sound recognition systems is selecting features and probability model parameters that are robust across a broad range of sound types. Robust systems should require no human intervention for feature extraction or model parameter estimation. To this end, we sought fully automatic methods for building recognition systems using training data.

Technical Discussion:  The MPEG-7 standardized features for sound recognition consist of dimension-reduced spectral vectors obtained using a linear transformation of a spectrogram. Dimension reduction uses a MERL-patented technology, based on the singular value decomposition (SVD) and independent component analysis (ICA), to find a set of basis functions that maximize the information content of the features whilst minimizing their size. Such compact features are essential for efficient training of automatic classifiers and for robust performance.     Within the standard, these features are used with hidden Markov models to build robust automatic classifiers. HMM classifiers are represented within MPEG-7 using XML-based description schemes that enable interoperability and portability of models between different applications. The system successfully identifies sound events as diverse as speech, singing, environmental noises, animal sounds, musical instruments and music genres. Industry uses for this technology include remote audio monitoring, media archive searching and automatic music monitoring for broadcast facilities.

Contact:  Bent Schmidt-Nielsen

Technology Area:  Audio Video Processing

Modification Date:  November 1, 2007