Consistent Anisotropic Wiener Filtering for Audio Source

For audio source separation applications, it is common to apply a Wiener-like filtering to a time-frequency (TF) representation of the data, such as the short-time Fourier transform (STFT). This approach, which boils down to assigning the phase of the original mixture to each component, is limited when sources overlap in the TF domain. In this paper, we propose a more sophisticated version of this technique for improved phase recovery. First, we model the sources by anisotropic Gaussian variables: this model accounts for the non-uniformity of the phase, and then permits us to incorporate some prior information about the phase that originates from a sinusoidal model. Then, we exploit the STFT consistency, which is the relationship between STFT coefficients that is due to its redundancy. We derive a conjugate gradient algorithm for estimating the corresponding filter, called consistent anisotropic Wiener. Experiments conducted on music pieces show that accounting for those two phase properties outperforms each approach taken separately.