Separating style and content
|
| MERL Report: | TR96-36 : J. B. Tenenbaum, W. T. Freeman |
Where Published: Adv. in Neural Info. Proc. Systems, volume 9, MIT Press, 1997
We seek to analyze and manipulate two factors, which we generically call style and content, underlying a set of observations. We fit training data with bilinear models which explicitly represent the two-factor structure. These models can adapt easily during testing to new styles or content, allowing us to solve three general tasks: extrapolation of a new style to unobserved content; classification of content observed in a new style; and translation of new content observed in a new style. For classification, we embed bilinear models in a probabilistic framework, Separable Mixture Models (SMMs), which generalizes earlier work on factorial mixture models (Hinton '94, Ghahramani '95). Significant performance improvement on a benchmark speech dataset shows the benefits of our approach.