Multiview Video Coding

We are working on developing advanced video compression algorithms for multiview video, i.e., video sequences recorded simultaneously from multiple cameras. We are also actively participating in the multiview video coding standardization activity in MPEG. Target applications for this work include 3D display and free viewpoint video.

Background & Objective:  The need for multiview video coding is driven by two recent technological developments: new 3D display technologies and the growing use of multi-camera arrays. A variety of companies are starting to produce 3D display technologies that do not require glasses and can be viewed by multiple people simultaneously. The immersive experience provided by these 3D displays are compelling and have the potential to create a growing market for 3D video and hence for multiview video compression.  Furthermore, even with 2D displays, multi-camera arrays are increasingly being used to capture a scene from many angles. The resulting multiview data sets allow the viewer to observe a scene from any viewpoint and serve as another application of multiview video compression.

Technical Discussion:  Our multiview video compression codec extends H.264/AVC to take advantage of correlations between different cameras. In standard video codecs, an important tool is motion compensated prediction where the encoder predicts the current frame from past or future frames in the same sequence.  By coding only the resulting prediction error instead of the entire frame, significant savings are possible. In addition to standard temporal prediction, our multiview codec allows the encoder to predict the current frame from frames in other cameras or from virtual interpolated views. Specifically, we have modified the MPEG JSVM reference software to allow insertion of multiview frames into the Decoded Picture Buffer (DPB) and various reference lists. By decomposing the multi-camera sequence in various ways, we can obtain spatio-temporal prediction that is more efficient than pure temporal prediction. Furthermore, when camera parameters are available, we can interpolate a virtual view to use as a reference. For example, our codec can combine left and right views to interpolate a synthetic center view to use in predicting the center sequence. The interpolated views often perform better than temporal references.

Future Direction:  We are working on improving the quality of our multiview codec and remain active in MPEG standardization activities including core experiments on buffer management, random access, and view synthesis.

Contacts:
Sehoon Yea
Huifang Sun
Anthony Vetro

Publications:
Zwicker, M.; Vetro, A.; Yea, S.; Matusik, W.; Pfister, H.; Durand, F., "Resampling, Antialiasing, and Compression in Multiview 3-D Displays", IEEE Signal Processing Magazine, ISSN: 1053-5888, Vol. 24, Issue 6, pp. 88-96, November, 2007 (IEEE Xplore, TR2007-084)

Zwicker, M.; Yea, S.; Vetro, A.; Forlines, C.; Matusik, W.; Pfister, H., "Display Pre-filtering for Multi-view Video Compression", International Conference on Multimedia (ACM Multimedia), ISBN: 978-1-59593-702-5, Session: Systems 4 - Coding Support - pp. 1046-1053, September 2007 (ACM Portal, TR2007-073)

Technical Reports:
TR2007-025 Depth Estimation for View Synthesis in Multimedia Video Coding
TR2006-048 Extensions of H.264/AVC for Multiview Video Compression
TR2006-035 View Synthesis for Multiview Video Compression
TR2004-137 Coding Approaches for End-To-End 3D TV Systems

Technology Areas:
Multimedia
Digital Video

Modification Date:  July 3, 2007