Perceived quality and bandwidth characterization of layered MPEG-2 video encoding

Junichi Kimura; Fouad A. Tobagi; Jose-Miguel Pulido; Peder J. Emstad

doi:10.1117/12.371214

22 November 1999 Perceived quality and bandwidth characterization of layered MPEG-2 video encoding

Junichi Kimura, Fouad A. Tobagi, Jose-Miguel Pulido, Peder J. Emstad

Proceedings Volume 3845, Multimedia Systems and Applications II; (1999) https://doi.org/10.1117/12.371214
Event: Photonics East '99, 1999, Boston, MA, United States

Abstract

The current Internet is not well suited for the transmission of high quality video such as MPEG-2 because of severe quality degradation during network congestion episodes. One possible solution is the combination of layered video coding with the Differentiated Services (DiffServ) architecture; different video layers are mapped into different priority levels, and packets with different priorities receive a different dropping treatment in the network. It is expected that with layering and priority dropping, graceful degradation of video quality will be experienced during congestion episodes. We consider various layering mechanisms defined in the MPEG-2 standards; namely, temporal scalability, data partitioning (DP) and Signal to Noise Ratio (SNR) scalability. The main issue in this paper is how layers should be created to maximize perceived video quality over a given range of network conditions. Key to our study is the use of real life video sequences and a video quality measure consisting of a perceptual distortion metric based on the Human Visual System (VHS). Our results show that video quality is sensitive to how layering is accomplished, and that there is an optimum layering that maximizes the quality for a given network condition. Our results also show that layering can achieve higher network loading for a given minimum quality target than non-layered video, and can achieve graceful degradation over a wider range of network conditions. We have also seen that the wider the range of network conditions is, the higher is the number of layers required in order to remain at the highest possible quality level for each network condition. In particular, we demonstrate how three or four layers achieve better results than two layers; however, additional layers beyond four provide marginal improvement. Therefore, from a practical point of view, three or four layers are sufficient to attain most of the benefits of layering. We compare the various scalability techniques in terms of complexity and video quality. Temporal scalability, which restricts the layering to be done at frame boundaries, is the simplest to implement and introduces no overhead, but performs poorly compared to data partitioning, which allows the grouping of coefficients into layers independent of the frames they belong to. This shows that, contrary to customary belief, dropping data in B frames prior to dropping data in P or I frames is a poor layering technique. DP is much simpler to implement and introduces significantly lower overhead than SNR scalability. However, SNR scalability provides higher quality than DP when network conditions are particularly poor.

Citation Download Citation

Junichi Kimura, Fouad A. Tobagi, Jose-Miguel Pulido, and Peder J. Emstad "Perceived quality and bandwidth characterization of layered MPEG-2 video encoding", Proc. SPIE 3845, Multimedia Systems and Applications II, (22 November 1999); https://doi.org/10.1117/12.371214

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available