The new MPEG-4 standard allows for interactivity. high compression, and/or universal accessibility and portability of multimedia content (natural audio and video, synthetic content). The visual part of MPEG-4 specifies algorithms for object oriented audio-visual coding. In addition to the conventional frame-based functionalities of the MPEG-i and MPEG-2 standards, the MPEG-4 video coding will also support arbitrary shaped video objects. Therefore the concept of video object planes (VOPs) has been introduced. Each frame of an input video sequence is segmented into a number of regions, which may cover image or video content of interest. These regions are encoded in the so called alpha-plane, giving the objects contour. In contrast to the MPEG-112 standards, the video input is no longer considered as a rectangular region. The shape, motion and texture information of a VOP is encoded and transmitted in a video object layer, covering all information of one video object (VO). Similar to the MPEG- 1/2 coders, the MPEG-4 video coding scheme processes the successive images of a VOP sequence in a block-based manner (e.g. motion estimation/compensation, DCT). The coding and decoding are based on macroblocks (MBs) of 16x16 pixel size. Therefore an image padding technique is used for the macroblocks of an MPEG-4 image, which contain the shape edge of an object. These blocks are called contour macroblocks. Their non-object pixel are filled using the padding technique. This technique, which will be described in detail in chapter two, turned out to be a computational complex and very irregular operation. A dedicated hardware accelerator for the MPEG-4 padding algorithm has been designed to remove this task from a general purpose host processor. The accelerator architecture exploits the data dependency of the padding algorithm to allow for a very high macroblock throughput rate. The global architecture is described in chapter three, while the data dependent scheduling of operation is sketched in chapter four. Chapter five will give a conclusion about the performance results and the hardware cost of the module.
|