Unit VII MM Chap10 Basic Video Compression Techniques

8/3/2019 Unit VII MM Chap10 Basic Video Compression Techniques

1/51

Chapter 10 Basic VideoChapter 10 Basic Video

Compression TechniquesCompression Techniques.

10.2 Video Compression with MotionCompensation

.

10.4 H.261

10.5 H.263

.

1


2/51

10.1 Introduction to10.1 Introduction toVideoVideo

CompressionCompression -

of frames, i.e., images. An obvious solution to video com ression

would be predictive coding based onprevious frames.

Compression proceeds by subtractingimages: subtract in time order and code theres ua error.

It can be done even better by searching forfrom the previous frame.

2


3/51

10.2 Video Compression with10.2 Video Compression with

Motion CompensationMotion Compensation Consecutive frames in a video are similar

temporal redundancy exists.

Temporal redundancy is exploited so that not everyrame o e v eo nee s o e co e n epen en yas a new image.

The difference between the current frame and other

frame(s) in the sequence will be coded smallvalues and low entropy, good for compression.

Compensation (MC):

Motion Estimation (motion vector search). MC-based Prediction.

Derivation of the prediction error, i.e., the difference.

3


4/51

Each ima e is divided into macroblocks of size N N.

By default, N = 16 for luminance images. For chrominance

images, N = 8 if 4:2:0 chroma subsampling is adopted. o on compensa on s per orme a e

macroblock level. The current ima e frame is referred to as Tar et Frame.

A match is sought between the macroblock in the TargetFrame and the most similar macroblockin previous and/orfuture frame s referred to as Reference frame s .

The displacement of the reference macroblock to thetarget macroblock is called a motion vector MV.

.the Reference frame is taken to be a previous frame.

4


5/51

neighborhood both horizontal and vertical

dis lacements in the ran e , .This makes a search window of size (2p +1) (2p +1).

5


6/51

.. The difference between two macroblocks can then

be measured by their Mean Absolute Difference

(MAD):

The goal of the search is to find a vector (i, j) as themotion vector MV =(u, v), such that MAD(i, j) isminimum:

6


7/51

Se uential search: se uentiall search the whole 2

+1) (2p + 1) window in the Reference frame (also

referred to as Full search).

window is compared to the macroblock in the Targetframe pixel by pixel and their respective MAD is then

. . .

The vector (i, j) that offers the least MAD is designated asthe MV (u, v) for the macroblock in the Target frame.

sequent a searc met o s very cost y assum ng eacpixel comparison requires three operations (subtraction,

absolute value, addition), the cost for obtaining a motion O(p2N2).

7


8/51

PROCEDURE 10.1 MotionPROCEDURE 10.1 Motion--

vector:vector:sequentialsequential--searchsearch

8


9/51

suboptimal but still usually e ective.

The procedure for 2D Logarithmic Search ofmotion vectors takes several iterations and is akinto a binary search:

. . ,

the search window are used as seeds for a MAD-based search; they are marked as '1'.

ter t e one t at y e s t e m n mum slocated, the center of the new search region is moved

to it and the step-size ("offset") is reduced to half. In the next iteration, the nine new locations are

marked as '2', and so on.

9


10/51

10


11/51

11


12/51

previous subsection, the total operationsper second is dropped to:

12


13/51

(multiresolution) approach in which initial

estimation of the motion vector can be obtainedrom mages w a s gn can y re uceresolution.

-

which the original image is at Level 0, images atLevels 1 and 2 are obtained by down-samplingrom e prev ous eve s y a ac or o , an einitial search is conducted at Level 2.

can also be proportionally reduced, the numberof operations required is greatly reduced.

13


14/51

14


15/51

''

15


16/51

PROCEDURE 10.3 MotionPROCEDURE 10.3 Motion--

vector:hierarchicalvector:hierarchical--searchsearch

16


17/51

17


18/51

.. ..

standard, its principle of MC-based compression

is retained in all later video compressions an ar s.

,

conferencing and other audiovisual services overISDN.

e v eo co ec supports t-rates o p ps,where p ranges from 1 to 30 (Hence also known as p

64). Require that the delay of the video encoder be less

than 150 msec so that the video can be used for real-time bidirectional video conferencin .

18


19/51

ITU Recommendations &ITU Recommendations & H.261H.261

Video FormatsVideo Formats .

recommendations for visual telephony

systems: H.221Frame structure for an audiovisual

channel supporting 64 to 1,920 kbps.

. rame contro s gna s or au ov suasystems.

. .

H.261Video encoder/decoder for audiovisual

services at p 64 kbps. H.320Narrow-band audiovisual terminal

equipment for p 64 kbps transmission.

19


20/51

20


21/51

21


22/51

.. Two t es of ima e frames are defined: Intra-frames

(I-frames) and Inter-frames (P-frames):

I-frames are treated as independent images. Transform-frame, hence "Intra".

P-frames are not independent: coded by a forward-

frame is allowed not just from a previous I-frame). Temporal redundancy removal is included in P-frame

co ng, w ereas - rame co ng per orms on y spa aredundancy removal.

To avoid propagation of coding errors, an I-frame is usuallysen a coup e o mes n eac secon o e v eo.

Motion vectors in H.261 are always measured in unitsof full ixel and the have a limited ran e of15pixels, i.e., p = 15.

22


23/51

-- --

Macroblocks are ofsize 16 16 pixels for the Y frame, and 8 employed. A macroblock consists of four Y, one Cb, and oneCr 8 8blocks.

,

coeffi

cients then go through quantization zigzag scan andentropy coding.23


24/51

-- -- . . -

coding scheme based on motion

For each macroblock in the Target frame, a

search methods discussed earlier.

,is derived to measure the prediction error.

,quantization, zigzag scan and entropy coding

24


25/51

-

macroblock (not the Target macroblock itself)

ome mes, a goo ma c canno e oun ,

i.e., the prediction error exceeds a certain

accepta e eve . The MB itself is then encoded (treated as an Intra

MB) and in this case it is termed a non-motion

compensated MB.

For motion vector, the di erence MVD is

sent for entro codin :

25


26/51

26


27/51

.. The uantization in H.261 uses a constant ste

size,for all DCT coefficients within a macroblock.

If we use DCT and QDCT to denote the DCTcoe c en s e ore an a er e quan za on, enfor DC coefficients in Intra mode:

for all other coefficients:

, .

27


28/51

... .

of how the H.261 encoder and decoder.

A scenario is used where frames I, P1,and P2

Note: decoded frames (not the original

motion estimation.

points indicated by the circled numbers are. . .

28


29/51

29


30/51

30


31/51

31


32/51

A Glance at Syntax of H.261 VideoA Glance at Syntax of H.261 Video

BitstreamBitstream. . .

bitstream: a hierarchy offour layers: Picture,

Grou of Blocks GOB , Macroblock, and Block. The Picture layer: PSC (Picture Start Code) delineates

boundaries between pictures. TR (Temporale erence prov es a t me-stamp or t e p cture.

The GOB layer: H.261 pictures are divided into ,

called a Group of Blocks (GOB).

Fig. 10.9 depicts the arrangement of GOBs in a CIF or QCIFum nance mage.

For instance, the CIF image has 2 6 GOBs, corresponding toits image resolution of 352288 pixels. Each GOB has its Start

o e an roup num er .

32


33/51

In case a network error causes a bit error or the loss of somebits, H.261 video can be recovered and resynchronized at thenext identifiable GOB.

GQuant indicates the Quantizer to be used in the GOB unless itis overri en y any su sequent uant uantizer orMacroblock). GQuant and MQuant are referred to as scale in Eq.(10.5).

e acro oc ayer: ac acro oc as s own

Address indicating its position within the GOB, Quantizer(MQuant), and six 8 8 image blocks (4 Y, 1 Cb, 1 Cr).

e oc ayer: or eac oc , e s ream s ar swith DC value, followed by pairs of length of zero-run (Run)and the subsequent non-zero value (Level) for ACs, and

. ,63]. Level reflects quantized values its range is [127, 127]and Level 0.

33


34/51

34


35/51

35


36/51

.. ...

standard for video conferencing and other

Switched Telephone Networks (PSTN). ms at ow t-rate commun cat ons at t-

rates of less than 64 kbps.

ses pre c ve co ng or n er- rames oreduce temporal redundancy and transform

spatial redundancy (for both Intra-frames andinter-frame rediction .

36


37/51

37


38/51

.. . , .

the notion ofGroup of Blocks (GOB).

The di erence is that GOBs in H.263 do nothave a fixed size, and they always start andend at the left and right borders of the

picture. As shown in Fig. 10.10, each QCIF luminance

mage cons s s o s an eachas 11 1 MBs (176 16 pixels), whereas

GOBs and each GOB has 44 2 MBs (704 32 ixels .

38


39/51

39


40/51

..

MV are predicted from the median values of the

horizontal and vertical components, respectively," " " "o , , rom e prev ous , a oveand "above and right" MBs (see Fig. 10.11 (a)).

Instead of codin the MV u, v itself, the errorvector (u, v) is coded, where u = u up andv = v vp.

40


41/51

41


42/51

--,

half-pixel precision is supported in H.263vs. full-pixel precision only in H.261.

The default ran e for both the horizontal and

vertical components u and v of MV(u, v)arenow 16, 15.5 .

The pixel values needed at half-pixel positions

interpolation method, as shown in Fig. 10.12.

42


43/51

43


44/51

.. .

options in its various Annexes. Four of the

common options are as follows: Unrestricted motion vector mode:

The pixels referenced are no longer restricted toe w t n t e oun ary o t e mage.

When the motion vector points outside the

that is geometrically closest to the referencedpixel is used.

e maximum range o motion vectors is - . ,31.5].

44


45/51

- As in H.261, variable length coding (VLC) is used in

H.263 as a default coding method for the DCTcoefficients.

Similar to H.261, the syntax of H.263 is also

coded using a combination offixed length code andvariable length code.

In this mode, the macroblock size for MC is reduced

from 16 to 8. Four motion vectors (from each of the 8 8 blocks)

are generated for each macroblock in the luminance.

45


46/51

- In H.263, a PB-frame consists of two pictures

, . . .

The use of the PB-frames mode is indicated in.

The PB-frames mode yields satisfactory.

Under large motions, PB-frames do not

improved new mode has been developed inVersion 2 of H.263.

46


47/51

47


48/51

. .. ..

applications and o er additional flexibility in

,pixel aspect ratio and clock frequencies.

.

in addition to the four optional modes in. .

It uses Reversible Variable Length Coding (RVLC)

A slice structure is used to replace GOB to o eradditional flexibilit .

48


49/51

. , ,

scalabilities.

upport o mprove - rames mo e n

which the two motion vectors of the B-frame

o not ave to e er ve rom t e orwar

motion vector of the P-frame as in Version 1.

H.263+ includes deblocking filters in the

coding loop to reduce blocking e ects.

49


50/51

. H.263 and additional recommendations for

Enhanced Reference Picture Selection ERPS ,Data Partition Slice (DPS), and AdditionalSupplemental Enhancement Information.

The ERPS mode operates by managing a multi-framebuffer for stored frames enhances coding

.

The DPS mode provides additional enhancement to

error resilience by separating header and motionvector data from DCT coefficient data in thebitstream and protects the motion vector data by

.

50


51/51

.. Text books:

A Java H.263 decoder by A.M. Tekalp

Digital Video and HDTV Algorithms and Interfaces by C.A.

Image and Video Compression Standards by V. Bhaskaran and K.Konstantinides

.Ghanbari

Video processing and communications by Y. Wang et al.

..including: Tutorials and White Papers on H.261 and H263

. an . so tware mp ementat ons An H263/H263+ library

A Java H.263 decoder

51

Documents

Unit VII MM Chap10 Basic Video Compression Techniques