Video compression techniques

Video Compression Techniques

ByH.N.Gunasinghe | AS2010379

University of Sri Jayewardenepura – Sri LankaCSC 362 1.5 Seminar I

2

Content

IntroductionVideo compression standard MPEG-1-2-4 Summery

3

Video Compression Introduction

Problem: Raw video contains an immense amount of data. Communication and storage capabilities are limited and expensive.

Example HDTV video signal:

4

Video Compression: Why?

Application Data Rate Uncompressed Compressed

Video Conference 352 x 240 @ 15 fps

30.4 Mbps

64 - 768 kbps

CD-ROM Digital Video 352 x 240 @ 30 fps

60.8 Mbps

1.5 - 4 Mbps

Broadcast Video 720 x 480 @ 30 fps

248.8 Mbps

3 - 8 Mbps

HDTV 1280 x 720 @ 60 fps

1.33 Gbps

20 Mbps

Bandwidth Reduction

5

Video Compression Standards

STANDARD APPLICATION BIT RATE

H.261 Video telephony and teleconferencing over ISDN

p x 64 kb/s

MPEG-1 Video on digital storage media (CD-ROM) 1.5 Mb/s

MPEG-2 Digital Television > 2 Mb/s

H.263 Video telephony over PSTN < 33.6 kb/s

MPEG-4 Object-based coding, synthetic content, interactivity

Variable

H.264 From Low bitrate coding to HD encoding, HD-DVD, Surveillance, Video conferencing. (Improved video compression)

10’s to 100’s kb/s

6

The method is used to compress video. In principle, a motion picture is a rapid sequence of a set of frames in which each frame is a picture. In other words, a frame is a spatial combination of pixels, and a video is a temporal combination of frames that are sent one after another. Compressing video, then, means spatially compressing each frame and temporally compressing a set of frames.

Moving Picture Experts Group (MPEG)

7

Achieving Compression

Reduce redundancy and irrelevancy. Sources of redundancy:

Temporal – Adjacent frames highly correlated. Spatial – Nearby pixels are often correlated with each other. Color space – RGB components are correlated among themselves.

Irrelevancy – Perceptually unimportant information.

8

Basic Video Compression Architecture

Exploiting the redundanciesTemporal – MC-prediction and MC-interpolation (MC-Mcmillan)Spatial – Block DCT (discrete cosine transform)Color – Color space conversion

Scalar quantization of DCT coefficientsRun-length and Huffman coding of the non-zero quantized DCT coefficients

9

Current Video Compression Standards

Classification & Characterization of different standards

Based on the same fundamental building blocksMotion-compensated prediction and interpolation

2-D Discrete Cosine Transform (DCT)

Color space conversion

Scalar quantization, run-length, and Huffman coding

Other tools added for different applicationsProgressive or interlaced videoImproved compression, error resilience, scalability, etc

10

MPEG-1 and MPEG-2

MPEG-1 (1991)Goal is compression for digital storage media, CD-ROMAchieves VHS quality video at ~1.5 Mb/secThe frame rate is locked at 25 (PAL) fps and 30 (NTSC) fps respectivelylow computation times for encoding and decoding

MPEG-2 (1993)Superset of MPEG-1 to support higher bit rates, higher resolutions, and interlaced picturesOriginal goal to support interlaced video from conventional television. Eventually extended to support HDTVmore scalable than MPEG-1 and is able to play the same video in different resolutions and frame rates

11

MPEG-4 (1998)

Primary goals: new functionalities, not better compression

lower bit rates (10Kb/s to 1Mb/s) with a good qualityObject-based or content-based representation –

Separate coding of individual visual objectsContent-based access and manipulation

Integration of natural and synthetic objectsInteractivity - multimedia applications and video communicationCommunication over error-prone environmentsIncludes frame-based coding techniques from earlier standards

12

Video Structure

MPEG Structure

13

Spatial compressionThe spatial compression of each frame is done with JPEG, or a modification of it. Each frame is a picture that can be independently compressed.

Temporal compressionIn temporal compression, redundant frames are removed. When we watch television, for example, we receive 30 frames per second. However, most of the consecutive frames are almost the same. For example, in a static scene in which someone is talking, most frames are the same except for the segment around the speaker’s lips, which changes from one frame to the next.

14Figure 15.16 MPEG frames

15

The MPEG compression algorithm encodes the data in 5 steps

Step 1: Reduction of the Resolution Step 2: Motion Estimation Step 3: Discrete Cosine Transform (DCT) Step 4: Quantization Step 5: Entropy Coding

The MPEG Compression

data is well-conditioned

compression

16

Step 1: Reduction of the Resolution• The human eye has a lower sensibility to colour information than to

dark-bright contrasts. • A conversion from RGB-colour-space into YUV colour components

help to use this effect for compression. • The chrominance components U and V can be reduced

(subsampling) to half of the pixels in horizontal direction (4:2:2), or a half of the pixels in both the horizontal and vertical (4:2:0).

17

Step 2: Motion Estimation• Because two successive frames of a video sequence often have

small differences (except in scene changes), the MPEG offers a way of reducing this temporal redundancy.

• It uses three types of previously coded frames:

– I-frames (intra), coded independently of all other frames

– P-frames (predicted), coded based on previously coded frame

– B-frames (bidirectional), coded based on both previous and

future coded frames

18

comparison

I-frames• no reference to

other frames• compression is

not that high

B-frames• two directional

version of the P-frame• referring to both

directions (one forward frame and one backward frame)

• cannot be referenced by other P- or B-frames, because they are interpolated from forward and backward frames.

• predicted from an earlier I-frame or P-frame

• cannot be reconstructed without their referencing frame

• need less space than the I-frames

• Only the differences are stored

P-frames

19

Schematic process ofmotion estimation

20

Schematic process ofmotion estimation

The Actual frame is divided into non overlapping blocks (macro blocks) usually block size of 16x16 pixels.

they are only calculated if the difference between two blocks at the same position is higher than a threshold, otherwise the whole blockis transmitted.

Use matching tries, most time consuming one during encoding. each block of the current frame is compared with a past frame within a search area

After prediction the, the predicted and the original frame are compared, and their differences are coded.

21

Step 3: Discrete Cosine Transform• The DCT is computationally very expensive and its complexity increases

disproportionately (O(N 2 ) ). • That is the reason why images compressed using DCT are divided into

blocks. • Another disadvantage of DCT is its inability to decompose a broad signal

into high and low frequencies at the same time. • Therefore the use of small blocks allows a description of high frequencies

with less cosine terms.

Visualization of 64 basis functions (cosine frequencies) of a DCT.

The direct current-term, which is constant and describes the

average grey level of the block

The 63 remaining terms are called alternating-current terms

22

Step 4: Quantization

FQUONTIZED = F(u, v) DIV Q(u, v) Q is the quantization Matrix of dimension N F is the DCT terms

only the differences between the DC-terms are stored The AC-terms are then stored in a zig-zag-path with increasing frequency

values

23

Step 5: Entropy Coding

The entropy coding takes two steps:– Run Length Encoding (RLE )– Huffman coding

24

All five Steps together

25

References• Introduction to Video Compression Techniques - Anurag Jain

www.slideshare.net (May,7, 2013)• http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0506/s0561282.pdf

(June,10,2013)

http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0506/s0561282.pdf

Education

Video compression techniques