Upload
hansi-gunasinghe
View
1.398
Download
3
Tags:
Embed Size (px)
DESCRIPTION
This is about some current video compression techniques. This shows, how the evolution and improvements of video compression techniques were happened.
Citation preview
Video Compression Techniques
ByH.N.Gunasinghe | AS2010379
University of Sri Jayewardenepura – Sri LankaCSC 362 1.5 Seminar I
2
Content
IntroductionVideo compression standard MPEG-1-2-4 Summery
3
Video Compression Introduction
Problem: Raw video contains an immense amount of data. Communication and storage capabilities are limited and expensive.
Example HDTV video signal:
4
Video Compression: Why?
Application Data Rate Uncompressed Compressed
Video Conference 352 x 240 @ 15 fps
30.4 Mbps
64 - 768 kbps
CD-ROM Digital Video 352 x 240 @ 30 fps
60.8 Mbps
1.5 - 4 Mbps
Broadcast Video 720 x 480 @ 30 fps
248.8 Mbps
3 - 8 Mbps
HDTV 1280 x 720 @ 60 fps
1.33 Gbps
20 Mbps
Bandwidth Reduction
5
Video Compression Standards
STANDARD APPLICATION BIT RATE
H.261 Video telephony and teleconferencing over ISDN
p x 64 kb/s
MPEG-1 Video on digital storage media (CD-ROM) 1.5 Mb/s
MPEG-2 Digital Television > 2 Mb/s
H.263 Video telephony over PSTN < 33.6 kb/s
MPEG-4 Object-based coding, synthetic content, interactivity
Variable
H.264 From Low bitrate coding to HD encoding, HD-DVD, Surveillance, Video conferencing. (Improved video compression)
10’s to 100’s kb/s
6
The method is used to compress video. In principle, a motion picture is a rapid sequence of a set of frames in which each frame is a picture. In other words, a frame is a spatial combination of pixels, and a video is a temporal combination of frames that are sent one after another. Compressing video, then, means spatially compressing each frame and temporally compressing a set of frames.
Moving Picture Experts Group (MPEG)
7
Achieving Compression
Reduce redundancy and irrelevancy. Sources of redundancy:
Temporal – Adjacent frames highly correlated. Spatial – Nearby pixels are often correlated with each other. Color space – RGB components are correlated among themselves.
Irrelevancy – Perceptually unimportant information.
8
Basic Video Compression Architecture
Exploiting the redundanciesTemporal – MC-prediction and MC-interpolation (MC-Mcmillan)Spatial – Block DCT (discrete cosine transform)Color – Color space conversion
Scalar quantization of DCT coefficientsRun-length and Huffman coding of the non-zero quantized DCT coefficients
9
Current Video Compression Standards
Classification & Characterization of different standards
Based on the same fundamental building blocksMotion-compensated prediction and interpolation
2-D Discrete Cosine Transform (DCT)
Color space conversion
Scalar quantization, run-length, and Huffman coding
Other tools added for different applicationsProgressive or interlaced videoImproved compression, error resilience, scalability, etc
10
MPEG-1 and MPEG-2
MPEG-1 (1991)Goal is compression for digital storage media, CD-ROMAchieves VHS quality video at ~1.5 Mb/secThe frame rate is locked at 25 (PAL) fps and 30 (NTSC) fps respectivelylow computation times for encoding and decoding
MPEG-2 (1993)Superset of MPEG-1 to support higher bit rates, higher resolutions, and interlaced picturesOriginal goal to support interlaced video from conventional television. Eventually extended to support HDTVmore scalable than MPEG-1 and is able to play the same video in different resolutions and frame rates
11
MPEG-4 (1998)
Primary goals: new functionalities, not better compression
lower bit rates (10Kb/s to 1Mb/s) with a good qualityObject-based or content-based representation –
Separate coding of individual visual objectsContent-based access and manipulation
Integration of natural and synthetic objectsInteractivity - multimedia applications and video communicationCommunication over error-prone environmentsIncludes frame-based coding techniques from earlier standards
12
Video Structure
MPEG Structure
13
Spatial compressionThe spatial compression of each frame is done with JPEG, or a modification of it. Each frame is a picture that can be independently compressed.
Temporal compressionIn temporal compression, redundant frames are removed. When we watch television, for example, we receive 30 frames per second. However, most of the consecutive frames are almost the same. For example, in a static scene in which someone is talking, most frames are the same except for the segment around the speaker’s lips, which changes from one frame to the next.
14Figure 15.16 MPEG frames
15
The MPEG compression algorithm encodes the data in 5 steps
Step 1: Reduction of the Resolution Step 2: Motion Estimation Step 3: Discrete Cosine Transform (DCT) Step 4: Quantization Step 5: Entropy Coding
The MPEG Compression
data is well-conditioned
compression
16
Step 1: Reduction of the Resolution• The human eye has a lower sensibility to colour information than to
dark-bright contrasts. • A conversion from RGB-colour-space into YUV colour components
help to use this effect for compression. • The chrominance components U and V can be reduced
(subsampling) to half of the pixels in horizontal direction (4:2:2), or a half of the pixels in both the horizontal and vertical (4:2:0).
17
Step 2: Motion Estimation• Because two successive frames of a video sequence often have
small differences (except in scene changes), the MPEG offers a way of reducing this temporal redundancy.
• It uses three types of previously coded frames:
– I-frames (intra), coded independently of all other frames
– P-frames (predicted), coded based on previously coded frame
– B-frames (bidirectional), coded based on both previous and
future coded frames
18
comparison
I-frames• no reference to
other frames• compression is
not that high
B-frames• two directional
version of the P-frame• referring to both
directions (one forward frame and one backward frame)
• cannot be referenced by other P- or B-frames, because they are interpolated from forward and backward frames.
• predicted from an earlier I-frame or P-frame
• cannot be reconstructed without their referencing frame
• need less space than the I-frames
• Only the differences are stored
P-frames
19
Schematic process ofmotion estimation
20
Schematic process ofmotion estimation
The Actual frame is divided into non overlapping blocks (macro blocks) usually block size of 16x16 pixels.
they are only calculated if the difference between two blocks at the same position is higher than a threshold, otherwise the whole blockis transmitted.
Use matching tries, most time consuming one during encoding. each block of the current frame is compared with a past frame within a search area
After prediction the, the predicted and the original frame are compared, and their differences are coded.
21
Step 3: Discrete Cosine Transform• The DCT is computationally very expensive and its complexity increases
disproportionately (O(N 2 ) ). • That is the reason why images compressed using DCT are divided into
blocks. • Another disadvantage of DCT is its inability to decompose a broad signal
into high and low frequencies at the same time. • Therefore the use of small blocks allows a description of high frequencies
with less cosine terms.
Visualization of 64 basis functions (cosine frequencies) of a DCT.
The direct current-term, which is constant and describes the
average grey level of the block
The 63 remaining terms are called alternating-current terms
22
Step 4: Quantization
FQUONTIZED = F(u, v) DIV Q(u, v) Q is the quantization Matrix of dimension N F is the DCT terms
only the differences between the DC-terms are stored The AC-terms are then stored in a zig-zag-path with increasing frequency
values
23
Step 5: Entropy Coding
The entropy coding takes two steps:– Run Length Encoding (RLE )– Huffman coding
24
All five Steps together
25
References• Introduction to Video Compression Techniques - Anurag Jain
www.slideshare.net (May,7, 2013)• http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0506/s0561282.pdf
(June,10,2013)