View
218
Download
0
Category
Tags:
Preview:
Citation preview
Image and Video Compression Fundamentals
Heejune AHNEmbedded Communications Laboratory
Seoul National Univ. of TechnologyFall 2013
Last updated 2013. 9. 07
Heejune AHN: Image and Video Compression p. 2
1. Driving Force of Video Compr.
Uncompressed Video Bandwidth Ver. Resolution x Hor. Resolution x Time Resolution x Colors Eg. CCIR 601 (TV Quality) 720x480x30x24 = 248,832,000 bps
Typical Storage and Network DVD 4.7 GB (about 80 sec for CCIR) ADSL 100Mbps < CCIR BW
t
o
y
x
720
480 30
Heejune AHN: Image and Video Compression p. 3
Typical values
Typical Video Bandwidth ITU CCIR 601 L(858x525) C(429 x525) 30fps => 216.0Mbps CIF L (352x288) C(176x144) 30fps => 36.5Mbps QCIF L (176x144) C(88x72) 15fps => 4.6Mbps
Typical Storage /Transmission Capacity Terrestrial TV broadcasting channel ~20 Mbps CD/DVD-5 640MB/4.7GB Ethernet/Fast Ethernet <10/100 Mbps ADSL/VDSL downlink 2048 kbps/100Mbps Wireless cellular (2G/3G/3G+) 9.6/384/2000kbps
Heejune AHN: Image and Video Compression p. 4
2. Image and Video Compression
Information Theory 1950’s Claude Shannon (Bell Lab) pioneered. Providing Mathematical Limits for Information
Processing/Communications Coding
Source Coding • How to Reduce the data
• for information representation Channel coding
• How to Transmit Data
• though Noise/Distored Channels
Note : TDMA, FDMA, CDMA, OFDMA, and MIMO
are all for the channelization methods Claude Elwood Shannon
(April 30, 1916 – February 24, 2001)
Heejune AHN: Image and Video Compression p. 5
Typical Visual Comm. System
Typical path
Info source
Sourcecoder
channelcoder
modulator demodulator
channeldecoder
Sourcedecoder
Infooutput
Channel(wired/wirless/storage)
Heejune AHN: Image and Video Compression p. 6
Codec
Codec = enCOder&DECoder Codec Types
Lossless compression• X == X’
• Used for document file (ZIP), Medical Images (JPEG lossless)
• Entropy coding (Arithmetic coding, Huffman coding), Predictive coding Lossy compression
• X ~ X’
• Used for Entertainment, Communication Multimedia
• (DCT), Quantization
Encoder Decoder X Y X’
Heejune AHN: Image and Video Compression p. 7
Uncompressed, Zipped, H264-encoded of same video
Video Compression System Feature Source model
• Note: zip is source-independent encoding Human Visual System
• HVS does not notice many distortions
Heejune AHN: Image and Video Compression p. 8
3. Predictive Coding
DPCM (Differential Pulse Coded Modulation) Highly Correlated pixel values in Spatial Domain Code current (S0) using previously coded ones (S1, S2, S3 etc)
Coder Block Diagram
S3 S4
S1 S0
line of pixels above
current line of pixels
S2
[ ]x n
Predictor
EntropyCoder
EntropyDecoder
Predictor
+
+
+
-
Encoder Decoder
[ ]e n [ ]e n [ ]x n
p p
Heejune AHN: Image and Video Compression p. 9
DPCM example
original
1
0 0 0
0
0 1 0
0.5
0 0.5 0
Heejune AHN: Image and Video Compression p. 10
Motion Compensation Prediction
Temporal domain prediction
How to use the temporal correlation? Model and representation methods
Two successive video framesChange
detectionmask
Heejune AHN: Image and Video Compression p. 11
Model based MC 2D/3D Model
• dx, dy, dz and rotations
• Estimate (ie. Calculate) the parameters in encoder and use for decoder Difficulties
• Too high Shape encoding, Estimation Complexity for now
• In MPEG-4 Object Oriented coding
BackgroundMoving area picked up by change detector
Moving areasmissed bychange detector
Heejune AHN: Image and Video Compression p. 12
Block Based MC Segment Fixed Size Block and find best matching displacement Easier Implementation in HW and SW
Real Motion
MV
X(t) X(t+1)
Heejune AHN: Image and Video Compression p. 13
4.Transform coding
Transform Spatial Domain to Frequency Domain Easy for quantization
• Energy Compaction Properties and HVS properties
• No Compression itself
transform
Ty x quantizer
Qq y encoder
Cc q
samples yimage x indices q
1
inversetransform
ˆ ˆT x y 1
dequantizer
ˆ Qy q 1
decoder
C q c
indices qˆsamples yreconstructed
ˆimage x
bit-stream c
Heejune AHN: Image and Video Compression p. 14
Block transform
(fixed-size) Block Transform Easy for implementation Normally 2-D separable Transform
image blockDCT coefficients
of block
quantized DCT coefficients
of block
block reconstructed
from quantized coefficients
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
0
2
4
6
0
2
4
6
- 30
- 20
- 10
0
10
20
30
Heejune AHN: Image and Video Compression p. 15
Transform types KL Transform is proved optimal DCT is fixed and similar to KL for image signals Wavelet and Fractal Transform etc
(1) Karhunen Loève transform [1948/1960]
(2) Haar transform [1910]
(3) Walsh-Hadamard transform [1923]
(4) Slant transform [Enomoto, Shibata, 1971]
(5) Discrete CosineTransform (DCT) [Ahmet, Natarajan, Rao, 1974]
(1) (2) (3) (4) (5)
Heejune AHN: Image and Video Compression p. 16
Transform size The Larger Block, The more efficient, but The more Computationally
complex 8x8 or 4x4 are used for Standards
Heejune AHN: Image and Video Compression p. 17
5. Quantization
Approximation of Values Lossy Coding (key data reduction) Applied to 2D transform Coefficient
Heejune AHN: Image and Video Compression p. 18
Qstep (or qscale) Distortion Range The Larger/Coarse Q step
• The More Compression
• The Larger Distortion Rate Distortion Theory
In Video Coding Applied to 2D transform Coefficients HVS
• Smaller in low freq
• Larger in high frequency
Quantizer input
Quantizer output x̂
x
Heejune AHN: Image and Video Compression p. 19
6. Entropy Coding
Statistical redundancy in video coding Many zeroes in quantized transform coefficients Unequal histogram of control info, like motion vectors and coding
type
Entropy coding Principle
• “Shorter Code words for More Frequency events”
• Variable Length Coding (VLC) Huffman coding
• Integer VLC: each code words are integer length
• Used for most Standards Arithmetic Coding
• Fractional Length Coding
• Started from H.263+ but used in H.264 practically
Heejune AHN: Image and Video Compression p. 20
VLC coding in Image Coding Zigzag scan used for more statistical correlation 2-D Run-Length Code (num of zeros, no zero value)
185 3 1 1 -3 2 -1 0
1 1 -1 0 -1 0 0 1
0 0 1 0 -1 0 0 0
1 1 0 -1 0 0 0 -1
0 0 1 0 0 0 -1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Q (8)
1480 26.0 9.5 8.9 -26.4 15.1 -8.1 0.3
11.0 8.3 -8.2 3.8 -8.4 -6.0 -2.8 10.6
-5.5 4.5 9.0 5.3 -8.0 4.0 -5.1 4.9
10.7 9.8 4.9 -8.3 -2.1 -1.9 2.8 -8.1
1.6 1.4 8.2 4.3 3.4 4.1 -7.9 1.0
-4.5 -5.0 -6.4 4.1 -4.4 1.8 -3.2 2.1
5.9 5.8 2.4 2.8 -2.0 5.9 3.2 1.1
-3.0 2.5 -1.0 0.7 4.1 -6.1 6.0 5.7
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
Run-level coding
Zig-zag scanTransformed 8x8 block
Heejune AHN: Image and Video Compression p. 21
7. Codec Design
Hybrid Codec Most Standards Codec MC => DCT => Quant => Entropy Coding
Intra-frame Decoder
Motion-Compensated
Predictor
ControlData
DCTCoefficients
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Intra-frameDCT Coder- E
ntro
py co
der
Quant
DeQ
Heejune AHN: Image and Video Compression p. 22
Complexity Consideration Asymmetric Complexity
• Encoders are more complex for most standards
• Non-real time Encoding but Real time Encoding (e.g. Broadcasting, Storage)
• One time encoding many time decoding
• Encoder and decoder Cost Parallel Processing and HW/SW implementation (in MPEG-2)
• Motion Compensation (~ 55%)
• DCT/DCT (~15%)
• VLC encoding/Decoding (~15%)
• Other (post processing) (15%)
Recommended