BY AMRUTA KULKARNI STUDENT ID : 1000666836 UNDER SUPERVISION OF DR. K.R. RAO Complexity Reduction...
If you can't read please download the document
BY AMRUTA KULKARNI STUDENT ID : 1000666836 UNDER SUPERVISION OF DR. K.R. RAO Complexity Reduction Algorithm for Intra Mode Selection in H.264/AVC Video
BY AMRUTA KULKARNI STUDENT ID : 1000666836 UNDER SUPERVISION OF
DR. K.R. RAO Complexity Reduction Algorithm for Intra Mode
Selection in H.264/AVC Video Coding
Slide 2
H.264/MPEG-4 Part 10 or AVC Is an advanced video compression
standard, developed by ITU-T Video Coding Experts Group(VCEG)
together with ISO/IEC Moving Picture Experts Group(MPEG). It is a
widely used video codec in mobile applications, internet ( YouTube,
flash players), set top box, DTV etc. A H.264 encoder converts the
video into a compressed format(.264) and a decoder converts
compressed video back into an uncompressed format.
Slide 3
How does H.264 codec work ? An H.264 video encoder carries out
prediction, transform and encoding processes to produce a
compressed H.264 bit stream. The block diagram of the H.264 video
encoder is shown in Fig 1. A decoder carries out a complementary
process by decoding, inverse transform and reconstruction to output
a decoded video sequence. The block diagram of the H.264 video
decoder is shown in Fig 2.
Profiles in H.264 The H.264 standard defines sets of
capabilities, which are also referred to as Profiles, targeting
specific classes of applications. Fig. 3. Different features are
supported in different profiles depending on applications. Table 1.
lists some profiles and there applications. ProfileApplications
BaselineVideo conferencing, Videophone MainDigital Storage Media,
Television Broadcasting HighStreaming Video ExtendedContent
distribution Post processing Table 1. List of H.264 Profiles and
applications [2]
Slide 7
Profiles in H.264[9] Fig. 3 Profiles in H. 264[9]
Slide 8
Prediction The H.264 encoder forms a prediction of the current
macro block a) Based on the current frame using intra (spatial)
prediction. b) Based on the previous frames that have already been
coded using inter prediction.
Slide 9
Intra prediction Supports the following macro block sizes : a)
16x16 (for Luma) [9] Fig. 4 H.264 Intra 16x16 prediction
modes.
Slide 10
Intra Prediction H.264 Intra 16x16 prediction modes are shown
in Fig 4. Mode 0 (vertical): extrapolation from upper samples (H).
Mode 1 (horizontal): extrapolation from left samples (V). Mode 2
(DC): mean of upper and left-hand samples (H+V). Mode 3 (Plane): a
linear plane function is fitted to the upper and left-hand side
samples H and V. This works well in areas of smoothly-varying
luminance. b) 8x8 (for Chroma) DC, Plane, Horizontal and
Vertical
Slide 11
Intra prediction c) 4x4 (for Luma) modes for intra prediction
are shown if Fig.5 -[2] Fig 5. H.264 Intra 4x4 prediction
modes[2]
Slide 12
Introduction In H.264/AVC intra-prediction is conducted for
block sizes such as 4x4 luma blocks, 16x16 luma blocks, and 8x8
chroma blocks for baseline profile 4:2:0 format. The residual
between the current block and its prediction is then transformed,
quantized, and entropy coded. To obtain the best mode for these
blocks, the H.264/AVC encoder performs the rate-distortion
optimization (RDO) technique for each macro block.
Slide 13
RDO procedure for one macro block Set macro block parameters :
QP(quantization parameter) and Lagrangian multiplier () Calculate :
Mode = 0.85*2(QP-12)/3 Then calculate cost, which determines the
best mode, Cost = D + MODE * R, D Distortion R - bit rate with
given QP Lagrange multiplier Distortion (D) is obtained by SATD
(Sum of Absolute Transform Differences) between the original macro
block and its reconstructed block. Bit rate(R) includes the bits
for the mode information and transform coefficients for macro
block. Quantization parameter (QP) can vary from (0-51) Lagrange
multiplier () a value representing the relationship between bit
cost & quality.
Slide 14
RDO Procedure for one macro block 1. For a 4x4 luma block
select the best mode, which minimizes the cost among 9 modes. 2.
For a 16x16 luma block choose the mode with minimum SATD(Sum of
absolute transformed differences) among the 4 modes as the best
mode. 3. For an 8x8 chroma block choose the mode with minimum cost
among 4 modes. 4. Choose the best one as the prediction mode of one
macro block by comparing RD costs for 4x4 mode obtained from step
(1) and 16x16 mode from step (2). 5. Choose the best one as the
prediction mode of one chroma block.
Slide 15
RDO (Rate distortion optimization) Considering the RDO
procedure for intra mode selection in H.264/AVC, the number of mode
combinations in one macro block is N8x(16xN4 + N16) N8 number of
modes of an 8x8 chroma block 4 modes N4 number of modes of an 4x4
luma block - 9 modes N16 number of modes of an 16x16 luma block 4
modes N8x(16xN4 + N16) = 4x(16x9 + 4) = 592 Thus, to select the
best mode for one macro block in the intra prediction, the
H.264/AVC encoder carries out 592 RDO calculations. As a result,
the complexity of the encoder increases extremely.
Slide 16
SATD(sum of absolute transform differences) calculation
Following steps are performed in JM 17.2 for calculation of SATD.
a) Find the absolute difference (magnitude) between original 16x16
block and predicted 16x16 block. b) Apply Hadamard transform on
every 4x4 block. c) Then take sum of every 4x4 block transform
coefficients except the DC coefficient. d) Check if the sum_cost
> max_cost. if yes return max_cost. e) If no, apply Hadamard
transform on every 4x4 block and add sum of all 16 DC coefficients
to sum_cost. f) Check if the sum_cost > max_cost if yes return
max_cost else return sum_cost. where, sum_cost -> cost
calculated for each block partition. max_cost -> maximum cost
value allowed for each block size.
Slide 17
Introduction This project uses the baseline profile, as it
provides simplicity in implementation. The important features are
a) I and P slice coding b) Enhanced error resilience such as FMO
(flexible macro block ordering) and arbitrary slice ordering(ASO)
and redundant slices (RS) c) Context adaptive variable length
coding (CAVLC) Baseline profile is primarily used for low-cost
applications, for data loss robustness. The joint model (JM 17.2)
[6] implementation of the H.264 encoder is used in this
project.
Slide 18
Complexity reduction algorithm for intra prediction This
project will implement the complexity reduction algorithm for all
the 3 block sizes 1) 16x16 luma 2) 4x4 luma 3) 8x8 chroma
Slide 19
Algorithm The proposed intra mode selection algorithm for a
16x16 luma block is summarized as follows: Step 1 - Examine sizes
of adjacent blocks: if both blocks (upper block and left block) are
16x16, go to Step 2, otherwise go to Step 4. Step 2 - Examine modes
of adjacent blocks: if both modes are same, go to Step 3, otherwise
select the best mode for a 16x16 luma block, which results in the
minimum SATD (sum of absolute transformed differences) between two
adjacent modes of modeA and modeB. Step 3 - If both adjacent modes
are DC mode, go to Step 4, otherwise select the best mode for a
16x16 luma block, which results in the minimum SATD between the
adjacent mode and DC mode.
Slide 20
Algorithm Modes of upper and left blocks used as candidate
modes[8] Indices used in 4x4 luma block[8]
Slide 21
Yes No
Slide 22
Algorithm Step 4 - Let V be a vertical difference between upper
boundary pixels of the current block and boundary pixels of the
upper block, and H be a horizontal difference between left boundary
pixels of the current block and boundary pixels of the left block
as follows. V = |u(i)-q(i)| for i =0 to 15. H = |l(i)-r(i)| for i
=0 to 15. Where u(i) -> upper block boundary pixels q(i) ->
upper boundary pixels of current block l(i) -> boundary pixels
of the left block r(i) -> left boundary pixels of the current
block T 2 -> Threshold level 2 (T 2 = 32)
Slide 23
Algorithm Obtain candidate modes as follows by using two
difference values, V and H: if |V H | is smaller than 2T 2,
candidate modes are DC mode and plane mode; if (V H) is larger than
T 2, candidate modes are DC mode and horizontal mode; if (V H) is
smaller than T 2, candidate modes are DC and vertical modes, where
T 2 is a positive value. T 2 is the threshold value equal to 32.
The calculation for V and H is shown in Fig. 8. Finally, select the
best mode between each candidate mode by choosing the mode with
minimum SATD.
Slide 24
Algorithm Fig. 6 Calculation of V & H [8]
Slide 25
Comparison of number of RDO computations Intra block
sizesNumber of modes - Original JM 17.2 Implementation Number of
modes - Complexity reduction algorithm 16 x 16 4 2 8 x 8 4 2 4 x 4
9 4 After implementing the complexity reduction algorithms the
number of mode decisions for every macro block size in intra
prediction is reduced. Table 2 lists reduction in mode decisions.
Table 2. Reduction in Number of mode decisions for Intra block
prediction. [8]
Slide 26
CIF and QCIF sequences. CIF (Common Intermediate Format) is a
format used to standardize the horizontal and vertical resolutions
in pixels of Y, C b, C r sequences in video signals, commonly used
in video teleconferencing systems. QCIF means "Quarter CIF". To
have one fourth of the area as "quarter" implies the height and
width of the frame are halved. The differences in Y, C b, C r of
CIF and QCIF are as shown below in fig.6. [16] Fig.6 CIF and QCIF
resolutions( Y, C b, C r ).
Slide 27
Results The following QCIF and CIF sequences were used to test
the complexity reduction algorithm. [10] Akiyo Foreman Car phone
Hall monitor Silent News Container Coastguard
Slide 28
Test Sequences News Foreman Akiyo CoastguardCar phone Bus
Slide 29
Test Sequences Hall monitor Silent
Slide 30
Results The complexity reduction algorithm was successfully
implemented on the JM 17.2 software. Profile used : Baseline Total
number of frames : 100 (I frames only) QP (quantization parameter)
: 28 Table 1 shows the simulation results for QCIF sequences. Table
2 shows the simulation results for CIF sequences. Computational
efficiency is measured by the amount of time reduction, which is
computed as follows: Time is calculated as :
Slide 31
Results MSE is calculated as: Bit rate is calculated as: PSNR
is calculated as :
Slide 32
Results Sequence (QCIF) Time (%) PSNR(dB) Bit rate (%) MSE
Akiyo-10.2030.0143.590.033 Foreman-10.942-0.0042.03-0.012 Car
phone-9.768-0.0024.33-0.012 Hall monitor-10.8260.0022.780.011
Silent-10.669-0.0022.98-0.007 News-10.5660.0041.810.080
Container-9.1070.0081.33-0.019 Coastguard-10.629-0.0212.72-0.082
*Negative values indicate the gain (e.g. decrease in encoding time)
*Positive values indicate the loss (e.g. increase in the bit rate)
Table 1 : Test results for QCIF sequences
Slide 33
Results Sequence (CIF) Time (%) PSNR(dB) Bit rate (%) MSE
Bus-10.459-0.0065.370.111 Container-10.4950.0503.93-0.027
Coastguard-10.287-0.0012.72-0.020 *Negative values indicate the
gain (e.g. decrease in encoding time) *Positive values indicate the
loss (e.g. increase in the bit rate) Table 2 : Test results for CIF
sequences
Slide 34
References: 1. ITU-T Rec. H.264 | ISO/IEC 14496-10: Information
Technology Coding of Audio-visual Objects, Part 10: Advanced Video
Coding 2002. 2. T.Wiegand, et al,: Overview of the H.264/AVC Video
Coding Standard. IEEE Trans. Circuits and Syst. for Video Technol.,
Vol. 13, pp. 560-576, July. 2003. 3. Z.Chen, et al,: Fast Integer
Pixel and Fractional Pixel Motion Estimation for JVT. Doc.
#JVT-F017,Dec. 2002. 4. B.Hsieh, et al,:Fast Motion Estimation for
H.264/MPEG-4 AVC by Using Multiple Reference Frame Skipping
Criteria. VCIP 2003,Proceedings of SPIE, Vol. 5150, pp. 1551-1560,
Oct. 2003. 5. A.Puri et al, Video coding using the H.264/MPEG-4 AVC
compression standard, Signal Processing: Image Communication,
vol.19, pp. 793-849, Oct. 2004.
Slide 35
References 6. H.264/AVC JM software:
http://iphome.hhi.de/suehring/tml/http://iphome.hhi.de/suehring/tml/
7. An overview of the H.264 encoder: www.vcodex.com 8. J. Kim, et
al, Complexity reduction algorithm for intra mode selection in
H.264/AVC video coding J. Blanc-Talon et al. (Eds.): ACIVS 2006,
LNCS 4179, pp. 454 465, Springer-Verlag Berlin Heidelberg, 2006. 9.
I. Richardson, The H.264 advanced video compression standard second
edition, Wiley, 2010. 10. YUV test video sequences :
http://trace.eas.asu.edu/yuv/