Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
UNIVERSITY OF ENGINEERING AND TECHNOLOGY TAXILA
DEPARTMENT OF ELECTRICAL ENGINEERING
DESIGN AND IMPLEMENTATION OF IMPROVED QUALITY
LOW BIT RATE VIDEO CODING
A dissertation submitted in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
in
Electrical Engineering
by
Gulistan Raja
03-UET/PhD-EE-14
Research Committee in charge:
Prof. Dr. Muhammad Javed Mirza – Supervisor
Prof. Dr. Habibullah Jamal
Prof. Dr. Muhammad Khawar Islam
Dr. Shoab A. Khan
2008
iii
Design and Implementation of Improved Quality Low Bit Rate
Video Coding
Copyright © 2008
by
Gulistan Raja
All rights reserved
iv
Dedicated to my family
v
SUMMARY
Design and Implementation of Improved Quality
Low Bit Rate Video Coding
Gulistan Raja
03-UET/PhD-EE-14
Today’s most video coding standards use block based discrete cosine transform
coding schemes to exploit spatial redundancy. The basic approach is: partitioning of
the whole image into blocks, transformation and quantization. Loss of correlation
occurs between adjacent blocks due to coarse quantization at low bit rates. This
introduces visually disturbing block discontinuities along block edges, known as
blocking artifacts.
The latest H.264/AVC video coding standard employs normative adaptive loop
deblocking filter algorithm for reduction of blocking artifacts. Performance analysis of
deblocking filter has proved its effectiveness for suppression of artifacts. However, it
is highly computationally complex. Therefore, there is need to reduce this computing
complexity to make it suitable for low bit rate applications, e.g., real time mobile
video. Various attempts have been made to reduce computing complexity of
deblocking algorithm but most of them deal with hardware implementation using
efficient architecture. We propose an optimized deblocking algorithm based on
motion activity of video sequences.
vi
First, we have done performance analysis of latest H.264/AVC with other video
coding standards for low bit rate applications and the results show a significant
performance gain of H.264/AVC in comparison with other standards. Second, we
have shown that H.264/AVC deblocking filter is very effective in suppressing
blocking artifacts generated at low bit rates. However, it takes one third computing
resources of decoder due to high computational cost. The main cause of this
enormously high computing complexity is boundary strength computations, which are
primarily used to select one type of filter out of two filters: normal and strong. More
than 90% of computational resources are spent on boundary strength computations in
H.264/AVC deblocking filter. Third, by considering computing complexity reduction
of H.264/AVC deblocking filter as an objective, a motion activity based deblocking
algorithm is proposed. Based on sum of motion vectors at frame level, the thresholds
have been set through experimentation for categorization of video sequences
according to their motion activity into three groups as: (1) low motion (2) moderate
motion (3) high motion. It has been observed through experimentation in proposed
research that strong filter of H.264/AVC deblocking filter can be replaced by normal
filter for low to moderate motion sequences. The new decision criteria for application
of filter based on motion activity of video sequences has been proposed. As a result,
boundary strength computations are not used for low to moderate motion sequences in
proposed deblocking algorithm.
Various simulations are conducted to evaluate the candidacy of the proposed
technique. A significant reduction in average number of operations is achieved
without losing subjective quality of the video. A reduction of 45.29% in average
number of operations is attained. The objective and subjective results are in
vii
conformity of the original H.264/AVC deblocking filter for low and moderate motion
video sequences. The proposed research can be used for real time low bit rate video
applications. For example, mobile video on portable devices, video telephony, video
conferencing on Internet using low bandwidth lines.
Keywords: Video coding, H.264/AVC, deblocking filter, motion activity, computing
complexity.
viii
ACKNOWLEDGMENTS
First of all, I would like to express my overflowing gratitude to Almighty Allah for
granting me wisdom, resources and strength to complete this work.
The first acknowledgment goes to my supervisor, Prof. Dr. Muhammad Javed Mirza,
for his invaluable guidance, constructive advise, accurate criticism and
encouragement during the course of this research. I have great appreciation for
Professor Mirza’s wisdom as he has lended great support to me in academic matters.
I would like to express my deep gratitude to members of my research committee,
Prof. Dr. Habibullah Jamal, Prof. Dr. Muhammad Khawar Islam and Prof. Dr. Shoab
A. Khan for their interesting discussions and helpful comments in the evaluation of
this research work. They always encouraged me and were the significant force during
my dissertation work.
Special thanks to Tian Song, with whom I studied together during my master studies in
Osaka University, Osaka, for his valuable comments and suggestions.
I am grateful to Prof. Ahmad Khalil Khan for useful discussions, encouragement and
motivation during the course of studies.
I would also like to thank all my colleagues and friends especially Prof. Dr.
Muhammad Amin, Prof. Dr. Umar Farooq, Prof. Dr. Zafrullah, Prof. Dr. Muhammad
Ahmad, Prof. Tahir Nadeem Malik, Prof. Aftab Ahmad, Prof. Iram Baig, Prof. Dr.
ix
Adeel Akram, Tahir Mahmood, Ilyas Ahmad, Amir Hanif, Zahid Suleman Butt, Riffat
Asim Pasha and Dr. Mirza Jahanzaib for their encouragement.
I am thankful to all the people who had given me support during my research
especially Prof. Dr. Qaiser-uz-Zaman, Director ASR & TD, Zafar Iqbal Sabir, Admin
Officer, ASR & TD office and their staff.
At the end, my heartfelt appreciation is expressed to all the members of my family
especially my mother and wife for their love, inspiration, patience, continuous support,
encouragement and prayers during my PhD studies.
x
CURRICULUM VITA
Education
1996 B.Sc. Electrical Engineering, University of Engineering and
Technology, Taxila
2002 M.S. Information Systems Engineering, Osaka University, Osaka
2008 Ph.D. Electrical Engineering, University of Engineering and
Technology, Taxila
Professional Experience
1997 ~ 2003 Research Associate, Electrical Engineering Department,
University of Engineering and Technology, Taxila
2000~2002 Intern, Synthesis Corporation, Osaka
2003 ~ to date Assistant Professor, Electrical Engineering Department,
University of Engineering and Technology, Taxila
Pertinent Publications
1. Gulistan Raja, M. J. Mirza, and T. Song, “H.264/AVC Deblocking Filter based on
Motion Activity in Video Sequences” Journal of IEICE Electronics Express, Japan,
Vol. 5, No. 19, 2008, pp. 809-814.
2. Gulistan Raja and M. J. Mirza, “A New Scheme of Suppressing Blocking Artifacts
in H.264/AVC Deblocking Filter for Low Bit Rate Video Coding,” World
Scientific and Engineering Academy and Society Transactions on Circuits and
Systems, Greece, Issue 1, Vol. 6, 2007, pp. 182-186.
xi
3. Gulistan Raja and M. J. Mirza, “Evaluation of Loop Filtering for Reduction of
Blocking Effects in Real Time Low Bit Rate Video Coding,” MUET Research
Journal of Engineering & Technology, Pakistan, Vol. 26, No. 3, 2007, pp. 211-218.
4. Gulistan Raja and M. J. Mirza, “In-Loop Deblocking Filter for JVT H.264/AVC,”
World Scientific and Engineering Academy and Society Transactions on Signal
Processing, Greece, (selected paper from International Conference on Signal
Processing, Robotics and Automation, ISPRA, 06, Madrid, Spain), Issue 2, Vol. 2,
pp. 143-148.
5. Gulistan Raja, M. J. Mirza, “JVT H.264/AVC: Evaluation with Existing Standards
for Low Bit Rate Video Coding,” Proceedings of 17th IEEE International
Conference on Microelectronics, Islamabad, Pakistan, December 13-15, 2005, pp.
301-304.
6. Gulistan Raja, M. J. Mirza, “Evaluation of Emerging JVT H.264/AVC with
MPEG Video,” Proceedings of 9th IEEE International Multi-topic Conference,
Karachi, Pakistan, December 24-25, 2005, pp. 626-629.
7. Gulistan Raja, M. J. Mirza, “Performance Comparison of Advanced Video Coding
H.264 Standard with Baseline H.263 and H.263+ Standards,” Proceedings of 4th
IEEE International Symposium on Communications & Information Technologies,
Sapporo, Japan, October 26-29, 2004, pp. 743-746.
xii
TABLE OF CONTENTS
Summary................................................................................................................................... v
Acknowledgments..................................................................................................................viii
Curriculum Vita ....................................................................................................................... x
Table of Contents ................................................................................................................... xii
List of Figures ........................................................................................................................xiv
List of Tables ..........................................................................................................................xvi
Chapter 1: Introduction………………………………………………………………...1
1.1 Background ........................................................................................................ 1
1.2 Objectives ........................................................................................................... 2
1.3 Approach ............................................................................................................ 3
1.4 Thesis Outline..................................................................................................... 4
Chapter 2: Literature Review…………………………………………………………..6
2.1 Theory of Blocking Artifacts at Low Bit Rates ................................................ 6
2.2 Methods to Reduce Blocking Artifacts – Deblocking Filters........................... 9
2.2.1 Categories of Techniques for Reducing Blocking Artifacts .............. 9
2.2.2 Literature Review for Deblocking Filters......................................... 12
2.2.3 Summary of Salient Techniques used in Deblocking Filters .........25
2.3 Motion Activity Detection Metrics - A Deblocking Filters Perspective........ 27
2.3.1 Review of Significant Motion Activity Detection Approaches ....... 27
2.3.2 Analysis of Motion Activity Detection Techniques......................... 32
xiii
Chapter 3: Case Studies – Analysis with respect to Low Bit Rate Video Coding…..33
3.1 Performance Analysis of H.264/AVC with Existing Standards..................... 33
3.1.1 H.264/AVC Profiles and Levels ....................................................... 34
3.1.2 Main Blocks of H.264/AVC ............................................................. 36
3.1.3 Test Environment and Simulation Results ....................................... 39
3.2 Evaluation of H.264/AVC Deblocking Filter ................................................. 47
3.2.1 H.264/AVC Loop Deblocking Filter ................................................ 47
3.2.2 Experimental Methodology and Results........................................... 53
Chapter 4: Design and Implementation of Proposed Deblocking Filter for Improved
Quality Low Bit Rate Video Coding..............................................................62
4.1 Analysis of Strong Filter and Normal Filter Employment in H.264/AVC
Deblocking Filter.............................................................................................. 63
4.2 Classification using Motion Activity in Video Sequences ............................. 72
4.3 Motion Vectors Thresholds for Motion Activity ............................................ 74
4.4 Proposed Deblocking Filter ............................................................................. 76
4.5 Experimental Environment .............................................................................. 82
4.6 Computational Complexity Comparison......................................................... 83
4.7 Objective Comparison ..................................................................................... 90
4.8 Subjective Comparison .................................................................................... 93
Conclusions .......................................................................................................................... 105
Future Recommendations .................................................................................................... 107
References ............................................................................................................................ 108
xiv
LIST OF FIGURES
Fig. 2.1 Blocking Artifacts at 40 Kbps................................................................................... 8
Fig. 2.2 Post processing for reduction of blocking artifacts ................................................. 9
Fig. 2.3 Loop filter for reduction of blocking artifacts ....................................................... 10
Fig. 3.1 JVT H.264/AVC encoder........................................................................................ 36
Fig. 3.2 Rate distortion comparison of H.264/AVC at low bit rates with MPEG2 and MPEG4 .. 44
Fig. 3.3 Rate distortion comparison of H.264/AVC at low bit rates with H.263 Baseline and H.263+.. 44
Fig. 3.4 Subjective comparison at low bit rats: QCIF CARPHONE frame 57 encoded at 22 Kbps... 45
Fig. 3.5 Subjective comparison at low bit rates: QCIF FOREMAN frame 134 encoded at 40 Kbps . 46
Fig. 3.6 Position of deblocking filter in H.264/AVC encoder............................................ 47
Fig. 3.7 Filtering order at macroblock level ........................................................................ 48
Fig. 3.8 Boundary strength (bS) computation flowchart ...................................................... 49
Fig. 3.9 H.264/AVC deblocking filter.................................................................................. 53
Fig. 3.10 Rate-PSNR Comparison at Low Bit Rates: with- & without deblocking filter . 56
Fig. 3.11 Subjective comparison for various QCIF sequences ............................................ 58
Fig. 3.12 Subjective comparison for various QCIF sequences ............................................ 59
Fig. 3.13 Subjective comparison for various CIF sequences ............................................... 60
Fig. 3.14 Subjective comparison for various CIF sequences ............................................... 61
Fig. 4.1 QCIF CONTAINER at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter ...... 65
Fig. 4.2 QCIF SALESMAN at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter........ 66
Fig. 4.3 QCIF MOTHER DAUGHTER at 30 Kbps: Use of (a) Normal Filter (b)
Strong Filter ......................................................................................................................67
Fig. 4.4 QCIF FOOTBALL at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter ........ 68
Fig. 4.5 Use of Strong and Normal Filter at Frame Level in H.264/AVC Deblocking
Filter encoded at 30 Kbps (a) QCIF CONTAINER (b) QCIF SALESMAN...................... 69
xv
Fig. 4.6 Use of Strong and Normal Filter at Frame Level in H.264/AVC Deblocking
Filter encoded at 30 Kbps (a) QCIF MOTHER DAUGHTER (b) QCIF CARPHONE .. 70
Fig. 4.7 Use of Strong and Normal Filter at Frame Level in H.264/AVC Deblocking
Filter encoded at 30 Kbps (a) QCIF FOREMAN (b) QCIF FOOTBALL ........................ 71
Fig. 4.8 Thresholds for classification of QCIF video sequences........................................ 75
Fig. 4.9 Thresholds for classification of CIF video sequences........................................... 76
Fig. 4.10 Adjacent samples to vertical & horizontal edge .................................................. 77
Fig. 4.11 Proposed Deblocking Filter .................................................................................. 82
Fig. 4.12 Comparison of addition operations (a) QCIF sequences (b) CIF sequences ..... 86
Fig. 4.13 Comparison of shift operations (a) QCIF sequences (b) CIF sequences ........... 87
Fig. 4.14 Comparison operations (a) QCIF sequences (b) CIF sequences ........................ 88
Fig. 4.15 Objective comparison between H.264/AVC deblocking filter and proposed
deblocking filter for various (a) QCIF sequences (b) CIF sequences .................................. 92
Fig. 4.16 Subjective comparison for various QCIF sequences........................................... 94
Fig. 4.17 Subjective comparison for various QCIF sequences .......................................... 95
Fig. 4.18 Subjective comparison for various QCIF sequences........................................... 96
Fig. 4.19 Subjective comparison for various QCIF sequences........................................... 97
Fig. 4.20 CONTAINER frame 1 encoded at 35 Kbps ......................................................... 98
Fig. 4.21 CIF BRIDGE frame 4 encoded at 40 Kbps........................................................... 99
Fig. 4.22 CIF MOTHER DAUGHTER frame 6 encoded at 40 Kbps ............................... 100
Fig. 4.23 CIF HIGHWAY frame 5 encoded at 40 Kbps .................................................... 101
Fig. 4.24 CIF SILENT frame 3 encoded at 40 Kbps ......................................................... 102
Fig. 4.25 CIF IRENE frame 14 encoded at 40 Kbps .......................................................... 103
Fig. 4.26 CIF FOREMAN frame 9 encoded at 35 Kbps .................................................... 104
xvi
LIST OF TABLES
Table 2.1 Deblocking filters for various standards............................................................... 10
Table 2.2 Comparison of post- and loop filtering................................................................. 11
Table 2.3 Comparison of deblocking algorithms.................................................................. 25
Table 2.4 SD thresholds of motion vector magnitude .......................................................... 31
Table 3.1 Coding tools supported by baseline, main and extended profile ......................... 35
Table 3.2 Objective comparison of H.264/AVC with MPEG-2 and MPEG-4 at different
bit rates for various QCIF sequences .................................................................................... 42
Table 3.3 Objective comparison of H.264/AVC with H.263 Baseline and H.263+ at
different bit rates for various QCIF sequences ..................................................................... 43
Table 3.4 Average luminance PSNR at different low bit rates for QCIF sequences with-
and without deblocking filter................................................................................................. 54
Table 3.5 Average luminance PSNR at different low bit rates for CIF sequences with-
and without deblocking filter................................................................................................. 55
Table 4.1 MVsum for QCIF video sequences......................................................................... 73
Table 4.2 MVsum for CIF video sequences............................................................................. 74
Table 4.3 Various parameters for experimental environment .............................................. 83
Table 4.4 Average number of operations spent on QCIF sequences using H.264/AVC
deblocking filter and proposed filter ..................................................................................... 84
Table 4.5 Average number of operations spent on CIF sequences using H.264/AVC
deblocking filter and proposed filter ..................................................................................... 85
Table 4.6 Computing complexity analysis of proposed filter with H.264/AVC
deblocking filter ..................................................................................................................... 89
Table 4.7 Average Luminance PSNR at different bit rates for QCIF sequences ................ 90
Table 4.8 Average Luminance PSNR at different bit rates for CIF sequences ................... 91
1
CHAPTER 1
Introduction
1.1 Background
Imagine that you want to transmit or store a TV quality digital video. Transmission or
storage capability of 37.32 Mega bytes is required for 1 second of video and 134. 37
Giga bytes are required for 1-hour uncompressed (raw) video program. This requires
enormously high data transmission and/or storage medium, which is beyond the
capabilities of today’s systems. Therefore, there is a need for compression to deal with
this kind of high-bit rate data. Moreover, the demand for digital video communication
applications such as video conferencing, video e-mail, network games and other value
added services has increased considerably. However, transmission rates over public
switched telephone networks (PSTN) and wireless networks are still very restricted
due to bandwidth limitations. Consequently, separate international video coding
standards have been recommended for different applications such as H.261 [1-2, 4],
MPEG-1 [2-4], MPEG-2 [2, 5-6], H.263 [7], MPEG-4 [8], H.263+ [9-10]. These
standards address wide range of applications having different requirements in terms of
bit rates, picture quality, error resilience and delay, etc.
The latest video coding standard, H.264/AVC is developed by Joint Video Team
(JVT) that includes experts from Motion Picture Expert Group (MPEG) and ITU-T
Video Coding Expert Group (VCEG). The official title of the new standard is
Advanced Video Coding (AVC); however, it is widely known by its ITU document
2
number, H.264 or MPEG-4 Part 10. The final drafting work on the first version of the
standard was completed in May of 2003. H.264/AVC supersedes previous video
coding standards in almost every aspect. The salient enhancements made by this
standard are: variable block size motion compensation with small block sizes, quarter-
pixel accurate motion compensation, multiple reference picture motion compensation,
decoupling of referencing order from display order, weighted prediction skipped and
direct motion influence and loop deblocking filtering [11]. The wide range of target
applications that can be categorized as: (1) Broadcast over cable, satellite
communication, cable modem, DSL, terrestrial communication, etc. (2) Storage on
optical and magnetic devices, DVD etc. (3). Conversational services important
networks. (4) Video-on-demand (5) multimedia streaming services over IDSN, cable
mode, DSL, LAN, wireless Network etc. and multimedia messaging services over
ISDN, DSL, Ethernet etc [12].
1.2 Objectives
H.264/AVC video coding standard along with other standards use block base
transform coding scheme to exploit spatial redundancy. However, loss of correlation
occurs between adjacent blocks due to coarse quantization at low bit rates. This
produces visually disturbing discontinuities along the block edges, known as blocking
artifacts. H.264/AVC employs normative adaptive loop deblocking filter for the
reduction of blocking artifacts [13]. The filter is applied to the reconstructed frame in
both, encoder and decoder. The filtered frames are used as reference frames for
motion compensation for subsequent coded frame. Performance analysis of
H.26/AVC deblocking filter shows that it reduces the blocking artifacts significantly
at low bit rates [14-17]. However, it is highly computationally complex, as it takes
3
one-third of computing resources of the decoder [18].The main reason for high
computing complexity of filter is heavy conditional processing on block edge and at
pixel level required for filtering decision, and to select one type of filter out of two
filters: normal and strong.
Most of research reported in literature for reduction of blocking artifacts is by use of
efficient architecture and hardware implementation of deblocking filter [19-22] but
very little work has been reported for algorithmic optimization of deblocking
algorithm. The main focus of our research is to reduce the computing complexity of
deblocking algorithm for H.264/AVC video, so that it can be used for real time low bit
rate applications like mobile video effectively.
1.3 Approach
This thesis describes the design and implementation of reduced computing deblocking
filter for low bit rate video coding. The novel idea of incorporating motion activity of
video sequences in deblocking filter to reduce the computing complexity is proposed.
It has been found that motion compensation vectors can be used to detect the motion
activity of video sequences. Based on this criterion, different video sequences are
categorized into three groups: low motion activity, moderate motion activity and high
motion activity. The thresholds using absolute sum of motion vectors has been set to
classify video sequences in these three groups. Using this criterion, new modified
conditions for filtering edge pixels are implemented. This results in significant
reduction in computational complexity of deblocking algorithm as decision to select
between two types of filters (strong/normal filter) takes considerable computing
operations. Experimental simulations conducted in our research show significant
4
reduction in computing complexity without loss of subjective quality of video for low
to moderate motion video sequences.
1.4 Thesis Outline
The rest of the thesis is organized as follows:
Chapter 2 provides literature review for thesis. First the mathematical background for
occurrence of blocking artifacts at low bit rates is introduced. Second, various
schemes for reduction of blocking artifacts in literature is discussed. Third, we
introduce some important methods used for detection of motion activity in video
sequences with perspective of incorporating it into deblocking filters.
In chapter 3, performance evaluations for low bit rate video coding are carried out.
Initially, performance analysis of H.264/AVC standard with existing video coding
standards for low bit rate video coding is done. Finally, effectiveness of H.264/AVC
deblocking filter for reduction of blocking artifacts at low bit rates is evaluated.
In chapter 4, design and implementation of a new criterion for deblocking algorithm
for low bit rate video coding is presented. First, examination of strong and normal
filter usage in original H.264/AVC deblocking filter is described. Second,
categorization of various video sequences according to motion activity is introduced.
Third, thresholds on the basis of absolute sum of motion compensation vectors for
low, moderate and high motion sequences are provided. Fourth, design and
implementation of proposed deblocking algorithm is discussed. Fifth, experimental
5
results for computing complexity, subjective and objective comparison of proposed
scheme with original H.264/AVC deblocking algorithm are given.
Finally, conclusions and future research directions are provided.
6
CHAPTER 2
Literature Review
This chapter describes literature review of central methods for reduction of blocking
artifacts. Section 2.1 describes the theory related to occurrence of blocking artifacts at
low bit rates while categories of deblocking techniques and some central methods
found in literature for blocking artifacts reduction are discussed in section 2.2. As this
thesis presents a novel approach of incorporating motion activity of video sequences
in deblocking filters; section 2.3 elaborates some central schemes for detection of
motion activity found in literature.
2.1 Theory of Blocking Artifacts at Low Bit Rates
The basic approach in block based discrete cosine transform schemes for image and
video coding is to divide the whole image into blocks, transform each block using
discrete cosine transform, quantize and entropy coded [23]. An image is divided in M
x N blocks, generally 8 x 8 blocks. The Discrete Cosine Transform (DCT) for 8 x 8
block is given by Eq. 2.1.
( ) ( )
+
+
= ∑∑= = 16
12cos16
12cos),(4
)()(),(7
0
7
0,,
ππ yjxiyxwyCxCyxWi j
jiji 2.1
where wi,j(x,y) are the 64 samples of ijth input sample block and Wx,y are the 64 DCT
coefficients (x,y) and C(x), C(y) are constants as described by Eq. 2.2.
7
0 1 0 2/1)(
≠==
xxxC 2.2
After this transform, the DCT coefficients are quantized. The inverse DCT (IDCT)
reconstructs a block of image samples from an array of DCT coefficients. The IDCT
takes as input a block of 8 x 8 DCT coefficients Wx,y and reconstructs a block of 8 x 8
image samples wi,j by Eq. 2.3.
( ) ( )
+
+
= ∑∑= = 16
12cos16
12cos),(4
)()(),( ,
7
0
7
0,
ππ yjxijiWyCxCjiW yxx y
yx 2.3
Quantization step divides transformed coefficients by quantization table and are
rounded to an integer. At low bit rates, high-order DCT coefficients are more severely
quantized (usually to zero).
In video coding, motion compensation is another source of propagation of these
blocking artifacts [14]. Copied interpolated pixel data from various locations of
different reference frames can be used for generation of motion compensated blocks.
Discontinuities on the edges of copied blocks of data arise, as there is never a perfect
fit for this data. Moreover, during copying process, existing edge discontinuities in
reference frames are passed into the interior of the block to be motion compensated.
Blocking artifacts makes the decompressed images/video unacceptable for human
eyes at low bit-rates and often limits the maximum compression performance that can
be achieved. Fig. 2.1 shows comparison of uncompressed (raw) and the reconstructed
frames of for CIF MOTEHR DAUGHTER, CIF CONTAINER, and CIF FOREMAN
encoded at 40 Kbps respectively. It is apparent that the reconstructed frames contain
blocking artifacts.
8
Fig. 2.1 Blocking Artifacts at 40 Kbps (a) Uncompressed (raw) frame 3 of sequence ‘CIF Mother and Daughter’ (b) Reconstructed frame of (a) by H.264/AVC (c) Original frame 2 of sequence “CIF Container” (d) Reconstructed frame of (c) by H.264/AVC (e) Uncompressed (raw) frame 4 of sequence “CIF Foreman” (f) Reconstructed frame of (e) by H.264/AVC
9
2.2 Methods to Reduce Blocking Artifacts – Deblocking Filters
This section outlines the categories of deblocking filters and reviews some core
algorithms used for suppression of blocking artifacts in existing literature. Section
2.2.1 provides overview of two main types used for reduction of blocking artifacts.
Literature review of some significant methods used for deblocking are explained in
section 2.2.2 while summary of these methods is given in section 2.2.3.
2.2.1 Categories of Techniques for Reducing Blocking Artifacts
There are two types of techniques employed for reduction of blocking artifacts [14]. :
1. Post filtering
2. Loop filtering
In post filtering, as shown in Fig. 2.2, deblocking filter is applied after the decoder and
utilizes decoded parameters. It operates on display buffer outside the coding loop. The
frame is decoded into reference frame buffer and filtered before passing it to display
device. An additional buffer may be required for implementation of post filter.
Fig. 2.2 Post processing for reduction of blocking artifacts
The use of post filter is optional in most standards as it is not a normative part of
standards. In loop filtering, the deblocking filter works within the coding loop. For
motion compensation of following frames, filtered frames are used as reference
10
frames. As a result, standard conformant decoder is needed to carry out filtering
identical to that of encoder. Filtering takes place for each macroblock during decoding
process and reference frame buffer is used to store the filtered output. Fig. 2.3 shows
the position of loop deblocking filter in coding loop at encoder and decoder
respectively.
Fig. 2.3 Loop filter for reduction of blocking artifacts (a) encoder (b) decoder
Different video coding standards proposed deblocking filters for blocking artifacts
reduction. Table 2.1 shows deblocking filters used by various standards [11, 23].
Table 2.1 Deblocking filters for various standards
Standard Deblocking Filter H.261 Optional in-loop filter
MPEG-1 No filter
MPEG-2 No Filter, post-filter processing often used
H.263 No filter, post-filter using H.263+
MPEG-4 Optional in-loop filter, post-filter processing suggested
H.264 Mandatory in-loop filter, post- filter processing may also be used
11
The reported research for post- and loop-filtering for reduction of blocking artifacts in
literature is very diverse. Lot of attention has been given to post-filtering but very
little work has been reported in the area of loop filtering in literature. Table 2.2
compare pros and cons of loop filtering and post filtering.
Table 2.2 Comparison of post- and loop filtering Post filtering Loop filtering
Independent of coding standard
Improvement in quality of reconstructed frame results in accurate motion compensation
Implementation without any increment in bit rate or any modification in encoding procedure
Exactly same filtering at encoder and decoder
Filtering only at the decoder Extra frame buffer not required at decoder
No compatibility issues as it works outside coding loop
Compatibility with coding standard required
Extra buffer required at decoder
Difficult to incorporate in commercial coding products with existing standards
Use of either post filtering or loop filtering has some pros and cons on both sides. For
example, in decoder implementations, maximum independence is offered by post
filtering and no amendment in video coding standard is needed. However it requires
an extra buffer at the decoder. On the other hand, there are also some advantages of
loop filtering i.e., applying deblocking filter within coding loop [14]. First, by using
loop filtering, the quality of reconstructed frame can be improved. The outcome is
quality improvement of prediction frame and as a consequence, accuracy in motion
compensated prediction for next encoded frame can be achieved. Second, the quality
level of deblocking is guaranteed as exactly same filtering is done at encoder &
decoder respectively, resulting in expected (predicted) quality of video at the decoder
12
side. Third, extra frame buffer is not required at decoder as was the case for post
filters. Fourth, empirical results revealed that usage of loop filtering results in
improvement of objective and subjective quality of video with major reduction in
decoder complexity in comparison with post filtering [24-25].
2.2.2 Literature Review for Deblocking Filters
Many algorithms are proposed for reduction of blocking artifacts for block based
transform coding schemes. Among them are:
1. Projection on Convex Set (POCS) based algorithms
2. Maximum a Posteriori (MAP) technique
3. Constrained Least Square (CLS) deblocking
4. Combined Transform Coding (CTC) scheme
5. AC prediction based deblocking
6. Wavelet based deblocking algorithms
7. Multilayer perceptron (MLP) neural network based deblocking method
8. Deblocking filtering using weighted sum of symmetrically aligned pixels
9. Deblocking using gradient projection method
10. Lapped orthogonal transform (LOT) based deblocking
11. Deblocking using genetic algorithm (GA)
12. Deblocking based on Human Visual System (HVS)
13. Non-linear spatial filters deblocking
14. Adaptive linear spatial filters deblocking
The brief description of above mentioned approaches is as follows:
13
Projection on Convex Set (POCS) based Algorithms
The POCS algorithms [26] use iterative block reduction technique based on theory of
projection onto convex sets. A number of constraints on coded image are used for
restoration into original form. For example, one constraint can be devised from the
information that blocking artifact image has high frequency components across
boundary of neighboring blocks. These high frequency components are omitted from
original image, so projection of artifact image onto original image is performed by
iterative procedure. These iterations are repeated until artifact free image is obtained.
In POCS based algorithm proposed by Yang et al [27], based on line processes
modeling of the image edge structure, a new family of directional smoothness
constraint sets is described. Because of the fact that visibility of artifacts in an image
is spatially varying, the authors have also taken definition of smoothness sets. The
numerical difficulty of computing the projections onto these sets is overcome by a
divide-and-conquer (DAC) strategy. In DAC, new smoothness sets are derived such
that their projections are easier to compute. The algorithm can remove blocking
artifacts from compressed image and video. The highly correlated images are assumed
by Paek et al [28] to reduce blocking artifacts based on POCS. As assumed images are
highly correlated, the global frequency characteristics in two adjacent blocks are
similar to the local ones in each block. The high frequency components in global
characteristics of a decoded image, which are not found in local ones, results from
blocking artifacts are considered. N-point DCT to obtain the local characteristics, and
2N-point DCT to obtain the global ones, and then relation between N-point and 2N-
point DCT coefficients are employed. The undesired high frequency components
caused by blocking artifacts are detected by comparison of N-point with 2N-point
DCT coefficients. Then novel convex sets and their projection operators in the DCT
14
domain are proposed by authors and they claim that it yields significantly better
performance than the conventional techniques in terms of objective quality, subjective
quality, and convergence behavior.
Maximum a Posteriori (MAP) Technique
MAP based technique is based on stochastic model of image data [29]. It selects the
best image from a set of better images. Quantization step partitions the transform
coefficients and maps all points in a partition cell to a reconstruction point, taken as
centeriod of cell. The technique selects the reconstruction point within quantization
partition cell which results in reconstructed image that best fits a non-Gaussian
Markov random field (MRF) image model. The gradient projection method is used to
update the estimate based on image model iteratively. In paper [30], probabilistic
models are used for both the degradation introduced by the coding and for a “good”
image. The restored video sequence is the MAP estimate based on these models. The
authors first describe a generic model for video compression. It also explains the
effects of motion compensation which is used in many video compression techniques.
A decompression algorithm is then outlined based on a previously proposed image
model. From experimental results, reconstructed image sequence shows a reduction in
many of the most noticeable artifacts.
Constrained Least Square (CLS) deblocking
Yang et al [31] describes reconstruction of images from incomplete block discrete
cosine transform (BDCT) data. In it, prior knowledge about the smoothness of the
original image is transmitted along with the image data. The decoder reconstructs the
image by using both of them. Two methods are proposed in this paper based on POCS
15
and CLS respectively. In CLS, the proposed objective function captures the
smoothness properties of original image. The recovered image is obtained by
minimizing an objective function, which is the weighted sum of two functions that
impose conflicting requirements on the recovered image. Thus, if one of these
functions penalizes deviation from the available data the other must penalize the
undesired effects if an image is reconstructed only from the available data. In this
sense, the second function introduces prior knowledge that complements the available
data or, in other words, constrains the behavior of the reconstructed image. Iterative
algorithms are introduced for its minimization. The authors claim with the help of
experimental results that blocking artifacts can be reduced drastically. In another
paper based on adaptive constrained least squares restoration by Andre' Kaup [32], a
numerically simple post-processing scheme is proposed. The spatial adaptation of post
processing to local image structure preserves high frequency details of image. The
authors claim that proposed technique almost completely removes blocking artifacts.
Combined Transform Coding (CTC) scheme
In Combined Transform Coding (CTC) scheme [33], image is divided into two sets
that contain different correlation properties, i.e., the upper image set (UIS) and lower
image set (LIS). The UIS contains the most significant information and tends to be
highly correlated whereas; LIS contains the less significant information and carries
less correlation. Then the UIS is compressed noiselessly without dividing into blocks
and LIS is coded by conventional block transform coding. This results in suppression
of blocking effects in image due to the fact that correlation in UIS is reduced without
distortion and thus as a result the inter-block correlation is significantly reduced .The
additional advantage of the CTC scheme is removal of ringing effects.
16
AC Prediction based Deblocking
Taehwan Shin et al [34] proposed a blocking effect reduction method based on
content-based AC prediction for MPEG-2 video. The algorithm, first detects the block,
which has caused blocking artifact. Then DC sequence is generated and position of
block is searched in image content. The AC coefficients are predicted by content-
based AC prediction algorithm. Simulations performed by authors show that proposed
algorithm reduces the blocking artifacts effectively. The research by Changick Kim
[35], proposes another AC prediction based blocking artifact reduction method. For
each block, its DC value and DC values of the surrounding eight neighbor blocks are
exploited to predict low frequency AC coefficients. Each block is categorized into low
activity or high activity block by use of these predicted AC coefficients. Then two
types of low pass filters are adaptively applied based on the categorized result of each
block. A strong low pass filter is applied in low activity region, where blocking
artifacts are most noticeable. High activity regions are filtered by weak low pass filter.
Computer simulations performed by author show that proposed algorithm is effective
in reducing blocking artifacts as well as ringing artifacts. Hadamard transform is used
by K. Veeraswamy et al [36] for AC coefficients prediction to reduce the blocking
artifacts. In proposed method, Hadamard transform DC values are transmitted. AC
restoration method is used for image reconstruction. The proposed method improves
the peak signal to noise ratio and reduces the blocking effects significantly.
Wavelet Based Deblocking Algorithms
Wavelet based deblocking algorithm [37] computes soft threshold values based on
difference between wavelet transform coefficients of image blocks and coefficients of
entire image to threshold high-frequency wavelet coefficients in different sub-bands
17
using different values and strategies. An adaptive threshold value is employed for
different images and characteristics of blocking effects. The filtered image is obtained
by thresholding of different sub bands by three-level decomposition. Liew et al [38]
proposed a non-iterative wavelet-based deblocking algorithm. The algorithm exploits
the fact that block discontinuities are constrained by the dc quantization interval of the
quantization table, as well as the behavior of wavelet modulus maxima evolution
across wavelet scales to derive appropriate threshold maps at different wavelet scales.
The algorithm can suppress blocking artifacts as well as ringing artifacts effectively
while preserving true edges and textural information.
Multilayer Perceptron (MLP) Neural Network based Deblocking Method
Multilayer Perceptron (MLP) neural network deblocking is based on adaptive learning
by examples concept. In this scheme [39], relevant information from the image is
extracted and given as input to neural network. The MLP neural network tries to learn
to reconstruct the original image. On the encoder side, the image is compressed and
decompressed by image compression algorithms. By the decompressed image,
features representing the occurrence of blocking effects, the numerical artifacts
indicators (NAIs), are taken out and as an input given to the MLP network. The MLP
will try to produce an output approximating the difference between the original image
and the decompressed image. To train the MLP network, a suitable supervised
learning algorithm and difference between the original and the decompressed image as
a desired output is used. After the completion of training, the weights of the MLP
network together with the compressed image data are transmitted or stored. When
compressed data is received at the decoder, decompression and extraction of blocking
18
effect features is done and given as input to MLP network. The output of MLP
network is added in the decompressed image for final decoded image formation.
Deblocking Filtering using Weighted Sum of Symmetrically Aligned Pixels
In deblocking filtering using sum of symmetrically aligned pixels [40], a new class of
deblocking algorithms for reduction of blocking artifacts in images and video is
proposed. A symmetrically aligned weighted sum of pixel quartets with respect to
block boundaries is employed for image deblocking. The basic weights are obtained
from a function which obeys predefined constraints. A deblocked image is produced
using these weights which contain blurred edges near real edges. The authors refer
these blurred edges as ghosting phenomenon. To prevent this, non-monotone area
weights of pixels is modified by dividing each pixel’s weight by predefined factor
called a grade. This scheme is referred as weight adaptation by grading (WABG).
Better deblocking of monotone areas is done by doing three iterations of WABG. The
fourth iteration is done on rest of image to deblock the detailed blocks. The authors
call this as deblocking frames of variable size i.e., DFOVS. The WABG and the
DFOVS approaches automatically adapt themselves to different bit rates. It produces
very good results for decompressed images ranging from extremely low to medium bit
rates as claimed by authors.
Deblocking using Gradient Projection Method
The gradient projection based method [41] exploits the correlation between the
intensity values of boundary pixels of two neighboring blocks. It is based on the
theoretical and empirical observation that under mild assumptions, quantization of the
DCT coefficients of two neighboring blocks increases the expected value of the Mean
19
Squared Difference of Slope (MSDS) between the slope across two adjacent blocks,
and the average between the boundary slopes of each of the two blocks. This increase
in expected value of MSDS is dependent on the width of quantization intervals of
transform coefficients. Consequently, amongst all permitted inverse quantized
coefficients, the set which reduces the expected value of this MSDS by a suitable
amount is most likely to decrease the blocking artifacts. In order to estimate the set of
unquantized coefficients, a constrained quadratic programming problem in which the
quantization decision intervals provide upper and lower bound constraints on the
coefficients is solved. The authors with the help of simulations claim that from a
subjective viewpoint, the blocking effect is less noticeable in processed images than in
the ones using existing filtering techniques.
Lapped Orthogonal Transform (LOT) based Deblocking
Lapped orthogonal transform (LOT) [42] can reduce blocking artifacts to very low
levels. It is tool with basis functions that overlap adjacent blocks. Malvar et al [43]
proposed an optimal LOT that is related to the DCT in such a way that a fast algorithm
for a nearly optimal LOT is derived. The LOT is distinguished by the fact that each
block of size N is mapped into a set of N basis functions, each one being longer than N
samples. As coding noise is mainly a function of quantization process, therefore, it is
virtually unaffected by LOT. The blocking effects are reduced to a level where they
can hardly be detected by the human eye. However, it requires about 20-30 percent
more computations, mostly additions in comparison with DCT [43]. In another
research by Malvar [44], the lapped bi-orthogonal transform (LBT) and hierarchical
lapped bi-orthogonal transform (HLBT) are used for image coding. The HLBT has a
significantly lower computational complexity than the lapped orthogonal transform
20
(LOT), with almost no blocking artifacts in comparison with DCT. Experimental
results performed by author show better performance of the LBT and HLBT and they
have fewer ringing artifacts.
Deblocking using Genetic Algorithm (GA)
Chih-Chin et al [45] proposed a hybrid approach of using L-filter (modified linear
finite impulse response (FIR) filter or a generalization of median filter) and genetic
algorithm (GA) to reduce the blocking artifacts. The authors consider the blocking
artifact removal as a de-noising problem since the blocking artifacts can be thought as
the superposition of an image and a quantization noise. An L-filter is an order statistic
filter that combines order information of the observation data and applies linear
operation to the ranked data. An L-filter can be used to remove different types of
noises if its parameters are properly chosen [46]. The search for proper parameters for
L-filter is done with the help of genetic algorithms (GAs). GAs are well known for
their ability to perform parallel search in complex solution spaces and have the
following advantages over traditional search methods: (i) GAs directly work with a
coding of the parameter set; (ii) search is carried out from a population of points; (iii)
payoff information is used instead of derivatives or auxiliary knowledge; and (iv)
probabilistic transition rules are used instead of deterministic ones [47]. In proposed
method by Chih-Chin et al, the L-filter is used for reduction of blocking artifacts and
the GA is used to search the proper parameters for the L-filter. The proposed approach
is used as follows: At the sender side, a reconstructed image is obtained by taking the
inverse transform of the transmitted transform data. The proper L-filter parameters are
found by using a GA between the original and reconstructed images. The L-filter
parameters are then transmitted to the receiver side for removing the blocking
21
artifacts. The authors claim with the experimental results that the proposed approach
is a practicable technique to reduce the blocking artifacts in the block-based
compressed images.
Deblocking based on Human Visual System (HVS)
B. Macq et al [48] proposed a criterion based on visual model for reduction of
blocking artifacts. The target is to decompose the corrupted image into perceptual
channels and to cancel the channels where the noise is above the visibility threshold.
Then image is reconstructed with the only channels where the estimated noise is
below the visibility threshold. More specifically, the noisy picture is first split up into
several perceptual channels by means of filters tuned to specific spatial frequencies
and orientations. Each resulting filtered picture is then weighted by a masking
function in order to cancel the visible noise. The masking is a function of the
perceptual component contrast of the original picture. Difference of noisy picture and
noise estimation is used for carrying out the contrast. The addition of each masked
pictures provides at last the restored picture. Tao Chen et al [49] proposed an
approach that works in transform domain for reduction of quantization noise. The
adaptive weighting mechanism is integrated by considering the masking effect of
human visual system. The proposed approach makes use of transform coefficients of
shifted blocks, rather than those of the neighboring blocks, in order to obtain a close
correlation between the DCT coefficients at the same frequency. The filtering is
operated location-variantly based on the local activity of blocks to achieve the artifacts
reduction and detail preservation simultaneously. More exactly, an adaptively
weighted low-pass filtering technique is activated to image blocks of different
activities, which represent the inherent masking abilities for artifacts. Human visual
22
system sensitivity at different frequencies is used to characterize the block activity.
Blocking artifacts are more noticeable for low-activity blocks and post-filtering of the
transform coefficients is applied within a large neighborhood to smooth out the
artifacts. For high activity blocks, a small window and a large central weight are used
to preserve the image details since the eye has difficulty discerning small intensity
variations in portions of an image where strong edges and other abrupt intensity
changes occur. Finally, the quantization constraint is also applied to the filtered DCT
coefficients prior to the reconstruction of the image from coefficients. Another
approach for reduction of blocking artifacts based on masking effect of human visual
system is proposed by Shen-Chuan Tai et al [50]. The proposed scheme is based on
three separate modes that classify local characteristics of images. Region classification
with respect to activity across block boundary is done before the application of one of
three modes of deblocking filter. The classification of regions is: smooth regions,
complex regions and intermediate regions. Flat areas of block boundary are strong
filtered whereas, weak filter is applied is areas of high spatial or temporal activity. An
intermediate mode is used for solving problem of either excessive blurring or
inadequate removal of blocking effect.
Deblocking using Non-linear Spatial Filters
An algorithm based on non-linear smoothing of pixels for deblocking is proposed by
Jim Chou et al [51]. The deblocking is performed in two steps. In step 1, difference
between actual image edges and artificial discontinuities produced by quantization
noise at block boundaries is taken into account. A probabilistic framework is used to
derive estimates for the reconstructed DCT coefficients and for the quantization error
of each image coefficient. While removal of blockiness by reducing discontinuities at
23
block edges is done in step 2. The principal used is to reduce discontinuities of
artificial edges at block boundaries to a level that is imperceptible to the eye. First, the
discontinuities are computed by differencing the pixels across each block boundary
and then, authors attempted to reduce these discontinuities below visibility threshold.
Experimental results show significant improvement in visual quality of images.
Gaetano Scognamiglio et al [52] proposed a technique based on unsharp masking
(UM) for noise smoothing and edge enhancing. The authors used the approach
described in [53] with additional new features. Important new feature in this technique
is that amount of coding artifacts and fact that blocking artifacts can be located at any
position in video sequence is taken into account. The method does not need any
information about position and size of blocks. In another approach by Kee-Koo Kwon
et al [54], an adaptive post processing algorithm using block boundary classification
and simple adaptive filter (SAF) is proposed. The method of deblocking can be
described as follows: First, classification of each block boundary into smooth or
complex sub-region is done. For smooth-smooth sub-regions with blocking artifacts, a
non-linear 1-D 8 tap filter is applied while a nonlinear 1-D variant filter is applied to
smooth-complex and complex-smooth regions for suppression of artifacts. For
complex-complex sub-regions, a nonlinear 1-D 2-tap filter is only applied to adjust
two block boundary pixels so as to preserve the image details. Authors’ experimental
simulations show that proposed algorithm produces better results than those of the
conventional algorithms, both subjectively and objectively.
Adaptive Linear Spatial Filters Deblocking
A deblocking algorithm by adaptively using spatial frequency and temporal
information extracted from the compressed data is proposed by Hyun Wook Park et al
24
[55]. The authors investigated the distribution of the inverse quantized coefficients
and the motion vectors for extraction of semaphores of the blocking artifacts and
ringing noise in each 8 x 8 block. For reduction of blocking artifacts, a 1-D low pass
filter (LPF) and 2-D signal adaptive filter are applied adaptively to every 8 x 8 block
by using blocking and ringing semaphores. Computer simulations performed by
authors on several images show that proposed method’s better performance over
MPEG-4 VM (verification model). Yonghun Kim et al [56] proposed a deblocking for
reduction of ringing and blocking artifacts. One of the important features is that
blocking artifacts are removed without blurring the edge regions at the decoder.
Authors argue that low pass filters to reduce the blocking and ringing artifacts are
performed strongly. However, excessive smoothing also removes important high-
frequency content. They proposed a separable low pass filter with the Gaussian-
shaped impulse response for reduction of artifacts. An adaptive algorithm based on the
filtering of block boundaries for reduction of blocking artifacts in compressed images
and video is proposed by Nam Ik Cho et al [57]. The authors consider a scan line of an
image with blocking artifacts as continuous-time step function and output also as the
continuous-time step response of the lowpass filter. The continuous-time response can
be found out with given boundary difference and duration of blocky region without
computation. Then, deblocking is just sampling of the output at the pixel locations. As
a consequence, an appropriate filtered output is obtained without actual filtering.
Simulations carried out by authors show that objective and subjective quality of
proposed method is comparable to other conventional algorithms and POCS.
25
2.2.3 Summary of Salient Techniques used in Deblocking Filters
The summary of some central schemes used for reduction of blocking artifacts in
literature is given in Table 2.3.
Table 2.3 (a) Comparison of deblocking algorithms Deblocking scheme Salient features
POCS based algorithms
Based on theory of projection on convex sets Iterative block reduction technique May take number of iterations to converge High computing complexity for real time video
MAP based deblocking
Based on stochastic model of image data Select image from set of better images Iterative block reduction technique High computing complexity
CLS deblocking
Uses smoothness properties of original image Iterative block reduction technique High computing complexity Not suitable for real time video
CTC scheme
Divide image into two sets that represent differentcorrelation properties Uses these two sets for reduction of blocking Moderate computing complexity Not suitable for real time video
AC prediction DC sequence is generated by image content AC coefficients are predicted from DC values Used for deblocking in video as well as images
Wavelet based deblocking
Non-iterative deblocking Uses statistical characteristics of blockdiscontinuities & behavior of wavelet coefficientsfor different image features for deblocking Primarily used for image deblocking
MLP neural network(NN)
Relevant information from image is extracted andgiven as input to neural network NN try to learn to reconstruct the original image Deblocking achieved by adding NN’s output tocompressed image
Weighted sum of symmetrically aligned pixels
Apply weighted sums in pixel quartets Weights obtained by 2-D function usingpredefined constraints Suitable for images at very low bit rates More computing complexity for video
26
Table 2.3 (b) Comparison of deblocking algorithms Deblocking scheme Salient features
Gradient projection method
Based on theoretical and empirical observations Exploits correlation between intensity values ofboundary pixels of neighboring blocks fordeblocking Suitable for still images
LOT based method
Coding noise virtually unaffected by the LOT as itis mainly due to quantization process Blocking effects reduction to a level that canbarely be detected by human eye 20-30% more computations in comparison withDCT
Filtering using GA
L-filter used to remove deblocking noise bychoosing proper parameters Proper parameters found by using GA betweenoriginal and reconstructed images
HVS deblocking
Change in image will not be perceived by humans,if contrast value is below visibility thresholdknown as masking effect; Considering masking effect, adaptive deblockingis applied Can be used for images as well as video
Non-linear spatial filters
Filtering based on non-linear operations May blur the images At low bit rates, block discontinues cannot becompletely eliminated Used for deblocking of images as well as video
Linear spatial filters
No additional information to be transmitted or anyadditional operation on encoder side. Low computing complexity May blur the images at low bit rates Used for deblocking of images as well as video
Most of the methods described above are used for removal of blocking artifacts for
still images and are computationally complex for low bit rate applications like real
time mobile video for portable devices and video conferencing on Internet over low
bandwidth lines. Furthermore, it has been found that deblocking using AC prediction,
HVS based deblocking and blocking effects reduction using non-linear and linear
spatial filtering can be used for deblocking in video sequences. However, they may
cause blurring. The latest H.264/AVC video coding standard incorporates an adaptive
spatial filter for reduction of blocking artifacts.
27
2.3 Motion Activity Detection Metrics – A Deblocking
Filters Perspective
As this thesis proposes a novel approach of incorporating motion activity of video
sequences in deblocking filters, therefore it is worth mentioning to review some
central techniques used for detection of motion activity in literature. This section first
reviews some methods for detection of motion activity and then analysis of these
methods with respect to computational complexity is presented.
2.3.1 Review of Significant Motion Activity Detection Approaches
A person watching a video sequence perceives various types of motion in it: slow,
moderate, and fast. In standard test sequences, examples of slow motion sequences are
CONTAINER, AKIYO, CLAIRE, and GRAND MA; moderate motion sequences
examples are SALESMAN, MOTHER-DAUGHTER, NEWS, and FOREMAN while
SOCCER and FOOTBALL belongs to high motion sequences. There are various
methods for detection of motion activity in video sequences. Among them are:
1. Motion intensity histogram
2. SAD based descriptors
3. Motion vectors based descriptors
The above mentioned methods in the light of literature review are briefly described as
follows:
28
Motion Intensity Histogram
The histogram of the motion intensity can be used to characterize the video sequence
temporal motion intensity distributions [58]. The histogram can be scaled to multiple
video levels as histogram is not dependent on the video segment size. In order to
compute motion intensity histograms, there is a need to quantize motion intensity into
levels. Then, vector quantization methods can be used to transform quantized intensity
levels. The scene intensity can be described as very low, low, medium, high, and very
high. For a given video unit, the motion intensity histogram (MIH) can be defined as
Eq. 2.4 [59].
},.........{ 3,2,1,0 iNpppppMIH = 2.4
where pi is the percentage of quantized motion intensity corresponding to i-th
quantization level. The experiments performed by authors show that MIH captures the
human perception of human motion quite significantly. There are some other simple
metrics that can be used to detect the motion activity of video sequences. Among them
are difference of histograms (DH), histogram of difference (HD), and block histogram
difference (BHD) as described in Eq. 2.5 through Eq. 2.7 respectively [60].
∑=
−=L
okji
fkhkh
DjiDH )()(1),( 2.5
+= ∑∑
+−
−
=−
L
Lji
L
okji
fkhkh
DjiHD
α
α
2/
2/)()(1),( 2.6
∑ ∑= =
−=fDBD
b
L
kji kbhkbhjiBHD
/
0 0),(),(),( 2.7
29
where i,j are the frame index, h is the histogram operator with L levels, DB and Df are
block and frame size respectively, and α is the threshold that represent closeness to
origin. The authors [60] found that DH and HD work at frame level and detect
changes at global level. The HD is fairly effective for high motion sequences, as in
high motion, significant changes occur between frames and more pixels are distributed
away from origin. The BHD metric is more sensitive to local motion.
SAD based Descriptors
Hu Weiwei et al [61] describes that motion activity of video sequences can also
detected by using sum of absolute differences (SAD). The authors, with the help of
experiments, found that video sequence with strong motion activity will be having a
higher SAD value in comparison with slow motion sequence of lower SAD value.
They classified various video sequences based on their motion content into three
categories as follows:
Class A sequences – slow motion activity
Class B sequences – median motion activity
Class C sequences –strong motion activity
The authors supposed two thresholds after computing SAD values of different video
sequences. These thresholds are: T1 = 1300, and T2 = 4000. If SAD < T1, the current
macroblock will be classified as slow motion activity (class A sequence). If T1 < SAD
< T2, then classification for current macroblock will be as median motion activity
(class B sequence) while current macroblock’s SAD > T2, will be treated as strong
30
motion activity (class C sequence). The authors with the help of experiments show
that their method can successfully detect the motion activity of video sequences.
Motion Vectors based Descriptors
Kader A. Peker et al [62] does in-depth overview of descriptors of motion activity
using motion vectors. The motion vectors are readily available due to their
computation in motion estimation and these are easily extracted from compressed
video. Though compressed domain motion vectors are not accurate enough for object
motion analysis, they are adequate enough for the measurement of the gross motion in
video. There are number of low-complexity statistical descriptions of motion for
overall activity in the frame. Among them are average of motion vector magnitudes
and variance of motion vector magnitudes. These descriptors can be computed in the
frames by Eq. 2.8 and Eq. 2.9 respectively [63].
∑=
=N
iiavg MV
NMV
1 1 2.8
∑∑==
−=N
ii
N
ii MV
NMV
NMV
11
2var
11 2.9
framein thetorsmotion vecofNumeber :)....1frame( in the torsMotion vec :
NNiMVi =
The higher the average value of motion magnitudes, higher is the motion activity and
vice versa. The variance of motion vector measures the motion activity by computing
its non-uniformity [64]. In another approach by Sylvie Jeannin et al [65], quantized
standard deviation of motion vector magnitude is used to detect the motion activity of
video sequences. The authors first construct the ground truth data set that consists of
31
637 video segments from MPEG-7 test set. Then, they used human subjects to classify
the video segments into five types of motion classes: (1) very low intensity; (2) low
intensity; (3) medium intensity; (4) high intensity; (5) very high intensity. Then
average of ratings among subjects is taken to classify each video segment. The authors
found that standard deviation of motion vector magnitude provides slightly better
approximation of ground truth in comparison with average of motion vector
magnitude. The authors computed standard deviation (SD) thresholds and observed as
described in Table 2.4.
Table 2.4 SD thresholds of motion vector magnitude [65]
Motion Intensity Range of σ Very low 0≤ σ <3.9
Low 3.9≤ σ <10.7 Median 10.7≤ σ <17.1
High 17.1≤ σ <32 Very High 32≤ σ
Experimental simulations performed by authors show that video sequences can be
categorized different motion intensity levels by using standard deviation of motion
vector magnitude successfully. Dong Tian et al used motion vector’s modulus to
describe the motion activity. The authors defined ∆xk(i,j), ∆yk(i,j) as motion vector of
image block (i,j) in frame k. Motion activity of image block can be given by Eq. 2.10
[66].
∆+∆+= ),(),(11),( 22
maxjiyjix
MAjiMA kkk 2.10
where MAmax is the maximum value of motion vector modulus. The motion activity of
a whole frame, MAk can be represented by mean value of MAk(i,j), for all i and j. By
32
denoting number of image blocks in the frame as NMB, the motion activity of whole
frame can be found as Eq. 2.11 [66].
),(1 jiMAN
MAi i
kMB
k ∑∑= 2.11
Experimental analysis performed by authors depict that proposed method can detect
the motion activity successfully.
2.3.2 Analysis of Motion Activity Detection Techniques
The Motion intensity histogram technique and SAD based descriptors of motion
activity are computationally complex in comparison with motion vector based
methods. The SAD based detection of motion activity is even more complex than
histogram based techniques. In case of real time low bit rate video applications such
as video conferencing on internet using PSTN, mobile video and video telephony,
computational complexity is an important factor. As these applications need video to
be transmitted in real time, more time will be taken for computational complex
systems. Because motion vectors are readily available from motion estimation process
of video coding, we do not need to compute them separately. These readily available
motion vectors from motion estimation can easily be extracted with minimal
decoding. Thanks to this advantage of availability of motion vectors that result in less
computational complexity in comparison with other methods. Therefore, it is suitable
for detection of motion activity for low bit rate real time video applications.
33
CHAPTER 3
Case Studies – Analysis with respect to Low Bit
Rate Video Coding
This chapter presents two case studies done for performance evaluations at low bit
rate video coding. Section 3.1 describes performance evaluation of latest H.264/AVC
standard with existing standards (Base line H.263, H.263+ , MPEG-2 and MPEG-4
standards) for low bit rate video coding. While performance analysis of H.264/AVC
deblocking filter for reduction of blocking artifacts is explained in section 3.2.
3.1 Performance Analysis of H.264/AVC Standard*
The major aim behind emerging H.264/AVC standard was to develop an advanced
video coding standard for generic audiovisual devices and to provide a mean to attain
significantly higher video quality in comparison with existing video coding standards.
The final first version draft of standard was completed in May 2003. The H.264/AVC
video coding standard has the following salient features in comparison to existing
standards [67]:
Enhanced motion estimation with variable block size
Integer block transform
Improved loop deblocking filter
Enhanced entropy encoding
* This independent evaluation was carried out in 2004 and cited 4 times in literature till 2008 [89-92].
34
The H.264/AVC is designed to cover wide range of applications [13]. Some of the
important applications are as follows:
Cable TV on optical networks, copper
Direct broadcast satellite video services
Digital subscriber line video services
Digital terrestrial television broadcasting
Interactive storage media. E.g., optical disks
Multimedia mailing
Multimedia services over packet networks
Real-time conversational services. E.g., videoconferencing, videophone
Remote video surveillance
Serial storage media. E. g., digital VTR
Our research presents performance comparison of H.264/AVC with MPEG-2, MPEG-
4 ASP, H.263 baseline and H.263+ standards for low bit rate video coding.
3.1.1 H.264/AVC- Profiles and Levels
The limitations and capabilities needed to decode bit-stream are specified by profiles
and levels of standard [68]. Each profile specifies a subset of algorithmic features and
limits that should be supported by all decoders conforming to that profile. The
H.264/AVC defines the following three profiles:
Baseline profile
Main profile
35
Extended profile
The Baseline profile supports all features in H.264/AVC except the following two
feature sets:
Set 1: B slices, CABAC, weighted prediction, field coding and macroblock adaptive
switching between frame and field coding.
Set 2 SP and SI slices: The first set of features is supported by main profile. On the
other hand, Flexible Macroblock Ordering (FMO) feature supported by the baseline
profile is not supported by main profile. Extended profile supports both sets of
features over the Baseline profile, excluding macroblock adaptive switching between
frame and field coding and CABAC. Table 3.1 shows the coding tools supported by
each profile.
Table 3.1 Coding tools supported by baseline, main and extended profile Baseline Main Extended
I slices I slices I slices
P slices P slices P slices
CAVLC B slices SP and SI slices
Slice groups CABAC CAVLC
Redundant slices Field coding Data Partitioning
Arbitrary slice order (ASO)
Weighted prediction
Slice groups & redundant slices
On the other hand, each level for codec specifies a set of limits on parameters such as
sample processing rate, picture size, coded bit-rate and memory requirements.
H.264/AVC design is divided into two layers i.e. Network Abstraction Layer (NAL)
and Video Coding Layer (VCL). NAL formats the VCL layer data into a format that is
appropriate for transmission by variety of transport layers whereas compression of
video is performed by VCL. The video coding layer of H.264/AVC is similar in spirit
36
to other existing video coding standards. Hybrid of spatial and temporal prediction
with transform coding constitutes VCL. Fig. 3.1 shows a H.264/AVC block diagram of
the video coding layer for a macroblock. The picture is split into blocks.
Fig. 3.1 JVT H.264/AVC encoder
3.1.2 Main Blocks of H.264/AVC
The H.264/AVC encoder can be divided into following blocks:
Intra Frame Prediction and Compensation
Inter Motion Compensation
Transform
Quantization
Entropy Coding
Loop Filter
37
The working of individual blocks of encoder is briefly explained in following sections
[12].
Intra Frame Prediction and Compensation
In Intra Frame, intra macroblocks are predicted from different macroblocks in the
same frame. As a result, encoded I-pictures are large in size, since a huge amount of
information is present in the frame and no temporal information is used as a part of
encoding process. H.264/AVC employs spatial correlation between adjacent
macroblocks to increase encoding efficiency. The difference between the actual
macroblock and predicted macroblock is coded, which results in fewer bits. The
difference is then transmitted to the decoder, together with the information required
for the decoder to repeat the prediction process (motion vectors, prediction mode etc.)
Inter Motion Compensation
Inter frames do prediction by using previously encoded video frames of fields using
block based motion compensation to exploit the temporal redundancies that exists
between successive frames. Important differences from earlier standards are inclusion
of SP-pictures, which enables efficient switching between bit streams with similar
content encoded at different bit rates as well as random access and fast playback
modes. Also at block level, support of block sizes from 16 x 16 down to 8 x 8 is added,
enabling motion compensation for each 16 x 16 to be performed using a number of
different blocks sizes. The availability of smaller blocks improves prediction in
general, and also the ability to handle fine motion detail in particular, which results in
better quality of video reducing large blocking artifacts. The prediction capability of
38
motion compensation is further improved by introducing quarter-pixel motion
compensation.
Transform
The information contained in a prediction error block resulting either from inter
prediction or intra prediction is then transformed. H264/AVC employs three
transforms depending upon the type of data that is to be coded: a Hadamard transform
for the 4 x 4 array of luminance DC coefficients in intra macroblocks predicted in 16 x
16 mode, Hadamard transform for the 2 x 2 array of chrominance DC coefficients (in
any macroblock) and integer DCT transform for all other 4 x 4 blocks in data. The
DCT transform is a pure integer spatial transform without rounding errors tolerances
and also eliminates any mismatch between the encoder and decoder in the inverse
transform. The small block size helps in reducing blocking and ringing artifacts which
results in improvement of picture quality.
Quantization
After applying transform, the resulting data is quantized. This process, significantly
compresses the data. H.264/AVC uses scalar quantization with a total of 52
quantization steps. The wide range of quantization steps makes possible for encoder to
control the trade-off between bit rate and quality, accurately and flexibly.
Entropy Coding
The last step in encoding processes is entropy coding. It assigns shorter codes to
symbols with higher probabilities of occurrence and longer codes to symbols of lower
occurrence. Two types of entropy coding methods are specified: Variable-Length
39
Coding (VLC) and Context-Based Adaptive Binary Arithmetic Coding (CABAC). In
VLC method, H.264/AVC uses a single universal VLC (UVLC) table that is used for
entropy coding of all the data except transform coefficients. Whereas, transform
coefficients are coded using Context Adaptive VLC (CAVLC). CABAC employs
adaptive probability model with a changing statistics of a video frame. It provides
estimates of conditional probabilities of the coding symbols.
Loop Filter
H.264/AVC employs an adaptive loop filter that reduces the block distortion. It
operates on horizontal and vertical block edges of each decoded macroblock after the
inverse transform to remove the artifacts caused by block prediction errors. It is also
possible for the encoder to alter the filtering strength or to disable the filter. The
filtering is applied on vertical and horizontal edges of 4 x 4 blocks in a macroblock.
3.1.3 Test Environment and Simulation Results
The standard is analyzed with respect to low bit rate video coding and each sequence
is coded with bit-rate less than 160 Kbps. The set of sequences represents a range of
typical video content from low and high latency applications. The QCIF sequences
used are FOREMAN, NEWS, HALL, MOTHER-DAUGHTER, CARPHONE,
COASTGUARD, SALESMAN, and TENNIS respectively [69-70]. We have used 150
frames of sequences for encoding. The coding performance is compared on output bit-
rate and peak signal to noise ratio (PSNR) of the encoded video sequences. Joint
model reference software version 7.6 encoder [71] is used for H.264/AVC tests.
40
Comparison of H.264/AVC with MPEG-2 Standard
MPEG-2 [5-6, 72] is the most common standard for video storage and transmission.
The key features of MPEG-2 standard are support for coding of interlaced video and
efficient coding of television-quality video. We compared the coding performance of
MPEG-2 and H.264/AVC by using different set of sequences. We used MPEG-2
Video Encoder based on MPEG-2 test model 5 codec developed by MPEG Software
Simulation Group [73]. The encoder was configured for main profile with the
following configuration: Six frames per GOP, I/P frames distance equals to 2, 4:3
aspect ratio, intra dc precision was configured to 1. Also alternate scan and half pixel
search was turned on. H.264/AVC encoder was configured for quarter pixel motion
vector resolution, five frames for inter motion, context-based adaptive binary coding
(CABC) for symbol coding, rate distortion optimized mode decision. Also Hadamard
transform and inter search range of 16 x 16, 16 x 8, 8 x 16, 8 x 8, 4 x 8, 8 x 4, 4 x 4
were used. The coding gains of H.264/AVC over MPEG-2 for various sequences are
shown in Table 3.2 and rate distortion comparison with H.264/AVC for QCIF
COASTGUARD and QCIF FOREMAN in Fig. 3.2.
H.264/AVC vs. MPEG-4 ASP Standard
MPEG-4 [8, 74-76] standard was developed with the aim of extending capabilities of
the earlier standards. The main features of MPEG-4 are efficient compression of
progressive and interlaced video sequences, coding of video objects, and support for
effective transmission over practical networks, and coding of texture data. We used
Microsoft MPEG-4 Visual Reference Software [77] based on Micro-soft-FDAM1-2.3-
001213 version. MPEG-4 ASP profile was used with the following configurations:
basic GM and sprite mode, MEPG quant type, VLC entropy coding, motion search
41
range window size was 16 and texture quant step for I/B/P VOPS was set to 26. Also
quarter pixel motion search and TM5 rate control were turned on. The same
H.264/AVC parameters were used for comparison. Table 3.2 shows the Y-PSNR gain
of H.264/AVC over MPEG-4 ASP for various sequences while Fig. 3.2 depicts rate
distortion comparison with H.264/AVC for QCIF COASTGUARD and QCIF
FOREMAN.
Comparison of H.264/AVC with H.263 Baseline Standard
H.263 [7, 78] is commonly used for low-delay and low to medium bit-rated
applications such as video conferencing and surveillance. We used H.263 codec
Version 2 by UBC [79] to produce baseline encoding results. The configuration of
H.264/AVC was same as for previous comparisons. Table 3.3 shows the H.264/AVC
coding gain over baseline H.263 for various sequences and Fig.3.3 shows the
objective gain of H.264/AVC for from H.263 baseline QCIF CARPHONE and QCIF
TENNIS for low bit rate video coding.
Comparison of H.264/AVC vs H.263+ Encoder
H.263+ [9-10] recommended in 1998, is an extension to baseline H.263 providing 12
negotiable modes and features to improve the coding performance and to enhance
error resilience. We compared H.264/AVC coding performance with H.263+ encoder.
The TMN H.263 version 3.0 by UBC [79] was used to generate the results by
configuring it with features that are required for H.263+. For H.263+, advanced intra
coding mode, deblocking filter mode, supplemental enhancement information mode,
advanced prediction mode and syntax-based arithmetic coding options were turned on
and search window of 15 x 15 and five frames for inter search motion were used.
42
H.264/AVC configuration was same as comparison with baseline H.263. Table 3.3
shows the H.264/AVC PSNR gain achieved over H.263+ at various bit rates and rate
distortion comparison of H.263+ with H.264/AVC for QCIF CARPHONE and QCIF
TENNIS at low bit rates is given in Fig. 3.3.
Table 3.2 Objective comparison of H.264/AVC with MPEG-2 and MPEG-4 at different bit rates for various QCIF sequences
Luminance PSNR (dB) Sequence Bit rate (Kbps) MPEG2 MPEG4 H.264
42 27.99 27.54 35.58 74 28.02 30.71 37.78 94 28.06 31.89 38.75
Foreman
115 28.87 32.89 39.62 42 27.03 29.56 38.35 63 27.05 32.08 40.57 84 27.10 33.63 42.36
News
106 27.19 34.87 43.90 42 27.62 31.93 40.13 62 27.65 34.22 41.30 93 27.69 35.77 42.40
Hall
114 28.75 36.43 42.91 48 30.51 34.94 41.67 74 30.56 36.91 43.65 100 31.42 38.28 44.98
Mother& daughter
125 34.39 39.08 46.01 42 28.4 28.62 37.89 63 28.42 32.11 39.75 83 28.44 32.64 41.01
Carphone
103 28.95 35.01 42.01 53 27.75 28.09 32.01 69 27.77 28.53 32.97 94 28.19 29.97 34.24
Coastguard
116 30.68 30.9 35.03 42 27.42 31.24 38.62 62 27.44 33.27 40.72 83 27.47 34.72 42.23
Salesman
103 28.33 35.64 43.38 42 29.24 26.35 33.44 68 29.28 29.68 35.70 94 30.35 32.12 37.14
Tennis
120 32.36 33.44 38.36
43
Table 3.3 Objective comparison of H.264/AVC with H.263 Baseline and H.263+ at different bit rates for various QCIF sequences
Luminance PSNR (dB) Sequence Bit rate (Kbps) H.263 H.263+ H.264
42 30.81 30.80 35.58 74 32.85 33.29 37.78 94 33.73 34.23 38.75 Foreman
115 34.53 35.03 39.62 42 33.68 33.27 38.35 63 35.99 35.67 40.57 84 38.19 37.59 42.36
News
106 40.28 38.92 43.9 42 35.41 34.93 40.13 63 37.62 37.01 41.3 93 39.25 38.92 42.4
Hall
114 39.9 39.51 42.9 48 37.78 37.35 41.67 74 39.21 38.91 43.65 100 40.7 40.5 44.98
Mother Daughter
125 41.43 41.35 46 42 33.68 33.87 37.89 62 35.24 35.47 39.75 83 36.45 36.73 41.01
Car phone
103 37.39 37.70 42.02 37 27.74 27.57 30.71 68 29.86 30.01 32.97 95 31.06 31.31 34.24
Coastguard
116 31.82 32.08 35.03 42 34.01 33.18 38.62 62 36.28 35.48 40.72 82 38.06 37.02 42.23
Salesman
103 39.34 38.53 43.38 43 29.32 29.72 33.44 68 31.30 32.05 35.70 94 32.74 33.60 37.14
Tennis
120 33.90 34.76 38.36
44
Fig. 3.2 Rate distortion comparison of H.264/AVC at low bit rates with MPEG-2 and MPEG-4 (a) QCIF COASTGUARD (b) QCIF FOREMAN
Fig. 3.3 Rate distortion comparison of H.264/AVC at low bit rates with H.263 Baseline and H.263+ (a) QCIF CARPHONE (b) QCIF TENNIS
45
The subjective comparison of H.264/AVC with existing standards for QCIF
CARPHONE and QCIF FOREMAN for low bit rate video coding is given. Fig. 3.4
shows H.264/AVC subjective comparison of QCIF CARPHONE frame 57 with
MPEG-4, H.263 baseline and H.263+ standards encoded at 22 Kbps while
H.264/AVC comparison of QCIF FOREMAN frame 134 with MPEG-2, MPEG-4,
H.263 baseline and H.263+ encoded at 40 Kbps is shown in Fig. 3.5. These
comparisons depict clear superiority of latest H.264/AVC standard over existing
standards for low bit rate video coding.
Fig. 3.4 Subjective comparison at low bit rats: QCIF CARPHONE frame 57 encoded at 22 Kbps with (a) H.264/AVC (b) MPEG-4 (c) H.263 Baseline (d) H.263 +
46
Fig. 3.5 Subjective comparison at low bit rates: QCIF FOREMAN frame 134 encoded at 40 Kbps (a) Original Uncompressed (b) with H.264/AVC (c) with MPEG-2 (d) with MPEG-4 (e) with H.263 Baseline (f) with H.263 +
47
3.2 Evaluation of H.264/AVC Deblocking Filter
Performance analysis of latest standard, H.264/AVC has shown significant objective
and subjective gains in comparison with other existing video coding standards for low
bit rate video communication [12, 68, 80-82]. Moreover, H.264/AVC employs
mandatory adaptive loop deblocking filter for the reduction of blocking artifacts at
low bit rates [13]. The deblocking filter is applied to the reconstructed frame both in
encoder and decoder. This section describes the performance analysis of H.264/AVC
deblocking filter.
3.2.1 H.264/AVC Loop Deblocking Filter
H.264/AVC employs an adaptive loop deblocking filter after the inverse transform in
the encoder and decoder respectively [13] as shown in Fig. 3.6.
Fig. 3.6 Position of deblocking filter in H.264/AVC encoder
48
The filter is applied to each macroblock to reduce the blocking artifacts without
reducing the sharpness of the picture. The net effect is in improvement of the
subjective quality of compressed video. The output of filter is used for motion
compensated prediction for further frames. The deblocking filter process is invoked
for the luminance and chrominance components separately. Filtering is applied to
vertical and horizontal edges of the block except for the edges on the slice boundaries.
The order of the filtering at a macroblock level is shown in Fig. 3.7.
Fig. 3.7 Filtering order at macroblock level
Initially, 4 vertical edges of the luminance component i.e., VLE1, VLE2, VLE3, and
VLE4 are filtered. Then, horizontal edges of the luminance component i.e., HLE1,
HLE2, HLE3, and HLE4 are filtered. Finally, vertical edges of chrominance
component, VCE1, VCE2, and horizontal edges of chrominance component HCE1,
HCE2 are filtered respectively. It is also possible for the filter to alter the filter
strength or to disable the filter. The filtering operation affects three samples on either
side of the boundary. The operation of deblocking filter can be divided into two main
49
steps, i.e., filter boundary strength computation and strong/normal filter application
respectively.
Filter Boundary Strength Computation
The filter strength i.e., the amount of filtering is computed with the help of parameter
boundary strength (bS). The boundary strength (bS) of the filter depends on the current
quantizer, macroblock type, motion vector, gradient of the image samples across the
boundary and other parameters [13]. The boundary strength is derived for each edge
between neighboring 4 x 4 luminance blocks and for each edge; bS parameter is
assigned an integer value for 0 to 4. The rules for selecting integer value for parameter
boundary strength (bS) is illustrated in flow chart of Fig. 3.8.
Fig. 3.8 Boundary strength (bS) computation flowchart
50
The bS values for filtering of chrominance block edges are not calculated
independently and the same values calculated for luminance edges are used.
Application of these rules results in strong filtering in the areas where there is a
significant blocking distortion, such as boundary of intra coded macroblock or a
boundary between blocks that contain coded coefficients.
Strong/Normal Filter Application
The filtering decision does not depend only on non-zero boundary strength, i.e.,
filtering cannot be started on the basis of non-zero boundary strength only [13].
Deblocking filtering may not be needed, even in the case of non-zero boundary
strength. This is especially true when we have real sharp transitions across the edge.
Applying filter to such edges will result in blurry image. When pixels do not change
much across the block edge in very smooth regions, blocking artifacts are most
noticeable. Therefore, another condition in addition to non-zero boundary strength is
required for filtering decision. As a consequence, set of samples across the edges say
p2, p1, p0 and q0, q1, q2 are filtered only, if they have met the following two
conditions:
(1) bS should be greater than zero
(2) abs(p0-pq0) < α abs(p1-p0) < β & abs (q1-q0) ≤ β
where α and β are the thresholds defined in the standard [13], they increase with the
average quantizer QP of the two blocks containing samples p and q. When QP is
small, the small transition across the boundary is likely due to image features rather
than that of blocking effects that should be preserved and so the thresholds α and β are
51
low. When QP is large, blocking distortion is likely to be significant and α and β are
higher so more boundary samples are filtered [11]. The filter can be switched off,
when there is a real significant change across the boundary of an original image,
which is not due to blocking distortion. The luminance deblocking filtering is
performed on four 16-sample edges and on two 8-sample edges for chrominance
components in horizontal and vertical directions respectively. Following rules apply
for filter implementation [11, 13, 68].
Pixel values above and to the left of the current macroblock (MB) that may
have already been modified by filter on previous MBs shall be used as
input to the filter on the current MB and may be further modified during
the filtering of current MB.
Pixel values modified during filtering of vertical edges are used as input
for filtering of horizontal edges for the same MB.
Pixel values modified during the filtering of previous edges are used input
for the filtering of the next edge in both horizontal and vertical directions.
The procedure for calculating filtered pixel samples is as follows. When integer values
of boundary strength is 1 to 3, the steps required for computing filtered samples is:
A 4-tap filter is applied with inputs p1, p0, q0, q1, producing filtered
outputs p’0 and q’0.
If abs (p2-p0) is less than threshold β, another 4-tap filter is applied with
inputs p2, p1, p0, q0 producing filtered output p’1 for luminance
component only.
If abs (q2-q0) is less than the threshold β, a four tap filter is applied with
inputs q2, q1, q0, p0, producing filtered q’1 for luminance component.
52
When integer value of boundary strength equals to 4, the following procedure is used
to get filtered output:
If abs (p2-p0) is less than β and abs (p0-q0) is less than α/4 and current
block is luminance block than p’0 is produced by 5-tap filtering of p2, p1,
p0, q0, q1 and p’1 by 4-tap filtering of p2, p1, p0, q0 and p’2 is produced
by 5-tap filtering of p3, p2, p1, p1, q0 respectively.
Otherwise p’0 is produced by 3-tap filtering of p1, p0 and q0.
If abs(q2-q0) is less than β and abs(p0-q0) is less then α/4 and current
block is luminance block than q’0 is produced by 5-tap filtering of q2, q1,
q0, p0, p1 and q’1 is produced by 4-tap filtering of q2, q1, q0, p0 and q’2
by 5-tap filtering of q3, q2, q1, q0 ,p0 respectively.
Else q’0 is produced by 3-tap filtering of q1, q0, p1.
The whole procedure of generating filtered sample values is illustrated in Fig. 3.9.
53
Fig. 3.9 H.264/AVC deblocking filter
3.2.2 Experimental Methodology and Results
H.264/AVC Joint model reference software version encoder [83] is used for tests. We
have used QCIF (176 x 144) and CIF (352 x 288) video sequences. The set of
sequences represents a range of typical video content from low and high latency
applications. The QCIF sequences used for experimentation are MISS AMERICA,
CARPHONE, TENNIS and FOREMNAN while CIF sequences used are HALL,
COASTGUARD, MOBILE&CALENDAR and TEMPETE [69-70] respectively. We
have used 50 frames of sequences for QCIF and CIF encoding. QCIF sequences were
encoded at 15 fps and CIF sequences were encoded at 30 fps frame rate respectively.
Each sequence was coded at five different bit rates. The coding performance is
54
compared on output bit rate and PSNR of the encoded video sequences. Only the
luminance component is taken into consideration since human visual system is less
sensitive to color than to luminance. H.264/AVC encoder was configured for quarter
pixel motion vector resolution, five frames for inter motion, context-based adaptive
binary coding (CABAC) for symbol coding, rate distortion optimized mode decision.
Both encoders were configured to have five frames for inter motion search. The PSNR
is compared by comparing coding performance with and without deblocking filter
mode. Table 3.4 shows the luminance PSNR for various QCIF sequences while Table
3.5 shows CIF sequences.
Table 3.4 Average luminance PSNR at different low bit rates for QCIF sequences with- and without deblocking filter
Average PSNR(Y), dB Sequence Bit rate, Kbps No Filter With Filter 130 41.72 41.69 90 39.71 39.74 70 38.39 38.40 QCIF Carphone
50 36.64 36.77 120 45.93 45.94 100 45.39 45.41 60 43.80 43.78
QCIF Miss America
30 41.58 41.60 100 37.56 37.55 80 36.53 36.53 60 35.25 35.26
QCIF Foreman
30 32.14 32.28 115 34.00 34.03 95 33.17 33.21 70 31.91 31.96
QCIF Coastguard
40 29.87 29.95 120 42.25 42.25 90 41.37 41.41 60 40.08 40.15
QCIF Hall
20 34.18 34.48 120 44.15 44.13 70 40.23 40.42 50 37.93 37.92 QCIF News
25 33.70 33.76
55
Table 3.5 Average luminance PSNR at different low bit rates for CIF sequences with- and without deblocking filter
Rate PSNR graph with and without loop filter for QCIF COASTGUARD sequence
and CIF FOREMAN sequence at various bit rates are shown in Fig. 3.10 (a) and Fig
3.10 (b) respectively. These graphs depict that PSNR of using H.264/AVC deblocking
filter is slightly better than that of without filter.
Average PSNR(Y), dB Sequence Bit rate, Kbps No Filter With Filter 130 34.71 34.91 90 33.15 33.42 60 31.22 31.64
CIF Foreman
40 28.94 29.57 140 41.42 41.43 120 40.79 40.80 90 39.48 39.54
CIF Mother Daughter
60 37.50 37.73 140 37.00 37.08 100 35.45 35.63 70 33.61 33.89
CIF Irene
40 31.18 31.53 130 36.88 36.87 90 35.39 35.44 70 34.32 34.41 CIF Container
40 32.31 32.45 120 38.14 38.26 85 37.27 37.50 65 36.47 36.81 CIF Highway
40 35.04 35.50 125 38.24 38.24 90 37.81 37.87 70 37.50 37.59
CIF Bridge
40 36.94 37.10
56
(a)
(b)
Fig. 3.10 Rate-PSNR comparison at low bit rates: with- & without deblocking filter (a) QCIF COASTGUARD (b) CIF FOREMAN
57
Fig. 3.11 through Fig. 3.14 shows the subjective comparison between with and -
without loop filter for various QCIF and CIF sequences at low bit rates. The details of
various frames and encoded bit rates of QCIF sequences are as follows: Fig 3.11 (a)
QCIF CARPHONE frame 6 at 30 Kbps, Fig 3.11 (b) QCIF CLAIRE frame 2 at 30
Kbps, Fig 3.11 (c) QCIF FOREMAN frame 3 at 30 Kbps and Fig 3.11 (d) QCIF
HALL frame 4 at 20 Kbps while Fig 3.12 (a) QCIF MISS AMERICA frame 1 at 30
Kbps, Fig 3.12 (b) QCIF NEWS frame 5 at 25 Kbps, Fig 3.12 (c) QCIF AKIYO frame
1 at 20 Kbps and Fig 3.12 (d) QCIF SALESMAN frame 6 at 20 Kbps. On the other
hand, various CIF sequences used for comparison are: Fig 3.13 (a) CIF BRIDGE
frame 7 at 40 Kbps, Fig 3.13 (b) CIF CONTAINER frame 1 at 40 Kbps, Fig 3.13 (c)
CIF FOREMAN frame 9 at 35 Kbps and Fig 3.14 (a) CIF HIGHWAY frame 8 at 40
Kbps, Fig 3.14 (b) CIF MOTHER DAUGHTER frame 8 at 40 Kbps, Fig 3.14 (c) CIF
SILENT frame 13 at 40 Kbps. The perceptual comparison of various QCIF and CIF
sequences depicts that H.264/AVC deblocking filter can significantly reduce blocking
artifacts at low bit rates.
58
Fig. 3.11 Subjective comparison for various QCIF sequences (i) No deblocking filter (ii) H.264/AVC deblocking filter
59
Fig. 3.12 Subjective comparison for various QCIF sequences (i) No deblocking filter (ii) H.264/AVC deblocking filter
60
Fig. 3.13 Subjective comparison for various CIF sequences (i) No deblocking filter (ii) H.264/AVC deblocking filter
61
Fig. 3.14 Subjective comparison for various CIF sequences (i) No deblocking filter (ii) H.264/AVC deblocking filter
62
CHAPTER 4
Design and Implementation of Proposed
Deblocking Filter for Improved Quality Low Bit
Rate Video Coding
This chapter describes a novel approach adopted in deblocking filter for H.264/AVC
video. Performance analysis [14-17] of H.264/AVC loop factor shows that it reduces
the blocking artifacts significantly at low bit rates. However, it is highly
computationally complex, as it takes one-third of computational resources of the
decoder according to an analysis of run-time profiles of decoder sub- functions [18].
The main reasons for high computational complexity of filter are: (1) heavy
conditional processing for edge strength computations on block edge, (2) pixel level
required for filtering decision, and (3) to select one type of filter out of two filters:
strong/normal. Kin-Hung Lam [84] reported that more than 90% of computational
resources are spent on boundary strength computations in H.264/AVC deblocking
filter.
Very little work has been reported in the area of algorithmic optimization in
comparison to efficient hardware implementation of H.264/AVC deblocking filter
[85-87]. In this chapter, a novel approach of incorporating motion activity of video
sequences in deblocking filter is proposed, which results in optimized deblocking
algorithm with respect to computing complexity. The proposed algorithm not only
reduces blocking artifacts without significant loss of subjective quality but also has
significant reduction in computing complexity in comparison with original
63
H.264/AVC deblocking filter. Section 4.1 examines employment of strong and normal
filter in H.264/AVC deblocking filter. The classification of video sequences using
motion compensation vectors is elaborated in section 4.2 while section 4.3 explains
motion activity of video using sum of motion vectors thresholds for various video
sequences. The detail of proposed deblocking algorithm is given in section 4.4.
Experimental environment for comparison of proposed methodology with existing
H.264/AVC deblocking algorithm is shown in section 4.5 while computing
complexity analysis, objective and subjective comparison are elaborated in section
4.6, section 4.7 and section 4.8 respectively.
4.1 Analysis of Strong and Normal Filter Employment in
H.264/AVC Deblocking Filter
Since boundary strength computations is a major factor for computational complexity
of H.264/AVC deblocking filter and these computations are primarily used to select
between strong and normal filter. Therefore, it is worth mentioning to analyze the
frequency of usage of strong and normal filter in H.264/AVC deblocking filter for
various video sequences. H.264/AVC based loop deblocking filter employs two kinds
of filters i.e., (1) normal filter and (2) strong filter based on boundary strength. In
proposed research, the source code of H.264/AVC deblocking filter [88] has been
modified to insert flags at pixel level, macroblock level and frame level to study the
effects of the employment of these two filters in H.264/AVC deblocking filter. All
sequences used in experimentation are configured by taking intra period as 0, i.e., only
first frame is taken as intra frame. Fig. 4.1 through Fig. 4.4 describes the comparison
of usage of strong and normal deblocking at macroblock level of different frames for
QCIF CONTAINER, QCIF SALESMAN, QCIF MOTHER DAUGHTER, and QCIF
64
FOOTABLL respectively. While Fig. 4.5 through Fig. 4.7 depicts the comparison of
usage of strong and normal filter at frame level for QCIF CONTAINER, QCIF
SALESMAN, QCIF MOTHER DAUGHTER, QCIF CARPHONE, QCIF
FOREMAN and QCIF FOOTBALL respectively. It has been observed through this
experimentation that: excluding the first intra frame, the usage of strong filter is
minimal in all sequences except QCIF FOOTBALL. The usage of strong filter in
QCIF FOOTBALL increases as the frames in sequence progresses. On the other hand,
there is a significant use of normal filter in video sequences.
65
(a)
(b)
Fig. 4.1 QCIF CONTAINER at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter
66
(a)
(b)
Fig. 4.2 QCIF SALESMAN at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter
67
(a)
(b)
Fig. 4.3 QCIF MOTHER DAUGHTER at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter
68
(a)
(b) Fig. 4.4 QCIF FOOTBALL at 30 Kbps: Use of (a) Normal Filter (b) Strong Filter
69
(a)
(b) Fig. 4.5 Use of Strong and Normal Filter at Frame Level in H.264/AVC Deblocking
Filter encoded at 30 Kbps (a) QCIF CONTAINER (b) QCIF SALESMAN
70
(a)
(b)
Fig. 4.6 Use of Strong and Normal Filter at Frame Level in H.264/AVC Deblocking Filter encoded at 30 Kbps (a) QCIF MOTHER DAUGHTER (b) QCIF CARPHONE
71
(a)
(b) Fig. 4.7 Use of Strong and Normal Filter at Frame Level in H.264/AVC Deblocking Filter encoded at 30 Kbps (a) QCIF FOREMAN (b) QCIF FOOTBALL
72
4.2 Classification using Motion Activity in Video Sequences
Different video sequences can be categorized based on their motion activity. Before
explaining method for classification of video sequences using motion vectors, it is
deemed essential to briefly describe the process of finding motion vectors in motion
estimation.
Motion estimation involves representing difference between two consecutive frames
(current and reference frame) as a set of motion vectors, which is used in motion
compensation block to produce a predictive block. The frame is partitioned into
blocks and motion vectors are computed for each block. A two dimensional vector, x
and y component are assigned to each block. For example, in full search motion
estimation algorithm, all possible blocks within search area in the previous frame are
checked to find a block most closely matched to current one. Many matching criterion
like sum of the absolute differences (SAD), Mean Square Error (MSE), Mean
Absolute Difference (MAD) can be used. SAD is the most popular one. If SAD is
used, then motion vectors of each block are computed by examining SAD values of
candidate motion vectors within search area and motion vector which has the
minimum SAD value chosen.
Motion vectors are utilized to categorize video sequences as it has low computational
cost in comparison to other techniques. The pre-calculated motion vectors during
motion estimation process can be used in deblocking filter with minimal computing
complexity. Experimentation conducted in the proposed research found that absolute
sum of motion vectors of all macroblocks and sub-blocks in each frame can be used to
detect the motion activity on frame by frame basis in video sequences. Eq. 4.1
describes motion vectors absolute sum for Quarter Common Intermediate Format
73
(QCIF) sequences while sum of motion vectors for Common Intermediate Format
(CIF) sequences is mathematically explained by Eq. 4.2.
∑=
=98
0
||i
MBisum MVMV (4.1)
∑=
=395
0||
iMBisum MVMV (4.2)
∑ +=AXB
yixiMBi MVMVMV |||| (4.3)
where A x B are sub-blocks of macroblock such as 16 x 16, 16 x 8, 8 x 16, 8 x 8, 8 x 4,
4 x 8, 4 x 4, and MVx , MVy are horizontal and vertical motion vectors respectively and
MVMB is the absolute sum of horizontal and vertical motion vectors of all sub-blocks
in a macroblock. Table 4.1 shows maximum sum of motion vectors, MVsum for various
QCIF video sequences.
Table 4.1 MVsum for QCIF video sequences QCIF Video Sequences MVsum
Container 166 Akiyo 260 Miss America 442 Claire 516 Grand Ma 525 Hall 580 Salesman 999 Mother daughter 1575 News 1618 Silent 2068 Carphone 2411 Foreman 4059 Soccer 36903 Football 48602
74
Maximum sum of motion vectors, MVsum , for various CIF video sequences is depicted
in Table 4.2.
Table 4.2 MVsum for CIF video sequences CIF Video Sequences MVsum Container 442 Hall 2707 Bridge 4439 Mother Daughter 5611 Paris 7092 Highway 8949 Silent 9324 Irene 12995 Foreman 20259 Coastguard 47008 Football 114568
4.3 Motion Vector Thresholds for Motion Activity
Video sequences having higher value of sum of motion vectors, MVsum of frame will
have strong motion activity and vice versa. Therefore, based on MVsum value, video
sequences can be classified into low motion activity, moderate motion activity and
high motion activity. From Fig 4.8, two thresholds for QCIF (176 x 144 samples)
sequences have been assumed: TH1MV = 600 and TH2MV = 4000. The pseudocode for
classification of motion activity in video sequences into three classes i.e., low motion
sequence, moderate motion sequence, and high motion sequence is shown below:
if MVsum < TH1MV
Low Motion Sequence
else if TH1MV < MVsum < TH2MV
Moderate Motion Sequence
75
else MVsum > TH2MV
High Motion Sequence
Motion Vectors Thresholds for QCIF Sequences
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000 C
ON
TAIN
ER
AK
IYO
MIS
S A
ME
RIC
A
CLA
IRE
GR
AN
D M
A
HA
LL
SA
LES
MA
N
MO
THE
R D
AU
GH
TER
NE
WS
SIL
EN
T
CA
RP
HO
NE
FOR
EM
AN
SO
CC
ER
FOO
TBA
LL
QCIF Sequences
MVs
um
MODERATE Motion Activity
HIGH Motion Activity
LOW Motion Activity
Fig. 4.8 Thresholds for classification of QCIF video sequences
As CIF (352 x 288 samples) video sequences are greater in size in comparison with
QCIF sequences, QCIF thresholds cannot be used. Fig 4.9 shows thresholds for CIF
sequences.
76
Motion Vectors Thresholds for CIF Sequences
0
3000
6000
9000
12000
15000
18000
21000
24000
27000
30000
CO
NTA
INE
R
HA
LL
BR
IDG
E
MO
THE
R D
AU
GH
TER
PA
RIS
HIG
HW
AY
SIL
EN
T
IRE
NE
FOR
EM
AN
CO
AS
TGU
AR
D
FOO
TBA
LL
CIF Sequences
MVs
um
LOW Motion Activity
MODERATE Motion Activity
HIGH Motion Activity
Fig. 4.9 Thresholds for classification of CIF video sequences
From Fig 4.9, two thresholds for CIF sequences have been assumed: TH1MV = 4500
and TH2MV = 20000. For MVsum < 4500, sequence will be classified as low motion
activity. If 4500 < MVsum < 20000, then sequence will be classified as moderate
motion activity. While the sequence will have high motion activity if MVsum > 20000.
4.4 Proposed Deblocking Filter
The operation H.264/AVC deblocking filter can be divided into two main steps, i.e.,
edge strength computation and filter application, respectively. The edge strength is
computed with the help of boundary strength parameter (bS) and depends on the
77
current quantizer, macroblock type, motion vector and gradient of the image samples
across the boundary. The parameter bS is derived for each edge between neighboring
4 x 4 luminance blocks and assigned an integer value from 0 to 4. The bS values
calculated for luminance edges are also used for filtering of chrominance block edges
and need not to be calculated independently. The four samples on vertical edge or
horizontal edge in adjacent blocks are p0, p1, p2, p3 and q0, q1, q2, q3 respectively
are shown in Fig. 4.10. In addition to non-zero bS, another condition need to be
satisfied for filtering of samples across block edges.
Fig. 4.10 Adjacent samples to vertical and horizontal edge
The samples are filtered only, if they satisfy the two conditions of Eq. 4.4 and Eq. 4.5.
0>bS (4.4)
<−
<−
<−
β
β
α
)01(&&
)01(&&
)00(
qqabs
ppabs
qpabs
(4.5)
78
where α and β are the thresholds. These thresholds are computed with parameters
index A and index B, which are derived on the basis of quantization parameter values
for macro blocks containing samples p0 and q0 from Eq. 4.6 and Eq. 4.7 [13].
Index A =
+>+<+
otherwiseetAfilteroffsQPetAfilteroffsQPetAfilteroffsQP
AV
AV
AV
; 51 ; 51
0 ; 0 (4.6)
Index B =
+>+<+
otherwiseetBfilteroffsQPetBfilteroffsQPetBfilteroffsQP
AV
AV
AV
; 51 ; 51
0 ; 0 (4.7)
Where QPAV is average quantization parameter and filteroffset A and filteroffset B are
numerical values ranging from -6 to +6 used in parameter file, when deblocking filter
is enabled. The values of α and β are calculated against index A and index B values in
tabular form, which can be approximated as Eq.4.8 [68].
72
)( 1254)( 6 −=
−=
xxxx
βα (4.8)
The experimentation conducted in the proposed research reveals that strong filtering
of H.264/AVC deblocking filter can be excluded without significant loss of subjective
quality of video for low to moderate motion activity video sequences. Hence,
boundary strength (bS) computations used primarily to select between strong and
normal filter can be eliminated for low to moderate motion video sequences. As a
result, for low to moderate motion sequences, filtering of the samples can be decided
79
on the fulfillment of the only one condition i.e., of Eq. 4.5 whereas, for high motion
video sequences, fulfillment of both conditions given in Eq. 4.4 and 4.5 are needed.
This reduces computational complexity in the proposed deblocking algorithm
significantly due to the elimination of boundary strength computations for low to
moderate motion video sequences.
The application of proposed algorithm is decided on the basis of inter frames (P-
pictures). First motion vectors sum, MVsum of each P picture is computed and
compared with pre-defined thresholds to decide the class of motion activity at frame
level. If picture is classified as low or moderate motion activity, only normal filter
[13] is applied on the fulfillment of Eq. 4.5. The working of normal filter is elaborated
through Eq.4.9 to Eq.4.26. The input unfiltered samples are p0, p1, p2, q0, q1, and q2
whereas P0, P1, P2, Q0, Q1, and Q2 are filtered output samples.
Normal filter for luminance edge:
)255),0,0((0 dpMaxMinP += (4.9)
)255),0,0((0 dqMaxMinQ −= (4.10)
where ))),3)4)11(2)00(((,(( KqppqKMaxMind >>+−+<<−−= (4.11)
if β<−= |02| ppAP (4.12)
)0),1)11(1)0(2(,0((11 KpqpopKMaxMinpP >><<−>>++−+=
else 11 pP = (4.13)
if β<−= )02| qqAQ (4.14)
)0),11(1)00(2(,0((11 KqqpqKMaxMinqQ <<−>>++−+=
else 11 qQ = (4.15)
whereas )0:1?)(()0:1?)((0 ββ <+<+= QP AAKK (4.16)
80
22 pP = (4.17)
22 qQ = (4.18)
Normal filter for chrominance edge:
)255),0,0((0 dpMaxMinP += (4.19)
)255),0,0((0 dqMaxMinQ −= (4.20)
where ))),3)4)11(2)00(((,(( KqppqKMaxMind >>+−+<<−−= (4.21)
and 10 += KK (4.22)
11 pP = (4.23)
11 qQ = (4.24)
22 pP = (4.25)
22 qQ = (4.26)
The variable K0 is a function of index A and boundary strength (bS) as defined in
standard [13].
For high motion activity, strong filter [13] in addition to normal filter is applied on the
fulfillment of Eq. 4.4 and Eq. 4.5. The working of strong filter is given in Eq. 4.27
through Eq. 4.38.
Strong filter for luminance edge:
3)410*20*21*22(0 >>+++++= qqpppP (4.27)
2)20012(1 >>++++= qpppP (4.28)
3)40012*33*2(2 >>+++++= qppppP (4.29)
81
3)410*20*21*22(0 >>+++++= ppqqqQ (4.30)
2)20012(1 >>++++= pqqqQ (4.31)
3)40012*33*2(2 >>+++++= pqqqqQ (4.32)
Strong filter for chrominance edge:
2)2101*2(0 >>+++= qppP (4.33)
11 pP = (4.34)
22 pP = (4.35)
2)2101*2(0 >>+++= pqqQ (4.36)
11 qQ = (4.37)
22 qQ = (4.38)
where P0, P1, P2, Q0, Q1, and Q2 are filtered output samples. The entire procedure
for deblocking in proposed algorithm is shown in Fig. 4.11.
82
Fig. 4.11 Proposed Deblocking Filter
4.5 Experimental Environment
The proposed algorithm is tested by H.264/AVC joint model reference software
version JM 10.2 [88]. The proposed scheme’s computational complexity, objective,
and subjective quality with the original deblocking algorithm of H.264/AVC
implemented in JM 10.2 are compared. To analyze the proposed filter versus
H264/AVC deblocking filter, video sequences having low and moderate motion
83
activity are used. The Quarter Common Intermediate Format (QCIF) sequences used
for experimentation are CONTAINER, AKIYO, MISS AMERICA, CLAIRE,
GRAND MA, HALL, SALESMAN, MOTHER DAUGHTER, NEWS, SILENT,
CARPHONE, and FOREMAN while Common Intermediate Format (CIF) sequences
used are CONTAINER, BRIDGE, MOTHER DAUGHTER, PARIS, HIGHWAY,
SILENT, IRENE, and FOREMAN [69-70]. Each sequence coded at four different bit
rates consists of 150 frames. The parameters incorporated in the simulation model are
given in Table 4.3.
Table 4.3 Various parameters for experimental environment
4.6 Computing Complexity Comparison
The source code of H.264/AVC reference software is modified to count additions,
shifts and number of comparison operations performed both in original H.264/AVC
and proposed deblocking algorithm. The average number of addition, shift and
comparison operations required for various low to moderate motion QCIF and CIF
sequences of H.264/AVC deblocking filter versus proposed deblocking algorithm for
the same number of frames is shown in Table 4.4 and Table 4.5 respectively.
Reference software version JM 10.2 Video format 352 x 288 & 176 x 144 Total frames 150 Frame rate 15 Profile Main Intra Period 0 (only 1st frame) Max search range 16 GOP structure IPPP Hadmard Transform Used Transform 8x8 mode Not used No. of reference frames 1 Frame skip 0
Inter search 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4
B frames Not used Entropy method CABAC Rate distortion optimization Not used
84
Table 4.4 Average number of operations spent on QCIF sequences using H.264/AVC deblocking filter and proposed filter
QCIF Sequence Operations H.264 filter Proposed Additions 28,476,460 16,292,188 Multiplications 0 0 Divisions 0 0 Shifts 36,463,742 20,778,158
Container
Comparisons 144,531,133 75,207,656 Additions 27,371,101 15,662,701 Multiplications 0 0 Divisions 0 0 Shifts 35,959,334 20,425,673
Akiyo
Comparisons 142,740,797 74,374,293 Additions 26,357,322 15,492,921 Multiplications 0 0 Divisions 0 0 Shifts 35,123,180 20,263,459
Miss America
Comparisons 141,444,678 73,892,811 Additions 29,389,037 16,902,641 Multiplications 0 0 Divisions 0 0 Shifts 37,057,379 21,227,732
Grand Ma
Comparisons 146,022,631 76,782,643 Additions 35,902,147 19,433,303 Multiplications 0 0 Divisions 0 0 Shifts 40,511,946 22,403,383
Salesman
Comparisons 150,930,703 80,338,883 Additions 36,937,707 20,088,158 Multiplications 0 0 Divisions 0 0 Shifts 40,911,885 23,061,611
Mother Daughter
Comparisons 151,959,176 81,907,953 Additions 41,848,796 22,148,680 Multiplications 0 0 Divisions 0 0 Shifts 43,690,091 23,992,012
Silent
Comparisons 156,258,335 85,144,423 Additions 51,547,739 27,782,268 Multiplications 0 0 Divisions 0 0 Shifts 48,376,735 27,384,781
Foreman
Comparisons 163,081,086 93,638,591
85
Table 4.5 Average number of operations spent on CIF sequences using H.264/AVC deblocking filter and proposed filter
CIF Sequence Operations H.264 filter Proposed Additions 209,073,508 106,534,689 Multiplications 0 0 Divisions 0 0 Shifts 194,629,505 109,146,848
Container
Comparisons 658,904,456 372,531,724 Additions 162,641,668 84,435,369 Multiplications 0 0 Divisions 0 0 Shifts 172,012,906 96,177,357
Bridge
Comparisons 626,390,344 339,840,985 Additions 231,164,441 115,687,209 Multiplications 0 0 Divisions 0 0 Shifts 205,766,183 114,703,405
Paris
Comparisons 677,279,548 337,074,592 Additions 269,656,850 134,292,043 Multiplications 0 0 Divisions 0 0 Shifts 224,782,071 125,383,657
Highway
Comparisons 703,963,638 416,485,739 Additions 261,440,800 133,611,987 Multiplications 0 0 Divisions 0 0 Shifts 220,835,749 125,154,325
Foreman
Comparisons 697,709,576 414,675,307
Fig. 4.12 through Fig. 4.14 show the comparison of addition operations, shift
operations, and comparison operations between H.264/AVC deblocking filter and
proposed deblocking algorithm for different low to motion activity video sequences
respectively. These comparison graphs clearly depict significant reduction in addition,
shift and comparison operations for proposed deblocking filter in comparison with
H.264/AVC deblocking filter.
86
0
5
10
15
20
25
30
35
40
45
Akiyo Grand Ma Salesman MotherDaughter
Silent
QCIF Sequences
No.
of A
dditi
on O
pera
tions
(Mill
ions
)
H.264 DF Proposed
(a)
0
50
100
150
200
250
300
Container Bridge Paris Highway Foreman
CIF Sequences
No.
of A
dditi
on O
pera
tions
(Mill
ions
)
H.264 DF Proposed
(b)
Fig.4.12 Comparison of addition operations (a) QCIF sequences (b) CIF sequences
87
0
5
10
15
20
25
30
35
40
45
50
Akiyo Grand Ma Salesman MotherDaughter
Silent
QCIF Sequences
No.
of S
hift
Ope
ratio
ns (M
illio
ns)
H.264 DF Proposed
(a)
0
50
100
150
200
250
Container Bridge Paris Highway Foreman
CIF Sequences
No.
of S
hift
Ope
ratio
ns (M
illio
ns)
H.264 DF Proposed
(b)
Fig.4.13 Comparison of shift operations (a) QCIF sequences (b) CIF sequences
88
0
20
40
60
80
100
120
140
160
180
Akiyo Grand Ma Salesman MotherDaughter
Silent
QCIF Sequences
No.
of C
ompa
rison
Ope
ratio
ns (M
illio
ns)
H.264 DF Proposed
(a)
0
100
200
300
400
500
600
700
800
Container Bridge Paris Highway Foreman
CIF Sequences
No.
of C
ompa
rison
Ope
ratio
ns (M
illio
ns)
H.264 DF Proposed
(b)
Fig.4.14 Comparison operations (a) QCIF sequences (b) CIF sequences
89
Table 4.5 describes the overall computing complexity analysis of proposed filter in
comparison with H.264/AVC deblocking filter. The significant reduction in total
number of operations in proposed algorithm can be seen from the table. The reduction
in number of operations is achieved because of the following two reasons; the strong
filter is not used and edge strength computations are eliminated in video sequences of
low and moderate motion activity. For various sequences used in the simulation,
45.29% reduction in average number of operations is achieved in comparison to
H.264/AVC deblocking filter.
Table 4.6 Computing complexity analysis of proposed filter with H.264/AVC deblocking filter
H.264 deblocking filter Proposed deblocking filter
Sequences Total no. of operations
Avg. no. of operations spent on
filtering of edge
samples
Total no. of operations
Avg. no. of operations spent on
filtering of edge
samples
Reduction in no. of
operations of proposed
filter
%age reduc--tion on
total oprtns
QCIF Hall 216706920 20617213 120100701 11815795 96606219 44.58
QCIF News 220906614 17687001 118106515 10566838 102800099 46.54
QCIF Salesman 215415348 13175349 113959513 7573386 101455835 47.10
QCIF Carphone 230203012 23755877 125376191 14849401 104826820 45.54
QCIF Foreman 226194657 20798789 122495837 13175185 103698820 45.84
CIF Container 1062607469 200665721 588213260 122801488 474394208 44.64
CIF Bridge 961044917 117720165 520453711 81144899 440591206 45.85
CIF Paris 1114210172 242037000 567465206 145811190 546744966 49.07
CIF Highway 1198402559 312258491 676161439 191747607 522241119 43.58
CIF Foreman 1179986124 296785670 673441618 190093590 506544505 42.93
90
4.7 Objective Comparison
The objective comparison of proposed algorithm with H.264/AVC deblocking
algorithm is performed by measuring Peak to Signal Noise Ratio (PSNR) of both
algorithms at various low bit rates. Table 4.6 and Table 4.7 show comparison for
luminance PSNR at different bit rates using H.264/AVC filter and proposed algorithm
for various QCIF sequences and CIF sequences respectively.
Table 4.7 Average Luminance PSNR at different bit rates for QCIF sequences Average Y-PSNR(dB) Sequence Bit rate
(Kbps) H.264 filter Proposed 120 45.94 45.94 80 44.70 44.71 60 43.78 43.81 QCIF Miss America
30 41.60 41.64 120 42.25 42.25 90 41.41 41.40 30 36.84 37.23 QCIF Hall
20 34.48 34.67 120 43.66 43.66 60 39.57 39.59 35 36.04 36.10 QCIF Salesman
20 32.72 32.74 120 44.11 44.11 100 42.71 42.77 50 37.92 38.04 QCIF News
25 33.76 33.78 130 41.69 41.69 70 38.40 38.44 50 36.77 36.84 QCIF Carphone
30 34.36 34.41 100 37.55 37.56 80 36.53 36.59 60 35.26 35.44 QCIF Foreman
30 32.28 32.47
91
Table 4.8 Average Luminance PSNR at different bit rates for CIF sequences Average Y-PSNR(dB) Sequence Bit rate
(Kbps) H.264 filter Proposed 130 36.87 36.92 90 35.44 35.49 70 34.41 34.42 CIF Container
40 32.45 32.47 125 38.24 38.29 90 37.87 37.87 70 37.59 37.63 CIF Bridge
40 37.10 37.11 140 41.43 41.44 120 40.80 40.84 90 39.54 39.59 CIF Mother Daughter
60 37.73 37.76 120 38.26 38.28 85 37.50 37.52 65 36.81 36.82 CIF Highway
40 35.50 35.54 140 37.08 37.15 100 35.63 35.63 70 33.89 33.96 CIF Irene
40 31.53 31.53 130 34.91 35.06 90 33.42 33.56 60 31.64 31.70 CIF Foreman
40 29.57 29.57
For most of sequences PSNR values are same for the proposed and original
H.264/AVC deblocking filter, however, a slight improvement in PSNR is observed
within a range 0.01-0.18 dB for QCIF sequences and 0.01-0.15 dB for CIF sequences.
Fig. 4.15 shows the objective performance by comparing average PSNR of proposed
filter with H.264/AVC deblocking filter at various bit rates for various QCIF and CIF
sequences.
92
(a)
(b)
Fig. 4.15 Objective comparison between H.264/AVC deblocking filter and proposed deblocking filter for various (a) QCIF sequences (b) CIF sequences
93
4.8 Subjective Comparison
A subjective comparison between original raw (uncompressed) frame, no filter,
H.264/AVC deblocking filter and proposed deblocking filter, for various low to
moderate motion activity QCIF and CIF sequences at low bit rates is done. The
various frames of QCIF sequences used are: Fig. 4.16 (a) QCIF CONTAINER frame
1 at 20 Kbps (b) QCIF MISS AMERICA frame 2 at 30 Kbps; Fig. 4.17 (a) QCIF
CLAIRE frame 3 at 30 Kbps (b) QCIF HALL frame 5 at 20 Kbps; Fig. 4.18 (a) QCIF
SALESMAN frame 6 at 20 Kbps (b) QCIF NEWS frame 4 at 25 Kbps and Fig. 4.19
(a)QCIF CARPHONE frame 3 at 30 Kbps (b)QCIF FOREMAN frame 2 at 30 Kbps.
Fig. 4.20 through Fig. 4.26 show subjective comparison of CIF CONTAINER, CIF
BRIDGE, CIF MOTHER DAUGHTER, CIF HIGHWAY, CIF SILENT, CIF IRENE
and CIF FOREMAN for various frames at low bit rates respectively.
The subjective analysis of proposed algorithm with H.264/AVC deblocking filter is
performed by comparing the perceptual quality of video. The comparison is done by
considering different frames of QCIF and CIF sequences. The set of sequences used
for experimentation represents wide range of typical content for low and high latency
applications. The analysis shows that perceptual quality of proposed algorithm is
comparable with H.264/AVC deblocking filter. Further observation of these low to
moderate motion sequences used for experimentation revealed that proposed
algorithm effectively suppresses the blocking artifacts without significant blurring and
substantially preserving the original edges with benefit of reduced computational
complexity.
94
Fig. 4.16 Subjective comparison for various QCIF sequences (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
95
Fig. 4.17 Subjective comparison for various QCIF sequences (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
96
Fig. 4.18 Subjective comparison for various QCIF sequences (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
97
Fig. 4.19 Subjective comparison for various QCIF sequences (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
98
Fig. 4.20 CIF CONTAINER frame 1 encoded at 35 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
99
Fig. 4.21 CIF BRIDGE frame 4 encoded at 40 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
100
Fig. 4.22 CIF MOTHER DAUGHTER frame 6 encoded at 40 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
101
Fig. 4.23 CIF HIGHWAY frame 5 encoded at 40 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
102
Fig. 4.24 CIF SILENT frame 3 encoded at 40 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
103
Fig. 4.25 CIF IRENE frame 14 encoded at 40 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
104
Fig. 4.26 CIF FOREMAN frame 9 encoded at 35 Kbps (i) Raw frame (ii) No deblocking filter (iii) H.264/AVC deblocking algorithm (iv) Proposed
105
CONCLUSIONS
The main objective of this thesis was design and implementation of deblocking filter
for low bit rate video coding with reduced computing complexity.
In chapter 3, we presented two case studies with respect to low bit rate video coding:
Performance analysis of H.264/AVC video coding standard with existing standards
and evaluation of H.264/AVC deblocking filter. The results of performance analysis
of H.264/AVC proved its superiority over the other video coding standards at low bit
rates. While, the evaluation of H.264/AVC deblocking filter demonstrated the
effectiveness of filter for suppressing blocking artifacts. On the other hand, computing
complexity of H.264/AVC deblocking filter is recognized.
In chapter 4, taking computing complexity reduction as a target, deblocking algorithm
based on motion activity of video sequences is proposed. Due to availability of motion
vectors from motion estimation, absolute sum of motion compensation vectors are
used for categorizing different video sequences according to motion activity.
Furthermore, impact of H.264/AVC’s strong and normal filter is analyzed on various
different video sequences. It is found through experimentation that for low to
moderate video sequences, strong filter can be replaced by normal filter. As a result,
the boundary strength that takes the major chunk of computations is not used for low
to moderate motion activity video sequences in proposed approach.
Computational complexity comparison of proposed approach with original deblocking
algorithm revealed significant amount of reduction in various computing operations.
106
Furthermore, objective analysis of proposed approach shows that PSNR of low to
moderate video sequences does not change much in comparison with original
algorithm. The extensive subjective comparison demonstrate that perceptual quality of
video using proposed method is comparable with original H.264/AVC deblocking
algorithm.
The proposed deblocking algorithm can be used in applications, where reduced
computing complexity with low bandwidth is desired. For example, real time low bit
rate applications that include mobile video, video conferencing on Internet, and video
telephony over low bandwidth lines.
107
FUTURE RECOMMENDATIONS
The proposed research work can be further extended as follows:
The motion compensation vectors have been use for categorizing different video
sequences according to motion activity in proposed research. Other metrics for
detection of motion activity can also be investigated. For example, the block-size,
coding mode information can be used for the classification.
In proposed research, thresholds have been found for QCIF and CIF resolutions.
Adaptive threshold definition approach may be further researched to cope with other
resolution applications. E.g., thresholds for higher resolution sequences can be
investigated.
Efficient hardware solutions are necessary for low bit rate applications, further
research about the LSI architecture of proposed method can be done. For example,
loading unfiltered samples and storing filtered samples phenomenon can be optimized,
as memory access is the most time consuming operation.
108
REFERENCES
[1] ITU-T, “Video Coding for Audiovisual Services at p x 64 bits,” ITU-T
Recommendation H.261, Mar. 1993.
[2] David W. Lin, Cheng-Tie Chen, and T. Russell Hsing, “Video on Phone Lines:
Technology and Applications,” Proceedings of IEEE, pp. 175-193, Feb. 1995.
[3] D. LaGall, “MPEG: A Video Compression Standard for Multimedia Application,”
Communications ACM, vol. 34, pp. 46-58, Apr. 1991.
[4] T. Von Roden, “H.261 and MPEG1-A Comparison,” Proceedings of IEEE 15th
Annual International Phoenix Conference on Computers and Communications, pp.
65-71, Mar. 1996.
[5] O. J. Morris, “MPEG-2: Where did it Come from and What is it?,” IEE
Colloquium, pp. 1/1 - 1/5, Jan. 1995.
[6] P. N. Tudor, “MPEG-2 Video Compression,” Journal of Electronics and
Communication Engineering, vol. 7, issue 6, pp. 257-264, Dec. 1995.
[7] Karel Rijkse, “H.263: Video Coding for Low-Bit-Rate Communication,” IEEE
Communications Magazine, vol. 34, issue 12, Dec. 1996.
[8] T. Sikora, “MPEG Digital Video-Coding Standards,” IEEE Signal Processing
Magazine, vol. 14, issue 5, pp. 82-100, Sept. 1997.
[9] T. R. Gardos, “H.263+: The New ITU-T Recommendation for Video Coding at
Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics,
Speech and Signal Processing, ICASSP-98, pp. 3793-3796, May 1998.
[10] Guy Cˆot´e, Berna Erol, Michael Gallant, and Faouzi Kossentini, “H.263+:
Video Coding at Low Bit Rates,” IEEE Transactions on Circuits nnd Systems for
Video Technology, vol. 8, no. 7, pp. 849-866, Nov. 1998.
109
[11] Iain E.G. Richardson, “H.264 and MPEG-4 Video Compression”, Great
Britain, John Wiley & Sons, 2003.
[12] Thomas Weigand, Gary J. Sullivan, Gistle Bjontegaard, and Ajay Luthra:
“Overview of H.264 / AVC Video Coding Standard,” IEEE Transactions on
Circuits and Systems for Video Technology, vol. 13, issue 7, pp. 560-576, July,
2003.
[13] Joint Video Team of ITU-T and ISO/IEC JTC 1, “Draft ITU-T
Recommendation and Final Draft International Standard of Joint Video
Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC),” Joint Video Team
(JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-G050r1, May 2003.
[14] P. List, A. Joch, J. Lainema, G. Bjøntegaard, and M. Karczewicz, “Adaptive
Deblocking Filter” IEEE Transactions on Circuits and Systems for Video
Technology, vol. 13, no. 7, pp. 614-619, July 2003.
[15] Y. Zhong, I. Richardson, A. Miller, and Y. Zhao, “Perceptual Quality of
H.264/AVC Deblocking Filter”, Proceedings of IEE International Conference on
Visual Information Engineering (VIE 2005), Glasgow, UK. 2005.
[16] Gulistan Raja, and Muhammad Javed Mirza, “In-Loop Deblocking Filter for
JVT H.264/AVC,” World Scientific and Engineering Academy and Society
Transactions on Signal Processing, vol. 2, issue 2, ISSN: 1790-5022, 2006, pp.
143-148.
[17] Gulistan Raja and Muhammad Javed Mirza, “Evaluation of Loop Filtering for
Reduction of Blocking Effects in Real Time Low Bit Rate Video Coding,” MUET
Research Journal of Engineering & Technology, Vol. 26, No. 3, ISSN: 0254-7821,
2007, pp. 211-218.
110
[18] Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro,
“H.264/AVC Baseline Profile Decoder Complexity Analysis,” IEEE Transactions
on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 704–716, July
2003.
[19] Chung-Ming Chen and Chung-Ho Chen, “Configurable VLSI Architecture for
Deblocking Filter in H.264/AVC”, IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 16, no. 8, pp. 1072-1082, Aug. 2008.
[20] F. Tobajas, G. Callicó, P. A. Pérez, V. De Armas, and R. Sarmiento, “An
Efficient Architecture for Hardware Implementation of H.264/AVC Deblocking
Filtering”, Digest of Technical Papers - IEEE International Conference on
Consumer Electronics, ICEE 2008, pp. 1-2, Jan 2008.
[21] Mustafa Parlak and Ilker Hamzaoglu, “Low Power H.264 Deblocking Filter
Hardware Implementations,” IEEE Transactions on Consumer Electronics, vol. 54
no. 2, pp. 808-816, May 2008.
[22] Philip Dang, “High Performance Architecture of an Application Specific
Processor for the H.264 Deblocking Filter,” IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, vol. 16, no. 10, pp. 1321-1334, Oct. 2008.
[23] Iain E.G. Richardson, “Video Codec Design”, Great Britain, John Wiley &
Sons, 2003.
[24] Y.-L. Lee and H. W. Park, “Loop Filtering and Post-Filtering for Low-Bit
Rates Moving Picture Coding,” Signal Processing: Image Communications, vol.16,
pp. 871–890, 2001.
[25] J. Lainema and M. Karczewicz, “TML 8.4 Loop Filter Analysis,” ITU-T SG16
Doc. VCEG-N29, 2001.
111
[26] A. Zakhor, “Iterative Procedures for Reduction of Blocking Effects in
Transform Image Coding”, IEEE Transactions on Circuits and Systems for Video
Technology, vol. 2, no. 1, pp. 91-95, Mar. 1992.
[27] Yongyi Yang, and Nikolas P. Galatsanos, “Removal of Compression Artifacts
Using Projections onto Convex Sets and Line Process Modeling”, IEEE
Transactions on Image Processing, vol. 6, no. 10, pp. 1345-1357, Oct. 1997.
[28] Hoon Paek, Rin-Chul Kim, and Sang-Uk Lee, “On the POCS-Based
Postprocessing Technique to Reduce the Blocking Artifacts in Transform Coded
Images”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 8,
no. 3, pp. 358-367, Jun. 1998.
[29] T. P. O’Rourke, and R. L. Stevenson, “Improved Image Decompression for
Reduced Transform Coding Artifacts”, IEEE Transactions on Circuits and
Systems for Video Technology, vol. 5, no. 6, pp. 490-499, Dec. 1995.
[30] Robert L. Stevenson, “Reduction of Coding Artifacts in Low-Bit-Rate Video
Coding”, Proceedings of IEEE 38th Midwest Symposium on Circuits and Systems,
pp. 854-857, Aug. 1995.
[31] Yongyi Yang, Nikolas P. Galatsanos, and Aggelos K. Katsaggelos,
“Regularized Reconstruction to Reduce Blocking Artifacts of Block Discrete
Cosine Transform Compressed Images”, IEEE Transactions on Circuits and
Systems for Video Technology, vol. 3, no. 6, pp. 421-431, Dec 1993.
[32] Andre' Kaup, “Adaptive Constrained Least Squares Restoration for Removal
of Blocking Artifacts in Low Bit Rate Video Coding”, Proceedings of IEEE
International Conference on Acoustics, Speech, and Signal Processing, ICASSP-
97, pp. 2913-2916, April 1997.
112
[33] Ya-Qin Zhang, L. R. Pickholtz, and H. M. Loew, “A New Approach to Reduce
the “Blocking Effect” of Transform Coding”, IEEE Transactions on
Communications, vol. 41, no. 2, pp. 299-302, Feb. 1993.
[34] Taehwan Shin, Kyungho Cho, and Byung-Ha Ahn, “Block Effect Reduction
with Content-Based AC Prediction in an MPEG-2 Compressed Video”, IEEE
Transactions on Consumer Electronics, vol.45, no.3, pp.625-631, Aug. 1999.
[35] Changick Kim, “Adaptive Post-.filtering for Reducing Blocking and Ringing
Artifacts in Low Bit-Rate Video Coding”, Signal Processing: Image
Communication vol. 17, pp. 525-535, 2002.
[36] K. Veeraswamy, B. Chandra Mohan, and T. Jyothirmayi, “An Image
Compression Scheme using AC Predictions”, Proceedings of IEEE International
Conference on Computational Intelligence and Multimedia Applications, pp.25-30,
2007.
[37] S. Wu, H. Yan, and Z. Tan, “An Efficient Wavelet-Based Deblocking
Algorithm for Highly Compressed Images”, IEEE Transactions on Circuits and
Systems for Video Technology, vol. 11, no. 11, pp. 1193-1198, Nov. 2001.
[38] Alan W.-C. Liew, and Hong Yan, “Blocking Artifacts Suppression in Block-
Coded Images Using Overcomplete Wavelet Representation”, IEEE Transactions
on Circuits and Systems for Video Technology, vol. 14, no. 4, pp. 450-461, Apr.
2004.
[39] G. Qiu, “MLP for Adaptive Postprocessing Block-Coded Images”, IEEE
Transactions on Circuits and Systems for Video Technology, vol. 10, no. 8, pp.
1450-1454, Dec.2000.
[40] A. Z. Averbuch, , A. Schclar, and D. L. Donoho, “Deblocking of Block-
Transform Compressed Images using Weighted Sums of Symmetrically Aligned
113
Pixels”, IEEE Transactions on Image Processing, vol. 14, no. 2, pp. 200-212, Feb.
2005.
[41] Shigenobu Minami and Avideh Zakhor, “An Optimization Approach for
Removing Blocking Effects in Transform Coding”, IEEE Transactions on Circuits
and Systems for Video Technology, vol. 5, no. 2, pp. 74-82, April 1995.
[42] H. S . Malvar, “Optimal Pre- and Post-Filters in Noisy Sampled-Data System”
Ph.D. dissertation. Dept. Elect. Eng., Mass. Inst. Technology, Aug. 1986.
[43] Henrique S. Malvar, and David H. Staelin, “The LOT: Transform Coding
without Blocking Effects”, IEEE Transactions on Acoustics, Speech and Signal
Processing, vol. 37, no. 4, pp.553-559, Apr. 1989.
[44] Henrique S. Malvar, “Biorthogonal and Nonuniform Lapped Transforms for
Transform Coding with Reduced Blocking and Ringing Artifacts”, IEEE
Transactions on Signal Processing, vol. 46, no. 4, pp. 1043-1053, Apr. 1998.
[45] Chih-Chin Lai, and Din-Chang Tseng, “An Optimal L-filter for Reducing
Blocking Artifacts using Genetic Algorithms”, Signal Processing: Image
Communication vol. 81, pp. 1525-1535, 2001.
[46] T. Sun, M. Gabbouj, and Y. Neuvo, “Adaptive L-filter with Applications in
Signal and Image Processing”, Signal Processing 38, pp. 331-344, 1998.
[47] D.E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine
Learning”, Addison-Wesley, Reading, MA, 1989.
[48] B. Macq, M. Mattavelli, O. Van Calster, E. Van Der Plancke, S. Comes and
W. Li, “Image Visual Quality Restoration by Cancellation of The Unmasked
Noise”, Proceedings of IEEE Conference on Acoustics, Speech, and Signal
Processing, ICASSP-94, pp. V/53 - V/56, April 1994.
114
[49] Tao Chen, Hong Ren Wu, and Bin Qiu, “Adaptive Postfiltering of Transform
Coefficients for the Reduction of Blocking Artifacts”, IEEE Transactions on
Circuits and Systems for Video Technology, vol. 11, no. 5, pp. 594-602, May
2001.
[50] Shen-Chuan Tai, Yen-Yu Chen, and Shin-Feng Sheu, “Deblocking Filter for
Low Bit Rate MPEG-4 Video”, IEEE Transactions on Circuits and Systems for
Video Technology, vol. 15, no. 6, pp. 733-741, June 2005.
[51] Jim Chou, Matthew Crouse, and Kannan Ramchandran, “A Simple Algorithm
for Removing Blocking Artifacts in Block-Transform Coded Images”, IEEE
Signal Processing Letters, vol. 5, no. 2, pp. 33-35, Feb. 1998.
[52] Gaetano Scognamiglio, Giovanni Ramponi, and Andrea Rizzi, “Enhancement
of Coded Video Sequences via an Adaptive Nonlinear Post-Processing”, Signal
Processing: Image Communications, vol.18, pp. 127–139, 2003.
[53] G. Scognamiglio, A. Rizzi, L.Albani, and G.Ramponi, “Picture Enhancement
in Video and Block-Coded Image Sequences”, IEEE Transactions on Consumer
Electronics vol. 45, no. 3, pp. 680–689, Aug. 1999.
[54] Kee-Koo Kwon, Sung-Ho Im, and Dong-Sun Lim, “Picture Quality
Improvement in MPEG-4 Video Coding using Simple Adaptive Filter”,
Proceedings of the 12th Annual ACM International Conference on Multimedia
MM’ 04, pp.284-287, 2004.
[55] Hyun Wook Park, and Yung Lyul Lee, “A Postprocessing Method for
Reducing Quantization Effects in Low Bit-Rate Moving Picture Coding”, IEEE
Transactions on Circuits and Systems for Video Technology, vol. 9, no. 1, pp.
161-171, Feb. 1999.
115
[56] Yonghun Kim, and Taihong Yi, “Efficient Post-Processing for Block-Based
Compressed Video”, Proceedings of 4th EURASIP Conference focused on
Video/Image Processing and Multimedia Communications, pp. 101-105, July
2003.
[57] Nam Ik Cho, Bong Gyun Roh, and Sang Uk Lee, “Reduction of Blocking
Artifacts by a Modeled Lowpass Filter Output”, Proceedings of IEEE International
Symposium on Circuits and Systems, pp. 673-676, May 2000.
[58] Xinding Sun, B. S. Manjunath, Ajay Divakaran, “Representation of Motion
Activity in Hierarchical Levels for Video Indexing and Filtering”, Proceedings of
IEEE Conference on Image Processing, ICIP-02, pp.I-149-I-152, Sept. 2002.
[59] Xinding Sun, “Motion Activity for Video Indexing”, PhD Thesis, University
of California, Santa Barbara, 2004.
[60] João Ascenso, Catarina Brites, Fernando Pereira, “Content Adaptive Wyner-
Ziv Video Coding Driven by Motion Activity”, Proceedings of International
Conference on Image Processing, ICIP-06, pp. 605 - 608, Oct. 2006.
[61] Hu Weiwei, and Zhang Qishan, “H.264 Motion Estimation Algorithm based
on Video Sequences Activity” Journal of Electronics, China, vol. 25, no.1, pp.
125-128, Jan. 2008.
[62] Kadir A. Peker , Ajay Divakaran , and Thomas V. Papathomas, “Automatic
Measurement of Intensity of Motion Activity of Video Segments”, Proceedings
SPIE Conference on Storage and Retrieval from Media Databases, San Jose,
pp.341-351, Jan. 2001.
[63] “MPEG-7 Visual part of the XM 4.0,” ISO/IEC MPEG99/W3068, Maui, USA,
Dec. 99.
116
[64] Kadir A. Peker and Ajay Divakaran, “A Novel Pair-Wise Comparison based
Analytical Framework for Automatic Measurement of Intensity of Motion
Activity of Video Segments”, Proceedings of IEEE International Conference on
Multimedia and Expo, pp. 936-939, 2001.
[65] Sylvie Jeannin, and Ajay Divakaran, “MPEG-7 Visual Motion Descriptors”,
IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 6,
pp. 720-724, June 2001.
[66] Dong Tian, Lansun Shen, and Zhiheng Yao, “Motion Activity Based Wireless
Video Quality Perceptual Metric”, Proceedings of International Symposium on
Intelligent Multimedia, video and Speech Processing, Hang Kong, pp.527-530,
May. 2001.
[67] Jörn Ostermann, Jan Bormans, Peter List, Detlev Marpe, Matthias Narroschke,
Fernando Pereira, Thomas Stockhammer, and Thomas Wedi, “Video Coding with
H.264/AVC: Tools, Performance, and Complexity”, IEEE Circuits and Systems
Magazine, pp. 7-28, First Quarter 2004.
[68] Atul Puri,, Xuemin Chen, and Ajay Luthra, “Video Coding using the
H.264/MPEG-4 AVC Compression Standard”, Signal Processing: Image
Communications, vol.19, pp 793-849, 2004.
[69] “Video Traces Research Group Test Sequences”, available on-line
http://trace.eas.asu.edu/yuv/index.html, 2004.
[70] “Video Library and Tools, Network Synthesis Lab”, available on-line:
http://nsl.cs.sfu.ca/wiki/index.php/Video_Library_and_Tools, 2006.
[71] Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Joint Model
Reference Software Version 7.6.
117
[72] ISO/IEC, “The MPEG-2 International Coding Standard,” Reference number
ISO/IEC 13818-2, 1996.
[73] MPEG Software Simulation Group, “The MSSG homepage”, available on-
line: http://www.mpeg.org/MPEG/MSSG, 2004.
[74] T. Sikora, “ The MPEG-4 Video Standard Verification Model,” IEEE
Transactions for Circuits and Systems for Video Technology, vol. 7, pp. 19-31,
Feb 1997.
[75] Fernando Pereira, “MPEG-4: Why, what, how and when?”, Signal Processing:
Image Communication, vol. 15, pp. 271-279, 2000.
[76] Olivier Avaro, Alexandros Eleftheriadis, Carsten Herpel, Ganesh Rajan, and
Liam Ward, “MPEG-4 Systems: Overview”, Signal Processing: Image
Communication, vol. 15, pp. 281-298, 2000.
[77] International Organization for Standardization “MPEG-4 Reference software”
available on-line: http://www.iso.ch/iso/en/ittf/PubliclyAvailableStandards, 2004.
[78] ITU-T, “Video Coding for Low Bit Rate Communication,” ITU-T
Recommendation H.263; version 1, Nov 1995; version 2, Jan. 1998; version 3,
Nov. 2000.
[79] “ITU-T H.263 Encoder, version 2”, Signal Processing and Multimedia Group,
University of British Colombia, Canada.
[80] Gulistan Raja, and Muhammad Javed Mirza, “Performance Comparison of
Advanced Video Coding H.264 Standard with Baseline H.263 and H.263+
Standards”, Proceedings of 4th IEEE International Symposium on
Communications & Information Technologies, Sapporo, pp. 743-746, Oct. 2004.
118
[81] Gulistan Raja, and Muhammad Javed Mirza, “Evaluation of Emerging JVT
H.264/AVC with MPEG Video,” Proceedings of 9th IEEE International Multi-
topic Conference, Karachi, pp. 626-629, Dec. 2005.
[82] Gulistan Raja, and Muhammad Javed Mirza, “JVT H.264/AVC: Evaluation
with Existing Standards for Low Bit Rate Video Coding,” Proceedings of 17th
IEEE International Conference on Microelectronics, Islamabad, pp. 301-304, Dec.
2005.
[83] Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Joint Model
Reference Software Version 9.2.
[84] Kin-Hung Lam, “Reduced Complexity Deblocking Filter for H.264 Video
Coding”, Proceedings of 39th Asilomar Conference on Signals, Systems, and
Computers, pp.1372-1374, 2005.
[85] Jung-Ah Choi and Yo-Sung Ho, “Simple and Efficient Deblocking Algorithm
for H.264/AVC Decoder”, Proceedings of 15th International Conference on
Systems, Signals and Image Processing, IWSSIP 2008, pp. 433-436, June 2008.
[86] Blagoj Kocovski, Tomislav Kartalov, Zoran Ivanovski, and Ljupcho Panovski,
“An Adaptive Deblocking Algorithm for Low Bit-Rate Video”, Proceedings of 3rd
International Symposium on Communications, Control, and Signal Processing,
ISCCSP 2008, pp. 888-893, Mar. 2008.
[87] J. Ren, and N. Kehtarnavaz, “Algorithmic optimization for H.264 Deblocking
Filter on Portable Devices”, Proceedings of the International Symposium on
Consumer Electronics, ISCE 2007, pp. 1-6, June 2007.
[88] H.264/AVC Joint Model Reference Software Version 10.2, available on-line
http://iphome.hhi.de/suehring/tml/, 2007.
119
[89] Chin-Min Yang, “Hybrid Fast Mode Decision Algorithm”, Master's Thesis,
National Central University, Taiwan, 2005.
[90] Adam Major, Ying Yi, Ioannis Nousias, Mark Milward, Sami Khawam and
Tughrul Arslan, “H.264 Decoder Implementation on a Dynamically
Reconfigurable Instruction Cell Based Architecture”, Proceedings of IEEE
International SOC conference, pp. 49-52, Sep. 2006.
[91] Z. Ahmad, S. Worrall, A. M. Kondoz, “H.264 Header Compression Scheme
for Wireless Communication”, 4th International Conference on Visual Information
Engineering, VIE 2007.
[92] Adam Major, Ioannis Nousias, Sami Khawam, Mark Milward, Ying Yi and
Tughrul Arslan, “H.264/AVC In-Loop De-Blocking Filter Targeting a
Dynamically Reconfigurable Instruction Cell Based Architecture”, Proceedings of
2nd NASA/ESA Conference on Adaptive Hardware and Systems, pp. 134-138 ,
2007.