Upload
jason-larson
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
A frequency domain approach for Intra-coding pictures
Presenter: Andy C. Yu
forMicrosoft Research Cambridge
16th May 2006
Outline:
• Introduction - H.264/AVC Intra coding v. Motion JPEG-2000 coding.
• The proposed Fintra algorithm – based on frequency domain
• Simulation results – comparison with state-of-the-art algorithms
• Conclusions
Introduction
• H.264/AVC is standardised by ITU-T and MPEG ISO/IEC in Dec. 2001 for commercial video coding.
• It represents a major step forward in the development of video standards*.
• It typically outperforms all existing standards by a factor of two.
• Another important fact is that H.264/AVC is a public and open standard*.
* R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical Review, Jan 2003
Conventional Intra-coding structure
VideoSequence 8x8 DCT
…
…
Quantization Process(Lossy)
EntropyCoding
8
Decomposition
8
H.264/AVC Intra-coding structure
VideoSequence
Integer 4x4 DCT &Quantisation Process
…
…
EntropyCoding
Decomposition
4
4
Predictedblock
Originalblock
Residue Data
Mode Selection
Intra-Prediction in H.264/AVC
A B C D E F G HQIJKLMNOP
a b c de f g hi j k l
m n o p
A : Neighbouring sample that are already reconstructed at the encoder and decoder sides.
a : Samples to be predicted.
Picture from I. Richardson, H.264 and MPEG-4 Video Compression, Willey Publisher, 1st edition, 2003.
Picture from I. Richardson, H.264 and MPEG-4 Video Compression, Willey Publisher, 1st edition, 2003.
Lagrangian Evaluation• Final mode decision is selected to minimise
Lagrangian cost.
• Where D: measure of the distortion
R: the number of bits.
λ: Lagrange parameter.
Qp: Quantisation factor.
)Qp|mode(.)Qp|mode(Qp)|mode( RDJ
0 10 20 30 40 500
1000
2000
3000
4000
5000
6000
Quantisation factor, Qp
Lag
rang
ian
pa
ram
ete
r
Lagrangian evaluation (cont.)
3/)12Qp(285.0
H.264 Intra-coding v. Motion JPEG2000 coding
Picture from J. Ostermann, J. Bormans et al., “Video Coding with H.264/AVC: Tools, Performance, and Complexity,” IEEE Circuits and Systems Magazine, first quarter, 2004
The Proposed Fast Intra-coding Algorithm (Fintra)
H.264/AVC Intra-coding structure
VideoSequence
Integer 4x4 DCT &Quantisation Process
…
…
EntropyCoding
Decomposition
4
4
Predictedblock
Originalblock
Residue Data
Mode Selection
H.264/AVC Intra-coding structure
VideoSequence
Integer 4x4 DCT &Quantisation Process
…
…
EntropyCoding
Decomposition
4
4
Predictedblock
Originalblock
Residue Data
Mode Selection
The search strategy for the proposed Fintra algorithm
• The proposed algorithm selects fewer modes to undergo full Lagrangian cost evaluation.
• The entire selection process operates in the discrete cosine transform (DCT) domain.
• Generally, larger residue block energy is produced by intra-coding.
• It is observed that the modes that provide the least residue energy will result in minimum R and hence minimise the Lagrangian cost.
Algorithm formulation• The selection criterion can be measured
from the SAD of residue block in Discrete Cosine Transform domain:
The definition of 2D-DCT is:
3
0
3
0modee)DCT(residu )},(),({SAD
m n
nmnm PBT
1
0
1
0 2
)12(cos
2
)12(cos),()()(
22 N
n
M
muv N
vn
M
umnmvcuc
NMF
B
• Thus, the selection is the least residue energy produced by the mode.
3
0
3
0mode
mode),(),(minarg
m n
nmnm PBT
15
1mode|mode|
mode)(AC)(ACDCDCminarg
i
ii PBPB
• We separate the low frequency spectrum, DC, from other AC frequencies.
• DC and low-frequency AC coefficients contain more energy than the high-frequency coefficients.
15
mode|mode|mode
)(AC)(ACDCDCminargi
ii PBPB
Task: Develop efficient approach to calculate DCP|mode and ACP|mode(i)
A B C D E F G HQIJKLMNOP
a b c de f g hi j k lm n o p
A
ABC…
……
..
OPQ
17x1
ABC…
……
..
OPQ
ω1,0 ω1,1
ω2,0
….
ω8,0
ω9,0
…. ….
…. ….ω9,1
ω1,17
ω2,17…
.
ω8,17
ω9,17
ω1,16
ω9,16
….
….
….
….
17x1
.
9x17
ABC…
……
..
OPQ
ω1,0 ω1,1
ω2,0
….
ω8,0
ω9,0
…. ….
…. ….ω9,1
ω1,17
ω2,17…
.
ω8,17
ω9,17
ω1,16
ω9,16
….
….
….
….
17x19x17
DCP|mode0
…...
DCP|mode1
DCP|mode7
DCP|mode8
9x1
= .
ABC…
……
..
OPQ
ω1,0 ω1,1
ω2,0
….
ω8,0
ω9,0
…. ….
…. ….ω9,1
ω1,17
ω2,17…
.
ω8,17
ω9,17
ω1,16
ω9,16
….
….
….
….
17x19x17
DCP|mode0
…...
DCP|mode1
DCP|mode7
DCP|mode8
9x1
= .PDC DC
~ N
N DCP~
DC
Ni i )AC(P
~)AC(
15mode|mode|
mode
3
0
3
0mode
mode
)(AC)(ACDCDCminarg
),(),(minarg
i
m n
ii
nmnm
PBPB
PBT
Thus, the selection process,
15
)AC(DCrow
~)(AC
~DCminarg
iii NΩINΩI BB
becomes
1.3 0.5 -1.3-0.5 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.1 0.10.2 0 -0.1 -0.2 -0.1 0 0.1 0.2 0.1 0 -0.1 -0.2 -0.1 0
A B DC E F G H I J K L M N O P Q
0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0
0.2 -0.5 -0.2-0.7 0 0 0 0 0.4 0.2 0.1 0 0 0 0 0 0.5
0.1 0.5 -0.20.5 -0.7 -0.4 0.1 0 0 0 0.1 0.1 0 0 0 0 0
0 0 0.10.1 0 0 0 0 0.1 0.2 0.1 0 -0.2 -0.3 -0.1 0 0
-0.3 -0.4 -0.1-0.3 0 0 0 0 0.3 0.4 0.3 0.1 0 0 0 0 0
-0.2 -0.2 0-0.1 0 0 0 0 0 0.1 0.3 0.2 0 0 0 0 -0.1
Mode 0
Mode 1
Mode 2
Mode 3
Mode 4
Mode 5
Mode 6
Mode 7
Mode 8
AC(1,0)
~
Compared with 2D-DCT ?
1
0
1
0 2
)12(cos
2
)12(cos),()()(
22 N
n
M
muv N
vn
M
umnmvcuc
NMF
B
General practice: apply 1D-DCT twice, row-wise and column-wise ?
1.3 0.5 -1.3-0.5 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0.1 0.10.2 0 -0.1 -0.2 -0.1 0 0.1 0.2 0.1 0 -0.1 -0.2 -0.1 0
A B DC E F G H I J K L M N O P Q
0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0
0.2 -0.5 -0.2-0.7 0 0 0 0 0.4 0.2 0.1 0 0 0 0 0 0.5
0.1 0.5 -0.20.5 -0.7 -0.4 0.1 0 0 0 0.1 0.1 0 0 0 0 0
0 0 0.10.1 0 0 0 0 0.1 0.2 0.1 0 -0.2 -0.3 -0.1 0 0
-0.3 -0.4 -0.1-0.3 0 0 0 0 0.3 0.4 0.3 0.1 0 0 0 0 0
-0.2 -0.2 0-0.1 0 0 0 0 0 0.1 0.3 0.2 0 0 0 0 -0.1
Mode 0
Mode 1
Mode 2
Mode 3
Mode 4
Mode 5
Mode 6
Mode 7
Mode 8
AC(1,0)
~
General Practice: 0.1 X (B + D + J + L – F – H – N - P) + 0.2 X (C + K - G - O) 2 Multiplication + 11 Addition operators
Features of conversion matrix Ω
• Uniquely existed at each position in frequency domain.
• Independent of the video sequences.
• Low computational demand.
• The coefficients can be calculated and stored in advance.
We may select at least 2 modes to undergo the full Lagrangian
process.
Match Percentage between the least residue energy and the least rate-distortion cost
25
35
45
55
65
75
85
95
1 2 3 4Number of candidates
Mat
ch P
erce
nta
ge
(%)
Akiyo (Class A)
Foreman (Class B)
Mobile (Class C)
Simulations and Results• Comparison of the results with the two best
performing algorithms detailed in the literatures.
• Simulation Settings– JM6.1e version in Microsoft Visual C++– Various values of M are selected.
• Measurements:picture quality (dB), compression ratio (Kbits/sec) and entire encoding time (sec).
Simulation 1:
Compared with
C. Kim, H. Shih, and C. Kuo, “Fast H.264 Intra-prediction mode selection using joint spatial and transform domain features”, Journal of Visual Communication and Image Representation, vol. 17, pp. 291-310, 2006.
by selecting M=3
Foreman QCIF 30Hz 300 frames All Intra-coding
0 200 400 600 800 1000 1200 1400 1600 180026
28
30
32
34
36
38
40
42
44
Rate (Kbits/sec)
PS
NR
(dB
)
JM6.1eProposedResult in [6]
0 200 400 600 800 1000 1200 1400 1600 180026
28
30
32
34
36
38
40
42
44
Rate (Kbits/sec)
PS
NR
(dB
)
JM6.1eProposedResult in [6]
Perception sensitivity
area
Foreman QCIF 30Hz 300 frames All Intra-coding
0 200 400 600 800 1000 1200 1400 1600 180026
28
30
32
34
36
38
40
42
44
Rate (Kbits/sec)
PS
NR
(dB
)
JM6.1eProposedResult in [6]
Foreman QCIF 30Hz 300 frames All Intra-coding
300 320 340 360 380 400 420
30
30.5
31
31.5
Rate (Kbits/sec)
Foreman QCIF 30Hz 300 frames All Intra-coding
JM6.1eProposedResult in [6]
300 320 340 360 380 400 420
30
30.5
31
31.5
Rate (Kbits/sec)
Δ1≈0.1dB
Δ2≈0.3dB
Foreman QCIF 30Hz 300 frames All Intra-coding
JM6.1eProposedResult in [6]
Simulation 2:
F. Pan, X. Lin, et al., “Efficient prediction mode selection 4x4 blocks in H.264,”JVT-013, ISO/IEC JTC1/SC29/ WG11 and ITU-T SG16 Q.6, Mar. 2003, Pattaya, Thailand.
F. Pan, X. Lin, et al., “Fast mode decision algorithm for Intraprediction in H.264/AVC video coding,” IEEE transaction on Circuits and Systems for Video Technology, pp. 813-823, Jul. 2005.
by selecting M=2
SequencesPSNR difference Bit rate difference Speed up c.f. JM6.1e
Pan et al Proposed Pan et al Proposed Pan et al Proposed
Bus (CIF) -0.22 dB -0.12 dB 3.85 % 1.46 % 58.12 % 75.22 %
Coastguard -0.11 dB -0.06 dB 2.36 % 1.68 % 55.03 % 72.48 %
Container -0.23 dB -0.10 dB 3.70 % 0.77 % 56.36 % 71.86 %
Foreman -0.29 dB -0.07 dB 4.44 % 1.97 % 65.38 % 72.47 %
News -0.29 dB -0.09 dB 3.90 % 1.18 % 55.39 % 73.13 %
Paris (CIF) -0.23 dB -0.09 dB 3.21 % 1.47 % 57.78 % 73.43 %
Silent -0.18 dB -0.05 dB 3.54 % 2.00 % 65.17 % 71.86 %
Conclusions
• A fast algorithm using matrix formulae in the DCT domain is presented.
• The proposed Fintra algorithm does not need any a priori knowledge or magic number.
• The simulation results verify that the proposed algorithm outperforms two other algorithms in terms of all measurements.
• The rate-distortion curves show that the proposed algorithm achieves the same coding performance, yet reduces the computation requirement by up to 75%.
Future Work• The proposed Finter algorithm has been
developed to achieve speed up for encoding P- and B- frames.
• The Finter algorithm produces competitive results to other algorithms recorded in the literatures.
• An integration for both proposed Fintra and Finter algorithms has been done to attain a time of saving up to 86% without sacrificing both picture quality and bit rate efficiency.
References[1] Andy C. Yu, Ngan King Ngi, and Graham Martin, “Efficient Intra- and Inter-mode Selection Algorithms for
H.264/AVC,” in Journal of Visual Communication and Image Representation – Special Issue on Emerging H.264/AVC Video Coding Standard, vol. 17, issue 2, pp. 322-343, Elsevier Press, Apr 2006.
[2] Andy C. Yu, Graham Martin, and Heechan Park, “Improved Schemes for Inter-frame Coding in the H.264/AVC Standard,” in Proc. of 12th IEEE Conference on Image Processing (ICIP) 05, vol. 2., pp. 902-905, Genoa, Italy, Sep. 2005.
[3] Andy C. Yu, Graham Martin, and Heechan Park, “A Frequency Domain Approach to Intra Mode Selection in H.264/AVC,” in Proc. of 13th European Signal Processing Conference (EUSIPCO) 05, 4pp., Antalya, Turkey, Sep. 2005
[4] Andy C. Yu and Graham Martin, “Advanced Block Size Selection Algorithm for INTER frame Coding in H.264/AVC,” in Proc. of 11th IEEE International Conference on Image Processing (ICIP) 04, vol. 1, pp.95-98, Singapore, Oct 2004.
[5] Andy C. Yu, “Efficient Block Size Selection Algorithm for INTER frame Coding in H.264/AVC,” in Proc. of 29th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 04 , vol. 3, pp. 169-172, Montreal, Canada, May 2004.
[6] F. Pan, X. Lin, et al., “Fast mode decision algorithm for Intraprediction in H.264/AVC video coding,” IEEE transaction on Circuits and Systems for Video Technology, pp. 813-823, Jul. 2005.
[7] C. Kim, H. Shih, and C. Kuo, “Fast H.264 Intra-prediction mode selection using joint spatial and transform domain features”, Journal of Visual Communication and Image Representation, vol. 17, pp. 291-310, 2006.
[8] __, “Information technology – coding of audio visual objects – Part 10: advance video coding,” ISO/IEC 14496-10:2003, Dec. 2003.
[9] JVT reference software, JM6.1e, downloaded from http://bs.hhi.de/~suehring/
Q & A