A frequency domain approach for Intra-coding pictures Presenter: Andy C. Yu for Microsoft Research Cambridge 16 th May 2006

A frequency domain approach for Intra-coding pictures

Presenter: Andy C. Yu

forMicrosoft Research Cambridge

16th May 2006

Outline:

• Introduction - H.264/AVC Intra coding v. Motion JPEG-2000 coding.

• The proposed Fintra algorithm – based on frequency domain

• Simulation results – comparison with state-of-the-art algorithms

• Conclusions

Introduction

• H.264/AVC is standardised by ITU-T and MPEG ISO/IEC in Dec. 2001 for commercial video coding.

• It represents a major step forward in the development of video standards*.

• It typically outperforms all existing standards by a factor of two.

• Another important fact is that H.264/AVC is a public and open standard*.

* R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical Review, Jan 2003

Conventional Intra-coding structure

VideoSequence 8x8 DCT

…

…

Quantization Process(Lossy)

EntropyCoding

8

Decomposition

8

H.264/AVC Intra-coding structure

VideoSequence

Integer 4x4 DCT &Quantisation Process

…

…

EntropyCoding

Decomposition

4

4

Predictedblock

Originalblock

Residue Data

Mode Selection

Intra-Prediction in H.264/AVC

A B C D E F G HQIJKLMNOP

a b c de f g hi j k l

m n o p

A : Neighbouring sample that are already reconstructed at the encoder and decoder sides.

a : Samples to be predicted.

Picture from I. Richardson, H.264 and MPEG-4 Video Compression, Willey Publisher, 1st edition, 2003.

Picture from I. Richardson, H.264 and MPEG-4 Video Compression, Willey Publisher, 1st edition, 2003.

Lagrangian Evaluation• Final mode decision is selected to minimise

Lagrangian cost.

• Where D: measure of the distortion

R: the number of bits.

λ: Lagrange parameter.

Qp: Quantisation factor.

)Qp|mode(.)Qp|mode(Qp)|mode( RDJ

0 10 20 30 40 500

1000

2000

3000

4000

5000

6000

Quantisation factor, Qp

Lag

rang

ian

pa

ram

ete

r

Lagrangian evaluation (cont.)

3/)12Qp(285.0

H.264 Intra-coding v. Motion JPEG2000 coding

Picture from J. Ostermann, J. Bormans et al., “Video Coding with H.264/AVC: Tools, Performance, and Complexity,” IEEE Circuits and Systems Magazine, first quarter, 2004

The Proposed Fast Intra-coding Algorithm (Fintra)


VideoSequence


…

…

EntropyCoding

Decomposition

4

4

Predictedblock

Originalblock

Residue Data

Mode Selection


VideoSequence


…

…

EntropyCoding

Decomposition

4

4

Predictedblock

Originalblock

Residue Data

Mode Selection

The search strategy for the proposed Fintra algorithm

• The proposed algorithm selects fewer modes to undergo full Lagrangian cost evaluation.

• The entire selection process operates in the discrete cosine transform (DCT) domain.

• Generally, larger residue block energy is produced by intra-coding.

• It is observed that the modes that provide the least residue energy will result in minimum R and hence minimise the Lagrangian cost.

Algorithm formulation• The selection criterion can be measured

from the SAD of residue block in Discrete Cosine Transform domain:

The definition of 2D-DCT is:

3

0

3

0modee)DCT(residu )},(),({SAD

m n

nmnm PBT

1

0

1

0 2

)12(cos

2

)12(cos),()()(

22 N

n

M

muv N

vn

M

umnmvcuc

NMF

B

• Thus, the selection is the least residue energy produced by the mode.

3

0

3

0mode

mode),(),(minarg

m n

nmnm PBT

15

1mode|mode|

mode)(AC)(ACDCDCminarg

i

ii PBPB

• We separate the low frequency spectrum, DC, from other AC frequencies.

• DC and low-frequency AC coefficients contain more energy than the high-frequency coefficients.

15

mode|mode|mode

)(AC)(ACDCDCminargi

ii PBPB

Task: Develop efficient approach to calculate DCP|mode and ACP|mode(i)

A B C D E F G HQIJKLMNOP

a b c de f g hi j k lm n o p

A

ABC…

……

..

OPQ

17x1

ABC…

……

..

OPQ

ω1,0 ω1,1

ω2,0

….

ω8,0

ω9,0

…. ….

…. ….ω9,1

ω1,17

ω2,17…

.

ω8,17

ω9,17

ω1,16

ω9,16

….

….

….

….

17x1

.

9x17

ABC…

……

..

OPQ

ω1,0 ω1,1

ω2,0

….

ω8,0

ω9,0

…. ….

…. ….ω9,1

ω1,17

ω2,17…

.

ω8,17

ω9,17

ω1,16

ω9,16

….

….

….

….

17x19x17

DCP|mode0

…...

DCP|mode1

DCP|mode7

DCP|mode8

9x1

= .

ABC…

……

..

OPQ

ω1,0 ω1,1

ω2,0

….

ω8,0

ω9,0

…. ….

…. ….ω9,1

ω1,17

ω2,17…

.

ω8,17

ω9,17

ω1,16

ω9,16

….

….

….

….

17x19x17

DCP|mode0

…...

DCP|mode1

DCP|mode7

DCP|mode8

9x1

= .PDC DC

~ N

N DCP~

DC

Ni i )AC(P

~)AC(

15mode|mode|

mode

3

0

3

0mode

mode

)(AC)(ACDCDCminarg

),(),(minarg

i

m n

ii

nmnm

PBPB

PBT

Thus, the selection process,

15

)AC(DCrow

~)(AC

~DCminarg

iii NΩINΩI BB

becomes

1.3 0.5 -1.3-0.5 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0.1 0.10.2 0 -0.1 -0.2 -0.1 0 0.1 0.2 0.1 0 -0.1 -0.2 -0.1 0

A B DC E F G H I J K L M N O P Q

0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0

0.2 -0.5 -0.2-0.7 0 0 0 0 0.4 0.2 0.1 0 0 0 0 0 0.5

0.1 0.5 -0.20.5 -0.7 -0.4 0.1 0 0 0 0.1 0.1 0 0 0 0 0

0 0 0.10.1 0 0 0 0 0.1 0.2 0.1 0 -0.2 -0.3 -0.1 0 0

-0.3 -0.4 -0.1-0.3 0 0 0 0 0.3 0.4 0.3 0.1 0 0 0 0 0

-0.2 -0.2 0-0.1 0 0 0 0 0 0.1 0.3 0.2 0 0 0 0 -0.1

Mode 0

Mode 1

Mode 2

Mode 3

Mode 4

Mode 5

Mode 6

Mode 7

Mode 8

AC(1,0)

~

Compared with 2D-DCT ?

1

0

1

0 2

)12(cos

2

)12(cos),()()(

22 N

n

M

muv N

vn

M

umnmvcuc

NMF

B

General practice: apply 1D-DCT twice, row-wise and column-wise ?

1.3 0.5 -1.3-0.5 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0.1 0.10.2 0 -0.1 -0.2 -0.1 0 0.1 0.2 0.1 0 -0.1 -0.2 -0.1 0

A B DC E F G H I J K L M N O P Q

0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0

0.2 -0.5 -0.2-0.7 0 0 0 0 0.4 0.2 0.1 0 0 0 0 0 0.5

0.1 0.5 -0.20.5 -0.7 -0.4 0.1 0 0 0 0.1 0.1 0 0 0 0 0

0 0 0.10.1 0 0 0 0 0.1 0.2 0.1 0 -0.2 -0.3 -0.1 0 0

-0.3 -0.4 -0.1-0.3 0 0 0 0 0.3 0.4 0.3 0.1 0 0 0 0 0

-0.2 -0.2 0-0.1 0 0 0 0 0 0.1 0.3 0.2 0 0 0 0 -0.1

Mode 0

Mode 1

Mode 2

Mode 3

Mode 4

Mode 5

Mode 6

Mode 7

Mode 8

AC(1,0)

~

General Practice: 0.1 X (B + D + J + L – F – H – N - P) + 0.2 X (C + K - G - O) 2 Multiplication + 11 Addition operators

Features of conversion matrix Ω

• Uniquely existed at each position in frequency domain.

• Independent of the video sequences.

• Low computational demand.

• The coefficients can be calculated and stored in advance.

We may select at least 2 modes to undergo the full Lagrangian

process.

Match Percentage between the least residue energy and the least rate-distortion cost

25

35

45

55

65

75

85

95

1 2 3 4Number of candidates

Mat

ch P

erce

nta

ge

(%)

Akiyo (Class A)

Foreman (Class B)

Mobile (Class C)

Simulations and Results• Comparison of the results with the two best

performing algorithms detailed in the literatures.

• Simulation Settings– JM6.1e version in Microsoft Visual C++– Various values of M are selected.

• Measurements:picture quality (dB), compression ratio (Kbits/sec) and entire encoding time (sec).

Simulation 1:

Compared with

C. Kim, H. Shih, and C. Kuo, “Fast H.264 Intra-prediction mode selection using joint spatial and transform domain features”, Journal of Visual Communication and Image Representation, vol. 17, pp. 291-310, 2006.

by selecting M=3

Foreman QCIF 30Hz 300 frames All Intra-coding

0 200 400 600 800 1000 1200 1400 1600 180026

28

30

32

34

36

38

40

42

44

Rate (Kbits/sec)

PS

NR

(dB

)

JM6.1eProposedResult in [6]

0 200 400 600 800 1000 1200 1400 1600 180026

28

30

32

34

36

38

40

42

44

Rate (Kbits/sec)

PS

NR

(dB

)


Perception sensitivity

area


0 200 400 600 800 1000 1200 1400 1600 180026

28

30

32

34

36

38

40

42

44

Rate (Kbits/sec)

PS

NR

(dB

)



300 320 340 360 380 400 420

30

30.5

31

31.5

Rate (Kbits/sec)



300 320 340 360 380 400 420

30

30.5

31

31.5

Rate (Kbits/sec)

Δ1≈0.1dB

Δ2≈0.3dB



Simulation 2:

F. Pan, X. Lin, et al., “Efficient prediction mode selection 4x4 blocks in H.264,”JVT-013, ISO/IEC JTC1/SC29/ WG11 and ITU-T SG16 Q.6, Mar. 2003, Pattaya, Thailand.

F. Pan, X. Lin, et al., “Fast mode decision algorithm for Intraprediction in H.264/AVC video coding,” IEEE transaction on Circuits and Systems for Video Technology, pp. 813-823, Jul. 2005.

by selecting M=2

SequencesPSNR difference Bit rate difference Speed up c.f. JM6.1e

Pan et al Proposed Pan et al Proposed Pan et al Proposed

Bus (CIF) -0.22 dB -0.12 dB 3.85 % 1.46 % 58.12 % 75.22 %

Coastguard -0.11 dB -0.06 dB 2.36 % 1.68 % 55.03 % 72.48 %

Container -0.23 dB -0.10 dB 3.70 % 0.77 % 56.36 % 71.86 %

Foreman -0.29 dB -0.07 dB 4.44 % 1.97 % 65.38 % 72.47 %

News -0.29 dB -0.09 dB 3.90 % 1.18 % 55.39 % 73.13 %

Paris (CIF) -0.23 dB -0.09 dB 3.21 % 1.47 % 57.78 % 73.43 %

Silent -0.18 dB -0.05 dB 3.54 % 2.00 % 65.17 % 71.86 %

Conclusions

• A fast algorithm using matrix formulae in the DCT domain is presented.

• The proposed Fintra algorithm does not need any a priori knowledge or magic number.

• The simulation results verify that the proposed algorithm outperforms two other algorithms in terms of all measurements.

• The rate-distortion curves show that the proposed algorithm achieves the same coding performance, yet reduces the computation requirement by up to 75%.

Future Work• The proposed Finter algorithm has been

developed to achieve speed up for encoding P- and B- frames.

• The Finter algorithm produces competitive results to other algorithms recorded in the literatures.

• An integration for both proposed Fintra and Finter algorithms has been done to attain a time of saving up to 86% without sacrificing both picture quality and bit rate efficiency.

References[1] Andy C. Yu, Ngan King Ngi, and Graham Martin, “Efficient Intra- and Inter-mode Selection Algorithms for

H.264/AVC,” in Journal of Visual Communication and Image Representation – Special Issue on Emerging H.264/AVC Video Coding Standard, vol. 17, issue 2, pp. 322-343, Elsevier Press, Apr 2006.

[2] Andy C. Yu, Graham Martin, and Heechan Park, “Improved Schemes for Inter-frame Coding in the H.264/AVC Standard,” in Proc. of 12th IEEE Conference on Image Processing (ICIP) 05, vol. 2., pp. 902-905, Genoa, Italy, Sep. 2005.

[3] Andy C. Yu, Graham Martin, and Heechan Park, “A Frequency Domain Approach to Intra Mode Selection in H.264/AVC,” in Proc. of 13th European Signal Processing Conference (EUSIPCO) 05, 4pp., Antalya, Turkey, Sep. 2005

[4] Andy C. Yu and Graham Martin, “Advanced Block Size Selection Algorithm for INTER frame Coding in H.264/AVC,” in Proc. of 11th IEEE International Conference on Image Processing (ICIP) 04, vol. 1, pp.95-98, Singapore, Oct 2004.

[5] Andy C. Yu, “Efficient Block Size Selection Algorithm for INTER frame Coding in H.264/AVC,” in Proc. of 29th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 04 , vol. 3, pp. 169-172, Montreal, Canada, May 2004.

[6] F. Pan, X. Lin, et al., “Fast mode decision algorithm for Intraprediction in H.264/AVC video coding,” IEEE transaction on Circuits and Systems for Video Technology, pp. 813-823, Jul. 2005.

[7] C. Kim, H. Shih, and C. Kuo, “Fast H.264 Intra-prediction mode selection using joint spatial and transform domain features”, Journal of Visual Communication and Image Representation, vol. 17, pp. 291-310, 2006.

[8] __, “Information technology – coding of audio visual objects – Part 10: advance video coding,” ISO/IEC 14496-10:2003, Dec. 2003.

[9] JVT reference software, JM6.1e, downloaded from http://bs.hhi.de/~suehring/

http://bs.hhi.de/~suehring/

Q & A

Documents

A frequency domain approach for Intra-coding pictures Presenter: Andy C. Yu for Microsoft Research Cambridge 16 th May 2006