56
Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Heiko Schwarz, Detlev Marpe, and Thomas Wiegand CSVT, Sept. 2007 2009/5 MC2008, VCLAB 1

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

  • Upload
    xia

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard. Heiko Schwarz, Detlev Marpe, and Thomas Wiegand CSVT, Sept. 2007. Outline. Introduction Problems Definition Functionality Goal Competition Applications Targets History of SVC Structure of SVC - PowerPoint PPT Presentation

Citation preview

Page 1: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Overview of the Scalable Video Coding Extension of the H.264/AVC StandardHeiko Schwarz, Detlev Marpe, and Thomas Wiegand

CSVT, Sept. 2007

2009/5 MC2008, VCLAB 1

Page 2: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Outline

Introduction Problems Definition Functionality Goal Competition Applications Targets

History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC Conclusions

2007/8 MC2008, VCLAB 2

Page 3: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - problem

Non-Scalable Video Streaming Multiple video streams are needed for

heterogeneous clients

2007/8 MC2008, VCLAB 3

8Mb/s

6Mb/s 4Mb/s

1Mb/s

512Kb/s

Page 4: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - definition

Scalable video stream

Scalability Removal of parts of the video bit-stream to adapt

to the various needs of end users and to varying terminal capabilities or network conditions

2007/8 MC2008, VCLAB 4

Sub-stream 1Sub-stream 2

Sub-stream n

Sub-stream k1

Sub-stream k2

Sub-stream ki…reconstruc

tion

High quality

Low quality

Page 5: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - functionality

Functionality of SVC Graceful degradation when “right” parts of the bit-

stream are lost Bit-rate adaptation to match the channel

throughput Format adaptation for backwards compatible

extension Power adaptation for trade-off between runtime

and quality

2007/8 MC2008, VCLAB 5

Page 6: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - goal

Goal of SVC

Scalability mode Fidelity reduction (SNR scalability) Picture size reduction (spatial scalability) Frame rate reduction (temporal scalability) Sharpness reduction (frequency scalability) Selection of content (ROI or object-based

scalability)

2007/8 MC2008, VCLAB 6

Sub-stream k1

Sub-stream k2

Sub-stream ki…

H.264/AVC bit-stream

=(Quality)

Page 7: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - competition

SVC is an old research topic (> 20 years) and has been included in H.262/MPEG-2, H.263, and MPEG-4 Visual. Rarely used because

The characteristics of traditional video transmission systems

Significant loss of coding efficiency and large increase in decoder complexity

Competition Simulcast Transcoding

2007/8 MC2008, VCLAB 7

Page 8: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - applications

Applications Heterogeneous clients Unequal protection Surveillance

Problems Increased decoder complexity Decreased coding efficiency Temporal scalability is more often supported than

spatial and quality scalability.

2007/8 MC2008, VCLAB 8

Page 9: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Introduction - targets

Targets Little decrease in coding efficiency Little increase in decoding complexity Support of temporal, spatial, and quality

scalability A backward compatible base layer Simple bit-stream adaptations after encoding

2007/8 MC2008, VCLAB 9

Page 10: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

History of SVC

October 2003: MPEG issues a call for proposals of Scalable Video Coding 12 wavelet-based 2 extensions of H.264/AVC

~October 2004: MSRA vs. HHI proposal (Wavelet-based vs. H.264 Extension)

October 2004: HHI proposal adopted as starting point (due to reduction of the encoder and decoder and improvements in coding efficiency)

January 2005: MPEG and VCEG agree to jointly finalize the SVC project as an Amendment of H.264/AVC

Spring 2007: Finalization

2007/8 MC2008, VCLAB 10

Page 11: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Structure of SVC

2007/8 MC2008, VCLAB 11

Spatial decimation

Temporal scalable coding

Temporal scalable coding

Prediction

Prediction

Base layer coding

Base layer coding

SNR scalable coding

SNR scalable coding

Multiplex

Page 12: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Outline

Introduction History of SVC Structure of SVC Temporal Scalability

Hierarchical prediction structure Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC Conclusions

2007/8 MC2008, VCLAB 12

Page 13: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Temporal Scalability

Hierarchical prediction structures

2007/8 MC2008, VCLAB 13

0 1234 5 67 8 9101112 13 1415 16

0 123 4 56 7 8 9 101112 13 1415 16 17 18

0 1 2 3 4 5 6 7 8 9 1011 1213 14 15 16

Hierarchical B pictures

Non-dyadic hierarchical prediction

Hierarchical prediction with zero delay

GOP

Page 14: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Temporal Scalability

Combination with multiple reference picture Arbitrary modification of the prediction

structure Issue of quantization

Lower layers with higher fidelity Smaller QPs are used in lower layers

Propagation of quantization error smaller QPs are used in higher layers

2007/8 MC2008, VCLAB 14

Page 15: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Temporal Scalability

2007/8 MC2008, VCLAB 16

I

I

I

I

P P P P P P P P

P P PP

P P

P

B0 B0 B0 B0

B0B0

B0

B1 B1 B1 B1

B1 B1B2 B2 B2 B2

N=1

N=2

N=4

N=8

Temporalscalability

Video Coding Experiment with H.264/MPEG4-AVCForeman, CIF 30Hz @ 1320kbpsPerformance as a function of N

Cascaded QP assignmentQP(P) QP(B0)-3 QP(B1)-4 QP(B2)-5

This slide is copied from JVT-W132-Talk

Page 16: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Temporal Scalability

Coding efficiency of hierarchical prediction JSVM11, High profile with CABAC Only one reference frame

2007/8 MC2008, VCLAB 18

CIF

Page 17: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Temporal Scalability

Compared with IPPP (With and without delay constraint)

Providing temporal scalability usually doesn’t have any negative impact on coding efficiency

2007/8 MC2008, VCLAB 19

Page 18: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Outline

Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability

Inter layer prediction Inter layer motion prediction Inter layer residual prediction Inter layer intra prediction

Quality Scalability Combined Scalability Profiles of SVC Conclusions

2007/8 MC2008, VCLAB 20

Page 19: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

2007/8 MC2008, VCLAB 21

H.264/AVC MCP & Intra-prediction

Hierarchical MCP & Intra-prediction

Hierarchical MCP & Intra-prediction

Base layer coding

Base layer coding

Base layer coding

texture

motion

texture

motion

texture

motion

Inter-layer prediction•Intra•Motion•Residual

Inter-layer prediction•Intra•Motion•Residual

Spatial decimation

Spatial decimation

MultiplexScalable

bit-stream

H.264/AVC compatible coder

H.264/AVC compatible base layer bit-stream

Page 20: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Similar to MPEG-2, H.263, and MPEG-4 Arbitrary resolution ratio The same coding order in all spatial layers Combination with temporal scalability Inter-layer prediction

2007/8 MC2008, VCLAB 22

Intra

IntraSpatial 0Temporal 0Temporal 1

Spatial 1Temporal 2

Page 21: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

The prediction signals are formed by MCP inside the enhancement layer (Temporal) (small motion and

high spatial detail)

Up-sampling from the lower layer (Spatial) Average of the above two predictions (Temporal + Spatial)

Inter-layer prediction Three kinds of inter-layer prediction

Inter-layer motion prediction Inter-layer residual prediction Inter-layer intra prediction

Base mode MB Only residual are transmitted, but no additional side info.

2007/8 MC2008, VCLAB 23

Page 22: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Inter-layer motion prediction base_mode_flag = 1 The reference layer is inter-coded Data are derived from the reference layer

MB partitioning Reference indices MVs

motion_pred_flag 1: MV predictors are obtained from the reference layer 0: MV predictors are obtained by conventional spatial

predictors.

2007/8 MC2008, VCLAB 24

(x1,y1)

Reference layer

1616

88

(x2,y2)

(2x2,2y2) (2x1,2y1)

Page 23: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Inter-layer residual prediction residual_pred_flag = 1 Predictor

Block-wise up-sampling by a bi-linear filter from the corresponding 88 sub-MB in the reference layer

Transform block basis

2007/8 MC2008, VCLAB 25

Page 24: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Inter-layer intra prediction base_mode_flag = 1 The reference layer is intra-coded Up-sampling from the reference layer

Luma: one-dimensional 4-tap FIR filter Chroma: bi-linear filter

2007/8 MC2008, VCLAB 26

Page 25: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Past spatial scalable video: Inter-layer intra prediction requires completely decoding

of base layer. Multiple motion compensation and deblocking filter are

needed. Full decoding + inter-layer prediction: complexity >

simulcast. Single-loop decoding

Inter-layer intra prediction is restricted to MBs for which the co-located base layer is intra-coded

2007/8 MC2008, VCLAB 27

Page 26: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Single-loop vs. multi-loop decoding

2007/8 MC2008, VCLAB 28This slide is copied from http://iphome.hhi.de/wiegand/assets/pdfs/H264AVC_SVC.pdf

Inter

I B P

Page 27: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Generalized spatial scalability in SVC Arbitrary ratio

Only restriction: Neither the horizontal nor the vertical resolution can decrease from one layer to the next.

Cropping Containing new regions Higher quality of interesting regions

2007/8 MC2008, VCLAB 29

Page 28: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Coding efficiency Multiple-loop > Single-loop

2007/8 MC2008, VCLAB 30

Page 29: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard
Page 30: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Coding efficiency (IPPP…) Multi-loop > Single-loop

2007/8 MC2008, VCLAB 32

Page 31: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Encoder control (JSVM) Base layer

p0 is optimized for base layer

Enhancement layer p1 is optimized for enhancement layer

Decisions of p1 depend on p0 Efficient base layer coding but inefficient

enhancement layer coding

2007/8 MC2008, VCLAB 33

)}()({minarg' 00000}{

00

pRpDpp

)}|()|({minarg' 0111011}|{

101

ppRppDppp

Page 32: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Encoder control (optimization) Base layer

Considering enhancement layer coding Eliminating p0’s disadvantaging enhancement layer coding

Enhancement layer

No change w

w = 0: JSVM encoder control w = 1: Single-loop encoder control (base layer is not

controlled)

2007/8 MC2008, VCLAB 34

)]}|()|([)]()()[1{(minarg' 011101100000}|,{

0010

ppRppDwpRpDwpppp

Page 33: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Spatial Scalability

Coding efficiency of optimal encoder control Optimized encoder vs. JSVM encoder (QPE = QPB

+ 4)

2007/8 MC2008, VCLAB 35

Page 34: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Outline

Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability

CGS MGS Drift control

Combined Scalability Profiles of SVC Conclusions

2007/8 MC2008, VCLAB 36

Page 35: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

Coarse-grain quality scalability (CGS) A special case of spatial scalability

Identical sizes (resolution) for base and enhancement layers

Smaller quantization step sizes for higher enhancement residual layers

Designed for only several selected bit-rate points Supported bit-rate points = Number of layers

Switch can only occur at IDR access units

2007/8 MC2008, VCLAB 37

Page 36: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

Medium-grain quality scalability (MGS) More enhancement layers are supported

Refinement quality layers of residual Key pictures

Drift control Switch can occur at any access units CGS + key pictures + refinement quality layers

2007/8 MC2008, VCLAB 38

Page 37: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

Drift control Drift: The effect caused by unsynchronized MCP

at the encoder and decoder side Trade-off of MCP in quality SVC

Coding efficiency drift

2007/8 MC2008, VCLAB 39

Page 38: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

MPEG-4 quality scalability with FGS

Base layer is stored and used for MCP of following pictures Drift: Drift free Complexity: Low Efficiency: Efficient based layer but inefficient enhancement

layer Refinement data are not used for MCP

2007/8 MC2008, VCLAB 40

Base layer

Refinement(possibly lost or truncated)

Page 39: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability MPEG-2 quality scalability (without FGS)

Only 1 reference picture is stored and used for MCP of following pictures

Drift: Both base layer and enhancement layer Frequent intra updates is necessary

Complexity: Low Efficiency: Efficient enhancement layer but inefficient base

layer

2007/8 MC2008, VCLAB 41

Base layer

Refinement(possibly lost or truncated)

Page 40: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability 2-loop prediction

Several closed encoder loops run at different bit-rate points in a layered structure

Drift: Enhancement layer Complexity: High Efficiency: Efficient base layer and medium efficient

enhancement layer

2007/8 MC2008, VCLAB 42

Base layer

Refinement(possibly lost or truncated)

Page 41: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

SVC concepts

Key picture Trade-off between coding efficiency and drift MPEG-4 FGS: All key pictures MPEG-2 quality scalability: Non-key pictures

2007/8 MC2008, VCLAB 43

Base layer

Refinement(possibly lost or truncated)

Page 42: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

Drift control with hierarchical prediction

Key pictures Based layer is stored and used for the MCP of following pictures

Other pictures Enhancement layer is stored and used for the MCP of following

pictures GOP size adjusts the trade-off between enhancement layer

coding efficiency and drift

2007/8 MC2008, VCLAB 44

Base layer

Refinement(possibly lost or truncated)

P P PB1B1B2 B2 B2 B2

Page 43: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

Comparisons of drift control

2007/8 MC2008, VCLAB 45

Low efficiency

High efficiency

Drift

Drift-free

Page 44: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

Comparisons of coding efficiency

2007/8 MC2008, VCLAB 46

High dQP

Low dQP

QSTEP = 2 (QP-4)/6

Page 45: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Quality Scalability

MGS with key pictures using optimized encoder control

2007/8 MC2008, VCLAB 47

Only base layer

Page 46: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Outline

Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability

SVC encoder structure Dependence and Quality refinement layers Bit-stream format Bit-stream switching

Profiles of SVC Conclusions

2007/8 MC2008, VCLAB 48

Page 47: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Combined Scalability

SVC encoder structure

2007/8 MC2008, VCLAB 49

Dependency

layer

The same motion/prediction

information

The same motion/prediction

information

Temporal Decomposition

Page 48: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Combined Scalability

Dependency and Quality refinement layers

2007/8 MC2008, VCLAB 50

D = 2

Q = 2

Q = 1

Q = 0

D = 1

Q = 2

Q = 1

Q = 0

D = 0

Q = 2

Q = 1

Q = 0

Scalable bit-stream

Page 49: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Combined Scalability

2007/8 MC2008, VCLAB 51

T0

D1

Q1

Q0

D0

Q1

Q0

T2 T1 T2 T0

Page 50: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Combined Scalability

Bit-stream format

2007/8 MC2008, VCLAB 52

NAL unit header NAL unit header extension NAL unit payload

1 1 1 1 1 323362

P T D Q

P (priority_id): indicates the importance of a NAL unitT (temporal_id): indicates temporal levelD (dependency_id): indicates spatial/CGS layerQ (quality_id): indicates MGS/FGS layer

Page 51: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Combined Scalability

Bit-stream switching Inside a dependency layer

Switching everywhere Outside a dependency layer

Switching up only at IDR access units Switching down everywhere if using multiple-loop

decoding

2007/8 MC2008, VCLAB 53

Page 52: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Outline

Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC

Scalable Baseline Scalable High Scalable High Intra

Conclusions

2007/8 MC2008, VCLAB 54

Page 53: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Profiles of SVC

Scalable Baseline For conversational and surveillance applications requiring

low decoding complexity Spatial scalability: fixed ratio (1, 1.5, or 2) and MB-aligned

cropping Temporal and quality scalability: arbitrary No interlaced coding tools B-slices, weighted prediction, CABAC, and 8x8 luma

transform The base layer conforms Baseline profile of H.264/AVC

2007/8 MC2008, VCLAB 55

Page 54: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Profiles of SVC

Scalable High For broadcast, streaming, and storage Spatial, temporal, and quality scalability: arbitrary The base layer conforms High profile of

H.264/AVC Scalable High Intra

Scalable High + all IDR pictures

2007/8 MC2008, VCLAB 56

Page 55: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Conclusions

Temporal scalability Hierarchical prediction structure

Spatial and quality scalability Inter-layer prediction of Intra, motion, and residual information Single-loop MC decoding Identical size for each spatial layer – CGS CGS + key pictures + quality refinement layer – MGS

applications Power adaption – decoding needed part of the video stream Graceful degradation – when “right” parts are lost Format adaption – backwards compatible extension in mobile TV

What’s next in SVC? Bit-depth scalability (8-bit 4:2:0 10-bit 4:2:0) Color format scalability (4:2:0 4:4:4)

2007/8 MC2008, VCLAB 57

Page 56: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

References

H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” CSVT 2007.

T. Wiegand, “Scalable Video Coding,” Joint Video Team, doc. JVT-W132, San Jose, USA, April 2007.

T. Wiegand, “Scalable Video Coding,” Digital Image Communication, Course at Technical University of Berlin, 2006. (Available on http://iphome.hhi.de/wiegand/dic.htm)

H. Schwarz, D. Marpe, and T. Wiegand, “Constrained Inter-Layer Prediction for Single-Loop Decoding in Spatial Scalability,” Proc. of ICIP’05.

2007/8 MC2008, VCLAB 58