Upload
xia
View
45
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Overview of the Scalable Video Coding Extension of the H.264/AVC Standard. Heiko Schwarz, Detlev Marpe, and Thomas Wiegand CSVT, Sept. 2007. Outline. Introduction Problems Definition Functionality Goal Competition Applications Targets History of SVC Structure of SVC - PowerPoint PPT Presentation
Citation preview
Overview of the Scalable Video Coding Extension of the H.264/AVC StandardHeiko Schwarz, Detlev Marpe, and Thomas Wiegand
CSVT, Sept. 2007
2009/5 MC2008, VCLAB 1
Outline
Introduction Problems Definition Functionality Goal Competition Applications Targets
History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC Conclusions
2007/8 MC2008, VCLAB 2
Introduction - problem
Non-Scalable Video Streaming Multiple video streams are needed for
heterogeneous clients
2007/8 MC2008, VCLAB 3
8Mb/s
6Mb/s 4Mb/s
1Mb/s
512Kb/s
Introduction - definition
Scalable video stream
Scalability Removal of parts of the video bit-stream to adapt
to the various needs of end users and to varying terminal capabilities or network conditions
2007/8 MC2008, VCLAB 4
Sub-stream 1Sub-stream 2
Sub-stream n
…
Sub-stream k1
Sub-stream k2
Sub-stream ki…reconstruc
tion
High quality
Low quality
Introduction - functionality
Functionality of SVC Graceful degradation when “right” parts of the bit-
stream are lost Bit-rate adaptation to match the channel
throughput Format adaptation for backwards compatible
extension Power adaptation for trade-off between runtime
and quality
2007/8 MC2008, VCLAB 5
Introduction - goal
Goal of SVC
Scalability mode Fidelity reduction (SNR scalability) Picture size reduction (spatial scalability) Frame rate reduction (temporal scalability) Sharpness reduction (frequency scalability) Selection of content (ROI or object-based
scalability)
2007/8 MC2008, VCLAB 6
Sub-stream k1
Sub-stream k2
Sub-stream ki…
H.264/AVC bit-stream
=(Quality)
Introduction - competition
SVC is an old research topic (> 20 years) and has been included in H.262/MPEG-2, H.263, and MPEG-4 Visual. Rarely used because
The characteristics of traditional video transmission systems
Significant loss of coding efficiency and large increase in decoder complexity
Competition Simulcast Transcoding
2007/8 MC2008, VCLAB 7
Introduction - applications
Applications Heterogeneous clients Unequal protection Surveillance
Problems Increased decoder complexity Decreased coding efficiency Temporal scalability is more often supported than
spatial and quality scalability.
2007/8 MC2008, VCLAB 8
Introduction - targets
Targets Little decrease in coding efficiency Little increase in decoding complexity Support of temporal, spatial, and quality
scalability A backward compatible base layer Simple bit-stream adaptations after encoding
2007/8 MC2008, VCLAB 9
History of SVC
October 2003: MPEG issues a call for proposals of Scalable Video Coding 12 wavelet-based 2 extensions of H.264/AVC
~October 2004: MSRA vs. HHI proposal (Wavelet-based vs. H.264 Extension)
October 2004: HHI proposal adopted as starting point (due to reduction of the encoder and decoder and improvements in coding efficiency)
January 2005: MPEG and VCEG agree to jointly finalize the SVC project as an Amendment of H.264/AVC
Spring 2007: Finalization
2007/8 MC2008, VCLAB 10
Structure of SVC
2007/8 MC2008, VCLAB 11
Spatial decimation
Temporal scalable coding
Temporal scalable coding
Prediction
Prediction
Base layer coding
Base layer coding
SNR scalable coding
SNR scalable coding
Multiplex
Outline
Introduction History of SVC Structure of SVC Temporal Scalability
Hierarchical prediction structure Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC Conclusions
2007/8 MC2008, VCLAB 12
Temporal Scalability
Hierarchical prediction structures
2007/8 MC2008, VCLAB 13
0 1234 5 67 8 9101112 13 1415 16
0 123 4 56 7 8 9 101112 13 1415 16 17 18
0 1 2 3 4 5 6 7 8 9 1011 1213 14 15 16
Hierarchical B pictures
Non-dyadic hierarchical prediction
Hierarchical prediction with zero delay
GOP
Temporal Scalability
Combination with multiple reference picture Arbitrary modification of the prediction
structure Issue of quantization
Lower layers with higher fidelity Smaller QPs are used in lower layers
Propagation of quantization error smaller QPs are used in higher layers
2007/8 MC2008, VCLAB 14
Temporal Scalability
2007/8 MC2008, VCLAB 16
I
I
I
I
P P P P P P P P
P P PP
P P
P
B0 B0 B0 B0
B0B0
B0
B1 B1 B1 B1
B1 B1B2 B2 B2 B2
N=1
N=2
N=4
N=8
Temporalscalability
Video Coding Experiment with H.264/MPEG4-AVCForeman, CIF 30Hz @ 1320kbpsPerformance as a function of N
Cascaded QP assignmentQP(P) QP(B0)-3 QP(B1)-4 QP(B2)-5
This slide is copied from JVT-W132-Talk
Temporal Scalability
Coding efficiency of hierarchical prediction JSVM11, High profile with CABAC Only one reference frame
2007/8 MC2008, VCLAB 18
CIF
Temporal Scalability
Compared with IPPP (With and without delay constraint)
Providing temporal scalability usually doesn’t have any negative impact on coding efficiency
2007/8 MC2008, VCLAB 19
Outline
Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability
Inter layer prediction Inter layer motion prediction Inter layer residual prediction Inter layer intra prediction
Quality Scalability Combined Scalability Profiles of SVC Conclusions
2007/8 MC2008, VCLAB 20
Spatial Scalability
2007/8 MC2008, VCLAB 21
H.264/AVC MCP & Intra-prediction
Hierarchical MCP & Intra-prediction
Hierarchical MCP & Intra-prediction
Base layer coding
Base layer coding
Base layer coding
texture
motion
texture
motion
texture
motion
Inter-layer prediction•Intra•Motion•Residual
Inter-layer prediction•Intra•Motion•Residual
Spatial decimation
Spatial decimation
MultiplexScalable
bit-stream
H.264/AVC compatible coder
H.264/AVC compatible base layer bit-stream
Spatial Scalability
Similar to MPEG-2, H.263, and MPEG-4 Arbitrary resolution ratio The same coding order in all spatial layers Combination with temporal scalability Inter-layer prediction
2007/8 MC2008, VCLAB 22
Intra
IntraSpatial 0Temporal 0Temporal 1
Spatial 1Temporal 2
Spatial Scalability
The prediction signals are formed by MCP inside the enhancement layer (Temporal) (small motion and
high spatial detail)
Up-sampling from the lower layer (Spatial) Average of the above two predictions (Temporal + Spatial)
Inter-layer prediction Three kinds of inter-layer prediction
Inter-layer motion prediction Inter-layer residual prediction Inter-layer intra prediction
Base mode MB Only residual are transmitted, but no additional side info.
2007/8 MC2008, VCLAB 23
Spatial Scalability
Inter-layer motion prediction base_mode_flag = 1 The reference layer is inter-coded Data are derived from the reference layer
MB partitioning Reference indices MVs
motion_pred_flag 1: MV predictors are obtained from the reference layer 0: MV predictors are obtained by conventional spatial
predictors.
2007/8 MC2008, VCLAB 24
(x1,y1)
Reference layer
1616
88
(x2,y2)
(2x2,2y2) (2x1,2y1)
Spatial Scalability
Inter-layer residual prediction residual_pred_flag = 1 Predictor
Block-wise up-sampling by a bi-linear filter from the corresponding 88 sub-MB in the reference layer
Transform block basis
2007/8 MC2008, VCLAB 25
Spatial Scalability
Inter-layer intra prediction base_mode_flag = 1 The reference layer is intra-coded Up-sampling from the reference layer
Luma: one-dimensional 4-tap FIR filter Chroma: bi-linear filter
2007/8 MC2008, VCLAB 26
Spatial Scalability
Past spatial scalable video: Inter-layer intra prediction requires completely decoding
of base layer. Multiple motion compensation and deblocking filter are
needed. Full decoding + inter-layer prediction: complexity >
simulcast. Single-loop decoding
Inter-layer intra prediction is restricted to MBs for which the co-located base layer is intra-coded
2007/8 MC2008, VCLAB 27
Spatial Scalability
Single-loop vs. multi-loop decoding
2007/8 MC2008, VCLAB 28This slide is copied from http://iphome.hhi.de/wiegand/assets/pdfs/H264AVC_SVC.pdf
Inter
I B P
Spatial Scalability
Generalized spatial scalability in SVC Arbitrary ratio
Only restriction: Neither the horizontal nor the vertical resolution can decrease from one layer to the next.
Cropping Containing new regions Higher quality of interesting regions
2007/8 MC2008, VCLAB 29
Spatial Scalability
Coding efficiency Multiple-loop > Single-loop
2007/8 MC2008, VCLAB 30
Spatial Scalability
Coding efficiency (IPPP…) Multi-loop > Single-loop
2007/8 MC2008, VCLAB 32
Spatial Scalability
Encoder control (JSVM) Base layer
p0 is optimized for base layer
Enhancement layer p1 is optimized for enhancement layer
Decisions of p1 depend on p0 Efficient base layer coding but inefficient
enhancement layer coding
2007/8 MC2008, VCLAB 33
)}()({minarg' 00000}{
00
pRpDpp
)}|()|({minarg' 0111011}|{
101
ppRppDppp
Spatial Scalability
Encoder control (optimization) Base layer
Considering enhancement layer coding Eliminating p0’s disadvantaging enhancement layer coding
Enhancement layer
No change w
w = 0: JSVM encoder control w = 1: Single-loop encoder control (base layer is not
controlled)
2007/8 MC2008, VCLAB 34
)]}|()|([)]()()[1{(minarg' 011101100000}|,{
0010
ppRppDwpRpDwpppp
Spatial Scalability
Coding efficiency of optimal encoder control Optimized encoder vs. JSVM encoder (QPE = QPB
+ 4)
2007/8 MC2008, VCLAB 35
Outline
Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability
CGS MGS Drift control
Combined Scalability Profiles of SVC Conclusions
2007/8 MC2008, VCLAB 36
Quality Scalability
Coarse-grain quality scalability (CGS) A special case of spatial scalability
Identical sizes (resolution) for base and enhancement layers
Smaller quantization step sizes for higher enhancement residual layers
Designed for only several selected bit-rate points Supported bit-rate points = Number of layers
Switch can only occur at IDR access units
2007/8 MC2008, VCLAB 37
Quality Scalability
Medium-grain quality scalability (MGS) More enhancement layers are supported
Refinement quality layers of residual Key pictures
Drift control Switch can occur at any access units CGS + key pictures + refinement quality layers
2007/8 MC2008, VCLAB 38
Quality Scalability
Drift control Drift: The effect caused by unsynchronized MCP
at the encoder and decoder side Trade-off of MCP in quality SVC
Coding efficiency drift
2007/8 MC2008, VCLAB 39
Quality Scalability
MPEG-4 quality scalability with FGS
Base layer is stored and used for MCP of following pictures Drift: Drift free Complexity: Low Efficiency: Efficient based layer but inefficient enhancement
layer Refinement data are not used for MCP
2007/8 MC2008, VCLAB 40
Base layer
Refinement(possibly lost or truncated)
Quality Scalability MPEG-2 quality scalability (without FGS)
Only 1 reference picture is stored and used for MCP of following pictures
Drift: Both base layer and enhancement layer Frequent intra updates is necessary
Complexity: Low Efficiency: Efficient enhancement layer but inefficient base
layer
2007/8 MC2008, VCLAB 41
Base layer
Refinement(possibly lost or truncated)
Quality Scalability 2-loop prediction
Several closed encoder loops run at different bit-rate points in a layered structure
Drift: Enhancement layer Complexity: High Efficiency: Efficient base layer and medium efficient
enhancement layer
2007/8 MC2008, VCLAB 42
Base layer
Refinement(possibly lost or truncated)
Quality Scalability
SVC concepts
Key picture Trade-off between coding efficiency and drift MPEG-4 FGS: All key pictures MPEG-2 quality scalability: Non-key pictures
2007/8 MC2008, VCLAB 43
Base layer
Refinement(possibly lost or truncated)
Quality Scalability
Drift control with hierarchical prediction
Key pictures Based layer is stored and used for the MCP of following pictures
Other pictures Enhancement layer is stored and used for the MCP of following
pictures GOP size adjusts the trade-off between enhancement layer
coding efficiency and drift
2007/8 MC2008, VCLAB 44
Base layer
Refinement(possibly lost or truncated)
P P PB1B1B2 B2 B2 B2
Quality Scalability
Comparisons of drift control
2007/8 MC2008, VCLAB 45
Low efficiency
High efficiency
Drift
Drift-free
Quality Scalability
Comparisons of coding efficiency
2007/8 MC2008, VCLAB 46
High dQP
Low dQP
QSTEP = 2 (QP-4)/6
Quality Scalability
MGS with key pictures using optimized encoder control
2007/8 MC2008, VCLAB 47
Only base layer
Outline
Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability
SVC encoder structure Dependence and Quality refinement layers Bit-stream format Bit-stream switching
Profiles of SVC Conclusions
2007/8 MC2008, VCLAB 48
Combined Scalability
SVC encoder structure
2007/8 MC2008, VCLAB 49
Dependency
layer
The same motion/prediction
information
The same motion/prediction
information
Temporal Decomposition
Combined Scalability
Dependency and Quality refinement layers
2007/8 MC2008, VCLAB 50
D = 2
Q = 2
Q = 1
Q = 0
D = 1
Q = 2
Q = 1
Q = 0
D = 0
Q = 2
Q = 1
Q = 0
Scalable bit-stream
Combined Scalability
2007/8 MC2008, VCLAB 51
T0
D1
Q1
Q0
D0
Q1
Q0
T2 T1 T2 T0
Combined Scalability
Bit-stream format
2007/8 MC2008, VCLAB 52
NAL unit header NAL unit header extension NAL unit payload
1 1 1 1 1 323362
P T D Q
P (priority_id): indicates the importance of a NAL unitT (temporal_id): indicates temporal levelD (dependency_id): indicates spatial/CGS layerQ (quality_id): indicates MGS/FGS layer
Combined Scalability
Bit-stream switching Inside a dependency layer
Switching everywhere Outside a dependency layer
Switching up only at IDR access units Switching down everywhere if using multiple-loop
decoding
2007/8 MC2008, VCLAB 53
Outline
Introduction History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC
Scalable Baseline Scalable High Scalable High Intra
Conclusions
2007/8 MC2008, VCLAB 54
Profiles of SVC
Scalable Baseline For conversational and surveillance applications requiring
low decoding complexity Spatial scalability: fixed ratio (1, 1.5, or 2) and MB-aligned
cropping Temporal and quality scalability: arbitrary No interlaced coding tools B-slices, weighted prediction, CABAC, and 8x8 luma
transform The base layer conforms Baseline profile of H.264/AVC
2007/8 MC2008, VCLAB 55
Profiles of SVC
Scalable High For broadcast, streaming, and storage Spatial, temporal, and quality scalability: arbitrary The base layer conforms High profile of
H.264/AVC Scalable High Intra
Scalable High + all IDR pictures
2007/8 MC2008, VCLAB 56
Conclusions
Temporal scalability Hierarchical prediction structure
Spatial and quality scalability Inter-layer prediction of Intra, motion, and residual information Single-loop MC decoding Identical size for each spatial layer – CGS CGS + key pictures + quality refinement layer – MGS
applications Power adaption – decoding needed part of the video stream Graceful degradation – when “right” parts are lost Format adaption – backwards compatible extension in mobile TV
What’s next in SVC? Bit-depth scalability (8-bit 4:2:0 10-bit 4:2:0) Color format scalability (4:2:0 4:4:4)
2007/8 MC2008, VCLAB 57
References
H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” CSVT 2007.
T. Wiegand, “Scalable Video Coding,” Joint Video Team, doc. JVT-W132, San Jose, USA, April 2007.
T. Wiegand, “Scalable Video Coding,” Digital Image Communication, Course at Technical University of Berlin, 2006. (Available on http://iphome.hhi.de/wiegand/dic.htm)
H. Schwarz, D. Marpe, and T. Wiegand, “Constrained Inter-Layer Prediction for Single-Loop Decoding in Spatial Scalability,” Proc. of ICIP’05.
2007/8 MC2008, VCLAB 58