View
215
Download
0
Tags:
Embed Size (px)
Citation preview
High-Quality Video View Interpolation
Larry ZitnickInteractive Visual Media Group
Microsoft Research
3D video
Lumigraph Light field
Geometry centric Image centric
Warping InterpolationPolygon rendering + texture mapping
Fixed geometry
View-dependent geometry
View-dependent texture
Sprites with depth
Layered depth Image
Current practice
Many cameras
Motion Jitter
vs.
free viewpoint video
Current practice
Many cameras
Motion Jitter
vs.
free viewpoint video
Video view interpolation
Fewer cameras
Smooth Motion
Automatic
and
Real-time rendering
System overview
OFFLINE
ONLINE
Video Capture
Stereo Compression
SelectiveDecompression
Render
File
Representation
Video Capture
concentratorsconcentrators
hard diskshard disks controlling
laptopcontrollinglaptop
camerascamerascameras
Calibration
Zhengyou Zhang, 2000
Input videos
System overview
OFFLINE
ONLINE
Video Capture
Stereo Compression
SelectiveDecompression
Render
File
Representation
Video Capture
Stereo
Key to view interpolation: Geometry
Stereo Geometry
Camera 1 Camera 2
Image 1 Image 2
Virtual Camera
Match ScoreMatch ScoreMatch Score Match Score
Good
Bad
Image correspondence
Correct
Image 1 Image 2
Leg
Wall
Incorrect
Why segments?
Better delineation of boundaries.
Why segments?
Larger support for matching.
Handle gain and offset differences without global model (Kim, Kolmogorov and Zabih, 2003.)
Why segments?
More efficient.
786,432 pixels vs. 1000 segments
Compute disparities per segment rather than per pixel.
Segmentation
Many methods will work:
Graph-based (Felzenszwalb and Huttenlocher, 2004)
Mean Shift (Comaniciu, et al. 2001)
Min-cut (Boykov et al. 2001)
Others…
Segmentation: Important properties
Not too large, not too small…
As large as possible while not spanning multiple objects.
Segmentation: Important properties
Stable Regions
Segmentation: Our Approach
First average…
…then segment.
Anisotropic smoothing
Segmentation: Result
Close-up
Matching segments
Many measures will work: SSD Normalized correlation Mutual information
Depends on color balancing and image quality.
Matching segments: Important properties
Never remove correct matches.
Remove as many false matches as possible
Use global methods to remove remaining false positives.
Matching segments: Our approach
Create gain histogram
Good match
Bad match0.8
0.8 1.25
1.25
1
0
p
pgain
Image 1 Image 2
Local matching
Low textureLow texture
Number of states = number of depth levels
Image 2Image 1
Global regularization
Create MRF (Markov Random Field):
A F
E
DC
B
Each segment is a node
P Q R
S T
U
Global regularization
Disparity Images
Likelihood (data term)
Prior (regularization term)
Image 2Image 1
Global regularization
A F
E
DC
B
colorA ≈ colorB → zA ≈ zB
P Q R
S T
U
Global regularization
i iSk knl
kliliki ddΝDP 2,;
Variance – % of border and similarity of colorNormal distribution
A F
E
DC
B
A F
E
DC
B
A F
E
DC
B
Disparity Disparity Disparity
Multiple disparity maps
Compute a disparity map for each image.
We want the disparity maps to be consistent across images…
Image 2Image 1
Consistent disparities
A F
E
DC
B
zA ≈ zP, zQ, zS
P Q R
S T
U
A
A
Consistent disparities
Disparities dependent on neighboring disparities.
Likelihood includes neighboring disparities.
Consistent disparities
if not occluded
if occluded
Use original data term if not occluded.
Bias disparities to lie behind known surfaces when occluded.
Is the segment occluded?
Not occludedOccluded
Ii
If occluded…
Occluded
Disparity
Ii
Iteratively solve MRF
Depth through time
MattingInterpolated view without matting
Background Surface
Foreground Surface
Camera
Foreground Alpha
Background
Bayesian MattingChuang et al. 2001
Strip Width
Background
Foreground
Rendering with matting
MattingNo Matting
System overview
OFFLINE
ONLINE
Video Capture
Stereo Compression
SelectiveDecompression
Render
File
RepresentationStereo Representation
Representation
Main Layer:
Color
Depth
Main
Boundary
Boundary Layer:
Color
Depth
Alpha
Background
ForegroundStrip Width
System overview
OFFLINE
ONLINE
Video Capture
Stereo Compression
SelectiveDecompression
Render
File
RepresentationRepresentation Compression
Time = 0 Time = 1
Camera 1
Camera 2
Camera 3
Camera 4
Compression
Time = 0 Time = 1
Camera 1
Camera 2
Camera 3
Camera 4
TemporalPrediction
Compression
Time = 0 Time = 1
Camera 1
Camera 2
Camera 3
Camera 4
SpatialPrediction
Compression
Spatial prediction
Depth and Texture
ReferenceCamera
PredictedCamera
ReferenceCamera
PredictedCamera
Depth and Texture
Warped
Spatial prediction
Error Signal
ReferenceCamera
PredictedCamera
Warped Depth and Texture
Spatial prediction
_
+
Reconstructed (after error signal is added)
ReferenceCamera
PredictedCamera
Warped Depth and Texture
Spatial prediction
Boundary layer coding
Depth Color Texture Alpha Matte
Color
Depth
Alpha
Use our own shape coding method similar to MPEG-4Use our own shape coding method similar to MPEG-4
System overview
OFFLINE
ONLINE
Video Capture
Stereo Compression
SelectiveDecompression
Render
File
Representation
SelectiveDecompression
Render
Compression
File
Rendering
Source Cameras
ProjectMain Layer
ProjectBoundary Layer
Composite
ProjectMain Layer
ProjectBoundary Layer
Rendering
ProjectedVideo of background
depth
Video of background
color
Depth Color
GPU
Vertex Shader Pixel ShaderPosition,Texture Coord
Rendering the main layer (Step 1)
Z-Buffer
Color Buffer
Main Layer Depth
Projected
GPU
Pixel Shader
CPU
Generate Erase Mesh
Rendering the main layer (Step 2)
Color Buffer
Z-Buffer
Locate Depth Discontinuities
GPU
Compositing
CPU
Generate Boundary Mesh
Boundary Depth
Boundary RGBA
Rendering boundary layerProjected Main Layer
Vertex Colors Color Buffer
Z-Buffer
Projected
Graphics for Vision
Use the GPU for vision.
Real-time stereo – (Yang and Pollefeys, CVPR 03)
ProjectMain Layer
ProjectBoundary Layer
Composite
ProjectMain Layer
ProjectBoundary Layer
Rendering
GPU
Pixel Shader
Camera 1 Camera 2
Final Result
Final composite
Weights based on proximity to virtual viewpoint
Compositing views
DEMO
“Massive Arabesque” videoclip
Future work
Mesh simplification More complicated scenes Temporal interpolation (use optical flow) Wider range of virtual motion 2D grid of cameras
Summary
Sparse camera configuration High-quality depth recovery Automatic matting New two-layer representation Inter-camera compression Real-time rendering