Sequence-to-Sequence Alignment and Applications. Video > Collection of image frames

Sequence-to-Sequence Alignmentand Applications

Video > Collection of image frames

Video > Collection of image frames

= Space-time volume

X

Y

Time

Sequence-to-Sequence Alignment

[work with Yaron Caspi]

Sequence 1Frame 1

Frame 2Frame 3

Frame n

Sequence 2

Frame 1Frame 2

Frame 3

Frame n

Video 2 Video 1Frame 1

Frame 2Frame 3

Frame n

Frame 1Frame 2

Frame 3

Frame n

(a) Find temporal correspondences

(b) Find spatial correspondences

(x,y,t)(x’,y’,t’)

x

y t

Align and Integrate Space-Time Info[work with Yaron Caspi]

• Spatial resolution

• Temporal resolution

• Spectral range

• Depth of focus

• Dynamic range

• Field-of-View (FOV)

• View point

“Super Sensors” Exceed Optical Bounds of Visual Sensors:

Align and Integrate space-time info

Not enough info for alignment in individual frames

Image 1 Image 2

Image-to-Image Alignment

Information in Video:

Alignment uniquely defined

• Appearance info

• Dynamic info

within frames

between frames

• Moving objects• Non rigid motion • Varying illumination

),,(),,(21

wtvyuxStyxS

Where:

);,,(

);,,(

);,,(

temporal

spatial

spatial

Ptyxww

Ptyxvv

Ptyxuu

affine 1D ,homography temporalspatial PP

Problem Formulation

) projective 2D(

Spatio-Temporal Alignment

SSD Minimization:

tyx

wtvyuxStyxSPErr,,

212 ),,(),,()(

tyx

T

w

v

u

SSSPErr,,

2

212)(

Gauss-Newton (coarse-to-fine) iterations

Coarse-to-Fine Minimization

time

256

256

100

Sequence 1time

256

256

100

Sequence 2

Pyramid of Sequence 2Pyramid of Sequence 1

128

12850

6425

64

128

128

50

64 2564

… …

Sequence 1 Sequence 2

Before Alignment After Alignment







Illumination changes:



time

time

time Super-resolution in space and in time.

time

High-resolution output sequence: time

Low-resolution input sequences

Increasing Space-Time Resolution in Video[work with Eli Shechtman & Yaron Caspi]

Spatial Super-ResolutionMultiple

low-resolution input images:

High-resolution output image:

Recover small details

What is Super-Resolution in Time?

Recover dynamic events that are “faster” than frame-rate (Generate a “high-speed” camera)

• Application areas: sports events, scientific imaging, etc...

• Effects of “fast” events imaged by “slow cameras”:

(1) Motion aliasing (2) Motion blur

(1) Motion AliasingThe “Wagon wheel” effect: Slow-motion:

time

Continuous signal

time

Sub-sampled in time

time

“Slow motion”

(2) Motion Blur

Sh(xh,yh,th)

lnS

lS1

Space-Time Super-Resolution

x

y t

y

x

t

Blur kernel:

PSF

Exposure time

lhA

Low resolution input sequences

High-resolution space-time volume

Super Resolution in Time

Input 1 Input 2

Input 3 Input 4

(25 frames/sec)

Input sequence in slow motion:

Super Resolution in Time

Output sequence (super-resolved) :

(75 frames/sec)(75 frames/sec)

Motion Blur

Overlay of frames

Simulated sequences of “fast” event:• Very long exposure-time • Very low frame-rate

One low-res sequence: Another low-res sequence: And another one...

Motion Blur

Output trajectory: (overlay of frames)

Deblurring:

3 out of 18 low-resolution input sequences: (frame overlays)

Output:

Input:

Output sequence:

(x15 frame-rate)Without estimating motion of the ball!

Input (low-res) frames at collision:

4 input sequences: Output (high-res) frame at collision:

Motion-Blur

Video 1

Video 3

Video 2

Video 4

• Spatial resolution

• Temporal resolution

• Spectral range

• Depth of focus

• Dynamic range

• Field-of-View (FOV)

• View point

Optical Limits of Visual Sensors:

Very little common visual information!!!

Alignment of Non-Overlapping Sequences

Coherent appearance (Image-to-Image

Alignment)

Sequence-to-Sequence Alignment:

Alignment in time and in space

Coherent camera behavior

Coherent scene dynamics (Seq-to-Seq

Alignment)

[work with Yaron Caspi]

The scene

When is it possible?

2) cameras fixed relative to each other

1) same center of projection

iT

ΔtiS

HΔt 1

Δt

HSHT ii Δt HH=? =?

Problem formulation

H

H

}{ iT }{ iSInput:

1Δt

HSHTi ii:Output: and such that H Δt Δt HHi


Conjugate matrices have the same eigenvalues:

)( seigenvalue )( seigenvalue tii ST

Recovering Temporal Alignment

iT

ΔtiS

Δt=?i],,[ 321

Ti ],,[ '3

'2

'1

))(),(( argmini

tii SseigenvalueTseigenvaluet

T and S have the same eigenvalues, up to scale:

)( )( tii SseigenvalueTseigenvalue

1Δt

HSHT ii Δt HH

Search for the temporal shift which minimizes:

Recovering Spatial Transformation

Given : Δt1

Δt

HSHT ii Δt HH

0 Δtii SHHT ΔtH H

Solve a homogeneous set of linear equations in H HH

011 ΔtSHHT ΔtH H

0 Δtnn SHHT ΔtH H

Sequence 1:

Sequence 2:

Exceed Limited FOV

Combined Sequence:

Sequence 1:

Sequence 2:

Exceed Limited Field of View – Wide-Screen Movies

Wide-screen movie:

Fused Sequence:

Visible light (video): Infra-Red:

Exceed Limited Spectral Range – Day and Night Vision

Zoomed-out Zoomed-in

Exceed Limited Focal Length –

Zoomed-in

Zoomed-out

Exceed Limited Focal Length

Copyright, 1996 © Dale Carnegie & Associates, Inc.

Summary• Forget image frames

Video = space-time volume >> collection of images

• Use all available spatio-temporal info for analysis, representation, and exploitation.

Applies to many problem areas: 1. Quick search in video. 2. Alignment and integration of information to exceed optical bounds of visual sensors. 3. Action analysis and recognition 4. Synthesis of video data

and many more…

Copyright, 1996 © Dale Carnegie & Associates, Inc.

A few comments and clarifications regarding Exercise 4

ON THE BOARD

(Please ask a friend if you were not in class)

Documents

Sequence-to-Sequence Alignment and Applications. Video > Collection of image frames