28
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri Onur G. Guleryuz DoCoMo USA Labs *Presented at 2008 SIAM Conference on Imaging Science on 07/09/2008 (Animated slides, please use slide show mode)

DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

DoCoMo USA Labs All Rights ReservedSandeep Kanumuri, NML

Fast super-resolution of video sequences using sparse directional transforms*

Sandeep KanumuriOnur G. Guleryuz

DoCoMo USA Labs

*Presented at 2008 SIAM Conference on Imaging Science on 07/09/2008

(Animated slides, please use slide show mode)

Page 2: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

2DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Outline

• System Model

• Motivation

• Prior Work

• Our Solution: SWAT (Sparse Warped transform and Adaptive Thresholding)– Algorithm Flowchart

– Over-complete Transform

– Warped (Directional) Transform

– Over-complete Inverse Transform

– Adaptive Thresholding

• Performance Comparison

• Conclusion

Page 3: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

3DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

System Model

• Design goals1. High Quality Rendering

2. Fast Algorithm (Lower Complexity) – Single Frame, Simple Transform

Page 4: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

DoCoMo USA Labs All Rights ReservedSandeep Kanumuri, NML

Motivation

Page 5: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

5DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Broadcast Video – TV application

Docking station

Low-resolution video signal for mobile phones

Low-resolution video is sent to the docking station

Docking station uses the SWAT algorithm to convert low-resolution video to high-resolution video

High-resolution video is sent to a TV or a large display

BENEFIT: Broadcast programming aimed at mobile phones can also be

used in stationary environments

A.1

A.2

B

Low-resolution video is converted to high-resolution video by the cell phone itself

using the SWAT algorithm and high-resolution video is transmitted to the TV

using local wireless technologies

Only one path (Path A or Path B) is used

Page 6: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

6DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Broadcast Video – VGA phones

Low-resolution video signal for mobile phones

BENEFIT: SWAT capability allows this cell phone to convert low-resolution

video to high-resolution video

VGA phone with SWAT capability

VGA phone without SWAT capability

Page 7: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

7DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

More Applications…

• Video Quality Enhancement Service– SWAT algorithm can be deployed as a service to enhance the

resolution and quality of videos

• Video Conferencing– A SWAT equipped terminal can show video at a higher zoom level

and with improved quality

• High-quality Image Zooming– SWAT algorithm enables the mobile phone to convert the low quality,

low resolution image into a high quality, high resolution image

Page 8: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

8DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Prior Work

• Linear solutions– Filter design

• Non-linear solutions– Regularization (Projection onto the model space)

• Signal Sparsity– Iterated Denoising / Shrinkage– Lp-Norm Minimization

• Optical Flow

• Adaptive filtering

• Example-based approaches

– Data Consistency (Projection onto the input space)

Page 9: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

9DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

SWAT Algorithm Flowchart

Output Image/Video

Input Image/Video

Linear Interpolation Filter

Directional Over-completeTransform

Adaptive Thresholding

Directional Over-complete Inverse Transform

Enforce Data Consistency

More iterations?

Low-resolution, low quality

High-resolution, low quality

High-resolution, high quality

yes no

Regularization

Page 10: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

10DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Linear Interpolation Filter

• A linear interpolation filter is used to form an initial estimate of the high-resolution image/video– However, the quality of interpolation is relatively low

• Popular filter choice– Low pass filter of Daubechies 7/9 Inverse Wavelet

– H.264 Interpolation Filter

• A customized linear interpolation filter can be used, if any of the following is known.– Downsampling filter (if the input was obtained by downsampling a

higher resolution original)

– Filtering caused by the camera acquisition process

Page 11: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

11DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

0 N-1k

(Sparse Decomposition Domain)(Signal Domain)

S(k)

+T

-T

0 N-1n

s(n)

0 N-1k

C(k)^

(Denoised)

Core idea – Exploit Signal Sparsity

S(k)

0 N-1k

+ W(k)C(k) =

“noise”

Page 12: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

12DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

• Transform size: 4x4 (used for description), 3x3

• Transform used: DCT, Hadamard• For an Over-complete Transform

– all possible 4x4 blocks in the image/frame are selected using a non-directional mask

– Each 4x4 block undergoes a transform to produce a set of transformed coefficients

– Each pixel is involved in multiple transforms (16, on the average)

– Total number of transformed coefficients ~ 16 x number of pixels

• Directional Over-complete Transform– Here, each of the 4x4 blocks is formed

by applying a directional mask followed by a warping process (see next slide)

Block (1,1)

Block (2,1)

Block (H-3,1) Block (H-3,2) Block (H-3,W-3)

Block (1,2) Block (1,W-3)

Block (2,2) Block (2,W-3)

… … …

Blocks of an Over-complete Transform

H = Height of image; W = Width of image

Non-directional mask used to select a 4x4 block

Over-complete Transform

Page 13: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

13DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

but violated on directional edges

Signal sparsity in DCT domain holds for horizontal

and veritcal edges

Non-directional mask

Directional masks

Transform domain: 4x4 DCT

Transform support is warped

Animated Slide, Please use slide show mode

Let us consider 4 blocks along the edge- First, using Non-directional masks- Now, using Directional masks- Directional masks lead to sparse representation

For Directional Over-complete Transform, Directional masks replace the Non-directional mask

Warped (Directional) Transform

Page 14: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

14DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

• Decision made for a block (4x4) of pixels– At each pixel, a vote is cast for the mask that minimizes the signal

variance along the mask direction.

– The mask with the most votes is chosen

• Reduces inconsistency in directions

How to choose a mask?

Example masks

Page 15: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

15DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Over-complete Inverse Transform

• For an Over-complete Inverse Transform– Each set of transformed coefficients is converted back to pixel domain

– Each pixel has multiple estimates from different blocks and a weighted combination is used to arrive at its final estimate

W1 W2 W3

and so on with all the blocks….

Page 16: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

16DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Adaptive Thresholding

• Transform coefficients are thresholded for denoising

• A master threshold ( ) is used for an initial pass

• A local threshold ( ) is calculated and finally used– Elost: Energy lost due to thresholding when is used as threshold.

TEfT lost ˆ

• Parameters f1 to fn and E1 to En are tuned to achieved a local optimum

1

f1

f2

fn

(0,0) E2E1 En

Elost

f()

T

T

Page 17: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

17DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Enforcing Data Consistency

• Role of data consistency module – Ensure that the high-resolution estimate, when downsampled, can

produce the low-resolution input.

Data Consistency module

Downsampling FilterLinear Interpolation

FilterHigh-resolution Input

Low-resolution Input

High-resolution Output

+

+

_

+

Page 18: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

18DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Performance Comparison

• Super-resolution of QCIF to CIF sequences– Low pass filter from Daubechies 7/9 wavelet filter bank

– Compression is done using H.264/AVC codec (JM12.0)

• SWAT run with 2 iterations

• Compared with– Bilinear interpolation

– H.264 interpolation

– Simple Inverse

– Iterated Denoising / Shrinkage (ID)• 2 iterations (similar complexity compared to SWAT)

• 10 iterations

Page 19: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

19DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

PSNR comparison (uncompressed)

Page 20: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

20DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

PSNR comparison (uncompressed)

Page 21: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

21DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

PSNR comparison (uncompressed)

Page 22: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

22DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

H264ID (2 iterations)SWAT

Visual Comparison (uncompressed)

Page 23: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

23DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

H264ID (2 iterations)SWAT

Visual Comparison (uncompressed)

Page 24: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

24DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

PSNR comparison (compression at QP=20)

Page 25: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

25DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

PSNR comparison (compression at QP=25)

Page 26: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

26DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

H264SWAT

Visual Comparison (compression at QP=25)

Page 27: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

27DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Visual Comparison (compression at QP=25)

H264SWAT

Page 28: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri

28DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML

Conclusion

• SWAT algorithm renders high quality output and yet remains fast– Quality comparable to ID (10 iterations)– Complexity comparable to ID (2 iterations)

• Enabling Features– Over-complete transform representation– Simple basic transform (Hadamard, Integer DCT)– Sparse warped transform– Adaptive thresholding– Weighted inverse transform

• Reference– S. Kanumuri, O. G. Guleryuz and M. R. Civanlar, "Fast super-resolution

reconstructions of mobile video using warped transforms and adaptive thresholding", SPIE Applications of Digital Image Processing XXX , August 2007

• Flicker Reduction Application– To appear in SPIE 2008 (Applications of Digital Image Processing XXXI)

• E-mail:– Sandeep Kanumuri ([email protected])– Onur G. Guleryuz ([email protected])