8
Blur Processing Using Double Discrete Wavelet Transform Yi Zhang Keigo Hirakawa Department of Electrical & Computer Engineering, University of Dayton http://campus.udayton.edu/ ˜ ISSL Abstract We propose a notion of double discrete wavelet trans- form (DDWT) that is designed to sparsify the blurred im- age and the blur kernel simultaneously. DDWT greatly en- hances our ability to analyze, detect, and process blur ker- nels and blurry images—the proposed framework handles both global and spatially varying blur kernels seamlessly, and unifies the treatment of blur caused by object motion, optical defocus, and camera shake. To illustrate the poten- tial of DDWT in computer vision and image processing, we develop example applications in blur kernel estimation, de- blurring, and near-blur-invariant image feature extraction. 1. Introduction Image blur is caused by a pixel recording lights from multiple sources. Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused by a wide aperture that prevents light rays originating from the same point from converg- ing. Camera motion during the exposure produces global motion blur where the same point on the scene is observed by multiple moving pixel sensors. Object motion causes each pixel to observe multiple points on the scene that pro- duces spatially-variant motion blur. Assuming Lambertian surfaces, blur is typically represented by the implied blur kernel that acts on the unobserved sharp in-focus image. Blur is a part of everyday photography. On one hand, long exposure is needed to overcome poor lighting con- ditions, but it increase the risk of camera shake and ob- ject motion blurs that severely deteriorate the sharpness of the image. Automatic or manual focus is also a challenge when the scene covers a wide range of depths or is rapidly changing (e.g. sports photography), often causing unwanted defocus blur. On the other hand, professional photogra- phers use well-controlled blur to enhance the aesthetics of a photograph. Thus the ability to manipulate blur in post- processing would offer a greater flexibility in consumer and professional photography. Blur is also valuable for computer vision. Blur may vary across the spatial location (e.g. a scene with multiple mov- (a) Defocus (b) Camera shake (c) Object motion Figure 1. Variations on the types of image blur. ing objects or multiple depths) or be global (e.g. camera shake). We may learn the temporal state of the camera and the scene from blur caused by camera shake or object mo- tion, respectively. The defocus blur kernel varies with the object distance/depth, which can be useful for three dimen- sional scene retrieval from a single camera[23]. Blur also interferes with recognition tasks, as feature extraction from blurry image is a real challenge. In this paper, we propose a novel framework to address the analysis, detection, and processing of blur kernels and blurry images. Central to this work is the notion of dou- ble discrete wavelet transform (DDWT) that is designed to sparsify the blurred image and the blur kernel simultane- ously. We contrast DDWT with the work of [3], which regularizes image and blur kernel in terms of their spar- sity in linear transform domain. The major disadvantage of regularization approach is that the image/blur coefficients are not directly observed, hence requiring a computation- ally taxing search to minimize some “cost” function. On the other hand, out DDWT provides a way to observe the wavelet coefficients of image and blur kernel directly. This gives DDWT coefficients a very intuitive interpretation, and simplify the task of decoupling the blur from the signal, re- gardless of why the blur occurred (e.g. object motion, de- focus, and camera shake) or the type of blur (e.g. global and spatially varying blur). In this sense, DDWT is likely to impact computer vision and image processing applica- tions broadly. Although the primary goal of this article is to develop the DDWT as an analytical tool, we also show example applications in blur kernel estimation, deblurring, and near-blur-invariant image feature extraction to illustrate the potential of DDWT. 1089 1089 1089 1091 1091

Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

Blur Processing Using Double Discrete Wavelet Transform

Yi Zhang Keigo HirakawaDepartment of Electrical & Computer Engineering, University of Dayton

http://campus.udayton.edu/˜ISSL

Abstract

We propose a notion of double discrete wavelet trans-form (DDWT) that is designed to sparsify the blurred im-age and the blur kernel simultaneously. DDWT greatly en-hances our ability to analyze, detect, and process blur ker-nels and blurry images—the proposed framework handlesboth global and spatially varying blur kernels seamlessly,and unifies the treatment of blur caused by object motion,optical defocus, and camera shake. To illustrate the poten-tial of DDWT in computer vision and image processing, wedevelop example applications in blur kernel estimation, de-blurring, and near-blur-invariant image feature extraction.

1. Introduction

Image blur is caused by a pixel recording lights from

multiple sources. Illustrated in Figure 1 are three common

types of blur: defocus, camera shake and object motion.

Defocus blur is caused by a wide aperture that prevents

light rays originating from the same point from converg-

ing. Camera motion during the exposure produces global

motion blur where the same point on the scene is observed

by multiple moving pixel sensors. Object motion causes

each pixel to observe multiple points on the scene that pro-

duces spatially-variant motion blur. Assuming Lambertian

surfaces, blur is typically represented by the implied blur

kernel that acts on the unobserved sharp in-focus image.

Blur is a part of everyday photography. On one hand,

long exposure is needed to overcome poor lighting con-

ditions, but it increase the risk of camera shake and ob-

ject motion blurs that severely deteriorate the sharpness of

the image. Automatic or manual focus is also a challenge

when the scene covers a wide range of depths or is rapidly

changing (e.g. sports photography), often causing unwanted

defocus blur. On the other hand, professional photogra-

phers use well-controlled blur to enhance the aesthetics of

a photograph. Thus the ability to manipulate blur in post-

processing would offer a greater flexibility in consumer and

professional photography.

Blur is also valuable for computer vision. Blur may vary

across the spatial location (e.g. a scene with multiple mov-

(a) Defocus (b) Camera shake (c) Object motion

Figure 1. Variations on the types of image blur.

ing objects or multiple depths) or be global (e.g. camera

shake). We may learn the temporal state of the camera and

the scene from blur caused by camera shake or object mo-

tion, respectively. The defocus blur kernel varies with the

object distance/depth, which can be useful for three dimen-

sional scene retrieval from a single camera[23]. Blur also

interferes with recognition tasks, as feature extraction from

blurry image is a real challenge.

In this paper, we propose a novel framework to address

the analysis, detection, and processing of blur kernels and

blurry images. Central to this work is the notion of dou-

ble discrete wavelet transform (DDWT) that is designed to

sparsify the blurred image and the blur kernel simultane-

ously. We contrast DDWT with the work of [3], which

regularizes image and blur kernel in terms of their spar-

sity in linear transform domain. The major disadvantage of

regularization approach is that the image/blur coefficients

are not directly observed, hence requiring a computation-

ally taxing search to minimize some “cost” function. On

the other hand, out DDWT provides a way to observe the

wavelet coefficients of image and blur kernel directly. This

gives DDWT coefficients a very intuitive interpretation, and

simplify the task of decoupling the blur from the signal, re-

gardless of why the blur occurred (e.g. object motion, de-

focus, and camera shake) or the type of blur (e.g. global

and spatially varying blur). In this sense, DDWT is likely

to impact computer vision and image processing applica-

tions broadly. Although the primary goal of this article is

to develop the DDWT as an analytical tool, we also show

example applications in blur kernel estimation, deblurring,

and near-blur-invariant image feature extraction to illustrate

the potential of DDWT.

2013 IEEE Conference on Computer Vision and Pattern Recognition

1063-6919/13 $26.00 © 2013 IEEE

DOI 10.1109/CVPR.2013.145

1089

2013 IEEE Conference on Computer Vision and Pattern Recognition

1063-6919/13 $26.00 © 2013 IEEE

DOI 10.1109/CVPR.2013.145

1089

2013 IEEE Conference on Computer Vision and Pattern Recognition

1063-6919/13 $26.00 © 2013 IEEE

DOI 10.1109/CVPR.2013.145

1089

2013 IEEE Conference on Computer Vision and Pattern Recognition

1063-6919/13 $26.00 © 2013 IEEE

DOI 10.1109/CVPR.2013.145

1091

2013 IEEE Conference on Computer Vision and Pattern Recognition

1063-6919/13 $26.00 © 2013 IEEE

DOI 10.1109/CVPR.2013.145

1091

Page 2: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

(a) DDWT analysis in (2) (b) DDWT analysis in (3)

Figure 2. The two processing pipelines above are equivalent. Though (a) is the direct result of applying DDWT (di � dj) to the observed

blurry image y, (b) is the interpretation we give to the DDWT coefficients.

2. Related WorkRecent advancements on blind and non-blind deblur-

ring have enabled us to handle complex uniform blur ker-

nels (e.g. [20, 19, 12, 3, 2, 21, 22, 9]). By compari-

son, progress in blind and non-blind deblurring for spatially

varying blur kernel (e.g. [16, 17, 4, 6]) has been slow since

there are limited data availability to support localized blur

kernel. For this reason, it is more common to address this

problem using multiple input images [1, 7] and additional

hardware [21, 18, 8]. Approaches to computational solu-

tions include supervised [16] or unsupervised [17, 10] fore-

ground/background segmentation, statistical modeling [4],

homography based blur kernel modeling methods [24, 15]

and partial differential equation (PDE) methods [6]. In par-

ticular, sparsifying transforms have played key roles in the

detection of blur kernels—gradient operator [4, 6, 10, 9] and

wavelet/framelet/curvelet transforms [3, 11] have been used

for this purpose.

However, existing works have shortcomings, such as

problems with ringing artifact in deblurring [20, 5] or in-

ability to handle spatially varying blur [19, 12, 3, 2, 21, 22,

9]. It is also common for deblurring algorithms to require

iteration [3, 20, 5, 12, 24, 15], which is highly undesirable

for many real-time applications. Besides PDE, authors are

unaware of any existing framework that unify analysis, de-

tection, and processing of camera shake, object motion, de-

focus, global, and spatially varying blurs.

3. Double Discrete Wavelet Transform3.1. Definitions

We begin by defining single discrete wavelet transform(DWT) and the proposed double discrete wavelet trans-form (DDWT). Both transforms are defined to take a over-

complete (a.k.a. undecimated) form and invertible.

Definition 1 (DWT). Let y : Z2 → R be an image signaland n ∈ Z

2. Denote by dj : Z2 → R a wavelet analysisfilter of jth subband. Then

wj(n) := {dj � y}(n)is the jth subband, nth location over-complete single dis-crete wavelet transform coefficient of an image y(n), where� denotes a convolution operator.

Definition 2 (DDWT). The over-complete double discretewavelet transform is defined by the relation

vij(n) := {di � wj}(n),where vij(n) is the transform coefficient of the image y(n)in the (i, j)th subband and location n.

In the special case that dj(n) is a 1D horizontal wavelet

analysis filter and di(n) is a vertical one, then vij(n) is

an ordinary separable 2D wavelet transform. In our work,

however, we allow the possibility that dj(n) and di(n) to be

arbitrarily defined (e.g. both horizontal). Technically speak-

ing, the DWT/DDWT definitions above may apply to non-

wavelet transforms dj and di, so long as they are invertible.

3.2. DDWT Analysis Of Blurred Image

Assuming Lambertian reflectance, let x : Z2 → R be

latent sharp image, y : Z2 → R is the observed blurry

image, and n ∈ Z2 is the pixel location index. Then the

observation y is assumed to be given by:

y(n) = {x � hn}(n) + ε(n) (1)

where ε : Z2 → R is measurement noise. The point spread

function hn : Z2 → R denotes a (possibly local) blur kernel

acting at pixel location n, which may not be known a pri-ori. However, hn may take a parametric form in the case of

motion blur (by object speed and direction) or defocus blur

(by aperture radius or depth). In order for the convolution

model of (1) to hold, the Lambertian reflectance assumption

is necessary since objects may be observed from a different

angle (e.g. as the camera or objects move). Although the

degree of deviation of the non-Lambertian reflectance from

the model of (1) depends on the properties of surface ma-

terial (such as Fresnel constant), it is a common practice in

the blur/deblur literature to approximate the real world re-

flectance with (1) (as we do as well in this paper). Where

understood, the subscript n is omitted from hn(n).When DDWT is applied to the observation y, the cor-

responding DDWT coefficients vij is related to the latent

sharp image x and the blur kernel h by:

vij(n) ={di � dj � y}(n) (2)

={di � h} � {dj � x}(n) + {di � dj � ε}(n)={qi � uj}(n) + ηij(n), (3)

10901090109010921092

Page 3: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

(a) Blurred input image y (b) DWT coefficients wj (c) DDWT coefficients vij (d) Estimated pixel velocity kn

Figure 3. Example of DWT and DDWT coefficients using real camera sensor data. The motion blur manifests itself as a double edge in

DDWT, where the distance between the double edges (yellow arrows in (c)) correspond to the speed of the object. Average velocity of the

moving pixels in (d) is 38 pixels, and the direction of motion is 40 degrees above the horizontal.

where uj := dj � x and qi := di � h are the DWT decom-

positions of x and h, respectively; and ηij := di � dj � h is

noise in the DDWT domain. The relation between (2) and

(3) is also illustrated in Figure 2. By the commutativity and

associativity of convolution, the processes in 2(a) and 2(b)

are equivalent—2(a) is the direct result of applying DDWT

on the observed blurry image,1 but 2(b) is the interpreta-

tion we give to the DDWT coefficients (though 2(b) is not

directly computable).

Suppose that uj and qi are sufficiently sparse. Then

DDWT coefficients vij is the result of applying a “sparse

filter” qi to a “sparse signal” uj . For a filter qi supported on

n ∈ {n1, . . . ,nK}, we have

vij(n) =K∑

k=1

qi(nk)uj(n− nk). (4)

When K is small, vij is nothing more than a sum of a KDWT coefficients uj . Thanks to sparsity, many of uj are

already zeros, and so vij is actually a sum of only a few (far

less than K) DWT coefficients uj . In this paper, we call a

DDWT coefficient aliased if vij is a sum of more than one

“active” uj coefficients. One can reduce the risk of aliasing

when the choice of dj and di makes uj and qi as sparse as

possible. By symmetry, one may also interpret DDWT coef-

ficients as vij = {qj � ui}+ ηij—this is equally valid. But

in practice, the “confusion” between (qi, uj) and (qj , ui)does not seem to be a concern for algorithm development

when qi is more sparse than qj .

Recovery of uj from vij leads to image deblurring,

while reconstructing qi is the blur kernel detection problem.

Clearly, it is easy to decouple uj and qi if vij is unaliased,

and reasonably uncomplicated when uj and qi are suffi-

ciently sparse. In the subsequent sections, we demonstrate

1Images captured by color image sensor undergoes demosaicking,

which makes (1) void. One can circumvent this problem by using de-

mosaicking method in [13] designed to recover DWT coefficients wj from

color filter array data directly.

the power of DDWT by analyzing specific blur types and

designing example applications. Owing to page limit, we

describe object motion blur processing at length. DDWT

treatment of other blur types are brief, but their details

follow the examples of object motion processing closely.

Where understood, the super scripts i and j are omitted

from vij , wj , uj , and qi.

4. Object Motion Blur4.1. DDWT Analysis

Consider for the moment the horizontal motion blur ker-

nel. Assuming constant velocity of the object during expo-

sure, the blur kernel can be modeled as:

h(n) =step(n+

(0

k/2

))− step(n− (

0k/2

))

k, (5)

where k is the speed of the object. Letting di denote a Haar

wavelet transform [−1, 1], the DWT coefficient qi is just a

difference of two impulse functions:

q(n) =δ(n+

(0

k/2

))− δ(n− (

0k/2

))

k. (6)

Hence {q � u} is a “difference of two DWT coefficients”

placed k pixels apart:

{q � u}(n) =u(n+

(0

k/2

))− u(n− (

0k/2

))

k. (7)

Figure 3 shows an example of DDWT coefficients. Recall

that DWT coefficients u typically captures directional im-

age features (say vertical edges). By (7), we intuitively

expect DDWT of moving objects to yield “double edges”

where the distance between the two edges2 correspond ex-

actly to the speed of the moving object. Indeed, detection

2Red lines denote positive DDWT coefficients while blue lines are neg-

ative.

10911091109110931093

Page 4: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

(a) DWT (or DDWT with no blur) (b) DDWT with object motion blur (c) DDWT with defocus blur

Figure 4. Autocorrelation examples. The minimum coincides with the pixel speed/blur radius we seek. Red lines denote � ≥ L or s ≥ S.

of local object speed simplifies to the task of these detecting

double edges. Deblurring is equally straightforward: dou-

ble edges in DDWT coefficients v are the copies of u.

4.2. Object Motion Detection

Naturally, human eye is well suited for the task of identi-

fying replicated DWT coefficients u in DDWT coefficients

v—in military applications, for example, a human “ana-

lyst” can easily detect complex motion (such as rotation)

when presented with a picture of DDWT coefficients (such

as Figure 3(c)). For computational object motion detection,

our ability to detect complex motion is limited primarily by

the capabilities of the computer vision algorithms to extract

local image features from DDWT, and find similar features

that is located unknown (k pixel) distance away. There are a

number of ways that this can be accomplished—we empha-

size that the correlation-based technique we present below

should not be taken as the best way, but rather a proof-of

concept on DDWT. A more advanced image feature extrac-

tion strategy founded on computer vision principles would

likely improve the overall performance of the object motion

detection—we leave this as future research plan.

Assume that u in (7) is wide sense stationary. A typical

autocorrelation function Ru(�) := Eu(n − (0

�/2

))u(n +(

0�/2

)) of u is shown in Figure 4(a). When v(n) = u(n) +

η(n) (i.e. no blur),

Rv(�) = Ru(�) +Rη(�). (8)

On the other hand, in the presence of motion blur, Rv(�) is

a linear combination of Ru(�) and Rη(�), hence Rv(�) =

(2Ru(�)−Ru(�− k)−Ru(�+ k))k−2 +Rη(�). (9)

As illustrated by Figure 4(b), the maximum of Rv(�) ≈σ2u/k

2 + σ2η occurs at � = 0. The two minimums of Rv(�)

coincide with the minimum of Ru(�) and with � = ±kcaused by the Ru(�± k). Hence the candidate � which pro-

duces smallest secondary autocorrelation yields the estima-

tion of the blur kernel length:

k = argmin�≥L

Rv(�) (10)

where � ∈ [L,∞) is the candidate searching range (because

the first minimum is expected to live in � ∈ [0, L)). The

autocorrelation function of (9) is essentially the indicator

function for the double edges evidenced in Figure 3(c).

To estimate the local object motion, autocorrelation

needs to be localized also. Expectation operator in (8) can

be approximated by an local weighted average Rv(n, �) ≈∑m∈Λ a(n+m, �)v(n+m+

(0

�/2

))v(n+m− (

0�/2

))∑

m∈Λ a(n+m, �)(11)

where Λ defines the local neighborhood, n is the center

pixel location, and a(n, �) denotes the averaging weight

at location n. Drawing on the principles of bilateral fil-

tering, weights a(n, �) promote averaging of v(n + m +(0

�/2

))v(n + m − (

0�/2

)) when y(n + m +

(0

�/2

)) and

y(n+m−(0

�/2

)) are similar to y(n); and limit contributions

of the DDWT coefficient unlikely to be associated with the

object at n. Borrowing the idea of image simplification,

the bilateral filtering on Rv(n, �) can be repeated multiple

times to yield a smoothed Rv(n, �) that favors piecewise-

constant object motion speed over more complex ones.

For non-horizontal/vertical motion, we use image shear-

ing to skew the input image y by angle φ ∈ [0, π] (compared

to image rotation, shearing avoids interpolation error). De-

noting by Rv(n, φ, �) the autocorrelation function of v(n)in the sheared direction φ, we detect the local blur angle θand length k by:

(θn, kn) = arg min(φ,�)∈[0,π]×[L,∞)

Rv(n, φ, �). (12)

Figure 3(d) shows the result of estimating the angle θ and

the length k of the blur at every pixel location, correspond-

ing to the input image Figure 3(a).

4.3. Object Motion Deblurring

Complementary to the detection of qi from vij is the no-

tion of recovering uj from vij . When inverse DWT is ap-

plied to the recovered uj , the reconstructed image is the la-

tent sharp image x (i.e. deblurred image). In the discussion

below, we assume that the motion blur angle θ and length

k are already known via (12). We continue to assume hori-

zontal blur—non-horizontal follows from image shearing.

Recall the relation in (3) and (7). We first note that noise

ηij can be accounted for by applying one of many standard

wavelet shrinkage operators to DDWT coefficients vij [14].

10921092109210941094

Page 5: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

Input Proposed Lucy[20] Chan et al.[5] Shan et al.[22]

Input Proposed Lucy[20] Chan et al.[5] Shan et al.[22]

Figure 5. Result of object motion deblurring using real camera sensor data with (top row) global and (bottom row) spatially varying blurs.

Since methods in [5, 20, 22] cannot handle non-global blur, top row allows for a fair comparison of the reconstruction quality, while the

bottom row shows a more realistic scenario. The bottom row was rendered with average velocity of moving pixels for [5, 20, 22] and using

Figure 3(d) for the proposed deblurring method.

Such procedure effectively removes noise ηij in vij to yield

a robust estimate v ≈ {qi � uj}(n) = k−1{u(n+(

0k/2

))−

u(n− (0

k/2

))} for low and moderate noise. Hence the main

deblurring task is the estimation of u(n) from v(n).Key insight we exploit for deblurring is that denoised

DDWT coefficients v(n +(

0k/2

)) and v(n − (

0k/2

)) share

the same DWT coefficient u(n). But u(n) in v(n+(

0k/2

))

and v(n− (0

k/2

)) may be contaminated by u(n+

(0k

)) and

u(n − (0k

)), respectively. It follows from the usual DWT

arguments that DWT coefficients u of a natural image x are

indeed sparse, and thus it is a rare event that contaminants

u(n+(0k

)) and u(n−(

0k

)) are both active at the same time.

To this effect, we have the following result.

Claim 1 (Robust Regression). Let

v(n) =u(n+(

0k/2

))− u(n− (

0k/2

)).

Suppose further that the probability density function of u issymmetric (with zero mean), and P [u(n) = 0] = ρ (u issaid to be “ρ-sparse”). Then

P[u(n+ k) = 0

∣∣∣‖v(n+(

0k/2

))‖ < ‖v(n− (

0k/2

))‖]≥ ρ.

The proof is provided in the supplementary document.

By the above claim, the following reconstruction scheme is

“correct” with probability greater than ρ, uj(n) ={kvij(n+

(0

k/2

)) if ‖v(n+

(0

k/2

))‖ < ‖v(n− (

0k/2

))‖

kvij(n− (0

k/2

)) otherwise.

(13)

Since this deblurring scheme improves if P [u(n) = 0] =ρ ≈ 1 (i.e. more sparse), the choice of sparsifying transform

dj is the determining factor for the effectiveness of the pro-

posed DDWT-based blur processing.

We highlight a few notable features of the proposed de-

blurring scheme. First, the recovery of uj in (13) is simple,

and works regardless of whether the blur kernel is global or

spatially varying (simply replace k with kn). Second, the

deblurring technique in (13) is a single-pass method. Con-

trast this to the overwhelming majority of existing decon-

volution techniques that require iteration [20, 5, 12]. Third,

owing to the fact that no DDWT coefficient v(n) can in-

fluences the reconstruction of the DWT coefficient u(n)that is more than

(0

k/2

)pixels away, the proposed method

is not prone to ringing artifacts. Finally, one can easily in-

corporate any wavelet domain denoising scheme into the

design of the deblurring algorithm. Reconstructions using

real camera sensor data in Figure 5 shows superiority of the

proposed DDWT approach.

5. Optical Defocus Blur

In this section, we extend DDWT analysis to optical de-

focus blur. The support of the defocus blur kernel takes the

shape of the aperture opening, which is a circular disk in

most typical cameras (supp{h} = {n : ‖n‖ ≤ r} where ris the radius of the disk). Though h(n) may not be known

exactly, consider the following approximation:

h(n) ≈{

1πr2 if ‖n‖ ≤ r

0 otherwise.(14)

10931093109310951095

Page 6: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

Letting di denote a Haar wavelet transform [−1, 1], the cor-

responding sparse blur kernel qi is (see Figure 6(a))

qi(n) ≈

⎧⎪⎨⎪⎩

1πr2 if ‖n‖ = r and n2 > 0−1πr2 if ‖n‖ = r and n2 < 0

0 otherwise.

(15)

As it turns out, the crude approximation in (14) is accept-

able for DDWT-based blur processing—with the disconti-

nuities at the disk boundary and smoothness of h(n) inside

the disk, the sparse blur kernel qi is similar to (15).

Drawing parallels between (15) and (6), DDWT-based

processing of optical defocus blur requires only minor mod-

ifications to (12) and (13). For detection, one would re-

define autocorrelation function to integrate over the circum-

ference of the circle in (14):

Rv(n, s) =E∫ π/2

−π/2v(n− s

(cos(θ)sin(θ)

))v(n+ s

(cos(θ)sin(θ)

))dθ

s.

The estimated defocus blur radius is given by

rn = argmins≥S

Rv(n, s),

where s ∈ [S,∞) is the candidate search range. The modi-

fied autocorrelation function is shown in Figure 4(c), where

its second minimum corresponds to the detected blur radius.

Figure 6(b) shows the estimation of defocus blur radius rnat every location, corresponding to the input Figure 1(a).

For deblurring, we recover the latent wavelet coefficient

uj by comparing vij(n−r(cos(θ)sin(θ)

)) and vij(n+r

(cos(θ)sin(θ)

))—

it is an unlikely event that both are aliased. As such,

uj(n, θ) =πr2[β(n, θ)vij(n− r(cos(θ)sin(θ)

))

− (1− β(n, θ))vij(n+ r(cos(θ)sin(θ)

))]

β(n, θ) =|vij(n+ r

(cos(θ)sin(θ)

))|

|vij(n− r(cos(θ)sin(θ)

))|+ |vij(n+ r

(cos(θ)sin(θ)

))|

is a possible reconstruction of u based on a pair of DDWT

coefficients. Figure 7 shows obtained final deblurring result

by marginalizing out θ:

uj(n) = (

∫ π/2

−π/2

uj(n, θ)dθ)(πr)−1. (16)

6. Camera Shake BlurIn this section, we develop a non-blind deblurring algo-

rithm for camera shake based on the DDWT framework.

Camera shake differs from the object motion and defocus

blurs because it is difficult to parameterize the blur kernel.

Although various work has shown that DWT sparsify most

“reasonable” functions, certain types of transform seem to

be well suited for modeling qi [9]. Define reverse-ordered

sparse filter q′(n) = q(−n). If DWT successfully decor-

relates the blur kernel h and image signal x, then following

approximations will hold: Eq(n)q(m) ≈ 0, ∀n = m.

By this approximation, we have the following property:

E[{q′ � v}(n)|u] = E

[ ∑m,m′

q(m−n)q(m−m′)u(m′)∣∣∣u]

= u(n)E[∑

m

q(m)2].

Hence the following is an unbiased estimator for the latent

DWT coefficient u:

u(n) = ({q′ � v}(n))(∑m

q(m)2)−1. (17)

The generalization of (17) to the spatially varying blur

kernel case is straightforward (simply replace qi with qin).

On the other hand, if h is a global blur kernel then y �→ v,

v �→ u, and u �→ x are a completely linear, shift-invariant

processing. When the wavelet denoising step v �→ v can be

ignored, then the entire deblur procedure y �→ v �→ u �→ xmay be reduced to a single convolution operation, which

significantly reduces the computational complexity:

x ={∑

j

ej �

{q′i∑

m qi(m)

}� {di � dj}

}� y, (18)

where ej is the jth synthesis filter corresponding to dj .

One weakness to the proposed deblurring scheme is the

lack of adaptivity to the image content. Unlike (13) or (16),

the method in (17) makes no distinction between aliased and

non-aliased coefficients v. For example, suppose we apply

(17) to the problem of object motion blur (instead of cam-

era shake). In this case, deblurring procedure would simply

average v(n+(

0k/2

)) and v(n− (

0k/2

)) instead of choosing

the one with far lower risk of aliasing (a la (13)). Deblurred

image is likely to suffer from minor ringing artifacts near

edges due to the fact that aliasing was not resolved. Signal-

adaptive reconstruction in DDWT domain will be addressed

in our future work—the main challenge is that the DWT

coefficients qi of blur kernel is more difficult to parameter-

ize than the object motion and defocus ones. Nevertheless,

(17) is a simple, non-iterative deblurring method yielding

reasonable output image quality, and it clearly validates the

overall DDWT blur processing framework.

7. Image RecognitionBlur interferes with recognition tasks, as feature extrac-

tion from blurry image is a real challenge. For example, a

license plate shown in Figure 9(a) is blurred by the motion

10941094109410961096

Page 7: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

(a) Defocus DWT

(b) Estimation

Figure 6. (a) is the DWT of

defocus blur kernel. Dis-

tance between double edges

is 9. (b) is defocus diame-

ter estimation of figure 1(a)

with average diameter of 7.

Input Proposed Lucy[20] Chan et al.[5]

Input Proposed Lucy[20] Chan et al.[5]

Figure 7. Result of optical defocus deblurring using real camera sensor data with (top

row) global and (bottom row) varying depth. Although methods in [5, 20] cannot handle

non-global blur, top row is a fair comparison of the reconstruction quality, while the bot-

tom row shows a more realistic scenario. The bottom row was rendered with background

blur for [5, 20] and using Figure 6(b) for the proposed deblurring method.

Input Proposed Lucy[20] Chan et al.[5]Figure 8. Result of camera shake deblurring using synthetic data.

of the car. As evidenced by Figure 9(b), edge detection fails

to yield meaningful features because the image lacks sharp

transitions. Character recognition on Figure 9(b) would also

likely fail, not only due to degraded image quality but also

because letters are elongated in the horizontal direction by

an arbitrary amount. One obvious way to cope with this is

to deblur as a pre-processing step to the computer vision

algorithms. Analysis in Section 3 suggests an alternative

approach of extracting near-blur-invariant image features.

Specifically, consider DWT and DDWT of input image

y, shown in Figures 9(c-d). Though the conventional inter-

pretation of Figure 9(c) is that this is a wavelet decomposi-

tion of an image, one can equally understand this as a sparse

filter applied to a sharp image, as follows:

wj(n) := {qj � x}(n) + {dj � ε}(n).If Haar wavelet transform [−1, 1] is used, above reduces to

a difference of latent sharp image x:

wj(n) := x(n+(

0k/2

))− x(n− (

0k/2

)) + {dj � ε}(n).

Hence the characteristics of the latent image x is well pre-

served in DWT coefficients wj . Indeed, characters in Fig-

ure 9(c) are more readable—their appearance is sharp with

strong edges, and they have not been elongated in the direc-

tion of the motion. However, each character appears twice

(two 7’s, etc.), so a character recognition on wj would re-

quire a post-processing step to prevent double counting of

(a) Real camera sensor data y (b) Edge detection on y (c) DWT coefficients wj (d) DDWT coefficients vij

Figure 9. Example of license plate identification using DDWT using real camera sensor data.

10951095109510971097

Page 8: Blur Processing Using Double Discrete Wavelet Transform€¦ · Illustrated in Figure 1 are three common types of blur: defocus, camera shake and object motion. Defocus blur is caused

detected characters. DDWT shown in Figure 9(d) essen-

tially performs edge detection on Figure 9(c). Compared

to the failed edge detection of Figure 9(b), this is clearly

an improvement. Note also that we were never required to

detect the blur kernel for producing Figures 9(c-d).

The above analysis raises the possibility of carrying out

recognition tasks on blurry images without image deblur-

ring. DWT and DDWT are near-blur-invariant representa-

tions of the latent image x, and computer vision algorithms

can be trained to work with them directly.

8. ConclusionsWe proposed double discrete wavelet transform—a

novel analytical tool for blur processing. DDWT sparsi-

fies the latent sharp image and blur kernel simultaneously

using DWT. Sparse representation is key to decoupling blur

and image signals, enabling blur kernel recovery and de-

blurring to occur in the wavelet domain. This framework

also inspires a new generation of blur-tolerant recognition

tasks aimed at exploiting the near-blur-invariant properties

of DDWT coefficients. We validated the power of DDWT

framework via example applications and experiments using

real camera sensor data, but further development in blur ker-

nel detection, deblurring, and recognition tasks are possible.

Potential applications of DDWT include object velocity and

defocus blur estimation, which are useful for making infer-

ences on the object activities or the depths.

Acknowledgments We thank the authors of [3, 4, 5, 22, 24]for providing their code. This work was funded in partby Texas Instrument and University of Dayton GraduateSchool Summer Fellowship program.

References[1] L. Bar, B. Berkels, M. Rumpf, and G. Sapiro. A variational

framework for simultaneous motion estimation and restora-

tion of motion-blurred video. ICCV, 2007.

[2] M. Ben-Ezra and S. Nayar. Motion deblurring using hybrid

imaging. In Computer Vision and Pattern Recognition, 2003.Proceedings. 2003 IEEE Computer Society Conference on,

volume 1, pages I–657. IEEE, 2003.

[3] J. Cai, H. Ji, and Z. Shen. Blind motion deblurring from a

single image using sparse approximation. CVPR, 62:291–

294, January 2009.

[4] A. Chakrabarti, T. Zickler, and W. T. Freeman. Analyzing

spatially-varying blur. CVPR, 2010.

[5] S. Chan, R. Khoshabeh, K. Gibson, P. Gill, and T. Nguyen.

An augmented Lagrangian method for total variation video

restoration. Image Processing, IEEE Transactions on,

20(11):3097–3111, 2011.

[6] S. Chan and T. Nguyen. Single image spatially-variant out-

of-focus blur removal. ICIP, 2011.

[7] J. Chen, L. Yuan, C. Tang, and L. Quan. Robust dual mo-

tion deblurring. In Computer Vision and Pattern Recognition(CVPR), 2008 IEEE Conference on, pages 1–8. IEEE, 2008.

[8] T. Cho, A. Levin, F. Durand, and W. Freeman. Motion blur

removal with orthogonal parabolic exposures. In Computa-tional Photography (ICCP), 2010 IEEE International Con-ference on, pages 1–8. IEEE, 2010.

[9] T. Cho, S. Paris, B. Horn, and W. Freeman. Blur kernel

estimation using the radon transform. In Computer Visionand Pattern Recognition (CVPR), 2011 IEEE Conference on,

pages 241–248. IEEE, 2011.

[10] S. Dai and Y. Wu. Motion from blur. In Computer Vision andPattern Recognition, 2008. CVPR 2008. IEEE Conferenceon, pages 1–8. IEEE, 2008.

[11] D. Donoho and M. Raimondo. A fast wavelet algorithm for

image deblurring. ANZIAM Journal, 46:C29–C46, 2005.

[12] R. Fergus, B. Singh, A. Hertzmann, S. Roweis, and W. Free-

man. Removing camera shake from a single photograph.

ACM SIGGRAPH, 2006.

[13] K. Hirakawa, X. Meng, and P. Wolfe. A framework for

wavelet-based analysis and processing of color filter array

images with applications to denoising and demosaicing. In

Acoustics, Speech and Signal Processing, 2007. ICASSP2007. IEEE International Conference on, volume 1, pages

I–597. IEEE, 2007.

[14] K. Hirakawa and P. Wolfe. Skellam shrinkage: Wavelet-

based intensity estimation for inhomogeneous poisson data.

Information Theory, IEEE Transactions on, 58(2):1080–

1093, 2012.

[15] M. Hirsch, C. J. Schuler, S. Harmeling, and B. Scholkopf.

Fast removal of non-uniform camera shake. In Computer Vi-sion (ICCV), 2011 IEEE International Conference on, pages

463–470. IEEE, 2011.

[16] J. Jia. Single image motion deblurring using transparency. In

Computer Vision and Pattern Recognition, 2007. CVPR’07.IEEE Conference on, pages 1–8. IEEE, 2007.

[17] A. Levin. Blind motion deblurring using image statistics. Ad-vances in Neural Information Processing Systems, 19:841,

2007.

[18] A. Levin, P. Sand, T. Cho, F. Durand, and W. Freeman.

Motion-invariant photography. In ACM Transactions onGraphics (TOG), volume 27, page 71. ACM, 2008.

[19] A. Levin, Y. Weiss, F. Durand, and W. Freeman. Understand-

ing and evaluating blind deconvolution algorithms. In Com-puter Vision and Pattern Recognition, 2009. CVPR 2009.IEEE Conference on, pages 1964–1971. IEEE, 2009.

[20] L. Lucy. Bayesian-based iterative method of image restora-

tion. Journal of Astronomy, 79(745-754), 1974.

[21] S. Nayar and M. Ben-Ezra. Motion-based motion deblurring.

Pattern Analysis and Machine Intelligence, IEEE Transac-tions on, 26(6):689–698, 2004.

[22] Q. Shan, J. Jia, and A. Agarwala. High-quality motion de-

blurring from a single image. ACM Transactions on Graph-ics (SIGGRAPH), 2008.

[23] M. Subbarao and G. Surya. Depth from defocus: a spatial

domain approach. International Journal of Computer Vision,

13(3):271–294, 1994.

[24] O. Whyte, J. Sivic, A. Zisserman, and J. Ponce. Non-uniform

deblurring for shaken images. In Computer Vision and Pat-tern Recognition (CVPR), 2010 IEEE Conference on, pages

491–498. IEEE, 2010.

10961096109610981098