4

Click here to load reader

[IEEE ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Prague, Czech Republic (2011.05.22-2011.05.27)] 2011 IEEE International

  • Upload
    susanto

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Prague, Czech Republic (2011.05.22-2011.05.27)] 2011 IEEE International

QUADRATIC OPTIMIZATION BASED SMALL SCALE DETAILS EXTRACTION

Zhengguo Li, Jinghong Zheng, Chuohao Yeo and Susanto Rahardja

Signal Processing Department, Institute for Infocomm Research, 1 Fusionopolis Way, Singapore

ABSTRACT

In many image processing problems, it is required to extract smallscale details from an image or a set of images. In this paper, weintroduce a new framework for extracting small scale details froma single input image or a set of input images. We then show how toapply the framework to address several important problems in thefield of image processing including tone mapping of high dynamicrange images, de-noising of a non-flash image with a pair of non-flash and flash images, as well as details enhancement via multi-light images and a single input image. Experimental results showthat the proposed framework outperforms existing methods.

1. INTRODUCTION

In the fields of image processing, many applications, such as de-noising of images [1], tone mapping of high dynamic range (HDR)images [2], and detail enhancement via multi-light images [3],require the extraction of small scale details from an image or aset of images. The small scale details can be either noise, e.g.,a random pattern with zero mean, or texture, such as a repeatedpattern with regular structure. When there is only one input im-age, it is straightforward to extract the desired details by usingthe total variation based method in [4], the half quadratic opti-mization based method in [1], or the quadratic optimization basedmethod in [2]. However, when there are multiple input images,using these methods directly would be complex because each inputimage needs to be decomposed individually. Given the increasingpopularity of multi-shot imaging and the immediate response fromCanon, Fuji, Nikon, Olympus and Ricoh with their latest digitalcamera for multi-shot capability, it is desirable to provide a bettersolution that extracts the optimally captured details from a numberof individual photographs and puts them together to form a betterimage.

In this paper, we introduce a method to extract small scaledetails from a single input image or a set of input images. Ourproposed method is based on the following two observations: 1)the details of an image are usually captured by its luminance vari-ations, i.e., the gradient field of its luminance; and 2) it is easyto generate a gradient field for a given image such as the gradientfield of its luminance, but more complex to calculate an image for agiven guidance field via solving the Possion Equation [5, 8]. In theproposed method, a guidance field is first constructed to includeall desired details by using the luminance variations of the inputimage(s). A new quadratic optimization problem, which we referto as “small scale details extraction problem”, is then formulatedon the extraction of small scale details from the guidance field.Its cost function comprises two terms. One is on the fidelity of thegradient field of small scale details to the guidance field. The otheris on the energy of the desired small scale details. A regularizationfactor is adopted to obtain a tradeoff between them. Finally, the

desired details are obtained by using an iterative method to solvethe new optimization problem.

There are two key procedures in the proposed method. Oneis to build up a guidance field and the other is to select a func-tion for measuring the fidelity of the gradient field of small scaledetails to the guidance field. To illustrate these two procedures,three examples of applications are used: de-noising of a non-flashimage with a pair of non-flash and flash images, tone mapping ofHDR images, details enhancement via multi-light images and asingle input image. Since there are multiple input images in thethird problems, the computational cost would be high when usingthe methods in [1, 2, 4]. The proposed scheme can be regardedas a unified framework for the extraction of small scale detailsbecause it can address different types of inputs and applications.Furthermore, the proposed framework only involves a quadraticoptimization problem that can be easily solved via an iterativemethod. Therefore, it could be a powerful and flexible tool foruse in the fields of image processing.

The rest of this paper is organized as follows. The proposedsmall scale details extraction problem is formulated in Section 2.Section 3 discusses its application to the de-noising of a non-flashimage with a pair of non-flash and flash images, the tone mappingof high dynamic range (HDR) images, as well as details enhance-ment via multi-light images and a single input image. Finally,concluding remarks are provided in Section 4.

2. SMALL SCALE DETAILS EXTRACTION PROBLEM

In many applications, an input image, L(x, y), is decomposed intotwo parts, Lb(x, y) and Ld(x, y), as [1, 2, 4]

L(x, y) = Lb(x, y) + Ld(x, y), (1)

where Lb(x, y) is an image formed by homogeneous regions withsharp edges and Ld(x, y) is noise or texture. Popular methodsinclude half quadratic regularization [1], quadratic regularization[2], total variation minimization [4], and so on. The input andoutput of these methods are L(x, y) and Lb(x, y), respectively.In this section, we introduce a new approach. In the proposedapproach, a guidance field �v(x, y) = (�v1(x, y), �v2(x, y)) is firstgenerated by using the luminance variations of the input image(s)as in [5, 11]. The desired small scale details Ld(x, y) is thenextracted from the guidance field �v(x, y) by solving a quadraticoptimization problem.

When the desired small scale details are included in only oneinput image, �v(x, y) can be selected as the gradient field of theinput image. Users can also adjust the guidance field such thatthe quality of the final image is improved. When the small scaledetails are included in multiple input images, �v(x, y) can be gen-erated from the gradient fields of all input images, and it might nolonger be a gradient field. The new details extraction problem is

1309978-1-4577-0539-7/11/$26.00 ©2011 IEEE ICASSP 2011

Page 2: [IEEE ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Prague, Czech Republic (2011.05.22-2011.05.27)] 2011 IEEE International

formulated as follows:

minLd(x,y)

⎧⎨⎩

∑(x,y)

[Φ (∇Ld(x, y), �v(x, y), �u(x, y)) +

1

λL2

d(x, y)

]⎫⎬⎭ ,

(2)where the optimal variable Ld(x, y) represents the desired details.�u(x, y) is an auxiliary vector field, and it is usually the same as�v(x, y) except for certain special cases. For example, considerthe de-noising of a non-flash image L(x, y) when there is a flashimage F (x, y) [9]. �v(x, y) and �u(x, y) are the gradient fields ofL(x, y) and F (x, y), respectively. �u(x, y) is the same as �v(x, y)unless otherwise specified. The first term is a quadratic function of

( ∂Ld(x,y)∂x

−�v1(x, y)) and ( ∂Ld(x,y)∂y

−�v2(x, y)). Its major func-

tion is to measure the fidelity of ∇Ld(x, y) with respect to �v(x, y).The second term is on the energy of the details signal which re-quires the details signal to have a small energy. A regularizationfactor is adopted to obtain a tradeoff between these two terms.

The problem (2) is a new optimization problem for image pro-cessing. The inputs are a guidance field �v(x, y) and an auxiliaryvector field �u(x, y) rather than an image L(x, y) and the output isa details image Ld(x, y). It is well known that it is very easy tocompute a guidance field for a given image but more complex tocompute an image for a given guidance field via solving the Pos-sion Equation [8, 5]. The proposed method is a unified frameworkfor handling both single input and multiple inputs, and can addressa wide variety of applications.

There are two key procedures in the proposed framework. Oneis the generation of the guidance and auxiliary fields, and the otheris the selection of the fidelity function. They will be further inves-tigated for three different problems in the field of image processingin the following section.

3. APPLICATIONS OF THE PROPOSED METHOD

3.1. De-noising of a Non-flash Image with a Pair of Non-flashand Flash Images

It is very challenging to capture a high quality image under lowlight environments. An interesting method was proposed by Petschnigget al. [9] that uses a pair of images: one taken with flash tocapture detail and the other taken without flash to capture ambientillumination. The flash image exhibits a better SNR than the non-flash one and it provides a better estimate of the high frequencydetail. De-noising of the non-flash image is a key component inthe method in [9]. In this subsection, we show how to apply ourproposed framework to address this problem.

Suppose that non-flash and flash images are represented byL(x, y) and F (x, y), respectively. �v(x, y) and �u(x, y) are thegradient fields of L(x, y) and F (x, y), respectively, i.e.,

�v1(x, y) =∂L(x, y)

∂x; �v2(x, y) =

∂L(x, y)

∂y, (3)

�u1(x, y) =∂F (x, y)

∂x; �u2(x, y) =

∂F (x, y)

∂y. (4)

The function Φ(∇Ld(x, y), �v(x, y), �u(x, y)) is given

Φ =( ∂Ld(x,y)

∂x− �v1(x, y))

2

ψ(�u1(x, y))+

( ∂Ld(x,y)∂y

− �v2(x, y))2

ψ(�u2(x, y)), (5)

where the function ψ(z) is provided as

ψ(z) = |z|γ + 0.0001, (6)

the values of γ and λ are 1.2 and 0.25, respectively. The proposedmethod is compared with the half quadratic regularization basedmethod in [1] by testing a pair of non-flash and flash images inFigs. 1(a) and 1(b). It is shown in the red box of Fig. 1(c), the halfquadratic based method tends to make the region to be either over-blurred or under-blurred while the proposed method can overcomethis problem, which is shown in Fig. 1(d).

Fig. 1. Comparison of different de-noising schemes. (a) The flash

image. (b) The non-flash image. (c) By using the half quadratic

regularization based method. (d) By using the proposed method.

3.2. Tone Mapping of High Dynamic Range Images

It is shown in [2, 7, 10] that a key component for tone mapping ofan HDR image is to decompose the luminance of an HDR imageinto a base layer and a details layer as in [2]. The proposed frame-work can also be applied to address this problem. The guidancefield �v(x, y) is chosen as the gradient field of L(x, y). The fidelityfunction is selected as in Equation (5), and the function ψ(z) isgiven as in Equation (6) with the values of γ and λ as 1.2 and 2,respectively.

The proposed tone mapping method is compared with thosemethods in [6, 7, 8] by choosing their default settings. The resultsof Fattal’s and Durand’s methods are generated by using the opensource software qtpfsgui [12] and Reinhard’s method by HDR-Shop [13], respectively. We test a commonly used HDR radi-ance map Cadik desk01 and the corresponding results are illus-trated in Fig. 2. Local contrasts are preserved very well by usingFattal’s scheme. However, the lamp in the image “Cadik desk”looks darker than the wall and this is different from the originalradiance map. In other words, the relative brightness could havebeen changed by Fattal’s scheme. As a result, the generated image

1310

Page 3: [IEEE ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Prague, Czech Republic (2011.05.22-2011.05.27)] 2011 IEEE International

does not look very natural even though it is visually satisfying.The overall appearance of images generated by Reinhard’s andDurand’s schemes is more natural but the details of lamp are notpreserved very well. On the other hand, it can be observed fromFig. 2 that the local details are preserved very well by the proposedtone mapping and the generated image also looks more natural.

Fig. 2. Comparison of different schemes. (a) Fattal’s method. (b)

Durand’s method. (c) Reinhard’s method. (d) Our method. HDR

image resource: Martin Cadik.

3.3. Details Enhancement via Multi-light Images

A set of multi-light images that capture the same scene under dif-ferent lighting positions and lighting intensities can collectivelyprovide a much more detailed description of the scene than a singleimage. The information lost in one image can be complementedby the information from other images. This feature of multi-lightimages can be applied to enhance the shape and surface details ofa scene [3].

Suppose that Lj(x, y)(1 ≤ j ≤ N) are N input images. Theguidance field �v(x, y) is built up to include the desired detailsof all these N input images. As the contents of an image areusually indicated by the intensity variation, the maximal gradi-ent at each location around the input images is usually used forthe construction of guidance field. However, shading informationfrom different images could also be contained in the guidance fieldas a shadow often has large intensity variations at its boundary.As such, shadow regions will cause artifacts in the final image.To remove the intensity variation caused by shadow edges fromthe guidance field, a shadow detection approach can be adopted todivide an image into shadow and non-shadow areas; the gradientsnear shadow edges are not included in the guidance field [11].However, it can be complex to detect shadows for each input im-age. Here, a simpler method is proposed to determine the guidancefield �v(x, y). Our new method is based on the following twoobservations on two images Lj(x, y) and Lj′(x, y): 1) the value

of | ∂Lj(x,y)

∂x| (or | ∂Lj(x,y)

∂y|) is much larger than that of | ∂Lj′ (x,y)

∂x|

(or | ∂Lj′ (x,y)∂y

|) in the case that one of pixels related to∂Lj(x,y)

∂x

(or∂Lj(x,y)

∂y) is at a shadow edge and neither pixels related to

∂Lj′ (x,y)∂x

(or∂Lj′ (x,y)

∂y) is at a shadow edge; and 2) the gap be-

tween | ∂Lj(x,y)

∂x| (or | ∂Lj(x,y)

∂y|) and | ∂Lj′ (x,y)

∂x| (or | ∂Lj′ (x,y)

∂y|) is

usually not large in other cases. The value of �v1(x, y) is{g1(j

1,2(x, y), x, y), if∣∣∣ ∂Lj1,1

(x,y)

∂x

∣∣∣ > ζ∣∣∣ ∂Lj1,2

(x,y)

∂x

∣∣∣g1(j

1,1(x, y), x, y), otherwise

and the value of �v2(x, y) is{g2(j

2,2(x, y), x, y), if∣∣∣ ∂Lj2,1

(x,y)

∂y

∣∣∣ > ζ∣∣∣ ∂Lj2,2

(x,y)

∂y

∣∣∣g2(j

2,1(x, y), x, y), otherwise

ζ is a constant and its default value is 3. ji,1(x, y) and ji,2(x, y)(i = 1, 2) denote the indices of two images with the largest gradi-ent and the second largest gradient at position (x, y) along direc-tion x and y, respectively. The functions g1(j, x, y) and g2(j, x, y)are defined as

g1(j, x, y) = w(Lj(x, y))w(Lj(x+ 1, y))∂Lj(x, y)

∂x

g2(j, x, y) = w(Lj(x, y))w(Lj(x, y + 1))∂Lj(x, y)

∂y

and the weighting function w(z) is

w(z) =

⎧⎨⎩

z/30; if z ≤ 30(255− z)/30; if z > 2251; if 225 ≥ z > 30

(7)

It is shown in the above equation that Lj(x, y) has a large weight-ing factor when it is reliable and a small weighting factor when it isnot reliable. The fidelity function is computed as in Equation (5),and the function ψ(z) is given in Equation (6) with the values ofγ and λ as 1.2 and 0.5, respectively. The optimization problem (2)is also a quadratic optimization problem, and it can also be easilysolved by using an iterative method.

We first compare the case in which there is only one inputimage with the case in which there are multi-light input images.The red boxes of Figs. 3(a) and 3(b) show that more details areincluded in the final output image especially the shadow areas byusing multi-light input images. We next compare the proposedguidance field with the maximum gradient field. Clearly, thereare artifacts due to shadow edges in the final image as in Fig.3(c) by using the maximum gradient field while these are removedby using the proposed guidance field. Finally, we compare theproposed details enhancement method with the scheme presentedin [3]. In [3], the base layer is calculated as a weighted sum ofthe input images. Even though users can control the strength andlocation of shadows through adjusting the weight factors of inputimages, the shadows of the final images are still a little messy asdemonstrated in Fig. 3(d), while the proposed details enhancementmethod can provide natural shading information in the final outputimages.

3.4. Details Enhancement via a Single Input Image

In this subsection, we shall provide one more advantage of theproposed method, i.e., a user can adjust a guidance field to improvethe quality of the final image.

Consider the detail enhancement via a single input image [2],

1311

Page 4: [IEEE ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Prague, Czech Republic (2011.05.22-2011.05.27)] 2011 IEEE International

Fig. 3. Comparison of different schemes. (a) The proposed detail

extraction scheme with only one input image. (b) The proposed

details extraction scheme with the proposed guidance field. (c) The

proposed details extraction scheme with the maximum gradient

field. (d) The details enhancement scheme in [Fattal et al. 2007].

L1(x, y). The guidance field �v(x, y) is computed as

�vi(x, y) = gi(1, x, y) ; i = 1, 2. (8)

The weights of w(L1(x, y))w(L1(x + 1, y)) and w(L1(x, y))w(L1(x, y + 1)) are respectively added to �v1(x, y) and �v2(x, y)such that less noise is included in the guidance field �v(x, y). Thisimplies that the guidance field could be adjusted according to auser’s requirements to improve the quality of the final image byusing the proposed method. The fidelity function is computed asin Equation (5), and the function ψ(z) is given in Equation (6)with the values of γ and λ as 4.0 and 1.5 respectively.

The proposed method is compared with the method in [2] wherethe input image is shown in Fig. 4(a). It is demonstrated in Fig.4(b) that the image becomes sharper by the proposed guidancefield in Equation (7) while there are artifacts in a shadow region asin Fig. 4(c) by using the method in [2]. Therefore, when comparedwith the methods in [2], the proposed method provides a user withthe flexibility to adjust the guidance field such that the quality ofthe final image is improved.

Fig. 4. Comparison of different guidance fields. (a) The input

image. (b) By using the proposed method. (c) By using the method

in [2].

4. CONCLUSION

In this paper, a unified framework based on a new optimizationproblem is introduced for the extraction of small scale details froma guidance field. We give examples of how the proposed frame-work can be applied to address several important problems in thefield of image processing including the de-noising of a non-flashimage with a pair of non-flash and flash images, tone mapping of

high dynamic range (HDR) images, as well as details enhancementvia multi-light images and a single input image. In our futureresearch, we will apply the framework to address other challengingproblems in the fields of image processing, such as the extractionof desired details from a set of low dynamic range images withdifferent exposures.

References[1] P. Charbonnier, L. Blanc-Feraud, G. Aubert, and M. Barlaud

M, “Deterministic edge-preserving regularization in com-puted imaging,” IEEE Transactions on Image Processing, vol.6, no. 2, pp. 298-311, Feb. 1997.

[2] Z. Farbman, R. Fattal, D. Lischinshi, and R. Szeliski, “Edge-preserving decompositions for multi-scale tone and detailsmanipulation,” ACM Transactions on Graphics (Proc. SIG-GRAPH), vol. 27, no. 3, pp. 249-256, Aug. 2008.

[3] R. Fattal, M. Agrawala, and S. Rusinkiewicz, Multiscaleshape and details enhancement for multi-light image collec-tions. ACM Transactions on Graphics (Proc. SIGGRAPH),vol. 26, no. 3, pp. 51:1-51:10, Aug. 2007.

[4] L. Rudin L., S. Osher, and E. Fatemi, “Nonlinear totalvariation based noise removal algorithms,” Physica D., vol.60, pp. 259-268, 1992.

[5] P. Perez, M. Gangnet, and A. Blake, “Poisson image editing,”ACM Transactions on Graphics (Proc. SIGGRAPH), vol. 22,no. 3, pp. 313-318, Aug. 2003.

[6] E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda, “Photo-graphic tone reproduction for digital images,” ACM Transac-tions on Graphics, vol. 21, no. 3, pp. 267-276, Aug. 2002.

[7] F. Durand F. and J. Dorsey, “Fast bilateral filtering for thedisplay of high-dynamic-range images,” ACM Transactionson Graphics, vol. 21, no. 3, pp. 257-266, Aug. 2002.

[8] R. Fattal, D. Lischinshi, and M. Werman, “Gradient domainhigh dynamic range compression,” ACM Transactions onGraphics, vol. 21, no. 3, pp. 67:1-67:10, Aug. 2002.

[9] G. Petschnigg, M. Agrawala, and H. Hoppe, “Digital photog-raphy with flash and no-flash image pairs,” ACM Transactionson Graphics (Proc. SIGGRAPH), vol. 23, no. 3, pp. 664-672,Aug. 2004.

[10] Z. G. Li, S. Rahardja, S. S. Yao, J. H. Zheng, and W.Yao, “High dynamic range compression by half quadraticregularization”, In 2009 IEEE International Conference onImage Processing, pp. 3169-3172, Cairo, Egypt, Nov. 2009.

[11] J. H. Zheng, Z. G. Li, S. Rahardja, S. S. Yao, and W.Yao, “Collaborative image processing algorithm for detailsrefinement and enhancement via multi-light images,” In IEEEICASSP 2010, pp. 1382-1385, Mar. 2010.

[12] http://qtpfsgui.sourceforge.net/, Dec. 2007.

[13] http://www.hdrshop.com/, Oct. 2007.

1312