15
Surface Normal Deconvolution: Photometric Stereo for Optically Thick Translucent Objects Chika Inoshita 1 , Yasuhiro Mukaigawa 2 Yasuyuki Matsushita 3 , and Yasushi Yagi 1 1 Osaka University 2 Nara Institute of Science and Technology 3 Microsoft Research Asia Abstract. This paper presents a photometric stereo method that works for optically thick translucent objects exhibiting subsurface scattering. Our method is built upon the previous studies showing that subsur- face scattering is approximated as convolution with a blurring kernel. We extend this observation and show that the original surface normal convolved with the scattering kernel corresponds to the blurred surface normal that can be obtained by a conventional photometric stereo tech- nique. Based on this observation, we cast the photometric stereo problem for optically thick translucent objects as a deconvolution problem, and develop a method to recover accurate surface normals. Experimental re- sults of both synthetic and real-world scenes show the effectiveness of the proposed method. 1 Introduction Photometric stereo estimates the surface normals of a scene from multiple shad- ing images taken under different lighting conditions [1]. While conventional meth- ods are developed for simple Lambertian diffuse surfaces [2], recent generaliza- tions can handle more complex reflections in real-world scenes [3,4]. However, surface normal estimation of translucent materials is still a difficult task, where subsurface scattering is significant [5]. In a translucent object, incident light travels randomly and exits from various neighboring locations. This global light transport effect makes it hard to directly associate the shading observations with its surface geometry. As a result, shape- from-intensity techniques that only assume local illumination models naturally suffer from the unmodeled error of subsurface scattering. One of the directions to address this issue is to remove the subsurface scattering component from the ob- servations, therefore conventional shape-from-intensity techniques can be used to the remaining direct lighting component. Recently, Nayar et al . [6] have demon- strated an approach to effectively remove subsurface scattering from the scene observations, with an expense of additional measurements under high-frequency illuminations. In general, removing the subsurface scattering component requires an additional preprocessing stage, and a shape-from-intensity method that can directly account for subsurface scattering is wanted.

Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution:Photometric Stereo for

Optically Thick Translucent Objects

Chika Inoshita1, Yasuhiro Mukaigawa2

Yasuyuki Matsushita3, and Yasushi Yagi1

1Osaka University 2Nara Institute of Science and Technology3Microsoft Research Asia

Abstract. This paper presents a photometric stereo method that worksfor optically thick translucent objects exhibiting subsurface scattering.Our method is built upon the previous studies showing that subsur-face scattering is approximated as convolution with a blurring kernel.We extend this observation and show that the original surface normalconvolved with the scattering kernel corresponds to the blurred surfacenormal that can be obtained by a conventional photometric stereo tech-nique. Based on this observation, we cast the photometric stereo problemfor optically thick translucent objects as a deconvolution problem, anddevelop a method to recover accurate surface normals. Experimental re-sults of both synthetic and real-world scenes show the effectiveness ofthe proposed method.

1 Introduction

Photometric stereo estimates the surface normals of a scene from multiple shad-ing images taken under different lighting conditions [1]. While conventional meth-ods are developed for simple Lambertian diffuse surfaces [2], recent generaliza-tions can handle more complex reflections in real-world scenes [3, 4]. However,surface normal estimation of translucent materials is still a difficult task, wheresubsurface scattering is significant [5].

In a translucent object, incident light travels randomly and exits from variousneighboring locations. This global light transport effect makes it hard to directlyassociate the shading observations with its surface geometry. As a result, shape-from-intensity techniques that only assume local illumination models naturallysuffer from the unmodeled error of subsurface scattering. One of the directions toaddress this issue is to remove the subsurface scattering component from the ob-servations, therefore conventional shape-from-intensity techniques can be used tothe remaining direct lighting component. Recently, Nayar et al . [6] have demon-strated an approach to effectively remove subsurface scattering from the sceneobservations, with an expense of additional measurements under high-frequencyilluminations. In general, removing the subsurface scattering component requiresan additional preprocessing stage, and a shape-from-intensity method that candirectly account for subsurface scattering is wanted.

Page 2: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

2 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

While exact modeling of subsurface scattering is still a difficult task that re-quires complicated models, prior studies in the field of computer graphics showthat the image formation model of subsurface scattering can be well approxi-mated as convolution of the scattering kernel and surface radiance on opticallythick materials, which distribute light regardless of the incident directions [7]. Weuse this approximation to develop surface normal deconvolution, which recoversoriginal surface normal from the blurry surface normal obtained by conventionalphotometric stereo on translucent objects. This idea is similar to Dong et al .’smethod [8], which estimates surface normal by deconvolved input images toremove the subsurface scattering effect. While Dong et al . assume parametricsubsurface scattering, i.e., photon beam diffusion of optically homogeneous me-dia, we represent subsurface scattering by non-parametric convolution kernelsfor either optically homogeneous or inhomogeneous media. The convolution ker-nels can be either calibrated or estimated, and various deconvolution techniquesin the literature (such as image deblurring methods) can be used for the imple-mentation to recover deblurred surface normal. We show estimation results byboth our deconvolution formulation and existing deconvolution in experiments.

2 Related Works

Conventional photometric stereo methods recover surface normals at a pixel-leveldetail based on local illumination models. While the original work of Wood-ham [1] uses a simple Lambertian reflection model [2], more recent approachesmake various generalizations by explicitly accounting for more flexible reflectancemodels [9, 10], or by robust estimation framework [3, 4]. These methods areshown effective; however, for translucent objects, global light interactions needto be accounted for to achieve accurate surface normal estimation. A seminalwork of Nayar et al . [11] explicitly takes interreflections into account, whichare global light transports among opaque surfaces. While the problem is sim-ilar, subsurface scattering remains as an un-addressed light transport effect inshape-from-intensity methods.

Recently, structured-light methods for measuring the shape of translucentsurfaces are proposed. To reduce effect of subsurface scattering, combinationof polarizers and phase shifting [12], multiplication of low-frequency and high-frequency projection pattern [13], and high frequency illumination with multi-plexed light source [14] have been used. In addition, Gupta et al . [15] use severalbinary projection codes to decrease estimation errors caused by subsurface scat-tering and interreflections. These techniques are shown effective, with an expenseof specialized hardware setups.

Modeling subsurface scattering has been more studied in computer graphicsas bidirectional scattering surface reflection distribution function (BSSRDF).Although the general BSSRDFs can represent various translucent appearances, itis difficult to exactly model BSSRDFs because of its high-dimensionality. Hence,researchers previously approximate BSSRDFs as a low dimensional function withan assumption of homogeneous media [16], or isotropic scattering based on the

Page 3: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 3

Subsurface scattering 𝐼𝑠(𝒙, 𝐥, 𝐯)

Lambertian reflection: 𝐼𝑟(𝒙, 𝐥, 𝐯)

Surface normal 𝒏(𝒙)

𝒏(𝒚) 𝒏(𝒚′)

𝐼(𝒙, 𝐥, 𝐯)

Fig. 1. Light interactions on a translucent surface. Incident light is partially reflectedoff the surface, while the rest of the light transmits and spreads inside the subsurface.

diffusion theory [17–19]. In our work, we also approximate subsurface scatteringby a simple model with an assumption of optically thick materials. In opticallythick materials, incident light repeatedly scatters and loses its directionality,and as a result, the scattering strength becomes invariant to the illuminationand observation directions. Based on this characteristics, subsurface scatteringis approximated by a convolutional model in [7]. Our method is built upon thisobservation and extends it to develop a photometric stereo method for opticallythick translucent objects.

3 Convolutional Image Formation Model

We begin with the image formation model for a translucent surface. When lightilluminates a translucent surface, it is partially reflected, transmitted and alsoabsorbed as depicted in Fig. 1. A portion of the transmitted light comes backto the surface via subsurface scattering; thus, the radiance I(x, l,v) at a scenepoint x with incident vector l and observation vector v becomes the sum of thereflection Ir(x, l,v) and subsurface scattering Is(x, l,v) components as

I(x, l,v) = Ir(x, l,v) + Is(x, l,v). (1)

The subsurface scattering component Is(x, l,v) is modeled as [17]

Is(x, l,v) = γ(x) F (v,n(x), η)

∫y∈A

R(x, y)F (l,n(y), η)n(y)T ldy, (2)

where γ(x) is a scale factor for the subsurface scattering component, F repre-sents Fresnel transmission, and v,n, l ∈ R3 are the obervation, surface normal,and incident vectors, respectively. η is a refractive index, R(x, y) represents an

Page 4: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

4 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

extinction term for light from scene point x to its neighbor y such as dipolemodel [17], and A defines a neighboring area. Generally, the subsurface scatter-ing component describes a nonlinear relation between the surface normals andobserved intensity due to the Fresnel transmission term. To relax this complexity,we approximate the original model in a simpler form by assuming an opticallythick material, as in [7]. On the surface of an optically thick material, subsurfacescattering does not depend on the direction of the light, because the transmittedlight scatters uncountable times and loses its directionality as same as diffusionapproximation. Thus, subsurface scattering is invariant to the incident directionand outgoing direction, and the Fresnel terms F can be regarded as constantson optically thick materials. As a result the subsurface scattering componentIs(x, l,v) is simplified to

Is(x, l) = γ′(x)

∫y∈A

R(x, y)n(y)T ldy, (3)

where γ′(x) is a new scale factor of subsurface scattering that includes constantFresnel transmission terms.

Assuming a Lambertian reflectance model for the reflection componentIr(x, l) = ρ(x)n(x)T l with a diffuse albedo ρ(x), the intensity observation I(x, l,v)can be written as

I(x, l) =

ρ(x)n(x) + γ′(x)

∫y∈A

R(x, y)n(y)dy

T

l. (4)

The first factor of Eq. (4) can be regarded as a simple convolution model as

I(x, l) =

∫y∈A

h(x, y)n(y)dy

T

l = (h ∗ n(x))Tl, (5)

where ∗ is the convolution operation, the kernel h represents a scattering effectfor the surface normals as

h(x, y) = ρ(x)δ(x− y) + γ′(x)R(x, y). (6)

A similar convolutional approximation of subsurface scattering is also dis-cussed in the work of Munoz et al . [7] for the forward rendering of optically thickmaterials. This method is also inspired by the works of convolutional approxi-mated subsurface scattering by d’Eon et al . [18] for the rendering of human skinand Donner et al . [16] for multi-layered materials. Unlike their method wherethe extinction term R(x, y) is defined as a function parameterized only by therelative position of x and y, our method allows more flexibility for the extinctionterm R(x, y) so that inhomogeneous translucent materials can also be handled.

Page 5: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 5

4 Solution method

Based on the convolutional image formation model, we develop a photometricstereo method for estimating the surface normals of an optically thick translu-cent surface. Our input is the same as traditional photometric stereo: A set ofimages is taken under varying lighting directions from a fixed view point. Tosimplify the discussion, we assume that the light directions are calibrated andthe observations do not include shadows. In the rest of the paper, we work inthe discretized pixel sites u and v that correspond to scene points x and y,respectively; thus Eq. (5) becomes

I(u, l) = (h(u, v) ∗ n(u))Tl. (7)

The convolution equation Eq. (7) has a simple linear algebraic expression as

D = HNL, (8)

where D ∈ Rm×k is an observation matrix, m and k are the numbers of pixelsand light directions, respectively, H ∈ Rm×m is a scattering matrix, N ∈ Rm×3

is a surface normal matrix, and L ∈ R3×k is an incident light matrix, which isassumed to be known. This linear expression indeed has a similarity to the Lam-bertian photometric stereo [1], where the observation D, scaled surface normalNs, and light matrix L has the following relationship:

D = NsL. (9)

From Eqs. (8) and (9), we can see that the scaled surface normal Ns correspondsto HN as

Ns = HN. (10)

Therefore, we could regard the scaled surface normal Ns as a blurry version ofthe original surface normal N that we wish to estimate. In the following we callNs a smoothed surface normal.

Based on this observation, we estimate the surface normal N by taking thefollowing two-step approach: (a) Obtain the smoothed surface normal Ns byLambertian photometric stereo [1], (b) Estimate surface normal N in a decon-volution framework using the subsurface scattering matrix H.

(a) Estimation of smoothed surface normal Ns. We use a conventional Lamber-tian photometric stereo [1] for deriving the smoothed surface normal Ns as

Ns = DL†, (11)

where † represents a Moore-Penrose pseudo inverse.

Page 6: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

6 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

(b) Estimation of original surface normal N. Once the smoothed surface normalNs is obtained, we use Eq. (10) for deriving the original surface normal N. Ifthe scattering matrix H is available and invertible, we can directly obtain theestimate of the original surface normal N in a linear least-squares fashion asN = H−1Ns. Since the estimation result produced by such a simple deconvo-lution is often degraded by ringing artifacts due to the loss of high-frequencyinformation in the original signal, we use a smoothness constraint to stabilizethe estimation. We design the smoothness term s as a weighted second-orderdifference of n(u) among u’s neighborhood locations t and v as

n′′(u) = w(t, u) (n(t)− n(u))− w(u, v) (n(u)− n(v)) . (12)

The weight w(u, v) controls the discontinuity of surface normals by taking thedifference of intensity observations across varying lightings li as

w(u, v) = exp

(− 1

m

k∑i

(I(u, li)− I(v, li))2

). (13)

The matrix expression of the smoothness N′′ is given as

N′′ = WN, (14)

where W ∈ Ra×m is a matrix of the second-order derivative filter, a is the numberof triplets used for computing the second-order derivatives. In our case, we definethe triplets along horizontal and vertical directions in the image coordinates.Finally, our estimation problem becomes a ridge regression problem as

N̂ = argminN

||HN−Ns||2F + λ||WN||2F , (15)

where λ controls smoothness of the estimates. An explicit solution to this prob-lem is given by setting its first-order derivative to be zero as

N =(HTH + λWTW

)−1

HTNs. (16)

In this manner, the estimates for the original surface normal N can be obtainedin a closed-form.

The mathematical expression of the problem is equivalent to the image de-blurring problem, where the original sharp image is recovered via deconvolution.The important difference is, however, that our problem deals with the deconvolu-tion of surface normals. Therefore, conventional image priors that are developedfor natural images may not be suitable. Other than this aspect, existing decon-volution techniques can be alternatively used for estimating the surface normalN from the smoothed surface normal Ns. The convolution kernel H is generallyunknown but can be either calibrated (non-blind deconvolution) or estimated(blind deconvolution). While most of image deblurring techniques are limitedto spatially-invariant point spread functions (PSFs), which corresponds to han-dling optically homogeneous materials in our case, the formulation of Eq. (16)can naturally handle optically inhomogeneous materials, corresponding to thecase of spatially-varying PSFs.

Page 7: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 7

Target object

Thin

light ray

Projector Camera

Incident pattern Observation at each point

Projector Camera

Diffuse surface

=

=

Incident pattern

Distribution on target object

Convolution kernel

Fig. 2. Setting for measuring the convolution kernel. Projector casts a thin light ray totarget object. We estimate the convolution kernel from the incident pattern and lightdistributions on the target object. In the case of inhomogeneous media, we capturelight distributions at optically different regions.

4.1 Calibration of Convolution Kernel

As mentioned above, the surface normal deconvolution can be performed with-out knowing the convolution kernel by using blind deconvolution techniques;however, the knowledge of the convolution kernel is useful for stabilizing theestimation. Here we describe a simple procedure for measuring the convolutionkernel. Fig. 2 shows our setting for measuring the convolution kernel. By illumi-nating a diffuse surface and the target translucent material individually by a thinray emitted from a projector, we obtain the measurements of the incident lightdistribution and scattering response on the surface, respectively. The measuredscattering response corresponds to the convolution between the incident lightdistribution and the convolution kernel. From this relationship, we calibrate theconvolution kernel h. When target media is optically inhomogeneous, we needto calibrate convolution kernels at each optically different region.

5 Experiments

Now we evaluate our method using both synthetic and real-world data for thepurposes of quantitative and qualitative evaluations.

5.1 Synthetic data

Homogeneous media For the synthetic data, we use two types of scenes, sceneA and B, as shown in Fig. 3. The image sizes of scene A and B are 150×113 and160 × 160 pixels, respectively. For synthesizing the input images under varyinglightings, we use Eq. (1) with the subsurface scattering model of Eq. (2). For theextinction term R(x, y) in Eq. (2), we use the Dipole model [17] with the sameparameters that are described in their paper. The camera model is orthographicand a pixel area corresponds to (4/15)2[mm2] in the metric system.

Figure 3 (b) shows the result of Lambertian photometric stereo based onEq. (11) and its angular error in the pseudo color. Although the estimated sur-face normals are smoothed out due to subsurface scattering, especially around

Page 8: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

8 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

edges, the low-frequency signal of the overall surface normal directions are largelyobtained.

To apply our surface normal deconvolution of Eq. (16), we use the extinctionterm R(x, y) as the convolution kernel. The distance between scene points xand y is approximated to the distance between pixel sites u and v in the imagecoordinates. Figures 3 (c) and (d) show the result of our method with varyingsmoothness factors, λ = 0.01 and λ = 0.1, respectively. While the results witha small smoothness factor λ = 0.01 yield sharper reconstructions, they sufferfrom ringing artifacts around surface normal edges. Although the choice of theproper value for λ is scene-dependent and thus difficult as is the case with anyregularization techniques, with a proper value of λ, our method significantly im-proves the reconstruction accuracy over the Lambertian photometric stereo thatonly considers the local illumination model, even though we also assume thesame Lambertian model as the reflectance component. Table 1 summarizes themaximum and mean angular errors of the surface normal estimates using variousmaterial parameters. In general, we have observed that the smaller magnitude ofsubsurface scattering yields better accuracy, because stronger subsurface scat-tering cuts off the high-frequency signals more significantly. It shows that, byproperly accounting for subsurface scattering, the accuracy improves by roughly2 ∼ 5 times in comparison with the baseline technique that only considers thelocal illumination model.

For optically homogeneous materials, we can also use conventional decon-volution methods in place of solving Eq. (16). Figures 4 and 5 show resultsof conventional non-blind deconvolution and blind deconvolution for scene B,respectively. For the non-blind deconvolution methods, we use the same con-volution kernel with the one that is used for producing the result of Fig. 3.The results show consistent improvement over Lambertian photometric stereo,although these original methods are not particularly designed for deblurringsurface normal fields. In addition, the results of blind deconvolution methodsin Fig. 5, where the convolution kernel is not given but simultaneously estimated,also show improvement. While the blind deconvolution is a harder problem thannon-blind deconvolution and the results are generally worse, when the knowledgeof the convolution kernel is unavailable, it is a viable option for our method.

Inhomogeneous media Our solution method is naturally applicable to thecase of inhomogeneous materials as long as the convolution kernel H in Eq. (16)is defined. To evaluate the performance of our method for inhomogeneous ma-terials, we produce synthetic images that contain different optical thicknessesusing masks that indicate the material regions as shown in Fig. 6 (a) and (b).Due to the difference of the magnitudes of subsurface scattering in the mate-rial regions, the surface normal estimates obtained by Lambertian photometricstereo, shown in Fig. 6 (d) and (e), exhibit varying smoothnesses; smoother inthe gray mask region, and sharper in the white mask region.

By applying our method, the surface normal field is consistently improvedregardless of the material regions as shown in the figure. This recovery shows

Page 9: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 9

Normal map of scene A

Max: 56.4, Mean: 11.7 Max: 65.6, Mean: 7.2 Max: 48.8, Mean: 6.8 0

10

20

30

[deg.]

Normal map of scene B

(a) Example of image and PSF for deconvolution

(b) Lambertian photometric stereo

(c) Our method 𝜆 = 0.01

(d) Our method 𝜆 = 0.1

Max: 61.3, Mean: 15.7 Max: 52.1, Mean: 6.0 Max: 98.4, Mean: 10.7

0

10

20

30

[deg.]

Fig. 3. Result of synthetic scenes A and B. (a) shows an example of synthetic imagesusing the Dipole model with the skim milk parameters in [17]. (b) is the surface normaland error maps of the Lambertian photometric stereo. More faithful surface normalsare obtained with our method in (c) and (d) with the varying smoothness factor λ.

Table 1. Max and mean angular errors [deg.] of scene A and B with various materials.Parameters of each materials are described in [17].

Scene A Scene B

Lambertian PS

Lambertian PS

Our method 𝜆 = 0.01

Our method 𝜆 = 0.1

Our method 𝜆 = 0.01

Our method 𝜆 = 0.1

max mean max mean max mean max mean max mean max mean

1. Marble

2. Skim milk

3. Whole milk

4. Skin1

5. Skin2

41.5

56.4

37.7

54.4

50.3

40.1

65.6

30.7

61.4

59.5

30.5

48.8

25.4

50.6

44.1

56.2

61.3

52.4

63.4

61.5

29.1

52.1

22.1

43.1

47.9

36.2

98.4

28.5

105.1

86.2

7.7

11.7

6.6

10.7

9.8

3.0

7.2

2.5

7.6

5.1

2.6

6.8

1.9

6.3

4.7

11.9

15.7

10.7

15.3

14.3

1.9

6.0

1.5

6.5

4.2

5.6

10.7

4.4

10.6

8.7

1

2

3

4

5

Plot of kernels

higher accuracy than that of Fig. 3, because of the fact that this inhomogeneousexample contains a region where scattering is less significant.

Page 10: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

10 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

Max: 47.0, Mean: 6.4

(a) Wienerfilter [20]

Max: 47.8, Mean: 6.9

(b) RichardsonLucy [21]

Max: 56.0, Mean: 6.8

(c) Cho et al. [22]

Max: 54.6, Mean: 11.0

(d) Dong et al. [23]

0

10

20

30

[deg.]

Fig. 4. Surface normal estimates of scene B using non-blind deconvolution methods

Max: 53.6, Mean: 9.1

(a) Levin et al. [24]

Max: 54.9, Mean: 11.5

(b) Krishnan et al. [25]

0

10

20

30

[deg.]

Fig. 5. Surface normal estimates of scene B using blind deconvolution methods

5.2 Real-world data

We also tested our method using the real-world translucent objects. Figure 7 (a)shows our experiment setting. We used a Nikon D90 camera with a linear radio-metric response function (RAW mode) and with a telescopic lens to approximatean orthographic projection. The target scenes are illuminated under directionallightings, and the light directions are calibrated using a dark specular sphere.In addition, to avoid specular reflections from the scene, we placed polarizersin front of both the light source and camera. We used three target objects: asoap as a homogeneous medium, angel and unicorn ornaments as inhomogeneousmedia as shown in Fig. 7 (b). Each scene is recorded under 12 different lightingdirections. The image size of the soap, angel, and unicorn scenes are 232× 164,206 × 257, and 158 × 230 pixels, respectively. Prior to the measurement, theconvolution kernels are measured using the procedure described in Sec. 4.1. Forthe inhomogeneous objects, we measured two distinct kernels which depend ondifferent material regions, one for the white region and the other for the pinkregion. Examples of the recorded intensity images are shown in the left-most

Page 11: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 11

(d) Estimation result of scene A (e) Estimation result of scene B

White region

Gray region

(a) Material region mask and synthesized image for scene A

(b) Material region mask and synthesized image for scene B

(c) Kernel for each region

Max: 39.3, Mean: 4.5

Our method 𝜆 = 0.01

Max: 50.7, Mean: 10.2

Lambertian photometric stereo

Max: 36.4, Mean: 3.2

Our method 𝜆 = 0.01

Max: 63.7, Mean: 13.3

Lambertian photometric stereo

0

10

20

30

[deg.]

Fig. 6. Results with optically inhomogeneous media using scenes A and B. (a) and(b) show the masks that indicate different material regions and one of the synthesizedimages. We use two types of convolution kernels shown in (c) for these distinct regions.(d) and (e) show the smoothed surface normals obtained by Lambertian photometricstereo and our results, respectively.

column of Fig. 8. These images are not significantly blurry, but the details aresmoothed out due to subsurface scattering.

The top row of Fig. 8 shows the result for the soap scene. While the surfacenormal recovered by Lambertian photometric stereo is significantly smoothedout, our method produces a much sharper estimate. The middle and bottomrows show the results of the angel and unicorn scenes, respectively. To assessthe reconstruction accuracy, we created replicas of those by making molds withplaster. Assuming that the plaster reflectance is close to Lambertian, we ob-tained the reference surface normal by applying Lambertian photometric stereoto the replicas. The surface normal of plaster replicas exhibits details of theoriginal shape, while the result of Lambertian photometric stereo is smoother.Our method makes the blurry surface normal much sharper and closer to thereference surface normal.

5.3 Discussion

Computation time. Previous experiments show, in the case of optically homo-geneous materials, we can apply various fast deconvolution methods for image

Page 12: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

12 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

Camera (Nikon D90)

Light source (3M MP220)

Target object

Specular Sphere

Polarizer

(a) Experiment setting (b) Target objects

Soap

Angel Unicorn

Fig. 7. Experiment setting and target objects. We use a projector as a light source.The camera is equipped with a telescopic lens. Polarizers are used to reduce the effectsof specular reflection on the target object.

Pink region

White region

Lambertian photometric stereo Our method 𝜆 =0.1

Pink region

White region

(b) Estimated surface normals (a) Example of measured image

and observed PSF

Soap

Angel

Unicorn

Normal of plaster

Normal of plaster

Our method 𝜆 =0.1

Our method 𝜆 =0.1

Lambertian photometric stereo

Lambertian photometric stereo

Fig. 8. Result of the real-world scenes. The soap scene in the top row is a homogeneousmedium, while the middle and bottom rows (angel and unicorn scenes) are made ofinhomogeneous media.

Page 13: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 13

deblurring to recover surface normal. However, in the case of inhomogeneous me-dia, we have to solve Eq. (16) to deal with spatially variant convolution kernels.Our matlab implementation on Intel Core i7 CPU (3.5 GHz) takes about 17.6,39, and 3.5 seconds to recover surface of soap, angel, and unicorn scenes, respec-tively. The density of non-zero elements of matrix FTF + λWTW in Eq. (16)is about 2.5%. The computation time depends on the size and the number ofnon-zero elements of matrix FTF+λWTW, which are determined by the inputimage size and apparent sizes of PSFs in the image coordinates.

Limitations. Our method has a couple of limitations. First, we have ignoredthe influence of Fresnel transmissions, thus our method is restricted to opticallythick materials. As the material shows more directional scattering, the accuracyof our method may gradually decrease. We are interested in exploring an it-erative estimation framework to adaptively update the convolution kernels forincorporating the Fresnel transmission effects. The second limitation is that ourmethod in practice relies on known convolution kernels, especially when dealingwith optically inhomogeneous materials. Although a sophisticated blind decon-volution method may resolve this issue, but at this point, the knowledge of theconvolution kernel plays an important role in obtaining accurate surface normalestimates. We are interested in investigating a good prior for surface normalfields that may potentially improve the blind deconvolution.

6 Discussions and Summary

In this paper, we proposed a photometric stereo method for optically thicktranslucent objects. We have extended the previous study on a convolutionalapproximation of subsurface scattering and developed a surface normal deconvo-lution technique, which consists of a conventional photometric stereo and imagedeconvolution. Our experiment shows that the surface normals of translucentobjects are reliably estimated by our method. As depicted in the experimentsection, our method can benefit from a large body of image deblurring methodsin the literature including blind deconvolution methods. In addition, we haveshown that our method is able to handle optically inhomogeneous media.

References

1. Woodham, R.J.: Photometric Method For Determining Surface Orientation FromMultiple Images. Optical Engineering 19 (1980) 139–144

2. Lambert, J.H.: Photometria sive de mensure de gratibus luminis. Eberhard KlettIn Eberhard Klett (1760)

3. Ikehata, S., Wipf, D., Matsushita, Y., Aizawa, K.: Robust Photometric Stereousing Sparse Regression. In: Proc. of IEEE Conference on Computer Vision andPattern Recognition (CVPR). (2012)

4. Wu, L., Ganesh, A., Shi, B., Matsushita, Y., Wang, Y., Ma, Y.: Robust Photo-metric Stereo via Low-Rank Matrix Completion and Recovery. In: Proc. of AsianConference on Computer Vision (ACCV). (2010)

Page 14: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

14 C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi

5. Moore, K.D., Peers, P.: An empirical study on the effects of translucency onphotometric stereo. The Visual Computer 29(6-8) (2013) 817–824

6. Nayar, S.K., Krishnan, G., Grossberg, M.D., Raskar, R.: Fast separation of directand global components of a scene using high frequency illumination. ACM Trans.on Graph. (ToG) 25(3) (2006) 935–944

7. Munoz, A., Echevarria, J.I., Seron, F.J., Gutierrez, D.: Convolution-based sim-ulation of homogeneous subsurface scattering. Computer Graphics Forum 30(8)(2011) 2279–2287

8. Dong, B., Moore, K., Zhang, W., Peers, P.: Scattering Parameters and SurfaceNormals from Homogeneous Translucent Materials using Photometric Stereo. In:Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).(2014)

9. Alldrin, N., Zickler, T., Kriegman, D.: Photometric stereo with non-parametric andspatially-varying reflectance. In: Proc. of IEEE Conference on Computer Visionand Pattern Recognition (CVPR). (2008)

10. Goldman, D.B., Curless, B., Hertzmann, A., Seitz, S.M.: Shape and spatially-varying BRDFs from photometric stereo. IEEE Trans. Pattern Analysis and Ma-chine Intelligence (PAMI) 32 (2010) 1060–1071

11. Nayar, S.K., Ikeuchi, K., Kanade, T.: Shape from interreflections. Proc. of Inter-national Conference on Computer Vision (ICCV) (1990)

12. Chen, T., Lensch, H.P.A., Fuchs, C., Seidel, H.P.: Polarization and Phase-shiftingfor 3D Scanning of Translucent Objects. In: Proc. of IEEE Conference on ComputerVision and Pattern Recognition (CVPR). (2007)

13. Chen, T., Seidel, H.P., Lensch, H.P.A.: Modulated phase-shifting for 3D scan-ning. In: Proc. of IEEE Conference on Computer Vision and Pattern Recognition(CVPR). (2008)

14. Gu, J., Kabayashi, T., Gupta, M., Nayar, S.K.: Multiplexed Illumination for SceneRecovery in the Presence of Global Illumination. In: Proc. of International Con-ference on Computer Vision (ICCV). (2011)

15. Gupta, M., Agrawal, A., Veeraraghavan, A., Narasimhan, S.G.: A Practical Ap-proach to 3D Scanning in the Presence of Interreflections, Subsurface Scatteringand Defocus. International Journal of Computer Vision (IJCV) 102 (2012) 33–55

16. Donner, C., Lawrence, J., Ramamoorthi, R., Hachisuka, T., Jensen, H.W., Nayar,S.: An empirical BSSRDF model. ACM Trans. on Graph. (ToG) 28(3) (2009)

17. Jensen, H.W., Marschner, S.R., Levoy, M., Hanrahan, P.: A practical model forsubsurface light transport. Proc. ACM SIGGRAPH (2001) 511–518

18. d’Eon, E., Irving, G.: A quantized-diffusion model for rendering translucent ma-terials. ACM Transactions on Graphics (TOG) 30 (2011)

19. Borshukov, G., Lewis, J.: Realistic human face rendering for “the matrix reloaded”.In: Proc. ACM SIGGRAPH Sketches and Applications. (2003)

20. Wiener, N.: Extrapolation, Interpolation, and Smoothing of Stationary Time Se-ries:With Engineering Applications. 1 edn. MIT Press (1949)

21. Richardson, W.H.: Bayesian-based iterative method of image restoration. Journalof the Optical Soceity of America (JOSA) 62 (1972) 55–59

22. Cho, S., Wang, J., Lee, S.: Handling outliers in non-blind image deconvolution.In: Proc. of International Conference on Computer Vision (ICCV). (2011)

23. Dong, W., Zhang, L., Shi, G.: Centralized sparse representation for image restora-tion. In: Proc. of International Conference on Computer Vision (ICCV). (2011)

24. Levin, A., Weiss, Y., Durand, F., Freeman, W.T.: Efficient marginal likelihoodoptimization in blind deconvolution. In: Proc. of IEEE Conference on ComputerVision and Pattern Recognition (CVPR). (2011)

Page 15: Surface Normal Deconvolution: Photometric Stereo …inoshita/paper/ECCV2014...Surface Normal Deconvolution 5 4 Solution method Based on the convolutional image formation model, we

Surface Normal Deconvolution 15

25. Krishnan, D., Tay, T., Fergus, R.: Blind deconvolution using a normalized sparsitymeasure. In: Proc. of IEEE Conference on Computer Vision and Pattern Recog-nition (CVPR). (2011)