4
Optik 124 (2013) 2097–2100 Contents lists available at SciVerse ScienceDirect Optik jou rn al homepage: www.elsevier.de/ijleo Segmentation-based adaptive vergence control for parallel multiview stereoscopic images Jianjun Lei , Hailong Zhang, Chunping Hou, Liyuan Lin School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China a r t i c l e i n f o Article history: Received 20 January 2012 Accepted 17 June 2012 Keywords: Multiview Autostereoscopic display Vergence control a b s t r a c t The vergence control is important for multiview stereoscopic images captured by a camera array to acquire natural and clear stereoscopic effects. In this paper, we derive the basic characteristics of vergence control for parallel multiview stereoscopic images. Then we segment the stereoscopic images based on Mean Shift, and shift the images by the disparity of the object in the central region to achieve the adaptive vergence control. We have evaluated our method on Middlebury data sets and artificially synthesized images. Experimental results show that the proposed method is effective, and the generated stereoscopic images could reproduce the 3D scenes vividly in a multiview autostereoscopic display. © 2012 Elsevier GmbH. All rights reserved. 1. Introduction Stereoscopic display technology is receiving increasing atten- tion because it enables us to obtain the depth information of the objects and improve the perception of the distribution of the objects in the scene. There are many types of stereoscopic display technol- ogy including classic stereo systems that require glasses to more sophisticated multiview autostereoscopic displays that do not need glasses [1–3]. Multiview autostereoscopic displays provide highly realistic stereoscopic images and free-view navigation to viewers by generating various viewpoints of the scene [4–6]. There are two means to get the stereoscopic images for multi- view autostereoscopic displays: toed-in camera array and parallel camera array [7,8]. Stereoscopic images captured directly usually have fusion problems which will lead to blurriness. It is essential to acquire natural and clear stereoscopic effects through vergence control. In the toed-in method, vergence control could be achieved by rotating all optical axes to converge on the key object. However, even with the addition of vergence control, inherent vertical disparity and keystone distortion still exists to result in visual fatigue. For the parallel method, vergence control is achieved by moving the CCD horizontally to alter the relative position with lens or shifting the captured images. Kwon et al. [9] implemented vergence control via disparity information for binocular stereo- scopic images, but did not consider the multiview stereoscopic images. Deng et al. [10] shifted the images captured by a tri-view parallel camera to implement the vergence control. The processed stereoscopic images have both positive and negative disparity and Corresponding author. E-mail address: [email protected] (J. Lei). are free from vertical disparities and keystone distortions. Never- theless, the method only suits for fixed depth scenes because of its inflexibility of shifting manually or setting the shifting range in the camera. In this paper, we derived the characteristics of vergence control in parallel multiview stereoscopic images. After the disparity of the object in the central region is obtained by means of Mean Shift, the images can be shifted by the disparity to realize the vergence control. 2. Characteristics of vergence control Vergence control is indispensable to get natural and clear stereoscopic images using a parallel camera array. Taking an example for eight views, the principle of vergence control for multiview stereoscopic images is derived as follows. First, we estab- lish the coordinate system based on view 1. Set V 1 , V 2 , . . . , V 8 as the horizontal coordinate of view1 to view8, V 1 , V 2 , . . . , V 8 as the horizontal coordinate of view1 to view 8 after vergence control, d 1 , d 2 , . . . , d 7 as the horizontal disparity of adjacent views, d 1 , d 2 , . . . , d 7 as the horizontal disparity of adjacent views after vergence control. Then, view 2 to view 8 are shifted by x 1 , x 2 , . . . , x 7 to reduce the disparity as: V i+1 = V i+1 x i , i = 1, 2, . . . , 7 (1) d i = V i+1 V i = (V 2 x 1 ) V 1 = d 1 x 1 , i = 1 (V i+1 x i ) (V i x i 1 ) = d i x i + x i 1 , i = 2, 3, . . . , 7 (2) 0030-4026/$ see front matter © 2012 Elsevier GmbH. All rights reserved. http://dx.doi.org/10.1016/j.ijleo.2012.06.021

Segmentation-based adaptive vergence control for parallel multiview stereoscopic images

  • Upload
    liyuan

  • View
    221

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Segmentation-based adaptive vergence control for parallel multiview stereoscopic images

Ss

JS

ARA

KMAV

1

toiosgrb

vchtcbedfmlvsips

0h

Optik 124 (2013) 2097– 2100

Contents lists available at SciVerse ScienceDirect

Optik

jou rn al homepage: www.elsev ier .de / i j leo

egmentation-based adaptive vergence control for parallel multiviewtereoscopic images

ianjun Lei ∗, Hailong Zhang, Chunping Hou, Liyuan Linchool of Electronic Information Engineering, Tianjin University, Tianjin 300072, China

a r t i c l e i n f o

rticle history:eceived 20 January 2012

a b s t r a c t

The vergence control is important for multiview stereoscopic images captured by a camera array toacquire natural and clear stereoscopic effects. In this paper, we derive the basic characteristics of vergence

ccepted 17 June 2012

eywords:ultiview

control for parallel multiview stereoscopic images. Then we segment the stereoscopic images based onMean Shift, and shift the images by the disparity of the object in the central region to achieve the adaptivevergence control. We have evaluated our method on Middlebury data sets and artificially synthesizedimages. Experimental results show that the proposed method is effective, and the generated stereoscopicimages could reproduce the 3D scenes vividly in a multiview autostereoscopic display.

utostereoscopic displayergence control

. Introduction

Stereoscopic display technology is receiving increasing atten-ion because it enables us to obtain the depth information of thebjects and improve the perception of the distribution of the objectsn the scene. There are many types of stereoscopic display technol-gy including classic stereo systems that require glasses to moreophisticated multiview autostereoscopic displays that do not needlasses [1–3]. Multiview autostereoscopic displays provide highlyealistic stereoscopic images and free-view navigation to viewersy generating various viewpoints of the scene [4–6].

There are two means to get the stereoscopic images for multi-iew autostereoscopic displays: toed-in camera array and parallelamera array [7,8]. Stereoscopic images captured directly usuallyave fusion problems which will lead to blurriness. It is essentialo acquire natural and clear stereoscopic effects through vergenceontrol. In the toed-in method, vergence control could be achievedy rotating all optical axes to converge on the key object. However,ven with the addition of vergence control, inherent verticalisparity and keystone distortion still exists to result in visualatigue. For the parallel method, vergence control is achieved by

oving the CCD horizontally to alter the relative position withens or shifting the captured images. Kwon et al. [9] implementedergence control via disparity information for binocular stereo-copic images, but did not consider the multiview stereoscopic

mages. Deng et al. [10] shifted the images captured by a tri-viewarallel camera to implement the vergence control. The processedtereoscopic images have both positive and negative disparity and

∗ Corresponding author.E-mail address: [email protected] (J. Lei).

030-4026/$ – see front matter © 2012 Elsevier GmbH. All rights reserved.ttp://dx.doi.org/10.1016/j.ijleo.2012.06.021

© 2012 Elsevier GmbH. All rights reserved.

are free from vertical disparities and keystone distortions. Never-theless, the method only suits for fixed depth scenes because of itsinflexibility of shifting manually or setting the shifting range in thecamera.

In this paper, we derived the characteristics of vergence controlin parallel multiview stereoscopic images. After the disparity of theobject in the central region is obtained by means of Mean Shift,the images can be shifted by the disparity to realize the vergencecontrol.

2. Characteristics of vergence control

Vergence control is indispensable to get natural and clearstereoscopic images using a parallel camera array. Taking anexample for eight views, the principle of vergence control formultiview stereoscopic images is derived as follows. First, we estab-lish the coordinate system based on view 1. Set V1, V2, . . . , V8as the horizontal coordinate of view1 to view8, V ′

1, V ′2, . . . , V ′

8as the horizontal coordinate of view1 to view 8 after vergencecontrol, d1, d2, . . . , d7 as the horizontal disparity of adjacentviews, d′

1, d′2, . . . , d′

7 as the horizontal disparity of adjacent viewsafter vergence control. Then, view 2 to view 8 are shifted by�x1, �x2, . . . , �x7 to reduce the disparity as:

V ′i+1 = Vi+1 − �xi, i = 1, 2, . . . , 7 (1)

d′i = V ′

i+1 − V ′i

={

(V2 − �x1) − V1 = d1 − �x1, i = 1

(Vi+1−�xi)−(Vi − �xi−1) = di−�xi + �xi−1, i = 2, 3, . . . , 7(2)

Page 2: Segmentation-based adaptive vergence control for parallel multiview stereoscopic images

2098 J. Lei et al. / Optik 124 (2

sdtd�

V

d

z

3

Sait

e(tk

K

wcet

ict

at[

C

w

C

image. However, vertical disparities and keystone distortions existin the images by the edge of the largest cube.

Fig. 3 shows the images captured by the parallel camera array.The whole object’s positions from left boundaries in the three

Fig. 1. Illustration of characteristics of vergence control.

It has been proved that the most comfortable and ideal stereo-copic effect occurs when d1 = d2 = · · · = d7 = d. Fig. 1 shows theesirable situation that the multiview stereoscopic images meethe condition d1 = d2 = · · · = d7 = d. We set zone A as the zeroisparity object in view 1 and shift view 2, view 3 . . . view 8 byx, 2 × �x, . . . , 7 × �x as the following equations:

′i+1 = Vi+1 − �xi = Vi+1 − i × �x, i = 1, 2, . . . , 7 (3)

′i = V ′

i+1 − V ′i = (Vi+1 − i × �x) − (Vi − (i − 1) × �x)

= d − �x, i = 1, 2, . . . , 7 (4)

The zero disparity, negative disparity, and positive disparityone will exist in the images with sufficient shifting distance.

. Disparity estimation based on Mean Shift

The adaptive and robust image segmentation based on Meanhift is insensitive to smooth and texture region and similar tonalysis mechanism of human eye. In this paper, we segment themages based on Mean Shift, and then use the segmentation resultso generate the smoothed dense disparity map.

An image is represented by spatial and color information. Theigenvector for spatial-color Mean Shift filters is defined as X =xs, xr)T . Because space and color is mutually independent, the mul-ivariate kernel is defined as the product of two radially symmetricernels [11].

hs,hr (x) = B

h2s h3

r

k

(∥∥∥ xs

hs

∥∥∥2)

k

(∥∥∥ xr

hr

∥∥∥2)

(5)

here xs is spatial information, xr is the color feature, k(x) theommon kernel profile used in both two domains, hs and hr themployed kernel bandwidths, and B the corresponding normaliza-ion constant.

Every point is finally converged using the Mean Shift algorithmteratively based on multivariate kernels. The initial points whichonverge to the same point are classified to the same region andhen labeled.

After the segmentation, we get the initial disparity map through window-based local method. We introduce the gradient informa-ion in light of [12] and combine it with BT (Birchfield and Tomasi)13], so the cost function is finally designed as:

(p, p̄d) = w ∗ CBT (p, p̄d) + (1 − w) ∗ CGRAD(p, p̄d) (6)

here

BT (p, p̄d) =∑

c ∈ {r,g,b}min{C̄(p, p̄d), C̄(p̄d, p)} (7)

013) 2097– 2100

CGRAD(p, p̄d) =∑

c ∈ {r,g,b}|∇xIc(p) − ∇xI′c(p̄d)|2

+∑

c ∈ {r,g,b}|∇yIc(p) − ∇yI′c(p̄d)|2 (8)

C̄(p, p̄d) = max{0, Ic(p) − I′c,max, I′c,min − Ic(p)} (9)

C̄(p̄d, p) = max{0, I′c(p̄d) − Ic,max, Ic,min − I′c(p̄d)} (10)

where I, I′ are the stereo pair, p, p̄d are the two corresponding pointsin the stereo pair, ∇x and ∇y represent the horizontal and verti-cal gradient, w represents the weighting factor between 0 and 1.I′c,max, I′

c,min are respectively the maximum and minimum of I′(p̄d),I′+ and I′− (see [13]).

Then we utilize cross-check to locate the error matching pointsin the initial disparity map and smooth the disparity map consider-ing the result of image segmentation [14]. Specifically, suppose thatthe points lying in the same segment region possess the same dis-parity, we compute the mean disparity of correct matching pointsin the segment region and assign it to the region.

4. Experiments

3DSMAX simulation experiments and multiview autostereo-scopic display experiments using stereoscopic image sets areconducted to evaluate our algorithms. Mean Shift segmentation isimplemented in L*u*v color space, and we use 16, 16, 20 for hs, hr

and the minimum region size, respectively.

4.1. 3DSMAX simulation experiments

We use 3DSMAX to establish a stereoscopic scene with fourcubes. The largest cube is 30 cm long, 120 cm wide, and 106 cm highand the other three cubes are 10 cm long, 10 cm wide, and 10 cmhigh. The distance between the small cube in the middle and thelargest cube is 80 cm. The other two small cubes which are 15 cmapart located 110 cm in front of the largest cube. Three cameraswhich are 5 cm apart respectively are placed 50 cm ahead of thetwo small cubes. The focal length and the horizontal view angleare 43 mm and 45◦. The cameras are placed by the toed-in arraymethod and the parallel array method to capture the stereoscopicimages from the frontal view.

Fig. 2 shows the images captured by the toed-in camera array.The principal axes of cameras converge on the small cube in themiddle. The largest cube has positive disparity and its positionsfrom left boundaries in the three images are rightward one by onefrom left image to right image. The middle cube is zero disparity andlocated in the same positions in the three images. Two small sym-metrical cubes have negative disparities and their positions fromleft boundaries are leftward one by one from left image to right

Fig. 2. Images captured by toed-in camera array.

Page 3: Segmentation-based adaptive vergence control for parallel multiview stereoscopic images

J. Lei et al. / Optik 124 (2013) 2097– 2100 2099

Fig. 3. Images captured by parallel camera array.

iivpd

tirlcpiv

4

mda1sttsvdsp

wteiFcynsa(wv

Fig. 5. Eight viewpoint stereoscopic images.

Fig. 6. Image composed directly.

Fig. 4. Images obtained after vergence control.

mages are leftward one by one from the left image to the rightmage due to the negative disparity. The images cannot realize theivid reproduction of the reality scene for the lack of positive dis-arities. Nevertheless, vertical disparities and keystone distortionso not exist.

We regulate the images captured by the parallel camera array byhe proposed vergence control method and the results are shownn Fig. 4. The small cube in the middle is set to zero disparity usingegion segmentation. The positions from left boundaries of theargest cube, small cube in the middle, and two small symmetri-al cubes are rightward, fixed, leftward one by one as a result of theositive disparity, zero disparity, and negative disparity. The three

mages realize the vivid reproduction of the reality scene withoutertical disparities and keystone distortions.

.2. Stereoscopic display experiments

We utilize view 1, view 5, and the corresponding ground truthap of the moebius scene in the Middlebury stereoscopic image

atabase [15] and then generate six virtual images between view 1nd view 5 using virtual viewpoint synthesis algorithm. Then, view, six virtual images, and view 5 make up eight viewpoint stereo-copic images of the parallel camera array configuration. Finally,he composition images with resolution of 1920 × 1080 obtainedhrough the composition algorithm. Fig. 5 shows eight viewpointtereoscopic images. Fig. 6 shows the composed image by eightiewpoint stereoscopic images through the composition algorithmirectly. Fig. 7 shows the composed image by eight viewpointtereoscopic images through the composition algorithm after theroposed vergence control method.

As shown in Fig. 6, each region object in the image is blurred,hich illustrates that the zero disparity region does not exist and

he disparities of the images are relatively large. Only stereoscopicffects out of the screen can be observed when the compositionmage is displayed on the parallax barrier autostereoscopic display.ig. 7 shows that the blue lattice object in the middle region of theomposition image is clear, and it is the zero disparity object. Theellow and pink objects at the side of image are blurred due toegative disparity. The blue lattice object can be observed on theurface of screen when the composition image is displayed on the

utostereoscopic display. Meanwhile, negative disparity objectsyellow and pink object) and positive objects (white background)ill be out of and in the monitor respectively, which realize the

ivid reproduction of the reality scene.

Fig. 7. Image composed after vergence control.

5. Conclusions

A segmentation-based adaptive vergence control method forparallel multiview stereoscopic images is proposed in this paper.The stereoscopic images are segmented based on Mean Shift, andthe object in the middle of the image is set to be the zero disparityobject based on observing characteristics of human eyes. Then weshift the images by its disparity, and change the disparity range of

the images, which realize adaptive vergence control. Consequently,we obtain the natural and clear images and achieve the vivid repro-duction of reality scenes.
Page 4: Segmentation-based adaptive vergence control for parallel multiview stereoscopic images

2 124 (2

A

Ft6eN1PtD

R

[

[

[

[

[

100 J. Lei et al. / Optik

cknowledgments

We would like to thank Professor Ming-Ting Sun and Professorei Ji for comments and suggestions. This research was par-ially supported by the Natural Science Foundation of China (nos.1002029, 60932007, 61072062, 61101224), International Sci-nce and Technology Cooperation Program (no. 2010DFA12780),atural Science Foundation of Tianjin (nos. 12JCYBJC10400,2JCQNJC00500, 12JCQNJC00300), Research Fund for the Doctoralrogram of Higher Education of China (no. 20110032120029), andhe Opening Project of Jiangsu Province Web TV Research andevelopment Center for Engineering Technology (no. SIIT111001).

eferences

[1] P. Merkle, K. Müller, T. Wiegand, 3D video: acquisition, coding, and display,IEEE Trans. Consum. Electron. 56 (2) (2010) 946–950.

[2] Y.C. Fan, Y.T. Kung, B.L. Lin, Three-dimensional auto-stereoscopic image recor-ding, mapping and synthesis system for multiview 3D display, IEEE Trans.Magnet. 47 (3) (2011) 683–686.

[3] Q.H. Wang, A.H. Wang, W.X. Zhao, Y.H. Tao, D.H. Li, Autostereoscopic displaybased on multi-layer lenticular lens, Optik–Int. J. Light Electron Opt. 122 (15)(2011) 1326–1328.

[4] M. Salmimaa, T. Jarvenpaa, Characterizing autostereoscopic 3-D displays, J. Info.Display 25 (2009) 8–11.

[

013) 2097– 2100

[5] K.J. Oh, A. Vetro, Y.S. Ho, Depth coding using a boundary reconstruction fil-ter for 3-D video systems, IEEE Trans. Circ. Syst. Video Technol. 21 (3) (2011)350–359.

[6] A. Vetro, T. Wiegand, G.J. Sullivan, Overview of the stereo and multiview videocoding extensions of the H.264/MPEG-4 AVC standard, Proc. IEEE 99 (4) (2011)626–642.

[7] A. Woods, T. Docherty, R. Koch, Image distortions in stereoscopic video systems,Proc. SPIE 1915 (1993) 36–48.

[8] T. Jarvenpaa, M. Salmimaa, Optical characterization of autostereoscopic 3-Ddisplays, J. Soc. Info. Display 16 (8) (2008) 825–833.

[9] K.C. Kwon, Y.T. Lim, N. Kim, Vergence control of binocular stereoscopic camerausing disparity information, J. Opt. Soc. Korea 13 (3) (2009) 379–385.

10] H. Deng, Q.H. Wang, D.H. Li, Disparity images acquired by parallel camera arraywith shift, Acta Photon. Sin. 38 (11) (2009) 2985–2988.

11] D. Comaniciu, P. Meer, Mean Shift: a robust approach toward featurespace analysis, IEEE Trans. Pattern Anal. Mach. Intell. 24 (5) (2002)603–619.

12] S. Lee, K. Oh, Y. Ho, Segment-based multi-view depth map estimationusing belief propagation from dense multi-view video, Proc. 3DTV (2008)193–196.

13] S. Birchfield, C. Tomasi, A pixel dissimilarity measure that is insensitiveto image sampling, IEEE Trans. Pattern Anal. Mach. Intell. 20 (4) (1998)401–406.

14] C. Lü, H. Wang, H. Ren, Y. Shen, Virtual view synthesis for multi-view 3D dis-

play, in: Proceedings of Third International Joint Conference on ComputationalScience and Optimization, 2010, pp. 444–446.

15] H. Hirschmüller, D. Scharstein, Evaluation of cost functions for stereo matching,in: Proceedings of IEEE Computer Society Conference on Computer Vision andPattern Recognition, 2007, pp. 1–8.