Upload
yu-chieh
View
215
Download
1
Embed Size (px)
Citation preview
Fig. 1. Block diagram of 2D to 3D conversion.
Fig. 2. Model of Depth Image Based Rendering (DIBR).
Fig.3. Block diagram of 3D stereo view synthesis architecture
Abstract-- Since consumer electronic products grow rapidly, 3D display technology becomes popular in recent years. Now, there are more and more 3D products such as 3D camera, 3D projector, and 3D-TV. However, how to convert traditional 2D video contents to 3D one is a vital issue to solve it. This paper proposes a real-time 3D rendering processor for view synthesis in 2D-to-3D conversion. Moreover, the proposed fractional precision image warping and hole filling architecture can effectively decrease the output buffer size about 94.4 than the traditional. The real-time processor throughput can achieve up to 60 frames per second and full HD resolution (1920x1080) on stereoscopic displays.
I. INTRODUCTION Since 3D display technology advances rapidly; 3D-relative
products become popular recently. Stereo vision techniques use stereo matching and view synthesis method generally. Currently, stereoscopic display technology has been an extension of H.264/AVC in 3DAV[1]. For backward compatible with 2D video contents, depth image based rendering (DIBR) [2][3] is the key technology in 2D-to-3D conversion. Moreover, depth map estimation [4] also plays an important role which influences video/image quality. From the viewpoint of hardware implementation, there are some papers [5][6] which are implemented in respectively. This paper adopts true depth map information and focuses on hardware architecture of 3D rendering (3D image warping and hole filling), as shows in Fig.1. The proposed rendering processor can be efficiently applied on stereoscopic display and achieve up to 60 frames per second with full HD resolution (1920x1080) in real time.
II. ALGORITHM Fig. 2 shows a model of DIBR; here the original image
points at location (Pm,y) are mapped to left points (Pl,y) and right points (Pr,y). By the geometric association in Fig. 2, it can obtain (3),
ZBfPand
ZBfPP mml
2P
2r +− == (3)
Where f is the focal length of the camera, B is the distance between Ol and Or. Moreover, Z stands for the depth of object from the view plane. Xl and Xr can be produced from 3-D image warping. However, due to coordinate transformation from 2D view plane to 3D image plane, it generates some holes or occlusion problems and needs hole filling method to fix them.
III. PROPOSED ARCHITECTURE Figure 3 shows hardware architecture of 3D view synthesis
engine. In order to decrease computational complexity, the intermediate-view texture and their depth map are stored in
Real-Time 3D Rendering Processor for 2D-to-3D Conversion of Stereoscopic Displays
Yeong-Kang Lai, Senior Member, IEEE and Yu-Chieh Chung, Member, IEEE Department of the Electrical Engineering, National Chung Hsing University, Taichung 402, Taiwan, R.O.C.
Yeong-Kang Lai:[email protected] and Yu-Chieh Chung:[email protected]
2014 IEEE International Conference on Consumer Electronics (ICCE)
978-1-4799-1291-9/14/$31.00 ©2014 IEEE 117
Fig. 4. Detailed hole filling architecture for 2D-to-3D conversion.
Fig. 5. Comparison between original and proposed buffer sizes.
Fig. 6. 3D rendering after performing the proposed processor.
external memory. It adopts look-up table to speed up the warping computation time in the fractional precision warping engine. After image warping, the left and right frames may generate holes after coordinate transformation.
The hole filling algorithm uses image interpolation method and neighboring pixels to fill the residual holes because the integer grid points in the reference were warped to represent irregular points in the virtual views. The detailed hole filling architecture is shown in Fig.4. The 3D view synthesis engine adopts our proposed two-stages of hole
filling and occlusion warping pixel generation engine. It uses a weight interpolation according to pixel distance and edge detection of holes. Finally, it writes 3D data of left and right views to external memory (DDR2 SDRAM).
IV. EXPERIMENTAL RESULTS
We implement our proposed architecture and use 3D displays to verify it. To decide the size of output buffer, we analyze the column resolution in a frame and the pixel number of PE output. Computing the pixels number difference, the number would be the minimum size of the output buffer. As Fig. 5 shows, it decreases the output buffer size by 94.4 than the original one. Fig. 6 shows the final results which revealed on a 3D display. The occlusion and hole problems can be solved via our algorithm and architecture. Moreover, we invited 10 audiences to evaluate the quality of 3D sequences, and the average score is 86.
V. CONCLUSION In this paper, we propose a high quality view synthesis
architecture for 3D display systems. According to pixel distance and edge detection, the hole filling is classified into three types for image inpainting. From the experimental results, the proposed algorithm could distinguish between foreground and background information effectively and get better human visual perception for 3D-TV. Moreover, the proposed two-stages of hole filling and occlusion warping pixel generation engine can provide efficient architecture than others. Finally, it only needs 3 adders and 2 multiplier circuits in image inpainting interpolation per pixel for each view synthesis parallelism. Hence it is suitable for real-time 3D applications on stereoscopic displays.
REFERENCES [1] Joint Draft 7.0 on Multiview Video Coding, Joint Video Team of
ISO/IEC MPEG and ITU-T VCEG, ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Apr. 2008.
[2] P. Merkle, K Müller, T. Wiegand, “3D Video: Acquisition, Coding, and Display,” IEEE Trans. Consum. Electron., vol. 56, no. 2, pp. 946-950, May 2010.
[3] F. Bruls, S. Zinger, and L. Do, “Multi-view coding and view synthesis for 3DTV,” in Proc. IEEE International Conference on Consumer Electronics, pp. 685-686, Jan. 2011.
[4] Yeong-Kang Lai, Yu-Fan Lai, and Ying-Chang Chen, “An Effective Hybrid Depth-Perception Algorithm for 2D-to-3D Conversion in 3D Display Systems,” in Proc. IEEE International Conference on Consumer Electronics, pp. 612-613, Jan. 2012.
[5] Ying-Rung Horng, Yu-Cheng Tseng, and Tian-Sheuan Chang, “VLSI Architecture for Real-Time HD1080p View Synthesis Engine,” IEEE Trans. Circuits Syst. Video Technol. vol. 21, no. 9, pp. 1329-1340, Sep. 2011.
[6] Shen-Fu Hsiao, Jin-Wen Cheng, Wen-Ling Wang, and Guan-Fu Yeh, “Low Latency Design of Depth-Image-Based Rendering Using Hybrid Warping and Hole-Filling,” in Proc. IEEE International Symposium on Circuits and Systems, pp. 608-611, May 2012.
[7] W.-Y. Chen, Y.-L. Chang, H.-K. Chiu, S.-Y. Chien, and L.-G. Chen, “Real-time depth image based rendering hardware accelerator for advanced three dimensional television system,” in Proc. IEEE ICME, pp. 2069–2072, Jul. 2006.
118