4
Local Feature Descriptor Based Rapid 3D Ear Recognition ZENG Hui 1 , Zhang Rui 1 , Mu Zhichun 1 , WANG Xiuqing 2 1. School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083 E-mail: [email protected] 2. Vocational and Technical Institute, Hebei Normal University, Shijiazhuang 050031 E-mail: [email protected] Abstract: This paper presents a local feature descriptor based rapid 3D ear recognition method. A coarse-to-fine pose alignment method is proposed to alleviate local extremum problem of ICP algorithm. At first, the LSP descriptors of 3D ear feature points are used to perform coarse pose alignment. Then the improved ICP algorithm (MICP algorithm) is used for fine pose alignment. Three kinds of descriptors, including LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor, are used for feature extraction and matching. Experimental results show that 3D CS-LBP descriptor based method has better recognition performance and computational efficiency than the other two descriptors based methods. Key Words: MICP Algorithm, 3D LBP Descriptor, 3D CS-LBP Descriptor, 3D Ear Recognition 1 Introduction Ear recognition technology is one of the research fields in biometric identification, which can be used in the customs security, video monitoring and so on [1,2]. Compared with other popular human biometrics features, such as face and fingerprint, ear has many distinctive advantages. For example, ear has a rich and stable structure that changes little with age and does not suffer from changes in facial expression [3,4].The existing ear recognition methods can be classified into two categories: 2D image based method and 3D data based method. Compared with the 2D image based method, the 3D data based method has better robustness to pose variation and varying lighting conditions. So in recent years, more and more researchers began to pay more and more attention to 3D ear recognition technique. Up to now, there are not a lot of research works about 3D ear recognition, and most of these methods are carried out based on ICP (Iterative Closest Point) algorithm. Pin Yan and Kevin W. Bowyer developed an automatic ear recognition system, that uses ICP algorithm based the 3D shape matching to perform human recognition [4]. In order to further improve the recognition speed, Pin Yan and Kevin W. Bowyer propose a rapid 3D ear recognition method based on ICP algorithm, which firstly preprocesses the 3D model to reduce the computational complexity of the recognition step [5]. Hui Chen and Bir Bhanu propose a ear recognition method based on LSP (Local Surface Patch) descriptor and ICP algorithm. It firstly uses LSP descriptor for initial alignment, and then ICP algorithm is used for fine alignment [6,7]. Islam et al. propose a 3D ear recognition method based on local 3D feature and ICP algorithm [8]. Similar to the previous methods, initial coarse alignment is firstly performed using the local 3D feature on test sample and candidate sample, and then fine alignment and * This work was supported by the National Natural Science Foundation of China under the Grant No. 61375010, 61005009 and Beijing Higher Education Young Elite Teacher Project under the Grant No. YETP0375. recognition are carried on by using ICP algorithm. All these above methods can obtain good recognition rate, but they usually require a good initial matching to ensure the global convergence of algorithm. In this paper, we use coarse-to-fine strategy to complete 3D ear pose alignment to solve local extreme problem of ICP algorithm. Firstly, we use the shape index values for feature point detection. Secondly, the LSP descriptor is used for coarse pose alignment. Finally, we use the Improved ICP algorithm to carry on the fine pose alignment. In the step of the feature extraction and recognition, the LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor are respectively used for feature extraction, and the numbers of the matching points are used for recognition. 2 3D Ear Pose Alignment For the same person, different 3D ear data collected in different time inevitably has pose variation, so pose alignment step is necessary before feature extraction. In this paper, a set of 3D ear data under nearly normal pose is selected as alignment template. Then the improved ICP algorithm is used to match the 3D ear data to the alignment template so as to achieve the purpose of pose alignment. 2.1 Feature Point Detection The shape index values of 3D ear depth map are used for feature point detection. The shape index value is a quantitative curvature measure of local shape and it is sensitive to subtle changes of surface shape. Given a 3D point on a surface, its corresponding shape index value is defined as: 1 1 2 1 2 ( ) ( ) 1 1 ( ) tan 2 ( ) ( ) p p SI p p p (1) where 1 and 2 are the max and min principal curvatures of the surface at point p respectively, with 1 2 . As the principal curvatures are invariable to rotation and translation transformation, the shape indexes are also rotation and Proceedings of the 33rd Chinese Control Conference July 28-30, 2014, Nanjing, China 4942

[IEEE 2014 33rd Chinese Control Conference (CCC) - Nanjing, China (2014.7.28-2014.7.30)] Proceedings of the 33rd Chinese Control Conference - Local feature descriptor based rapid 3D

  • Upload
    xiuqing

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Page 1: [IEEE 2014 33rd Chinese Control Conference (CCC) - Nanjing, China (2014.7.28-2014.7.30)] Proceedings of the 33rd Chinese Control Conference - Local feature descriptor based rapid 3D

Local Feature Descriptor Based Rapid 3D Ear Recognition

ZENG Hui1, Zhang Rui1, Mu Zhichun1, WANG Xiuqing2

1. School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083 E-mail: [email protected]

2. Vocational and Technical Institute, Hebei Normal University, Shijiazhuang 050031 E-mail: [email protected]

Abstract: This paper presents a local feature descriptor based rapid 3D ear recognition method. A coarse-to-fine pose alignment method is proposed to alleviate local extremum problem of ICP algorithm. At first, the LSP descriptors of 3D ear feature points are used to perform coarse pose alignment. Then the improved ICP algorithm (MICP algorithm) is used for fine pose alignment. Three kinds of descriptors, including LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor, are used for feature extraction and matching. Experimental results show that 3D CS-LBP descriptor based method has better recognition performance and computational efficiency than the other two descriptors based methods. Key Words: MICP Algorithm, 3D LBP Descriptor, 3D CS-LBP Descriptor, 3D Ear Recognition

1 Introduction Ear recognition technology is one of the research fields in

biometric identification, which can be used in the customs security, video monitoring and so on [1,2]. Compared with other popular human biometrics features, such as face and fingerprint, ear has many distinctive advantages. For example, ear has a rich and stable structure that changes little with age and does not suffer from changes in facial expression [3,4].The existing ear recognition methods can be classified into two categories: 2D image based method and 3D data based method. Compared with the 2D image based method, the 3D data based method has better robustness to pose variation and varying lighting conditions. So in recent years, more and more researchers began to pay more and more attention to 3D ear recognition technique.

Up to now, there are not a lot of research works about 3D ear recognition, and most of these methods are carried out based on ICP (Iterative Closest Point) algorithm. Pin Yan and Kevin W. Bowyer developed an automatic ear recognition system, that uses ICP algorithm based the 3D shape matching to perform human recognition [4]. In order to further improve the recognition speed, Pin Yan and Kevin W. Bowyer propose a rapid 3D ear recognition method based on ICP algorithm, which firstly preprocesses the 3D model to reduce the computational complexity of the recognition step [5]. Hui Chen and Bir Bhanu propose a ear recognition method based on LSP (Local Surface Patch) descriptor and ICP algorithm. It firstly uses LSP descriptor for initial alignment, and then ICP algorithm is used for fine alignment [6,7]. Islam et al. propose a 3D ear recognition method based on local 3D feature and ICP algorithm [8]. Similar to the previous methods, initial coarse alignment is firstly performed using the local 3D feature on test sample and candidate sample, and then fine alignment and * This work was supported by the National Natural Science Foundation of China under the Grant No. 61375010, 61005009 and Beijing Higher Education Young Elite Teacher Project under the Grant No. YETP0375.

recognition are carried on by using ICP algorithm. All these above methods can obtain good recognition rate, but they usually require a good initial matching to ensure the global convergence of algorithm.

In this paper, we use coarse-to-fine strategy to complete 3D ear pose alignment to solve local extreme problem of ICP algorithm. Firstly, we use the shape index values for feature point detection. Secondly, the LSP descriptor is used for coarse pose alignment. Finally, we use the Improved ICP algorithm to carry on the fine pose alignment. In the step of the feature extraction and recognition, the LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor are respectively used for feature extraction, and the numbers of the matching points are used for recognition.

2 3D Ear Pose Alignment For the same person, different 3D ear data collected in

different time inevitably has pose variation, so pose alignment step is necessary before feature extraction. In this paper, a set of 3D ear data under nearly normal pose is selected as alignment template. Then the improved ICP algorithm is used to match the 3D ear data to the alignment template so as to achieve the purpose of pose alignment.

2.1 Feature Point Detection

The shape index values of 3D ear depth map are used for feature point detection. The shape index value is a quantitative curvature measure of local shape and it is sensitive to subtle changes of surface shape. Given a 3D point on a surface, its corresponding shape index value is defined as:

1 1 2

1 2

( ) ( )1 1( ) tan2 ( ) ( )

p pSI pp p

(1)

where 1 and 2 are the max and min principal curvatures of the surface at point p respectively, with 1 2 . As the principal curvatures are invariable to rotation and translation transformation, the shape indexes are also rotation and

Proceedings of the 33rd Chinese Control ConferenceJuly 28-30, 2014, Nanjing, China

4942

Page 2: [IEEE 2014 33rd Chinese Control Conference (CCC) - Nanjing, China (2014.7.28-2014.7.30)] Proceedings of the 33rd Chinese Control Conference - Local feature descriptor based rapid 3D

translation transformation invariant. From the definition of the shape index we can see that the range of the shape index is [0, 1]. From equation (1) we can conclude that the shape index provides a continuous gradation between shapes, so the shape index value can be used to describe the local shape of 3D model. Fig. 1 (a) shows the 3D ear depth map and Figure 1 (b) shows its corresponding shape index value map.

(a) 3D ear depth map (b) its corresponding shape index value map

Fig. 1 3D ear depth map and its corresponding shape index value map

After obtaining the shape index value of each 3D ear points, we select the point that has local extreme of shape index values as feature point. In this paper, the size of the local window is 5×5.Figure 2 shows the results of feature point detection with one person’s two sets of 3D ear data. From Fig. 2 we can see that some feature points corresponding to the same physiological ear area, which can be used for the following coarse pose alignment step.

(a) (b)

Fig. 2 Results of feature point detection with one person’s 3D ear data

2.2 LSP Descriptor Based Coarse Pose Alignment

After obtaining the 3D ear feature points, we compute their corresponding LSP descriptors. LSP descriptor is constructed using the feature point and its neighboring points, and its feature vector consists of the feature point, the type of local surface, the coordinates of centroid and the shape histogram which is based on shape index values and the surface normal vector. The detailed construction method of LSP descriptor is described in [6].As shape index value and the surface normal vector are invariant to rigid transformation, LSP descriptor is also invariant to rigid transformation. So the LSP descriptor has good robustness to pose variety.

In this paper, we compare the LSP descriptors of the 3D ear data and the alignment template . The content of the comparison is that the type of local surface and the shape histogram. In this paper, the 2 distribution method is used

to calculate the dissimilarity between different two-dimensional shape histograms:

22 ( )( , ) i i

i i i

q vQ V

q v (2)

where, Q and V are shape histograms, iq and iv are the component of Q and V respectively. If the two LSP descriptors have same type of local surface and their 2 value is less than the threshold, then they can be labeled as a pair of matching points. From experimental results we found that it’s time-consuming and has a certain number of false matching points if we compute the matching points if we only use the LSP descriptor. In this paper, we use the following constraints to improve the matching result

1 2 1 2

1 2 1 2

, , 1

, , 2max( )S S M M

S S M M

d d

d d (3)

where 1 9.4mm , 2 3.7mm .1 2,S Sd is Euclidean distance

between two LSP descriptors of test data, 1 2,M Md is

Euclidean distance between two LSP descriptors of alignment template. The above two geometric constraints can ensure geometric consistency between the matching points, which can help to remove near matching points. Fig. 3 shows the final matching points that have been optimized using the above constraints. After obtaining the matching points of test sample and alignment template, singular value decomposition is used to calculate the initial transformation matrix between the two sets of 3D data. The transformation matrix can be used as an initial estimation for fine pose alignment step.

(a) (b)

Fig. 3 Final matching points

2.3 MICP Algorithm Based Fine Pose Alignment

In this paper, MICP algorithm (Modified Iterative Closet Point) is used for 3D ear fine pose alignment [9]. Its improvements include the following two aspects: 1) The KD -tree algorithm is used to search closest points in the space between the test sample and alignment template, and the Euclidean distance between the corresponding points is adopted. Compared with ICP algorithm, our method can greatly improve the computational efficiency. 2) Closest point distance constraint is used to remove the noise points. In the process of iteration, current rigid transform matrix is used to transform the test samples, and the distances between the 3D point and its corresponding closet point in alignment template is computed. If the distance is 2 times greater than

4943

Page 3: [IEEE 2014 33rd Chinese Control Conference (CCC) - Nanjing, China (2014.7.28-2014.7.30)] Proceedings of the 33rd Chinese Control Conference - Local feature descriptor based rapid 3D

the average distance, then the point is labeled as noise point. 3) Uniqueness constraint is used to remove false matching points. The above process can effectively reduce the times of iterations and improve the matching accuracy. Fig. 4 (a) and (b) shows the 3D depth map of a human ear before and after pose alignment respectively.

(a) (b)

Fig. 4 3D depth map of a human ear before and after pose alignment

3 3D Local Descriptor Based Feature Matching

3.1 3D LBP Descriptor and 3D CS-LBP Descriptor

LBP (Local Binary Pattern) is firstly proposed by Ojala has been one of the most effective texture analysis in 2D image and widely used in texture classification, image retrieval, and other fields [10-12]. In recent years, with the development of 3D recognition technology, researchers propose the 3D LBP descriptor according to the characteristics of depth map [14]. Compared with the 2D LBP descriptor, 3D LBP descriptor has better performance in describing 3D structural information. Its basic idea is to generate a 4 bit binary number according to the difference value produced by the center point and neighboring points.

In order to reduce the dimension of 3D CS-LBP descriptor in the condition of not reducing it’s describe ability, we propose 3D CS-LBP descriptor [15]. The 3D CS-LBP descriptor has better robustness to noise and its dimension is /24 2N , which is far less than the dimension of 3D LBP descriptor 4 2N . The construction method effectively combines the advantage of CS-LBP descriptor and 3D LBP descriptor. The detailed step can be found in [15].

3.2 3D CS-LBP Based Feature Matching

As only a small number of matching points can be obtained based on the feature detection result using the method mentioned in section 1.1, it could not meet the needs of recognition. Therefore, for the test 3D data and the alignment template, every point of the corresponding depth map is treated as feature points. Then we compute the 3D CS-LBP descriptor of every feature points. In the process of feature points matching step, the initial matching points can be determined according to the dissimilarity of 3D CS-LBP descriptors. In this paper, the dissimilarity is calculated by

2 distribution distance showed in equation (2). Here, Q and V are two-dimensional shape histograms, iq and

iv are the component of Q and V respectively. Fig. 5 shows a set of result of initial matching, which contains 321 pairs of

matching points. From Figure 5 we can see that there are a certain number of false matching points.

Fig. 5 Initial matching result based on 3D CS-LBP

In order to remove the initial mismatch points, the geometric constraints and space constraints mentioned in section 1.2 are used. For geometric constraints, threshold values are set as follows: 1 9.4mm , 2 3.7mm . The spatial location constraint principle is that 3D spatial relationship of two feature points of test ear depth map should be the same as that of the corresponding matching feature points of alignment template. Fig. 6 shows the results that have been removed false matching feature points, and the number of matching points decreases to 55.

Fig. 6 Results that have been removed false matching feature points

4 Experimental Results In this paper, the experimental data are all from UND’s J2

3D ear database. We picked out 830 groups data from J2, including pose variation, varying lighting conditions, hair shade, with eardrops and so on. Every people has 2 sets of data, including 237 men and 178 women, which contain 70 people with eardrop, 40 people with hair shade. Fig. 7 shows three pairs of 3D ear depth map under the condition of change of attitude, with eardrops and hair shade. In this paper, all experiments were performed using Matlab programme on a computer ,whose CPU is Intel Pentium ® D CPU 2.8GHz, and memory is 2G.

(a) (b) (c)

Fig. 7 3D ear depth maps In this paper, the 3D ear recognition experiment mainly

includes the following steps: 1) Randomly selected 100 × 2 sets of data from 830 groups of 3D ear data including 415 people. The first set of data of each people is used as test data, and the second set of data is used as model data. 2) For 2D

4944

Page 4: [IEEE 2014 33rd Chinese Control Conference (CCC) - Nanjing, China (2014.7.28-2014.7.30)] Proceedings of the 33rd Chinese Control Conference - Local feature descriptor based rapid 3D

texture image, the Adaboost algorithm is used for ear region detection, and then the corresponding 3D ear data can be determined as subsequent experiment data. 3) For each 3D ear data, the MICP algorithm based coarse to fine method mentioned in section 1is used for pose alignment. 4) For every point of normalized ear depth image, compute its corresponding 3D CS-LBP descriptors. 5) Use the method mentioned in section 3 to perform feature point matching between the test data and the model data. The number of matching points is used for recognition.

According to the steps mentioned above, we compared and analyzed the performance of 3D ear recognition method based on three local descriptors: LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor. “MICP+LBP”“MICP+3D LBP”and“MICP+3D CS-LBP” respectively correspond to three 3D ear recognition methods based on LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor. “MICP” denotes that the coarse-to-fine pose alignment based on MICP algorithm is used before ear recognition. From Table 1 and Table 2 we can see that the MICP method can effectively improve the recognition rate and the recognition result of the method based on 3D LBP and 3D CS-LBP is better than the method based on LBP.

Table 1: Recognition results of different methods

Recognition method Rank-1 Rank-2 Rank-3 Rank-4 Rank-5

MICP+LBP 87% 89% 92% 93% 95% MICP+3D

LBP 94% 95% 97% 97% 98%

MICP+3D CS-LBP 94% 95% 96% 97% 98%

Table 2: Average number of matching points and average extraction time

Recognition method

Average number of matching points

Average extraction time (s)

MICP+LBP 59 4.89

MICP+3D LBP 45 9.89 MICP+3D CS-LBP 57 5.32

5 Conclusion This paper presents a MICP algorithm and local descriptor

based 3D ear recognition method. The main contributions are as follows: 1) a pose alignment method based on LSP descriptor and MICP algorithm has been proposed. In the process of iteration, KD -tree search algorithm, nearest point distance constraint and uniqueness constraint are used to eliminate noise points and decrease the number of iterations. 2) LBP descriptor, 3D LBP descriptor and 3D CS-LBP descriptor were used for feature extraction, and then the number of matching points is used for recognition. Extensive experimental results show that MICP algorithm and3D CS-LBP descriptor based 3D ear recognition has the best

performance in terms of recognition rate and the computational efficiency.

References [1] Abaza A, Ross A, Herbert C, et al. A survey on ear

biometrics[J]. ACM Computing Surveys, 2013, 45(2): Article No. 22

[2] Islam S, Bennamoun M, Owens R A. et al. A review of recent advances in 3D ear and expression invariant face biometrics[J]. ACM Computing Surveys, 2012, 44(3): Article No. 14

[3] Zhang B, Mu Z, Jiang C, et al. A robust algorithm for ear recognition under partial occlusion: IEEE proceedings of the Control Conference (CCC), 2013 32nd Chinese, 2013.

[4] Ping Yan and Kevin W. Bowyer. An automatic 3d ear recognition system, Proceedings of the Third International Symposium on 3D Data Processing, Visualization and Transmission University of North Carolina, Chapel Hill, 2006: 326-333.

[5] Ping Yan and Kevin W. Bowyer. Biometric Recognition Using 3D Ear Shape, IEEE Transactions on Pattern Analysis And Machine Intelligence, vol. 29, no. 8, pp. 1297-1308, August 2007.

[6] Hui Chen and Bir Bhanu, Human Ear Recognition in 3D. IEEE Transactions on Pattern Analysis And Machine Intelligence, vol. 29, no. 4, pp. 718-737, April 2007.

[7] Hui Chen and Bir Bhanu. Efficient Recognition of Highly Similar 3D Objects in Range Images, IEEE Transactions on Pattern Analysis And Machine Intelligence, vol. 31, no. 1, pp. 172-179, January 2009.

[8] Islam S, Davies R, Bennamoun M, et al. Efficient Detection and Recognition of 3D Ears[J]. International Journal of Computer Vision, 2011,95(1): 52-73.

[9] Kai W, Zhichun M, Zhijun H. A fast 3d ear recognition method based on local surface patch[J]. International Journal of Advancements in Computing Technology, 2012, 4( 20): 516-524.

[10] Ojala T, Pietikäinen M, and Mäenpää T. Multiresolution gray-scale and rotation invariant texture classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971-987.

[11] Ahonen T, Hadid A, and Pietikäinen M. Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037-2041

[12] Ahonen T, Hadid A, and Pietikäinen M. Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037-2041

[13] Heikkilä M, Pietikäinen M, Schmid C. Description of interest regions with local binary patterns. Pattern Recognition, 2009, 42(3): 425-436.

[14] Yonggang Huang, Yunhong Wang and Tieniu Tan, combining Statistics of Geometrical and Correlative Features for 3D Face Recognition, the 17thBritish Machine Vision Conference, 2006, pp. 879-888.

[15] Hui Zeng, Ji-Yuan Dong, Zhi-chun Mu, Yin Guo. Ear Recognition Based on 3D Keypoint Matching. The 10th IEEE International Conference on Signal Processing, October 24-28, Beijing, China, 1694-1697, 2010.

4945