Download pdf - SIFT Algorithm for Verification of Ear Biometric

7/23/2019 SIFT Algorithm for Verification of Ear Biometric

http://slidepdf.com/reader/full/sift-algorithm-for-verification-of-ear-biometric 1/13

Using SIFT Algorithm for Verification of Ear Biometric

Amrita Pathak, Gauri Gupta,Y7211062

Y7104022, [email protected],[email protected]

CS698x projectInstructor: Dr P Gupta

Abstract- This project presents a method for extracting distinctive invariant features

from ear images that can be used to perform reliable matching between different views

of an ear. It shows the extraction of keypoints in an ear image and also the results of it

are matching with other samples of both same and different subjects. It also validates

the behavior of SIFT algorithm on ear images of different subjects by clearly varying

the illumination, scale, rotation and also occluding some parts of the image. The feature

matching is found to perform well under the above mentioned varying conditions. The

FAR, FRR and accuracy curves obtained as a result of experimenting with data of 50

subjects, prove that the SIFT features are highly distinctive. This implies that a singlefeature can be correctly matched with high probability against a large database of

features.

1. IntroductionEar biometric system uses features of the ear (Figure 1) to identify or verify an individuals’

identity. The Comparison is based on variations of the complex structure of the ear. Ear

growth after the first four months of birth is proportional to gravity can cause the ear to

undergo stretching. Although humans lack the ability to recognize one another from ears, ears

have both reliable and robust features which are extractable from a distance .Other biometrics

such as Fingerprints, Hand Geometry and Retinal Scans require close contact and may be

considered invasive by users .Study by Iannnarelli showed the feasibility of ear biometrics

[1]. Possible applications for ear are Biometrics ATM Machines, Evidence surveillance and

recognition systems or access to any restricted area.

Figure 1: Structure of External Ear



2. Related WorkThere has been quite a bit of work on ear biometric. Iannarelli System is anthropometric

technique based upon the 12 ear measurements. It requires exact alignment and normalization

of the ear photo and allows comparable measurements. All measurements are relative to an

origin which is a point chosen on the image. Hui Chen and Bir Bhanu introduced a two-step

ICP (Iterative Closest Point) algorithm for matching 3D ears and Contour Matching for 3DEar Recognition [2]. Eigenear and PCA based approach have also been used for ear

identification [3].

3. SIFT: Shift Invariant Feature Transform

The algorithm has four stages [4]:

3.1 Scale Space Extrema Detection

This is the stage where the interest points, which are called keypoints in the SIFT framework,

are detected. For this, the image is convolved with Gaussian filters at different scales, and

then the difference of successive Gaussian-blurred images is taken. Key points are then taken

as maxima/minima of the Difference of Gaussians (DoG) that occur at multiple scales.

Specifically, a DoG image D(x,y,σ) is given by

Where L(x,y,k σ ) is the original image I(x,y) convolved with the Gaussian blur G(x,y,k σ ) at

scale k σ, i.e.,

Hence a DoG image between scales k iσ and k jσ is just the difference of the Gaussian-blurred

images at scales k iσ and k jσ. For scale-space extrema detection in the SIFT algorithm, the

image is first convolved with Gaussian-blurs at different scales. The convolved images aregrouped by octave (an octave corresponds to doubling the value of σ), and the value of k i is

selected so that we obtain a fixed number of convolved images per octave. Then the

Difference-of-Gaussian images are taken from adjacent Gaussian-blurred images per octave.

Once DoG images have been obtained, key points are identified as local minima/maxima of

the DoG images across scales. This is done by comparing each pixel in the DoG images to its

eight neighbors at the same scale and nine corresponding neighboring pixels in each of the

neighboring scales. If the pixel value is the maximum or minimum among all compared

pixels, it is selected as the candidate keypoint.

3.2 Keypoint LocalizationAfter scale space extrema are detected the SIFT algorithm discards low contrast key points

and then filters out those located on edges. Resulting set of keypoints is shown in Figure 2(a).

Scale space extrema detection produces too many keypoint candidates, some of which are

unstable. The next step in the algorithm is to perform a detailed fit to the nearby data for

accurate location, scale, and ratio of principal curvatures. This information allows points to

be rejected that have low contrast (and are therefore sensitive to noise) or are poorly localized

along an edge.

3.2.1 Interpolation of nearby data for accurate position

First, for each candidate key point, interpolation of nearby data is used to accurately

determine its position. The approach calculates the interpolated location of the maximum,

which substantially improves matching and stability. The interpolation is done using the



quadratic Taylor expansion of the Difference-of-Gaussian scale-space function, D(x,y,σ ) with

the candidate keypoint as the origin. This Taylor expansion is given by:

Where D and its derivatives are evaluated at the candidate keypoint and x=( x,y,σ ) is the offset

from this point. The location of the extremum, , is determined by taking the derivative of

this function with respect to x and setting it to zero. If the offset is larger than 0.5 in any

dimension, then that's an indication that the extremum lies closer to another candidate

keypoint. In this case, the candidate keypoint is changed and the interpolation performed

instead about that point. Otherwise the offset is added to its candidate keypoint to get the

interpolated estimate for the location of the extremum.

3.2.2 Discarding low-contrast keypoints

To discard the keypoints with low contrast, the value of the second-order Taylor expansion

D(x) is computed at the offset . If this value is less than 0.03, the candidate keypoint isdiscarded. Otherwise it is kept, with final location and scale σ, where y is the original

location of the keypoint at scale σ.

3.2.3 Eliminating edge responses

The DoG function will have strong responses along edges, even if the candidate keypoint is

unstable to small amounts of noise. Therefore, in order to increase stability, we need to

eliminate the keypoints that have poorly determined locations but have high edge responses.

For poorly defined peaks in the DoG function, the principal curvature across the edge would

be much larger than the principal curvature along it. Finding these principal curvatures

amounts to solving for the eigenvalues of the second-order Hessian matrix, H:

The eigenvalues of H are proportional to the principal curvatures of D. It turns out that the

ratio of the two eigenvalues, say α is the larger one, and β the smaller one, with ratio r=α / β , is

sufficient for SIFT's purposes. The trace of H, i.e. D xx + D yy, gives us the sum of the two

eigenvalues, while its determinant, i.e D xx D yy-D2 xy, yields the product. The ratio

R=Tr(H)2/Det(H) can be shown to be equal to (r+1)2/r, which depends only on the ratio of the

eigenvalues rather than their individual values. R is minimum when the eigenvalues are equalto each other. Therefore the higher the absolute difference between the two eigenvalues,

which is equivalent to a higher absolute difference between the two principal curvatures of D,

the higher the value of R. It follows that, for some threshold eigenvalue ratio r th, if R for a

candidate keypoint is larger than r th+1)2/r th, that keypoint is poorly localized and hence

rejected. The new approach uses r th = 10

3.3 Orientation Assignment

In this step, each keypoint is assigned one or more orientations based on local image gradient

directions. This is the key step in achieving invariance to rotation as the key point descriptor

can be represented relative to this orientation and therefore achieve invariance to image



rotation. First, the Gaussian-smoothed image L( x,y,σ) at the keypoint's scale σ is taken so that

all computations are performed in a scale-invariant manner. For an image sample L( x,y) at

scale σ, the gradient magnitude, m(x,y), and orientation, Θ(x,y), are pre-computed using pixel

differences:

The magnitude and direction calculations for the gradient are done for every pixel in a

neighboring region around the keypoint in the Gaussian-blurred image L. An orientation

histogram with 36 bins is formed, with each bin covering 10 degrees. Each sample in the

neighboring window added to a histogram bin is weighted by its gradient magnitude and by a

Gaussian-weighted circular window with a σ that is 1.5 times that of the scale of thekeypoint. The peaks in this histogram correspond to dominant orientations. Once the

histogram is filled, the orientations corresponding to the highest peak and local peaks that are

within 80% of the highest peaks are assigned to the keypoint. In the case of multiple

orientations being assigned, an additional keypoint is created having the same location and

scale as the original keypoint for each additional orientation.

3.4 Keypoint Descriptor

Previous steps found keypoint locations at particular scales and assigned orientations to them

and ensured invariance to image location, scale and rotation. Now we compute descriptor

vectors for these keypoints such that the descriptors are highly distinctive and partially

invariant to the remaining variations, like illumination, 3D viewpoint, etc. The featuredescriptor is computed as a set of orientation histograms on (4 x 4) pixel neighborhoods. The

orientation histograms are relative to the keypoint orientation and the orientation data comes

from the Gaussian image closest in scale to the keypoint's scale, the contribution of each

pixel is weighted by the gradient magnitude, and by a Gaussian with σ 1.5 times the scale of

the keypoint. Histograms contain 8 bins each, and each descriptor contains a 4x4 array of 16

histograms around the keypoint. This leads to a SIFT feature vector with (4 x 4 x 8 = 128

elements). This vector is normalized to enhance invariance to changes in illumination.

4. AlgorithmsWe have framed some algorithms to get the results which are described in the following sub-

sections.

4.1 Matching Algorithm

• find SIFT keypoints for each image and initialize

• Initialize a distance ratio threshold (dRatio).

•

For each descriptor vector in the first image(des1)

– Compute vector of dot products with descriptor vector of second image(des2)

i.e. dProd=des1.des2

– Take inverse cosine and sort results [value,index]=sort(acos(dProd))

– If nearest neighbour has angle less than dRatio times 2nd then match else

don’t match.

•

Return the number and percentage of keypoints matched



4.2 Algorithm to Find FRR

• Total samples in the database = 50

• Total number of queries = 50

• Match each query image with the corresponding other sample of the database

• Get the percentage match and store

•

For threshold (T) ranging from 0 to 100 – For each score value i (ranging from 1 to 50)

–

If score (i) <T then count as false rejected i.e. FRR(T)=FRR(T)+1

• Find percentage of each FRR value

• Plot FRR against threshold ranging from 0 to 100.

4.3 Algorithm to Find FAR

• Total samples in the database = 50

• Total number of queries = 50

• Match each query image with all the other images in the database except its own

sample.• Get the percentage match and store

• For threshold (T) ranging from 0 to 100

–

For each score value i (ranging from 1 to 2450)

– If score (i)>T then count as false accepted i.e. FAR(T)=FAR(T)+1

•

Find percentage of each far value

Plot FAR against threshold ranging from 0 to 100

4.4 Algorithm to Find Accuracy

• For the percentage scores for FAR and FRR

– Accuracy = 100-((FAR+FRR)/2)

•

Plot accuracy against threshold ranging from 0 to 100.•

Find the threshold for maximum accuracy

5. Results5.1 Results of using SIFT on ear for general matching under controlled environment

(a) Keypoints shown on an ear image (b)Keypoint matching of same ear (65

matches found out of 83)



(c)Keypoint matching of different ears (0

matches found out of 79)(d) Keypoint matching after changing

brightness (uniformly) (44 matches

found out 83)

(e) Keypoint matching after rotation by 45

degrees (39 matches found out of 83)

Figure 2: Results obtained after matching ear under controlled environment

5.2 Results of using SIFT on ear for matching under strained environment

5.2.1 Results Obtained by Changing Illumination

Subject Total number

of keypointsfound

Total number

of keypointsmatched

Percentage

matching

Type of

illuminationchange

See

figure

1 96 0 0 Large 3(a)

2 29 0 0 Large 3(b)

3 168 35 20.833 medium 3(c)

4 234 72 30.769 small 3(d)

Conclusion: SIFT gives good results for small to medium illumination changes but not for

large changes.



(a)

Subject 1 (b) Subject2

(c) Subject 3 (d) Subject 4

Figure 3: Matching Results Obtained After Changing Illumination for 4 subjects

5.2.2 Results Obtained by Changing Scale

Subject Database

image size

Query

image

size

Total

number of

keypoints

found

Total

number of

keypoints

matched

Percentage

matching

See Figure

1 302X418 516X713 152 52 34.210 4(a)

1 390X266 666X440 93 18 19.354 4(b)

2 191X381 338X642 546 36 6.593 4(c)

2 586X476 342X265 843 49 5.812 4(d)

Conclusion: SIFT gives optimum match results even when scale of the image is changed



(a)

Subject 1(b) Subject 1

(c) Subject 2(d) Subject 2

Figure 4: Matching Results Obtained After Changing Scale for 2 subjects.

5.2.3 Results Obtained by Changing RotationSubject Total number

of keypoints

found

Total number

of keypoints

matched

Percentage

matching

Type of

rotation of ear

See

figure

1 431 13 3.016 Completely

forward

5(a)


backward

5(b)


forward

5(c)


backward

5(d)

Conclusion: SIFT gives optimum match results even when image rotation is changed



(a) Subject 1 (b) Subject 1


Figure 5: Matching Results Obtained After Changing Rotation for 3 subjects

5.2.4 Results Obtained by Occluding a Part of ImageSubject Total number

of keypoints

found

Total number

of keypoints

matched

Percentage

matching

Amount of

Occlusion

See

figure

1 291 3 1.031 Large 6(a)

2 968 1 .1033 Large 6(b)

3 267 40 14.981 Medium 6(c)

3 284 51 17.957 Small 6(d)

Conclusion: SIFT gives good results for small to medium amount of occlusion but not for

large occlusion.

(a) Subject 1 (b) Subject 2




Figure 6: Matching Results Obtained After occluding images of 3 subjects

5.3 Results obtained after plotting FAR, FRR and accuracy

Figure 7: FAR-FRR Curve



Figure 8: Accuracy Curve

Figure 9: ROC Curve



Figure 10: DET Curve

6. ConclusionThe result obtained by plotting the curves is as follows:

Accuracy = 99.4485%

Threshold = 4

i.e. we get maximum accuracy at threshold = 4.

From several experiments conducted we conclude that the SIFT algorithm is robust to

changes in scale and rotation for the ear biometric. It gives good results for small to medium

illumination changes but not for large changes. It gives good results for small to medium

occlusion but not for large occlusion. It is not an efficient algorithm (with respect to

computational speed) for large database.

7. References

1. A. Iannarelli, Ear Identification. Foresic Identification Series. Paramont

PublishingCompany, Fremont, California, 1989.

2.

Hui Chen and Bir Bhanu, Contour Matching for 3D Ear Recognition

Center for Research in Intelligent Systems University of California, Riverside,

California

3. Mark Burge, Wilhelm Burger, "Ear Biometrics in Computer Vision," icpr,pp.2822, 15th

International Conference on Pattern Recognition (ICPR'00) - Volume 2, 2000

.



4. David G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”,

International Journal of Computer Vision, 2004

5. P. Yan and K. B. Bowyer. 2D and 3D ear recognition. Biometric

Consortium Conference, 2004

6. www.cse.huji.ac.il/course/2006/compvis/lectures/

http://www.cse.huji.ac.il/course/2006/compvis/lectures/