2012_A study of quantitative comparisons of photographs and video images.pdf

7/27/2019 2012_A study of quantitative comparisons of photographs and video images.pdf

1/11

A study of quantitative comparisons of photographs and video images based on

landmark

derived

feature

vectors

Krista F. Kleinberg a,*, J. Paul Siebert b

a Forensic Medicine and Science, Joseph Black Building, University of Glasgow, Glasgow G12 8QQ, UKbDepartment of Computing Science, Sir Alwyn Williams Building, University of Glasgow, Glasgow G12 8QQ, UK

1. Introduction

As a result of the wide deployment of surveillance cameras,

there is both opportunity and motivation, given the amount of

visual material being collected digitally, to identify suspects from

CCTV. Although rapidly improving in terms of spatial resolution,

the majority of video surveillance equipment does not produce

images of sufficient quality needed to provide identificationswhen

othermore conclusive evidence, such asDNA or fingerprints, is not

available. It is in these kinds of cases that anthropometrymay havethe potential to provide a useful identification technique.

Surveillance video can be important supportive evidence because

it may show a crime being committed, although, it is not always

easy to recognise,and therefore convict,a criminal caughtonCCTV.

Video surveillance can bemore reliable than eyewitness testimony

because the story told is always consistent and also corroborates

what the eyewitness reported [1]. However, a more comprehen-

sive analysis is necessary because even when facial video images

are of sufficient quality, it is possible that two people may look

similar to each other in this medium.

The roles of anthropometry and forensic science have inter-

twined beginning with Bertillon in the 1800s [2,3] and anthro-

pometry was one of the identification methods used in [46].

Although more sophisticated vision based methods of image

comparison are being developed [7,8], it remains to be seen whatcanbe achievedbyutilizing ratiosbetweenkey facial landmarks on

single 2D images. Even if reliable automatic methods for face

image comparison can be developed, the need for manual

intervention in terms of landmark placement are likely to be

required where low-quality images have to be analysed, such as

generatedbymany currently installedCCTV systems. In contrast to

comparing two images, anthropometric proportions from the face

and body of live suspects were compared against 2D images and

was one of the identification methods resulting in convictions in

two out of three cases in Halbersteins 2001 paper [9]. One of the

fundamental problems with comparing 2D images is facial pose.

Forensic Science International 219 (2012) 248258

A R T I C L E I N F O

Article history:

Received 6 July 2011Received in revised form 22 November 2011

Accepted 4 January 2012

Available online 24 January 2012

Keywords:

Facial identification

Anthropometry

Image comparison

Face database

A B S T R A C T

An abundunce of surveillance cameras highlights the necessity of identifying individuals recorded.

Images captured are often unintelligible and are unable to provide irrefutable identifications by sight,

and therefore a more systematic method for identification is required to address this problem. An

existing database of video and photograhic imageswas examined, which hadpreviously been used in a

psychological research project; material consisted of 80 video (Sample 1) and 119 photograhic (Sample

2) images, though taken with different cameras. A set of 38 anthropometric landmarks were placed by

hand capturing 59 ratios of inter-landmark distances to conduct within sample and between sample

comparisons using normalised correlation calculations; mean absolute value between ratios, Euclidean

distance and Cosine u distance between ratios. The statistics of the two samples were examined to

determine which calculation best ascertained if there were any detectable correlation differences

between faces that fall under the same conditions. A comparison of each face in Sample 1 was then

compared against thedatabase of faces in Sample 2. Wepresent pilot results showing that theCosineu

distance equation usingZ-normalisedvaluesachieved the largest separation between True Positive and

True Negative faces.Having applied theCosineu distance equationwewere then able to determine that

if a match value returned is greater than 0.7, it is likely that the best match will be a True Positive

allowing a decrease of database images to be verified by a human. However, a much larger sample of

images requires to be tested to verify these outcomes. 2012 Elsevier Ireland Ltd. All rights reserved.

* Corresponding author. Present address: PEACH Unit, University of Glasgow,

Queen Mothers Hospital, 8th Floor Tower Block, Dalnair Street, Glasgow G3 8SJ,

UK. Tel.: +44 141 201 1988; fax: +44 141 201 6943.

E-mail addresses: [email protected] , [email protected]

(K.F. Kleinberg), [email protected] (J.P. Siebert).

Contents

lists

available

at

SciVerse

ScienceDirect

Forensic Science International

journal homepage : www.elsev ier .co m/locate / fo rsc i in t

0379-0738/$ see front matter 2012 Elsevier Ireland Ltd. All rights reserved.

doi:10.1016/j.forsciint.2012.01.014
http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014mailto:[email protected]:[email protected]:[email protected]:[email protected]://www.sciencedirect.com/science/journal/03790738http://www.sciencedirect.com/science/journal/03790738http://www.sciencedirect.com/science/journal/03790738http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://dx.doi.org/10.1016/j.forsciint.2012.01.014http://www.sciencedirect.com/science/journal/03790738mailto:[email protected]:[email protected]:[email protected]://dx.doi.org/10.1016/j.forsciint.2012.01.014


2/11

Attempts to rectify this in facial recognition pose invariant

systems described in [10,11] reported greater recognition rates

than when used without the pose transformations. Using soft

biometric traits was shown to be beneficial in improving

recognition accuracy when combined with a commercial based

face matching program [12].

Three questions should be asked of a comparison method; is it

possible to carry out the comparison objectively, is it possible to

avoid manual input, and is it applicable to checking large

databases? An identification made based on 2D images will be

more decisive if there is a way to quantify the comparison, rather

than if the identification is based solely on a subjective analysis, as

the result is a comparison that is objectivewithminimalbias.Once

quantification of a comparison is achieved, the process should be

automated. An automated process would decrease the error from

involving many different operators in the comparison process and

would allow largedatabases to be checkedquickly.As a face search

could potentially be extended to full populations by reviewing

internationalised databases, i.e. Interpol [13], the need for

automation is high. According to The Ministry of Justice Statistics

(UK) bulletin, the reoffending rate for criminals in England and

Wales in 2006was 146.1 offences per 100 offenders [14].Although

this is a decrease of 22.9% from 2000, the numbers indicate there is

justification for a database of convicted criminal images that couldbe quickly automated and checked.

We document an investigation into the comparison of

anthropometric ratios of facial landmark pairs manually located

on 2D images. The constraints in this study are that we consider

best-case scenario situations as a bench mark given that scenarios

in thefield,bydefinition, cannotbe asbenign. The subjectmatter is

based on the analysis of comparing high quality full-face frontal

video and photographic images of individuals of a similar ethnic

background with neutral expressions.

This investigation expanded previous research carried out by

Kleinberg, Vanezis and Burton [15] and was conducted to test the

hypothesis: Using a comparison of anthropometric facial ratios, it

is

possible

to

discriminate

between

individuals

of

two

samples.

The objective of this study was to derive measurements betweenspecific landmarks on the face in both print and video media and

incorporate them into a feature vector to use in statistical analysis

to

determine

if

identifications

of

an

individual

can

be

made

based

on

these

measurements.

Knowledge

of

the

type

of

information

gathered in this studymay help in future to rankpotential suspects

for human identification verification. However, in order to

establish

that

two

faces

were

the

same

and

use

this

identification

method

to

identify

positively

rather

than

eliminate

suspects,

it

would be necessary to show that the probability of a false match in

the rest of the population at random was of an acceptably low

probability

[16].

To

investigate

the

hypothesis

in

this

study,

we

seek

to

address

the following questions:

Of the proposed images, can similar faces be separated from

dissimilar faces within a single sample using vector compar-

isons?

How distinguishable are individual faces in the samples? Is it

possible to distinguish true positive faces from true negative

faces using vector comparisons where the statistics from two

samples are known?

Using

a

small sample

of

re-landmarked images,

how signifi-

cant is the error contribution in re-landmarked images and

what is the operator induced measurement spread under ideal

conditions?

Given

a

specific

example

and

set

of

comparisons

with

the

database, what constitutes a manageable subsample, worthy of

further

manual

verification?

2. Materials

A total of 199 images of Caucasian male police volunteers were available which

hadbeenused previously in research conducted byBruce et al. [17]. The199 images

comprised 80 different video still faces (Sample 1) and 119 different photographic

faces (Sample 2). According to Bruce et al. [17], The image quality on the videos

was high-equivalent to what would be produced by a good amateur photographer

trying to reveal a good likeness of someone onahome videotape. Thephotographic

images in Sample 2 included the same 80 faces depicted in the video cohort, and an

additional 39 new faces not included as video stills. The photographs were of

policemen, both retired and presently working and except for photographs, which

have already been published elsewhere are, for this reason, unable to be exhibited

in this paper. However, an example of each type of image is provided in Fig. 1. Both

sets of images, taken on the same day, were displayed from the frontal viewpoint,

showing features from the neck up, in what appeared to be the format of police

identification photographs. In this study the identity of the subjects in the video

images was known and could be cross referenced with the corresponding

photographic images. This means that identifications made on the basis of facial

anthropometry could be designated as true or false. One positive feature of these

video images was that because they were recorded on the same day as the

photographs, the study images didnothave anyof thepossible facial changeswhich

can occurdue to time factors such asweight loss/gain, increase in age orpresence of

facial hair.

3. Methodology

Given a set of landmarks there is a need to be able to quantify the landmarks

numerically such that they can be used to compare faces. Ideally, the measure

should

be

invariant

to

in-plane

translations

and

rotations

and

be

tolerant

to

adegree of out-of-plane rotation in order to accommodate the variability inherent

when posing a subject for full frontal image capture. Thirty-eight landmarks (Table

1), ten unilateral and 14 bilateral, were chosen for inclusion in the anthropometric

study and are shown in Fig. 2. Careful consideration was given to the selection of

landmarks that were used is this study. Anthropometric research by Farkas [18],

Purkait [19], Fieller [20], Evison [21], and facial recognition research by Craw et al.

[22] and Okada et al. [23] were consulted when choosing the landmarks that were

included in the present study. When choosing a landmark it was important that it

was one that could be placed consistently. It had to be a point where an operator

performing the comparison would be able to locate it in the same place within an

acceptable error. According to Fieller [20], the criteria used to determine a

successful/reliable landmark are: observer knowledge, consistency of landmark

Fig.

1.

High

resolution

video

image

(a)

and

selection

of

ten

database

photographs

(b).

K.F. Kleinberg, J.P. Siebert/Forensic Science International 219 (2012) 248258 249


3/11

placement, discriminatory power, and landmark visible in majority of cases.

Excluded landmarkswere eliminated on the basis of their inability to be located on

photographs.

Although thenumber ofpossible linearmeasurements increases combinatorially

with the number of landmarks, not all are reliable or pertinent to the research

undertaken

for

this

study.

A

total

of

73

linear

measurements

(21

unilateral,

26bilateral) were chosen for this study. The majority of these were chosen by

consulting the literature [18,22,24]. Two of these measurements used in a previous

study [25], ex-n and ex-sto, were chosen because they utilise landmarks that were

considered tobe less affected by facial expression than others and also because they

would be visible even if the subject was wearing a hat. Three bilateral

measurements were unique to the present study.

From these landmarks and linear measurements, a total of 59 ratios (also

unilateral and bilateral) were selected for comparison of images (Table 2). The

linear measurements that make up the ratios are shown in Fig. 3. A ratio wasderived by dividing the smaller linear measurement (numerator) by the larger

linear measurement (denominator). The ratios were chosen to achieve a balance of

the horizontal and vertical regions of the face. Intuitively, it is expected that longer

lines between landmarks located on different sections of the face would make a

more reliable proportion than two short lines in the same section of the face. This is

because small variations in landmarkplacement makingup short lineswould result

in large changes in proportions, which may not accurately portray true variations

between individuals. The ratios utilised in this research were deliberately chosen to

include linear measurements between landmarks in different sections of the face

and others that covered a small section of the face, such as the length vs. the width

of the eye. As it is more common to use absolute measurements in anthropometric

comparisons [18,19,26,27] rather than ratios, there was less guidance with respect

to which ratios would be more reliable or more relevant than others in the present

study. Halberstein used a combination of up to twelve face and body ratios when

comparing a photograph to a live subject, and three of these ratios were used [9].

These ratioswere ear length/facial height (sa-sba/n-gn), nasal height/ear length (n-

sn/sa-sba)

and

nasal

width/nasal

height

(al-al/n-sn).

The

remainder

of

the

ratiosthat were used by Halberstein were not incorporated into this research because

they either included facial landmarks thatwere not chosen for the present study or

Table 1

Landmarks and their definitions used in this study [18,24].

1. Glabella (g): the most prominent midline point between the eyebrows.

2. Nasion (n): the point in the midline of both the nasal root and the

nasofrontal suture. This point is always above the line that connects the

two inner canthi. A canthus is the angle at either end of the fissure

between the eyelids.

3. Exocanthion (ex): the point at the outer commissure of the eye fissure. A

commissure is the site of union of corresponding parts and a fissure is any

cleft or groove, in this case of the eye [bilateral].

4. Endocanthion

(en):

the

point

at

the

inner

commissure

of

the

eye

fissure[bilateral].

5. Palpebrale superius (ps): highest point in the midportion of the free

margin of each upper eyelid. The free margin portion of the eyelid is the

unattached edge [bilateral].

6. Palpebrale inferius (pi): the lowest point in the midportion of the free

margin of each lower eyelid [bilateral].

7. Orbitale (or): the lowest point on the margin of the orbit. The orbit is the

bony cavity that contains the eyeball [bilateral].

8. Superaurle (sa): the highest point of the free margin of the auricle. The

auricle is the portion of the external ear that is not contained within the

head [bilateral].

9. Subaurale (sba): the lowest point on the free margin of the ear lobe

[bilateral].

10. Postaurale (pa): the most posterior point on the free margin of the ear

helix. The helix refers to the coiled structure of the ear. [bilateral].

11. Otobasion inferius (obi): the lowest point of attachment of the external

ear to the head [bilateral].

12. Alare (al): the most lateral point on each nostril contour [bilateral].

13. Subnasale (sn): the midpoint of the angle at the columella (fleshy, lower

margin) base where the lower border of the nasal septum and the surface

of the upper lip meet.

14. Pronasale (prn): the most protruded point of the nasal tip.

15. Subalare (sbal): the point on the lower margin of the base of the nasal

ala where the ala disappears into the upper lip skin [bilateral].

16. Stomion (sto): the imaginary point at the crossing of the vertical facial

midline and the horizontal labial (lip) fissure between gently closed lips,

with teeth shut in the natural position.

17. Crista philtri landmark (cph): the point on the elevated margin of the

philtrum just above the vermilion line. The philtrum is the vertical

groove in the median portion of the upper lip and vermilion refers to the

exposed

red portion of the upper or lower lip [bilateral].

18. Cheilion (ch): the point located at each labial commissure [bilateral].

19. Labiale inferius (li): the midpoint of the vermilion border of the lower lip.

20. Labiale superius (ls): the midpoint of the vermilion border of the upperlip.

21. Gonion (go): the most lateral point at the angle of the mandible. The

mandible is the bone of the lower jaw [bilateral].

22. Sublabiale (sl): determines the lower border of the lower lip or the upper

border of the chin.

23. Pogonion (pg): the most anterior midpoint of the chin.

24. Gnathion (gn): the lowest point in the midline on the lower border of the

chin.

Fig. 2. Facial landmarks and their location.

Table 2

Ratios used in this study.

go-go/n-gn sn-sto/sto-sl sn-gn/n-sto li-sl/sn-ls

n-prn/g-pg sbal-sn/sn-prn [bilateral] gn-go/n-gn [bilateral] sl-gn/sto-gn

al-al/ex-ex ex-go/go-go [bilateral] al-al/n-sn n-sn/n-sto

sa-sba/n-gn [bilateral] n-gn/n-sto n-sn/sa-sba [bilateral] en-al/ex-ch [bilateral]

ex-ex/go-go obi-ch/g-sa [bilateral] ex-n/ex-sto [bilateral] sbal-ls/n-al [bilateral]

ex-n/n-sto [bilateral] pi-al/sa-ex [bilateral] ex-sto/n-sto [bilateral] ex-obi/ex-ch [bilateral]

en-ex/ps-pi [bilateral] ex-al/ch-gn [bilateral] en-en/ex-ex ch-ls/n-prn [bilateral]

pi-or/en-ex [bilateral] al-ls/ch-gn [bilateral] sa-sba/pa-obi [bilateral] ch-li/ex-ch [bilateral]

cph-cph/sn-ls ex-sto/rt ex-lt ch [bilateral] ls-sto/ch-ch sn-gn/ex-gn [bilateral]

sto-li/ch-ch

K.F. Kleinberg, J.P. Siebert/Forensic Science International 219 (2012) 248258250


4/11

because they were body ratios, such as shoulder width, leg or shoe lengths. Two

ratios (n-sn/n-sto, n-gn/n-sto) were used by Catterick for his research [28]. The

remainder of the ratios chosenwereunique to this study. Inorder to continuewith a

best case scenario situation, one volunteer, with previous experience in placing

landmarks on 2D images, placed the 38 landmarks on all 199 images using the

measurement programme produced in-house, Facial Identification Centre Version0.32Forensic Medicine and Science Glasgow University.

The group of 59 ratios is treated as a 59 dimensional vector and this has been

evaluated as a means of comparing all faces. In this study, the feature vector is the

series of 59 ratios derived from chosen linear measurements between facial

landmarks. The alternative for comparing ratios between landmarks is to compare

the raw distances between the landmarks. Comparing raw distances can be

accomplished using the Procrustes [29] alignment techniques, and although

outside of the scope of the current project, may be used in future studies. The

advantage of using ratios is that they are both scale and rotation invariant and also

to a slight degree auto-corrective (in terms of errors added during landmarking). In

addition, ratiosexhibit adegree of invariance to the effects of out-of-plane rotations

for small angles (when the effects of such rotations are sufficiently small to

approximate a 2D affine transformation on the imaging plane).

Three equations were used to test the comparison of a feature vector from one

sample against another; mean absolute difference, Euclidean distance and Cosine u

distance. The first two equations compare the length of the difference vector and

the

third

equation

compares

the

angle

between

the

vectors.

The

three

equations

areas follows:

3.1. The mean absolute difference between ratio vectors

Eq. (1) determines the distance that separates one face from another by taking

the absolute value of one face ratio vector subtracted from the same ratio of a

second face. This is carried out for each ratio element in the feature vector. The

summation of this feature vector is then divided by the total number of elements

(59 ratios in this case). A difference of 0 between two faces establishes that those

two faces have identical facial ratio vectors. The smaller the difference in facial

ratios is indicative of a smaller difference between faces. A disadvantage of using

this equation is that the maximum difference between faces is not bounded:

Meanabsdiff

XnNn1

F1n F2nj j

N (1)

3.2. The Euclidean distance between ratios

The Euclidean distance (Eq. (2)) also measures the distance between twomulti-

dimensional vectors. This is the square root of the sum of the squares of the

elements, in this case ratios. A difference of 0 between two faces establishes that

those two faces have identical facial ratio vectors. The smaller the difference in

facial ratios is indicative of a smaller difference between faces. A disadvantage of

using this equation is that the maximum difference between faces is not bounded:

Euclideandistance ffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiXnNn1

F1n F2n 2vuut

(2)

3.3. The Cosine u distance

The Cosine u distance equation (Eq. (3)) is a similarity measurement and is used

to measure the angle between two vectors. A cosine difference of 1.0 between two

faces establishes that those two faces have identical vectors of facial ratios. An

advantage of using this equation is that the range of values is bound from 1.0 to

+1.0 and useful comparisons are ranged from zero to one. A difference of zero is

indicative of a face that shows no correlation whereas a result of 0.5 is achieved by

random chance. Anynegative result shows the face comparison produces an inverse

correlation:

Cosu

XnNn1

F1n F2n

F1kk F2k

k

(3)

A comparison between two faces was deemed a true positive match (TP) if the

match was a correct match between the video image and photograph of the same

subject. A true negative match (TN)wasone that excludes the faces andwhichwasa

correct exclusion because it involved a video image and a photograph of two

different subjects.A falsepositivematch (FP)wasonewhich wasan incorrect match

between a video image and a photograph of two different subjects and a false

negative match (FN) was one which is excluded but which was an incorrect

exclusion because it involved the video image and photograph of the same subject.

To answer the questions laid forth in Section 1, the three equations were applied

in the following four scenarios to test the comparison of Sample 1 faces to Sample 2

faces; within sample comparisons, between sample comparisons, error in landmark

placement, and the potential sample of photographs subject tomanual verification.

4. Results

4.1. Within sample comparisons

To test if similar faces were separable from dissimilar faces

within

a

single

sample

the

equations

were

applied

so

that

every face

in

a

single

sample

was

compared

to

itselfand

everyother

face

within

this sample. Each sample contained only one image of each face and

for this reason allthatcould bedetermined was the true negativity of

this

collection

of

different faces.

Therefore,

no

estimate

of

the

degree

to

which

two

same

faces

(true

positives)

would

match

when

captured at different times could be made from this data. The same

tests were carried out on Sample 1 (video) and then separately on

Sample

2

(photographs).

Testing all combinations

of

pairs

of

faces

within

each

sample

was

important

because

it

compared

faces

acquired under the same capture conditions, allowing the tests to

ascertain

if

it

were

possible

to

discriminate

between

different

(truenegative)

faces.

Therefore,

in

this

experiment

the

primary

source

of

variability

between

faces

should

be

attributable

to

differences

in

the

measured facial landmark ratios, i.e. generated by genuine face

shape differences, whilst the statistics of the remaining sources of

variability

remain

constant;

same

media, same

operator

placing

landmarks

and

same

facial

pose.

The similarity or dissimilarity between the faces in a single

sample is cross-checked by comparing the distributions of the

similarity

statistics

of

Sample

1

to

those

of

Sample

2.

A

Sample

1

to

Sample

2

cross-comparison

of

the

statistics

produced

by

matching

faces within their own samples can be used to in future to predict

how discriminable faces are when making comparisons between

these

two

samples.

If

both

samples

exhibit

similar

statistics,

this

would

be

indicative

that

it

is

possible

to

distinguish

faces

between

Fig. 3. Linear measurements that created the ratios utilised in this study.



5/11

samples because any difference between faces would be a result of

the true difference in faces rather than a result of the different

media recording each of the two samples of images. Results are

summarised in Fig. 4ac and are illustrated by superimposing the

normal distribution curves and similarity density histograms of

the two samples. In Table 3a the standard deviation scaled

differencebetween the Sample1 and Sample2means indicates the

difference in statistics between media, the cosine distance by far

exhibiting the greatest difference.

In order to equalise the absolute ranges of the feature vector

values in each sample and address the observed difference in the

statistics of the comparisons of Sample 1 and Sample 2, the

equations were completed using the application of Z-normalised

ratio values, illustrated in Fig. 4ac. Each element, F(n) in the

measurement vector F is expressed by a population of measure-

ment ratioswithin a sample, Z-normalisation potentially enhances

range of variation (and accentuates anydifferences) about the ratio

mean of this sub-population of ratios, allowing small differences in

the data to become more apparent. This was accomplished by

dividing the mean subtracted element by the sample standard

deviation for the particular ratio (Eq. (4)). Z-normalisation can be

applied to all three of the equations:

Z-normalizedelement FZn Fn mFn

sFn(4)

Therefore, by applying Z-normalisation it becomes possible to

force the distributions into a standardised range. Taking the Z-

normalised cosine distance as an example, the result of this

process, driving the means and standard deviations together and

reducing difference between the variance-scaled means is

illustrated in Table 3b.

4.2. Between sample comparisons

Results from conducting the equations were illustrated using

distribution histograms, separating TP faces from TN faces, Fig. 5a

c. These normal histogram distribution curves of TP faces and TNfaces were superimposed to determine if it was possible to

distinguish between faces in the two groups. The amount of

overlap shows the possibility of achieving either a FP or FN face

match, also known as the rate of misclassification. The smaller the

area, the smaller the chances of obtaining a FP or FN face match.

In the graphs, TP face matches are represented by the dotted

lines and the solid lines represent TN face matches. In order to

ensure equal numbers of faces in the two samples, the 39 faces in

Sample 2 that were not in Sample 1 were not included in this

Table 3b

Mean, standard deviation, and standard deviation scaled difference between the Sample 1 and Sample 2 means of Z-normalised: cosine distance for Sample 1 and Sample 2.

Comparison method Sample Sample mean m Sample standard deviation (SD) mSample

SDSample

mSample 1SDSample 1

mSample 2SDSample 2

N, number of samples

Z-normalised Cos(u) 1 0.0113 0.2460 0.04594 0.01474 3160

Z-normalised Cos(u) 2 0.0077 0.2468 0.0312 7021

Table 3a

Mean, standard deviation, and standard deviation scaled difference between the Sample 1 and Sample 2 means of unnormalised: mean absolute distance (MAD), Euclidean

distance and cosine distance for Sample 1 and Sample 2.

Comparison

method

Sample

Sample

mean

m

Sample

standard

deviation

(SD)

mSample

SDSample

mSample 1

SDSample 1

m Sample 2

SDSample 2

N,

number

of

samples

MAD 1 0.08354 0.02316 3.607 0.561 3160

MAD 2 0.08507 0.02041 4.168 7021

Euclidean distance 1 1.015 0.4121 2.463 1.042 3160

Euclidean distance 2 1.149 0.3278 3.505 7021

Cos(u) 1 0.9859 0.01314 75.03 74.955 3160

Cos(u) 2 0.9887 0.006592 149.985 7021

Fig. 4. (ac) Summary of the conditions imposed and results achieved in the within

sample comparisons of faces. Histograms and superimposed mean and standard

deviation of unnormalised: mean absolute distance (a), Euclidean distance (b) and

cosine distance (c)within sample comparisons for Sample 1 (dotted lower line) and

Sample 2 (solid upper line).



6/11

analysis and every face in Sample 1 was compared to every face in

Sample

2.

The

mean

absolute

difference,

the

Euclidean

distance

and

the

Cosine udistance equations (all distance measures Z-normalised)

were

applied

in

the

between

sample

comparisons.

Superimposednormal

histogram

distribution

curves

of

TP

and

TN

face

matches

were used to illustrate the discrimination between the two groups.

In general, a slightly narrower distribution was seen for the TP

faces. This was most likely because the distribution contained only

TP matches and therefore the data should be centred on a smaller

range of values. The amount of overlap between the TP and TN face

matches correlated to the possibility of achieving either a FP or FN

face match.

Superimposing the normal curves to demonstrate the separa-

tion between TP and TN face matches, the Cosine u distance (Z-

normalised) equation produced the smallest amount of overlap

and of the three equations conducted was determined to be best

equation to test the discrimination between faces of two samples.

Examination of the superimposed curves showed approximately a

30% chance of the best match between compared faces corre-

sponding to a correct identification. The TP distribution is very

small and difficult to see on the graphs. However, it is still possible

to see the 0.7 threshold emerging for the Cosine udistance with

careful observation of Fig. 5c.

Table 4 illustrates that following Z-normalisation, the cosine

distance provides the greatest separation of TP comparisons from

the TN comparisons, based on the difference between the TP and

TN standard deviation scaled means, respectively. The conclusion

made from this investigationwas that the cosinedistance equation

was the best predictor of face discrimination tested thus far andwas the sole equation used to test the error in landmark placement

and to determine the sample of images from the database that

could be narrowed down for further verification by an operator.

In pattern matching based on the cosine distance between two

unit vectors, the returned measure can be interpreted as a match

probability. Whilst a cosine distance of 1 indicates a 100%

probability of the compared vectors being the same, and 0

indicates zero probability, a distance of 0.5 indicates the 50%

chance level of correlation between compared vectors. Themean

of the TP distribution barely reaches this 50% level, although this of

course indicates thatapproximatelyhalf of theTP comparisonswill

at least exceed a chance match value. A standard deviation of

2.37

about

the

TP

distribution

mean

of

0.48

indicates

that

over

17.5% of the TP matches will exceed a 70% chance of producing a bestclosest match for the database tested.

4.3.

Error

in

landmark

placement

A small inter-operator study was carried out, to assess the

influence of landmark placement conducted bymultiple operators.

It

has

been

reported

that

landmark

placement,

tested

on

3D

images

in

a

clinical

setting,

reveals

that

average

operator

error

can

vary

widely [30]. Therefore the effect of landmark placement error is

important to testbecause although landmarkplacement on images

in

the

two

samples

used

in

this

study

was

conducted

by

a

single

operator,

this

would

not

likely

occur

in

practice.

Facial landmarks were placed on a total of six video images,

chosen

at

random,

six

times

each

by

five

different

operators.

Oneoperator

had

previous

experience

in

using

the

equipment

and

Table 4

Summary of the conditions imposed and results achieved in the between sample TP and TN face comparisons. Mean, standard deviation, and standard deviation scaled

difference between TP and TN comparisons for Z-normalised vectors: mean absolute distance, Euclidean distance and cosine distance TP and TN data sets.

Comparison method

(all Z-normalised)

Sample Sample mean m Sample standard

deviation (SD)

mSampleSDSample

mTPSDTP

mTNSDTN

N, number

of samples

MAD TP 0.7771 0.2234 3.479 0.815 80

MAD TN 1.1053 0.2574 4.294 6320

Euclidean distance TP 7.594 2.258 3.3632 0.9660 80

Euclidean distance TN 10.572 2.442 4.3292 6320

Cos(u) TP 0.4822 0.2035 2.370 2.3950 80

Cos(u) TN 0.0061 0.2389 0.02553 6320

Fig. 5. (ac) Summary of the conditions imposed and results achieved in the

between sample TP and TN face comparisons. Results are illustrated by the

superimposed normal histogram curves showing the amount of overlap in TP

(dotted lower line) and TN faces (solid upper line). Mean absolute distance (a),

Euclidean distance (b) and cosine distance (c).



7/11

knowledge of the landmarks; landmark locations were studied

using the definitions provided in [18] and [24]. The remaining

operators had no experience in using the equipment and no

previous knowledge of anthropometric landmarks. The inexperi-

enced operators were given a list of landmark definitions (Table 1)

adapted from the literature [18,24] aswell as a single photocopy of

an enlarged male face (A4 sized), front facing, with previously

placed landmarks to use as a guide. The same equipment was used

by all operators and each operator conducted their landmark

placement of images in a single day. Using the Cosine udistance

equation, comparisons of re-landmarked images were analysed

first from the single experienced operator and second, from all

operators (Fig. 6a and b).

The Cosine u distance (Z-normalised) equation was used to

compare the re-landmarked images because, when applied in the

comparison of faces between samples, it was found to be the

equation in which the statistics of the TP and TN populations were

the most separated. Each face in the subset sample was compared

to every other face in the subset sample and resulting data was

illustrated as superimposed normal histogram curves of TP and TN

face matches. It was hypothesised that conducting an inter-

operator test, using high resolution research material, but

completed by inexperienced operators, would produce a greater

amount of variation than from an experienced operator and thishypothesis was tested and found to hold.

The effect that inexperienced operators had on the separation

rate of TP and TN faces was compared to that of an experienced

operator and is summarised in Fig. 6a and b and Table 5. Compared

to that of the experienced operator, the effect of landmark

placement by inexperienced operators can clearly be seen in the

separation rates of TP and TN face matches: the mean for TP

comparisons collapses from 0.8 for the experienced operator to

0.44 for the mix of experienced and in-experienced operators.

The TP vs. TN standard deviation scaled means separation is more

than double for the experienced operator compared to that of the

mix of operators. A similar observation can be made by inspecting

the

superimposed

histograms

for

the

TP

and

TN

comparisons

for

the experienced operator vs. the mix of experienced andinexperienced operators. Although a bimodal result is generated

by the experienced operator for both TP and TN, a much greater

separation

of

the

TPTN

distributions

is

observed.

However,

for

all

operators

a

typical

averaged

picture

emerges

and

the

effect

of

the

expert canbe seen as a small additional bump at the top of the TP

distribution.

The

most

important

point

witnessed

in

Fig.

6a and

b

was

to

observe

the

strong

effect

that

the

experienced

operator

had

in

creating a larger separation of TP and TN face matches. As multiple

experienced operators were not tested, it cannotbe stated that this

difference

in

separation

rates

between

the

experienced

operator

and

all

operators

was

due

to

the

experience

of

the

operators

or

instead, the effect that will naturally occur with multiple

operators.

This

could

be

tested

by

conducting

a

study

using

apool

of

experienced

operators.

A

further

study

analysing

the

distribution

achieved

from

the

re-landmarked

images

of

each

operator after applying the Cosine u distance (Z-normalised)

equation could determine if any of the inexperienced operators

also achieved the same strong separation rate as the experienced

operator.

An

inexperienced

operator

producing

a

similar

degree

of

separation to the experienced operator would signify that the largeseparation rate produced from all operators was caused by the

inclusion of multiple operators rather than their experience.

However,

from

the

literature

in

[31], it

can

be

predicted

that

the

spread

from

a

single

inexperienced

operator

would

be

larger

than

an experienced operator.

4.4.

Potential

sample

of

photographs

subject

to

manual

verification

The amount of overlap between the distributions of TP and TN

faces illustrates an approximation of the misclassification rate (see

Fig.

5ac).

However,

given

the

task

of

comparing

a

suspects

image

to

a

large

database

of

identity

photographs,

the

ability

to

decrease

the number of possible face matches could potentially save

significant

numbers

of

investigation

hours.

This

smaller

sample

ofsuspect

photographs

could

then

be

more

closely

scrutinised

by

an

expert.

For

this

analysis,

each

face

in

Sample

1

was

compared

to

Fig. 6. (a and b) Superimposed normal curve histograms illustrating TP (dotted

lower line) and TN (solid upper line) face comparisons of the Cosine u (Z-

normalised) distance equations in six re-landmarked images from Sample 1 using

one experienced operator (a) and multiple operators (b).

Table 5

Mean, standard deviation, and standard deviation scaled difference between TP and TN comparisons in landmark placement error study: mean absolute distance, Euclidean

distance and cosine distance for TP and TN data sets illustrating TP and TN face comparisons in six re-landmarked images from Sample 1 using one experienced operator and

multiple operators.

Comparison method: cosine distance (all Z-normalised) Sample Sample mean m Sample standard

deviation (SD)

mSampleSDSample

mTPSDTP

mTNSDTN

N, number

of samples

One experienced operator TP 0.8029 0.1489 5.392 5.8224 90

One experienced operator TN 0.1666 0.3871 0.4304 540

Multiple operators: experienced and inexperienced TP 0.4409 0.2655 1.6606 2.01351 2610

Multiple operators: experienced and inexperienced TN 0.08657 0.2453 0.3529 13,500



8/11

each face in Sample2 for a total of80 comparisons. The Cosine u(Z-

normalised) distance equation was used and the resulting values

were placed indescendingorder, noting the rank of theTP.Thebest

match was defined as the match value that returned a Cosine u

value thatwas highest or closest to1.0. Thiswasused to determine

within a confidence range given a best match value how many

additional faces in the database would need to be verified before

the true positive match was found.

Best match values were placed in intervals of 0.1. The mean

rank of the TP, SD, and 2SD confidence interval for each match

interval was found and results shown in Table 6. In this instancethe confidence interval says that for within a given confidence

range, how many database images should be looked at in total.

Results in Table 6 indicate that given match values of 0.7 the best

match from the database is also likely to be the TP face. This result

is consistent with the observed degree of overlap between the TP

and TN distributions shown in Fig 5.Amatch threshold of 0.7 is not

an unreasonably high value to set, given that a distance of 0.5

indicates the 50% chance level of correlation between compared

vectors. A larger sample of images should be tested to determine if

results consistent with those presented here are produced.

5. Discussion

Using high resolution photographic research material, theobject of the study was to assess if a facial anthropometric feature

vector could be utilised to distinguish between individuals of a

similar

age

group,

ancestry

and

sex.

Given

a

database

of

subjects,

knowledge

of

the

type

of

information

gathered

in

this

study

may

help in future to narrow down the number of possible suspects in

an investigation. The technique presented here entailed analysing

vector

comparisons

to

differentiate

between

images

of

two

samples.

The

feature

vector

was

utilised

in

three

types

of

equations

testing the differences between faces in the samples. Normal-

isation was applied to the ratio values as a way to equalise the

feature

vector

values

in

each

sample

and

account

(to

some

degree)

for

the

statistics

that

different

camera

parameters

would

produce.

Z-normalisation enhances any differences between means and

makes

the

interpretation

of

the

data

more

straightforwardallowing

small

differences

in

the

data

can

to

be

more

simply

seen.

We found

that

the

face

matching

technology

investigated

in

this study can assist in a database search; however, it does not

provide an unequivocal means of confirming facial identifications

suitable

to

use

in

court.

Therefore,

the

focus

of

future

work

should

concentrate

on

the

potential

for

this

approach

to

extract

facial

information improving the search of databases and leaving

humans as the ultimate authenticator.

The

first

step

to

answering

the

objectives

laid

forth

in

the

introduction

was

to

evaluate

each

sample

of

images

to

determine

if

once the equations were applied, any differences could be seen

between the two samples. Testing faces against those found in the

same

sample

is

important

because

it

allows

the

equations

to

ascertain

if

there

are

any

differences

between

faces

which

fall

under the same conditions. This means that other than the

possibility of slight changes in facial expression the facial ratios

will be the only changeable variable between faces as all other

variables remain constant; same media, same operator placing

landmarks and same facial pose.

Once samples were looked at individually, a between sample

comparison was conducted to determine how distinguishable the

faces were in the two samples.Once the respective equationswere

conducted, superimposed normal histogram distribution curves of

true positive faces and true negative faces were used to illustrate

the discrimination of the two groups. In general, a narrowerdistribution was seen for the true positive faces. This was because

as the distribution contained only true positive matches, the data

should be centred on a smaller range of values. The amount of

overlap correlated to the possibility of achieving either a false

positive or false negative face match.

Although other researchused the squared Euclidean distance to

measure the likeness between pairs of faces [32], we found by

superimposing the normal curves to demonstrate the separation

between true positive and true negative faces, the Cosine u

distance (Z-normalised) equation produced the least amount of

overlap between true positive faces and true negative faces when

statistics of the two sampleswere known. Thematch values of true

negative

faces

in

the

superimposed

histogram

normal

curves

begin

to trail off at 0.7, indicating that although it is still possible toachieve a true negative identification above this value, it is likely

that a returned match score of below 0.7 will result in a true

negative

face

after

closer

examination.

Although

this

result

occurred

in

this

study,

it

may

not

be

replicated

with

a

larger

test

database. The investigations undertaken in this study to determine

if it is possible to discriminate between individuals of two samples

using

a

multi

dimensional

facial

feature

vector

found

that

the

Cosine udistance was the best discriminator this but could further

be improved upon by administering a more comprehensive

statistical analysis.

A small

inter-operator

study

was

carried

out,

to

assess

the

influence

of

landmark

placement

conducted

by

multiple

operators.

This is important to test because although landmark placement on

all

images

used

in

the

comparative

process

of

this

study

wasconducted

by

a

single

operator,

this

would

not

likely

be

the

case

in

the

real

world.

Landmark

placement

has

been

tested

by

other

researchers on 3D images in a clinical setting and it was suggested

that average operator error varies widely [30]. Using a digital

sliding

calliper

to

measure

photographs,

researchers

carried

out

an

intra

observer

study

to

test

reliability

of

measurements

and

results

showed a low reliability in measurements of ls-sto and n-sn [33].

The currentanalysiswas conductedwith one experienced operator

but

the

remaining

operators

were

inexperienced

It

would

be

beneficial

to

analyse

this

data

further

in

an

inter-operator

study

using experienced operators located in different graphical regions

because this scenario would be more likely as a police procedure.

Experience

was

shown

to

be

a

benefiting

factor

when

the

inter-

operator

variation

in

taking

standard

skeletal

measurements

was

Table 6

Interval showing two standard deviations of how many images in the database should be manually investigated.

Interval of best

match values

n (number of best

matches in the interval)

Mean of TP rank SD of TP rank Min of TP rank Max of TP rank Number of images to

manually investigate

in database (mean+2SD)

0.900.99 0 N/A N/A N/A N/A N/A

0.800.89 2 1 0 1 1 1

0.700.79 7 1 0 1 1 1

0.600.69 23 3.6 7.6 1 37 19

0.500.59

30

8.9

13.5

1

65

360.400.49 16 16.7 28.5 1 110 74

0.300.39 2 2 1.4 1 3 5



9/11


10/11
http://nickfieller.staff.shef.ac.uk/seminars/faces04-10-06.pdfhttp://www.interpol.int/Public/ICPO/FactSheets/GI04.pdfhttp://www.interpol.int/Public/ICPO/FactSheets/GI04.pdf


11/11

[29] I.L.Dryden,K.V.Mardia, StatisticalShapeAnalysis, Wiley-Blackwell,WestSussex,1998.

[30] A. Ayoub, et al., Validation of a vision-based, three-dimensional facial imagingsystem, Cleft Palate: Cran. J. 40 (2003) 523529.

[31] B.J Adams, J.E. Byrd, Interobserver variation of selected postcranial skeletalmeasurements, J. Forensic Sci. 47 (2002) 11931202.

[32] J.P. Davis, T. Valentine, R.E. Davis, Computer assisted photo-anthropometricanalyses of full-face and profile facial images, Forensic Sci. Int. 200 (2010)165176.

[33] M.Roelofse,et al.,Photo identification:facialmetrical andmorphologicalfeaturesin South African males, Forensic Sci. Int. 177 (2008) 168175.

[34]

B. Murphy, R.D. Morrison, Introduction to Environmental Forensics, AcademicPress, 2007.[35] D. Sheskin,Handbookof ParametricNonparametric Statistical Procedures, Chap-

man Hall/CRC, 2007.

[36] T. Fawcett, An introduction to ROC analysis, Pattern Recogn. Lett. 27 (2006)861874.

[37] D. DeCarlo, et al., An anthropometric face model using variational techniques, in:Proceedings of the25th Annual Conference onComputerGraphicsand InteractiveTechniques, 1998, pp. 6774.

[38] C. Zhang, S.F. Cohen, 3-D face structure extraction and recognition from imagesusing 3-D morphing and distance mapping, IEEE Trans. Image Proc. 11 (2002)12491259.

[39] M.I.M. Goos, et al., 2D/3D image (facial) comparison using camera matching,Forensic Sci. Int. 163 (2006) 1017.

[40] J. Lee, et al., Efficient height measurement method of surveillance camera image,

Forensic Sci. Int. 177 (2008) 1723.[41] H.C. Longuet-Higgins,A computer algorithmfor reconstructing a scene from twoprojections, Nature 293 (1981) 133135.


Documents

2012_A study of quantitative comparisons of photographs and video images.pdf