5
Determination of Suitable Boundary in Biometric Authentication MILOSLAV HUB Institute of System Engineering and Informatics Faculty of Economics and Administration, University of Pardubice Studentska 84, 532 10 Pardubice CZECH REPUBLIC Abstract: This article is focused on a problem of setting suitable boundary in the biometric authentication. The biometric authentication is somewhat different from the other kinds of authentication. While password matches relevant template which are saved in database or password does not match, biometric characteristics are somewhat stochastic. It means biometric characteristics (e.g. voice) can never absolutely match to their relevant templates. For this reason the system analyst has to set the strictness of similarity between submitted biometric characteristics and relevant templates. In this case the subject is considered as a valid user, not as an impostor. Non-Bayes tasks of a statistical decision seem to be a quite suitable tool for solving this problem. Key-Words: Authentication, identification, biometric, keystroke dynamics, data security, Bayes theory, classification, decision. 1 Introduction Authentication as a data security measurement is very important for keeping data as safe as possible in the framework of information society [2], [10]. The aim of authentication is to decide whether some subject is really the claimed one [18]. There are three types of authentication: authentication by knowledge, authentication by ownership of something, and authentication by attribute. Each one has both advantages and disadvantages. They can be combined to increase the security of information as well [8]. One possible way how to increase security level of access to information systems, is a combination of authentication by knowledge and authentication by attribute, i.e. parallel usage of passwords and keystroke dynamics. Everyone has different style of keyboard typing [11]. It is quite similar to man’s own signature. It is a reason why this way of biometric authentication was selected for a research which results are discussed in this article. But, the principles of keyboard typing authentication can be used as the other ways of authentications, e.g. hand geometric [1]. In keystroke dynamics it is possible to recognize various kinds of identifiably characteristics which are measurable: duration times (difference between the time of a key press and the time of the same key release), latency times (times between key release of the first key and key press of the next key), key typing speed, position of the finger on the key, pressure on the key and so on. 2 Problem formulation The principle of authentication is apparently simple. An authenticated subject puts some previously concluded identification characteristics and if these characteristics are recognized this subject is considered as a claimed subject. Let’s mark this subject as a valid user V . If a subject is not considered a valid user it is considered an impostor. Consequently, we can describe each subject as given by the (1). ) , , , ( * * i i i i i C x I T s r = (1) i s ..... i -th authentized subject * i I .... measurable nominal parameter called “assertion about identity by i -th subject” i x r .... measurable parameter called “array of identifying characteristics of i -th subject” * i T ... template of subject who is i -th authentized subject claimed to i C .... class to i -th authentized subject belong to (hidden nominal parameter) The class i C , i -th authentificed subject i s belongs to, we can define as (2). = = ' * ' * i i i i i I I I I I V C (2) V .... valid user Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp449-453)

Determination of Suitable Boundary in Biometric Authentication · Determination of Suitable Boundary in Biometric Authentication MILOSLAV HUB Institute of System Engineering and Informatics

Embed Size (px)

Citation preview

Determination of Suitable Boundary in Biometric Authentication

MILOSLAV HUB Institute of System Engineering and Informatics

Faculty of Economics and Administration, University of Pardubice Studentska 84, 532 10 Pardubice

CZECH REPUBLIC

Abstract: This article is focused on a problem of setting suitable boundary in the biometric authentication. The biometric authentication is somewhat different from the other kinds of authentication. While password matches relevant template which are saved in database or password does not match, biometric characteristics are somewhat stochastic. It means biometric characteristics (e.g. voice) can never absolutely match to their relevant templates. For this reason the system analyst has to set the strictness of similarity between submitted biometric characteristics and relevant templates. In this case the subject is considered as a valid user, not as an impostor. Non-Bayes tasks of a statistical decision seem to be a quite suitable tool for solving this problem. Key-Words: Authentication, identification, biometric, keystroke dynamics, data security, Bayes theory, classification, decision. 1 Introduction Authentication as a data security measurement is very important for keeping data as safe as possible in the framework of information society [2], [10]. The aim of authentication is to decide whether some subject is really the claimed one [18]. There are three types of authentication: authentication by knowledge, authentication by ownership of something, and authentication by attribute. Each one has both advantages and disadvantages. They can be combined to increase the security of information as well [8]. One possible way how to increase security level of access to information systems, is a combination of authentication by knowledge and authentication by attribute, i.e. parallel usage of passwords and keystroke dynamics. Everyone has different style of keyboard typing [11]. It is quite similar to man’s own signature. It is a reason why this way of biometric authentication was selected for a research which results are discussed in this article. But, the principles of keyboard typing authentication can be used as the other ways of authentications, e.g. hand geometric [1]. In keystroke dynamics it is possible to recognize various kinds of identifiably characteristics which are measurable: duration times (difference between the time of a key press and the time of the same key release), latency times (times between key release of the first key and key press of the next key), key typing speed, position of the finger on the key, pressure on the key and so on.

2 Problem formulation The principle of authentication is apparently simple. An authenticated subject puts some previously concluded identification characteristics and if these characteristics are recognized this subject is considered as a claimed subject. Let’s mark this subject as a valid user V . If a subject is not considered a valid user it is considered an impostor. Consequently, we can describe each subject as given by the (1).

),,,( **iiiii CxI Ts r

= (1)

is ..... i -th authentized subject *iI .... measurable nominal parameter called “assertion

about identity by i -th subject” ixr .... measurable parameter called “array of

identifying characteristics of i -th subject” *iT ... template of subject who is i -th authentized

subject claimed to iC .... class to i -th authentized subject belong to

(hidden nominal parameter)

The class iC , i -th authentificed subject is belongs to, we can define as (2).

≠=

= '*

'*

ii

iii III

IIVC (2)

V .... valid user

Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp449-453)

I ...... impostor

As it was stated before, the main task of each authentication is to make decision whether submitted proof of identity is to be accepted or rejected - see (3).

},{: RAXD →×T (3)

D ..... set of all possible decisions d T ..... set of all possible templates X ..... set of all possible values of identifying arrays A .....acceptance of submited identifying array as

a proof of identity R ..... rejection of submited identifying array as a

proof of identity

This decision process does not work perfectly each time. Some valid users might be classified as impostors and some impostors might be classified as valid users and might be granted access to the information system. Following two kinds of error are recognized:

1. acception error AE 2. rejection error RE

Based on Bernoulli principle [7] we can deduce following important qualitative indexes of a quality of authentication – see (4) and (5).

V

E

NR NN

VEP R

V ∞→= lim)/( (4)

I

E

NA NN

IEP A

I ∞→= lim)/( (5)

)/( VEP R . conditional probability of occurance of rejection error when valid user is autheniced

)/( IEP A .. conditional probability of occurance of acception error when impostor is autheniced

REN ........... number of occurances rejection errors

AEN ........... number of occurances acceptance errors VN ............ number of valid users IN ............. number of impostors

Now, the fundamental problem of suggestion of suitable authentication method is evident. It is necessary to find some function f with the assistance the system is able to make decision whether accept or reject the provided proof of identity.

The problem of finding this function f can be divided into the next two steps (sub problems):

1. To find a suitable similarity measure ),( *Txd r between submitted identifying characteristics and relevant template of an authentized subject.

2. To determine suitable boundary b of similarity measure for classification of a subject as a valid user or an impostor.

A lot of researches have been devoted to the first sub problem, but the second sub problem has been neglected and it is one of the reasons why it is studied here. Keystroke dynamics will be used as a biometric authentication method and residual dispersion with using of Laplace probability density will be used as a similarity measure ),( *Txd r [9].

3 Bayes task of statistics decision Bayes task of statistics decision which follow from Bayes theorem [1], are formulated in [5], [6] and [16]. The goal of Bayes tasks of statistics decision is to find strategy Q which minimizes Bayes risk )(QR . If this theorem is implemented into auhentification, the following equation (6) is obtained as a result.

),())),((,()),,((

))),((,()),,(()(

***),(

**

*

TTT

TTT

xddxdQIWIxdP

xdQVWVxdPQRxd

rrr

rr

r

⋅+

⋅= ∫(6)

)(QR ........................Bayes risk of strategy Q

)),,(( * VxdP Tr.........probability density of situation

when a valid user V is authentized, and similarity measure is ),( *Txd r

. )),,(( * IxdP Tr

............probability density of situation when an impostor I is authentized, and similarity measure is ),( *Txd r

. ))),((,( *TxdQVW r

..cost of decision of the strategy Q which decision is based on

),( *Txd r; a claimed user is a valid

user V . ))),((,( *TxdQIW r

...cost of decision of the strategy Q which decision is based on

),( *Txd r; a claimed user is an

impostor I . As it is evident form the equation (6), Bayes task of statistics decision can not be used for authentication purposes because of the next reasons:

Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp449-453)

• The penalty function W needn’t to be known. • A priory probabilities )(VP and )(IP aren’t

known. These probabilities are necessary for computation of )),,(( * VxdP Tr

and )),,(( * IxdP Tr

. • Probabilities )/),(( * VxdP Tr and

)/),(( * IxdP Tr are contingent probabilities in

fact. These probabilities are necessary for computation of )),,(( * VxdP Tr

and )),,(( * IxdP Tr

. 4 Non-Bayes tasks of statistics decision 4.1 Neyman-Pearson task Each object can occur in a normal state or a dangerous state. The goal of the Neyman-Pearson task is to find such a strategy which decomposes the set of features of pursued objects so that conditional probability of overlooked danger is no more than determined value. If this conditional is carried out the strategy which conditional probability of false danger is minimum, is choosen [12], [13]. In this problem such a strategy is found which minimalizes conditional probability of wrong rejection of a valid user )/( VEP R under condition conditional probability of wrong acceptance of impostor )/( IEP A

is at the most Aε . In this model we transformed the set of features to one characteristic – to the similarity measure between submitted identifying characteristics and relevant template ),( *Txd r

. This approach does not consider any alternative way of authentication. It is a reason why it must be guaranteed that each valid user can be accepted at least once per a session. It means problem-user has to have a chance to be correctly accepted. Boundary of similarity measure when every valid user is at least once a session correctly accepted is marked at the graph as the point A (see Fig. 1). At this point A conditional probability of wrong acceptance of impostor is

minAε . Simultaneously, it is not advantageous to determine such boundary when

maxAA εε > because while

0.)/( == constVEP R , )/( IEP A is increasing. This situation is pointed as B in the graph on Fig. 1.

Fig. 1: Neyman-Person task

It is evident the concrete value of the Aε have to lie

between values minAε and

maxAε , hence

maxmin, AAA εεε ∈ .

On basis of executed experiments when keyboard dynamic authentication was investigated, the conditional probabilities of the wrong acceptance of impostor were determined as 226.0

min=Aε and 485.0

max=Aε .

This values correspond to boundaries 2

min 614.0 sb µ= and 2max 681.0 sb µ= .

Simultaneously, this task can be reformulated. Conditional probability of overlooked danger (conditional probability of the wrong acceptance of impostor )/( IEP A in this case) will be minimize under the condition that conditional probability of the false alarm (conditional probability of the wrong rejection of

valid user )/( VEP R in this case) will be Rε at maximum. The problem-users must be considered in this reformulated Neyman-Pearson task too. Each valid user including problem-users must have a chance to be rightly accepted. This situation is marked as A in the graph on the Fig. 1: Neyman-Person taskFig. 1, 174.0

max=Rε in this point.. However the boundrary of

similarity measure can be higher (not so much strict) what leads to decreasing of )/( VEP R . Than

max,0 RR εε ∈ can be defined. Again, it is not quite

advantageous to determine boundary of similarity measure when

max)/( AA IEP ε> . For this reason and on

base of experiments it is more appropriate to determine

Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp449-453)

22 681,0,614,0 ssb µµ∈ in keyboard authentication. 4.2 Mini-max task A decomposition of set of features, when conditional probability of wrong categorization of subject is minimum, is searched for in this task [14]. The solution of this task is a determination of such a boundary when )/()/( VEPIEP RA = . In keystroke dynamics boundary has to be set to

2602,0 sb µ= , because at this boundary 190,0)/()/( == VEPIEP RA . But this boundary

brings a problem. This boundary is smaller then boundary when each valid user is at least once properly accepted. 4.3 Wald task The goal of Wald task is to find suitable decomposition of feature set of surveyed subject to conditional probability of the wrong classification into groups isε at maximum. Symbol ε presents some determined value [15], [17]. This need can be opposing at some situations. That is why this task does not formulate only two subsets as previous tasks but three subsets. In the investigated problem this subsets mean “accept”, “reject” and “I do not know”. “I do not know” is the mentioned third subset. When the subject is classified to this third subset, the alternative authentication system has to make a final decision if the subject is a valid user or an impostor. Wald task speculates the same value of ε both maximum conditional probability of wrong accept of impostor and maximum conditional probability of wrong reject of valid user. For our purposes let’s define different values of these conditional probabilities. Let’s define Aε as maximum conditional probability of wrong acceptance of an impostor and Rε as maximum conditional probability of wrong rejection of a valid user. This modified task has the following solution. Let’s mark the boundary when )/( VEP R is minimum under condition that AA IEP ε=)/( as Ab . Similarly, let’s mark boundary when )/( IEP A is minimum under condition that RR VEP ε=)/( as Rb . Then if AR bb ≥ , the authentication decision d will be made by the way given by (7). In case when AR bb < this task has infinitely solutions and it can be resolved by Neyman-Pearson task of statistical decision.

≥<∧>

≤=

R

RA

A

bxdRbxdbxdknowtdonI

bxdAd

),(),(),('

),(

*

**

*

TTT

T

r

rr

r

(7)

d .............decision of authentication system A .............acceptance of submited identifying

characteristics as a proof of identity R .............rejection of submited identifying

characteristics as a proof of identity ),( *Txd r

.value of similarity measure between submitted identifying characteristics and relevant template

Ab ............boundary of similarity measure when )/( VEP R is minimum under condition that

AA IEP ε=)/( Rb ............boundary of similarity measure when

)/( IEP A is minimum under condition that RR VEP ε=)/(

Fig. 2 shows a possible solution of this task. When the particular value of similarity measure ),( *Txd r

is no more than Ab , the proof of identity will be accepted and a user will be assumed a as valid user. If this value will be bigger than Rb , the proof of identity will be rejected. An alternative authentication will be used if ),( *Txd r

will be between Ab and Rb .

Fig. 2: Wald task

.

Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp449-453)

4 Conclusion Bayes task of statistic decision is not suitable for purposes of determination of similarity measure boundary because it is not possible to assign value of Bayes risk of any strategy. It is the reason why non-Bayes tasks of statistical decision have to be used. In situation when it is not possible to guarantee an alternative authentication (it means an authentication which will be used when a standard authentication won’t be able to decide whether some subject is a valid user or an impostor) Neyman-Pearson task of statistical decision can be used. In situation when there is an alternative authentication available, Wald task can be used. Mini-max task of statistical decision might not be suitable strategy, because some valid users (problem-users) might not be granted access to information system. Although this approach is used for keystroke authentication in this research, it is possible to use these principles for other kinds of biometric authentication as well.

References: [1] Artazi, P., R., at all: Hand geometric and hand

print texture based prototype for identity authentication. In WSEAS Transactions on Systems. Issue 2 Vol.3 April 2004, pp 526- 532, ISSN 1109-2777.

[2] Čapek, J: User identification by information system (original in Czech). Scientific papers of the University of Pardubice Ser. D. 21-25. Pardubice 2004. ISSN 1211-555X, ISBN 80-7194-716-4.

[3] Čapek, J., Hub, M.: Fuzzy Approach in Biometric Authentication by Keystroke Dynamics. In WSEAS Transactions on Systems, 2005, Issue 4, Volume 4. ISSN 1109-2777

[4] Bayes, T.: An essay towards solving a problem in the doctrine of chance. Philosophical Transactionsnof the Royal Society, London, 1763. Reprinted in Biometrika, Vol. 45, 1958, s. 298-315.

[5] Devijer, P. A., Kittler, J.: Pattern Recognition: A Statistical Approach.1st printing. New York: Prentice-Hall, Englewood Cliffs, 1982.

[6] Duda, R. O., Hart, P. E.: Pattern Classification and Scene Analysis. 1st printing. New York: John Willey and Sons, 1973.

[7] Hendl, J: Overview of statistical data analysis (original in czech). 1. ed. Praha: Portál, 2004. 583 p. ISBN 80-7178-820-1.

[8] Hub, M.: Strategy of the choice identification signs within multifactorial authentication. (original in Czech). E+M Economics and Management. pp 147-150, Liberec 2003. ISSN 1212-3609

[9] Hub, M. Data security – authentication (original in Czech). 1. vyd. Pardubice: Univerzita Pardubice, 2005. ISBN 80-7194-825-X

[10] Komárková, J., Šimonová, S., Dušek, V. Geographic Information on the Web. WSEAS TRANSACTIONS on INFORMATION SCIENCE AND APPLICATIONS, November 2004, vol. 1, issue 5, s. 1185 – 1188, ISSN 1790-0832

[11] Legget. J, Williams, G., Usink, M.: Dynamic identity verification via keystroke characteristics. International Journal of Man-Machine Studies, v36, s. 859-870, Sept. 1990

[12] Neyman, J., Pearson, E. S.: On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrica, 1928, 20A, p. 175-240.

[13] Neyman, J., Pearson, E. S.: On the problem of the most efficient tests of statistical hypothese. Phil. Trans. Royal Soc. London, 1933, 231, p.289-337.

[14] Schlesinger, M. I., Hlaváč, V.: Deset přednášek z teorie statistického a strukturálního rozpoznávání. 1. vyd. Praha: Vydavatelství ČVUT, 1999. 521 s. ISBN 80-01-01998-5.

[15] Wald, A.: Sequential Analysis. New York: John Wiley, 1974.

[16] Wald, A.: Basic idea sof a general Tudory of statisticalmdecision rules. Proceeding of the International Congess of Mathematicians, 1950, vol. I.

[17] Wald, A., Wolfowitz, J.: Optimum charakter of the sequential ratio test. Ann. Math. Stat., 1948,19(3), s. 326-339.

[18] Guide Understanding I&A. National Computer Security Center. NCSC-TG-017 Library No. 5-235,479. Version 1.

Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp449-453)