Upload
others
View
20
Download
0
Embed Size (px)
Citation preview
Keystroke recognition using Android devices
João Paulo Sim-Sim Lopes
Thesis to obtain the Master of Science Degree in
Electrical and Computer Engineering
Supervisor: Prof. Paulo Luís Serras Lobato Correia
Examination Committee
Chairperson: Prof. José Eduardo Charters Ribeiro da Cunha Sanguino
Supervisor: Prof. Paulo Luís Serras Lobato Correia
Members of the committee: Prof. Rui Jorge Henrique Calado Lopes
April 2015
ii
Abstract The term “biometrics” is derived from the Greek words “bio” (life) and “metrics” (to measure).
Biometric recognition is therefore related with the recognition of people based on their characteristics.
Automatic biometric recognition systems have become available over the last decades, due to
significant advances in computation. However, until recently, specific devices were needed for
biometric recognition. Nowadays, smartphones have a considerable processing power, allowing to
implement some biometric algorithms. A demand for biometric recognition was created due to
increased smartphone market penetration, since the devices hold sensible personal information. To
have a secure access to sensitive information some type of security against illegitimate users is
needed. Biometric security is therefore a must on these devices, given that the traditional PINs
(Personal Identification Number) can be stolen, forgotten or cracked. On the other hand, personal
characteristics are unique and can’t be forgotten and are hardly stolen, making biometric validation
superior to PIN usage, creating a demand for biometric validation applications to secure people’s
information. To increase security a PIN is commonly used together with biometric identification. The
system proposed in this dissertation aims to monitor mobile phone users for a pattern while writing on
keyboards (keystroke) and then using this pattern to secure the mobile phone from unauthorized
users. The proposed system can use an algorithm based either on Euclidean distances or Support
Vector Machines, for the classification stage. Encouraging results were obtained using the SVM
classifier.
Keywords
Biometric recognition, personal identification, keystroke dynamics, smartphone, Euclidean distances,
Support Vector Machine.
iii
Resumo O termo "biometria" é derivado da palavra grega "bio" (vida) e "métricas" (para medir).
Reconhecimento biométrico está relacionado com o reconhecimento de pessoas com base nas suas
características. Sistemas de reconhecimento biométrico automáticos tornaram-se disponíveis ao
longo das últimas décadas, devido a avanços significativos na computação. No entanto, até
recentemente, eram necessários dispositivos específicos para reconhecimento biométrico. Hoje em
dia, os smartphones têm um poder considerável de processamento, permitindo a implementação de
alguns algoritmos biométricos. Uma procura por reconhecimento biométrico foi criada devido ao
aumento da penetração no mercado dos smartphones, já que os dispositivos contêm informações
pessoais sensíveis. Para ter um acesso seguro a informações sensíveis, é necessário ter algum tipo
de segurança contra usuários ilegítimos. Segurança biométrica é, portanto, uma obrigação a ter
nesses dispositivos, dado que os PINs tradicionais (Personal Identification Number) podem ser
roubados, esquecidos ou descobertos. Por outro lado, as características pessoais são únicas e não
podem ser esquecidas e dificilmente são roubadas, tornando a validação biométrica superior ao uso
do PIN, criando uma demanda por aplicações de validação biométricas para garantir a segurança da
informação das pessoas. Para aumentar a segurança de um PIN é comum usar técnicas de
identificação biométrica em conjunto. O sistema proposto nesta dissertação tem como objetivo
monitorizar o padrão dos utilizadores do telemóvel enquanto escrevem em teclados (keystroke) e, em
seguida, usar esse padrão para proteger o telemóvel de utilizadores não autorizados. O sistema
proposto pode usar um algoritmo baseado quer em distâncias euclidianas ou Support Vector
Machines (SVM), para a fase de classificação. Resultados encorajadores foram obtidos utilizando o
classificador SVM.
Palavras-chave
Validação biométrica, identificação pessoal, keystroke dynamics, Smartphones, Euclidean distances,
Support Vector Machine
iv
Table of Contents
Abstract ................................................................................................................................................... i
Resumo .................................................................................................................................................. iii
1 Introduction .................................................................................................................................... 1
1.1 Context ................................................................................................................................... 1
1.2 Biometric systems ................................................................................................................. 1
1.3 Objectives .............................................................................................................................. 4
1.4 Contributions ......................................................................................................................... 5
1.5 Organization of the text ........................................................................................................ 6
2 Mobile biometrics state of the art ............................................................................................... 7
2.1 Biometric recognition techniques ....................................................................................... 7
2.1.1 Keystroke dynamics ..................................................................................................... 7
2.1.2 Face recognition ........................................................................................................... 8
2.1.3 Iris scan .......................................................................................................................... 9
2.1.4 Voice recognition .......................................................................................................... 9
2.1.5 Hand geometry ............................................................................................................ 10
2.1.6 Gait ................................................................................................................................ 10
2.1.7 Handwritten biometric signatures ............................................................................. 11
2.1.8 Choosing the technique ............................................................................................. 12
2.2 Keystroke dynamics as a biometric trait .......................................................................... 12
2.2.1 Input sensor ................................................................................................................. 12
2.2.2 Features ....................................................................................................................... 13
2.2.3 Classification techniques ........................................................................................... 14
2.2.4 Keystroke models ....................................................................................................... 17
2.2.5 Conclusion ................................................................................................................... 19
3 Proposed keystroke dynamics recognition application ......................................................... 20
3.1 Architecture .......................................................................................................................... 20
3.2 Capturing user input ........................................................................................................... 21
3.3 Classification and decision ................................................................................................ 22
4 Results .......................................................................................................................................... 26
4.1.1 Average key timing measures .................................................................................. 26
4.1.2 Euclidean distances ................................................................................................... 29
4.1.3 SVM .............................................................................................................................. 32
4.1.4 Conclusion ................................................................................................................... 36
5 Using the application .................................................................................................................. 36
v
6 Conclusions and further Work .................................................................................................. 42
6.1 Summary and conclusion .................................................................................................. 42
6.2 Further work......................................................................................................................... 42
7 References .................................................................................................................................. 44
vi
Index of Figures
Figure 1 – Generic biometric system ................................................................................................. 2
Figure 2 – Identification vs. Verification (griaulebiometrics, 2014) ............................................... 3 Figure 3 - Classification of user authentication approaches .......................................................... 3 Figure 4- FRR, FAR and CER ............................................................................................................ 4
Figure 5 – Generic architecture of a biometric recognition system............................................... 5
Figure 6 – Timing intervals between consecutive key presses (McLoughlin & Mohanavel, 2009) ....................................................................................................................................................... 8 Figure 7 – Face preprocessing (Tao & Veldhuis, 2006) ................................................................. 9 Figure 8 - Voice recognition process (Shabeer & Suganthi, 2007)............................................. 10 Figure 9 - Gait cycle phases (physio-pedia, 2015) ........................................................................ 11 Figure 10 –Different types of input ................................................................................................... 13 Figure 11 – Various keystroke features .......................................................................................... 14 Figure 12- Classification techniques for keystroke dynamics (Support Vector Machine, Back-Propagation Neural Network, Predictive Adaptive Resonance Theory, Radial Basis Function Network) ............................................................................................................................................... 16 Figure 13 - Digraph ............................................................................................................................ 16 Figure 14 – Monograph ..................................................................................................................... 17 Figure 15 - User enrollment process (Awad & Traore, 2013) ...................................................... 18 Figure 16- Verification process (Awad & Traore, 2013) ............................................................... 19 Figure 17- System architecture ........................................................................................................ 20 Figure 18 –Dwell time ........................................................................................................................ 21 Figure 19 – Flight time ....................................................................................................................... 21 Figure 20 – Soft keyboard ................................................................................................................. 22 Figure 21 - Illustrative configuration of hashmap from training labels ........................................ 23 Figure 22- Illustrative configuration of hashmap from train and test features ........................... 24 Figure 23 - SVM linear kernel illustration (Ranga, 2015) ............................................................. 25 Figure 24- SVM RBF kernel illustration (openclassroom.stanford.edu, 2015) .......................... 26 Figure 25- Average dwell time from all users (mxplayer) ............................................................. 27 Figure 26 - Average flight time from all users (mxplayer) ............................................................ 27 Figure 27 - Average dwell time from all users (Lisboa2014) ....................................................... 28 Figure 28 - Average flight time from all users (Lisboa2014) ........................................................ 28 Figure 29 - Average dwell time from all users (tecnicoLisboa) .................................................... 29 Figure 30 - Average flight time from all users (tecnicoLisboa) .................................................... 29 Figure 31- ROC curve for 'mxplayer' (Euclidean distances) ........................................................ 31 Figure 32 - ROC curve for ‘Lisboa2014’ (Euclidean distances) .................................................. 31 Figure 33 - ROC curve for ‘tecnicoLisboa’ (Euclidean distances) ............................................... 32 Figure 34 - Probability of the claimed user be the true user (mxplayer) .................................... 33 Figure 35 - Probability of the claimed user be the true user (Lisboa2014) ................................ 33 Figure 36 - Probability of the claimed user be the true user (tecnicoLisboa) ............................ 34 Figure 37 - ROC curve for ‘mxplayer’ (SVM) .................................................................................. 35 Figure 38 - ROC curve for ‘Lisboa2014’ (SVM) ............................................................................. 35 Figure 39 - ROC curve for ‘tecnicoLisboa’ (SVM) ......................................................................... 36 Figure 40 - Login screen .................................................................................................................... 37 Figure 41 - Password choices .......................................................................................................... 37 Figure 42 - Confirmation box for a user that already exists ......................................................... 37 Figure 43 – Main screen .................................................................................................................... 38
vii
Figure 44 - Training screen ............................................................................................................... 39 Figure 45 - Training accepted screen .............................................................................................. 39 Figure 46 - Box to choose an algorithm to proceed with verification .......................................... 40 Figure 47 - Imposter message .......................................................................................................... 40 Figure 48 - Verification screen after a user is approved ............................................................... 41
viii
Index of Tables
Table 1 - ROC evaluation for the best accuracy achieved (Euclidean distances) ................... 30 Table 2 - ROC evaluation for the best accuracy achieved (SVM) .............................................. 34
ix
List of Acronyms
ARTMAP Predictive Adaptive Resonance Theory
AUC Area Under the Curve
BPNN Back-Propagation Neural Networks
CER Crossover Error Rate
DTW Dynamic Time Warping
EER Equal Error Rate
FRR False Recognition Rate
FAR False Acceptance Rate
FP False Positive
FN False Negative
GPS Global Positioning System
ID Identify
MCS Multiple Classifier System
OS Operating System
PIN Personal Identification Number
RBFN Radial Basis Function Network
RBF Radial Basis Function
ROC Receiver Operating Characteristic
SVM Support Vector Machine
TP True Positive
TN True Negative
1
1 Introduction
1.1 Context
Mobile phones have a central role in everyday life. Worldwide, the number of active cellphones
now exceeds the world population, and the same penetration growth trend is observed in Portugal.
Among these, smartphones are assuming an increasing share of the market. Smartphones are in fact
small computers, with increasingly powerful processors and considerable amounts of memory and
storage capabilities. Also, smartphones include displays capable of providing friendly graphical
interfaces and offer touch sensitive screens. This allows the development of advanced applications
covering all aspects of life: from voice communications, to internet access, personal entertainment, or
even to take care of mobile payments and other financial applications. Since smartphones are in fact
small computers, their operation is governed by an operating system. Today the dominating operating
system in the market is Android, with more than 50% market share, followed by iOS with 42% and the
remaining distributed between Microsoft, BlackBerry and Symbian, according to (Mobile Markting,
2015).
Since smartphones appeared and took charge of our information and communications, a need
to enhance the security of these devices exists. For example, there are applications to track
smartphones from their GPS unit and control the device remotely, antivirus, backup, etc. Besides this,
most people use a pattern (combination of movements that lock the phone screen) or a PIN to access
the device, however they are easy to detect crack. Nowadays, a few smartphones already have
biometric recognition, such as face (Nexus phones) and fingerprint (IPhone 5S). However, adding a
biometric trait to PIN’s hasn’t been commercially explored. This work will explore a person writing on
smartphones as a biometric trait in order to increase access security.
1.2 Biometric systems
Knowledge based authentication seeks to look for the user identification through a service
access (ex.: website). Object based identification, consists on comparing the attributes of the original
object to what is known about objects with the same features. Finally, biometric based authentication
needs a unique characteristic of the user to create a copy of that characteristic (ex.: hand, iris, face,
keystroke) to create a template and store it. Given that, when a user wants to authenticate itself, the
user traits will be compared to ones stored in the template.
Figure 3 represents 3 different types of user authentication currently available. Knowledge
based type seeks to identify the user by requiring personal information. A good example, is used by
some websites, when recovering the password, by implementing security questions. Regarding object
based authentication, it is a little different as it requires a person besides the user, to identify that
object. Finally, the most important, biometric authentication, which is divided into two categories.
Physiological, which is associated with physical characteristics of the user. Behavioral, which
describes how a user behaves during that type of authentication.
Biometric systems are automated methods that verify or recognize the ID of a person based on
a physical, physiological or behavioral characteristics. When conjugated with traditional security
2
methods they provide an extra level of security. Examples of biometric characteristics would be
fingerprints, face, iris, and others that will be enumerated in the section entitled ‘Mobile biometrics
state of the art’.
Enrollment is the process of collecting biometric data from a user and store it in the system.
Furthermore, authentication is the identification or verification of the user’s identity by matching the
data provided by the user with the data stored in the system. During the enrollment, the biometric
system stores biometric traits of the user. During authentication, this traits are used to recognize a
user who provides his biometric trait. Depending on how the biometric system is projected, it can
operate in two different modes which are verification and identification, as it’s shown in Figure 2. In
identification mode, the user does not required to provide his identity, thus the biometric trait provided
by the user is matched with all the users enrolled in the template in order to match or reject the
claimed identity. On the other hand, in verification mode the biometric trait is matched only against the
user enrollment template.
Enrollment is an important step regarding the accuracy of the template, so it should not be
limited to one-time step and keep updating the user template. As observed in Figure 1, a generic
biometric system is assembled by 5 major components, sensor, feature extraction, feature matching,
decision maker and a template. The first component, biometric sensor, is responsible for the scanning
the biometric trait of the user, being the interface between the user and the authentication system.
Next, feature extraction is responsible to extract salient data that is responsible to distinguish between
different users. During the enrollment the data extracted is stored in a template. The matcher is a
module that compares the input with the template and then indicates the similarity between those two.
The decision module, makes the authentication decision.
SensorFeature
extractionApplication
deviceDecision
Template
Matcher
Figure 1 – Generic biometric system
3
Figure 2 – Identification vs. Verification (griaulebiometrics, 2014)
Any biometric system will exhibit occasional false acceptance of intruders and false rejection of
legitimate users. The corresponding False Accept Rate (FAR) and False Reject Rate (FRR), as well
as the Equal Error Rate (EER or CER, which stands for Crossover Error Rate), where FAR equals
FRR, are important metrics to ensure the validation. FAR ought to be low, as it specifies the probability
that an impostor can use the device, as well as FRR, which can cause inconvenient when the ratio is
high. An illustration of these parameters can be seen in Figure 4, which demonstrates that FAR and
FRR are inversely proportional.
Figure 3 - Classification of user authentication approaches
4
Figure 4- FRR, FAR and CER
Besides the FRR and FAR metrics, there are others such as sensitivity, specificity and
accuracy. These 5 metrics will be used to assess the obtained results, in section 4. Sensitivity, also
called true positive rate, measures the actual positives which are correctly identified, and it is
complementary to the false negative rate. Specificity, also called true negative rate, measures the
negatives which are correctly identified, being complementary of the false positive rate. Sensitivity and
specificity can be calculated according to formulas (1) and (2), respectively. The acronyms TP, TN, FP
and FN stands for true positives, true negatives, false positives and false negatives, respectively.
Accuracy, which can be calculated using equation (3), assesses how well the system behaves, and
allows choosing the optimal operation threshold for a system.
(1)
(2)
(3)
1.3 Objectives
This work is focused on the exploitation of biometric based authentication using keystroke
dynamics in mobile devices, which uses a person’s unique typing pattern to aid in identifying that
5
person. This pattern is difficult to observe but can be produced by anyone who is able to press a
keypad. Keystroke dynamics is a biometric trait usable for authentication which isn’t much exploited
until now. However, there’s much interest on developing it for mobile phones once there’s an
exponential growth of smartphone usage and these devices carry a lot of personal information,
currently being secured by PINs or patterns which can be stolen or forgotten.
Given the rapid increase observed in the adoption of smartphones and of the corresponding
applications market, there is also an opportunity for the development of biometric security applications.
As such, there are three main types of mobile keyboards that need to be considered: numeric, thumb-
based (QWERTY) and soft keyboards (touch keyboards). Due to the variety of available keyboards,
there are some challenges on the adaptation of the keystroke analysis methods originally developed
for PC/traditional keyboards, to the mobile phone case. The main aspects to take into consideration
include:
Usage of small keys – mobile devices are limited in size, leading to the usage of smaller
keyboards. The user tends to make more writing errors, can stop a sentence while
writing, raising the challenge of identifying which keystrokes are valid.
Key shape and response to the applied pressure makes the keystroke analysis for
mobile handsets significantly different from the one performed over traditional
keyboards.
Mobile devices have limited memory and CPU capability thus algorithms to use ought to be
simple.
1.4 Contributions
The main objective of this work is to develop an application for Android OS smartphones,
which performs biometric verification of the user based on the keystroke dynamics when entering a
password. The developed biometric recognition system follows the general architecture represented in
Figure 5. Based on that figure, the author has designed, implemented and tested a biometric
verification system capable of identifying users. However, as the smartphones standard keyboard
does not have some of the necessary functionalities for this work, another keyboard had to be
developed in order to proceed with the remaining work. The majority of the corresponding software
implementation has been developed by the author. A deeper explanation of all these steps will be
provided in section 3.
InputFeature
extractionClassification Decision
Figure 5 – Generic architecture of a biometric recognition system
6
1.5 Organization of the text
This work follows in detail the development of a biometric keystroke application for an
Android powered mobile device. The description of this work is made in four sections with the current
one already introducing the contextualization, demand for biometric security, biometric system,
objectives and organization and addressed topics which compose this work. Each section is focused
on different steps, but all necessary to achieve the proposed application.
Section 2 presents a general overview of all the biometric techniques in order to understand
better their behavior. After that, the author provides a more detailed overview over the chosen
technique for this work.
Section 3 provides in detail the approach to development of the application, as well as,
architecture, capturing user input, classification and decision. In architecture section, all the steps
essential in the coding process are explain but that does not replace all the coding necessary.
Section 4 targets the evaluation of the data with the proposed algorithms. After that, the
results performance are presented and discussed.
Section 5 presents a walkthrough for the application usage.
Section 6 is reserved for conclusions and some further work that can be made.
7
2 Mobile biometrics state of the art From law enforcement, to military forces, public transportation, border control and commercial
shipping authorities, mobile biometrics are quickly becoming a lifesaver to these industries in order to
speed up processing of people and goods. Access to business data from mobile devices requires
secure authentication, but traditional password schemes based on a mix of alphanumeric and symbols
are cumbersome and unpopular, leading users to avoid accessing business data on their personal
devices altogether (Trewin, et al., 2015).
This section overviews the main solutions currently available for biometric recognition
techniques, such as, face recognition, iris scan, voice recognition, hand geometry, gait and
handwriting signature. It discusses some of the most used biometric recognition approaches as well
as their main advantages and disadvantages.
Section 2.1 is dedicated to presenting the main biometric recognition techniques employed in
mobile devices, while Section 2.2, focuses on the main theme of this work, mobile keystroke
dynamics, providing more detail about this biometric trait.
2.1 Biometric recognition techniques
Biometric techniques for recognition can rely on information extracted from different
modalities, or traits. The usage of several biometric traits in a mobile environment is discussed in this
section.
2.1.1 Keystroke dynamics
Keystroke dynamics recognition consists in the recognition of an individual based on the way
he types, using a mobile keyboard. This is the goal of this work, and this topic is further elaborated in
the final part of this chapter. This subsection just introduces the problem and defines the main
concepts. In particular when employing keystroke dynamics as a biometric trait, there are two major
authentication strategies that can be employed: static or continuous.
In static biometric authentication, each participant provides his biometric features during
enrollment. These features are stored in a template. Whenever a person tries to authenticate herself,
she will provide a new sample of the same biometric feature and this new input is compared to the
ones previously stored in the template. If they are similar, the input will match, and it will validate the
user.
Authentication using static keystroke dynamics is based on measuring the duration of key
presses by the user, and on the time latency between consecutive keystrokes, relating them as they
are being pressed. For enrollment the user is asked to type a fixed text a number of times and each
time the measures are stored in a template. When attempting authentication, the user types the text
once again while measuring duration and latency timings to be compared against the stored values
(Crawford, 2010).
8
In continuous biometric authentication instead of typing a fixed text, the system is used with
unconstrained textual input (free-text), typically for a longer period of time. Over that time, information
is collected on how the user types on the keyboard, during enrollment. During authentication the
features computed from the input text values are compared to the ones stored in the template (Awad
& Traore, 2013).
Figure 6 represents the duration of one keystroke which is the sum of the time when the key is
pressed and when it is released.
Figure 6 – Timing intervals between consecutive key presses (McLoughlin & Mohanavel, 2009)
2.1.2 Face recognition
Face is probably the biometric trait most frequently used for recognition purposes. System
implementing face recognition algorithms are usually composed by face preprocessing, face
authentication and information fusion.
The face preprocessing module is responsible for the segmentation of an adequate facial
image from the available photo or video footage. This process typically includes three steps, namely
face detection, face registration and illumination normalization.
Face detection can consist in a simple scheme using rectangular binary features and the
integral image. There are two classes of methods to achieve face detection, heuristic-based and
classification-based methods. The first class comprehend skin color and facial geometry methods. The
heuristic-based methods are simple to implement but aren’t reliable as they are vulnerable to exterior
changes. On the other hand, classification-based methods treat face detection as a pattern
classification problem, thus they benefit from the existing pattern classification resources, being able
to deal with more complex scenarios. However, patterns to be classified have to cover the exhaustive
set of image patches at any location and scale of the input image, so classification-based methods
typically have a high computational load (Tao & Veldhuis, Biometric Authentication System on Mobile
Personal Devices, 2010).
The next step is to register the face. To do that facial features are combined (Tao & Veldhuis,
2006). It is common to use holistic methods or local methods. Holistic methods take advantage of both
global face texture information and the local facial feature information, however it has a relatively high
computational load. Nevertheless a local registration method is more direct and faster as it only takes
the locations of local facial features to calculate the transformation (Tao & Veldhuis, 2010).
9
Furthermore, illumination normalization is characterized by two methods. One method studies
the illumination problem and the other works on the image pixel values (Tao & Veldhuis, 2010).
Finally, after detecting, registering and normalizing the image, comes the verification, which uses two
classes the user and the impostor classes. They classify overlapped regions with a minimal possible
error, using the likelihood ratio in the Newman-Pearson sense.
An example of this process is represented in Figure 7.
Figure 7 – Face preprocessing (Tao & Veldhuis, 2006)
2.1.3 Iris scan
Iris is a biometric trait which exhibits good recognition properties, notably due to not changing
with aging.
Most iris recognition methods requires an infrared illumination to highlight the characteristics of
the iris, however on mobile phones that’s not possible, as they should work on the visible spectrum.
To be able to do that, there are two main approaches to cope with the noisy images in a color iris
recognition system are either to apply image enhancement techniques or to extract multiple types of
features and apply a fusion mechanism (Radu P. , 2012).
2.1.4 Voice recognition
Voice activity detection plays an important role for an efficient voice interface between human
and mobile devices. As the user records his voice, the speech is digitalized and the frequency
spectrum of speech signal is encoded and stored (Shabeer & Suganthi, 2007). An illustration of this
process is shown in Figure 8.
Furthermore, when a person starts using a cell phone his speech spectrum can be coded for
recognition purposes and then compared with the stored coded spectrum. On the other hand, to save
computational power, a voice trigger system could be implemented using a keyword-dependent
10
speaker recognition technique (Lee, Chang, Yook, & Kim, 2009). The goal of this component is to
avoid false activation from the voice recognition.
Figure 8 - Voice recognition process (Shabeer & Suganthi, 2007)
2.1.5 Hand geometry
Hand geometry recognition not only has a good performance in identifying the user but also it is
known to be a non-invasive biometric technique. This approach works well with low resolution
cameras, which is an upside for mobile phones. However, it can be difficult to distinguish the hand
from the background, due to illumination, lack of contrast between hand and background or even blur
effects within the image, making the image segmentation a challenge for mobile phones.
Segmentation step is essential in hand biometrics, given that a subsequent feature extraction depends
on an accurate and precise hand isolation, otherwise template features could be inappropriately
extracted, resulting in a reduction in individual identification (Sierra, Casanova, Ávila, & Vera, 2009).
On (Franzgrote, et al., 2011) they use flash illumination to enhance the hand silhouette while
darkening the background and use an effective method for extraction and representation of palm line
orientation information.
2.1.6 Gait
Gait corresponds to the particular manner of walking of a subject. It has two basic components,
the swing phase and the stance phase. The stance phase is when one foot is in contact with the
ground and the swing phase is when one of the feet is in the air for limb advancement. Each person
has a specific stride, making it a unique person’s signature. The most basic form of gait is step
detection and characterization. An illustrative image of a gait cycle is included in Figure 9.
11
Figure 9 - Gait cycle phases (physio-pedia, 2015)
As smartphones are equipped with various sensors such as gyroscopes and accelerometers,
often, the gait cycle is measured with a combination of sensors. This is the case in (Minh Thang,
Quang Viet, Dinh Thuc, & Choi, 2012) where they perform data acquisition with a built-in
accelerometer, while a user walked naturally. However, due to battery power saving the sampling rate
is low and time intervals between two consecutive acceleration values are not equal. Moreover, the
biometric gait of each individual is different day by day (D.S., M.S., S., & J.N., 2012), making this
method not very efficient for biometric validation. Finally, gait analysis within consumer devices must
overcome several difficulties that specialized gait sensors do not face, for example, the compensation
for different positions that the mobile device may be placed during motion. For the reading to be
accurate the mobile phone has to be always facing the same position.
2.1.7 Handwritten biometric signatures
Despite handwrite signatures are well deployed on particular devices, they aren’t on mobile
devices. With the growth of smartphones and tablets (touch screens), was created a new opportunity
to migrate handwritten signature authentication to mobile devices. However, some of the signals
captured on traditional on-line signature systems are not present in portable devices, making it a
challenge for algorithm implementation.
On (Blanco-Gonzalo, Miguel-Hurtado, Mendaza-Ormaza, & Sanchez-Reillo, 2012) they tested 7
different devices with different characteristics. They only used time, X and Y signals. This restriction
comes from the fact that some devices do not capture pressure, making it less robust. Regarding
results, they concluded that visual feedback from the signature is a major performance parameter,
showing that the small devices offer the best CERs, on average. Capacitive screens performed better
than resistive screens.
Moreover, on (Mendaza-Ormaza, Miguel-Hurtado, Blanco-Gonzalo, & Jose Diez-Jimeno, 2011),
they used 4 different devices, all with Android OS, including devices with both resistive and capacitive
screens. Due to hardware constrains, azimuth and inclination angles were not captured. In addition,
resistive screens’ precision of the pressure values obtained was really low, with a great variation
between devices, making it useless for implementation of an algorithm. On the other hand, capacitive
12
screens provide information about size of the surface in contact with the touch screen. Given that, they
used 3 different temporal signals for each type of screen. For capacitive screens, x-axis, y-axis and
size signals were used and for resistive screens x-axis, y-axis and pressure signals were used. SVM
and DTW algorithms were used for signature classification.
2.1.8 Choosing the technique
After presenting a brief overview of several biometric traits that may be used to achieve a
stronger authentication of smartphone users, the author has developed a particular interest in
techniques using keystroke dynamics as a biometric trait. The reason for this choice is related to the
fact that people often have to write messages on their smartphones. Moreover, adding an extra layer
of security while writing passwords or PIN’s seemed an excellent idea to protect the information within
the device.
2.2 Keystroke dynamics as a biometric trait
As discussed above, this work is about using keystroke dynamics as a biometric trait.
Therefore, this section provides a more detailed overview of the published work on this topic, to lay the
foundations for the subsequent development of smartphone-based authentication system based on
keystroke dynamics analysis.
2.2.1 Input sensor
A system relying on the analysis of keystroke dynamics needs to capture the relevant
information by accessing the user typing characteristics. For that purpose the nature of the keyboard
used as an input sensor needs to be known. As summarized in Figure 10, the input can be done via a
soft or a hard keyboard. A hard keyboard is an external input device used to type data into some sort
of computer system whether it be a mobile device, a personal computer, or another electronic
machine. A keyboard usually includes alphabetic, numerical and common symbols used in everyday
writing. On the other hand, a soft keyboard is a system that replaces the hard keyboard on a
computing device with an on-screen image map. These keyboards are typically used to enable input
on a handheld device so that a keyboard does not have to be carried with it.
For the purpose of this work, only soft keys will be considered, as smartphones have software
keyboards. The touch screens, in these devices, used for displaying the keyboard and receiving the
corresponding information, can be either resistive or capacitive. On resistive screens the pressure
applied on the screen can be read, as they function based on finger pressure. Furthermore, capacitive
screens can read the size of the surface of the finger, once they detect anything that is conductive.
13
Input
Hard keys
Soft keys
Desktop keyboard
Hard mobile
keyboard
Capacitive
Resistive
Mobile keyboard
Figure 10 –Different types of input
2.2.2 Features
A feature is a distinctive attribute or characteristic of something. Every human has different
characteristics or attributes and hence it is possible to distinguish, i.e., identify them. With that in mind,
in this work, features will be defined such as the typing rhythm of the user.
There are different methods and metrics upon which keystroke analysis can be based
(Shanmugapriya & Padmavathi, 2009):
Static at login: a known keyword, phrase or predetermined text is captured and then
compared against stored typing patterns
Periodic-dynamic: the user typing pattern is captured during a part of a logged session
and then compared against stored typing patterns to determine deviations.
Continuous-dynamic: similar to the periodic dynamic but the authentication is done
during the entire logged session.
Keyword-specific: is an extension of continuous or periodic dynamic but related to
specific keywords.
Application-specific: continuous or periodic dynamic applied to a specific application.
Keyword latency: considers the overall latency for a complete word
Some additional features can be considered when using smartphones and tablets:
Pressure during typing
Fingertip size
Physics of the mobile device: it means how the user holds his device and which is the
preferred hand.
Figure 11 proposes a compact representation organizing the information presented above.
14
Feature Extraction
Static
Continous
Template (specific times)
Free‐Text (whole session)
Pressure during typing
Finger size
Key press duration
Key dwell time
Key press duration
Key dwell time
Static at login
Periodic dynamic
Key word specific
Keyword latency
Continuous dynamic
Trigraph latency
Keyword latency
Pressure during typing
Finger size
Figure 11 – Various keystroke features
2.2.3 Classification techniques
Classification techniques enable a classifier to identify to which of a set of categories a new
observation belongs. This is possible through the training of the algorithm on the basis of a training set
of data containing observations whose category membership is known. A category would be
composed by a set of features which defines it. An example would be assigning a password input into
true or false user classes.
In this context, there are two main classification approaches followed for keystroke analysis,
statistical techniques and neural networks techniques or a combination of both (Karman & Krishnaraj,
2010). Furthermore, both need a matcher and stored data, to allow the processing of the keystroke
timings. Figure 12 illustrates some classification techniques.
For statistical analysis, some of the methods commonly applied on keystroke are:
Euclidean or Manhattan distance measures between two vectors of typing characters.
Not only that but total time periods and pressure are measured and stored as a
template.
SVM, which separates vector samples in Rn. Each feature will correspond to a plane xi
where a binary set will be represented. The goal is to design a hyper plane that
classifies all training vectors in two or more classes, that leaves the maximum margin
from all classes.
Pattern recognition and neural network is comprised by fuzzy ARTMAP (Predictive
Adaptive Resonance Theory), RBFN (Radial Basis Function Network), BPNN (Back-
Propagation Neural Networks) and Bayes’ rule algorithms.
As an example one can consider the bio password (Karman & Krishnaraj, 2010) where there
are 3 statistical measurements (mean, standard deviation and median) that are submitted to a feature
selection algorithm. Posterior to that, there’s a classification that aims to find the best class closest to
the classified pattern.
15
The essential features to be used for the classification step are keystroke timings, which are
the timings between successive keystrokes, press and release events. The time between both events
is called dwell time, on the other hand, the time between the release event and the press event of the
next key is named flight time. The template to be used for recognition is constructed with basis on this
concepts. The template refers to the process of determining, from a given set of available biometric
acquisitions, which are the best suited to represent the collected data and the statistics of the
considered users’ biometrics (Maiorana, Campisi, González-Carballo, & Neri, 2011). For example, the
mean relates to the fact that typing biometrics are behavioral, so the samples collected won’t be
consistent. As result the trend of the samples are used for authentication. This can be done taking into
account different approaches.
A simplistic approach is used in (McLoughlin & Mohanavel, 2009), where the variance will help
to determine the statistical dispersion of the samples. Fixed weights are assigned to each variance
value, with the highest weight assigned to the smallest variance. Three verifiers are used to determine
the authenticity of the user, these being press timings, release timings and overall timings. Each of the
verifiers has mean values, time duration and weighted variance that will be used for authentication.
A more complex approach is used in (Karman & Krishnaraj, 2010) and (Maiorana, Campisi,
González-Carballo, & Neri, 2011). On the first reference mean, standard deviation and median are
calculated for the features. The next step is to select a subset of features through stochastic
algorithms and the classified to find the best match. On the second reference, Euclidean and
Manhattan distances are used to compute the distances between keystrokes. Then, to characterize
the keystroke variability of each user, four statistical values are computed for each latency feature.
After that, an algorithm is used to select a template to apply to keystroke dynamics. The authentication
step is done by comparing the current acquisition with the reference ones (stored and computed
during training).
Even though these models performed well, they are only suitable on alphanumerical
passwords, not on free-text. Due to the many word combination possible in free text, it would be
necessary to enroll all words before putting it to test.
Some recent studies have begun to use neural networks as a pattern classification method.
Common approaches for neural networks include Feed Forward Multilayered Perceptron Networks,
Radial Base Function Networks and Generalized Regression Networks. However, some mobile
devices lack the computing power necessary to employ neural networks, where the processing is
done on the device itself. Neural networks are composed by monographs, digraphs, n-graphs, each
describing the human behavior in performing the described action. A monograph represents the action
of pressing on a key on the keyboard. A digraph represents a typing action performed by the user from
a specific key to another key on the keyboard. As shown in Figure 14 monograph network consists of
two layers with one input node representing the mapped key code and the output node the fly time
associated with the input key. On the other hand, in Figure 13 the digraph network consists of two
layers with two input nodes representing the from and to mapped keys, while the output layer consists
of one node which represents the fly time of the input digraph. (Awad & Traore, 2013).
16
ClassificationARTMAP
SVM
BPNN
RBFN
Bayes algorithms
Standard measures
Euclidean
Fixed weights
Manhattan distances
Figure 12- Classification techniques for keystroke dynamics (Support Vector Machine, Back-Propagation Neural
Network, Predictive Adaptive Resonance Theory, Radial Basis Function Network)
1
2
n
From
Time3
To
Input layer Output layerHidden layer
Figure 13 - Digraph
17
1
2
n
Key Time
Input layer
Hidden layer
Output layer
Figure 14 – Monograph
2.2.4 Keystroke models
Keystroke models authentication can be classified as either static or continuous. As it was
referred previously, static authentication refers to keystroke analysis performed only at specific times,
as during a login process. An example is PIN model (Karman & Krishnaraj, 2010), where the PIN
number is introduced by the user several times during enrollment. The user timing vector is captured
in and is enrolled in keystroke acquisition. Other keystroke features are extracted and their mean,
standard deviation and median is calculated which is given as input to the feature subset selection.
In addition, continuous authentication performs the same analysis but during the whole session.
This method provides a tool to also detect user substitution after successful login. The free-text model
is a continuous authentication system, looking for the continuously presence of the authorized user.
This is done by analyzing the typing rhythms the users’ shows during their normal interaction with a
computer. There is a long time of data collecting due to many combinations of words. In (Sim &
Janakiraman, 2007) word specific digraphs are constructed from the most common words used, due
to sample dispersion. Nevertheless the achieved results aren’t very accurate: for the best sequences
the accuracy is 80% or more. On the other hand, on (Awad & Traore, 2013) very good accuracy was
achieved. They used monographs and digraphs analysis and a neural network to predict missing
digraphs. The technique assumes that it is possible to enroll the user by covering the most frequently
occurring keys (or most frequently occurring monographs), but not all expected digraphs. It further
assumes that it is possible to approximate the remaining digraphs based on the relation between the
monitored ones. However, all missing monographs will be ignored during the analysis. In addition, the
model is valid for a diversity of keyboards. A key mapping technique sorts the key codes based on
associated average time, and accordingly maps them in corresponding order. This will assist the
neural network in building and approximating the relation between the keys based on the behavioral
distance between them expressed in time. For the elimination of outliers Peirce’s criterion is used.
18
There are two neural networks (Mono and Di) that are trained for each of the enrolled users. The
enrollment process is shown on Figure 15 as well as verification process shown in Figure 16
The flight time for a missing digraph is obtained from the output of the trained digraph neural
network. The digraph neural network takes as input the DKO (Digraph Key Order) for the ‘to’ and
‘from’ keys of a missing digraph and then returns as output an estimate of the corresponding fly time.
The neural network architecture remains the same for all users, although the weights for each key are
user specific. For 53 users, in a heterogeneous experiment, they achieved a FAR equal to 0.0152%
and a FRR equal to 4.82%. However, neural networks require a high level of data processing, difficult
for a mobile device, but due to the fast advance in technology that might be doable on a mobile phone.
However, their approach doesn’t have an adaptive enrolment scheme. So it isn’t possible for the
method to know if the user changed its behavior.
Figure 15 - User enrollment process (Awad & Traore, 2013)
19
Figure 16- Verification process (Awad & Traore, 2013)
Finally, there has to be a decision whether the classification is accepted or not. To do that,
there has to be a commitment between the FAR and FRR. These two metrics can be visualized on a
ROC curve, which is a trade-off between the metrics. In general, the decision is based on a threshold,
which determines how close to a template the input needs to be for it to be considered a match. If the
threshold is reduced, FAR will increase opposed to the FRR which will decrease. Conversely, if the
threshold is raised the opposite will happen.
2.2.5 Conclusion
In general, neural networks tend to produce better results than statistical methods, although
neural networks are highly variable since the number of layers and the number of neurons per layer
have a linear relationship with the quality of the results. As the number of layers and neurons
increases, so does the complexity of the network and thus the amount of time required to process
results. Also, each time a user is added, the neural network must be retrained, which also increases
the amount of processing power required to use these methods. So, despite of lower quality results
when using statistical methods, there are still areas where the statistical methods may be superior,
such as when available processing power is limited. However, statistical classifiers do not provide a
strong enough level of pattern classification to support the needs of some authentication systems.
20
3 Proposed keystroke dynamics recognition application An application or app is a software design to run on smartphones and other mobile devices. In
this case, a new app is created for the Android operating system. It is developed in the Java
programming language using the Android Software Development Kit (SDK), which includes a
comprehensive set of development tools including a debugger, a set of libraries, an emulator,
documentation, sample code, and tutorials.
3.1 Architecture
As illustrated in Figure 2 a biometric system can be projected for identification or verification.
Regarding this work, as the smartphone is a personal device, the application will be developed for
verification. In verification mode the features extracted from the selected biometric trait are matched
only against the corresponding user enrollment template. In Figure 17, the architecture of the
projected system is illustrated.
Fixed Text
Euclidean measures
Key times
InputFeature extraction
Classification Decision
Threshold
SVM
Database
Database
Figure 17- System architecture
In the first step, the input is an alphanumeric password, as the writing done in smartphones is
relatively short making free-text input not the most suitable for smartphones. As the app deals with
keystroke dynamics as a biometric trait, the classification algorithms used for the classification step
need training. Due to that, the developed application has two operation modes. One is the training
mode, in which the user inserts the chosen password to allow the system to capture the corresponding
features and store them in a database. For the purpose of this work, to be able to compare different
types of passwords, three passwords have been chosen, notably: mxplayer, Lisboa2014 and
tecnicoLisboa. This passwords have different characteristics, the first only has lowercase, the second
has upper and lower case as well as numbers and the third has upper and lowercase The second
mode, verification, is the mode where the actual input (of the true user or the intruder) is compared
against the data gathered during training. The app will allow different users to register and choose one
password. Figure 18 and Figure 19 illustrates the two time features that are being analyzed, dwell and
flight time.
21
Dwell time
Key press Key release
Figure 18 –Dwell time
Key i Key i+1
Figure 19 – Flight time
3.2 Capturing user input
At the same time each key is pressed or released the time metrics are measured and stored in
a SQLite database, which is available from an Android library. Each training operation has a different
id, and each id has associated with a set of the key codes, dwell times and flights time for the chosen
password. As the application records key timings, while the user is typing the password, if the user
makes a typing mistake, that input will not be valid, because it would invalidate the training data as the
key timing would be greater than expected.
To be able to capture key times, the developed application implements two functions
OnKeyDown and OnKeyUp which handle key down and key up events, respectively, when they occur.
However, in conjunction with the text editor function from Android library, the standard keyboard only
generates number key events, while alphabet does not generate any events.
To circumvent this, a new soft keyboard had to be developed and included in the application.
This keyboard allows to generate the events with the key timings, as well as providing information
about the pressed key codes. While the keyboard is active, it is always ready to receive a key press
from any key. When a key is pressed, it is identified, then the necessary action is applied. These
actions can be applied trough the binding of OnKeyboardActionListener interface to the keyboard
view, which is illustrated in Figure 20. This interface implements a listener for virtual keyboard events,
such as, onPress, onRelease, and onKey methods. The first two, as the names suggest, are called
when a key is pressed or released. These functions are responsible to send a key event with the
system clock time, in milliseconds, and the key code of the press/released key. Then, onKey, is
responsible for sending a key press to the listener, which translates into the writing in the EditText.
22
Through the above described steps, the OnKeyDown and OnKeyUp functions, are already able
to capture key timings and key codes necessary for keystroke dynamics analysis. In addition, the
screen also detects motion. However, when the keyboard pops up, that functionality isn’t available, but
in the same away that is possible to bind OnKeyboardActionListener interface to the keyboard view, is
also possible to bind onTouchListener interface. This allows to implement onTouch method, which is
called when a touch event is dispatched to the keyboard, allowing to capture pressure and size.
Capacitive touch screens detect size much more than pressure, returning a normalized value between
0 and 1. However, in this work this method is implemented but not tested due to lack of time. The
author just achieved this after all the data have been analyzed. To test this it would be necessary to
create new databases and analyze all over again. Moreover, when a key event is detected by
OnKeyDown, the interface will follow the event until OnKeyUp, preventing from missing the release
time of the key.
Figure 20 – Soft keyboard
3.3 Classification and decision
When the user enters the verification mode, he can choose which algorithm to use, Euclidean
distances and SVM. On both of them, the user has to enter the same password one more time, to be
able to verify if the user being tested is the true user.
When using Euclidean distances, the key times for the user attempting authentication are
loaded into a vector with the double size of the given password, in which the odd positions contain
dwell times and even positions contain flight times, and another vector contains the values of the
registered user password entered after choosing the algorithm.
Each time a Euclidean distance is calculated, according to the formula in (4), the result is
stored in a vector that holds all the distances between the entered password and each of the
password training from that user. Then, upon a threshold, the algorithm will decide if the user is valid
or not. In section 4, the choice of the threshold will be explained for both algorithms.
23
(4)
Regarding SVM, the implemented algorithm uses ‘libsvm’ library for training and testing of the
features. This one is very different from the first, as it needs samples from the true user and other
samples from a false user, with the same password. To be able to do that the data is loaded into one
simple hashmap for the training labels, and two complex hashmap’s for the train and test features. A
hashmap has a number of “boxes” which are used to store key-value pairs. Each “box” has a unique
number, and when a key-value pair is stored into the map, the hashmap will look at the hash code of
the key, and store the pair in the “box” which identifier is the hash code for the key. Figure 21
represents the hashmap from training labels, where the value of each “box” represents the classified
classes, where number 1 represents the true user class and ‘–n’ the false user(s) class(es).
There is a correlation between the keys of Figure 21 and Figure 22. The key values in each of
the hashmap’s, correspond to the same trained password, in other words if key equal to zero holds a
true user in Figure 22, the same key in Figure 21 will have a value of 1 which represents the true user.
Yet in Figure 21, there is another hashmap associated with the key which holds dwell and flight times
for each password.
HashMap
0
1
2
n
1
1
1
n
26
27
‐1
‐1
n
Figure 21 - Illustrative configuration of hashmap for training labels
24
HashMap
HashMap
HashMap
HashMap
0
1
2
n
D time 1 F time 1 D time 2
1 2 3 4
F time 2
n
Figure 22- Illustrative configuration of hashmap for train and test features
Before the data is copied into the hashmap, it has to be normalized. This is a common
procedure in machine learning. The normalization consists in converting a vector into a unit vector,
between 0 and 1. This trains the SVM on relative values of the features, not magnitudes. Why
normalize? Because of the way the SVM optimization problem is defined, features with higher
variance have greater effect on the margin. Usually this doesn't make sense - we'd like our classifier to
be 'unit invariant'. The procedure is done by dividing each value by the norm of the vector. The norm is
calculated using the formula in (5).
(5)
After this, the SVM nodes are created, where each key-value pair, from Figure 22 represents
the index and the value of the nodes. Moreover, each training vector, which is composed by a number
of nodes that belongs to one password, are labeled with the values from Figure 21, in order to
distinguish them. For the nodes training, there are some parameters that can be set, kernel type,
parameter C and gamma. The kernel can be linear or nonlinear, as illustrated in Figure 23 and Figure
24, respectively. The decision whether to use one or the other, has some facts take into consideration.
Typically, the best possible predictive performance is better for a nonlinear kernel, or at least as good
as the linear one. It’s been shown that linear kernel is a degenerate version from RBF (Gaussian),
which is a nonlinear kernel, hence the linear kernel is never more accurate than a properly tuned RBF
kernel. This affirmation is only not true when the number of features is large relatively to the number of
samples, (Ng., 2015). In that case is good enough to use linear kernel, because nonlinear kernels do
not score better than the linear one. In the case of this work, the number of features is small as well as
the number of samples. So, to sum up, the RBF kernel is the chosen one. With this kernel, there are
25
two parameters that can be selected which is C and gamma. Parameter C tells the SVM optimization
how much you want to avoid misclassifying each training set. For large values of C, the optimization
will choose a smaller margin hyper plane, which does a better job of getting all the training points
classified correctly. The opposite happens to small values of C, the optimization will look for a larger
margin hyper plane, even if that hyper plane misclassifies more points. On the other hand, parameter
gamma should be chosen according to the magnitudes on the pairwise distances of the data points. If
the value gamma is very small, RBF kernel is very wide, meaning all the data points could fall into one
class. However, if gamma is very large, RBF kernel is very narrow, meaning that, probably, all training
vectors will end up as support vectors. These two extreme situations are not desirable, so a
combination between the two extremes should be found. The chosen parameters for each dataset are
specified in section 4.1.3.
The final results are obtain with a function, from “libsvm”, which predicts a probability for each
of the classes. Finally, upon a threshold, the user is validated or rejected.
Figure 23 - SVM linear kernel illustration (Ranga, 2015)
26
Figure 24- SVM RBF kernel illustration (openclassroom.stanford.edu, 2015)
4 Results In this section some results, from the two implemented algorithms, will be presented, as well as
some discussion about the final results. One drawback from the analysis, is that the database does
not have many users, making generalization rather limited. However there are 3 different passwords.
These passwords were chosen carefully, so that each one has different characteristics: one is
lowercase only, another includes lowercase, uppercase and numbers, while the third one includes
lower and uppercase.
To analyze the performance of the algorithm and calculate the best operation thresholds for
each user, a ROC curve should be plotted. A ROC curve is a graphical plot that illustrates a
performance of a classifier as its threshold varies. The curve is plotted by the ratio between true
positives and false negatives at various threshold settings.
In the case of this work thresholds are represented by the distances or by the probability of the
claimed user be the true user. To plot these curves it is necessary to enroll different users and test
them against each other to cover the true and false cases. All ROC curves were plotted using
XLSTAT software, which is an add-on for Microsoft Excel. This allows excel to plot ROC curves by
giving the correct and incorrect values from the users. Subsequently, the threshold is chosen based
on best accuracy for that ROC curve. However, depending on the goal of the verification, the
threshold should be chosen accordingly. If the goal is to secure sensitive information, the threshold
should be lower, to not allow false negatives. On the other hand, the threshold might be higher, so
the user do not have to enter the password more than once. If a system has its ROC curve below or
along with the line which corresponds to the true positive rate equal to false positive rate, is a random
system. On the other hand, a perfect system would have the AUC (Area Under the Curve) equal to 1.
This means that the ‘curve’ goes alongside the true positive rate axis and the when reaching the
value 1 goes until the end of the false positive rate axis.
4.1.1 Average key timing measures
As the key timings are the main tool for the algorithms, it makes sense to look at them to make
a superficial prediction for the final output values. The way they behave influences the performance of
the algorithms. A good method to have a general look at the key timings is to make an average of the
values. The average timings are represented in Figure 25 to Figure 30.
Figure 25 and Figure 26 corresponds to the mxplayer password. From the dwell time is hard to
take any conclusion, once it’s not constant between users, however that does not happen with flight
time. From there, is clear to see that ‘x’ letter is the one with the higher flight time, contrary to ‘l’ letter
which has the lowest. The ‘x’ is a letter which is no common to write so the higher flight time can be
explained. Regarding ‘l’ letter, the time is significantly reduce because letter ‘l’ is very close to ‘p’, so
the flight is significantly reduced.
27
Figure 27 and Figure 28 correspond to the Lisboa2014 password. With this password, dwell
time has its highest value in the first letter. Regarding flight time, is similar between users, and the
transition between letters and numbers has the highest values, due to the shift key being pressed.
Finally, Figure 29 and Figure 30 corresponds to the tecnicoLisboa password. Dwell time is
very unstable, however is clear to see that it increases for all users in ‘L’ letter. This also occur with
Lisboa2014 password. While observing flight time, the same behavior as dwell time is observed, very
unstable except for ‘L’ letter.
Figure 25- Average dwell time from all users (mxplayer)
Figure 26 - Average flight time from all users (mxplayer)
m x p l a y e r
jpl 85.16 79.32 81.16 88.76 69.12 77.16 74.84 86.84
carla1 97.20 97.48 91.80 94.48 78.88 98.48 98.00 114.32
kiwi1 66.80 78.12 67.00 58.52 73.08 56.48 76.64 70.76
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
dwell tim
e [ms]
average dwell time (mxplayer)
m x p l a y e r
jpl 0.00 175.60 136.64 138.44 98.96 206.88 181.04 131.40
carla1 0.00 1028.56 720.00 288.48 665.56 632.60 469.72 231.56
kiwi1 0.00 358.84 238.72 122.88 268.32 337.20 283.96 182.48
0.00
200.00
400.00
600.00
800.00
1000.00
1200.00
flight time [m
s]
average flight time (mxplayer)
28
Figure 27 - Average dwell time from all users (Lisboa2014)
Figure 28 - Average flight time from all users (Lisboa2014)
L i s b o a 2 0 1 4
joao 97.81 69.81 78.58 79.31 89.31 76.27 90.08 94.77 83.19 75.15
mj 59.88 43.46 58.96 63.92 55.23 68.62 57.08 56.54 64.73 61.12
susy 114.48 101.92 100.12 80.64 99.00 97.12 98.48 99.52 102.16 67.64
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
dwell tim
e [m
s]
average dwell time (Lisboa2014)
L i s b o a 2 0 1 4
joao 0.00 257.6 303.2 122.4 162.6 40.42 727.3 36.12 35.19 208.3
mj 0.00 466.0 847.8 739.3 851.6 682.2 1660. 803.0 536.4 406.0
susy 0.00 377.9 536.8 489.9 429.1 437.7 1350. 488.4 426.3 489.5
0.00
200.00
400.00
600.00
800.00
1000.00
1200.00
1400.00
1600.00
1800.00
flight time [m
s]
average flight time (Lisboa2014)
29
Figure 29 - Average dwell time from all users (tecnicoLisboa)
Figure 30 - Average flight time from all users (tecnicoLisboa)
4.1.2 Euclidean distances
In mathematics, the Euclidean distance is a distance between two points in a space with two
or more dimensions. To plot ROC curves when having a decision module based on Euclidean
distances, intra and inter user distances have to be calculated. Intra user distances are measured by
t e c n i c o L i s b o a
jpssl 83.0 84.5 77.7 85.6 90.0 74.6 71.4 91.9 53.0 82.9 91.9 85.2 77.3
kiwi 56.2 79.2 53.2 55.3 51.2 55.9 55.3 71.8 55.6 75.1 52.8 52.5 72.2
vanessa 82.2 123. 128. 110. 83.2 114. 80.7 114. 94.9 123. 115. 88.8 113.
mary 71.4 68.7 65.3 70.1 81.2 61.9 71.2 96.6 67.1 64.9 64.9 74.1 65.4
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
dwell tim
e [ms]
average dwell time (tecnicoLisboa)
t e c n i c o L i s b o a
jpssl 0.00 122. 222. 92.3 148. 46.3 67.1 403. 152. 216. 118. 162. 54.2
kiwi 0.00 273. 331. 411. 141. 302. 203. 1247 131. 209. 240. 167. 184.
vanessa 0.00 319. 362. 225. 177. 255. 175. 1023 228. 244. 211. 283. 178.
mary 0.00 244. 367. 217. 228. 205. 81.1 932. 179. 243. 202. 263. 62.0
0.00
200.00
400.00
600.00
800.00
1000.00
1200.00
1400.00
flight time [m
s]
average flight time (tecnicoLisboa)
30
comparing a distance of one password with the remaining ones inserted by the same user until all
combinations have been made. Inter user, uses the same method as intra user but between different
users. With that data, it is possible to plot ROC curves. In Table 1, each row represents the threshold
for the best accuracy achieved. With this algorithm the threshold represents a distance, where the
user is accepted when the distance of the password inserted is lower than the threshold calculated.
In Table 1, the figures for the best accuracy achieved are represented. Despite the thresholds
indicated in the table being an optimal solution, there are other possible thresholds than can be used
depending on the goal of the application. If security is important the threshold should be lower, on the
other hand a higher threshold value might be chosen if the user prefers to minimize the number of
access attempts required by the system, at the expense of allowing an increase false accept rate.
Table 1 - ROC evaluation for the best accuracy achieved (Euclidean distances)
Password Threshold Sensitivity Specificity TP TN FP FN Accuracy
mxplayer 460 0.769 0.758 76.9% 75.8% 24.2% 23.1% 76.4%
Lisboa2014 707 0.64 0.874 64% 87.4% 12.6% 36% 76.3%
tecnicoLisboa 363 0.542 0.946 54.2% 94.6% 5.4% 45.8% 77.1%
To be able to evaluate the performance of the system that uses Euclidean distances as a
metric, and set an optimal threshold, the corresponding ROC curves were plotted – see
Figure 31, Figure 32 and Figure 33. From Table 1, it is possible to see that using the password
tecnicoLisboa leads to the best performance, while the other two passwords present similar results.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive rate (Sensitivity)
False positive rate (1 ‐ Specificity)
31
Figure 31- ROC curve for 'mxplayer' (Euclidean distances)
Figure 32 - ROC curve for ‘Lisboa2014’ (Euclidean distances)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive rate (Sensitivity)
False positive rate (1 ‐ Specificity)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive
rate (Sensitivity)
False positive rate (1 ‐ Specificity)
32
Figure 33 - ROC curve for ‘tecnicoLisboa’ (Euclidean distances)
4.1.3 SVM
In machine learning SVM is a learning model with associated learning algorithms that analyze
data and recognize patterns, used for classification and regression analysis. To be able to learn how
to analyze data and recognize patterns, there has to be training data consisting of a set of training
examples, each marked as belonging to one of two categories. Given that, an SVM training algorithm
builds a model that assigns new examples into one category or the other, making it a non-probabilistic
binary classifier. Those two categories are divided by a clear gap that is as wide as possible. This is
possible through a construction of a hyper plane which can be linear or nonlinear. A good separation
is achieved by the hyper plane that has the largest distance to the nearest training data point of any
class, since in general the large the margin the lower the generalization error of the classifier.
To be able to train and test, the data parameter C and gamma had to be chosen for each set
of users. The parameters where chosen based on the best output result. To do that, a combination of
values where tested. From there, the best output values were taken into account. When training the
data if the test user was the true user or a false user, the password being tested would not be in the
training. This way the tested password, does not correspond to any in the training data. To analyze the
performance values for each user, histograms where used. Each histogram represents the probability
of the claimed user being the true user. For each test, the output would return a probability, which is
represented in each bin of the histogram. After some preliminary tests parameter C was set to the
value 10000, for every password. Regarding parameter gamma, for mxplayer it was set to 1 and for
Lisboa2014 and tecnicoLisboa it was set to 10.
Figure 34, Figure 35 and Figure 36 illustrate the recognition rate, in percentage, for each user
of the three passwords, mxplayer, Lisboa2014 and tecnicoLisboa, respectively. The algorithm returns
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive rate (Sensitivity)
False positive rate (1 ‐ Specificity)
33
a probability of recognition between 0 and 1, where the higher the number the higher is the probability
of the true user being the claimed user. So, when the claimed user inserts the password the
probability should be higher than the threshold.
Figure 34 - Probability of the claimed user be the true user (mxplayer)
Figure 35 - Probability of the claimed user be the true user (Lisboa2014)
02468
10121416
Frequency
Probability [%]
jpl
carla1
kiwi1
0
5
10
15
20
25
Freq
uen
cy
Probability [%]
joao
susy
mj
34
Figure 36 - Probability of the claimed user be the true user (tecnicoLisboa)
To be able to evaluate the performance of the system when using the SVM classifier, and set
an optimal operation threshold, the corresponding ROC curves were plotted – see Figure 37, Figure
38 and Figure 39. These curves also allow comparing the performance between the two approaches,
using either the SVM classifier or the Euclidean distance metric. When plotting these figures, the true
user would be set and not enrolled with the testing data, and then tested against other captures from
the same user as well as from false users, with the output probability being recorded. The XLSTAT
software was used to produce the ROC curves.
In Table 2, the figures for the best accuracy achieved are presented. Despite the thresholds
indicated in the table corresponding to an optimal solution, there are other possible threshold values
than can be used, depending on the goal of the application. If security is important the threshold
should be higher.
Table 2 - ROC evaluation for the best accuracy achieved (SVM)
Password Threshold Sensitivity Specificity TP TN FP FN Accuracy
mxplayer 0.65 0.92 0.973 92% 97.3% 2.7% 8% 95.6%
Lisboa2014 0.39 0.987 0.973 98.6% 97.3% 2.7% 1.4% 97.8%
tecnicoLisboa 0.3 0.93 0.958 93% 95.8% 4.2% 7% 95%
The password Lisboa2014 has the best performance, while the other two have a similar
performance. However, it is clear to see that using the SVM classifier allowed achieving a big
0
2
4
6
8
10
12
14
Freq
uen
cy
Probability [%]
jpssl
mary
kiwi
vanessa
35
improvement over the results when using the Euclidean distance metric, as it is able to distinguish
each users’ characteristics much better, which is reflected in the classification and consequently on
recognition results.
Figure 37 - ROC curve for ‘mxplayer’ (SVM)
Figure 38 - ROC curve for ‘Lisboa2014’ (SVM)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive
rate (Sensitivity)
False positive rate (1 ‐ Specificity)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive rate (Sensitivity)
False positive rate (1 ‐ Specificity)
36
Figure 39 - ROC curve for ‘tecnicoLisboa’ (SVM)
4.1.4 Conclusion
By the analysis above is clear to see that SVM has better performance that Euclidean
distances. In ROC analysis tecnicoLisboa performed better in Euclidean distances while Lisboa2014
performed better in SVM. As stated above, a password should not have only lower case or correlated
letters. A good password would have at least an upper case and at least one number. The password
Lisboa2014 is a very good example, as it is the one which performed better because fulfills all the
characteristics stated in the beginning of this section.
5 Using the application The purpose of this section is to elucidate the user in how to use the application developed. The
description will guide the user on how to create a new account or use an old one, to train the
algorithms and how to test them.
To be able to run the application it’s necessary to install the apk. To do that the smartphone
should be connected to the computer. After that, go to ‘Computer’ in ‘Start Menu’ and in ‘Portable
Devices’ click on your phone logo to open internal storage and drag the Keystroke.apk and
SoftKeyboard.apk to your internal storage root. Then you can unplug the phone. Next go to a file
manager from your phone, if you don’t have one Astro File Manager is a good one, and click in the
apk that were previously copied into the phone. After the installation the SoftKeyboard should be
selected. For that it is necessary to go to settings, language & input, current keyboard, choose
keyboards and select Soft Keyboard. This way the keyboard pops up and the user is able to change
between the standard keyboard and the Soft Keyboard.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
True positive rate (Sensitivity)
False positive rate (1 ‐ Specificity)
37
After completing the steps described above, the next step is to create a new user when the
login screen appears, as seen in Figure 40. In the case of an existing user, a confirmation box
appears, Figure 42, where the user can confirm his identity or reject it. After inserting the username, a
box pops up with the three passwords allowing the user to choose the one to be used, as shown in
Figure 41. Of course, in a real deployment of the application the password should be freely chosen by
the user.
Figure 40 - Login screen
Figure 41 - Password choices
Figure 42 - Confirmation box for a user that already exists
38
After the successful login, comes the main screen, Figure 43, where the use can choose
between data training and verification. The app does not allow verification when the user does not
have at least 5 training attempts, in order to allow creating the database template for that user.
When the user presses the training button for the first time a screen as shown in Figure 44 will
appear. For each new user, 25 training attempts are recommended, and the counter will keep
counting how many attempts are left until the user reaches 25. It is possible to see the password while
writing it, for that is enough to press the ‘show’ button in the upper right corner. It is also shown, below
the text box, the chosen password, to avoid making mistakes in the password writing. If the user
makes a mistake while writing the password, that attempt will not be valid. As this treats key timings, it
does not make sense to accept writing errors, once this treats key timings that would alter the normal
key timings. If the user makes a mistake is enough to press send that will reset the text box. If the
password is well written a new screen will pop up, shown in Figure 45. Then by pressing ‘Go Back’
button the main screen will appear again to proceed with another training.
Figure 43 – Main screen
39
Figure 44 - Training screen
Figure 45 - Training accepted screen
40
After completing the recommended training, the user should proceed to the verification step.
When pressed a pop up window will appear, Figure 46, where the user may choose between two
algorithms or return to the previous screen. Then, a screen similar to the training will appear. After
writing the password and pressed ‘send’ button the algorithm analyzes the writing sample. Two
outputs can appear. If the user is considered an imposter, a message will appear in the main screen,
Figure 47. On the other hand, a new screen will appear, Figure 48, with the name of the user in
question and two buttons. One to go back to the main screen and another to erase the user and all the
training from the database.
Figure 46 - Box to choose an algorithm to proceed with verification
Figure 47 - Impostor message
41
Figure 48 - Verification screen after a user is approved
42
6 Conclusions and further Work This section summarizes the work presented, highlights the main conclusion and provides
guidelines for the development of future work.
6.1 Summary and conclusion
The first section has presented the dependency of smartphones in everyday life. They carry
personal information and the security in them is easy to break. So, adding a biometric trait to the
already existing security is a good way to improve security. This section also overviews the theoretical
basis of biometric systems solutions. Finally, the work objectives and its structure are presented by
summarizing the purpose of each section.
Section 2 starts by reviewing some of the main biometric techniques and their structure. After
that, the technique for this work is chosen and a deeper study is presented. In this way, the reader will
have a better understanding about the relevant state-of-the-art, allowing a better understand of the
choices made by the author in the design and implementation if the developed keystroke recognition
solution.
Section 3 presents the detailed architecture of the implemented solution and describes the
interaction between the various steps. Next, the capture of the user input, classification and decision
are described separately, notably the key timing is captured, and the training of the presented
algorithms.
Section 4 presents performance analysis. First, an estimation of the influence of the time
samples for each password is presented due to its influence on analysis evaluation. For each
algorithm the test conditions are presented as well as the test results and the respective discussion.
Section 5 introduces the installation of the application on the smartphone as well as guide
through for the user on how to use it.
In the last few years there have been some research in mobile keystroke dynamics. This Thesis
has been one more effort to enrich that research. However, with all the coding done by the author and
the limited time involved, the application performance still has some room to be improved as well as
the expansion of the database. So, it may be concluded that even with the limitation involving the
application implementation, the application has a good performance and should allow increasing the
security when entering alpha numeric passwords. Furthermore, by analyzing the system performance,
notably looking at the ROC curves, it can be stated that using the SVM classifier the application can
improve the password security for a user. On the other hand, a system based on the Euclidean
distance metric would still need further improvement before being applied to help improve security.
6.2 Further work
The author hopes that the developed app will be revisited in the future and that better
performances may be achieved through the addition of new features. In this context, some features to
enhance the algorithms classification are presented here:
43
Pressure: adding the user finger pressure on the key may improve algorithm
performance classification
Fixed weights: adding fixed weights based on variance of the samples. A sample which
is much more statistical dispersed naturally indicates a less reliable mean compared to
a sample with a smaller variance, so variance is calculated in order to determine the
statistical dispersion of the samples. Fixed weights would be assigned to each value in
the template in order of the inverse ranking of their variance
Phone orientation: analyze how the user holds its phone when training data through the
gyroscope may help identifying the true user.
44
7 References
(2014, March 4). Retrieved from wikipedia: http://en.wikipedia.org/
(2014, 6 17). Retrieved from whatsnext.nuance: http://whatsnext.nuance.com/biometrics‐
smartphone‐future‐mobile‐authentication/
(2014, June 17). Retrieved from zdnet: http://www.zdnet.com/30‐percent‐of‐companies‐will‐use‐
biometric‐identification‐by‐2016‐7000025942/
(2014, Jul 08). Retrieved from griaulebiometrics: http://www.griaulebiometrics.com/en‐
us/book/understanding‐biometrics/introduction/model/types
(2015, January). Retrieved from www.sibelle.info/oped4.htm
(2015, January). Retrieved from openclassroom.stanford.edu:
http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearni
ng&doc=exercises/ex8/ex8.html
Awad, A., & Traore, I. (2013). Biometric Recognition Based on Free‐Text Keystroke Dynamics. (p. 1).
Cybernetics, IEEE Transactions on (Volume:PP , Issue: 99 ).
Blanco‐Gonzalo, R., Miguel‐Hurtado, O., Mendaza‐Ormaza, A., & Sanchez‐Reillo, R. (2012).
Handwritten signature recognition in mobile scenarios: Performance evaluation. Security
Technology (ICCST), 2012 IEEE International Carnahan Conference on, (pp. 174‐179). Boston,
MA.
Cho, D.‐h., Ryoung Park, K., Woong Rhee, D., Kim, Y., & Yang, J. (2006). Pupil and Iris Localization for
Iris Recognition in Mobile Phones. Software Engineering, Artificial Intelligence, Networking,
and Parallel/Distributed Computing, 2006. SNPD 2006. Seventh ACIS International Conference
on, (pp. 197‐201). Las Vegas, NV.
Crawford, H. (2010). Keystroke dynamics: characteristics and opportunities. Eight annual
international conference on privacy, security and trust.
D.S., M., M.S., N., S., M., & J.N., C. (2012). The Effect of Time on Gait Recognition Performance.
Information Forensics and Security, IEEE Transactions on (Volume:7 , Issue: 2 ) , (pp. 543‐552).
Franzgrote, M., Borg, C., Ries, B. J., Bussemaker, S., Jiang, X., Fieseler, M., & Zhang, L. (2011).
Palmprint Verification on Mobile Phones Using Accelerated Competitive Code. Hand‐Based
Biometrics (ICHB), 2011 International Conference on, (pp. 1‐6). Hong Kong.
Karman, M., & Krishnaraj, N. (2010). Bio password — Keystroke dynamic approach to secure mobile
devices. (pp. 1‐4). Coimbatore: Computational Intelligence and Computing Research (ICCIC),
2010 IEEE International Conference on.
Kurkovsky, S., Carpenter, T., & MacDonald, C. (2010). Experiments with Simple Iris Recognition for
Mobile Phones. (pp. 1293‐1294). Las Vegas, NV: Information Technology: New Generations
(ITNG), 2010 Seventh International Conference on.
Lee, H., Chang, S., Yook, D., & Kim, Y. (2009). A Voice Trigger System using Keyword and Speaker
Recognition. Consumer Electronics, IEEE Transactions on (Volume:55 , Issue: 4 ), (pp. 2377‐
2384).
45
Maiorana, E., Campisi, P., González‐Carballo, N., & Neri, A. (2011). Keystroke dynamics
authentication for mobile phones. SAC '11 Proceedings of the 2011 ACM Symposium on
Applied Computing (pp. 21‐26). New York: ACM New York, NY, USA ©2011.
McLoughlin, & Mohanavel. (2009). Keypress biometrics for user validation in mobile consumer
devices. (pp. 280‐284). Kyoto: Consumer Electronics, 2009. ISCE '09. IEEE 13th International
Symposium on.
Mendaza‐Ormaza, A., Miguel‐Hurtado, O., Blanco‐Gonzalo, R., & Jose Diez‐Jimeno, F. (2011). Analysis
of handwritten signature performances using mobile devices. Security Technology (ICCST),
2011 IEEE International Carnahan Conference on, (pp. 1‐6). Barcelona.
Minh Thang, H., Quang Viet, V., Dinh Thuc, N., & Choi, D. (2012). Gait Identification Using
Accelerometer on Mobile Phone. Control, Automation and Information Sciences (ICCAIS),
2012 International Conference on, (pp. 344‐348). Ho Chi Minh City.
Mobile Markting. (2015, February). Retrieved from Marketing land:
http://marketingland.com/nearing‐75‐percent‐smartphone‐penetration‐year‐end‐94903
Ng, P. A. (2015, January). Retrieved from https://www.youtube.com/watch?v=i25MEJeX0Eg
Ng., P. A. (2015, January). Retrieved from opencourseonline:
https://www.youtube.com/watch?v=i25MEJeX0Eg
Oner, M., Pulcifer‐Stump, J., Seeling, P., & Kaya, T. (2012). Towards the run and walk activity
classification through step detection ‐ an android application. 34th Annual International
Conference of the IEEE EMBS. San Diego, California, USA.
physio‐pedia. (2015, February). Retrieved from http://www.physio‐
pedia.com/images/b/b0/Figure2.jpg
Radu, P. (2012). Image Enhancement vs Feature Fusion in Colour Iris Recognition. (pp. 53‐57). Lisbon:
Emerging Security Technologies (EST), 2012 Third International Conference on.
Radu, R., Sirlantzis, K., Howells, W., Hoque, S., & Deravi, F. (2012). Image Enhancement vs Feature
Fusion in Colour Iris Recognition. (pp. 53‐57). Lisbon: Emerging Security Technologies (EST),
2012 Third International Conference on.
Ranga, A. (2015, January). Retrieved from https://amitranga.wordpress.com/machine‐
learning/support‐vector‐machines/
Ranga, A. (2015, January). Retrieved from https://amitranga.wordpress.com/machine‐
learning/support‐vector‐machines/
Ranga, A. (2015, January 20). amitranga. Retrieved from wordpress:
https://amitranga.wordpress.com/machine‐learning/support‐vector‐machines/
Ritchie, R., Rubino, D., Michaluk, K., & Nickison, P. (2013, 09 24). Retrieved from android central:
http://www.androidcentral.com/talk‐mobile/future‐authentication‐biometrics‐multi‐factor‐
and‐co‐dependency‐talk‐mobile
Shabeer, H., & Suganthi, P. (2007). Mobile Phones Security Using Biometrics. Conference on
Computational Intelligence and Multimedia Applications, 2007. International Conference on,
(pp. 270‐274). Sivakasi, Tamil Nadu.
46
Shanmugapriya, D., & Padmavathi, G. (2009). A Survey of Biometric keystroke Dynamics: Approaches,
Security and Challenges. (IJCSIS) International Journal of Computer Science and Information
Security, (pp. Vol. 5, No. 1).
Sierra, A. d., àvila, C. S., del Pozo, G. B., & Casanova, J. G. (2011). Gaussian multiscale aggregation
oriented to hand biometric segmentation in mobile devices. Nature and Biologically Inspired
Computing (NaBIC), 2011 Third World Congress on, (pp. 237‐242). Salamanca.
Sierra, A. d., Casanova, J. G., Ávila, C. S., & Vera, V. J. (2009). Silhouette‐based hand recognition on
mobile devices. Security Technology, 2009. 43rd Annual 2009 International Carnahan
Conference on, (pp. 160‐166). Zurich.
Sim, T., & Janakiraman, R. (2007). Are Digraphs Good for Free‐Text Keystroke Dynamics? (pp. 1‐6).
Minneapolis, MN: Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE
Conference on.
Tao, Q., & Veldhuis, R. (2006). Biometric Authentication for a Mobile Personal Device (pp. 1‐3). San
Jose, CA: Mobile and Ubiquitous Systems: Networking & Services, 2006 Third Annual
International Conference on.
Tao, Q., & Veldhuis, R. (2010). Biometric Authentication System on Mobile Personal Devices. (pp.
763‐779). Instrumentation and Measurement, IEEE Transactions on (Volume:59 , Issue: 4 ) .
Trewin, S., Swart, C., Koved, L., Martino, J., Singh, K., & Ben‐David, S. (2015, January). Biometric
Authentication on a Mobile Device: A Study of User Effort, Error and Task Disruption.
http://researcher.ibm.com/. Retrieved from http://researcher.ibm.com/researcher/files/us‐
kapil/ACSAC12.pdf
Trojahn, M., & Ortmeier, F. (2013). Toward mobile authentication with keystroke dynamics on
mobile phones and tablets. Advanced Information Networking and Applications Workshops
(WAINA), 2013 27th International Conference on, (pp. 697‐702). Barcelona.