31
The Windows-Users and -Intruder simulations Logs Dataset (WUIL): An Experimental Framework for Masquerade Detection Mechanisms J. Benito Cami˜ na, Carlos Hern´andez-Gracidas, Ra´ ul Monroy * , Luis Trejo Computer Science Department Tecnol´ogico de Monterrey, Campus Estado de M´ exico Carretera al Lago de Guadalupe Km. 3-5, Atizap´an, Estado de M´ exico, 52926, M´ exico Abstract We introduce a new masquerade dataset, called Windows-Users and - Intruder simulations Logs (WUIL), which, unlike existing datasets, involves more faithful masquerade attempts. While building WUIL, we have worked under the hypothesis that the way in which a user navigates her file system structure can neatly separate a masquerade attack. Thus, departing from standard practice, we state that it is not a user action, but the object upon which the action is carried out what distinguishes user participation. We shall argue that this approach, based on file system navigation provides a richer means, and at a higher-level of abstraction, for building novel models for masquerade detection. We shall devote an important part of this paper to describe WUIL’s con- tent: what information about user activity is stored and how it is represented; prominent characteristics of the participant users; the kinds of masquerade attacks to be timely detected; and the way they have been simulated. We shall argue that WUIL provides reliable data for experimenting on close to real-life instances of masquerade detection, as well as for conducting fair comparisons on rival detection mechanisms, hoping it will be of use to the research community. * Corresponding author Email addresses: [email protected] (J. Benito Cami˜ na), [email protected] (Carlos Hern´ andez-Gracidas), [email protected] (Ra´ ul Monroy), [email protected] (Luis Trejo) Preprint submitted to Expert Systems with Applications July 9, 2013

The Windows-Users and -Intruder simulations Logs dataset (WUIL): An experimental framework for masquerade detection mechanisms

  • Upload
    itesm

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

The Windows-Users and -Intruder simulations Logs

Dataset (WUIL): An Experimental Framework for

Masquerade Detection Mechanisms

J. Benito Camina, Carlos Hernandez-Gracidas, Raul Monroy∗, Luis Trejo

Computer Science DepartmentTecnologico de Monterrey, Campus Estado de Mexico

Carretera al Lago de Guadalupe Km. 3-5, Atizapan, Estado de Mexico, 52926, Mexico

Abstract

We introduce a new masquerade dataset, called Windows-Users and -Intruder simulations Logs (WUIL), which, unlike existing datasets, involvesmore faithful masquerade attempts. While building WUIL, we have workedunder the hypothesis that the way in which a user navigates her file systemstructure can neatly separate a masquerade attack. Thus, departing fromstandard practice, we state that it is not a user action, but the object uponwhich the action is carried out what distinguishes user participation. Weshall argue that this approach, based on file system navigation provides aricher means, and at a higher-level of abstraction, for building novel modelsfor masquerade detection.

We shall devote an important part of this paper to describe WUIL’s con-tent: what information about user activity is stored and how it is represented;prominent characteristics of the participant users; the kinds of masqueradeattacks to be timely detected; and the way they have been simulated. Weshall argue that WUIL provides reliable data for experimenting on close toreal-life instances of masquerade detection, as well as for conducting faircomparisons on rival detection mechanisms, hoping it will be of use to theresearch community.

∗Corresponding authorEmail addresses: [email protected] (J. Benito Camina), [email protected]

(Carlos Hernandez-Gracidas), [email protected] (Raul Monroy), [email protected] (LuisTrejo)

Preprint submitted to Expert Systems with Applications July 9, 2013

As a side contribution of this paper, we use WUIL to conduct a simplecomparison of two masquerade detection methods: one based on SVM, andthe other based on KNN. While this comparison experiment is not central tothe paper, we expect it to motivate research exploring deeper the masqueradedetection problem, and spreading the use of WUIL. In a similar vein, weprovide directions for further research, hinting on how to use the featurescontained in WUIL, and hoping others would find them appealing.

Keywords: masquerade dataset, masquerade detection, computer security

1. Introduction

Information is extremely critical and valuable. As more information isstored in computers, it is paramount to timely detect whether one’s com-puter session is being illegally seized by an intruder, so-called a masquerader.Failing at doing so may result in countless losses.

Masquerade detection is approached as an anomaly detection task, wherethe masquerade detection mechanism aims at distinguishing any diversionin the current user activity from a given profile of ordinary user behavior.It has been actively studied since the seminal work of Schonlau et al. [1],who suggested that, to profile a user, one should use the commands she hastyped during a UNIX session. Schonlau et al.s’ masquerade dataset, calledSEA, has been the de facto standard for developing and comparing rivalmasquerade detection mechanisms.

SEA, however, presents severe limitations. Most prominently, it doesnot include a collection of faithful intrusion attempts. Instead, Schonlau etal. have adopted a one versus the others (OVTO) masquerade approach,where an ordinary session from a user is taken to be an intrusion attemptagainst other. Further, masquerade detection based on command usage hasproven not to be powerful enough [2]. As a result, research has resortedto profile a user considering alternative sources of activity, including deviceusage (e.g., the keyboard) [3, 4], application usage [5], or search behavior [6].This, in turn, has yielded new masquerade datasets; some of them enable thedevelopment and fair comparison of new detection mechanisms, but othersdo not have a clear working hypothesis for user profiling. What is more,they either follow the OVTO approach for masquerade detection, or considera restricted masquerade scenario.

2

In this paper, we introduce a new masquerade dataset, called Windows-Users and -Intruder simulations Logs (WUIL). WUIL contains informationabout both user activity, in terms of file system usage, and, unlike rivaldatasets, faithful masquerade attempts. It has been built under the workinghypothesis that to characterize user behavior we should observe the wayshe navigates her file system structure. Thus, it is the object upon whichan action is carried out what distinguishes user participation; this is unlikeexisting approaches, which consider the user action only. We shall arguethat file system navigation provides a richer means, and at a higher-level ofabstraction, for building novel models for masquerade detection; furthermore,it is not device-, platform-, or application-dependent.

We shall devote part of this paper to describe WUIL: what informationabout user activity is stored and how it is represented; prominent character-istics of the participant users; the kinds of masquerade attacks to be timelydetected; and the way they have been simulated. We shall argue that WUILprovides reliable data for experimenting on close to real-life instances of mas-querade detection, as well as for conducting fair comparisons on rival detec-tion mechanisms, hoping it will be of use to the research community.

To give the reader a taste as to how to use WUIL to approach masquer-ade detection, we also report on the results of comparing two masqueradedetection mechanisms: one based on Support Vector Machines (SVM), andthe other on K-Nearest Neighbors (KNN). While we shall discuss strengthsan limitations of this experiment, we insist it is not central to the paper: weexpect it to motivate the use of WUIL among the research community todeepen general understanding on masquerade detection.

In a similar vein, we provide directions for further research, hinting onhow to use the features contained in WUIL, and hoping others would findthem appealing. Having signed a non-disclosure agreement, one can freelydownload WUIL from http://homepage.cem.itesm.mx/raulm/wuil-ds/.

2. Existing Datasets for Masquerade Detection

This section outlines the strengths and weaknesses in existing datasets formasquerade detection, prompting the development of WUIL. Since our maininterest consists in describing and assessing the most prominent datasets usedto detect masquerade attacks, we will focus on the structure and features ofeach dataset presented. The reader is referred to the original work in casehe is interested in knowing validation details and classification results.

3

2.1. The SEA Dataset

The SEA dataset [7] contains log information about the activity of 70UNIX users. Each user log consists of 15,000 commands, to which optionsand arguments have been stripped off, and which were gathered via the aact

auditing mechanism. SEA follows the OVTO masquerade approach, so 50users were selected to serve as honest holders, while the remaining ones were(artificially) set to be masqueraders.

Every user log is divided into blocks, each of which is of size 100 com-mands. A command block is arbitrarily called a session. Thus, a useramounts to a sequence of 150 sessions. The first 50 sessions of every user areleft untouched, constituting the construction set, and are used for developingthe proposed masquerade detection mechanism. The last 100 sessions, how-ever, may or may not have been contaminated, constituting the validationdataset, and are used for validating the mechanism.

While SEA makes it possible to conduct a fair comparison of rival mas-querade detectors, it is very restrictive to test how powerful a method is,especially when more realistic conditions are considered [2]. To begin with,a supposedly masquerader does not have any kind of intrusion intention; nordoes he have any knowledge about the profile of his intended victim either. Amasquerade session is just a sequence of commands that a user, unwillinglymarked as a ‘masquerader’, would type in a UNIX session. Moreover, inSEA, some of such masqueraders show very simple and repetitive behavior,which can be easily marked as unusual, even by human inspection.

2.2. SEA Extensions

There have been several attempts at getting around the limitations ofSEA. Work in this vein ranges from redefining the masquerade detectionexperiments set by Schonlau et al., as, e.g., in [8], to considering enrichedcommand lines [9].

In [8] SEA is used but with the commonly named 1v49 configuration. Thisconfiguration works again with an OVTO approach; here the first 50 sessionsare used to create the user’s profile, but the first 50 sessions of other usersare used to create a non-user’s profile. The problems of this configurationare that the OVTO approach is still used and the training is made not onlywith the user information, but with the masqueraders, and this is unreal.

In the works [2, 10], SEA is used but with new masquerade attacks. Thenew attacks are synthetic and are created based on the commands commonly

4

used by the user, making it more difficult to detect. This still have theproblem that the attacks are not close to real.

Other work, [9], compares the idea of using truncated commands, likeSEA does, versus enriched commands. To do that they used the Green-berg dataset [11], a dataset that gathers information over a period of 4months of UNIX enriched commands of 168 users divided in 4 categories:novice programmers, experienced programmers, computer scientists and non-programmers. So, to use the dataset they first filter the dataset to have onewith truncated commands like SEA and the other with enriched commands.For the experiment, they used 50 users as legitimate users using 1,000 com-mands to train the profile and 1,000 to validate it. For the attacks, 100commands of other 25 users were used to create the masquerade set, hav-ing 2,500 attack commands and 30 blocks of 10 commands inserted into thevalidation set for each user; again there are no real attacks and the OVTOapproach is applied.

To overcome the well known limitations of SEA, one may embark oneselfin gathering faithful UNIX sessions; however, such task has proven to becomplex, given that real UNIX sessions are sparse and, hence, difficult toobtain. This has been shown by Chinchani et al. [12], who have attemptedto synthesize sequences of user commands, with the aim of enabling theconstruction of a model for ordinary behavior. RACOON, Chinchani etal.’s tool, synthesizes user sessions in order to get around of the inherentlylong time it takes the collection of real ones. Its ultimate aim is to speed upthe process of development and evaluation of masquerade detection systemsbut the problem is that the data is still far from real.

As shown before, the results of the works based on UNIX commands,have not been good enough. So, recently, research on masquerade detectionhas been looking at alternative sources of user activity to better profile userbehavior. In what follows, we outline prominent works.

2.3. Online Games

The work in [13], was based in behavior patterns of users when playingonline games. The objective was to avoid the accounts to be stolen. In thesekind of games there are items that can be sold for real money.

In this work, they study the behavior of users based on idle times betweenmovements of the game character. Here they used two different times: periodof activity and idle period. The activity period is the time where the charactermoves constantly with pauses shorter than one second. The idle period is

5

the time without movement greater than one second. They used a periodof three days to gather information of times and create logs of normal userbehavior. They built a dataset with 287 players of a commercial game butas with other datasets, they followed the OVTO approach.

2.4. Mouse Usage

In [3] a dataset is built based on information about keyboard activity,mouse movement and clicks, processes running and user commands. Al-though a lot of information of user’s interaction is collected, the datasethas only three users. Furthermore, they only use mouse features like mouseclicks, distance between clicks, mouse speed and mouse movement angle. Forthe validation they do not use attacks but the behavior of other users, againnot close to real attacks.

There are other works based on mouse interaction like [14]. Here theytry to differentiate a masquerader from a user based on mouse movementsand events. They gathered information about mouse coordinates each timethe mouse is moved. They also gathered features like distance, angle, andvelocity between one point and another. The dataset they created consists of18 users with 10,000 coordinates; all users were working in Internet Explorer.The work also follows an OVTO approach.

Other work based on mouse usage is [15], where they presented to theuser a graphical 5x5 matrix; each cell of the matrix was a button with nospecial significance. The dataset they created was based on user interactionafter they were asked to use the program five times. The problem here isthat there is no real application; users had to click the buttons in the matrixin certain sequence in order to create a user profile. Finally, a particular userwas identified after a new interaction with the matrix. However they onlyhave five users, so it is no so difficult to differentiate one from others. Wesee again an OVTO approach being used.

2.5. Keyboard Usage

Some of the most prolific works in masquerade detection based on key-board usage are the ones that analyze typing patterns [16, 17, 18, 19, 20,21, 22]. Most of these datasets are created by typing a password. The taskof typing the password is repeated a number of times (from a few dozens tothousands in one case). Comparison of these works becomes very challengingif not unfair since features used vary as well as the number of times passwordsare typed.

6

One of the most prominent work in this area is the one from Maxion andKillourhy [4]. In this work the authors used 51 users to collect, in severalsessions of 50 records, a total of 400 records for each user. Moreover theycaptured information about users like gender and handedness. For all usersthe password was the same, they only focused on comparing the way oneuser types the password against the others. They obtain 31 features of thekeystrokes. The experiments still follow the OVTO approach but in this caseit is acceptable because the only way to validate these experiments is whenall users have the same password.

2.6. Clustering User Search Behavior

In a complementary vein, research on masquerade detection has attemptedto characterize user activity considering a combination of several observableactions. For example, Salem & Stolfo [6] provides 22 types of user activi-ties, including information gathering, browsing, communications, etc. Thedataset used to model user search behavior consists of normal user data col-lected from a group of 18 individuals and a simulated masquerader data from40 different individuals, during a period of 4 days and more than 500,000records per user, in average. Data collection was achieved by a Windowshost sensor developed by the authors, which mainly logged all registry-basedactivity, process creation and destruction, window GUI and file accesses, aswell as DLL libraries’ activity. The 10 GB dataset, called RUU (Are YouYou?), is publicly available1.

Masqueraders were asked to act according to one of the following threescenarios: malicious : the goal was to find financial data from a coworker’sdesktop computer in a 15 minutes period; benign: similar to the previous one,but the masquerader used his coworker’s computer for legitimate purposes,assuming he had no access to its own computer; neutral : masqueraders wereleft to freely choose whether to access or not their coworker’s desktop. Theysimulated these masquerader attacks in lab using a single pre-configuredmachine, which can be considered as a major drawback of the final dataset.Thus, the masquerade detection mechanism is ad-hoc, in that it is OperatingSystem (OS) dependent, and needs to be adapted whenever a change onplatform is in order.

Roughly, Salem & Stolfo’s working hypothesis states that, upon an in-

1Dataset: http://www1.cs.columbia.edu/ids/RUU/data/.

7

trusion, a masquerade will explore the target victim’s computer, and thatthis search behavior will clearly be different from that of the user, who of-ten knows what she looks for and where to do so. The taxonomy developedby the authors is used to translate a sequence of system behavior into a se-quence of abstract actions, which is used to build a ocSVM, as a classifiermodel for the user. To validate the allegedly participation of a user in a 2-minute observation window, the window activity is also abstracted out usingthe taxonomy, and then the window is classified (as normal or not) usingthree main features, namely: number of automated-search related actions,number of file touches, and percentage of file system navigation actions.

While interesting and well-designed, Salem & Stolfo’s masquerade datasetand detection approach presents several drawbacks. One of the major lim-itations is that, even though they do not follow the OVTO approach, theattacks are still far from being faithful to a masquerade scenario. The at-tacks have been conducted on an artificially set computer, not on a real-userone: search behavior is somehow affected by where one searches through.The artificially set computer has 70GB of files, but authors fail to explainhow searching behavior is affected by the size of the hard disk, the size ofthe files, or the data volume.

Salem & Stolfo collected user activity for about four days in average.This might not be enough because in such a short period of time users couldnot have required to search for anything at all. Furthermore, users’ behaviornormally changes when they are aware of their actions being monitored,especially during the first days. Hence, it is recommended not to take intoaccount the first logged records. In this case was clearly not possible sincethe collection time was too small. Furthermore, lack of details regarding theused methodology make impossible for the reader to duplicate the authors’results.

Another limitation of RUU dataset is that the masquerade attacks usedfor testing do not consider all different masquerade scenarios, namely: ma-licious, benign, and neutral. Our experiments in Section 6 show that it ismuch easier to spot an eager, malicious masquerade that one who exploresnaıvely the victim’s file system, and yet he does not hesitate to grab infor-mation that he judges critical or valuable. On the other hand, the number offile touches and the percentage of file system navigation user action hardlyrelate to search behavior, as they could be associated with a number of ac-tivities other than masquerade ones. For example, if a user leaves the mousepointer over a folder for a few milliseconds, a contemporary MS Windows op-

8

erating system will automatically traverse recursively underneath that folderstructure in order to compute relevant information, including folder size, filecount, etc. This fact has been overseen while building RUU.

An analysis of the homogeneity of the users and the simulated attacksis important. This is because a more homogeneous dataset allows for amore confident analysis of the results, but a more heterogeneous dataset willenable to test the classifiers in a wider range of different environments. Thesimulated attacks in the RUU dataset are homogeneous, because they havebeen conducted in the same testing machine. However, we do not know howsimilar the structure of the RUU users’ file system is, so we cannot completelyagree that their user base is homogeneous, even with the same backgroundand education, because the data associated with how every user distributeshis data on his computer is not provided.

2.7. Disadvantages of Current Datasets

Summarizing, existing datasets for masquerade detection suffer from some,if not all, of the following drawbacks:• They yield a simple structure on user activity: for example, in the

command based approach, there is no relation that can be exploited,other than grouping commands into a script.• They adopt the OVTO approach, and, hence, lack of faithful intrusion

attempts.• They are dependent on external issues. For example, in the command-

based approach, a masquerade detection is designed towards a givenOS, while in the password-based approach, it is towards a specific key-board device. A similar argument applies to the search-based approach,which depends on the command taxonomy, but it is magnified, since itfurther depends on very specific aspects of the OS internal workings.• They demand very unrealistic development or update scenarios. For

example, in the password-based approach, an organization may requirea user to change her password at least once a month, which might getin the way to collect 400 password typing samples.• They consider unrealistic masquerade epochs. For example, SEA con-

siders sessions 100-command long.• They are fixed in size and not continuously growing or improving

datasets.Thus, [23]’s statement still prevails :“A unified and coherent framework

for testing and comparing different algorithms in a consistent, reliable and

9

reproducible way could benefit the entire community of intruder detectionresearch”. We believe WUIL (and WUIL second generation) could fill in thisgap. We shall now discuss the rationale behind the development of WUIL,as well as motivating the design decisions taken during its development.

3. The WUIL Dataset

The design and development of WUIL has been driven by a simple work-ing hypothesis: the way a user navigates the structure of her File System(FS) suffices to masquerade detection. In WUIL, FS navigation comprisestwo key aspects: first, the FS object upon which a user has carried out anaction; and second, information as to how each of these objects is used overa session. This information is captured by means of a navigation structure.

3.1. Navigation Structure

A navigation structure contains information about the objects touchedby the user, whether directly, e.g. by means of a mouse click, or indirectly,e.g. through an application. It is composed of two graphs: an access graphand a directory graph. The access graph holds (the most recently) visited FSobjects, and provides the order (by means of a traversal path) that the userhas followed on each visit. By contrast, the directory graph is a proper sub-graph of the user FS; as expected, it is an arborescence, with a distinguishedvertex, called the root, and denoted by “/”.

3.1.1. Access Graph

Let FS and objects(FS) denote the FS of user u, and the set of objectsin it (files and folders), respectively.2 The access graph of u is a directedgraph, G = (V,E), where V ⊆ objects(FS), and E ⊆ V × nat × V . E is

so that nk→ n′ ∈ E if and only if, having visited n, u has k times visited n′

next. Initially, E = ∅, and V = {/}. Then, upon a user access of node n′,and departing from node n, G is updated as follows:

(V,E) =

{(V, (E − {n k→ n′}) ∪ {n k+1→ n′}) if n

k→ n′ ∈ E(V ∪ {n′}, E ∪ {n 1→ n′}) otherwise

2For the sake of brevity, and given that the user, u, is understood from the context, weshall refrain ourselves from subscripting with u all the symbols defined in this section.

10

Every node in the access graph is a tuple with two elements: i) the pathto its associated FS object, and ii) its weight. For every node n ∈ V , the pathof n, denoted π(n), is a string of the form /(α/)?α, where α stands for analphanumeric, representing the name of the file system object, and ? for thetransitive, reflexive closure of string concatenation, denoted by juxtaposition.The weight of a node n, denoted weight(n), represents the number of timesn has been accessed. Formally, it is given by:

weight(n) = foldl({|k | ∃n′. n′ k→ n|},+, 0)

where {|. . .|} denotes a multiset, and foldl the higher-order function given byfoldl({|X0, X1, . . . , Xn|} , F, E) = F (. . . (F (F (E,X0), X1), . . .) Xn) (where Eand F denote, in turn, the base element and the step function, respectively.)

3.1.2. Directory Graph

A directory graph is a directed acyclic graph representing a subtree graphof the user file system, FS: D = (V ′, E ′), where V ′ ⊆ objects(FS) andE ′ ⊆ V ′ × V ′, so that (n, n′) ∈ E ′ if and only if n′ is a child to n in FS.Regarding the access graph, G = (V,E), notice that V ⊆ V ′, and that, forevery node n ∈ V , V ′ contains as many as nodes are along π(n).

Like the access graph, the directory graph is initially D = (∅, {/}). Then,upon a user access to any object o in FS, we update D = (V ′, E ′) as follows.First, we compute the node pairs along π(o): let µ be a singleton string,then:

p0 = {(n, n′) | ∃α, α′, µ. π(o) = αµα′, π(n) = α, π(n′) = αµ}

Next, for every pair (n, n′) ∈ p0:

(V ′, E ′) =

{(V ′ ∪ {n′}, E ′ ∪ (n, n′)) if (n, n′) /∈ E ′(V ′, E ′) otherwise

Every node n ∈ V ′ is a tuple of two elements: i) its last access time, andii) its position with respect to D, denoted pos(n,D, π(n)). Let /(α1, . . . , αk)denote D’s root node, /, together with its k object children; then, we definenode position by:

pos(n, /α1, . . . , αk, π(n))=

[ ] if π(n) = /[i] if π(n) = /αi

[i] pos(n′, αi, π(n′)) if π(n) = /αiπ(n′)

11

Navigation Structure Maintenance. To keep the navigation structure of agiven user to a manageable size, we have applied Least Recently Used, apage replacement algorithm for virtual memory management, so a file thathas not been used in a certain amount of days (e.g. a file that was deletedfrom the file system or that was moved to other folder) is deleted from thestructure. Other algorithms, such as FIFO or Most Frequently Used are bothapplicable and easy to implement, while others, e.g. Second Chance, or NotUsed Recently require us to include more accounting information inside anavigation node. We regularly update the navigation structure, at intervalsthat can be programmed using a setup code.

3.2. Using WUIL for Masquerade Detection

There are lots of aspects to a navigation-based profile that can be ex-ploited in the construction of a masquerade detector, including object usagedependencies, object usage frequencies, etc. Below, we discuss two aspectsthat illustrate the richer structure provided by our approach.Locality: To begin with, a file system is a data structure; so, not only can

the use of objects be related with ordering, but it can also be relatedwith space. Thus, we may exploit two standard OS notions: first,temporal locality, an object that has been referred to recently is likelyto be referred to soon again, and second, spatial locality, an object nearother that has been recently referred to is likely to be referred to soon.Often, a user accesses files close one another (spatial locality), and, ona short period of time, she repeatedly accesses the same files (temporallocality). By contrast, a masquerader will access lots of files and foldersalmost in a random way, and without repeatedly using the same files.

Tasks: We hypothesize that a user files related objects into a single folder,and that each of these folders accounts for a user task (e.g., job, enter-tainment, family, etc.) Then, we may profile user behavior in terms oftask transitions. Notice that, this way, we abstract out a notion of useractions at a higher-level than the one obtained in [6], which considerslow-level actions, such as editing, compiling, etc.

There are of course others. For example, since we have given each FS objecta position, we may use this notion in an attempt to localize where, in theFS, the user usually works at.

Summarizing, our navigation-based approach to masquerade detectionhas the following benefits:

12

• It allows to simulate a masquerade attempt, so as to make it fit anyrequirements in terms of duration, working scenario, etc.• It allows to develop and update a user profile under close to real work-

ing conditions. A navigation-based masquerade detector is more trans-parent to the user, as it does not require her to give it any specialattention.• It is device-, and platform-independent.• It provides a richer structure, with which we may build masquerade

detectors at a higher-level of abstraction.Notice, however, that a masquerade attempt to a user cannot be easily trans-lated into one to some other; so, in our approach, OVTO is not straightfor-ward.

We are now ready to give the details of the content of WUIL. Since, unlikeother masquerade datasets, it contains the simulation of (three types of)attacks, we shall split our discussion into two parts: one for users (Section 4),and the other for the attacks (Section 5.)

4. The Construction of WUIL: the User Dataset

Currently, WUIL comprises records of 20 users. Each user is charac-terized by a collection of logs. User logs were captured over a period ofobservation, which ranges from five to ten weeks of ordinary working days.Our logs comprehend information about the navigation of user objects only;put differently, WUIL involves no records about objects that are part of theOS, or an application.

4.1. The MS Windows Audit Tool

Although our approach based on file system navigation is not fixed toa particular OS, we have targeted MS Windows, because it is widely used,and, hence, makes it easy to recruit volunteers.

The price we had to pay is the use of the MS Windows logging tool,audit. In what follows, we report our experiences after this selection deci-sion, which might be worth considering if someone is to replicate our exper-iments. Firstly, the functionality of audit is not uniform across all differentWindows versions, and, thus, not only will pre-processing be in order, butit will also be Windows version dependent. Secondly, audit is not highlycustomizable; further, since it is not open, the audit code cannot be mod-ified so as to make it provide the information one is after. Thirdly, audit,

13

or Windows to this end, does not explicitly distinguish the actions raised bythe user (directly or indirectly), from those raised by an OS facility, or anyother system application; so, a lot of extra information would be recorded.Finally, and more importantly, whenever the mouse is on top of a folder,some versions of Windows explore the content of it, regardless of whetherthe user has clicked on it or not. If overlooked, this fact may affect severelythe validation of a masquerade detection working hypothesis, since these logrecords do not account for user FS navigation at all. These records cannot bestripped off fully automatically, since there are legitimate user actions thatwould yield similar record history.

Yet, Windows allows for a heterogeneous sampling, with respect to theuser role, proficiency, and other characteristics. Also, in most Windows ver-sions, audit can be customized at least to make it monitor activity on onlyspecific folders, thus getting away a little from storage space requirements.To gather user data, we enable audit only on folders that each user specif-ically stated to be working. Most commonly, we have monitored file systemusage on two folders, namely: Desktop and My Documents, but for someusers, we had to audit additional folders or even drives.

4.2. Users and User Logs

To the time of writing, WUIL contains logs of 20 users (though, we arecontinuously recruiting volunteers). They all signed a standard confiden-tiality agreement, and received a little compensation for taking part on thisresearch.

Table 1 provides basic information of the users. As can be noticed, thenumber of logged days varies from one user to the other. This is because notall users are geographically close one another, and so it was not plausible tohave a fixed start- and end-date. The amount of information stored in theuser navigation structure mostly depends on the user (role, amount of timespent with a computer, etc.), rather than simply on the number of auditeddays (to see this, look into Table 2.)

While recording user activity, we have striven towards not to introduceany sort of bias. However, users are all unique; what might work with oneuser might not work for the others. In an attempt to answer possible unusualmasquerade detection results, we have asked each user to fill in a survey withfurther personal information. Through this survey, we have gathered addi-tional information, such as age, gender, addressing title, etc. We also havesubjective information that might prove useful while using WUIL to validate

14

Table 1: Basic information about all WUIL users, namely: role, MS Windows version,and the number of days logged using the audit tool.

User Role MS Windows Ver. No. of logged days1 Manager Windows XP 492 Manager Windows XP 483 SystemsArea Windows XP 354 Accountant Windows XP 295 Secretary Windows XP 546 Secretary Windows XP 557 Secretary Windows XP 568 Secretary Windows XP 319 Secretary Windows XP 3410 ShippingArea Windows Vista 3311 Secretary Windows XP 1312 Programmer Windows 7 4413 Student Windows Vista 4014 Sales Windows 7 3415 PhDStudent Windows Vista 5416 PhDStudent Windows 7 3717 Student Windows 7 3518 Student Windows 7 3019 Secretary Windows Vista 3620 PhDStudent Windows XP 23

a specific masquerade detection hypothesis. This involves, for example, how(un)tidy each user considers she keeps her working folders, how tidy she con-siders herself for organizing and carrying out general tasks in relation withher job, how computer knowledgeable she is, or how capable she is on cus-tomizing the underlying operating system, to mention a few. These surveysare all available as well by sending electronic-mail to the first author.

Roughly, our enquiries report that the average user age is 38.78, rangingfrom 18 to 61 years old; the average time each user spends with her computeris 8.07 hours per day; the average self-score for user tidiness is 8.15, theaverage self-score for skills regarding usage of MS Windows Office is 6.73,and the average self-score for knowledge on making administrative changesto MS Windows is 6.10. Average self-score were all determined consideringthe answer given by each user, in a scale from 1 to 10, where 1 means noneand 10 all.

4.3. Filtering and Pre-processing

The MS Windows audit tool yields logs with data that are spurious toour purposes, and, so, we have developed a few scripts to automatically filtersuch data out. Also, user logs contain information that cannot be simplydisseminated without jeopardizing user anonymity, so we have also developed

15

Table 2: The number of accesses of each user before and after filtering.User Original Log Accesses Filtered Log Accesses1 293,894 262,9442 77,608 63,5363 40,708 15,1594 147,418 129,8475 350,800 338,7886 369,440 356,1947 352,910 335,0818 91,852 85,4829 178,602 178,46010 127,532 108,64311 547,545 463,41012 2,692,858 197,63913 2,307,804 929,03214 1,752,639 19,81315 454,280 453,20416 1,444,914 1,290,29517 245,862 213,14718 10,409 10,31519 90,357 54,30220 76,943 57,608

scripts that attempt to hide out user identity, by means of simple rewriting.We call sanitized, the logs that are ready for public dissemination. We notice,in passing, that we deliver only either navigation structures or sanitized logs.

A summary of information of WUIL user logs, after sanitizing, is shownin Table 2. A close look into Table 2 reveals that for some users, e.g. 12 and14, pre-processing stripped off quite a few records. This is because for theseusers we could not get audit to record information of specific folders only;so, we got it all.

5. Attacks and their Simulation

One of the main difficulties that the research community in intrusiondetection faces is the lack of datasets with real, or at least real enough,attacks (see Section 2). We have developed WUIL in an attempt at filling inthis gap. In this section, we report on the way we crafted and simulated ourmasquerade attempts.

5.1. Masqueraders’ Logs

WUIL contains logs of three sorts of (simulated) attacks: basic, inter-mediate, and advanced. We have arrived at each attack class following theresults yielded by a survey, applied to both an undergraduate and a graduate

16

student class. Each participant student was enrolled in a Computer Securitymodule, at time of survey application, and was given 30 minutes to fill in theassociated questionnaire.

The questionnaire was rather simple, and mainly required each studentto provide a description of what she would do, in a step-by-step fashion,in case she had access to a computer session that had been left carelesslyunattended. We imposed no restriction as to the ownership of the unattendedcomputer, but ask to fill in a descriptor of the target (classmate, Professor,subordinate, etc.). We particularly insisted on ruling out installing backdoors, or any other form of malicious programs, since it was our intentionto understand what a masquerader would be after, and not the means forgetting possession of somebody else’s computer. We did not disallow thepossibility of removing files, as opposed to extracting them out, or any otheractions, such as removing masquerade tracks.

Our survey output several different masquerade strategies, which we groupedinto the three sorts of attacks mentioned above, and that we describe in therest of this section.

5.2. The Basic Attack

The basic attack is one that, roughly, models the case of an occasionalmasquerader; that is, one who would grab an opportunity, rather than cre-ating it. The masquerader actions are thus simple and we assume he did notbring himself a removable storage media (USB flash drive), only his mobile(smart) phone. Regarding FS operations, the masquerader would be ableonly to open or close a file. For extracting information, he would be able toeither send a limited amount of files by e-mail, and then get rid of any attacktracks, or just use his own memory. More specifically, the procedure we havefollowed to conduct a basic attack are as follows:

1. Write down the actual time of the day; set a five-minute alarm andstop immediately after the alarm goes off.

2. Go to My Documents, and set it to be the current working folder.3. Browse the content of the current working folder, looking for a file with

an appealing name (e.g., my banking info) that has not been alreadyexplored. Open that file and verify that the information is somewhatinteresting. If possible, e-mail the file to a personal account; otherwise,take a picture of it with your mobile phone, and then close it. Repeatthis step as many times as files of interest are in the current workingdirectory.

17

4. Apply this process recursively, on the depth of the tree structure, usinga depth-first searching strategy, exploring the more appealing namedfolders first.

5. If possible, repeat this entire procedure with Desktop, or any otherfolder of interest, outside My documents.

5.3. The Intermediate Attack

In the intermediate attack, the masquerader is assumed to have broughta USB flash drive with himself. He would follow a very simple and direct pro-cedure to locate somewhat interesting files, and then he would copy them outinto his USB flash drive. More specifically, the procedure we have followedto simulate an intermediate attack is the following:

1. Write down the actual time of the day; set a five-minute alarm andstop immediately after the alarm goes off.

2. Insert the USB flash drive.3. Open MS Windows’s searching tool.4. Search for all files with a name matching either of the following strings:

*bank*.*, *password*.*, *pass*.*, *account*.*, *number*.*, and*pin*.*. Copy the files found into the USB flash drive.

5. Search again the target computer but now for files with a name match-ing either of the following strings: *.doc*, *.xls*, *.txt*, *.jpg*,and *.png*. Copy all files found into the USB flash drive.

6. Remove the USB flash drive, and remove tracks.

5.4. The Advanced Attack

In the case of the advanced attack, we assumed that not only has themasquerader brought in himself a USB flash drive, but he has a .bat fileto automatically copy out the same files as in the case of the intermediateattack. The steps are as follows:

1. Write down the actual time of the day; set a five-minute alarm andstop immediately after the alarm goes off.

2. Insert the USB flash drive, and open it.3. Double click the .bat file.4. Remove the USB flash drive, and remove tracks.Attacks were all simulated online: they were carried out directly by one

and the same person, and on the actual user computer machine. As expected,no bit of information was actually taken away. Extracted files were destroyed

18

Table 3: Attack logs: number of records before and after filtering.

UserAttack 1 Attack 2 Attack 3

Total Filtered Total Filtered Total FilteredAccesses Accesses Accesses Accesses Accesses Accesses

1 2,476 2,353 4,851 4,352 14,825 14,8252 2,049 2,032 4,203 3,968 1,423 1,2733 2,240 2,155 1,112 1,069 1,904 1,8364 3,702 3,646 7,603 7,422 1,1369 1,12355 3,085 3,001 15,507 14,925 10,378 10,3786 3,594 3,499 8,902 8,530 7,466 7,4667 3,184 3,090 18,916 18,625 11,464 11,4648 5,331 5,267 8,265 7,898 10,404 10,4049 3,546 3,546 10,761 10,711 9,519 9,51910 1,038 730 973 499 148 1311 6033 6033 25162 25162 842 84212 1,966 1,953 3,411 3,411 16,269 16,26513 259 254 16098 16096 1004 100414 13,404 1,084 4,403 1,315 4,537 1,72615 1,284 1,284 6,100 6,100 4,171 4,17116 2,893 2,893 919 919 2,075 2,07517 1,585 1,585 2,289 2,289 3,342 3,34218 1,770 1,770 498 498 1,495 1,49519 1,003 588 1,215 392 225 13920 3,455 3,443 52,819 52,804 10,361 10,361

in the presence of the user, using a different machine. We were very strict atabandoning the masquerade attempt whenever the alarm went off.

The records output by audit from the simulation of an attack were allremoved from every user log. The are kept on separate files, we call attacklog, each of which is labelled with the user name and the kind of attack.Table 3 displays the number of records on each attack log.

6. Preliminary Experiments

In this section, we shall briefly describe the results obtained through apreliminary experiment on applying WUIL for masquerade detection. Thepurpose of this experiment, other than obtaining a fully functional masquer-ade detection system, is to provide the reader with sufficient evidence toconsider WUIL as a working dataset for the development of new masqueradedetection mechanisms.

In our experiment, we have adopted a window-based, one-class classifi-cation approach. A window comprehends a period of observation of userbehavior, and is represented in terms of a vector of features; windows aredirectly extracted from the user logs, during the construction of the associ-ated user navigation structure. We have built a set of classifiers, one for each

19

user, using only a single class of data (user logs assumed to be attack-free);given a window where a given user has allegedly participated, called a testwindow, each model is built so as to strive towards identifying if it deviatesfrom those seen during construction.

In what follows, we shall give an overview of how we used the informationin the WUIL dataset in order to obtain, for each user, the constructionset, and the validation dataset (made out of both clean and contaminatedwindows.) We shall also outline the construction of two different one-classclassifiers: one based on Support Vector Machines (SVM), and the other onK-Nearest Neighbors (KNN). We understand the disadvantage of using non-parametric classifiers and how their use for a real-life application could have aseriously negative impact; however, using these widely used classifiers raisesseveral other advantages, such as the chance to exemplify the potential of thedataset and to determine a reliable baseline for future experimentation and,by prioritizing these facts, we justify our choice. Accordingly, we shall presentthe results of validating each masquerade detector to see how good eachmethod performs in the general case (type of attack) and in the particularcase (a user). Finally, we will analyze the results obtained throughout ourexperimentations and draw some conclusions.

For each user, we take sanitized logs (both user and attack), and as-sume that each log entry contains the following information: ENTRY ID,a sequential number used as a unique identifier, starting from 0; DATE,the date the log entry was recorded; TIME, the time when the entry wascaptured, expressed in 24-hour format; ELAPSED TIME, the time sinceaudit was last enabled; and finally, OBJECT , the name of the object itself(path included).

6.1. Work logs

Each sanitized log is first transformed into a work log, which includesordinary and contaminated entries, and which is composed of the construc-tion and the validation dataset. The construction dataset is the first 80% ofthe original user log, leaving the first 5% of it for warming up purposes (seebelow), while the validation dataset is the remaining 20%, which is contam-inated following this two-step rule: first, take the log as if it were a tape;next, randomly insert the three attacks to the corresponding user. For thelast step, we took each attack at a time; then, we randomly generate a num-ber between 0 and the length of the attack to be inserted; finally, we inserted

20

Figure 1: A work log. The first 5% is ignored for user profiling, although it is used forgraph updating; the following 75% is used for user profiling; finally, the last 20% is usedfor model validation.

the attack in that position, and repeated this process considering only theremaining part of the tape (not the one that has just been modified).

The warming up log bit corresponds to a period of time where observeddata is considered unreliable. This phase is required, since, at first, thenavigation structure is empty. Thus, a few early records must be ignored,at least for the construction of a classifier; they are used, though, for theconstruction of the navigation structure. Figure 1 portrays a work log.

6.2. Feature Selection

This section is aimed at explaining the rationale behind the selection ofthe window features. Before going further, though, we briefly explain howwe have defined the window setting. First, there are typically two ways toapproach window-based data sampling: either the size of the window is fixedin terms of time, or it is in terms of a number of events. The two approacheshave both advantages and disadvantages; however, selecting one of themaffects which features are to be chosen and the way they are obtained; fur-thermore, it also affects the applicability of a particular classification method(e.g., Naıve Bayes is inapplicable if windowing is time-based). Second, theother design decision in a windowing approach is the step, the length thewindow is slid for the next sample. In our case, we use a time-based windowapproach, being the size of the window 30 seconds, and the step the windowsize. The rationale behind this design decision has to do with the timelydetection of a masquerade attempt; Since these experiments are not the cruxof the paper, we shall refrain ourselves from discussing this issue further.

For the selection of the window features, we hypothesize that the intrudershould know little about usual user behavior. Thus, each feature shouldhelp minimizing the difference among valid-user windows, while maximizingthe difference among masquerade ones. The selected window features arediscussed below; we use statistics to refer to mean, median, mode, variance,standard deviation, and sample range, for a given window measure.

21

Accesses: The number of FS objects touched in a window, T , and the num-ber of times an access to a FS object resulted in adding a new nodeto the navigation structure, N , we call these kinds of FS objects newlyaccessed. Presumably, these figures will be larger, when featuring amasquerade window.

Time between accesses: The elapsed time between two consecutive objectaccesses. This measure, reported through statistics, is complementaryto the number of accesses, and so it also attempts to capture that amasquerader will aim to compromise as many critical FS objects aspossible in a short period of time.

Depth: Statistics regarding the depth at which every FS object touchedin a window lies in the directory structure. Presumably, having noknowledge of where to find critical FS objects, the attacker will searchthe user FS eagerly. By contrast, the user will work usually at verydistinctive locations, consequently yielding a working depth pattern.

Path distance between accesses. Statistics regarding the path distancebetween the objects associated with consecutive accesses. This measurecomplements the one above, since, again, the masquerader has little orno knowledge about the user file system, and, hence, will explore itusing a distinctive search pattern.

Structure features: This is a snapshot of the user navigation structureafter processing the given window, and is given by: the total numberof vertices (|V |), the total number of directed edges (|Ed|), the totalnumber of undirected edges (|Eu|), the ratio between the total number

of vertices and the total number of directed edges ( |V ||Ed|), and the ratio

between the total number of vertices and the total number of undirectededges ( |V ||Eu|). Selecting this feature follows directly from the hypothesisthat a navigation structure models how the user interacts with her filesystem.

The feature vector used in our experiments is given in Table 4.

Table 4: The window feature vector.Accesses Depth Time Graph features Path distance

between accesses between accesses

T N Statistics Statistics |V | |Ed| |Eu| |V ||Ed|

|V ||Eu| Statistics

The previously described window features are, by no means, the only onesthat can be obtained from WUIL; neither should they be considered the mostadequate nor the most accurate to describe user behavior. In fact, we kindly

22

and eagerly invite the research community to explore other alternatives torepresent user behavior and, mainly, model masquerade detection systemsbased on WUIL properties. Some examples of potentially useful features,not explored in this paper, include: the ratio between the folder density andthe amount of visited objects; an account for the least, respectively most,files or folders; the average time between accesses to each FS object; andeven an account for the least, respectively most, executed processes.

6.3. Experimental Setup

Having transformed a given work log into a sequence of window featurevectors, we normalize features to prevent one of them to have more influencethan the others (e.g., the maximum value for time between accesses is 30seconds in our experiments, while the total number of accesses might be inthe order of thousands). The normalized feature vectors are suitable to anumber of Statistics or Artificial Intelligence learning methods, of which wehave selected to apply SVM and KNN, because they have proven to performparticularly well in several contexts.

For building each KNN classifier, we first varied the number of neighborsbeing considered, k, going from 2 to 200 with steps of 2. Then, we variedthe value within which the test vector (representing the test window) is tobe considered as generated by the user, called the threshold, going from 0 to0.05 with steps of 0.002. The associated distance, δ, is computed as follows:

δ =

√√√√ 1

n

n∑i=1

(tsi − tri)2

where ts is the test vector being evaluated, and tr is the training vector(generated by the valid user interactions) with which ts is being compared.After computing the distances between the test window and all of the trainingvectors, for the k neighbors being obtained, the weighting consists of simplyaveraging each δ corresponding to the neighbors belonging to the same class.If the averaged δ value is less than the threshold, then the vector is consideredas user; otherwise, it is considered as masquerader.

For the case of the SVM classifier, we used a radial basis function kernel(rbf), where the value of γ is first varied from 0 to 1000, with steps of 5, andthe value of ν is randomly selected in the [0, 1] interval (we tried five ν valuesfor each γ). In order to tune the rbf SVM, we applied simulated annealing:

23

first fixing γ and ν to the best results found when varying from 0 to 1000 andthen defining that the function to be optimized f(x) would be the accuracyof the SVM classifier. The initial temperature t is given by:

t =δE

3

where δE = fmax(x) − fmin(x) amounts to the difference between the bestclassification accuracy and the worst classification accuracy for the prelim-inary experiments with SVM. Two nested cycles are necessary for tryingdifferent values for γ and ν; with γ changing in the range of ±1 of its previ-ous value, while ν changes in the range of ±0.1 of its previous value. Finally,t is decreased: t← t× 0.85 (t decreases its value in 15% each cycle).

For reporting our experimental results, we resort to the well known truepositives (TP), true negatives (TN), false positives (FP), false negatives (FN)values, to describe how well the classifier identifies masqueraders and validusers, but also, how often it fails at identifying valid users and masqueraders.Additionally, we use as a simple metric the cumulative value of TP +TN forcomparing KNN and SVM to give an idea on how well they perform.

6.4. Experimental Results

The classification results, considering both the KNN-based detector andthe SVM-detector, are shown in Tables 5 and 6, respectively (Notice thatwe have considered 18 users only, because at the time of experimenting, wedid not have the complete logs of the other two). In both tables, the firstcolumn contains the user ID; column 2 to column 5 portray the relativedetection performance in terms of FP, FN, TP and TN, respectively; column6 contains the cumulative value of TP and TN (columns 4 an 5); columns7 to 9 describe the FN rate corresponding to the basic (AT1), intermediate(AT2) and advanced (AT3) attack, respectively.

For the KNN case, the average TP is 88.65%, while the average TN is83.00%. For SVM, the average TP is 87.86%, while the average TN is 79.92%.If we compare the two tables, figures, in general, favor KNN over SVM forall measures. We can observe that although performance is very similar, thehighest FP and FN are present using SVM, and the same happens with thelowest TP and TN. By contrast, KNN behaves more uniformly.

Regarding the user log size, we can see that the best results with KNNare obtained using a relatively small number of records, and that somethingsimilar happens to SVM. However, if we observe the worst results in general

24

Table 5: Overall performance of the KNN-based mechanism for masquerade detection.F and T stand for false and true, respectively,while P and N for positive and negative.Further, we use AT1, AT2 and AT3 for short of basic, intermediate, and advanced, re-spectively, and refer to the attacks simulated on each individual computer machine.

User % FP % FN % TP % TN % TP+TN % FN AT1 % FN AT2 % FN AT3

1 19.72 16.13 83.87 80.28 1.64150 40.00 10.00 0.002 21.71 3.85 96.15 78.29 1.74442 0.00 0.00 20.003 12.80 8.00 92.00 87.20 1.79195 0.00 0.00 50.004 17.05 6.67 93.33 82.95 1.76281 22.22 0.00 0.005 17.72 19.35 80.65 82.28 1.62924 50.00 9.09 0.006 17.50 3.45 96.55 82.50 1.79052 10.00 0.00 0.007 5.81 9.38 90.63 94.19 1.84818 20.00 0.00 9.098 36.47 3.23 96.77 63.53 1.60304 10.00 0.00 0.009 11.31 6.45 93.55 88.69 1.82240 10.00 0.00 10.00

10 33.96 37.50 62.50 66.04 1.28538 80.00 10.00 0.0012 5.60 41.94 58.06 94.40 1.52470 70.00 50.00 9.0914 21.71 3.13 96.88 78.29 1.75165 9.09 0.00 0.0015 26.74 18.18 81.82 73.26 1.55078 45.45 9.09 0.0016 16.67 6.45 93.55 83.33 1.76882 9.09 10.00 0.0017 18.37 3.13 96.88 81.63 1.78508 9.09 0.00 0.0018 6.94 3.23 96.77 93.06 1.89830 0.00 11.11 0.0019 2.24 4.55 95.45 97.76 1.93211 0.00 0.00 20.0020 13.72 9.68 90.32 86.28 1.76604 30.00 0.00 0.00

Average 17.00 11.35 88.65 83.00 1.71649 23.05 6.07 6.57

Table 6: Overall performance of the SVM-based mechanism for masquerade detection.For the meaning of each symbol, we refer the reader to Table 5.

User % FP % FN % TP % TN % TP+TN % FN AT1 % FN AT2 % FN AT3

1 7.31 12.90 87.10 92.69 1.79788 20.00 10.00 9.092 31.94 3.85 96.15 68.06 1.64212 0.00 0.00 20.003 8.54 12.00 88.00 91.46 1.79463 10.00 0.00 50.004 13.74 13.33 86.67 86.26 1.72925 44.45 0.00 0.005 27.67 9.68 90.32 72.33 1.62655 20.00 9.09 0.006 22.50 3.45 96.55 77.50 1.74052 10.00 0.00 0.007 12.08 9.38 90.63 87.92 1.78546 10.00 9.09 9.098 27.06 9.68 90.32 72.94 1.63264 0.00 10.00 18.189 13.18 6.45 93.55 86.82 1.80371 9.09 0.00 10.00

10 20.38 54.17 45.83 79.62 1.25456 100.00 30.00 0.0012 12.41 41.94 58.06 87.59 1.45654 70.00 50.00 9.0914 31.58 6.25 93.75 68.42 1.62171 18.18 0.00 0.0015 59.96 0.00 100.00 40.04 1.40036 0.00 0.00 0.0016 24.28 0.00 100.00 75.72 1.75723 0.00 0.00 0.0017 24.66 3.13 96.88 75.34 1.72215 9.09 0.00 0.0018 3.24 16.13 83.87 96.76 1.80630 9.09 33.33 9.0919 9.68 0.00 100.00 90.32 1.90323 0.00 0.00 0.0020 11.19 16.13 83.87 88.81 1.72680 40.00 10.00 0.00

Average 20.08 12.14 87.86 79.92 1.67787 20.55 8.97 7.47

25

Figure 2: Comparison of KNN results against SVM ones.

terms for both algorithms, they are obtained with user 10, but at the sametime, users with less days logged and a smaller number of records providewith better detection rates. We see that, for the purpose of determiningan adequate log size for conducting this sort of experiments, our results arefar from conclusive. Regarding the type of attack, we see that the basicattack is harder to detect. This is because the masquerader browses the userFS taking a more informed decision on each step. By contrast, an advanceattack instance is more likely to trigger an alarm, since it comprehends quitea number of FS operations.

We summarize the comparison between SVM and KNN in Figure 2 sum-marizes it. There, we observe that often KNN surpasses SVM, consideringTP + TN. However, we also notice that for no user the performance dif-ference is greater than 10%, which means that their global performance ismostly similar.

6.5. What did we learn from these experiments?

We confirm that masquerade detection based on user-system interactionsis an interesting and complex problem which, approached as a classificationone, poses a big number of interesting and challenging sub-problems.

We have seen that the detection rate for both KNN and SVM is highin general terms; however, we have detected several issues that must beaddressed in order to guarantee satisfactory masquerade detection rates forthe user to feel safe. We could argue that the most sophisticated the attack,

26

the most harmful it will be; however, we must remember that the morethe attacker behaves as the valid user, the harder it will be to detect theirattack. This is because the features we use for masquerade detection cannotalways vary enough to be considered as abnormal. One way of getting aroundthis problem might be weighting features, such as new accesses, or consideradditional user features.

Further work also involves trying different window sizes (for example,30, 60 and 90 seconds), and even the alternative approach to fix the size ofa window, based on the number of records. An argument in defense to ourtime-windowing approach, is that it is expected that variations in the amountof objects present in a window will be of use for identifying a masqueraderwhen they perform an unauthorized access to the user’s system. Yet, as saidearlier, this is not definitive.

Another issue worth further research is to study different distributionsfor the three portions that make a work log. In particular, we anticipate thatextending the warming up phase will yield better detection results.

7. Further Work

We divide our directions for further work according to two key aspects.First, how to improve our dataset to have a better and more complete dataset,and, second, what kind of experiments need to be carried out to validate theworking hypothesis behind the construction of WUIL, namely: user profilingconsidering how a user navigates the structure of her FS suffices to detectmasqueraders. We discuss these issues below.

7.1. WUIL 2nd Generation

Further work involves extending WUIL, in terms of both log content, andnumber of participant users. To this purpose, we are continuously recruit-ing more volunteer users, but also gathering more information from currentparticipants. We are also working on incorporating c© Mac OS users, which,among other things, involves writing further scripts to leave logs in a com-mon format. Also, for volunteers, we are especially targeting MS Windows 8users; this way we will be able to see if our current scripts need to be updatedto account for changes in the audit tool.

In a complementary vein, further research also involves adding new kindsof attacks. In particular, we are striving towards the simulation of capturethe flag attacks, where the masquerader knows what to find, and a time limit

27

is imposed to determine if he is up to. These kinds of attacks will help usdetermine how long it takes to a masquerader to achieve an attack goal, andhow long it takes for this to pass unnoticed.

We are working to integrate new data in the dataset, especially datacoming from different user actions to access files; for example, using themouse, the keyboard, or a shortcut to go to a special file. Also, we intend toregister other events such as the user working simultaneously among severalopened files. These new features, among others, will lead us to the release ofthe second generation of WUIL.

8. Conclusions

In this work, we argued about the need to widen research in the masquer-ade detection field to other sources of user activity and behavior. We chosethe approach of analyzing the way a user navigates the structure of her FileSystem and introduced WUIL, a dataset that contains information regardingthe use of file system objects.

WUIL is made up of data gathered from normal users and a set of differ-ent users simulating masqueraders’ attacks. Three kinds of attacks were per-formed per legitimate user. Having masqueraders’ activity logged we avoidedfalling into the OVTO approach found in other dataset building techniques.Also, despite WUIL is built up of Windows logs, the general idea can beimplemented in any Operating System.

We designed a Navigation Structure to represent data contained in WUILand to extract feature vectors. We used these vectors in an experiment withtwo different classifiers, SVM and KNN. Results show that KNN is betterthan SVM; KNN in general reached an average of 17% of false positivesand 11.35% of false negatives. Besides that, experiments showed that it ispossible to detect masqueraders analyzing the interaction of the user withher file system, but it is necessary to go more deeply and look for other waysto take advantage of the dataset and the navigation structures.

As we mentioned previously, the central part of this work is the structureof the WUIL dataset, where we expect to motivate its usage among the re-search community, towards the improvement of masquerade detection; WUILis expected to become the dataset of choice to perform fair comparisons fordifferent detection mechanisms.

Although the current version of WUIL would help to develop masqueradedetection experiments, it is necessary to improve WUIL in two ways. First,

28

adding users constantly with different profiles and Operating Systems to havea mixed and more complete dataset, and second, it is necessary to gatherother kind of data, so we can study not only what objects are used, but alsothe way they are being accessed (e.g. mouse, keyboard, shortcut, etc).

9. Acknowledgements

We are grateful to the anonymous referees, and to the members of theNetSec group, at Tecnologico de Monterrey, Campus Estado de Mexico, fortheir comments on an earlier draft of this paper. The research reported herewas supported by CONACYT grant 105698.

References

[1] M. Schonlau, W. DuMouchel, W. Ju, A. Karr, M. Theus, Y. Vardi,Computer intrusion: Detecting masquerades, Statistical Science 16 (1)(2001) 58–74.

[2] I. Razo-Zapata, C. Mex-Perera, R. Monroy, Masquerade attacks basedon user’s profile, Journal of Systems and Software 85 (11) (2012) 2640–2651.

[3] A. Garg, R. Rahalkar, S. Upadhyaya, K. Kwiat, Profiling users in GUIbased systems masquerade detection, in: Proceedings of the 7th IEEEInformation Assurance Workshop, IEEE Computer Society Press, 2006,pp. 48–54.

[4] K.-S. Killourhy, R.-A. Maxion, Why did my detector do that?! - predict-ing keystroke-dynamics error rates, in: S. Jha, R. Sommer, C. Kreibich(Eds.), Recent Advances in Intrusion Detection, 13th International Sym-posium, RAID 2010, Vol. 6307 of Lecture Notes in Computer Science,Springer, 2010, pp. 256–276.

[5] V. Sankaranarayanan, S. Pramanik, S. Upadhyaya, Detecting mas-querading users in a document management system, in: Proceedings ofthe IEEE International Conference on Communications, ICC‘06, Vol. 5,IEEE Computer Society Press, 2006, pp. 2296–2301.

[6] M. Ben-Salem, S. S., Modeling user search behavior for masquerade de-tection, Computer Science Technical Reports 033, Columbia University(2010).

29

[7] M. Schonlau, Masquerading user data, http://www.schonlau.net (2008).

[8] R. Maxion, T. Townsend, Masquerade detection using truncated com-mand lines, in: Dependable Systems and Networks, 2002. DSN2002. Proceedings. International Conference on, 2002, pp. 219 – 228.doi:10.1109/DSN.2002.1028903.

[9] R. A. Maxion, Masquerade detection using enriched command lines, in:Proceedings of the International Conference on Dependable Systems andNetworks, DSN‘03, IEEE Computer Society Press, San Francisco, CA,USA, 2003, pp. 5–14.

[10] R. Posadas, C. Mex-Perera, R. Monroy, J. Nolazco-Flores, Hybridmethod for detecting masqueraders using session folding and hiddenmarkov models, in: Proceedings of the 5th Mexican International Con-ference on Artificial Intelligence: Advances in Artificial Intelligence, Vol.4293 of Lecture Notes in Computer Science, Springer, 2006, pp. 622–631.

[11] S. Greenberg, Using unix: Collected traces of 168 users.

[12] R. Chinchani, A. Muthukrishnan, M. Chandrasekaran, S. Upadhyaya,RACOON: Rapidly generating user command data for anomaly detec-tion from customizable templates, in: Proceedings of the 20th AnnualComputer Security Applications Conference, ACSAC‘04, IEEE Com-puter Society Press, 2004, pp. 189–204.

[13] K.-T. Chen, L.-W. Hong, User identification based on game-play activitypatterns, in: Proceedings of the 6th ACM SIGCOMM workshop onNetwork and system support for games, NetGames ’07, ACM, New York,NY, USA, 2007, pp. 7–12. doi:10.1145/1326257.1326259.URL http://doi.acm.org/10.1145/1326257.1326259

[14] M. Pusara, C. E. Brodley, User re-authentication via mouse movements,in: Proceedings of the 2004 ACM workshop on Visualization and datamining for computer security, VizSEC/DMSEC ’04, ACM, New York,NY, USA, 2004, pp. 1–8. doi:10.1145/1029208.1029210.URL http://doi.acm.org/10.1145/1029208.1029210

[15] A. Weiss, A. Ramapanicker, P. Shah, S. Noble, L. Immohr, Mouse move-ments biometric identification: A feasibility study.

30

[16] R. Joyce, G. Gupta, Identity authentication based on keystroke laten-cies, Commun. ACM 33 (2) (1990) 168–176. doi:10.1145/75577.75582.URL http://doi.acm.org/10.1145/75577.75582

[17] S. Bleha, C. Slivinsky, B. Hussien, Computer-access security systemsusing keystroke dynamics, Pattern Analysis and Machine Intelligence,IEEE Transactions on 12 (12) (1990) 1217 –1222. doi:10.1109/34.62613.

[18] S. Cho, C. Han, D. H. Han, H. il Kim, Web based keystroke dynam-ics identity verification using neural network, Journal of OrganizationalComputing and Electronic Commerce 10 (2000) 295–307.

[19] S. Haider, A. Abbas, A. Zaidi, A multi-technique approach for useridentification through keystroke dynamics, in: Systems, Man, and Cy-bernetics, 2000 IEEE International Conference on, Vol. 2, 2000, pp. 1336–1341 vol.2. doi:10.1109/ICSMC.2000.886039.

[20] E. Yu, S. Cho, Ga-svm wrapper approach for feature subset selectionin keystroke dynamics identity verification, in: Neural Networks, 2003.Proceedings of the International Joint Conference on, Vol. 3, 2003, pp.2253 – 2257 vol.3. doi:10.1109/IJCNN.2003.1223761.

[21] L. Araujo, J. Sucupira, L.H.R., M. Lizarraga, L. Ling, J. Yabu-Uti, User authentication through typing biometrics features, Sig-nal Processing, IEEE Transactions on 53 (2) (2005) 851 – 855.doi:10.1109/TSP.2004.839903(410) 53.

[22] P. Kang, S.-s. Hwang, S. Cho, Continual retraining of keystroke dy-namics based authenticator, in: Proceedings of the 2007 internationalconference on Advances in Biometrics, ICB’07, Springer-Verlag, Berlin,Heidelberg, 2007, pp. 1203–1211.URL http://dl.acm.org/citation.cfm?id=2391659.2391794

[23] M. Bertacchini, P. Fierens, A survey on masquerader detection ap-proaches, in: Proceedings of V Congreso Iberoamericano de SeguridadInformatica, Universidad de la Republica de Uruguay, 2008, pp. 46–60.

31