STATISTICAL TECHNIQUES IN ANOMALY INTRUSION DETECTION SYSTEM

International Journal of Advances in Engineering & Technology, Nov. 2012.

©IJAET ISSN: 2231-1963

387 Vol. 5, Issue 1, pp. 387-398

STATISTICAL TECHNIQUES IN ANOMALY INTRUSION

DETECTION SYSTEM

Hari Om & Tanmoy Hazra Department of Computer Science & Engineering, Indian School of Mines, Dhanbad, India

ABSTRACT

In this paper, we analyze an anomaly based intrusion detection system (IDS) for outlier detection in hardware

profile using statistical techniques: Chi-square distribution, Gaussian mixture distribution and Principal component

analysis. Anomaly detection based methods can detect new intrusions but they suffer from false alarms. Host based

Intrusion Detection Systems (HIDSs) use anomaly detection to identify malicious attacks i.e. intrusion. The features

are shown by large set of dimensions and the system becomes extremely slow during processing this huge amount of

data (especially, host based). We show the comparative results using three different approaches: Principal

Component Analysis (PCA), Chi-square distribution and cluster with Gaussian mixture distribution. We get good

results using these techniques.

KEYWORDS: Principal Component Analysis, Outlier, Mahalanobis Distance, Confusion Matrix, Anomaly

Detection, Gaussian mixture distribution, Chi-square distribution, Expectation maximization algorithm.

I. INTRODUCTION

The process of monitoring events occurs in a computer system and analyzing them to signal for intrusions

is known as intrusion detection. An intrusion is defined as an attack in a network or system by an intruder

that hampers the security goals such as integrity, confidentiality, authentication i.e. violation of the

security policy of a system. There are various types of attacks: external attacks, internal penetrations, and

misfeasors. An intruder tries to gain unauthorized access to a valid user’s system. An intrusion detection

system (IDS) is a program that analyzes what happens/has happened during an execution and tries to find

out indications if the computer has been misused. With explosive rapid expansion of computers in past

few years, their security has become an important issue. Host based intrusion detection systems (HIDSs)

are used to monitor a suspicious activity of the system. The HIDSs can be classified into two types:

anomaly detection that is based on statistical measures and misuse detection that is based on signature.

Anomaly detection is used to capture the changes in behavior that deviates from the normal behavior.

These methods take training data as input to build normal system behavior models. Alarms are raised

when any activity deviates from the normal model. These models may be generated using statistical

analysis, data mining algorithms, genetic algorithms, artificial neural networks, fuzzy logic, rough set, etc.

Anomaly detection methods may raise alarms for normal activity (false positive) or not sound alarms

during attacks (false negative). Nowadays, the numbers of new attacks are increasing and the variations of



388 Vol. 5, Issue 1, pp. 387-398

known attacks cannot be recognized by misuse detection. Here, we develop an intrusion detection system

using the Chi-square distribution and Gaussian mixture distribution that are used to detect outlier data and

find out the results using three different approaches.

Outlier detection is the most important tasks in data analysis. The outliers describe abnormal data

behavior, i.e. data which deviate from the natural data variability. The cut-off value or threshold which

divides anomalous and non-anomalous data numerically is often the basis for important decisions. Many

methods have been discussed for univariate outlier detection. They are based on (robust) estimation of

location and scatter, or on data quantiles. Their major disadvantage is that these rules are independent

from the sample size. Moreover, by definition of most rules, outliers are identified even for “clean” data,

or at least no distinction is made between outliers and extremes of a distribution. The basis for

multivariate outlier detection is the Mahalanobis distance. The standard method for multivariate outlier

detection is robust estimation of parameters in the Mahalanobis distance and the comparison with a

critical value of the Chi-square distribution. However, the values larger than this critical value are not

necessarily outliers; they could still belong to the data distribution. In order to distinguish between

extremes of a distribution and outliers, Garrett introduces the Chi-square plot, which draws the empirical

distribution of the robust Mahalanobis distances against the Chi-square distribution. The rest of the paper

is organized as follows. Section 2 discuses the related work and section 3 discusses the proposed work.

II. RELATED WORK

Many designs have been developed for intrusion detection [1-6]. Shyu has developed a network based

intrusion predictive model using principal component analysis (PCA) and Chi-square distribution for

KDD1999 dataset [1]. Denning describes an intrusion detection model that is capable of detecting break-

ins, penetrations and other types of computer attacks. This model is based on the hypothesis that the

security violations can be detected by monitoring audit records of the system for abnormal patterns of

system usage [2]. Errors in multivariate data have been detected using PCA [3]. Ye discusses an anomaly

detection technique based on a Chi-Square statistic into information systems that achieves 100% detection

rate [4]. Puketza discusses methodologies to test an intrusion detection system and gets satisfactory result

in the course of testing IDSs [6]. He further finds experimentally that the performance of Chi-square

distribution is better than that of the Hotelling’s T2 [14]. In [34], the PCA methodology is discussed to

detect intrusion. Filzmoser applies the Chi-square method to detect multivariate outlier in exploration

geochemistry [17]. Garrett has made a tool for multivariate outlier detection [18]. Gascon et al. have

shown a comprehensive statistical analysis in relation to the vulnerability disclosure time, updates of

vulnerability detection systems (VDS), software patching releases and publication of exploits [22]. Davis

and Clark have shown a trend toward deeper packet inspection to construct more relevant features

through targeted content parsing [23]. Jin et al. discuss direct utilization of the covariance matrices of

sequential samples to detect multiple network attacks [24]. Yeung and Ding show in their experimental

results that the dynamic modeling approach is better than the static modeling approach for the system call

datasets, while the dynamic modeling approach is worse for the shell command datasets [25]. Hussein and

Zulkernine present a framework to develop components with intrusion detection capabilities [26]. Wang

et al. present several cross frequency attribute weights to model user and program behaviors for anomaly

intrusion detection [27]. Chen et al. discuss an efficient filtering scheme that can reduce system workload

and only 0.3% of the original traffic volume is required to examine for anomaly [28]. Casas et al. present

an unsupervised network intrusion detection system that is capable of detecting unknown network attacks

without using any kind of signatures, labeled traffic, or training [29]. Trailovi´c discusses variance

estimation and ranking methods for stochastic processes modeled by Gaussian mixture distributions. It is

shown that the variance estimate from a Gaussian mixture distribution has the same properties as a

variance estimate from a single Gaussian distribution based on a reduced number of samples [31].

Dasgupta presents provably the first correct algorithm for learning a mixture of Gaussians, which is very

simple and returns the true centers of the Gaussians to within the precision specified by the user, with

high probability and has linear complexity in the dimension of data and polynomial in the number of



389 Vol. 5, Issue 1, pp. 387-398

Gaussians [32]. Dempster et al. present a general approach to iterative computation of maximum-

likelihood estimator for incomplete data since each iteration of the algorithm consists of an expectation

step followed by a maximization step [33]. Chandola et al. have presented a brief survey of anomaly

detection [35]. Teodoro et al. have discussed anomaly detection methodologies and different type of

problems [36]. In next section, we discuss statistical techniques for intrusion detection.

III. STATISTICAL TECHNIQUES IN INTRUSION DETECTION

In this work, we discuss three approaches, which are based on PCA, Chi-square distribution and Gaussian

mixture distribution and then discuss their comparative performances for host based intrusion detection

system. We may mention that these techniques have been discussed in literature for network based

intrusion detection systems, not for host based systems. In [34], the PCA has been used for outlier

detection in hardware profile. We now discuss PCA, Chi-square distribution, and Gaussian mixture

distribution each for host based intrusion detection system.

3.1 Principal Component Analysis

The principal component analysis (PCA) is a common technique to find the patterns in a high dimension

data by reducing its dimensions without losing the information contained in it. It produces a set of

principal components which are orthonormal eigenvalue/eigenvector pairs, i.e., it projects a new set of

axes that best suit the data. In our proposed scheme, these set of axes represent the normal feature data.

Outlier detection occurs by mapping the data on to these ‘normal’ axes and calculating the distances from

these axes. If the distance is greater than a certain threshold, we may confirm there is an attack i.e. outlier

detection. Principal components are particular linear combinations of m random variables: X1, X2, ... ,Xm

that have two important properties:

a. X1 , X2 , ... , Xm are uncorrelated and sorted in descending order.

b. Total variance, X, of X1 , X2 , ... , Xm is given by

X =

m

i 1

Xi (1)

These variables are found from eigenanalysis of the correlation matrix of the original variables X1, X2, ... ,

Xm. The values of correlation matrix and covariance are not same [9-10], [15].

Let the original data, X, be an nxm data matrix of n observations with each observation consisting of m

fields (dimensions) X1, X2,..., Xm and R be a m x m correlation matrix of X1, X2,...,Xm. The eigenvalues of

R are the roots of the following polynomial equation:

| R – λI | = 0

Each eigenvalue of R has a corresponding non-zero vector e, called an eigenvector,

Re = λe

If (λ1 , e1), (λ2 , e2), ..., (λm, em) are the m eigenvalue-eigenvector pairs of the correlation matrix R, the ith

principal component is given by

yi = eiT(x – x )

= ei1 (x1 – 1x )+ ei2 (x2 – 2x ) + ... + eim (xm – mx ), i =1 , 2 , ..., m. (2)

where

eigen values are ordered as λ1 ≥ λ2 ≥ ... ≥ λm ≥ 0,

eiT= (ei1 , ei2 ,..., eim) is ith eigenvector,

x=(x1, x2, … , xm)T

is transpose of observation data

x =(x1 , x2 , ..., xm)T is transpose of sample mean vector of the observation data.

For any attribute xi, let the observed data be a1, a2, … , an, then we have

ix = 1/n

n

i

ia1



390 Vol. 5, Issue 1, pp. 387-398

The principal components derived from the covariance matrix are usually different from the principal

components generated from the correlation matrix. When some values are much larger than others, their

corresponding eigenvalues have larger weights. One of these metrics is known as the Mahalanobis

distance

d2 (x, y) = (x − y)

T R

-1 (x − y ) (3)

where R-1

is the sample correlation matrix, x and y are vectors and here XT

represents the transpose of X.

Using the correlation matrix, the relationships between the data fields is represented more effectively.

There are two main issues in applying PCA, interpretation of the set of principal components and

calculation of distance.

Each eigenvalue of a principal component corresponds to the amount of variation it encompasses. The

larger eigenvalues are more significant and correspond to their projected eigenvectors. The principal

components are sorted in descending order. Eigenvectors of the principal components represent axes

which best suit a data sample. Points which lie at a far distance from these axes are assumed to exhibit

abnormal behavior and they can be easily identified. Using a threshold value, the data generated by

normal system with Mahalanobis distance greater than the threshold is considered an outlier and, here, it

is an intrusion. But, sometimes it alerts the user as intrusion if the data is on the threshold boundary.

Consider the sample principal components, y1 , y2 , ... ,ym of an observation X where

yi = eiT

(X – x ), i =1 , 2 , ..., m

The sum of squares of the partial principal component scores is equal to the principal component score:

m

i 1

yi2/λi = y1

2/λ1 + y2

2/λ2 + …. + ym

2/λm (4)

equates to the Mahalanobis distance of the observation X from the mean of the normal sample dataset

[15]. Major principal component score is used to detect extreme deviations with large values on the

original features. The minor principal component score is used to detect some attacks which may not

follow the same correlation model. As a result, two thresholds are needed to detect attacks. q is a subset of

major principal components and r is a subset of minor principal components. The major principal

component score threshold is denoted Tq while the minor principal component score threshold is referred

to as Tr. q is most significant principal components and r is least significant principal components. An

attack occurs for any observation X if :

q

i 1

yi2/λi > Tq or

m

rmi 1

yi2/λi > Tr (5)

3.2 Chi-square distribution The Chi-square distribution is an asymmetric distribution that has a minimum value of 0, but no

maximum value. The curve reaches a peak to the right of 0, and then gradually declines. The mean of the

Chi square distribution is the degree of freedom and the standard deviation is twice the degrees of

freedom. For each degree of freedom, there is a different X2 distribution. This implies that the Chi square

distribution is more spread out with a peak farther to the right, for larger than for smaller degrees of

freedom. As a result, for any given level of significance, the critical region begins at a larger Chi-square

value, the larger the degree of freedom. The size of multivariate data are quantified by the covariance

matrix. A well-known distance measure which takes into account the covariance matrix is the

Mahalanobis distance. For a p-dimensional multivariate sample xi ( i = 1, …, n ), the Mahalanobis

distance is defined as

MDi = (( xi - t )T C

-1 ( xi - t ))

1/2 , i = 1, … , n

where t is the estimated multivariate location and C the estimated covariance matrix. Usually, t is the

multivariate arithmetic mean, and C the sample covariance matrix. For multivariate normally distributed

data, the values are approximately Chi-square distributed with p degrees of freedom (Xp2). The

multivariate outliers can now simply be defined as observations having a large (squared) Mahalanobis

distance. For this purpose, a quantile of the Chi-squared distribution (e.g., the 99.5% quantile) could be



391 Vol. 5, Issue 1, pp. 387-398

considered. However, this approach has several limitations. The Mahalanobis distances need to be

estimated by a robust procedure in order to provide reliable measures for the recognition of outliers.

Single extreme observations, or groups of observations, departing from the main data structure can have a

severe influence to this distance measure because both location and covariance are usually estimated in a

non-robust manner. The minimum covariance determinant (MCD) estimator is probably most commonly

used in practice. Using robust estimators of location and scatter in the above equation leads to robust

distances (RD). If the squared RD for an observation is larger than, say, X 2

p;0.995, it can be declared as an

outlier. The Chi-square plot are obtained by the squared Mahalanobis distances (which have to be

computed ont the basis of robust estimations of location and scatter) against the quantiles of Xp2, the most

extreme points are deleted until the remaining points follow a straight line. The deleted points are

identified as outliers. This method needs user interaction and experience on the part of the analyst.

Moreover, especially for large data sets, it can be time consuming, and also to some extent it is subjective.

In the next section a procedure that does not require analyst intervention, is reproducible and therefore

objective, into consideration is introduced.

3.3 Gaussian Mixture Distribution

Assuming the covariance matrix Σ non-singular, the probability density function (pdf) of X can be written

as:

Nx(μ, Σ) = |2Π Σ|-1/2

exp(-1/2 (x- μ)T Σ

-1 (x- μ)) (6)

Here μ denotes mean value, Σ covariance matrix and | . | the determinant. It is possible to have

multivariate Gaussian distributions with singular covariance matrix. In that case, the above expression

cannot be used for the probability distribution function. We assume non-singular covariance matrices. For

a mixture model with K components each zi is between 1 and K. The sum to be maximized is

Σi Σk p(k|xi;θt)[log p(k;θ) + log p(xi|k;θ)] (7)

Using the mixture model notation from above, we have p(k;θ) = αk and p(xi|k;θ) = f(xi;λk).

Then the sum to be maximized can be written as

E = Σi Σk p(k|xi;θt)[log αk + log f(xi;λk)] (8)

For the E-step, we use Bayes rule:

wik = p(k|xi;θt) = f(xi;λk).αk / Σj f(xi;λj).αj (9)

For the M-step, the two terms inside the square brackets in E involve disjoint sets of parameters. So, we

can do two separate maximizations. The first one to be maximized is

Σi Σk wik log αk = Σk ck log αk (10)

where ck = Σi wik, subject to constraint Σk αk = 1. Using a Lagrange multiplier, its solution is given by

αk = ck / Σj cj (11)

The second one to be maximized is

Σi Σk wik log f(xi;λk) (12)

This can be divided into K separate maximizations, each of the following form

λk = argmaxλ Σi wik log f(xi;λ) (13)

The cluster is used to automatically estimate the parameters of a Gaussian mixture model from sample

data. This process is essentially similar to conventional clustering except that it allows cluster parameters

to be accurately estimated even when the clusters overlap substantially.

IV. EXPERIMENT

To carry out the experiments, the performance logs need to be generated. The steps for generating the

performance logs are as follows.

On the start menu, go to settings, then Control Panel. Double click Administrative Tools and

double click Computer Management.



392 Vol. 5, Issue 1, pp. 387-398

Explore performance Logs and Alerts by right click Counter Logs, and then click New Log

Settings.

Type a name for the counter log and then click OK.

Click Add Counters.

In the Performance object box, select a performance object that need be monitored.

Counters added for experiment.

On the General tab under Sample data every, sampling interval of 15 seconds is configured.

On the Log Files tab log files properties is configured as Comma delimited files that can be

viewed later in reporting tools such as Microsoft Excel.

After the performance log has been generated for each day, the log is divided into 4 groups, and the

average values for each column of the table are calculated. After finding the average values, the values

are maintained in another table. These values are used as our normal data set.

In the meanwhile, for one day the system is left to work when the graphics driver, audio driver, and USB

driver have been disabled. This generates the logs for system performance that have been considered as

intruded data. We have taken the same number and the same type of attributes in our experiments. For our

experiment, we have taken the normal dataset and the testing dataset i.e. mixture dataset (normal and

intrusion) have been shown in Tables 1 and 2, respectively.

Table 1: Normal dataset with some selective attributes Commit

ted byte

in use

Availab

le

Mbytes

Cache

faults/sec

Page

faults/sec

Page

writes/se

c

Page

op/sec

Pool

Non-

paged

Allocs

Pool

Paged

Allocs

System

driver

total

byte

Write

copies/se

c

3.82441

8508

1724.57

2519

101.6124

671

295.3630

973

0.06857

4005

1.09718

4087

26682 42251.7

6336

7503152

.855

3.33582

1308

3.64145

3572

1736.84

8485

57.15040

864

256.0914

499

0.24534

2241

3.92547

5853

23133.4

2424

32497.9

697

7503872 4.44542

9107

5.11114

4298

1680.81

9718

79.47708

971

334.4868

715

0.14372

3968

2.29958

3495

37059.2

0563

52290.0

2817

7607771

.944

1.91861

6972

5.63854

6946

1654.26

1628

36.17087

94

162.9619

512

0.03686

9153

0.58990

6451

32614.1

1047

45698.5

5233

7503872 2.13336

7554

4.57840

4615

1702.97

1429

51.11998

798

280.3527

608

0.15858

5618

2.53736

9884

27955.8 37290.3

7143

7503872 5.54100

7434

4.77965

7979

1696 49.93331

807

128.9745

175

0.16840

0988

2.69441

5815

33171.6

1811

48146.6

0236

7503872 1.32170

4214

6.23580

037

1640.77

8065

94.47212

975

308.6162

728

0.13230

9061

2.11694

497

54970.7

5613

71052.6

1742

7503872 3.68972

7265

5.61765

1849

1648.30

0971

46.02698

896

219.4003

424

0.01555

0526

0.24880

8412

33495.5

1262

43982.1

0097

7503689

.072

7.98484

3091

4.11134

8119

1720.17

0732

132.1210

622

525.6392

525

0.06567

2177

1.05075

484

26849.5

3659

37876.3

3537

7500175

.61

7.74730

1195

9.08474

6823

1619.29

0476

68.32869

399

458.0278

67

0.60135

9321

9.62174

913

42900.9

6667

47518.9

2857

7470655

.39

4.44356

2404

10.7049

4185

1580.08

8785

76.97290

274

512.5241

61

1.16544

1987

18.6470

7179

49948.0

1402

62041.1

2617

7470663

.776

10.8575

9362

10.8421

5109

1563.67

4455

102.9971

76

469.8837

442

0.46537

3406

7.44597

4504

86950.0

7321

100841.

3692

7470944

.498

8.17171

7948

8.92087

0039

1650.93

6255

35.11830

951

189.8569

296

0.29024

2311

4.64387

6977

36718.6

8127

45366.9

5219

7470916

.335

7.52211

2596

9.01944

5732

1629.35

9551

123.1521

235

660.7778

565

1.23764

2668

19.8022

8269

37276.3

9326

44230.8

5393

7470045

.483

9.02677

0954



393 Vol. 5, Issue 1, pp. 387-398

8.15211

776

1669.16

8831

46.88692

477

185.9397

726

0.32143

1061

5.14289

6979

36697.4

8312

42494.6

1299

7470859

.304

2.60437

2686

9.78973

4979

1564.33

3333

108.4193

887

688.6777

162

1.01201

838

16.1922

9409

39289.6

8571

42273.7

4286

7469309

.562

11.2262

4646

8.82137

0882

1635.53

8462

70.41303

081

401.7980

469

0.52689

9445

8.43039

1119

36366.8

5

41617.4

6154

7470379

.323

7.55144

6472

9.22601

2337

1605.61

8421

82.65173

496

478.1532

06

0.81245

8579

12.9993

3727

44724.5 48669.6

7763

7470484

.211

5.78276

1903

13.1381

8528

1567.82

7759

60.97717

436

335.7265

847

0.80176

4537

12.8282

326

49766.0

2174

58386.9

9164

7470946

.462

6.01478

3825

10.2376

1708

1621.77

0221

107.5001

39

639.6973

908

0.61802

9752

9.88847

6038

49423.0

7537

56112.7

8125

7577140

.706

3.40091

2746

14.1174

0933

1537.98

6938

55.69721

551

329.7004

28

1.10637

4628

17.7019

9405

55713.7

1408

58141.7

2424

7470967

.269

5.03511

2956

11.3247

4835

1523.32

4022

46.23485

621

296.8309

553

0.23660

2086

3.78563

3383

45210.7

8911

52444.3

8687

7471038

.212

5.96200

7286

8.32844

5781

1651.97

9167

48.98892

139

329.3808

661

0.64198

8393

10.2718

1429

35643.3

0208

41295.2

1354

7470613

.333

5.87977

0855

11.6934

3107

1548.83

8095

46.67733

261

268.6304

243

0.67348

6201

10.7757

7921

48628.6

8571

52783.7

6429

7471689

.143

2.70073

5805

13.8784

9525

1432.84

9765

40.06764

295

200.9667

528

0.45081

3931

7.21302

2891

65461.5

8216

69597.7

2457

7470956

.57

4.41019

7699

9.62421

2947

1599.36

4035

52.68971

743

289.4970

927

0.56781

3405

9.08501

4478

45187.1

7982

48521.9

5175

7470690

.807

4.01893

4235

10.1362

5358

1567.27

7372

80.36205

126

476.6320

696

0.89210

8133

14.2737

3012

45269.5

9124

48855.0

219

7470416

.35

6.00012

105

14.3701

0283

1552.65

6716

32.39166

472

251.1562

525

0.61583

7055

9.85339

2883

50546.9

5191

55239.2

2554

7470791

.536

3.87021

0501

8.60691

6888

1628.66

2634

51.56334

321

258.0409

433

0.31237

3168

4.99797

0694

43503.2

5403

52199.9

8387

7470850

.753

2.98393

2976

9.20217

6526

1585.38

3178

140.3664

739

874.0920

392

1.14834

4959

18.3735

1934

40237.6

3551

44611.6

1682

7588050

.542

13.2098

419

8.17029

6367

1637.28

0702

131.4596

738

607.5526

303

0.72011

2328

11.5217

9725

40203.4

3275

48259.9

2982

7470553

.076

14.0465

5708

10.3369

9814

1550.55

7522

84.19619

754

483.3572

935

0.56207

5459

8.99320

7343

43950.9

469

51344.5

885

7470687

.15

6.15198

2051

8.68097

7551

1642.34

4156

120.2821

865

367.8754

24

0.28778

0024

4.60448

039

55126.2

0698

65521.6

9075

7503742

.338

3.01397

8289

15.5496

2495

1376.74

1758

66.20661

941

480.5522

772

0.50815

5306

8.13048

4901

59087.9

3407

65756.6

685

7470931

.458

6.63678

3264

9.92405

1137

1585.74

7813

67.85154

881

336.0418

492

0.35963

2859

5.75412

5742

49347.1

1953

57079.7

6385

7503137

.586

3.96796

1776

14.1485

8942

1450.71

467

69.47134

899

348.4456

721

0.83024

7837

13.2839

654

59940.0

9017

64597.6

9583

7470850

.412

4.24406

5789

12.1838

5114

1465.22

314

127.4100

169

848.5976

84

1.03672

1365

16.5875

4184

53400.0

7438

56523.8

2645

7469546

.843

8.54424

9265

14.0090

7012

1440.05

1704

54.06844

756

234.6482

035

0.36904

3016

5.90468

8251

57645.8

0964

66894.4

7004

7470993

.297

5.67031

7002

9.29341

2873

1583.12

7273

108.4084

986

817.3167

886

1.13076

3968

18.0922

2349

40862.4

3636

42871.1 7469391

.127

10.1794

9874

8.27161

7418

1650.73

9394

235.4550

432

575.5292

127

0.40006

9694

6.40111

5106

43779.8

3636

52748.0

4848

7470818

.521

5.38662

6169



394 Vol. 5, Issue 1, pp. 387-398

Table 2: Testing dataset with some selective attributes Committ

ed byte

in use

Availabl

e Mbytes

Cache

faults/se

c

Page

faults/se

c

Page

writes/se

c

Page

op/sec

Pool

Nonpage

d Allocs

Pool

Paged

Allocs

System

driver

total

byte

Write

copies/se

c

4.60138

7552

1700.02

2388

26.0404

3396

132.996

7699

0.08761

0478

1.40176

7647

29212.3

4328

39788.4

8507

7533002

.507

2.94018

4286

4.94660

698

1690.17

9012

85.3540

8783

371.750

5326

0.28284

9872

4.52559

7951

32166.5

5556

43517.9

321

7503581

.235

7.65501

2219

9.25192

0893

1632.75

4655

220.428

4527

905.328

7534

0.37617

982

6.01887

7122

63023.9

1566

72219.4

2935

7497443

.119

8.42061

2447

9.87306

6812

1584.64

2612

64.1982

9632

766.807

9994

0.42482

7439

6.79723

9025

48472.4

8454

57402.4

9828

7470780

.261

4.50664

9959

7.56782

3945

1692.20

6349

68.2845

7374

324.150

1421

0.94456

778

15.1130

8448

35413.9

3651

38624.1

9048

7470356

.317

5.48903

716

8.83105

4122

1646.63

7537

54.3164

7998

264.238

9394

0.32550

4227

5.20806

7639

39974.5

8491

44790.8

8183

7471010

.447

5.51790

3309

9.11075

792

1635.15

8042

51.9675

4351

234.469

843

0.42992

5858

6.87881

372

49530.9

8042

54235.6

993

7470840

.481

3.02198

2506

8.03826

8296

1653.07

0313

71.3703

7658

399.828

3644

0.94208

7728

15.0734

0364

35642.4

0625

39465.2

7344

7470368 5.88156

0678

9.81139

4241

1581.92

963

87.2102

098

427.907

25

0.70748

8741

11.3198

1986

43191.1

5556

52279.0

8889

7470755

.081

10.6452

3952

13.2414

429

1585.84

5865

64.6419

0334

316.883

6163

0.67535

2729

10.8056

4366

53580.3

6842

61298.0

3383

7470985

.945

4.57260

3869

8.74293

7931

1644.18

3962

32.3115

6115

222.820

7108

0.29641

1962

4.74259

1388

36625.9

5755

41421.0

7547

7470881

.811

4.25570

7607

8.21368

4021

1634.32

4786

62.8280

7225

336.102

0724

0.53460

3814

8.55366

1019

38311.6

6239

44574.3

1197

7470298

.803

3.90711

0265

12.9612

3187

1484.29

9335

62.8370

6428

446.223

2714

0.55942

9696

8.95087

5138

39367.7

4279

50919.1

5965

7470895

.113

5.64607

2064

9.59933

8354

1608.68

4211

83.1764

2964

498.384

5806

0.96693

393

15.4709

4288

40983.4

2915

46584.3

1579

7470341

.182

9.07504

3219

12.4230

2175

1585.58

7703

73.9178

2982

383.005

4702

1.11480

7342

17.8369

1747

52191.4

6293

57598.2

8571

7470933

.642

4.48466

5132

8.30114

4161

1657.94

197

225.526

7105

470.624

4796

0.30587

9715

4.89407

5434

40272.8

0297

51679.3

4143

7470849

.727

3.65573

7869

11.6564

8745

1522.17

2414

189.698

7689

1189.39

3533

2.62417

4933

41.9867

9892

48923.3

908

55990.7

0115

7468938

.299

14.6328

3051

13.8432

0911

1499.34

1053

64.7662

2069

314.717

7163

0.87648

5557

14.0237

6892

58342.5

8105

66071.5

9158

7470707

.335

4.62282

402

9.73926

6809

1581.82

7451

68.7048

6069

415.767

9794

0.54809

2356

8.76947

7698

40609.1

098

47562.4

7059

7470734

.557

7.13023

9303

11.2344

4288

1540.49

8941

99.7359

2101

500.433

7987

0.32827

3963

5.25238

3416

58790.2

7542

63676.7

8072

7509421

.559

3.99335

076

3.19629

803

1740 1602.59

2741

4193.39

7458

1.86552

8638

29.8484

5821

22480 28030 7278592 123.228

5306

7.09809

2873

1613 3708.62

9397

48843.1

9072

4.60062

4828

73.6099

9725

26335 31848 7278592 504.425

6508

7.63337

616

1600 2531.46

6836

29764.0

9948

6.08298

2197

97.3277

1516

29131 37788 7372800 287.251

9371

7.64847

9754

1603 1006.76

9784

1956.87

085

6.00021

7242

96.0034

7587

29534 40935 7372800 27.8010

0656

8.26295

7542

1590 461.147

8738

2469.61

1203

5.20016

3966

83.2026

2346

31214 45784 7372800 200.472

9878



395 Vol. 5, Issue 1, pp. 387-398

3.32467

8577

1738 1762.76

4486

4682.11

5926

1.67807

7508

26.8492

4012

22296 28074 7278592 141.014

4466

6.19187

7248

1643 3732.79

9556

49552.1

3253

4.66608

2767

74.6573

2428

25054 31251 7278592 492.671

6819

5.87539

7339

1651 2049.18

9776

21404.1

4929

3.52296

2366

56.3673

9785

31232 31145 7376896 297.969

9201

7.64619

4342

1600 2791.57

5869

31989.2

3505

2.53379

5515

40.5407

2824

34482 35446 7376896 211.705

2831

7.89301

8861

1602 295.943

3757

2465.41

6994

6.00020

3608

96.0032

5773

31029 44443 7372800 34.8011

8093

7.84651

569

1601 524.727

077

1533.88

9331

5.99383

6392

95.9013

8227

36112 36112 7471104 48.5500

7478

7.96336

981

1600 1156.74

4234

2285.15

4868

6.00005

6544

96.0009

047

36439 42058 7471104 44.6004

2031

5.93630

8543

1649 1982.85

2701

20715.5

1759

3.39234

5214

54.2775

2342

31382 31160 7376896 292.818

6234

7.63327

6794

1598 2817.94

1408

32296.5

4016

2.53070

6249

40.4912

9998

34623 35585 7376896 196.995

5022

7.84363

4084

1601 480.204

6148

1513.81

4548

6.00005

7662

96.0009

2258

36190 37699 7471104 48.6004

6706

5.74721

5523

1653 1971.06

8567

20561.8

4666

3.39333

2869

54.2933

2591

31316 31052 7376896 284.205

0934

7.65623

0282

1602 2656.19

9819

31600.5

7928

2.53288

8158

40.5262

1053

34369 35563 7376896 220.161

3049

7.89778

8417

1599 866.335

4261

2194.67

6453

5.99401

817

95.9042

9071

36073 37570 7471104 37.8955

1487

12.9385

1347

1429 233.867

1284

1094.26

8827

0 0 55611 63459 7471104 67.8001

3385

13.7001

5193

1405 282.068

5421

2013.08

0051

0 0 58759 68319 7471104 76.8005

1064

4.1. False alarm rate

False alarm rate and detection rate can be calculated using confusion matrix. Confusion matrix is

shown below

Predicted Class

C NC

Actual Class C

NC

Fig. 1 Confusion matrix

where,

C – Anomaly class, NC – Normal class

TN – True Negative, FN – False Negative

TP – True Positive, FP – False Positive

Recall (R) = TP / (TP+ FN), Precision (P) = TP / (TP+FP)

F-measure = (RP(1+ 2)) / (R 2

+P), where R and P denote Recall and Precision, respectively;

is the relative importance of precision vs. recall and it is usually set to 1. Therefore, we can

have F-measure as 2RP/(R+P).

TN FP

FN TP



396 Vol. 5, Issue 1, pp. 387-398

4.2. Comparative results

Table 3: Comparative results of PCA, Chi-square distribution and cluster with Gaussian mixture distribution

Name of Used Techniques Detection rate False alarm rate

Principal Component Analysis 97.5% 2.5%

Chi-square distribution 90% 10%

Cluster with Gaussian Mixture Distribution 97.5% 2.5%

Here, we have used three different techniques for our experiments. We have got the results for PCA

technique and Gaussian Mixture Distribution each, which is 97.5% detection rate. We have also got good

result in Chi-distribution method. Using these above techniques we can easily detect hardware based

intrusion. All these three techniques are shown by implementing for HIDS on the basis of performance

log and two (i.e. Principal Component Analysis and Cluster with Gaussian Mixture Distribution) among

these three give good results. The PCA and Gaussian mixture distribution give the detection rate

maximum 97.5% each and the Chi-square distribution gives 90%. Generally, anomaly detection systems

suffer from false alarm rate. Another problem in anomaly detection systems is that they are not able to

classify the boundary data i.e. sometimes detection systems show the normal data as intrusion data and

vice versa. That’s why we have used confusion matrix that gives accurate detection rate and false alarm

rate. We can apply these techniques for any type of outlier detection and huge dataset also.

V. FUTURE WORK

In future work, we will try to produce better results in terms of better detection rate and lesser false alarm

rate. We will also explore other statistical techniques that can be used in our work for improving the

results.

VI. CONCLUSIONS

In this paper, we have analyzed an intrusion detection system using Chi-square distribution and Gaussian

mixture distribution. We have shown the comparative results of Principal Component Analysis, Chi-

square distribution and Gaussian mixture distribution. Our experimental results show that the PCA and

Gaussian mixture distribution each give detection rate maximum 97.5% and the Chi-square distribution

90%. Generally, anomaly detection system suffers from false alarm rate. We can apply these methods for

any type of outlier detection and we can apply these techniques for huge dataset also.

ACKNOWLEDGEMENT

The authors express their sincere thanks to Prof. S. Chand for his invaluable comments and suggestions.

REFERENCES

[1] M.L . Shyu, S.C. Chen, K. Sarinnapakorn, and L. Chang, “A novel anomaly detection scheme based on

principal component classifier,” IEEE Foundations and New Directions of Data Mining Workshop, pp. 172–

179, Nov. 2003.

[2] D.E. Denning, “An Intrusion Detection Model,” IEEE Trans. on Software Engineering, SE-13, No. 2, pp.

222-232, 1987.

[3] D.M. Hawkins, “The Detection of Errors in Multivariate Data Using Principal Components,” Journal of the

American Statistical Association, Vol. 69, No. 346, pp. 340-344, 1974.

[4] N. Ye and Q. Chen, “An Anomaly Detection Technique Based on a Chi-Square Statistic for Detecting

Intrusions into Information Systems,” Quality and Reliability Eng. Int’l, Vol. 17, No. 2, pp. 105-112, 2001.

[5] J. McHugh, “Testing Intrusion Detection Systems: A Critique of the 1998 and 1999 DARPA Intrusion

Detection System Evaluations as Performed by Lincoln Laboratory,” ACM Trans. on Information and System

Security, Vol. 3, No. 4, pp. 262-294, 2000.



397 Vol. 5, Issue 1, pp. 387-398

[6] N.J. Puketza, K. Zhang, B. Mukherjee and R.A. Olsson, “Testing Intrusion Detection Systems: Design

Methodologies and Results from an Early Prototype,” Proc. 17th

National Computer Security Conference,

Vol. 1, pp.1-10, Oct. 1994.

[7] F.N.M. Sabri, Md. Norwawi, and K. Seman, “Identifying False Alarm Rates for Intrusion Detection System

with Data Mining,” International Journal of Computer Science and Network Security, Vol.11 No.4, Apr.

2011.

[8] Z.K. Baker and V. K. Prasanna, “Efficient Hardware Data Mining with the Apriori Algorithm on FPGAs,” In

Proceedings of the Thirteenth Annual IEEE Sym. on Field Programmable Custom Computing Machines

2005.

[9] J.D. Jobson, “Applied Multivariate Data Analysis,” Volume II: Categorical and Multivariate Methods.

Springer- Verlag, N Y, 1992.

[10] I.T. Jolliffe, “Principal Component Analysis,” Springer- Verlag, N Y, 2002.

[11] J. Song, H. Takakura and Y. Okabe, “A proposal of new benchmark data to evaluate mining algorithms for

intrusion detection,” In 23rd Asia Pacific Advanced Networking Meeting, 2007.

[12] N. Athanasiades, R. Abler, J. Levine, O. Henry and G. Riley, “Intrusion detection testing and benchmarking

methodologies,” In IEEE International Information Assurance Workshop, 2003.

[13] H. Song and J.W. Lockwood, “Efficient packet classification for network intrusion detection using FPGA,”

Intl. Symp. on Field Programmable Gate Arrays, Feb. 2005.

[14] N. Ye, S.M. Emran, Q. Chen, and S. Vilbert, “Multivariate Statistical Analysis of Audit Trails for Host-Based

Intrusion Detection,” IEEE Trans. on Computers, Vol-51, No.7, 2002.

[15] R.A. Johnson and D.W. Wichern, “Applied Multivariate Data Analysis,” 3rd

Edition, Prentice-Hall, Inc.,

Englewood Cliffs, N.J., U.S.A., 1992.

[16] H. Om and T.K. Sarkar, “Neural network based intrusion detection system for detecting changes in

hardware profile,” Journal of Discrete Mathematical Sciences & Cryptography, vol. 12(4), pp. 451-466, 2009.

[17] P. Filzmoser, C. Reimann, and R.G. Garrett, “Multivariate outlier detection in exploration geochemistry,”

Technical report TS 03-5, Department of Statistics, Vienna University of Technology, Austria, Dec. 2003.

[18] R.G. Garrett, “The chi-square plot: A tool for multivariate outlier recognition,” Journal of Geo chemical

Exploration, Vol. 32, pp. 319-341, 1989.

[19] D. Gervini, “A robust and efficient adaptive reweighted estimator of multivariate location and scatter,”

Journal of Multivariate Analysis, Vol. 84, pp. 116-144, 2003.

[20] P.J. Rousseeuw, and K. Van Driessen, “A fast algorithm for the minimum covariance determinant estimator,”

Technometrics, Vol. 41, pp. 212-223, 1999.

[21] Rousseeuw P.J., Van Zomeren B.C., “Unmasking multivariate outliers and leverage points,” Journal of the

American Statistical Association, Vol. 85 (411), pp. 633-651, 1990.

[22] H. Gascon, A. Orfila, and J. Blasco, “Analysis of update delays in signature-based network intrusion

detection systems,” Computers & Security, Vol. 30, pp. 613-624, 2011.

[23] J.J. Davis and A.J. Clark, “Data preprocessing for anomaly based Network Intrusion Detection: Review,”

Computers & Security, Vol. 30, pp. 353-375, 2011.

[24] S. Jin, D.S. Yeung, and X. Wang, “Network intrusion detection in covariance feature space,” Pattern

Recognition, Vol. 40, pp. 2185-2197, 2007.

[25] D. Yeung, and Y. Ding, “Host-based intrusion detection using dynamic and static behavioral models,”

Pattern Recognition, Vol. 36, pp. 229-243, 2003.

[26] M. Hussein and M. Zulkernine, “Intrusion detection aware component-based systems: A specification-based

framework,” Journal of Systems and Software, Vol. 80, pp. 700-710, 2007.

[27] W. Wang, X. Zhang, and S. Gombault, “Constructing attribute weights from computer audit data for effective

intrusion detection,” Journal of Systems and Software, Vol. 82, pp.1974-1981, 2009.

[28] C.M. Chen, Y.L. Chen, and H.C. Lin, “An efficient network intrusion detection,” Computer Communications,

Vol. 33, pp. 477-484, 2010.

[29] P. Casas, J. Mazel, and P. Owezarski, “Unsupervised Network Intrusion Detection Systems: Detecting the

Unknown without Knowledge,” Computer Communications, Vol. 35, pp. 772-783, 2012.

[30] K. Triantafyllopoulos, “On the central moments of the multidimensional Gaussian distribution," The

Mathematical Scientist, Vol. 28, pp. 125-128, 2003.

[31] L. Trailovi´c and L. Y. Pao, “Variance Estimation and Ranking for Gaussian Mixture Distributions in Target

Tracking Applications,” Proc. Conf. Decision and Ctrl., Vol. 2, pp. 2195 – 2201, Dec. 2002.

[32] S. Dasgupta, “Learning Gaussian Mixtures,” Proc. of IEEE Symp. on Foundations of Computer Science,

p.634, Oct. 17-18, 1999 .



398 Vol. 5, Issue 1, pp. 387-398

[33] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum-Likelihood from Incomplete Data Via the EM

Algorithm,” J. Royal Stat. Soc. Ser. B, vol. 39, no. 1, pp. 1-38, 1977.

[34] H. Om and T. Hazra, “Design of Anomaly Detection System for Outlier Detection in Hardware Profile Using

PCA,” International Journal on Computer Science and Engineering, Vol. 4, Issue 9, pp. 1623-1632, Sept.

2012.

[35] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Computing Surveys, vol. 41,

no. 3, 2009.

[36] P. G. Teodoro, J. D. Verdejo, G. M. Fernandez, and E.Vazquez, “Anomaly-based network intrusion

detection: Techniques, systems and challenges”, computer & security, Vol. 28, Issues 1–2, pp. 18-28, Feb.–

March 2009.

AUTHORS

Hari Om is presently working as Assistant Professor in the department of Computer Science &

Engineering at Indian School of Mines, Dhanbad, India. He earned his Ph.D. in Computer Science

from Jawaharlal Nehru University, New Delhi, India. He has published more than 50 research

papers in International and National Journals including various Trans. of IEEE, Springer, Elsevier

etc., International and National conferences of high repute. His research areas are Video-on-

Demand, Cryptography, Network security, Data Mining, and Image Processing.

Tanmoy Hazra has completed his M.Tech in Computer Applications, department of Computer

Science and Engineering from Indian School of Mines, Dhanbad in 2012. He is presently

working as Assistant Professor in the department of Computer Engineering at ISB & M School of

Technology, Pune. His research interest is mainly on Intrusion Detection Systems.