ENHANCING PERFORMANCE OF INTRUSION ......degree, diploma, associateship, fellowship, titles in this or any other University or other institution of higher learning. I further declare

ENHANCING PERFORMANCE OF INTRUSION

DETECTION SYSTEMS USING

DATA FUSION TECHNIQUES

A Thesis submitted to Gujarat Technological University

for the Award of

Doctor of Philosophy

in

Electronics and Communication

by

Shah Vrushank Manharlal

[119997111013]

under supervision of

Dr. A. K. Aggarwal

GUJARAT TECHNOLOGICAL UNIVERSITY

AHMEDABAD

September-2018

ENHANCING PERFORMANCE OF INTRUSION

DETECTION SYSTEMS USING

DATA FUSION TECHNIQUES

A Thesis submitted to Gujarat Technological University

for the Award of

Doctor of Philosophy

in

Electronics and Communication

by

Shah Vrushank Manharlal

[119997111013]

under supervision of

Dr. A. K. Aggarwal


AHMEDABAD

September-2018

© Shah Vrushank Manharlal

DECLARATION

I declare that the thesis entitled “Enhancing Performance of Intrusion

Detection Systems Using Data Fusion Techniques” submitted by me for the

degree of Doctor of Philosophy is the record of research work carried out by me

during the period from June 2011 to September 2018 under the supervision of

Dr. A. K. Aggarwal and this has not formed the basis for the award of any

degree, diploma, associateship, fellowship, titles in this or any other University

or other institution of higher learning.

I further declare that the material obtained from other sources has been duly

acknowledged in the thesis. I shall be solely responsible for any plagiarism or

other irregularities, if noticed in the thesis.

Signature of the Research Scholar : ……………..…… Date:….………………

Name of Research Scholar: Shah Vrushank Manharlal

Place : Ahmedabad

CERTIFICATE

I certify that the work incorporated in the thesis Enhancing Performance of

Intrusion Detection Systems using Data Fusion Techniques submitted by

Shri Shah Vrushank Manharlal was carried out by the candidate under my

supervision/guidance. To the best of my knowledge: (i) the candidate has not

submitted the same research work to any other institution for any

degree/diploma, Associateship, Fellowship or other similar titles (ii) the thesis

submitted is a record of original research work done by the Research Scholar

during the period of study under my supervision, and (iii) the thesis represents

independent research work on the part of the Research Scholar.

Signature of Supervisor: ……………………… Date: ………………

Name of Supervisor: Dr. A. K. Aggarwal

Place: Ahmedabad

COURSE-WORK COMPLETION CERTIFICATE

This is to certify that Mr. Vrushank Manharlal Shah enrolment no.

119997111013 is a PhD scholar enrolled for PhD program in the branch

Electronics and Communication of Gujarat Technological University,

Ahmedabad.

(Please tick the relevant option(s))

He/She has been exempted from the course-work (successfully completed

during M.Phil Course)

He/She has been exempted from Research Methodology Course only

(successfully completed during M.Phil Course)

He/She has successfully completed the PhD course work for the partial

requirement for the award of PhD Degree. His/ Her performance in the

course work is as follows-

Grade Obtained in Research Methodology

(PH001)

Grade Obtained in Self Study Course

(Core Subject)

(PH002)

AB BB

Supervisor’s Sign

Dr. A. K. Aggarwal

Originality Report Certificate

It is certified that PhD Thesis titled Enhancing Performance of Intrusion Detection

Systems using Data Fusion Techniques by Shah Vrushank Manharlal has been

examined by us. We undertake the following:

a. Thesis has significant new work / knowledge as compared already published or are

under consideration to be published elsewhere. No sentence, equation, diagram, table,

paragraph or section has been copied verbatim from previous work unless it is placed

under quotation marks and duly referenced.

b. The work presented is original and own work of the author (i.e. there is no

plagiarism). No ideas, processes, results or words of others have been presented as

Author own work.

c. There is no fabrication of data or results which have been compiled / analysed.

d. There is no falsification by manipulating research materials, equipment or

processes, or changing or omitting data or results such that the research is not

accurately represented in the research record.

e. The thesis has been checked using (copy of originality report attached) and found

within limits as per GTU Plagiarism Policy and instructions issued from time to time

(i.e. permitted similarity index <=25%).

Signature of the Research Scholar : …………………………… Date: ……..….………


Place : Ahmedabad

Signature of Supervisor: ……………………………… Date:... ………………


Place: Ahmedabad

PhD THESIS Non-Exclusive License to


In consideration of being a PhD Research Scholar at GTU and in the interests of the

facilitation of research at GTU and elsewhere, I, Shah Vrushank Manharlal having

Enrollment number: 119997111013 hereby grant a non-exclusive, royalty free and perpetual

license to GTU on the following terms:

a) GTU is permitted to archive, reproduce and distribute my thesis, in whole or in part,

and/or my abstract, in whole or in part ( referred to collectively as the “Work”) anywhere

in the world, for non-commercial purposes, in all forms of media;

b) GTU is permitted to authorize, sub-lease, sub-contract or procure any of the acts

mentioned in paragraph (a);

c) GTU is authorized to submit the Work at any National / International Library, under the

authority of their “Thesis Non-Exclusive License”;

d) The Universal Copyright Notice (©) shall appear on all copies made under the authority

of this license;

e) I undertake to submit my thesis, through my University, to any Library and Archives.

Any abstract submitted with the thesis will be considered to form part of the thesis.

f) I represent that my thesis is my original work, does not infringe any rights of others,

including privacy rights, and that I have the right to make the grant conferred by this non-

exclusive license.

g) If third party copyrighted material was included in my thesis for which, under the terms

of the Copyright Act, written permission from the copyright owners is required, I have

h) I retain copyright ownership and moral rights in my thesis, and may deal with the

copyright in my thesis, in any way consistent with rights granted by me to my University

in this non-exclusive license.

i) I further promise to inform any person to whom I may hereafter assign or license my

copyright in my thesis of the rights granted by me to my University in this non exclusive

license.

j) I am aware of and agree to accept the conditions and regulations of PhD including all

policy matters related to authorship and plagiarism

Signature of the Research Scholar: ----------------


Date: -------------------- Place: Ahmedabad

Signature of Supervisor: -------------------------------------------------


Date: -------------------------- Place:Ahmedabad

Seal:

Thesis Approval Form

The viva-voce of the PhD Thesis submitted by Shri Shah Vrushank Manharlal Enrollment

No. entitled 119997111013 was conducted on …………………….………… (day and date)

at Gujarat Technological University.

(Please tick any one of the following option)

The performance of the candidate was satisfactory. We recommend that he/she be

awarded the PhD degree.

Any further modifications in research work recommended by the panel after 3 months

from the date of first viva-voce upon request of the Supervisor or request of

Independent Research Scholar after which viva-voce can be re-conducted by the same

panel again.

The performance of the candidate was unsatisfactory. We recommend that he/she

should not be awarded the PhD degree.

---------------------------------------------------- ----------------------------------------------------

Name and Signature of Supervisor with Seal 1) (External Examiner 1) Name and Signature

------------------------------------------------------- -------------------------------------------------------

2) (External Examiner 2) Name and Signature 3) (External Examiner 3) Name and Signature

ABSTRACT

Intrusion Detection Systems are used widely for security of computer networks. An Intrusion

Detection System (IDS) aims to detect network intrusions by unauthorized entities and send

alerts to the administrator about the security breach. An IDS monitors the network traffic and

analyses the packets flowing in-out of network and is able to discriminate between normal

packets and abnormal packets. IDS is a defense-in depth mechanism, which is used along

with firewalls and virus protection systems. IDSs are catching the attention of network

security industry due to low cost, ease of deployment, real time detection and quick response.

An IDS uses sensors to monitor packets entering a network. It may then match the packet

information with the attack signatures, stored in its memory, to locate malicious packets.

Another type of IDS looks at the pattern of the packets, being monitored, to identify packets,

which are trying to attack the network. Such IDSs are said to detect anomalies in packets and

can detect novel types of attacks. The reports of malicious activity are made available at the

management console by both the types of IDSs. An IDS provides an automated mechanism to

detect internal as well as external intruders. Whereas the firewalls are used to show and/or to

limit the ports and IP addresses used for communication between two entities, IDS are able to

look at the content of the packets before taking any action.

The IDS technology is not without challenges. A typical intrusion detection system usually

generates a flood of alerts. Receipt of an alert does not guarantee that there is malicious

activity as some of the alerts may be false positives. If there are large number of false

positives, these may overwhelm the data processing tasks on the system. Moreover an IDS

may sometimes drop the packets when overloaded with large amount of data on the network.

This increases the probability of missing real intrusions.

Signature based IDS are highly accurate having less numbers of false positives. But these are

unable to detect novel type of attacks. On the other hand, anomaly-type IDS can detect novel

type of attacks but produces a large number of false alerts. The tradeoff between ability to

detect novel type of attacks and to generate lower number of false alarms is the key point for

selection of IDS by a network manager. The performance of signature based IDS is limited

by the size of network as it degrades as the size of the network increases.

An efficient attack detection mechanism having higher accuracy and less number of false

alarms requires multiple IDSs of diverse nature viz., signature based and anomaly based IDS

to be deployed for monitoring the same event in a large network. IDS are vulnerable to

attacks as some attackers successfully target the IDS with the aim of bypassing the IDS. This

can become a serious challenge for a network, which is being monitored by a single IDS. The

multiple IDSs, when deployed to overcome the limitations of single IDS, poses the challenge

of interoperability. The interoperability issue is due to the fact that each IDS has different

ways of reporting alerts and their rule sets may be entirely different.

When multiple intrusion detection systems are used to monitor a network, data fusion is used

to enhance the overall performance of the system. Data fusion is the process of combining

alerts from multiple intrusion detection system to derive an inference about presence of an

attack. The data fusion from multiple IDS is used to provide accurate situation assessment for

detection of attack. The desired outcome of fusing data from multiple IDS is often to obtain

low error probability and increased reliability of inference made by the alert fusion system.

The alerts obtained from the data fusion of multiple IDSs has several advantages over the

alerts from a single IDS. The first advantage is that failure of an IDS, when a single IDS is

used, can be catastrophic. Secondly it is possible to have accurate and reliable attack

detection by removing the inherent limitations, when only one type of IDS is used. The

success of data fusion depends on the reliability of evidence provided by IDSs involved in

fusion process, as unreliable IDS deviates the decision making and provides complementary

outcome.

The objectives of the research work were to develop a framework to fuse alerts from

heterogeneous intrusion detection system to ameliorate the detection coverage of intrusions

and to reduce false alert rate. The research resulted in the formulation of a mathematical

model which discounts the intrusion detection system based on their reliability of detecting a

particular type of attack. The model has been validated by using four different IDSs on three

different datasets of attacks on a large network.

Acknowledgement

First and foremost, the eternal gratitude goes to almighty god for enlightening me to pursue

this research work. Almighty God has blessed me with astounding persons during the tenure

of my research. I would to thanks family members and especially parents, late Kalpana

Manharlal Shah and Manharlal Shantilal Shah for nurturing me, loving me and encouraging

me at every step of my life.

I take this opportunity to express deep sense of gratitude to Honorable guide, Dr. Akshai

Aggarwal, EX-Vice Chancellor, GTU for constant motivation, guidance and heartfelt support

in my quest for knowledge. He nurtured my skills to grow from immature PhD scholar to

matured researcher. He has given me all freedom in doing research while ensuring that I will

not deviate from the core of my research.

I have a great pleasure to acknowledge co-supervisor Dr. Nirbhay Chaubey for constant

support, inspiration and mentoring. I would also like to thank foreign co-supervisor Dr.

Shishir Shah, University of Houston, Texas for his valuable insights and rigorous reviews

during research weeks. His guidance at very stages has helped a lot to achieve this milestone.

I would also like to thanks Doctoral Progress Committee members Dr. Haresh Bhatt, Head,

Information Technology & Networks Division, SAC, ISRO and Dr. Y. B. Acharya, Scientist,

Physical Research Lab, Ahmedabad for their valuable suggestions and timely constructive

critics on my work which help to complete research work in right direction and right time.

My sincere gratitude goes to Dr. Rajul Gajjar, Dean, PhD programme, Dr. N. M. Bhatt, Dean,

PhD programme, Prof. S. D. Panchal , I/C Registrar along with staff members of my PhD

section, GTU for administrative assistance and support. Special thanks to a very special

person, my wife, Unnati Shah for his continued love and support. I greatly value her

understanding during the entire duration of my PhD program.

The words of appreciation goes to my little and sweet daughter Prisha for compromising my

love during the phase of thesis writing. Last but not the least, I would like to thank all my

friends and colleagues who has directly and indirectly helped me in completion of this study.

Table of Content

1 Introduction 231.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.4 Objectives and Scope of work . . . . . . . . . . . . . . . . . . . . . . . . . 261.5 Original Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.6 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 Basics of Intrusion Detection System 282.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.2 Signature based Intrusion Detection System . . . . . . . . . . . . . . . . . 312.3 Anomaly based Intrusion Detection System . . . . . . . . . . . . . . . . . . 322.4 Evaluation of Intrusion Detection System . . . . . . . . . . . . . . . . . . . 33

2.4.1 Evaluation metrics IDS . . . . . . . . . . . . . . . . . . . . . . . . . 332.4.2 Evaluation Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.5.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3 Estimation of Reliability of Intrusion Detection system 503.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.2 Reliability of Intrusion Detection System . . . . . . . . . . . . . . . . . . . 50

3.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.2.2 Defination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.2.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3 Reliability of Data fusion result . . . . . . . . . . . . . . . . . . . . . . . . 523.4 Estimation of Reliability values . . . . . . . . . . . . . . . . . . . . . . . . 543.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4 Attack Detection Performance of IDS using Data Fusion of multipleheterogeneous systems 564.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.2 Data fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.2.1 Defination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.2.2 Advantages of data fusion for intrusion detection . . . . . . . . . . 564.2.3 Challenges of data fusion for intrusion detection . . . . . . . . . . . 57

4.3 Data fusion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.4 Frame of Discernment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

16

4.5 Fusion rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.6 Requirements and limitations of data fusion rules . . . . . . . . . . . . . . 624.7 Reliable Alert Fusion rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.8 Reliable Alert Fusion of Multiple Intrusion Detection system . . . . . . . . 63

4.8.1 Complementary behavior . . . . . . . . . . . . . . . . . . . . . . . . 644.8.2 Complementary Conflicting behavior . . . . . . . . . . . . . . . . . 644.8.3 Supplementary behavior . . . . . . . . . . . . . . . . . . . . . . . . 664.8.4 Supplementary Conflicting confusion matrix . . . . . . . . . . . . . 66

4.9 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.9.1 Experiments against DARPA’99 Dataset . . . . . . . . . . . . . . . 664.9.2 Experiments against KDD’99 Dataset . . . . . . . . . . . . . . . . . 674.9.3 Experiments against NSL-KDD Dataset . . . . . . . . . . . . . . . 704.9.4 Experiments against Real time Attack-UDP(flooding) . . . . . . . . 73

4.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Summary of Research Work & Recommendation for Future Work 755.1 Summary of Research Work . . . . . . . . . . . . . . . . . . . . . . . . . . 755.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . . . . . . 76

17

List of Abbreviations

• IDS: Intrusion Detection System

• NFR: Network Flight Recorder

• DARPA: Defense Advanced Research Projects Agency

• PHAD: Packet Header Anomaly Detector

• KDD: Knowledge Discovery & Data Mining

• PPV: Positive Prediction Value

• NPV: Negative Prediction Value

• RAF: Reliable Alert Fusion

• TP: True Positive

• FP: False Positive

• TN: True Negative

• FN: False Negative

• TPR: True Positive Rate

• TNR: True Negative Rate

• FPR: False Positive Rate

• FNR: False Negative Rate

• MIT: Mutual Information Transfer

• NSL: Network Security Lab

• CCTV: Closed Circuit Television

• TBM: Transferable Belief Model

• BBA: Basic Belief Assignment

• OSI: Open Systems Interconnection

• DOS: Denial of Service

• U2R: User to Root

• R2L: Remote to Local

• ICMP: Internet Control Message Protocol

• IP: Internet Protocol

18

• TCP: Transmission Control Protocol

• UDP: User Datagram Protocol

• VRT: Vulnerability Research Group

• CRF: Conjunctive Reliability Factor

• DRF: Disjunctive Reliability Factor

• RAF: Reliable Alert Fusion

• WAO: Weighted Average Operator

• PCR: Proportional Conflict Redistribution

• DS: Dempster Shafer

19

List of Figures

2.1 Deployment of Firewall & Intrusion Detection System . . . . . . . . . . . . 282.2 Components of Intrusion Detection System . . . . . . . . . . . . . . . . . . 292.3 Block diagram of Signature based IDS . . . . . . . . . . . . . . . . . . . . 302.4 Denial of service attack (UDP flooding) scenario . . . . . . . . . . . . . . . 302.5 The effect of False positive rate on PPV . . . . . . . . . . . . . . . . . . . 342.6 Intrusion Detection Evaluation Framework . . . . . . . . . . . . . . . . . . 352.7 Number of Attack Instances in DARPA99 . . . . . . . . . . . . . . . . . . 372.8 Total number of connections in DARPA’99 dataset . . . . . . . . . . . . . 372.9 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.10 % detection rate of snort versus suricata for DOS, probe, R2L and U2R

category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482.11 % detection rate of PHAD versus ALAD for DOS, probe, R2L and U2R

category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.1 Binary symmetric model of intrusion detection system . . . . . . . . . . . . 523.2 Reliability versus False positives . . . . . . . . . . . . . . . . . . . . . . . . 533.3 Data fusion for known ground truth . . . . . . . . . . . . . . . . . . . . . . 533.4 Data fusion for unknown ground truth . . . . . . . . . . . . . . . . . . . . 54

4.1 Typical data fusion process . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.2 Data fusion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.3 Positive evidence versus mass value . . . . . . . . . . . . . . . . . . . . . . 604.4 Complementary confusion matrix . . . . . . . . . . . . . . . . . . . . . . . 634.5 Complementary conflicting confusion matrix . . . . . . . . . . . . . . . . . 654.6 Supplementary confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . 654.7 Supplementary Conflicting confusion matrix . . . . . . . . . . . . . . . . . 674.8 Comparison of proposed rule with DS rule against NSL-KDD for detecting

R2L attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.9 Comparison of proposed rule with DS rule against NSL-KDD for detecting

DOS attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.10 UDP(flooding) Packets captured by wireshark . . . . . . . . . . . . . . . . 724.11 UDP(flooding) Packets data . . . . . . . . . . . . . . . . . . . . . . . . . . 72

20

List of Tables

2.1 Defination of various types of alert . . . . . . . . . . . . . . . . . . . . . . 322.2 Alert metrics of IDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.3 Categorywise attacks in DARPA99 . . . . . . . . . . . . . . . . . . . . . . 362.4 KDD99 attack distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.5 Categorywise attacks in KDD99 . . . . . . . . . . . . . . . . . . . . . . . . 372.6 Categorywise attacks in NSL-KDD . . . . . . . . . . . . . . . . . . . . . . 382.7 Comparison between various IDS datasets . . . . . . . . . . . . . . . . . . 392.8 Results of Snort & Suricata against DARPA99-Week-4, Day-1 . . . . . . . 422.9 Results of Snort & Suricata against DARPA99-Week-4, Day-2 . . . . . . . 432.10 Results of Snort & Suricata against DARPA99-Week-4, Day-3 . . . . . . . 432.11 Results of Snort & Suricata against DARPA99-Week-4, Day-4 . . . . . . . 432.12 Results of Snort & Suricata against DARPA99-Week-4, Day-5 . . . . . . . 442.13 Results of Snort & Suricata against DARPA99-Week-5, Day-1 . . . . . . . 442.14 Results of Snort & Suricata against DARPA99-Week-5, Day-2 . . . . . . . 452.15 Results of Snort & Suricata against DARPA99-Week-5, Day-3 . . . . . . . 452.16 Results of Snort & Suricata against DARPA99-Week-5, Day-4 . . . . . . . 462.17 Results of Snort & Suricata against DARPA99-Week-5, Day-5 . . . . . . . 462.18 List of attacks detected by various intrusion detection system . . . . . . . . 472.19 Number of signature raised for DOS(ICMP Flooding) attack by snort . . . 472.20 Attack detected by Signature and Anomaly based IDS against DARPA99 . 47

3.1 Evaluation parameters of IDS using binary channel model . . . . . . . . . 52

4.1 Performance of data fusion of two IDS systems under Complementary Con-fusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2 Performance of data fusion of two IDS systems under Complementary Con-flicting Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.3 Performance of data fusion of two IDS systems under supplementary con-fusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.4 Performance of data fusion of two IDS systems under supplementary con-flicting confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.5 DARPA99 Experiment description . . . . . . . . . . . . . . . . . . . . . . . 674.6 Comparison of single IDS with fusion using DS and fusion using proposed

rule by deriving reliability value using TPR against DARPA99 dataset . . 674.7 Comparison of single IDS with fusion using DS and fusion using proposed

rule by deriving reliability value using TPR against DARPA99 dataset . . 684.8 DARPA 99 Experiment description . . . . . . . . . . . . . . . . . . . . . . 68

21

4.9 Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using conflict between evidences againstDARPA99 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.10 Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using conflict between evidences againstDARPA99 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.11 KDD99 Experiment description . . . . . . . . . . . . . . . . . . . . . . . . 684.12 Comparison of single IDS with fusion using DS and fusion using proposed

rule by deriving reliability value using TPR against KDD99 dataset . . . . 694.13 Comparison of single IDS with fusion using DS and fusion using proposed

rule by deriving reliability value using TPR against KDD99 dataset . . . . 694.14 KDD99 Experiment description . . . . . . . . . . . . . . . . . . . . . . . . 694.15 Comparison of single IDS with fusion using DS and fusion using proposed

rule by deriving reliability value using conflict between evidences againstKDD99 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.16 Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using conflict between evidences againstKDD99 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.17 NSL-KDD R2L Attack Experiment description . . . . . . . . . . . . . . . . 704.18 NSL-KDD DOS Attack Experiment description . . . . . . . . . . . . . . . 714.19 UDP flooding attack experiment . . . . . . . . . . . . . . . . . . . . . . . . 714.20 Alerts and mass value against UDP flooding attack . . . . . . . . . . . . . 714.21 Fusion result against UDP flooding attack . . . . . . . . . . . . . . . . . . 72

22

Chapter 1

Introduction

1.1 Introduction

The technological world is evolving at an astounding pace. As per internet world statisticsof the year 2017, today 3.8 billion people are using internet which is equivalent to half ofthe world’s population [1]. The explosion in use of the internet is due to the large numberof applications posed by it such as e-commerce, e-banking, email and e-shopping. Theinternet technology is like a double-edged sword, the vast benefits of the internet pose thechallenges about confidentiality, integrity and availability of internet. Computer networksecurity is defined as an activity to secure the network against vulnerabilities in order toprotect the integrity and accessibility of the network. Network security process aims toprotect the network against a variety of threats called as an intrusion and to hinder themfrom entering and spreading in the network. A set of activities that violate the securityprotocol of a computer network system is called as intrusion [2].

Intrusion Detection system (IDS) is a security system that tracks the traffic on acomputer network system, analyzes the traffic and generates a warning called an alert oralarm in case any abnormalities found [3]. IDS is defined as a classifier that collects theevidence about the presence or absence of an attack. IDS usually detects intrusion/attackby recognizing the observations of an intrusion in progress or by analyzing the resultsobtained after the occurrence of intrusion [4]. IDS is capable to analyze the packetsflowing in-out of network and able to discriminate between normal packet and abnormalpacket [5]. The intent of IDS is to detect computer attacks and to alert the networkadministrator about the security breach. IDS is not meant to replace any of the existingsecurity systems on the network but it is for enforcing the intricate security policies forusers on the network. Thus, IDS is a defense-in-depth mechanism alongside firewalls andvirus protection systems [6].

Intrusion Detection System is catching the attention of network security industry dueto its low cost, ease of deployment, real-time detection, and quick response [7]. IntrusionDetection System uses sensors and management console. Sensors detect the maliciousactivity by matching packet information with stored attack signatures and reports suchmalicious activity to the management console. There is a rapid expansion in the useof IDS technology by network administrators, information security officers and databasemanagers as IDS provides automate mechanism to detect internal as well as externalintruder.

23

1.2. BACKGROUND

Intrusion Detection system has tremendous advantages compared to traditional fire-wall systems. Firewalls are capable to shows the ports and IP addresses used for commu-nication between two entity while IDS are able to show the content within the packets.Intrusion detection system is capable of doing protocol based analysis, sensors can detectmalicious activity as they know how the protocols are functioning.

Although there are many advantages of IDS over traditional security tools, the IDStechnology is not without challenges [8]. A typical intrusion detection system usuallygenerates a flood of alerts. The alert does not mean that there is malicious activity asthey can be a false positive which can overwhelm the data processing tasks on the system.Thus, Intrusion detection system requires human intervention to analyzes the alerts as99% are false positives. Intrusion Detection system seems to be unreliable as it sometimesdrops the packets when overloaded with large data on the network. This increases theprobability of missing real intrusions.

1.2 Background

Intrusion Detection System (IDS) is a developing technology and it is the main focusingarea of research from last three decades in the field of computer network security. Thefirst model of IDS was capable of detecting unauthorized access was proposed by Denning[9]. This model lays down the foundation for the development of many other IDS.

Haystack IDS [10] aims to reduce voluminous audit trails by doing pattern matchingbased on attack profiles. The work in the field of IDS was strictly for a particular hostmachine until the year 1990. The distributed intrusion detection system was first proposedby researchers in [11] which enlighted the concept of extending the intrusion detection fromhost to local area network and from local area network to arbitrarily wider networks.

The distributed IDS [11] was capable of monitoring the traffic on a single system aswell as a networked system with multiple hosts. Under the influence of the IDS modelproposed by denning [9], the researchers developed a network anomaly detector and in-trusion reporter [12] which was the first anomaly IDS based on statistics based expertsystem.

Vern Paxson in [13] announced Bro IDS with the support of its own ruleset languagefor analyzing the traffic from packet capture library (libpcap). To address the issuesof security and network administration, amoroso [14] designed Network Flight Recorder(NFR) tool using libpcap. APE initially was developed as a packet sniffer tool in [15] andwas renamed as snort. Snort has now become the world’s largely used signature-basedIDS system with three lakh active users worldwide. The intrusion detection system thatexists in late 90’s were incapable of detecting novel attacks. This laid the foundation foranomaly based intrusion detection system which learns the normal behavior of the systemand any deviation from normal behavior will be detected as an attack. Anup K. Ghoshet. al. in [16] proposed a method to learn the normal profile for intrusion detection witha goal to observe and detect the anomalous traffic. Lippmann [17] shows a methodologyfor offline evaluation of Intrusion detection system against DARPA’99 dataset which wasconsidered as a benchmark dataset.

Mahoney in [18] proposed a method for anomaly detection from network traffic basedon packet bytes and packet header anomaly detector (PHAD) in [18]. The work on

24

1.3. PROBLEM STATEMENT

anomaly detectors found in literature [19], [20], [21], [22] shows that is need for the tech-niques that optimizes the performance of anomaly-based IDS systems. Cuppens [23]proposed an alert correlation cooperative module to correlate the alerts into one singlescenario. Authors in [24] proposed an intelligent intrusion detection systems by integrat-ing the inference from misuse and anomaly-based intrusion detection system with a fuzzymodel in order to assess the overall network attack scenario. Yu & Frincke [25] developeda framework based on Dempster-Shafer theory for fusing the alerts for accurate identifi-cation of intrusion. Their work shows the method to evaluate the alert confidence metricwhich was based on maximum entropy and minimum mean square error. Chen and Aick-elin [26] implemented an anomaly detection method using Dempster-Shafer fusion rulefor fusing features. Their approach was capable of handling the situations where certainfeatures of anomaly detection are missing or some features are poor.

Ciza thomas [27] modified the dempster shafer theory for fusing alerts from heteroge-nous multiple intrusion detection systems for having different views of same attack sce-nario. Authors in [28], [29], [30] and [31] have shown the advantages of alert fusiontechniques for multiple IDS systems over single IDS system.

1.3 Problem Statement

The functional idea of an Intrusion detection system is to detect an intrusion as it happens.However, it is very challenging as most of the IDS suffers from false alarms [32]. Signature-based IDS are very fast and accurate as they identify known intrusion by matching incom-ing traffic with signatures database [33]. Signature-based IDS has become obsolete due toits incapability of detecting novel attacks [34]. Another issue with signature-based IDS isthat the database has a large set of signatures which are being compared with incomingand outgoing packets. This mechanism takes a longer time to detect a malicious packetand hence it negatively affect the security of the network. On counterpart, Anomaly IDSstores the normal behavior of the host and deviation from the normal behavior is identifiedas an intrusion. Anomaly-based IDS is highly efficient in detecting novel attacks. Thecurrent network environments have so much data involved that its normal profile changesover time. Hence, a static anomaly intrusion detection system becomes ineffective in pro-ducing a large number of false alarms. Due to enormous computing power in modernday computer systems, alert correlation and alert fusion are the promising techniques toenhance the performance of attack detection using multiple IDS systems. Alert fusionis a method used to combine the local decision of multiple IDS of diverse characteristicsto make a global inference about the actual situation. Alert fusion is a prospective ap-proach to combine advantages signature-based IDS with anomaly-based IDS to enhancethe performance of intrusion detection in terms of true positive rate and false positiverate. The work in [35] shows the limitations of dempster-shafer theory for fusion of datafrom multiple intrusion detection systems. Francisco et. al. [36] shows that the effec-tive combination of data from various sources yields improved detection rate and reducedfalse alert rate. Each intrusion detection system deployed for detecting intrusions have adifferent level of expertise and their detection methodology may be complementary. Also,IDS alerts can vary in terms of completeness, accuracy and certainty. This unreliabilityof IDS can completely deviate from the decision of data fusion. Wen [37] argued that theinformation found in the real world is uncertain as well as partially reliable. However,none of the data fusion techniques incorporates reliability of data being fused.

25

1.4. OBJECTIVES AND SCOPE OF WORK

Hence the aim of the thesis is to provide a solution to following questions.

1. Can reliable alert fusion based intrusion detection has the capability to resolve thetradeoff issues between novel intrusion detection and false alert rate?

2. How reliability of Intrusion detection system is computed which is used to discountthe masses of unreliable IDS during the data fusion process?

3. How can correctness and reliability of decision from fused system be measured inorder to facilitate the relative comparison and evaluation against individual IDS?

4. How to devise a robust data fusion rule to handle uncertainty and conflict betweenobservations of IDS observing the same event?

1.4 Objectives and Scope of work

Following are the main objectives of the research work:

1. To develop a framework to fuse alerts from heterogeneous intrusion detection systemto improve detection rate and false alarm rate.

2. To investigate the capability of data fusion techniques in enhancing the performanceof an intrusion detection system.

3. To formulate a mathematical model which discounts the intrusion detection systembased on their reliability of detecting a particular attack/intrusion.

1.5 Original Contribution

Performance of signature/misuse based IDS namely, Snort & Suricata and anomaly-basedIDS namely, packet header anomaly detector (PHAD) and application layer anomalydetector (ALAD) has been analyzed and validated against benchmark IDS evaluationdatasets. It is concluded that the IDS is not equally reliable for all classes of attacks. Itprovides very high accuracy for certain class of attack and very less for other classes ofattack which makes single IDS inappropriate to monitor the network with multiclass ofattacks. Reliable data fusion based IDS detection framework is proposed and its perfor-mance is evaluated and validated with three different datasets DARPA99, KDD99 andNSL-KDD. Data fusion rule was used to combine alerts from heterogenous IDS underconflicting, complementary and supplementary behavior of multiple IDS system. Datafusion rules were studied and their performance with uncertain & conflicting evidence isobserved and compared with the proposed framework. Reliable alert fusion rule is for-mulated having the capability to discount the unreliable IDS during the fusion process.The mathematical formula to compute reliability in case of known ground truth and theunknown ground truth is devised. The positive prediction value (PPV) and negativeprediction value (NPV) improves comparatively with increment in the number of truepositives detected by IDS due to the reduction in the number of false positives. Theaccuracy and reliability obtained with fused IDS are analyzed by making use of informa-tion theoretic model using the binary channel. The reliability and accuracy values areindicators of the performance of fusion of multi-IDS using RAF rule. The mathematical

26

1.6. ORGANIZATION OF THESIS

expression for estimating the reliability of fusion-based IDS is formulated. The F-score,Recall and Precision of four IDS systems Snort, Suricata, PHAD and ALAD is calculatedand comparison is done with fused IDS using Conjunctive rule, Dempster Shafer rule,Yager’s rule, Smet’s rule, weighted average operator, proportional conflict rule, adaptiveconflict rule and with proposed reliable alert fusion rule.

1.6 Organization of Thesis

The thesis is organized into five chapters. Chapter-1 gives introduction to research topicand discusses objectives & scope of work followed by chapter-2 which gives basics ofintrusion detection system. The evaluation metrics are presented in this chapter alongwith explaining basic types of IDS viz. signature-based IDS and anomaly-based IDS.the dataset used for evaluation of performance of IDS is discussed and compared. Theevaluation of signature/misuse based IDS like snort, Suricata and anomaly-based IDSPHAD and ALAD is done against the DARPA99 dataset. Individual IDS performance isevaluated in this chapter with parameters which gives justification about failure of IDS inachieving higher detection rate. The chapter concludes with discussion on experimentalsetup and experimental results.

Chapter-3 discuss the concept of reliability of IDS and propose method to estimatethe numerical value of reliability based on known & unknown ground truth. Chapter-4presents the attack detection performance of IDS using data fusion of multiple heteroge-neous systems. The basics of data fusion techniques is discussed by presenting the datafusion model. The existing fusion rules used to combine alerts from multiple intrusiondetection system are presented. The limitations of existing fusion rules in presence ofunreliable IDS is demonstrated in this chapter.

Chapter-4 also describes the proposed fusion rule having capability to compromisebetween reliable and unreliable IDS. The reliability and robustness of proposed data fusionrule is demostrated under complementary, complimentary conflicting, supplementary andsupplementary conflicting behaviour of intrusion detection system. The chapter furthershows performance evaluation of data fusion of multiple IDS with proposed RAF ruleagainst the benchmark dataset DARPA’99, KDD99 and NSL-KDD under two differentreliability criteria. The chapter concludes with a contribution of a reliable and secure IDSwhich is powerful and used as an efficient IDS for securing network against vulnerabilitieswith high reliability & accuracy. Chapter-5 summarizes the research work with conclusion& future scope of work.

27

Chapter 2

Basics of Intrusion Detection System

2.1 Introduction

The widespread use of internet demands for a secure network for protecting the confiden-tial information against internal as well as external intruders. Intruders that are insidethe network having access rights to all valuable information of the network are the majorthreat to compromise the network security policy [38]. Firewall systems are well efficientto secure the networks against the external attackers entering the network [39]. Intrusiondetection system tracks the data flowing in & out of the network, analyzes the data andraises an alert in case of abnormality found.

The difference between Firewall and IDS can be well understood by using the analogy.Assume that you have kept valuable assets at your home. This asset can be protectedby setting up barriers such as gates & install home security systems like closed-circuittelevision cameras. Firewalls can be related to the locked gates while IDS are the CCTVcamera or security systems. Firewalls perform the operation of blocking and filtering theabnormal traffic on the network. While IDS perform the process of sniffing, analyzingand alerting. The goal of IDS is to detect computer intrusions and to alert the networkadministrator about the security breach.IDS is not meant to replace any of the existingsecurity systems on the network but to enforce the intricate security policies for users onthe network [40]. Thus, IDS is a defense-in-depth mechanism alongside firewalls and virus

Figure 2.1: Deployment of Firewall & Intrusion Detection System

28

2.1. INTRODUCTION

Figure 2.2: Components of Intrusion Detection System

protection systems. Figure-2.1 shows the deployment of firewall and intrusion detectionsystem in the network. Anderson introduces the concept of intrusion detection system[41] with the basic aim to maintain confidentiality, integrity and assurance against variousnetwork intrusions.

Intrusion detection system plays a reactive informer role rather than proactive for anetwork administrator [42]. IDS is used to detect intrusions but unable to prevent them.Intrusion detection system is also capable to analyze the audit data to learn the behavior& impact of intrusion for building advanced IDS systems. Intrusion detection systemhas two major components namely, sensors and management console [43] as shown infigure-2.2.

Sensors perform the task of sniffing the network traffic and analyzing the audit pat-terns with the help of data collector and data analyzer while the managament consolehas a knowledge datbase for attacks information about current state of the system & au-dit information describing the events happening and response engine which controls thereaction mechanism & how to respond [44]. The system may raise an alarm & report tothe administrator or may block the source of the attack.

An ideal intrusion detection should be reliable and able to run continuously withouthuman intervention. It should have lower false alarm rate and higher attack detectioncapability and also capable to run without imposing minimum overhead on the system.

29

2.1. INTRODUCTION

Figure 2.3: Block diagram of Signature based IDS

Figure 2.4: Denial of service attack (UDP flooding) scenario

30

2.2. SIGNATURE BASED INTRUSION DETECTION SYSTEM

2.2 Signature based Intrusion Detection System

The intrusion detection in signature based IDS is based on the principle of comparingincoming packet from network traffic with stored attack patterns to detect abnormalactivity/intrusions. The stored patterns also called as signatures are basically the knownmethods used by attacker to intrude the network. The IDS generates a alert, if the contentof incoming packet matches with attack signature.

Signature based IDS also called misuse based IDS is incapable to detect zero-dayvulnerability/novel intrusions. This type of IDS provide lower false alarms. In its simplestform signature based IDS has two main components: 1) Pattern matching 2) Attack ruledatabase. The basic working block diagram of signature IDS is as shown in figure 2.3.Snort is the most popular open source signature based IDS [45] with libcap based packetsniffing and alert logging capability. Snort has three basic modes of operation:

1. Packet sniffer mode

2. Packet logger mode

3. Intrusion detection system

In packet sniffer mode, snort reads the network traffic and displays the content ofpackets. In the packet logging mode, snort is used to log all three packets to the storagedisk. While, in IDS mode it sniffs the traffic, analyzes the traffic against the defined ruleset[46]. Snort has a capability to perform real-time traffic analysis. The alerts generatedby snort can be visualized using snorby, sguil [47] and squert tools [48]. The efficiencyand accuracy of intrusion detection by snort requires that the signature database shouldbe well defined. Attack signatures are officially available for download by vulnerabilityresearch team (VRT) group or user-defined rules can be created using simple rule formatsas defined in [49].

Suricata is open source IDS tool released in the year 2009 and its stable version in2010 [50]. Suricata with all the functionality of snort such packet sniffing, packet loggingand intrusion detection system has the capability to combines the ruleset from variousdatabases such as emerging threats, VRT rules, etc. Suricata offers high efficiency andincreased speed in network traffic due to its multithreading engine [51].

Attack detection using signature-based IDS

Denial of service is an attack in which the attacker prevents the legitimate user from usingthe network resources. A typical DOS (UDP) flooding attack is one in which intrudersends a large number of packets to cause network bandwidth saturation on the host ma-chine. Figure-2.4 shows UDP flooding attack scenario where an attacker sends a largenumber of UDP packets on a specific port on the victim system. Thus, when a legiti-mate user requires a service on the victim machine, the system sends ”DESTINATIONUNREACHABLE” message.

In order to detect such malicious activity a typical snort alert can be written as :

31

2.3. ANOMALY BASED INTRUSION DETECTION SYSTEM

Table 2.1: Defination of various types of alert

Name Defination

True positive (TP) Number of attacks that are correctly de-tected

False positive (FP) Number of normal traffic packet that are in-correctly detected as attacks

True negative (TN) Number of normal traffic packets that arecorrectly classified

False negative (FN) Number of attacks that are not detected

alert icmp any any $->$ any any (msg:"ICMP Destination

Unreachable Communication with Destination Host is Prohibited"}

The input packet is compared with the above rule and if abnormality detected thenalert will be raised and logged into the alert file.

2.3 Anomaly based Intrusion Detection System

Anomaly-based IDS stores the normal behavior of the network and any deviation fromthe normal behavior is identified as an intrusion. Anomaly detector requires a trainingphase to learn the normal behavior of the system on the network and testing phase toevaluate the observed data against stored normal profile [52].

Anomaly-based IDS system is advantageous compared to signature-based IDS havingthe capability of detecting a novel attack even though the relevant information about theattack is unavailable. Anomaly detector usually provides a large number of false alertsas any previously unknown behavior is detected as intrusion [53]. Secondly, Anomalydetectors require a large training dataset for creating a normal profile that represents thesystem behavior under normal condition. A typical functional flowchart of anomaly-basedIDS is as shown in [34].

Signature-based detectors like Snort and Suricata uses attack signatures while anomalydetectors PHAD and ALAD learns the statistical model of normal profile and deviationfrom the normal profile is detected as attack [18]. The difference between Packet headeranomaly detector (PHAD) and Application layer anomaly detector (ALAD) is in the typeof attributes used for estimating the normal profile. PHAD uses 34 attributes correspond-ing to ethernet, IP, TCP, UDP and ICMP packet header fields. The normal ranges ofabove packet header fields are used to create a normal profile and any deviation from thisnormal ranges is detected as abnormality or intrusion. On the other hand, ALAD usesincoming server TCP requests: source and destination addresses and ports, opening andclosing TCP flags in the application payload as the attributes to model the normal profile[54].

32

2.4. EVALUATION OF INTRUSION DETECTION SYSTEM

Table 2.2: Alert metrics of IDS

Name Defination

True positive rate (TPR)TP

TP + FN

False positive rate (FPR)FP

FP + TN

True negative rate (TNR)TN

TN + FP

False negative rate (FNR)FN

TP + FN

2.4 Evaluation of Intrusion Detection System

2.4.1 Evaluation metrics IDS

The evaluation metric for an intrusion detection system is an quantity to measure theperformance of Intrusion Detection System for detecting network intrusions. Table-2.1shows the type of alert raised by the IDS. Following are the types of evaluation metricsfor intrusion detection systems:

1. Alert metrics

2. Sensitivity metrics

3. Capability metrics

4. Variability metrics

Alert metrics of IDS

The Alert metrics of the intrusion detection system are the true positive rate, false positiverate, true negative rate and false negative rate as shown in table-2.2. The true positiverate is defined as the ratio of the number of true positives to the sum of the number oftrue positives and the number of false negatives. The false positive rate is defined as theratio of the number of false positives to the sum of the number of true positives and thenumber of false positives. The true negative rate is defined as the ratio of the number oftrue negatives to the sum the number of true negatives and the number of false positives.The false negative rate is defined as the ratio of the number of false negatives to the sumof the number of false negatives and the number of true positives [55].

Sensitivity metrics of IDS

The Sensitivity metrics of the intrusion detection system are positive prediction valueand negative prediction value. The positive prediction value (PPV) is a term that indi-cates that there is an intrusion present when IDS outputs an alert. Negative predictionvalue(NPV) indicates that there is no intrusion present when IDS outputs no alert [56].PPV and NPV is given by the formula,

33


Figure 2.5: The effect of False positive rate on PPV

PPV =TP

FP + TP(2.1)

NPV =TN

FN + TN(2.2)

Figure-2.5 shows the effect of increase in false positive rate on positive prediction valuefor constant values of true positive rate.

Capability metrics of IDS

The capability metrics of IDS are accuracy, reliability and mutual information transfer(MIT) [57]. The accuracy of IDS is defined as the degree to which the detection ofintrusion detection system is correct and is given as,

ACCURACY =TN + TP

FN + TN + TP + FP(2.3)

Variability metrics of IDS

The variability metrics of IDS are F-Score, Recall and Precision. Precision indicatesthe number of intrusions detected is belonging to the attack class. Recall indicates thenumber of intrusions that are correctly detected. F-Score defined as the test to measurethe accuracy of IDS [58]. F-score, Precision & Recall is given by,

F − Score =2PR

P +R(2.4)

34


Figure 2.6: Intrusion Detection Evaluation Framework

Where,

P =TP

TP + FP(2.5)

R =TP

TP + FN(2.6)

2.4.2 Evaluation Datasets

The evaluation of intrusion detection system is much needed task as security systemdeployed have to prove which type of attacks they are capable to detect. The evaluationof IDS is used to compare the performance capability of IDS compared to other IDSs.However, the evaluation of IDS is generally a challenging task due to lack of availabilityof proper evaluation framework. The evaluation can be an offline evaluation or an onlineevaluation.

The online/live traffic evaluation of IDS is achieved by creating a network of threemachines with each machine having two network interface card. The second networkinterface card is used to configure a bridge for allowing the network traffic to pass throughthem.

The online evaluation framework requires Wireshark and TCPreplay tool. Wiresharkis used to capture live traffic from network. While tcpreplay is used to send and replaythe capture file on a specified interface. Intrusion detection system sniffs the networktraffic having combination of attack traffic and background traffic. The network traffic isreplayed at desired speed by TCPREPLAY tool [59]. Intrusion detection system generatesthe alert which are logged into log file. The log file is used to evaluate various perfor-mance metrics of the intrusion detection system. Figure 2.6 shows a typical evaluationframework.

The offline evaluation of IDS is done by recreating the datasets having attacks alongwith background traffic. TCPdump and TCPreplay tool are used to perform an of-fline evaluation. There are multiple common datasets available for others non-IDS fields.

35


Table 2.3: Categorywise attacks in DARPA99

Attack Class Attack Type

Denial of Service (DOS) Attacks apache2, smurf, neptune, dosnuke, land, pod,back, teardrop, tcpreset, syslogd, crashiis,arppoison, mailbomb, selfping, processtable,udpstorm, warezclient

Probe Attacks portsweep, ipsweep, lsdomain, ntinfoscan,mscan, illegal-sniffer, queso, satan

Remote to Local (R2L) Attacks dict, netcat, sendmail,imap, ncftp, xlock, xs-noop, sshtrojan, framespoof, ppmacro, guest,netbus, snmpget, ftpwrite, httptunnel, phf,named

User to root (U2R) Attacks sechole, xterm, eject, ntfsdos, nukepw, se-cret, perl, ps, yaga, fdformat, ppmacro ffb-config, casesen, loadmodule, sqlattack

Table 2.4: KDD99 attack distributionTraining Size (Percentage) Test Size ( Percentage)

Normal 972781 19.85 60593 19.48DOS 3883390 79.27 231455 74.41Probe 41102 00.83 4166 01.33U2R 52 00.001 245 00.07R2L 1106 00.02 14570 04.68Total 4898431 100 311029 100

However, there are no recent dataset for IDS evaluation [60]. DARPA-1999, KDD99 andNSL-KDD are some of the well-known datasets used for the evaluation of IDS. Followingsubsection will describe an overview of various intrusion detection datasets used in thepresent thesis.

DARPA 1999 Dataset

Defense Advanced Research Projects Agency (DARPA) designed first benchmark datasethaving raw TCP/IP dump files in the year 1998 and 1999 for evaluation of intrusiondetection system [61]. DARPA99 is modified and renewed version of DARPA98 whichconsists of total 5 weeks of data which is divided into 3 weeks of training dataset and 2weeks of testing dataset. Each week of dataset consists of five day of data from Monday toFriday of inside and outside traffic. The detailed explanation of various intrusions/attackspresent in DARPA99 along with normal traffic is explained in detail in [62].

Figure-2.8 shows the total number of connections available in the DARPA’99 testingdataset of the fourth and the fifth week. Figure-2.7 the number of attack instances present

36


Figure 2.7: Number of Attack Instances in DARPA99

Figure 2.8: Total number of connections in DARPA’99 dataset

Table 2.5: Categorywise attacks in KDD99


Denial of Service (DOS) Attacks Back, land, neptune, pod, smurf, teardrop

Probe Attacks Ipsweep, nmap, portsweep, satan

Remote to Local (R2L) Attacks ftp write, guess passwd, imap, multihop, phf,spy, warezclient, warezmaster

User to root (U2R) Attacks Buffer overflow, Loadmodule, perl, rootkit

37


Table 2.6: Categorywise attacks in NSL-KDD


Denial of Service (DOS) Attacks Back, Land, Neptune, Pod, Smurf, Teardrop,Mailbomb, Processtable, Udpstorm,Apache2, Worm

Probe Attacks Satan, IPsweep, Nmap, Portsweep, Mscan,Saint

Remote to Local (R2L) Attacks Guess password, Ftp write, Imap, Phf, Mul-tihop, Warezmaster, Xlock, Xsnoop, Sn-mpguess, Snmpgetattack, Httptunnel, Send-mail, Named

User to root (U2R) Attacks Buffer overflow, Loadmodule, Rootkit, Perl,Sqlattack, Xterm,Ps

in DARPA99. DARPA99 consists of total 194 instances of attacks with 44 probe, 57 DOS,57 R2L and 36 U2R attacks. Table-2.3 shows category wise attacks present in DARPA99dataset.

Most research in the field of IDS have been done using DARPA99 dataset. However,many of the researchers have criticized and argued about its applicability for IDS eval-uation. Most of them consider that the dataset is very outdated and unable to createbehavior like the present day attack. Along with the critics, there is a significant argumentin favor of DARPA99.

In [63], authors argued that the non-availability of any other dataset that includes thecomplete network traffic was probably the initial reason to make use of the DARPA datasetfor IDS evaluation by researchers. In [64], authors comment that if a present day advancedsystem could not perform well on DARPA dataset, it could also not perform acceptably onrealistic data. Authors in [65] argued that even though there are shortcomings, the Lincolnevaluation indicates that even the best of the research IDS systems fall far short of theDARPA goals for detection and false alarm performance. MCHugh in his work [65] believethat any sufficiently advanced IDS should be able to achieve good true positive detectionperformance on the DARPA IDS evaluation dataset. Demonstrating such performance,however, is only necessary to show the capabilities of such a detector, it is not sufficient.

KDD99 Dataset

The KDD99 dataset is knowledge discovery database originally created from DARPA98.KDD99 is suitable for evaluation of signature-based as well as anomaly-based IDS as ithas two weeks of training dataset consisting of attack-free instances and three week of thetesting dataset having attack instances. The KDD99 dataset has 41 features along withone class label. The class label consists of attack in four categories R2L, U2R, Probe andDOS. KDD99 have only 20 % of normal traffic and 80% of attack traffic which makes it

38

2.5. RESULTS & DISCUSSION

Table 2.7: Comparison between various IDS datasetsDARPA99 KDD99 NSL-KDD

Type Base Dataset Extracted Features Duplicatesremoved

with RawTCP/IPdump files

and size re-duced

Training data size 6.2 GB 4898431 Kb 125973 KbTesting data size 3.67 GB 311029 Kb 22544 KbUsability 15 literature

paper avail-able

153 literature paper 33 literaturepaper avail-able

an unbalanced dataset. Also, R2L and U2R are the rarest attack in KDD99 while DOShas predominant and duplicate instances. Table-2.4 shows the attack distribution in theKDD99 dataset and Table-2.5 shows category wise attacks in KDD99. KDD99 dataset hasredundant and frequent records which causes the evaluation result to be biased towardsit. This limitation of KDD99 is harmful when we are detecting less frequent attacks likeR2L and U2R.

NSL-KDD Dataset

NSL-KDD is the newest dataset in order to solve the issues with KDD99. NSL-KDDwas distributed for testing in the year 2009 by University of New Brunswick. This newversion of dataset does not have redundant records in train set, so the classfier does notget biased towards more frequent records. Also, the test set does not have duplicaterecords which give better detection rates. The reduced NSL-KDD make it reasonableto run the experiments without the need of randomly selecting small set as in KDD99.The results by Tavallaee et al. [66] shows that the results obtained by NSL-KDD makethe evaluation results more consistent and comparable. NSL-KDD consists of 21 labelledattacks in training dataset and 37 attacks in testing dataset along with novel attacks.Table-2.6 shows the categorywise attacks present in NSL-KDD dataset.

Table-2.7 shows the comparison between various IDS datasets in terms of type, trainingand testing dataset size and usability in literature.

2.5 Results & Discussion

2.5.1 Experimental Setup

The evaluation of intrusion detection is carried out with the experimental setup as perfigure-2.9. The setup consists of three 3rd Generation Intel R© CoreTM i5 processor (1.6GHz) with operating system Linux Ubuntu having RAM-4GB. One machine deployedwith Signature-based IDS such as Snort and Suricata and another machine deployed withAnomaly detectors such as PHAD and ALAD. The third machine acts as an attackermachine having dataset loaded. The dataset traffic will be replayed with TCPReplay[67].

39


Figure 2.9: Experimental Setup

Network traffic generation

The network traffic generation requires a tool to produce realistic background traffic byrecreating the dataset for IDS testing. DARPA99 dataset distributed online in [68] consistsof five weeks of network traffic files of inside and outside traffic in packet capture (PCAP)format. Tcpreplay is capable of replaying a pcap file out of one or two interfaces. Whenplaying the traffic out of two interfaces, tcpreplay simulates the client/server relationshipnormally seen in network traffic. In this manner, although the file resides on one machine,the traffic can be played across a device, such as a router.

The tcpprep command is used to seperate the incoming traffic into two categories,client and server packets.

tcpprep -a bridge -o OUT.prep -i inside.tcpdump.pcap

The information content can rewriten with tcprewrite

tcprewrite -C -c out.prep -i inside.tcpdump.pcap --o output.pcap

The traffic can be replayed on the network interface intf1 using tcpreplay

tcpreplay --intf1=eth0 output.pcap

The traffic can be replayed on the network interface with different speeds using fol-lowing commands. To replay as quickly as possible

tcpreplay --topspeed --intf1=eth0 output.pcap

To replay traffic at a rate of 10Mbps

tcpreplay --mbps=10.0 --intf1=eth0 output.pcap

40


Signature based detection

The DARPA99 dataset replayed on the network from attacker machine was sniffed byanother two machines, one installed with signature based IDS like snort and suricata andanother installed with anomaly based IDS like PHAD and ALAD.

Snort is a lightweight network intrusion detection system having capability to detectmisuse in the network traffic.

Downloading and installing snort

sudo su

wget -o snort -2.8.6.1.tar.gz http://www.snort.org/downloads/116

tar xvzf snort -2.8.6.1.tar.gz

cd snort -2.8.6.1

./configure

make

make install

Creating required file and directory

mkdir /etc/snort

mkdir /etc/snort/rules

mkdir /var/log/snort

For starting snort in the intrusion detection mode

snort -A full -c /etc/snort/snort.conf

Installation of suricata for intrusion detection

tar xzvf suricata-4.0.0.tar.gz

cd suricata-4.0.0

./configure

make

make install

To run suricata in IDS mode

suricata $-c$ /etc/suricata/suricata.yaml $-i$ eth0

Anomaly based detection

For configuring the machine for anomaly detection, the source files of two well-knownanomaly detectors namely, Packet header anomaly detector (PHAD) and ApplicationLayer Anomaly detector (ALAD) is downloaded from http: //www.cs.fit.edu. Week-3data from DARPA99 is used for training the anomaly detectors while week-4 and week-5data are used for testing the anomaly detectors.

To run PHAD and ALAD detectors following instructions are used

41


Table 2.8: Results of Snort & Suricata against DARPA99-Week-4, Day-1Week Day Type of Traffic ID Date Name of Attack Attack Category Attacker IP Victim IP Snort IDS Suricata IDS

4 1 inside (3) 41.091531 29/03/1999 portsweep probe 172.016.118.020 172.016.113.050

4 1 inside (5) 41.111531 29/03/1999 portsweep probe 172.016.118.050 192.168.001.001

4 1 inside (6) 41.133333 3/29/1999 guessftp r2l 172.016.118.070 172.016.112.050 yes

4 1 inside (2) 41.155048 29/03/1999 yaga u2r 172.016.118.070 172.016.112.100

4 1 inside (1) 41.161308 29/03/1999 crashiis dos 172.016.118.070 172.016.112.100

4 1 outside (4) 41.084031 29/03/1999 ps u2r 209.154.098.104 172.016.112.050

4 1 outside (2) 41.084818 29/03/1999 sendmail r2l 202.049.244.010 172.016.114.050

4 1 outside (1) 41.09 29/03/1999 ntfsdos r2l 172.16.112.100 172.16.112.100

4 1 outside (4) 41.093708 29/03/1999 sshtrojanInstall r2l 202.077.162.213 172.016.114.050

4 1 outside (1) 41.112127 29/03/1999 xsnoop r2l 128.223.199.068 172.016.114.168 yes yes

4 1 outside (2) 41.114554 29/03/1999 snmpget r2l 204.097.153.043 172.016.000.001

4 1 outside (1) 41.114703 29/03/1999 guesstelnet r2l 192.005.041.239 172.016.113.050

4 1 outside (5) 41.122222 29/03/1999 portsweep probe 153.107.252.061 172.016.112.100

4 1 outside (3) 41.13583 29/03/1999 ftpwrite r2l 194.027.251.021 172.016.112.050 yes yes


4 1 outside (2) 41.182453 29/03/1999 secret data 195.115.218.108 172.016.112.050

4 1 outside (2) 41.213446 29/03/1999 smurf dos 006.238.105.108 172.016.112.050

g++ phad.cpp -O -o phad

DARPA99 week-3, week-4 and week-5 dataset from monday to friday are downloaded andevaluated with anomaly detectors using following instruction:

phad 1123200 in3* in4* in5* >phad.sim

g++ te.cpp -O -o te

te in3* > train

te in45* > test

perl alad.pl train test > alad.sim

2.5.2 Experiment Results

This section presents the result obtained by running signature based IDS Snort & Suricataand Anomaly based IDS PHAD & ALAD against DARPA99 dataset. We run the snort& suricata with its default ruleset against DARPA99 dataset with four classes of attacksnamely, DOS, U2R, R2L and probe. The objective behind the investigation is to highlightthe problem of lower detection coverage. The analysis of IDS alerts was supervised bysnorby and squert. Table-2.8 to Table-2.18 shows the list of attacks present in week-4 andweek-5 of DARPA99 and corresponding attacks detected by snort and suricata IDS.

Table-2.19 shows the number alerts raised by snort against each signatures of snortfor ICMP flooding attacks. The generated alerts are classified as true alerts and falsealerts based on the ground truth information from the dataset. We have considered fourcategories of attacks for evaluation and observed the amount of true alerts, false alertsand have derived the percentage detection rate of snort and suricata. Detection rate of20% for denial of service attack of snort IDS is found to be minimum and 35% for R2Lattack was maximum. Since suricata has an ability to understand level-7 of OSI model,we observe an improvement in detection rate of suricata against each category. Table-2.18shows the list of attacks detected by various intrusion detection systems.

42



4 2 outside(2) 42.090909 30/03/1999 httptunnel r2l 197.182.091.233 172.016.112.050 yes yes

4 2 outside(1) 42.094131 30/03/1999 phf r2l 202.247.224.089 172.016.114.050

4 2 outside(1) 42.104107 30/03/1999 loadmodule u2r 194.027.251.021 172.016.113.050

4 2 outside(3) 42.112913 30/03/1999 ps u2r 206.222.003.197 172.016.112.050 yes

4 2 outside(1) 42.12 30/03/1999 ntfsdos r2l 172.16.112.100 172.16.112.100

4 2 outside(3) 42.122248 30/03/1999 secret data 197.218.177.069 172.016.114.050

4 2 outside(1) 42.135452 30/03/1999 sqlattack r2l 207.253.084.013 172.016.112.194

4 2 outside(4) 42.143228 30/03/1999 sechole u2r 205.160.208.190 172.016.112.100 yes

4 2 outside(1) 42.145441 03/30/1999 land dos 172.016.113.050 172.016.113.050

4 2 outside (73) 42.155148 03/30/1999 mailbomb dos 194.027.251.021 172.016.114.050 yes(3) yes(15)

4 2 outside (1) 42.174944 03/30/1999 processtable dos 208.240.124.083 172.016.114.050 yes(3)

4 2 outside (1) 42.21041 03/30/1999 crashiis dos 209.001.012.046 172.016.112.100


4 3 inside (1) 43.101313 31/03/1999 processtable dos 172.016.118.060 172.016.113.050

4 3 inside (2) 43.113032 31/03/1999 arppoison probe 172.016.118.020 172.016.113.050

4 3 inside (1) 43.144547 31/03/1999 smurf dos 001.012.120.006 172.016.112.100

4 3 outside (4) 43.080401 31/03/1999 satan probe 209.030.070.014 172.016.114.050 yes(4) yes(1)

4 3 outside (2) 43.084 31/03/1999 netcat r2l 207.230.054.203 172.016.112.100

4 3 outside (1) 43.093814 31/03/1999 imap r2l 208.254.251.132 172.016.114.050 yes(9) yes(1)

4 3 outside (8) 43.1 31/03/1999 ppmacro r2l 194.027.251.021 172.016.112.100 yes(45) yes(1)

4 3 outside (1) 43.11 31/03/1999 netcat r2l 206.048.044.018 172.016.112.100 yes yes(1)

4 3 outside (2) 43.111111 31/03/1999 warezmaster dos 202.027.121.118 172.016.112.050

4 3 outside (2) 43.1145 31/03/1999 ncftp r2l 152.169.215.104 172.016.114.050 yes yes(1)

4 3 outside (2) 43.122854 31/03/1999 secret data 196.037.075.158 172.016.112.050

4 3 outside (2) 43.1259 31/03/1999 named r2l 194.007.248.153 172.016.112.020 yes

4 3 outside (1) 43.134223 31/03/1999 guessftp r2l 208.240.124.083 172.016.113.050

4 3 outside (7) 43.155357 31/03/1999 guest r2l 209.012.013.144 72.016.112.050



4 3 outside(2) 43.175811 31/03/1999 guesstelnet r2l 209.001.012.046 172.016.112.050

4 3 outside (201) 43.191217 31/03/1999 snmpget r2l 207.181.092.211 172.016.000.001


4 4 inside (1) 44.082615 1/04/1999 teardrop dos 172.016.118.060 172.016.114.050

4 4 inside (1) 44.11 1/04/1999 dosnuke dos 172.016.115.234 172.016.112.100

4 4 inside (2) 44.1145 1/04/1999 ncftp r2l 172.016.118.040 172.016.114.050

4 4 inside (12) 44.164944 1/04/1999 sshprocesstable dos 172.016.118.020 172.016.112.050 yes(14)

4 4 outside (2) 44.08 1/04/1999 ntinfoscan probe 172.016.112.100 206.048.044.018

4 4 outside (12) 44.080757 1/04/1999 ipsweep probe 172.016.112.005 172.016.112.010

4 4 outside (4) 44.083 1/04/1999 netbus r2l 135.008.060.182 172.016.112.100 yes(3) yes(121)

4 4 outside (4) 44.091807 1/04/1999 sshtrojan r2l 195.073.151.050 172.016.114.050 yes yes

4 4 outside (4) 44.1205 1/04/1999 ppmarcro r2l 194.027.251.021 172.016.112.100 yes yes

4 4 outside (9) 44.1247 1/04/1999 guest r2l 153.107.252.061 172.016.112.050

4 4 outside (2) 44.1307 1/04/1999 xlock r2l 209.012.013.144 172.016.114.168 yes(1) yes(48)

4 4 outside (1) 44.131529 1/04/1999 guesspop r2l 202.247.224.089 172.016.112.194 yes(1) yes(1)

4 4 outside (1) 44.161242 1/04/1999 phf r2l 202.072.001.077 172.016.114.050

4 4 outside (1) 44.183234 1/04/1999 mailbomb dos 194.027.251.021 172.016.114.050 yes(50) yes

4 4 outside (1) 44.201454 1/04/1999 sqlattack r2l 194.007.248.153 172.016.112.194 yes

43



4 5 inside (1) 45.084547 2/04/1999 smurf dos 001.012.120.006 172.016.112.050

4 5 inside (2) 45.1145 2/04/1999 ncftp r2l 172.016.118.070 172.016.114.050

4 5 outside (2) 45.09 2/04/1999 arppoison probe 206.047.098.151 172.016.114.050

4 5 outside (2) 45.095541 2/04/1999 sshtrojan r2l 195.073.151.050 172.016.114.050 yes

4 5 outside (12) 45.100334 2/04/1999 ipsweep probe 172.016.112.005 172.016.112.010

4 5 outside (2) 45.103937 2/04/1999 xlock r2l 139.134.061.042 172.016.114.050 yes(1) yes(1)

4 5 outside (3) 45.105138 2/04/1999 named r2l 194.007.248.153 172.016.112.020 yes(1) yes(2)


4 5 outside (3) 45.1149 2/04/1999 netbus r2l 209.001.012.046 172.016.112.100


4 5 outside(2) 45.130542 2/04/1999 named r2l 195.073.151.050 172.016.112.020 yes

4 5 outside(6) 45.14 2/04/1999 ipsweep probe 128.223.199.068 172.016.114.050 yes(8) yes(8)

4 5 outside(1) 45.162148 2/04/1999 loadmodule u2r 194.027.251.021 172.016.113.050

4 5 outside(3) 45.165009 2/04/1999 sechole u2r 195.115.218.108 172.016.112.100 yes(3)

4 5 outside(1) 45.181011 2/04/1999 portsweep probe 202.049.244.010 172.016.113.050


5 1 outside(1) 51.0838 5/04/1999 pod dos 202.077.162.213 172.016.112.050


5 1 outside(2) 51.085 5/04/1999 pod dos 202.077.162.213 172.016.114.050

5 1 outside(2) 51.085947 5/04/1999 warezclient dos 207.075.239.115 172.016.112.050 yes yes

5 1 outside 51.093123 5/04/1999 smurf dos login session 172.016.112.050


5 1 outside(3) 51.105811 5/04/1999 guesstelnet r2l 192.005.041.239 172.016.118.080 yes(1) yes(1)

5 1 inside(1) 51.1145 5/04/1999 dosnuke dos 172.016.115.234 172.016.112.100

5 1 inside(1) 51.120309 5/04/1999 loadmodule u2r 172.016.114.207 172.016.113.050

5 1 outside(1) 51.121101 5/04/1999 ffbconfig u2r 135.013.216.191 172.016.112.050

5 1 outside(1) 51.131803 5/04/1999 smurf dos 023.234.078.052 172.016.114.050

5 1 outside(2) 51.133019 5/04/1999 arppoison probe 152.169.215.104 172.016.112.100

5 1 outside(12) 51.1401 5/04/1999 apache2 dos 152.169.215.104 172.016.114.050

5 1 outside(1) 51.1421 5/04/1999 pod dos 010.011.022.033 172.016.113.050

5 1 inside(1) 51.144601 5/04/1999 imap r2l 172.016.117.103 172.016.114.050 yes

5 1 inside(6) 51.1632 5/04/1999 dict r2l 172.016.118.010 172.016.114.050

5 1 outside(1) 51.171917 5/04/1999 syslogd dos 172.005.003.005 172.016.112.050

5 1 outside(1) 51.180445 5/04/1999 neptune dos 010.020.030.040 172.016.112.050 yes(8)

5 1 outside(1) 51.183623 5/04/1999 crashiis dos 202.072.001.077 172.016.112.100

5 1 outside(1) 51.185613 5/04/1999 ls probe 195.073.151.050 172.016.112.020 yes yes

5 1 outside(1) 51.194715 5/04/1999 dosnuke dos 206.048.044.018 172.016.115.234

5 1 outside(2) 51.200037 5/04/1999 udpstorm dos 172.016.112.050 172.016.113 yes

5 1 outside(1) 51.201715 5/04/1999 selfping dos 135.013.216.191 172.016.112.050

5 1 inside(2) 51.204631 5/04/1999 ncftp r2l 172.016.118.070 172.016.114.050

44



5 2 outside(2) 52.081109 6/04/1999 tcpreset probe 135.008.060.182 172.016.112.050

5 2 outside(1) 52.083236 6/04/1999 teardrop dos 207.230.054.203 172.016.114.050 yes(19)

5 2 inside(4) 52.085357 6/04/1999 casesen u2r 172.016.113.204 172.016.112.100 yes(2)

5 2 outside(1) 52.0922 6/04/1999 xsnoop r2l 194.007.248.153 172.016.112.050 yes yes


5 2 outside(4) 52.100738 6/04/1999 xterm u2r 152.169.215.104 172.016.112.194

5 2 inside(3) 52.101901 6/04/1999 ftpwrite r2l 172.016.114.207 172.016.112.050 yes yes

5 2 inside(4) 52.103409 6/04/1999 back dos 206.048.044.050 172.016.114.050 yes

5 2 outside(3) 52.112045 6/04/1999 ps u2r 199.227.099.125 172.016.112.050

5 2 outside(2) 52.113855 6/04/1999 neptune dos 010.020.030.040 172.016.114.050 yes(22) yes(2)

5 2 outside(1) 52.1206 6/04/1999 httptunnel r2l 172.016.112.050 196.037.075.158

5 2 outside(3) 52.125501 6/04/1999 eject u2r 152.169.215.104 172.016.112.050

5 2 outside(1) 52.130655 6/04/1999 pod dos 166.102.114.043 172.016.113.050 yes

5 2 outside(2) 52.132827 6/04/1999 yaga u2r 194.007.248.153 172.016.112.100 yes


5 2 outside(11) 52.140207 6/04/1999 ppmarcro r2l 199.174.194.016 172.016.112.100 yes(28) yes(80)

5 2 inside(1) 52.1412 6/04/1999 syslogd dos 172.003.045.001 172.016.112.050

5 2 outside(1) 52.142452 6/04/1999 perl u2r 207.103.080.104 172.016.114.207 yes yes

5 2 outside(6) 52.162435 6/04/1999 fdformat u2r 196.038.075.158 172.016.112.050 yes(20)

5 2 outside(7) 52.165435 6/04/1999 queso probe 199.227.099.125 172.016.113.050

5 2 outside(1) 52.181637 6/04/1999 neptune dos 010.020.030.040 192.168.001.001

5 2 outside(1) 52.205605 6/04/1999 dosnuke dos 172.016.115.234 172.016.112.100 yes(2) yes(1)

5 2 inside(2) 52.214522 6/04/1999 ncftp r2l 172.016.118.020 172.016.114.050

5 2 outside(1) 52.050813 7/04/1999 udpstorm dos 172.016.112.050 172.016.113.050



5 3 outside(1) 53.084346 7/04/1999 xlock r2l 152.204.242.193 172.016.112.050

5 3 outside(1) 53.085717 7/04/1999 phf r2l 209.001.012.046 172.016.114.050

5 3 inside(1) 53.092039 7/04/1999 tcpreset probe hobbes console 172.016.112.050 yes

5 3 outside(2) 53.0948 7/04/1999 netbus r2l 209.167.099.071 172.016.112.100

5 3 outside(1) 53.102617 7/04/1999 back dos 152.204.242.193 172.016.114.050

5 3 outside(2) 53.1105 7/04/1999 netcat r2l 172.016.113.084 197.182.091.233



5 3 outside(1) 53.133203 7/04/1999 perl u2r 209.017.189.098 172.016.112.207 yes yes

5 3 inside(7) 53.134015 7/04/1999 queso probe 172.016.114.169 172.016.112.050

5 3 outside(21) 53.1448 7/04/1999 snmpget r2l 207.230.054.203 172.016.000.001 yes yes(16)

5 3 inside(1) 53.15011 7/04/1999 processtable dos 172.016.117.052 172.016.113.050

5 3 outside(1) 53.152648 7/04/1999 back dos 194.027.251.021 172.016.114.050 yes yes

5 3 outside(2) 53.155432 7/04/1999 ffbconfig u2r 195.115.218.108 172.016.112.050 yes(1) yes(1)

5 3 inside(1) 53.17135 7/04/1999 apache2 dos 172.016.117.052 172.016.114.050


45



5 4 inside(1) 54.082003 4/08/1999 ps u2r console 172.016.112.050

5 4 inside(7) 54.090101 4/08/1999 phf r2l 206.048.044.050 172.016.114.050

5 4 inside(7) 54.0912 4/08/1999 casesen u2r 172.016.112.149 172.016.112.100 yes yes(24)

5 4 inside(1) 54.102102 4/08/1999 ntfsdos r2l 172.16.112.100 172.16.112.100

5 4 outside(3) 54.103459 4/08/1999 portsweep probe 153.010.008.174 172.016.112.050 yes yes(2)

5 4 outside(2) 54.110416 4/08/1999 ntinfoscan probe 206.048.044.018 172.016.112.100 yes yes(2)

5 4 outside(2) 54.115 4/08/1999 yaga u2r 194.007.248.153 172.016.112.100 yes yes2

5 4 outside(1) 54.115701 4/08/1999 crashiis dos 194.007.248.153 172.016.112.100 yes yes

5 4 outside(8) 54.1206 4/08/1999 httptunnel r2l 172.016.112.050 196.037.075.158

5 4 outside(3) 54.125758 4/08/1999 fdformat u2r 194.027.251.021 172.016.112.050

5 4 outside(1) 54.145832 4/08/1999 satan probe 209.074.060.168 172.016.114.050 yes yes

5 4 outside(1) 54.155338 4/08/1999 teardrop dos 199.227.099.125 172.016.114.050

5 4 outside(4) 54.160341 4/08/1999 sechole u2r 208.240.124.083 172.016.112.100

5 4 inside(1) 54.170132 4/08/1999 resetscan probe 172.016.117.103 172.016.112

5 4 outside(61) 54.175007 4/08/1999 snmpget r2l 194.027.251.021 172.016.000.001 yes yes(6)

5 4 outside(3) 54.183002 4/08/1999 ntinfoscan probe 206.048.044.018 172.016.112.100 yes yes(2)

5 4 outside(1) 54.190707 4/08/1999 ls probe 209.012.013.144 172.016.112.020 yes yes

5 4 outside(2) 54.194108 4/08/1999 warezclient dos 209.030.070.014 172.016.112.050

5 4 outside(2) 54.225131 4/11/1999 arppoison probe 135.013.216.191 172.016.112.100


5 5 inside(3) 55.080105 4/09/1999 portsweep probe 172.016.113.050 206.048.044.050

5 5 outside(1) 55.0805 4/09/1999 xsnoop r2l 194.027.251.021 172.016.113.050 yes yes

5 5 inside(1) 55.081418 4/09/1999 crashiis dos 172.016.117.111 172.016.112.100 yes yes

5 5 inside(7) 55.0825 4/09/1999 illegalsniffer probe 172.016.112.098 172.016.112

5 5 outside(1) 55.084452 4/09/1999 back dos 206.047.098.151 172.016.114.050

5 5 inside(7) 55.0855 4/09/1999 illegalsniffer probe 172.016.112.097 172.016.112

5 5 inside(2) 55.085514 4/09/1999 netcat r2l 172.016.113.204 172.016.112.100

5 5 outside(2) 55.091529 4/09/1999 xterm u2r 194.027.251.021 172.016.112.207 yes yes


5 5 inside(1) 55.1006 4/09/1999 anypw r2l 172.16.112.100 172.16.112.100

5 5 inside(8) 55.10083 4/09/1999 guest r2l 172.016.118.050 172.016.112.050

5 5 inside(1) 55.103001 4/09/1999 perl u2r console 172.016.114.050

5 5 outside(4) 55.1108 4/09/1999 framespoofer data 194.027.251.021 172.016.112.100


5 5 outside(1) 55.123412 4/09/1999 sqlattack r2l 206.186.080.111 172.016.112.194

5 5 outside(2) 55.1244 4/09/1999 yaga u2r 209.001.012.046 172.016.112.100


5 5 outside(3) 55.125811 4/09/1999 guesstelnet r2l 135.013.216.191 172.016.112.100



5 5 inside(3) 55.141732 4/09/1999 eject u2r 206.048.044.050 172.016.112.050 yes yes(190)

5 5 inside(1) 55.163447 4/09/1999 land dos 172.016.113.050 172.016.113.050


5 5 outside(1) 55.172757 4/09/1999 sendmail r2l 152.204.242.193 172.016.114.050

5 5 outside(4) 55.174733 4/09/1999 xterm u2r 202.072.001.077 172.016.113.105 yes yes(2)

5 5 outside(1) 55.183012 4/09/1999 neptune dos 011.021.031.041 172.016.113.050

5 5 inside(1) 55.184715 4/09/1999 perl u2r 172.016.112.194 172.016.114.050

5 5 outside(2) 55.185233 4/09/1999 warezclient dos 206.047.098.151 172.016.112.050


5 5 outside(4) 55.204925 4/09/1999 casesen u2r 204.071.051.016 172.016.112.100

5 5 inside(1) 55.20053 4/09/1999 secret data console 172.016.112.050 yes yes

46


Table 2.18: List of attacks detected by various intrusion detection system

Attacks detected by Snort xsnoop, ftpwrite, imap, ppmacro

netcat, ncftp, netbus, sshtrojan

xclock, guesspop, named, guesstelnet, snmpget

casesen, yaga, xterm, eject, ipsweep,

ntinfoscan, mscan, tcpreset

perl, ffbconfig, satan, portsweep

Attacks detected by Suricata xsnoop, ftpwrite, imap, ppmacro, netcat,

xclock, guesspop, named, guesstelnet, snmpget,

casesen, yaga, xterm, eject, ipsweep, satan

ntinfoscan, mscan, tcpreset, guessftp, sqlattack,

ncftp, netbus, sshtrojan, fdformat, pod

portsweep, ps, sechole, perl, ffbconfig

Attacks detected by PHAD fdformat, teardrop, dosnuke,

portsweep, phf, land, satan,

Attacks detected by ALAD casesen, eject, fdformat, ffbconfig,

phf, ncftp, guessftp, crashiis, ps

sechole, xterm, yaga

Table 2.19: Number of signature raised for DOS(ICMP Flooding) attack by snort

Sr. No. Name of Signature Number of alert raised

1. ICMP Destination Unreachable Port Unreachable 937

2. ICMP Destination Unreachable Host Unreachable 1729

3. ICMP Echo Reply 2380

4. ICMP Fragment Reassembly Time Exceeded 238

5. ICMP PING 280

6. ICMP PING *NIX 35

7. ICMP PING BSDtype 0

8. ICMP Time-To-Live Exceeded in Transit 0

Table 2.20: Attack detected by Signature and Anomaly based IDS against DARPA99Total At-tacks

Snort Suricata PHAD ALAD

Probe 44 13 13 18 7DOS 57 12 15 21 18R2L 57 20 26 2 26U2R/Data 36 9 16 0 9Total 194 54 70 41 60

47


Figure 2.10: % detection rate of snort versus suricata for DOS, probe, R2L and U2Rcategory

Figure 2.11: % detection rate of PHAD versus ALAD for DOS, probe, R2L and U2Rcategory

48

2.6. SUMMARY

Figure-2.10 shows % detection rate of snort and suricata against complete DARPA99dataset. The evaluation of signature-based IDS against DARPA99 dataset shows thateven with the advanced signature-based IDS, it is difficult to achieve the detection rate of50 % or more against such an old dataset. The well-known anomaly intrusion detectionsystem like PHAD AND ALAD which learns the normal value ranges of TCP, UDP andICMP header fields and any deviation from these values is detected as an abnormalityor an intrusion is evaluated against DARPA99. We train these detectors using week-3attack free data from the DARPA99 dataset and test the detectors using week-4 data andweek-5 data. Figure-2.11 shows the detection rate of PHAD versus ALAD for DOS, U2Rand R2L attacks.

2.6 Summary

A review of basics of intrusion detection system has been described in this chapter. Thebasics of deployment of firewall and IDS is discussed. Brief discussion on types of intru-sion detection system and their functional block diagram are presented. A typical DOS(UDP flooding) attack scenario and its detection using signature based IDS is discussed indetail. The empirical formulas of various performance metrics are discussed in detail. Adetailed comparison between three benchmark dataset DARPA99, KDD99 and NSLKDDare described. Finally, the results of individual IDS namely, snort, suricata, PHAD andALAD against DARPA99 is shown. The chapter concludes with the results that indi-cates the lower detection rate of signature and anomaly based intrusion detection whichmotivates for data fusion of heterogenous intrusion detection system.

49

Chapter 3

Estimation of Reliability of IntrusionDetection system

3.1 Introduction

Intrusion detection system is being accepted as a robust security system to detect intru-sions due to internal as well as external intruders. The detection performance of IDS isseverely affected due to high false alarm rate. Intrusion detection system produces floodof alerts when deployed to monitor the network. This poses the concern about the reli-ability of IDS as 99 % of the alerts are false alerts. This chapter discusses the conceptof reliability of intrusion detection system and reliability of fusion result. The chaptersummarizes the method to derive the numerical value of reliability of IDS which is usedto discount the importance of IDS during the fusion process to compromise for unreliableIDS.

3.2 Reliability of Intrusion Detection System

3.2.1 Introduction

Data fusion is used to combine all the alerts from heterogeneous multiple IDS system tohave a better assessment of the situation under consideration. Majority of fusion rulesfound in literature [69], [70], [71], [72] and [73] works on the optimistic approach andconsider the alerts from intrusion detection system as equally reliable during the fusionprocess. Intrusion detection system is build using different models and this different modelprovide the different level of detection capability to IDS. e.g., Signature-based IDS Snortis very good in detecting DOS attacks while Anomaly-based IDS PHAD fails to detectU2R attacks. The efficient utilization of different level of expertise of IDS is accomplishedby incorporating the reliability of IDS which is used to discount the masses of unreliableIDS to avoid decreasing performance of fusion results. Reliability of intrusion detectionsystem is the modern trend in the data fusion for handling conflict and uncertainty ofsources of evidence [74], [75].

3.2.2 Defination

Reliability of intrusion detection system is defined as the degree to which the detectionof intrusion detection system is consistent [76]. The reliability of an intrusion detec-

50

3.2. RELIABILITY OF INTRUSION DETECTION SYSTEM

tion system represents its ability to detect correct intrusion under the given networktraffic. For number of intrusion detection system used in fusion process n ≥ 2 letΘ = {θ1, θ2, θ3, . . . , θn} be the frame of discernment for the fusion problem under con-sideration having n exclusive and exhaustive categories of attack. The sets of all subsetsof Θ is called as power-set of Θ and is denoted by 2Θ.

A basic belief assignment (BBA) is a function m from 2Θ, the power set of Θ to [0,1]which is the amount of mass value from IDS in favor of particular attack category. Thebelief mass assignment will satisfy the property:

m(φ) = 0 and∑A∈2Θ

m(A) = 1 (3.1)

Here, m(φ) is the mass assigned to nullset and m(A) is the mass assigned to attackcategory A. Intrusion detection system will be assigned a numerical reliability value Ri

ε [0,1] of ith IDS. A completely unreliable IDS will have Ri = 0 and completely reliableIDS will have Ri = 1. The value of Ri of IDS is used to discount the masses generatedby IDS for the given frame of discernment Θ. The discounted mass for attack category Aas explained in [77] is given by,

mR(A) = (1−Di)m(A) (3.2)

mR(θ) = Di + (1−Di)m(θ) (3.3)

Here, Di is called discounting factor of ith IDS and is related to reliability Ri as,

Ri = 1−Di (3.4)

3.2.3 Challenges

The efficiency and accuracy of intrusion detection is enhanced by incoporating reliabilityof IDS in data fusion rule. The main challenges to be dealed with while incorporatingreliability of IDS in data fusion rule can be listed as below:

1. Estimating reliability co-efficients: The first main challenge is to estimate the relia-bility values for each intrusion detection system participating in data fusion process.More specifically called as reliability co-efficients.

2. Reliable data fusion rule: The second challenge is to design a robust fusion rule thatcompromises between reliable and unreliable IDS. Reliable alert fusion rule shouldbe capable to discount the masses based on the derived reliability co-efficients.

3. Accurate ground truth detection: The reliability co-efficients adds a second levelof uncertainty after the uncertainty in the alerts generated by intrusion detectionsystem. The data fusion rule should be designed to handle the uncertainty so as tomatch the its inference with the actual ground truth.

51

3.3. RELIABILITY OF DATA FUSION RESULT

Table 3.1: Evaluation parameters of IDS using binary channel modelTerm Equivalent

terms fromIDS

Meaning

FP P(A/-I) There is an alert A, when there is no intrusion -ITP P(A/I) There is an alert A, when there is an intrusion IFN P(-A/I) There is no alert -A, when there is an intrusion ITN P(-A/-I) There is no alert -A, when there is no intrusion -IPPV P(I/A) The chances is that an intrusion I, is present when an IDS output an alarm ANPV P(-I/-A) The chances is that there is no intrusion –I, when an IDS does not output an alarm -A

Figure 3.1: Binary symmetric model of intrusion detection system

3.3 Reliability of Data fusion result

Data fusion with reliability of intrusion detection system ensure that only reliable alertsfrom IDS are fused. Another important aspect of applying reliability in data fusion isthe reliability of fusion result itself. Reliability of fusion is defined as the degree to whichthe detection of data fusion rule is consistent and it matches with ground truth. Thereliability of fusion is given by,

Rids = P (I

A) ∗ I(x; y)

H(x)+ P (

−I−A

) ∗ H(x/y)

H(x)(3.5)

Where P (I

A) is the probability that there is an intrusion when the intrusion detection

system raises an alert. P (−I−A

) is the probability that there is no intrusion when the

intrusion detection system raises no alert. I(x; y) is the mutual information transferbetween input and output of the intrusion detection system. H(x) is the entropy atthe input of the intrusion detection system and H(x/y) is the loss of information betweeninput and output of intrusion detection system. The mutual information transfer betweeninput and output of IDS is given by,

I(x; y) = H(x)−H(x/y) (3.6)

The reliability of fusion result given by equation-3.5 is based on the binary symmetricmodel as shown in figure-3.1. The input X indicates the ground truth about the incomingpacket, where, x=0 indicates non-attack and x=1 indicates attack. The output Y indicatesalert generated by IDS. Thus, y=0 indicates no-alert generated by IDS and y=1 indicatesalert generated by IDS. Table-3.1 shows evaluation parameters based on the proposed

52

3.3. RELIABILITY OF DATA FUSION RESULT

Figure 3.2: Reliability versus False positives

Figure 3.3: Data fusion for known ground truth

53

3.4. ESTIMATION OF RELIABILITY VALUES

Figure 3.4: Data fusion for unknown ground truth

binary symmetric model. Figure-3.2 shows the variation of reliability value with respectto false positive rate. As false positive rate increases which indicates higher number ofalarms results in decreased reliability.

3.4 Estimation of Reliability values

The major challenge in incorporating reliability of IDS into the fusion is problem ofobtaining reliability coefficients. Reliability coefficients basically show a numerical valueof trust in the mass value provided the Intrusion Detection System. The problem offinding reliability can be related to the problem of conflict between masses generatedfrom the alerts of various Intrusion detection systems [78]. The mere existence of conflictbetween the mass provided by Intrusion detection systems indicates the presence of anunreliable IDS which may cause the fusion result to be complementary from reality. Anhighly conflicting IDS will be assigned least reliability and least conflicting IDS will beassigned highest reliability. Figure-3.3 shows the data fusion flow diagram for unknownground truth.

The conflict matrix for four IDS system used to derive the reliability is given by,

K =

0 k12 k13 k14

k21 0 k23 k24

k31 k32 0 k34

k41 k42 k43 0

Where, kij indicates the conflicting mass between ith and jth IDS and is given by,

Another approach of finding reliability is to relate reliability with the true alert rateof IDS. In this approach, it is assumed that the IDS having highest true alert rate andthe lowest false alert rate will be assigned highest reliability and thereby, giving highest

54

3.5. SUMMARY

weightage in the fusion process. While all other IDS is assigned relative reliability valuebased on their true alert rate and false alert rate. The approach of assigning reliabilitybased on true alert rate requires the ground truth knowledge. While the approach ofassigning reliability based on conflict between the IDS can work without knowledge ofthe ground truth. In these work, we have used both the approaches and compared theresult of the proposed rule with existing rules [79]. Figure-3.4 shows the data fusion flowdiagram for known ground truth.

3.5 Summary

The discussion on the reliability of the Intrusion Detection System and the relationshipof reliability values of IDS and the discounting factor is presented. Important challengesrequired to be handled while incorporating reliability into data fusion rule is discussed.The formula for reliability evaluation of data fusion rule is formulated based on the tra-ditional binary symmetric model. The problem of finding reliability coefficients in themulti-IDS system is presented and criteria for deriving the reliability value is formulated.

55

Chapter 4

Attack Detection Performance ofIDS using Data Fusion of multipleheterogeneous systems

4.1 Introduction

A single intrusion detection system based on its characteristics & features may not detectall classes of attack. If several diverse IDS viz., signature-based IDS and anomaly-basedIDS are used, then alerts obtained from each of them can be combined with others usingdata fusion rules. Data fusion of multiple IDS has the ability to overcome the limitationsof individual IDS monitoring the network. Data fusion of multiple intrusion detectionsystems contributes in enhancing the system performance. This chapter introduces thebasics of data fusion techniques and methodology to perform intrusion detection withalert fusion using reliable alert fusion based on data fusion techniques along with experi-mentation against various benchmark datasets.

4.2 Data fusion

4.2.1 Defination

Data fusion is a process of combining information from several sources of evidence in orderto derive the global decision which is more consistent, accurate which is rather beyondthe capacity of individual intrusion detection system [80]. Intrusion detection systemsniffs the network traffic and raises & alerts for any abnormality. Data fusion of multipleIDS system is defined as a system that fuses/combines the alerts of IDS to infer the realintrusion situation in the network under consideration. A typical data fusion process isas shown in figure-4.5. The alerts generated by each IDS is converted to masses whichis used to estimate detected intrusion. The masses are then fused using the fusion rulewhich infers the global decision about the intrusion present.

4.2.2 Advantages of data fusion for intrusion detection

1. Increased Intrusion Detection Coverage:

A single intrusion detection system is capable of detecting some class of attack withhigher accuracy while data fusion based IDS can cover multiple classes of attack.

56

4.2. DATA FUSION

Figure 4.1: Typical data fusion process

2. Reduced ambiguity:

Intrusion detection system using data fusion reduces complexity in the inferences ofIDS and the fused data will have higher clarity.

3. Increased Intrusion Detection Accuracy:

An efficient data fusion rule used to combine alerts from multiple IDS systemsincreases the overall detection accuracy by increasing the belief mass for an observedevent.

4. Robustness:

The data fusion based IDS are able to handle the failure of any other IDS systemsfor detecting intrusions if one of the IDS can detect the attack.

5. Reliability:

The inference obtained from data fusion based IDS is more reliable

4.2.3 Challenges of data fusion for intrusion detection

As multiple heterogenous intrusion detection systems are used to detect intrusion belong-ing to multiple categories, there are challenges in getting correct inference from intrusiondetection system as well as fusion system. Following are the major challenges with datafusion of multiple intrusion detection systems.

1. Conflicting Alerts:

Alerts raised by individual intrusion detection system might be conflicting. The con-flicting alerts when fused generates complementary inferences. Hence, data fusiontechniques must be able to handle conflicting alerts.

2. Alert pre-processing:

57

4.3. DATA FUSION MODEL

Alerts generated from different heterogenous IDS systems must be aligned beforefusing them. Thus, pre-processing the alerts and converting them to masses for datafusion is a challenging task.

3. Reliability of IDS:

Intrusion detection system might be unreliable and generates large numbers of falsealarms which deviates the decision making of data fusion. Thus, data fusion ruleshould be designed to compromise between reliable and unreliable IDS.

4. Reliability of data fusion rule:

Reliability of fusion is defined as degree to which the detection of data fusion ruleis consistent and it matches with ground truth.

4.3 Data fusion model

Data fusion model for fusing alerts from multiple IDS system is as shown in figure-4.2.An IDS sniffs the network traffic and generates positive and negative alerts. If we denotethe hypothesis that attack is present by H and attack not present by -H then accordingto [81] we have,

m(H) =P

P +N + C(4.1)

m(−H) =N

P +N + C(4.2)

m(Hor −H) =C

P +N + C(4.3)

Where, P - positive evidence in favour of hypothesis H, N -Negative evidence opposing thehypothesis H or favouring hypothesis -H and C is constant which is equal to 2 for binaryframe of hypothesis. m(H) is the mass value for hypothesis H. m(Hor−H) is mass valuefor hypothesis H or –H and can be called m(uncertain) i.e, mass value for uncertaintybetween H and -H. Figure-4.3 shows the effect of increase in positive evidence on massvalue of m(H) , m(−H) and m(uncertain). The process of converting IDS alerts to massesis called as alert-to-mass mapping as shown in figure-4.2. The converted masses are thenfused using reliable alert fusion rule after discounting the masses by their reliability.

4.4 Frame of Discernment

A complete (exhaustive) set describing all of the sets in the hypothesis space is calledframe of discernment [82]. Generally, the frame is denoted as Θ. The elements in theframe must be mutually exclusive. If the number of the elements in the set is n, thenthe power set is a set of all subsets of (Θ) will have 2n elements. For example, theframe of discernment for detecting DOS attack will be Θ = {DOS,−DOS, θ}, whereθ = (DOS ∪ −DOS) is uncertainty.

58

4.4. FRAME OF DISCERNMENT

Figure 4.2: Data fusion model

59

4.5. FUSION RULES

Figure 4.3: Positive evidence versus mass value

4.5 Fusion rules

Fusion rule is used to combine masses from n evidence sources and outputs a fused de-cision. For number of evidence sources n ≥ 2 let Θ = {θ1, θ2, θ3, . . . , θn} be the frame ofdiscernment for the fusion problem under consideration having n exclusive and exhaustivehypothesis . The sets of all subsets of Θ is called as power-set of Θ and is denoted by2Θ. The power-set is usually closed under unions, intersections and complements and isdefined as a Boolean algebra. The fusion rules such Dempster shafer Rule in [83], Yager’sRule in [84] and Smet’s TBM Rule in [85] are rules which are closed under Union operator.However, this rules doesn’t contain intersections of element of Θ.

A basic belief assignment (BBA) is a function m from 2Θ, the power set of Θ to [0,1].The belief mass assignment will satisfy the property:

m(φ) = 0 and∑A∈2Θ

m(A) = 1 (4.4)

Here, m(φ) is the mass assigned to nullset. Let, m1(B) and m2(C) are two independentmasses from two sources of evidence. Then the combined mass m(A) is obtained bycombining m1(B) and m2(B) through conjunctive rule,

m(A) =∑

B,C∈2Θ

B∩C=A

m1(B)m2(C) (4.5)

m(φ) =∑

B,C∈2Θ

B∩C=φ

m1(B)m2(C) (4.6)

Disjunctive rule of combination is defined for union of elements of Θ. If m1(B) and m2(C)are two independent masses from two sources of evidence then the combined mass m(A)obtained by combining m1(B) and m2(C) through the rule,

60

4.5. FUSION RULES

m(A) =∑

B,C∈2Θ

B∪C=A

m1(B)m2(C) (4.7)

The disjunctive rule is preferable when some sources of evidence are unreliable but wedon’t know which one is unreliable.

The normalized version of conjunctive rule was proposed by Arthur Dempster andGlenn shafer in [83] and is known as Dempster-Shafer Rule. In DS rule, the fused massesm(A) is obtained from two independent sources of evidence m1(B) and m2(C) usingfollowing equation:

m(A) =

∑B,C∈2Θ

B∩C=A

m1(B)m2(C)

1−∑

B,C∈2Θ

B∩C=φ

m1(B)m2(C)(4.8)

m(φ) = 0 (4.9)

The above rule is defined for fusing two independent masses from sources of evidence.However, the same can be extended for n independent and equally reliable sources.

Yager’s rule is the modification of dempster-shafer rule to resolve the failure of DSrule under high conflict between evidence of sources. Yager rule assigns the conflictingmass to the weight of ignorance. Yager’s rule is given by,

m12(A) =∑

B,C∈2Θ

B∩C=A

m1(B)m2(C) (4.10)

m12(θ) = m(θ) +∑

B,C∈2Θ

B∩C=φ

m1(B)m2(C) (4.11)

Dubois and prade rule of combination by D. dubois and H. prade [86] is applicablewhen out of two sources, one source is unreliable and these unreliability is because of highconflict between the evidence they provide. DP rule assigns the value of conflict betweentwo sources under union operator to the total mass value.

m(A) =∑

B,C∈2Θ

B∪C=AB∩C=φ

m1(B)m2(C) +∑

B,C∈2Θ

B∩C=AB∩C 6=φ

m1(B)m2(C) (4.12)

Proportional conflict rule (PCR) proposed by Smarandache [87] is new fusion ruledesigned to handle conflict between evidences of sources. The PCR rule is based onfollowing three principles:

1. Using conjunctive mass between the sources of evidence by conjunctive rule as perequation-4.5

2. Computing the conflicting mass between evidence of sources as per equation-4.6

61

4.6. REQUIREMENTS AND LIMITATIONS OF DATA FUSION RULES

3. Redistributing the conflicting mass to non-empty sets involved in model

mPCR1(A) = m12(A) +c12(X)

d12

k12 (4.13)

Where, m12(A) is the conjunctive mass for category A between evidence source-1 &evidence source-2. c12(A) =m1(A) + m2(A) and d12(A) = 2 which is sum of total massform source-1 and source-2.

Smet’s rule [88], Weighted average operator [89], Adaptive conflict rule [90] and Murfyrule [91] are modifications and extension of dempster-shafer framework.

4.6 Requirements and limitations of data fusion rules

Ciza Thomas in [92] suggests that the timely detection of intrusion in multiple IDS frame-work requires an efficient fusion rule that effectively combines evidence from multiple IDSand outputs a decision that accurately matches with existing ground truth. Following arethe basic requirements for fusion rule as mapped out by authors:

1. Fusion rule should incorporate the reliability of intrusion detection system for theevidence it provide about the presence of intrusion.

2. The rule should be able to compromise between the reliable IDS and unreliable IDS.

3. If all the IDS involved in fusion are unreliable then fusion rule should discard theavailable IDS and then new sets of IDS has to be found for concerned fusion problem.

According to Katar in [93] the quality of decision from a fusion operator varies ap-plication to application. In present work the goal is to combine alerts from multiple IDSsystems, so the trustworthiness of alerts is a matter of concern. The existing fusion rulesdiscussed here have following limitations:

1. None of the existing rule incorporates the reliability of source whose evidence areto be fused. Thus, there is no real time criteria which assign a numerical value ofreliability to the evidence given by the source.

2. The existing fusion rule considered all the sources of evidence to be equally reliable.However, in fusion framework there might be some unreliable sources which misleadsto the fusion rule to give wrong decision.

3. One major drawback related to the fusion rule as suggested by Goodman in [94]is that in an environment consisting of many hypotheses and many sources, it isdifficult to decide whether to accept or reject the result of fusion rule. If sourcesof evidences are highly conflicting, the DS rule completely fails. If analyst blindlybelieves on the result then the decision can be misleading or complementary.

62

4.7. RELIABLE ALERT FUSION RULE

Figure 4.4: Complementary confusion matrix

4.7 Reliable Alert Fusion rule

The reliable alert fusion (RAF) rule is based on DS framework [83]. Here, m1(B) andm2(C) are two independent masses from two sources of evidence. Then the combinedmass m(A) is obtained by combining m1(B)and m2(C) through the rule,

m(A) = CRF (A)∑

B,C∈2Θ

B∩C=A

m1(B)m2(C) +DRF (A)∑

B,C∈2Θ

B∪C=A

m1(B)m2(C) (4.14)

Where,

CRF (A) =∏n

Rn (4.15)

DRF (A) = (1−∏n

Rn)(1−∏n

(1−Rn)) (4.16)

Here, Rn is the reliability value of nth source of evidence. CRF (A) is conjunctive reliabilityvalue about A and DRF (A) is disjunctive reliability value about A. CRF and DRF valueacts as a weighting factor to compromise between conjunctive mass and disjunctive mass.The proposed rule effectively incorporates reliability of each source of evidence. If all thesources of evidence are reliable we get CRF (A) = 1 and DRF (A) = 0, so the proposedrule converge to conjunctive rule. If all the sources of evidence are unreliable we getCRF (A) = 0 and DRF (A) = 0, so the proposed rule does not give any solution and newsources of evidence has to be found. If some sources are reliable and some are unreliable weget CRF (A) = 0.5 and DRF (A) = 0.5, so the proposed rule will shows the compromisebetween conjunctive mass and disjunctive mass.

4.8 Reliable Alert Fusion of Multiple Intrusion De-

tection system

To investigate the performance of reliable alert fusion rule based multiple IDS system interms of accuracy and false alarm rate, four simple but representative study cases with

63

4.8. RELIABLE ALERT FUSION OF MULTIPLE INTRUSION DETECTIONSYSTEM

Table 4.1: Performance of data fusion of two IDS systems under Complementary Confu-sion matrix

Conjuntive DS Yager Smet WAO PCR ACR Murfy RAFrule rule rule rule rule rule rule rule rule

TPR 0.72 0.79 0.72 0.72 0.77 0.77 0.79 0.68 0.7393FPR 0.34 0.41 0.34 0.34 0.40 0.40 0.41 0.30 0.1082PPV 0.68 0.66 0.68 0.68 0.66 0.66 0.66 0.71 0.8883NPV 0.70 0.74 0.70 0.70 0.73 0.73 0.74 0.67 0.7462ACCURACY 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.8008

two intrusion detection system has been investigated as follows:

1. Signature & Anomaly IDS having complementary behaviour

2. Signature & Anomaly IDS having complementary conflicting behaviour

3. Signature & Anomaly IDS having supplementary behaviour

4. Signature & Anomaly IDS having supplementary conflicting behaviour

4.8.1 Complementary behavior

Complementary behavior means single IDS shows complementary alerts for detectingdifferent classes of attacks. The confusion matrix of two IDS systems for complementarybehavior is as shown in figure-4.4.

Two classes attack and non-attack are considered in the confusion matrix. The perfor-mance is dependent upon correctly identifying the ground truth. IDS1 correctly identifies8 out of 10 incoming packets for non-attack category. While IDS2 correctly identifies 8 outof 10 incoming packets for the attack category. Hence, the behavior of both the detectionsystems is complementary to each other. Table-4.1 shows the performance of conjunctiverule, Dempster-shafer rule, yager’s rule, smet’s rule, weighted average rule, proportionalconflict rule, Murfy’s statistical average rule has been compared against reliable fusionrule under complementary behavior. We observe a reduction of 73.5 % in false alarm ratewith RAF rule and 12 % higher accuracy compared to other data fusion rules.

4.8.2 Complementary Conflicting behavior

Complementary conflicting means single IDS gives complementary alerts and over esti-mated weights for certain class of attacks. Complementary conflicting alerts are difficultto fuse as the inferences from IDS systems are contradictory. The confusion matrix oftwo IDS systems for complementary conflicting behavior is as shown in figure-4.5. IDS1

correctly identifies 10 out of 10 incoming packets for non-attack category and 4 out 10incoming packets for attack category. While IDS2 correctly identifies 10 out of 10 incom-ing packets for attack category and 4 out of 10 packets for non-attack category. Hence,the behavior of both the detection systems is complementary conflicting to each other.Table-4.2 shows the performance of conjunctive rule, Dempster-shafer rule, yager’s rule,smet’s rule, weighted average rule, proportional conflict rule, Murfy’s statistical averagerule has been compared against reliable fusion rule under complementary conflicting be-havior. We observe a reduction of 75 % in false alarm rate with RAF rule and 13 % higheraccuracy compared to other data fusion rules.

64

4.8. RELIABLE ALERT FUSION OF MULTIPLE INTRUSION DETECTIONSYSTEM

Figure 4.5: Complementary conflicting confusion matrix

Table 4.2: Performance of data fusion of two IDS systems under Complementary Con-flicting Confusion matrix



Figure 4.6: Supplementary confusion matrix

65


Table 4.3: Performance of data fusion of two IDS systems under supplementary confusionmatrix



Table 4.4: Performance of data fusion of two IDS systems under supplementary conflictingconfusion matrix


TPR 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00FPR 0 0.03 0.02 0.02 0.02 0.02 0.04 0.02 0.009PPV 1.00 0.97 0.98 0.98 0.98 0.98 0.96 0.98 0.99NPV 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00ACCURACY 1.00 0.98 0.99 0.99 0.99 0.99 0.98 0.99 0.99

4.8.3 Supplementary behavior

Supplementary behavior means both the IDS system shows similar behavior and support-ing same class of attack. The confusion matrix of two IDS systems having supplementarybehavior is as shown in figure-4.6. Table-4.2 shows the performance of conjunctive rule,Dempster-shafer rule, yager’s rule, smet’s rule, weighted average rule, proportional con-flict rule, Murfy’s statistical average rule has been compared against reliable fusion ruleunder complementary conflicting behavior. We observe a reduction of 85 % in false alarmrate with RAF rule and 10 % higher accuracy compared to other data fusion rules.

4.8.4 Supplementary Conflicting confusion matrix

Table-4.4 shows the performance of conjunctive rule, dempster-shafer rule, yager’s rule,smet’s rule, weighted average rule, proportional conflict rule, Murfy’s statistical averagerule and compared against reliable fusion rule under supplementary conflicting behavior.We observe 99 % to 100 % accuracy for all data fusion rules with 100 % true positive rate.The conjunctive rule seems to perform better along with RAF as both the IDS systemsare supplementary behavior.

4.9 Results & Discussion

4.9.1 Experiments against DARPA’99 Dataset

In DARPA99 Experiment, we preprocessed the dataset and total 5766 packets were loadedon to the network. In the first experiment as per Table-4.5, we use the TPR of IDS asreliability criteria. Table-4.6 and Table-4.7 shows the performance comparison of singleIDS against the fusion using DS and fusion using the proposed rule. The observed resultsshow an efficient reduction in the number of false positives and a significant increase in

66


Figure 4.7: Supplementary Conflicting confusion matrix

Table 4.5: DARPA99 Experiment descriptionCharacteristic Name

Dataset Name DARPA 1999Frame of Discernment (Θ) [probe, -probe, θ]Reliability criteria TPR of IDSNo. of packets processed 5766

the accuracy of IDS. Table-4.8 shows the description of the second experiment performedusing DARPA99 where reliability is derived by calculating the amount of conflict betweenthe IDS systems. Table-4.9 and Table-4.10 shows the comparison results of single IDSwith fusion using DS and fusion using proposed rule by deriving reliability value usingconflict between evidences.

4.9.2 Experiments against KDD’99 Dataset

In KDD99 Experiment, we downloaded KDD99 cup dataset, preprocessed the dataset andtotal 3456 packets containing attack and non-attack packet was loaded onto the networkand replayed using tcpreplay tool. The frame of discernment is selected to detect smurfattack. The total 1944 smurf attacks were present in the processed dataset. The KDD99Experiment description is shown in Table-4.11. Table-4.12 and Table-4.13 shows theperformance of the proposed rule along with the DS rule. Table-4.14 gives the description

Table 4.6: Comparison of single IDS with fusion using DS and fusion using proposed ruleby deriving reliability value using TPR against DARPA99 dataset

Snort Suricata PHAD NETAD DS ProposedRule Rule

TP 127 124 144 118 131 143TN 2715 2721 2730 2681 2644 5324FP 2784 2778 2769 2818 2855 32FN 140 143 123 149 136 267

67


Table 4.7: Comparison of single IDS with fusion using DS and fusion using proposed ruleby deriving reliability value using TPR against DARPA99 dataset


TPR 0.4757 0.4644 0.5393 0.4419 0.4906 0.5356FPR 0.5063 0.5052 0.5035 0.5125 0.5192 0.060PPV 0.0437 0.0429 0.0494 0.0402 0.0439 0.8171NPV 0.9510 0.9501 0.9569 0.9473 0.9511 0.9522ACC 0.4929 0.4934 0.4984 0.4854 0.4813 0.9481

Table 4.8: DARPA 99 Experiment description

Characteristic NameDataset Name DARPA 1999Frame of Discernment (Θ) [probe, -probe, θ]Reliability criteria Conflict between IDSNo. of packets processed 5766

Table 4.9: Comparison of single IDS with fusion using DS and fusion using proposed ruleby deriving reliability value using conflict between evidences against DARPA99 dataset


TP 128 107 129 129 119 136TN 2751 2718 2689 2768 2693 5130FP 2748 2781 2810 2731 2806 244FN 139 160 138 138 148 256

‘

Table 4.10: Comparison of single IDS with fusion using DS and fusion using proposed ruleby deriving reliability value using conflict between evidences against DARPA99 dataset



Table 4.11: KDD99 Experiment descriptionCharacteristic Name

Dataset Name KDD 1999Frame of Discernment (Θ) [smurf, -smurf, θ]Reliability criteria TPR of IDSNo. of packets processed 3456

68


Table 4.12: Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using TPR against KDD99 dataset


TP 997 967 1015 960 1008 1033TN 742 741 730 758 723 1501FN 947 977 929 984 936 911FP 770 771 782 754 789 11

Table 4.13: Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using TPR against KDD99 dataset



Table 4.14: KDD99 Experiment description

Characteristic NameDataset Name KDD 1999Frame of Discernment (Θ) [smurf, -smurf, θ]Reliability criteria Conflict between IDSNo. of packets processed 3456

Table 4.15: Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using conflict between evidences against KDD99 dataset


TP 916 1015 982 910 969 1015TN 788 763 769 762 765 1490FN 1028 928 962 1034 975 929FP 724 749 743 750 747 22

‘

Table 4.16: Comparison of single IDS with fusion using DS and fusion using proposedrule by deriving reliability value using conflict between evidences against KDD99 dataset



69


Figure 4.8: Comparison of proposed rule with DS rule against NSL-KDD for detectingR2L attack

Table 4.17: NSL-KDD R2L Attack Experiment description

Characteristic NameDataset Name NSL-KDDFrame of Discernment (Θ) [R2L, -R2L, θ]Reliability criteria Conflict between IDSNo. of packets processed 5000No. of R2L attacks present 52

about the KDD99 experiment by using conflict between the IDS as a reliability criterion.Table-4.15 and Table-4.16 shows the results obtained under this experiment.

4.9.3 Experiments against NSL-KDD Dataset

In order to have the evaluation of intrusion detection more consistent and accurate, areduced dataset called NSL-KDD is distributed by [66]. NSL-KDD has 21 labeled attacksin training set and 37 attacks in testing dataset as described in section-2.4.2. To performthe evaluation of our proposed technique against NSL-KDD, We utilized two IDS systemsnamely, snort and PHAD. In the work by Ciza Thomas [92] it is shown that PHADhas performed badly during detection of R2L and U2R attack. While, snort performswell against DOS and R2L categories of attack. In pre-processing of NSL-KDD usingWireshark tool [95], it is found that DOS and R2L have very low variations and hence itis difficult to detect such attacks using traditional detection method.

Table-4.17 shows the description of NSL-KDD experiment for detecting R2L attackconsidering conflict as reliability criteria. Table-4.18 shows the description of NSL-KDDexperiment for detecting DOS attack considering conflict as reliability criteria. Figure-4.8shows the results of the proposed rule with DS rule against NSL-KDD for detecting R2Lattack and Figure-4.9 shows the results of the proposed rule with DS rule against NSL-KDD for detecting DOS attack. It can be observed from the result that the proposed rulegives the highest accuracy and least false positive rate compared to individual IDS andDS rule.

70


Figure 4.9: Comparison of proposed rule with DS rule against NSL-KDD for detectingDOS attack

Table 4.18: NSL-KDD DOS Attack Experiment description

Characteristic NameDataset Name NSL-KDDFrame of Discernment (Θ) [DOS, -DOS, θ]Reliability criteria Conflict between IDSNo. of packets processed 5000No. of R2L attacks present 944

Table 4.19: UDP flooding attack experiment

Characteristic NameDataset Name Real datasetFrame of Discernment (Θ) [DOS(UDP), -DOS(UDP), θ]No. of packets processed 7051Reliability Factor conflict between IDSFusion Rule RAF ruleNo. of packets processed 7051

Table 4.20: Alerts and mass value against UDP flooding attack

Snort Suricata PHAD ALADTotal Alerts 115 121 170 120P 10 12 5 8N 3 2 10 12Mass value 0.67 0.75 0.294 0.36Decision Attack Attack Non-Attack Non-Attack

71


Figure 4.10: UDP(flooding) Packets captured by wireshark

Figure 4.11: UDP(flooding) Packets data

Table 4.21: Fusion result against UDP flooding attackType of fusion Mass value Decision

Snort ∪ Suricata 0.8820 AttackPHAD ∪ ALAD 0.28 Non-AttackPHAD ∪ Snort 0.40 Non-AttackFusion with DS rule 0.46 Non-AttackFusion with RAF rule 0.7477 Attack

72

4.10. SUMMARY

4.9.4 Experiments against Real time Attack-UDP(flooding)

UDP flooding is a technique that execute an attack by generating large number of UDPpackets to a specific port. This cause the network of victim flooded by large number ofpackets which depletes the network bandwidth and blocks the desired services for legiti-mate user. A typical diagram is as shown in figure-2.4. The objective of this experiment isto evaluate the performance of our proposed data fusion method against real time attackand compare its detection capability and reliability against individual IDS snort, suricata,PHAD and ALAD.

In this experiment we created a network of three machine as shown in figure-2.9. Theattacker machine try to flood the victim machine by sending large numbers of packets toa single port.

We have use hping3 [96] to launch the attack. hping3 is an open source packet gen-erator with the capability of generating TCP, UDP and ICMP packets and sending themon specified IP address with specific port. Following instruction is used to generate UDPflooding attack:

sudo hping3 --udp -p 7051 --destport 80 --flood 235.127.67.235

Table-4.19 shows the description of real time attack experiment. We use RAF rule withconflict between IDS as a reliability factor. Figure-4.10 shows the UDP(flooding) packetscaptured by wireshark. The dataset consists of totally 7051 as given by an argument inhping3. Figure-4.11 shows UDP packet details, It is obseved that the packet has XXXXXXXX as data which is another indication of attack. The victim machine is having snort,suricata, PHAD & ALAD IDS. Table-4.20 shows the results of alerts and mass value ofindividual IDS. The value of P is calculated from number of alerts raised against thesignature. If number of alerts for a particular signature is less than 10, the value of P willbe equal to 1. If number of alerts raised against the signature is greater then 10, then Pwill be calculated using following equation [81].

P =NOA

50+ 2 (4.17)

Number of signatures that are not raised is taken as value of N. The mass value willbe calculated using equations 4.13-4.3. It is evident from table-4.20, PHAD and ALADfailed to detect the attack. Table-4.21 shows the result obtained by union of variousintrusion detection system and reliable alert fusion rule. It shows that the failure ofPHAD and ALAD in detecting attack reduces the mass value of DS fusion rule whichfails to detect attack. The RAF rule discounts the belief of PHAD and ALAD as theirbeliefs are conflicting with the beliefs of snort and suricata and thereby accurately detectsthe UDP (flooding) attack. RAF rule effectively handles the unreliable IDS.

4.10 Summary

The basic definition of data fusion rule along with advantages & challenges of data fusionfor intrusion detection is described. Data fusion model for fusing the alerts from multipleheterogenous IDS system is presented. The formulation for converting alerts-to-massfor the selected hypothesis is presented. Dempster-Shafer rule and various modified DSrules such as yager’s rule, Dubois & prade rule, proportional conflict rule are described.

73

4.10. SUMMARY

The basic requirements & limitations of data fusion rules and proposed reliable alertfusion rule is described and demonstrated against two IDS system under complementary,complementary conflicting, supplementary & supplementary conflicting behaviour. Theperformance of a reliable alert fusion rule is demonstrated against benchmark datasetssuch as DARPA99, KDD99 and NSL-KDD. The alert fusion method against real-timenetwork attack UDP (flooding) is evaluated and performance improvement in intrusiondetection is demonstrated.

74

Chapter 5

Summary of Research Work &Recommendation for Future Work

5.1 Summary of Research Work

The demand for internet and secure network is increasing and intrusion detection systemis a valuable alternative for protecting the network against internal and external vulnera-bilities. The present research work contributes to enhancing the performance of intrusiondetection system by reducing the false alarms and increasing the detection coverage. Themain objective of the work is to study the limitations of signature-based and anomaly-based intrusion detection system and to fuse the alerts generated by these systems usingdata fusion techniques to have a reliable & accurate diagnosis of network attacks. Theresearch in network security field is prominent and focussed toward improving the per-formance of IDS system in terms of false positives, accuracy, detection rate, etc. Theliterature also shows studies and experiments on intrusion detection system where thenumber of false alarms and reliability of IDS is a major concern. However, there is lack ofstudy done for enhancing the performance of intrusion detection system using data fusionenvironment by incorporating reliability of IDS.

To address the performance issues in regard to intrusion detection system, we haveinvestigated the following listed points:

• The performance of intrusion detection system is improved by using a reliable alertfusion rule and various experiments against benchmark dataset & real-time attack isdemonstrated. We have developed a new mathematical expression for effective androbust fusion of alerts/mass form various intrusion detection systems with the ca-pability to compromise between reliable & unreliable IDS systems by encompassingthe reliability of IDS.

• A mathematical expression for the reliability of intrusion detection system underknown and unknown ground truth has been derived using two different criteria. Thecomparison between reliable alert fusion rule and existing rules like Dempster Shafer,Yager rule, Conjunctive rule, Smet rule, Weighted Average Operator, ProportionalConflict rule, Murphy’s Statistical Average rule is presented and their performanceunder complementary, complementary conflicting, supplementary & supplementaryconflicting is demonstrated.

75

5.2. RECOMMENDATIONS FOR FUTURE WORK

The observed results show that using a reliable alert fusion of multiple IDS system,significant improvement is found in the number of false alarms and the reliability ofthe system.

• We have performed the experiments against signature-based IDS namely, Snort andSuricata & Anomaly-based IDS namely, PHAD and ALAD. It is concluded that theperformance of individual IDS against the oldest dataset is less than 50 % for anycategory of attack. Hence, an alternative approach for enhancing the performanceof IDS is required before deploying it in the real computer network.

5.2 Recommendations for Future Work

• Recommendation for future research is to develop and design an intrusion detectionsystem based on theoretical analysis developed in this research in order to test, verifyand demonstrate the concepts presented and developed in this thesis in real-worldnetwork intrusion.

• In present work reliability fusion rule is used to enhance the performance in termsof accuracy & false alarm rate. Further to continue research different reliabilityevaluation methods can be applied to determine the authenticity of alerts generatedby IDS before fusing them.

• Data fusion techniques based on the Dempster-Shafer framework and their exten-sions are used to fuse alerts from multiple IDS system. Other fusion rules such asDSmT, Consensus operator can be used and their performance can be measured.

• Data fusion along with the proposed reliability method can be used to reduce thenumber of false alarms and improve detection accuracy. Other techniques such asdata correlation can also be investigated in presence of multiple-IDS fusion frame-work.

• The accuracy of inference obtained from data fusion of multiple IDS can be furtherimproved by understanding the network scenario and by tuning the performanceof intrusion detection system for detection of a particular class of intrusion. Thesignatures for misuse based IDS should be updated continuously to detect moderndays intrusion. Anomaly detectors should be trained using some novel techniquessuch as K-Nearest Neighbor (KNN) [97], Bayesian network [98], Support vectormachine [99], clustering techniques [100] for creating the normal profile as theirperformance is dependent upon how well the normal profile is created.

• In the present thesis, we have demonstrated the performance of Snort, Suricata,PHAD and ALAD against DARPA99, KDD99 and NSL-KDD along with real-timeUDP (flooding) attack. Research can be done to design and distribute new datasetfor evaluation of intrusion detection system for providing a future direction of re-search in network security. Further research can use different of IDS system toanalyze the performance of proposed reliable alert fusion rule.

76

Bibliography

[1] Usage and population statistics. (2017) http://www.internetworldstats.com.

[2] A. S. Desai and D. Gaikwad, “Real time hybrid intrusion detection system usingsignature matching algorithm and fuzzy-ga,” in Advances in Electronics, Communi-cation and Computer Technology (ICAECCT), 2016 IEEE International Conferenceon. IEEE, 2016, pp. 291–294.

[3] E. A. Watkins, F. Roesner, S. McGregor, B. Lowens, K. Caine, and M. N. Al-Ameen, “Sensemaking and storytelling: Network security strategies for collabora-tive groups,” in Collaboration Technologies and Systems (CTS), 2016 InternationalConference on. IEEE, 2016, pp. 622–623.

[4] A. Mattar and M. Z. Reformat, “Detecting anomalous network traffic using evidencetheory,” in Advances in Fuzzy Logic and Technology 2017. Springer, 2017, pp. 493–504.

[5] I. Ivanovic and S. Gajin, “Recommendations for network traffic analysis using thenetflow protocol,” 2016.

[6] T. Mavroeidakos, A. Michalas, and D. D. Vergados, “Security architecture based ondefense in depth for cloud computing environment,” in Computer CommunicationsWorkshops (INFOCOM WKSHPS), 2016 IEEE Conference on. IEEE, 2016, pp.334–339.

[7] J. Amudhavel, V. Brindha, B. Anantharaj, P. Karthikeyan, B. Bhuvaneswari,M. Vasanthi, D. Nivetha, and D. Vinodha, “A survey on intrusion detection system:State of the art review,” Indian Journal of Science and Technology, vol. 9, no. 11,2016.

[8] I. El Mir, A. Haqiq, and D. S. Kim, “Performance analysis and security based onintrusion detection and prevention systems in cloud data centers,” in InternationalConference on Hybrid Intelligent Systems. Springer, 2016, pp. 456–465.

[9] D. E. Denning, “An intrusion-detection model,” IEEE Transactions on softwareengineering, no. 2, pp. 222–232, 1987.

[10] S. E. Smaha, “Haystack: An intrusion detection system,” in Aerospace ComputerSecurity Applications Conference, 1988., Fourth. IEEE, 1988, pp. 37–44.

[11] S. R. Snapp, J. Brentano, G. V. Dias, T. L. Goan, L. T. Heberlein, C.-L. Ho, K. N.Levitt, B. Mukherjee, S. E. Smaha, T. Grance et al., “Dids (distributed intrusiondetection system)-motivation, architecture, and an early prototype,” in Proceedings

77

BIBLIOGRAPHY

of the 14th national computer security conference, vol. 1. Washington, DC, 1991,pp. 167–176.

[12] J. Hochberg, K. Jackson, C. Stallings, J. McClary, D. DuBois, and J. Ford, “Nadir:An automated system for detecting network intrusion and misuse,” Computers &Security, vol. 12, no. 3, pp. 235–248, 1993.

[13] V. Paxson, “Bro: a system for detecting network intruders in real-time,” Computernetworks, vol. 31, no. 23, pp. 2435–2463, 1999.

[14] E. Amoroso, “Intrusion detection: an introduction to internet surveillance, correla-tion, trace back, traps, and response,” Intrusion. Net Book, 1999.

[15] S. Northcutt, Snort: IDS and IPS toolkit. Syngress Press, 2007.

[16] A. K. Ghosh, A. Schwartzbard, and M. Schatz, “Learning program behavior profilesfor intrusion detection.” in Workshop on Intrusion Detection and Network Moni-toring, vol. 51462, 1999, pp. 1–13.

[17] R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McClung,D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham et al., “Evaluatingintrusion detection systems: The 1998 darpa off-line intrusion detection evaluation,”in DARPA Information Survivability Conference and Exposition, 2000. DISCEX’00.Proceedings, vol. 2. IEEE, 2000, pp. 12–26.

[18] M. V. Mahoney and P. K. Chan, “Phad: Packet header anomaly detection foridentifying hostile network traffic,” Tech. Rep., 2001.

[19] A. K. Ghosh, J. Wanken, and F. Charron, “Detecting anomalous and unknownintrusions against programs,” in Computer Security Applications Conference, 1998.Proceedings. 14th Annual. IEEE, 1998, pp. 259–267.

[20] M. Crosbie and E. H. Spafford, “Defending a computer system using autonomousagents,” 1995.

[21] E. Lundin and E. Jonsson, “Anomaly-based intrusion detection: privacy concernsand other problems,” Computer networks, vol. 34, no. 4, pp. 623–640, 2000.

[22] T. Lane and C. E. Brodley, “An application of machine learning to anomaly detec-tion,” in Proceedings of the 20th National Information Systems Security Conference,vol. 377. Baltimore, USA, 1997, pp. 366–380.

[23] F. Cuppens and A. Miege, “Alert correlation in a cooperative intrusion detectionframework,” in Security and privacy, 2002. proceedings. 2002 ieee symposium on.IEEE, 2002, pp. 202–215.

[24] A. Siraj, R. B. Vaughn, and S. M. Bridges, “Intrusion sensor data fusion in anintelligent intrusion detection system architecture,” in System Sciences, 2004. Pro-ceedings of the 37th Annual Hawaii International Conference on. IEEE, 2004, pp.10–pp.

[25] D. Yu and D. Frincke, “Alert confidence fusion in intrusion detection systems withextended dempster-shafer theory,” in Proceedings of the 43rd annual Southeast re-gional conference-Volume 2. ACM, 2005, pp. 142–147.

78

BIBLIOGRAPHY

[26] Q. Chen and U. Aickelin, “Anomaly detection using the dempster-shafer method,”in conference on data mining, DMIN’06. Citeseer, 2006.

[27] C. Thomas and N. Balakrishnan, “Modified evidence theory for performance en-hancement of intrusion detection systems,” in Information Fusion, 2008 11th Inter-national Conference on. IEEE, 2008, pp. 1–8.

[28] A. Valdes and K. Skinner, “Probabilistic alert correlation,” in Recent advances inintrusion detection. Springer, 2001, pp. 54–68.

[29] F. Valeur, G. Vigna, C. Kruegel, and R. A. Kemmerer, “Comprehensive approach tointrusion detection alert correlation,” IEEE Transactions on dependable and securecomputing, vol. 1, no. 3, pp. 146–169, 2004.

[30] S. Mathew, C. Shah, and S. Upadhyaya, “An alert fusion framework for situationawareness of coordinated multistage attacks,” in Information Assurance, 2005. Pro-ceedings. Third IEEE International Workshop on. IEEE, 2005, pp. 95–104.

[31] S. Wen, Y. Xiang, and W. Zhou, “A lightweight intrusion alert fusion system,”in High Performance Computing and Communications (HPCC), 2010 12th IEEEInternational Conference on. IEEE, 2010, pp. 695–700.

[32] G. J. Victor, “Intrusion detection systems false positives,” 2013.

[33] H. Wu, S. Schwab, and R. L. Peckham, “Signature based network intrusion detectionsystem and method,” Sep. 9 2008, uS Patent 7,424,744.

[34] P. Garcia-Teodoro, J. Diaz-Verdejo, G. Macia-Fernandez, and E. Vazquez,“Anomaly-based network intrusion detection: Techniques, systems and challenges,”computers & security, vol. 28, no. 1, pp. 18–28, 2009.

[35] Y. Liu, X. Wang, and K. Liu, “Network anomaly detection system with optimizedds evidence theory,” The Scientific World Journal, 2014.

[36] F. J. Aparicio-Navarro, K. G. Kyriakopoulos, and D. J. Parish, “A multi-layerdata fusion system for wi-fi attack detection using automatic belief assignment,” inInternet Security (WorldCIS), 2012 World Congress on. IEEE, 2012, pp. 45–50.

[37] W. Jiang, M. Zhuang, and C. Xie, “A reliability-based method to sensor data fu-sion,” Sensors, vol. 17, no. 7, p. 1575, 2017.

[38] S. Anwar, J. Mohamad Zain, M. F. Zolkipli, Z. Inayat, S. Khan, B. Anthony, andV. Chang, “From intrusion detection to an intrusion response system: Fundamen-tals, requirements, and future directions,” Algorithms, vol. 10, no. 2, p. 39, 2017.

[39] M. Ammar, M. Rizk, A. Abdel-Hamid, and A. K. Aboul-Seoud, “A framework forsecurity enhancement in sdn-based datacenters,” in New Technologies, Mobility andSecurity (NTMS), 2016 8th IFIP International Conference on. IEEE, 2016, pp.1–4.

[40] S. S. Vichare, “Comparative study on firewall and intrusion detection system,”International Journal of Engineering Science, vol. 13716, 2017.

79

BIBLIOGRAPHY

[41] J. P. Anderson et al., “Computer security threat monitoring and surveillance,” Tech-nical report, James P. Anderson Company, Fort Washington, Pennsylvania, Tech.Rep., 1980.

[42] A. A. Ahmed, “Investigation approach for network attack intention recognition,”International Journal of Digital Crime and Forensics (IJDCF), vol. 9, no. 1, pp.17–38, 2017.

[43] A. Sarmah, “Intrusion detection systems; definition, need and challenges,” 2001.

[44] J. McHugh, A. Christie, and J. Allen, “Defending yourself: The role of intrusiondetection systems,” IEEE software, vol. 17, no. 5, pp. 42–51, 2000.

[45] M. Roesch et al., “Snort: Lightweight intrusion detection for networks.” in Lisa,vol. 99, no. 1, 1999, pp. 229–238.

[46] M. Roesch, “Snort users manual snort release: 1.8.” Retrieved November, vol. 12,2001.

[47] B. Visscher, “Sguil: The analyst console for network security monitoring,” Squil0.8. 0, 2013.

[48] R. Heenan and N. Moradpoor, “Introduction to security onion,” in The First PostGraduate Cyber Security Symposium, 2016.

[49] S. U. Manual, “v2. 9.2,” The Snort Project, 2011.

[50] Suricata. (2017) https://suricata-ids.org/.

[51] D. Day and B. Burns, “A performance analysis of snort and suricata network intru-sion detection and prevention engines,” in Fifth International Conference on DigitalSociety, Gosier, Guadeloupe, 2011, pp. 187–192.

[52] O. Depren, M. Topallar, E. Anarim, and M. K. Ciliz, “An intelligent intrusiondetection system (ids) for anomaly and misuse detection in computer networks,”Expert systems with Applications, vol. 29, no. 4, pp. 713–722, 2005.

[53] A. Patcha and J.-M. Park, “An overview of anomaly detection techniques: Existingsolutions and latest technological trends,” Computer networks, vol. 51, no. 12, pp.3448–3470, 2007.

[54] M. V. Mahoney and P. K. Chan, “Learning rules for anomaly detection of hostilenetwork traffic,” in Data Mining, 2003. ICDM 2003. Third IEEE InternationalConference on. IEEE, 2003, pp. 601–604.

[55] S. Axelsson, “The base-rate fallacy and the difficulty of intrusion detection,” ACMTransactions on Information and System Security (TISSEC), vol. 3, no. 3, pp. 186–205, 2000.

[56] W. Lee, W. Fan, M. Miller, S. J. Stolfo, and E. Zadok, “Toward cost-sensitive mod-eling for intrusion detection and response,” Journal of computer security, vol. 10,no. 1-2, pp. 5–22, 2002.

80

BIBLIOGRAPHY

[57] G. Gu, P. Fogla, D. Dagon, W. Lee, and B. Skoric, “Measuring intrusion detectioncapability: an information-theoretic approach,” in Proceedings of the 2006 ACMSymposium on Information, computer and communications security. ACM, 2006,pp. 90–101.

[58] P. Dokas, L. Ertoz, V. Kumar, A. Lazarevic, J. Srivastava, and P.-N. Tan, “Datamining for network intrusion detection,” in Proc. NSF Workshop on Next Genera-tion Data Mining, 2002, pp. 21–30.

[59] A. Turner, “Tcpreplay,” http://tcpreplay. synfin. net/trac/, 2011.

[60] A. Shiravi, H. Shiravi, M. Tavallaee, and A. A. Ghorbani, “Toward developinga systematic approach to generate benchmark datasets for intrusion detection,”computers & security, vol. 31, no. 3, pp. 357–374, 2012.

[61] J. McHugh, “Testing intrusion detection systems: a critique of the 1998 and 1999darpa intrusion detection system evaluations as performed by lincoln laboratory,”ACM Transactions on Information and System Security (TISSEC), vol. 3, no. 4,pp. 262–294, 2000.

[62] K. K. R. Kendall, “A database of computer attacks for the evaluation of intrusiondetection systems,” Ph.D. dissertation, Massachusetts Institute of Technology, 1999.

[63] C. Thomas, V. Sharma, and N. Balakrishnan, “Usefulness of darpa dataset forintrusion detection system evaluation.” The International Society for Optical En-gineering, 2008.

[64] M. Mahoney and P. Chan, “An analysis of the 1999 darpa/lincoln laboratory evalua-tion data for network anomaly detection,” in Recent advances in intrusion detection.Springer, 2003, pp. 220–237.

[65] J. McHugh, “Testing intrusion detection systems: a critique of the 1998 and 1999darpa intrusion detection system evaluations as performed by lincoln laboratory,”ACM Transactions on Information and System Security (TISSEC), vol. 3, no. 4,pp. 262–294, 2000.

[66] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis ofthe kdd cup 99 data set,” in Computational Intelligence for Security and DefenseApplications, 2009. CISDA 2009. IEEE Symposium on. IEEE, 2009, pp. 1–6.

[67] A. Turner and M. Bing, “Tcpreplay: Pcap editing and replay tools for* nix,” online],http://tcpreplay. sourceforge. net, 2005.

[68] L. LABORATORY. (2017) https://www.ll.mit.edu/ideval/data/.

[69] X. Xing, Y. Cai, Z. Zhao, and L. Cheng, “Weighted evidence combination based onimproved conflict factor,” Journal of Discrete Mathematical Sciences and Cryptog-raphy, vol. 19, no. 1, pp. 173–184, 2016.

[70] O. Kanjanatarakul and T. Denœux, “Distributed data fusion in the dempster-shafer framework,” in System of Systems Engineering Conference (SoSE), 201712th. IEEE, 2017, pp. 1–6.

81

BIBLIOGRAPHY

[71] H. Jafari, X. Li, and L. Qian, “Efficient processing of uncertain data using dezert-smarandache theory: A case study,” in Dependable, Autonomic and Secure Com-puting, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf onBig Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech), 2016 IEEE 14th Intl C. IEEE, 2016,pp. 715–722.

[72] W. Jiang, S. Wang, X. Liu, H. Zheng, and B. Wei, “Evidence conflict measure basedon owa operator in open world,” PloS one, vol. 12, no. 5, p. e0177828, 2017.

[73] F. Sebbak and F. Benhammadi, “Total conflict redistribution rule for evidentialfusion,” in Information Fusion (FUSION), 2016 19th International Conference on.IEEE, 2016, pp. 1324–1331.

[74] X. L. Dong and F. Naumann, “Data fusion: resolving data conflicts for integration,”Proceedings of the VLDB Endowment, vol. 2, no. 2, pp. 1654–1655, 2009.

[75] R. R. Yager, “A framework for multi-source data fusion,” Information Sciences, vol.163, no. 1, pp. 175–200, 2004.

[76] D. Mercier, B. Quost, and T. Denœux, “Refined modeling of sensor reliability inthe belief function framework using contextual discounting,” Information Fusion,vol. 9, no. 2, pp. 246–258, 2008.

[77] Z. Elouedi, K. Mellouli, and P. Smets, “Assessing sensor reliability for multisensordata fusion within the transferable belief model,” IEEE Transactions on Systems,Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 782–787, 2004.

[78] V. Shah, A. K. Aggarwal, and N. Chaubey, “Performance improvement of intrusiondetection with fusion of multiple sensors,” Complex & Intelligent Systems, vol. 3,no. 1, pp. 33–39, 2017.

[79] V. M. Shah and A. Agarwal, “Reliable alert fusion of multiple intrusion detectionsystems.” IJ Network Security, vol. 19, no. 2, pp. 182–192, 2017.

[80] H. Durrant-Whyte and T. C. Henderson, “Multisensor data fusion,” in SpringerHandbook of Robotics. Springer, 2016, pp. 867–896.

[81] A. Jøsang, “The consensus operator for combining beliefs,” Artificial Intelligence,vol. 141, no. 1-2, pp. 157–170, 2002.

[82] L. A. Zadeh, “Review of a mathematical theory of evidence,” AI magazine, vol. 5,no. 3, p. 81, 1984.

[83] G. Shafer, “Dempster-shafer theory,” Encyclopedia of artificial intelligence, pp. 330–331, 1992.

[84] R. R. Yager, “On the dempster-shafer framework and new combination rules,” In-formation sciences, vol. 41, no. 2, pp. 93–137, 1987.

[85] P. Smets, “Belief functions: the disjunctive rule of combination and the general-ized bayesian theorem,” in Classic Works of the Dempster-Shafer Theory of BeliefFunctions. Springer, 2008, pp. 633–664.

82

BIBLIOGRAPHY

[86] D. Dubois and H. Prade, “On the combination of evidence in various mathematicalframeworks,” in Reliability data collection and analysis. Springer, 1992, pp. 213–241.

[87] F. Smarandache and J. Dezert, “Information fusion based on new proportional con-flict redistribution rules,” in Information Fusion, 2005 8th International Conferenceon, vol. 2. IEEE, 2005, pp. 8–pp.

[88] P. Smets, “Data fusion in the transferable belief model,” in Information Fusion,2000. FUSIon 2000. Proceedings of the Third International Conference on, vol. 1.IEEE, 2000, pp. PS21–PS33.

[89] J. Dezert and F. Smarandache, “Proportional conflict redistribution rules for in-formation fusion,” Advances and applications of DSmT for Information Fusion-Collected works, vol. 2, pp. 3–68, 2006.

[90] M. C. Florea, J. Dezert, P. Valin, F. Smarandache, and A.-L. Jousselme, “Adapta-tive combination rule and proportional conflict redistribution rule for informationfusion,” arXiv preprint cs/0604042, 2006.

[91] R. R. Murphy, “Dempster-shafer theory for sensor fusion in autonomous mobilerobots,” IEEE Transactions on Robotics and Automation, vol. 14, no. 2, pp. 197–206, 1998.

[92] C. Thomas and N. Balakrishnan, “Performance enhancement of intrusion detec-tion systems using advances in sensor fusion,” in Information Fusion, 2008 11thInternational Conference on. IEEE, 2008, pp. 1–7.

[93] C. Katar, “Combining multiple techniques for intrusion detection,” Int J ComputSci Network Security, vol. 6, no. 2B, pp. 208–218, 2006.

[94] I. R. Goodman, R. P. Mahler, and H. T. Nguyen, Mathematics of data fusion.Springer Science & Business Media, 2013, vol. 37.

[95] U. Banerjee, A. Vashishtha, and M. Saxena, “Evaluation of the capabilities of wire-shark as a tool for intrusion detection,” International Journal of computer applica-tions, vol. 6, no. 7, 2010.

[96] S. Sanfilippo, “Hping3 (8)-linux man page,” 2005.

[97] Y. Liao and V. R. Vemuri, “Use of k-nearest neighbor classifier for intrusion detec-tion,” Computers & security, vol. 21, no. 5, pp. 439–448, 2002.

[98] D. Barbara, N. Wu, and S. Jajodia, “Detecting novel network intrusions usingbayes estimators,” in Proceedings of the 2001 SIAM International Conference onData Mining. SIAM, 2001, pp. 1–17.

[99] K.-L. Li, H.-K. Huang, S.-F. Tian, and W. Xu, “Improving one-class svm foranomaly detection,” in Machine Learning and Cybernetics, 2003 International Con-ference on, vol. 5. IEEE, 2003, pp. 3077–3081.

[100] C. Piciarelli, C. Micheloni, and G. L. Foresti, “Trajectory-based anomalous event de-tection,” IEEE Transactions on Circuits and Systems for video Technology, vol. 18,no. 11, pp. 1544–1554, 2008.

83

BIBLIOGRAPHY

List of PublicationsPatent:

1. System and Method for Reliable Intrusion Detection in Computer Network Sys-tems”has been registered and published by Indian Patent Office. (Application Num-ber: 201621013036)

Journal Papers:

1. Shah, V., Aggarwal, A. K., and Chaubey, N. (2017). Performance improvementof intrusion detection with fusion of multiple sensors. Complex and IntelligentSystems, 3(1), 33-39. (Springer Journal)

2. Shah, V. M., and Agarwal, A. K. (2017). Reliable Alert Fusion of Multiple Intru-sion Detection Systems. International Journal of Network Security, 19(2), 182-192.(SCOPUS Indexed)

3. Shah, V., Aggarwal, A. K., and Chaubey, N. (2017). Alert Fusion of IntrusionDetection systems using Fuzzy Dempster shafer Theory. Journal of EngineeringScience and Technology Review, 10(3), 123-127. (SCOPUS Indexed)

4. Shah, V., and Aggarwal, A. K. (2016). Enhancing performance of intrusion de-tection system against KDD99 dataset using evidence theory. Cyber-Security andDigital Forensics, 106.

5. Shah, V., Aggarwal, A. K., and Chaubey, N. (2018). Alert Fusion of IntrusionDetection systems using Fuzzy Consensus Operator. International journal of fuzzysystems. (SCOPUS, SCI Indexed) (under-review)

6. Shah, V., Aggarwal, A. K., and Chaubey, N. (2018) Intrusion Detection using Ma-chine Learning based Alert Classification. IEEE Transactions of Network scienceand Engineering (SCOPUS, SCI Indexed) (under-review)

Conference Papers:

1. Shah, V., and Aggarwal, A. K. (2015, February). Heterogeneous fusion of IDS alertsfor detecting DOS attacks. In Computing Communication Control and Automation(ICCUBEA), 2015 International Conference on (pp. 153-158). IEEE.

84

Documents

ENHANCING PERFORMANCE OF INTRUSION ......degree, diploma, associateship, fellowship, titles in this or any other University or other institution of higher learning. I further declare