Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACK
DETECTION AND PREVENTION MECHANISMS FOR
CLOUD- ASSISTED WIRELESS BODY AREA
NETWORKS (WBAN)
By
Rabia Latif
A thesis submitted to the faculty of Department of Information Security,Military College of Signals, National University of Sciences and Technology,
Islamabad, Pakistan, in partial fulfillment of the requirements for the degree of PhD inInformation Security
February 2016
ABSTRACT
Distributed Denial of Service (DDoS) attack does not aims to disrupts or interfere with the
real sensor data, rather they take advantage of disparity that exists between the network
bandwidth and the limited resource availability of the victim. Detecting and preventing such
attacks in cloud- assisted Wireless Body Area Networks (WBANs) is an important concern.
Such attacks can be avoided by first detecting followed by prevention and mitigation. Attack
detection is an initial step of any defense approach that needs to be taken prior to attack
mitigation techniques. Similarly, attack prevention also plays an important role in protecting
a network from malicious attacks. This research is mainly focused on the DDoS attack
detection and prevention algorithms and propose a novel solution that not only consumes
less resources but also produce efficient results.
The limited resources of WBAN are not enough to mitigate the huge amount of traffic
generated by DDoS attack. Therefore, there is a need for lightweight approaches and ca-
pable of handling real-time high speed sensor data for detection of such attacks in cloud-
assisted WBAN environment. The concern of detecting and preventing the DDoS attack
in cloud- assisted WBAN remains unresolved, existing solutions proposed for such attacks
in conventional networks are not directly applicable in cloud-assisted WBAN environment
due to the resource scarceness of these networks. Moreover, multiple entry points into these
networks leave them more vulnerable to such attacks which makes the attack detection and
prevention process a challenging task.
The aim of this research is to design a lightweight, in-network, distributed and scalable
approach for detecting DDoS attack that is capable of handling high speed streaming data
generated by WBAN sensors in cloud- assisted WBAN environment. The goal is to propose
the attack detection technique with improved performance when compared with existing
techniques in terms of: i) improved attack detection accuracy; ii) minimizing overall re-
source usage and iii) reducing overall computational cost. Analyzing and comparing the
existing techniques for detecting attacks in both conventional and wireless sensor networks
concludes that Very Fast Decision Tree (VFDT) has proved to be the most promising solu-
tion for identifying the malicious behavior of nodes in these networks through pattern dis-
covery. Therefore, in this research , we have selected and explored VFDT technique that is
lightweight and have further optimized it for handling high-speed streaming data originating
from WBAN sensors.
The performance evaluation is done through simulation experiments and real-time WBAN
ii
testbed deployment to test the effectiveness of proposed attack detection approach. In addi-
tion, the quantitative results obtained from the simulation experiments are benchmarked with
corresponding results acquired from the existing techniques. The results comparison shows
the advantages and significance of deploying stream mining approach in such networks, for
detecting DDoS attacks in an efficient and timely manner.
Another objective of this research is to propose an efficient traceback technique specif-
ically for cloud- assisted WBAN environment that incur minimal overhead on the WBAN
network. The goal is to propose a technique that is efficient in packet marking and path
reconstruction procedures in order to traceback and identify the source of DDoS attack with
less convergence time. Different traceback techniques have been analyzed and their compar-
ison drawn to the conclusion that Probability Packet Marking (PPM) is most appropriate and
widely used approach in both conventional and wireless sensor networks. The key issue of
PPM lies in assigning the marking probability for path reconstruction. Therefore, we model
the traceback of DDoS attack as a marking probability assignment problem and further op-
timized it for efficient traceback of DDoS attack in cloud- assisted WBAN environment.
The evaluation is performed through simulation experiments to test the effectiveness of
the proposed traceback technique. In addition, the quantitative results acquired from the
simulations are benchmarked with equivalent results acquired from a fish bone traceback
technique. The result comparisons prove the effectiveness of proposed traceback technique
in WBAN networks, for identifying the source of DDoS attacks with less convergence time
and minimum overhead.
iii
TABLE OF CONTENTS
ABSTRACT ii
TABLE OF CONTENTS iii
LIST OF FIGURES viii
LIST OF TABLES x
DEDICATION xi
ACKNOWLEDGEMENTS xii
PUBLICATIONS xiii
ACRONYMS xiv
NOTATIONS xvi
1 INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Security Requirements for Cloud- Assisted WBAN in context of Confiden-
tiality, Integrity and Availability (CIA) . . . . . . . . . . . . . . . . . . . . 2
1.3 Distributed Denial of Service Attack . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Distributed Denial of Service Attack: Conventional Network . . . . 3
1.3.2 Distributed Denial of Service Attack: Cloud-assisted WBAN . . . . 4
1.4 Motivation and Problem Statement . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Contributions and Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 DISTRIBUTED DENIAL OF SERVICE ATTACK: A Review 11
2.1 Cloud- Assisted Wireless Body Area Networks . . . . . . . . . . . . . . . 12
2.1.1 Integrating WBAN with Cloud Computing Technology . . . . . 12
2.1.2 Terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 Cloud- Assisted WBAN Applications . . . . . . . . . . . . . . . . 15
2.2 Distributed Denial of Service (DDoS) Attack in Cloud- Assisted WBAN . . 15
iv
2.2.1 Classification of DDoS Attack . . . . . . . . . . . . . . . . . . . . 16
2.2.2 A Taxonomy of Distributed Denial of Service Attack Defense Mech-
anisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Role of Data Mining in Distributed Denial of Service Attack Detection . . . 22
2.3.1 Existing Data Mining Techniques for DDoS Attack Detection . . . 25
2.4 Stream Mining Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.2 Very Fast Decision Tree (VFDT) . . . . . . . . . . . . . . . . . . . 29
2.4.3 Very Fast Decision Tree based on Predefined Threshold (VFDT-) . 29
2.4.4 Optimized Very Fast Decision Tree (OVFDT) . . . . . . . . . . . . 29
2.4.5 Concept Adaptive VFDT (CVFDT) . . . . . . . . . . . . . . . . . 30
2.5 Effect of Noise in Streaming Data . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Traceback Techniques for Distributed Denial of Service (DDoS) Attack . . 30
2.6.1 Existing Traceback Techniques for Standard IP- Based Networks . 31
2.6.2 Traceback techniques for Mobile Ad-hoc Networks . . . . . . . . . 32
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 PROPOSED DDoS ATTACK DETECTION AND PREVENTION FRAME-
WORK FOR CLOUD-ASSISTED WBAN 35
3.1 Requirements for DDoS Attack Detection in Cloud- Assisted WBAN envi-
ronment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Proposed Cloud- Assisted WBAN Architecture . . . . . . . . . . . . . . . 37
3.2.1 Formulation of Cloud- Assisted WBAN Architecture . . . . . . . . 37
3.2.2 Proposed Cloud-assisted WBAN Architecture . . . . . . . . . . . . 40
3.3 Proposed Framework for Detecting and Preventing DDoS Attack . . . . . . 45
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 EVFDT: An Enhanced Very Fast Decision Tree Algorithm for Detecting DDoS
Attack in Cloud- Assisted WBAN 48
4.1 Proposed Distributed Denial of Service attack detection system . . . . . . . 50
4.1.1 Data Collection Phase . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.2 Pre-Processing Phase . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.3 Attack Classification . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.4 Attack Response . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Enhanced Very Fast Decision Tree (EVFDT): A Proposed Classification Al-
gorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
v
4.2.1 EVFDT Tree Building Process . . . . . . . . . . . . . . . . . . . . 55
4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 ATTACK DETECTION SCHEME: PERFORMANCE ANALYSIS AND
BENCHMARKING 61
5.1 Performance Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . 62
5.1.1 Attack Detection Accuracy . . . . . . . . . . . . . . . . . . . . . . 62
5.1.2 False Alarm Rate (FAR) . . . . . . . . . . . . . . . . . . . . . . . 63
5.1.3 Computational Cost . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.1.4 Sensitivity vs Specificity . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.5 Tree Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.6 Computational Time . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.7 Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2 Simulation- Based Experiments . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.1 Synthetic Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.2 DDoS Attack Strategy: Generation and Analysis . . . . . . . . . . 68
5.2.3 Performance Evaluation and Comparative Analysis . . . . . . . . . 69
5.3 Hardware- Based Experiments . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.1 Experimental TestBed . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.2 Traffic Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.3 Performance Evaluation and Comparative Analysis . . . . . . . . . 80
5.4 Qualitative Comparison of Classification Algorithms . . . . . . . . . . . . 86
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6 PROPOSED TRACEBACK SCHEME FOR DISTRIBUTED DENIAL OF
SERVICE ATTACK 89
6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.1.1 Probabilistic Packet Marking . . . . . . . . . . . . . . . . . . . . . 91
6.1.2 Key Issues in Selecting Probability . . . . . . . . . . . . . . . . . 91
6.2 Proposed Traceback Technique . . . . . . . . . . . . . . . . . . . . . . . . 95
6.2.1 Finding the Traveling Distance . . . . . . . . . . . . . . . . . . . . 96
6.2.2 Uniform Residual Probability . . . . . . . . . . . . . . . . . . . . 100
6.3 DDoS Attacker Traceback and Path Reconstruction . . . . . . . . . . . . . 100
6.3.1 Procedure for Aggregate Node Path Reconstruction . . . . . . . . . 100
6.3.2 Procedure for Sensor Node Path Reconstruction . . . . . . . . . . . 101
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
vi
7 TRACEBACK SCHEME: PERFORMANCE EVALUATION AND BENCH-
MARKING 104
7.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Evaluation and Comparative Analysis . . . . . . . . . . . . . . . . . . . . 105
7.2.1 Convergence time . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.2 Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2.3 Overhead on Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8 CONCLUSION AND FUTURE DIRECTIONS 111
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
REFERENCES 115
vii
LIST OF FIGURES
1.1 DDoS Attack in Conventional Network . . . . . . . . . . . . . . . . . . . 4
1.2 DDoS Attack Illustration in WBANs . . . . . . . . . . . . . . . . . . . . . 5
2.1 Cloud-Assisted WBAN Conceptual Architecture for E-Health Monitoring . 13
2.2 DDoS Attack Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Taxonomy of DDoS Defense Mechanism . . . . . . . . . . . . . . . . . . 20
2.4 Data Mining Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Effect of Noisy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1 Flat Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Flat Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Cluster-based Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Data Aggregation Topology . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Proposed cloud-assisted WBAN Architecture . . . . . . . . . . . . . . . . 41
3.6 Sequence of Operations from Patient to Healthcare Professional . . . . . . 42
3.7 Workflow of Attack Detection Node at Cloud . . . . . . . . . . . . . . . . 45
3.8 Proposed Framework for Detecting and Preventing DDoS Attacks . . . . . 46
4.1 Proposed DDoS Attack Detection System . . . . . . . . . . . . . . . . . . 50
4.2 Proposed EVFDT Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1 Illustration of LEACH Protocol . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Accuracy in different Noise Percentage . . . . . . . . . . . . . . . . . . . 70
5.3 Accuracy vs In in different Noise Percentages . . . . . . . . . . . . . . . . 70
5.4 FPR and FNR vs In (a) False Positive Rate (b) False Negative Rate . . . . . 71
5.5 Tree Size vs Noise Percentage . . . . . . . . . . . . . . . . . . . . . . . . 73
5.6 Computational Time vs Number of Instances In . . . . . . . . . . . . . . . 74
5.7 Memory Usage vs Number of Instances In . . . . . . . . . . . . . . . . . . 75
5.8 Arduino XBee Shield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.9 ’Arduino XBee shield’ over e-Health sensor shield complete kit . . . . . . 76
5.10 Complete WBAN Demonstration . . . . . . . . . . . . . . . . . . . . . . . 78
5.11 Arduino IDE serial monitor . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.12 Attack Detection Accuracy for Different Noise(%) . . . . . . . . . . . . . 81
viii
5.13 Attack Detection Accuracy Comparison with Different Noise(%) . . . . . 81
5.14 Effect of Noise% on FPR and FNR (a) False Positive Rate (b) False Negative
Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.15 Effect of In on FPR and FNR (a) False Positive Rate (b) False Negative Rate 83
5.16 Sensitivity vs Specificity (a) VFDT-τ (b) CVFDT (c) OVFDT (d) EVFDT . 84
5.17 ROC curves showing the tradeoff between Sensitivity and false-positive rate
(100-Specificity) of DDoS attacks . . . . . . . . . . . . . . . . . . . . . . 85
5.18 Computational Cost Comparison . . . . . . . . . . . . . . . . . . . . . . . 85
5.19 Computational Time Comparison . . . . . . . . . . . . . . . . . . . . . . . 86
5.20 Memory Usage Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 Graphical Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Residual Probability ϕ1 for node n1 . . . . . . . . . . . . . . . . . . . . . 93
6.3 Unmarked Probability ϕ0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.4 Falsify Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.5 WBAN Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.6 IEEE 802.15.4 with DPPM label . . . . . . . . . . . . . . . . . . . . . . . 97
6.7 DPPM label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.8 Sensor Nodes Connecting with an Edge . . . . . . . . . . . . . . . . . . . 97
6.9 (a): Multi-Hop WBAN Topology . . . . . . . . . . . . . . . . . . . . . . . 99
6.10 (b): Sequence of Packet Traveling Along the Path . . . . . . . . . . . . . . 99
7.1 Number of packets required by proposed technique and FBT (τi = 0.08) . . 106
7.2 Uncertainty values for PPM with Different Marking Probabilities . . . . . . 107
7.3 A Comparison of Overhead on Individual Nodes . . . . . . . . . . . . . . 109
ix
LIST OF TABLES
2.1 DDoS Defense Mechanisms based on Deployment Location . . . . . . . . 21
2.2 Data Mining Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Comparison of existing DDoS attack detection mechanisms . . . . . . . . . 26
5.1 Performance evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3 Cost Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.5 FPR and FNR of Classification Algorithms in Percentage . . . . . . . . . . 71
5.6 Sensitivity and Specificity of Classification Algorithms in Percentage . . . 72
5.7 Tree Size Comparison with different Noise Percentage . . . . . . . . . . . 73
5.8 List of Statistical Features . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.9 Experimental Results of Attack Detection Accuracy(%) for real-time datasets 82
5.10 Sensitivity and Specificity of Existing Proposed Classification Algorithms
in Percentage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.11 Qualitative Comparison of Proposed and Existing Classification Algorithms 88
7.1 Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 Convergence Time Comparison of FBT and proposed Technique . . . . . . 107
7.3 Total Overhead on Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
x
DEDICATION
This thesis is dedicated to
MY BELOVED PARENTS, HUSBAND,
AND MY DAUGHTER
for their love, endless support and encouragement
xi
ACKNOWLEDGEMENTS
I am grateful to God Almighty who has bestowed me with the strength and the passion to
accomplish this thesis and I am thankful to Him for His mercy and benevolence. Without
His consent I could not have indulged myself in this task.
I would like to express my sincere gratitude to my advisor Dr. Haider Abbas for his
continuous support throughout my degree, for his patience, motivation, enthusiasm, and
immense knowledge. His guidance helped me in all the time of research and writing of this
thesis.
I am grateful to my thesis guidance and evaluation committee members including Dr.
Asif Masood , Dr. Hammad Afzal, and Dr. Mehreen Afzal for their constant supervision
and support.
A very special thanks goes out to Dr. Seemab Latif, without her efforts my job would
have undoubtedly been more difficult. I greatly benefitted from her keen scientific insight,
and her ability to put complex ideas into simple terms.
I am very grateful to my parents for the endless support they provided me through my
entire life and in particular during my studies. I must acknowledge my husband, without
whose love, encouragement and editing assistance, I would not have finished this thesis.
xii
PUBLICATIONS
The following relevant publications have been produced during PhD period.
1. R. Latif, H. Abbas, S. Latif, A. Masood, ”DDOS Attack Source Detection Using Effi-
cient Traceback Technique (ETT) in Cloud-Assisted Healthcare Environment”, Jour-
nal of Medical Systems. Impact factor 2.24, (Under Review).
2. R. Latif, H. Abbas, S. Latif, A. Masood, ”EVFDT: An Enhanced Very Fast Deci-
sion Tree Algorithm for Detecting Distributed Denial of Service Attack in Cloud-
Assisted Wireless Body Area Network”, Mobile Information Systems, Hindawi Pub-
lishing Coorporation, Article ID 260594, 2015. Impact factor 0.94.
3. R. Latif, H. Abbas, S. Latif, A. Masood, ”Performance Evaluation of Enhanced Very
Fast Decision Tree (EVFDT) Mechanism for Distributed Denial of Service Attack
Detection in Healthcare Systems”, Healthcare on Smart and Mobile Devices, Annals
of Telecommunications, 2015. Impact factor 0.699.
4. R. Latif, H. Abbas, S. Latif, ”Distributed Denial of Service DDoS attack detection
using data mining approach in cloud Assisted Wireless Body Area Networks”, In-
ternational Journal of Ad hoc and Ubiquitous Computing (IJAHUC), 2014. Impact
factor: 0.55.
5. R. Latif, H. Abbas, Sad Assar , ”Distributed Denial of Service (DDoS) Attack in
Cloud-Assisted Wireless Body Area Network: A Systematic Literature Review”, Jour-
nal of Medical Systems (JoMS), vol. 38, no. 128, November 2014. Impact factor: 2.24.
6. R. Latif, H. Abbas, S. Latif, ”Analyzing Feasibility for Deploying Very Fast Decision
Tree for DDoS Attack Detection in Cloud Assisted WBAN”, Published in the proceed-
ings of 2014 International Conference on Intelligent Computing (ICIC2014). August
3- 6, 2014, Taiyuan, China.
7. R. Latif, H. Abbas, S. Assar, Q. Ali ”Cloud Computing Risk Assessment: A Sys-
tematic Literature Review” (2013) Springer Lecture Notes in Electrical Engineering
Vol:276, pp:285-295.
xiii
ACRONYMS
Wireless Body Area Network WBAN
Denial of Service DoS
Distributed Denial of Service DDoS
Confidentiality Integrity Availability CIA
Very Fast Decision Tree VFDT
Enhanced Very Fast Decision Tree EVFDT
Concept Adaptive Very Fast Decision Tree CVFDT
Optimized Very Fast Decision Tree OVFDT
Network Simulator-2 NS-2
Efficient Traceback Technique ETT
Body Control Unit BCU
Transport Control Protocol TCP
User Datagram Protocol UDP
Electrocardiogram ECG
Internet Control Message Protocol ICMP
Intrusion detection System IDS
Intrusion Prevention System IPS
Genetic Algorithm GA
K- Nearest Neighbor KNN
Hoeffding Bound HB
Mobile AdHoc Networks MANET
Wireless Sensor Network WSN
Probabilistic Packet Marking PPM
Deterministic Packet Marking DPM
Dynamic Probability Packet Marking DPPM
Cumulative Path CP
xiv
Marking Probability Distribution Function MPDF
Electromyography EMG
Secure Socket Layer SSL
Transport Layer Security TLS
Secure Shell SSH
Role Based Access Control RBAC
Quality of Service QoS
Hoeffding Tree HT
Low Energy Adaptive Clustering Hierarchy LEACH
False Alarm Rate FAR
False Positive Rate FPR
False Negative Rate FNR
Very Fast Machine Learning VFML
Receiver Operating Characteristics ROC
Fish Bone Traceback FBT
Media Access Control MAC
Time To Live TTL
MAC Protocol Data Unit MPDU
xv
NOTATIONS
τ Threshold
AN Aggregate Node
BS Base Station
CH Cluster Head
t Flow of Traffic
s Sensor Node
M Malicious Nodes
k Number of Malicious Nodes in a Network
r Critical Node
ε Hoeffding Bound
G(.)+ Upper Bound
G(.)− Lower Bound
Tr Adaptive Threshold
In Number of Instances
ϕi Residual Proability
l Leaf
HBCount Total Number of Values in Sorted List
XS Sorted List of HB Values
S Stream of Samples
xvi
HT Hoeffding Tree
c Number of Classes
lp Length of Prunned Tree
ETT Efficient Traceback Technique
A Attack Path
a Attacker
v Victim
N Total Number of Nodes
τi Marking Probability
ϕi Residual Probability
p Attack Measure from a to v
K Uncertainty Factor
li Legitimate Node
d Travelling Distance
i Total Number of Nodes Along Attack Path
P (s) Label
m Maximum Uncertainty
HN Nth Harmonic Number
xvii
Chapter 1
INTRODUCTION
1.1 Introduction
Wireless Body Area Networks (WBANs) have emerged as a promising technology that has
shown enormous potential for improving the quality of healthcare and has thus, found a
broad range of medical applications from ubiquitous health monitoring to emergency medi-
cal response systems. The WBANs has a potential to reduce health monitoring costs and im-
prove the quality of a patients’ life. However, the efficient management of the huge amount
of highly sensitive data collected and generated by WBAN sensor nodes requires an ascend-
able and secure storage and processing infrastructure. Given the limited resources of WBAN
sensors for power, storage and processing, the integration of WBANs and cloud computing
provides a powerful, viable and hybrid platform to process the enormous amount of data col-
lected from multiple WBAN nodes. It must also be able to realize long term patient health
monitoring and the analysis of his/her health records under different situations [1] [2].
In cloud computing, wireless devices do not need computing facilities, data storage, pow-
erful configuration such as high speed CPUs, and other software services, since their data
and complicated computing operations can be shifted and processed in the cloud, which sig-
nificantly reduces the operational and maintenance costs [3] [4]. The seamless integration
of WBAN and cloud computing will provide several benefits to e-Healthcare, including bet-
ter patient care, cost reduction, solution to resource scarceness, better health quality, and
research and strategic planning support [5].
However, despite the benefits of cloud-assisted WBAN, several security issues and chal-
lenges remain unresolved. Among these, data availability is the most nagging security issue.
The most serious threat to data availability is a Distributed Denial of Service (DDoS) attack
that directly affects the all-time availability of a patients data [1]. The existing solutions
for standalone WBANs and sensor networks are not applicable in the cloud. For detect-
ing a DDoS attack in cloud-assisted WBAN, there is a need for a defensive approach that
1
understands the network semantics and flow of traffic in the networks.
This chapter is organized as follows. Section 1.1 introduces the basic concept of cloud-
assisted WBAN. Section 1.2 highlights the security requirements of Cloud- Assisted WBAN
in context of Confidentiality, Integrity and Availability (CIA). In section 1.3, a brief overview
of DDoS attack is given both in conventional networks and cloud-assisted WBAN environ-
ment. Section 1.4 highlights the motivation and objectives of this research work. Section 1.5
summarizes the research contributions and outcomes of the research. Finally, in section 1.6,
the overall structure of the thesis is given.
1.2 Security Requirements for Cloud- Assisted WBAN in context of Confidentiality,
Integrity and Availability (CIA)
The security of sensors, the data collection at aggregate nodes and the transmission of that
data over cloud via an unsecured network is an important and critical issue. As in other se-
cure systems, key security requirements are also required for cloud- assisted WBAN applica-
tions. These requirements are confidentiality, integrity and availability (CIA) [6]. Therefore,
it is important to understand these security requirements before integrating adequate security
solutions. Following are the fundamental security requirements for provisioning security in
cloud- assisted WBAN:
1. Data Confidentiality: Data confidentiality is required to protect sensitive data from
eavesdropping by a rouge sensor or intruder. In e-Health applications, the WBAN sen-
sors send sensitive information about patient health status. An adversary can eaves-
drops on the communication and can overhear the critical information, which may
cause a severe damage to the patient data. Achieving confidentiality requires a crypto-
graphic key for encrypting patients data. But due the resource scarce nature of WBAN
sensors, it is very challenging to generate, store and use cryptographic keys for en-
cryption [6] [7].
2. Data Integrity: Lack of data integrity mechanism allows an adversary to modify or
tamper the patients information when transmitted over an insecure channel. It is very
hazardous especially for life critical events. Therefore, it is essential to ensure the
presence of adequate data integrity mechanism [6] [7].
2
3. Availability: Attacks on network availability (like DoS, DDoS attacks), where the
attacker tries to reduce the networks capacity and performance and even make the
network unavailable to legitimate users [6] [7].
Several popular schemes [8] [9] [10] have been proposed in the literature to satisfy the data
authentication, integrity and confidentiality requirements for provisioning security in sensor
networks. However, very little research has been done to address the issue of availability of
sensor nodes under attack.
1.3 Distributed Denial of Service Attack
A Distributed Denial of Service (DDoS) attack is defined as an explicit attempt by an at-
tacker to exhaust the resources of a victim node. Multiple nodes are deployed to launch an
attack by sending a stream of packets towards the victim, thus consuming the key resources
of victim node and making them unavailable to legitimate nodes.These resources mainly
include the network bandwidth, computing power, and memory resources [13] . Section
1.3.1 and section 1.3.2 explains the DDoS attack in conventional networks cloud- assisted
WBANs respectively.
1.3.1 Distributed Denial of Service Attack: Conventional Network
A Distributed Denial of Service (DDoS) attack is defined as an explicit attempt by the mali-
cious nodes to launch an attack against victim node in order to exhaust the victim resources
and prevent it from providing services to legitimate users. This type of attack is distributed
in nature, i.e. multiple nodes are deployed to launch an attack by sending a stream of pack-
ets towards the victim, thus, consuming the key resources of victim node and making them
unavailable to legitimate nodes. These resources mainly include the network bandwidth,
computing power and memory resources [11].
In conventional networks, a DDoS attack can be launched by either vulnerability (i.e.
exploiting a protocol or a running application by sending malformed packets towards the
victim) or by overwhelming the victim in order to exhaust the resources of victim machine
[11].
In DDoS attack, the attacker follow the principle: ”Power of many is greater than few to
launch an attack”. These attacks aim to compromise the legitimate machines in the network
which then participate in the attack process as governed by the master attacker [12].
3
The attacker initiates an attack by first inspecting the vulnerable machines in the network
through network scanning. After identifying the vulnerable machines, the attacker exploits
the identified vulnerabilities to gain access to these machines and infect them with malicious
code or install attack patches on them. These vulnerable machines are thus, compromised
and used by an attacker to launch DDoS attack against the selected victim machines to either
overwhelm their resources or crash them [13].
The compromised nodes, also called zombies, are widely scattered over the network and
are remotely controlled by the attacker. The attack code installed on these zombies are
triggered simultaneously by the attacker in order to launch a DDoS attack towards the victim
machine. The complete scenario of DDoS attack in conventional network is depicted in
Figure 1.1. As a result, the victim machine is overwhelmed by receiving a huge amount of
Figure 1.1: DDoS Attack in Conventional Network
traffic from all directions and is unable to respond to legitimate requests.
1.3.2 Distributed Denial of Service Attack: Cloud-assisted WBAN
In cloud- assisted WBAN, an attacker launched a DDoS attack by triggering attack code
on multiple compromised sensor nodes simultaneously in order to send hoax messages to a
victim node in very short intervals of time. As a result the victim node is overwhelmed with
4
huge amount of traffic then its maximum processing power thus, exhausting its resources
and prevent it from providing services to its legitimate users [1].
In cloud-assisted WBAN, the attacker node is several order of magnitude higher process-
ing power than a regular sensor node. These attack nodes compromised the legitimate sensor
nodes in the network and forged their identities with an intention to launch an attack by send-
ing the huge inflows of traffic towards the victim node [1]. Before designing any security
scheme for detecting DDoS attack in cloud- assisted WBAN environment, there is a need to
understand the network semantic and flow of traffic in these networks.
Figure 1.2 illustrates the DDoS attack mechanism in WBANs. The cloud follow the mech-
anism of conventional network shown in Figure 1.1. As given in Figure 1.2, the attacker
launch DDoS attack from multiple points towards a single victim sensor node in order to
overwhelm it with huge number of hoax requests.
Figure 1.2: DDoS Attack Illustration in WBANs
The attacker can be one of the following: a malicious node injected in the network by an
attacker, legitimate node compromised by an attacker by forging its identity and laptop class
node having additional processing capabilities. The intention of any attack category is to
exhaust the energy resources of the victim node.
1.4 Motivation and Problem Statement
Besides other open issues in WBAN environment such as energy efficiency, quality of ser-
vice, and standardization; security and privacy are the key issues that need special attention.
Among these security issues, data availability is the most nagging security issue. Availability
determines whether a sensor node has the ability to use the resources and whether the net-
work is available for data communication. However, failure of the base station or aggregate
5
nodes availability will eventually threaten the entire sensor network for health critical appli-
cations. Thus availability is of utmost importance for maintaining an operational network.
The DDoS attack is one of the most powerful attacks on the availability of patients health
data and services of health care professional. DDoS attack severely affects the capacity and
performance of a WBAN network if not handled in a timely and appropriate manner [14].
DDoS attack does not aim at disruption or interference with the real sensor data. Rather
they take advantage of disparity present between the network bandwidth and limited re-
source availability of the victim. Detecting and preventing against such attacks in cloud-
assisted WBAN is an important concern. Attacks can be avoided by first detecting an attack
followed by attack prevention and mitigation. Attack detection is an initial step of any de-
fense approach that needs to be taken prior to attack mitigation techniques. Similarly, attack
prevention also plays an important role in protecting a network from malicious attacks. This
thesis mainly focused on the DDoS attack detection and prevention algorithms and proposed
a novel solution that not only consumes less resources but also produce accurate results.
The limited resources of WBANs are not enough to mitigate the huge amount of traffic
generated by DDoS attack. Therefore, there is a need for an approach that is light weight
and capable of handling real-time high speed sensor data for the detection of such attacks
in cloud- assisted WBAN environment. The concern of detecting and preventing the DDoS
attack in cloud- assisted WBAN remains unresolved. All the solutions proposed for such
attacks in conventional networks are not directly applicable in cloud-assisted WBAN envi-
ronment due to the resource scarceness of these networks. The multiple entry points into
these networks leave them more vulnerable to such attacks which makes the attack detection
and prevention process more complicated.
The aim of this research is to design a light-weight, in-network, distributed and scalable
approach for detecting DDoS attack that is capable of handling high- speed streaming data
generated by WBAN sensors in cloud- assisted WBAN environment. The goal is to propose
the attack detection technique with improved performance when compared to exiting tech-
niques in terms of: i) improved attack detection accuracy; ii) minimizing overall resource
usage and iii) reducing overall computational cost. Analyzing and comparing the existing
techniques for detecting attacks in both conventional and wireless sensor networks concludes
that the data mining techniques have proved to be the most promising solution for identifying
6
the malicious behavior of nodes in these networks through pattern discovery. Therefore, in
this research we have explored the data mining technique that is light-weight and have fur-
ther optimized it for handling high-speed streaming data originating from WBAN sensors.
Another objective of this research is to propose an efficient traceback technique specifi-
cally for cloud- assisted WBAN environment that incur minimal overhead on the network.
The proposed technique is efficient in packet marking and path reconstruction procedures for
tracebacking and identifying the source of DDoS attack with less convergence time. Differ-
ent traceback techniques have been analyzed and their comparison drawn to the conclusion
that Probability Packet Marking (PPM) is the most appropriate and widely used approach in
both conventional and wireless sensor networks [15] [16]. The key issue of PPM lies in as-
signing the marking probability for path reconstruction. Therefore, we model the traceback
of DDoS attack as a marking probability assignment problem and further optimize it for ef-
ficient traceback of DDoS in cloud- assisted WBAN environment. The purpose of selecting
PPM technique is to reduce the overhead on sensor nodes.
1.5 Contributions and Outcomes
Specifically this research has resulted in the following contributions and research outcomes:
Contribution 1: A cloud- assisted WBAN architecture is proposed that integrates a wireless
body area network with cloud computing to store and process the data collected by WBAN
sensors for patients eHealth monitoring. The proposed system architecture is scalable and is
able to store the huge amount of data generated by WBAN sensors. Based on the proposed
architecture, a framework has been proposed for the detection and prevention of DDoS at-
tack in cloud- assisted WBAN environment. Further, we have identified the possible attack
points where DDoS attack occurs and for which the solutions have been proposed.
Contribution 2: Proposed, deployed and analyzed an efficient destination-based attack
detection technique for detecting DDoS attack in cloud-assisted WBAN environment.
• Proposed a distributed denial of service attack detection system.
• An algorithm for attack classification has been proposed that is capable of handling
noisy data and detects a DDoS attack efficiently with high accuracy and low false
7
alarm rates.
• Performance evaluation and comparative analysis has been performed on both syn-
thetic data generated by simulation and real-time data generated by deploying actual
WBAN hardware testbed.
Contribution 3: Proposed, deployed and analyzed an efficient traceback technique to trace
the source of a DDoS attack in cloud- assisted WBAN environment.
• A novel packet marking technique has been proposed for both single-hop and multi-
hop WBAN topology.
• A novel labeling technique has been proposed to find the traveling distance of node
from the source.
• An aggregate node path reconstruction algorithm is proposed to reconstruct the path
from victim to the aggregate node of the cluster that contains the attacker and the
source node.
• A sensor node path reconstruction algorithm has been proposed to perform the path
reconstruction from aggregate node to the source node from where the attack origi-
nates.
• Simulations have been performed to evaluate the performance of proposed traceback
technique. Finally, a comparative analysis has been done to show the dominance of
proposed technique over existing techniques.
1.6 Thesis Outline
This thesis is divided into eight chapters. A brief overview of each chapter is given in this
section.
Chapter 2 introduces the role of data mining in the detection of DDoS attack. Further,
the chapter provides the background information on existing data mining and stream mining
techniques. The advantages and limitations of different techniques are also presented along
with their implications when applied in cloud- assisted WBAN environment. The effect of
noise on streaming data is then discussed. Finally, for the prevention of DDoS attack, the
8
background information on existing traceback mechanisms are discussed in details along
with their drawbacks and limitations when used in cloud-assisted WBAN environment.
In Chapter 3, the proposed cloud- assisted WBAN architecture is presented. Each module
of proposed architecture is discussed in detail. Further, we highlighted the possible areas that
are vulnerable to DDoS attack and require a security mechanism for the prevention of DDoS
attack. Based on the proposed cloud-assisted architecture, a novel framework for detecting
and preventing DDoS attack in cloud- assisted is proposed. For evaluating the proposed
framework, a DDoS attack detection technique is proposed in Chapter 4 and DDoS attack
prevention technique is proposed in Chapter 6.
In Chapter 4, the proposed DDoS attack detection system is presented. Each phase of
attack detection system is elaborated. Further, the statistical features are identified that helps
in the detection of DDoS attack. An improvement of Very Fast Decision Tree (VFDT) [53]
namely Enhanced VFDT (EVFDT) is then proposed that is capable of handling noisy data
and detects a DDoS attack efficiently with high accuracy and low false alarm rate. Different
procedures of proposed EVFDT are given and discussed in detail along with the algorithms.
Chapter 5 discusses the performance evaluation and benchmarking of proposed DDoS at-
tack detection technique. The basis of performance evaluation is to analyze the effectiveness
of attack detection technique in detecting DDoS attack. Likewise, the comparative anal-
ysis is performed to show the dominance of proposed technique over existing techniques
for detecting DDoS attack. The complete evaluation and comparison process is performed
separately on both synthetic datasets generated by simulation in NS-2 and dataset generated
by deploying actual WBAN hardware testbed environment. The performance metrics that
are used to evaluate and compare the simulation results includes: attack detection accuracy,
false alarm rate, sensitivity vs specificity, computational cost, tree size and resource usage.
In Chapter 6, a traceback technique called Efficient Traceback Technique (ETT) is pro-
posed, to be deployed specifically in resource constrained WBAN for both multi-hop and
single- hop topology. For packet marking, a novel labeling technique is proposed. Subse-
quently, a working example is given to show the effectiveness of proposed technique. Fur-
ther, a DDoS attacker traceback algorithms are proposed for path reconstruction and attacker
identification. This mechanism comprises of two procedures: Procedure for Aggregate Node
Path Reconstruction (to reconstructs the path from victim to the aggregate node of the cluster
9
that contains the attacker and the source node), and Procedure for Sensor Node Path Recon-
struction (to perform the path reconstruction from aggregate node to the source node from
where the attack originates.
In Chapter 7, the performance of proposed traceback technique is evaluated through sim-
ulation and experiments. The proposed technique assigns the dynamic marking probability
to each node along the path, and further reconstructs the attack path to efficiently traceback
the attacker and making subsequent decisions. The performance of proposed scheme is af-
fected by few network parameters. The variation in these parameters are used to quantify the
results, based on simulation experiments. The network Simulator NS-2 is used to compared
and analyze the performance metrics including: convergence time, overhead and uncertainty.
The acquired simulation results are compared with corresponding results obtained from the
simulation of existing traceback techniques for both multi-hop and single-hop WBAN net-
work. Simulation results show that the proposed technique yields superior results compared
to existing techniques.
Finally, Chapter 8 concludes the thesis with future directions
10
Chapter 2
DISTRIBUTED DENIAL OF SERVICE ATTACK: A Review
With the increasing popularity of cloud- assisted WBAN for e-Health applications, the de-
mand for securing these networks is also increasing. Existing security attacks on high speed
networks (internet) and their solutions are not directly applicable on cloud assisted WBAN
environment. The underlying reasons for this lack of applicability are: a) It is fairly new
technology which includes the limitations of both WBAN and cloud; b) Resource scarce
nature of WBAN sensor nodes: limited processing power, low computation capabilities and
less memory; c) Multiple entry points to WBAN network; d) Non- triviality in selecting
particular critical sensor node [17].
For detecting security attacks in these networks, there is a need for the development of
attack defensive approaches that understand and analyze the network semantics and flow of
the traffic in these networks. Attacks can be avoided by first detecting an attack followed by
attack mitigation and prevention. Attack detection is an initial step of any defense approach
that needs to be taken prior to attack mitigation techniques. Similarly, attack prevention also
plays an important role in protecting a network from malicious attacks.
This chapter is organized as follow. Section 2.1 introduces cloud- assisted WBAN, with
emphasis on the security issues related to this technology. Section 2.2 provides an in-depth
analysis of the classification of DDoS attack, focusing on the types of DDoS attack and
their targets. In section 2.3, we present an overview of data mining techniques and their
importance in detecting malicious behavior of a network. Further, the existing data mining
techniques and for DDoS attack detection along with their limitations are discussed in this
section. Section 2.4 discusses the stream mining techniques for mining high speed stream
data. In section 2.5, we discuss the effect of noise on stream data. Section 2.6 provides a
detailed analysis of the existing traceback mechanisms for DDoS attack and their limitations
for resource constrained WBAN. Finally, the chapter is summarized in section 2.7.
11
2.1 Cloud- Assisted Wireless Body Area Networks
Due to advancements in wireless technologies and emerging ideas such as wireless sensor
networks, wireless body area networks, and other types of low power wireless communica-
tion networks, patient health monitoring and other related services are becoming more and
more popular. These networks will reduce health monitoring costs and improve the quality
of a patients life [14] [18]. However, the efficient management of the massive amount of
monitored data gathered by various WBAN sensors is a key problem for their large scale
adaptation in healthcare services. Therefore, there is a need for innovative solutions to meet
the growing challenges of handling the exponential growth in data generated by WBAN
sensor nodes. WBAN nodes have limited power, energy, capacity, and computation and
communication capabilities. Yet, at the same time, they need to be scalable and power-
ful, with secure storage and high-performance computation, and they require real time data
processing and storage, especially for e-health applications [19] [20].
2.1.1 Integrating WBAN with Cloud Computing Technology
Cloud computing is a promising technology that is expected to play an important role in
attaining the afore stated goals [21] for healthcare management. The integration of a WBAN
with cloud computing introduces a hybrid and feasible platform to process the enormous
amount of data gathered from various WBAN sensor nodes. It must also be able to realize
long term patient health monitoring and the analysis of patients health records under differ-
ent situations. In cloud computing, wireless devices do not need computing facilities, data
storage, powerful configuration such as a high speed CPU, and other software services, since
their data and extensive computing operations can be shifted and processed on the cloud, thus
significantly reducing the operational and maintenance costs [3] [5]. The flawless integra-
tion of WBAN and cloud computing will provide several benefits to e-healthcare, including
better patient health care, reduced cost, solution for resource scarceness, and research and
strategic planning support [2]. This cloud-assisted WBAN will enable medical servers and
physicians to universally access the storage and processing infrastructure on a pay-as-you-go
pricing model [22].
Figure 2.1 depicts the typical cloud-assisted WBAN conceptual architecture for the e-
health monitoring solution being considered in this research. The architecture is multi-tiered
12
Figure 2.1: Cloud-Assisted WBAN Conceptual Architecture for E-Health Monitoring
and described below:
1. Tier 1 - WBANs: It represents WBANs and incorporates a set of small, intelligent,
wireless in-body and on-body sensors that are placed purposely on the patients body.
These sensors monitor, process, and store information about the patients physiological
parameters. The mobile devices (PDAs and smart phones) serve as gateways for the
WBAN, also known as the Body Control Unit (BCU). Because the WBAN application
is related to patients health, there is a need for a reliable packet delivery system for data
from a WBAN node to the BCU, i.e., acknowledgments of delivered packets and the
retransmission of lost packets. This tier will emphasize the communication channel
used as a transport layer protocol [1].
2. Tier 2 - Transmission: Depicts the transmission medium, in which the mobile de-
vices transmit the sensed data to the e-health care service provider over the cloud for
performing health care related tasks. The base station or an access point is respon-
sible for collecting data from tier-1 and transferred it to cloud via insecure network
(internet) for further processing. The transport layer of the network stack specifies
the protocol (TCP/UDP) through which the BCU and e-healthcare service provider
communicate [1].
13
3. Tier 3 - Cloud Services: This tier is composed of cloud services in which the e-
healthcare service provider categorizes the data based on the attributes chosen by the
patient and transfers it to the health cloud storage. Here again, the transport layer pro-
tocol is responsible for the reliable transmission of data from the e-healthcare service
provider to cloud storage in the cloud environment [1].
Nevertheless, the research into a cloud-assisted WBAN platform is still in its infancy.
Current studies in this area focus on architectural design issues for a cloud-assisted WBAN
to realize e-Healthcare services, while they lack an emphasis on security issues. These
issues could be malicious in nature. Among these, data availability is the most nagging
security issue. The major threat to data availability is distributed denial of service attack
that adversely affect the overall performance and reliability of the healthcare systems from
secure record keeping to seamless accessible and healthcare data transmission. In Fig 2.1, the
red circles show the entities and area of emphasis for which the DDoS attack and available
solutions will be analyzed. Therefore, there is a need to put together all the studies and
assess all the available knowledge on the subject.
2.1.2 Terminologies
Following terminologies are used throughout the thesis:
• WBAN Sensor Node: Deployed on human body for monitoring patients health pa-
rameters such as ECG, pulse, blood sugar, blood pressure sensors.
• Base Station: Responsible for collecting all the information from all the aggregate
nodes and transfer the data to cloud through internet.
• Body Control Unit (BCU): An aggregation node to collect all information from sen-
sor nodes and forwards the information to base station.
• Malicious/Attack Node: A sensor node launching an attack towards victim.
• Victim Node: It is a target node against which a DDoS attack is initiated
• High Speed Networks (Internet): Standard IP- Based computer network.
• E- Health Service Provider: It classifies a patients health record on the basis of
patient attributes and transfers it to the e-Health cloud storage for permanent storage.
14
• Health cloud Storage: Responsible for storing and retrieving data upon request by
authorized users (pharmacists, doctors, health workers, etc.).
• Data Requesters: These include healthcare professionals (doctors, nurses, health
workers) that use application specific services (SaaS) to access a patients stored data.
2.1.3 Cloud- Assisted WBAN Applications
Cloud- assisted WBANs are deployed for various applications [6]. Some of these are dis-
cussed below:
1. Medical Application: As the proposed solution helps in monitoring patients health in
hospitals and disastrous areas. In both scenarios, there is a demand for highest levels
of security, i.e. the medical information should not be leaked and only accessible to
authorized personals. Similarly, the information should be available continuously to
ensure better health monitoring.
2. Gaming, entertainment and consumer electronics: The gaming applications need
wireless devices that can sense different body postures and provide input to the appli-
cations. For example, body position and movement sensors. Entertainment systems
like wireless headphones demand for higher bandwidth and consume high power since
duty cycling cannot be used. Typically, gaming sensors connect a gaming console
where the data is collected for interactive gaming and entertainment systems connect
to a device which provides data.
3. Lifestyle: These application environments and devices around the user are sensitive
to the users, their moods and their activities. One can achieve the goals of these ap-
plications using WBAN, which can provide facilities to uniquely identify each user,
recognize his/ her mood and monitor activities. WBAN sensors can connect to ac-
cess points that activate personalization, identify the users using digital signatures and
periodically transmit sensors data to the which system that recognizes the mood and
activity of the user.
2.2 Distributed Denial of Service (DDoS) Attack in Cloud- Assisted WBAN
A Distributed Denial of Service (DDoS) attack is defined as an explicit attempt by an attacker
to exhaust the resources of a victim node. Multiple nodes are deployed to launch an attack
15
by sending a flow of data packets to the victim, thus consuming the key resources of victim
node and make them unavailable to legitimate nodes. These resources mainly includes the
network bandwidth, computing power and memory resources [11]. A detailed analysis of
distributed denial of service attack in cloud- assisted WBAN environment and its implication
shows that DDoS attack has following characteristics:
1. During an attack, the packet length, sequence number and window size remains fixed.
2. Source IP and destination IP address along with port numbers are spoofed and gener-
ated randomly.
3. Packet throughput decreases for legitimate users, which is defined as the number of
bytes transferred from source to destination per unit time.
4. Packet loss increases for legitimate users, which occurs due to the interaction of legit-
imate traffic with attack traffic.
5. Packet delays increases as network congestion builds up.
2.2.1 Classification of DDoS Attack
Under DDoS attack, the sensor node or the base station of a wireless body area network is
similar to the system or a server of a standard IP- based network, under DDoS attack. As
shown in Figure 2.2, DDoS attack can be classified into two broad categories namely band-
width depletion attack and resource depletion attack. Each of these broad categories can
be further classified into two subcategories. Bandwidth depletion attack can be subdivided
into flood attack and amplification attack whereas resource depletion attack can be subdi-
vided into protocol exploitation and malformed packet attack. Each of these attacks and
their subcategories are discussed in this section.
Bandwidth Depletion Attack
In bandwidth depletion attack, the goal of an attacker is to flood the victim node with huge
amount of traffic to prevent the legitimate traffic to reach the victim node. It is further divided
into flood attack and amplification attack [11].
• Flood Attack: In flood attack, zombies send huge amount of traffic towards a vic-
tim node in order to congest the network bandwidth of victim node with IP traffic.
16
Figure 2.2: DDoS Attack Classification
As a result, the victim node crashes, slows down or get affected from overwhelmed
networks bandwidth. Thus blocking the legitimate users to access the victim node.
These attacks are generally launched using UDP (User Datagram Protocol) and ICMP
(Internet Control Message Protocol) packets [11].
In UDP flood attack, an excessive amount of UDP packets are forwarded to selected
or random port of the victim node. If no application is running on the specified port of
the victim node, an ICMP packet is send as a message reply stating that the destination
port is unreachable. The DDoS attacking program will spoof the source IP address of
the attacked packet which helps to conceal the identity of the secondary victim nodes.
The packet returned from the victim node will not be sent back to zombies but instead
send to spoofed addresses.UDP flood attacks also utilize the connections bandwidth
near the victim system [11].
In ICMP flood attack, zombies send an excessive amount of ICMP-ECHO-REPLY
packets ping to the targeted node. These packets prompt the victim node to reply and
the combination of traffic overload the connection bandwidth of the victims network.
During an ICMP attack, the attacker also spoof the source IP address of the ICMP
packet [11].
• Amplification Attack: In an amplification attack, an attacker sends a large number
of messages to a broadcast IP address. Doing this will enable all the nodes in the
network that receives the broadcasted message to send a reply to a victim node. The
17
attacker will use amplification in order to raise the attack traffic volume. It includes
smurf attack and fragile attack [11].
Smurf Attack: A smurf attack involves an attacker to send packets to the IP address of
victim using amplification [11].
Fragile Attack: In fragile attack, the attacker sends packets to the network amplifier
using the UDP ECHO packets. Fragile attacks generate additional malicious traffic,
thus, causing more damage to the victim [11].
Resource Depletion Attack
In resource depletion attack, the goal of an attacker is to block the critical resources (pro-
cessor and memory) of a victim node in order to prevent the legitimate user from using
these resources. It is further divided into protocol exploitation attack and malformed packet
attacks [11].
• Protocol Exploitation Attack: The protocol exploitation attack can be further divided
into PUSH + ACK attack and TCP SYN attack [11].
TCP SYN Attack:In TCP SYN attack, an attacker programs zombies to send RCP SYN
requests to victim node in order to consume victim resources and prevents it from
responding to legitimate requests. It is a three- way handshake between the source and
the destination node in which source node spoof the IP address of victim and sends a
huge volume of TCP SYN packets to the victim node. In response, the victim node
sends ACK+SYN but did not receive the final ACK from the sender. This results in the
exhaustion of victim resources and the victim node is unable to respond to legitimate
requests [11].
PUSH+ACK Attack: PUSH+ACK attack involves the attacker sending TCP packets
and PUSH+ACK bits simultaneously. It will be prompt in the TCP packet header and
command the victim to offload data in TCP buffer and send acknowledgement once it
is completed [11].
• Malformed Packet Attacks: Malformed packet attack involves the attacker to in-
struct the zombies to send wrong IP packets to victim node in order to crash it. It is
further divided into IP address attack and IP packet attacks [11].
18
On the basis of DDoS attack classification discussed above, a complete DDoS attack tax-
onomy is discussed in section 2.2.2 according to which the existing DDoS defense mecha-
nisms are studied and analyzed.
2.2.2 A Taxonomy of Distributed Denial of Service Attack Defense Mechanisms
To combat DDoS attack, various mechanisms have been proposed to date in literature. All
of the existing techniques are proposed and implemented for high speed networks or wire-
less sensor networks. None of the mechanism is suitable for resource constrained WBAN
networks. This section classifies the DDoS defense mechanisms against two types of DDoS
attacks (bandwidth depletion attacks and resource depletion attacks) discussed in section
2.2.1 on the basis of two criterias. These classification criteria are crucial for the formula-
tion of efficient and robust defense strategy against DDoS attacks [11].
1. The first criteria for classification is based on the location at which the defense mech-
anism is deployed. Based on this criteria, the defense mechanisms are classified into
three categories: source-based, destination-based and network-based defense mecha-
nisms [11].
2. The second criteria for classification is the time at which the defense mechanism is
deployed in order to response to DDoS attack. Depending on this criteria, the defense
mechanisms are divided into three categories: before the attack, during the attack and
after the attack [11].
Figure 2.3 shows the taxonomy of DDoS defense mechanism according to which the
existing DDoS defense mechanism are studied and analyzed. The detail of DDoS attack
taxonomy is given below:
Deployment Location
1. Source-Based Defense Mechanisms: Source-based mechanisms are deployed close to
the source of the attack in order to prevent network users from generating DDoS at-
tacks. Source-based mechanisms are deployed either at the entry point (edge router) of
the sources core network or at the at the access router of a routing domain that connects
to the sources local network through edge router. Several source-based mechanisms
19
Figure 2.3: Taxonomy of DDoS Defense Mechanism
have been proposed for detecting DDoS attack at the source and are discussed later in
this section [17].
2. Destination-Based Defense Mechanisms: Destination-based mechanism are deployed
close to the victim of the attack. Both attack detection and response is performed at the
destination of the attack. These mechanisms must be capable of observing the victim
model and its behavior in order to detect anomalies [17].
3. Network Based Defense Mechanisms: Network-based defense mechanism are de-
ployed inside the network. The objective of these mechanisms are to detect the attack
traffic and create response to stop t at intermediate network [17].
4. Hybrid (Distributed) Defense Mechanisms: Hybrid defense mechanisms are deployed
at various locations such as source, destination or intermediate networks and there
is usually collaboration among the deployment points. Detection can be done at the
victim or intermediate network and the response can be initiated and distributed to
other nodes by the victim [17].
Table 2.1 summarizes the defense mechanisms based on deployment location along with the
features and enumerates the pros and cons of each category.
20
Table 2.1: DDoS Defense Mechanisms based on Deployment Location
Features Pros ConsSource-Based
Detection and responseare deployed at source
Helps in Preventing theloss of resources by fil-tering the traffic at thesource
It is difficult to detect anattack traffic at source dueto less volume of traffic
Destination-Based
Detection and responseare deployed at victim
These defense mech-anisms are easier andcheaper to deploy be-cause of their access tothe aggregate traffic nearthe victim node
The detection and re-sponse of the attackcannot be done untilit reaches the victimwhich cause the resourcewastage on the pathstowards the victim
Network-Based
Detection and responseare deployed at inter-mediate network
Detection and tracebackof attack source is easybecause the aim is to fil-ter attack traffic as closeto source as possible
- Excessive storage andprocessing overheadsat intermediate points.- Less traffic availablefor attack detection at in-termediate access points.- Difficult deploymentbecause it requires thereconfiguration of allrouter on the network.
Hybrid(Dis-tributed)
Both detection andresponse occurs atdifferent locations:Detection takes placeeither at intermediatenetwork or destinationnode whereas responsetakes place at thesource node
- More robust againstDDoS attack. -More resources availableat various levels to han-dle DDoS attack effi-ciently.
- Strong collaborationamong the deploymentpoints is required.- Comlexity and overheaddue to the communicationbetween distributed com-ponents spread all overthe network.
Time at Which the Defense Mechanism is deployed
1. Before an Attack (Attack Prevention): The best time to stop a DDoS attack is at its
initial stage when it is launched. It is done by deploying attack prevention mechanisms
at source, destination and intermediate network. Attack prevention can be done by
employing filters, installing Intrusion Detection Systems (IDS), firewalls and Intrusion
Prevention Systems (IPS) [11].
2. During an Attack (Attack Detection): Attack detection is the key step in defending
against DDoS attack that occurs during the attack. These mechanism can also be de-
21
ployed at source, destination, intermediate networks and hybrid locations. A number
of attack detection mechanisms exists in literature and discussed in section 2.3.1.
3. After an Attack (Traceback the Source of an Attack and Response): The main focus
of DDoS attack mechanism is to minimize the impact of an attack and maximize the
availability of resources and services for legitimate users. Therefore, the DDoS de-
tection mechanism must be followed by the two main categories of after the attack
mechanisms: (a) the first category is traceback mechanism which is responsible to
identify (trace) the source of the attack. These are discussed later in Chapter 6, (b)
the second category is responsible for initiating an appropriate response to the attack.
The most common response mechanism is throttling (rate- limit) applied on identified
attack flows [3].
To overcome the effects of DDoS attack in cloud-assisted WBAN environment various tech-
niques have been studied and explored during this research according to the classification
and taxonomy of DDoS attack. Among these, data mining classification techniques have
proven itself as a valuable tool to identify misbehaving nodes and thus for detecting DDoS
attacks. Therefore, in section 2.3, data mining technique are studied and analyzed for their
effectiveness and efficiency in detecting DDoS attacks.
2.3 Role of Data Mining in Distributed Denial of Service Attack Detection
Data mining, also known as knowledge discovery, is a under studied topic in the field of com-
puter science that employs a number of existing computational techniques from statistics,
information retrieval, machine learning and pattern recognition [23]. According to Patcha
et al. [24], ”Data mining is concerned with learning patterns, association, changes anoma-
lies and statistically significant structures and events from large quantities of data”. Fig 2.4
depicts the process of data mining that transforms the raw data to valuable knowledge.
Data miners are trained at using specialized automated software to discover regularities
and irregularities in large and complex data sets. In the recent past, data mining techniques
have been considered as one of the most promising solution for identifying the malicious
behavior of nodes in the network and became an important component for the detection and
prevention of DoS and DDoS attacks [25].
The aim of deploying WBAN for e-health applications is to make the real-time decisions
22
Figure 2.4: Data Mining Process
efficiently, which seems to be a very challenging task due to the highly resource constrained
computing, communication capacities and high speed of non-stationary data generated by
WBAN sensors. This challenge is the source of motivation to select and explore the data
mining techniques that is light-weight and deals with discovering patterns from large con-
tinuous stream of WBAN sensors data. Specific tasks that data mining might contribute to a
DDoS attack detection in Cloud-assisted WBAN are as follows:
1. Help to mine the sensor data for uncovering patterns in order to make intelligent deci-
sions immediately after an attack occurs.
2. Detect anomalous activities that expose a real attack .
3. Identify large continuous patterns in ongoing streams of sensor data i.e., different IP
address, same activity etc.
4. Identify bad sensors signatures .
5. Detect previously unknown network anomalies.
To fulfill these specific tasks, data miners utilize a single or a combination of data min-
ing techniques. These includes: statistical techniques, data summarization, visualiza-
tion, clustering, association rule and classification techniques [26]. The effectiveness
of each technique depends upon the application scenario on which it is applied. Table
2.2 shows the details of these techniques along with their pros and cons [27] [28].
Finally, we analyze the consequences of choosing data mining techniques in cloud-
assisted WBAN environment.
23
Table 2.2: Data Mining Techniques
Description Pros Cons ConsequencesA
ssoc
iatio
nR
ule
Min
ing These are focused on
the discovery of pat-terns and dependen-cies in data sets. Itis an expression of theform X ⇒ Y , whereX and Y are sets ofitems. Make use oftwo measures: sup-port and confidence[27].
Formulated to lookfor sequential patterns[27].
It is difficult to de-tect an attack traffic atsource due to less vol-ume of traffic
As WBAN sensorshas less compu-tational capacity,therefore buildingcomplex algorithmsis not a good choice
Clu
ster
ing Grouping together ob-
jects that are similarto each other but dif-ferent to the object be-longing to other clus-ter. These algorithmsare used for the de-tection of underlyingstructures within thedata [28].
Provides end userswith abstract view ofdatabase operations.Very fast computationon databases [9].
The effectiveness ofclustering techniquesdepends on the appli-cations. Once a mergeor a split is commit-ted, it cannot be un-done or refined [28].
As WBAN data ishigh speed continuousdata streams, whichleads to missing datavalues within the in-put data
Vis
ualiz
atio
n Visualization is a wayto transform poor datainto meaningful formby using a wide va-riety of data miningtechniques in order todiscover hidden pat-terns [28].
Visualization is a wayto transform poor datainto meaningful formby using a wide va-riety of data miningtechniques in order todiscover hidden pat-terns [28].
Difficult to under-stand when buildhierarchically . Asit is difficult forhumans to understandnumbers, so the sum-marization is requiredto put the data intographical form [28].
Computational inten-sive for WBAN be-cause summarizationis required to put thedata into human un-derstandable form.
Stat
istic
s Statistics is a branchof mathematics deal-ing with collectingdata and countingit [27].
Gives a high- levelview of the database.Provides a usefuland important in-formation about thedatabase [27].
These techniquesmake certain as-sumptions aboutdata [27].
As we are dealingwith human health,making assumptionsabout data is not agood idea.
Cla
ssifi
catio
n Classification Tech-niques predicts thecategory to whicha particular recordbelongs [28].
Simplicity and in-terpretability oftheir rules. Betterperformance and un-derstanding then otherDM techniques [28].
Data preparation pro-cedures are not re-stricted and imposedby any requirements[28].
Appropriate forWBAN due to theirsimplicity and inter-pretability of theirrules, derived easilyfrom the organizationof the tree in case ofdecision trees
24
2.3.1 Existing Data Mining Techniques for DDoS Attack Detection
In past, data mining techniques have been considered as one of the most promising so-
lutions for identifying the malicious behavior of nodes in the network. For this research,
data mining techniques have been studied and evaluated for the detection of DDoS attack
in cloud-assisted WBAN environment. From the perspective of DDoS attack detection, ex-
isting data mining techniques (Subbulakshmi [29], Wu et al. [30], Lee et al. [31], Arun et
al. [32], Thwe et al. [33]) can be broadly classified into source- based and destination- based
detection techniques. Source- based detection techniques are deployed near the source of
an attack whereas destination based detection techniques are deployed near the victim of an
attack. These detection techniques are discussed below:
Source- Based Detection Techniques
Lee et al. [31], have proposed an enhanced traffic matrix-based approach in which the traffic
matrix parameters are optimized using a Genetic Algorithm (GA). Only two features of the
IP header, namely packet arrival time and source IP address, are used to construct a traffic
matrix. From this traffic matrix, the variance is calculated and used to categorize the traffic
as normal (high variance) or a DDoS attack (low variance). Finally, upon the detection of an
attack, alerts are generated.
Arun and Selvakumar [32] investigated ensemble-based neuro-fuzzy classifiers. The key
contributions of the authors include a weight-update distribution policy, reduction in the
error cost, and ensemble output combination approach. The performance was evaluated
using attack test data, which was not included in the training data set. The results showed
that the proposed scheme is able to detect new attacks.
Destination Based Detection Techniques
In Wu et al. [30], a destination-based technique is proposed that deploys a decision tree at the
victim node and a traffic pattern matching technique for attack identification and traceback at
the source of an attack. For the classification of the tree, fifteen distinct network and packet
features were chosen to monitor the packet rate and byte rate to reveal the traffic flow pattern.
For the data classification, C4.5 classification algorithm was applied on chosen network and
packet features as tests to observe abnormal traffic flow.
25
Table 2.3: Comparison of existing DDoS attack detection mechanisms
Scheme Type Advantages Disadvantages
DDoS attackdetection withdecision treeC4.5 [30]
Destinationbased
-Capable of handlingfuture attacks-Efficient trace backprocedure-Suitable for any net-work topology
-Low classification ac-curacy when trainingdata is large-Requires an entiredataset to remain per-manently in memoryresults in memoryconsumption
DDoS attackdetection us-ing EnhancedSupportVector Ma-chines [29]
Destinationbased
-Fast Computation-Does not requiredataset to be stored inmemory
-Evaluate on obsoletedata set (KDD Cup)-Detection reliability isnot very high-Low accuracy
DDoS attackdetectionusing opti-mized trafficmatrix [31]
Sourcebased
-Time based windowis replaced with packetbased window size toreduce the computationoverhead-Increases the detectionrate by optimizing thefeatures of traffic ma-trix using Genetic algo-rithm
-Not suitable for realtime network traffic-Detection delay is high-May generate exces-sive alerts
DDoS attackdetectionusing an en-semble ofadaptive andhybrid neuro-fuzzy [32]
Sourcebased
-Suitable for handlinglarge stream of networkdata-No retraining of classi-fiers-Capable of incremen-tal learning-Better performance ascompared to other algo-rithms
-Hybrid approach mak-ing it complex in cloud-assisted WBAN envi-ronment-Prior knowledge ofdata distribution isneeded
DDoS attackdetection Us-ing K-NearestNeigh-bour [33]
Destinationbased
-Discovered unknownattacks-Attack characteristicscan be analyzed usingstatistical values
-Very high complexitydealing with large datasets-Computationallyexpensive-Not suitable for realtime data mining
Subbulakshmi et al. [29] proposed an Intrusion Detection System (IDS) based defense
26
mechanism to counter DDoS attack. The IDS is trained using the datasets obtain from the
extraction of attack traffic features. To strengthen the detection process, weights are added
with these datasets at regular intervals.
Thwe and Thandar [33] proposed a statistical anomaly detection technique based on K-
Nearest Neighbor (KNN) deployed at the victim node. A user specified threshold is defined.
When the current state of the system differs from the defined model by a specified threshold,
an anomaly is raised. At this stage KNN is used to detect an attack.
A comparative analysis of the existing data mining techniques for the detection of a DDoS
attack is given in Table 2.3. It provides evidence that the existing techniques have high
complexity and low accuracy when a large amount of training data is used. None of the
existing schemes is suitable for the real-time data mining of the high-speed streaming data
coming from a sensor network. Therefore in this research, stream mining techniques are
analyzed that has the ability to handle real- time streams of data and detect a DDoS attack
efficiently and in less time. Stream mining techniques are discussed in next section.
2.4 Stream Mining Techniques
Stream mining is concerned with extracting significant knowledge structures from continu-
ous and rapid stream of data. Further, a data stream is an organized sequence of instances
that can be read a small number of times using less computational and memory resources.
Taking into account the resource constrained nature of WBAN sensors, Very Fast Decision
Tree (VFDT) proves to be a light weight data mining technique that is able to process a large
amount of high speed streaming data consuming less memory space. It turns out to be effi-
cient in the detection of DDoS attack at any stage due to its ability of building decision tree
from scratch. In [17], VFDT is applied for detection of DDoS attack and objective based
comparison is done. The results show that the VFDT proves to be an accurate tool for DDoS
attack detection. Therefore, in this research VFDT is selected and improved for detecting
DDoS attack efficiently. This section explains the preliminaries for VFDT and discusses its
variants along with their limitation when used for DDoS attack detection in cloud- assisted
WBAN.
27
2.4.1 Preliminaries
VFDT is a stream-based data classification method that learns using a complete set of
N training samples expressed as (X, y), where X is a vector of n attributes given as
{X1, X2...Xn}. The aim is to construct a model of a mapping function y = f(X) that will
predict the classes of subsequent samples x with maximum accuracy. To design a VFDT
for DDoS attack detection, the mathematical preliminaries used for the classification are
discussed below [34] [53].
Hoeffding Bound: This gives a certain level of confidence about the best attribute to split
the node. Suppose we have N independent observations of a real-valued random variable r
whose bounded range is R. The Hoeffding bound states that with confidence level 1− δ, the
true mean of variable r is at least r − ε, where ε can be calculated using equation 2.1 [53].
ε =
√R2ln(1/δ)
2N(2.1)
Information Gain: VFDT uses the information gain as a heuristic evaluation function to
find the upper and lower bounds with high confidence. The upper bound G(.)+ and lower
bound G(.)− are calculated using equation 2.2 and 2.3 [53].
G(A, T )+ =∑vεA
P (T,A, v) +
√ln(1/δ)
2NH(Sel(T,A, v))+ (2.2)
G(A, T )− =∑vεA
P (T,A, v) +
√ln(1/δ)
2NH(Sel(T,A, v))− (2.3)
where A is an attribute in the T set of training samples. P (T,A, v) is a fragment of the
training samples in set T that holds the value v for attribute A. Sel(T,A, v) selects all the
training samples having value v for attribute A from set T .
Although VFDT classifies stream data efficiently, but it has limitations for example, it
cannot handle noisy data and classification accuracy decreases with the increase in noise.
In recent past, few variations of VFDT have been proposed. In this section, VFDT and its
variants are discussed and briefly analyzed for their feasibility long with their limitations
when used for DDoS attack detection in cloud- assisted WBAN.
28
2.4.2 Very Fast Decision Tree (VFDT)
Domingos et al, [53] proposed VFDT based on hoeffding bound (HB) using equation 2.1 to
control over error in the attribute splitting distribution selection. Information gainG(.) given
in equation 2.2 and equation 2.3 is used as a Heuristic Evaluation Function (HEF) in order
to decide the split attribute to convert the decision nodes to leaves. ∆G = G(Xa)− G(Xb)
defines the difference between the two best attributesXa (highestG(.))andXb (second high-
est G(.)). If ∆G > ε, then Xa considered as highest value attribute in G(.). At this stage,
the splitting occurs on attribute Xa and the decision node is converted into leaf node. The
major drawback of this technique lies in the fact that there exists certain cases, when the two
information gains has very small differences and are equally good to become a leaf node.
At this stage, a tie condition occurs and the process gets stuck. Resolving the tie- breaking
is a computation intensive task that increases the processing time and decreases the overall
accuracy of decision tree. At the same time, it is considered to be inappropriate for resource
constrained WBAN.
2.4.3 Very Fast Decision Tree based on Predefined Threshold (VFDT-)
To overcome the limitation of VFDT, Hilton et al, [34] proposed a fixed tie- breaking thresh-
old τ . Whenever the difference between two information gains is very small, τ acts as a
quick decisive parameter to solve the tie condition. The node splitting occurs on the present
best attribute despite of how good the second best attribute might be. The value of τ is
chosen randomly and remains fixed throughout the process. An excessive tie- breaking con-
ditions reduces the performance of VFDT-τ significantly on noisy and complex streaming
data, even with the use of parameter τ . VFDT-τ does not support pruning as the tree size
itself is very small. While improving the accuracy, the tree size explodes. Therefore, a
suitable pruning mechanism is required at this stage.
2.4.4 Optimized Very Fast Decision Tree (OVFDT)
To overcome the limitation of fixed tie- breaking threshold, Yang et al, [35] proposed an
algorithm based on adaptive tie- breaking threshold computed directly from the Hoeffding
Bound (HB) mean. The value of HB mean fluctuates intensively with the increase in the
noise percentage thus, reducing the accuracy of attack classification.
29
2.4.5 Concept Adaptive VFDT (CVFDT)
CVFDT [36] maintains two trees simultaneously in memory. The tree with the shortest
depth is retained and the other one is discarded. The main drawback of this technique is that
it consumes more memory and time to maintain two trees. Also CVFDT does not handle
noisy data efficiently.
2.5 Effect of Noise in Streaming Data
Noisy data is considered as a meaningless or extraneous data that makes the identification
of data patterns more difficult. As the noise increases in data stream, the number of outliers
also increases. In sensor networks, noise arises due to the changes in system behavior and
malicious activity in the network. There are two major sources of noise [37]
Error: An error is defined as a noisy value coming from an erroneous sensor. Outliers
caused by such errors have a very high probability of occurrence.
Event: An event refers to a particular phenomenon which, in this case, is an attack occur-
rence event that changes the state of a system. Outliers caused by events occur with small
probability but they are lasting and modify the historical patterns of sensor data.
In both cases, the presence of outliers due to noisy data decrease the attack detection
accuracy and increase the false alarm rate. At the same time, the tree size increases which
results in added memory consumption. The goal is to detect and remove these outliers
from the sensor data in order to ensure the high attack detection accuracy while keeping the
resource consumption of the network to minimum. Figure 2.5 shows the detrimental effect
of noisy data on classification accuracy (figure 2.5(a)) and tree size (figure 2.5(b)). The
experimental data is synthetic and supplied as input to VFDT-τ algorithm with τ=0.05 [36].
The error rate significantly affects both the accuracy and tree size of decision tree when the
number of instances increases manifold. Even a small error rate leads to increase in tree size
by several times.
2.6 Traceback Techniques for Distributed Denial of Service (DDoS) Attack
In DDoS attack, the key issue lies in detecting an attack and invoking the appropriate trace-
back mechanism. Several techniques are available in literature for detecting DDoS attack
in sensor networks as discussed in section 2.3.1, but very limited amount of work is found
on traceback mechanism. Traceback requires reconstructing the attack path and identifying
30
(a) Effect of Noise on Accuracy (b) Effect of Noise on Tree Size
Figure 2.5: Effect of Noisy Data
the source of DDoS attack [46]. Traceback techniques proposed for conventional IP- based
networks [38], [39], [40], [41] are not directly applicable on resource constrained WBAN
environment due to additional overhead requirements and high convergence time. Similarly,
several traceback techniques are also available for Mobile Adhoc Network (MANET) [42]
and Wireless Sensor Networks (WSN) [43] that overcome the limitation of overhead but at
the cost of additional processing and storage requirements.
2.6.1 Existing Traceback Techniques for Standard IP- Based Networks
There are four major techniques to tackle with traceback problem in standard IP- based
networks. These include: hash- based traceback [39], Internet Control Message Protocol
(ICMP) based traceback [38], Probabilistic Packet Marking (PPM) [40] and Deterministic
Packet Marking (DPM) [41] techniques. However, these techniques are not appropriate
when deployed in resource constrained WBAN environment because they requires extra
computation and implementation resources. These techniques are discussed as follow:
Hash- Based Traceback techniques
Hash- bases IP traceback generates audit trails for network traffic and can detect the source
of a single packet delivered by the network recently. These techniques require adequate
amount of memory and storage space to record and transfer these network audit trails. The
implementation of hash- based traceback in WBAN is not practical but explanatory. These
techniques are only good for traceback in conventional IP based networks where storage
space is sufficient for logging packets [39].
31
ICMP- Based Traceback techniques
ICMP traceback techniques utilizes ICMP packets that contains the information about the
previous and the following routers and sends this information to the source and the desti-
nation of the original packet. Using this additional ICMP packet, the victim node easily
reconstructs the attack path. However, this technique is not appropriate for resource con-
straint WBAN network because it requires the WBAN network to make use of full TCP/IP
protocol stack. Also maintaining extra ICMP packet throughout the transmission and trace-
back requires huge amount of memory and computational resources [38].
Probabilistic Packet Marking (PPM) Techniques
In PPM techniques, each router not only forwards the packet but also marks individual pack-
ets with a low marking probability. This mark is a unique identifier analogous to that specific
router. As compared to other techniques, PPM has small implementation and management
overhead due to the probabilistic nature of algorithm. However, the computational overhead
and the convergence time is high, which is the time taken by victim node to reconstruct the
attack path by collecting at least one marked packet from each intermediate router. This
results in limiting the usefulness of PPM for fast traceback in WBAN environment [40].
Deterministic Packet Marking (DPM) Techniques
Like PPM, DPM also requires each router to mark individual packets. Moreover, the DPM
approach requires all the internet routers to be updated for every packet marking, which in
turn requires a huge amount of spare bits in IP packets. Therefore, the scalability of DPM is
very limited. Also it requires a huge amount of storage space for packet logging for routers.
For this reason, DPM is not a good solution for traceback in WBAN [41].
All of the traceback techniques discussed above are for conventional IP- based network. A
number of traceback approaches exists in literature that are proposed specifically for Mobile
Ad-Hoc Networks (MANETs).
2.6.2 Traceback techniques for Mobile Ad-hoc Networks
Jin et al. [42], proposed traceback technique based on node sampling in which a complete
network is split into various zones and each node knows its zone ID to which it belongs.
Upon the arrival of packet, each node first writes its zone ID into the packet with a certain
32
probability and then passes it. Upon the detection of DDoS attack, the victim node recon-
structs the complete path by collecting sufficient number of these marked packets. Analysis
shows that the reconstruction process of this technique is less accurate to efficiently trace-
back the source of an attack.
Things et al. [45], proposed a scheme for MANET named ICMP traceback with Cumula-
tive Path (CP). This scheme conceals the complete attack path information in ICMP trace-
back CP message. However, this scheme requires to overload some fields of the IP header
and thus, needs a heavy protocol stack which is unavailable in resource constraint WBAN
environment.
Bo Chao et al. [44] proposed a traceback scheme specifically for hierarchical WSN en-
vironment. The proposed scheme is based on two layer labeling technique and a Marking
Probability Distribution Function (MPDF) that offers a fixed marking probability assign-
ment to each node for simplicity. Using fixed marking probability requires a large number
of packets for attack path reconstruction which results in high convergence time. As the
packets are overwritten with same marking probability by all routers, which results in un-
fairness marking. The proposed scheme is evaluated in term of qualitative comparison, no
quantitative comparison is done.
2.7 Conclusion
A Distributed Denial of Service (DDoS) attack is defined as an explicit attempt by an attacker
to exhaust the resources of a victim node. Multiple nodes are deployed to launch an attack by
sending a stream of packets towards the victim, thus consuming the key resources of victim
node and make them unavailable to legitimate nodes. These resources mainly includes the
network bandwidth, computing power and memory resources.
To overcome the effects of DDoS attack in cloud-assisted WBAN environment various
techniques have been studied and explored during this research according to the classifica-
tion and taxonomy of DDoS attack. Among these, data mining classification techniques have
proven itself as a valuable tool to identify misbehaving nodes and thus for detecting DDoS
attacks. The simulation result shows that the existing data mining techniques for DDoS
attack detection have high complexity and low attack detection accuracy when applied on
high- speed streaming data. Therefore, stream mining techniques were selected and explored
33
as a solution towards the detection of DDoS attack in cloud- assisted WBAN environment.
The detailed analysis shows that the mining accuracy of stream data is effected by the noise
present in the data. Therefore, handling the noise significantly improves the results of accu-
racy. Finally, for the prevention of distributed denial of service attack, traceback techniques
in standard IP- based network and mobile Ad-hoc networks were studied and explored. The
study shows that the traceback techniques in standard IP- based networks are not appropriate
for cloud- assisted WBAN environment due to additional computation and implementation
resource requirements. Further, traceback techniques for mobile Ad-hoc networks have high
convergence time and extra overhead on nodes.
The detailed analysis of DDoS attack detection and prevention techniques shows that the
topological aspects of WBAN network plays an important role and must be incorporated into
proposed attack detection and prevention techniques in order to achieve better results.
34
Chapter 3
PROPOSED DDoS ATTACK DETECTION AND PREVENTION
FRAMEWORK FOR CLOUD-ASSISTED WBAN
Distributed denial of service (DDoS) attack aims to generate a huge volume of attack traffic
towards a victim node in order to deplete the energy resources of a victim node rapidly. The
attacker node launches an attack by compromising the legitimate sensor node and partici-
pate in the network operation with malicious intent. As we are dealing with critical health
monitoring application of wireless body area networks, the effect of such an attack can be
disastrous to the network operations. The distributed nature of DDoS attacks in WBAN
environment demands the need for innovative solutions in order to successfully detect an
attack [18] [19].
In this chapter, first we propose a cloud- assisted WBAN architecture and discusses its
modules in detail. Secondly, based on the proposed architecture, a framework is presented
for the detection and prevention of DDoS attack in cloud- assisted WBAN environment.
Based on the framework, two techniques were proposed:
1. A distributed attack detection technique is proposed in Chapter 4 that efficiently de-
tects DDoS attack in these networks.
2. A traceback technique is proposed in Chapter 6 that efficiently identify the source of
an attack and block an attacker.
In section 3.1, we discuss the requirements for detecting DDoS attack in cloud- assisted
WBAN environment. In section 3.2, a proposed cloud- assisted WBAN architecture is dis-
cussed in detail along with its modules. A proposed framework for detecting and preventing
DDoS attack in cloud- assisted WBAN environment is presented in section 3.3. Finally, the
concluding remarks are given in section 3.4.
35
3.1 Requirements for DDoS Attack Detection in Cloud- Assisted WBAN environment
Due to the wireless communication medium and limited resources of cloud- assisted WBAN
environment, it becomes extremely difficult to detect and traceback the DDoS attack in these
networks. The attack class observes the traffic flow in the network and mark the nodes that
are actively taking part in transmitting and receiving data packets. The marked nodes are
named as critical nodes and are considered as a target of DDoS attack. In this research, we
refer these critical nodes as target or victim nodes.
The nodes from attack class launches DDoS attack towards the victim nodes from multiple
locations of the network in order to deplete the limited energy resources of victim node and
make them unavailable for legitimate users. The wireless nature of cloud- assisted WBAN
network allows multiple entry points which makes it more difficult to detect and traceback
the attack in these networks. The attack class intending to initiate DDoS attack can be
categorize into following classes:
1. Injected Sensor Nodes: This class contains either regular sensor nodes with normal
sensing powers or more powerful nodes like a base station.
2. Compromised Sensor Nodes: This class contains the legitimate sensor nodes that are
compromised by attack nodes in order to disrupt the normal operation of the network.
3. Laptop Class Nodes: This class consists of sensor nodes with additional communi-
cation resources for transmitting and receiving data packets. In addition, the sensor
nodes of this class has more power and computational resources.
Figure 3.1: Flat Topology
36
DDoS attack model containing attacker classes is shown in Figure 3.1. The base station,
regular sensor nodes and aggregate nodes are the legitimate nodes of a network. Where as,
the compromised node, laptop class nodes and malicious nodes are the attack nodes that
launches attack towards the legitimate nodes.
3.2 Proposed Cloud- Assisted WBAN Architecture
The integration of WBAN and cloud computing technology provides a platform to create a
new digital paradigm with leading features called cloud-assisted WBAN. In this section, the
proposed cloud-assisted WBAN architecture is discussed.
3.2.1 Formulation of Cloud- Assisted WBAN Architecture
Before formulating the cloud-assisted WBAN architecture, there is a need to understand the
factors that contribute towards the efficient modeling of cloud-assisted WBAN architecture.
These factors includes:
Sensor Data
Sensor data can be seen as a huge volume of high speed real-time stream data collected
continuously from WBAN sensors and transferred to the cloud via a base station for fur-
ther processing and storage. Because we are dealing with data mining techniques that can
identify the relationship between data attributes, the attributes can either be homogeneous
or heterogeneous [47]. Homogeneous attribute allows sensing single- value attribute, for
example, blood pressure only. In case of heterogeneous attributes, each node is fitted with
multiple sensors that allows to sense multiple attributes at one time, for example, ECG, tem-
perature, EEG etc. In proposed architecture, WBAN sensors have homogeneous attributes,
i.e., they can only sense a single value, e.g., ECG sensors and EEG sensors, whereas the
entire WBAN network is heterogeneous.
Network Topology
Network topology is defined as a data delivery model which gives the routing path for trans-
fer of data from the sensor node to the base station. The network topology is classified into
three main categories depending upon the flow of traffic in the network. These include:
1. Flat Topology: In flat topology, the readings of each sensor node are transmitted
directly to the base station using a single-hop communication mode [48]. The flow of
37
traffic from sensor node to base station is expressed as t = (t(s,BS)) and is illustrated
in Figure 3.2, where t is a traffic from sensor node s to Base station BS.
Figure 3.2: Flat Topology
2. Cluster-based Topology: In cluster-based topology, each cluster contains few sensor
nodes and an Aggregate Node (AN) that acts as Cluster Head (CH). Each aggregate
node sends data to the base station directly. Aggregate nodes are also sensor nodes
but with additional capabilities like memory, computation, and communication re-
sources. These nodes perform data aggregation and control a set of predefined clusters
of sensor nodes in a network. Finally, the aggregated data is forwarded to a base sta-
tion either directly or via other aggregate nodes. The data aggregation topology is
deployed for the proposed cloud- assisted WBAN architecture and illustrated in Fig-
ure 3.3 [48]. The flow of traffic from sensor node to base station is represented as:
t = (t(s,AN(s)), t(AN(s),AN(s)), t(AN(s),BS)), where t(s,AN(s)) is a traffic from sensor node
to its aggregate node, t(AN(s),AN(s)) is a traffic from one aggregate node of cluster A to
the aggregate node of cluster B and t(AN(s),BS) is a traffic from aggregate node to base
station.
3. Data Aggregation Topology: In this topology, individual sensor readings come from
different sensors to the base station via a well-defined tree of interconnected inter-
mediary nodes called aggregate nodes. This helps to reduce the total traffic flow in
the network and minimizes the number of data transmissions, which saves sensor
energy [48]. Data aggregation topology is shown in Figure 3.4 and expressed as:
t = ts1, ts2, ..., t(sN,BS), where t(sN,BS) is the total number of intermediary nodes to
reach the base station.
38
Figure 3.3: Cluster-based Topology
Figure 3.4: Data Aggregation Topology
Processing Architecture
Classification techniques process data either in a centralized or distributed manner [47].
Since we are dealing with resource-constrained WBAN nodes, choosing an appropriate pro-
cessing architecture is an important concern. Implementing a centralized approach in a
WBAN network causes a huge amount of data flow and communication towards the base
station, which can create a bottleneck and waste communication bandwidth. On the other
hand, the distributed approach has the advantage of performing classification locally at ag-
gregate nodes and then passing on the results to upper nodes (base station). Since only the
classification result will be forwarded to upper node (base station), the overall energy con-
sumption for transmission can be significantly reduced. A novel classification technique is
proposed for attack detection (discussed in Chapter 4).
39
Selection of Data Mining Technique
Based upon the general data mining approaches discussed in chapter 2, we have selected
decision tree as a classification techniques for detecting a DDoS attack in a cloud-assisted
WBAN environment. In chapter 4, a decision tree based classification technique is proposed
that is lightweight, capable of incremental learning during online mining, and effectively
used for stream mining classification [1].
Application Area
As we are dealing with patients health monitoring application, security is of utmost impor-
tance. The application area of the proposed cloud-assisted WBAN architecture is a patient
e-Health monitoring system.
Evaluation Method
Evaluations can be done through analytical modeling, simulation, or the real-time deploy-
ment of sensors. Among these, simulation and real- time environment are most widely used
and effective approach to design and test any proposed solution in terms of the accuracy,
computation, and communication complexities. To evaluate the performance of proposed
architecture, both simulation and real-time testbed are deployed. For simulation experi-
ments, Network Simulator (NS-2) is used and for real-time WBAN test-bed environment,
real sensors nodes are deployed.
Data Sets
Data sets are used to validate the proposed scheme experimentally. They can be either syn-
thetic or real. Which data set is more appropriate depends upon the application scenario and
criticality of the situation and results. For this research, both synthetic and real-time datasets
are used.
3.2.2 Proposed Cloud-assisted WBAN Architecture
The proposed architecture integrates a WBAN with cloud computing technology in order to
store and process the data collected by WBAN nodes for patient e-Health monitoring. The
proposed system architecture is scalable and is able to store the enormous amount of data
generated by WBAN sensors. Since the data from these sensors are highly sensitive and
vulnerable to many attacks, a security mechanism is proposed in Chapter 4 to ensure data
40
availability at all times. DDoS is a major attack that affects the availability of a patient’s
data to data access requesters [49]. Figure 3.5 shows our proposed cloud-assisted WBAN
architecture. The green dotted circle shows the entities that are the victims of DDoS attacks.
Figure 3.5: Proposed cloud-assisted WBAN Architecture
Based on the cloud- assisted WBAN architecture, Figure 3.6 shows the sequence of steps
from transferring a patient’s data to the cloud until its supervision by a healthcare profes-
sional.
The proposed cloud- assisted WBAN architecture consists of the following entities:
1. WBAN Sensor Network: The WBAN is composed of an infinite number of sensory
nodes such that S = {S1, S2...Sn}, where |S| = n and n is the number of nodes in
the network. The limited energy resources of these n sensor nodes differentiate the
modeling and detection of DDoS attacks in them. The adversary class, which refers
to the set of malicious nodes M = {M1,M2...Mk−1}, where |M | = k ≤ n and k
is the number of malicious nodes, monitors the network traffic flow and identifies the
nodes that are actively taking part in the transmission and reception of data packets.
The identified sensor nodes are marked as critical nodes and are likely targets for a
DDoS attack. These critical nodes are referred to as victim nodes and denoted by
V = {V1...Vr} where V ⊂ n, i.e., each critical node r of set v is a target of a DDoS
attack, which implies that |V | = r � n. In a wireless body area network environment,
a DDoS attack occurs at two points.
41
Figure 3.6: Sequence of Operations from Patient to Healthcare Professional
Aggregating Nodes: The network traffic from individual sensor nodes toward an ag-
gregate node can be defined as: t = (t(s,AN(s)), showing a one-step data transmission
to the aggregate node. Compared to a WSN, a WBAN network consists of a small
number of sensor nodes. Therefore, defining a single intermediary aggregate node is
sufficient. The aggregate nodes are close to the base station, that is why they will ex-
pect more traffic inflows. This make these sensor nodes the critical nodes and more
vulnerable to DDOS attacks [48].
Base Station: The network traffic from an individual sensor node to the base sta-
tion through aggregate nodes is defined as t = (t(s,AN(s)), t(AN(s),AN(s)), t(AN(s),BS)) ,
where t(s,AN(s)) is a traffic from sensor node to its aggregate node, t(AN(s),AN(s)) is a
traffic from one aggregate node of cluster A to the aggregate node of cluster B and
42
t(AN(s),BS) is a traffic from aggregate node to base station. The base station is the
control center of all the activities of the sensor network [48].
2. Cloud Service Provider: The cloud service provider is responsible for providing data
storage facilities for the WBAN network. These consist of the health cloud storage
and e-Health service provider. The health cloud storage is responsible for storing
and retrieving data upon the request of authorized users (pharmacists, doctors, health
workers, etc.). The e-Health service provider classifies a patient’s health record on the
basis of patient attributes and transfers it to the e-Health cloud storage for permanent
storage.
3. Attack Detection Node: In the proposed architecture, we deploy an attack detection
node at the victim side with the aim of attack detection in the cloud environment.
The main purpose of deploying this attack detection node is to minimize the direct
attack traffic flow towards the victim. The link between the attack detection node and
sensor network, as well as other networks, is secured using the Secure Socket Layer/
Transport Layer Security SSL/TLS protocol in an attempt to prevent a man-in-the-
middle attack [50].
4. Data Requesters: These include healthcare professionals (doctors, nurses, health
workers) that use application specific services (SaaS) to access a patient’s stored data.
As a result, the cloud storage provider connects with the e-Health service provider to
verify the data requester.
5. Patients: Patients must be registered with an e-Health care service provider before
using cloud services. The patient is accountable to specify an attribute-based access
policy in an attempt to receive e-Health care services.
According to Figure 3.5, the proposed cloud- assisted WBAN architecture can be cate-
gorized into five steps. At each step, we discuss how the attacker launches a DDoS attack
against the victim nodes [17].
Step 1 (Patient health data collection): The WBAN network collects health information
from different body sensors attached to patient’s body and transmits the information to the
43
base station/patient’s gateway via aggregate nodes. At this point, the DDoS attack occurs
against two entities.
Aggregate nodes: The attacker compromises the legitimate sensor nodes and launches
a DDoS attack by bringing about adversary class nodes in the network in an attempt to
originate an attack from various points in the network toward the aggregate nodes.
Base station: Because the base station is the control center for all the activities of the
sensor network, an adversary class can generate a DDoS attack towards the base station to
exhaust the energy and, therefore, destroy the whole WBAN sensor network. At this stage,
a DDoS attack detection approach is needed to successfully detect an attack by classifying
the malicious packet. The detection approach should be lightweight and consume little of
the memory resources of the low-power WBAN sensors.
Step 2 (Secure Data transfer to e-Health service provider): In this step, the patient’s
collected health data is transferred to the e-Health care service provider. At this stage, an
attacker may compromise the base stations and launch a DDoS attack against the e-Health
care service provider. Simultaneously, the e-Health care service provider can be a victim
of other outside attackers. In order to protect the e-Health service provider from a direct
DDoS attack, an attack detection node is deployed at the victim (e-Health service provider)
for the purpose of attack detection. The communication channel between the base station
and attack detection node is secured using the SSL/TLS security protocol in order to provide
data confidentiality and integrity. The process and flow chart of the attack detection node are
presented in figure 3.7.
Step 3(Patient health data processing at e-Health Service provider): After receiving
the patient data securely, the e-Health service provider classifies the patient’s health records
based on the attributes chosen by the patient. It then sets different authentication levels based
on the role of the data requesters using a Role Based Access Control (RBAC) policy [51].
The data requesters include doctors, nurses, and health workers.
Step 4 (Transfers Patient data to cloud Storage): After classification, the data is securely
transferred to the e-Health cloud storage server to make them available to data requesters.
The communication channel between the e-Health service provider and cloud storage is
secured using the Secure Socket Layer/Secure Shell SSL/SSH protocol [50]. At this point,
the cloud storage server may also be a victim of a DDoS attack from adversary class users in
44
Figure 3.7: Workflow of Attack Detection Node at Cloud
order to overwhelm its resources. Moreover, we can deploy the same attack detection node
to detect an attack.
Step 5 (Data Requester Access): To access a patient’s data, a requester forwards the
request to the health cloud storage with their identity and requests the corresponding attribute
sets of the patient. In return, the cloud storage provider communicates with the e-Health
service provider to authenticate the requester. The standard security protocol (SSH/TLS) is
used to provide the requester’s authentication [17] [50].
Although the proposed architecture presents the complete cloud-WBAN environment, but
for this research, the key focus is on detecting and preventing a DDoS attack within WBAN
domain. In next section, a framework is proposed that for the detection and prevention of
DDoS attack in WBAN environment.
3.3 Proposed Framework for Detecting and Preventing DDoS Attack
In this section, a proposed framework is presented for the detection and prevention of DDoS
attack in cloud-assisted WBAN. It is based on the conceptual architecture presented in sec-
tion 3.1. The proposed framework is explained as follow:
1. Capture the incoming stream of packers originating from WBAN sensor network.
2. Obtain the packet features of the current traffic flow.
45
Figure 3.8: Proposed Framework for Detecting and Preventing DDoS Attacks
3. Statistical features are calculated from these extracted packet features and input to the
attack classification module.
4. The DDoS attack detection module detects if an attack has been occurred based on the
statistical features.
5. The attack detection module uses a proposed algorithm for detecting an attack. The
proposed attack detection module and proposed classification algorithm is discussed
in chapter 4.
6. If no attack is detected, the traffic is forwarded to the immediate upper node.
7. If an attack is detected, then
(a) The victim node call the traceback module.
(b) Under traceback module, the packet marking module is used to mark each packet
passing by each node. The proposed packet marking approach is discussed in
46
chapter 6.
(c) Victim node reconstructs the attack path based on the packet marking. The pro-
posed attack path reconstruction algorithms are presented in chapter 6.
(d) After successful path reconstruction, block an identified attacker to further stop
an attack
8. Go to Step 1
3.4 Conclusion
Nowadays, Wireless Body Area Networks (WBANs) is emerging as a promising technol-
ogy with a considerable potential in improving patients health care services. The inte-
gration of WBAN and cloud computing technology provides a platform to create a new
digital paradigm with leading features called cloud-assisted WBAN. The foremost concern
of cloud-assisted WBAN is the security and privacy of data either collected and stored by
WBAN sensors or transmitted to cloud over an insecure network. Among these, data avail-
ability is the most nagging security issue. The major threat to data availability is distributed
denial of service attack (DDoS) normally launched from various distributed locations. In
order to assure the all time availability of patients data, in this chapter, we propose a cloud-
assisted WBAN architecture. Based on the proposed architecture a victim based DDoS at-
tack detection and prevention framework is then proposed that receives the incoming stream
of sensor data and classifies into attack and non- attack data. After the successful attack de-
tection, a traceback module reconstructed the attack path and identify an attacker. The attack
detection module is discussed in Chapter 4 and traceback module is discussed in Chapter 6.
47
Chapter 4
EVFDT: An Enhanced Very Fast Decision Tree Algorithm for Detecting
DDoS Attack in Cloud- Assisted WBAN
Nowadays, cloud-assisted WBAN for patient health monitoring have attracted researchers
attention. Beside other open issues in WBAN environment such as energy efficiency, Qual-
ity of Service (QoS), and standardization; security and privacy are the key issues that need
special attention. Among these security issues, data availability is the most nagging security
issue. The DDoS attack is one of the most powerful attack on the availability of patients
health data and services of health care professional. DDoS attack severely affects the ca-
pacity and performance of a WBAN network if not handled in a timely and appropriate
manner [14] [1] .
For detecting a DDoS attack in cloud-assisted WBAN, there is a need for a defensive ap-
proach that understands the network semantics and flow of traffic in the networks. When
a victim node is flooded with huge amount of packets that exceeds its processing ability,
the excess must be dropped. The packet based dropping strategy helps in distinguishing the
legitimate traffic from the flood traffic and is used to avoid the impact of attack traffic on
legitimate users. Observing the network traffic flow shows that there is no regular structure
of patterns exists in the network and therefore, statistical pattern identification techniques
are needed. Integrating existing attack detection and defense mechanism in a resource con-
strained WBAN network increases the computation and communication cost [52] [?].
The network resources are not enough to mitigate the huge amount of traffic generated
by DDoS attack [57]. Therefore, there is a need for an approach that is light weight and
capable of handling real time streaming data. Considering this, a number of stream min-
ing techniques have been studied and explored in Section 2.4. Very Fast Decision Tree
(VFDT) [53], a stream mining technique VFDT has proved to be the most prevalent due to
the simplicity and interpretability of their rules and thus considered as more appropriate for
low-power sensor networks [54]. The underlying reasons for the selection of VFDT are:
48
1. Light weight i.e., it does not require a dataset to be stored in the memory thus making
it suitable for resource constraint WBAN.
2. Can progressively build decision tree from scratch which helps in detecting DDoS
attack at any stage.
3. Each time a new segment of sensor data arrives, a test and train process is performed
over it keeping the stored tree up to date.
4. Does not require reading full dataset yet adjusts decision tree according to the newly
incoming and gathered statistical attributes thus, consuming less memory space.
5. Appropriate for huge amount of non-stationary and streaming data obtained from
WBAN sensors.
6. Provides a transparent learning process.
These features make VFDT a suitable candidate for implementing an autonomous decision
maker for DDoS attack detection in cloud-assisted WBAN.
In this chapter, the proposed DDoS attack detection system is presented. Further, an
improvement of VFDT [53] namely Enhanced VFDT (EVFDT) is proposed which differs
from the existing algorithms in terms of classification accuracy, tree size, computational cost,
memory and time. Our aim is to build a decision tree based classification algorithm capable
of handling noisy data and detects a DDoS attack efficiently with high accuracy and low
false alarm rate while allowing a legitimate requesters to access the resources. The proposed
algorithm is deployed at the victim node.
This chapter is organized as follow: Section 4.1 presents the proposed DDoS attack de-
tection system. Each phase of attack detection system is elaborated. Further, the statistical
features are identified that helps in the detection of DDoS attack. In section 4.2, an im-
provement of VFDT [53] namely Enhanced VFDT (EVFDT) is proposed that is capable
of handling noisy data and detects a DDoS attack efficiently with high accuracy and low
false alarm rate. Different procedures of EVFDT are also discussed. Finally, the concluding
remarks are given in section 4.3.
49
4.1 Proposed Distributed Denial of Service attack detection system
The proposed Distributed Denial of Service (DDoS) attack detection system studies the net-
work traffic behavior and classifies it as a normal or malicious traffic based on the observed
traffic patterns. The proposed system architecture is shown in Figure 4.1. When the data
Figure 4.1: Proposed DDoS Attack Detection System
stream is generated by WBAN sensor network, the incoming data is first collected in online
database where the features extraction takes place. After feature extraction the data is stored
in offline database for training and testing purposes. The final attack classification output
will be generated on the basis of proposed algorithm for further decision making. The pro-
posed attack classification algorithm is given in section 4.2. Finally, the attack response is
either forwards the packet or call the traceback mechanism based on the classification re-
sults. The proposed DDoS attack system is broadly classified into four phases starting from
50
data collection phase up to response generation phase. The detail of each phase is discussed
in the following subsections:
4.1.1 Data Collection Phase
In this phase, the incoming data stream in captured online and stored in database for training
purposes. The captured data is supplied as an input to the pre-processing phase for feature
extraction. Each instance of an incoming traffic is defined by a collection of features and is
represented in feature vector space.
4.1.2 Pre-Processing Phase
Pre-processing phase is further divided into the packet feature extraction phase followed by
the labeling phase.
Packet Feature Extraction Phase
In feature extraction phase, the real- time packets are captured from the WBAN network
traffic in order to construct the new statistical features that are used for the detection and
analysis of DDoS attack. The identified features are important in defining the QoS [55] of
the real- time network and to classify the network traffic pattern under DDoS attack. These
includes:
1. Packet Loss Percentage: It is defined as a number of packets lost or dropped between
nodes due to the interaction of the legitimate traffic with attack traffic. It is the presence
of traffic congestion and overloading in the network due to occurrence of DDoS attack.
Wireless networks have high probability of packet loss due to the presence of noise
and interference. Packet loss percentage can be calculated using equation 4.1 [56]:
PacketLossPercentage =
∑ni=0 PL∑ni=0 PS
∗ 100 (4.1)
Where PL is the packet loss and PS is the total number of packets send towards the
destination.
2. Delay or Latency: The amount of time taken by the packet to reach the destination
after being transmitted from the source. Delay is dependent on the amount of traffic
being transmitted. It can be increased with the increase in network traffic and become
worst under network congestion periods. Delay or Latency can be calculated using
51
equation 4.2 [56]:
Delay =∑
(PATimei − PSTimei) (4.2)
Where PATimei is the time when the packet reach the destination and PSTimei is
the time when the packet originates from the source.
3. Jitter (Packet Delay Variation): The variation in the time between packets arriving
at the destination within a particular window. It is used as an indicator of consistency
and stability of the network. Jitter occurs when the transmission of delay of the packets
is variable in the network. Jitter can be calculated using equation 4.3 [56]:
Jitter =n∑i=0
(Delayi −Delay
N
)(4.3)
Where Delayi is the packet duration, Delay is the last packet delay and N is the
difference of packet sequence number.
4. Throughput: Throughput is the number of bytes transferred per unit time from source
end- point to destination end- point. It is measured in bits per second (bps). DDoS
defense mechanism ideally increases throughput for legitimate users. Throughput can
be calculated using equation 4.4 [56]:
Throughput =
∑ni=0(PacketReceived)∑n
i=0(StartT ime− StopT ime)(4.4)
Labeling Phase
In labeling phase, classes are assigned to these statistical features. The entire dataset is
divided into two classes labeled as 1 for attack and 0 for non-attack packet. After labeling
the resulted dataset consisting of both attack and non-attack data and is used for training
the classification tree. Mapping the pre-processing phase to feature vector space is given as
follow:
• Let ′x′ be the N-dimensional vector of extracted features i.e., x = x1, x2, x3...xn,
where 1, 2, 3...n are the individual packet features.
• Let Px be the packet of x features
• Let Cx be the vector space of labeled packets of dimension Cn.
52
4.1.3 Attack Classification
In this phase, the incoming traffic is classified as attack or non-attack by building a classi-
fication tree using the preprocessed data defined in feature vector space. For building the
classification tree, we have proposed an algorithm, which is discussed in Section 4.2.
Before an actual attack classification begins, the proposed classifier is trained and tested.
In training, the data from the preprocessed phase is used to train a EVFDT classifier. Based
on the nature of DDoS attack, EVFDT classifier is trained by considering two classes: attack
and non- attack. Similarly, testing is used to test the accuracy of a classifer. In the simulations
experiments, the data is divided into 80% training data and 20% testing data. This percentage
can be vary depending upon the severity of attack.
4.1.4 Attack Response
The goal of attack response module is to minimize the impact of DDoS attack on the victim
node while allowing the legitimate traffic to move forward. When a DDoS attack is detected,
an appropriate traceback mechanism is applied to trace an attacker by reconstructing the
attack path and block the traffic. The traceback technique is proposed and discussed in
Chapter 6.
4.2 Enhanced Very Fast Decision Tree (EVFDT): A Proposed Classification Algo-
rithm
A new classification algorithm namely Enhanced Very Fast Decision Tree (EVFDT) is pro-
posed. It is an optimization of original VFDT-τ [36] (discussed in Chapter 2) to make it
efficient in classifying the DDoS attack in real-time cloud-assisted WBAN environment.
The EVFDT classification algorithm simultaneously trains and tests the decision tree based
on learning traffic patterns and classifies malicious behavior of an attacker based on these
learned patterns as shown in Figure 4.1. One of the obvious aspect of real-time sensor net-
work environment is that the percentage of noise ratio is very high. The noise is produce due
to the presence of meaningless or extraneous data in the network traffic that makes the iden-
tification of traffic patterns more difficult and challenging. Due to this noise, the detection
accuracy of classifier decreases thus, causing a massive increase in false alarms rate.
To overcome the effect of this noise on detection accuracy and missing protection mech-
anism of WBAN, the EVFDT improves the existing VFDT [36] algorithms in terms of fol-
53
lowing parameters: (1) accuracy; (2) tree size; (3) Computational time and (4) memory. The
proposed EVFDT flowchart is given in Figure 4.2 that shows the overall flow of classifying
the stream into attack or non-attack based on the learned traffic patterns. The complete tree
building process and its procedures are discussed in subsequent section.
Figure 4.2: Proposed EVFDT Flowchart
According to the taxonomy of DDoS attack discussed in chapter 2, the proposed DDoS
attack detection algorithm is destination- based i.e. it is deployed at the victim node and
detects a DDoS attack efficiently once it has been launched from an attacker.
54
4.2.1 EVFDT Tree Building Process
Algorithm 1 given below is the algorithm for Enhanced VFDT.
Algorithm 1 EVFDT Procedure: Enhanced VFDTRequire: S: a stream of examples, X: a set of symbolic attributes
G(.): heuristic evaluation function for node splittingδ: one minus desired probability of choosing the correct attribute at any given nodenmin: number of samples between estimation of growth. Size of S0 = nminXS: sorted list of Hoeffding bound valuesm: total number of values in XSn: new Hoeffding Bound value seen at the nodeTr: adaptive thresholdS0: subset of S → S0 ∈ Sτ : 5% of examples in S0. Threshold for checking the node eligibility to be part of HT
1: BEGIN Procedure EnhancedVFDT(S,X,G,δ,nmin).2: A stream of examples S arrives3: if HT = φ then4: TreeInitialization(S,X)5: Get an Initialized HT with a single root node6: end if7: if HT 6= φ then8: NewStreamSample(S,X)9: end if
10: Label l with the majority class among the samples seen so far at l11: Let nl be the number of samples seen at l12: if samples seen so far at l arenot all of the same class and (nlmodenmin) = 0 then13: Compute Gl(Xi) for each attribute Xi ∈ Xl −Xφ using nijk(l)14: PrunedMean = AccuracyEVFDT(XS,m,n,Tr)15: Let Xa be the attribute with highest Gl,Xb be the attribute with second-highest Gl
16: Compute ε using equation 2.117: Let ∆Gl = Gl(Xa)−Gl(Xb)18: if ((∆Gl) > εor(∆Gl) ≤ PrunedMean) and Xa 6= Xφ then19: Split Xa as a branch20: for each branch of split do21: Add a new leaf lm and let Xm = X −Xa
22: Let Gm(Xφ) be the G obtained by predicting the most frequent class at lm23: for each class yk and each value xij of each attribute Xi ∈ Xl −Xφ do24: Let nijk(lm) = 025: end for26: end for27: end if28: else29: Pruning(S,S0,nmin,τ ,HT)30: end if31: Return HT32: END Procedure
55
EVFDT is based on original VFDT-τ [36] and is improved in two aspects: accuracy and
the tree size which in turns effect the computational time and memory resources. Algorithm
1 presents the pseudo code of EVFDT. It is divided into four sub procedures as described
below:
Procedure for Tree Initialization
In this procedure, the tree is initialized using a single leaf node; which is a root node. The tree
grows as a new data stream arrives at the root node. Algorithm 2 shows the Tree Initialization
Procedure. The procedure executes when the first data stream arrives.
Algorithm 2 EVFDT Procdure: Tree Initialization1: BEGIN Procedure TreeInitialization(S,X)2: Let HT be a tree with a single leaf l1 (the root)3: Let X1 = X ∪ (Xφ)4: G1(Xφ) be the G obtained by predicting the most frequent class in S5: for each class yk do6: for each value xij of each attribute Xi ∈ X do7: Let nijk(l) = 08: end for9: end for
10: Return HT11: END Procedure
Procedure for New Stream Sample
In this procedure, data from the stream is traversed starting from the root node. Each time a
new data stream arrives, this procedure traverses the stream from root node to the leaf node
and at the same time the tree statistics are updated. Algorithm 3 presents the procedure for
traversing a new sample.
Algorithm 3 EVFDT Procdure: New Stream Sample1: BEGIN Procedure NewStreamSample(S,X).2: for each new instance (x, y) in S do3: Sort(x, y) into a leaf l using HT4: for each xij in X such that Xi ∈ X do5: Increment nijk(l)6: end for7: end for8: END Procedure
56
Procedure for Accuracy Improvement
As stated in Section 2.3.1, Hoeffding Bound (HB) fluctuation intensifies with the increase
of noise thus, causing detrimental effects on the accuracy and tree size of VFDT-τ [36]. To
overcome this issue, the proposed EVFDT attempts to modify the attribute splitting proce-
dure by using an adaptive tie-breaking threshold τ that restricts the decision node to become
a splitting attribute. In existing VFDT-τ and its variants, the value of τ is pre-configured
by user and remains fixed throughout the tree building process. It is not possible to find the
best value of τ until all the possibilities are tried by brute force. Testing large number of
different values for τ is not favorable in real-time environment. Instead, EVFDT assigns an
adaptive tie- breaking threshold for splitting, equal to the mean of difference between HB
values, which provides the basis for node splitting throughout the tree building process.
By using this method, EVFDT has a dynamic tie- breaking threshold τ whose value is no
longer fixed and pre- defined but instead depends upon the arrival of new instances, their HB
values and the mean of difference between HB values. EVFDT incorporates the procedure
of accuracy enhancement with dynamic τ as presented in Algorithm 4 and explained as:
Let XS be the sorted list of HB values seen at leaf l, HBCount be the total number of
HB values in XS seen at leaf l and n be the new HB value seen at leaf l. Starting with
HBCount ≥ 3, the mean of difference between HB values in XS is calculated. This mean
is stored as threshold Tr as given in equation 4.5.
Tr =
∑HBCountj=1 (XSj+1 −XSj)
HBCount(4.5)
Where XS is a sorted list of HB values and HBCount is the total number of values in XS.
A threshold Tr is updated for each new value in sorted list.
Find the position of new HB value in XS, let XSn be the position of HB value in XS,
the PrunedMean is calculated with new value of HB as given in equation 4.6.
PrunedMean =(XSn −XSn−1) + (XSn+1 −XSn)
2(4.6)
If this PrunedMean is less than the updated threshold Tr, then HB is added to XS and
returns the PrunedMean otherwise HB value is discarded as an outlier [37] and return Tr
as PrunedMean. This whole process continues during the tree building process. Whenever
57
Algorithm 4 EVFDT Procedure: AccuracyEVFDT1: BEGIN Procedure AccuracyEVFDT(XS,m,n,Tr).2: if (m == 1) then3: Tr =
∑mi=1(
XSm
)4: else5: if (m == 2) then6: Tr =
∑mi=1(
XSm
)7: for j = 1 do8: XDj = XSj+1 −XSj9: Increment j=j+1
10: j < m11: end for12: else13: for j = 1 do14: XDj = XSj+1 −XSj15: Increment j=j+116: j < m17: end for18: Tr =
∑mi=1(
XDim
)19: Let XSn be the position of n in sorted list20: PrunedMean = (XSn−XSn−1)+(XSn+1−XSn)
2
21: if (PrunedMean ≤ Tr) then22: Add XSn in sorted list XS23: XS ← XS + n24: else25: Discard n as it is outlier26: XS ← XS27: end if28: end if29: end if30: Return PrunedMean31: END Procedure
a new HB value is added in XS, threshold Tr is updated.
Procedure for Tree Pruning
Pruning plays a very important role in decision tree learning process which helps to mini-
mize the tree size by cutting off the tree nodes that participate less to classify the instances.
Pruning is done to lessen the overall complexity of decision tree which arise due to the
presence of noisy or erroneous data. Analysis shows that the improvement in classification
accuracy causes enormous increase in tree size which in turn takes more memory space and
computational time. To overcome the problem of tree size explosion, Algorithm 5 presents
the proposed tree pruning mechanism and is explained as:
58
Algorithm 5 EVFDT Procedure: Pruning1: BEGIN Procedure Pruning(S,S0,nmin,τ ,HT).2: Let DataSeenAtLeaf be the number of samples seen at leaf node n0
3: for each example S ∈ S0 do4: if the sample S traverses to the node n0 which is a leaf node then5: Start counter on node n0
6: Increment: DataSeenAtLeaf n0
7: else8: Continue growing EVFDT9: end if
10: end for11: if DataSeenAtLeaf < τ then12: Prune the tree: Delete n0
13: UPDATE HT14: end if15: END Procedure
Let HT be the hoeffding tree to be pruned, S be the stream of examples belonging to S0
i.e., S0εS and DataSeenAtLeafn0 is the number of samples seen at leaf node. For every
example S in S0 is passed through the tree starting from the root node. If S0 is filtered to the
leaf node n0, then increment DataSeenAtLeafn0 otherwise continue growing the HT .
If DataSeenAtLeafn0 is less then τ , where τ is the threshold for checking the eligibility
of a node to be part of HT , then prune the tree by deleting the leaf node n0 and updating
the HT . Otherwise do not prune the HT . The eligibility of the node to be part of HT is
checked by comparing the number of samples seen at the leaf node with τ . The comparison
will tell us that this leaf node has less contribution towards classification as less number of
instances are filtered to this leaf node on the current HT .
4.3 Conclusion
Nowadays, zombies- based DDoS attack occurs with legitimate flow of traffic. Therefore,
it is very difficult to detect such attacks even with the presence of stored attack traffic sig-
natures. The challenge is to distinguish the legitimate traffic and DDoS attack traffic. Data
mining techniques for data classification fails for real-time streaming data and also they re-
quire a sufficient amount of memory for data storage. On the other hand, stream mining
techniques handle real- time high speed streaming data originating from WBAN sensors and
are efficient for resource scarce WBAN network. Therefore, stream mining techniques have
been studied and explored for DDoS attack detection. For successful detection of DDoS
59
attack, there is a need for an efficient detection system. Therefore, a DDoS attack detection
system is proposed consisting of four main phases from data collection phase to attack re-
sponse phase. The real- time streaming data from WBAN network is given as input to the
proposed system. After the successful classification of an attack by the proposed system,
the attack response is generated in which the traffic is either forward to the destination or
block for further analysis. For the classification and detection of an attack, there is a need
of a mechanism through which an attack can be detected efficiently and accurately while
consuming less resources. For this purpose, an algorithm based on VFDT is proposed and
presented in this chapter. The main contributions include a novel EVFDT classification al-
gorithm that differs from existing algorithms in terms of attack classification accuracy and
tree size. The performance of proposed EVFDT algorithms is evaluated on synthetic dataset
generated by implementing LEACH protocol in NS-2 [59]. The proposed system is also
deployed in real-time WBAN environment for examining and verifying the effectiveness of
proposed classification algorithm. The evaluation procedure is discussed in next chapter.
60
Chapter 5
ATTACK DETECTION SCHEME: PERFORMANCE ANALYSIS
AND BENCHMARKING
The basis of performance evaluation is to analyze the effectiveness of proposed attack de-
tection technique EVFDT discussed in Chapter 4 in detecting DDoS attack. Likewise, the
comparative analysis intend to show the dominance of EVFDT over existing techniques for
detecting DDoS attack.
This chapter evaluates the performance of proposed DDoS attack detection technique
through simulation-based experiments and hardware- based experiments. The performance
metrics that are used to evaluate and compare the simulation results includes: attack detec-
tion accuracy, false alarm rate, sensitivity vs specificity, computational cost, tree size and
resource usage.
The complete evaluation and comparison process is performed separately on both syn-
thetic datasets generated by simulation in NS-2 [58] and dataset generated by deploying
actual WBAN hardware testbed environment. Each of the selected performance metric is
evaluated on both synthetic datasets and real-time WBAN dataset. Finally, a comparative
analysis is done based on the simulation results obtained from EVFDT with the correspond-
ing simulation results acquired from existing techniques.
This chapter is organized as follow: The performance evaluation metrics selected for as-
sessing the effectiveness of proposed technique is discussed in section 5.1. These perfor-
mance metrics are evaluated with the varying number of instances In and different noise
percentage present in the datasets. In section 5.2, using the selected performance metrics,
the proposed technique is evaluated on synthetic dataset generated by simulating LEACH
protocol in Network Simulator, NS-2. In addition to evaluation, a quantitative comparison
of proposed EVFDT with existing techniques is also discussed in this section. In section 5.3,
a hardware- based testbed is deployed to demonstrate the real-time WBAN environment. A
real-time stream data generated by WBAN sensors are used to evaluate the effectiveness of
61
EVFDT against the selected performance metrics. Computational cost is the metric calcu-
lated only for hardware-based generated data in order to measure the cost of computation
on sensor nodes. At the end of this section, a comparative analysis is given by applying
the EVFDT and existing techniques on real-time datasets. The Qualitative comparison of
EVFDT and existing classification algorithms are given in Section 5.4. Finally, Section 5.5
concludes the chapter.
5.1 Performance Evaluation Metrics
EVFDT is evaluated using the following quantified performance evaluation metrics: attack
detection accuracy, false alarm rate, sensitivity vs specificity, computational cost, tree size,
computational time and memory usage. These performance metrics are very important for
the evaluation of any attack classification technique. All of these performance metrics are
used to evaluate the simulation- based and hardware- based experiments except computa-
tional cost and tree size. Table 5.1 shows the performance metrics used in simulation- based
and hardware- based experiments. The tick (X) shows that the metric is used for evaluation
whereas the cross (X) shows that the metric is not used for evaluation. These performance
metrics are discussed below.
Table 5.1: Performance evaluation metrics
ExperimentMethod
DetectionAccuracy
FalseAlarm
Cost Sensitivity vsSpecificity
TreeSize
Time MemoryUsage
Simulation X X X X X X XHardware X X X X X X X
5.1.1 Attack Detection Accuracy
Attack detection accuracy is a ratio of number of correct predictions to the total number
of tested examples. For measuring detection accuracy, confusion matrix is used. Table 5.2
presented the confusion matrix.
True Positive (TP) = Samples that are correctly classified as attack class.
True Negative (TN) = Samples that are correctly classified as non-attack class.
62
Table 5.2: Confusion Matrix
Normal AttackNormal True Negative False PositiveAttack False Negative True Positive
False Positive (FP) = Samples that are incorrectly classified as attack class.
False Negative (FN) = Samples that are incorrectly classified as non-attack class
In general, the accuracy of proposed EVFDT is directly proportional to the number of data
stream samples. With the increase in data stream samples (number of instances), EVFDT
becomes more and more accurate as long as the data is noise free. But as soon as the
noise is injected, the classification accuracy starts decreasing. For the conducted simulation
experiments, attack detection accuracy was calculated using equation 5.1
Attack DetectionAccuracy =TruePositives+ TrueNegatives
Total Number of TestedExamples(5.1)
EVFDT is compared with existing stream mining classification algorithms in terms of de-
tection accuracy on both simulation- based datasets in Section 5.2 and datasets generated by
deploying real-time WBAN testbed in Section 5.3.
5.1.2 False Alarm Rate (FAR)
The number of false alarms generated by the attack detection technique is computed as the
combination of both False Positive Rate (FPR) and False Negative Rate (FNR). These are
discussed below:
False Positive Rates (FPR)
The FPR of an attack classification technique is defined as a ratio of the total number of
legitimate packets classified as malicious packets to the total number of packets. It can be
calculated using equation 5.2
False PositiveRate =Legitimate Packets Incorrectly Classified asMalicious
Total Number of Packets(5.2)
False Negative Rates (FNR)
The FNR of an attack classification technique is defined as a ratio of the total number of
attack packets classified as legitimate to the total number of packets. It can be calculated
63
using equation 5.3
FalseNegativeRate =Attack Packets Incorrectly Classified asLegitimate
Total Number of Packets(5.3)
The FPR and FNR increases with the increase in the noise percentage which give rise to
false alarm generation. On the contrary, FPR and FNR decreases with the increase in the
number of instance In which means less false alarm rate.
EVFDT is compared with existing stream mining classification algorithms in terms of FPR
and FNR on both simulation- based datasets and datasets generated by deploying real-time
WBAN testbed.
5.1.3 Computational Cost
In addition to detection accuracy and FAR, it is also important to consider computational
cost when evaluating the performance of classification algorithm on real-time WBAN test
bed.
A cost matrix is a mean for influencing decision making of a classification model. It
provides the basis to the classification model to minimize the costly misclassifications and
maximize useful accurate classifications. To calculate the computational cost of EVFDT, a
cost matrix is used. Table 5.3 shows the cost matrix.
Table 5.3: Cost Matrix
Normal AttackNormal 0 λ
Attack 1 0
As the objective of EVFDT is to avert the rejection of legitimate users access, therefore, the
cost of false alarm was assumed high. In this research, the cost of false positives has been
assumed five times more than the cost of false negatives. The cost function is employed to
simplify the performance comparison of the EVFDT algorithm with existing stream mining
algorithms. The formula for calculating cost function is given in equation 5.4. It is based on
the number of samples that are classified incorrectly. Less cost means better performance of
the detection system.
Cost = (1−Attack DetectionAccuracy) + λ(False PositiveRate) (5.4)
64
where the parameter λ is the difference between false alarm and miss. For evaluation, λ is
set as 5. The computational cost of proposed algorithm and existing classification algorithms
are calculated and compared in Section 5.3.
5.1.4 Sensitivity vs Specificity
The confusion matrix given in Table 5.2 is also used to calculate other important statistical
measures that are useful to evaluate the performance of proposed classification algorithm.
These measures include:
Sensitivity
It determines the probability that the classification algorithm correctly identify an attack
traffic. The sensitivity of proposed and existing algorithms were calculated using equation
5.5.
Sensitivity =TruePositive
TruePositive+ FalseNegative(5.5)
Specificity
It indicates the probability that the classification algorithm correctly identify a non- attack
traffic. The specificity of proposed and existing algorithms were calculated using equation
5.6.
Specificity =TrueNegative
TrueNegative+ False Positive(5.6)
Analysis shows that the sensitivity and specificity decreases with the increase in noise per-
centage. Sensitivity and specificity of proposed and existing classification algorithms are
further evaluated and compared in Section 5.2 and Section 5.3.
5.1.5 Tree Size
Tree size is a key evaluation metric to assess the performance of any decision tree based clas-
sification algorithm. The amount of memory required to build a decision tree depends on the
tree size. A significant characteristic of any classification algorithm lies in its ability to build
a decision tree with reduced tree size and increased classification accuracy simultaneously.
To evaluate the proposed algorithm of pruning, tree size metric is used. Preliminary analysis
shows that the tree size gets bigger with increase in noise percentage. Detailed analysis of
tree size with respect to noise percentage and number of instance In is discussed in Section
65
5.2.
5.1.6 Computational Time
Early detection of an attack is a desirable property of any attack detection technique. In
addition to accuracy, a detection technique should also be fast enough in detecting an at-
tack. Computational time is the total time taken in seconds for processing a full set of data
stream. Computational complexity of proposed algorithm is expressed in big O notation
and is proportional to O(lpdvc) where lp is the length of a pruned tree, d is the total num-
ber of attributes, v is the number of values per attribute and c is the number of classes.
Computational time of proposed and exiting algorithms is calculated and compared for both
simulation- based and hardware- based experiments.
5.1.7 Memory Usage
Taking into account the resource scarcity of WBAN environment, EVFDT should consume
less memory resources. The advantage of stream mining algorithms over existing machine
learning algorithms is that they did not require a full dataset to be stored in memory, rather
perform mining at run time. Memory required for running proposed EVFDT is calculated
using big O notation and is proportional to O(ndvc), where n is the number of decision
nodes in a tree, d is the total number of attributes, v is the number of values per attribute and
c is the number of classes. The total amount of memory required to run the proposed and
existing classification algorithms is the sum of memory allocated for learning and memory
allocated for training. Memory usage of proposed and exiting algorithms is calculated and
compared for both simulation- based and hardware- based experiments.
5.2 Simulation- Based Experiments
In this section, we evaluate the performance of proposed EVFDT in detecting DDoS attack
from incoming data stream generated by simulating LEACH protocol in NS-2 [60]. EVFDT
is compared with existing VFDT and its variants in terms of the performance evaluation
metrics discussed in Section 5.1.
5.2.1 Synthetic Datasets
The Low Energy Adaptive Clustering Hierarchy (LEACH) protocol [60] was implemented
in NS-2 for generating the synthetic data stream containing one million data values, which
66
are divided into five datasets. Each dataset contains different number of instances and noise
percentage. LEACH protocol is selected because it is light weight and closely reflects the
WBAN scenario. Cluster heads acts as aggregate node or Body Control Unit (BCU) and all
other nodes as regular sensor nodes as depicted in Figure 5.1. LEACH protocol is respon-
sible for transferring the data from WBAN sensor node to BCUs. The DDoS attack code is
attached with regular sensor nodes to make them malicious. The number of malicious sensor
nodes varies with different attack dataset.
Figure 5.1: Illustration of LEACH Protocol
Table 5.4: Simulation Parameters
Parameters ValuesSensing Field 50 X 50Topology StarSimulation Time 900s,1200s,1500s,1800s,2000sPacket Size 1000 bytesRadio Communication Range 2m Standard,5m Special useNo. of Nodes 50BAN Coordinator Directional ModeSensor Nodes Omni directional ModeRouting Protocol LEACH
67
The simulation runs for 900s, 1200s, 1500s, 1800s and 2000s to generate the data streams.
Other simulation parameters and network configurations are shown in Table 5.4. After pass-
ing from data collection phase, and pre-processing phase discussed in chapter 4, the resulting
data set includes both attack and non- attack data. This resulting dataset is given as an input
to EVFDT for further attack classification.
5.2.2 DDoS Attack Strategy: Generation and Analysis
In this subsection, an attack dataset is generated and analyzed for DDoS attack. The analysis
is performed on the basis of DDoS attack characteristics discussed in Chapter 4. In ongo-
ing simulation experiment, first the legitimate traffic is considered for analysis. Secondly,
the attack is generated and its intensity under flooding attack traffic is analyzed. An attack
algorithm is written and generated online. The resulted dataset (attack dataset) is stored
in database at base station for pre-processing. The EVFDT classification algorithm is ap-
plied on the pre-processed dataset for attack classification. The DDoS attack pseudocode is
presented in Algorithm 6.
Algorithm 6 Distributed Denial of Service Attack AlgorithmRequire:
SN: Set of sensor nodes for aggregate node (AN) selectionR: Number of roundsSimulation Parameters of LEACH protocol (Table 5.4) Output: DDOS Attack Dataset
1: Procedure DDOSAttackAlgorithm(SN,R)2: BEGIN3: if r = 0 then4: Initial round AN selection5: end if6: for (maximum number of rounds r) do7: Choose r rounds AN randomly8: AN announces schedule time T to all SN9: end for
10: Attach attack code with random nodes Nr
11: Randomly initiates malicious nodes towards victim node v12: Malicious nodes start and stop randomly according to time T during formation13: Malicious nodes starts compromising v14: Malicious nodes forward flooding packets to v with high rate to overflow v and con-
sume resources available to victim v15: v receives packet with rate16: More malicious nodes starts compromising v at their schedule time T causing DDoS
flooding attack17: END Procedure
68
JAVA code is written to randomly add noise in the data set. For this purpose, N- dimen-
sional feature vector is multiplied with vector of random variables taken from the Normal
DistributionN(0, σ2), where σ2 is noise variance adjust according to the percentage of noise
added.
5.2.3 Performance Evaluation and Comparative Analysis
To evaluate the performance of EVFDT on simulation- based experiments, noise is attached
with datasets in order to compare the result of EVFDT with variants of VFDT under different
noise percentage. Each dataset contains different number of instances and divided into 20%
testing data and 80% training data to learn the classifier. The performance is evaluated using
the following parameters. The detail of these parameters is given in section 5.1.
• Attack Detection Accuracy
• False Alarm Rate
• Sensitivity and Specificity
• Tree Size
• Resource Usage
Attack detection Accuracy
For the conducted simulation-based experiments, attack detection accuracy was calculated
using equation 5.1 In Figure 5.2, we evaluate and compare the impact of variation of the
noise percentage on the attack detection accuracy for proposed and existing classification
algorithms. As observe from the graph in figure, the attack detection accuracy for all algo-
rithms is higher for noise free data. But as the noise percentage starts increasing the detection
accuracy begins to decrease. It is seen in figure that the EVFDT maintains good accuracy
even with the presence of noise.
Similarly, the number of instances In is a second important factor that contributes towards
the accuracy of attack detection algorithm. With the increase in number of instances In,
the EVFDT becomes more and more accurate as long as there is noise free data. Figure
5.3 shows the attack detection accuracy comparison of EVFDT with existing classification
algorithms on varying datasets with noise percentage of 10% and 20%. On all experimental
69
Figure 5.2: Accuracy in different Noise Percentage
(a) (b)
Figure 5.3: Accuracy vs In in different Noise Percentages
datasets, EVFDT maintains higher detection accuracy with less false alarm rate. A same
increase in the attack detection accuracy is noticeable for other noise percentages.
False Alarm Rate
FPR and FNR are the two key factors that contribute towards the generation of false alarms.
These can be calculated using equation 5.2 (FPR) and equation 5.3 (FNR). To analyze the
performance of proposed EVFDT in term of false alarm generation, we examine the impact
of noise percentage and number of instances In on FPR and FNR. Table 5.5 compares the
proposed and existing classification algorithms in terms of FPR and FNR for varying noise
percentage. It has been observed that the FPR and FNR increases with the increase in noise
percentage in all cases. As the goal is to maintain lower FPR and FNR even under extreme
noise and accordingly the proposed EVFDT achieves this objective by maintaining lower
FPR and FPR.
70
Table 5.5: FPR and FNR of Classification Algorithms in Percentage
NoiseVFDT-τ CVFDT OVFDT Proposed
FPR FNR FPR FNR FPR FNR FPR FNR
0% 4.6 5.4 5.3 5.1 1.4 2.4 1.0 0.95% 5.1 5.9 5.9 5.7 1.8 2.7 1.6 1.210% 5.7 6.8 6.5 6.2 2.2 3.2 1.9 1.715% 6.2 7.3 7.7 7.0 2.9 3.5 2.3 2.120% 6.8 7.9 8.1 7.8 3.8 4.1 2.9 2.8
Similarly, the effect of varying In on FPR and FNR is illustrated in Figure 5.4. As seen from
figure, the increase in In decreases the FPR and FNR. CVFDT has highest FPR and FNR for
highest In i.e. FPR=3.4 and FNR=3.1, whereas for same In, EVFDT has lowest FPR and
FNR i.e. FPR=0.9 and FNR=1.0 which means that EVFDT generates less false alarms as
compared to other techniques for varying In.
(a) (b)
Figure 5.4: FPR and FNR vs In (a) False Positive Rate (b) False Negative Rate
Sensitivity and Specificity
To evaluate the performance of EVFDT and existing classification algorithms, sensitivity
and specificity are the key statistical measures. They are calculated using Equation 5.5
(sensitivity) and Equation 5.6 (specificity). Table 5.6 compares the sensitivity and specificity
of EVFDT with existing classification algorithms with respect to different noise percentages.
As shown in table, the average sensitivity of VFDT-τ and CVFDT is very low as compared
to average specificity which is very high, this results in high false negatives.
71
OVFD and EVFDT have initially high sensitivity and high specificity for noise free data,
which is an ideal case. But as soon the noise is injected and its percentage starts increasing,
the specificity of OVFDT starts increasing whereas sensitivity starts decreasing which again
results in high false negatives.
On the other hand, EVFDT maintains the ideal case initially. But as the noise percentage
increases to 15%, the specificity starts increasing whereas sensitivity starts decreasing, but
the average sensitivity and specificity still meet the ideal condition.
Table 5.6: Sensitivity and Specificity of Classification Algorithms in Percentage
NoiseVFDT-τ CVFDT OVFDT Proposed
Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity
0% 90.8 98.4 89.8 99.4 97.6 94.9 97.9 94.75% 79.2 91.7 78.9 91.7 89.2 91.7 82.3 90.010% 78.2 88.8 69.2 90.8 77.0 90.7 80.2 88.415% 76.3 88.3 67.7 88.9 76.9 86.8 79.2 86.820% 77.7 81.3 66.9 79.1 73.9 81.0 78.1 83.4
Avg 80.4 88.9 74.5 89.1 82.5 89.0 83.6 88.6
Tree Size
A significant characteristic of EVFDT lies in its ability to build a decision tree with reduced
tree size and increased classification accuracy simultaneously. The tree size is the depth of
the decision tree.
The tree size of proposed and existing classification algorithms is evaluated and compared
with respect to increase in noise percentage. Table 5.7 shows the impact of noise on tree
size for different number of instances In. The same table is plotted in Figure 5.5 in order to
compare tree sizes of EVFDT and existing classification algorithms.
It is observable from the Table 5.7 that the tree size gets bigger with increase in noise per-
centage. Similarly, the tree size is directly proportional to the number of instances In i.e.
the increase in In increases the tree size. The goal is to achieve maximum accuracy while
maintaining small tree size. Although VFDT-τ obtains a smallest tree size in our simulation,
72
Table 5.7: Tree Size Comparison with different Noise Percentage
Data Set Noise VFDT-τ CVFDT OVFDT EVFDT
Dataset-10
0% 4 5 6 35% 6 7 8 610% 9 9 11 815% 9 10 13 920% 9 9 15 8
Average 7 8 9 6
Dataset-20
0% 5 6 5 55% 7 7 8 710% 8 9 10 615% 10 11 11 820% 10 11 13 8
Average 8 9 10 6
(a) (b)
Figure 5.5: Tree Size vs Noise Percentage
but results in increased classification error which is not acceptable. As shown in Figure 5.5,
notably, EVFDT maintains a smaller tree size. In few cases, EVFDT and VFDT-τ produces
same tree size, but the average tree size of EVFDT is smaller than VFDT-τ . The simulation
experiment shows the tree size (TS): TSEV FDT < TSV FDT−τ < TSCV FDT < TSOV FDT .
Resource Usage
The resources used for simulation experiments include the overall usage of CPU time and
memory for processing the classification algorithm. Both computational time and memory
usage is calculated using Big O notation as discussed in section 5.1. Computational time is
a time taken in seconds by CPU for processing a full set of data stream. Figure 5.6 shows
the computational time of the proposed and existing classification algorithms. It includes
the overall CPU time for processing a full set of data stream samples. It is evident from the
73
graph that CVFDT takes maximum time as compared to other technique because of building
and processing two tree simultaneously. Among all classification algorithms, VFDT-τ has a
small CPU processing time due to small and definite value of τ . The proposed EVFDT takes
slightly more time than VFDT-τ because of tree pruning and adaptive threshold computation.
The Computation time is compared as (CT ) : CTV FDT−τ < CTEV FDT < CTOV FDT <
CTCV FDT .
Figure 5.6: Computational Time vs Number of Instances In
Similarly, the total amount of memory required to run a classification algorithm is the sum
of memory allocated for learning and memory allocated for training. The amount of memory
is directly proportional to the number of instances In, which is an obvious fact because the
increase in the number of instances In requires additional memory for learning and training.
Moreover, the presence of noise has no impact on the memory requirement of classification
algorithm. Figure 5.7 compares the total amount of memory required for running EVFDT
and existing algorithms. As shown in the figure, CVFDT consumes more memory space as
compared to other classification algorithms due to the simultaneous learning and training of
two classification trees in the memory. The memory resource usage of proposed EVFDT
is less due to pruning and run-time computation of tie- breaking threshold. As compared
to proposed EVFDT, VFDT-τ consumes little more memory due to the initial selection and
declaration of decisive parameter τ . The memory usage of classification algorithm is com-
pared as (M) : MEV FDT < MOV FDT < MCV FDT < MV FDT−τ .
74
Figure 5.7: Memory Usage vs Number of Instances In
5.3 Hardware- Based Experiments
5.3.1 Experimental TestBed
To evaluate the performance of EVFDT algorithm for DDoS attack detection, the real time
WBAN testbed is deployed using e-Health sensor platform. EVFDT algorithm has been
implemented using Very Fast Machine Learning VFML libraries [61]. The experiments
were run on Ubuntu 14.04 - 64bit workstation with 2.8GHz processor and 8GB RAM with
all unnecessary background processes switched off. Wireshark 1.10.3 is installed to capture
the real-time packets coming from WBAN network via Zigbee module.
Testbed Setup
The testbed uses e-Health sensor shield v2.0 by Libelium communications distribution to
demonstrate the real-time WBAN scenario [62]. It monitors in real time the patients’ health
by deploying different medical sensors on patients body to get sensitive data of patients for
subsequent analysis by physicians. The gathered information can be send wirelessly to base
station using Arduino XBee Shield shown in Figure 5.8.
Figure 5.9 shows the ’Arduino XBee shield’ over e-Health sensor shield complete kit. It
is a 802.15.4 arduino shield embedded with Digi XBee 802.15.4 Original Equipment Manu-
facturer Module- Radio Frequency (OEM-RF) module having upto 100m distance transmis-
sion. It is specifically developed for low-power wireless communication applications such
as WBAN and sensor networks. Arduino XBee shield fits in the XBee socket of e-Health
sensor shield in order to sense and control the data from physical world and transfers it to
base station. From base station, the data is shifted to cloud for permanent storage. Similarly,
75
Figure 5.8: Arduino XBee Shield
the patients’ data can also be visualized by physicians in real time by sending it directly to
laptop or smartphone.
Figure 5.9: ’Arduino XBee shield’ over e-Health sensor shield complete kit
Nine different medical sensors are deployed at different locations on patients’ body as
shown in Figure 5.10. These sensors include: air flow (breathing), oxygen in blood (SPO2),
body temperature, electrocardiogram (ECG), glucometer, galvanic skin response (GSR -
sweating), blood pressure (sphygmomanometer), patient position (accelerometer) and mus-
cle/electromyography sensor (EMG). The details of each sensor is given as follow [62]
1. Air Flow (Breathing): Air flow sensor is used to measure the rate of breathing in
patient when a respiratory help is required. It consists of a set of two tines that are
placed in the nostrils to measure the breathing rate and an elastic thread that fixes
76
behind the ears.
2. Pulse and Oxygen in Blood: The pulse oximeter sensor is used to measure the amount
of oxygen in blood and pulse of a patient.
3. Body Temperature: Body temperature sensor is used to measure the temperature of
different body parts. The temperature varies according to the part of body at which
the temperature is measured and the time of measurement. Different body parts have
different temperatures.
4. Electrocardiogram (ECG): The electrocardiogram (ECG) is a diagnostic tool used
to assess the electrical and muscular functions of the heart.
5. Glucometer: A glucometer is a medical device for measuring the approximate con-
centration of glucose in the patient’s blood.
6. Galvanic Skin Response (GSR): The Galvanic skin response (GSR) is used for mea-
suring the electrical conductance of the skin, which varies with its moisture level. GSR
sensor measures the electrical conductance between 2 points, and is essentially a type
of ohmmeter.
7. Sphygmomanometer: It is used to measure the blood pressure of a patient.
Each sensor is equipped with a code that allows to sense and read patients data and trans-
fers it to e-health sensor shield. Similarly, a code is written and attached with e-health sensor
shield in order to allow it to act as an aggregate node. The responsibility of aggregate node
is to manage separately the data of each individual sensor and transfers it to base station.
The code for sensor node and aggregate node is written in Arduino software using Arduino
libraries. The Arduino IDE serial monitor is used to visualize the data coming from sen-
sor nodes. Figure 5.11 displays the screen shot of serial monitor showing the data of pulse
oximeter sensor.
Topology Design
In the demonstration of WBAN test bed, star topology is deployed for simplicity in which
each sensor is connected directly with the aggregate node (e-Health sensor shield) which
in turn connected to base station (PC/ Laptop) wirelessly using 802.15.4/ zigbee shield as
77
Figure 5.10: Complete WBAN Demonstration
shown in Figure 5.10. Sensor data is further transfer to the cloud server for permanent
storage.
5.3.2 Traffic Generation
The Arduino application is installed at the base station to gather and visualize the data com-
ing from WBAN sensors via aggregate node. The purpose of sensor network test-bed deploy-
ment is to demonstrate the DDoS attack detection scenario in real- time environment which
depends upon the rate of low of packets in the network. Therefore, there is also a need for
an application that successfully capture the packets coming from the deployed sensor net-
work for further analysis and attack classification. For this purpose, Wireshark application
is installed at base station. Now, the two different applications i.e. Arduino application and
wireshark application runs simultaneously at base station in order to generate and compare
the real- time sensor network traffic.
The Arduino application is used to gather and visualize data from WBAN sensors whereas
wireshark is used to capture packets coming from these sensors. These packets are then used
to calculate the statistical features given in Table 5.8 and was discussed in Chapter 4.
78
Figure 5.11: Arduino IDE serial monitor
Table 5.8: List of Statistical Features
Features Description
Packet Loss Per-centage
The number of packets or bytes lost due to the interaction of the legiti-mate traffic with the attack traffic
Delay or latency The time taken by the packets to reach from source to destinationJitter The variation in delay or packet delay variation. It is the variation in the
time between packets arriving within a particular windowThroughput The number of bytes transferred per unit time from source to destination
Sensor nodes have been identified through Sensor node ID. Legitimate traffic is generated
with fixed delay having fixed packet size and packet rate. Whereas, the attack traffic is gen-
erated by attaching the DDoS attack code with four sensors nodes using arduino application.
The packet rate of attack traffic is 150 pkts per second while the packet size remains fixed.
The complete experiment is run for 1 hours in which 50,000 data instances have been col-
lected. These data instances are divided into five datasets containing different number of
instances for evaluation purposes.
The resulting dataset contains both attack and non-attack data which is fed into the pre-
processing phase. After pre-processing, the dataset is divided into 20% testing data and 80%
79
training data to learn the classifier in attack classification phase.
5.3.3 Performance Evaluation and Comparative Analysis
To evaluate the performance of proposed algorithm on real-time cloud-assisted WBAN
testbed, following performance evaluation metrics are selected and employed. These evalu-
ation metrics are discussed in section 5.1.
• Attack Detection Accuracy
• False Alarm Rate
• Sensitivity and Specificity
• Cost
• Resource Usage
Attack Detection Accuracy
As given in Section 5.1, attack detection accuracy is define as a ratio of number of correct
predictions to the total number of tested examples and is calculated using equation 5.1. In
Figure 5.12, we analyze the effect of noise on attack detection accuracy of EVFDT with
respect to different number of instances In . As shown in the figure, for 0% noise, the attack
detection accuracy starts increasing with the increase in In. But as the noise percentage
increases, the attack detection accuracy starts decreasing even with the maximum In. For
instance, for 0% noise, the attack detection accuracy is nearly 100% for N=50,000, whereas,
for 15% noise the detection accuracy is deceased to 90.1% given the same value of In. A
similar decrease in the attack detection accuracy is observable with the increase in the noise
percentage in this experiment.
In figure 5.13, the proposed classification algorithm EVFDT is compared with existing
stream mining classification algorithms [34] [35] [36] in terms of attack detection accuracy
with respect to different noise percentages given In = 40, 000. It is evident from the figure
that the EVFDT maintains high accuracy upto 98.7% with 0% noise as compared to other
VFDT variants. Though the increase in noise effects the detection accuracy but still it is
higher than other VFDT variants.
80
Figure 5.12: Attack Detection Accuracy for Different Noise(%)
Figure 5.13: Attack Detection Accuracy Comparison with Different Noise(%)
Table 5.9 presents the attack detection accuracy comparison of EVFDT and existing clas-
sification algorithms. From table, it can be seen that the percentage of detection accuracy
95.7% for N= 10,000 and it starts increases with the increase in In. For In = 50, 000, the
detection accuracy reaches to 98.3%. For the same dataset, it is evident that the EVFDT
achieves maximum gain of 6.6% and a minimum gain of 2.4% in accuracy of attack detec-
tion.
False Alarm Rate
As discussed in Section 5.1, the rate of false alarm generated by attack detection technique
depends on the combination of two important factors: false positive rate and false negative
rate. These can be calculated using Equation 5.2 (FPR) and Equation 5.3 (FNR).
81
Table 5.9: Experimental Results of Attack Detection Accuracy(%) for real-time datasets
In VFDT-τ CVFDT OVFDT EVFDT
10,000 88.1 90.0 91.3 95.720,000 88.9 91.2 92.5 96.830,000 89.6 92.1 93.7 97.640,000 91.0 93.0 94.6 98.150,000 92.2 93.8 96.4 98.8
Average 89.8 92.0 93.6 97.4
(a) (b)
Figure 5.14: Effect of Noise% on FPR and FNR (a) False Positive Rate (b) False NegativeRate
To assess the performance of EVFDT in term of false alarm generation, we analyze the
effect of noise percentage and number of instances In on false positive rate and false neg-
ative rate. Figure 5.14 shows the effect of noise percentage on false positive rate and false
negative rate. It has been observed that the FPR and FNR increases with the increase in
noise percentage. As the goal is to maintain lower FPR and FNR even under extreme noise
and accordingly the proposed EVFDT achieves this objective by maintaining lower FPR and
FPR.
Similarly, Figure 5.15 illustrates the false positive rate and false negative rate for varying
In. It is seen that the increase in In decreases the FPR and FNR. For In= 50,000, CVFDT has
highest FPR and FNR i.e. FPR=4.1 and FNR=3.1, whereas for same In, Proposed EVFDT
has lowest FPR and FNR i.e. FPR=0.8 and FNR=1.1 which means that the EVFDT generates
less false alarms as compared to other techniques for varying In.
82
(a) (b)
Figure 5.15: Effect of In on FPR and FNR (a) False Positive Rate (b) False Negative Rate
Sensitivity and Specificity
Sensitivity and specificity are the two important statistical measures that are useful to evalu-
ate the performance of proposed classification algorithm. They are calculated using Equation
5.5 (sensitivity) and Equation 5.6 (specificity). Table 5.10 shows the sensitivity and speci-
ficity of proposed and existing attack classification algorithms with respect to different noise
percentages.
Table 5.10: Sensitivity and Specificity of Existing Proposed Classification Algorithms inPercentage
NoiseVFDT-τ CVFDT OVFDT EVFDT
Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity
0% 87.8 97.4 88.8 99.3 92.2 90.9 98.9 96.75% 79.4 91.7 79.9 91.7 90.2 86.7 95.3 90.010% 79.2 88.8 69.8 93.8 88.5 84.7 86.2 88.415% 75.3 87.3 68.7 88.9 86.2 80.8 80.2 86.420% 78.7 85.3 66.9 81.1 83.9 76.2 78.1 84.5
Avg 80.0 89.0 74.8 90.9 88.2 83.8 87.7 89.2
In Figure 5.16, the average sensitivity vs specificity is plotted showing the true positives,
true negatives, false positives and false negatives of existing and proposed classification
algorithms. From Figure 5.16 ,three possible outcomes are concluded:
1. High Specificity vs Low Sensitivity: In this case, a positive test means the attack is
probable having less false positives. Similarly, a negative test means low sensitivity
83
(a) (b)
(c) (d)
Figure 5.16: Sensitivity vs Specificity (a) VFDT-τ (b) CVFDT (c) OVFDT (d) EVFDT
having high false negatives. A negative test is not very helpful in decision making.
As shown in Figure 5.16, VFDT-τ and CVFDT falls in this category which leads to
conclusion that they both have high false negatives.
2. Low Specificity vs High Sensitivity: In this case, a negative test means the attack is
not probable having less false negatives. Similarly, a positive test means low speci-
ficity having high false positives. In this outcome, a positive test is not very helpful in
decision making. As shown in Figure 5.16, OVFDT falls in this category which leads
to conclusion that OVFDT suffers from high false positives.
3. High Specificity vs High Sensitivity: This is an ideal case. A positive test result
means attack is probable and a negative test result means attack is not probable. From
figure 5.16(d) , EVFDT shows an ideal case in which the sensitivity is 87.7% and
specificity is 89.2%.
Based on the sensitivity and specificity, Receiver Operating Characteristics (ROC) curve
is plotted which is used to assess the effectiveness of a given detection technique. Figure
5.17 shows the ROC curve of the EVFDT and existing algorithms. From the figure, it is
evident that the proposed EVFDT has high sensitivity (True Positive Rate) with fever false
positive rate (100- Specificity) as compared to other existing algorithms.
84
Figure 5.17: ROC curves showing the tradeoff between Sensitivity and false-positive rate(100-Specificity) of DDoS attacks
Cost
In addition to accuracy and false alarm rate, it is also important to consider computa-
tional cost when evaluating the performance of classification algorithm on real-time WBAN
testbed. The computational cost of proposed and existing classification algorithms are cal-
culated using Equation 5.3. For evaluation and comparison of classification algorithms, the
parameter λ is set as 5. Figure 5.18 compares the computational cost of proposed and ex-
isting classification algorithms in the form of bar chart. From the figure, it is evident that
the computational cost per sample of EVFDT is less as compared to VFDT and its variants.
CVFDT maintains high computational cost among all because it builds two classifications
trees simultaneously. VFDT maintains less computational cost nearly equal to EVFDT, but
at the same time it results in very low attack detection accuracy and high false alarm rate.
Figure 5.18: Computational Cost Comparison
85
Resource Usage
Resource usage includes the overall usage of CPU time and memory for processing the
classification algorithm. Both computational time and memory usage is calculated using
Big O notation.
Computational time is a time taken in seconds by CPU for processing a full set of data
stream. Figure 5.19 shows the computational time of the proposed and existing classification
algorithms. It is evident from the figure that VFDT-τ has a small running time due to small
and fix value of τ . EVFDT takes slightly more time than VFDT-τ because of pruning and
tie- breaking threshold computation.
Figure 5.19: Computational Time Comparison
Similarly, the total amount of memory required to run EVFDT is the sum of memory allo-
cated for learning and memory allocated for training. The total memory required for running
EVFDT and existing algorithms is shown in Figure 5.20. As shown in the figure 5.20, the
amount of memory increases with the increase in In. CVFDT consumes more memory space
as compared to other techniques because of maintaining two classification trees simultane-
ously in the memory. The memory resource usage of EVFDT is less because it calculates
the tie- breaking threshold τ at run- time. As compared to EVFDT, VFDT-τ consumes little
more memory due to the initial selection and declaration of decisive parameter τ .
5.4 Qualitative Comparison of Classification Algorithms
Comparison of EVFDT classification algorithm with existing VFDT and its variants is shown
in Table 5.11. Only OVFDT and EVFDT can handle noisy data. At the same time, OVFDT
handles noisy data to some extent and becomes inaccurate with the increase in noise percent-
age due to the presence of outliers. VFDT-τ , CVFDT and Improved VFDT (IVFDT) [63]
does not provide tree pruning. They maintain small tree size from the beginning but with
86
Figure 5.20: Memory Usage Comparison
the increase in noise percentage, the tree size starts increasing which in turn requires more
computational time and consume more memory space. Only EVFDT classification algo-
rithm efficiently handles noisy data and at the same time maintains small tree size with less
resource usage.
5.5 Conclusion
In this chapter, we have evaluate and compare the performance of DDoS attack detection
technique proposed in Chapter 4, for variations in the results of performance evaluation pa-
rameters, including: attack detection accuracy, false alarm rate, cost, tree size, sensitivity
vs specificity and resource usage. For the evaluation of proposed technique, the attack de-
tection accuracy is analyzed for variations in the number of instances In and the percentage
of noise present in the data. The overall analysis shows that the attack detection accuracy
increases with the increase in In, whereas it significantly decreases with the increase in noise
percentage. Further, simulation experiments are performed to assess the false positive rate
and false negative rate. The results shows the significant increase in FPR and FNR for in-
crease in noise percentage whereas for number of instances, the increase in In decreases the
rate of false positives and false negatives. Subsequent results were obtained for sensitivity
and specificity, i.e. the increase in noise percentage decreases the sensitivity and speci-
ficity. The sensitivity and specificity of any system directly exhibit its accuracy. The high
sensitivity and specificity of proposed system depicts that it is more accurate in detecting
attacks as compared to existing techniques. The computational cost is calculated only for
real-time generated data in order to measure the cost of computation on sensor nodes. It is
evident from simulation results that the computational cost of proposed algorithm is less as
87
Table 5.11: Qualitative Comparison of Proposed and Existing Classification Algorithms
Features VFDT-τ CVFDT OVFDT EVFDTDetectionAccuracy
Very Low Very Low Good; Does nothandle outliers
Excellent
ComputationalCost
Very Low Very High High Very Low
ResourceUsage (Time/Memory)
Less time;More mem-ory
More timein buildingtwo trees andrequires addi-tional memory
Less time; Lessmemory
Same Time asVFDT but con-sumes very lessmemory space
Noisy DataHandling
Does not han-dle noisy data
Not appropriateunder noisydata
HB fluctua-tion intensifiesunder noisydata; Accuracydecreases
Handles noisydata efficiently
Tree Size/Pruning
Small TreeSize; Nopruning
Same tree sizeas VFDT; NoPruning
Small TreeSize; Incremen-tal pruning
Small tree size;Iterative Prun-ing
ComputationalResources
Consume lessresources
Consume moreresources bymaintainingtwo trees
Consume lessresources
Consumes veryless resourcesby cutting ofHB outliers
compared to existing algorithms. Likewise, the resource usage of proposed technique is su-
perior to other techniques. The computational time of proposed technique is slightly greater
than VFDT-τ because of pruning and threshold computation. VFDT-τ do not perform prun-
ing at all, therefore it takes less computation time. At the end, a qualitative comparison is
performed to show the superiority of proposed attack detection algorithm. The qualitative
analysis shows that only OVFDT and EVFDT can handle noisy data. VFDT-τ , CVFDT and
IVFDT does not provide tree pruning at all.
88
Chapter 6
PROPOSED TRACEBACK SCHEME FOR DISTRIBUTED DENIAL
OF SERVICE ATTACK
With the increasing popularity of cloud- assisted WBAN for critical health applications, the
demand for securing these networks is also increasing. One of the major threats to these
networks are DDoS attack that not only exhaust the network capacity but also prevent these
networks to perform their desired tasks [64]
In DDoS attack, the key issue lies in detecting an attack and invoking the appropriate
traceback mechanism. In chapter 4, a machine learning based attack detection algorithm
is proposed to detect distributed denial of service attack in WBAN environment. In this
chapter, a novel traceback technique is proposed and discussed.
Traceback requires reconstructing the attack path and identifying the source of distributed
denial of service attack. Traceback techniques proposed for conventional IP- based net-
works [38], [39], [40], [41] are not directly applicable on resource constrained WBAN envi-
ronment due to additional overhead requirements and high convergence time. Similarly, sev-
eral traceback techniques are also available for MANET [42] and WSN [43] that overcome
the limitation of overhead but at the cost of additional processing and storage requirements.
Results and analysis shows that none of the available solutions are appropriate for trace-
back of DDoS attack in cloud- assisted WBAN environment. Among the available tech-
niques, fishbone traceback (FBT) [44] is specifically proposed for hierarchical Wireless Sen-
sor Networks (WSN). It is based on edge sampling approach [40] and appears to be more
appropriate than other techniques because it is lightweight and easily implemented in WSN.
FBT uses marking probability distribution function that assigns fixed marking probability to
all the nodes in order to minimize the convergence time but, concurrently, it increases the
overhead on nodes.
In this chapter, a new traceback technique called Efficient Traceback Technique (ETT) is
proposed, to be deployed specifically in resource constrained WBAN environment. The pro-
89
posed technique assigns the dynamic marking probability to each node based on the number
of hops the packet travelled once it originates from the source. The number of hops can be
calculated as the distance travelled by the packets from the source. Finally, a path reconstruc-
tion algorithm is proposed to traceback the attacker. Results and comparison shows that the
proposed technique has less convergence time (definition) as compared to fixed PPM. Sim-
ilarly, the proposed technique results in less computational overhead on nodes as compared
to other available schemes.
Section 6.1 present the preliminaries and gives an introduction to PPM. Further, the prob-
lems and issues related to choosing marking probability is discussed. The three key issues
relevant to choosing marking probability are discussed in detail. Section 6.2 described the
proposed packet marking technique for both multi-hop and single- hop topology. For packet
marking, a novel labeling technique is proposed to find the traveling distance of node from
the origin. Subsequently, a working example is given to show the effectiveness of proposed
technique. In section 6.3, a proposed DDoS attacker traceback algorithms are proposed for
path reconstruction and identification of an attacker. This mechanism comprises of two pro-
cedures: Procedure for Aggregate Node Path Reconstruction (to reconstructs the path from
victim to the aggregate node of the cluster that contains the attacker and the source node),
and Procedure for Sensor Node Path Reconstruction (to performs the path reconstruction
from aggregate node to the source node from where the attack originates. Finally, in section
6.4, performance evaluation and benchmarking is given.
6.1 Preliminaries
In sensor network environment, one of the key feature is that the source node itself inserts
its source address in the MAC header before it sends any packet. This allows a number of
anonymous attacks on sensor networks [40].
A number of approaches are available to traceback the source of an attack and packet
marking is one of them. In packet marking approach, each node place some path information
in every passing packet until it reaches the victim. The victim node reconstructs the attack
path by collecting a certain number of packets along the network path.
Among packet marking approaches, probabilistic packet marking (PPM) is considered the
most well-known solution for traceback of DDoS attack because PPM has small implemen-
90
tation and management overhead due to the probabilistic nature of algorithm.
6.1.1 Probabilistic Packet Marking
A PPM based traceback can be classified into packet marking and path reconstruction phases.
In packet marking phase, each originating packet is marked with some probability as it pass
by intermediate nodes along the attack path. In reconstruction phase, a victim node uses
the recorded path information in the packet to reconstruct the attack path and locating the
source of an attack. For recording path information, node sampling, node append and edge
sampling are widely used techniques [40].
6.1.2 Key Issues in Selecting Probability
In DDoS attack, traceback mechanism is carried out between an attacker and the victim. At-
tackers hide their identity using spoofing and restrict the number of attack packets. Whereas,
the victim needs to choose appropriate marking scheme to locate the attacker. For efficient
PPM mechanism, the key issue lies in selecting a suitable marking probability τ for easy and
accurate traceback in WBAN environment [46]. The key issues are explained as follow:
1. At-Least-One-Marking per Sensor Node: According to the graphical net-
work topology shown in Figure 6.1, let A be the attack path such that A =
{a, n1, n2...nN , v}, where a represents the attacker, v denotes a victim of DDoS
attack and ni where (i = 1, 2...N) represents N sensor nodes (including aggregate
node) along the attack path.
Figure 6.1: Graphical Network Topology
Suppose node ni has a marking probability τi. The residual probability ϕi is defined as
the probability that an attack packet has lastly been marked by node ni and not by any
91
other node further down the attack path. From the perspective of victim v, ϕi helps the
victim v to know that the node ni is on the attack path after inspecting this incoming
packet. Residual probability ϕi is represented as:
ϕi =
n∏j=1
(1− τj) i = 0
n∏j=j+1
(1− τj) 1 6 i < N
τi i = N
(6.1)
Consider all nodes have the fixed marking probability then τ1 = τ2 = ... = τn ≡ τ .
From Equation 6.1, we have;
ϕi = τ(1− τ)N−1 for1 6 i 6 N (6.2)
From Equation 6.2, it is concluded that the residual probability ϕi for node ni is geo-
metrically smaller, i.e. the node is closer to the attacker. It is given as:
ϕ1 < ϕ2 < ... < ϕN (6.3)
From Equation 6.3, it is concluded that the node n1 has minimum possibility whereas
node nN has the maximum possibility to send its marking information to the victim
node v. It is not possible for victim v to figure out that node ni is on attack path until
v receives a packet that contains a marking left by node ni. Therefore, the victim must
receive at-least-one-marking from each node ni on the attack path for the successful
reconstruction of attack path. Let P be the attack measure from an attacker a to the
victim v. To fulfill the need of at-least-one-marking per node ni, an efficient PPM-
based traceback must meet the following criteria:
Pϕ1 = Pτ(1− τ)N−1 > 1 (6.4)
In Figure 6.2, a graph is plotted that shows the possible values of ϕ1 for node n1 with
respect to τ and N using Equation 6.2. It is evident from Figure 6.2 that for different
number of N , the peak value occurs at τ = 1/N e.g., for N = 25, the peak value
occurs at 1/25 for which ϕ1 = 0.0277. As the value of N (total number of nodes
between a and v) varies and is unknown to victim, therefore it is difficult to decide the
92
ideal marking probability a priori.
Figure 6.2: Residual Probability ϕ1 for node n1
One possible solution is to select a small τ , again doing this allows the attacker to
lessen his attack volume so that a limited range of τ are available for successful attack.
2. Spoofing: In spoofing, the attacker besides spoofing source address may also spoof
the packets marking field by falsifying data in order to conceal his/her identity or
attack path. This whole process is termed as spoofed marking attack [46]. From
the victims perspective if a packet remains unmarked along the path i.e., the packet
remains unmarked by any intermediate node ni, the false data in the marking field left
by an attacker may lead to inaccurate path reconstruction. The probability that the
packet remains unmarked is expressed as:
ϕ0 = (1− τ)N (6.5)
Taking ϕ0 along y-axis, a graph is plotted with respect to τ and N as shown in Figure
6.3. The graph shows that ϕ0 is a inversely proportional to τ with different number of
nodes N.
3. Uncertainty: Packets whose marking fields are spoofed with false data also cause
uncertainty in traceback. The concept of uncertainty was introduced by Park and Lee
[65] and explained with the help of Figure 6.4. Back to the previous assumption in
which an attack path is defined as A = {a, n1, n2...nN , v}. As shown in Figure 6.4.
An attacker initiates an attack by spoofing the marking field with the false data (l1, n1),
where l1 is the legitimate node which is spoofed. Before reaching the victim node v, if
93
Figure 6.3: Unmarked Probability ϕ0
the spoofed packet remains unmarked by other nodes along the path, it is considered
as legitimate packet originating from l1. A similar scenario is assumed for other nodes
l2, l3, ..., lK , where K is the uncertainty factor and defined as a total number of fake
sources of an attack besides the actual attacker a. Hence, the total number of false
sources of an attack identified by a traceback technique is (K + 1).
Figure 6.4: Falsify Paths
As discussed before, node closer to an attacker has least residual probability ϕi as
compared to other nodes. The attacker takes advantage of this scenario by keeping all
the spoofed packets unmarked and send them to victim v showing them as these were
marked by node n1. This scenario is represented as:
Kϕ1 = ϕ0 (6.6)
Solving Equation 6.6 by putting values of ϕ1 and ϕ0, we obtained
K =1
τ− 1 (6.7)
94
From Equation 6.7, it is observed that marking probability τ is inversely proportional
to uncertainty. As in original PPM approach, the marking probability τ is fixed. In-
crease in fixed marking probability τ results in the decrease of uncertainty factor K.
6.2 Proposed Traceback Technique
The existing PPM approaches proposed for sensor networks uses fixed marking probability
τi, for packet marking which results in high convergence time, additional overhead and un-
certainty as discussed in section 3.2. The root cause of this variance is the assignment of
uneven probability ϕi to sensor nodes ni along the attack path. Liu et al., [46] introduces the
concept of a dynamic probability packet marking (DPPM) approach, in which the marking
probability is assigned to each node based on the distance travelled by the packet. DPPM
uses Time-to-Live (TTL) field in IP- header to determine the travelling distance of each
packet passing by the router. As in sensor networks, we are dealing with MAC protocol,
determining the travelling distance is a key issue. Using a TCP protocol in WSN itself in-
creases the overhead due to three-way handshake. In the following section, we will present
a new traceback technique specifically for resource constrained WBAN. The proposed tech-
nique is based on DPPM and uses MAC header instead of IP header. The proposed technique
has following features:
1. It assigns a uniform probability i to all the nodes ni along the attack path with the aim
to reduce the overall convergence time.
2. It reduces the overhead on all the nodes by assigning the variable marking probability
in descending order as the packet travels along the attack path towards the victim node.
3. It ensures that each packet should mark at least once in order to remove the uncertainty
caused by spoofed marking.
The proposed technique works as follow:
Let d denotes the travelling distance of a packet such that (1 6 d 6 i), where i is the total
number of nodes along the attack path. Each node ni marks the packet r with the marking
probability which can be calculated as the distance travelled by packet r from its source until
reach that particular node. It can be expressed as:
95
τi =1
d(6.8)
Taking into account the working of proposed technique, the key issue lies in how to find
the travelling distance of each packet r from its source? In the following section, we will
answer this question. To the best of our knowledge, it is the first attempt to deploy DPPM in
WSN environment.
6.2.1 Finding the Traveling Distance
Before finding the traveling distance of each packet from its source, first we look into the
WBAN network topology shown in Figure 6.5. The network topology can be either multi-
hop or single- hop. Figure 6.5(a) shows the multi-hop WBAN topology in which sensor
nodes transmit their data to an aggregate node via intermediate nodes. Figure 6.5(b) shows
the single-hop topology in which each sensor node directly sends its data to an aggregate
node and further to base station (BS) via intermediate aggregate nodes. To find the traveling
(a) Multi-Hop WBAN (b) Single-Hop WBAN
Figure 6.5: WBAN Network Topology
distance of each packet from its source, a small number of bytes are reserved in the data
payload of MAC protocol data unit (MPDU) and labeled as DPPM label. Figure 6.6 shows
the MPDU with DPPM label. The labeling mechanism brings a very less change in IEEE
802.15.4 MAC header. In each packet, only 12 bytes are reserved to carry DPPM label for
multi-hop WBAN and 10 bytes for single-hop WBAN. As the label uses data payload of
MPDU which is variable in length, therefore, it is acceptable to carry this amount of data to
perform traceback operation in WBAN environment.
The length of DPPM label depends upon the topology employed for WBAN. Next, we
will discuss labeling in detail for both multi-hop and single-hop WBAN topology.
96
Figure 6.6: IEEE 802.15.4 with DPPM label
Multi-hop WBAN Topology:
For multi- hop WBAN topology, 12 bytes are reserved in MAC data payload and labeled as
P (s) = (Source, End, Initial,Head, Tail,Distance) as shown in Figure 6.7. Each P (s)
represent a packet fields marked at each sensor node s along the path. (Source, End) is as-
sociated with regular sensor node, where (Initial,Head, Tail) is associated with aggregate
nodes which helps in path reconstruction and Distance is used to find the distance travelled
by each packet from its origin.
Figure 6.7: DPPM label
The detail of each field is given as follow:
• Source: Source is the originating sensor node ID of an edge connecting two sensor
nodes e.g., in Figure 6.8 A is the source node sending packet to node B. When the
attack packet first originates, the source node write its node ID to this field of the
packet P (s).
Figure 6.8: Sensor Nodes Connecting with an Edge
• End: It is a node which receives a packet from a source i.e. node at the edge that
receives the packet e.g. in Figure 6.8, B is End. Upon receiving the packet, the end
node first checks the following conditions before writing its ID in the field: Source
97
field! = EMPTY Distance field = = 0 End node and Source node Same Cluster When
the above conditions met, the node writes its ID into the end filed of packet P (s). At
this point the distance field become 1.
• Distance: It is defined as the traveling distance from the source to the victim. This
field is incremented by each intermediate node as the packet travels along the attack
path.
• Initial: This field of a packet is written by an aggregate node of the cluster where
source node is present and remains same along the path until the packet reaches the
victim. The attack node also lies in the same cluster as the aggregate node.
• Head: This field is written by aggregate node of current cluster and contains the head
of an edge for aggregate nodes. This field is updated by every downstream aggregate
node upon the arrival of packet.
• Tail: Upon receiving the packet, the aggregate node updates this field with the tail of
an edge for aggregate node. This field is also written by aggregate nodes only.
Working Example: A detailed working example for finding the traveling distance of a
packet is given in this section. A multi- hop WBAN network topology is shown in figure
6.9(a). It consists of four clusters, where each cluster have regular sensor nodes and one
aggregate node that acts as a cluster head. Each sensor node either send its data directly to an
aggregate node or via other regular sensor nodes. Similarly, each aggregate node forwards
its data to base station (BS) either directly or via intermediate aggregate nodes. Suppose
attacker a1 launch DDoS attack towards BS by sending out spikes of packets. Figure 6.9(b)
shows the sequence of packets traveling along the path towards BS. Every node updates
each field of a packet P (s) in order to find the distance and reconstruct the path successfully.
It is explained as follow:
• Sensor node a2 writes its ID into the source field of packet P (s). After reaching node
a2, the DPPM label became (a2, 0, 0, 0, 0, 0)
• At node a3, the DPPM label is updated as P (a3) = (a2, a3, 0, 0, 0, 1). At this stage,
distance field is incremented by 1.
98
Figure 6.9: (a): Multi-Hop WBAN Topology
Figure 6.10: (b): Sequence of Packet Traveling Along the Path
• Upon reaching at aggregate node A, the DPPM label is updated and become P (A) =
(a2, a3, A,A, 0, 2).
• When aggregate node B receives the packet, it updates the packet by putting its ID
in the tail field as P (B) = (a2, a3, A,A,B, 3). The value of initial and head remains
same and distance is incremented by 1.
• Similarly, aggregate node C and D successively update the packet.
• Finally, the packet reaches the base station with DPPM label (a2, a3, A, C,D, 5).
Single- hop WBAN Topology
For Single- hop WBAN topology, only 10 bytes are reserved in MAC data payload and
labeled as P (s) = (Source, Initial,Head, Tail,Distance). The End field is redundant
99
and thus eliminated. The rest of the marking procedure for single- hop topology is same as
discussed for multi-hop WBAN topology.
6.2.2 Uniform Residual Probability
As discussed in Section 6.1.2, the key feature of proposed technique is to maintain a uniform
residual probability ϕi. To attain this, each node chooses its marking probability τi = (1/d)
where d = (1, 2, ..., N) and defined as a traveling distance of a packet from its source until
reaches the victim. For N sensor nodes, the residual probability is given as:
ϕN =1
N(6.9)
Similarly, for other nodes the residual probability ϕi is calculated by solving Equation 6.1
ϕi = τi
n∏j=1
(1− τj) 1 6 i < N (6.10)
ϕi =1
Nfor1 6 i < N (6.11)
From Equation 6.9 and Equation 6.11, it is concluded that each node ni along the attack path
has maintained a uniform residual probability ϕi to mark each packet before it reaches the
victim. This shows that each packet has been marked legitimately and no packet has been
left unmarked by any node which results in no uncertainty at all. It is further evaluated in
Chapter 7.
6.3 DDoS Attacker Traceback and Path Reconstruction
After successful packet marking, the next step is the path reconstruction and identification
of an attacker. Based on the collected marked packets, victim v execute the attack path re-
construction process. The proposed technique divides the reconstruction process into two
Procedures: (1) Aggregate nodes path reconstruction, and (2) Sensor node path reconstruc-
tion within the cluster.
6.3.1 Procedure for Aggregate Node Path Reconstruction
This procedure reconstructs the path from victim to the aggregate node of the cluster that
contains the attacker and the source node. Algorithm 7 gives the procedure for aggregate
node path reconstruction.
100
Algorithm 7 Aggregate Node Path Reconstruction at VictimRequire: S: Set of attack packets at victim v Packet x, y
Stack S1String path
1: BEGIN Procedure PathReconstructionAtAggregateNode().2: Group the packets in set S based on Initial field3: for each Group G1 in S do4: x = FindLeaf(G1) //Function Call5: S1.push(x.Head)6: y = FindParent(x,Head,G1) //Function Call7: while y 6= 0 do8: S1.push(x.Head)9: x = y.
10: y=FindParent.x,Head,G1) //Function Call11: end while12: end for13: path = AggregateNodePathReconstruction(S1) //Function Call14: END Procedure15: BEGIN Procedure Packet FindLeaf(Group G1) //Function Definition16: for each packet j in G1 do17: if j.Tail == 0 then18: RETURN path19: end if20: end for21: END Procedure22: BEGIN Procedure Packet FindParent(Packet k,G1) //Function Definition23: for each packet j in G1 do24: if j.Tail == k then25: RETURN j.26: end if27: end for28: END Procedure29: BEGIN Procedure String AggregateNodePathReconstruction(Stack S1) //Function
DefinitionRequire: String path30: while S1.IsEmpty() 6= 0 do31: path += S1.pop()32: end while33: RETURN path34: END Procedure
6.3.2 Procedure for Sensor Node Path Reconstruction
This procedure performs the path reconstruction from aggregate node to the source node
from where the attack originates. The procedure for sensor node path reconstruction is given
in Algorithm 8.
101
Algorithm 8 Sensor Node Path ReconstructionRequire:
Packet j, kStack S2String path
1: BEGIN Procedure PathReconstructionAtSensorNode().2: Find Aggregate Node Packet AggPacket at Aggregate Node A.3: for every packet i in A do4: if i.Initial == A then5: AggPacket = i.6: end if7: end for8: Find Parent of Aggregate Node A9: for every packet i in A do
10: if (i.Source == AggPacket.Source)&&(i.End ==AggPacket.End)&&(i.Initial = 0) then
11: j = i.12: end if13: end for14: S2.push(j.End)15: k = FindParent(j.Source) //It will return the packet which has End value same as the
input Parameter.16: while k 6= 0 do17: S2.push(k.End)18: j = k.19: k=FindParent(j.Source) //Function Call20: end while21: S2.push(k.Source) //k.Source is the Intruder Node22: path = CompromisedNodePathReconstruction(Stack S2) //Function Call23: END Procedure24:25: BEGIN Procedure String CompromisedNodePathReconstruction(Stack S2) //Func-
tion DefinitionRequire: String path26: while S2.IsEmpty() 6= 0 do27: path += S2.pop()28: end while29: RETURN path30: END Procedure
6.4 Conclusion
In cloud- assisted WBAN, identifying the source of distributed denial of service attack and
reconstructing an attack path are the key challenges due to the resource constrained nature of
these networks. Traceback techniques proposed for standard IP- based networks are not ap-
propriate for sensor networks due to additional overhead requirements and high convergence
102
time. Similarly existing techniques proposed for mobile ad-hoc networks requires additional
processing and storage requirements. In this chapter, an efficient traceback technique is pro-
posed that can be deployed in cloud- assisted WBAN environments. The proposed tech-
nique assigns the dynamic marking probability to each node based on the number of hops
the packet travelled once it originates from the source. The number of hops can be calcu-
lated as the distance travelled by the packets from the source. Finally, a path reconstruction
algorithms are proposed that efficiently traceback the attacker.
103
Chapter 7
TRACEBACK SCHEME: PERFORMANCE EVALUATION AND
BENCHMARKING
DDoS attack is one of the major attacks in WBAN environment that not only exhausts the
available resources but also influence the reliability of information being transmitted. After
the successful detection of DDoS attacks, the next key challenge lies in identifying the source
of these attacks and reconstructs the attack path. Among the existing traceback techniques,
PPM is the most widely and successfully implemented technique towards the preventing
DDoS attack. However, since marking probability assignment has significant affect on both
the convergence time and performance of a scheme, it is not directly applicable in WBAN
environment due to high convergence time and overhead on intermediate nodes. Therefore,
a new scheme called efficient traceback technique (ETT) is proposed in chapter 6 in order to
improve the effectiveness and compatibility of PPM in WBAN.
In this chapter, the performance of proposed traceback technique is evaluated through
simulation and experiments. The proposed scheme is discussed in chapter 6, which assigns
the dynamic marking probability to each node along the path, and further reconstructs the
attack path to efficiently traceback the attacker and making subsequent decision. The per-
formance of proposed scheme is affected by few network parameters. The variation in these
parameters are used to quantify the results, based on simulation experiments. The network
Simulator NS-2 was used to compared and analyze the performance metrics including: con-
vergence time, overhead and uncertainty. The acquired simulation results are compared with
corresponding results obtained from the simulation of fishbone traceback technique for both
multi- hop and single- hop sensor network. The results shows that the proposed technique is
better than FBT that used fixed marking probability.
7.1 Simulation Setup
To analyze the feasibility and evaluate the performance of proposed marking technique and
path reconstruction algorithms, extensive simulation experiments have been performed. The
104
key purpose of performing simulation experiments is to investigate the performance metrics
related to packet marking approach: convergence time, Overhead on nodes and uncertainty
in marking packets.
To conduct the simulation experiments, network simulator NS-2 is deployed. A multi-
hop WBAN topology is constructed consisting of a base station and fifty sensor nodes which
are divided into six clusters. Each cluster has an aggregate node that act as a cluster head
and few regular nodes. Other simulation parameters and network configurations are shown
in table 7.1.
Table 7.1: Simulation Parameters
Parameters Values
Sensing Field 50*50Simulation Time 1200 sPacket Size 1000 bytesRadio Communication Range 2m Standard,5m Special useNo. of Nodes 50BAN Coordinator Directional ModeSensor Nodes Omni directional ModeMAC Type IEEE 802.15.4
The attack paths are chosen randomly from different clusters and on each of the chosen path,
different number of are initiated and transmitted. As the packet travels along the path, each
intermediate node ni simulates and mark the packets according to the proposed marking
technique discussed in chapter 6. Finally, the victim node v simulates by applying proposed
attack path reconstruction algorithm in order to reconstructs all the attack paths and trace the
attacker. The number of attackers in simulation varies from 1 to 50.
The simulation run ten times for both FBT and proposed technique and an average of data
values obtained from experiments are taken as results. Further, the results of evaluation and
comparative analysis is discussed in section 7.1.2.
7.2 Evaluation and Comparative Analysis
Simulations are performed to evaluate the proposed traceback technique and generate results
for the chosen metrics including convergence time, Uncertainty and overhead on nodes.
105
7.2.1 Convergence time
Convergence time is measured a number of packets needed for a successful attack path
reconstruction [44]. It depends on the uniform residual probability ϕi. From Equation 6.2,
the most prominent aspect of the traceback convergence time for PPM (FBT) is given as:
ConvergenceT imeFBT τ(1− τ)N−1 > 1. Thus keeping τ and N fixed for FBT, we get:
ConvergenceT imeFBT >1
τ(1− τ)N−1(7.1)
As we learned from Equation 6.9 and Equation 6.11 that ϕi = 1/N , therefore for proposed
technique the traceback convergence time is given as:
ConvergenceT imeETT > N (7.2)
Figure 7.1 shows the number of packets required by proposed traceback technique and
FBT to reconstruct the attack path. For FBT, we assume the fixed marking probability of
0.08. The graph clearly indicate that the proposed packet marking technique has less con-
vergence time. For FBT, the convergence time is exponential to the length of attack path
which mean the convergence time increases with the increase in path length. Table 7.2
Figure 7.1: Number of packets required by proposed technique and FBT (τi = 0.08)
compares the numerical values of ConvergenceT imeFBT and ConvergenceT imeETT for
different number of node’s distance from the source. It is evident from the table that the pro-
posed ConvergenceT imeETT requires less amount of packets for attack path reconstruction
which means that it has less convergence time as compared to ConvergenceT imeFBT with
106
different marking probabilities.
Table 7.2: Convergence Time Comparison of FBT and proposed Technique
Nodes FBT(τi = .02)
FBT(τi = .04)
FBT(τi = .06)
FBT(τi = .08)
FBT(τi = .10)
FBT(τi = .20)
FBT(τi = .30)
FBT(τi = .35)
Proposed
10 59 38 29 27 24 36 79 129 1015 64 42 37 41 43 125 489 1209 1520 73 55 56 61 74 337 2835 10432 2025 83 66 75 93 125 1102 17386 87983 2530 89 85 99 141 302 3182 101625 765292 30
7.2.2 Uncertainty
For PPM, the maximum uncertainty is given as: (m = (1/τ) − 1) in [65] which shows
that PPM locate few possible attackers under spoofed marking attack. Figure 7.2 shows the
uncertainty values of PPM for different marking probabilities τi. As the value of τ increases,
the uncertainty factor decreases. Again, choosing a large value of τ is not a good solution.
As discussed in section 4.2, each node ni along the attack path has maintained a uniform
residual probability ϕi to mark each packet before it reaches a victim. Concluding this
shows that each packet has been marked legitimately and no packet has been left unmarked
(ϕi = 0) by any node which results in no uncertainty at all which means (m = 0) for
proposed technique. This indicated that proposed ETT allows to locate actual attacker
under DDoS attack.
Figure 7.2: Uncertainty values for PPM with Different Marking Probabilities
107
7.2.3 Overhead on Nodes
A key issue of WBAN is its resource scarcity. Therefore, any traceback technique should
ensure less overhead cost on individual WBAN nodes as well as on collective nodes. In
this section, we estimate and compare the individual overhead and total overheads on nodes
under FBT and proposed technique.
The proposed technique has to determine the traveling distance of each node from its
origin and therefore, it is expected that its overhead cost is more for marking packets as
compared to FBT that uses fixed marking probability. Despite that, this assumption is not
correct, because each node only inspects the packet and increment the distance field by one
for each incoming packet. Hence, the cost of proposed technique turns out to be very less
than FBT. For simplicity, first we calculate the overhead on individual nodes and then the
total overhead on all the nodes along the path has been computed. For FBT, a fixed marking
probability τi is assigned to every node for packet marking. If there are n number of packets
in a DDoS attack, the overhead on every individual node is calculated as:
OverheadFBT = nτi (7.3)
For proposed technique, every node chooses a marking probability of 1/d(ford =
1, 2, ..., N) to mark packets. In this case, the overhead on every node turns out to be:
OverheadETT =n
d(7.4)
Figure 7.3 gives the comparison of individual nodes overhead for both FBT and proposed
technique, where number of packets n = 10, 000, total number of nodes N = 15 and
marking probability for FBT is assume as τi = 0.3.
It is evident from the graph that under FBT, all nodes have same overhead. On the con-
trary, under ETT, only first two nodes undergo high overhead after that the overhead drops
rapidly as the path length increases. Similarly, the total overhead for FBT and proposed
ETT depends upon N which defines as the total number of nodes on the reconstruction path.
Recalling Equation 7.3 and Equation 7.4, the total overhead under FBT is calculated as:
TotalOverheadFBT = nτiN (7.5)
108
Figure 7.3: A Comparison of Overhead on Individual Nodes
Table 7.3: Total Overhead on Nodes
Nodes FBT(τi = .20)
FBT(τi = .30)
FBT(τi = .35)
Proposed
10 2 3 3.5 2.9315 3 4.5 5.25 3.3220 4 6 7 3.625 5 7.5 8.75 3.8230 6 9 10.5 4
For proposed ETT, total overhead is calculated by summing all N terms and is represented
as:
TotalOverheadETT = n
(1
1+
1
2+
1
3+ ...+
1
N
)= nHN (7.6)
Where HN is the N th harmonic number. Table 7.3 shows that the proposed scheme provides
better quantitative results as compared to FBT scheme. Under FBT three marking proba-
bilities i.e. 0.20, 0.30 and 0.35 are chosen for comparison. For small number of nodes, the
total overhead of FBT and proposed technique is almost equal but as the number of nodes
increases, the total overhead of FBT increases rapidly whereas the overhead of proposed
technique increases gradually.
7.3 Conclusion
In this chapter, the performance of proposed DDoS attack traceback technique is evaluated
and compared for the variation in results. The results acquired from simulation experiments
109
were analyzed and compared in terms of following performance metrics: convergence time,
overhead on individual nodes, total overheads on all nodes and uncertainty in marking pack-
ets.
Convergence time is analyzed for the variation in the number of packets needed for a
successful path reconstruction with respect to varying number of nodes in the network. Sub-
sequently, the simulation experiments are carried out to evaluate and compare the uncertainty
and overhead of proposed technique with existing FBT technique which uses fixed marking
probability in marking packets. The convergence time shows that the proposed technique
requires less number of packets for path reconstruction as compared to FBT. However, for
proposed technique the increase in the number of nodes give slightly rise to convergence
time which is acceptable. Whereas, under FBT, the convergence time is exponential to the
length of attack path.
The overhead on nodes is compared for varying number of packets with respect to number
of nodes in the network. In the case of individual node overhead, FBT has same overhead on
all the nodes whereas the overhead of proposed technique drops rapidly as the path length
increase. Similarly the total overhead of proposed technique is very less as compared to FBT.
Finally, the uncertainty of proposed technique is zero, which means proposed ETT allows to
locate actual attacker under DDoS attack.
110
Chapter 8
CONCLUSION AND FUTURE DIRECTIONS
8.1 Summary
Distributed denial of service (DDoS) attack does intends to disturbs or meddle with the
genuine sensor information, rather they exploit the difference present between the network
bandwidth and limited resource availability of the victim. Detecting and preventing against
such attacks in cloud- assisted WBAN is an imperative concern. Attacks can be evaded by
first detecting an attack took after by attack prevention and mitigation. Attack detection
is a beginning stride of any protection approach that should be taken prior to any defense
approach. Likewise, attack prevention action additionally plays a vital part in shielding a
network from noxious attacks.
This thesis is mainly focused on the DDoS attack detection and prevention algorithms and
propose a novel solution that not only consumes less resources but also produces accurate
results. The limited resources of WBAN are not enough to mitigate the huge amount of
traffic generated by DDoS attack. Therefore, there is a need for an approach that is light
weight and capable of handling real-time high speed sensor data for the detection of such
attacks in cloud- assisted WBAN environment. The concern of detecting and preventing the
DDoS attack in cloud- assisted WBAN remains unresolved, all the solutions proposed for
such attacks in conventional networks are not directly applicable in cloud-assisted WBAN
environment due to the resource scarceness of these networks. The multiple entry points into
these wireless sensor networks leave them more vulnerable to such attacks which makes the
attack detection and prevention process more complicated.
The aim of this research thesis is to design a light- weight distributed and scalable ap-
proach for detecting DDoS attack that is capable of handling high speed streaming data
generated by WBAN sensors in cloud- assisted WBAN environment. The goal is to propose
the attack detection technique with improved performance when compared with exiting tech-
niques in terms of: i) improved attack detection accuracy; ii) minimizing overall resource
111
usage and iii) reducing overall computational cost. Analyzing and comparing the existing
techniques for detecting attacks in both conventional and wireless sensor networks concludes
that the data mining techniques have proved to be the most promising solution for identify-
ing the malicious behavior of nodes in these networks through pattern discovery. Therefore,
this research selects and explore the data mining technique that is light-weight and further
optimizes it for handling high-speed streaming data originating from WBAN sensors.
Another objective of this thesis is to propose an efficient traceback technique specifically
for cloud- assisted WBAN environment that incur minimal overhead on the network. The
goal is to propose a technique that is efficient in packet marking and path reconstruction pro-
cedures in order to traceback and identify the source of DDoS attack with less convergence
time. Different traceback techniques have been analyzed and their comparison drawn to the
conclusion that Probability Packet Marking (PPM) is the most appropriate and widely used
approach in both conventional and wireless sensor networks. The key issue of PPM lies in
assigning the marking probability for path reconstruction. Therefore, we model the trace-
back of DDoS attack as a marking probability assignment problem and further optimized
it for efficient traceback of DDoS in cloud- assisted WBAN environment. The purpose of
selecting PPM technique is to reduce the overhead on sensor nodes.
In chapter 3, first we propose a cloud- assisted WBAN architecture and discusses its mod-
ules in detail. Secondly, based on the proposed architecture, a framework is presented for
the detection and prevention of DDoS attack in cloud- assisted WBAN environment. Based
on the framework, (1) a distributed attack detection technique is proposed in Chapter 4 that
efficiently detects DDoS attack in wireless sensor networks and (2) a traceback technique is
proposed in Chapter 6 that efficiently identify the source of an attack and block an attacker.
In Chapter 4, a victim- based DDoS attack detection algorithm is presented. This algo-
rithm is an improvement of Very Fast Decision Tree namely Enhanced Very Fast Decision
Tree, which differs from the existing algorithms in terms of classification accuracy, false
alarm rate, sensitivity and specificity, computational cost, tree size, memory and time. Our
proposed classification algorithm is capable of handling noisy data and detects a DDoS at-
tack efficiently with high accuracy and low false alarm rate while allowing a legitimate
requesters to access the resources. The proposed algorithm is deployed at the victim node.
In Chapter 5, the performance of proposed DDoS attack detection scheme is analyzed
112
and compared with respect to varying noise percentage and number of instances. Both
simulation-based experiments and hardware- based experiments are performed for analy-
sis. The simulation results obtained for evaluation and comparisons are quantified using
the metrics including: attack detection accuracy, tree size, computational cost, resource us-
age, false alarm rate and sensitivity vs specificity. Each of the selected performance metric
is evaluated separately on both synthetic datasets and real-time WBAN dataset. Finally, a
comparative analysis is done based on the simulation results obtained from the proposed
technique with the corresponding simulation results acquired from existing techniques.
The overall analysis shows that the attack detection accuracy increases with the increase
in number of instances, whereas it significantly decreases with the increase in noise per-
centage. Further, simulation experiments are performed to assess the false positive rate and
false negative rate. The results shows the significant increase in FPR and FNR for increase
in noise percentage whereas for number of instances, the rate of false positives and false
negatives decreases. Subsequent results are obtained for sensitivity and specificity, i.e. the
increase in noise percentage decreases the sensitivity and specificity. The sensitivity and
specificity of any system directly exhibit its accuracy. The high sensitivity and specificity of
proposed system depicts that it is more accurate in detecting attacks as compared to existing
techniques. The computational cost is calculated only for real-time generated data in order to
measure the cost of computation on sensor nodes. It is evident from simulation results that
the computational cost of proposed algorithm is less as compared to existing algorithms.
Likewise, the resource usage of proposed technique is superior to other techniques. The
computational time of proposed technique is slightly greater than VFDT-t because of prun-
ing and tie- breaking threshold computation. VFDT-t do not perform pruning at all, therefore
it takes less computation time.
At the end, a qualitative comparison is performed to show the superiority of proposed
attack detection algorithm. The qualitative analysis shows that only OVFDT and EVFDT
can handle noisy data. VFDT-t, CVFDT and IVFDT does not provide tree pruning at all.
In Chapter 6, a novel traceback technique is proposed that efficiently traceback the source
of DDoS attack in cloud- assisted WBAN. The proposed technique assigns the dynamic
marking probability to each node based on the number of hops the packet travelled once it
originates from the source. The number of hops can be calculated as the distance travelled
113
by the packets from the source. Finally, a path reconstruction algorithm is proposed to
traceback the attacker. Results and comparison shows that the proposed technique has less
convergence time as compared to fixed PPM. Similarly, the proposed technique results in
less computational overhead on nodes as compared to other available schemes.
The performance of proposed traceback technique is evaluated through simulation and ex-
periments and discussed in Chapter 7. The performance of proposed scheme is affected by
few network parameters. The variation in these parameters are used to quantify the results,
based on simulation experiments. The network Simulator NS-2 was used to compared and
analyze the performance metrics including: convergence time, overhead and uncertainty.
The acquired simulation results are compared with corresponding results obtained from the
simulation of fishbone traceback technique for both multi- hop and single- hop sensor net-
works. The results shows that the proposed technique is superior than Fish Bone Traceback
that used fixed marking probability.
Convergence time is analyzed for the variation in the number of packets needed for a suc-
cessful path reconstruction with respect to varying number of nodes in the network. Results
shows that the proposed technique requires less number of packets for path reconstruction
as compared to FBT. However, for proposed technique the increase in the number of nodes
give slightly rise to convergence time which is acceptable. Whereas, under FBT, the con-
vergence time is exponential to the length of attack path. Further, the overhead on nodes is
compared for varying number of packets with respect to number of nodes in the network.
In the case of individual node overhead, FBT has same overhead on all the nodes whereas,
the overhead of proposed technique drops rapidly as the path length increase. Similarly, the
total overhead of proposed technique is very less as compared to FBT. The uncertainty of
proposed technique is zero, which means proposed Efficient Traceback Technique allows to
locate actual attacker under DDoS attack.
8.2 Future Work
The propose architecture given in figure 3.4 shows the complete cloud- assisted WBAN ar-
chitecture that efficiently sense data from WBAN sensors and transfers it to cloud service
provider for permanent storage via an insecure internet. The communication channel be-
tween the base station and the cloud service provider is secured using the SSL/TLS security
114
protocol in order to provide data confidentiality and integrity.
But the main focus of this thesis is the DDoS attack detection and traceback within the
WBAN domain i.e. the transfer of data from sensor nodes to aggregate node and from mul-
tiple aggregate nodes to the base station. In DDoS attack detection, aggregate node and the
base station are the victims of DDOS attack that can be overwhelmed or flooded with at-
tack traffic in order to consume the network bandwidth or deplete their resources. Similarly,
the victim under DDoS attack either the base station or the aggregate node reconstructs the
attack path and identify an attacker for further blocking. The major aim is to propose a
lightweight, in- network and distributed approach for DDoS attack detection and traceback
that fulfills the requirements of resource constrained WBAN domain and efficiently transfers
the data to cloud for further processing. In future, this work can be extended to cloud do-
main by following the proposed architecture (Figure 3.4) to detect a DDoS attack and further
traceback the source of attack. A private cloud is deployed and the proposed DDoS attack
detection and traceback techniques will be applied on attack detection node to prevent the
cloud service provider and cloud storage server from such attacks. In addition, the proposed
techniques can be further enhanced to achieve better results and for deployment in public
cloud.
Another future work involves the deployment of proposed attack detection technique for
intrusion detection including: slow and fast scans, SYN floods, smurf attack, traffic regula-
tion conditions and other attacks.
115
REFERENCES
[1] R. Latif, H. Abbas, and S. Assar, ”Distributed denial of service(DDoS) attack incloud- assisted wireless body area networks: a systematic literature review,” Journalof Medical Systems, vol. 38, no. 128, 2014.
[2] E. AbuKhousa, H. A. Najati, ”UAE-IHC: Steps towards Integrated EHealth Environ-ment” Proceedings of the 4th e-Health and Environment Conference, February 2012.
[3] A. Waqar, A. Raza, H. Abbas, and M. K. Khurram, ”A framework for preservationof cloud users data privacy using dynamic reconstruction of metadata,” Journal ofNetwork and Computer Applications, vol. 36, no. 1, pp. 235- 248, January 2013.
[4] R. Latif, H. Abbas, S. Assar, and Q. Ali, ”Cloud Computing Risk Assessment: A Sys-tematic Literature Review,” Book Chapter: Future Information Technology, SpringerLecture Notes in Electrical Engineering vol:276, pp:285-295, 2013.
[5] A. A. Moshaddique, L. Jingwei, and K. Kyungsup, ”Security and Privacy Issues inWireless Sensor Networks for Healthcare Applications,” Journal of Medical Systems,vol. 36, no, 1, pp. 93 101, 2012.
[6] D. Ashraf, and H. Aboul Ella, ”Wearable and Implantable Wireless Sensor NetworkSolutions for Healthcare Monitoring”, Journal of Sensors (Basel), vol. 12, no. 9,September 2012.
[7] S. Shahnaz, Sana Ullah, and K. S. Kwak, ”A Study of IEEE 802.15.4 Security Frame-work for Wireless Body Area Networks”, Journal of Sensor (Basel), vol. 11, no. 2,Jan 2011.
[8] N. D. Han, L. Han, D. M. Tuan, H. Peter, and M. Jo, ”A scheme for data confi-dentiality in Cloud-assisted Wireless Body Area Networks,” Journal of InformationSciences, vol. 284, pp. 157-166, 2014.
[9] K. Zhang, X. Liang, M. Baura, R. Lu, and X. S. Shen, ”PHDA: A priority based healthdata aggregation with privacy preservation for cloud assisted WBANs,” Journal ofInformation Sciences, vol. 284, pp. 130-141, 2014.
[10] T. Hayajneh, A. V. Vasilakos, G. Almashaqbeh, B. J. Mohd, M. A. Imran, M. Z.Shakir, and K. A. Qaraqe, ”Public-key authentication for cloud-based WBANs,” Bo-dyNets ’14 Proceedings of the 9th International Conference on Body Area Networks,pp. 286- 292, 2014.
[11] S. T. Zargar, J. Joshi, and D. Tipper, ”A survey of defense mechanisms against dis-tributed denial of service (DDOS) flooding attacks,” IEEE Communications Surveysand Tutorials, vol. 15, no. 4, pp. 20462069, 2013.
[12] Z. A. Baig, M. Baqer, and A. I. Khan, ”A Pattern Recognition Scheme for DistributedDenial of Service (DDoS) Attacks in Wireless Sensor Networks,” Proceedings of theIEEE International Conference on Pattern Recognition (ICPR’06), 2006.
116
[13] A. Mittal, A. K. Shrivastava, and M. Manoria, ”A Review of DDOS Attack and itsCountermeasures in TCP Based Networks,” International Journal of Computer Sci-ence and Engineering Survey (IJCSES), vol.2, no.4, November 2011.
[14] R. Latif, H. Abbas, S. Assar, and S. Latif, ”Analyzing feasibility for deploying veryfast decision tree For DDoS attack detection in cloud-assisted WBAN,” IntelligentComputing Theory: Proceedings of the 10th International Conference, ICIC 2014,pp. 507519, August 3-6, 2014.
[15] J. Xu, X. Zhou, and F. Yang, ”Traceback in wireless sensor networks with packetmarking and logging,” Journal Frontiers of Computer Science in China archive, vol.5, no. 3, pp. 308- 315, 2011
[16] M. T. Goodrich, ”Probabilistic packet marking for large-scale IP traceback,” Journalof IEEE ACM Transactions on Networking (TON), vol. 16, no. 1, pp. 15- 24, 2008
[17] R. Latif, H. Abbas, and S. Latif, ”Distributed Denial of Service (DDoS) attack detec-tion using data mining approach in cloud- Assisted Wireless Body Area Networks,”International Journal of Ad Hoc and Ubiquitous Computing (IJAHUC), 2015 (InPress).
[18] S. Irum, A. Ali, F. K. Aslam, and H. Abbas, ”A Hybrid Security Mechanism forintra-WBAN and inter-WBAN Communication,” International Journal of DistributedSensor Networks, vol. 2013, Article ID 842608, 11 pages, 2013.
[19] A. Ali, and F. K. Aslam, ”A Broadcast-Based Key Agreement Scheme using SetReconciliation for Wireless Body Area Networks,” Journal of Medical Systems(Springer), vol. 38, no. 5, May 2014.
[20] R. Latif, H. Abbas, and S. Assar, ”Cloud Computing Risk Assessment: A SystematicLiterature Review,” Future Information Technology, Future Tech vol. 276, pp, 285295, 2013.
[21] W. Jiafu, Z. Caifeng, S. Ullah, L. Chin-Feng, Z. Ming, and W. Xiaofei, ”IoT SensingFramework with Inter-cloud Computing Capability in Vehicular Networking,” Jour-nal of IEEE Network, vol. 27, pp. 5661, 2013.
[22] I. Foster, Y. Zhao, L. Raicu, S. Lu, ”Cloud Computing and Grid Computing 360-Degree Compared,” Proceedings of the Grid Computing Environments Workshop(GCE), pp. 1-10, November 2008.
[23] T. B. Manohar, E. V. N. Jyothi, and B. Rajani, ”Traceback of DDoS Attacks Basedon Decision Trees Model Using Intrusion Detection System,” International Journalof Computer Science and Management Research, vol. 1, no. 4, 2012.
[24] A. Patcha, and J. Park, ”An overview of anomaly detection techniques: Existing so-lutions and latest technological trends,” The International Journal of Computer andTelecommunications Networking, vol. 51, no. 12, 2007.
[25] N. Jain, Sharma, and Shikha, ”The Role of Decision Tree Techniques for Automat-ing Intrusion Detection System,” International Journal of Computational EngineeringResearch, vol. 2, no. 4, 2012.
117
[26] A. Mahmood, Ke Shi, and K. Shaheen, ”Data Mining Techniques for Wireless SensorNetworks: A Survey,” International Journal of Distributed Sensor Networks, vol.2013.
[27] M. K. Sarat, and G. H. Christopher, ”Summarization Techniques for Visualization ofLarge Multidimensional Datasets,” Technical Report.
[28] E. Y. Moawia, and M. El-mukashfi, ”A New Approach for Evaluation of Data MiningTechniques,” IJCSI International Journal of Computer Science Issues, vol. 7, no. 5,2010.
[29] T. Subbulakshmi, S. M. Shalinie, V. Ganapathisubramanian, K. Balakrishnan, D.Anandkumar, and K. Kannathal, ”Detection of DDoS attacks using enhanced supportvector machines with real time generated dataset,” Proceedings of the 3rd Interna-tional Conference on Advanced Computing (ICoAC 11), pp. 1722, IEEE, Chennai,India, December 2011.
[30] Y. C. Wu, H. R. Tseng, W. Yang, and R. H. Jan, ”DDoS detection and tracebackwith decision tree and grey relational analysis,” International Journal of Ad Hoc andUbiquitous Computing, vol. 7, no. 2, pp. 121136, 2011.
[31] S. M. Lee, D. S. Kim, J. H. Lee, and J. S. Park, ”Detection of DDoS attacks usingoptimized traffic matrix,” Computers and Mathematics with Applications, vol. 63, no.2, pp. 501510, 2012.
[32] R. K. Arun and S. Selvakumar, ”Detection of distributed denial of service attacksusing an ensemble of adaptive and hybrid neuro-fuzzy systems,” Computer Commu-nications, vol. 36, no.3, pp. 303319, 2013.
[33] T.Thwe and P. Thandar, ”Statistical anomaly detection of DDoS attacks using K-nearest neighbor,” International Journal of Computer and Communication Engineer-ing Research, vol. 2, no.1, pp. 315319, 2014.
[34] G. Hulten, P. Domingos, and L. Spencer, ”Mining Massive Data Streams”, Journal ofMachine Learning, vol. 6, no.4, pp. 1431- 1452, 2005.
[35] H. Yang and S. Fong, ”Moderated VFDT in stream mining using adaptive tie thresholdand incremental pruning,” Proceedings of the 13th International Conference on DataWarehousing and Knowledge Discovery (DaWaK 11), pp. 471483, August 2011.
[36] G. Hulten, L. Spencer, and P. Domingos, ”Mining time changing data streams,” Pro-ceedings of the 7th ACMSIGKDD International Conference on Knowledge Discoveryand Data Mining (KDD 01), pp. 97106, August 2001.
[37] A. Fawzy, H. M.O.Mokhtar, andO.Hegazy, ”Outliers detection and classification inwireless sensor networks,” Egyptian Informatics Journal, vol. 14, no. 2, pp. 157164,2013.
[38] S. M. Bellovin, ”ICMP Traceback Messages- Internet Engineering Task Force,” In-ternet Draft: draft-ietf-itrace-04.txt, August 2003.
[39] A. C. Snoeren, C. Partridge, L. A. Sanchez, and C. E. Jones, ”Hash-Based IP Trace-back,” Proceeding of the ACM, SIGCOMM, pp. 314, August 2001.
118
[40] S. Savage, D. Wetherall, A. Karlin, and T. Anderson, ”Practical network support forIP traceback,” Proceeding of the ACM SIGCOMM, pp. 295306, October 2000.
[41] B. Andrey, and A. Nirwan, ”IP Traceback With Deterministic Packet Marking,” IEEECOMMUNICATIONS LETTERS, vol. 7, no. 4, APRIL 2003.
[42] X. Jin, Y. Zhang, Y. Pan, and Y. Zhou, ”ZSBT: A Novel Algorithm for tracing DoSAttacker in MANETs,” EURASIP Journal of Wireless Communications and Network-ing, 2006:9, 2006.
[43] D. Sy, and L. Bao, ”CAPTRA: coordinated packet traceback,” Proceedings of the 5thInternational Conference on Information Processing in Sensor Networks (IPSN), pp.152-159, April 2006.
[44] C. Bo-Chao, C. Huan, and L. Guo-Tan, ”FBT: an efficient traceback scheme in hier-archical wireless sensor network,” Journal of Security and Communication Networks,vol. 2, no. 2, pp. 133-144, 2009.
[45] V. L. L. Thing, H. C. J. Lee, M. Sloman, and J. Zhou, ”Enhanced ICMP tracebackwith cumulative path,” Proceedings of 61st IEEE Vehicular Technology Conference,VTC 2005, vol. 4, pp. 2415 - 2419, June 2005.
[46] J. Liu, Z. Lee, and Y. Chung, ”Dynamic probabilistic packet marking for efficientIP traceback,” Computer Networks: The International Journal of Computer andTelecommunications Networking, vol. 51, no. 3, pp. 866- 882, Feb 2007.
[47] A. Mahmood, K. Shi, S. Khatoon, and M. Xiao, ”Data Mining Techniques for Wire-less Sensor Networks: A Survey,” International Journal of Distributed Sensor Net-works, vol. 2013, Article ID 406316, 24 pages, 2013. doi:10.1155/2013/406316.
[48] A. Z. Baig, and A. I. Khan, ”DDoS Attack Modelling and Detection in Wireless Sen-sor Networks,” In book: Mobile Intelligence: Mobile Computing and ComputationalIntelligence, John Wiley and Sons, pp.595-626, 2010.
[49] E. Petana, and S. Kumar, ”EKG monitoring over Wireless Sensor Networks and DDoSvulnerabilities: Remote EKG monitoring over wireless sensor networks and Impactof Internet Distributed Denial of Service (DDoS) Attacks,” Paperback, 2011.
[50] R. Roman, and J. Lopez, ”Integrating Wireless Sensor Networks and the Internet: aSecurity Analysis,” Internet Research, vol. 19, no. 2, pp.246-259, 2009.
[51] S. Misra, and A. Vaish, ”Reputation- based Role Assignment for Role- based AccessControl in Wireless Sensor Networks,” Journal of Computer Communications, vol.34, no. 3, pp.281-294, 2011.
[52] R. K. Arun and S. Selvakumar, ”Distributed denial of service attack detection using anensemble of neural classifier,” Computer Communications, vol. 34, no. 11, pp. 13281341, 2011.
[53] P. Domingos and G. Hulten, ”Mining high-speed data streams,” Proceedings of the 6thACM SIGKDD International Conference on Knowledge Discovery and Data Mining,pp. 7180, August 2000.
119
[54] H. Yang, S. Fong, G. Sun, and R. Wong, ”A Very Fast Decision Tree Algorithm forReal-Time Data Mining of Imperfect Data Streams in a Distributed Wireless SensorNetwork,” International Journal of Distributed Sensor Networks, vol. 2012, ArticleID 863545, 16 pages, 2012. doi:10.1155/2012/863545.
[55] J. Mirkovic, P. Reiher, S. Fahmy, R. Thomas, A. Hussain, S. Schwab, and C. Ko,”Measuring denial Of service,” Conference on Computer and Communications Se-curity. Proceedings of the 2nd ACM workshop on Quality of protection, pp. 53-58,2006.
[56] A. Bhandari1, A. L. Sangal, and K. Kumar, ”Performance Metrics for Defense Frame-work against Distributed Denial of Service Attacks,” International Journal of NetworkSecurity, vol.6, pp. 38- 47, April 2014.
[57] D. Arora, P. Singh, and V. Singh, ”Impact analysis of denial of service (DoS) dueto packet flooding,” International Journal of Engineering Research and Applications,vol. 4, no. 6, pp. 144149, 2014.
[58] N. Gu, Y. Jiang, J. Zhang, H. Zheng, ”An Implementation of WBAN Module Basedon NS-2,” Proceedings of the 2013 International Conference on Computer Sciencesand Applications (CSA), pp. 114- 118, December 2013.
[59] J. A. Pamplin, ”NS2 Leach Implementation”,http://read.pudn.com/downloads87/ebook/334495/ns2leach.pdf.
[60] L. Hughes, X. Wang, and T. Chen, ”A review of protocol implementations and energyefficient cross-layer design for wireless body area networks,” Sensors Journal, vol.12, no. 11, pp. 14730 14773, 2012.
[61] VFML (Very Fast Machine Learning) toolkit, 2014,http://www.cs.washington.edu/dm/vfml/.
[62] E-Health Sensor Platform V2.0 for Arduino and Raspberry Pi (Biometric / Med-ical Applications). https://www.cooking-hacks.com/documentation/tutorials/ehealth-biometric-sensor-platform-arduino-raspberry-pi-medical
[63] H. Geoffrey, K. Richard, P. Bernhard, ”Tie Breaking in Hoeffding trees,” In: Gama, J.,Aguilar-Ruiz, J.S. (eds) Proceedings Workshop W6: Second International Workshopon Knowledge Discovery in Data Streams, pp. 107-116 (2005)
[64] R. Latif, H. Abbas, and S. Assar, ”Distributed denial of service (DDoS) attack incloud-assisted wireless body area networks: a systematic literature review,” Journalof Medical Systems (Springer), vol. 38, no.128, pp. 1-10, 2014.
[65] K. Park, and H. Lee, ”On the Effectiveness of Probabilistic Packet Marking for IPTraceback Under Denial of Service Attack.” Proceedings of 2001 IEEE INFOCOMConference, June 2001.
120