40
Evaluation of Support Vector Machine Kernels for Detecting Network Anomalies Prerna Batta, Maninder Singh, Zhida Li, Qingye Ding, and Ljiljana Trajković Communication Networks Laboratory http://www.ensc.sfu.ca/~ljilja/cnl/ School of Engineering Science Simon Fraser University

Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Evaluation of Support Vector Machine Kernels for Detecting

Network Anomalies

Prerna Batta, Maninder Singh, Zhida Li, Qingye Ding, and Ljiljana Trajković

Communication Networks Laboratoryhttp://www.ensc.sfu.ca/~ljilja/cnl/School of Engineering Science

Simon Fraser University

Page 2: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introduction:n Border Gateway Protocol (BGP)n Machine learning

n Feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 2May 28, 2018

Page 3: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introductionn Border Gateway Protocol (BGP)n Machine learning

n BGP feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 3May 28, 2018

Page 4: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Introduction: Border Gateway Protocol

n BGP’s main function is to optimally route data between Autonomous Systems (ASes)

n AS: a collection of BGP routers (peers) within a single administrative domain

n Four types of BGP messages: n open, keepalive, update, and notification

n BGP anomalies: n Slammer, Nimda, Code Red I, routing

misconfigurations

ISCAS 2018, Florence, Italy 4May 28, 2018

Page 5: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Introduction: Machine learning

n Machine learning models classify data using a feature matrix

n Feature matrix: n rows: data points n columns: feature values

n Algorithms: n Logistic Regression, Naïve Bayes,

Support Vector Machine (SVM)n SVM defines decision boundary to geometrically lie

midway between the support vectors

ISCAS 2018, Florence, Italy 5May 28, 2018

Page 6: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introductionn Border Gateway Protocol (BGP)n Machine learning

n BGP feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 6May 28, 2018

Page 7: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Feature extraction: BGP messages

n Extract 37 featuresn Sample every minute during a five-day period:

n the peak day of an anomalyn two days prior and two days after the peak day

n 7,200 samples for each anomalous event:n 5,760 regular samples (non-anomalous)n 1,440 anomalous samplesn imbalanced dataset

ISCAS 2018, Florence, Italy 7May 28, 2018

Page 8: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

BGP features

ISCAS 2018, Florence, Italy

Feature Definition Category

1 Number of announcements Volume

2 Number of withdrawals Volume

3 Number of announced NLRI prefixes Volume

4 Number of withdrawn NLRI prefixes Volume

5 Average AS-PATH length AS-path

6 Maximum AS-PATH length AS-path

7 Average unique AS-PATH length AS-path

8 Number of duplicate announcements Volume

9 Number of duplicate withdrawals Volume

10 Number of implicit withdrawals Volume

May 28, 2018 8

Page 9: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

BGP features

ISCAS 2018, Florence, Italy

Feature Definition Category

11 Average edit distance AS-path

12 Maximum edit distance AS-path

13 Inter-arrival time Volume

14–24Maximum edit distance = n,

where n = (7, ..., 17)AS-path

25–33Maximum AS-path length = n,

where n = (7, ..., 15)AS-path

34 Number of IGP packets Volume

35 Number of EGP packets Volume

36 Number of incomplete packets Volume

37 Packet size (B) Volume

May 28, 2018 9

Page 10: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Feature extraction: BGP messages

n Border Gateway Protocol (BGP) enables exchange

of routing information between gateway routers

using update messages

n Collections of BGP update message:

n Réseaux IP Européens (RIPE) under the

Routing Information Service (RIS) project

n Route Views

n Available in multi-threaded routing toolkit (MRT)

binary format

ISCAS 2018, Florence, Italy 10May 28, 2018

Page 11: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

BGP anomalies

n Slammer:n infected Microsoft SQL servers through a small

piece of code that generated IP addresses at random

n Nimda: n exploited vulnerabilities in the Microsoft Internet

Information Services (IIS) web servers for Internet Explorer 5

n Code Red I: n attacked Microsoft IIS web servers by replicating

itself through IIS server weaknesses

ISCAS 2018, Florence, Italy 11May 28, 2018

Page 12: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Duration of BGP events

ISCAS 2018, Florence, Italy 12May 28, 2018

Anomaly Date Anomaly (min)

Regular (min)

Slammer January 25, 2003 869 6,331Nimda September 18-20, 2001 3,521 3,679Code Red I July 19, 2001 600 6,600

Page 13: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

May 28, 2018 ISCAS 2018, Florence, Italy 13

Number of BGP announcements: Slammer

Page 14: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

May 28, 2018 ISCAS 2018, Florence, Italy 14

Number of BGP announcements: Nimda

Page 15: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

May 28, 2018 ISCAS 2018, Florence, Italy 15

Number of BGP announcements: Code Red I

Page 16: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

May 28, 2018 ISCAS 2018, Florence, Italy 16

Number of BGP announcements: Regular

Page 17: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introductionn Border Gateway Protocol (BGP)n Machine learning

n Feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 17May 28, 2018

Page 18: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Feature selection

n Reduces redundancy among features and improves the classification accuracy

n Decision tree algorithm was used for for feature selection:n one of the most successful techniques for

supervised classification learningn It can handle both numerical and categorical

featuresn Publicly available software tool: C5

ISCAS 2018, Florence, Italy 18May 28, 2018

Page 19: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Feature selection: decision tree

ISCAS 2018, Florence, Italy

Dataset Training data Selected features

Dataset 1 Slammer + Nimda 1–21, 23–29, 34–37Dataset 2 Slammer + Code Red I 1–22, 24–29, 34–37Dataset 3 Code Red I + Nimda 1–29, 34–37

May 28, 2018 19

§ Either four (30, 31, 32, 33) or five (22, 30, 31, 32, 33) features are removed in the constructed trees mainly because:n features are numerical and some are used

repeatedly

Page 20: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introductionn Border Gateway Protocol (BGP)n Machine learning

n Feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 20May 28, 2018

Page 21: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Support Vector Machine

ISCAS 2018, Florence, Italy 21May 28, 2018

n SVM defines a separating hyperplane in order to assign the target variables into distinct categories

n It is a non-probabilistic binary classifiern Used for classification problems and in pattern

recognition applicationsn Modified version of logistic regression

Page 22: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Support Vector Machine

n For a given dataset x with n number of training data, SVM finds the maximum margin hyperplane separating different classes of data:! = !#, %# , !# ∈ '(, %# ∈ 1,−1 , ∀ , = 1,2, … ,/

n !#: p-dimensional input vectorn %#: output value (1 or -1)

n Decision vector separating two classes is given by:12 3 ! + 5 = 0

n 12: optimal weighing vectorn b: bias

ISCAS 2018, Florence, Italy 22May 28, 2018

Page 23: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Support Vector Machine

n For linearly separable training data, margins are defined as:n !" # $ + & = 1n !" # $ + & = −1

ISCAS 2018, Florence, Italy 23May 28, 2018

Page 24: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

SVM with linear kernel: correctly classified regular (circles) and anomalous (stars) data points as well as one incorrectly classified regular

(circle) data point

May 28, 2018 ISCAS 2018, Florence, Italy 24

Support Vector Machine

Page 25: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Support Vector Machine

n Distance between the margins: 2/∥ $% ∥n Objective function: minimize ∥ $% ∥n Let C be the regularization parameter that defines the

separation of two classes and the error when using a training dataset. The hyperplane is acquired by minimizing the margins:

& '()*

(+( +

12 ∥ . ∥/,

with constraints 1(2 3( ≥ 1 − +(, 6 = 1,… ,9n 1(: target valuen +(: set of slack variables

ISCAS 2018, Florence, Italy 25May 28, 2018

Page 26: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Support Vector Machine: kernel trick

n Instead of calculating each mapping, the “kernel trick” is used to directly calculate the inner product in the input space

n The mapping defines feature space and generates a decision boundary for input data points

n Using the “kernel trick” reduces the complexity of the optimization problem that now only depends on the input space instead if the feature space

ISCAS 2018, Florence, Italy 26May 28, 2018

Page 27: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Support Vector Machine

n Instead of employing a minimization model, the problem be formulated using Lagrangian dual multiplier ! as:

max %&'(

&!& −

12%&'(

,%-'(

,!&!-.&.- /&, /- ,

subject to:

0 ≤ !3 ≤ 4 ∀ 6 = 1,2, … , 9 and%3'(

&!3.3 = 0

ISCAS 2018, Florence, Italy 27May 28, 2018

Page 28: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

SVM with the nonlinear kernel function: the three-dimensional space shows a hyperplane dividing regular (circles) and anomalous (stars) data

pointsMay 28, 2018 ISCAS 2018, Florence, Italy 28

Support Vector Machine

Page 29: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introductionn Border Gateway Protocol (BGP)n Machine learning

n Feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 29May 28, 2018

Page 30: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Experimental procedure

ISCAS 2018, Florence, Italy 30May 28, 2018

Step 1:

n Use 37 features or select the most relevant

features using the decision tree algorithm

n Step 2:

n Train the SVM with linear, quadratic, or cubic

kernels

n Step 3:

n Test the models using various datasets

n Step 4:

n Evaluate the SVM kernels based on accuracy and

F-Score

Page 31: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Training and test datasets

ISCAS 2018, Florence, Italy 31May 28, 2018

Training dataset Test dataset

Dataset 1 Slammer and Nimda Code Red IDataset 2 Nimda and Code Red I Slammer

Page 32: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Experimental procedure

n MATLAB 2017b Statistics and Machine Learning Toolbox

n The performance of SVM with various kernels is evaluated using combinations of datasets

n SVM performance was measured based on accuracy and F-Score

n The confusion matrix is used to evaluate performance of classification algorithms

n True positive (TP) and false negative (FN) are the number of anomalous data points that are classified as anomaly and regular, respectively

ISCAS 2018, Florence, Italy 32May 28, 2018

Page 33: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

ISCAS 2018, Florence, Italy 33May 28, 2018

n Accuracy: n (TP+TN)/(TP+TN+FP+FN)

n F-Score signifies harmonic mean between precision and sensitivity:n 2 x (precision x sensitivity)/(precision + sensitivity)n precision: TP/(TP+FP)n sensitivity: TP/(TP+FN)

Performance measures

Page 34: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

SVM with linear kernel

ISCAS 2018, Florence, Italy 34May 28, 2018

Linear kernel Accuracy (%) F-Score (%)

Selected features Training dataset Test RIPE BCNET Test

1-37Dataset 1 72.76 61.34 54.21 73.60Dataset 2 70.81 52.89 45.36 73.19

1-21, 23-29, 34-37Dataset 1 74.71 63.26 55.39 76.29Dataset 2 73.27 54.12 49.38 74.48

Page 35: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

SVM with nonlinear kernels

ISCAS 2018, Florence, Italy 35May 28, 2018

Quadratic kernel Accuracy (%) F-Score (%)

Selected features Training dataset Test RIPE BCNET Test

1-37Dataset 1 58.55 52.73 43.68 58.85Dataset 2 61.27 42.87 35.52 63.15

1-21, 23-29, 34-37Dataset 1 63.84 58.51 46.39 67.24Dataset 2 63.36 46.55 38.73 64.68

Page 36: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

SVM with nonlinear kernels

ISCAS 2018, Florence, Italy 36May 28, 2018

Cubic kernel Accuracy (%) F-Score (%)

Selected features Training dataset Test RIPE BCNET Test

1-37Dataset 1 65.33 54.31 45.53 68.55Dataset 2 63.15 46.23 40.17 65.49

1-21, 23-29, 34-37Dataset 1 69.21 58.12 49.26 70.14Dataset 2 67.79 49.78 42.36 69.55

Page 37: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Roadmap

n Introductionn Border Gateway Protocol (BGP)n Machine learning

n Feature extraction and selectionn Support vector machine and kernelsn Experimental procedure and classification resultsn Conclusions and references

ISCAS 2018, Florence, Italy 37May 28, 2018

Page 38: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

Conclusions

n SVM algorithm is one of the most efficient ML toolsn Kernels are used to transform the input data into a high

dimensional spacen Their performance depends on both the feature

selection and the type of datasetsn Analyzed BGP anomaly datasets are linearly separablen SVM with linear kernel outperforms SVMs with

quadratic and cubic kernels

ISCAS 2018, Florence, Italy 38May 28, 2018

Page 39: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

References: sources of data

n RIPE RIS raw data [Online]. Available: http://www.ripe.net/data-tools/.

n University of Oregon Route Views project [Online]. Available:http:// www.routeviews.org/.

n BCNET [Online]. Available: http://www.bc.net/.

ISCAS 2018, Florence, Italy 39May 28, 2018

Page 40: Evaluation of Support Vector Machine Kernels for Detecting ...ljilja/cnl/presentations/ljilja/iscas2018/ISCAS_2018_final_revised.pdf · Evaluation of Support Vector Machine Kernels

References:http://www.sfu.ca/~ljilja/cnl

n Q. Ding, Z. Li, S. Haeri, and Lj. Trajković, “Application of machine learning techniques to detecting anomalies in communication networks: Datasets and Feature Selection Algorithms” in Cyber Threat Intelligence, M. Conti, A. Dehghantanha, and T. Dargahi, Eds., Berlin: Springer, to appear.

n Q. Ding, Z. Li, S. Haeri, and Lj. Trajković, “Application of machine learning techniques to detecting anomalies in communication networks: Classification Algorithms” in Cyber Threat Intelligence, M. Conti, A. Dehghantanha, and T. Dargahi, Eds., Berlin: Springer, to appear.

n Q. Ding, Z. Li, P. Batta, and Lj. Trajković, "Detecting BGP anomalies using machine learning techniques," in Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC 2016), Budapest, Hungary, Oct. 2016, pp. 3352-3355.

n Y. Li, H. J. Xing, Q. Hua, X.-Z. Wang, P. Batta, S. Haeri, and Lj. Trajković, “Classification of BGP anomalies using decision trees and fuzzy rough sets,” in Proc. IEEE International Conference on Systems, Man, and Cybernetics, SMC 2014, San Diego, CA, October 2014, pp. 1312-1317.

n N. Al-Rousan, S. Haeri, and Lj. Trajković, “Feature selection for classification of BGP anomalies using Bayesian models," in Proc. International Conference on Machine Learning and Cybernetics, ICMLC 2012, Xi'an, China, July 2012, pp. 140-147.

n N. Al-Rousan and Lj. Trajković, “Machine learning models for classification of BGP anomalies,” in Proc. IEEE Conf. High Performance Switching and Routing, HPSR 2012, Belgrade, Serbia, June 2012, pp. 103-108.

ISCAS 2018, Florence, Italy 40May 28, 2018