IDS - Analysis of SVM and decision trees

IDS[Intrusion Detection

System]Analysis of Decision Trees and

SVMS. V. FarrahiH. Manzari

N. KharazmiShiraz University of Technology

What is intrusion detection?

»Intrusion detection systems (IDSs) are software or hardware systems that automate the process of monitoring the events occurring in a computer system or network, analyzing them for signs of security problems.

What is intrusion detection?

»Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusions, defined as attempts to compromise the confidentiality, integrity, availability, or to bypass the security mechanisms of a computer or network

Why need intrusion detection?

»Intrusions are caused by attackers accessing the systems from the Internet, authorized users of the systems who attempt to gain additional privileges for which they are not authorized, and authorized users who misuse the privileges given them.

Classification of intrusion detection system

Generally speaking, there are two kinds of classification methods for intrusion detection system:

» According to different data sources, intrusion detection system includes host-based IDS and network-based IDS.

» According to different analysis methods, intrusion detection system includes Misuse Detection and Anomaly Detection.

host-based and network-based IDS» Host-based systems base their decisions on information obtained from a single host (usually audit trails), while network-based intrusion detection systems obtain data by monitoring the traffic in the network to which the hosts are connected

Misuse Detection and Anomaly Detection

» A signature detection system identifies patterns of traffic or application data presumed to be malicious while anomaly detection systems compare activities against a ‘‘normal ’’ baseline

» Anomaly detection assumes that an intrusion will always reflect some deviations from normal patterns.

» Misuse detection is based on the knowledge ofsystem vulnerabilities and known attack patterns

Signatures based

Intrusion Patterns

activities

pattern matching

intrusion

Example: if (src_ip == dst_ip) then “land attack”

Anomaly based

activity measures

probable

intrusion

Misuse detection Advantages and disadvantages

» The primary advantage of signature detection is that known attacks can be detected fairly reliably with a low false positive rate.

» The drawback of the signature detection approach is that such systems typically require a signature to be defined for all of the possible attacks that an attacker may launch against a network

Misuse detection Advantages and disadvantages

»The main disadvantage of misuse detection approaches is that they will detect only the attacks for which they are trained to detect.

»Novel attacks or unknown attacks or even variants of common attacks often go undetected. At a time when new security vulnerabilities in software are discovered and exploited every day, the reactive approach embodied by misuse detection methods is not feasible for defeating malicious attacks

Anomaly detection Advantages and disadvantages» Anomaly detection systems have two major advantages over signature based intrusion detection systems. The first advantage that differentiates anomaly detection systems from signature detection systems is their ability to detect unknown attacks as well as ‘‘zero day’’ attacks

» profiles of normal activity are customized for every system, application and/or network, and therefore making it very difficult for an attacker to know with certainty what activitiesit can carry out without getting detected.

Anomaly detection Advantages and disadvantages

» Disadvantage of the anomaly detection approach is that well-known attacks may not be detected, particularly if they fit the established profile of the user

» if the attacker knows that his profile is stored he can change his profile slightly and train the system in such a way that the system will consider the attack as a normal behavior.

Process model for Intrusion Detection

» Three fundamental functional components of an IDS:Information Sources – the different sources of event information used to determine whether an intrusion has taken place. These sources can be drawn from different levels of the system, with network, host, and application monitoring most common.» Analysis – the part of intrusion detection systems that actually organizes and makes sense of the events derived from the information sources, deciding when those events indicate that intrusions are occurring or have already taken place» Response – Send alarm to the administrator

Architecture

Architecture of an intrusion detection system

KDD Cup 99 dataset- A benchmark» There are approximately 4,940,000 kinds of data in training dataset

» There are 23 types of attacks contained in training information and 37 types of attacks contained in test information,14 types of attacks more than training information

» each record ( row) has 41 features plus one that is class variable

» test information can be used to assess the detection capacity for unknown attacks.

KDD Cup 99 dataset attacks» Four types of attacks in the KDD cup 99 :Probe: Strictly speaking, it should not be regarded as true attacks but preparation step of attackers before launching attacks.

» Dos (Denial of service): Such attack may cause the stop of server operation, and the server cannot provide services. The attack usually occupies all system source of server, or occupies the band width and disables system resource and makes operation stop.

KDD Cup 99 dataset attacks (cont…» U2R (User gain root): In the attack, users take advantage of system leak to get access to legal purview or administrator’s purview

» A remote to user (R2L) attack is a class of attack where an attacker sends packetsto a machine over a network, then exploits the machine’s vulnerability to illegally gain local access as a user.

Evaluation steps

Classification tree » Classification tree which is also called decision tree isone of the main techniques used in data mining.

» Its main goal is to learn from class-labeled training tuples for predicting classes of new or previously unseen data.

» Two methods for building tree are top-down tree and bottom-up Pruning

» ID3 and C4.5, two common algorithms of decision tree, are constructed in top-down manner.

Steps of Classification tree

1) Computing the information gain for each attribute.2) The attribute with the highest information gain, is selected as a splitting attribute.3) If the selected attribute is discrete (categorical), the node is branched with all possible values. If the attribute is continuous, a cut point with the highest information gain is selected.4) After splitting, consider whether or not these new nodes are leaves (their data belong to the same type); otherwise, new nodes are the root of the sub-trees.5) Repeating all the above steps, until all new nodes are leaves.

SVM – Support Vector Machine

small distance between data and hyperplane and right: big distance between data and hyperplane.

Percentage of various data

10% kddcup.data_10_percent.gz.

Preprocess of data

» The research will sample training dataset (10% kddcup.data_10_percent.gz) and test Dataset

» Based on the normal proportion, select each 10,000 group of data where normal proportion is 10%, 20%, 30%, . . ., 90% in training dataset and test dataset

Camparison

Accuracy = TP +TN/(TP + TN + FP + FN) * 100%

False alarm rate = FP/(FP +TN)* 100%

Detection rate = TP /(TP + FN) * 100%

precision = TP/(TP + FP) * 100%

recall = TP/(TP + FN) * 100%

Accuracy comparison between C4.5 and SVM



» when the proportion of normal information is large (>70%), their accuracy is approximately equal, but SVM is much better

» According to the average, C4.5 is slightly better than SVM

Detection rate comparison between C4.5 and SVM

Comparison of Detection Rate(cont..)

Comparison of Detection Rate(cont…)

» In detection rate, C4.5 declines as the percentage of normal data rises, but SVM is not fixed.

» Integrally speaking, Curve of C4.5 is above that of SVM

» obviously, its detection rate is better than that of SVM

False alarm rate comparison between C4.5 and SVM

False alarm rate comparison between C4.5 and SVM

False alarm rate comparison between C4.5 and SVM (cont..)

» In comparison of false alarm rate, SVM is inferior to C4.5 only when the proportion of normal information is 30%, 50% and 60%, but it is better than C4.5 otherwise

» According to the average value, SVM is better C4.5 in false alarm rate.

Comparison

» For comparison results of C4.5 and SVM, we finds that C4.5 is superior to SVM in accuracy and detection; but in false alarm rate, SVM is better

Feature Selection

» In complex classification domains, featuresmay contain false correlations, which hinderthe process of detecting intrusions.» Further, some features may be redundant since the information they add is contained in other features» Extra features can increase computation time, and can have an impact on the accuracy of the IDS.

Feature Selection(cont..)

» Empirical results indicate that significant input feature selection is important to design an IDS that is lightweight, efficient and effective for real world detection systems

» IDSs try to perform their task in real time.Some data may not be useful to the IDS and thus can be eliminated before processing

» Feature selection can help to reduce the time need to construct a model

Correlation coefficient(preprocessing)Correlation coefficient of A and B is defined as follows :

Correlation coefficient(preprocessing)

Detection rate comparison between C4.5 and SVM

Classification and Regression Trees (CART)» The Classification and Regression Trees (CART) methodology is based on binary recursive partitioning

» The process is binary because parent nodes are always split into exactly two child nodes and recursive because the process is repeated by treating each child node as a parent

» For splitting, the Gini rule is used which essentially is a measure of how well the splitting rule separates the classes contained in the parent node

Classification and Regression Trees (CART)(cont…)

» Unlike other methods, CART does not stop in the middle of the tree growing process, because there might still be important information to be discovered by drilling down several more levels.

» Once the maximal tree is grown and a set of sub-trees is derived from it, CART determines the best tree by testing for error rates or costs

Classification and Regression Trees (CART)(cont…)

» The best sub-tree is the one with the lowest or near-lowest cost, which may be a relatively small tree

» The best variable selected at each node of the tree is called (first) primary variable

» Surrogate variables are defined as the variables that most accurately predict the action of the primary variable

Result of CART

» KDD cup 99 Data set has 41 features , which is high-dimensional

» IDS is a real-time task , thus feature reduction can help reduce the time of constructing a model

» This resulted in a reduced 12-variable data set with C, E, F, L, W, X, Y, AB, AE, AF, AG and AI as variables

Performance of CART

Experimental Result

Experimental Result

Conclusion and future work» Decision trees can help in IDSs with constructing an accurate model But not do well in R2l and U2R attacks

» From empirical results of U2R and R2L classes whichhave small training data and for which decision tree gives better performance than SVM, we can say that decision tree works well with small training data

» We found that reducing the number of features will not necessarily reduce the test time. This quite depends on the existing relationship between dataset features, not on the number of features.

Refrences

[1] M. Ektefa, S. Memar, F. Sidi, and L. S. Affendey, "Intrusion Detection Using Data Mining Techniques," 2010 International Conference on Information Retrieval & Knowledge Management, (CAMP)2010.[2] B. M. Bidgoli, M. Analoui, M. H. Rezvani, and H. S. Shahhoseini, "Performance Evaluation of Decision Tree for Intrusion Detection Using Reduced Feature Spaces," Trends in Intelligent Systems and Computer Engineering, 2008.[3] S. Chebrolua, A. Abrahama, and J. P. Thomasa, "Feature deduction and ensemble design of intrusion detection systems," Computers & Security, 2005.