22
Research Article LNNLS-KH: A Feature Selection Method for Network Intrusion Detection Xin Li , 1 Peng Yi , 1 Wei Wei, 2 Yiming Jiang, 1 and Le Tian 1 1 Information Technology Institute, PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, China 2 Center for Energy Environment & Economy Research, Zhengzhou University, Zhengzhou 450001, China Correspondence should be addressed to Peng Yi; [email protected] Received 15 September 2020; Revised 27 November 2020; Accepted 17 December 2020; Published 6 January 2021 Academic Editor: Jes´ us D´ ıaz-Verdejo Copyright © 2021 Xin Li et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. As an important part of intrusion detection, feature selection plays a significant role in improving the performance of intrusion detection. Krill herd (KH) algorithm is an efficient swarm intelligence algorithm with excellent performance in data mining. To solve the problem of low efficiency and high false positive rate in intrusion detection caused by increasing high-dimensional data, an improved krill swarm algorithm based on linear nearest neighbor lasso step (LNNLS-KH) is proposed for feature selection of network intrusion detection. e number of selected features and classification accuracy are introduced into fitness evaluation function of LNNLS-KH algorithm, and the physical diffusion motion of the krill individuals is transformed by a nonlinear method. Meanwhile, the linear nearest neighbor lasso step optimization is performed on the updated krill herd position in order to derive the global optimal solution. Experiments show that the LNNLS-KH algorithm retains 7 features in NSL-KDD dataset and 10.2 features in CICIDS2017 dataset on average, which effectively eliminates redundant features while ensuring high detection accuracy. Compared with the CMPSO, ACO, KH, and IKH algorithms, it reduces features by 44%, 42.86%, 34.88%, and 24.32% in NSL-KDD dataset, and 57.85%, 52.34%, 27.14%, and 25% in CICIDS2017 dataset, respectively. e classification accuracy increased by 10.03% and 5.39%, and the detection rate increased by 8.63% and 5.45%. Time of intrusion detection decreased by 12.41% and 4.03% on average. Furthermore, LNNLS-KH algorithm quickly jumps out of the local optimal solution and shows good performance in the optimal fitness iteration curve, convergence speed, and false positive rate of detection. 1.Introduction With the advent of the era of big data, the dimension of information has increased exponentially. In many fields such as machine learning, data analysis, and text mining [1], it is increasingly difficult to handle large amounts of high di- mension data. Irrelevant and redundant features increase the complexity of the dimension and interfere with the accurate classification results, resulting in poor performance of the algorithm. Intrusion detection system (IDS) [2] relies on a large amount of network data, which carries out real-time monitoring of network transmission and identifies and processes malicious use of computers and network re- sources. e “curve of dimensionality” (COD) caused by massive data of IDS leads to low detection rate, poor effect, and high false positive rate, which seriously affect the effi- ciency of intrusion detection. How to improve the efficiency of intrusion detection while ensuring the detection accuracy has become an urgent problem to be solved. As a common method of data dimensionality reduction, feature selection has attracted more and more attention. It reduces the complexity of data by deleting unnecessary features, which is of great significance to IDS. Feature se- lection algorithms filter out redundant data to reduce the dimensions of network data. In addition, the computing payload of IDS is decreased and the detection speed is improved. Consequently, feature selection is one of the critical links of data preprocessing in IDS, which has a significant impact on detection accuracy and model gen- eralization ability. Generally, the feature selection frame- work is composed of four parts: search module, evaluation criterion, judgment condition and verification, and output. e search module includes search starting point and search strategy. After the original feature set is processed by the Hindawi Security and Communication Networks Volume 2021, Article ID 8830431, 22 pages https://doi.org/10.1155/2021/8830431

LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

Research ArticleLNNLS-KH A Feature Selection Method for NetworkIntrusion Detection

Xin Li 1 Peng Yi 1 Wei Wei2 Yiming Jiang1 and Le Tian 1

1Information Technology Institute PLA Strategic Support Force Information Engineering University Zhengzhou 450001 China2Center for Energy Environment amp Economy Research Zhengzhou University Zhengzhou 450001 China

Correspondence should be addressed to Peng Yi yipengndsc163com

Received 15 September 2020 Revised 27 November 2020 Accepted 17 December 2020 Published 6 January 2021

Academic Editor Jesus Dıaz-Verdejo

Copyright copy 2021 Xin Li et al 2is is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

As an important part of intrusion detection feature selection plays a significant role in improving the performance of intrusiondetection Krill herd (KH) algorithm is an efficient swarm intelligence algorithm with excellent performance in data mining Tosolve the problem of low efficiency and high false positive rate in intrusion detection caused by increasing high-dimensional dataan improved krill swarm algorithm based on linear nearest neighbor lasso step (LNNLS-KH) is proposed for feature selection ofnetwork intrusion detection 2e number of selected features and classification accuracy are introduced into fitness evaluationfunction of LNNLS-KH algorithm and the physical diffusion motion of the krill individuals is transformed by a nonlinearmethod Meanwhile the linear nearest neighbor lasso step optimization is performed on the updated krill herd position in order toderive the global optimal solution Experiments show that the LNNLS-KH algorithm retains 7 features in NSL-KDD dataset and102 features in CICIDS2017 dataset on average which effectively eliminates redundant features while ensuring high detectionaccuracy Compared with the CMPSO ACO KH and IKH algorithms it reduces features by 44 4286 3488 and 2432 inNSL-KDD dataset and 5785 5234 2714 and 25 in CICIDS2017 dataset respectively 2e classification accuracyincreased by 1003 and 539 and the detection rate increased by 863 and 545 Time of intrusion detection decreased by1241 and 403 on average Furthermore LNNLS-KH algorithm quickly jumps out of the local optimal solution and showsgood performance in the optimal fitness iteration curve convergence speed and false positive rate of detection

1 Introduction

With the advent of the era of big data the dimension ofinformation has increased exponentially In many fields suchas machine learning data analysis and text mining [1] it isincreasingly difficult to handle large amounts of high di-mension data Irrelevant and redundant features increase thecomplexity of the dimension and interfere with the accurateclassification results resulting in poor performance of thealgorithm Intrusion detection system (IDS) [2] relies on alarge amount of network data which carries out real-timemonitoring of network transmission and identifies andprocesses malicious use of computers and network re-sources 2e ldquocurve of dimensionalityrdquo (COD) caused bymassive data of IDS leads to low detection rate poor effectand high false positive rate which seriously affect the effi-ciency of intrusion detection How to improve the efficiency

of intrusion detection while ensuring the detection accuracyhas become an urgent problem to be solved

As a common method of data dimensionality reductionfeature selection has attracted more and more attention Itreduces the complexity of data by deleting unnecessaryfeatures which is of great significance to IDS Feature se-lection algorithms filter out redundant data to reduce thedimensions of network data In addition the computingpayload of IDS is decreased and the detection speed isimproved Consequently feature selection is one of thecritical links of data preprocessing in IDS which has asignificant impact on detection accuracy and model gen-eralization ability Generally the feature selection frame-work is composed of four parts search module evaluationcriterion judgment condition and verification and output2e search module includes search starting point and searchstrategy After the original feature set is processed by the

HindawiSecurity and Communication NetworksVolume 2021 Article ID 8830431 22 pageshttpsdoiorg10115520218830431

search module the corresponding feature subset is gener-ated Appropriate evaluation criteria are constructed toevaluate the feature subsets When the termination condi-tion of the feature selection process is reached the finalselected feature subset is output Meanwhile it is verified toevaluate the quality of feature selection algorithm 2eframework of feature selection is shown in Figure 1

2e swarm intelligence optimization method is a kindof group-oriented random search technology which pro-vides new ideas for solving the feature selection problemKrill herd (KH) algorithm is a new type of swarm intel-ligence optimization method that studies the foraging ruleand clustering behavior of krill herd in nature By simu-lating the movement induced by other krill individualsforaging activity and physical diffusion motion of krillherd the position of individuals is constantly updatedWhile looking for food and the highest krill herd densitythey will move towards the best solution and finally get theglobal optimal solution KH algorithm has been widelyconcerned by many scholars and engineers for its excellentoptimization performance and is considered to be one ofthe fastest developing natural heuristic algorithms insolving practical optimization problems [3] It integratesthe local robust search method with the population-basedmethod and has a good performance in high-dimensionaldata processing It is widely used in network path opti-mization [4] text clustering analysis [5] neural networktraining [6] multiple continuous optimization [7ndash9]combinatorial optimization [10 11] constraint optimiza-tion [12ndash14] and other scenarios [3] KH algorithm hasgood exploitation ability but the exploration ability is notsatisfactory which means that the algorithm is easy to fallinto local optimal solution when solving practical prob-lems Although there are existing optimization algorithmsfor KH algorithm the research on the optimization algo-rithm that can provide high convergence rate and globaloptimal solution is continuing2erefore the improvementof KH algorithm to balance the global exploration and localexploitation abilities is of great significance for improvingthe solution accuracy and optimization efficiency

In this paper an optimized LNNLS-KH algorithm forfeature selection is proposed to address the problem of largenumber and high dimension of intrusion detection datasetsIt filters out the redundant features of IDS data so that theefficiency of intrusion detection is significantly improvedand the time cost is enormously reduced

2emain contributions of this paper are listed as follows

(i) 2e number of dimensions and detection accuracyof feature selection were introduced into the fitnessfunction which improved the ability of featureselection

(ii) To accelerate the convergence speed of the algo-rithm we modified the physical diffusion motion ofkrill individuals by the nonlinear method

(iii) 2e LNNLS-KH algorithm was proposed for featureselection of intrusion detection data which

effectively enhanced the local exploitation abilityand global exploration ability of the algorithm

(iv) 2e proposed algorithm was comprehensivelyevaluated by conducting a large number of exper-iments on NSL-KDD and CICIDS2017 datasetsdataset 2e experimental results show that theLNNLS-KH algorithm exhibited good competitiveperformance in the evaluation indicators for in-trusion detection

2e remaining sections of this paper are organized asfollows Section 2 presents the related works about featureselection methods and the variants of KH algorithm Section3 provides a detailed description of the proposed LNNLS-KH algorithm Section 4 provides shows the experimentalresults and discussion Section 5 is concluded with futureresearch

2 Related Works

In this section we show three feature selection methodsbased on evaluation criteria and feature selection algorithmsin IDS Meanwhile we summarize swarm intelligence al-gorithms especially KH algorithm and its variants

21 Feature Selection Methods Based on the EvaluationCriteria 2ere are three types of feature selection methodsbased on the evaluation criteria the filter method thewrapper method and the embedded method [15] 2e filtermethod assigns weights to the features of each dimensionfilters the features in the order of weight and uses the featuresubsets to train the classification algorithm 2erefore theprocess of feature selection is independent of the classifi-cation algorithm Although the filter method occupies fewercomputing resources and saves more time for feature se-lection the selected feature subset lacks the adjustment ofthe classification algorithm resulting in low classificationaccuracy 2e wrapper method takes into account the effectof the performance of the classification algorithm on thefeature subsets so it derives a high classification accuracybut the computation and time are consumed enormously2e embedded method integrates the feature selectionprocess and the classification algorithm and simultaneouslyperforms feature selection during the classification trainingIts computation cost and classification accuracy are betweenthe filter method and the wrapper method 2e featureselection of intrusion detection data requires high accuracyand the training time of offline data is not concerned2erefore the wrapper method is adopted as the featureselection method in this paper 2e frameworks of the threetypes of feature selection methods based on the evaluationcriteria are shown in Figure 2

22 Feature Selection Algorithms in IDS Feature selection isone of the most important parts of data preprocessing inintrusion detection which is of great significance to IDS2e

2 Security and Communication Networks

Evaluation criterion Judgment condition VerificationYes

No

Output

Search startingpoint

Search strategy

Search module

Figure 1 Framework of feature selection

Raw data

Trainingdataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The filter method

The final evaluation

Evaluation criteria of features

Filter to get a subset of features

Calculating thescores of features

Ranking the scores of features

(a)

Raw data

Training dataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The wrapper method

Search features

Classify Evaluate

The final evaluation

Character subset

Classification result

Performance evaluation

(b)

Raw data

Training dataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The embedded method

The final evaluation

Classification algorithm(the feature subset is automatically

obtained through the training of the classification model)

(c)

Figure 2 Frameworks of the three types of feature selectionmethods (a) framework of the filter method for feature selection (b) frameworkof the wrapper method for feature selection and (c) framework of the embedded method for feature selection

Security and Communication Networks 3

characteristics of network intrusion detection data aremultiple features and large scale Features of different cat-egories have different attribute values including redundantfeatures that interfere with the classification results A largenumber of redundant features reduce the efficiency of de-tection algorithms and increase the false positive rate of in-trusion detection However a feature selection algorithm withgood performance decreases the dimensionality of networkdata and improves the accuracy and detection speed of IDS

In recent years there has been a great deal of researchstudies on feature selection in intrusion detection Smith et alcombined Bayesian network and principal componentanalysis (PCA) to conduct feature selection for intrusiondetection data [16]2ey used Bayesian networks to adjust thecorrelation of attributes and PCA to extract the primaryfeatures on an institute-wide cloud system 2e disadvantageis that the detection accuracy is considered to be furtherimproved as an improvement Zhao et al [17] proposed afeature selection method based on Mahalanobis distance andapplied it to network intrusion detection to obtain the optimalfeature subset Feature ranking based on Mahalanobis dis-tance was used as the principle selection mechanism and theimproved exhaustive search was used to select the optimalranking features 2e experimental results based on the KDDCUP 99 dataset show that the algorithm has good perfor-mance on both the support vector machine and the k-nearestneighbor classifier Singh and Tiwari proposed an efficientapproach for intrusion detection in reduced features of KDDCUP 99 dataset in 2015 [18] Iterative Dichotomiser 3 (ID3)algorithmwas used for feature reduction of large datasets andKNNGA was used as a classifier for intrusion detection 2emethod performs well on evaluation measures of sensitivityspecificity and accuracy However both Zhao et al and Singhand Tiwari [17 18] conduct experiments on the outdateddatasets which are difficult to reflect the new attack featuresof modern networks In [19] Ambusaidi et al proposed afeature selection algorithm based on mutual information todeal with linear and nonlinear related data features 2eyestablished an intrusion detection system based on least-squares support vector machine Experimental results showthat the proposed algorithm performs well in accuracy butpoor in false positive rate Shone et al proposed an unsu-pervised feature learning method based on nonsymmetricdeep autoencoder (NDAE) and a novel deep learning clas-sification model constructed using stacked NDAEs [20] 2eresults demonstrated that the approach offers high levels ofaccuracy precision and recall together with reduced trainingtime Meanwhile it is worth noting that the stacked NDAEmodel has 9881 less training time than the mainstreamDBN technology 2e limitation is that the model needs toassess and extend the capability to handle zero-day attacks

In [21] a self-adaptive differential evolution (SaDE)algorithm was proposed to deal with the feature selectionproblem It uses adaptive mechanism to select the mostappropriate among the four candidate solution generationstrategies which effectively reduced the number of features2e disadvantage is that the experiment uses small sampledata and more data is needed to further support the con-clusion Shen et al adopted principal component analysis

and linear discriminant analysis to decrease the dimen-sionality of the dataset and combined with Bayesian clas-sification to construct an intrusion detection model [22]Simulation experiments based on CICIDS2017 dataset showthat the proposed algorithm filters out the noise in the dataand improves the time performance to a certain extentHowever the algorithm still needs to be optimized to furtherimprove the classification accuracy In [23] a hybrid net-work feature selection method based on convolutionalneural network (CNN) and long and short-term memorynetwork (LSTM) had been applied to IDS According to theexperimental results the proposed feature selection algo-rithm achieves better accuracy compared with the CNN-only model and the LSTM-only model However the de-tection accuracy of Heartbleed and SSHPatator attacks islow In [24] Farahani proposed a new cross-correlation-based feature selection (CCFS) method to reduce the featuredimension of intrusion detection dataset Compared withcuttlefish algorithm (CFA) and mutual information-basedfeature selection (MIFS) the proposed algorithm wasdemonstrated to have a good performance in the accuracyprecision and recall rate of classification However theauthor simply replaced the categorical attributes with nu-meric values when dealing with symbolic data withoutconsidering a more reasonable one-hot encoding method2e summary of feature selection methods in IDS is shownin Table 1

23 Swarm Intelligence Algorithms for Feature Selection2e core of feature selection is the search strategy forgenerating feature subsets Although the exhaustive searchstrategy can find the globally optimal feature subset itsexcessive time complexity consumes huge computing re-sources whether exhaustive search or nonexhaustive searchIn recent years swarm intelligence optimization methodsinspired by natural phenomena provide a new approach tosolve the problem of feature selection [10ndash17] 2erefore wepropose the LNNLS-KH algorithm with high search effi-ciency as the search strategy for feature subset Swarm in-telligence optimization methods simulate the evolution ofsurvival of the fittest in nature and are a group-orientedrandom search technique that can be used to solve complexproblems in large-scale data analysis [25] Common swarmintelligence optimization methods include particle swarmoptimization (PSO) [26] ant colony optimization algorithm(ACO) [27] cuckoo algorithm (CA) [28] artificial fishswarm algorithm (AFSA) [29] artificial bee colony algo-rithm (ABC) [30] fruit fly optimization algorithm (FOA)[31] monkey algorithm (MA) [32] bat algorithm (BA) [33]and salp swarm algorithm (SSA) [34]

Moreover Ahmed et al proposed a new chaotic chickenswarm algorithm (CCSO) for feature selection [35] Bycombining logical maps and chaotic trend maps the CSOalgorithm acquires a strong spatial search ability 2e ex-perimental results show that the classification accuracy ofthe model is further improved after CCSO feature selection2e disadvantage is the lack of comparison with otherchaotic algorithms Ahmtabakh proposed an unsupervised

4 Security and Communication Networks

feature selection method based on ant colony optimization(UFSACO) [36] which iteratively filtrates feature throughthe heuristic and previous stage information of the antcolony Simultaneously the similarity between features isquantified to reduce the redundancy of data featuresHowever the efficiency of feature selection process needs tobe improved

To solve the problem that it is easy to fall into the localoptimal solution Arora and Anand proposed a butterflyoptimization algorithm (BOA) based on binary variables[37] Based on the foraging behavior of butterflies the al-gorithm uses each butterfly as a search agent to iterativelyoptimize the fitness function which has good convergenceability and avoids the premature problem to a certain extentExperimental results show that the algorithm reduces thelength of feature subset while selecting the optimal featuresubset and improves the classification accuracy to a certainextent However the time cost is larger than that of geneticalgorithm and particle swarm optimization algorithm andthe optimization result of the feature subset for repeatedexperiments is inaccurate and has poor robustness

In [38] Yan et al proposed a hybrid optimization al-gorithm (BCROSAT) based on simulated annealing andbinary coral reefs which is used for feature selection in high-dimensional biomedical datasets 2e algorithm increasesthe diversity of the initial population individuals through theleague selection strategy and uses the simulated annealingalgorithm and binary coding to improve the search ability ofthe coral reef optimization algorithm However the algo-rithm has high time complexity In [39] a new chaoticDragonfly algorithm (CDA) is proposed by Sayed et alwhich combines 10 different chaotic maps with the searchiteration process of dragonfly algorithm so as to acceleratethe convergence speed of the algorithm and improve theefficiency of feature selection 2e algorithm uses the worstfitness value best fitness value average fitness value stan-dard deviation and average feature length as evaluationcriteria 2e experimental results show that the adjustmentvariable of Gauss map significantly improves the perfor-mance of dragonfly algorithm in classification performancestability number of selected features and convergencespeed 2e disadvantage is that the experimental data issmall and the algorithm needs to be verified on large-scaledatasets Zhang et al [40] mixed genetic algorithm andparticle swarm optimization algorithm to conduct taboosearch for the produced optimal initial solution and theresult of quadratic feature selection is the global optimal

feature subset 2e algorithm not only guarantees the goodclassification performance but also greatly reduces the falsepositive rate and false negative rate of classification results2e disadvantage is that the algorithm takes a large calcu-lation cost and a long offline training time

24KrillHerd (KH)AlgorithmandVariants Krill herd (KH)algorithm is a new swarm intelligence optimization methodbased on population proposed by Gandomi and Alavi in2012 [41] 2e algorithm studies the foraging rules andclustering behavior of the herding of the krill swarms innature and simulates the induced movement foraging ac-tivity and random diffusion movement of KH Meanwhileit obtains the optimal solution by continuously updating theposition of krill individuals

Abualigah et al introduced a multicriteria mixedfunction based on the global optimal concept in the KHalgorithm and applied it to text clustering [5] By supple-menting the advantages of local neighborhood search andglobal wide area search the algorithm balances the ex-ploitation and exploration process of krill herd In [42] theinfluence of excellent neighbor individuals on the krill herdduring evolution is considered and an improved KH algo-rithm is proposed to enhance the local search ability of thealgorithm In [43] a hybrid data clustering algorithm (IKH-KHM) based on improved KH algorithm and k-harmonicmeans was proposed to solve the problem of sensitiveclustering center of K-means algorithm 2is algorithmincreases the diversity of KH by alternately using the randomwalk of Levi flight and the crossover operator in the geneticalgorithm It improves the global search ability of the al-gorithm and avoids the phenomenon of premature con-vergence of the algorithm to some degree 2e simulationexperiments of the 5 datasets in the UCI database show thatthe IKH-KHM algorithm overcomes the noise sensitivityproblem to a certain extent and has a significant effect on theoptimization of the objective function However its slowrecovery speed results in a high time cost of the algorithm In2017 Li and Liu adopted a combined update mechanism ofselection operator and mutation operator to enhance theglobal optimization ability of the KH algorithm2ey solvedthe problem of unbalanced local search and global search ofthe original KH algorithm [44]

For enhancing the global search ability of KH algorithma global search operator improved KH algorithm wasproposed by Jensi and Jiji [9] and applied to data clustering

Table 1 Summary of feature selection methods in IDS

Method Author Year Ref noBayesian network-based dimensionality reduction and principal component analysis (PCA) Smith et al 2010 [16]Ranking based on Mahalanobis distance and exhaustive search Zhao et al 2013 [17]Iterative Dichotomiser 3 (ID3) algorithm Singh and tiwari 2015 [18]Mutual information method Ambusaidi et al 2016 [19]Nonsymmetric deep autoencoder (NDAE) Shone et al 2018 [20]Self-adaptive differential evolution (SaDE) Xue et al 2018 [21]Principal component analysis (PCA) and linear discriminant analysis (LDA) Shen et al 2019 [22]Hybrid network of convolutional neural network (CNN) and long short-term memory network (LSTM) Sun et al 2020 [23]Cross-correlation-based feature selection (CCFS) method Farahani 2020 [24]

Security and Communication Networks 5

2e algorithm continuously searches around the originalarea to guide the krill herd to the global optimal movementIt defines a new step size formula which is convenient forkrill individuals to fine tune their position in the searchspace At the same time the elite selection strategy is in-troduced into the krill herd update process which is helpfulfor the algorithm to jump out of the local optimal solutionExperimental results show that the improved KH algorithmhas higher accuracy and better robustness

In [45] Wang et al proposed a stud KH algorithm2emethod adopts a new krill herd genetics and reproductionmechanism replacing the random selection in the stan-dard KH algorithm with columnar selection operator andcrossover operator To balance the exploration and ex-ploitation abilities of the KH algorithm Li et al proposeda linear decreasing step KH algorithm [46] In the algo-rithm the step size scaling factor is improved linearlywhich makes it decrease with the increase of iterationtimes thereby enhancing the search ability of thealgorithm

Although KH algorithm and its enhanced version showbetter performance than other swarm intelligence algo-rithms there are still deficiencies such as unbalanced ex-ploration and exploitation In this paper to minimize thenumber of selected features and achieve high classificationaccuracy both parameters are introduced into the fitnessevaluation function 2e physical diffusion motion of krillindividuals is nonlinearly improved to dynamically adjustthe random diffusion amplitude to accelerate the conver-gence rate of the algorithm At the same time a linear nearestneighbor lasso step optimization is performed on the basis ofupdating the position of the krill herd which effectivelyenhances the global exploration ability It helps the algo-rithm achieve better performance reduce the data dimen-sion of feature selection and improve the efficiency ofintrusion detection

3 Algorithm Design

In this section we first provide a brief description of the KHalgorithm subsequently we present an improved version ofKH named LNNLS-KH to address the problem of largenumber and high dimension in feature selection of intrusiondetection

31 Standard KH Algorithm 2e framework of KH algo-rithm is shown in Figure 3 It includes three actions of krillindividual crossover operation and updating position andcalculating the fitness function Krill individuals changetheir position according to three actions after completinginitialization 2en the crossover operator is executed tocomplete the position update and the new fitness function iscalculated If the number of iterations does not reach themaximum krill individuals repeat the process until the it-eration is completed

As a novel biologically inspired algorithm for solvingoptimization tasks the KH algorithm expresses the possiblesolution of the problem with each krill individual By

simulating the foraging behavior the krill herd position iscontinuously updated to obtain the global optimal solution2e motions of krill individuals are mainly affected by thefollowing three aspects

(1) Movement induced by other krill individuals(2) Foraging activity(3) Physical diffusion motion

2e KH algorithm adopts the Lagrange model to searchin multidimensional space 2e position update of krillindividuals is shown as follows

dXi

dt Ni + Fi + Di (1)

where Xi Xi1 Xi2 XiNV1113966 1113967 Ni is the movement in-duced by other krill individuals Fi is the foraging activity ofkrill individual and Di is random physical diffusion basedon density region

311 Movement Induced by Other Krill Individuals 2emovement induced by other krill individuals is described asfollows

Nnewi N

maxαi + ωnNoldi (2)

αi αlocali + αtargeti (3)

where Nmax is the maximum induction velocity of sur-rounding krill individuals and it is taken 001(msminus 1) [5] ωn

represents the inertial weight in the range [0 1] Noldi is the

result of last motion induced by other krill individuals αlocali

is a parameter indicating the direction of guidance andαtargeti is the direction effect of the global optimal krillindividual

αlocali is defined as follows

αlocali 1113944NN

ji

1113954Kij1113954Xij

1113954Xij Xj minus Xi

Xj minus Xi

+ ε 1113954Kij

Ki minus Kj

Kworst

minus Kbest

(4)

where Kbest and Kworst are the best and worst fitness value ofkrill herd Ki is the fitness value of ith krill individual Kj

represents the fitness value of ith neighbor krill individual(j 1 2 NN) andNN represents the total amount ofneighbors 2e ε at the denominator position is a smallpositive number to avoid the singularity caused by zerodenominator

When selecting surrounding krill individuals the KHalgorithm finds the number of nearest neighbors to krillindividual ith by defining the ldquoneighborhood ratiordquo It is acircular area with krill individual ith as the center andperception distance dsi as the radius dsj is described asfollows

dsi 15N

1113944

N

j1Xi minus Xj

(5)

6 Security and Communication Networks

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 2: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

search module the corresponding feature subset is gener-ated Appropriate evaluation criteria are constructed toevaluate the feature subsets When the termination condi-tion of the feature selection process is reached the finalselected feature subset is output Meanwhile it is verified toevaluate the quality of feature selection algorithm 2eframework of feature selection is shown in Figure 1

2e swarm intelligence optimization method is a kindof group-oriented random search technology which pro-vides new ideas for solving the feature selection problemKrill herd (KH) algorithm is a new type of swarm intel-ligence optimization method that studies the foraging ruleand clustering behavior of krill herd in nature By simu-lating the movement induced by other krill individualsforaging activity and physical diffusion motion of krillherd the position of individuals is constantly updatedWhile looking for food and the highest krill herd densitythey will move towards the best solution and finally get theglobal optimal solution KH algorithm has been widelyconcerned by many scholars and engineers for its excellentoptimization performance and is considered to be one ofthe fastest developing natural heuristic algorithms insolving practical optimization problems [3] It integratesthe local robust search method with the population-basedmethod and has a good performance in high-dimensionaldata processing It is widely used in network path opti-mization [4] text clustering analysis [5] neural networktraining [6] multiple continuous optimization [7ndash9]combinatorial optimization [10 11] constraint optimiza-tion [12ndash14] and other scenarios [3] KH algorithm hasgood exploitation ability but the exploration ability is notsatisfactory which means that the algorithm is easy to fallinto local optimal solution when solving practical prob-lems Although there are existing optimization algorithmsfor KH algorithm the research on the optimization algo-rithm that can provide high convergence rate and globaloptimal solution is continuing2erefore the improvementof KH algorithm to balance the global exploration and localexploitation abilities is of great significance for improvingthe solution accuracy and optimization efficiency

In this paper an optimized LNNLS-KH algorithm forfeature selection is proposed to address the problem of largenumber and high dimension of intrusion detection datasetsIt filters out the redundant features of IDS data so that theefficiency of intrusion detection is significantly improvedand the time cost is enormously reduced

2emain contributions of this paper are listed as follows

(i) 2e number of dimensions and detection accuracyof feature selection were introduced into the fitnessfunction which improved the ability of featureselection

(ii) To accelerate the convergence speed of the algo-rithm we modified the physical diffusion motion ofkrill individuals by the nonlinear method

(iii) 2e LNNLS-KH algorithm was proposed for featureselection of intrusion detection data which

effectively enhanced the local exploitation abilityand global exploration ability of the algorithm

(iv) 2e proposed algorithm was comprehensivelyevaluated by conducting a large number of exper-iments on NSL-KDD and CICIDS2017 datasetsdataset 2e experimental results show that theLNNLS-KH algorithm exhibited good competitiveperformance in the evaluation indicators for in-trusion detection

2e remaining sections of this paper are organized asfollows Section 2 presents the related works about featureselection methods and the variants of KH algorithm Section3 provides a detailed description of the proposed LNNLS-KH algorithm Section 4 provides shows the experimentalresults and discussion Section 5 is concluded with futureresearch

2 Related Works

In this section we show three feature selection methodsbased on evaluation criteria and feature selection algorithmsin IDS Meanwhile we summarize swarm intelligence al-gorithms especially KH algorithm and its variants

21 Feature Selection Methods Based on the EvaluationCriteria 2ere are three types of feature selection methodsbased on the evaluation criteria the filter method thewrapper method and the embedded method [15] 2e filtermethod assigns weights to the features of each dimensionfilters the features in the order of weight and uses the featuresubsets to train the classification algorithm 2erefore theprocess of feature selection is independent of the classifi-cation algorithm Although the filter method occupies fewercomputing resources and saves more time for feature se-lection the selected feature subset lacks the adjustment ofthe classification algorithm resulting in low classificationaccuracy 2e wrapper method takes into account the effectof the performance of the classification algorithm on thefeature subsets so it derives a high classification accuracybut the computation and time are consumed enormously2e embedded method integrates the feature selectionprocess and the classification algorithm and simultaneouslyperforms feature selection during the classification trainingIts computation cost and classification accuracy are betweenthe filter method and the wrapper method 2e featureselection of intrusion detection data requires high accuracyand the training time of offline data is not concerned2erefore the wrapper method is adopted as the featureselection method in this paper 2e frameworks of the threetypes of feature selection methods based on the evaluationcriteria are shown in Figure 2

22 Feature Selection Algorithms in IDS Feature selection isone of the most important parts of data preprocessing inintrusion detection which is of great significance to IDS2e

2 Security and Communication Networks

Evaluation criterion Judgment condition VerificationYes

No

Output

Search startingpoint

Search strategy

Search module

Figure 1 Framework of feature selection

Raw data

Trainingdataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The filter method

The final evaluation

Evaluation criteria of features

Filter to get a subset of features

Calculating thescores of features

Ranking the scores of features

(a)

Raw data

Training dataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The wrapper method

Search features

Classify Evaluate

The final evaluation

Character subset

Classification result

Performance evaluation

(b)

Raw data

Training dataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The embedded method

The final evaluation

Classification algorithm(the feature subset is automatically

obtained through the training of the classification model)

(c)

Figure 2 Frameworks of the three types of feature selectionmethods (a) framework of the filter method for feature selection (b) frameworkof the wrapper method for feature selection and (c) framework of the embedded method for feature selection

Security and Communication Networks 3

characteristics of network intrusion detection data aremultiple features and large scale Features of different cat-egories have different attribute values including redundantfeatures that interfere with the classification results A largenumber of redundant features reduce the efficiency of de-tection algorithms and increase the false positive rate of in-trusion detection However a feature selection algorithm withgood performance decreases the dimensionality of networkdata and improves the accuracy and detection speed of IDS

In recent years there has been a great deal of researchstudies on feature selection in intrusion detection Smith et alcombined Bayesian network and principal componentanalysis (PCA) to conduct feature selection for intrusiondetection data [16]2ey used Bayesian networks to adjust thecorrelation of attributes and PCA to extract the primaryfeatures on an institute-wide cloud system 2e disadvantageis that the detection accuracy is considered to be furtherimproved as an improvement Zhao et al [17] proposed afeature selection method based on Mahalanobis distance andapplied it to network intrusion detection to obtain the optimalfeature subset Feature ranking based on Mahalanobis dis-tance was used as the principle selection mechanism and theimproved exhaustive search was used to select the optimalranking features 2e experimental results based on the KDDCUP 99 dataset show that the algorithm has good perfor-mance on both the support vector machine and the k-nearestneighbor classifier Singh and Tiwari proposed an efficientapproach for intrusion detection in reduced features of KDDCUP 99 dataset in 2015 [18] Iterative Dichotomiser 3 (ID3)algorithmwas used for feature reduction of large datasets andKNNGA was used as a classifier for intrusion detection 2emethod performs well on evaluation measures of sensitivityspecificity and accuracy However both Zhao et al and Singhand Tiwari [17 18] conduct experiments on the outdateddatasets which are difficult to reflect the new attack featuresof modern networks In [19] Ambusaidi et al proposed afeature selection algorithm based on mutual information todeal with linear and nonlinear related data features 2eyestablished an intrusion detection system based on least-squares support vector machine Experimental results showthat the proposed algorithm performs well in accuracy butpoor in false positive rate Shone et al proposed an unsu-pervised feature learning method based on nonsymmetricdeep autoencoder (NDAE) and a novel deep learning clas-sification model constructed using stacked NDAEs [20] 2eresults demonstrated that the approach offers high levels ofaccuracy precision and recall together with reduced trainingtime Meanwhile it is worth noting that the stacked NDAEmodel has 9881 less training time than the mainstreamDBN technology 2e limitation is that the model needs toassess and extend the capability to handle zero-day attacks

In [21] a self-adaptive differential evolution (SaDE)algorithm was proposed to deal with the feature selectionproblem It uses adaptive mechanism to select the mostappropriate among the four candidate solution generationstrategies which effectively reduced the number of features2e disadvantage is that the experiment uses small sampledata and more data is needed to further support the con-clusion Shen et al adopted principal component analysis

and linear discriminant analysis to decrease the dimen-sionality of the dataset and combined with Bayesian clas-sification to construct an intrusion detection model [22]Simulation experiments based on CICIDS2017 dataset showthat the proposed algorithm filters out the noise in the dataand improves the time performance to a certain extentHowever the algorithm still needs to be optimized to furtherimprove the classification accuracy In [23] a hybrid net-work feature selection method based on convolutionalneural network (CNN) and long and short-term memorynetwork (LSTM) had been applied to IDS According to theexperimental results the proposed feature selection algo-rithm achieves better accuracy compared with the CNN-only model and the LSTM-only model However the de-tection accuracy of Heartbleed and SSHPatator attacks islow In [24] Farahani proposed a new cross-correlation-based feature selection (CCFS) method to reduce the featuredimension of intrusion detection dataset Compared withcuttlefish algorithm (CFA) and mutual information-basedfeature selection (MIFS) the proposed algorithm wasdemonstrated to have a good performance in the accuracyprecision and recall rate of classification However theauthor simply replaced the categorical attributes with nu-meric values when dealing with symbolic data withoutconsidering a more reasonable one-hot encoding method2e summary of feature selection methods in IDS is shownin Table 1

23 Swarm Intelligence Algorithms for Feature Selection2e core of feature selection is the search strategy forgenerating feature subsets Although the exhaustive searchstrategy can find the globally optimal feature subset itsexcessive time complexity consumes huge computing re-sources whether exhaustive search or nonexhaustive searchIn recent years swarm intelligence optimization methodsinspired by natural phenomena provide a new approach tosolve the problem of feature selection [10ndash17] 2erefore wepropose the LNNLS-KH algorithm with high search effi-ciency as the search strategy for feature subset Swarm in-telligence optimization methods simulate the evolution ofsurvival of the fittest in nature and are a group-orientedrandom search technique that can be used to solve complexproblems in large-scale data analysis [25] Common swarmintelligence optimization methods include particle swarmoptimization (PSO) [26] ant colony optimization algorithm(ACO) [27] cuckoo algorithm (CA) [28] artificial fishswarm algorithm (AFSA) [29] artificial bee colony algo-rithm (ABC) [30] fruit fly optimization algorithm (FOA)[31] monkey algorithm (MA) [32] bat algorithm (BA) [33]and salp swarm algorithm (SSA) [34]

Moreover Ahmed et al proposed a new chaotic chickenswarm algorithm (CCSO) for feature selection [35] Bycombining logical maps and chaotic trend maps the CSOalgorithm acquires a strong spatial search ability 2e ex-perimental results show that the classification accuracy ofthe model is further improved after CCSO feature selection2e disadvantage is the lack of comparison with otherchaotic algorithms Ahmtabakh proposed an unsupervised

4 Security and Communication Networks

feature selection method based on ant colony optimization(UFSACO) [36] which iteratively filtrates feature throughthe heuristic and previous stage information of the antcolony Simultaneously the similarity between features isquantified to reduce the redundancy of data featuresHowever the efficiency of feature selection process needs tobe improved

To solve the problem that it is easy to fall into the localoptimal solution Arora and Anand proposed a butterflyoptimization algorithm (BOA) based on binary variables[37] Based on the foraging behavior of butterflies the al-gorithm uses each butterfly as a search agent to iterativelyoptimize the fitness function which has good convergenceability and avoids the premature problem to a certain extentExperimental results show that the algorithm reduces thelength of feature subset while selecting the optimal featuresubset and improves the classification accuracy to a certainextent However the time cost is larger than that of geneticalgorithm and particle swarm optimization algorithm andthe optimization result of the feature subset for repeatedexperiments is inaccurate and has poor robustness

In [38] Yan et al proposed a hybrid optimization al-gorithm (BCROSAT) based on simulated annealing andbinary coral reefs which is used for feature selection in high-dimensional biomedical datasets 2e algorithm increasesthe diversity of the initial population individuals through theleague selection strategy and uses the simulated annealingalgorithm and binary coding to improve the search ability ofthe coral reef optimization algorithm However the algo-rithm has high time complexity In [39] a new chaoticDragonfly algorithm (CDA) is proposed by Sayed et alwhich combines 10 different chaotic maps with the searchiteration process of dragonfly algorithm so as to acceleratethe convergence speed of the algorithm and improve theefficiency of feature selection 2e algorithm uses the worstfitness value best fitness value average fitness value stan-dard deviation and average feature length as evaluationcriteria 2e experimental results show that the adjustmentvariable of Gauss map significantly improves the perfor-mance of dragonfly algorithm in classification performancestability number of selected features and convergencespeed 2e disadvantage is that the experimental data issmall and the algorithm needs to be verified on large-scaledatasets Zhang et al [40] mixed genetic algorithm andparticle swarm optimization algorithm to conduct taboosearch for the produced optimal initial solution and theresult of quadratic feature selection is the global optimal

feature subset 2e algorithm not only guarantees the goodclassification performance but also greatly reduces the falsepositive rate and false negative rate of classification results2e disadvantage is that the algorithm takes a large calcu-lation cost and a long offline training time

24KrillHerd (KH)AlgorithmandVariants Krill herd (KH)algorithm is a new swarm intelligence optimization methodbased on population proposed by Gandomi and Alavi in2012 [41] 2e algorithm studies the foraging rules andclustering behavior of the herding of the krill swarms innature and simulates the induced movement foraging ac-tivity and random diffusion movement of KH Meanwhileit obtains the optimal solution by continuously updating theposition of krill individuals

Abualigah et al introduced a multicriteria mixedfunction based on the global optimal concept in the KHalgorithm and applied it to text clustering [5] By supple-menting the advantages of local neighborhood search andglobal wide area search the algorithm balances the ex-ploitation and exploration process of krill herd In [42] theinfluence of excellent neighbor individuals on the krill herdduring evolution is considered and an improved KH algo-rithm is proposed to enhance the local search ability of thealgorithm In [43] a hybrid data clustering algorithm (IKH-KHM) based on improved KH algorithm and k-harmonicmeans was proposed to solve the problem of sensitiveclustering center of K-means algorithm 2is algorithmincreases the diversity of KH by alternately using the randomwalk of Levi flight and the crossover operator in the geneticalgorithm It improves the global search ability of the al-gorithm and avoids the phenomenon of premature con-vergence of the algorithm to some degree 2e simulationexperiments of the 5 datasets in the UCI database show thatthe IKH-KHM algorithm overcomes the noise sensitivityproblem to a certain extent and has a significant effect on theoptimization of the objective function However its slowrecovery speed results in a high time cost of the algorithm In2017 Li and Liu adopted a combined update mechanism ofselection operator and mutation operator to enhance theglobal optimization ability of the KH algorithm2ey solvedthe problem of unbalanced local search and global search ofthe original KH algorithm [44]

For enhancing the global search ability of KH algorithma global search operator improved KH algorithm wasproposed by Jensi and Jiji [9] and applied to data clustering

Table 1 Summary of feature selection methods in IDS

Method Author Year Ref noBayesian network-based dimensionality reduction and principal component analysis (PCA) Smith et al 2010 [16]Ranking based on Mahalanobis distance and exhaustive search Zhao et al 2013 [17]Iterative Dichotomiser 3 (ID3) algorithm Singh and tiwari 2015 [18]Mutual information method Ambusaidi et al 2016 [19]Nonsymmetric deep autoencoder (NDAE) Shone et al 2018 [20]Self-adaptive differential evolution (SaDE) Xue et al 2018 [21]Principal component analysis (PCA) and linear discriminant analysis (LDA) Shen et al 2019 [22]Hybrid network of convolutional neural network (CNN) and long short-term memory network (LSTM) Sun et al 2020 [23]Cross-correlation-based feature selection (CCFS) method Farahani 2020 [24]

Security and Communication Networks 5

2e algorithm continuously searches around the originalarea to guide the krill herd to the global optimal movementIt defines a new step size formula which is convenient forkrill individuals to fine tune their position in the searchspace At the same time the elite selection strategy is in-troduced into the krill herd update process which is helpfulfor the algorithm to jump out of the local optimal solutionExperimental results show that the improved KH algorithmhas higher accuracy and better robustness

In [45] Wang et al proposed a stud KH algorithm2emethod adopts a new krill herd genetics and reproductionmechanism replacing the random selection in the stan-dard KH algorithm with columnar selection operator andcrossover operator To balance the exploration and ex-ploitation abilities of the KH algorithm Li et al proposeda linear decreasing step KH algorithm [46] In the algo-rithm the step size scaling factor is improved linearlywhich makes it decrease with the increase of iterationtimes thereby enhancing the search ability of thealgorithm

Although KH algorithm and its enhanced version showbetter performance than other swarm intelligence algo-rithms there are still deficiencies such as unbalanced ex-ploration and exploitation In this paper to minimize thenumber of selected features and achieve high classificationaccuracy both parameters are introduced into the fitnessevaluation function 2e physical diffusion motion of krillindividuals is nonlinearly improved to dynamically adjustthe random diffusion amplitude to accelerate the conver-gence rate of the algorithm At the same time a linear nearestneighbor lasso step optimization is performed on the basis ofupdating the position of the krill herd which effectivelyenhances the global exploration ability It helps the algo-rithm achieve better performance reduce the data dimen-sion of feature selection and improve the efficiency ofintrusion detection

3 Algorithm Design

In this section we first provide a brief description of the KHalgorithm subsequently we present an improved version ofKH named LNNLS-KH to address the problem of largenumber and high dimension in feature selection of intrusiondetection

31 Standard KH Algorithm 2e framework of KH algo-rithm is shown in Figure 3 It includes three actions of krillindividual crossover operation and updating position andcalculating the fitness function Krill individuals changetheir position according to three actions after completinginitialization 2en the crossover operator is executed tocomplete the position update and the new fitness function iscalculated If the number of iterations does not reach themaximum krill individuals repeat the process until the it-eration is completed

As a novel biologically inspired algorithm for solvingoptimization tasks the KH algorithm expresses the possiblesolution of the problem with each krill individual By

simulating the foraging behavior the krill herd position iscontinuously updated to obtain the global optimal solution2e motions of krill individuals are mainly affected by thefollowing three aspects

(1) Movement induced by other krill individuals(2) Foraging activity(3) Physical diffusion motion

2e KH algorithm adopts the Lagrange model to searchin multidimensional space 2e position update of krillindividuals is shown as follows

dXi

dt Ni + Fi + Di (1)

where Xi Xi1 Xi2 XiNV1113966 1113967 Ni is the movement in-duced by other krill individuals Fi is the foraging activity ofkrill individual and Di is random physical diffusion basedon density region

311 Movement Induced by Other Krill Individuals 2emovement induced by other krill individuals is described asfollows

Nnewi N

maxαi + ωnNoldi (2)

αi αlocali + αtargeti (3)

where Nmax is the maximum induction velocity of sur-rounding krill individuals and it is taken 001(msminus 1) [5] ωn

represents the inertial weight in the range [0 1] Noldi is the

result of last motion induced by other krill individuals αlocali

is a parameter indicating the direction of guidance andαtargeti is the direction effect of the global optimal krillindividual

αlocali is defined as follows

αlocali 1113944NN

ji

1113954Kij1113954Xij

1113954Xij Xj minus Xi

Xj minus Xi

+ ε 1113954Kij

Ki minus Kj

Kworst

minus Kbest

(4)

where Kbest and Kworst are the best and worst fitness value ofkrill herd Ki is the fitness value of ith krill individual Kj

represents the fitness value of ith neighbor krill individual(j 1 2 NN) andNN represents the total amount ofneighbors 2e ε at the denominator position is a smallpositive number to avoid the singularity caused by zerodenominator

When selecting surrounding krill individuals the KHalgorithm finds the number of nearest neighbors to krillindividual ith by defining the ldquoneighborhood ratiordquo It is acircular area with krill individual ith as the center andperception distance dsi as the radius dsj is described asfollows

dsi 15N

1113944

N

j1Xi minus Xj

(5)

6 Security and Communication Networks

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 3: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

Evaluation criterion Judgment condition VerificationYes

No

Output

Search startingpoint

Search strategy

Search module

Figure 1 Framework of feature selection

Raw data

Trainingdataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The filter method

The final evaluation

Evaluation criteria of features

Filter to get a subset of features

Calculating thescores of features

Ranking the scores of features

(a)

Raw data

Training dataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The wrapper method

Search features

Classify Evaluate

The final evaluation

Character subset

Classification result

Performance evaluation

(b)

Raw data

Training dataset

Testing dataset

Classification algorithmPreprocessing

The optimal subset of features

The embedded method

The final evaluation

Classification algorithm(the feature subset is automatically

obtained through the training of the classification model)

(c)

Figure 2 Frameworks of the three types of feature selectionmethods (a) framework of the filter method for feature selection (b) frameworkof the wrapper method for feature selection and (c) framework of the embedded method for feature selection

Security and Communication Networks 3

characteristics of network intrusion detection data aremultiple features and large scale Features of different cat-egories have different attribute values including redundantfeatures that interfere with the classification results A largenumber of redundant features reduce the efficiency of de-tection algorithms and increase the false positive rate of in-trusion detection However a feature selection algorithm withgood performance decreases the dimensionality of networkdata and improves the accuracy and detection speed of IDS

In recent years there has been a great deal of researchstudies on feature selection in intrusion detection Smith et alcombined Bayesian network and principal componentanalysis (PCA) to conduct feature selection for intrusiondetection data [16]2ey used Bayesian networks to adjust thecorrelation of attributes and PCA to extract the primaryfeatures on an institute-wide cloud system 2e disadvantageis that the detection accuracy is considered to be furtherimproved as an improvement Zhao et al [17] proposed afeature selection method based on Mahalanobis distance andapplied it to network intrusion detection to obtain the optimalfeature subset Feature ranking based on Mahalanobis dis-tance was used as the principle selection mechanism and theimproved exhaustive search was used to select the optimalranking features 2e experimental results based on the KDDCUP 99 dataset show that the algorithm has good perfor-mance on both the support vector machine and the k-nearestneighbor classifier Singh and Tiwari proposed an efficientapproach for intrusion detection in reduced features of KDDCUP 99 dataset in 2015 [18] Iterative Dichotomiser 3 (ID3)algorithmwas used for feature reduction of large datasets andKNNGA was used as a classifier for intrusion detection 2emethod performs well on evaluation measures of sensitivityspecificity and accuracy However both Zhao et al and Singhand Tiwari [17 18] conduct experiments on the outdateddatasets which are difficult to reflect the new attack featuresof modern networks In [19] Ambusaidi et al proposed afeature selection algorithm based on mutual information todeal with linear and nonlinear related data features 2eyestablished an intrusion detection system based on least-squares support vector machine Experimental results showthat the proposed algorithm performs well in accuracy butpoor in false positive rate Shone et al proposed an unsu-pervised feature learning method based on nonsymmetricdeep autoencoder (NDAE) and a novel deep learning clas-sification model constructed using stacked NDAEs [20] 2eresults demonstrated that the approach offers high levels ofaccuracy precision and recall together with reduced trainingtime Meanwhile it is worth noting that the stacked NDAEmodel has 9881 less training time than the mainstreamDBN technology 2e limitation is that the model needs toassess and extend the capability to handle zero-day attacks

In [21] a self-adaptive differential evolution (SaDE)algorithm was proposed to deal with the feature selectionproblem It uses adaptive mechanism to select the mostappropriate among the four candidate solution generationstrategies which effectively reduced the number of features2e disadvantage is that the experiment uses small sampledata and more data is needed to further support the con-clusion Shen et al adopted principal component analysis

and linear discriminant analysis to decrease the dimen-sionality of the dataset and combined with Bayesian clas-sification to construct an intrusion detection model [22]Simulation experiments based on CICIDS2017 dataset showthat the proposed algorithm filters out the noise in the dataand improves the time performance to a certain extentHowever the algorithm still needs to be optimized to furtherimprove the classification accuracy In [23] a hybrid net-work feature selection method based on convolutionalneural network (CNN) and long and short-term memorynetwork (LSTM) had been applied to IDS According to theexperimental results the proposed feature selection algo-rithm achieves better accuracy compared with the CNN-only model and the LSTM-only model However the de-tection accuracy of Heartbleed and SSHPatator attacks islow In [24] Farahani proposed a new cross-correlation-based feature selection (CCFS) method to reduce the featuredimension of intrusion detection dataset Compared withcuttlefish algorithm (CFA) and mutual information-basedfeature selection (MIFS) the proposed algorithm wasdemonstrated to have a good performance in the accuracyprecision and recall rate of classification However theauthor simply replaced the categorical attributes with nu-meric values when dealing with symbolic data withoutconsidering a more reasonable one-hot encoding method2e summary of feature selection methods in IDS is shownin Table 1

23 Swarm Intelligence Algorithms for Feature Selection2e core of feature selection is the search strategy forgenerating feature subsets Although the exhaustive searchstrategy can find the globally optimal feature subset itsexcessive time complexity consumes huge computing re-sources whether exhaustive search or nonexhaustive searchIn recent years swarm intelligence optimization methodsinspired by natural phenomena provide a new approach tosolve the problem of feature selection [10ndash17] 2erefore wepropose the LNNLS-KH algorithm with high search effi-ciency as the search strategy for feature subset Swarm in-telligence optimization methods simulate the evolution ofsurvival of the fittest in nature and are a group-orientedrandom search technique that can be used to solve complexproblems in large-scale data analysis [25] Common swarmintelligence optimization methods include particle swarmoptimization (PSO) [26] ant colony optimization algorithm(ACO) [27] cuckoo algorithm (CA) [28] artificial fishswarm algorithm (AFSA) [29] artificial bee colony algo-rithm (ABC) [30] fruit fly optimization algorithm (FOA)[31] monkey algorithm (MA) [32] bat algorithm (BA) [33]and salp swarm algorithm (SSA) [34]

Moreover Ahmed et al proposed a new chaotic chickenswarm algorithm (CCSO) for feature selection [35] Bycombining logical maps and chaotic trend maps the CSOalgorithm acquires a strong spatial search ability 2e ex-perimental results show that the classification accuracy ofthe model is further improved after CCSO feature selection2e disadvantage is the lack of comparison with otherchaotic algorithms Ahmtabakh proposed an unsupervised

4 Security and Communication Networks

feature selection method based on ant colony optimization(UFSACO) [36] which iteratively filtrates feature throughthe heuristic and previous stage information of the antcolony Simultaneously the similarity between features isquantified to reduce the redundancy of data featuresHowever the efficiency of feature selection process needs tobe improved

To solve the problem that it is easy to fall into the localoptimal solution Arora and Anand proposed a butterflyoptimization algorithm (BOA) based on binary variables[37] Based on the foraging behavior of butterflies the al-gorithm uses each butterfly as a search agent to iterativelyoptimize the fitness function which has good convergenceability and avoids the premature problem to a certain extentExperimental results show that the algorithm reduces thelength of feature subset while selecting the optimal featuresubset and improves the classification accuracy to a certainextent However the time cost is larger than that of geneticalgorithm and particle swarm optimization algorithm andthe optimization result of the feature subset for repeatedexperiments is inaccurate and has poor robustness

In [38] Yan et al proposed a hybrid optimization al-gorithm (BCROSAT) based on simulated annealing andbinary coral reefs which is used for feature selection in high-dimensional biomedical datasets 2e algorithm increasesthe diversity of the initial population individuals through theleague selection strategy and uses the simulated annealingalgorithm and binary coding to improve the search ability ofthe coral reef optimization algorithm However the algo-rithm has high time complexity In [39] a new chaoticDragonfly algorithm (CDA) is proposed by Sayed et alwhich combines 10 different chaotic maps with the searchiteration process of dragonfly algorithm so as to acceleratethe convergence speed of the algorithm and improve theefficiency of feature selection 2e algorithm uses the worstfitness value best fitness value average fitness value stan-dard deviation and average feature length as evaluationcriteria 2e experimental results show that the adjustmentvariable of Gauss map significantly improves the perfor-mance of dragonfly algorithm in classification performancestability number of selected features and convergencespeed 2e disadvantage is that the experimental data issmall and the algorithm needs to be verified on large-scaledatasets Zhang et al [40] mixed genetic algorithm andparticle swarm optimization algorithm to conduct taboosearch for the produced optimal initial solution and theresult of quadratic feature selection is the global optimal

feature subset 2e algorithm not only guarantees the goodclassification performance but also greatly reduces the falsepositive rate and false negative rate of classification results2e disadvantage is that the algorithm takes a large calcu-lation cost and a long offline training time

24KrillHerd (KH)AlgorithmandVariants Krill herd (KH)algorithm is a new swarm intelligence optimization methodbased on population proposed by Gandomi and Alavi in2012 [41] 2e algorithm studies the foraging rules andclustering behavior of the herding of the krill swarms innature and simulates the induced movement foraging ac-tivity and random diffusion movement of KH Meanwhileit obtains the optimal solution by continuously updating theposition of krill individuals

Abualigah et al introduced a multicriteria mixedfunction based on the global optimal concept in the KHalgorithm and applied it to text clustering [5] By supple-menting the advantages of local neighborhood search andglobal wide area search the algorithm balances the ex-ploitation and exploration process of krill herd In [42] theinfluence of excellent neighbor individuals on the krill herdduring evolution is considered and an improved KH algo-rithm is proposed to enhance the local search ability of thealgorithm In [43] a hybrid data clustering algorithm (IKH-KHM) based on improved KH algorithm and k-harmonicmeans was proposed to solve the problem of sensitiveclustering center of K-means algorithm 2is algorithmincreases the diversity of KH by alternately using the randomwalk of Levi flight and the crossover operator in the geneticalgorithm It improves the global search ability of the al-gorithm and avoids the phenomenon of premature con-vergence of the algorithm to some degree 2e simulationexperiments of the 5 datasets in the UCI database show thatthe IKH-KHM algorithm overcomes the noise sensitivityproblem to a certain extent and has a significant effect on theoptimization of the objective function However its slowrecovery speed results in a high time cost of the algorithm In2017 Li and Liu adopted a combined update mechanism ofselection operator and mutation operator to enhance theglobal optimization ability of the KH algorithm2ey solvedthe problem of unbalanced local search and global search ofthe original KH algorithm [44]

For enhancing the global search ability of KH algorithma global search operator improved KH algorithm wasproposed by Jensi and Jiji [9] and applied to data clustering

Table 1 Summary of feature selection methods in IDS

Method Author Year Ref noBayesian network-based dimensionality reduction and principal component analysis (PCA) Smith et al 2010 [16]Ranking based on Mahalanobis distance and exhaustive search Zhao et al 2013 [17]Iterative Dichotomiser 3 (ID3) algorithm Singh and tiwari 2015 [18]Mutual information method Ambusaidi et al 2016 [19]Nonsymmetric deep autoencoder (NDAE) Shone et al 2018 [20]Self-adaptive differential evolution (SaDE) Xue et al 2018 [21]Principal component analysis (PCA) and linear discriminant analysis (LDA) Shen et al 2019 [22]Hybrid network of convolutional neural network (CNN) and long short-term memory network (LSTM) Sun et al 2020 [23]Cross-correlation-based feature selection (CCFS) method Farahani 2020 [24]

Security and Communication Networks 5

2e algorithm continuously searches around the originalarea to guide the krill herd to the global optimal movementIt defines a new step size formula which is convenient forkrill individuals to fine tune their position in the searchspace At the same time the elite selection strategy is in-troduced into the krill herd update process which is helpfulfor the algorithm to jump out of the local optimal solutionExperimental results show that the improved KH algorithmhas higher accuracy and better robustness

In [45] Wang et al proposed a stud KH algorithm2emethod adopts a new krill herd genetics and reproductionmechanism replacing the random selection in the stan-dard KH algorithm with columnar selection operator andcrossover operator To balance the exploration and ex-ploitation abilities of the KH algorithm Li et al proposeda linear decreasing step KH algorithm [46] In the algo-rithm the step size scaling factor is improved linearlywhich makes it decrease with the increase of iterationtimes thereby enhancing the search ability of thealgorithm

Although KH algorithm and its enhanced version showbetter performance than other swarm intelligence algo-rithms there are still deficiencies such as unbalanced ex-ploration and exploitation In this paper to minimize thenumber of selected features and achieve high classificationaccuracy both parameters are introduced into the fitnessevaluation function 2e physical diffusion motion of krillindividuals is nonlinearly improved to dynamically adjustthe random diffusion amplitude to accelerate the conver-gence rate of the algorithm At the same time a linear nearestneighbor lasso step optimization is performed on the basis ofupdating the position of the krill herd which effectivelyenhances the global exploration ability It helps the algo-rithm achieve better performance reduce the data dimen-sion of feature selection and improve the efficiency ofintrusion detection

3 Algorithm Design

In this section we first provide a brief description of the KHalgorithm subsequently we present an improved version ofKH named LNNLS-KH to address the problem of largenumber and high dimension in feature selection of intrusiondetection

31 Standard KH Algorithm 2e framework of KH algo-rithm is shown in Figure 3 It includes three actions of krillindividual crossover operation and updating position andcalculating the fitness function Krill individuals changetheir position according to three actions after completinginitialization 2en the crossover operator is executed tocomplete the position update and the new fitness function iscalculated If the number of iterations does not reach themaximum krill individuals repeat the process until the it-eration is completed

As a novel biologically inspired algorithm for solvingoptimization tasks the KH algorithm expresses the possiblesolution of the problem with each krill individual By

simulating the foraging behavior the krill herd position iscontinuously updated to obtain the global optimal solution2e motions of krill individuals are mainly affected by thefollowing three aspects

(1) Movement induced by other krill individuals(2) Foraging activity(3) Physical diffusion motion

2e KH algorithm adopts the Lagrange model to searchin multidimensional space 2e position update of krillindividuals is shown as follows

dXi

dt Ni + Fi + Di (1)

where Xi Xi1 Xi2 XiNV1113966 1113967 Ni is the movement in-duced by other krill individuals Fi is the foraging activity ofkrill individual and Di is random physical diffusion basedon density region

311 Movement Induced by Other Krill Individuals 2emovement induced by other krill individuals is described asfollows

Nnewi N

maxαi + ωnNoldi (2)

αi αlocali + αtargeti (3)

where Nmax is the maximum induction velocity of sur-rounding krill individuals and it is taken 001(msminus 1) [5] ωn

represents the inertial weight in the range [0 1] Noldi is the

result of last motion induced by other krill individuals αlocali

is a parameter indicating the direction of guidance andαtargeti is the direction effect of the global optimal krillindividual

αlocali is defined as follows

αlocali 1113944NN

ji

1113954Kij1113954Xij

1113954Xij Xj minus Xi

Xj minus Xi

+ ε 1113954Kij

Ki minus Kj

Kworst

minus Kbest

(4)

where Kbest and Kworst are the best and worst fitness value ofkrill herd Ki is the fitness value of ith krill individual Kj

represents the fitness value of ith neighbor krill individual(j 1 2 NN) andNN represents the total amount ofneighbors 2e ε at the denominator position is a smallpositive number to avoid the singularity caused by zerodenominator

When selecting surrounding krill individuals the KHalgorithm finds the number of nearest neighbors to krillindividual ith by defining the ldquoneighborhood ratiordquo It is acircular area with krill individual ith as the center andperception distance dsi as the radius dsj is described asfollows

dsi 15N

1113944

N

j1Xi minus Xj

(5)

6 Security and Communication Networks

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 4: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

characteristics of network intrusion detection data aremultiple features and large scale Features of different cat-egories have different attribute values including redundantfeatures that interfere with the classification results A largenumber of redundant features reduce the efficiency of de-tection algorithms and increase the false positive rate of in-trusion detection However a feature selection algorithm withgood performance decreases the dimensionality of networkdata and improves the accuracy and detection speed of IDS

In recent years there has been a great deal of researchstudies on feature selection in intrusion detection Smith et alcombined Bayesian network and principal componentanalysis (PCA) to conduct feature selection for intrusiondetection data [16]2ey used Bayesian networks to adjust thecorrelation of attributes and PCA to extract the primaryfeatures on an institute-wide cloud system 2e disadvantageis that the detection accuracy is considered to be furtherimproved as an improvement Zhao et al [17] proposed afeature selection method based on Mahalanobis distance andapplied it to network intrusion detection to obtain the optimalfeature subset Feature ranking based on Mahalanobis dis-tance was used as the principle selection mechanism and theimproved exhaustive search was used to select the optimalranking features 2e experimental results based on the KDDCUP 99 dataset show that the algorithm has good perfor-mance on both the support vector machine and the k-nearestneighbor classifier Singh and Tiwari proposed an efficientapproach for intrusion detection in reduced features of KDDCUP 99 dataset in 2015 [18] Iterative Dichotomiser 3 (ID3)algorithmwas used for feature reduction of large datasets andKNNGA was used as a classifier for intrusion detection 2emethod performs well on evaluation measures of sensitivityspecificity and accuracy However both Zhao et al and Singhand Tiwari [17 18] conduct experiments on the outdateddatasets which are difficult to reflect the new attack featuresof modern networks In [19] Ambusaidi et al proposed afeature selection algorithm based on mutual information todeal with linear and nonlinear related data features 2eyestablished an intrusion detection system based on least-squares support vector machine Experimental results showthat the proposed algorithm performs well in accuracy butpoor in false positive rate Shone et al proposed an unsu-pervised feature learning method based on nonsymmetricdeep autoencoder (NDAE) and a novel deep learning clas-sification model constructed using stacked NDAEs [20] 2eresults demonstrated that the approach offers high levels ofaccuracy precision and recall together with reduced trainingtime Meanwhile it is worth noting that the stacked NDAEmodel has 9881 less training time than the mainstreamDBN technology 2e limitation is that the model needs toassess and extend the capability to handle zero-day attacks

In [21] a self-adaptive differential evolution (SaDE)algorithm was proposed to deal with the feature selectionproblem It uses adaptive mechanism to select the mostappropriate among the four candidate solution generationstrategies which effectively reduced the number of features2e disadvantage is that the experiment uses small sampledata and more data is needed to further support the con-clusion Shen et al adopted principal component analysis

and linear discriminant analysis to decrease the dimen-sionality of the dataset and combined with Bayesian clas-sification to construct an intrusion detection model [22]Simulation experiments based on CICIDS2017 dataset showthat the proposed algorithm filters out the noise in the dataand improves the time performance to a certain extentHowever the algorithm still needs to be optimized to furtherimprove the classification accuracy In [23] a hybrid net-work feature selection method based on convolutionalneural network (CNN) and long and short-term memorynetwork (LSTM) had been applied to IDS According to theexperimental results the proposed feature selection algo-rithm achieves better accuracy compared with the CNN-only model and the LSTM-only model However the de-tection accuracy of Heartbleed and SSHPatator attacks islow In [24] Farahani proposed a new cross-correlation-based feature selection (CCFS) method to reduce the featuredimension of intrusion detection dataset Compared withcuttlefish algorithm (CFA) and mutual information-basedfeature selection (MIFS) the proposed algorithm wasdemonstrated to have a good performance in the accuracyprecision and recall rate of classification However theauthor simply replaced the categorical attributes with nu-meric values when dealing with symbolic data withoutconsidering a more reasonable one-hot encoding method2e summary of feature selection methods in IDS is shownin Table 1

23 Swarm Intelligence Algorithms for Feature Selection2e core of feature selection is the search strategy forgenerating feature subsets Although the exhaustive searchstrategy can find the globally optimal feature subset itsexcessive time complexity consumes huge computing re-sources whether exhaustive search or nonexhaustive searchIn recent years swarm intelligence optimization methodsinspired by natural phenomena provide a new approach tosolve the problem of feature selection [10ndash17] 2erefore wepropose the LNNLS-KH algorithm with high search effi-ciency as the search strategy for feature subset Swarm in-telligence optimization methods simulate the evolution ofsurvival of the fittest in nature and are a group-orientedrandom search technique that can be used to solve complexproblems in large-scale data analysis [25] Common swarmintelligence optimization methods include particle swarmoptimization (PSO) [26] ant colony optimization algorithm(ACO) [27] cuckoo algorithm (CA) [28] artificial fishswarm algorithm (AFSA) [29] artificial bee colony algo-rithm (ABC) [30] fruit fly optimization algorithm (FOA)[31] monkey algorithm (MA) [32] bat algorithm (BA) [33]and salp swarm algorithm (SSA) [34]

Moreover Ahmed et al proposed a new chaotic chickenswarm algorithm (CCSO) for feature selection [35] Bycombining logical maps and chaotic trend maps the CSOalgorithm acquires a strong spatial search ability 2e ex-perimental results show that the classification accuracy ofthe model is further improved after CCSO feature selection2e disadvantage is the lack of comparison with otherchaotic algorithms Ahmtabakh proposed an unsupervised

4 Security and Communication Networks

feature selection method based on ant colony optimization(UFSACO) [36] which iteratively filtrates feature throughthe heuristic and previous stage information of the antcolony Simultaneously the similarity between features isquantified to reduce the redundancy of data featuresHowever the efficiency of feature selection process needs tobe improved

To solve the problem that it is easy to fall into the localoptimal solution Arora and Anand proposed a butterflyoptimization algorithm (BOA) based on binary variables[37] Based on the foraging behavior of butterflies the al-gorithm uses each butterfly as a search agent to iterativelyoptimize the fitness function which has good convergenceability and avoids the premature problem to a certain extentExperimental results show that the algorithm reduces thelength of feature subset while selecting the optimal featuresubset and improves the classification accuracy to a certainextent However the time cost is larger than that of geneticalgorithm and particle swarm optimization algorithm andthe optimization result of the feature subset for repeatedexperiments is inaccurate and has poor robustness

In [38] Yan et al proposed a hybrid optimization al-gorithm (BCROSAT) based on simulated annealing andbinary coral reefs which is used for feature selection in high-dimensional biomedical datasets 2e algorithm increasesthe diversity of the initial population individuals through theleague selection strategy and uses the simulated annealingalgorithm and binary coding to improve the search ability ofthe coral reef optimization algorithm However the algo-rithm has high time complexity In [39] a new chaoticDragonfly algorithm (CDA) is proposed by Sayed et alwhich combines 10 different chaotic maps with the searchiteration process of dragonfly algorithm so as to acceleratethe convergence speed of the algorithm and improve theefficiency of feature selection 2e algorithm uses the worstfitness value best fitness value average fitness value stan-dard deviation and average feature length as evaluationcriteria 2e experimental results show that the adjustmentvariable of Gauss map significantly improves the perfor-mance of dragonfly algorithm in classification performancestability number of selected features and convergencespeed 2e disadvantage is that the experimental data issmall and the algorithm needs to be verified on large-scaledatasets Zhang et al [40] mixed genetic algorithm andparticle swarm optimization algorithm to conduct taboosearch for the produced optimal initial solution and theresult of quadratic feature selection is the global optimal

feature subset 2e algorithm not only guarantees the goodclassification performance but also greatly reduces the falsepositive rate and false negative rate of classification results2e disadvantage is that the algorithm takes a large calcu-lation cost and a long offline training time

24KrillHerd (KH)AlgorithmandVariants Krill herd (KH)algorithm is a new swarm intelligence optimization methodbased on population proposed by Gandomi and Alavi in2012 [41] 2e algorithm studies the foraging rules andclustering behavior of the herding of the krill swarms innature and simulates the induced movement foraging ac-tivity and random diffusion movement of KH Meanwhileit obtains the optimal solution by continuously updating theposition of krill individuals

Abualigah et al introduced a multicriteria mixedfunction based on the global optimal concept in the KHalgorithm and applied it to text clustering [5] By supple-menting the advantages of local neighborhood search andglobal wide area search the algorithm balances the ex-ploitation and exploration process of krill herd In [42] theinfluence of excellent neighbor individuals on the krill herdduring evolution is considered and an improved KH algo-rithm is proposed to enhance the local search ability of thealgorithm In [43] a hybrid data clustering algorithm (IKH-KHM) based on improved KH algorithm and k-harmonicmeans was proposed to solve the problem of sensitiveclustering center of K-means algorithm 2is algorithmincreases the diversity of KH by alternately using the randomwalk of Levi flight and the crossover operator in the geneticalgorithm It improves the global search ability of the al-gorithm and avoids the phenomenon of premature con-vergence of the algorithm to some degree 2e simulationexperiments of the 5 datasets in the UCI database show thatthe IKH-KHM algorithm overcomes the noise sensitivityproblem to a certain extent and has a significant effect on theoptimization of the objective function However its slowrecovery speed results in a high time cost of the algorithm In2017 Li and Liu adopted a combined update mechanism ofselection operator and mutation operator to enhance theglobal optimization ability of the KH algorithm2ey solvedthe problem of unbalanced local search and global search ofthe original KH algorithm [44]

For enhancing the global search ability of KH algorithma global search operator improved KH algorithm wasproposed by Jensi and Jiji [9] and applied to data clustering

Table 1 Summary of feature selection methods in IDS

Method Author Year Ref noBayesian network-based dimensionality reduction and principal component analysis (PCA) Smith et al 2010 [16]Ranking based on Mahalanobis distance and exhaustive search Zhao et al 2013 [17]Iterative Dichotomiser 3 (ID3) algorithm Singh and tiwari 2015 [18]Mutual information method Ambusaidi et al 2016 [19]Nonsymmetric deep autoencoder (NDAE) Shone et al 2018 [20]Self-adaptive differential evolution (SaDE) Xue et al 2018 [21]Principal component analysis (PCA) and linear discriminant analysis (LDA) Shen et al 2019 [22]Hybrid network of convolutional neural network (CNN) and long short-term memory network (LSTM) Sun et al 2020 [23]Cross-correlation-based feature selection (CCFS) method Farahani 2020 [24]

Security and Communication Networks 5

2e algorithm continuously searches around the originalarea to guide the krill herd to the global optimal movementIt defines a new step size formula which is convenient forkrill individuals to fine tune their position in the searchspace At the same time the elite selection strategy is in-troduced into the krill herd update process which is helpfulfor the algorithm to jump out of the local optimal solutionExperimental results show that the improved KH algorithmhas higher accuracy and better robustness

In [45] Wang et al proposed a stud KH algorithm2emethod adopts a new krill herd genetics and reproductionmechanism replacing the random selection in the stan-dard KH algorithm with columnar selection operator andcrossover operator To balance the exploration and ex-ploitation abilities of the KH algorithm Li et al proposeda linear decreasing step KH algorithm [46] In the algo-rithm the step size scaling factor is improved linearlywhich makes it decrease with the increase of iterationtimes thereby enhancing the search ability of thealgorithm

Although KH algorithm and its enhanced version showbetter performance than other swarm intelligence algo-rithms there are still deficiencies such as unbalanced ex-ploration and exploitation In this paper to minimize thenumber of selected features and achieve high classificationaccuracy both parameters are introduced into the fitnessevaluation function 2e physical diffusion motion of krillindividuals is nonlinearly improved to dynamically adjustthe random diffusion amplitude to accelerate the conver-gence rate of the algorithm At the same time a linear nearestneighbor lasso step optimization is performed on the basis ofupdating the position of the krill herd which effectivelyenhances the global exploration ability It helps the algo-rithm achieve better performance reduce the data dimen-sion of feature selection and improve the efficiency ofintrusion detection

3 Algorithm Design

In this section we first provide a brief description of the KHalgorithm subsequently we present an improved version ofKH named LNNLS-KH to address the problem of largenumber and high dimension in feature selection of intrusiondetection

31 Standard KH Algorithm 2e framework of KH algo-rithm is shown in Figure 3 It includes three actions of krillindividual crossover operation and updating position andcalculating the fitness function Krill individuals changetheir position according to three actions after completinginitialization 2en the crossover operator is executed tocomplete the position update and the new fitness function iscalculated If the number of iterations does not reach themaximum krill individuals repeat the process until the it-eration is completed

As a novel biologically inspired algorithm for solvingoptimization tasks the KH algorithm expresses the possiblesolution of the problem with each krill individual By

simulating the foraging behavior the krill herd position iscontinuously updated to obtain the global optimal solution2e motions of krill individuals are mainly affected by thefollowing three aspects

(1) Movement induced by other krill individuals(2) Foraging activity(3) Physical diffusion motion

2e KH algorithm adopts the Lagrange model to searchin multidimensional space 2e position update of krillindividuals is shown as follows

dXi

dt Ni + Fi + Di (1)

where Xi Xi1 Xi2 XiNV1113966 1113967 Ni is the movement in-duced by other krill individuals Fi is the foraging activity ofkrill individual and Di is random physical diffusion basedon density region

311 Movement Induced by Other Krill Individuals 2emovement induced by other krill individuals is described asfollows

Nnewi N

maxαi + ωnNoldi (2)

αi αlocali + αtargeti (3)

where Nmax is the maximum induction velocity of sur-rounding krill individuals and it is taken 001(msminus 1) [5] ωn

represents the inertial weight in the range [0 1] Noldi is the

result of last motion induced by other krill individuals αlocali

is a parameter indicating the direction of guidance andαtargeti is the direction effect of the global optimal krillindividual

αlocali is defined as follows

αlocali 1113944NN

ji

1113954Kij1113954Xij

1113954Xij Xj minus Xi

Xj minus Xi

+ ε 1113954Kij

Ki minus Kj

Kworst

minus Kbest

(4)

where Kbest and Kworst are the best and worst fitness value ofkrill herd Ki is the fitness value of ith krill individual Kj

represents the fitness value of ith neighbor krill individual(j 1 2 NN) andNN represents the total amount ofneighbors 2e ε at the denominator position is a smallpositive number to avoid the singularity caused by zerodenominator

When selecting surrounding krill individuals the KHalgorithm finds the number of nearest neighbors to krillindividual ith by defining the ldquoneighborhood ratiordquo It is acircular area with krill individual ith as the center andperception distance dsi as the radius dsj is described asfollows

dsi 15N

1113944

N

j1Xi minus Xj

(5)

6 Security and Communication Networks

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 5: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

feature selection method based on ant colony optimization(UFSACO) [36] which iteratively filtrates feature throughthe heuristic and previous stage information of the antcolony Simultaneously the similarity between features isquantified to reduce the redundancy of data featuresHowever the efficiency of feature selection process needs tobe improved

To solve the problem that it is easy to fall into the localoptimal solution Arora and Anand proposed a butterflyoptimization algorithm (BOA) based on binary variables[37] Based on the foraging behavior of butterflies the al-gorithm uses each butterfly as a search agent to iterativelyoptimize the fitness function which has good convergenceability and avoids the premature problem to a certain extentExperimental results show that the algorithm reduces thelength of feature subset while selecting the optimal featuresubset and improves the classification accuracy to a certainextent However the time cost is larger than that of geneticalgorithm and particle swarm optimization algorithm andthe optimization result of the feature subset for repeatedexperiments is inaccurate and has poor robustness

In [38] Yan et al proposed a hybrid optimization al-gorithm (BCROSAT) based on simulated annealing andbinary coral reefs which is used for feature selection in high-dimensional biomedical datasets 2e algorithm increasesthe diversity of the initial population individuals through theleague selection strategy and uses the simulated annealingalgorithm and binary coding to improve the search ability ofthe coral reef optimization algorithm However the algo-rithm has high time complexity In [39] a new chaoticDragonfly algorithm (CDA) is proposed by Sayed et alwhich combines 10 different chaotic maps with the searchiteration process of dragonfly algorithm so as to acceleratethe convergence speed of the algorithm and improve theefficiency of feature selection 2e algorithm uses the worstfitness value best fitness value average fitness value stan-dard deviation and average feature length as evaluationcriteria 2e experimental results show that the adjustmentvariable of Gauss map significantly improves the perfor-mance of dragonfly algorithm in classification performancestability number of selected features and convergencespeed 2e disadvantage is that the experimental data issmall and the algorithm needs to be verified on large-scaledatasets Zhang et al [40] mixed genetic algorithm andparticle swarm optimization algorithm to conduct taboosearch for the produced optimal initial solution and theresult of quadratic feature selection is the global optimal

feature subset 2e algorithm not only guarantees the goodclassification performance but also greatly reduces the falsepositive rate and false negative rate of classification results2e disadvantage is that the algorithm takes a large calcu-lation cost and a long offline training time

24KrillHerd (KH)AlgorithmandVariants Krill herd (KH)algorithm is a new swarm intelligence optimization methodbased on population proposed by Gandomi and Alavi in2012 [41] 2e algorithm studies the foraging rules andclustering behavior of the herding of the krill swarms innature and simulates the induced movement foraging ac-tivity and random diffusion movement of KH Meanwhileit obtains the optimal solution by continuously updating theposition of krill individuals

Abualigah et al introduced a multicriteria mixedfunction based on the global optimal concept in the KHalgorithm and applied it to text clustering [5] By supple-menting the advantages of local neighborhood search andglobal wide area search the algorithm balances the ex-ploitation and exploration process of krill herd In [42] theinfluence of excellent neighbor individuals on the krill herdduring evolution is considered and an improved KH algo-rithm is proposed to enhance the local search ability of thealgorithm In [43] a hybrid data clustering algorithm (IKH-KHM) based on improved KH algorithm and k-harmonicmeans was proposed to solve the problem of sensitiveclustering center of K-means algorithm 2is algorithmincreases the diversity of KH by alternately using the randomwalk of Levi flight and the crossover operator in the geneticalgorithm It improves the global search ability of the al-gorithm and avoids the phenomenon of premature con-vergence of the algorithm to some degree 2e simulationexperiments of the 5 datasets in the UCI database show thatthe IKH-KHM algorithm overcomes the noise sensitivityproblem to a certain extent and has a significant effect on theoptimization of the objective function However its slowrecovery speed results in a high time cost of the algorithm In2017 Li and Liu adopted a combined update mechanism ofselection operator and mutation operator to enhance theglobal optimization ability of the KH algorithm2ey solvedthe problem of unbalanced local search and global search ofthe original KH algorithm [44]

For enhancing the global search ability of KH algorithma global search operator improved KH algorithm wasproposed by Jensi and Jiji [9] and applied to data clustering

Table 1 Summary of feature selection methods in IDS

Method Author Year Ref noBayesian network-based dimensionality reduction and principal component analysis (PCA) Smith et al 2010 [16]Ranking based on Mahalanobis distance and exhaustive search Zhao et al 2013 [17]Iterative Dichotomiser 3 (ID3) algorithm Singh and tiwari 2015 [18]Mutual information method Ambusaidi et al 2016 [19]Nonsymmetric deep autoencoder (NDAE) Shone et al 2018 [20]Self-adaptive differential evolution (SaDE) Xue et al 2018 [21]Principal component analysis (PCA) and linear discriminant analysis (LDA) Shen et al 2019 [22]Hybrid network of convolutional neural network (CNN) and long short-term memory network (LSTM) Sun et al 2020 [23]Cross-correlation-based feature selection (CCFS) method Farahani 2020 [24]

Security and Communication Networks 5

2e algorithm continuously searches around the originalarea to guide the krill herd to the global optimal movementIt defines a new step size formula which is convenient forkrill individuals to fine tune their position in the searchspace At the same time the elite selection strategy is in-troduced into the krill herd update process which is helpfulfor the algorithm to jump out of the local optimal solutionExperimental results show that the improved KH algorithmhas higher accuracy and better robustness

In [45] Wang et al proposed a stud KH algorithm2emethod adopts a new krill herd genetics and reproductionmechanism replacing the random selection in the stan-dard KH algorithm with columnar selection operator andcrossover operator To balance the exploration and ex-ploitation abilities of the KH algorithm Li et al proposeda linear decreasing step KH algorithm [46] In the algo-rithm the step size scaling factor is improved linearlywhich makes it decrease with the increase of iterationtimes thereby enhancing the search ability of thealgorithm

Although KH algorithm and its enhanced version showbetter performance than other swarm intelligence algo-rithms there are still deficiencies such as unbalanced ex-ploration and exploitation In this paper to minimize thenumber of selected features and achieve high classificationaccuracy both parameters are introduced into the fitnessevaluation function 2e physical diffusion motion of krillindividuals is nonlinearly improved to dynamically adjustthe random diffusion amplitude to accelerate the conver-gence rate of the algorithm At the same time a linear nearestneighbor lasso step optimization is performed on the basis ofupdating the position of the krill herd which effectivelyenhances the global exploration ability It helps the algo-rithm achieve better performance reduce the data dimen-sion of feature selection and improve the efficiency ofintrusion detection

3 Algorithm Design

In this section we first provide a brief description of the KHalgorithm subsequently we present an improved version ofKH named LNNLS-KH to address the problem of largenumber and high dimension in feature selection of intrusiondetection

31 Standard KH Algorithm 2e framework of KH algo-rithm is shown in Figure 3 It includes three actions of krillindividual crossover operation and updating position andcalculating the fitness function Krill individuals changetheir position according to three actions after completinginitialization 2en the crossover operator is executed tocomplete the position update and the new fitness function iscalculated If the number of iterations does not reach themaximum krill individuals repeat the process until the it-eration is completed

As a novel biologically inspired algorithm for solvingoptimization tasks the KH algorithm expresses the possiblesolution of the problem with each krill individual By

simulating the foraging behavior the krill herd position iscontinuously updated to obtain the global optimal solution2e motions of krill individuals are mainly affected by thefollowing three aspects

(1) Movement induced by other krill individuals(2) Foraging activity(3) Physical diffusion motion

2e KH algorithm adopts the Lagrange model to searchin multidimensional space 2e position update of krillindividuals is shown as follows

dXi

dt Ni + Fi + Di (1)

where Xi Xi1 Xi2 XiNV1113966 1113967 Ni is the movement in-duced by other krill individuals Fi is the foraging activity ofkrill individual and Di is random physical diffusion basedon density region

311 Movement Induced by Other Krill Individuals 2emovement induced by other krill individuals is described asfollows

Nnewi N

maxαi + ωnNoldi (2)

αi αlocali + αtargeti (3)

where Nmax is the maximum induction velocity of sur-rounding krill individuals and it is taken 001(msminus 1) [5] ωn

represents the inertial weight in the range [0 1] Noldi is the

result of last motion induced by other krill individuals αlocali

is a parameter indicating the direction of guidance andαtargeti is the direction effect of the global optimal krillindividual

αlocali is defined as follows

αlocali 1113944NN

ji

1113954Kij1113954Xij

1113954Xij Xj minus Xi

Xj minus Xi

+ ε 1113954Kij

Ki minus Kj

Kworst

minus Kbest

(4)

where Kbest and Kworst are the best and worst fitness value ofkrill herd Ki is the fitness value of ith krill individual Kj

represents the fitness value of ith neighbor krill individual(j 1 2 NN) andNN represents the total amount ofneighbors 2e ε at the denominator position is a smallpositive number to avoid the singularity caused by zerodenominator

When selecting surrounding krill individuals the KHalgorithm finds the number of nearest neighbors to krillindividual ith by defining the ldquoneighborhood ratiordquo It is acircular area with krill individual ith as the center andperception distance dsi as the radius dsj is described asfollows

dsi 15N

1113944

N

j1Xi minus Xj

(5)

6 Security and Communication Networks

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 6: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

2e algorithm continuously searches around the originalarea to guide the krill herd to the global optimal movementIt defines a new step size formula which is convenient forkrill individuals to fine tune their position in the searchspace At the same time the elite selection strategy is in-troduced into the krill herd update process which is helpfulfor the algorithm to jump out of the local optimal solutionExperimental results show that the improved KH algorithmhas higher accuracy and better robustness

In [45] Wang et al proposed a stud KH algorithm2emethod adopts a new krill herd genetics and reproductionmechanism replacing the random selection in the stan-dard KH algorithm with columnar selection operator andcrossover operator To balance the exploration and ex-ploitation abilities of the KH algorithm Li et al proposeda linear decreasing step KH algorithm [46] In the algo-rithm the step size scaling factor is improved linearlywhich makes it decrease with the increase of iterationtimes thereby enhancing the search ability of thealgorithm

Although KH algorithm and its enhanced version showbetter performance than other swarm intelligence algo-rithms there are still deficiencies such as unbalanced ex-ploration and exploitation In this paper to minimize thenumber of selected features and achieve high classificationaccuracy both parameters are introduced into the fitnessevaluation function 2e physical diffusion motion of krillindividuals is nonlinearly improved to dynamically adjustthe random diffusion amplitude to accelerate the conver-gence rate of the algorithm At the same time a linear nearestneighbor lasso step optimization is performed on the basis ofupdating the position of the krill herd which effectivelyenhances the global exploration ability It helps the algo-rithm achieve better performance reduce the data dimen-sion of feature selection and improve the efficiency ofintrusion detection

3 Algorithm Design

In this section we first provide a brief description of the KHalgorithm subsequently we present an improved version ofKH named LNNLS-KH to address the problem of largenumber and high dimension in feature selection of intrusiondetection

31 Standard KH Algorithm 2e framework of KH algo-rithm is shown in Figure 3 It includes three actions of krillindividual crossover operation and updating position andcalculating the fitness function Krill individuals changetheir position according to three actions after completinginitialization 2en the crossover operator is executed tocomplete the position update and the new fitness function iscalculated If the number of iterations does not reach themaximum krill individuals repeat the process until the it-eration is completed

As a novel biologically inspired algorithm for solvingoptimization tasks the KH algorithm expresses the possiblesolution of the problem with each krill individual By

simulating the foraging behavior the krill herd position iscontinuously updated to obtain the global optimal solution2e motions of krill individuals are mainly affected by thefollowing three aspects

(1) Movement induced by other krill individuals(2) Foraging activity(3) Physical diffusion motion

2e KH algorithm adopts the Lagrange model to searchin multidimensional space 2e position update of krillindividuals is shown as follows

dXi

dt Ni + Fi + Di (1)

where Xi Xi1 Xi2 XiNV1113966 1113967 Ni is the movement in-duced by other krill individuals Fi is the foraging activity ofkrill individual and Di is random physical diffusion basedon density region

311 Movement Induced by Other Krill Individuals 2emovement induced by other krill individuals is described asfollows

Nnewi N

maxαi + ωnNoldi (2)

αi αlocali + αtargeti (3)

where Nmax is the maximum induction velocity of sur-rounding krill individuals and it is taken 001(msminus 1) [5] ωn

represents the inertial weight in the range [0 1] Noldi is the

result of last motion induced by other krill individuals αlocali

is a parameter indicating the direction of guidance andαtargeti is the direction effect of the global optimal krillindividual

αlocali is defined as follows

αlocali 1113944NN

ji

1113954Kij1113954Xij

1113954Xij Xj minus Xi

Xj minus Xi

+ ε 1113954Kij

Ki minus Kj

Kworst

minus Kbest

(4)

where Kbest and Kworst are the best and worst fitness value ofkrill herd Ki is the fitness value of ith krill individual Kj

represents the fitness value of ith neighbor krill individual(j 1 2 NN) andNN represents the total amount ofneighbors 2e ε at the denominator position is a smallpositive number to avoid the singularity caused by zerodenominator

When selecting surrounding krill individuals the KHalgorithm finds the number of nearest neighbors to krillindividual ith by defining the ldquoneighborhood ratiordquo It is acircular area with krill individual ith as the center andperception distance dsi as the radius dsj is described asfollows

dsi 15N

1113944

N

j1Xi minus Xj

(5)

6 Security and Communication Networks

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 7: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

where N is the amount of krill individuals and Xi and Xj

represent the position of ith and jth krill individualsαtargeti is defined as follows

αtargeti Cbest 1113954Kibest

1113954Xibest (6)

where Cbest is the effective coefficient between ith and globaloptimal krill individuals

Cbest

2 rand +I

Imax1113888 1113889 (7)

where I is the number of iterations Imax is the maximumnumber of iterations and rand is a random number between[0 1] which is used to enhance the exploration ability

312 Foraging Activity Foraging activity is affected by fooddistance and experience of food location and it is describedas follows

Fi Vfβi + ωfFoldi (8)

βi βfoodi + βbesti (9)

where Vf is foraging speed and it is taken 002(msminus 1) [41]ωf is inertia weight in the range [0 1] and βi indicatesforaging direction and it consists of food induction directionβfoodi and the historically optimal krill individual inductiondirection βbesti 2e essence of food is a virtual location usingthe concept of ldquocentroidrdquo It is defined as follows

Xfood

1113936

Ni1 1Ki( 1113857Xi

1113936Ni1 1Ki

(10)

(1) 2e induced direction of food to ith krill individual isexpressed as follows

βfoodi Cfood 1113954Kifood

1113954Xifood (11)

where Cfood is the food coefficient and it is determinedas follows

Cfood

2 1 minusI

Imax1113888 1113889 (12)

(2) 2e induced direction of historical best krill indi-vidual to ith krill individual is expressed as follows

βbesti 1113954Kibest1113954Xibest (13)

where 1113954Kibest represents the historical best individualinfluence on ith krill individual

313 Physical Diffusion Motion Physical diffusion is astochastic process 2e expression is as follows

Di Dmax 1 minus

I

Imax1113888 1113889δ (14)

where Dmax is the maximum diffusion velocity in the range[0002 0010](msminus 1) According to [41] it is taken

Movement induced by other krill individuals Foraging movement Physical diffusion

movement

Crossover operation

Updating position

Calculating the fitnessfunction

Three actions of krill individual

Figure 3 2e framework of KH algorithm

Security and Communication Networks 7

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 8: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

0005(msminus 1) δ represents the random direction vector andthe value is taken the random between [minus 1 1]

314 Crossover Crossover operator is an effective globaloptimization strategy An adaptive vectorization crossoverscheme is added to the standard KH algorithm to furtherenhance the global search ability of the algorithm [41] It isgiven as follows

Xim Xim lowastCr + Xrm lowast (1 minus Cr) randim ltCr

Xim else1113896

Cr 021113954Kibest

(15)

where r is a random number andr isin [1 2 i minus 1 i + 1 N] Xim represents the mthdimension of the ith krill individual Xrm represents the mthdimension of the rth krill individual and Cr is the crossoverprobability which decreases as the fitness increases and theglobally optimal crossover probability is zero

315 Movement Process of KH Algorithm Affected by themovement induced by other krill individuals foraging ac-tivity and physical diffusion the krill herd changed itsposition towards the direction of optimal fitness 2e po-sition vector of [tΔt] krill individual in interval [tΔt] isdescribed as follows

Xi(t + Δt) Xi(t) + ΔdXi

dt (16)

where Δt is the scaling factor of the velocity vector Itcompletely depends on the search space

Δt Ct 1113944

NV

ji

UBj minus LBj1113872 1113873 (17)

where NV represents the dimension of decision variablesLBj and UBj the upper and lower bounds of the j variablej 1 2 NV and Ct is the step scaling factor in the range[0 2]

32 e LNNLS-KH Algorithm In view of the weakness ofthe unbalanced exploitation and exploration ability of KHalgorithm we propose the LNNLS-KH algorithm for featureselection to improve the performance and pursue high ac-curacy rate high detection rate and low false positive rate ofintrusion detection 2e improvement is reflected in thefollowing three aspects

321 A New Fitness Evaluation Function To improve theclassification accuracy of feature subset detection we in-troduce the feature selection dimension and classificationaccuracy into fitness evaluation function 2e specific ex-pression of fitness is as follows

fitness αlowastFeatureselectedFeatureall

+(1 minus α)lowast (1 minus Accuracy)

(18)

where α isin [0 1] which is a weighting factor used to tune theimportance between the number of selected features andclassification accuracy Featureselected is the number of se-lected features Featureall represents the total number offeatures and Accuracy indicates the accuracy of classifica-tion results Moreover k-nearest neighbor (KNN) is used asthe classification algorithm and the classification accuracy isdefined as follows

Accuracy TP + TN

TP + TN + FP + FN (19)

where TP TN FP and FN are defined in the confusionmatrix as shown in Table 2

322 Nonlinear Optimization of Physical Diffusion Motion2e physical diffusion of krill herd is a random diffusionprocess 2e closer the individuals are to the food the lessrandom the movement is Due to the strong convergence ofthe algorithm the movement of krill individuals presents anonlinear change from quickness to slowness and the fitnessfunction gradually decreases with the convergence of thealgorithm According to equations (2) and (9) the move-ment induced by other krill individuals and foraging activityare nonlinear In the physical diffusion equation (14) thediffusion velocity Di of ith krill individual decreases linearlywith the increase of iteration times In order to fit thenonlinear motion of krill herd we introduce the optimi-zation coefficient λ and the fitness factor μfit of krill herd intothe physical diffusion motion 2e optimized physical dif-fusion motion expression is defined as follows

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)μfit1113890 1113891δ (20)

where λ is in the range of [0 1] and μfit is defined as follows

μfit K

best

Ki

(21)

where Kbest is the fitness value of the current optimal in-dividual and Ki represents the fitness value of ith krill in-dividual As the number of iterations increases Ki graduallydecreases until approaches Kbest 2erefore

μfit is in the range of (0 1] Introduce the fitness factorμfit into equation (20) to get the new physical diffusionmotion equation

Di Dmax 1 minus λ

I

Imaxminus (1 minus λ)

Kbest

Ki

1113890 1113891 (22)

According to equation (22) the number of iterations is Ithe fitness Ki of krill individual and the fitness Kbest of thecurrent optimal krill individual jointly determine the

8 Security and Communication Networks

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 9: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

physical diffusion motion so as to further adjust the randomdiffusion amplitude In the early stage of the algorithm it-eration the number of iterations is small and the fitnessvalue of the individual is large so the fitness factor is smallwhich is conducive to a large random diffusion of the krillherd As the number of iterations gradually increases thealgorithm converges quickly and the fitness of krill indi-viduals approaches the global optimal solution At the sametime the fitness factor increases nonlinearly which makesthe random diffusion more consistent with the movementprocess of krill individual

To further evaluate the effect of the KH algorithm fornonlinear optimization of physical diffusion motion (NOndashKH)we conducted experiments on two classical benchmark func-tions F1(x) is the Ackley function which is a unimodalbenchmark function F2(x) is the Schwefel 222 function whichis a multimodal benchmark function 2e experimental pa-rameters of F1(x) and F2(x) are shown in Table 3

Figure 4 shows the Ackley function and the Schwefel 222function graphs for n 2 We use standard KH algorithmand NO-KH algorithm to find the optimal value on theunimodal benchmark function and multimodal benchmarkfunction respectively 2e number of krill and iterations areset to 25 and 500 Table 4 shows the best value worst valuemean value and standard deviation which are obtained byrunning the algorithms 20 times We can see that comparedwith standard KH algorithm NO-KH algorithm searches forthe smaller optimal solutions on both the unimodalbenchmark function and multimodal benchmark functionand its global exploration ability is improved 2e smallerstandard deviation obtained from repeated experimentsshows that NO-KH algorithm has better stability 2ereforenonlinear optimization of physical diffusion motion of KHalgorithm is effective

2e above analysis shows introducing the optimizationcoefficient λ and the fitness factor μfit into the physicaldiffusion motion of the krill herd is conducive to dynami-cally adjusting the random diffusion amplitude of the krillindividuals and accelerating the convergence speed of thealgorithm Meanwhile it increases the nonlinearity of thephysical diffusion motion and the global exploration abilityof the algorithm

323 Linear Nearest Neighbor Lasso Step OptimizationWhen KH algorithm is used to solve the multidimensionalcomplex function optimization problem the local searchability is weak and the exploitation and exploration aredifficult to balance For enhancing the local exploitation andglobal exploration abilities of the algorithm the influence ofexcellent neighbor individuals on the krill herd duringevolution is considered and an improved KH algorithm is

proposed in [42] 2e algorithm introduces the nearestneighbor lasso operator to mine the neighborhood of po-tential excellent individuals to improve the local searchability of krill individuals but the random parameters in-troduced in the lasso operator increase the uncertainty of thealgorithm To cope with the problem we introduce animproved krill herd based on linear nearest neighbor lassostep optimization (LNNLS-KH) to find the nearest neighborof krill individuals after updating individual position andlinearly move a defined step to derive better fitness valueWith introducing the method of linearization the nearestneighbor lasso step of the algorithm changes linearly withiteration times accordingly balancing the exploitation andexploration ability of the algorithm In the early iteration thelarge linear nearest neighbor lasso step is selected to facilitatethe krill individuals to quickly adjust their positions so as toimprove the search efficiency of algorithm In the later stageof iteration the nearest neighbor lasso step decreases linearlyto obtain the global optimal solution

In krill herd X X1 X2 Xn1113864 1113865 assuming that jthkrill individual is the nearest neighbor of ith krill individualthe Euclidean distance between two krill individuals is de-fined as follows

distanceij Xi Xj1113966 1113967 (23)

where Xi Xj1113966 1113967 sub S and ine j 2e equation of linear nearestneighbor lasso step is defined as follows

step

I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(24)

2e fitness function is expressed as equation (18)2erefore the smaller fitness valuemeans that the number offeature selection is less under the condition of higher ac-curacy ie the position of krill individual is better 2eschematic diagram of LNNLS-KH is shown in Figure 5 2enew position Yk of jth krill individual is expressed as follows

Yk

Xj +I

Imaxtimes Xi minus Xj1113872 1113873 Ki gtKj

Xi +I

Imaxtimes Xj minus Xi1113872 1113873 Kj gtKi

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

(25)

Considering that the ith and krill jth individuals move toboth ends of the food the new position Yk will be far fromthe optimal solution after the linear neighbor lasso stepoptimization processing as shown in Figure 6

Table 2 Confusion matrix

Confusion matrix True conditionTrue condition positive True condition negative

Predicted condition Predicted condition positive True positive (TP) False positive (FP)Predicted condition negative False negative (FN) True negative (TN)

Security and Communication Networks 9

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 10: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

Table 3 Benchmark functions in the experiment

Benchmark functions Dim Range fmin

Fi(x) 1113936ni1 |xi| + 1113937

ni1 |xi| 10 [minus 10 10] 0

F2(x) minus 20exp(minus 02(12) 1113936

ni1 x2

i

1113969) minus ((1n) 1113936

ni1 cos(2πxi)) + 20 + e 10 [minus 32 32] 0

0100

2000

4000

50 100

F1

6000

Unimodal benchmark function Ackley

50

x2x 1

8000

0

10000

0ndash50 ndash50

ndash100 ndash100

020

5

10

10 20

F2

15

Multimodal benchmark function Schwefel 222

10

x2 x 1

0

20

0ndash10 ndash10ndash20 ndash20

Figure 4 Ackley function and Schwefel 222 function graphs for n 2 (a) Unimodal benchmark function Ackley (b) Multimodalbenchmark function Schwefel 222

Table 4 2e statistical results of KH and NO-KH algorithms on two benchmark functions

f(x) Algorithms Best value Worst value Mean value Standard deviation

F1 KH 1692Eminus 04 1099Eminus 02 1508Eminus 03 3342Eminus 03NO-KH 3277Eminus 05 9632E-04 4221Eminus 04 3908Eminus 04

F2 KH 5716Eminus 05 2168 0329 0816NO-KH 8309E-06 1155 0116 0362

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X2

X3

X1

Xj Xm

Xi

Yk2

Yk1

Food

Figure 5 Optimization of linear nearest neighbor lasso step forkrill individuals at the same end of food

Xi

Yk1

Food

distanceij=Xi Xj

The position of foodThe position of krill Xi The position of new krill Yi after LNNLS

The distance between two krillsThe length of LNNLS

X1X3

X2Xj

Figure 6 Optimization of linear neighboring lasso step for krillindividuals at both ends of food

10 Security and Communication Networks

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 11: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

2e pseudocode of LNNLS-KH algorithm is shown inAlgorithm 1

33Analysis of TimeComplexity In KH algorithm each krillindividual updates its position after movement which isinduced by other krill individuals foraging activity andphysical diffusion motion with the time complexity ofO(N) After Imax iterations the time complexity of thealgorithm is O(Imax middot N) In LNNLS-KH algorithm themodified fitness function and the nonlinear optimization ofphysical diffusion motion hardly perform additional cal-culations so the time complexity is not changed In additionthe linear nearest neighbor lasso step optimization process ofthe algorithm adds the calculations of equations (24) and(25) after the krill individual completes the position updateduring iteration and the time complexity is O(Imax middot N)2erefore the total time complexity of the LNNLS-KMalgorithm is O(2Imax middot N)

34 Description of the LNNLS-KH Algorithm for IDS FeatureSelection IDS is a system to recognize and process malicioususage of computers and network resources 2e intrusiondetection dataset records normal and abnormal traffic in-cluding network traffic data and types of network attacksand provides data support for the research and developmentof intrusion detection technology IDS is generally com-posed of data acquisition data preprocessing detectionunits and response actions as shown in Figure 7

2e LNNLS-KH algorithm is used to select the high-quality feature subsets of IDS 2e features of the intrusiondetection dataset are randomly initialized to different realnumbers in the range of [0 1] which constitute the positionvectors of the krill herd By calculating the fitness functionand carrying out the LNNLS-KH algorithm the positionvectors of the krill herd are constantly updated 2e fitnessfunction is determined by the number of feature selectionand the accuracy of classification so the position vectors ofthe krill herd move toward the optimal fitness valueAccording to [47] it is appropriate to set the feature se-lection threshold to 07 When the maximum number ofiterations is reached the position vector of the krill pop-ulation larger than the threshold is selected 2e selectedfeatures constitute the feature subset of intrusion detectiondata Furthermore selected feature subset is sent to thedetection units In view of the K-Nearest Neighbor (KNN)algorithm which is relatively mature in theory the detectionunits adopt KNN algorithm to construct intrusion detectionclassifier Finally the intrusion detection results are evalu-ated through test dataset 2e process of LNNLS-KH al-gorithm for IDS feature selection is shown in Figure 8

4 Results and Discussion

To verify the performance of the LNNLS-KH algorithm inIDS feature selection we adopt the NSL-KDD networkintrusion detection dataset and the CICIDS2017 dataset forexperiments

41 Datasets Analysis 2e NSL-KDD dataset is a classicdataset that has been used in the field of anomaly detectionAs an improved version of the KDD CUP 99 dataset it iscurrently one of the most reliable and influential intrusiondetection datasets Compared with the KDDCUP 99 datasetthe NSL-KDD dataset eliminates duplicate data so thedataset hardly contains redundant records Meanwhile theproportion of each type of record in the NSL-KDD datasethas been adjusted to make the proportion of each type ofdata reasonable Each record in the NSL-KDD dataset in-cludes 41-dimensional features and a classification labelKDDTraint+ and KDDTest+ in the NSL-KDD dataset areselected as the training subset and the test subset 2e typesof attacks are divided into four types denial of service (DoS)scan and probe (Probe) remote to local (R2L) and user toroot (U2R) 2e detailed attack names and distribution ofsample categories are shown in Tables 5 and 6 2e featuresof NSL-KDD dataset are shown in Table 7

2e NSL-KDD dataset includes four types of featureswhich are the basic features of TCP connections (9 in total)the contents of TCP connections (13 in total) the time-basednetwork traffic statistics (9 in total) and the host-basednetwork traffic statistics (10 in total) Among all the featuresldquoProtocol_typerdquo ldquoservicerdquo and ldquoflagrdquo are features of char-acter types which need to be preprocessed and mapped toordered values Because the mixed data types of numeric andcharacter are difficult to deal with the one-hot encoding isused to map different characters to different values Forexample the ldquoProtocol_typerdquo feature includes three types ofprotocol denoted by icmp [1 0 0] tcp [0 1 0] andudp [0 0 1] Similarly the 70 attributes in ldquoservicerdquo andthe 11 attributes in ldquoflagrdquo are also numeralized in the sameway 2e 41-dimensional feature is expanded to 122-di-mensional after one-hot encoding At the same time thedataset is normalized to eliminate the influence of features ofdifferent orders of magnitude on the calculation results thusreducing the experimental error 2e data preprocessing ishelpful to improve the accuracy of classification and ensurethe reliability of the results 2e values corresponding toeach feature are normalized to the interval [0 1] and thenormalization expression is as follows

Xlowast

X minus Xmin

Xmax minus Xmax (26)

where Xlowast is the normalized eigenvalue X is the originaleigenvalue and Xmax and Xmin represents the maximum andminimum values in the same dimension feature

Although NSL-KDD is a benchmark dataset in the fieldof network intrusion detection some of the attack types areoutdated due to the rapid development of network tech-nology 2erefore it hardly reflects the current real-networkenvironment CICIDS2017 is a novel network intrusiondetection dataset released by the Canadian Institute for

Data preprocessing

Data acquisition

Detection units

Response actions

Figure 7 2e framework of IDS

Security and Communication Networks 11

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 12: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

Cybersecurity (CIC) in 2017 2e dataset collected trafficdata for five days with only normal traffic on Monday andattacks occurring in the morning and afternoon fromTuesday to Friday It includes ldquoFTP patatorrdquo ldquoSSH patatorrdquo

ldquoDoS GoldenEyerdquo ldquoDoS Slowhttptestrdquo ldquoDos SlowlorisrdquoldquoHeartbleedrdquo ldquoWeb Attack Brute Forcerdquo ldquoWeb Attack SqlInjectionrdquo ldquoWeb Attack XSSrdquo ldquoInfiltration Attackrdquo ldquoBotrdquoldquoDDoSrdquo and ldquoPortScanrdquo which are common types of attacks

Start

Initialize parameters (N NV Imax UB LB)

Initialize the krill herd position

Calculate the fitness of individuals

Genetic operator

Update the position and fitness values of individuals

Find the nearest krill and calculate the linear lasso step with Eq (27)

Calculate the fitness valueKyk gt Ki or (Kj)

Keep the updated position Yk anddelete Xi or Xj

Update krill herd position Yk optimized by LNNLS with Eq (28)

Keep Xi or Xj and delete the updated location Yk

Iteration gt Imax

Output the optimal solution and the number of selected features

(1) Movement induced by other krill individuals(2) Foraging activity(3) Nonlinear physical diffusion motion

Calculate three actions

Yes

Yes No

No

Update Xgb and Kgb of global optimal individuals

KNN algorithm for intrusion detection

Input the IDS dataset

Evaluate intrusion detection results

Figure 8 2e process of LNNLS-KH algorithm for IDS feature selection

12 Security and Communication Networks

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 13: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

in modern networks 2e distribution of attack time andtypes of CICIDS2017 dataset is shown in Table 8 We use theMachineLearningCVE file in the CICIDS2017 dataset as thedataset which contains 78 features and an attack type label2e number and name of the feature are shown in Table 9Compared with the NSL-KDD dataset the attack types inthe CICIDS2017 dataset are more in line with the situation ofmodern networks

42 Experimental Results and Discussion of NSL-KDDDataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 Since the training of the al-gorithm requires normal and abnormal samples we mixnormal samples and different types of attack samples toconstruct train sets and test sets of four different attack typesIn order to reduce the time of searching the optimal feature

Input Training setOutput Global best solution the number of selected features and feature selection time

(1)Begin(2) Initialize algorithm parameters Nmax Vf DmaxNV ImaxUB LB(3) Initialize the krill herd position(4) Evaluate the fitness of krill individuals and find the individuals with the best and worst fitness values(5) for I 1 to Imax do(6) for each krill individual i(i 1 2 m) do(7) Calculate the three components of motion(8) (1) 2e motion induced by other krill individuals(9) (2) 2e foraging activity(10) (3) 2e nonlinear optimized physical diffusion(11) Implement crossover operator(12) Update krill herd position and fitness values(13) Calculate the linear nearest neighbor lasso step and new position using equations (24) and (25) and update new fitness

values(14) if KykgtKi or (Kj)(16) Leave Ki or (Kj) and delete Kyk(17) else(18) Leave Kyk and delete Ki or (Kj)(19) end if(19) end for(20) Update Xgb and Kgb of the globally optimal individuals(21) end for(22) Output the global best solution the number of selected features and feature selection time(23) End

ALGORITHM 1 2e LNNLS-KH algorithm

Table 5 2e distribution of sample categories

Attacktypes Attack names

DoS Neptune back land pod smurf teardrop mailbomb Apache2 processtable udpstorm wormProbe Ipsweep nmap portsweep Satan mscan saint

R2L ftp_write guess_passwd imap multihop phf spy warezclient warezmaster sendmail named snmpgetattack snmpguessxlock xsnoop httptunnel

U2R buffer_overflow loadmodule perl rootkit ps sqlattack xterm

Table 6 2e distribution of sample categories

Data category KDDTraint + samples KDDTest + samples Total number of samplesNormal 65120 11536 76656DoS 36944 6251 43195Probe 10786 2421 13207R2L 995 2653 3648U2R 52 67 119All 113897 22928 136825

Security and Communication Networks 13

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 14: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

subset we randomly select 50 of Probe attack samples 10of DoS attack samples 100 of U2R attack samples and100 of R2L attack samples in the KDDTraint + dataset asthe training dataset 100 of Probe dataset 50 of DoSdataset 100 of U2R dataset and 20 of R2L dataset in theKDDTest + dataset as test dataset

For the LNNLS-KH algorithm the maximum number ofiterations Imax and quantity of krill individuals N are set tobe 100 and 30 respectively In [41] the foraging speed of krillindividuals Vf is set to be 002 the maximum randomdiffusion rate Dmax is set to be 005 and the maximuminduction speed Nmax is set to be 001 In [47] the thresholdθ is set to be 07 As the LNNLS-KH algorithm is prefer-entially designed to ensure high accuracy and posteriorlyreduce the number of features the weight factor α in fitnessfunction is set to be 002

FPR FP

TN + FP (27)

DR TR

TP + FN (28)

We adopt the iterative curve of global optimal fitnessvalue feature selection time test set detection time datadimension after feature selection classification accuracydetection rate (DR) and false positive rate (FPR) asevaluation measures of feature selection for IDS 2e ac-curacy represents the ratio of the correctly classifiedsamples to the total number of samples which is defined asequation (19) FPR is also known as false alarm rate (FAR)which represents the ratio of samples that are incorrectlydetected as intrusions to all normal samples as shown in

Table 7 2e features of NSL-KDD dataset

Classification of features Number Serial number and name of features2e basic characteristics of TCPconnections 9 (1) duration (2) protocol_type (3) service (4) flag (5) src_bytes (6) dst_bytes (7) land

(8) wrong_fragment (9) urgent

2e content characteristics of a TCPconnection 13

(10) hot (11) num_failed_logins (12) logged_in (13) num_compromised (14)root_shell (15) num_root (16) su_attempted (17) num_file_creations (18) num_shells

(19) num_access_files (20) num_outbound_cmds (21) is_host_login (22)is_guest_login

Time-based statistical characteristicsof network traffic 9 (23) count (24) srv_count (25) serror_rate (26) srv_serror_rate (27) rerror_rate (28)

srv_rerror_rate (29) same_srv_rate (30) diff_srv_rate (31) srv_diff_host_rate

Host-based network traffic statistics 10

(32) dst_host_count (33) dst_host_srv_count (34) dst_host_same_srv_rate (35)dst_host_diff_srv_rate (36) dst_host_same_src_port_rate (37)

dst_host_srv_diff_host_rate (38) dst_host_serror_rate (39) dst_host_srv_serror_rate(40) dst_host_rerror_rate (41) dst_host_srv_rerror_rate

Table 8 Attack time and attack types of the CICIDS2017 dataset

Time Type Label Amount TotalMonday Normal BENIGN 529918 529918

TuesdayNormal BENIGN 432074

445909Brute force FTP patator 7938SSH patator 5897

Wednesday

Normal BENIGN 440031

692703DoS

DoS GoldenEye 10293DoS slowhttptest 5499Dos slowloris 5796Heart bleed 11

2ursday morning

Normal BENIGN 168186

170366Web attackWeb attack brute force 1507Web attack sql injection 21

Web attack XSS 652

2ursday afternoon Normal BENIGN 288566 288602Infiltration Infiltrationdnt 36

Friday morning Normal BENIGN 189067 191033Botnet Bot 1966

Friday afternoon (1) Normal BENIGN 97718 225745DDoS DDoS 128027

Friday afternoon (2) Normal BENIGN 127537 286467PortScan PortScan 158930

14 Security and Communication Networks

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 15: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

equation (27) DR also known as recall or sensitivityrepresents the probability of being correctly detected in allabnormalities as shown in equation (28)2e crossover-mutation PSO (CMPSO) algorithm [47] ACO algorithm[48] KH algorithm [41] and IKH algorithm [9] are set tobe comparative experiments 2e experimental results ofProbe DoS R2L and U2R dataset are shown as follows

For reflecting the performance of the LNNLS-KH al-gorithm intuitively the convergence curves of fitnessfunction for Probe DoS U2R and R2L datasets are shown inFigure 9 2e results show that LNNLS-KH algorithmachieves a good fitness function value when the number ofiterations reaches about 20 which demonstrates the strongexploitation ability and good convergence performance ofthe LNNLS-KH algorithm As the number of iterationsincreases other algorithms show varying degrees of con-vergence stagnation while LNNLS-KH algorithm constantlyjumps out of local optimum and finds the global optimalsolution with better fitness 2e fitness function values after

100 iterations achieve 00328 00393 00292 and 00036respectively for the four attack datasets showing excellentexploration ability 2erefore compared with the CMPSOACO KH and IKH algorithms the LNNLS-KH algorithmexhibits faster convergence speed and stronger abilities ofexploitation and exploration

2e results of different feature selection algorithms areshown in Table 10 2e bold number in front of the bracketsindicates the quantity of features after feature selection andthe specific feature numbers are listed in the brackets 2ecomparison of feature selection dimensions is shown inFigure 10 and different colours are used to distinguish the fivealgorithms Obviously the proposed LNNLS-KH algorithmmarked in red is in the innermost circle of Figure 10 for ProbeDoS U2R and R2L datasets It indicates that compared withthe other four feature selection algorithms LNNLS-KH al-gorithm retains the least features while ensuring accuracyAccording to Figure 10 LNNLS-KH algorithm selects theaverage 7 main features of the NSL-KDD dataset accounting

0

002

004

006

008

01

012

014

016

018

02

Fitn

ess f

unct

ion

DoS

Number of iterations

0

005

01

015

02

025

03Fi

tnes

s fun

ctio

nProbe

CMPSOACOKH

IKHLNNLS-KH

R2L

005

0

01

015

02

025

03

Fitn

ess f

unct

ion

005

0

01

015

02

025Fi

tnes

s fun

ctio

n

U2R

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Number of iterations

CMPSOACOKH

IKHLNNLS-KH

0 10 20 30 40 50 60 70 80 90 100

Figure 9 Convergence curve of fitness functions for the four attack datasets

Security and Communication Networks 15

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 16: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

for 1707 of the total number of features Compared withCMPSO ACO KH and IKH algorithms the proposedLNNLS-KH algorithm reduces the features of 44 42863488 and 2432 respectively in the dataset of four attacktypes Meanwhile the total number of features in the fourtypes of attack datasets is reduced by 3743

To further evaluate the performance of the feature se-lection algorithms we show the feature selection time anddetection time of five different algorithms in Table 11Feature selection time represents the time of filtering outredundant features 2e detection time represents the timefrom inputting the most representative feature subsets intoKNN classifier to the end of detection It can be seen fromTable 11 that the feature selection time of standard KHalgorithm is shorter than that of CMPSO algorithm andACO algorithm which indicates that KH algorithm achievesfaster speed and better performance In addition comparedwith standard KH algorithm the feature selection time ofLNNLS-KH algorithm is longer which is mainly due to thenonlinear optimization of physical diffusion motion and theoptimization of linear neighbor lasso step after the krill herdposition is updated Although part of the feature selectiontime is increased the convergence speed and global searchability are greatly improved At the same time LNNLS-KHalgorithm removes redundant features which considerablyincreases the detection speed In comparison to other fourfeature selection algorithms the detection time of LNNLS-KH algorithm is reduced by 1683 1691 894 and696 on average in test dataset samples of Probe DoS R2Land U2R

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and thetest dataset is detected using KNN classifier 2e classifi-cation accuracy of different algorithms is shown in Table 12Comparing the accuracy of results it is found that LNNLS-KH feature selection algorithm achieves a classificationaccuracy of above 90 for Probe DoS U2R and R2L test

Table 9 2e number and name of the features in the CICIDS2017 dataset

Feature number Feature name Feature number Feature name Feature number Feature name1 Destination port 27 Bwd IAT mean 53 Average packet size2 Flow duration 28 Bwd IAT std 54 Avg fwd segment size3 Total fwd packets 29 Bwd IAT max 55 Avg bwd segment size4 Total backward packets 30 Bwd IAT min 56 Fwd header length5 Total length of fwd packets 31 Fwd PSH flags 57 Fwd avg bytesbulk6 Total length of bwd packets 32 Bwd PSH flags 58 Fwd avg packetsbulk7 Fwd packet length max 33 Fwd URG flags 59 Fwd avg bulk rate8 Fwd packet length min 34 Bwd URG flags 60 Bwd avg bytesbulk9 Fwd packet length mean 35 Fwd header length 61 Bwd avg packetsbulk10 Fwd packet length std 36 Bwd header length 62 Bwd avg bulk rate11 Bwd packet length max 37 Fwd Packetss 63 Subflow fwd packets12 Bwd packet length min 38 Bwd Packetss 64 Subflow fwd bytes13 Bwd packet length mean 39 Min packet length 65 Subflow bwd packets14 Bwd packet length std 40 Max packet length 66 Subflow bwd bytes15 Flow bytess 41 Packet length mean 67 Init_Win_bytes_forward16 Flow packetss 42 Packet length std 68 Init_Win_bytes_backward17 Flow IAT mean 43 Packet length variance 69 act_data_pkt_fwd18 Flow IAT std 44 FIN flag count 70 min_seg_size_forward19 Flow IAT max 45 SYN flag count 71 Active mean20 Flow IAT min 46 RST flag count 72 Active std21 Fwd IAT total 47 PSH flag count 73 Active max22 Fwd IAT mean 48 ACK flag count 74 Active min23 Fwd IAT std 49 URG flag count 75 Idle mean24 Fwd IAT max 50 CWE flag count 76 Idle std25 Fwd IAT min 51 ECE flag count 77 Idle max26 Bwd IAT total 52 Downup ratio 78 Idle min

0

5

10

15

20Probe

DoS

U2R

R2L

CMPSOACOKH

IKHLNNLS-KH

Figure 10 Comparison of feature selection dimensions producedby different algorithms

16 Security and Communication Networks

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 17: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

dataset samples Furthermore LNNLS-KH algorithm im-proves the average classification accuracy of Probe DoSU2R and R2L test dataset samples by 995 1204 947and 866

Table 13 shows the false positive rate and detection rateof feature subset produced by different feature selectionalgorithms To visualize the difference we show the

comparison in Figure 11 For Probe DoS U2R and R2Ldatasets the average false positive rate of LNNLS-KH featureselection algorithm is 400 It reduces by 2070 1530888 and 334 respectively compared with CMPSOACO and IKH algorithms Similarly for the detection ratethe proposed LNNLS-KH feature selection algorithm ex-hibits excellent performance 2e average detection rate of

Table 10 2e feature selection results of different feature selection algorithms (NSL-KDD dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Probe 14 (2 3 4 7 8 10 11 17 1920 21 27 30 33)

15 (1 3 4 6 15 16 17 1921 23 29 35 39 40 41)

13 (3 4 5 7 8 1314 18 19 21 26 28

40)

11 (2 3 5 8 10 1718 29 34 35 41)

8 (3 4 8 11 15 2934 40)

DoS 16 (3 4 5 6 8 13 14 17 1822 23 26 30 32 35 41)

16 (3 4 7 12 14 19 20 2527 28 30 33 34 37 40 41)

12 (2 3 4 5 8 9 1215 19 24 26 30)

12 (2 3 4 6 12 1820 22 27 28 30 31)

10 (3 4 6 15 1719 20 21 30 37)

U2R 9 (3 4 5 9 12 19 32 3341) 8 (3 4 6 8 20 24 33 36) 8 (3 4 10 12 19 23

31 32)6 (3 10 11 21 36

39) 3 (3 33 36)

R2L 11 (2 3 4 8 21 22 25 2737 40 41)

10 (3 4 7 12 17 21 29 3738 40)

10 (2 3 4 6 13 1819 22 32 41)

8 (3 4 5 8 11 1421 31)

7 (2 3 4 10 15 2136)

Table 11 Feature selection time and detection time of different feature selection algorithms (NSL-KDD dataset)

Data categoriesTime of feature selection (second) Time of detection (second)

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 523178 499814 474533 534887 549048 3713 3823 3530 3405 3106DoS 789235 763086 716852 803816 829692 11869 11815 10666 10514 9844U2R 15487 14729 14418 15779 17224 0087 0086 0086 0086 0078R2L 255675 236908 224092 266951 272770 955 913 907 862 803

Table 12 2e classification accuracy of different feature selection algorithms (NSL-KDD dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Probe 8046 8656 9242 9374 9824DoS 8174 8336 8603 8874 9701U2R 8274 8457 8559 9189 9567R2L 7870 8162 8878 9049 9356

05

101520253035

Probe DoS U2R R2L

FPR

()

CMPSOACOKH

IKHLNNLS-KH

(a)

CMPSOACOKH

IKHLNNLS-KH

0

20

40

60

80

100

Probe DoS U2R R2L

DR

()

(b)

Figure 11 Comparison of classification FPR and DR of different feature selection algorithms (a) FPR of different feature selectionalgorithms (b) DR of different feature selection algorithms

Security and Communication Networks 17

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 18: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

the LNNLS-KH algorithm is 9648 which is 1347932 702 and 472 higher than the CMPSO ACOKH and IKH feature selection algorithms respectively

In conclusion LNNLS-KH feature selection algorithmperforms excellent in the global optimal fitness iterationcurve test set detection time number of dimensions offeature subset classification accuracy false positive rate anddetection rate Although the offline training time of theLNNLS-KH algorithm is longer than the CMPSO ACOKH and IKH algorithms its lower feature dimension re-duces the detection time Moreover the algorithm has fasterconvergence speed higher detection accuracy and lowerclassification false positive rate and detection rate

43 Experimental Results and Discussion of CICIDS2017Dataset 2e experiment is conducted in MATLAB R2016aon Windows 64 bit operating system with the processor ofIntel (R) core (TM) i7-4790 2e MachineLearningCVE filein the CICIDS2017 dataset includes 8 csv files of all trafficdata which contain 78 features plus an attack type tag byremoving some duplicate features We annotate trafficrecords according to different attack periods and types andstandardize and normalize the dataset Due to the excessiveamount of data contained in the analyzed CSV file problemssuch as excessively long time consuming and slow con-vergence rate of the model will occur when the host is usedfor model training2erefore we simplified and reintegratedthese CSV data files while preserving the original attack

timing features We selected a total of 12090 records and 5types of traffic including 1 type of normal traffic and 4 typesof attack traffic respectively ldquoDoSrdquo ldquoDDoSrdquo ldquoPortScanrdquoand ldquoWebAttackrdquo 2e data are randomly divided intotraining sets and test sets in a 2 1 ratio with independent andrepeated experiments

CMPSO ACO KH and IKH algorithms are used as thecomparison of LNNLS-KH algorithm 2e preprocessedNormal DoS DDoS PortScan and WebAttack subsets areinput into the algorithm model successively and the di-mension and feature subsets of feature selection are ob-tained We adopt the KNN classification model as theclassifier and get the accuracy of intrusion detectionthrough test set data 2e results of feature selection di-mension for the CICIDS2017 dataset are shown in Table 14According to different attack types LNNLS-KH algorithmselects different features For example the selected featuresof DOS subset are ldquoTotal Length of Bwd Packetsrdquo ldquoFwdPacket Length Minrdquo ldquoFlow IAT Minrdquo ldquoFIN Flag CountrdquoldquoRST Flag Countrdquo ldquoURG PacketsBulkrdquo ldquoBwd AvgPacketsBulkrdquo ldquoIdle Meanrdquo and ldquoIdle Stdrdquo For WebAttacksubset ldquoTotal Fwd Packetsrdquo ldquoBwd IAT Maxrdquo ldquoBwd PSHFlagsrdquo ldquoFwd Packetssrdquo ldquoBwd Avg PacketsBulkrdquo ldquoSubflowFwd Bytesrdquo ldquoActive Maxrdquo and ldquoIdle Maxrdquo are selected asattack features by LNNLS-KH algorithm It reduces thefeature dimension of IDS dataset while ensuring high ac-curacy 2e average feature dimension selected by LNNLS-KH algorithm is 102 accounting for 1308 of the totalnumber of features in CICIIDS2017 dataset It decreases the

Table 13 2e classification FPR and DR of different feature selection algorithms (NSL-KDD dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHProbe 2237 1804 850 405 118 8232 8918 9501 9522 9773DoS 2127 1408 1145 788 285 7912 8208 8377 8523 9680U2R 2451 2104 1613 845 430 8702 8979 9014 9367 9552R2L 3066 2405 1542 899 767 8356 8756 8891 9289 9585

WebAttack

PortScan

DDoS

DoS

Normal

Time of feature selection (second) 0 2000 4000 6000 8000 10000

CMPSOACOKH

IKHLNNLS-KH

(a)

WebAttack

PortScan

DDoS

DoS

Normal

Time of intrusion detection (second)

CMPSOACOKH

IKHLNNLS-KH

0 05 1 15 2 25

(b)

Figure 12 Comparison of feature selection time and intrusion detection time for different feature selection algorithms (a) Feature selectiontime for different feature selection algorithms (b) Intrusion detection time of different feature selection algorithms

18 Security and Communication Networks

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 19: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

number of features by 5785 5234 2714 and 25respectively compared with the CMPSO ACO KH andIKH algorithms

Figure 12 shows the feature selection time and intrusiondetection time of 5 different feature selection algorithms tofurther evaluate the performance of the feature selectionalgorithm It can be seen from Figure 12(a) that in thefeature selection stage the LNNLS-KH algorithm consumesa long time in finding the optimal feature subset due to thelinear nearest neighbor lasso step optimization after theposition update of the krill herd Compared with the KH andIKH algorithms it increases the time by an average of1438 and 932 Although the LNNLS-KH algorithmoccupies more calculation time the convergence speed andglobal search ability have been improved Figure 12(b) showsthe intrusion detection time of 5 different feature selectionalgorithms It is the detection time of the sample dataset bythe KNN classifier after the feature subset is searched

excluding the time of searching for the optimal featuresubset 2e feature dimension of LNNLS-KH algorithm islow and the amount of data processed in the classification ofdetection sample dataset is small which result s in the re-duction of classification detection time Compared with theCMPSO ACO KH and IKH algorithms the intrusiondetection time of the LNNLS-KH algorithm is reduced by652 517 214 and 228 on average

2e selection results of CMPSO ACO KH IKH andLNNLS-KH algorithms are used as feature subsets and theKNN classifier is used to detect the test dataset 2e clas-sification accuracy of different algorithms is shown in Ta-ble 15 For five types of subsets the average classificationaccuracy of the proposed LNNLS-KH algorithm is 9586In particular the classification accuracy reached 9755 forthe PortScan subset Compared with the other four featureselection methods the LNNLS-KH algorithm has an averageincrease of 311 852 858 245 and 429 on the

Table 14 2e number of feature selection for different algorithms (CICIDS2017 dataset)

Datacategories CMPSO ACO KH IKH LNNLS-KH

Normal

28 (3 7 13 15 16 17 20 2224 26 30 35 37 38 42 43 4445 46 49 50 56 59 62 63 64

65 76)

25 (1 3 4 7 10 11 12 1315 19 29 32 34 35 3743 46 47 51 55 56 58 73

76 78)

14 (11 19 33 39 4349 55 56 58 65 66

68 71 73)

14 (5 10 19 2021 23 27 33 4356 69 70 73 78)

8 (6 12 16 32 3850 54 73)

DoS24 (1 3 4 13 16 17 24 26 3033 35 39 40 44 48 51 53 57

58 59 60 62 67 70)

19 (3 6 12 13 15 26 3539 51 55 60 61 66 69 71

73 75 77 78)

13 (8 16 21 30 4550 52 57 59 63 66

67)

14 (2 12 15 1619 21 32 34 4446 65 68 76 77)

9 (6 8 20 44 4649 61 75 76)

DDoS

29 (15 18 19 20 23 25 26 3334 35 38 39 42 43 46 47 4951 55 56 57 59 60 61 62 63

71 72 78)

27 (6 9 10 13 16 19 2428 31 41 42 45 47 48 5051 52 53 54 56 59 60 61

62 65 68 72)

21 (10 12 13 15 1823 27 30 34 35 4142 45 55 61 63 65

66 68 70 76)

18 (1 11 13 14 1924 32 35 36 4042 47 51 57 60

69 70 75)

14 (2 5 8 9 1122 26 33 41 4347 51 74 77)

PortScan24 (1 3 6 15 16 28 30 33 3537 44 45 52 56 59 60 61 63

65 68 70 75 77 78)

21 (1 2 6 10 15 17 26 2729 39 42 43 46 49 58 61

66 69 70 71 76)

14 (15 20 22 27 3744 49 50 53 59 62

65 67 78)

15 (1 24 30 32 3343 49 53 54 5860 61 63 64 69)

12 (2 6 15 24 2528 32 57 59 63

66 76)

WebAttack 16 (2 7 26 29 45 47 50 5253 54 63 66 68 69 72 78)

15 (3 9 10 12 19 26 4046 50 54 64 65 68 69

73)

8 (1 17 19 36 48 4953 60)

7 (14 17 35 39 4448 54)

8 (3 29 32 37 6164 73 77)

Table 15 2e classification accuracy of different feature selection algorithms (CICIDS2017 dataset)

Data categories CMPSO () ACO () KH () IKH () LNNLS-KH ()Normal 8978 8906 9270 9458 9464DoS 7703 8269 9090 9334 9451DDoS 8173 8694 9185 8819 9576PortScan 9238 9564 9505 9735 9755WebAttack 8912 9308 9377 9426 9685

Table 16 2e classification FPR and DR of different feature selection algorithms (CICIDS2017 dataset)

Data categoriesFalse positive rate (FPR) () Detection rate (DR) ()

CMPSO ACO KH IKH LNNLS-KH CMPSO ACO KH IKH LNNLS-KHNormal 925 872 641 493 367 8805 8851 8925 9246 9389DoS 541 448 406 283 194 7257 8289 8786 9256 9264DDoS 685 492 454 633 318 7903 8347 9022 8752 9298PortScan 465 302 284 186 116 8825 9380 9433 9514 9542WebAttack 533 316 252 211 160 8740 9135 9219 9294 9477

Security and Communication Networks 19

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 20: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

Normal DoS DDoS PortScan and WebAttack subsetsrespectively Table 16 shows the classification FPR and DR ofdifferent feature selection algorithms on the test sets Basedon the detection of five different test sets the LNNLS-KHalgorithm has lower FPR and higher DR than other fouralgorithms

We propose the LNNLS-KH algorithm a novel featureselection algorithm for intrusion detection Experimentsbased on NSL-KDD and CICIDS2017 datasets show that thealgorithm has good feature selection performance and im-proves the efficiency of intrusion detection

5 Conclusions

With the rapid development of network technology in-trusion detection plays an increasingly important role innetwork security However the ldquodimensional disasterrdquo wascaused by massive data results in problems such as slowresponse and poor accuracy of the intrusion detectionsystem KH algorithm is a new swarm intelligence opti-mization method based on population which shows goodperformance in high-dimensional data processing provid-ing a new approach for reducing the dimension of intrusiondetection data and selecting useful features In this paper animproved KH algorithm named LNNLS-KH is proposedfor feature selection of IDS datasets by linear nearestneighbor lasso optimization 2e LNNLS-KH algorithmintroduces a new fitness function which is composed of thenumber of feature selection dimensions and classificationaccuracy Nonlinear optimization is introduced into thephysical diffusion motion of krill individuals to acceleratethe convergence speed of the algorithmMoreover the linearneighbor lasso step optimization is proposed to balance theexploration and exploitation abilities and obtain the globaloptimal solution of the feature subset effectively Experi-ments based on NSL-KDD and CICIDS2017 datasets showthat the LNNLS-KH algorithm retains 7 and 102 features onaverage which greatly reduces the dimension of the featuresIn the NSL-KDD dataset features are reduced by 444286 3488 and 2432 compared with CMPSO ACOKH and IKH algorithms And in the CICIDS2017 datasetthey are reduced by 5785 5234 2714 and 25respectively In addition the classification accuracy of theLNNLS-KH feature selection algorithm is increased by1003 and 539 and the time of intrusion detection isreduced by 1241 and 403 on the two datasets Fur-thermore LNNLS-KH algorithm enhances the ability ofjumping out of the local optimal solution and shows goodperformance in the optimal fitness iteration curve falsepositive rate of detection and convergence speed whichdemonstrated that the proposed LNNLS-KH algorithm is anefficient feature selection method for network intrusiondetection

In this research we realized that the initialization of theLNNLS-KH algorithm has a certain degree of randomness2erefore we conducted independent and repeated exper-iments to solve the problem and the results were reasonableand convincing Although the proposed algorithm showsencouraging performance it could be further improved

In future work we consider using data balancingtechniques to preprocess the experimental dataset to obtainmore accurate feature selection results and stronger algo-rithm stability Meanwhile we will combine the LNNLS-KHwith other algorithms to improve the exploration and ex-ploitation abilities thereby further shortening the time oftraining feature subset and classification detection On thecontrary as the LNNLS-KH algorithm is universally ap-plicable the LNNLS-KH algorithm can be applied to morefeature selection systems and solve optimization problems inother fields

Data Availability

2e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

2e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

2is work was sponsored by the National Key Research andDevelopment Program of China (Grants 2018YFB0804002and 2017YFB0803204) National Natural Science Founda-tion of PR China (Grant 72001191) Henan Natural ScienceFoundation (Grant 202300410442) and Henan Philosophyand Social Science Program (Grant 2020CZH009)

References

[1] W Wei and C Guo ldquoA text semantic topic discovery methodbased on the conditional co-occurrence degreerdquo Neuro-computing vol 368 pp 11ndash24 2019

[2] C-R Wang R-F Xu S-J Lee and C-H Lee ldquoNetwork in-trusion detection using equality constrained-optimization-basedextreme learning machinesrdquo Knowledge-Based Systems vol 147pp 68ndash80 2018

[3] G-G Wang A H Gandomi A H Alavi and D Gong ldquoAcomprehensive review of krill herd algorithm variants hy-brids and applicationsrdquo Artificial Intelligence Review vol 51no 1 pp 119ndash148 2019

[4] J Amudhavel D Sathian R S Raghav et al ldquoA fault tolerantdistributed self-organization in peer to peer (p2p) using krillherd optimizationrdquo in Proceedings of the 2015 InternationalConference on Advanced Research in Computer Science En-gineering amp Technology (ICARCSET 2015) pp 1ndash5 UnnaoIndia 2015

[5] L M Abualigah A T Khader and E S Hanandeh ldquoHybridclustering analysis using improved krill herd algorithmrdquoApplied Intelligence vol 48 no 11 pp 4047ndash4071 2018

[6] P A Kowalski and S Łukasik ldquoTraining neural networks withkrill herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[7] C Stasinakis G Sermpinis I Psaradellis and T VerousisldquoKrill-Herd Support Vector Regression and heterogeneousautoregressive leverage evidence from forecasting and trad-ing commoditiesrdquo Quantitative Finance vol 16 no 12pp 1901ndash1915 2016

20 Security and Communication Networks

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 21: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

[8] L Wang P Jia T Huang S Duan J Yan and L Wang ldquoAnovel optimization technique to improve gas recognition byelectronic noses based on the enhanced krill herd algorithmrdquoSensors vol 16 no 8 p 1275 2016

[9] R Jensi and GW Jiji ldquoAn improved krill herd algorithmwithglobal exploration capability for solving numerical functionoptimization problems and its application to data clusteringrdquoApplied Soft Computing vol 46 pp 230ndash245 2016

[10] H Pulluri R Naresh and V Sharma ldquoApplication of studkrill herd algorithm for solution of optimal power flowproblemsrdquo International Transactions on Electrical EnergySystems vol 27 no 6 Article ID e2316 2017

[11] D Rodrigues L A M Pereira J P Papa et al ldquoA binary krillherd approach for feature selectionrdquo in Proceedings of the 201422nd International Conference on Pattern Recognitionpp 1407ndash1412 IEEE Stockholm Sweden August 2014

[12] A Mukherjee and V Mukherjee ldquoChaotic krill herd algo-rithm for optimal reactive power dispatch considering FACTSdevicesrdquo Applied Soft Computing vol 44 pp 163ndash190 2016

[13] S Sun H Qi F Zhao L Ruan and B Li ldquoInverse geometrydesign of two-dimensional complex radiative enclosures usingkrill herd optimization algorithmrdquo Applied ermal Engi-neering vol 98 pp 1104ndash1115 2016

[14] S Sultana and P K Roy ldquoOppositional krill herd algorithmfor optimal location of capacitor with reconfiguration inradial distribution systemrdquo International Journal of ElectricalPower amp Energy Systems vol 74 pp 78ndash90 2016

[15] L Brezocnik I Fister and V Podgorelec ldquoSwarm intelligencealgorithms for feature selection a reviewrdquo Applied Sciencesvol 8 no 9 2018

[16] D Smith Q Guan and S Fu ldquoAn anomaly detectionframework for autonomic management of compute cloudsystemsrdquo in Proceedings of the 2010 IEEE 34th AnnualComputer Software and Applications Conference Workshopspp 376ndash381 IEEE Seoul South Korea July 2010

[17] Y Zhao Y Zhang W Tong et al ldquoAn improved featureselection algorithm based on MAHALANOBIS distance fornetwork intrusion detectionrdquo in Proceedings of 2013 Inter-national Conference on Sensor Network Security Technologyand Privacy Communication System pp 69ndash73 IEEE Nan-gang China May 2013

[18] P Singh and A Tiwari ldquoAn efficient approach for intrusiondetection in reduced features of KDD99 using ID3 andclassification with KNNGArdquo in Proceedings of the 2015 SecondInternational Conference on Advances in Computing andCommunication Engineering pp 445ndash452 IEEE DehradunIndia May 2015

[19] M A Ambusaidi X He P Nanda and Z Tan ldquoBuilding anintrusion detection system using a filter-based feature se-lection algorithmrdquo IEEE Transactions on Computers vol 65no 10 pp 2986ndash2998 2016

[20] N Shone T N Ngoc V D Phai and Q Shi ldquoA deep learningapproach to network intrusion detectionrdquo IEEE Transactionson Emerging Topics in Computational Intelligence vol 2 no 1pp 41ndash50 2018

[21] Y Xue W Jia X Zhao et al ldquoAn evolutionary computationbased feature selection method for intrusion detectionrdquo Se-curity and Communication Networks vol 2018 Article ID2492956 10 pages 2018

[22] Z Shen Y Zhang and W Chen ldquoA bayesian classificationintrusion detection method based on the fusion of PCA andLDArdquo Security and Communication Networks vol 2019Article ID 6346708 11 pages 2019

[23] P Sun P Liu Q Li et al ldquoDL-IDS Extracting features usingCNN-LSTM hybrid network for intrusion detection systemrdquoSecurity and Communication Networks vol 2020 Article ID8890306 11 pages 2020

[24] G Farahani ldquoFeature selection based on cross-correlation forthe intrusion detection systemrdquo Security amp CommunicationNetworks vol 2020 Article ID 8875404 17 pages 2020

[25] F G Mohammadi M H Amini and H R Arabnia ldquoAp-plications of nature-inspired algorithms for dimension Re-duction enabling efficient data analyticsrdquo in Advances inIntelligent Systems and Computing Optimization Learningand Control for Interdependent Complex Networks pp 67ndash84Springer Cham Switzerland 2020

[26] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquo inProceedings of the ICNNrsquo95-International Conference onNeural Networks no 4 pp 1942ndash1948 IEEE Perth WAAustralia December 1995

[27] M Dorigo M Birattari and T Stutzle ldquoAnt colony opti-mizationrdquo IEEE Computational Intelligence Magazine vol 1no 4 pp 28ndash39 2006

[28] R Rajabioun ldquoCuckoo optimization algorithmrdquo Applied SoftComputing vol 11 no 8 pp 5508ndash5518 2011

[29] M Neshat G Sepidnam M Sargolzaei and A N ToosildquoArtificial fish swarm algorithm a survey of the state-of-the-art hybridization combinatorial and indicative applicationsrdquoArtificial Intelligence Review vol 42 no 4 pp 965ndash997 2014

[30] D Karaboga ldquoAn idea based on honey bee swarm for nu-merical optimizationrdquo Technical Report-tr06 Erciyes uni-versity Engineering Faculty Computer EngineeringDepartment Kayseri Turkey 2005

[31] W-T Pan ldquoA new Fruit Fly Optimization Algorithm takingthe financial distress model as an examplerdquo Knowledge-BasedSystems vol 26 pp 69ndash74 2012

[32] R Zhao and W Tang ldquoMonkey algorithm for global nu-merical optimizationrdquo Journal of Uncertain Systems vol 2no 3 pp 165ndash176 2008

[33] X S Yang and X He ldquoBat algorithm literature review andapplicationsrdquo International Journal of Bio-Inspired Compu-tation vol 5 no 3 pp 141ndash149 2013

[34] S Mirjalili A H Gandomi S Z Mirjalili S Saremi H Farisand S M Mirjalili ldquoSalp Swarm Algorithm a bio-inspiredoptimizer for engineering design problemsrdquo Advances inEngineering Software vol 114 pp 163ndash191 2017

[35] K Ahmed A E Hassanien and S Bhattacharyya ldquoA novelchaotic chicken swarm optimization algorithm for featureselectionrdquo in Proceedings of the 2017 ird InternationalConference on Research in Computational Intelligence andCommunication Networks (ICRCICN) pp 259ndash264 IEEEKolkata India November 2017

[36] S Tabakhi P Moradi F Akhlaghian et al ldquoAn unsupervisedfeature selection algorithm based on ant colony optimiza-tionrdquo Engineering Applications of Artificial Intelligencevol 32 pp 112ndash123 2014

[37] S Arora and P Anand ldquoBinary butterfly optimization ap-proaches for feature selectionrdquo Expert Systems with Appli-cations vol 116 pp 147ndash160 2019

[38] C Yan J Ma H Luo and A Patel ldquoHybrid binary coral reefsoptimization algorithm with simulated annealing for featureselection in high-dimensional biomedical datasetsrdquo Chemo-metrics and Intelligent Laboratory Systems vol 184pp 102ndash111 2019

[39] G I Sayed A 2arwat and A E Hassanien ldquoChaoticdragonfly algorithm an improvedmetaheuristic algorithm for

Security and Communication Networks 21

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks

Page 22: LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection · ResearchArticle LNNLS-KH:AFeatureSelectionMethodforNetwork IntrusionDetection XinLi ,1PengYi ,1WeiWei,2YimingJiang,1andLeTian

feature selectionrdquo Applied Intelligence vol 49 no 1pp 188ndash205 2019

[40] Z Zhang P Wei Y Li et al ldquoFeature selection algorithmbased on improved particle swarm joint taboo searchrdquoJournal of Communication vol 39 no 12 pp 60ndash68 2018

[41] A H Gandomi and A H Alavi ldquoKrill herd a new bio-inspiredoptimization algorithmrdquo Communications in Nonlinear Scienceand Numerical Simulation vol 17 no 12 pp 4831ndash4845 2012

[42] Q Tan and Z Huang ldquoKrill herd with nearest neighbor lassooperatorrdquo Computer Engineering and Applications vol 55no 9 pp 124ndash129 2019

[43] Q Wang C Ding and X Wang ldquoA hybrid data clusteringalgorithm based on improved krill herd algorithm and KHMclusteringrdquo Control and Decision vol 35 no 10pp 2449ndash2458 2018

[44] Q Li and B Liu ldquoClustering using an improved krill herdalgorithmrdquo Algorithms vol 10 no 2 p 56 2017

[45] G-G Wang A H Gandomi and A H Alavi ldquoStud krill herdalgorithmrdquo Neurocomputing vol 128 pp 363ndash370 2014

[46] J Li Y Tang C Hua and X Guan ldquoAn improved krill herdalgorithm krill herd with linear decreasing steprdquo AppliedMathematics and Computation vol 234 pp 356ndash367 2014

[47] H B Nguyen B Xue P Andreae et al ldquoParticle swarmoptimisation with genetic operators for feature selectionrdquo inProceedings of the 17 IEEE Congress on Evolutionary Com-putation (CEC) pp 286ndash293 IEEE San Sebastian Spain June2017

[48] M H Aghdam and P Kabiri ldquoFeature selection for intrusiondetection system using ant colony optimizationrdquo Interna-tional Journal of Network Security vol 18 no 3 pp 420ndash4322016

22 Security and Communication Networks