CN107832830A

CN107832830A - Intruding detection system feature selection approach based on modified grey wolf optimized algorithm

Info

Publication number: CN107832830A
Application number: CN201711148008.2A
Authority: CN
Inventors: 钮焱; 龚红波; 李军; 童坤; 刘宇强; 李星
Original assignee: Hubei University of Technology
Current assignee: Hubei University of Technology
Priority date: 2017-11-17
Filing date: 2017-11-17
Publication date: 2018-03-23

Abstract

The invention discloses a kind of intruding detection system feature selection approach based on modified grey wolf optimized algorithm, sample set is built first, then obtains optimal feature subset；Improved present invention employs newest intelligent algorithm-grey wolf algorithm, and in terms of feature selecting.Compared to prior art, algorithm step itself greatly simplifies, and improves the accuracy in detection and system effectiveness of existing intruding detection system, and algorithm inherently carries control parameter, with search deeply can more scaling method automatically search strategy, reduce the probability that algorithm is absorbed in locally optimal solution.

Description

Intrusion detection system feature selection method based on improved wolf optimization algorithm

Technical Field

The invention belongs to the technical field of information security, relates to a method for selecting characteristics of an intrusion detection system, and particularly relates to a method for selecting characteristics of an intrusion detection system based on an improved wolf optimization algorithm.

Background

Today, the use of the internet is widely seen, and the shadow of the internet appears from the industry chain of companies and the assembly line of factories. But with the coming of the world-wide e-commerce era, protection on information security is also more and more important. Almost all company organizations and manufacturing plants must be connected together through the internet, which is an open environment, any type of attack can be spread on the internet, and large sites at home and abroad can be attacked by phishing websites and trojan viruses. Therefore, constructing an efficient Intrusion Detection System (IDS) is not trivial. IDS is used to collect and detect intrusion attacks in computers and networks. All IDSs use one of two intrusion detection methods, anomaly-based or signature-based. In an anomaly-based intrusion detection system, a computer constructs a model characterizing normal operating behavior by collecting data for the normal operating behavior, and then detects any unacceptable behavior that is different from the model. The anomaly-based intrusion detection system has the characteristics of low false alarm rate and detection of unknown attack behaviors. In a feature-based intrusion detection system, all operational records, including normal and abnormal behavior, are collected, and the system then detects an unknown intrusion attack by comparing the current operational record with the collected operational records.

The purpose of feature selection is to extract important features from the data and remove redundant features. The feature selection can reduce data dimensionality, improve prediction performance, reduce overfitting, enhance understanding between features and feature values, and the like. In the real world, data to be classified often has a large number of redundant features, which means that some features in the data can be replaced by other features, and the replaced features can be removed in the classification process, furthermore, the mutual connection between the features has a great influence on the output effect of the classification, and if we can find out the connection between the features, we can dig out a large amount of information hidden in the data.

All feature selection algorithms can be classified into the following three categories, filtering, embedding and wrapping. The filtering method is realized by firstly selecting the characteristics of the data set and then training a classifier to split the data set and the classifier. The key of the method is to find a method for measuring the importance of features, such as pearson correlation coefficients, mutual information and the like. Then sorting is carried out according to the size of the metric, and the characteristic with the metric value sorted in the front is selected as the characteristic of the classification standard. However, the method has the disadvantage of neglecting the interdependence relationship between the features, and on one hand, the top-ranked features are equivalent to the features with redundancy introduced if the correlation between some features is strong. On the other hand, the feature in the next rank, although the metric value is not large and the value is not obvious, has good prediction effect independently of other features and is combined with other features, so that the valuable features are lost. The embedded method is to integrate the feature selection process into the learner training process, and the two are completed in a unified process, such as lasso ridge regression. The core idea of the wrapping method is that under the condition of giving a training model and an evaluation method of prediction effect, the prediction effect of each subset is evaluated aiming at different feature subsets in a feature space, and the feature subset with the best prediction effect is selected as a finally selected training subset. The method has the advantage that the feature subset selected by the wrapping method has better prediction effect than the filtering method in consideration of the interdependency between the features, but has the disadvantage of larger calculation amount because the feature subset is an exponential order. Different algorithms are generated for how efficiently the entire feature space is searched.

The genetic algorithm is the first intelligent algorithm used for solving the problem, the idea of the genetic algorithm is derived from the reproductive genetic process among natural biological populations, the solution of the optimization problem is considered as a gene, and then genetic communication including crossing and variation is carried out among the whole populations. The natural environment can be regarded as an objective function, and genes with high adaptability to the natural environment are reserved and are passed on to the next generation. Genetic algorithms have the ability to solve complex nonlinear optimization problems. However, the genetic algorithm has many disadvantages such as low operation efficiency and easy falling into the local optimal solution.

The grayish wolf algorithm (GWO) is an evolutionary algorithm recently put into use that finds the location of prey by simulating the process of prey on a wolf flock, i.e. the optimal solution to the optimization problem. The present invention employs this method and is an improvement in feature selection.

Disclosure of Invention

In order to solve the technical problem, the invention provides an intrusion detection system feature selection method based on an improved grayish wolf optimization algorithm.

The technical scheme adopted by the invention is as follows: a method for selecting intrusion detection system features based on an improved wolf optimization algorithm is characterized by comprising the following steps:

step 1: inputting the captured intrusion data, and forming a training set T by using samples containing labels;

step 2: initializing current iteration times, the number of the wolf individuals, the population size of the wolf group and the position vector of each wolf individual;

and step 3: calculating the coding vector of each wolf according to the position vector, and calculating the adaptive value of each wolf according to the coding vector;

and 4, step 4: setting the maximum iteration number as max _ iter, and selecting the first three as alpha, beta and delta according to the size of the adaptive value;

and 5: updating the coding vector of each head wolf and recalculating the adaptive value of each head wolf;

the method specifically comprises the following substeps:

step 5.1: the global parameters needed to initialize the algorithm.

Step 5.2: determining and calculating vectors needing to be exchanged;

step 5.3: exchanging the vectors by using the specified exchange strategy;

step 5.4: and calculating the adaptive value of the new vector by using the method for calculating the adaptive value in the step three for the vector after the exchange is performed.

Step 6: updating the code vectors of alpha, beta and delta;

and 7: and judging whether t is greater than max _ iter, if so, outputting the coding vector of alpha, otherwise, returning to execute the step 5 after t = t + 1.

The invention adopts the latest intelligent algorithm, namely the wolf algorithm, and is improved in the aspect of feature selection. Compared with the prior art, the steps of the algorithm are greatly simplified, the detection accuracy and the system efficiency of the existing intrusion detection system are improved, the algorithm is provided with control parameters, the search strategy of the algorithm can be automatically replaced along with the deepening of the search, and the probability that the algorithm falls into the local optimal solution is reduced.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention;

FIG. 2 shows the change of the detection error rate and the average feature number after feature selection by BGWO when the algorithm runs gradually longer in the embodiment of the present invention;

fig. 3 shows the change of the detection error rate and the average feature number after the EBGWO is used for feature selection under the condition that the algorithm operation times gradually increase in the embodiment of the present invention.

Detailed Description

In order to facilitate understanding and implementation of the present invention for persons of ordinary skill in the art, the present invention is further described in detail with reference to the drawings and examples, it is to be understood that the implementation examples described herein are only for illustration and explanation of the present invention and are not to be construed as limiting the present invention.

Referring to fig. 1, the method for selecting characteristics of an intrusion detection system based on an improved grayish wolf optimization algorithm provided by the invention comprises the following steps:

step 1, inputting the captured intrusion data, and forming a training set by samples containing labels;

the captured intrusion data is input, each piece of intrusion data is represented by a feature vector, the intrusion data with known labels (namely whether the intrusion behavior is the intrusion behavior or not and what kind of intrusion behavior is the intrusion behavior) is represented by a vector containing a plurality of features, each dimension of the vector represents one feature of the data, and the intrusion data of the network has a plurality of features, such as connection duration, protocol type, number of bytes transmitted and the like. The total number of data features is not fixed and is related to the packet capture tool used by the intrusion detection system.

Step 2, initializing the current iteration times of the algorithm, the serial number of the wolf individuals, the population size of the wolf group and the position vector of each wolf individual;

initializing the iteration number t =1, initial number i =1 of the wolf individuals, setting the wolf group size as K, and the position of each wolf is a candidate solution of the intrusion detection problem, and randomly initializing the position vector of the wolf i in the wolf group in (0, max) for the wolf individuals from i =1,2, \ 8230;, KThe vector dimension is N, where max represents the maximum value of the position of the wolf body, and max can be selected according to practical situations, where K =12,max =1 in this embodiment;

step 3, calculating the coding vector of each wolf according to the position vector, and calculating the adaptive value of each wolf according to the coding vector;

it is necessary to find a mapping function f that can map values in the (0, max) interval into {0,1} the discrete set, and guarantee that there is a number δ in (0, max) such that there is f (temp 1) < f (temp 2) for all temp1 ∈ (0, δ) and temp2 ∈ [ δ, max).

According to the position vectorComputing a code vector for each wolfConverting the location of the gray wolf from a continuous value to a binary coded value of 0,1 using the following equation;

wherein position (i, j) representsThe value of the j-th dimension in (i, j) representsThe value of the j-th dimension;

binary coded vector based on wolfCalculating the adaptive value of the wolf according to the following steps:

encoding vector in wolf1 represents that the characteristic is selected, 0 represents that the characteristic is not selected, and the training set T is enabled to be encoded in the encoding vectorCorresponding to the training set under the selected characteristics as T _ solution, and utilizing classificationThe average accuracy (or classification error rate) of the T _ solution after classification is calculated by the device, and the accuracy is used as a wolf pack coding vectorCorresponding adaptive value P _i The classifier can be selected according to actual conditions, the classifier selected in the invention is a KNN classifier, and the value of K in the classifier is set to be 5.

Step 4, setting the maximum iteration times as max _ iter, and selecting the first three as alpha, beta and delta according to the adaptive value;

the maximum number of iterations max _ iter is set and then the adaptation value P is selected _i The optimal encoding vector of wolf is used as the encoding vector of alpha. The excellent of the adaptive value is relative, and is related to the meaning of the selected adaptive value function, the invention selects the classification error rate as the adaptive value of the wolf, and the lower the classification error rate is, the better the classification effect is, the better the selected individual wolf is.

Therefore, the initialization of α, β and δ in the present invention is divided into the following three substeps:

step 4.1 selection of the fitness value P _i Lowest wolfInitializing a code vector of alphaEncoding vector for wolf j

Substep 4.2 after eliminating j, selecting the fitness P from the remaining wolf individuals _i Lowest wolfInitializing a code vector of betaCode vector of wolf's n

Substep 4.3 after eliminating n, finally selecting the fitness P among the remaining wolf individuals _i Lowest wolfInitializing a code vector of deltaCode vector of wolf m

And step 5, updating the coding vector of each wolf head according to the cross strategy, and recalculating the adaptive value of each wolf head.

The step is the core of the invention and is an innovation point, and the invention improves the defects of the existing gray wolf optimization algorithm at the step, so that the data characteristics are further optimized, and the vehicle detection accuracy and the system efficiency of the intrusion detection system are further improved. When the method is specifically realized, firstly, a form parameter m =1 is initialized, and then the following substeps are executed:

step 5.1: initializing d =1, calculating the parameter a,And

where t represents the number of iterations of the current algorithm,andis in a value range of [0, 1%]The dimension of the random vector is N;

step 5.2: determining three parameters x for which a switching policy function f () is to be computed ₁ ,x ₂ ,x ₃ Desired value of n, x ₁ ,x ₂ ,x ₃ D-th dimension of (1)Comprises the following steps:

the value of the head-leader n wolf in the d-th dimension is shown,representing a binary distance step length of the selected individual wolf from the leader n wolf in the d dimension;

the value of n is related to m and satisfies the following relation:

for example, when calculating x ₁ D-th dimension of (2)At this time, m =1, and n = α according to the relational expression, it should be calculatedIs then calculatedThe value of (c). So that the calculation can be determinedThe value of n is obtained first and thenAndwhileThe value of the head-neck n wolf in d dimension is determined in the previous step, so only calculation is neededAnd (4) finishing.

Step 5.3: computing

WhereinAndposition vectors representing alpha, beta and delta in the t-th iteration,representing a position vector of the wolf i under the iteration number t;

step 5.4: computingAnd

whereinRepresenting successive distance steps of the selected individual wolf from the selected individual wolf, A ^d RepresentThe value of dimension d;represents the value of the D-th dimension of the heading n wolf's D vector obtained in step 5.3, rand being [0,1 ]]Random numbers in the interval;

step 5.5: will be provided withCarry over into substep 5.2, result in

Step 5.6: repeating the steps 5.1-5.5 until x is calculated ₁ ,x ₂ ,x ₃ (ii) a Judging whether d is equal to N, if not, adding 1 to d and repeating the steps 5.2-5.5 until x is calculated _m The values of all dimensions;

step 5.7: judging whether m is equal to 3, if not, adding 1 to m, and repeating the step 5.1-5.6;

step 5.8: updating wolf code vectors using the following update strategyEach dimension of (1);

whereinAnddenotes x ₁ ,x ₂ ,x ₃ In the value of d-dimension, rand represents [0,1 ]]Random numbers in the interval;

step 5.9: using updatedCalculating new adaptive value P of wolf i _i 。

The original training set T is added to the coding vectorAnd reserving the corresponding selected features, and deleting the unselected features to obtain a new training set T _ solution. Then, the average precision (or classification error rate) of the classified T _ solution is calculated by using a classifier, and the precision is used as a wolf pack coding vectorCorresponding adaptation value P _i 。

And judging whether the binary coding vectors and the adaptive values of all the individuals are updated in the iteration, namely judging whether i is equal to the population size K, if not, adding one to i, returning to the step 5.1, and if so, carrying out the next step.

And 6, updating the coding vectors of the alpha, the beta and the delta.

The method for updating the coding vectors of alpha, beta and delta includes the steps of sorting the updated individual adaptive values of wolfs and selecting the adaptive value P of the three-headed wolf with the first three of the adaptive values _α '，P _β ' and P _δ ' Adaptation values P to original alpha, beta and delta _α ，P _β And P _δ Performing corresponding comparison if the new adaptive value P _i Is superior to the original adaptation value P _i Then the corresponding code vector is calculatedUpdating the code vector corresponding to the new adaptive valueOtherwise, the updating is not carried out.

And 7, judging whether t is greater than max _ iter, outputting the alpha coding vector if the condition is met, and returning to the step 5 after t = t +1 if the condition is not met.

If t is greater than max _ iter, terminating iteration and determining the optimal feature subset according to the alpha code vectorBinary string representing optimal feature subset, 1 representing feature selected, 0 representing feature not selected, and outputtingAnd the feature corresponding to the dimension with the value of 1 is extracted.

If t is not equal to max _ iter, it indicates that the algorithm is not complete, and returns to step 5 after t = t + 1.

The effects of the present invention can be illustrated by the following comparative experiments

1. Simulation conditions are as follows:

the data set used in the experiment is a KDD99 network intrusion data set, and the data set is averagely divided into two parts, wherein one part is used as a training set, and the other part is used as a testing set. In the experiment, the language used by each method is realized by matlab.

2. Contents and results of the experiments

And a KDDCUP1999 data set is used as a data set, a KNN algorithm in matlab is used as a classifier for detection, and then the detection error rate and the data average characteristic selection number of the intrusion detection under different running times of the two algorithms of the existing intrusion detection characteristic selection algorithm BGWO and the improved Husky optimization algorithm EBGWO of the invention are obtained as algorithm evaluation.

KDDCUP1999 dataset contains 25912 sample records, each piece of data consisting of 41 features and a label, each label representing a type of attack. The first 12956 data are selected as training set and the last 12956 data are selected as test set. The number of wolf colony population individuals is set to be 12, the iteration times of the algorithm max _ itr is set to be 6, KNN is selected as a classifier in the experiment, and K is 5.

The experimental results are shown in fig. 2 and fig. 3, and fig. 2 and fig. 3 show the change of the detection error rate and the average feature number after feature selection using BGWO and EBGWO, respectively, when the number of algorithm runs is gradually increased. As can be seen from fig. 2, the larger the running times of the two algorithms, the smaller the error rate detected by the system, and when the algorithm runs 200 times (the 10 th test), the error rate detected after feature selection using BGWO is 2.89%, while the error rate detected after feature selection using EBGWO is reduced to 2.73%, and the detection effect of gwebo is improved by 5.5% compared with that of BGWO. It can also be seen that the detection effect using EBGWO is better than that using BGWO and the trend is increasing when the algorithm is run more than 100 times (trial 5). Also, as can be seen from fig. 3, the larger the number of runs, the progressively smaller the number of average features selected by both algorithms. When run 10, the mean number of features selected using BGWO was 27.2, while the mean number of features selected using EBGWO was reduced to 26.2.

In conclusion, compared with BGWO, EBGWO improves the dualization strategy thereof, and replaces the location update in the traditional binary grayish wolf algorithm with the crossover strategy in the genetic algorithm. Experiments show that compared with BGWO, EBGWO can further reduce the error rate of system detection, reduce the number of selected features and further improve the operation efficiency of IDS.

Compared with the prior art, the method has the advantages that the steps are greatly simplified, the detection accuracy and the system efficiency of the conventional intrusion detection system are improved, the method is provided with control parameters, the search strategy of the algorithm can be automatically changed along with the deepening of the search, and the probability of the method falling into the local optimal solution is reduced.

It should be understood that parts of the specification not set forth in detail are well within the prior art.

It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for selecting intrusion detection system features based on an improved wolf optimization algorithm is characterized by comprising the following steps:

and 5: updating the coding vector of each wolf and recalculating the adaptive value of each wolf;

and 6: updating the code vectors of alpha, beta and delta;

2. The improved grayish optimization algorithm-based intrusion detection system feature selection method according to claim 1, wherein: step 1, intrusion data which are collected and have known labels are represented by a vector containing a plurality of characteristics, and each dimension of the vector represents one characteristic of the data; the content of the label comprises whether the intrusion behavior is the intrusion behavior or not and what the intrusion behavior is; the characteristics include connection duration, protocol type, number of bytes transferred.

3. The improved grayish optimization algorithm-based intrusion detection system feature selection method according to claim 1, wherein: in step 2, initializing iteration times t =1, initial number i =1 of a wolf individual, setting the wolf group size as K, setting the position of each wolf as a candidate solution of the intrusion detection problem, and randomly initializing a position vector of the wolf i in the wolf group in (0, max) for wolf individuals from i =1,2, \ 8230; (K)The vector dimension is N, where max represents the maximum value of the position of the wolf individual.

4. The improved grayish wolf optimization algorithm-based intrusion detection system feature selection method according to claim 3, wherein: in step 3, according to the position vectorComputing a code vector for each wolfConverting the position of the gray wolf from a continuous value to a binary coded value of 0,1 using the following equation;

binary coded vector based on wolfCalculating the adaptive value of the wolf, wherein the specific implementation process comprises the following steps: encoding vector in wolfIn the method, 1 represents that the characteristic is selected, 0 represents that the characteristic is not selected, and a training set T is enabled to be encoded in the encoding vectorCorresponding to the training set under the selected characteristics as T _ solution, calculating the average precision or the classification error rate after classifying the T _ solution by using a classifier, and taking the precision as a wolf pack coding vectorCorresponding adaptive value P _i 。

5. The method of claim 3, wherein the intrusion detection system is further characterized by: in step 5, updating the coding vector of each wolf according to a cross strategy, and recalculating the adaptive value of each wolf; first the formal parameter m =1 is initialized, then the following sub-steps are performed:

step 5.1: initializing d =1, calculating the parameter a,And

and step 5.2: determining three parameters x for which a switching policy function f () is to be computed ₁ ,x ₂ ,x ₃ Desired value of n, x ₁ ,x ₂ ,x ₃ Value of d-th dimension of (1)Comprises the following steps:

the value of the head-leader n wolf in the d-th dimension is shown,representing a binary distance step length of the selected individual wolf from the leader n wolf in the d-dimension;

step 5.3: calculating out

WhereinAndrepresenting the position vectors of alpha, beta and delta in the t-th iteration,representing a position vector of the wolf i under the iteration number t;

step 5.4: calculating outAnd

whereinRepresenting successive distance steps of the selected individual wolf from the selected individual wolf, A ^d To representThe value of dimension d;represents the value of the D-th dimension of the heading n wolf's D vector obtained in step 5.3, rand being [0,1 ]]Random numbers in the interval;

and step 5.5: will be provided withCarry over to substep 5.2, give

Step 5.6: repeating the steps 5.2 to 5.5 until the values of all dimensions are calculated; judging whether d is equal to N, if not, adding 1 to d, and repeating the step 5.2-step 5.5;

whereinAnddenotes x ₁ ,x ₂ ,x ₃ In the d-th dimensionValue, rand, represents [0,1 ]]Random numbers in the interval;

step 5.9: using updatedCalculating new adaptive value P of wolf i _i 。

6. The method for selecting the features of the intrusion detection system based on the improved grayling optimization algorithm of claim 3, wherein the method for updating the code vectors of α, β and δ in step 6 is as follows: sorting the updated individual adaptation values of the wolfs, and selecting the adaptation value P of the three-headed wolf with the first three adaptation values _α '，P _β ' and P _δ ' Adaptation values P to original alpha, beta and delta _α ，P _β And P _δ Performing corresponding comparison if the new adaptive value P _i ' to be better than the original adaptation value P _i Then the corresponding code vector is usedUpdating the code vector corresponding to the new adaptive valueOtherwise, the updating is not carried out.