CN115659323A

CN115659323A - Intrusion detection method based on information entropy theory and convolution neural network

Info

Publication number: CN115659323A
Application number: CN202211094816.6A
Authority: CN
Inventors: 缪祥华; 李响
Original assignee: Kunming University of Science and Technology
Current assignee: Kunming University of Science and Technology
Priority date: 2022-09-05
Filing date: 2022-09-05
Publication date: 2023-01-31

Abstract

An intrusion detection method based on the combination of an information entropy theory and a convolutional neural network comprises the steps of firstly, converting character type data into numerical type data, standardizing the data and normalizing the data; and then, putting the data set into a convolutional neural network for dimensionality reduction and classification, calculating by combining information entropy uncertainty, and performing delayed relearning classification decision on part of data, wherein a random forest method is selected by a delayed decision method. When the intrusion behavior occurs, normal data and attack data can be distinguished by using the trained model. The method utilizes the characteristic of superior performance on the characteristic extraction capability and the classification learning effect of the convolutional neural network and combines the information entropy theory to evaluate the classified data, and the evaluation result is used as a secondary learning classification decision basis.

Description

Intrusion detection method based on combination of information entropy theory and convolutional neural network

Technical Field

The invention relates to an intrusion detection method based on the combination of an information entropy theory and a convolutional neural network, and belongs to the technical field of intrusion detection in networks.

Background

In recent years, with the continuous change of network technology and the continuous expansion of network scale, network security events at home and abroad are frequent, and the problem of network security is more emphasized. Research into intrusion detection systems has become an important part of current network security development. The research of the intrusion detection system is to make up the deficiency of attack defense in the traditional firewall and strengthen the monitoring of the network and the system operation condition, and discover various attack attempts, attack behaviors or attack results as much as possible to ensure the confidentiality, integrity and availability of network system resources. Intrusion detection systems have been developed for decades and have achieved some success by now, but there is much room for research on intrusion detection systems.

The conventional intrusion detection system still has some problems, specifically: the attack behavior can not be automatically researched and distinguished, the response is not timely enough when the detection is carried out on a large data scale, and the signature database needs to be updated in real time. The above problems will cause the system to have a low accuracy of detecting the data flow and generate a relatively high false alarm rate during the detection. When the unknown abnormal flow is faced, a better division mode is not provided, and the autonomous learning ability is lacked.

Disclosure of Invention

In order to make up for the defects of the prior art, the invention provides an intrusion detection method based on the combination of an information entropy theory and a convolutional neural network. The accuracy of detection is improved through the excellent performance of feature extraction and classification of the convolutional neural network, a belief theory is introduced to serve as secondary judgment, the occurrence of misjudgment is reduced, and the false alarm rate is reduced, so that the problems in the background art are solved.

In order to achieve the purpose, the invention provides the following technical scheme: an intrusion detection method based on the combination of an information entropy theory and a convolutional neural network comprises the following specific steps:

the first step is as follows: acquiring a training data set and a testing data set from the data set to perform data preprocessing on the data set;

the second step: transmitting the training set after data preprocessing into a network model for training to obtain a trained network model;

the third step: calculating an information entropy value for evaluation through an information entropy theory by obtaining probability distribution and confidence of data through a convolution bible network model;

the fourth step: directly outputting a classification result by the convolutional neural network for the data with the entropy value smaller than the threshold value, and learning and reclassifying the data larger than the threshold value by a random forest model;

the fifth step: and transmitting the test set after data preprocessing into the trained network model to obtain a classification result.

As a preferred scheme, the first step of performing data preprocessing on the intrusion detection data set specifically comprises the following steps:

(1) Because some characteristics in the intrusion detection data set are character type data, the character type data needs to be converted into numerical type data;

(2) In order to reduce the influence of high data dispersity and the numerical value on the model in the characteristics, the converted numerical data needs to be subjected to standardization processing;

(3) And carrying out data normalization on the normalized data in order to reduce the calculated amount of the model, so that the data are mapped into a [0,1] interval.

As a preferred scheme, the second step trains the network model with the processed network data, and the specific process of obtaining the trained network model is as follows:

(1) Calculating the output value of each neuron in a forward direction;

(2) Determining an optimization objective function;

(3) Carrying out forward and backward propagation to update network weight parameters according to gradient guidance of a convolutional neural network loss function;

(4) And repeating the three steps until the network error is less than a given value, and determining an optimal convolutional neural network model.

As a preferred scheme, the specific process of calculating the information entropy value by the information entropy theory to evaluate the probability distribution and the confidence of the data obtained in the third step is as follows:

(1) Carrying out entropy calculation on probability distribution output by the convolutional neural network through an information entropy theory;

(2) And (4) comparing the entropy value with a threshold value to further divide data needing secondary learning classification.

As a preferred scheme, the fourth step directly outputs the classification result from the convolutional neural network for the data with the entropy value smaller than the threshold, and the specific process of learning and reclassifying the data larger than the threshold by the random forest model is as follows;

(1) The convolutional neural network outputs a data category smaller than a threshold value, and the random forest further learns and divides data higher than the threshold value;

(2) Counting the uncertain sample rate in secondary classification;

(3) And combining and outputting the primary classification result and the secondary learning classification result.

As a preferred scheme, the specific process of transmitting the test set after data preprocessing into a trained network model to obtain a classification result is as follows;

(1) Performing parameter optimization on the intrusion detection model by a training set, and obtaining an optimal intrusion detection model when the uncertain sample rate of secondary classification reaches the lowest through threshold adjustment;

(2) And inputting a test set to test the intrusion detection model to obtain a final classification result.

Compared with the prior art, the invention provides an intrusion detection method based on the information entropy theory and the convolution neural network, which has the following beneficial effects:

the invention adopts the convolutional neural network with deep learning effective algorithm to extract large-scale data features and excellent classification capability, and applies the convolutional neural network to intrusion detection to improve the accuracy of the intrusion detection. And further evaluating and dividing the data classes by combining the confidence degrees generated by classifying the convolutional neural network with the information entropy theory. And the data types with large entropy and high uncertainty are relearned, so that the false alarm rate of intrusion detection is reduced. Thereby improving the overall intrusion detection performance.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is a diagram of a classification method for a CNN model;

fig. 3 is a diagram of a RF model classification method.

Detailed Description

In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.

As shown in fig. 1, an intrusion detection method based on the information entropy theory in combination with the convolutional neural network specifically includes the following steps:

the method comprises the following steps: carrying out data preprocessing on the intrusion detection data set;

the data preprocessing is divided into 3 steps:

1) Conversion of character-type data into numerical-type data

Because some features in the intrusion detection dataset are character-type data, and the convolutional neural network can only process numerical data, the character-type data in the dataset is first converted into numerical data.

2) Data standardization

In order to reduce the influence of high data dispersity and numerical value on a convolutional neural network in the features, the data converted into numerical values are standardized, the average value and the average absolute error of each characteristic value are calculated, and the formula is as follows:

wherein x is _k Means, S, representing the k-th attribute _k Mean absolute error, x, representing the kth feature _ik Representing the kth attribute of the ith record. Then, a normalization metric is performed on each data record, and the formula is as follows:

wherein, Z _ik K attribute value representing normalized ith data record。

3) Data normalization

And normalizing the normalized data to ensure that the numerical value of the data is in a [0,1] interval so as to reduce the calculated amount of the model, wherein the normalization processing formula is as follows:

step two: as shown in fig. 2, preprocessing the data and then transmitting the preprocessed data into a convolutional neural network model for training to obtain a trained network model;

training and tuning the convolutional neural network comprises the following three stages:

1) And (3) performing convolution on the preprocessed data, obtaining mapping of corresponding positions after each position is passed, and activating a function to form a new characteristic matrix through network parameter setting when a complete characteristic matrix is traversed. When all feature samples complete the feature mapping, the stack combination forms the feature map of the convolution. The convolution calculation formula is as follows:

m is the set of input samples and M is,

is the feature matrix output from the previous layer, "+" is the convolution operation, n is the current layer number,

is a convolution kernel, b ⁿ And selecting a Relu function as the activation function for the offset phi. Compared with other activation functions, the unsaturated characteristic of the Relu activation function can effectively avoid the gradient disappearance phenomenon.

2) The features output by the convolutional layer are sampled by the pooling layer, and all features are summarized by using data features of probability statistics. Not only can reduce the dimension, but also can ensureLeaving the most valid information. The method selects a maximum pooling method.

For the pooled feature matrix, the pooling method has the following calculation formula:

3) The convolution neural network with updated network parameters can obtain an original characteristic output value through forward propagation, the network parameters need to be updated through backward propagation, and the loss function of the convolution neural network is as follows:

wherein y is the actual value of the value,

is a predicted value. The parameter values are updated once per training. When the network trains the data sample, the iterative update formula for updating the weight ω and the bias b in the process of reducing the network loss is as follows:

alpha is learning rate

Is the partial derivative of the weight w and the bias b by the loss function.

Step three: the CNN network output layer obtains probability distribution values (P) of two types of data through a softmax activation function ₁ ,P ₂ ) Probability of passing the resulting data trafficThe distribution is subjected to entropy calculation, and the data is further screened and evaluated through an information entropy theory; .

The data evaluation method is as follows:

by obtaining the probability distribution value (P) of two types of data ₁ ,P ₂ ) The calculation formula for calculating the information entropy is as follows:

the entropy represented by H is also expressed as the uncertainty of the data, and n =2,P is obtained by classifying the data into two types _i Is the probability distribution of two predicted results and can also represent the confidence of data in two categories. The CNN network output layer obtains probability distribution values (P) of two types of data through a softmax activation function ₁ ,P ₂ ). And (4) rapidly sampling the probability distribution of all samples to calculate the uncertainty, wherein the calculated result reflects the size of the uncertainty of the data. The information entropy calculation formula of n =2 is as follows:

H＝-(P ₁ log ₂ P ₁ +P ₂ log ₂ P ₂ )

when the classified data is (P) ₁ ,P ₂ ) The more the probability distribution is even, the larger the calculated H is, the higher the uncertainty of the classification sample is, and vice versa (P) ₁ ,P ₂ ) The larger the difference in probability values, the lower the uncertainty of the classified sample. Through the division of H, the samples can be divided into samples with high uncertainty and samples with low uncertainty, and the uncertainty of the obtained data samples can be used as a judgment basis for delay classification decision.

Step four: directly outputting a classification result by the convolutional neural network for the data with the entropy value smaller than the threshold theta, learning and reclassifying the data larger than the threshold theta by using a random forest model shown in figure 3, and counting the uncertain sample rate during secondary classification;

step five: obtaining an optimal intrusion detection model when the uncertain sample rate of secondary classification reaches the lowest through adjusting the threshold value, and inputting test data in the trained optimal model to obtain a classification result;

the method comprises the steps of firstly, transmitting data into a convolutional neural network for classification through a data preprocessing stage, carrying out information entropy calculation through the confidence coefficient of the data generated by CNN, directly outputting a classification result by the CNN when the entropy value is smaller than a threshold value theta, and further classifying by a random forest model when the entropy value is larger than the threshold value theta. And combining the classification results obtained by the two as common output. The overall performance of the model can be improved, and the risk of model misclassification is reduced.

While the present invention has been described in detail with reference to the embodiments, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims

1. An intrusion detection method based on the combination of an information entropy theory and a convolutional neural network is characterized in that: the method comprises the following specific steps:

the second step is that: transmitting the training set after data preprocessing into a network model for training to obtain a trained network model;

2. The intrusion detection method based on the information entropy theory combined with the convolutional neural network as claimed in claim 1, wherein: the first step is to perform data preprocessing on the intrusion detection data set, and the specific process is as follows:

(2) In order to reduce the influence of high data dispersity and the influence of the numerical value on the model in the characteristics, the converted numerical data needs to be subjected to standardization processing;

(3) And normalizing the data after the normalization processing to reduce the calculation amount of the model, so that the data are mapped into a [0,1] interval.

3. The intrusion detection method based on the information entropy theory combined with the convolutional neural network as claimed in claim 1, wherein: in the second step, the network model is trained by the processed network data, and the specific process of obtaining the trained network model is as follows:

(1) Calculating the output value of each neuron in a forward direction;

(2) Determining an optimization objective function;

(4) And repeating the three steps until the network error is smaller than a given value, and determining the optimal convolutional neural network model.

4. The intrusion detection method based on the information entropy theory combined with the convolutional neural network as claimed in claim 1, wherein: the third step calculates the information entropy value through the information entropy theory to evaluate the probability distribution and the confidence coefficient of the obtained data, and the specific process is as follows:

(2) And dividing data needing secondary learning classification by comparing the entropy value with a threshold value.

5. The intrusion detection method based on the information entropy theory combined with the convolutional neural network as claimed in claim 1, wherein: the fourth step, the data with the entropy value smaller than the threshold value is directly output by a convolutional neural network to be classified, and the data with the entropy value larger than the threshold value is subjected to learning by a random forest model and then is classified in the specific process as follows;

(2) Counting the uncertain sample rate in secondary classification;

6. The intrusion detection method based on the information entropy theory combined with the convolutional neural network as claimed in claim 1, wherein: the specific process of transmitting the test set after data preprocessing into a trained network model to obtain a classification result is as follows;

(1) Performing parameter tuning on the intrusion detection model by a training set, and obtaining an optimal intrusion detection model when the secondary classification uncertain sample rate reaches the lowest value through adjusting the threshold value;