CN114372530A

CN114372530A - Abnormal flow detection method and system based on deep self-coding convolutional network

Info

Publication number: CN114372530A
Application number: CN202210024041.9A
Authority: CN
Inventors: 李小勇; 邓瑞文; 苑洁; 高雅丽; 李灵慧
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2022-04-19

Abstract

The invention discloses an abnormal flow detection method and system based on a deep self-coding convolutional network, wherein the method comprises the following steps: s1, training a plurality of depth autoencoders by using the preprocessed data; s2, inputting the preprocessed data into a plurality of self-encoders to obtain a plurality of different dimension reduction feature vectors; s3, performing feature splicing on the obtained different dimensionality reduction feature vectors and the preprocessed data, and training a convolutional neural network by using the feature splicing to obtain an optimal classification network model; and S4, splicing the preprocessed unknown data and the output of the self-encoder module, inputting the spliced unknown data and the output of the self-encoder module into a trained network model, and classifying the output of the convolutional neural network by using a softmax activation function to obtain a prediction result. The detection system comprises a data preprocessing module, a depth self-encoder module, a convolutional neural network module and a system management module. The invention solves the problems that the traditional abnormal flow detection scheme depends on an expert system and the traditional flow detection model has low accuracy and poor generalization capability.

Description

Abnormal flow detection method and system based on deep self-coding convolutional network

Technical Field

The invention relates to the technical field of communication networks, in particular to an abnormal flow detection method and system based on a deep self-coding convolutional network.

Background

Current communication networks are evolving rapidly, as are network attacks. New vulnerabilities appear daily and are rapidly exploited in zero-day attacks. Signature-based detection fails to detect previously unknown attacks, and abnormal traffic detection techniques can detect deviations from normal communication patterns, and are therefore important tools for improving the security of today's communication networks. Although there is a great amount of technical and scientific literature on network traffic anomaly detection methods, in the literature, the important steps of feature selection are often not fully described and processed. Anomaly detection in a communication network provides a basis for discovering new attacks, misconfigurations and network failures.

With the development of networks, malicious behaviors such as network intrusion, Service attack, information stealing, virus propagation and the like become more and more common under the drive of various benefits, the number, the types and the destruction degree are continuously increased, wherein a resource consumption type attack represented by Distributed Denial of Service (DDoS) is a main threat of internet security due to the fact that the implementation is simple and the defense is difficult, the traditional intrusion detection is mainly deployed at a user end to protect a user from being attacked and cannot eliminate malicious traffic in a network backbone node, the existing core node generally does not have the identification and control capability of malicious traffic and can only be propagated by the existing core node, most resources of the core node are consumed, the speed of network resource updating development cannot keep up with the abuse and consumption of the resources, so that the deployment of traffic detection and control mechanisms on the core node has important significance for guaranteeing the network performance, and further can serve as an information platform for attack source localization.

The rapid development of the internet not only brings convenience to the life of people, but also brings new challenges. The network security situation is still severe, not only the traditional security threats are not reduced, but also new threats continuously appear, so that the network is abnormal in various types, and detection and defense are not facilitated. The network scale and the network traffic become larger and larger, and a severe test is brought to the storage and detection of the network traffic. Although a special commercial detection instrument based on feature matching can perform online high-speed detection on known attacks, the special commercial detection instrument cannot give consideration to the detection capability of unknown anomalies and does not have the flow storage capability. The existing abnormal flow detection technology of the machine learning network can give consideration to the detection capability of unknown abnormality, but has the problem of low detection efficiency.

The network abnormal traffic detection mainly detects the behavior deviating from normal data. The information source is first modeled and analyzed to create a normal system or network reference profile. If the new data sample deviates from or exceeds the current normal mode profile, the anomaly detection system issues an early warning or reacts. Because the detection system customizes and draws the normal contour of the system or the network according to the normal condition, for external attack, an attacker is difficult to deviate from the normal contour during the attack, and therefore, the attacker is easy to detect by the abnormality detection system; similarly, the anomaly detection system can also detect attacks from within. In addition, anomaly detection systems have the ability to detect previously unknown attacks. The main disadvantages are: firstly, only an initial system is trained, and a normal contour model can be established; secondly, adjusting and maintaining the profile model is also complex and time consuming, and creating an incorrect profile model may result in a higher false alarm rate. Finally, some carefully constructed malicious attacks can gradually accept malicious behaviors by using an anomaly detection training system, so that the false alarm is missed.

The existing main abnormal flow detection methods comprise:

1. abnormal flow detection based on statistics: based on statistical abnormal traffic detection, it is assumed that the current network environment is in a quasi-steady state. The algorithm collects and arranges a large amount of normal flow data in the previous period, sets an initial threshold value by carrying out statistical analysis or data transformation on historical flow data, then calculates the current network flow data, and judges whether the current network is abnormal or not by comparing the current network flow data with the initial threshold value. If a certain statistic information of the current network flow data exceeds a corresponding threshold value, the abnormal flow is represented, and commonly used network flow characteristics include byte number, packet number, flow number, audit record data, the number of audit events, interval events, quintuple (protocol, source IP address, destination port and destination IP address), resource consumption events and the like.

2. Anomaly detection based on data mining: based on the abnormal detection of data mining, the data mining technology is utilized to analyze and mine the characteristic information of various flows from massive network flows, an automatic or semi-automatic modeling algorithm is adopted to mine characteristic parameters such as correlation, modes or trends and the like capable of reflecting the current network conditions, and the potential hiding characteristics of data are revealed from a higher abstract level, so that the abnormal behavior condition of the network is judged. The method is commonly used at present, such as induction rule generation, fuzzy logic, genetic algorithm and the like.

3. Anomaly detection based on machine learning: the identification of abnormal traffic is essentially a classification problem that is usually premised on learning. Abnormal flow detection based on machine learning is a high abstraction of previous experience and expression of models, and is characterized by establishing models. Different network traffic characteristics, such as number of bytes, average packet size, number of packets, maximum packet length, flow duration, inter-arrival time, etc., may be modeled objects. Bayesian networks, clustering, support vector machines, markov models, etc. have been widely used.

For example, a flow monitoring method based on a clustering algorithm has a framework shown in fig. 1, and is integrally divided into four modules: the system comprises a marking data auxiliary module, a mixed clustering module, an online classification module and a system updating module. The marking data auxiliary module is mainly responsible for information gain and characteristic weighting of network flow characteristics; the mixed clustering module mainly comprises a plurality of clustering algorithms, and the mixed clustering module is trained by using the same input and outputs the input according to different weights; the system updating module is responsible for adding new data, generally some flow information with new protocols or flow information selected by an expert system; and the online classification module adopts an NCC classifier and outputs a final flow detection result based on a clustering algorithm.

However, the existing common traffic monitoring technology generally needs to rely on human experts to select and label traffic characteristics, and continuously adds new expert label information during the running of a model, so that the dependency on an expert system is high, the method is not only high in cost, and cannot effectively cope with novel traffic attacks, but also is insufficient in generalization capability of abnormal traffic detection, cannot keep high accuracy under a plurality of different scenes, and cannot cope with 0day vulnerability attacks and some novel attacks.

Disclosure of Invention

The invention aims to provide an abnormal flow detection method and system based on a deep self-coding convolutional network, which aim to solve two problems: the method solves the problem that the traditional abnormal flow detection scheme depends on an expert system; and secondly, the problems of low accuracy and poor generalization capability of the traditional flow detection model are solved.

In order to achieve the above purpose, the invention provides the following technical scheme:

the invention firstly provides an abnormal flow detection method based on a deep self-coding convolution network, which comprises the following steps:

s1, training a plurality of depth autoencoders by using the preprocessed data;

s2, inputting the preprocessed data into a plurality of self-encoders to obtain a plurality of different dimension reduction feature vectors;

s3, performing feature splicing on the obtained different dimensionality reduction feature vectors and the preprocessed data, and training a convolutional neural network by using the feature splicing to obtain an optimal classification network model;

and S4, splicing the preprocessed unknown data and the output of the self-encoder module, inputting the spliced unknown data and the output of the self-encoder module into a trained network model, and classifying the output of the convolutional neural network by using a softmax activation function to obtain a prediction result.

Further, the preprocessing process of step S1 includes: and acquiring a flow protocol, a type, duration and byte number, and converting the flow protocol, the type, the duration and the byte number into a one-dimensional floating-point number vector through one-hot coding.

Further, the depth self-encoder of step S1 is composed of a plurality of fully connected layers.

Further, the depth self-encoder in step S1 uses a mean square loss function as a loss function, the Adam algorithm as an optimizer, and the tanh method as an activation function, where the layer with the least dimensionality is the required dimension reduction feature information.

Further, the mean square loss function is calculated as follows:

Υ(x,y)＝L{L₁,…,L_n}T，Ln＝(x_n-y_n)²

where L refers to the vector of the input calculation, L_nRefer to the different dimensions of the L vector, T denotes transpose of the vector, n refers to batch size, x_i、y_iThe parameters refer to different positions corresponding to the input and output characteristic vectors.

Further, the tanh function is calculated as follows:

where tanh is one of the hyperbolic functions and tanh is the hyperbolic tangent.

Further, step S3 adopts a two-dimensional convolution layer and two-dimensional batch normalization method, and adopts a relu method as an activation function.

Further, step S3 maps the outputs of different channels into a result vector using a global pooling technique, and performs normalization representation using a softmax method.

Further, step S3 uses the cross entropy loss function as the loss function, and the Adam algorithm is the optimizer.

The invention also provides an abnormal flow detection system based on the deep self-coding convolutional network, which comprises the following steps:

the data preprocessing module is used for carrying out primary processing on input data, including acquiring a flow protocol, a type, duration and byte number, and converting the flow protocol, the type, the duration and the byte number into a one-dimensional floating-point number vector through one-hot coding;

the deep self-encoder module is used for transmitting the preprocessed flow information as input to a plurality of deep self-encoders, training the marked flow data, extracting high-level abstract features from the input flow information by the trained self-encoders, splicing the high-level abstract features with the preprocessed data, and using the extracted high-level abstract features as the input of a convolutional network;

the convolutional neural network module is used for receiving the input of the preprocessing module and the deep self-encoder module, and training a network model by combining the output of the trained deep self-encoder module with the marked flow data to obtain an optimal classification network model;

and the system management module is used for managing the configuration of the system.

Compared with the prior art, the invention has the beneficial effects that:

according to the abnormal flow detection method based on the deep self-coding convolutional network, flow information is input into a deep self-coder and converted into dimension-reduced abstract features, the technology can better extract high-level features of flow, and deeper flow inclusion information is mined; CNN of a convolution-normalization-activation architecture sequence is used as a construction basis, and the accuracy rate is higher than that of a traditional detection model; when the classification is carried out, a two-dimensional CNN model based on a global pooling technology is used for construction, and the technology can map classification information with specific output channels so as to strengthen the interpretability of the model.

The invention solves the problems of lower feature extraction level and insufficient learning capacity of the model to the abnormal flow features of the current abnormal flow detection system, and simultaneously provides details for realizing the feature extraction and detection algorithm. Through tests, the accuracy rate of the kdd99 data set can reach 94%, known attacks and unknown attacks can be effectively prevented, and the accuracy rate and the universality are obviously improved compared with the prior art.

Drawings

In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.

Fig. 1 is a flow monitoring framework based on a clustering algorithm.

Fig. 2 is an abnormal traffic detection system architecture based on a deep self-coding convolutional network according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a depth self-encoder according to an embodiment of the present invention.

Fig. 4 is a self-encoder model structure according to an embodiment of the present invention.

Fig. 5 is a convolution model structure according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, those skilled in the art will now describe the present invention in further detail with reference to the accompanying drawings.

The abnormal flow detection system based on the deep self-coding convolutional network, as shown in fig. 2, includes:

1. Pre-processing module

The module carries out primary processing on input data, including acquiring a flow protocol, a type, duration, byte number and the like, and converts the data into a one-dimensional floating-point number vector through one-hot coding.

One-hot encoding, also known as one-bit-efficient encoding, uses an N-bit status register to encode N states, each state being held by its own independent register bit and only one of which is active at any one time. For each feature, if there are n possible values for the feature, then after unique hot coding, the feature becomes n binary features, and only one of the n binary features is valid.

2. Depth self-encoder module

As shown in fig. 3, the module uses the preprocessed traffic information as input to transmit to a plurality of deep self-encoders, the deep self-encoders need to be trained by using labeled traffic data, and the trained deep self-encoders can extract high-level abstract features from the input traffic information and splice the extracted high-level abstract features with the preprocessed data to serve as input of a convolutional network.

Each depth self-encoder is a depth neural network composed of a plurality of fully-connected layers, the training goal of the depth self-encoder is to enable the final output to be as close to the input as possible, a mean square loss function (mselos) is used as a loss function, an Adam algorithm is an optimizer, a tanh method is an activation function, and the layer with the minimum dimensionality is required dimension reduction characteristic information.

The mean square loss (mselos) function is calculated as follows:

Υ(x,y)＝L{L₁,…,L_n}^T，L_n＝(x_n-y_n)²

the tanh function is calculated as follows:

The specific structure of the self-encoder model used in the invention is shown in fig. 4, wherein FC refers to the full-link layer, the number refers to the number of nodes of the full-link layer, and the output of the intermediate full-link layer with the least number of nodes is spliced with the original characteristics of the flow and then input into the convolutional neural network module.

3. Convolutional neural network module

The module receives the input of the preprocessing module and the self-encoder module, and needs to train a network model by combining the output of the trained depth self-encoder module with the marked flow data to obtain an optimal classification network model. The trained model takes the preprocessed data and the output of the self-encoder module as input, and manual intervention is not needed.

The module adopts a two-dimensional convolution layer (convolution kernel with the size of 3 multiplied by 3) and a two-dimensional batch normalization method, adopts a relu method as an activation function, finally adopts a global-avg-pool technique (global-avg-pool) to map the output of different channels into result vectors, and uses a softmax method to carry out normalization expression. The model adopts a cross entropy loss function as a loss function, and an Adam algorithm as an optimizer.

The relu function, i.e. the linear rectification function, also called modified linear unit, is calculated as follows:

f(x)＝max(0，x)，

the relu function is more efficient in gradient descent and back propagation: the problems of gradient explosion and gradient disappearance are avoided, and the calculation process can be simplified.

In the early development of convolutional neural networks, one or n fully-connected layers were always required after convolutional layers passed through the pooling layer. The method is characterized in that parameters of the full connection layer are excessive, so that the model per se becomes too bulky. The method brings the problems of overlarge parameter quantity, reduced training speed and easy overfitting. The model solves the inherent problems of the full connection layer through the global pooling technology, reduces the parameter quantity and improves the operation efficiency.

Global pooling is the situation where the size of the sliding window of pooling is the same as the size of the entire feature map. Thus, each feature map input of W × H × C is converted into a 1 × 1 × C output. Therefore, it is also equivalent to a full link layer operation in which each position weight is 1/(W × H). Wherein W, H is width and height, and C is the number of channels.

The specific pooling method of global pooling within the sliding window may be arbitrary and therefore may be subdivided into global average pooling, global maximum pooling, etc.

The structure of the convolution model used by the convolutional neural network module is shown in fig. 5. The convolution kernels are all 3 x 3 in size and comprise four similar parts linked in sequence, each part comprising a convolution layer, a normalization, a relu activation function, and finally a global pooling layer and a softmax activation function.

The input of the convolutional neural network is a one-dimensional vector formed by splicing the original flow characteristics and the output of the self-encoder module, the one-dimensional vector is filled into a 20 x 20 two-dimensional vector from left to right in sequence from top to bottom, zero padding is carried out after the length is not enough, and the obtained 20 x 20 two-dimensional vector is reconstructed.

4. System management module

The module is used for managing the configuration of the system, and a user can modify the parameter information, the data processing mode, the display effect and the like of the system. Parameter information such as sampling frequency of traffic, alarm level, alarm pattern, etc. The data processing method includes, for example, whether the alarm information adopts rough (normal/abnormal) dichotomy or detailed information display (normal/abnormal a/./abnormal b), whether the alarm information needs to be notified instantly or after fixed time frequency statistics, and the like.

The abnormal flow detection method based on the deep self-coding convolutional network, as shown in fig. 2, adopts the above modules, and comprises the following steps:

s1, training a plurality of depth autoencoders by using the preprocessed data;

According to the abnormal flow detection method and system based on the deep self-coding convolutional network, flow information is input into a deep self-coder and converted into dimension-reduction abstract features, the technology can better extract high-level features of flow, and deeper flow inclusion information is mined; CNN of a convolution-normalization-activation architecture sequence is used as a construction basis, and the accuracy rate is higher than that of a traditional detection model; when the classification is carried out, a two-dimensional CNN model based on a global pooling technology is used for construction, and the technology can map classification information with specific output channels so as to strengthen the interpretability of the model.

The invention solves the problems of lower feature extraction level and insufficient learning capacity of the model to the abnormal flow features of the current abnormal flow detection system, and simultaneously provides details for realizing the feature extraction and detection algorithm. Through tests, the accuracy rate of the kdd99 data set can reach 94%, known attacks and unknown attacks can be effectively prevented, the accuracy rate and the universality are obviously improved compared with the prior art, and the results are shown in table 1.

Table 1 test results of the inventive and prior art methods on kdd99 data sets

Model (model)	Rate of accuracy	Recall rate	F1 value
				K nearest neighbor algorithm	0.92	0.96	0.94
Adaboost algorithm	0.77	0.84	0.77
				Random forest	0.88	0.92	0.90
Convolutional neural network	0.92	0.97	0.94
				The method of the invention	0.94	0.98	0.96

While certain exemplary embodiments of the present invention have been described above by way of illustration only, it will be apparent to those of ordinary skill in the art that the described embodiments may be modified in various different ways without departing from the spirit and scope of the invention. Accordingly, the drawings and description are illustrative in nature and should not be construed as limiting the scope of the invention.

Claims

1. An abnormal flow detection method based on a deep self-coding convolutional network is characterized by comprising the following steps:

s1, training a plurality of depth autoencoders by using the preprocessed data;

2. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 1, wherein the preprocessing procedure of step S1 includes: and acquiring a flow protocol, a type, duration and byte number, and converting the flow protocol, the type, the duration and the byte number into a one-dimensional floating-point number vector through one-hot coding.

3. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 1, wherein the deep self-coder of step S1 is composed of a plurality of fully-connected layers.

4. The abnormal traffic detection method based on the depth self-coding convolutional network of claim 1, wherein the depth self-coder of step S1 uses a mean square loss function as a loss function, Adam algorithm as an optimizer, and tanh method as an activation function, wherein the layer with the least dimensionality is the required dimension reduction feature information.

5. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 4, wherein the mean square loss function is calculated as follows:

Υ(x,y)＝L{L₁,…,L_n}T，Ln＝(x_n-y_n)²

6. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 4, wherein the tanh function is calculated as follows:

7. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 1, wherein step S3 adopts two-dimensional convolutional layer and two-dimensional batch normalization method, and adopts relu method as activation function.

8. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 7, wherein step S3 employs a global pooling technique to map the outputs of different channels into a result vector, and uses a softmax method for normalization representation.

9. The abnormal traffic detection method based on the deep self-coding convolutional network of claim 7, wherein step S3 adopts a cross entropy loss function as a loss function, and the Adam algorithm is an optimizer.

10. An abnormal traffic detection system based on a deep self-coding convolutional network, which is characterized by comprising: