CN111343147A

CN111343147A - Network attack detection device and method based on deep learning

Info

Publication number: CN111343147A
Application number: CN202010080797.6A
Authority: CN
Inventors: 陈双武; 金东�; 杨坚; 张勇东; 刘新民; 王玮
Original assignee: Beijing Zhongke Research Institute; University of Science and Technology of China USTC
Current assignee: Beijing Zhongke Research Institute; University of Science and Technology of China USTC
Priority date: 2020-02-05
Filing date: 2020-02-05
Publication date: 2020-06-26
Anticipated expiration: 2040-02-05
Also published as: CN111343147B

Abstract

The invention discloses a network attack detection device and method based on deep learning, which classify known flow after extracting expression characteristics with more discrimination from flow to be tested, reconstruct the expression characteristics to obtain reconstruction characteristics, and detect unknown attack flow according to the size of reconstruction error between the reconstruction characteristics and the expression characteristics, thereby realizing classification of the known flow and detection of unknown attack.

Description

Network attack detection device and method based on deep learning

Technical Field

The invention relates to the field of network security, in particular to a network attack detection device and method based on deep learning.

Background

With the rapid development of the internet, various network attacks come out endlessly, and the normal operation of a communication system is seriously influenced by the network attacks aiming at various novel network protocols and network system architectures. Conventional network security detection devices rely on static attack features (e.g., IP blacklists) or dynamic attack features (e.g., feature strings) to detect attack behavior in the network. Such detection methods rely on known attack signatures, which typically need to be extracted manually by hand, rely on expertise and experience, require a significant amount of time and manpower, and fail to respond effectively and timely to unknown attacks.

The network attack detection method based on deep learning can realize automatic extraction of flow characteristics, and is a novel security detection method which is widely researched in recent years. The method can be mainly divided into two main methods of unsupervised learning and supervised learning. The method can detect unknown network attacks to a certain extent, but cannot classify the detected known network attacks. The network attack detection method based on supervised learning models normal flow and known attack type flow as training data, and when the model detects the detected flow, the type of the detected flow can be identified.

In summary, most of the existing network attack detection methods based on deep learning cannot classify known type traffic (including normal traffic type and known attack type) and detect unknown attack traffic, and have certain limitations.

Disclosure of Invention

The invention aims to provide a network attack detection device and method based on deep learning, which can classify known type flow and detect unknown attack flow.

The purpose of the invention is realized by the following technical scheme:

a network attack detection device based on deep learning comprises: a deep learning network attack detection module, the deep learning network attack detection module comprising: the system comprises a feature extraction module, a known type classification module, a reconstruction module, an extreme value analysis module and a decision module; wherein:

the characteristic extraction module is used for extracting expression characteristics from the flow to be detected;

the known type classification module is used for evaluating the probability that the flow to be detected is of a certain known type according to the expression characteristics;

the reconstruction module is used for reconstructing the expression characteristics to obtain reconstruction characteristics;

the extreme value analysis module is used for evaluating the probability that the flow to be detected is unknown attack according to the reconstruction error between the reconstruction characteristic and the expression characteristic;

and the decision module is used for predicting the type of the flow to be detected according to the probability that the flow to be detected is of a certain known type and the probability that the flow to be detected is of unknown attack.

According to the technical scheme provided by the invention, after the expression characteristics with more discrimination are extracted from the flow to be tested, the known flow is classified, the expression characteristics are reconstructed to obtain reconstruction characteristics, and the unknown attack flow is detected according to the size of the reconstruction error between the reconstruction characteristics and the expression characteristics, so that the known flow is classified and the unknown attack is detected.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic diagram of a network attack detection apparatus based on deep learning according to an embodiment of the present invention;

fig. 2 is a frame diagram of a deep learning network attack detection module according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a training and testing phase provided by an embodiment of the present invention;

fig. 4 is a flowchart of a network attack detection method based on deep learning according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

The traditional network attack detection method needs to rely on professional knowledge and experience, has low detection accuracy and cannot effectively and timely respond to unknown attacks; although the accuracy rate of classifying known flow is high, the accuracy rate of detecting unknown attack is low, and two functions of classifying known type flow and detecting unknown attack flow cannot be realized at the same time, so that the practical application has certain limitation.

In view of this, embodiments of the present invention provide a network attack detection apparatus based on deep learning, after extracting expression features with more discriminative degrees from traffic to be tested by using a convolutional layer of a convolutional neural network, classifying the known traffic by using a full-link layer of the convolutional neural network, obtaining reconstruction features by using an automatic encoder, and detecting unknown attack traffic according to a size of a reconstruction error between the reconstruction features and the expression features, thereby achieving both classification of the known traffic and detection of unknown attack. In addition, the invention utilizes the Weibull distribution to calculate the probability of the detected flow belonging to the unknown attack, thereby improving the accuracy of the unknown attack detection.

As shown in fig. 1, a schematic diagram of a network attack detection apparatus based on deep learning according to an embodiment of the present invention mainly includes: a deep learning network attack detection module; as shown in fig. 2, the deep learning network attack detection module mainly includes: the system comprises a feature extraction module, a known type classification module, a reconstruction module, an extreme value analysis module and a decision module; wherein:

As shown in fig. 1, the apparatus further comprises: the flow extraction module and the flow preprocessing module;

the flow extraction module is used for acquiring original flow from a network;

the flow preprocessing module is used for dividing the original flow into different flows to be tested according to quintuple; the quintuple comprises: source IP, destination IP, source port, destination port and protocol number;

and the first d bytes of each flow to be tested, which are divided by the flow preprocessing module, are used as the input of the deep learning network attack detection module. Because the convolutional neural network needs all inputs with uniform size, and each flow has different size, in the embodiment of the present invention, each flow takes the first d bytes, the first d bytes contain all the contents of the quintuple, and the specific value of d can be set according to the actual situation.

In the embodiment of the present invention, a deep learning network attack detection module needs to be trained, and as shown in fig. 3, the training phase mainly includes the following three parts:

1. and training the parameters of the feature extraction module and the known type classification module.

As shown in fig. 2, the feature extraction module and the known type classification module are respectively implemented by a convolutional layer and a fully connected layer of a convolutional neural network.

The training set (containing only normal traffic types and known attack types) can be modeled as: (x, y) { (x)₁,y₁),(x₂,y₂)…(x_N,y_N) The input of the feature extraction module is x_i∈R^dThe first d bytes of the ith flow are represented, where yi ∈ {1, 2, …, K } is x_iAnd corresponding labels, wherein N is the number of training samples, and K is the number of types.

Expression characteristic z with discrimination output of characteristic extraction module_iObtaining a normalized output vector by a known type classification module

Wherein

React with x_iA probability of belonging to a known type j; j ═ 1,2,. K, a sequence number value of a known type; known types include: normal traffic type and various known attack types.

It can be understood that, in order to distinguish each type, the convolutional neural network trains the original input x to obtain a more differentiated expression feature, where the expression feature z is obtained by performing a function transformation (z ═ F (x)) on the original input x, and the training process is just to find the optimal form of the function F.

The feature extraction module and the known type classification module can be modeled as functions

x → z sum function

It is obvious that

Can be expressed as

Simultaneously training parameters of the feature extraction module and parameters of the known type classification module by using a random gradient descent method, wherein the loss function adopts a cross entropy loss function shown as the following formula:

in the above formula, M represents the number of training samples, that is, N training samples are divided into a plurality of batches, and each batch includes M training samples;

is label y_iThe specific form of the one-hot (one-hot) coding is a vector, and the vector has one and only one element of 1 and the rest elements of 0, wherein the position of 1 represents the real type of the flow, and (j) represents the index, i.e., (j) is

The jth element in (a).

2. And (5) parameter training of a reconstruction module.

In the embodiment of the invention, the reconstruction module is composed of an automatic encoderAnd (4) obtaining. Trained feature extraction module for each input x_iCorresponding expression characteristics can be obtained

The reconstruction module uses the expression features z_iGenerating reconstruction features

The reconstruction module may be expressed as a function

In the embodiment of the invention, the parameters of the reconstruction module are trained by using a random gradient descent method, and the loss function adopts a square error loss function shown as the following formula:

in the above formula, M represents the number of training batches.

3. And (5) training parameters of an extreme value fitting module.

The performance of the reconstruction module for reconstructing the expression characteristics of the unknown flow is poorer than that for reconstructing the known flow, namely the reconstruction error represented by the unknown attack flow is larger than that of the known flow.

The extreme value fitting module evaluates the probability that the flow to be measured is unknown attack based on Weibull distribution, wherein the Weibull distribution is a probability distribution model obeying extreme value theory, and the probability distribution function is as follows:

when the parameters of the extreme value fitting module are trained, firstly, the reconstruction errors e ═ e (e) of all known flow rates are calculated according to the following formula₁,e₂,…,e_N)：

All reconstruction errors are sorted from small to large, η (η < N, η is a hyper-parameter, generally η ═ 0.1N) reconstruction errors are selected from the top in the sequence, and the reconstruction errors are used for analyzing the unknown attack and the boundary of the known type.

For η reconstruction errors, parameters (m, tau, sigma) are obtained by a maximum likelihood method, wherein m (m > 0), tau (tau < x) and sigma (sigma > 0) are respectively a shape parameter, a position parameter and a scale parameter.

For the flow to be measured, the reconstruction error e' is brought into P_EVT(x) Equation calculation result P_EVT(e ') is the probability that the flow to be measured is unknown attack, and the larger the reconstruction error e', the larger P_EVTThe larger (e').

After training is completed, detection can be completed for any flow x' to be detected through the deep learning network attack detection module according to the method introduced above. The flow to be detected can be normal flow, known attack flow or unknown attack flow; the first d bytes of the flow to be detected are input into a well-learned deep learning network attack detection module, a feature extraction module outputs corresponding expression features z', and on one hand, a known type classification module outputs a normalized output vector by utilizing the expression features z

y′∈{1，2，…,K}，

The probability that the flow to be detected belongs to each category is included; in another aspect, the reconstruction module obtains the reconstruction feature using the expression feature z

The extreme value module is used for reconstructing the feature according to the expression feature z' and the reconstruction feature

Obtain P with the reconstruction error e' therebetween_EVT(e'). A final decision module based on

And P_EVT(e') obtaining the final prediction type.

As shown in fig. 3, the decision module decision process can be expressed as follows:

1) the probability that the flow to be measured is of a known type is recorded

The calculation process is expressed as:

the first expression selects the known type y represented by j corresponding to the probability of the maximum value^*J is a serial number value of a known type; the second expression, keeping the flow to be measured as the known type y^*The probability of (a) of (b) being,

reflects that the measured flow belongs to the type y^*The size of the probability of (c).

2) Using probabilities

And P_EVT(e') updating the probability p that the traffic to be tested belongs to an unknown attack_u：

3) Deciding the type of flow to be measured

Wherein, y^*Indicating a known type, α is a hyper-parameter,

is the final prediction type if

Then

Is of known flow and is y^*(ii) a If it is not

Is an unknown attack (unkonwn attack).

For ease of understanding, the following description is made with reference to examples.

As shown in table 1, the data set contains known traffic types and unknown attack types. Wherein the known traffic types include normal traffic types and known attack types.

TABLE 1 data set

1. Data set preprocessing

The training set consists of 80% of the known traffic types, and the test set consists of the remaining 20% of the known traffic types and all unknown attack types. As shown in table 1, traffic types are labeled, and there are 10 known types, so that in the solution, K is 10, and it is possible to detect whether an unknown attack is detected during testing, and therefore, the label value of the location type is set to 11.

2. And (5) a training stage.

1) Training by taking a training data set as input to obtain parameters of a feature extraction module and a known type classification module;

2) training the parameters of the reconstruction module by taking the training data set as input;

3) taking a training data set as input, and calculating the reconstruction error e ═ of all training samples (e)₁,e₂,…,e_N) And sequencing the reconstruction errors e from small to large, and taking η reconstruction errors to fit extreme value module parameters.

3. And (5) a testing stage.

1) For each measured flow, the known classification module predicts a known type y to which the measured flow belongs^*And outputting the probability of the prediction being correct

2) Extremum module output P_EVT(e') calculating the probability size of belonging to unknown attacks

Adjusting the size of the hyper-parameter α according to the test result;

3) comparison of p_uAnd

if it is not

The measured flow being of known type y^*Otherwise, it is unknown attack type.

Another embodiment of the present invention further provides a network attack detection method based on deep learning, which can be implemented based on the apparatus provided in the foregoing embodiment, as shown in fig. 4, and mainly includes the following steps:

extracting expression characteristics from the flow to be detected;

estimating the probability that the flow to be detected is of a certain known type according to the expression characteristics;

reconstructing the expression characteristics to obtain reconstructed characteristics;

estimating the probability of unknown attack of the flow to be detected according to the reconstruction error between the reconstruction characteristic and the expression characteristic;

and predicting the type of the flow to be detected according to the probability that the flow to be detected is of a certain known type and the probability that the flow to be detected is of unknown attack.

Further, the method further comprises:

acquiring original flow from a network;

dividing the original flow into different flows to be tested according to quintuple; the quintuple comprises: source IP, destination IP, source port, destination port and protocol number;

and taking the divided first d bytes of each flow to be tested as input to extract expression characteristics.

Further, expression feature extraction and probability that the flow to be detected is estimated to be a certain known type according to the expression features are respectively realized through a convolution layer and a full connection layer of a convolution neural network; the convolution layer is used as a feature extraction module, and the full-connection layer is used as a known type classification module;

the training sample during training is (x, y) { (x)₁,y₁),(x₂,y₂)…(x_N,y_N) The input of the feature extraction module is x_i∈R^dThe first d bytes, y, of the ith flow_i∈ {1, 2, …, K } is x_iCorresponding labels, wherein N is the number of training samples, and K is the number of types;

output expression feature z of feature extraction module_iObtaining a normalized output vector by a known type classification module

Wherein

React with x_iA probability of belonging to a known type j; j ═ 1,2,. K; known types of such include: normal traffic type and various known attack types;

is label y_iThe specific form of the one-hot coding is a vector, and the vector has one and only one element of 1 and the rest elements of 0, wherein the position of 1 represents the real type of the flow, and (j) represents the index, i.e. (j) is

The jth element in (a).

Furthermore, the reconstruction expression characteristics are realized through a reconstruction module, the reconstruction module is composed of an automatic encoder, and the reconstruction module utilizes the expression characteristics z_iGenerating reconstruction features

And (3) training parameters of a reconstruction module by using a random gradient descent method, wherein a loss function adopts a square error loss function shown as the following formula:

in the above formula, M represents the number of batch trainings;

the probability that the flow to be measured is unknown attack is estimated according to the reconstruction error between the reconstruction characteristic and the expression characteristic is realized through an extreme value fitting module, the extreme value fitting module estimates the probability that the flow to be measured is unknown attack based on Weibull distribution, the Weibull distribution is a probability distribution model obeying extreme value theory, and the probability distribution function is as follows:

Sorting all reconstruction errors in a sequence from small to large, and selecting η reconstruction errors in the front sorting;

for η reconstruction errors, obtaining (m, tau, sigma) parameters by a maximum likelihood method, wherein m, tau and sigma are respectively a shape parameter, a position parameter and a scale parameter;

for the flow to be measured, the reconstruction error e' is introduced into P_EVT(x) Equation calculation result P_EVT(e ') is the probability that the flow to be measured is unknown attack, and the larger the reconstruction error e', the larger P_EVTThe larger (e').

Further, predicting the type of the traffic to be measured according to the probability that the traffic to be measured is of a certain known type and the probability that the traffic to be measured is of an unknown attack includes:

the probability that the flow to be measured is of a known type is recorded

The probability that the flow to be measured is unknown attack is recorded as P_EVT(e′)；

Using probabilities

Deciding the type of flow to be measured

Wherein, y^*Indicating a known type, α is a hyperparameter.

Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.

It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the system is divided into different functional modules to perform all or part of the above described functions.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A network attack detection device based on deep learning is characterized by comprising: a deep learning network attack detection module, the deep learning network attack detection module comprising: the system comprises a feature extraction module, a known type classification module, a reconstruction module, an extreme value analysis module and a decision module; wherein:

2. The deep learning-based network attack detection device according to claim 1, wherein the device further comprises: the flow extraction module and the flow preprocessing module;

the flow extraction module is used for acquiring original flow from a network;

and the first d bytes of each flow to be tested, which are divided by the flow preprocessing module, are used as the input of the deep learning network attack detection module.

3. The network attack detection device based on deep learning of claim 1, wherein the feature extraction module and the known type classification module are respectively realized by a convolutional layer and a full connection layer of a convolutional neural network;

the training sample during training is (x, y) { (x)₁，y₁)，(x₂，y₂)…(x_N，y_N) The input of the feature extraction module is x_i∈R^dThe first d bytes, y, of the ith flow_i∈ {1, 2, …, K } is x_iCorresponding labels, wherein N is the number of training samples, and K is the number of types;

Wherein

React with x_iA probability of belonging to a known type j; j ═ 1,2,. K, a sequence number value of a known type; known types include: normal traffic type and various known attack types;

is label y_iIs a one-hot coded form of (j) is

The jth element in (a).

4. The deep learning-based network attack detection apparatus according to claim 1,

the reconstruction module is composed of an automatic encoder and utilizes the expression characteristic z_iGenerating reconstruction features

in the above formula, M represents the number of batch trainings;

when the parameters of the extreme value fitting module are trained, firstly, the reconstruction errors e ═ e (e) of all known flow rates are calculated according to the following formula₁，e₂，…，e_N)：

5. The device according to claim 1, wherein the predicting the type of traffic to be tested according to the probability that the traffic to be tested is a known type and the probability that the traffic to be tested is an unknown attack comprises:

the probability that the flow to be measured is of a known type is recorded

Using probabilities

And P_EVT(e') calculating and updating the probability p that the traffic to be tested belongs to unknown attack_u：

Deciding the type of flow to be measured

Wherein, y^*Indicating a known type, α is a hyperparameter.

6. A network attack detection method based on deep learning is characterized by comprising the following steps:

extracting expression characteristics from the flow to be detected;

7. The method for detecting network attacks based on deep learning of claim 6, wherein the method further comprises:

acquiring original flow from a network;

8. The method according to claim 6, wherein the network attack detection method based on deep learning,

extracting expression characteristics and evaluating the probability that the flow to be detected is of a certain known type according to the expression characteristics respectively through a convolutional layer and a full connection layer of a convolutional neural network; the convolution layer is used as a feature extraction module, and the full-connection layer is used as a known type classification module;

Wherein

is label y_iIs a one-hot coded form of (j) is

The jth element in (a).

9. The method according to claim 6, wherein the network attack detection method based on deep learning,

the reconstruction expression characteristic is realized by a reconstruction module, the reconstruction module is composed of an automatic encoder, and the reconstruction module utilizes the expression characteristic z_iGenerating reconstruction features

in the above formula, M represents the number of batch trainings;

10. The method as claimed in claim 6, wherein predicting the type of traffic to be detected according to the probability that the traffic to be detected is of a known type and the probability that the traffic to be detected is of an unknown attack comprises:

the probability that the flow to be measured is of a known type is recorded

Using probabilities

Deciding the type of flow to be measured

Wherein, y^*Indicating a known type, α is a hyperparameter.