CN111147443A

CN111147443A - Unified quantification method for network threat attack characteristics based on style migration

Info

Publication number: CN111147443A
Application number: CN201911127899.2A
Authority: CN
Inventors: 杨进; 李涛; 梁刚; 赵辉; 高天予; 唐晔晨
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2019-11-18
Filing date: 2019-11-18
Publication date: 2020-05-12

Abstract

The invention discloses a unified quantification method for network threat attack characteristics based on style migration, which is used for uniformly processing native characteristic graphs with different dimensions and comprises the following steps: 1) data acquisition: collecting network flow data in real time; 2) selecting basic characteristics of network flow data; 3) converting the basic characteristics of the selected network flow into native characteristic graphs with different dimensions by utilizing the Minkowski Distance; 4) adopting CNN to establish a generation network, inputting a 'native characteristic diagram' in a data set into the generation network, and generating a 'result diagram'; 5) extracting high-level features of the native feature map through a loss network, performing loss calculation on a generated result and a target 'style map' and 'native feature map', and 6) uniformly quantizing the features: and (4) adjusting the weight of the generated network according to the loss value calculated in the step 5), automatically extracting the high-level characteristics of the multi-dimensional network threat data, and simultaneously realizing the unified quantitative processing of the multi-threat sources.

Description

Unified quantification method for network threat attack characteristics based on style migration

Technical Field

The invention relates to the fields of network security technology, network security threat technology and the like, in particular to a style migration-based network threat attack characteristic unified quantification method.

Background

China has become one of the most serious countries suffering from network attacks all over the world, and the current situation of network security is very severe. The network threat perception system is always a key and core technology for guaranteeing the security of a network space. Along with the increasing of the total amount of internet data, diversification of related network activities and increasingly concealed network threat means, the current network threat detection means face the problems of long model training time, low training efficiency and insufficient training. Conventional cyber threat detection techniques have faced new challenges in the current situation.

Machine Learning (Machine Learning) is a discipline that specializes in how computers simulate or implement human Learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to continuously improve their performance.

The concept of deep learning stems from the study of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning forms a more abstract class or feature of high-level representation properties by combining low-level features to discover a distributed feature representation of the data.

The concept of deep learning was proposed by Hinton et al in 2006. An unsupervised greedy layer-by-layer training algorithm is provided based on a Deep Belief Network (DBN), and a multilayer automatic encoder deep structure is provided later to hope for solving the optimization problem related to the deep structure. In addition, the convolutional neural network proposed by Lecun et al is the first true multi-level structure learning algorithm that uses spatial relative relationships to reduce the number of parameters to improve training performance.

Deep learning is a method based on characterization learning of data in machine learning. An observation (e.g., an image) may be represented using a number of ways, such as a vector of intensity values for each pixel, or more abstractly as a series of edges, a specially shaped region, etc. Tasks (e.g., face recognition or facial expression recognition) are more easily learned from the examples using some specific representation methods. The benefit of deep learning is to replace the manual feature acquisition with unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms.

Deep learning is a new field in machine learning research, and its motivation is to create and simulate a neural network for human brain to analyze and learn, which simulates the mechanism of human brain to interpret data such as images, sounds and texts.

For example, Convolutional Neural Networks (CNNs) are machine learning models under deep supervised learning, and Deep Belief Networks (DBNs) are machine learning models under unsupervised learning.

The computation involved in producing an output from an input can be represented by a flow graph (flow graph): a flow graph is a graph that can represent a computation, where each node represents a basic computation and a computed value, and the results of the computation are applied to the values of the children of that node. Consider a set of computations that can be allowed in each node and possible graph structure and that define a family of functions. The input node has no parent node and the output node has no child node.

One particular attribute of such a flow graph is depth (depth): the length of the longest path from one input to one output.

A conventional feed-forward neural network can be viewed as having a depth equal to the number of layers (e.g., the number of hidden layers plus 1 for the output layers). SVMs have a depth of 2 (one corresponding to the nuclear output or feature space and the other corresponding to the linear mixture of the generated outputs).

The network security refers to that the hardware, software and data in the system of the network system are protected and are not damaged, changed and leaked due to accidental or malicious reasons, the system continuously, reliably and normally operates, and the network service is not interrupted.

Network security concerns the in-depth development of future network applications, which relate to security policies, mobile code, instruction protection, cryptography, operating systems, software engineering, and network security management. The isolation of a typical private intranet from the public internet has primarily used "firewall" technology.

China has become one of the most serious countries suffering from network attacks all over the world, and the current situation of network security is very severe. The network threat perception system is always a key and core technology for guaranteeing the security of a network space. Along with the increasing of the total amount of internet data, diversification of related network activities and increasingly concealed network threat means, the current network threat detection means has the problems of long model training time and low training efficiency. Conventional cyber threat detection techniques have faced new challenges in the current situation.

Disclosure of Invention

The invention aims to provide a method for uniformly quantizing network threat attack characteristics based on style migration, which can automatically extract high-level characteristics of multi-dimensional network threat data and can simultaneously realize uniform quantization processing on multiple threat sources.

The invention is realized by the following technical scheme: a unified quantification method for network threat attack characteristics based on style migration uniformly processes native characteristic graphs of different dimensions, and comprises the following steps:

1) data acquisition: collecting network flow data in real time;

2) selecting basic characteristics of network flow data;

3) converting the basic characteristics of the selected network flow into native characteristic graphs with different dimensions by utilizing the Minkowski Distance;

4) adopting CNN to establish a generation network, inputting a 'native characteristic diagram' in a data set into the generation network, and generating a 'result diagram';

5) extracting high-level features of the original feature map through a loss network, performing loss calculation on the generated result and the target 'style map' and 'original feature map' respectively,

6) unified quantification of characteristics: adjusting the weight value of the generated network according to the loss value calculated in the step 5).

In order to further realize the invention, the following arrangement mode is adopted: the step 1) comprises the following specific steps:

4.1) building a convolutional layer, performing convolution operation on the trained convolution kernel and the feature map of the previous layer in the convolutional layer, and outputting the feature map of the current layer through a given activation function according to the operation result;

4.2) establishing a down-sampling layer, and performing sampling operation on the feature mapping chart output in the step 4.1) by using the down-sampling layer;

4.3) establishing a full connection layer, calculating the result of the data output in the step 4.2) and outputting a result graph.

In order to further realize the invention, the following arrangement mode is adopted: the convolution function adopted by the convolution layer is as follows:

where l represents the number of layers, k is the convolution kernel, M_jA feature map representing input selection, b being a bias term;

the activation function adopts a tanh activation function;

the sampling function adopted by the down-sampling layer is as follows:

down (·) is a down-sampling function.

In order to further realize the invention, the following arrangement mode is adopted: the step 5) comprises the following specific steps:

5.1) adopting a convolution neural network to respectively establish a content loss function style loss function;

5.2) adopting the VGG model to calculate the content loss value, wherein the calculation formula of the VGG model is as follows:

5.3) utilizing the style loss function

The "style" loss value is calculated.

In order to further realize the invention, the following arrangement mode is adopted: the step 6) is specifically as follows:

passing the content loss value and the 'style' loss value through a function

And carrying out weighted summation to obtain a new 'graph'.

In order to further realize the invention, the following arrangement mode is adopted: the network flow data comprises a firewall, an intrusion detection system, a vulnerability scanning system, an anti-virus system, a terminal security management system, a security management platform and a security operation center.

In order to further realize the invention, the following arrangement mode is adopted: in step 2), the selected basic characteristics of the network flow data include: a source IP address, a destination IP address, a source port number, a destination port number, a protocol type, a total number of data packets, a number of null data packets, a ratio of a number of ingress and egress data packets, a number of reconnections, a duration of a flow, a length of a first data packet, a total number of bytes, an average number of bytes per packet, a variance of a number of bytes per packet, an average packet length, a ratio of a number of packets of the same length to a total number of packets, a standard deviation of data packet lengths, an average number of bits per second, an average interval of arrival of data packets, an average number of packets per second.

In order to further realize the invention, the following arrangement mode is adopted: in the step 3), when the basic feature of the network flow is converted, a calculation formula is adopted

And performing correlation processing between the selected basic features.

Compared with the prior art, the invention has the following advantages and beneficial effects:

(1) according to the method, the image is subjected to rapid style migration by utilizing deep learning, threat sources of different types, different sources and different dimensions are mapped into the same dimension and the same style according to a target image template, and finally a data set similar to the style is formed.

(2) The invention simulates the thinking mode of human brain to carry out automatic feature extraction mechanism through repeated multi-level learning, meets the urgent requirement of processing complex high-dimensional data, establishes a unified quantification method of network threat attack features, overcomes the dimension disaster problem of the traditional network threat perception algorithm, has the capability of flexibly and quantitatively extracting the intrusion attack features, gets rid of the dependence of the traditional method on manually designed data representation and input threat features, improves the self-adaptive capability of threat perception, improves the detection rate and reduces the false alarm rate. The method can also improve the perception capability of the network threat under the current increasingly complex and severe network security situation, and has important theoretical research value and practical application prospect.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Fig. 2 is a schematic diagram of a 5-layer CNN structure.

Fig. 3 is a schematic diagram of a loss network.

Detailed Description

The present invention will be described in further detail with reference to examples, but the embodiments of the present invention are not limited thereto.

It should be noted that, in the practical application of the present invention, the software program is inevitably applied to the software program, but the applicant states that the software program applied in the embodiment of the present invention is the prior art, and in the present application, the modification and protection of the software program are not involved, but only the protection of the hardware architecture designed for the purpose of the invention.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the equipment or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

In the present invention, unless otherwise expressly stated or limited, "above" or "below" a first feature means that the first and second features are in direct contact, or that the first and second features are not in direct contact but are in contact with each other via another feature therebetween. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly under and obliquely below the second feature, or simply meaning that the first feature is at a lesser elevation than the second feature.

Example 1:

the invention designs a unified quantification method of network threat attack characteristics based on style migration, which can automatically extract high-level characteristics of multi-dimensional network threat data and simultaneously realize unified quantification treatment on multiple threat sources, and particularly adopts the following setting mode: the method for uniformly processing the native feature maps with different dimensions comprises the following steps:

1) data acquisition: collecting network flow data in real time;

2) selecting basic characteristics of network flow data;

As a preferred arrangement, it can input the "original graph" x (i.e., the native feature graph) in the dataset into the "generating network" f_WTo generate a result map "

Extracting high-level features of the original feature map by the loss network to generate a result map "

Respectively with the target "style sheet" (target attribute matrix) y_sAnd performing loss calculation on an original graph (namely a native characteristic graph) x, adjusting the weight of the generated network according to the loss value, and achieving the conversion purpose by minimizing the loss value.

the activation function adopts a tanh activation function;

the sampling function adopted by the down-sampling layer is as follows:

down (·) is a down-sampling function.

5.3) utilizing the style loss function

The "style" loss value is calculated.

passing the content loss value and the 'style' loss value through a function

And carrying out weighted summation to obtain a new 'graph'.

And performing correlation processing between the selected basic features.

Example 2:

the embodiment is further optimized on the basis of the above embodiment, and a unified quantification method for cyber threat attack characteristics based on style migration, as shown in fig. 1, fig. 2, and fig. 3, includes the following steps:

s1, data acquisition:

by acquiring network flow data in real time, such as firewall, intrusion detection system, vulnerability scanning system, antivirus system, terminal security management system and other basic security detection units, and security management platform, security operation center and other comprehensive network flow data, when acquiring network flow data, the acquired content includes packet size, duration, data packet timing sequence, flow duration, packet arrival time interval, access domain name, port, IP, URL, DNS and other various information.

S2, selecting basic characteristics of the network flow:

the basic characteristics of the network flow data are selected from the group consisting of source IP address, destination IP address, source port number, destination port number, protocol type, total number of data packets, number of null data packets, ratio of number of incoming and outgoing data packets, number of reconnection, duration of flow, length of first data packet, total number of bytes, average number of bytes per packet, variance of number of bytes per packet, average packet length, ratio of number of packets of the same length to total number of packets, standard deviation of data packet length, average number of bits per second, average interval of arrival of data packets, average number of packets per second, etc., but are not limited thereto in practical application.

S3, establishing a model, including:

s3.1, data preprocessing based on Minkowski Distance:

first, n flow feature vectors are defined as X ═ X₁,x₂,...,x_j,...,x_n]Preferably, 142 attributes in the network flow data are used as basic features, that is, n is 142; but when different data is employed: the value of n varies from data source to data source.

Then

Wherein the content of the first and second substances,

is the value of the kth feature in the jth network flow feature vector, j is 1,2, …, m; k is 1,2, …, n.

Secondly, x is_jConversion to a matrix x 'of m rows and m columns'_jTo exploit the correlation between different features, see equation (1).

Matrix x'_jIs m-dimensional characteristic as in formula (2), wherein x'_jIs used as an m-dimensional vector for each column

And (4) showing.

Then

In order to define (process) the correlation between different features in the feature vector of the basic features of the network flow data, the invention employs a minkowski distance, as follows,

the advantage of using the Minkowski Distance is that it can be easily generalized from 2 dimensions to n dimensions when calculating the correlation between different features, and quickly calculate the difference between two categories. The jth traffic data is transformed into a symmetric matrix E with m rows and m columns of diagonal lines all equal to zero. Wherein the content of the first and second substances,

in practical application, after the network flow data, whether normal or abnormal, is mapped into E by using the Minkowski Distance, there is a significant characteristic difference. Therefore, the basic feature values of the network flow data are converted into a 'graph' (namely a native feature graph) suitable for deep learning analysis, and the 'graph' is used for further processing.

S3.2, processing of a network generation stage:

the style migration method of the image can extract some relevant key high-level features possibly contained in the image through a multi-layer deep learning model theoretically. The work done here is essentially to find more complex and intrinsic features (features) by using deep learning methods, which can reflect more essentially the higher-level features of the cyber threat and make the features uniform and quantitative. That is, after any given "original graph" (i.e. original characteristic graph) is obtained, the given "original graph" is input into the trained generation network, and the result after matrix conversion is output.

Because the convolutional neural network reduces the number of parameters needing to be trained by the neural network through the sharing of the receptive field and the weight, the training performance of the BP algorithm is greatly improved, and low-level and high-level features needed by classification can be automatically extracted, therefore, the convolutional operation can efficiently and automatically learn the high-dimensional attributes of the image, including spatial frequency, edges, colors and the like. And by identifying the need of more efficient and rapid data processing in the field of network security, the CNN algorithm can be more suitable for network threat characteristic expression than DNN, RNN and DBN. Therefore, in this embodiment, the CNN is used to build a generation network, that is, the generation network is implemented by using an end-to-end deep convolutional neural network, which uses convolutional operation to replace the ordinary matrix multiplication operation, and includes 2 convolutional layers, 2 maximum pooling layers, and 2 fully-connected layers, and the network is simple in structure and simple to implement. The method specifically comprises the following steps:

s3.2.1: convolutional layer-related processing:

the build-up of the convolutional layer is carried out by the formula (5),

where l represents the number of layers, k is the convolution kernel, M_jA feature map representing input selection, b being a bias term; performing convolution operation on the trained convolution kernel and the feature map of the previous layer in the convolution layer, and outputting the feature map of the current layer by a given tanh activation function of the operation result; the feature map of the present layer is a combination of several feature maps of the previous layer.

S3.2.2: and (3) down-sampling layer correlation processing:

and (4) establishing a down-sampling layer by using the sampling function of the formula (6). The downsampling layer can function to perform a decimation operation on the input (S3.2.1 output feature map), without changing the number of feature map layers, only the feature map size is converted,

where down (-) represents the downsampling function, and each output feature map has different w and b.

S3.2.3: and (4) related processing of a full connection layer:

the output data of S3.2.2 is input to the full connectivity layer. The full connection layer is positioned at the last layer or layers of the CNN and is used for outputting the calculation result.

S3.3, processing at a loss network stage:

in step S3.2, although the preliminary result of the uniform quantization is completed, it is possible that the uniform quantization result thereof is not optimal. Therefore, a content loss function and a style loss function need to be established and repeated continuously, so that the unified quantitative result of the characteristics in the generated network is optimal. The method comprises the following specific steps:

s3.3.1: establishing a content loss function and a style loss function:

due to loss function

It contains two definitions, a content loss (content loss) and a style loss (style loss), which measure the gap between content and "style", respectively. For each native profile x we have a content object y_cOne style object y_sFor "genre" conversion, content object y_cIs an input native feature graph x, an output matrix

Handle "style" y_sBinding to content x → y_sThe above. The system trains a network for each target "style". The loss network function is also a Convolutional Neural Network (CNN), but the parameters are not updated and are only used for calculating content loss and style loss, and the weight parameters of the previous generation network are trained and updated.

S3.3.2: calculating the loss of content:

in step S3.3.1, a content loss function has been defined. In this step, the content loss value is calculated. Preferably, the content loss value is calculated by using a VGG model in deep learning, and the higher layer features are extracted. Since the VGG model is originally used for image classification, a trained VGG model can effectively extract higher-level features of an image (a 'native feature map') (the higher-level abstract features are key factors reflecting various threats). The content loss calculation formula (7) is as follows:

further, from the image processing level, when a new image is found

(new feature maps) that minimize the loss of features at lower layers, and that use higher layers to reconstruct the new map, the content and global structure of the original map is preserved, but color textures and exact shapes do not exist, because higher-level image features are abstracted. The former basic characteristics for network threat perception, such as IP address, port number, source port number, destination port number and protocol type, do not exist independently, but merge and abstract the threat characteristics of a higher level.

S3.3.3: calculate "style" loss:

since the "style" loss function is defined in step S3.3.1, when we calculate the "style" loss value, we do not stay on the pixel point to analyze, but rather abstract the "image" with higher dimension to form the "style" of the image. For the network threat perception, network threat data higher-layer features are automatically extracted, and the extracted higher-layer features are more beneficial to the identification of the network threat. To achieve this effect, define

Representative network

The j-th layer of (1), the input is x. The shape of the feature map is C_jH_jW_jDefining a matrix C_j(x)Is C_j×C_jThe matrix (feature matrix) has elements from:

will be provided with

Is understood to be a C_jDimensional features, the size of each feature is H_jW_jThen, the left side C of the above formula_j(x) Is just with C_jThe decentered covariance of the dimensions is proportional. Since the gradient matrix can be calculated very efficiently, by adjustment

Is shaped as a matrix

The shape is C_j×H_jW_jThen C_j(x) Is that

The style loss is calculated as equation (9):

s3.4: model parameter adjustment and optimization

In S3.2 and S3.3, the content loss function and the style loss function are calculated respectively, and in this step, the two loss functions are weighted and summed to obtain a new "graph", which is the final unified quantification of the characteristics, as shown in formula (10), wherein α performs parameter adjustment according to the experiment.

The style reconstruction process has self-adaptability, and when the output and the target have different sizes, the output can be adjusted to the same dimension due to the gradient matrix, so that different n-dimensional features are finally uniformly quantized. Therefore, from the structure of the whole deep learning training network, the original characteristic diagram is input to obtain a converted matrix through the generation network, then the corresponding loss is calculated, the whole training network continuously updates the weight of the previous generation network by minimizing the loss, and finally the optimal generated network weight can be obtained

And the value is the final result of unified quantification of the network attack characteristic.

S4, unified quantification of network threat attack characteristics

In step S3.4, the optimum can be obtained

The value is obtained. In the step, after any given 'original graph' (namely a native characteristic graph) is obtained, the given 'original graph' (namely a native characteristic graph) is input into a trained generated network, and the result after matrix conversion is output.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications and equivalent variations of the above embodiments according to the technical spirit of the present invention are included in the scope of the present invention.

Claims

1. A unified quantification method for network threat attack characteristics based on style migration uniformly processes native characteristic graphs of different dimensions, and is characterized in that: the method comprises the following steps:

1) data acquisition: collecting network flow data in real time;

2) selecting basic characteristics of network flow data;

2. The unified quantification method of cyber threat attack characteristics based on style migration according to claim 1, characterized in that: the step 1) comprises the following specific steps:

3. The unified quantification method of cyber threat attack characteristics based on style migration according to claim 2, characterized in that: the convolution function adopted by the convolution layer is as follows:

the activation function adopts a tanh activation function;

the down-sampling layer adopts a sampling function of：

down (·) is a down-sampling function.

4. The unified quantification method of cyber threat attack characteristics based on style migration according to claim 1, characterized in that: the step 5) comprises the following specific steps:

5.3) utilizing the style loss function

The "style" loss value is calculated.

5. The unified quantification method of cyber threat attack characteristics based on style migration according to claim 4, wherein: the step 6) is specifically as follows:

passing the content loss value and the 'style' loss value through a function

And carrying out weighted summation to obtain a new 'graph'.

6. The unified quantification method for the cyber threat attack characteristics based on style migration according to any one of claims 1 to 5, wherein: the network flow data comprises a firewall, an intrusion detection system, a vulnerability scanning system, an anti-virus system, a terminal security management system, a security management platform and a security operation center.

7. The unified quantification method for the cyber threat attack characteristics based on style migration according to any one of claims 1 to 5, wherein: in step 2), the selected basic characteristics of the network flow data include: a source IP address, a destination IP address, a source port number, a destination port number, a protocol type, a total number of data packets, a number of null data packets, a ratio of a number of ingress and egress data packets, a number of reconnections, a duration of a flow, a length of a first data packet, a total number of bytes, an average number of bytes per packet, a variance of a number of bytes per packet, an average packet length, a ratio of a number of packets of the same length to a total number of packets, a standard deviation of data packet lengths, an average number of bits per second, an average interval of arrival of data packets, an average number of packets per second.

8. The unified quantification method for the cyber threat attack characteristics based on style migration according to any one of claims 1 to 5, wherein: in the step 3), when the basic feature of the network flow is converted, a calculation formula is adopted

And performing correlation processing between the selected basic features.