CN117318980A

CN117318980A - Small sample scene-oriented self-supervision learning malicious traffic detection method

Info

Publication number: CN117318980A
Application number: CN202310910097.9A
Authority: CN
Inventors: 沈蒙; 叶珂; 贾冀哲; 王伟; 岳光纯; 张大伟; 吴金贺; 欧嵬; 祝烈煌
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2023-07-10
Filing date: 2023-07-24
Publication date: 2023-12-29

Abstract

The invention discloses a malicious flow detection method based on self-supervision learning under a small sample condition, and belongs to the technical field of encryption network flow classification. According to the invention, through analyzing the flow interaction process, three characteristics of the length, the protocol and the arrival time interval of a data packet of the flow can be analyzed to effectively distinguish different types of flows, the characteristic embedding is realized by utilizing a continuous word bag model, a flow expression matrix is constructed, the matrix is combined with a self-supervision learning model, the characteristic learning of a non-tag flow sample is realized, a flow characteristic encoder network is constructed, on the basis, a small amount of flow samples with tags are used for training a full connection layer, and the encoder is connected with the full connection layer, so that a malicious flow detection model is obtained. Because the learning process of the model only uses a small amount of tagged data, the problems that the tagged malicious traffic samples are few and the supervised learning model construction requiring a large amount of samples is difficult to realize are effectively solved, and the malicious traffic detection under the condition of small samples is realized.

Description

Small sample scene-oriented self-supervision learning malicious traffic detection method

Technical Field

The invention relates to a malicious traffic detection method based on self-supervision learning under a small sample condition, and belongs to the technical field of encryption network traffic classification.

Background

With the rapid development of the internet, the complexity of the network topology and the equipment scale are also increased, and malicious network traffic tends to increase explosively. Conventional malicious traffic detection methods typically scan the packet content to discover malicious traffic according to a predetermined fixed string pattern, such as malicious traffic unique features that are unlikely to be discovered in any benign traffic. However, with the application of encryption protocols (e.g., SSL/TLS), the payload-based anomaly detection approach gradually decreases in effectiveness. Therefore, it is necessary to propose a malicious traffic detection method suitable for encrypted traffic.

In order to realize encrypted malicious traffic detection, most of the current methods are mainly constructed based on a supervised learning model of classification, the method regards malicious traffic detection as a classification task, and a classifier is trained by comprehensively using normal traffic and malicious traffic as inputs. However, this approach requires a significant amount of tagged traffic data to model. The traffic collected in the real world is unlabeled, for example, traffic collected at a gateway is unlabeled data, and the labels are difficult to set for the traffic by manpower, and meanwhile, the construction of a malicious traffic data set with labels requires an independent target range environment and a long-time traffic collection process, so that the quick online deployment of a malicious traffic detection model is difficult to deal with. Therefore, in order to effectively detect malicious traffic, it is necessary to invent a method for detecting malicious traffic under the condition of small sample based on self-supervision learning.

Disclosure of Invention

The invention aims to solve the problems that a large number of samples are required for supervised learning model construction due to few labeled malicious traffic samples, and mainly aims to provide a malicious traffic detection method based on self-supervised learning under the condition of small samples, and aims to detect malicious traffic by using a large amount of unlabeled traffic data and a small amount of labeled traffic data under the condition that network traffic is encrypted. The method extracts the attribute of each data packet, utilizes feature embedding to construct the feature vector of each flow, only uses non-tag data to train a feature encoder, freezes the network parameters of the encoder, uses a small amount of tagged data to train a full connection layer, connects the encoder with the full connection layer, constructs a malicious flow detection model, realizes the timely blocking of malicious flow, and ensures that equipment resources are prevented from being infringed.

The aim of the invention is achieved by the following technical scheme.

The invention discloses a malicious flow detection method based on self-supervision learning under a small sample condition, which is characterized in that three characteristics of the length, the protocol and the arrival time interval of a data packet of a flow can be found to effectively distinguish different types of flows by analyzing a flow interaction process, a continuous-word-bag-of-words (CBOW) model is utilized to realize characteristic embedding, a flow expression matrix is constructed, the matrix is combined with a self-supervision learning model to realize characteristic learning of unlabeled flow samples, a flow characteristic encoder network is constructed, on the basis, a full-connection layer is trained by a small number of labeled flow samples, and an encoder is connected with the full-connection layer, so that a malicious flow detection model is obtained. Because the learning process of the model only uses a small amount of tagged data, the problems that the tagged malicious traffic samples are few and the supervised learning model construction requiring a large amount of samples is difficult to realize are effectively solved, and the malicious traffic detection under the condition of small samples is realized.

The invention discloses a malicious flow detection method based on self-supervision learning under a small sample condition, which comprises the following steps:

step 1: for a flow, extracting the length of a data packet, the protocol of the data packet and the arrival time interval of a specific number before constructing a feature matrix, and storing a flow sample for embedding the flow features in the step 2.

For a stripe containing n packets p _i Flow t= { p of (i.ltoreq.n) ₁ ，p ₂ ，...，p _n Extracting the packet length (l) of B (B.ltoreq.n) packets before _i ) Packet protocol (q _i ) And packet arrival time interval (t _i ) Constructing a 3*B flow characteristic matrix, and storing a flow sample for flow characteristic embedding in the step 2.

Step 2: in view of the fact that the length, the protocol and the arrival time interval of the independent data packets lack context semantic information, the flow characteristics cannot be accurately expressed, a CBOW model is used, feature embedding operation is carried out on the data packet attributes by utilizing a specific number of data packets before and after each data packet, and three features of the length, the protocol and the arrival time interval of the data packets are respectively expanded to specific dimensions through the feature embedding operation, so that the flow characteristics are rich. The CBOW model consists of an input layer, a hidden layer and an output layer, and a weight matrix is obtained by training the CBOW model and is used for realizing feature embedding. And counting different attribute values of all the data packets, constructing an attribute value dictionary, wherein an input layer of CBOW is single-heat coding representation of a plurality of features, obtaining a hidden layer and an output layer through matrix calculation, obtaining probability of predicting the attribute values of the data packets according to the output layer, taking the attribute value with the maximum probability as output, calculating the error between the attribute value and a true value, counter-propagating and updating network parameters, and after multiple iterations, using a network parameter matrix as a feature vector matrix for realizing feature embedding, wherein each row corresponds to one feature value, and expanding the features into vectors with specific dimensions according to the matrix. According to the steps, a CBOW model for realizing feature embedding of the data packet length, the protocol and the arrival time interval is trained in sequence, a corresponding feature vector matrix is obtained, for each attribute value, the corresponding feature vector is matched, the attribute value which cannot be matched is replaced by a 0 vector, flow feature embedding is realized, one-dimensional features are expanded to high dimensions, context semantic information is fused, flow features are enriched, and flow expression is obtained.

In view of the fact that the length, the protocol and the arrival time interval of the independent data packets lack context semantic information, the flow characteristics cannot be accurately expressed, a CBOW model is used, feature embedding operation is carried out on the data packet attributes by utilizing a specific number of data packets before and after each data packet, and three features of the length, the protocol and the arrival time interval of the data packets are respectively expanded to specific dimensions through the feature embedding operation, so that the flow characteristics are rich. The CBOW model consists of an input layer, a hidden layer and an output layer, and a weight matrix is obtained by training the model, and the matrix is used for realizing feature embedding.

Different attribute values of all data packets are counted, and an attribute value dictionary D= { S is constructed ₁ ，S ₂ ，...，S _V Wherein V is the number of different attributes, and the input layer is a single-hot coded representation of multiple features { x } ₁ ，x ₂ ，...，x _C X, where x _i (i.ltoreq.C) represents the single thermal encoding of the ith feature, and the hidden layer has the following calculation formula:

wherein W is a matrix of V x N, N is the dimension of the hidden layer, and the obtained hidden layer h passes through the matrix W' to obtain an output layer o:

o＝W′ ^T *h

wherein W' is a matrix of N x V, and predicting the attribute value of the data packet as S according to the output layer _k The probability is:

taking the attribute value with the maximum probability as output, calculating the error between the attribute value and the true value, and back-propagating to calculate W and W' matrixes, wherein the W matrix is a characteristic vector matrix after multiple iterations, and W _k Corresponds to attribute S _k Is described.

According to the steps, feature vectors of the data packet length, the data packet protocol and the data packet reaching time interval are sequentially calculated, the corresponding feature vector is matched for each attribute value, the attribute value which cannot be matched is replaced by a 0 vector, flow feature embedding is achieved, one-dimensional features are expanded to high dimensions, context semantic information is fused, flow features are enriched, and flow expression is achieved.

Step 3: traffic sample data enhancement. In view of the probability that a malicious attacker increases the flow concealment by a flow confusion mode, partial information of the original flow is destroyed by adding a random data packet to the original flow and increasing random time delay, the method is used for encoder network training in the step 4, and the correlation between the original flow and the confusion flow is learned through the reserved characteristics, so that the malicious flow detection with the concealment increased by the confusion mode is realized.

For each packet p in a traffic T _i After which toIs added with a length l' _i (0≤l′ _i 1500) and selecting a predetermined proportion of the data packets by a simple random sampling methodIncreasing random time delay deltat _i (0≤Δt _i Less than or equal to 0.2 s) to have an arrival time interval t _i +Δt _i Extracting the confused flow characteristics, obtaining a flow sample with enhanced data through characteristic embedding, enabling the flow sample with enhanced data to act on the encoder network training in the step 4, and learning the correlation between the original flow and the confused flow through the reserved characteristics to realize malicious flow detection with increased concealment through the confusion mode.

Step 4: dividing all data sets into training sets and test sets according to a preset proportion, and taking all training set data without labels to realize the network training of the flow characteristic encoder. And extracting deep vector representations from the original feature representations and the data-enhanced feature representations of each flow by using a convolutional neural network, calculating the similarity of all the deep vector representations, constructing a loss function by using the similarity, and carrying out back propagation iterative calculation on the encoder network parameters to obtain a flow feature encoder network, wherein model pre-training is realized by using unlabeled flow samples only, and the efficiency of malicious flow detection model construction is improved.

Dividing all data sets into training sets and test sets according to a preset proportion, and taking all training set data without labels for self-supervision learning model training. For each flow T _i Is used for extracting deep vector representation(s) by using a convolutional neural network _i And s' _i ). Calculating the similarity of all matrixes, wherein the similarity function is defined as:

where u and v are deep vector representations of two samples extracted by the convolutional neural network, respectively, and the loss function for any one traffic sample is defined as:

and Q is the total number of samples, and the convolutional neural network automatically learns the characteristic extraction mode of the flow samples according to the loss iterative calculation parameters. And (3) calculating encoder network parameters by using the back propagation iteration of the loss function, finally obtaining a flow characteristic encoder network, and realizing model pre-training by using only unlabeled flow samples, thereby improving the efficiency of malicious flow detection model construction.

In order to improve the effectiveness of the characteristics, preferably, the convolutional neural network adopts a resnet-50.

Step 5: and (3) connecting a full-connection layer after the characteristic encoder network is trained in the step (4), fixing the encoder network parameters unchanged, randomly selecting each type of sample with a specific proportion in a training set, training the full-connection layer, defining a loss function as cross entropy, calculating the full-connection layer network parameters by using the back propagation of the loss function, and obtaining a trained malicious flow detection model after the preset iteration times are reached.

Adding a full-connection layer after a convolutional neural network layer, randomly selecting n% of each type of sample in a training set, training the full-connection layer, wherein a loss function is cross entropy loss:

where Q is the total number of samples, y _i Representing the flow rate T _i Is malicious traffic (y) _i =1) or normal flow (y _i ＝0)，p _i Representative flow T _i Probability of malicious traffic. And calculating the network parameters of the full-connection layer by using the back propagation of the loss function, and obtaining a trained malicious flow detection model after the iteration times are reached.

Step 6: the malicious flow detection model is obtained by connecting the encoder network trained in the step 4 and the full connection layer trained in the step 5, only a small amount of tagged data is needed in the training process of the malicious flow detection model, and malicious flow detection under a small sample scene is realized through the malicious flow detection model.

The beneficial effects are that:

1. the invention discloses a malicious flow detection method based on self-supervision learning under a small sample condition, which discovers that three characteristics of the length, the protocol and the arrival time interval of a data packet of flow can effectively distinguish different types of flow, combines the extracted characteristics with a self-supervision learning model, realizes the training of a flow characteristic encoder on unlabeled data, trains a full-connection layer by using a small amount of flow samples with labels on the basis, and connects the encoder with the full-connection layer to obtain a malicious flow detection model. Because the learning process of the model only uses a small amount of data with labels, the problem that the model is difficult to construct due to the lack of labels is effectively solved, and malicious flow detection in a small sample scene is realized.

2. According to the method for detecting malicious traffic under the small sample condition based on self-supervision learning, disclosed by the invention, the characteristic embedding operation is carried out on the data packet attribute by utilizing the data packets with specific numbers before and after each data packet, the three characteristics of the data packet length, the protocol and the arrival time interval are respectively expanded to specific dimensions through the characteristic embedding operation, so that the abundant traffic characteristics are realized, and the accuracy of detecting the malicious traffic is improved.

3. According to the method for detecting malicious traffic under the condition of the small sample based on self-supervision learning, disclosed by the invention, the confusion sample with enhanced data is obtained by randomly adding redundant data packets and time delay, the encoder network learns the correlation between the original traffic and the confusion traffic through the reserved characteristics, and the detection of the malicious traffic with increased concealment in a confusion manner is realized by using a large amount of unlabeled traffic data and a small amount of tagged traffic data.

Drawings

FIG. 1 is a flow chart of a malicious flow detection method under a small sample condition based on self-supervision learning;

Detailed Description

For a better description of the objects and advantages of the present invention, the following description of the invention refers to the accompanying drawings and examples.

Example 1:

FIG. 1 is a flow chart of malicious traffic detection. The embodiment discloses a malicious flow detection method based on self-supervision learning under a small sample condition, which comprises the following implementation steps:

step 1: extracting the packet length, packet protocol (IP, TCP, UDP, HTTP, TLS, DNS and others, respectively indicated by numerals 1-7) and packet arrival time interval of the first 10 packets of each flow, constructing a 3 x 10 flow characteristic matrix, and storing the flow samples in csv format.

Step 2: the embedded feature vector is calculated for the length, protocol, and time of arrival of each packet using the first 2 and last 2 packets of that packet. In the method, the learning rate of the CBOW model is set to be 0.001, the dimension of the hidden layer is set to be 100, namely the dimension of each feature vector is 1 x 100, and the feature dimension of each flow is 30 x 100.

Step 3: and generating mixed flow for each flow, adding random redundant data packets with 10% probability before each data packet, adding random time delay for each data packet with 50% probability, and extracting a feature matrix of the mixed flow to obtain a data enhancement sample.

Step 4: the public dataset USTC-TFC2016 is used, which contains 10 types of normal traffic and 10 types of malicious traffic. According to 8: and 2, dividing the training set and the testing set, removing labels in the training set sample, and training a self-supervision learning model by using the label-free sample. The model is used for constructing a convolutional neural network by referring to the resnet-50, extracting flow characteristics, wherein the model batch_size is 20, the learning rate is 0.001, and the iteration number is 20.

Step 5: freezing convolutional neural network parameters, adding a full-connection layer network model, respectively taking samples and labels (the proportion is randomly extracted for each type of flow) of 0.01%,0.02%,0.03%,0.04%,0.05%,0.10%,0.50% and 1.00% in a training set, training the full-connection layer network model, wherein the model batch_size is 20, the learning rate is 0.001, and the iteration number is 20.

Step 6: in order to embody the effectiveness of the self-supervision learning model, a convolutional neural network and a full-connection layer network model with the same structure as in the step 5 are constructed, network parameters are randomly initialized, and only 0.01%,0.02%,0.03%,0.04%,0.05%,0.10%,0.50% and 1.00% of samples and labels (the proportion is randomly extracted for each type of flow) in a training set are used for training the network, wherein the model batch_size is 20, the learning rate is 0.001 and the iteration number is 20.

Step 7: the classifier obtained in the step 5 and the classifier obtained in the step 6 are used for detecting and verifying malicious flow on a test set respectively, detection of malicious flow is achieved, detection results are shown in a table 1, and the result shows that the method can achieve good detection effect by only using a small number of labeled samples.

TABLE 1 malicious traffic detection effect

Proportion of labeled samples	Number of labeled samples	Self-supervision learning model accuracy	Learning accuracy using only tagged data
				0.01％	24	77.27％	55.27％
0.02％	62	90.73％	56.08％
				0.03％	99	92.44％	71.33％
0.04％	133	95.25％	76.89％
				0.05％	170	98.42％	81.95％
0.10％	350	98.47％	82.92％
				0.50％	1794	98.03％	95.31％
1.00％	3596	98.75％	96.48％

* Since the flows of each class are randomly extracted according to the proportion, when the proportion of the labeled samples is too small, the extraction quantity of the samples of the flows of certain classes can be less than 1, and the samples are processed according to the extraction quantity of 0 samples in the case, when the proportion of the labeled samples is increased, the quantity of the labeled samples is not necessarily increased according to the same multiple

While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims

1. A malicious flow detection method based on self-supervision learning under a small sample condition is characterized by comprising the following steps: comprises the steps of,

step 1: for a flow, extracting the length of a data packet, a data packet protocol and an arrival time interval of a specific number before constructing a feature matrix, and storing a flow sample for embedding the flow features in the step 2;

step 2: in view of the fact that the length, the protocol and the arrival time interval of the independent data packets lack context semantic information, the flow characteristics cannot be accurately expressed, a CBOW model is used, feature embedding operation is carried out on the data packet attributes by utilizing a specific number of data packets before and after each data packet, and three features of the length, the protocol and the arrival time interval of the data packets are respectively expanded to specific dimensions through the feature embedding operation, so that the flow characteristics are rich; the CBOW model consists of an input layer, a hidden layer and an output layer, and a weight matrix is obtained by training the CBOW model and is used for realizing feature embedding; counting different attribute values of all data packets, constructing an attribute value dictionary, wherein an input layer of CBOW is single-heat coding representation of a plurality of features, obtaining a hidden layer and an output layer through matrix calculation, obtaining probability of predicting the attribute value of the data packet according to the output layer, taking the attribute value with the maximum probability as output, calculating error of the attribute value and a true value, counter-propagating and updating network parameters, and after multiple iterations, using a network parameter matrix as a feature vector matrix for realizing feature embedding, wherein each row corresponds to one feature value, and expanding features into vectors with specific dimensions according to the matrix; according to the steps, a CBOW model for realizing feature embedding of the data packet length, the protocol and the arrival time interval is trained in sequence, a corresponding feature vector matrix is obtained, for each attribute value, the corresponding feature vector is matched, the attribute value which cannot be matched is replaced by a 0 vector, flow feature embedding is realized, one-dimensional features are expanded to high dimensions, context semantic information is fused, flow features are enriched, and flow expression is obtained;

step 3: in view of the probability that a malicious attacker increases the flow concealment in a flow confusion manner, partial information of the original flow is destroyed in a manner of adding a random data packet and increasing random time delay to the original flow, and the retained characteristics learn the correlation between the original flow and the confusion flow;

step 4: dividing all data sets into training sets and test sets according to a preset proportion, and taking all training set data without labels to realize the network training of the flow characteristic encoder; extracting deep vector representations from the original feature representations and the data-enhanced feature representations of each flow by using a convolutional neural network, calculating the similarity of all the deep vector representations, constructing a loss function by using the similarity, and carrying out back propagation iterative calculation on encoder network parameters to obtain a flow feature encoder network, wherein model pre-training is realized by using unlabeled flow samples only, and the efficiency of malicious flow detection model construction is improved;

step 5: connecting a full-connection layer after the feature encoder network is obtained by training in the step 4, fixing the encoder network parameters unchanged, randomly selecting each type of sample with a specific proportion in a training set, training the full-connection layer, defining a loss function as cross entropy, calculating the full-connection layer network parameters by using the back propagation of the loss function, and obtaining a trained malicious flow detection model after reaching a preset iteration number;

2. The method for detecting malicious traffic under the condition of a small sample based on self-supervision learning as set forth in claim 1, wherein the method comprises the following steps: the implementation method of the step 1 is that,

for a stripe containing n packets p _i Flow t= { p of (i.ltoreq.n) ₁ ,p ₂ ,…,p _n Lift (V) } handleTaking the packet length (l) of the previous B (B is less than or equal to n) packets _i ) Packet protocol (q _i ) And packet arrival time interval (t _i ) Constructing a 3*B flow characteristic matrix, and storing a flow sample for flow characteristic embedding in the step 2.

3. The method for detecting malicious traffic under the condition of a small sample based on self-supervision learning as set forth in claim 2, wherein the method comprises the following steps: the implementation method of the step 2 is that,

in view of the fact that the length, the protocol and the arrival time interval of the independent data packets lack context semantic information, the flow characteristics cannot be accurately expressed, a CBOW model is used, feature embedding operation is carried out on the data packet attributes by utilizing a specific number of data packets before and after each data packet, and three features of the length, the protocol and the arrival time interval of the data packets are respectively expanded to specific dimensions through the feature embedding operation, so that the flow characteristics are rich; the CBOW model consists of an input layer, a hidden layer and an output layer, and a weight matrix is obtained by training the model and is used for realizing feature embedding;

different attribute values of all data packets are counted, and an attribute value dictionary D= { S is constructed ₁ ,S ₂ ,…,S _V Wherein V is the number of different attributes, and the input layer is a single-hot coded representation { χ ] of multiple features ₁ ,χ ₂ ,…,χ _C X, where x _i (i.ltoreq.C) represents the single thermal encoding of the ith feature, and the hidden layer has the following calculation formula:

o＝W′ ^T *h

taking the attribute value with the maximum probability as output, calculating the error between the attribute value and the true value, and back-propagating to calculate W and W' matrixes, wherein the W matrix is a characteristic vector matrix after multiple iterations, and W _k Corresponds to attribute S _k Is a feature vector of (1);

4. A method for detecting malicious traffic under a small sample condition based on self-supervised learning as set forth in claim 3, wherein: the implementation method of the step 3 is that,

for each packet p in a traffic T _i After which toIs added with a length l' _i (0≤l′ _i 1500) and simultaneously selecting a predetermined proportion of data packets by a simple random sampling method to increase random time delay delta t _i (0≤Δt _i Less than or equal to 0.2 s) to have an arrival time interval t _i +Δt _i Extracting the confused flow characteristics, obtaining a flow sample with enhanced data through characteristic embedding, enabling the flow sample with enhanced data to act on the encoder network training in the step 4, and learning the correlation between the original flow and the confused flow through the reserved characteristics to realize malicious flow detection with increased concealment through the confusion mode.

5. The method for detecting malicious traffic under the condition of a small sample based on self-supervision learning as set forth in claim 4, wherein the method comprises the following steps: the implementation method of the step 4 is that,

dividing all data sets into training sets and test sets according to a preset proportion, and taking all training set data which do not contain labels for self-supervision learning model training; for each flow T _i Is used for extracting deep vector representation(s) by using a convolutional neural network _i And s' _i ) The method comprises the steps of carrying out a first treatment on the surface of the Calculating the similarity of all matrixes, wherein the similarity function is defined as:

wherein Q is the total number of samples, and the convolutional neural network automatically learns the characteristic extraction mode of the flow samples according to the loss iterative calculation parameters; and (3) calculating encoder network parameters by using the back propagation iteration of the loss function, finally obtaining a flow characteristic encoder network, and realizing model pre-training by using only unlabeled flow samples, thereby improving the efficiency of malicious flow detection model construction.

6. The method for detecting malicious traffic under the condition of a small sample based on self-supervision learning as set forth in claim 5, wherein the method comprises the following steps: the implementation method of the step 5 is that,

where Q is the total number of samples, y _i Representing the flow rate T _i Is malicious traffic (y) _i =1) or normal flow (y _i ＝0)，p _i Representative flow T _i Probability of being malicious traffic; and calculating the network parameters of the full-connection layer by using the back propagation of the loss function, and obtaining a trained malicious flow detection model after the iteration times are reached.

7. The method for detecting malicious traffic under the condition of a small sample based on self-supervision learning as set forth in claim 6, wherein the method comprises the following steps: and the convolutional neural network is selected from a resnet-50.