CN117034112A - Malicious network traffic classification method based on sample enhancement and contrast learning - Google Patents
Malicious network traffic classification method based on sample enhancement and contrast learning Download PDFInfo
- Publication number
- CN117034112A CN117034112A CN202311005429.5A CN202311005429A CN117034112A CN 117034112 A CN117034112 A CN 117034112A CN 202311005429 A CN202311005429 A CN 202311005429A CN 117034112 A CN117034112 A CN 117034112A
- Authority
- CN
- China
- Prior art keywords
- network traffic
- sample
- malicious
- samples
- contrast learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000013145 classification model Methods 0.000 claims abstract description 27
- 238000013528 artificial neural network Methods 0.000 claims abstract description 6
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000013598 vector Substances 0.000 claims description 61
- 230000006870 function Effects 0.000 claims description 9
- 230000000873 masking effect Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000010801 machine learning Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention belongs to the technical field of network security and deep learning, and particularly relates to a malicious network traffic classification method based on sample enhancement and contrast learning. Acquiring network traffic, extracting network traffic characteristics, constructing a network traffic sample set, preprocessing the network traffic characteristics, and constructing a pre-training set; training a basic model, calculating a loss value, calculating parameters of the basic model according to the loss value, and updating to obtain a contrast learning malicious flow classification model; and carrying out malicious traffic classification in the new task by adopting a contrast learning malicious traffic classification model. The invention adopts the basic model of the shallow neural network architecture occupied by light resources, so that the occupied resources are less, and the operation efficiency is high; the heuristic method of the contrast task is constructed by randomly covering network flow characteristics, so that data enhancement can be performed; the contrast learning malicious traffic classification model has good classification performance to distinguish the known network traffic categories, and can classify unknown small sample malicious network traffic.
Description
Technical Field
The invention relates to network security and deep learning technology, in particular to a malicious network traffic classification method based on sample enhancement and contrast learning.
Background
With the increasing popularity of the internet in modern life, more and more devices are communicating through a network, and the security of network space is receiving more attention. The traffic intrusion classification system is used for effectively classifying various malicious attacks on the network, and is one of key systems for maintaining network space safety. From a machine learning perspective, an intrusion classification system may be defined as a system that classifies network traffic. In short, it is to distinguish normal traffic from malicious traffic of the network. With the development of machine learning technology, a network malicious traffic classification method based on machine learning is widely paid attention to.
For the machine learning method, the intrusion classification system can accurately classify the test samples as long as enough samples are trained. However, today's network environment is constantly changing, new types of malicious traffic are endless, and it is difficult to obtain enough samples to train the model in a short time. The insufficient number of samples makes it difficult to fully train the machine-learned model, thereby affecting the effect of classifying malicious traffic.
Aiming at the problems, how to realize the classification of novel network malicious traffic by deep learning under the condition that the collected network traffic samples are less is a problem to be solved urgently.
Disclosure of Invention
The invention aims to solve the problem that the model of machine learning is difficult to sufficiently train due to insufficient sample number, further influences the effect of classifying malicious traffic, and provides a malicious network traffic classification method based on sample enhancement and contrast learning.
In order to achieve the above object, the malicious network traffic classification method based on sample enhancement and contrast learning includes:
obtaining N types of network traffic, wherein each type of network traffic comprises a plurality of samples, extracting a plurality of network traffic characteristics of each sample to form a characteristic vector corresponding to the sample, and constructing a network traffic sample set, wherein the network traffic sample set comprises network traffic and corresponding network traffic characteristics;
preprocessing the network flow characteristics to obtain an enhancement set of each sample;
taking union sets for enhancement sets of all samples contained in each type of network traffic, traversing other samples in the same union set for each sample in each union set to form positive sample pairs, taking one other union set traversing sample to form negative sample pairs, setting labels for each sample pair, and taking the obtained set of the positive and negative sample pairs as a pre-training set;
constructing a basic model, normalizing the feature vectors of two samples in a sample pair in a pre-training set, and then taking the normalized feature vectors as the input of the basic model to obtain two processed feature vectors of the sample pair;
calculating the similarity of the two processed feature vectors of the sample pair, identifying the label of the sample pair to obtain a label judgment value, calculating a loss value, calculating the parameters of a basic model according to the loss value, and updating to obtain a contrast learning malicious flow classification model;
and carrying out malicious traffic classification in the new task by adopting a contrast learning malicious traffic classification model.
Further, the preprocessing the network traffic characteristics to obtain an enhanced set of each sample includes:
training n network traffic classification models on a network traffic sample set, and inputting feature vectors of the jth class of network traffic into the ith network traffic classification model to obtain classification accuracy acc (i, j);
inputting the d network traffic characteristics into an i network traffic classification model to obtain importance weight (i, d), wherein the d network traffic characteristics belong to the j network traffic class;
importance weights I (j, d) are calculated according to the classification accuracy acc (I, j) and the importance weights weight (I, d), and are expressed as follows:
wherein I (j, d) represents importance weight of the feature of the d-th network traffic to the j-th class of network traffic;
the a-th sample s in the network traffic sample set a Is characterized by a feature vector x a The sample s a The method belongs to the j-th class of network traffic, calculates the covering probability of each network traffic characteristic in the j-th class of network traffic, and uses the formula to express as follows:
wherein P (j, d) represents the mask probability of the d-th network traffic feature for all samples of the j-th class of network traffic;
in the feature vector x according to the mask probability a Randomly selecting L network traffic characteristics to mask, and obtaining a sample s after masking a Repeating the operations of randomly selecting and masking m times to obtain m enhanced samples, and combining the feature vector x a And a set of m enhanced samples as samples s a Is described.
Further, for all positive sample pairs formed by the same and pooled samples, at least one sample between any two positive sample pairs is not repeated.
Further, the labeling each sample pair includes:
the label of the positive sample pair is set to 1 and the label of the negative sample pair is set to 0.
Further, a basic model is constructed, the feature vectors of two samples in a sample pair in a pre-training set are normalized and then used as the input of the basic model, and the two processed feature vectors of the sample pair are obtained, and the method comprises the following steps:
constructing a basic model by adopting a multi-layer perceptron, and normalizing the eigenvector x of the sample pair p And feature vector x q As a basic modelThe base model updates parameters for each layer of the multi-layer perceptron based on the following formula:
X (l+1) =σ(A (l) X (l) +b (l) )
wherein A is (l) Trainable parameter matrix for the first layer of the multi-layer perceptron, b (l) Is the parameter vector of the first layer of the multi-layer perceptron, X (l) X is the output of the first layer of the multi-layer perceptron (l+1) For the output of the first layer +1 of the multi-layer perceptron, sigma (…) is an activation function;
obtaining a characteristic vector x p And feature vector x q Feature vector processed by multi-layer perceptronAnd->
Further, the calculating the similarity of the two processed feature vectors of the sample pair includes:
the two processed eigenvectors of the sample pair are compared using a cosine similarity function, formulated as follows:
wherein,similarity value representing two feature vectors, ranging from [ -1,1],/>Representing feature vector x p Feature vector processed by multi-layer perceptron, < >>Representing feature vector x q The feature vector is processed by the multi-layer perceptron;
to make the similarity valueScale to [0,1 ]]Expressed by the formula:
wherein,a scaled value representing the similarity value.
Further, the calculating the loss value, calculating the parameters of the basic model according to the loss value and updating, includes:
the loss value is calculated by a binary cross entropy loss function and formulated as follows:
wherein, loss_L2 (x p ,x q ) Representing feature vector x p And feature vector x q Alpha represents a regularization factor, W represents all weight sums of the base model, and y represents a tag judgment value;
the gradient is calculated using back propagation and the parameters of the neural network are calculated and updated using gradient descent.
Further, the step of classifying the malicious traffic in the new task by adopting the contrast learning malicious traffic classification model includes:
acquiring a trained contrast learning malicious flow classification model;
collecting training samples in a new task, wherein at least one malicious traffic type in the training samples is not repeated with the malicious traffic type in the network traffic sample set;
and carrying out malicious traffic classification in a new task by adopting a contrast learning malicious traffic classification model, carrying out contrast learning on a training sample and a sample to be detected, and selecting a label with highest output probability as a prediction label.
Compared with the prior art, the invention has the remarkable advantages that: 1. the base model of the shallow neural network architecture occupied by light resources is adopted, so that the occupied resources are few, and the operation efficiency is high. 2. The heuristic method for constructing the comparison task by randomly covering the network flow characteristics can be used for data enhancement, the structure of the characteristic vector before covering can be effectively reserved, the effectiveness of the enhanced data is ensured, and the training effect of the basic model is improved. 3. Based on the architecture of contrast learning, the contrast learning malicious traffic classification model has good classification performance to distinguish the class of the known network traffic, and can classify the unknown small sample malicious network traffic.
Drawings
FIG. 1 is a flow chart of a small sample malicious network traffic classification method based on sample enhancement and contrast learning according to the invention;
fig. 2 is a flow chart of the network traffic construction training set of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides a small sample malicious network traffic classification method based on sample enhancement and contrast learning, which is shown in fig. 1, and comprises the following steps:
(1) And grabbing the network traffic data, extracting the network traffic characteristics from the network traffic data, and finally enhancing the network traffic characteristics and constructing a pre-training set.
(1-1) network traffic feature extraction: first, a network traffic capture tool (e.g., tcpdump, wireshark) is used to capture and form a network traffic file (e.g., a pcap file). Then, a network traffic analysis tool (such as a CICFlowMeter) is adopted to analyze and statistically analyze the network traffic files to form network traffic characteristics, a plurality of network traffic characteristics are extracted from each network traffic file (i.e. sample), the types of the extracted network traffic characteristics of each network traffic file are the same, the number of the extracted network traffic characteristics is the same, and if the network traffic characteristics are not extracted for a certain network traffic characteristic type, the network traffic characteristics are set to 0. The number of network traffic characteristics is denoted D. And obtaining a network traffic sample set S (comprising N types of network traffic and corresponding network traffic characteristics, wherein the N types of network traffic are normal network traffic and N-1 types of malicious network traffic).
(1-2) as shown in fig. 2, the network traffic characteristics are enhanced, and a pre-training set is constructed, which comprises the following specific steps:
(1-2-1) calculating the importance weight of the network traffic characteristics: firstly, n machine learning algorithms are selected, and n network traffic classification models C are trained on S 1 、C 2 、…、C n C is carried out by i The classification accuracy of the j-th class of network traffic is denoted acc (i, j), where i is [1, n ]],j∈[1,N]. C is C i The importance weight of the calculated d-th network traffic feature is named weight (i, d), and the d-th network traffic feature belongs to the j-th network traffic. Then, the importance weight I (j, d) of the feature of the d-th network traffic to the j-th network traffic is calculated according to the formula (1).
(1-2-2) sample enhancement: first, for each sample s of each type of network traffic a Its feature vector is denoted as x a Let s assume a And (3) calculating the covering probability of each network traffic characteristic d according to the formula (2) belonging to the j-th class of network traffic. Then, at x according to the mask probability a Randomly selecting L network traffic characteristics to mask, wherein the default value of L is 10, namely the value of the selected network traffic characteristics is set to be 0, and the masked characteristic vector is s a Is included. Repeating the above random pick and mask operation m times, then each network traffic sample s a Generating m enhanced samples, and combining the feature vectors x a The set of m enhanced samples is called s a Is denoted AS AS a 。
(1-2-3) constructing a pre-training set: 1) For one type of network traffic, taking the union of the enhancement sets of all samples contained in the one type of network traffic, and recording the union of the enhancement sets of the j type of network traffic as CAS j . 2) For CAS (CAS) j One sample s of (a) jb Taking CAS j Any one of the other samples s jk And s is equal to jb Form a positive sample pair (s jb ,s jk ) The tag is set to 1; union CAS that takes the enhancement set of other network traffic types o Any one sample s in (o+.j) ok And s is equal to jb Forms a negative pair of samples (s jb ,s ok ) The tag is set to 0. 3) Repeating step 2) for CAS j Traversing the CAS j Other samples in (1) form positive sample pairs, traverse CAS o The negative pairs are formed by all samples in (a). 4) Repeating the step 1) until all the samples in the union of the enhanced sets of the network traffic types are traversed. Positive sample pair generation for the same union sample, for positive sample pairs that are repeated with both samples in existing positive sample pairs, may be discarded directly after traversal, or may be deleted after traversal (e.g., positive sample pair (s j1 ,s j2 ) Sum(s) j2 ,s j1 ) Both belong to repeated pairs of samples, one is reserved). The final set of positive and negative sample pairs is denoted PNS as a pre-training set.
(2) Model training based on contrast learning: and constructing a basic model by adopting a shallow neural network, following a comparison learning method, and pre-training the model by adopting regularization and other technologies to obtain a comparison learning model.
(2-1) neural network base model definition: constructing a basic model by adopting a multi-layer perceptron (MLP), wherein the input of the basic model is the characteristic vector x of a certain positive and negative sample after normalization to the network flow p And x q The basic model updates parameters of each layer of the network flow based on the formula (3) to finally obtain x p And x q And the feature vector is processed by the multi-layer perceptron.
X (l+1) =σ(A (l) X (l) +b (l) ) (3)
Wherein A is (l) Trainable parameter matrix for the first layer of the multi-layer perceptron, b (l) Is the parameter vector of the first layer of the multi-layer perceptron, X (l) X is the output of the first layer of the multi-layer perceptron (l+1) For the output of layer 1+1 of the multi-layer perceptron, σ (…) is the activation function.
(2-2) comparing the eigenvectors of the two network traffic using a cosine similarity function, see equation (4).
Wherein,similarity value representing two feature vectors, ranging from [ -1,1],/>Represents x p Feature vector processed by multi-layer perceptron, < >>Represents x q And the feature vector is processed by the multi-layer perceptron.
(2-3) passing through equation (5)Scaling the range value to [0,1 ]]。
Wherein,a scaled value representing the similarity value.
And (2-4) after obtaining the scaling value, identifying the label of the input sample pair to obtain a label judgment value y, wherein the label judgment value y in the embodiment is label 0 or 1 of the sample pair. And calculating a loss value by adopting a binary cross entropy loss function, and adding L2 regularization, see formula (6). After the loss value is obtained, the gradient is calculated by back propagation, and the parameters of the multi-layer perceptron are calculated and updated by gradient descent.
Where y represents a tag determination value, los_l2 (x p ,x q ) Representing the loss value, α represents the regularization factor, and W represents all the weight sums of the base model.
And (2-5) training a large number of positive sample pairs and negative sample pairs to obtain a contrast learning malicious flow classification model.
(3) Network traffic small sample classification: and for the target task, performing malicious flow small sample classification in the target task of the small sample by adopting a contrast learning malicious flow classification model.
(3-1) model initialization: and obtaining a trained contrast learning malicious flow classification model.
(3-2) data input: training samples in the new task (a small number of training samples containing new malicious traffic types, at least one malicious traffic type in the training samples not being repeated with the malicious traffic types in the network traffic sample set) are collected. The training samples in the new task are collected logically the same as steps (1-1) through (1-2-3).
(3-3) model implementation: and carrying out malicious traffic classification in a new task by adopting a contrast learning malicious traffic classification model, carrying out contrast learning on a training sample and a sample to be detected, and selecting a label with highest output probability as a prediction label.
And (3) in the small sample training process, carrying out malicious traffic classification in a new task by adopting a contrast learning malicious traffic classification model, wherein the logic of the steps is the same as that of the steps (2-1) to (2-4), and the network traffic sample set is replaced by a training sample.
And in the small sample classification process, when the training samples and the samples to be detected are subjected to contrast learning, taking samples in the training samples and the samples in the samples to be detected to form sample pairs, inputting feature vectors of the sample pairs into a contrast learning malicious flow classification model, calculating the similarity of the two processed feature vectors of the sample pairs, and taking the label (type) of the training samples in the sample pair with the highest similarity (highest output probability) as the type of the samples to be detected.
The above examples merely represent one or several embodiments of the present invention, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
Claims (8)
1. The malicious network traffic classification method based on sample enhancement and contrast learning is characterized by comprising the following steps of:
obtaining N types of network traffic, wherein each type of network traffic comprises a plurality of samples, extracting a plurality of network traffic characteristics of each sample to form a characteristic vector corresponding to the sample, and constructing a network traffic sample set, wherein the network traffic sample set comprises network traffic and corresponding network traffic characteristics;
preprocessing the network flow characteristics to obtain an enhancement set of each sample;
taking union sets for enhancement sets of all samples contained in each type of network traffic, traversing other samples in the same union set for each sample in each union set to form positive sample pairs, taking one other union set traversing sample to form negative sample pairs, setting labels for each sample pair, and taking the obtained set of the positive and negative sample pairs as a pre-training set;
constructing a basic model, normalizing the feature vectors of two samples in a sample pair in a pre-training set, and then taking the normalized feature vectors as the input of the basic model to obtain two processed feature vectors of the sample pair;
calculating the similarity of the two processed feature vectors of the sample pair, identifying the label of the sample pair to obtain a label judgment value, calculating a loss value, calculating the parameters of a basic model according to the loss value, and updating to obtain a contrast learning malicious flow classification model;
and carrying out malicious traffic classification in the new task by adopting a contrast learning malicious traffic classification model.
2. The malicious network traffic classification method based on sample enhancement and contrast learning according to claim 1, wherein the preprocessing of the network traffic features to obtain an enhancement set of each sample includes:
training n network traffic classification models on a network traffic sample set, and inputting feature vectors of the jth class of network traffic into the ith network traffic classification model to obtain classification accuracy acc (i, j);
inputting the d network traffic characteristics into an i network traffic classification model to obtain importance weight (i, d), wherein the d network traffic characteristics belong to the j network traffic class;
importance weights I (j, d) are calculated according to the classification accuracy acc (I, j) and the importance weights weight (I, d), and are expressed as follows:
wherein I (j, d) represents importance weight of the feature of the d-th network traffic to the j-th class of network traffic;
the a-th sample s in the network traffic sample set a Is characterized by a feature vector x a The sample s a The method belongs to the j-th class of network traffic, calculates the covering probability of each network traffic characteristic in the j-th class of network traffic, and uses the formula to express as follows:
wherein P (j, d) represents the mask probability of the d-th network traffic feature for all samples of the j-th class of network traffic;
in the feature vector x according to the mask probability a Randomly selecting L network traffic characteristics to mask, and obtaining a sample s after masking a Repeating the operations of randomly selecting and masking m times to obtain m enhanced samples, and combining the feature vector x a And a set of m enhanced samples as samples s a Is described.
3. The method of sample-enhanced and contrast learning-based malicious network traffic classification of claim 1, wherein at least one sample between any two positive sample pairs is not repeated for all positive sample pairs formed from the same pooled sample.
4. The method for classifying malicious network traffic based on sample enhancement and contrast learning according to claim 1, wherein the step of labeling each sample pair comprises:
the label of the positive sample pair is set to 1 and the label of the negative sample pair is set to 0.
5. The malicious network traffic classification method based on sample enhancement and contrast learning according to claim 1, wherein constructing a basic model, normalizing feature vectors of two samples in a sample pair in a pre-training set, and then taking the normalized feature vectors as an input of the basic model, to obtain two processed feature vectors of the sample pair, comprises:
constructing a basic model by adopting a multi-layer perceptron, and normalizing the eigenvector x of the sample pair p And feature vector x q As an input to the base model, the base model updates parameters for each layer of the multi-layer perceptron based on the following formula:
X (l+1) =σ(A (l) X (l) +b (l) )
wherein A is (l) Trainable parameter matrix for the first layer of the multi-layer perceptron, b (l) Is the parameter vector of the first layer of the multi-layer perceptron, X (l) X is the output of the first layer of the multi-layer perceptron (l+1) For the output of the first layer +1 of the multi-layer perceptron, sigma (…) is an activation function;
obtaining a characteristic vector x p And feature vector x q Feature vector processed by multi-layer perceptronAnd->
6. The malicious network traffic classification method based on sample enhancement and contrast learning of claim 5, wherein the calculating the similarity of two processed feature vectors of a sample pair comprises:
the two processed eigenvectors of the sample pair are compared using a cosine similarity function, formulated as follows:
wherein,similarity value representing two feature vectors, ranging from [ -1,1],/>Representing feature vector x p Feature vector processed by multi-layer perceptron, < >>Representing feature vector x q Through the multi-layer sensing machineThe processed feature vector;
to make the similarity valueScale to [0,1 ]]Expressed by the formula:
wherein,a scaled value representing the similarity value.
7. The malicious network traffic classification method based on sample enhancement and contrast learning according to claim 6, wherein the calculating the loss value, calculating parameters of the basic model according to the loss value and updating, comprises:
the loss value is calculated by a binary cross entropy loss function and formulated as follows:
wherein, loss_L2 (x p ,x q ) Representing feature vector x p And feature vector x q Alpha represents a regularization factor, W represents all weight sums of the base model, and y represents a tag judgment value;
the gradient is calculated using back propagation and the parameters of the neural network are calculated and updated using gradient descent.
8. The malicious network traffic classification method based on sample enhancement and contrast learning according to claim 1, wherein the malicious traffic classification in a new task using a contrast learning malicious traffic classification model comprises:
acquiring a trained contrast learning malicious flow classification model;
collecting training samples in a new task, wherein at least one malicious traffic type in the training samples is not repeated with the malicious traffic type in the network traffic sample set;
and carrying out malicious traffic classification in a new task by adopting a contrast learning malicious traffic classification model, carrying out contrast learning on a training sample and a sample to be detected, and selecting a label with highest output probability as a prediction label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311005429.5A CN117034112A (en) | 2023-08-10 | 2023-08-10 | Malicious network traffic classification method based on sample enhancement and contrast learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311005429.5A CN117034112A (en) | 2023-08-10 | 2023-08-10 | Malicious network traffic classification method based on sample enhancement and contrast learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117034112A true CN117034112A (en) | 2023-11-10 |
Family
ID=88634824
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311005429.5A Pending CN117034112A (en) | 2023-08-10 | 2023-08-10 | Malicious network traffic classification method based on sample enhancement and contrast learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117034112A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117614742A (en) * | 2024-01-22 | 2024-02-27 | 广州大学 | Malicious traffic detection method with enhanced honey point perception |
-
2023
- 2023-08-10 CN CN202311005429.5A patent/CN117034112A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117614742A (en) * | 2024-01-22 | 2024-02-27 | 广州大学 | Malicious traffic detection method with enhanced honey point perception |
CN117614742B (en) * | 2024-01-22 | 2024-05-07 | 广州大学 | Malicious traffic detection method with enhanced honey point perception |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Nguyen et al. | Damage assessment from social media imagery data during disasters | |
CN112905421B (en) | Container abnormal behavior detection method of LSTM network based on attention mechanism | |
CN112491796B (en) | Intrusion detection and semantic decision tree quantitative interpretation method based on convolutional neural network | |
CN113806746B (en) | Malicious code detection method based on improved CNN (CNN) network | |
CN111901340B (en) | Intrusion detection system and method for energy Internet | |
CN111538741B (en) | Deep learning analysis method and system for big data of alarm condition | |
Wei et al. | Strategic application of ai intelligent algorithm in network threat detection and defense | |
CN114844840B (en) | Method for detecting distributed external network flow data based on calculated likelihood ratio | |
CN113254930B (en) | Back door confrontation sample generation method of PE (provider edge) malicious software detection model | |
CN113420294A (en) | Malicious code detection method based on multi-scale convolutional neural network | |
Ma et al. | A hybrid methodologies for intrusion detection based deep neural network with support vector machine and clustering technique | |
CN117034112A (en) | Malicious network traffic classification method based on sample enhancement and contrast learning | |
CN111859010A (en) | Semi-supervised audio event identification method based on depth mutual information maximization | |
Chen et al. | An efficient network intrusion detection model based on temporal convolutional networks | |
CN116582300A (en) | Network traffic classification method and device based on machine learning | |
Thanh et al. | An approach to reduce data dimension in building effective network intrusion detection systems | |
Chao et al. | Research on network intrusion detection technology based on dcgan | |
CN111144453A (en) | Method and equipment for constructing multi-model fusion calculation model and method and equipment for identifying website data | |
CN113657443B (en) | On-line Internet of things equipment identification method based on SOINN network | |
Malik et al. | Performance Evaluation of Classification Algorithms for Intrusion Detection on NSL-KDD Using Rapid Miner | |
Lighari | Hybrid model of rule based and clustering analysis for big data security | |
CN115842645A (en) | UMAP-RF-based network attack traffic detection method and device and readable storage medium | |
CN114398887A (en) | Text classification method and device and electronic equipment | |
Desai et al. | Unsupervised estimation of domain applicability of models | |
Sangeetha et al. | Crime Rate Prediction and Prevention: Unleashing the Power of Deep Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |