CN114884894B - Semi-supervised network traffic classification method based on transfer learning - Google Patents

Semi-supervised network traffic classification method based on transfer learning Download PDF

Info

Publication number
CN114884894B
CN114884894B CN202210415447.XA CN202210415447A CN114884894B CN 114884894 B CN114884894 B CN 114884894B CN 202210415447 A CN202210415447 A CN 202210415447A CN 114884894 B CN114884894 B CN 114884894B
Authority
CN
China
Prior art keywords
model
network traffic
data
training
retraining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210415447.XA
Other languages
Chinese (zh)
Other versions
CN114884894A (en
Inventor
李涛
周明睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210415447.XA priority Critical patent/CN114884894B/en
Publication of CN114884894A publication Critical patent/CN114884894A/en
Application granted granted Critical
Publication of CN114884894B publication Critical patent/CN114884894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a semi-supervised network traffic classification method based on transfer learning, which comprises the following steps: extracting statistical characteristics of network flow data and screening by using pearson correlation coefficients; step 2: inputting the statistical characteristics into a pre-training model and storing the parameters learned by the pre-training model; step 3: and migrating the pre-training model into a retraining model with more linear layers, and retraining the retraining model to obtain a classifier for classifying network traffic. According to the invention, the statistical characteristics of the network traffic data are extracted, the Pelson correlation coefficient is used for screening, the pre-training model is used for training and then the parameters are saved, and finally the re-training model is used for re-training, so that the problems that the existing network traffic classification method is difficult in classifying and collecting and marking a large number of data sets, the outdated data sets are easy to waste, and the model classification accuracy is reduced when the sample is subjected to conceptual deviation are solved.

Description

Semi-supervised network traffic classification method based on transfer learning
Technical Field
The invention relates to a semi-supervised network traffic classification method based on transfer learning, and belongs to the field of convolutional neural networks.
Background
With the rapid growth of network data and network complexity, the rise of network traffic diversity presents a great challenge to network traffic control and detection. The study of network traffic classification helps to schedule different types of network traffic and to protect against attacks by malicious traffic. How to automatically detect and correctly classify different types of network traffic categories has become a key issue in improving network quality of service and network security quality.
Traditional classification methods of network traffic can be classified into the following three categories:
port number based method: the internet digital distribution organization (The Internet Assigned Numbers Authority, IANA) divided the different network protocol port numbers in the nineties of the twentieth century, thereby uncovering a prelude to the classification of network traffic. This method determines the unknown application class by analyzing the port number in the network packet header and then comparing it to the port map. Because early Internet application is simpler, the network traffic classification method based on the port number has higher classification accuracy and higher classification speed. However, with the popularization of technologies such as network address conversion, port forwarding, protocol embedding, dynamic port allocation, etc., the accuracy of the port number-based network traffic classification method is greatly reduced.
Payload-based methods: researchers have found that the payload of a packet contains a lot of information for classification, and deep packet inspection (Deep Packet Inspection, DPI) technology is of increasing interest. DPI technology classifies network traffic by analyzing the payload portion of a packet. The technology does not need port number information of the data packet, so that the technology is not affected by dynamic ports. However, this method has the following problems: the encrypted data packet cannot be analyzed, and the performance is reduced when the protocol is fuzzy and the traffic of the protocol package is processed; the method analyzes the specific content of the data transmitted by the user, and can cause infringement on the privacy of the user.
A machine learning based method: the method classifies network traffic using statistical features of the network traffic, comprising: packet size, packet interval time, packet rate, etc. The statistical characteristics representing the network traffic are input into a machine learning model, and the network traffic identification based on the machine learning model can be realized through a certain training method. However, this method has the following problems: a set of characteristics reflecting network traffic needs to be designed, and a certain degree of expertise and a great deal of time are required for mining the characteristics; the method requires a large amount of labeled data to train the classifier, a large amount of manpower and material resources are consumed for collecting and labeling the large amount of data, and with the development of network technology, the data set collected and labeled before is likely to be outdated.
In view of the foregoing, it is necessary to propose a new semi-supervised network traffic classification method based on transfer learning to solve the above-mentioned problems.
Disclosure of Invention
The invention aims to provide a semi-supervised network traffic classification method based on transfer learning, which aims to solve the problems that the existing network traffic classification is difficult to collect and mark a large number of data sets, the outdated data sets are easy to waste, and the model classification accuracy is reduced when a sample is subjected to conceptual offset.
In order to achieve the above purpose, the present invention provides a semi-supervised network traffic classification method based on transfer learning, which specifically comprises the following steps:
step 1: extracting statistical characteristics of network flow data and screening by using pearson correlation coefficients;
step 2: inputting the statistical characteristics into a pre-training model and storing the parameters learned by the pre-training model;
step 3: and migrating the pre-training model into a retraining model with more linear layers, and retraining the retraining model to obtain a classifier for classifying network traffic.
As a further improvement of the present invention, the step 1 specifically includes:
step 11: capturing network traffic data by software, the network traffic data comprising unlabeled data and labeled data, the number of unlabeled data being greater than the number of labeled data;
step 12: extracting key information of each data packet in the network flow data;
step 13: calculating to obtain statistical characteristics according to key information of the first 45 data packets of each flow in the network flow data;
step 14: and (3) screening key statistical characteristics capable of effectively identifying network traffic categories from the statistical characteristics of the flow by using the Pelson correlation coefficient, and dividing the key statistical characteristics into a labeled part and a non-labeled part.
As a further improvement of the present invention, the step 2 specifically includes:
step 21: converting the key statistical features of the unlabeled data into an unlabeled data vector matrix as input of the pre-training model;
step 22: building the pre-training model, setting the initial learning rate of the pre-training model to be 0.001, setting the batch size to be 32 and setting the iteration number to be 150;
step 23: inputting the unlabeled data vector matrix in the step 21 into the pre-training model for pre-training;
step 24: and after the pre-training is finished, the parameters of each layer of the pre-training model are stored.
As a further improvement of the present invention, the step 3 specifically includes:
step 31: the tagged data is processed according to 7:3, dividing the ratio into a training set and a testing set, and converting key statistical characteristics of the training set into a training set vector matrix serving as input of the retraining model;
step 32: building the retraining model, and migrating the trained pre-training model in the step 24 to the retraining model with more linear layers;
step 33: setting the initial learning rate of the retraining model to be 0.001, the batch size to be 64 and the iteration number to be 100;
step 34: and the retraining model after training is used as a classifier for classifying network traffic.
As a further improvement of the present invention, there are 116 statistical features described in step 1.
As a further improvement of the invention, 90 statistical features with a correlation of less than 0.9 are screened out by the pearson correlation coefficient.
As a further improvement of the present invention, the pearson correlation coefficient calculation formula is:
wherein X and Y represent different statistical signature sequences for different samples.
As a further improvement of the present invention, the key information in step 12 includes packet arrival time, packet protocol, packet source IP address, packet destination IP address, packet size.
As a further improvement of the invention, the pre-trained model does not contain a Softmax layer for classification, and the final output of the pre-trained model is the statistical feature.
As a further improvement of the invention, two fully connected layers are added to the retraining model to reduce the statistical features and a Softmax layer is used to obtain the final classification result.
The beneficial effects of the invention are as follows: the semi-supervised network traffic classification method based on transfer learning provided by the invention is used for solving the problems that the existing network traffic classification is difficult to collect and mark a large number of data sets, the outdated data sets are easy to waste, and the model classification accuracy is reduced when a sample is subjected to conceptual deviation by extracting statistical characteristics of network traffic data, screening by using a Pearson correlation coefficient, training by using a pre-training model, then storing parameters, and finally retraining by using a retraining model.
Drawings
Fig. 1 is a step diagram of a semi-supervised network traffic classification method based on transfer learning according to the present invention.
Fig. 2 is a flow chart of fig. 1.
Fig. 3 is a schematic diagram of a pre-training model structure and parameters of each layer according to an embodiment of the present invention.
FIG. 4 is a diagram of retraining model structures and layers of parameters according to an embodiment of the present invention.
Fig. 5 is a schematic flow classification diagram according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1 and fig. 2, the present invention provides a semi-supervised network traffic classification method based on transfer learning, which specifically includes the following steps:
step 1: extracting statistical characteristics of network flow data and screening by using pearson correlation coefficients;
step 2: inputting the statistical characteristics into a pre-training model and storing the parameters learned by the pre-training model;
step 3: and migrating the pre-training model into a retraining model with more linear layers, and retraining the retraining model to obtain a classifier for classifying network traffic.
Step 1 is used for collecting and extracting original network traffic data so as to apply the original network traffic data to the training model of the invention, and step 1 specifically comprises the following steps:
step 11: capturing network traffic data by software, wherein the network traffic data comprises unlabeled data and labeled data, and the quantity of the unlabeled data is larger than that of the labeled data;
step 12: extracting key information of each data packet in the network flow data;
step 13: calculating according to key information of the first 45 data packets of each flow in the network flow data to obtain statistical characteristics;
step 14: and (3) screening key statistical features capable of effectively identifying network traffic categories from the flow statistical features by using Pelson correlation coefficients, and dividing the key statistical features into two parts, namely labeled and unlabeled.
Step 11 captures network traffic data through software, the network traffic data usually has no label, but the traditional network traffic method based on machine learning and deep learning usually needs to train a classifier by using a large amount of labeled data to achieve a good classification effect, but collecting and labeling a large amount of data samples needs to consume huge manpower and material resources, and along with the rapid development of network technology, the data samples collected and labeled at a huge cost are likely to be outdated. The invention only needs a large amount of unlabeled data and a small amount of labeled data to capture the network traffic data, thereby greatly reducing the difficulty of collecting the original data.
The key information of each data packet in step 12 includes the arrival time of the data packet, the protocol of the data packet, the source IP address of the data packet, the destination IP address of the data packet, and the size of the data packet, and the key information of each data packet is saved as a txt file.
Step 13, calculating to obtain statistics features according to key information of first 45 data packets of each flow in the network flow data, wherein the statistics features are 116 in total.
Because the correlation exists between the features, the higher the correlation between the features is, the higher the similarity degree between the two features is, and the higher the probability of category confusion is when a classification task is performed, so that the step 14 utilizes the pearson correlation coefficient to calculate the correlation between the features, and screens out 90 statistical features with the correlation smaller than 0.9, and the purpose of the step is to reduce the feature dimension, reduce the calculation complexity and simultaneously improve the classification accuracy.
The pearson correlation coefficient calculation formula is:
wherein X and Y represent different statistical signature sequences for different samples.
The step 2 specifically comprises the following steps:
step 21: converting key statistical features of the unlabeled data into an unlabeled data vector matrix as input of a pre-training model;
step 22: building a pre-training model, setting the initial learning rate of the pre-training model to be 0.001, setting the batch size to be 32 and setting the iteration times to be 150;
step 23: inputting the unlabeled data vector matrix in the step 21 into a pre-training model for pre-training;
step 24: and after the pre-training is finished, the parameters of each layer of the pre-training model are stored.
Specifically, in step 21, 90 key statistical features representing a large number of unlabeled exemplars are converted into a matrix of unlabeled data vectors of size 2×45, thereby facilitating input into the pre-training model.
Referring to fig. 3, the pre-trained model built in step 22 does not include a Softmax layer for classification.
And step 24, after the pre-training model is trained, the characteristic distribution condition of the network data flow can be predicted.
The pre-training model in the invention is a one-dimensional convolutional neural network model.
The step 3 specifically comprises the following steps:
step 31: the tagged data is read according to 7:3, dividing the ratio into a training set and a testing set, and converting key statistical characteristics of the training set into a training set vector matrix serving as input of a retraining model;
step 32: building a retraining model, migrating the retraining model trained in step 24 to a retraining model with more linear layers,
step 33: setting the initial learning rate of the retraining model to be 0.001, the batch size to be 64 and the iteration number to be 100;
step 34: the trained retraining model is used as a classifier for classifying network traffic.
Step 31 converts the 90 key statistical features of the training set representing a large number of labeled samples into a matrix of labeled data vectors of size 2 x 45, thereby facilitating input into the retraining model. Because the retraining model is a one-dimensional convolutional neural network model and has the characteristics that the retraining model is tested while training, the data with the labels are tested according to 7:3 is divided into a training set and a testing set, wherein the training set is used for inputting and training, and the part divided into the testing set is used for retraining and testing the whole retraining model, so that the reliability and the accuracy of the retraining model are ensured.
Referring to fig. 4, two fully connected layers are added to the retrained model in step 32 to reduce the statistical features and one Softmax layer is added to the retrained model to obtain the final classification result.
In order to illustrate the effects of the above-described aspects of the present invention, a description will be given below with reference to specific examples.
As shown in fig. 5, a flow classification diagram is shown. Firstly, network traffic is collected by utilizing a Wireshark tool, a pcap file is generated and stored, only key information of the first 45 data packets in the network traffic data is extracted in consideration of a real-time classification task, wherein the key information comprises packet size, packet arrival time, source IP address, destination IP address and transmission protocol information, and statistical characteristics calculated by utilizing the key information of the first 45 data packets in the network traffic data are used for representing the statistical characteristics of the whole network traffic data. The 116 statistical features calculated include two parts: 4 items of packet arrival time sequence, packet size sequence, packet difference sequence and time stamp sequence, wherein 17 kinds of statistical characteristics are shown in table 1, and the specific characteristic information is 68 kinds of statistical characteristics in total; the whole network flow data is divided into 48 statistical characteristics of uplink flow, downlink flow and whole data packet according to the data packet direction, and the specific characteristic contents are shown in table 2. Because the correlation exists between the features, the higher the correlation between the features is, the higher the similarity degree between the two features is, the higher the probability of category confusion is when a classification task is performed, so that 90 statistical features with the correlation smaller than 0.9 are screened out by using the correlation between the features calculated by the pearson correlation coefficient, and the aim of the step is to reduce the feature dimension, reduce the calculation complexity and improve the classification accuracy.
TABLE 1 17 statistical characteristics
TABLE 2 48 statistical characteristics
And then taking statistical features representing a large number of unlabeled old samples as input of a pre-training model, wherein the pre-training model is a one-dimensional convolutional neural network model, the pre-training model after training is completed can predict the feature distribution condition of the whole network data flow, the pre-training model is migrated to a new model with more linear layers after training is completed, namely, the model is retrained, then the new model is quickly retrained by using the statistical features representing a small number of labeled new samples, and the network flow classification task can be completed by the new model after retrained.
The present invention verifies on the ISCX-non vpn dataset, the 13 year south post Video dataset (Video 13) and the 19 year south post Video dataset (Video 19). Wherein the ISCX-non vpn data set is a common data set comprising six different application class traffic data: an email class, a file transfer class, a streaming class, a text chat class, a voice telephony class, and a P2P class. The Video13 dataset and the Video19 dataset are network Video datasets collected in 2013 and 2019 using a campus network of the university of south Beijing and the postal service respectively, and the feature distribution between the two datasets is different. Categories of Video13 dataset include: voD super-definition, voD high definition, voD standard definition, live video, conversational video, and P2P video. Categories of Video19 datasets include: on demand 480P, on demand 720P, live 480P, live 720P. Because of the 6 year difference between the Video13 dataset and the Video19 dataset, the Video13 dataset can be considered to be an outdated old dataset to some extent. The verification effect is as follows:
as shown in table 3, the Video19 dataset used 16574 samples together achieved an accuracy of 95.91% without pre-training; when pretrained using ISCX and Video13 datasets, video19 used 8332 samples in total achieved 95.6% and 98.31% accuracy, respectively. From this, it can be seen that when the invention is retrained, video19 only needs 50% of the samples, and its overall accuracy can approach or exceed that of a fully supervised training using all the samples, verifying the effectiveness of the invention.
TABLE 3 Performance of the invention on different Pre-training data sets
The Video19 dataset was mixed into Video13 at a ratio of 20%,40% and 60%, and then verified, and the verification results are shown in table 4. As the proportion of the new data in the pre-training data set increases, new data available for model training to learn during pre-training also increases, and thus the overall accuracy increases. Then to simulate the concept offset, non-Video-class traffic data (file transfer class and email class) in the ISCX dataset is added to the Video19 dataset to be identified for experimentation. The invention proves that the pre-training model can be better utilized to help improve the classification effect when the concept deviation occurs. Therefore, the classification accuracy of the new model obtained by the pre-training model migration to the new data is higher.
Table 4 overall accuracy of the invention versus different methods on a mixed pre-training dataset
In summary, the invention solves the problems that the existing network traffic classification method is difficult to collect and mark a large number of data sets in a classified manner, the outdated data sets are easy to waste, and the model classification accuracy is reduced when the samples are subjected to conceptual offset by screening the statistical characteristics of the network traffic data and using the pearson correlation coefficient, training the parameters by using the pre-training model and then storing the parameters, and finally retraining by using the retraining model.
The above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention.

Claims (7)

1. A semi-supervised network traffic classification method based on transfer learning is characterized by comprising the following steps:
step 1: extracting statistical characteristics of network flow data and screening by using pearson correlation coefficients; the method specifically comprises the following steps:
step 11: capturing network traffic data by software, the network traffic data comprising unlabeled data and labeled data, the number of unlabeled data being greater than the number of labeled data;
step 12: extracting key information of each data packet in the network flow data;
step 13: calculating to obtain statistical characteristics according to key information of the first 45 data packets of each flow in the network flow data;
step 14: screening key statistical features capable of effectively identifying network traffic categories from the statistical features of the flows by using Pelson correlation coefficients, and dividing the key statistical features into two parts, namely labeled and unlabeled;
step 2: inputting the statistical characteristics after screening into a pre-training model and storing the parameters learned by the pre-training model; the method specifically comprises the following steps:
step 21: converting the key statistical features of the unlabeled data into an unlabeled data vector matrix as input of the pre-training model;
step 22: building the pre-training model, setting the initial learning rate of the pre-training model to be 0.001, setting the batch size to be 32 and setting the iteration number to be 150;
step 23: inputting the unlabeled data vector matrix in the step 21 into the pre-training model for pre-training;
step 24: after the pre-training is finished, the parameters of each layer of the pre-training model are stored;
step 3: migrating the pre-training model to a retraining model with more linear layers, retraining the retraining model to obtain a classifier for classifying network traffic; the method specifically comprises the following steps:
step 31: the tagged data is processed according to 7:3, dividing the ratio into a training set and a testing set, and converting key statistical characteristics of the training set into a training set vector matrix serving as input of the retraining model;
step 32: building the retraining model, and migrating the trained pre-training model in the step 24 to the retraining model with more linear layers;
step 33: setting the initial learning rate of the retraining model to be 0.001, the batch size to be 64 and the iteration number to be 100;
step 34: and the retraining model after training is used as a classifier for classifying network traffic.
2. The semi-supervised network traffic classification method based on transfer learning as set forth in claim 1, wherein: there are 116 statistical features described in step 1.
3. The semi-supervised network traffic classification method based on transfer learning as recited in claim 2, wherein: and screening 90 statistical features with the correlation less than 0.9 through the pearson correlation coefficient.
4. The semi-supervised network traffic classification method based on transfer learning as recited in claim 3, wherein: the pearson correlation coefficient calculation formula is as follows:
wherein->And->Representing different statistical signature sequences for different samples.
5. The semi-supervised network traffic classification method based on transfer learning as set forth in claim 1, wherein: the key information in step 12 includes packet arrival time, packet protocol, packet source IP address, packet destination IP address, and packet size.
6. The semi-supervised network traffic classification method based on transfer learning as set forth in claim 1, wherein: the pre-training model does not contain a Softmax layer for classification, and the statistical features are finally output by the pre-training model.
7. The semi-supervised network traffic classification method based on transfer learning as set forth in claim 1, wherein: and adding two full-connection layers into the retraining model to carry out the reduction of the statistical characteristics and a Softmax layer for obtaining a final classification result.
CN202210415447.XA 2022-04-18 2022-04-18 Semi-supervised network traffic classification method based on transfer learning Active CN114884894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210415447.XA CN114884894B (en) 2022-04-18 2022-04-18 Semi-supervised network traffic classification method based on transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210415447.XA CN114884894B (en) 2022-04-18 2022-04-18 Semi-supervised network traffic classification method based on transfer learning

Publications (2)

Publication Number Publication Date
CN114884894A CN114884894A (en) 2022-08-09
CN114884894B true CN114884894B (en) 2023-10-20

Family

ID=82671571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210415447.XA Active CN114884894B (en) 2022-04-18 2022-04-18 Semi-supervised network traffic classification method based on transfer learning

Country Status (1)

Country Link
CN (1) CN114884894B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115619192B (en) * 2022-11-10 2023-10-03 国网江苏省电力有限公司物资分公司 Mixed relation extraction method oriented to demand planning rules

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020119662A1 (en) * 2018-12-14 2020-06-18 深圳先进技术研究院 Network traffic classification method
CN112235264A (en) * 2020-09-28 2021-01-15 国家计算机网络与信息安全管理中心 Network traffic identification method and device based on deep migration learning
CN113705712A (en) * 2021-09-02 2021-11-26 广州大学 Network traffic classification method and system based on federal semi-supervised learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL260986B (en) * 2018-08-05 2021-09-30 Verint Systems Ltd System and method for using a user-action log to learn to classify encrypted traffic
US20200104710A1 (en) * 2018-09-27 2020-04-02 Google Llc Training machine learning models using adaptive transfer learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020119662A1 (en) * 2018-12-14 2020-06-18 深圳先进技术研究院 Network traffic classification method
CN112235264A (en) * 2020-09-28 2021-01-15 国家计算机网络与信息安全管理中心 Network traffic identification method and device based on deep migration learning
CN113705712A (en) * 2021-09-02 2021-11-26 广州大学 Network traffic classification method and system based on federal semi-supervised learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
支持向量机的半监督网络流量分类方法;李平红;王勇;陶晓玲;;计算机应用(第06期);1515-1518 *

Also Published As

Publication number Publication date
CN114884894A (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN110730140B (en) Deep learning flow classification method based on combination of space-time characteristics
CN111340191B (en) Bot network malicious traffic classification method and system based on ensemble learning
CN110012029B (en) Method and system for distinguishing encrypted and non-encrypted compressed flow
WO2021103135A1 (en) Deep neural network-based traffic classification method and system, and electronic device
WO2020119662A1 (en) Network traffic classification method
CN110796196B (en) Network traffic classification system and method based on depth discrimination characteristics
CN110808945B (en) Network intrusion detection method in small sample scene based on meta-learning
CN112564974B (en) Deep learning-based fingerprint identification method for Internet of things equipment
US20130013542A1 (en) Scalable traffic classifier and classifier training system
CN112671757A (en) Encrypted flow protocol identification method and device based on automatic machine learning
CN112511555A (en) Private encryption protocol message classification method based on sparse representation and convolutional neural network
CN110808971A (en) Deep embedding-based unknown malicious traffic active detection system and method
CN113989583A (en) Method and system for detecting malicious traffic of internet
CN109981474A (en) A kind of network flow fine grit classification system and method for application-oriented software
CN114884894B (en) Semi-supervised network traffic classification method based on transfer learning
CN110034966B (en) Data flow classification method and system based on machine learning
CN111611280A (en) Encrypted traffic identification method based on CNN and SAE
CN114826776B (en) Weak supervision detection method and system for encrypting malicious traffic
CN114915575B (en) Network flow detection device based on artificial intelligence
CN114567487A (en) DNS hidden tunnel detection method with multi-feature fusion
CN116405419A (en) Unknown network protocol classification method based on small sample learning
CN114650229A (en) Network encryption traffic classification method and system based on three-layer model SFTF-L
Zhou et al. Encrypted network traffic identification based on 2d-cnn model
CN114095447A (en) Communication network encrypted flow classification method based on knowledge distillation and self-distillation
CN117318980A (en) Small sample scene-oriented self-supervision learning malicious traffic detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant