CN116319583A - Encryption network traffic classification method based on GCNN and MoE - Google Patents
Encryption network traffic classification method based on GCNN and MoE Download PDFInfo
- Publication number
- CN116319583A CN116319583A CN202310207576.4A CN202310207576A CN116319583A CN 116319583 A CN116319583 A CN 116319583A CN 202310207576 A CN202310207576 A CN 202310207576A CN 116319583 A CN116319583 A CN 116319583A
- Authority
- CN
- China
- Prior art keywords
- network
- data
- graph
- traffic
- gcnn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000013145 classification model Methods 0.000 claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 claims abstract description 10
- 238000005096 rolling process Methods 0.000 claims abstract description 9
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 25
- 238000002474 experimental method Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 11
- 239000013598 vector Substances 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 239000012634 fragment Substances 0.000 claims description 2
- 238000010801 machine learning Methods 0.000 abstract description 3
- 238000003062 neural network model Methods 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 7
- 230000006399 behavior Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000010606 normalization Methods 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000012358 sourcing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention belongs to the field of computer artificial intelligence, and in particular relates to an encryption network traffic classification method based on GCNN and MoE, which comprises the following steps: dividing flow data of a mobile application program in a period of time into a plurality of flow blocks with the same length; converting the traffic blocks into graph datasets having node features and edge weights; constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model; inputting a graph dataset of data to be tested into an encryption network flow classification model to obtain a classification result; the invention realizes higher classification performance and solves the problems of low classification accuracy, poor performance and the like of the traditional machine learning method and the traditional CNN, RNN and other neural network models.
Description
Technical Field
The invention belongs to the field of computer artificial intelligence, and particularly relates to an encryption network traffic classification method based on GCNN and MoE.
Background
With the increasing development of internet communication technology in recent years, the popularity of communication technology including 5G has made the growth of intelligent and mobile devices remarkable. By 2023, it is generally predicted that the number of internet of things (IoT) devices, including smartphones, will reach hundreds of billions, and that networks have become part of people's work and lives. In today's network management systems, network traffic classification is a critical task, the main objective being to predict network data flow protocols and application types.
In recent years, with the rapid development of the requirements for protecting the privacy security of the transmitted data and users, more and more protocols of application programs start to transmit data by using encryption technology, the duty ratio of encrypted traffic in the network also increases sharply, and the encryption technology is also more and more complex. The classification of encrypted traffic has been one of the most important network security directions since the advent of the internet. But due to the popularity of encryption technology and the high-speed increase of network throughput, it becomes increasingly difficult to achieve rapid and accurate classification of encrypted traffic. On the other hand, the occurrence of the encryption technology also leads to the increase of the possibility of various malicious traffic and network abnormal traffic, and the hacking attack also utilizes the encryption technology to perform a great deal of malicious attack activities, so that when a great deal of encrypted traffic occurs in the network, how to quickly classify the encrypted traffic and further perform refined traffic analysis is very important.
Existing mobile application classification works mostly overcome the challenges of encrypting traffic. For example, the appscenner approach uses a flow-based detection method that extracts side channel features from packet headers and computes statistical features to train on a machine learning model of mobile application classification. Also, the FlowPrint method constructs a fingerprint of an application by considering a communication map between a mobile device and other destinations (e.g., CDN and third party service) and related attributes (e.g., destination IP, destination port, and TLS certificate). In the inference phase, the fingerprints collected in the past are compared with the new fingerprints to determine the application. However, short communication time situations are considered due to challenges in building a communication graph of all possible behaviors of an application. Thus, if the user changes his usage behavior or uses a different function of the application, it may not function properly.
Integrating the current network traffic classification research situation based on deep learning, it is found that some challenges still exist in the process of classifying the network traffic by using the deep learning method:
1. over 80% of mobile traffic is encrypted or adopts Transport Layer Security (TLS), so traffic may not be classified using payload-based methods that analyze certain areas of the application layer protocol;
2. the port-based classification method cannot classify mobile traffic because an application program mainly uses HTTPS to transfer data and uses text formats such as XML or JSON to send data back and forth. Some information (such as the number of files or the size of the files) is not available to employ web page classification.
3. The user behavior varies dynamically over time, depending on the function used. Traffic captured within a short time (e.g., 5 minutes) of the mobile application may not represent its complete traffic behavior.
Disclosure of Invention
In order to solve the problems, the invention provides an encryption network traffic classification method based on GCNN and MoE, which specifically comprises the following steps:
s1, dividing flow data of a mobile application program in a period of time into flow blocks with the same length;
s2, converting the flow block into a graph dataset with node characteristics and edge weights;
s3, constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model;
s4, inputting the graph dataset of the data to be tested into the encrypted network flow classification model to obtain a classification result.
Further, when dividing the flow data of the mobile application program in a period of time into flow blocks with the same length, setting duration and overlapping time, and dividing the flow blocks by the duration and the overlapping time specifically includes: the length of each flow block is set to be the length of the duration, and each flow block has overlapping with its previous flow block by the length of overlap time and also has overlapping with its next flow block by the length of overlap time except for the first flow block and the last flow block.
Further, the process of converting traffic blocks into graph datasets having node features and edge weights includes the steps of:
removing the dns protocol in the flow block;
acquiring an IP address and a port number in a mobile application program and combining the IP address to the port number;
in constructing the graph data of the mobile application,
obtaining the maximum node number N required by one MApp graph, and generating all graph data of each MApp according to the weight between two nodes;
all the graphic data of each MApp are stored in 2 csv format files, node features are stored in the features.
Further, the encryption network traffic classification model of the mobile application program based on the graph rolling neural network GCNN and the hybrid Expert system comprises a plurality of cascaded GCN layers, a sourcing layer, an experiment network and a softmax layer, the graphs of the outputs of the four cascaded GCN layers potentially represent the maximum K values of the selected graph potential representation values of the input sourcing layer, the experiment network comprises a plurality of experiment units, the selected graphs potentially represent the input of the plurality of experiment units respectively, the product of the output of each experiment unit and the corresponding weight of the experiment unit is accumulated, and then the product of the output of each experiment unit and the corresponding weight of the experiment unit is input into the softmax layer, and the softmax layer obtains the classification result.
Further, if there are L GCN hierarchies, the output of the GCN layer of the first+1th is expressed as:
wherein ,output of the layer for the 1+1th picture volume, c l For the number of features of each graph node extracted at the first layer, n is the number of nodes, l=0..l-1, z 0 =X,/>Representing a node characteristic matrix, c representing the characteristic quantity of nodes in the node characteristic matrix; />Is a diagonal matrix of the graph; />An adjacency matrix for adding self-loops; />Is a trainable parameter of the first layer.
Further, a graph data is denoted by G, expressed asFor the node set in the graph data, epsilon is the edge set in the graph data, and if A is the adjacency matrix of the graph data, the adjacency matrix added with the self-loop is expressed as:
wherein I is an identity matrix.
Further, the side relationship between the nodes is established through the cross-correlation between the nodes, that is, if the cross-correlation between the two nodes is not 0, the side relationship exists between the two nodes, the side weight is the cross-correlation between the two nodes, and the calculation process of the cross-correlation between the two nodes comprises the following steps:
generating graph nodes according to the traffic captured in a given time window, dividing the given time window into T slices with different durations;
counting the number of packets of each intra-slice mobile application that send a packet of traffic to or receive a service deployed on the destination IP address and port as a cross-correlation between two nodes, expressed as:
wherein ,Ci,j R is the cross-correlation between node i and node j i (t) a binary variable representing whether node i is active in time slice t, r when node i is active i (t) =1, otherwise r i (t)=0。
Further, an Adam algorithm is adopted to optimize an encryption network traffic classification model of a mobile application program based on a graph roll-up neural network GCNN and a hybrid expert system, and the optimization process comprises the following steps:
the method comprises the steps of obtaining historical data as a training data set, and slicing the training data set, namely dividing the training data set into a plurality of mutually independent and orthogonal sub-data sets, wherein each sub-data set is sliced;
when the network is trained, each fragment is used for respectively inputting a classification model to obtain a prediction result, and training is carried out according to the prediction result and a label corresponding to training data;
and when the loss result converges or reaches the maximum training times, the optimization is completed.
Further, in the process of optimizing by using training data, a model logic loss function is used for carrying out a direction propagation optimizing model, and the model logic loss function is expressed as:
wherein ,as a logical loss function of the network, Θ is a parameterized Expert network, denoted asθ M Representing the M th Expert unit, d is the dimension of an Expert network, and M is the number of the Expert units in the Expert network; n is the number of training data; y is i A tag that is the ith training data; f (x) i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is the output of the Expert network, x i For the ith training data, W is the weight of the MoE gating network; />Is a sigmoid function, expressed as +.>
Further, the output F (x i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is expressed as:
wherein ,is the set of selected indices, pi m (x; Θ) is the gating value of the mth Expert network, h m (x; Θ) is the output of the mth Expert network, [ P ]]For a sliced dataset, [ M ]]Gating a set of networks for the expert; f (f) m (x; W) is the output of the mth Expert network, [ J ]]Representing the set of filters, σ (·) is the activation function, P represents the number of training data in a sliced dataset, w m,j Weight vector, x representing the jth filter in the mth Expert network (p) Representing the input data as the p-th data in the sliced data set; x represents input data.
In the invention, the limitation of a single graph rolling neural network GCNN model on encryption traffic classification is considered, so that the encryption traffic is better identified and classified, the single GCNN model is split into a plurality of Experts networks by adding a MoE expert network into a graph rolling network GCN structure, training and prediction are simultaneously carried out, then the judgment is carried out in a combined mode, and the accuracy of classification and identification on the network encryption traffic is improved by the combined classification judgment of GCNN and MoE, the higher classification performance is realized, and the problems of low classification accuracy, poor performance and the like of the traditional machine learning method and the traditional CNN, RNN and other neural network models are solved.
Drawings
FIG. 1 is an overall flow chart of the present invention;
FIG. 2 is a schematic diagram of a split of a proof mass of the present invention;
FIG. 3 is a model diagram of an encryption traffic classification method based on DGCNN and MoE according to the present invention;
fig. 4 is a diagram of the structure of the experert network of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides an encryption network traffic classification method based on GCNN and MoE, which specifically comprises the following steps:
s1, dividing flow data of a mobile application program (Mobile Application, MApp) within a period of time into flow blocks with the same length;
s2, converting the flow block into a graph dataset with node characteristics and edge weights;
s3, constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model;
s4, inputting the graph dataset of the data to be tested into the encrypted network flow classification model to obtain a classification result.
In this embodiment, for the flow data of a certain MApp, a large-duration flow block of a MApp is split into a plurality of smaller-duration flow blocks of the same MApp, and the duration T is passed through duration And overlap time T overlap For splitting the flow blocks, as shown in FIG. 2, the split flow blocks of MApp of the same length are divided by the duration T duration And overlap time T overlap The flow block of one MApp is overlapped with the previous flow block by the overlapping time T overlap Is also overlapping time T overlap If the duration T is duration Short enough, no T is required to be arranged overlap How to judge the duration T duration Whether short enough, the dividing flow block specifically comprises the following steps, according to experience set by a person skilled in the art:
step 1.1, acquiring original flow data by using a flow acquisition tool Wireshark and the like, wherein a sample data set is encrypted flow data of MApp in an original form, and processing the data;
step 1.2, firstly splitting the original MApp flow data block according to the duration T duration And overlap time T overlap Splitting a flow block of MApp with long duration into a plurality of small blocks with the same length and shorter duration;
step 1.3, the MApp flow block data set is stored as a csv format file after being split.
In this embodiment, each pair (IP: port) after combining the IP address to the Port number is defined as the next node in the graph data, that is, one node in the graph data in this embodiment is a combination of the IP address and the Port number. In this embodiment, the optimal value of the number N of nodes in a graph data is 20, if the value of N is higher, then all the graphs with fewer nodes must use zero padding for feature vectors (in the case of MLP) or potential representation vectors (in the case of the present invention), on the one hand, zero value features may mislead learning of the model, on the other hand, using fewer nodes, useful information may be lost from the discarded nodes, thereby affecting the performance of the model, most graphs have about 10 nodes, 90% of the graphs have less than 35 nodes, 86% of the graphs have less than 30 nodes, and when a large number of nodes are used, the performance increases to the optimal value N before decreasing again. It is worth mentioning that the optimal values of the MLP and the N of the present invention are different, the present invention requires more information about the graphics topology to distinguish the mobile applications, and the present invention has better performance in various experimental scenarios compared to the MLP.
Generating graph data used as a deep neural network model from the acquired flow data blocks of each MApp, specifically comprising the following steps:
step 2.1, removing the dns protocol in the data block;
step 2.2, obtaining the IP address and port number in the MApp and merging the IP address into a port number (same tuple (IP, port number) -same network destination);
and 2.3, using a flow block of a certain MApp as a data frame, generating all image data of each MApp by constructing the maximum node number N required by one MApp image and the weight between two nodes, wherein all image data of each MApp are stored in 2 csv format files, a features.csv format file with node characteristics and a weights.csv format file with weight between nodes.
Step 2.4, the connection is established by the edges connected between the nodes through cross correlation: given the traffic captured in a time window that would result in the plurality of graph nodes described above, the time window is then further divided into a plurality of segments having a predefined slice duration t slice Inner slice. Let T be the number of slices, and during each slice time, a node (destination IP address and port number are a pair) is considered active if the MApp sends or receives at least one traffic packet to a service deployed on the destination IP address and port number. Let r be i (t) is a binary variable of whether node i is active at time slice t, r when node i is active i (t) =1, otherwise r i (t) =0. Within the slice number T, the cross-correlation of two nodes i and j is defined as:
by adopting cross correlation, establishing the relationship of edges between nodes, and correspondingly setting the weight of the edges, specifically: if C i,j Not equal to 0, an edge is established between two nodes i and j, and the weight is C i,j 。
To avoid feature bias when training and predicting graph data into neural network classification model, using min-max scalar normalization to normalize C i,j Normalized to the range [0,1 ]]And (3) inner part. The min-max scalar normalization math is defined as follows:
wherein x' represents the normalized value of the single data, and x represents the value before normalization of the single data; min represents the minimum value of the column in which the data is located, and max represents the maximum value of the column in which the data is located.
Each of the graph dataNodes all need to construct a feature vector, i.e., node features. Since mobile applications are connected to various services, each represented by a certain node of the graph as a tuple of IP address and port number, traffic behavior from the mobile device to the server of each service may differ in various traffic characteristics, such as packet size, packet number, flow duration, etc. To extend to both encrypted and unencrypted traffic, it is employed to extract information only from the packet header, without analyzing the packet payload. In addition to packet features, flow features such as the number of flows, the average number of packets in each flow, and the average flow size in bytes are extracted, and in this embodiment only TCP flows and UDP flows are considered, relying on the Wireshark tool to collect analysis flow features. The feature vectors of all nodes in a graph data form a node feature matrix X,where n is the number of nodes and c is the number of features of the nodes.
An encrypted flow classification model based on a graph rolling neural network GCNN and a mixed expert MoE is constructed, statistical features and derivative features of data are extracted through the GCNN model, and the data processed through a SortPooling layer through the MoE model are fed into different MoE sub-networks for training and testing, and the method specifically comprises the following steps:
step 3.1, constructing an encryption flow classification framework of GCNN and MoE, judging the approaching degree of actual output (output is probability) and expected output (output is probability) by adopting a cross loss function, and adopting an Adam optimizer; the mathematical calculation of the cross entropy is as follows:
the learning rate during training is adjusted by adopting a fixed period interval, and after each step_size of epochs are trained, the learning rate is adjusted to lr=lr×decay, step_size=10, initial lr=0.0001, decay=0.9;
step 3.2, referring to fig. 3, the GCNN network model portion includes: the GCN structure formed by stacking 4 graph rolls has the size of 1024 for the first 3 graph convolution layers, 512 for the last graph convolution layer and tan h for the activation function;
in the laminate part of the drawing, a drawing is givenLet a be the adjacency matrix of G, so that a is a symmetrical binary matrix, and let the graph have no self-loops. Node feature matrix is defined as-> wherein />The computing node is potentially denoted +.> wherein />Is an adjacency matrix added with self-loops, +.>Is a diagonal matrix of the graph, such that +.>Is a trainable graph convolution parameter matrix shared among nodes, sigma is a nonlinear activation function, ++>Is the output activation matrix.
Intuitively, the graph data before being fed to the GCN layer shown in FIG. 3 has nodes defined by their neighbors and edges connecting the nodes, and thus the node's potential representation is affected by its neighbors. The graph convolutional layer, through its node features (XW), allows information to propagate between neighboring nodes through the product of the node features and the adjacency matrix (AXW). The potential representation of the final node is defined asWherein l= … L-1, z 0 :=X,/>Output for the first picture volume layer; c l The number of output channels of the first layer (i.e., the number of features extracted at the first layer per graph node);is a trainable parameter of the first layer. After the convolution process of all the convolution layers described above, a potential representation from the nodes in the overall graph is obtained.
Step 3.3, after the data of the graph convolution layer is processed, the obtained output is fed to a SortPooling layer, and the nodes are ordered through the sum of node characteristics of a first layer (the last layer of the graph convolution process);
if two nodes have the same value at the first level, the sum of the node features of the first-1 level is used, i.e. if the sum of the features of two nodes of the current level is the same, the two nodes are ordered using the sum of the features of their previous level until the tie is broken. Since the number of nodes in each graph is heterogeneous, the pooling layer also truncates or expands the potential representation of the graph to a predefined size. Given a predefined size (e.g., k) of the graph potential representation, truncation is performed if there are more than k values in the graph potential representation vector. Otherwise, zero padding is performed. The value of k is heuristically defined based on the input data. For example, defining k, 90% of the graph nodes are used to construct the graph latent representation vector to avoid losing node features in the final graph latent representation.
Referring to fig. 3, after constructing a MoE network model and passing through a sortpoling layer, an expert sub-network training test is performed, which specifically includes the following steps:
step 4.1, moE layer, consisting of a group of M "experert networks" f 1 ,...,f M And gating networks, which are typically set to be linear. Definition f m (x; W) is the output of the mth Expert network, the output of the MoE layer may beThe definition is as follows:
wherein Is the set of selected indices, pi m (x; Θ) is the gating value of the mth Expert network, its value being determined by the following definition:
for the mth expert network, this embodiment considers it as a convolutional neural network CNN structure, defined as follows:
wherein ,is the weight vector of the jth filter (i.e. neuron) in the mth Expert, j is the number of filters (i.e. neurons), d is the dimension of an Expert network; x is x (p) Representing the input data as the p-th data in the sliced data set; x represents input data; [ P ]]For a sliced dataset, [ M ]]Gating a set of networks for the expert; [ J]Representing the set of filters, σ (·) is the activation function, and P represents the number of training data in one sliced dataset.
In this embodiment, there is actually a Router at the SortPooling layer output to Weight, which adds noise as a disturbance and uses the experience-loss gradient of the disturbance to update the weights. The disturbance empirical loss at time t after the Router adds the disturbance is defined as:
wherein ,and->Is random noise; w (W) (0) Is the value initialized by the weight matrix, and the weight update rule of Router is defined as:
wherein η > 0 is expert learning rate, || F Representing the operation of the norm calculation,representing the weight update gradient at the mth expert network.
For the weight matrix of the mth expert, further let w= { W m } m∈[M] As a set of expert weight matrices.
Step 4.2, training the constructed MoE network model, which comprises the following steps:
wherein ,a logic loss function representing the network, defined as +.>And will be theta (0) Initialized to 0.
And 4.3, performing combined training and then judging on the data processed by the SortPooling layer by utilizing the constructed multi-Experts network with the graph rolling neural network GCNN and MoE structures, and performing prediction classification on the encrypted network flow data through the output probability of the Softmax function.
In this embodiment, a schematic diagram of an Expert unit is also provided, as shown in fig. 4, where the Expert unit includes a Conv1D layer, a MaxPool1D layer, a Conv1D layer, a Dense layer, a Dropout layer, and a Dense layer in cascade.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (10)
1. The encryption network traffic classification method based on GCNN and MoE is characterized by comprising the following steps:
s1, dividing flow data of a mobile application program in a period of time into a plurality of flow blocks with the same length;
s2, converting the flow block into a graph dataset with node characteristics and edge weights;
s3, constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model;
s4, inputting the graph dataset of the data to be tested into the encrypted network flow classification model to obtain a classification result.
2. The method for classifying traffic of an encrypted network based on GCNN and MoE according to claim 1, wherein when traffic data of a mobile application program in a period of time is divided into traffic blocks of the same length, a duration and an overlapping time are set, and the traffic blocks are divided by the duration and the overlapping time, specifically comprising: the length of each flow block is set to be the length of the duration, and each flow block has overlapping with its previous flow block by the length of overlap time and also has overlapping with its next flow block by the length of overlap time except for the first flow block and the last flow block.
3. The method of classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 1, wherein the process of converting traffic blocks into graph datasets having node features and edge weights includes the steps of:
removing the dns protocol in the flow block;
acquiring an IP address and a port number in a mobile application program and combining the IP address to the port number;
in constructing the graph data of the mobile application,
obtaining the maximum node number N required by one MApp graph, and generating all graph data of each MApp according to the weight between two nodes;
all the graphic data of each MApp are stored in 2 csv format files, node features are stored in the features.
4. The encryption network traffic classification method based on GCNN and MoE according to claim 1, wherein the encryption network traffic classification model of the mobile application program based on graph convolutional neural network GCNN and hybrid Expert system comprises a plurality of cascaded GCN layers, a soft layer, an experiment network and a softmax layer, graphs of outputs of four cascaded GCN layers potentially represent that the input soft layer selects the maximum K values of the graph potential representation values, the experiment network comprises a plurality of experiment units, the selected graphs potentially represent the input of the plurality of experiment units respectively, products of the output of each experiment unit and the corresponding weight of the experiment units are accumulated and then input into the softmax layer, and the softmax layer obtains classification results.
5. The method of classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 4, wherein if there are L GCN hierarchies, the output of the GCN layer of the l+1 th is expressed as:
wherein ,output of the layer for the 1+1th picture volume, c l For the number of features of each graph node extracted at the first layer, n is the number of nodes, l=0..l-1, z 0 =X,/>Representing a node characteristic matrix, c representing the characteristic quantity of nodes in the node characteristic matrix; />Is a diagonal matrix of the graph; />An adjacency matrix for adding self-loops; />Is a trainable parameter of the first layer.
6. The method of classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 5, wherein G represents a graph data expressed as For the node set in the graph data, epsilon is the edge set in the graph data, and if A is the adjacency matrix of the graph data, the adjacency matrix added with the self-loop is expressed as:
wherein I is an identity matrix.
7. A method for classifying traffic in an encrypted network based on GCNN and MoE according to claim 3, wherein the side relationship between nodes is established by the cross-correlation between nodes, i.e. if the cross-correlation between two nodes is not 0, there is a side relationship between two nodes, the side weight is the cross-correlation between two nodes, and the calculation process of the cross-correlation between two nodes comprises the following steps:
generating graph nodes according to the traffic captured in a given time window, dividing the given time window into T slices with different durations;
counting the number of packets of each intra-slice mobile application that send a packet of traffic to or receive a service deployed on the destination IP address and port as a cross-correlation between two nodes, expressed as:
wherein ,Ci,j R is the cross-correlation between node i and node j i (t) a binary variable representing whether node i is active in time slice t, r when node i is active i (t) =1, otherwise r i (t)=0。
8. The method for classifying encrypted network traffic based on GCNN and MoE according to claim 1, wherein the optimizing process comprises the steps of:
the method comprises the steps of obtaining historical data as a training data set, and slicing the training data set, namely dividing the training data set into a plurality of mutually independent and orthogonal sub-data sets, wherein each sub-data set is sliced;
when the network is trained, each fragment is used for respectively inputting a classification model to obtain a prediction result, and training is carried out according to the prediction result and a label corresponding to training data;
and when the loss result converges or reaches the maximum training times, the optimization is completed.
9. The method for classifying traffic in an encrypted network based on GCNN and MoE according to claim 8, wherein in the process of optimizing using training data, a model logic loss function is used to perform a direction propagation optimization model, where the model logic loss function is expressed as:
wherein ,as a logical loss function of the network, Θ is a parameterized Expert network, denoted asθ M Representing the M th Expert unit, d is the dimension of an Expert network, and M is the number of the Expert units in the Expert network; n is the number of training data; y is i A tag that is the ith training data; f (x) i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is the output of the Expert network, x i For the ith training data, W is the weight of the MoE gating network; l (z) is a sigmoid function, expressed as l (z) =log (1+exp (-z)).
10. The method for classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 8, wherein the output F (x i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is expressed as:
wherein ,is the set of selected indices, pi m (x; Θ) is the gating value of the mth Expert network, h m (x; Θ) is the output of the mth Expert network, [ P ]]For a sliced dataset, [ M ]]Gating a set of networks for the expert; f (f) m (x; W) is the output of the mth Expert network, [ J ]]Representing the set of filters, σ (·) is the activation function, P represents the number of training data in a sliced dataset, w m,j Weight vector, x representing the jth filter in the mth Expert network (p) Representing the input data as the p-th data in the sliced data set; x represents input data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310207576.4A CN116319583A (en) | 2023-03-06 | 2023-03-06 | Encryption network traffic classification method based on GCNN and MoE |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310207576.4A CN116319583A (en) | 2023-03-06 | 2023-03-06 | Encryption network traffic classification method based on GCNN and MoE |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116319583A true CN116319583A (en) | 2023-06-23 |
Family
ID=86812581
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310207576.4A Pending CN116319583A (en) | 2023-03-06 | 2023-03-06 | Encryption network traffic classification method based on GCNN and MoE |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116319583A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117240615A (en) * | 2023-11-13 | 2023-12-15 | 四川大学 | Migration learning network traffic correlation method based on time interval diagram watermark |
-
2023
- 2023-03-06 CN CN202310207576.4A patent/CN116319583A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117240615A (en) * | 2023-11-13 | 2023-12-15 | 四川大学 | Migration learning network traffic correlation method based on time interval diagram watermark |
CN117240615B (en) * | 2023-11-13 | 2024-01-30 | 四川大学 | Migration learning network traffic correlation method based on time interval diagram watermark |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qu et al. | A survey on the development of self-organizing maps for unsupervised intrusion detection | |
Zhang et al. | Network intrusion detection: Based on deep hierarchical network and original flow data | |
Abusitta et al. | A deep learning approach for proactive multi-cloud cooperative intrusion detection system | |
Ge et al. | Towards a deep learning-driven intrusion detection approach for Internet of Things | |
Zeng et al. | $ Deep-Full-Range $: a deep learning based network encrypted traffic classification and intrusion detection framework | |
Afuwape et al. | Performance evaluation of secured network traffic classification using a machine learning approach | |
CN113162908B (en) | Encrypted flow detection method and system based on deep learning | |
CN113469234A (en) | Network flow abnormity detection method based on model-free federal meta-learning | |
Xu et al. | Anomaly traffic detection based on communication-efficient federated learning in space-air-ground integration network | |
Cui et al. | A session-packets-based encrypted traffic classification using capsule neural networks | |
Ortet Lopes et al. | Towards effective detection of recent DDoS attacks: A deep learning approach | |
Malik et al. | Intelligent SDN traffic classification using deep learning: Deep-SDN | |
Hameed et al. | A deep learning approach for IoT traffic multi-classification in a smart-city scenario | |
CN112822189A (en) | Traffic identification method and device | |
Soleymanpour et al. | CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification | |
Shi et al. | Privacy-aware edge computing based on adaptive DNN partitioning | |
Khedr et al. | FMDADM: A multi-layer DDoS attack detection and mitigation framework using machine learning for stateful SDN-based IoT networks | |
CN116319583A (en) | Encryption network traffic classification method based on GCNN and MoE | |
CN112116078A (en) | Information security baseline learning method based on artificial intelligence | |
Mowla et al. | Evolving neural network intrusion detection system for MCPS | |
Hameed et al. | IoT traffic multi-classification using network and statistical features in a smart environment | |
Ning et al. | A novel malware traffic classification method using semi-supervised learning | |
Song et al. | I $^{2} $ RNN: An Incremental and Interpretable Recurrent Neural Network for Encrypted Traffic Classification | |
Devi et al. | Investigation on Efficient Machine Learning Algorithm for DDoS Attack Detection | |
Liu et al. | A BIPMU-based network security situation assessment method for wireless network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |