CN116319583A - Encryption network traffic classification method based on GCNN and MoE - Google Patents

Encryption network traffic classification method based on GCNN and MoE Download PDF

Info

Publication number
CN116319583A
CN116319583A CN202310207576.4A CN202310207576A CN116319583A CN 116319583 A CN116319583 A CN 116319583A CN 202310207576 A CN202310207576 A CN 202310207576A CN 116319583 A CN116319583 A CN 116319583A
Authority
CN
China
Prior art keywords
network
data
graph
traffic
gcnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310207576.4A
Other languages
Chinese (zh)
Inventor
段思睿
张弦
余翔
庞育才
肖云鹏
王蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202310207576.4A priority Critical patent/CN116319583A/en
Publication of CN116319583A publication Critical patent/CN116319583A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention belongs to the field of computer artificial intelligence, and in particular relates to an encryption network traffic classification method based on GCNN and MoE, which comprises the following steps: dividing flow data of a mobile application program in a period of time into a plurality of flow blocks with the same length; converting the traffic blocks into graph datasets having node features and edge weights; constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model; inputting a graph dataset of data to be tested into an encryption network flow classification model to obtain a classification result; the invention realizes higher classification performance and solves the problems of low classification accuracy, poor performance and the like of the traditional machine learning method and the traditional CNN, RNN and other neural network models.

Description

Encryption network traffic classification method based on GCNN and MoE
Technical Field
The invention belongs to the field of computer artificial intelligence, and particularly relates to an encryption network traffic classification method based on GCNN and MoE.
Background
With the increasing development of internet communication technology in recent years, the popularity of communication technology including 5G has made the growth of intelligent and mobile devices remarkable. By 2023, it is generally predicted that the number of internet of things (IoT) devices, including smartphones, will reach hundreds of billions, and that networks have become part of people's work and lives. In today's network management systems, network traffic classification is a critical task, the main objective being to predict network data flow protocols and application types.
In recent years, with the rapid development of the requirements for protecting the privacy security of the transmitted data and users, more and more protocols of application programs start to transmit data by using encryption technology, the duty ratio of encrypted traffic in the network also increases sharply, and the encryption technology is also more and more complex. The classification of encrypted traffic has been one of the most important network security directions since the advent of the internet. But due to the popularity of encryption technology and the high-speed increase of network throughput, it becomes increasingly difficult to achieve rapid and accurate classification of encrypted traffic. On the other hand, the occurrence of the encryption technology also leads to the increase of the possibility of various malicious traffic and network abnormal traffic, and the hacking attack also utilizes the encryption technology to perform a great deal of malicious attack activities, so that when a great deal of encrypted traffic occurs in the network, how to quickly classify the encrypted traffic and further perform refined traffic analysis is very important.
Existing mobile application classification works mostly overcome the challenges of encrypting traffic. For example, the appscenner approach uses a flow-based detection method that extracts side channel features from packet headers and computes statistical features to train on a machine learning model of mobile application classification. Also, the FlowPrint method constructs a fingerprint of an application by considering a communication map between a mobile device and other destinations (e.g., CDN and third party service) and related attributes (e.g., destination IP, destination port, and TLS certificate). In the inference phase, the fingerprints collected in the past are compared with the new fingerprints to determine the application. However, short communication time situations are considered due to challenges in building a communication graph of all possible behaviors of an application. Thus, if the user changes his usage behavior or uses a different function of the application, it may not function properly.
Integrating the current network traffic classification research situation based on deep learning, it is found that some challenges still exist in the process of classifying the network traffic by using the deep learning method:
1. over 80% of mobile traffic is encrypted or adopts Transport Layer Security (TLS), so traffic may not be classified using payload-based methods that analyze certain areas of the application layer protocol;
2. the port-based classification method cannot classify mobile traffic because an application program mainly uses HTTPS to transfer data and uses text formats such as XML or JSON to send data back and forth. Some information (such as the number of files or the size of the files) is not available to employ web page classification.
3. The user behavior varies dynamically over time, depending on the function used. Traffic captured within a short time (e.g., 5 minutes) of the mobile application may not represent its complete traffic behavior.
Disclosure of Invention
In order to solve the problems, the invention provides an encryption network traffic classification method based on GCNN and MoE, which specifically comprises the following steps:
s1, dividing flow data of a mobile application program in a period of time into flow blocks with the same length;
s2, converting the flow block into a graph dataset with node characteristics and edge weights;
s3, constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model;
s4, inputting the graph dataset of the data to be tested into the encrypted network flow classification model to obtain a classification result.
Further, when dividing the flow data of the mobile application program in a period of time into flow blocks with the same length, setting duration and overlapping time, and dividing the flow blocks by the duration and the overlapping time specifically includes: the length of each flow block is set to be the length of the duration, and each flow block has overlapping with its previous flow block by the length of overlap time and also has overlapping with its next flow block by the length of overlap time except for the first flow block and the last flow block.
Further, the process of converting traffic blocks into graph datasets having node features and edge weights includes the steps of:
removing the dns protocol in the flow block;
acquiring an IP address and a port number in a mobile application program and combining the IP address to the port number;
in constructing the graph data of the mobile application,
obtaining the maximum node number N required by one MApp graph, and generating all graph data of each MApp according to the weight between two nodes;
all the graphic data of each MApp are stored in 2 csv format files, node features are stored in the features.
Further, the encryption network traffic classification model of the mobile application program based on the graph rolling neural network GCNN and the hybrid Expert system comprises a plurality of cascaded GCN layers, a sourcing layer, an experiment network and a softmax layer, the graphs of the outputs of the four cascaded GCN layers potentially represent the maximum K values of the selected graph potential representation values of the input sourcing layer, the experiment network comprises a plurality of experiment units, the selected graphs potentially represent the input of the plurality of experiment units respectively, the product of the output of each experiment unit and the corresponding weight of the experiment unit is accumulated, and then the product of the output of each experiment unit and the corresponding weight of the experiment unit is input into the softmax layer, and the softmax layer obtains the classification result.
Further, if there are L GCN hierarchies, the output of the GCN layer of the first+1th is expressed as:
Figure BDA0004111504780000031
wherein ,
Figure BDA0004111504780000032
output of the layer for the 1+1th picture volume, c l For the number of features of each graph node extracted at the first layer, n is the number of nodes, l=0..l-1, z 0 =X,/>
Figure BDA0004111504780000033
Representing a node characteristic matrix, c representing the characteristic quantity of nodes in the node characteristic matrix; />
Figure BDA0004111504780000034
Is a diagonal matrix of the graph; />
Figure BDA0004111504780000035
An adjacency matrix for adding self-loops; />
Figure BDA0004111504780000036
Is a trainable parameter of the first layer.
Further, a graph data is denoted by G, expressed as
Figure BDA0004111504780000037
For the node set in the graph data, epsilon is the edge set in the graph data, and if A is the adjacency matrix of the graph data, the adjacency matrix added with the self-loop is expressed as:
Figure BDA0004111504780000041
wherein I is an identity matrix.
Further, the side relationship between the nodes is established through the cross-correlation between the nodes, that is, if the cross-correlation between the two nodes is not 0, the side relationship exists between the two nodes, the side weight is the cross-correlation between the two nodes, and the calculation process of the cross-correlation between the two nodes comprises the following steps:
generating graph nodes according to the traffic captured in a given time window, dividing the given time window into T slices with different durations;
counting the number of packets of each intra-slice mobile application that send a packet of traffic to or receive a service deployed on the destination IP address and port as a cross-correlation between two nodes, expressed as:
Figure BDA0004111504780000042
wherein ,Ci,j R is the cross-correlation between node i and node j i (t) a binary variable representing whether node i is active in time slice t, r when node i is active i (t) =1, otherwise r i (t)=0。
Further, an Adam algorithm is adopted to optimize an encryption network traffic classification model of a mobile application program based on a graph roll-up neural network GCNN and a hybrid expert system, and the optimization process comprises the following steps:
the method comprises the steps of obtaining historical data as a training data set, and slicing the training data set, namely dividing the training data set into a plurality of mutually independent and orthogonal sub-data sets, wherein each sub-data set is sliced;
when the network is trained, each fragment is used for respectively inputting a classification model to obtain a prediction result, and training is carried out according to the prediction result and a label corresponding to training data;
and when the loss result converges or reaches the maximum training times, the optimization is completed.
Further, in the process of optimizing by using training data, a model logic loss function is used for carrying out a direction propagation optimizing model, and the model logic loss function is expressed as:
Figure BDA0004111504780000043
wherein ,
Figure BDA0004111504780000044
as a logical loss function of the network, Θ is a parameterized Expert network, denoted as
Figure BDA0004111504780000051
θ M Representing the M th Expert unit, d is the dimension of an Expert network, and M is the number of the Expert units in the Expert network; n is the number of training data; y is i A tag that is the ith training data; f (x) i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is the output of the Expert network, x i For the ith training data, W is the weight of the MoE gating network; />
Figure BDA0004111504780000056
Is a sigmoid function, expressed as +.>
Figure BDA0004111504780000057
Further, the output F (x i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is expressed as:
Figure BDA0004111504780000052
Figure BDA0004111504780000053
Figure BDA0004111504780000054
wherein ,
Figure BDA0004111504780000055
is the set of selected indices, pi m (x; Θ) is the gating value of the mth Expert network, h m (x; Θ) is the output of the mth Expert network, [ P ]]For a sliced dataset, [ M ]]Gating a set of networks for the expert; f (f) m (x; W) is the output of the mth Expert network, [ J ]]Representing the set of filters, σ (·) is the activation function, P represents the number of training data in a sliced dataset, w m,j Weight vector, x representing the jth filter in the mth Expert network (p) Representing the input data as the p-th data in the sliced data set; x represents input data.
In the invention, the limitation of a single graph rolling neural network GCNN model on encryption traffic classification is considered, so that the encryption traffic is better identified and classified, the single GCNN model is split into a plurality of Experts networks by adding a MoE expert network into a graph rolling network GCN structure, training and prediction are simultaneously carried out, then the judgment is carried out in a combined mode, and the accuracy of classification and identification on the network encryption traffic is improved by the combined classification judgment of GCNN and MoE, the higher classification performance is realized, and the problems of low classification accuracy, poor performance and the like of the traditional machine learning method and the traditional CNN, RNN and other neural network models are solved.
Drawings
FIG. 1 is an overall flow chart of the present invention;
FIG. 2 is a schematic diagram of a split of a proof mass of the present invention;
FIG. 3 is a model diagram of an encryption traffic classification method based on DGCNN and MoE according to the present invention;
fig. 4 is a diagram of the structure of the experert network of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides an encryption network traffic classification method based on GCNN and MoE, which specifically comprises the following steps:
s1, dividing flow data of a mobile application program (Mobile Application, MApp) within a period of time into flow blocks with the same length;
s2, converting the flow block into a graph dataset with node characteristics and edge weights;
s3, constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model;
s4, inputting the graph dataset of the data to be tested into the encrypted network flow classification model to obtain a classification result.
In this embodiment, for the flow data of a certain MApp, a large-duration flow block of a MApp is split into a plurality of smaller-duration flow blocks of the same MApp, and the duration T is passed through duration And overlap time T overlap For splitting the flow blocks, as shown in FIG. 2, the split flow blocks of MApp of the same length are divided by the duration T duration And overlap time T overlap The flow block of one MApp is overlapped with the previous flow block by the overlapping time T overlap Is also overlapping time T overlap If the duration T is duration Short enough, no T is required to be arranged overlap How to judge the duration T duration Whether short enough, the dividing flow block specifically comprises the following steps, according to experience set by a person skilled in the art:
step 1.1, acquiring original flow data by using a flow acquisition tool Wireshark and the like, wherein a sample data set is encrypted flow data of MApp in an original form, and processing the data;
step 1.2, firstly splitting the original MApp flow data block according to the duration T duration And overlap time T overlap Splitting a flow block of MApp with long duration into a plurality of small blocks with the same length and shorter duration;
step 1.3, the MApp flow block data set is stored as a csv format file after being split.
In this embodiment, each pair (IP: port) after combining the IP address to the Port number is defined as the next node in the graph data, that is, one node in the graph data in this embodiment is a combination of the IP address and the Port number. In this embodiment, the optimal value of the number N of nodes in a graph data is 20, if the value of N is higher, then all the graphs with fewer nodes must use zero padding for feature vectors (in the case of MLP) or potential representation vectors (in the case of the present invention), on the one hand, zero value features may mislead learning of the model, on the other hand, using fewer nodes, useful information may be lost from the discarded nodes, thereby affecting the performance of the model, most graphs have about 10 nodes, 90% of the graphs have less than 35 nodes, 86% of the graphs have less than 30 nodes, and when a large number of nodes are used, the performance increases to the optimal value N before decreasing again. It is worth mentioning that the optimal values of the MLP and the N of the present invention are different, the present invention requires more information about the graphics topology to distinguish the mobile applications, and the present invention has better performance in various experimental scenarios compared to the MLP.
Generating graph data used as a deep neural network model from the acquired flow data blocks of each MApp, specifically comprising the following steps:
step 2.1, removing the dns protocol in the data block;
step 2.2, obtaining the IP address and port number in the MApp and merging the IP address into a port number (same tuple (IP, port number) -same network destination);
and 2.3, using a flow block of a certain MApp as a data frame, generating all image data of each MApp by constructing the maximum node number N required by one MApp image and the weight between two nodes, wherein all image data of each MApp are stored in 2 csv format files, a features.csv format file with node characteristics and a weights.csv format file with weight between nodes.
Step 2.4, the connection is established by the edges connected between the nodes through cross correlation: given the traffic captured in a time window that would result in the plurality of graph nodes described above, the time window is then further divided into a plurality of segments having a predefined slice duration t slice Inner slice. Let T be the number of slices, and during each slice time, a node (destination IP address and port number are a pair) is considered active if the MApp sends or receives at least one traffic packet to a service deployed on the destination IP address and port number. Let r be i (t) is a binary variable of whether node i is active at time slice t, r when node i is active i (t) =1, otherwise r i (t) =0. Within the slice number T, the cross-correlation of two nodes i and j is defined as:
Figure BDA0004111504780000081
by adopting cross correlation, establishing the relationship of edges between nodes, and correspondingly setting the weight of the edges, specifically: if C i,j Not equal to 0, an edge is established between two nodes i and j, and the weight is C i,j
To avoid feature bias when training and predicting graph data into neural network classification model, using min-max scalar normalization to normalize C i,j Normalized to the range [0,1 ]]And (3) inner part. The min-max scalar normalization math is defined as follows:
Figure BDA0004111504780000082
wherein x' represents the normalized value of the single data, and x represents the value before normalization of the single data; min represents the minimum value of the column in which the data is located, and max represents the maximum value of the column in which the data is located.
Each of the graph dataNodes all need to construct a feature vector, i.e., node features. Since mobile applications are connected to various services, each represented by a certain node of the graph as a tuple of IP address and port number, traffic behavior from the mobile device to the server of each service may differ in various traffic characteristics, such as packet size, packet number, flow duration, etc. To extend to both encrypted and unencrypted traffic, it is employed to extract information only from the packet header, without analyzing the packet payload. In addition to packet features, flow features such as the number of flows, the average number of packets in each flow, and the average flow size in bytes are extracted, and in this embodiment only TCP flows and UDP flows are considered, relying on the Wireshark tool to collect analysis flow features. The feature vectors of all nodes in a graph data form a node feature matrix X,
Figure BDA0004111504780000083
where n is the number of nodes and c is the number of features of the nodes.
An encrypted flow classification model based on a graph rolling neural network GCNN and a mixed expert MoE is constructed, statistical features and derivative features of data are extracted through the GCNN model, and the data processed through a SortPooling layer through the MoE model are fed into different MoE sub-networks for training and testing, and the method specifically comprises the following steps:
step 3.1, constructing an encryption flow classification framework of GCNN and MoE, judging the approaching degree of actual output (output is probability) and expected output (output is probability) by adopting a cross loss function, and adopting an Adam optimizer; the mathematical calculation of the cross entropy is as follows:
Figure BDA0004111504780000091
the learning rate during training is adjusted by adopting a fixed period interval, and after each step_size of epochs are trained, the learning rate is adjusted to lr=lr×decay, step_size=10, initial lr=0.0001, decay=0.9;
step 3.2, referring to fig. 3, the GCNN network model portion includes: the GCN structure formed by stacking 4 graph rolls has the size of 1024 for the first 3 graph convolution layers, 512 for the last graph convolution layer and tan h for the activation function;
in the laminate part of the drawing, a drawing is given
Figure BDA00041115047800000911
Let a be the adjacency matrix of G, so that a is a symmetrical binary matrix, and let the graph have no self-loops. Node feature matrix is defined as->
Figure BDA0004111504780000092
wherein />
Figure BDA00041115047800000912
The computing node is potentially denoted +.>
Figure BDA0004111504780000093
wherein />
Figure BDA0004111504780000094
Is an adjacency matrix added with self-loops, +.>
Figure BDA0004111504780000095
Is a diagonal matrix of the graph, such that +.>
Figure BDA0004111504780000096
Is a trainable graph convolution parameter matrix shared among nodes, sigma is a nonlinear activation function, ++>
Figure BDA0004111504780000097
Is the output activation matrix.
Intuitively, the graph data before being fed to the GCN layer shown in FIG. 3 has nodes defined by their neighbors and edges connecting the nodes, and thus the node's potential representation is affected by its neighbors. The graph convolutional layer, through its node features (XW), allows information to propagate between neighboring nodes through the product of the node features and the adjacency matrix (AXW). The potential representation of the final node is defined as
Figure BDA0004111504780000098
Wherein l= … L-1, z 0 :=X,/>
Figure BDA0004111504780000099
Output for the first picture volume layer; c l The number of output channels of the first layer (i.e., the number of features extracted at the first layer per graph node);
Figure BDA00041115047800000910
is a trainable parameter of the first layer. After the convolution process of all the convolution layers described above, a potential representation from the nodes in the overall graph is obtained.
Step 3.3, after the data of the graph convolution layer is processed, the obtained output is fed to a SortPooling layer, and the nodes are ordered through the sum of node characteristics of a first layer (the last layer of the graph convolution process);
if two nodes have the same value at the first level, the sum of the node features of the first-1 level is used, i.e. if the sum of the features of two nodes of the current level is the same, the two nodes are ordered using the sum of the features of their previous level until the tie is broken. Since the number of nodes in each graph is heterogeneous, the pooling layer also truncates or expands the potential representation of the graph to a predefined size. Given a predefined size (e.g., k) of the graph potential representation, truncation is performed if there are more than k values in the graph potential representation vector. Otherwise, zero padding is performed. The value of k is heuristically defined based on the input data. For example, defining k, 90% of the graph nodes are used to construct the graph latent representation vector to avoid losing node features in the final graph latent representation.
Referring to fig. 3, after constructing a MoE network model and passing through a sortpoling layer, an expert sub-network training test is performed, which specifically includes the following steps:
step 4.1, moE layer, consisting of a group of M "experert networks" f 1 ,...,f M And gating networks, which are typically set to be linear. Definition f m (x; W) is the output of the mth Expert network, the output of the MoE layer may beThe definition is as follows:
Figure BDA0004111504780000101
wherein
Figure BDA0004111504780000102
Is the set of selected indices, pi m (x; Θ) is the gating value of the mth Expert network, its value being determined by the following definition:
Figure BDA0004111504780000103
for the mth expert network, this embodiment considers it as a convolutional neural network CNN structure, defined as follows:
Figure BDA0004111504780000104
wherein ,
Figure BDA0004111504780000105
is the weight vector of the jth filter (i.e. neuron) in the mth Expert, j is the number of filters (i.e. neurons), d is the dimension of an Expert network; x is x (p) Representing the input data as the p-th data in the sliced data set; x represents input data; [ P ]]For a sliced dataset, [ M ]]Gating a set of networks for the expert; [ J]Representing the set of filters, σ (·) is the activation function, and P represents the number of training data in one sliced dataset.
In this embodiment, there is actually a Router at the SortPooling layer output to Weight, which adds noise as a disturbance and uses the experience-loss gradient of the disturbance to update the weights. The disturbance empirical loss at time t after the Router adds the disturbance is defined as:
Figure BDA0004111504780000111
wherein ,
Figure BDA0004111504780000112
and->
Figure BDA0004111504780000113
Is random noise; w (W) (0) Is the value initialized by the weight matrix, and the weight update rule of Router is defined as:
Figure BDA0004111504780000114
wherein η > 0 is expert learning rate, || F Representing the operation of the norm calculation,
Figure BDA0004111504780000115
representing the weight update gradient at the mth expert network.
Figure BDA0004111504780000116
For the weight matrix of the mth expert, further let w= { W m } m∈[M] As a set of expert weight matrices.
Step 4.2, training the constructed MoE network model, which comprises the following steps:
Figure BDA0004111504780000117
wherein ,
Figure BDA0004111504780000118
a logic loss function representing the network, defined as +.>
Figure BDA0004111504780000119
And will be theta (0) Initialized to 0.
And 4.3, performing combined training and then judging on the data processed by the SortPooling layer by utilizing the constructed multi-Experts network with the graph rolling neural network GCNN and MoE structures, and performing prediction classification on the encrypted network flow data through the output probability of the Softmax function.
In this embodiment, a schematic diagram of an Expert unit is also provided, as shown in fig. 4, where the Expert unit includes a Conv1D layer, a MaxPool1D layer, a Conv1D layer, a Dense layer, a Dropout layer, and a Dense layer in cascade.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. The encryption network traffic classification method based on GCNN and MoE is characterized by comprising the following steps:
s1, dividing flow data of a mobile application program in a period of time into a plurality of flow blocks with the same length;
s2, converting the flow block into a graph dataset with node characteristics and edge weights;
s3, constructing an encrypted network traffic classification model of a mobile application program based on a graph rolling neural network GCNN and a hybrid expert system, and training the model;
s4, inputting the graph dataset of the data to be tested into the encrypted network flow classification model to obtain a classification result.
2. The method for classifying traffic of an encrypted network based on GCNN and MoE according to claim 1, wherein when traffic data of a mobile application program in a period of time is divided into traffic blocks of the same length, a duration and an overlapping time are set, and the traffic blocks are divided by the duration and the overlapping time, specifically comprising: the length of each flow block is set to be the length of the duration, and each flow block has overlapping with its previous flow block by the length of overlap time and also has overlapping with its next flow block by the length of overlap time except for the first flow block and the last flow block.
3. The method of classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 1, wherein the process of converting traffic blocks into graph datasets having node features and edge weights includes the steps of:
removing the dns protocol in the flow block;
acquiring an IP address and a port number in a mobile application program and combining the IP address to the port number;
in constructing the graph data of the mobile application,
obtaining the maximum node number N required by one MApp graph, and generating all graph data of each MApp according to the weight between two nodes;
all the graphic data of each MApp are stored in 2 csv format files, node features are stored in the features.
4. The encryption network traffic classification method based on GCNN and MoE according to claim 1, wherein the encryption network traffic classification model of the mobile application program based on graph convolutional neural network GCNN and hybrid Expert system comprises a plurality of cascaded GCN layers, a soft layer, an experiment network and a softmax layer, graphs of outputs of four cascaded GCN layers potentially represent that the input soft layer selects the maximum K values of the graph potential representation values, the experiment network comprises a plurality of experiment units, the selected graphs potentially represent the input of the plurality of experiment units respectively, products of the output of each experiment unit and the corresponding weight of the experiment units are accumulated and then input into the softmax layer, and the softmax layer obtains classification results.
5. The method of classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 4, wherein if there are L GCN hierarchies, the output of the GCN layer of the l+1 th is expressed as:
Figure FDA0004111504770000021
wherein ,
Figure FDA0004111504770000022
output of the layer for the 1+1th picture volume, c l For the number of features of each graph node extracted at the first layer, n is the number of nodes, l=0..l-1, z 0 =X,/>
Figure FDA0004111504770000023
Representing a node characteristic matrix, c representing the characteristic quantity of nodes in the node characteristic matrix; />
Figure FDA0004111504770000024
Is a diagonal matrix of the graph; />
Figure FDA0004111504770000025
An adjacency matrix for adding self-loops; />
Figure FDA0004111504770000026
Is a trainable parameter of the first layer.
6. The method of classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 5, wherein G represents a graph data expressed as
Figure FDA0004111504770000027
Figure FDA0004111504770000029
For the node set in the graph data, epsilon is the edge set in the graph data, and if A is the adjacency matrix of the graph data, the adjacency matrix added with the self-loop is expressed as:
Figure FDA0004111504770000028
wherein I is an identity matrix.
7. A method for classifying traffic in an encrypted network based on GCNN and MoE according to claim 3, wherein the side relationship between nodes is established by the cross-correlation between nodes, i.e. if the cross-correlation between two nodes is not 0, there is a side relationship between two nodes, the side weight is the cross-correlation between two nodes, and the calculation process of the cross-correlation between two nodes comprises the following steps:
generating graph nodes according to the traffic captured in a given time window, dividing the given time window into T slices with different durations;
counting the number of packets of each intra-slice mobile application that send a packet of traffic to or receive a service deployed on the destination IP address and port as a cross-correlation between two nodes, expressed as:
Figure FDA0004111504770000031
wherein ,Ci,j R is the cross-correlation between node i and node j i (t) a binary variable representing whether node i is active in time slice t, r when node i is active i (t) =1, otherwise r i (t)=0。
8. The method for classifying encrypted network traffic based on GCNN and MoE according to claim 1, wherein the optimizing process comprises the steps of:
the method comprises the steps of obtaining historical data as a training data set, and slicing the training data set, namely dividing the training data set into a plurality of mutually independent and orthogonal sub-data sets, wherein each sub-data set is sliced;
when the network is trained, each fragment is used for respectively inputting a classification model to obtain a prediction result, and training is carried out according to the prediction result and a label corresponding to training data;
and when the loss result converges or reaches the maximum training times, the optimization is completed.
9. The method for classifying traffic in an encrypted network based on GCNN and MoE according to claim 8, wherein in the process of optimizing using training data, a model logic loss function is used to perform a direction propagation optimization model, where the model logic loss function is expressed as:
Figure FDA0004111504770000032
wherein ,
Figure FDA0004111504770000033
as a logical loss function of the network, Θ is a parameterized Expert network, denoted as
Figure FDA0004111504770000034
θ M Representing the M th Expert unit, d is the dimension of an Expert network, and M is the number of the Expert units in the Expert network; n is the number of training data; y is i A tag that is the ith training data; f (x) i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is the output of the Expert network, x i For the ith training data, W is the weight of the MoE gating network; l (z) is a sigmoid function, expressed as l (z) =log (1+exp (-z)).
10. The method for classifying traffic in an encrypted network based on GCNN and MoE as recited in claim 8, wherein the output F (x i The method comprises the steps of carrying out a first treatment on the surface of the Θ, W) is expressed as:
Figure FDA0004111504770000041
Figure FDA0004111504770000042
Figure FDA0004111504770000043
wherein ,
Figure FDA0004111504770000044
is the set of selected indices, pi m (x; Θ) is the gating value of the mth Expert network, h m (x; Θ) is the output of the mth Expert network, [ P ]]For a sliced dataset, [ M ]]Gating a set of networks for the expert; f (f) m (x; W) is the output of the mth Expert network, [ J ]]Representing the set of filters, σ (·) is the activation function, P represents the number of training data in a sliced dataset, w m,j Weight vector, x representing the jth filter in the mth Expert network (p) Representing the input data as the p-th data in the sliced data set; x represents input data.
CN202310207576.4A 2023-03-06 2023-03-06 Encryption network traffic classification method based on GCNN and MoE Pending CN116319583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310207576.4A CN116319583A (en) 2023-03-06 2023-03-06 Encryption network traffic classification method based on GCNN and MoE

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310207576.4A CN116319583A (en) 2023-03-06 2023-03-06 Encryption network traffic classification method based on GCNN and MoE

Publications (1)

Publication Number Publication Date
CN116319583A true CN116319583A (en) 2023-06-23

Family

ID=86812581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310207576.4A Pending CN116319583A (en) 2023-03-06 2023-03-06 Encryption network traffic classification method based on GCNN and MoE

Country Status (1)

Country Link
CN (1) CN116319583A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240615A (en) * 2023-11-13 2023-12-15 四川大学 Migration learning network traffic correlation method based on time interval diagram watermark

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240615A (en) * 2023-11-13 2023-12-15 四川大学 Migration learning network traffic correlation method based on time interval diagram watermark
CN117240615B (en) * 2023-11-13 2024-01-30 四川大学 Migration learning network traffic correlation method based on time interval diagram watermark

Similar Documents

Publication Publication Date Title
Qu et al. A survey on the development of self-organizing maps for unsupervised intrusion detection
Zhang et al. Network intrusion detection: Based on deep hierarchical network and original flow data
Abusitta et al. A deep learning approach for proactive multi-cloud cooperative intrusion detection system
Ge et al. Towards a deep learning-driven intrusion detection approach for Internet of Things
Zeng et al. $ Deep-Full-Range $: a deep learning based network encrypted traffic classification and intrusion detection framework
Afuwape et al. Performance evaluation of secured network traffic classification using a machine learning approach
CN113162908B (en) Encrypted flow detection method and system based on deep learning
CN113469234A (en) Network flow abnormity detection method based on model-free federal meta-learning
Xu et al. Anomaly traffic detection based on communication-efficient federated learning in space-air-ground integration network
Cui et al. A session-packets-based encrypted traffic classification using capsule neural networks
Ortet Lopes et al. Towards effective detection of recent DDoS attacks: A deep learning approach
Malik et al. Intelligent SDN traffic classification using deep learning: Deep-SDN
Hameed et al. A deep learning approach for IoT traffic multi-classification in a smart-city scenario
CN112822189A (en) Traffic identification method and device
Soleymanpour et al. CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification
Shi et al. Privacy-aware edge computing based on adaptive DNN partitioning
Khedr et al. FMDADM: A multi-layer DDoS attack detection and mitigation framework using machine learning for stateful SDN-based IoT networks
CN116319583A (en) Encryption network traffic classification method based on GCNN and MoE
CN112116078A (en) Information security baseline learning method based on artificial intelligence
Mowla et al. Evolving neural network intrusion detection system for MCPS
Hameed et al. IoT traffic multi-classification using network and statistical features in a smart environment
Ning et al. A novel malware traffic classification method using semi-supervised learning
Song et al. I $^{2} $ RNN: An Incremental and Interpretable Recurrent Neural Network for Encrypted Traffic Classification
Devi et al. Investigation on Efficient Machine Learning Algorithm for DDoS Attack Detection
Liu et al. A BIPMU-based network security situation assessment method for wireless network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination