CN115277888A - Method and system for analyzing message type of mobile application encryption protocol - Google Patents
Method and system for analyzing message type of mobile application encryption protocol Download PDFInfo
- Publication number
- CN115277888A CN115277888A CN202211171000.9A CN202211171000A CN115277888A CN 115277888 A CN115277888 A CN 115277888A CN 202211171000 A CN202211171000 A CN 202211171000A CN 115277888 A CN115277888 A CN 115277888A
- Authority
- CN
- China
- Prior art keywords
- message
- data
- feature
- mobile application
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000004458 analytical method Methods 0.000 claims abstract description 39
- 239000013598 vector Substances 0.000 claims description 105
- 230000006870 function Effects 0.000 claims description 49
- 230000002452 interceptive effect Effects 0.000 claims description 36
- 239000011159 matrix material Substances 0.000 claims description 34
- 230000003993 interaction Effects 0.000 claims description 29
- 238000011176 pooling Methods 0.000 claims description 28
- 230000004913 activation Effects 0.000 claims description 21
- 238000013527 convolutional neural network Methods 0.000 claims description 19
- 238000007781 pre-processing Methods 0.000 claims description 17
- 238000013528 artificial neural network Methods 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 12
- 230000005540 biological transmission Effects 0.000 claims description 11
- 230000009467 reduction Effects 0.000 claims description 8
- 238000007526 fusion splicing Methods 0.000 claims description 7
- 230000004927 fusion Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000003062 neural network model Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 3
- 230000008520 organization Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 2
- 210000004027 cell Anatomy 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 6
- 238000007635 classification algorithm Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010224 classification analysis Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/22—Parsing or analysis of headers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention relates to the technical field of message analysis, and discloses a method and a system for analyzing the type of a mobile application encryption protocol message. The invention solves the problems of high resource consumption, poor universality, low accuracy, poor generalization capability and the like in the prior art.
Description
Technical Field
The invention relates to the technical field of message analysis, in particular to a method and a system for analyzing message types of a mobile application encryption protocol.
Background
The trend of network traffic to the comprehensive encryption era is great, the encryption technology can ensure the safety of data transmission in network communication, but undeniably, malicious behaviors such as malicious software, illegal statements, network attacks and the like are also hidden in network mobile application encryption traffic, and serious threats are brought to users using the internet. The method is an important precondition for information monitoring, safety detection and electronic evidence collection, and has very important significance for maintaining healthy and green network environment, national safety and social stability.
The traditional methods of port matching and deep packet inspection need to analyze the message content first and then identify the message type through regular matching, but these pair encryption protocol messages are faced with failure. The method for using machine learning needs to design artificial features of a message to be identified, which consumes a lot of time and energy, and in the face of a plurality of application programs and encryption protocols with differences, it is difficult to design a feature set which generally reflects traffic features, which limits the universality of the machine learning method, and thus, when the machine learning method is used for analyzing and identifying encrypted network traffic, a better effect is difficult to obtain.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method and a system for analyzing the message type of a mobile application encryption protocol, which solve the problems of high resource consumption, poor universality, low accuracy, poor generalization capability and the like in the prior art.
The technical scheme adopted by the invention for solving the problems is as follows:
a method for analyzing the type of a mobile application encryption protocol message extracts and learns different modal characteristics of the mobile application encryption protocol message, and realizes the type analysis of the encryption protocol message by fusing the different modal characteristics.
As a preferable technical scheme, the method comprises the following steps:
s1, preprocessing message data: preprocessing the acquired mobile application network flow original data, and extracting structural feature data, time sequence feature data and interactive feature data of message loads in the original data;
s2, feature learning, which specifically comprises the following steps:
S2A, learning message structure characteristics: constructing a mobile application private encryption protocol message structure feature learning model based on a dynamic pooling convolutional neural network by using the structure feature data, and learning to obtain a message load structure feature vector;
S2B, learning message time sequence characteristics: constructing a mobile application private encryption protocol message time sequence characteristic learning model based on a long-time and short-time memory network by using the time sequence characteristic data, and learning to obtain a message load time sequence characteristic vector;
S2C, learning message interaction characteristics: constructing a mobile application private encryption protocol message interactive feature learning model based on a graph convolution neural network by using interactive feature data, and learning to obtain a message session interactive feature vector;
s3, message type analysis: and fusing and splicing the message load structure characteristic vector, the time sequence characteristic vector and the interactive characteristic vector, and outputting an analysis result of the message type of the mobile application private encryption protocol by using a maximum entropy classifier.
As a preferred technical solution, the step S1 includes the following steps:
s11, setting the length of an original network data packet intercepted by preprocessing, segmenting continuous network flow by a session flow, and separating network message load data above a transmission layer of each data packet in the session flow;
s12, distinguishing the uplink and downlink directions of the message load data: defining the uplink direction and the downlink direction of the message load data in the data packets in the session according to the data flow direction, taking the message load data which has the same initial address, destination address and port number as the first data packet as the uplink message load data, and taking the rest as the downlink message load data;
s13, respectively calculating the sizes of the load data in the uplink direction and the downlink direction, constructing a payload data sequence in hexadecimal form;
s14, splicing the uplink message load data and the downlink message load data to obtain message load structural characteristic data according to a splicing mode of the uplink data before and the downlink data after;
s15, arranging according to an organization mode of the data packet time sequence to obtain message load time sequence characteristic data;
s16, constructing a feature expression model based on a sequence-to-graph, and converting a data packet sequence in the session flow into an undirected graph; for each data packet in the session flow, extracting the packet direction of the data packet, the standard information entropy of the load data and the load length, and embedding the packet direction of the data packet, the standard information entropy of the load data and the load length as graph node characteristics to obtain message load interaction characteristic data.
As a preferred technical solution, in step S16, the calculation formula of the standard information entropy is:
wherein,the entropy of the standard information is represented,representing an arbitrary distributionDiscrete random variables of,To representThe number of discrete variables contained in (a),indicating the sequence number of the bytes in the data packet,which represents the bytes in the data packet,representing bytesIn thatThe probability of occurrence of (c).
As a preferred technical solution, the step S2A includes the following steps:
S2A1, inputting the message load structure characteristic data into a self-encoder with sparsity constraint conditions and noise robustness constraint conditions for anti-noise and dimension-reduction processing, and generating a feature vector after dimension-reduction and anti-noise processing;
S2A2, constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network; inputting the feature vector subjected to the dimension reduction and noise resistance processing into a constructed message structure feature learning model for learning to obtain a feature sequence subjected to convolution kernel operation;
the message structure characteristic learning model is constructed as follows:
constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network, wherein the message structure characteristic learning model is formed by stacking three layers of one-dimensional convolutions; the filling mode adopts a same mode, and each layer of convolution is accompanied with batch normalization; for each layer of convolution operation, the hidden layer output after one-dimensional convolution is:
wherein,
weight moments representing one-dimensional convolution kernelsThe row numbers of the array are numbered,column numbers of the weight matrix representing the one-dimensional convolution kernel,in the weight matrix representing the one-dimensional convolution kernelGo to the firstThe weight value of a column is determined,which represents the shape of the convolution kernel or kernels,representing input dataGo to the firstThe value of the column is such that,the total number of rows of data is represented,which represents the total number of columns entered,the shape of the input is represented by,to represent the outputA value of each position;
after the convolution kernel operation, a plurality of characteristic sequences are obtained for each input data, and the characteristic vector output by the last layer of convolution is set as:
S2A3, for the feature vector output by the last layer of convolution, taking k-max boosting as a nonlinear down-sampling function, and extracting the feature vector by utilizing nonlinear function dynamic pooling operation to obtain a message load structure feature vector;
the dynamic pooling operation is as follows:
wherein,
the structural characteristics of the message are represented,indicates the number of all the convolutional layers,indicates the number of layers of the current convolutional layer,which indicates the length of the input sequence,indicating a fixed poolLayer parameters.
As a preferred technical solution, in step S2A1, the cost function of the sparsity constraint condition is:
wherein,a cost function representing a sparsity constraint,representing the input from the encoder, and,the sparsity constraint is expressed in terms of,the weight representing the sparsity constraint is represented by,which represents the expectation of the total noise,representing the number of implicit layers in the self-encoder,,representing gaussian noise with a mean of 0 and a variance of 1,representing a neural networkThe layer is input into the device body,a number of an implicit layer element is indicated,the number of the neurons in the hidden layer is represented,representing a hidden layer response;
the cost function of the noise robustness constraint is:
wherein,a cost function representing a noise robustness constraint,the target output is represented by a target output,representing the output from the encoder learning network,which is indicative of an activation factor,、a number representing two input data is shown,representing input data fromTo input dataThe connection weight of (c).
As a preferred technical solution, the step S2B includes the following steps:
S2B1, constructing a mobile application private encryption protocol message load time sequence characteristic learning model based on a long-time memory network, wherein the message load time sequence characteristic learning model comprises JI memory units, JI is an integer and 32 is more than or equal to JI and less than or equal to 256, and learning message load time sequence characteristic data by using the constructed message load time sequence characteristic learning model, wherein the learning formula is as follows:
wherein,a function of a gate unit is represented,、、respectively representing a forgetting gate, an input gate or an output gate,it is shown that the activation function is,corresponding to forgetting to gate and loseThe parameters of the input gate or the output gate,indicating the time of dayThe input of (a) is performed,indicating the time of dayIs then outputted from the output of (a),a bias value representing a forgetting gate, an input gate, or an output gate;
S2B2, obtaining a time sequence characteristic vector output as a message load, wherein the output formula is as follows:
wherein,representing the time sequence characteristic vector of the message load,it is shown that the activation function is,a state vector of the cell is represented,the tan h activation function is expressed as,a parameter indicative of the output gate is provided,indicating the bias.
As a preferred technical solution, the step S2C includes the steps of:
S2C1, constructing a mobile application private encryption protocol message session interactive feature learning model based on a graph convolution neural network, wherein the session interactive feature learning model comprises two graph convolution layers which are sequentially connected, setting the number of channels of two graph convolutions when graph convolution operation is carried out, and activating a function to select a ReLU function;
inputting the message load interactive characteristic data into a graph convolution neural network model, and converting the graph into a sequence-to-graph method(ii) a Wherein, the number of the network data packets of the graph isEach node contains a characteristic number of packets ofThe feature matrix isThe adjacency matrix is;
And S2C2, performing graph convolution operation by using the learning model constructed in the step S2C1, wherein the graph convolution operation comprises the following steps of:
wherein,,the unit matrix is represented by a matrix of units,to representA corresponding matrix of degrees is formed by the degree matrix,the number of network layers is indicated,is shown asThe weight of the layer, the dimension of the weight is,Is shown passing throughThe dimensionality of the graph node data after the layer convolution,is shown asThe biasing of the layers is such that,is shown asInput of the layer, the input of the first layer being,Representing a nonlinear activation function ReLU function;
S2C3, obtaining one after two-layer graph convolution operationUsing the Flatten operation to stretch the matrix into one-dimensional eigenvectorsObtaining:
wherein,representing the interactive feature vector of the messaging session,has the dimension of,Representing each element in the message session interaction feature vector;
S2C4, compressing by using a layer of full connection layer pair, reducing dimensionality, and learning to obtain a message load session feature vector:
wherein,representing the feature vector of the message payload session,a weight matrix representing the fully-connected layer,the offset is represented by the number of bits in the bit,it is shown that the activation function is,the ReLU function is used at the fully connected layer.
As a preferred technical solution, the step S3 includes the following steps:
s31, performing integrated learning and combined training on the message structure characteristic learning model, the message time sequence characteristic learning model and the message interaction characteristic learning model, and setting a hyper-parameter during model combined training; and performing feature fusion splicing on the obtained message load structure feature vector, the message load time sequence feature vector and the message session interaction feature vector, and connecting to obtain:
s32, calculating through a second full connection layer and a softmax activation function thereof:
wherein,a weight matrix representing the second fully-connected layer,the offset is represented by the number of bits in the bit,the length representing the number of classes that need to be classified,is a one-dimensional vector;
s33, finally calculating and outputting the message type analysis result of the private encryption protocol of the mobile application:
A mobile application encryption protocol message type analysis system is based on the mobile application encryption protocol message type analysis method and comprises the following modules:
a message data preprocessing module: the method comprises the steps of preprocessing acquired mobile application network flow original data, and extracting structural feature data, time sequence feature data and interactive feature data of message loads in the original data;
the message structural feature learning module: the method comprises the steps of constructing a mobile application private encryption protocol message structure feature learning model based on a dynamic pooling convolutional neural network by using structure feature data, and learning to obtain a message load structure feature vector;
a message time sequence characteristic learning module: the method comprises the steps of constructing a mobile application private encryption protocol message time sequence characteristic learning model based on a long-time and short-time memory network by using time sequence characteristic data, and learning to obtain a message load time sequence characteristic vector;
the message interaction feature learning module: the method comprises the steps of constructing a mobile application private encryption protocol message interactive feature learning model based on a graph convolution neural network by utilizing interactive feature data, and learning to obtain a message session interactive feature vector;
a message type analysis module: the message type analysis method is used for fusing and splicing the message load structure characteristic vector, the time sequence characteristic vector and the interaction characteristic vector, and outputting an analysis result of the mobile application private encryption protocol message type by using a maximum entropy classifier;
the input ends of the message structure characteristic learning module, the message time sequence characteristic learning module and the message type analysis module are respectively and electrically connected with the output end of the message data preprocessing module, and the output ends of the message structure characteristic learning module, the message time sequence characteristic learning module and the message type analysis module are respectively and electrically connected with the input end of the message type analysis module.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention can accurately identify the message types of various network mobile application private encryption protocols, thereby improving the supervision efficiency and the supervision strength of network space safety;
(2) The invention is based on the load data above the transmission layer in the network flow data to learn and classify, does not depend on the IP address and port number information of the head of the network flow data packet, and the generalization capability of a classification model is strong;
(3) The invention carries out data set sampling test in a complex network environment, and the detection result more accords with the requirement under a real network environment.
Drawings
Fig. 1 is a schematic diagram illustrating steps of a method for parsing a mobile application encryption protocol packet type according to the present invention;
fig. 2 is a schematic structural diagram of a mobile application encryption protocol message type parsing system according to the present invention;
FIG. 3 is a schematic diagram of a mobile application encryption protocol message type parsing framework for multi-mode feature fusion learning according to the present invention;
FIG. 4 is a diagram of a process for converting a sequence of packets to the session characteristics of the mobile application private encryption protocol packet of the figure;
FIG. 5 is one of exemplary graphs of a mobile application session data sequence to graph conversion result;
FIG. 6 is a second exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 7 is a third exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 8 is a fourth exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 9 is a fifth exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 10 is a sixth exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 11 is a seventh exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 12 is an eighth exemplary graph of a mobile application session data sequence to graph conversion result;
FIG. 13 is a schematic diagram showing the comparison of the accuracy of the analysis of 17 types of mobile application encryption protocol message types by other classification algorithms and the present invention;
FIG. 14 is a diagram illustrating comparison of precision ratios for analysis of 17 types of mobile application encryption protocol messages according to other classification algorithms and the present invention;
FIG. 15 is a schematic diagram of other classification algorithms and a comparison of recall ratios for 17 types of mobile application encryption protocol message type parsing according to the present invention;
fig. 16 is a schematic diagram of comparison of F1 values for other classification algorithms and analysis of 17 mobile application encryption protocol packet types according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.
Example 1
As shown in fig. 1 to 16, the present invention provides a method for analyzing a message type of a mobile application encryption protocol for multi-mode feature fusion learning, that is, a method for analyzing a message type of a mobile application encryption protocol, including the following steps:
(1) Preprocessing the acquired mobile application network flow original data, and extracting load structure characteristic data, load time sequence characteristic data and session interaction characteristic data of the mobile application encryption protocol message.
(2) Constructing a mobile application private encryption protocol message structure characteristic learning model based on an autoencoder and a dynamic pooling convolutional neural network, and learning to obtain a message load structure characteristic vector;
(3) Constructing a mobile application private encryption protocol message time sequence characteristic learning model based on a long-time and short-time memory network, and learning to obtain a message load time sequence characteristic vector;
(4) And constructing a mobile application private encryption protocol message interaction feature learning model based on a graph convolution neural network, and learning to obtain a message session interaction feature vector.
(5) And fusing and splicing the load structure characteristic vector, the load time sequence characteristic vector and the session interaction characteristic vector of the mobile application private encryption protocol message, and outputting an analysis result of the type of the mobile application private encryption protocol message by using a maximum entropy classifier.
More specific description of the invention follows:
further, the step (1) specifically comprises the following substeps:
(1.1) preprocessing message load data of an original network data packet, setting the length of the data packet intercepted by preprocessing, segmenting continuous network flow by a session flow, and separating the network message load data above a transmission layer of each data packet in the session flow;
(1.2) distinguishing the uplink direction and the downlink direction of the message load data, distinguishing the data packets according to the directions when the uplink direction and the downlink direction are adopted, defining the direction of the first data packet in the session as the uplink direction, taking the message load data which has the same starting address, destination address and port number as the first data packet as the uplink message load data, and taking the rest as the downlink message load data.
And (1.3) respectively calculating the sizes of the load data in the uplink direction and the downlink direction, and constructing load data sequences in a hexadecimal form. The format is as follows:
the uplink message payload data is represented as: 00+ hex (uplink load data size);
the downlink message payload data is represented as: FF + hex (downstream payload data size).
And (1.4) splicing the uplink and downlink message load data according to the organization mode of the uplink data before the downlink data to obtain message load structural characteristic data.
And (1.5) arranging according to the organizing mode of the data packet time sequence to obtain message load time sequence characteristic data.
(1.6) constructing a feature expression model based on sequence-to-graph, and converting the data packet sequence in the session flow into an undirected graph. And extracting the packet direction, the information entropy and the load length of the load data of each data packet in the session flow, and embedding the packet direction, the information entropy and the load length of the load data as graph node characteristics to obtain message load interaction characteristic data.
The feature expression model based on sequence-to-graph is constructed by converting a data packet sequence in a conversation into a graph structure and performing feature expression on the converted data by utilizing a graph neural network. The transformation process is shown in FIG. 4. First, the transmission direction of the data packet needs to be distinguished. For this purpose, it is defined that the first packet sent in the session is C, the other is S, the positive direction of the packet sent by C to S is represented by 0, and the negative direction of the packet sent by S to C is represented by 1. Thus, the transmission process of the data packets of both sides of the session can be represented by an array A with the element value of 0 or 1, and the sequence of the elements in the array is the sequence of the data packets in the session. This one-dimensional array a representing the packet direction is converted into a adjacency matrix M of an undirected graph. The packets are connected in time sequence to form a sequence, and then the sequences are connected end to form a graph structure.
With the data structure of the graph, a one-dimensional sequence of data packet transmission processes can be represented in a two-dimensional mesh form. The graphical structure of the encryption protocol messaging session interaction feature of several mobile applications is shown in fig. 5-12.
Features extracted from each data packet are embedded in the graph nodes to express encrypted network traffic features. Calculating the length of the transmission layer load and a standard information entropy, wherein the calculation formula of the standard information entropy is as follows:
and then carrying out graph node characteristic embedding and correlation on the length of the transport layer load and the standard information entropy. And combining the three values of the packet direction, the load length and the standard information entropy into an array. Sequence-to-graph feature representation for each session can generate a matrix of 3*N and a label.
Further, the step (2) specifically comprises the following sub-steps:
and (2.1) inputting the message load structural feature data into a self-encoder with sparsity constraint conditions and noise robustness constraint conditions for anti-noise and dimension reduction processing so as to improve the anti-interference capability of mobile application encryption protocol message type analysis under the network environment of background flow. The implementation of the step can not only reduce the training time of each round of the subsequent dynamic pooling convolutional neural network, but also extract the characteristics more accurately, and finally increase the accuracy of the type analysis of the mobile application encryption protocol message.
Setting sparsity constraint conditions in a hidden layer of a self-encoder, wherein the input of the self-encoder is, the noise of background flow is considered during input, the expectation of the input noise is that the sparsity constraint is that the weight of the sparsity constraint is, the number of hidden layers in the self-encoder is, the number of hidden layer units is, the number of hidden layer neurons is, the hidden layer response is, and the sparsity constraint cost function of the self-encoder is:
and setting a noise robustness constraint condition in the self-encoder to constrain the connection weight matrix so as to strengthen a larger weight and weaken the disturbance of a small weight representing network background traffic noise. The cost function of the noise robustness constraint of the self-encoder is:
and inputting the message load structure feature data into a self-encoder with sparsity constraint conditions and noise robustness constraint conditions for unsupervised learning, and generating a feature vector after dimension reduction and noise resistance processing.
And (2.2) inputting the feature vector subjected to the dimension reduction and noise resistance processing into a constructed dynamic pooling convolutional neural network for learning. And constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network, wherein the model is formed by stacking three layers of one-dimensional convolutions. The filling mode adopts a same mode, and batch normalization is carried out along with each layer of convolution.
For each layer of convolution operation, setting the number of channels c of the convolution operation, and outputting the hidden layer after one-dimensional convolution as follows:
after the convolution kernel operation, a plurality of characteristic sequences can be obtained for each input data, and the characteristics of the last layer of convolution output are set as follows:
DropOut is added after the convolution operation to prevent overfitting.
(2.3) for the feature vector output by the last layer of convolution, adopting k-max _ posing as a nonlinear down-sampling function, and extracting features by utilizing nonlinear function dynamic pooling operation, wherein the dynamic pooling operation is as follows:
and after the pooling operation, obtaining the message load structure characteristic vector.
Further, the step (3) specifically includes the following sub-steps:
and (3.1) constructing a mobile application private encryption protocol message load time sequence characteristic learning model based on a long-time and short-time memory network, wherein the model comprises 64 memory units and is used for learning message load time sequence characteristic data.
The mobile application private encryption protocol message load time sequence characteristic learning model adopts a gate control mechanism to learn:
the gating values can be compressed between the [0,1] intervals by the activation function.
DropOut was added to the learning model to prevent overfitting, with a threshold of 0.5.
(3.2) model outputs are:
and the unit state vector acts with an output gate after passing through the activation function to obtain a time sequence characteristic vector of the output message load.
Further, the step (4) specifically includes the following sub-steps:
and (4.1) constructing a mobile application private encryption protocol message session interactive feature learning model based on the graph convolution neural network. The model structure comprises two times of image convolution operations, the number of channels of the two times of image convolution is set, and a function selection function is activated.
Inputting the message load interactive characteristic data into a graph convolution neural network model, and converting the graph into a sequence-to-graph method(ii) a Wherein, the number of the network data packets of the graph isEach node contains a characteristic number of packets ofThe feature matrix isThe adjacency matrix is;
And (4.2) performing graph convolution operation by using the constructed learning model. In the model, for each layer map the convolution operations are:
(4.3) after the two-layer graph convolution operation, one is obtainedUsing the Flatten operation to stretch the matrix into one-dimensional eigenvectorsObtaining:
(4.4) compressing by using a full connection layer pair, reducing dimensionality, and learning to obtain a message load session feature vector:
compressing by using a layer of full connection layer pair, reducing dimensionality, and learning to obtain a message load session feature vector:
further, the step (5) specifically comprises the following sub-steps:
and (5.1) performing ensemble learning and combined training on the three models, and setting hyper-parameters during model combined training.
And performing feature fusion splicing on the obtained message load structure feature vector, the message load time sequence feature vector and the message session interaction feature vector, and connecting to obtain the message load structure feature vector, the message load time sequence feature vector and the message session interaction feature vector.
(5.2) calculating through the second fully-connected layer and its softmax activation function:
(5.1) carrying out integrated learning and combined training on the message structural feature learning model, the message time sequence feature learning model and the message interaction feature learning model, and setting hyper-parameters during model combined training; and performing feature fusion splicing on the obtained message load structure feature vector, the message load time sequence feature vector and the message session interaction feature vector, and connecting to obtain:
(5.2) calculating through the second full connection layer and the softmax activation function thereof:
(5.3) finally, calculating and outputting the message type analysis result of the private encryption protocol of the mobile application:
The method provided by the invention extracts and learns the mobile application encryption message protocol characteristics of different modes from multiple dimensions, integrates and learns the load structure characteristics, the load time sequence characteristics and the session interaction characteristics of the mobile application private encryption protocol message, constructs the mobile application encryption protocol message type analysis model, has strong generalization capability, and obtains a good classification effect on encryption network flow data sets of different environments.
Example 2
As shown in fig. 1 to fig. 16, as a further optimization of embodiment 1, on the basis of embodiment 1, the present embodiment further includes the following technical features:
in this embodiment, a model framework is shown in fig. 3, and first, preprocessing acquired mobile application network traffic raw data, and extracting structural feature data, time sequence feature data, and interaction feature data of a packet load. Then constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network, and learning to obtain a message load structure characteristic vector; constructing a mobile application private encryption protocol message time sequence characteristic learning model based on a long-time and short-time memory network, and learning to obtain a message load time sequence characteristic vector; and constructing a mobile application private encryption protocol message interactive feature learning model based on the graph convolution neural network, and learning to obtain a message session interactive feature vector. And secondly, fusing and splicing the message load structure characteristic vector, the time sequence characteristic vector and the session interaction characteristic vector, and outputting an analysis result of the message type of the mobile application private encryption protocol by using a maximum entropy classifier.
Specifically, the method for analyzing the message type of the mobile application encryption protocol based on the multi-mode feature fusion learning of the embodiment further includes the following technical features:
(1) Preprocessing the acquired mobile application network flow original data, and extracting the structural feature data, the time sequence feature data and the interactive feature data of the message load.
In (1.1) of this step: in the design process of the message type analysis model and the classifier of the mobile application encryption protocol, the effective input problem of the classifier needs to be considered so as to improve the efficiency of classification and identification. Whether an open network traffic data set or network service data traffic collected by researchers are adopted, the original traffic format is in the pcap format, and the pcap format cannot be directly used for inputting a mobile application encryption protocol message type analysis model, and data needs to be preprocessed.
Five types of network mobile applications with different purposes, such as audio-visual entertainment, news information, life shopping, instant messaging and tools, are selected, and the network mobile applications comprise 17 different mobile application tools. The private encryption protocol message types used by the mobile applications are used as tag data and run in a public network environment and a campus network environment to collect corresponding network traffic data. The resulting data set is shown in table 1.
Table 1 collected mobile application network traffic data set
And embedding the characteristics extracted from each data packet in the graph node to express the encrypted network traffic characteristics. Calculating the length of the transmission layer load and a standard information entropy, wherein the calculation formula of the standard information entropy is as follows:
And then carrying out graph node characteristic embedding and correlation on the length of the transport layer load and the standard information entropy. And combining the three values of the data packet direction, the load length and the standard information entropy into an array. Sequence-to-graph feature representation for each session can generate a matrix of 3*N and a label.
(2) Constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network, and learning to obtain a message load structure characteristic vector
The specific process of the step is as follows:
and (2.1) inputting the message load structural feature data into a self-encoder with sparsity constraint conditions and noise robustness constraint conditions for anti-noise and dimension reduction processing so as to improve the anti-interference capability of mobile application encryption protocol message type analysis under the network environment of background flow. The implementation of the step can not only reduce the training time of each round of the subsequent dynamic pooling convolutional neural network, but also extract the characteristics more accurately, and finally increase the accuracy of the type analysis of the mobile application encryption protocol message.
Setting sparsity constraint conditions in a hidden layer of an autoencoder;
and setting a noise robustness constraint condition in the self-encoder to constrain the connection weight matrix so as to strengthen a larger weight and weaken the disturbance of a small weight representing network background traffic noise.
And inputting the message load structure characteristic data into a self-encoder with sparsity constraint conditions and noise robustness constraint conditions for unsupervised learning, and generating a characteristic vector after dimension reduction and noise resistance processing.
And (2.2) inputting the feature vector subjected to the dimension reduction and noise resistance processing into a constructed dynamic pooling convolutional neural network for learning. And constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network, wherein the model is formed by stacking three layers of one-dimensional convolutions. The filling mode adopts a same mode, and batch normalization is carried out along with each layer of convolution. A list of unit structures of the message payload structure feature learning model is shown in table 2.
Table 2 list of unit structures of message payload structure feature learning model
For each layer of convolution operation, the hidden layer output after one-dimensional convolution is:
after the convolution kernel operation, a plurality of characteristic sequences can be obtained for each input data, and the characteristics of the last layer of convolution output are set as follows:
after the convolution operation DropOut is added to prevent overfitting, with a threshold of 0.2.
(2.3) for the feature vector output by the last layer of convolution, taking k-max boosting as a nonlinear down-sampling function, and extracting the features by utilizing nonlinear function dynamic pooling operation, wherein the dynamic pooling operation is as follows:
and obtaining the characteristic vector of the message load structure after the pooling operation.
(3) And constructing a message time sequence characteristic learning model of the mobile application private encryption protocol based on the long-time memory network, and learning to obtain a message load time sequence characteristic vector.
The specific process of the step is as follows:
and (3.1) constructing a mobile application private encryption protocol message load time sequence characteristic learning model based on a long-time and short-time memory network, wherein the model comprises 64 memory units and is used for learning input traffic characteristics. A list of unit structures of the message payload timing characteristic learning model is shown in table 3.
Table 3 list of unit structures of message load timing characteristic learning model
The mobile application private encryption protocol message load time sequence characteristic learning model adopts a gate control mechanism to learn:
the gating values can be compressed between the [0,1] intervals by the activation function.
DropOut was added to the learning model to prevent overfitting, with a threshold of 0.5.
(3.2) model outputs are:
and the unit state vector acts with an output gate after passing through the activation function to obtain a time sequence characteristic vector of the output message load.
(4) And constructing a mobile application private encryption protocol message interactive feature learning model based on the graph convolution neural network, and learning to obtain a message session interactive feature vector.
The specific process of the step is as follows:
and (4.1) constructing a mobile application private encryption protocol message session interactive feature learning model based on the graph convolution neural network, wherein the unit structure of the model is set as shown in the table 4.
Table 4 list of unit structures of interactive feature learning model for message sessions
Inputting the message load interactive characteristic data into a graph convolution neural network model, and converting the graph into a sequence-to-graph method(ii) a Wherein, the number of the network data packets of the graph isEach node contains a characteristic number of packets ofThe feature matrix isThe adjacency matrix is。
And (4.2) performing graph convolution operation by using the constructed learning model. In the model, for each layer map the convolution operations are:
(4.3) after the two-layer graph convolution operation, one is obtainedUsing the Flatten operation to stretch the matrix into one-dimensional eigenvectorsAnd obtaining:
(4.4) compressing by using a layer of full connection layer pair, reducing dimensionality, and learning to obtain a message load session feature vector:
(5) And fusing and splicing the load structure characteristic vector, the time sequence characteristic vector and the session interaction characteristic vector of the mobile application private encryption protocol message, and outputting an analysis result of the type of the mobile application private encryption protocol message by using a maximum entropy classifier.
The specific process of the step is as follows:
(5.1) performing ensemble learning and combined training on the three models, wherein the hyper-parameter setting during the model combined training is shown in the table 5.
TABLE 5 parameter settings during training of three model combinations
And performing feature fusion splicing on the obtained feature vectors, wherein a list of unit structures of the feature fusion splicing is shown in table 6.
Table 6 list of unit structures for feature fusion splicing
(5.2) calculating through the second fully-connected layer and its softmax activation function:
(5.3) finally calculating and outputting the analysis result of the type of the private encryption protocol message of the mobile application, namelyThe sequence numbers corresponding to the categories to which the data belongs:
The experiment of this embodiment is performed on the acquired data set of the network mobile application in 17, and the experimental result is shown in table 7, which shows the analysis result of the method of this embodiment for each application traffic encryption protocol packet type. From the data in the table it can be seen that: four types of standard-finding indexes are applied to more than 99 percent, namely Jingdong, mei Tuo, aiqi skill and much spelling; for the recall index, there are 4 types of applications which exceed 98%, namely Microsoft-Launcher, dog searching input method, weChat and Mei Tuo respectively; for the F1 value index, over 98% have 5 types of applications, which are search dog input, microsoft-Launcher, kyoto, mei Tuo, and WeChat, respectively. The weighted averages of the precision, recall, and F1 values were 97.29%,97.26% and 97.27%, respectively, the overall accuracy of the model on this data set reached 97.26%.
Table 7 type resolution results of the inventive method on a dataset of a network mobile application in 17
In the comparison experiment, the model 2D-CNN, LSTM, GCN, CNN + LSTM is selected for comparison so as to verify the effectiveness of the message type analysis method of the mobile application encryption protocol for multi-mode feature fusion learning. The final overall comparative experimental results are shown in fig. 13 to 16.
It should be noted that, for the sake of simplicity, the present embodiment is described as a series of acts, but those skilled in the art should understand that the present application is not limited by the described order of acts, because some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art will recognize that the embodiments described in this specification are preferred embodiments and that acts or modules referred to are not necessarily required for this application.
The invention can accurately identify the types of the private encryption protocol messages of various network mobile applications, and improve the supervision efficiency and the supervision strength of network space safety;
the invention is based on the load data above the transmission layer in the network flow data to learn and classify, does not depend on the IP address and port number information of the head of the network flow data packet, and the generalization capability of the classification model is strong;
the invention carries out data set sampling test in a complex network environment, and the detection result more accords with the requirement under a real network environment.
It should be noted that, in the present invention, the execution sequence of the "S2A, the message structure feature learning", "S2B, the message timing feature learning", and "S2C, the message interaction feature learning" may be in various forms, and may even be performed simultaneously, so the order of the steps in the embodiments described in the present invention should not be considered as limiting the execution sequence of the three.
As described above, the present invention can be preferably realized.
All features disclosed in all embodiments in this specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications, equivalent arrangements, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A method for analyzing the type of a mobile application encryption protocol message is characterized in that different modal characteristics of the mobile application encryption protocol message are extracted and learned, and the type of the encryption protocol message is analyzed by fusing the different modal characteristics.
2. The method for parsing message type according to mobile application encryption protocol of claim 1, comprising the steps of:
s1, preprocessing message data: preprocessing the acquired mobile application network flow original data, and extracting structural feature data, time sequence feature data and interactive feature data of message loads in the original data;
s2, feature learning, which specifically comprises the following steps:
S2A, learning message structure characteristics: building a mobile application private encryption protocol message structure feature learning model based on a dynamic pooling convolutional neural network by using the structure feature data, and learning to obtain a message load structure feature vector;
S2B, learning message time sequence characteristics: constructing a mobile application private encryption protocol message time sequence feature learning model based on a long-time and short-time memory network by using the time sequence feature data, and learning to obtain a message load time sequence feature vector;
S2C, learning message interaction characteristics: constructing a mobile application private encryption protocol message interactive feature learning model based on a graph convolution neural network by using interactive feature data, and learning to obtain a message session interactive feature vector;
s3, message type analysis: and fusing and splicing the message load structure characteristic vector, the time sequence characteristic vector and the interactive characteristic vector, and outputting an analysis result of the message type of the mobile application private encryption protocol by using a maximum entropy classifier.
3. The method according to claim 2, wherein the step S1 comprises the following steps:
s11, setting the length of an original network data packet intercepted by preprocessing, segmenting continuous network flow by a session flow, and separating network message load data above a transmission layer of each data packet in the session flow;
s12, distinguishing the uplink and downlink directions of the message load data: defining the uplink direction and the downlink direction of the message load data in the data packets in the session according to the data flow direction, taking the message load data which has the same initial address, destination address and port number as the first data packet as the uplink message load data, and taking the rest as the downlink message load data;
s13, respectively calculating the sizes of the load data in the uplink direction and the downlink direction, and constructing a load data sequence in a hexadecimal form;
s14, splicing the uplink message load data and the downlink message load data to obtain message load structural characteristic data according to a splicing mode of the uplink data before and the downlink data after;
s15, arranging according to an organization mode of the data packet time sequence to obtain message load time sequence characteristic data;
s16, constructing a feature expression model based on a sequence-to-graph, and converting a data packet sequence in the session flow into an undirected graph; for each data packet in the session flow, extracting the packet direction of the data packet, the standard information entropy of the load data and the load length, and embedding the packet direction of the data packet, the standard information entropy of the load data and the load length as graph node characteristics to obtain message load interaction characteristic data.
4. The method according to claim 3, wherein in step S16, the standard entropy is calculated as:
wherein,the entropy of the standard information is represented,represent an arbitrary distributionDiscrete random variables of,To representThe number of discrete variables contained in (a),indicating the sequence number of the bytes in the data packet,which represents the bytes in the data packet,representing bytesIn thatThe probability of occurrence of (c).
5. The method for parsing message type of mobile application encryption protocol according to any one of claims 2 to 4, wherein the step S2A comprises the steps of:
S2A1, inputting the message load structure characteristic data into a self-encoder with sparsity constraint conditions and noise robustness constraint conditions for anti-noise and dimension-reduction processing, and generating a feature vector after dimension-reduction and anti-noise processing;
S2A2, constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network; inputting the feature vector subjected to the dimension reduction and noise resistance processing into a constructed message structure feature learning model for learning to obtain a feature sequence subjected to convolution kernel operation;
the message structure characteristic learning model is constructed as follows:
constructing a mobile application private encryption protocol message structure characteristic learning model based on a dynamic pooling convolutional neural network, wherein the message structure characteristic learning model is formed by stacking three layers of one-dimensional convolutions; the filling mode adopts a same mode, and each layer of convolution is accompanied with batch normalization; for each layer of convolution operation, the hidden layer output after one-dimensional convolution is:
wherein,
representing one-dimensional convolution kernelsThe row numbers of the weight matrix are numbered,column labels of the weight matrix representing the one-dimensional convolution kernel,in the weight matrix representing the one-dimensional convolution kernelGo to the firstThe weight value of a column is determined,which represents the shape of the convolution kernel,representing input dataGo to the firstThe value of the column is such that,the total number of rows of data is represented,which represents the total number of columns entered,the shape of the input is represented by,to represent the outputA value of each position;
after the convolution kernel operation, a plurality of characteristic sequences can be obtained for each input data, and the characteristic vector output by the last layer of convolution is set as follows:
S2A3, for the feature vector output by the last layer of convolution, taking k-max boosting as a nonlinear down-sampling function, and extracting the feature vector by utilizing nonlinear function dynamic pooling operation to obtain a message load structure feature vector;
the dynamic pooling operation is as follows:
wherein,
6. The method for parsing message type of mobile application encryption protocol according to claim 5, wherein in step S2A1, the cost function of sparsity constraint condition is:
wherein,a cost function representing a sparsity constraint,representing the input from the encoder, and,the sparsity constraint is expressed in terms of,the weight representing the sparsity constraint is represented by,which represents the expectation of the total noise,representing the number of implicit layers in the self-encoder,,representing gaussian noise with a mean of 0 and a variance of 1,representing a neural networkThe layer is input into the device body,a number of a hidden layer unit is indicated,the number of the neurons in the hidden layer is represented,representing a hidden layer response;
the cost function of the noise robustness constraint is:
wherein,a cost function representing a noise robustness constraint,the target output is represented by a target output,representing the output from the encoder learning network,which is indicative of an activation factor,、a number representing two input data is shown,representing input data fromTo input dataThe connection weight of (2).
7. The method according to claim 6, wherein the step S2B comprises the following steps:
S2B1, constructing a mobile application private encryption protocol message load time sequence characteristic learning model based on a long-time memory network, wherein the message load time sequence characteristic learning model comprises JI memory units, JI is an integer and 32 is more than or equal to JI and less than or equal to 256, and learning message load time sequence characteristic data by using the constructed message load time sequence characteristic learning model, wherein the learning formula is as follows:
wherein,a function of a gate unit is represented,、、respectively representing a forgetting gate, an input gate or an output gate,it is shown that the activation function is,a parameter corresponding to a forgetting gate, an input gate or an output gate,indicating the time of dayThe input of (a) is performed,indicating the time of dayIs then outputted from the output of (a),a bias value representing a forgetting gate, an input gate, or an output gate;
S2B2, obtaining a time sequence characteristic vector output as a message load, wherein the output formula is as follows:
wherein,a characteristic vector representing the time sequence of the message payload,it is shown that the activation function is,a vector of the states of the cells is represented,the tan h activation function is expressed as,a parameter indicative of the output gate is provided,indicating the bias.
8. The method according to claim 7, wherein the step S2C comprises the steps of:
S2C1, constructing a mobile application private encryption protocol message session interactive feature learning model based on a graph convolution neural network, wherein the session interactive feature learning model comprises two graph convolution layers which are sequentially connected, setting the number of channels of two graph convolutions when graph convolution operation is carried out, and activating a function to select a ReLU function;
inputting the message load interactive characteristic data into a graph convolution neural network model, and converting the graph into a sequence-to-graph method(ii) a Wherein, the net of the figureThe number of network data packets isEach node contains a characteristic number of packets ofThe feature matrix isThe adjacency matrix is;
And S2C2, performing graph convolution operation by using the learning model constructed in the step S2C1, wherein the graph convolution operation comprises the following steps of:
wherein,,the unit matrix is represented by a matrix of units,to representA corresponding matrix of degrees is formed by the degree matrix,the number of network layers is indicated,denotes the firstThe weight of the layer, the dimension of the weight is,Is shown passing throughThe dimensionality of the graph node data after the layer convolution,is shown asThe biasing of the layers is such that,is shown asInput of the layer, the input of the first layer being,Representing a nonlinear activation function ReLU function;
S2C3, obtaining one after two-layer graph convolution operationUsing the Flatten operation to stretch the matrix into one-dimensional eigenvectorsObtaining:
wherein,representing the interactive feature vector of the messaging session,has the dimension of,Representing each element in the message session interaction feature vector;
S2C4, compressing by using a layer of full connection layer pair, reducing dimensionality, and learning to obtain a message load session feature vector:
9. The method according to claim 8, wherein the step S3 comprises the following steps:
s31, performing integrated learning and combined training on the message structure characteristic learning model, the message time sequence characteristic learning model and the message interaction characteristic learning model, and setting a hyper-parameter during model combined training; and performing feature fusion splicing on the obtained message load structure feature vector, the message load time sequence feature vector and the message session interaction feature vector, and connecting to obtain:
s32, calculating through a second full connection layer and a softmax activation function thereof:
wherein,a weight matrix representing the second fully-connected layer,the offset is represented by the number of bits in the bit,the length representing the number of classes that need to be classified,is a one-dimensional vector;
s33, finally calculating and outputting the analysis result of the mobile application private encryption protocol message type:
10. A mobile application encryption protocol message type parsing system, based on any one of claims 1 to 9, characterized in that the mobile application encryption protocol message type parsing method comprises the following modules:
a message data preprocessing module: the method comprises the steps of preprocessing acquired mobile application network flow original data, and extracting structural feature data, time sequence feature data and interactive feature data of message loads in the original data;
the message structural feature learning module: the method comprises the steps of constructing a mobile application private encryption protocol message structure feature learning model based on a dynamic pooling convolutional neural network by using structure feature data, and learning to obtain a message load structure feature vector;
a message time sequence characteristic learning module: the method comprises the steps of constructing a mobile application private encryption protocol message time sequence characteristic learning model based on a long-time and short-time memory network by using time sequence characteristic data, and learning to obtain a message load time sequence characteristic vector;
the message interaction feature learning module: the method comprises the steps of constructing a mobile application private encryption protocol message interactive feature learning model based on a graph convolution neural network by using interactive feature data, and learning to obtain a message session interactive feature vector;
a message type analysis module: the message type analysis method is used for fusing and splicing the message load structure characteristic vector, the time sequence characteristic vector and the interaction characteristic vector, and outputting an analysis result of the mobile application private encryption protocol message type by using a maximum entropy classifier;
the input ends of the message structure characteristic learning module, the message time sequence characteristic learning module and the message type analysis module are respectively and electrically connected with the output end of the message data preprocessing module, and the output ends of the message structure characteristic learning module, the message time sequence characteristic learning module and the message type analysis module are respectively and electrically connected with the input end of the message type analysis module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211171000.9A CN115277888B (en) | 2022-09-26 | 2022-09-26 | Method and system for analyzing message type of mobile application encryption protocol |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211171000.9A CN115277888B (en) | 2022-09-26 | 2022-09-26 | Method and system for analyzing message type of mobile application encryption protocol |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115277888A true CN115277888A (en) | 2022-11-01 |
CN115277888B CN115277888B (en) | 2023-01-31 |
Family
ID=83757417
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211171000.9A Active CN115277888B (en) | 2022-09-26 | 2022-09-26 | Method and system for analyzing message type of mobile application encryption protocol |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115277888B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115801897A (en) * | 2022-12-20 | 2023-03-14 | 南京工程学院 | Dynamic message processing method for edge proxy |
CN115883263A (en) * | 2023-03-02 | 2023-03-31 | 中国电子科技集团公司第三十研究所 | Encryption application protocol type identification method based on multi-scale load semantic mining |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105430021A (en) * | 2015-12-31 | 2016-03-23 | 中国人民解放军国防科学技术大学 | Encrypted traffic identification method based on load adjacent probability model |
US20190273509A1 (en) * | 2018-03-01 | 2019-09-05 | Crowdstrike, Inc. | Classification of source data by neural network processing |
CN111147394A (en) * | 2019-12-16 | 2020-05-12 | 南京理工大学 | Multi-stage classification detection method for remote desktop protocol traffic behavior |
CN112003870A (en) * | 2020-08-28 | 2020-11-27 | 国家计算机网络与信息安全管理中心 | Network encryption traffic identification method and device based on deep learning |
CN112511555A (en) * | 2020-12-15 | 2021-03-16 | 中国电子科技集团公司第三十研究所 | Private encryption protocol message classification method based on sparse representation and convolutional neural network |
WO2021103135A1 (en) * | 2019-11-25 | 2021-06-03 | 中国科学院深圳先进技术研究院 | Deep neural network-based traffic classification method and system, and electronic device |
CN113179223A (en) * | 2021-04-23 | 2021-07-27 | 中山大学 | Network application identification method and system based on deep learning and serialization features |
WO2022041394A1 (en) * | 2020-08-28 | 2022-03-03 | 南京邮电大学 | Method and apparatus for identifying network encrypted traffic |
CN114358177A (en) * | 2021-12-31 | 2022-04-15 | 北京工业大学 | Unknown network traffic classification method and system based on multidimensional feature compact decision boundary |
WO2022094926A1 (en) * | 2020-11-06 | 2022-05-12 | 中国科学院深圳先进技术研究院 | Encrypted traffic identification method, and system, terminal and storage medium |
CN114519390A (en) * | 2022-02-17 | 2022-05-20 | 北京邮电大学 | QUIC flow classification method based on multi-mode deep learning |
-
2022
- 2022-09-26 CN CN202211171000.9A patent/CN115277888B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105430021A (en) * | 2015-12-31 | 2016-03-23 | 中国人民解放军国防科学技术大学 | Encrypted traffic identification method based on load adjacent probability model |
US20190273509A1 (en) * | 2018-03-01 | 2019-09-05 | Crowdstrike, Inc. | Classification of source data by neural network processing |
WO2021103135A1 (en) * | 2019-11-25 | 2021-06-03 | 中国科学院深圳先进技术研究院 | Deep neural network-based traffic classification method and system, and electronic device |
CN111147394A (en) * | 2019-12-16 | 2020-05-12 | 南京理工大学 | Multi-stage classification detection method for remote desktop protocol traffic behavior |
CN112003870A (en) * | 2020-08-28 | 2020-11-27 | 国家计算机网络与信息安全管理中心 | Network encryption traffic identification method and device based on deep learning |
WO2022041394A1 (en) * | 2020-08-28 | 2022-03-03 | 南京邮电大学 | Method and apparatus for identifying network encrypted traffic |
WO2022094926A1 (en) * | 2020-11-06 | 2022-05-12 | 中国科学院深圳先进技术研究院 | Encrypted traffic identification method, and system, terminal and storage medium |
CN112511555A (en) * | 2020-12-15 | 2021-03-16 | 中国电子科技集团公司第三十研究所 | Private encryption protocol message classification method based on sparse representation and convolutional neural network |
CN113179223A (en) * | 2021-04-23 | 2021-07-27 | 中山大学 | Network application identification method and system based on deep learning and serialization features |
CN114358177A (en) * | 2021-12-31 | 2022-04-15 | 北京工业大学 | Unknown network traffic classification method and system based on multidimensional feature compact decision boundary |
CN114519390A (en) * | 2022-02-17 | 2022-05-20 | 北京邮电大学 | QUIC flow classification method based on multi-mode deep learning |
Non-Patent Citations (3)
Title |
---|
ZIYI ZHAO ET AL.: ""CL-ETC: A Contrastive Learning Method for Encrypted Traffic Classification"", 《2022 IFIP NETWORKING CONFERENCE (IFIP NETWORKING)》 * |
程永新等: ""一种加密流量行为分析系统的设计研究"", 《通信技术》 * |
童博等: ""复杂网络环境下加密流量识别方法研究"", 《邮电设计技术》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115801897A (en) * | 2022-12-20 | 2023-03-14 | 南京工程学院 | Dynamic message processing method for edge proxy |
CN115801897B (en) * | 2022-12-20 | 2024-05-24 | 南京工程学院 | Message dynamic processing method of edge proxy |
CN115883263A (en) * | 2023-03-02 | 2023-03-31 | 中国电子科技集团公司第三十研究所 | Encryption application protocol type identification method based on multi-scale load semantic mining |
CN115883263B (en) * | 2023-03-02 | 2023-05-09 | 中国电子科技集团公司第三十研究所 | Encryption application protocol type identification method based on multi-scale load semantic mining |
Also Published As
Publication number | Publication date |
---|---|
CN115277888B (en) | 2023-01-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110896381B (en) | Deep neural network-based traffic classification method and system and electronic equipment | |
CN115277888B (en) | Method and system for analyzing message type of mobile application encryption protocol | |
CN110287983B (en) | Single-classifier anomaly detection method based on maximum correlation entropy deep neural network | |
WO2019144521A1 (en) | Deep learning-based malicious attack detection method in traffic cyber physical system | |
CN112508085B (en) | Social network link prediction method based on perceptual neural network | |
Wang et al. | App-net: A hybrid neural network for encrypted mobile traffic classification | |
CN109698836A (en) | A kind of method for wireless lan intrusion detection and system based on deep learning | |
Lai et al. | Industrial anomaly detection and attack classification method based on convolutional neural network | |
CN109446804B (en) | Intrusion detection method based on multi-scale feature connection convolutional neural network | |
CN111353153A (en) | GEP-CNN-based power grid malicious data injection detection method | |
CN112087442B (en) | Time sequence related network intrusion detection method based on attention mechanism | |
CN113177132A (en) | Image retrieval method based on depth cross-modal hash of joint semantic matrix | |
Xue et al. | Clustering-Induced Adaptive Structure Enhancing Network for Incomplete Multi-View Data. | |
CN114615093A (en) | Anonymous network traffic identification method and device based on traffic reconstruction and inheritance learning | |
CN111397902A (en) | Rolling bearing fault diagnosis method based on feature alignment convolutional neural network | |
CN113541834B (en) | Abnormal signal semi-supervised classification method and system and data processing terminal | |
CN115037805B (en) | Unknown network protocol identification method, system and device based on deep clustering and storage medium | |
CN103177265A (en) | High-definition image classification method based on kernel function and sparse coding | |
CN111641598A (en) | Intrusion detection method based on width learning | |
CN115277258B (en) | Network attack detection method and system based on temporal-spatial feature fusion | |
CN114064471A (en) | Ethernet/IP protocol fuzzy test method based on generation of countermeasure network | |
CN111130942B (en) | Application flow identification method based on message size analysis | |
CN114915575A (en) | Network flow detection device based on artificial intelligence | |
CN106021170A (en) | Graph building method employing semi-supervised low-rank representation model | |
CN117633627A (en) | Deep learning unknown network traffic classification method and system based on evidence uncertainty evaluation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |