CN114510615A - Fine-grained encrypted website fingerprint classification method and device based on graph attention pooling network - Google Patents
Fine-grained encrypted website fingerprint classification method and device based on graph attention pooling network Download PDFInfo
- Publication number
- CN114510615A CN114510615A CN202111191717.5A CN202111191717A CN114510615A CN 114510615 A CN114510615 A CN 114510615A CN 202111191717 A CN202111191717 A CN 202111191717A CN 114510615 A CN114510615 A CN 114510615A
- Authority
- CN
- China
- Prior art keywords
- graph
- flow
- network
- nodes
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000011176 pooling Methods 0.000 title claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000010586 diagram Methods 0.000 claims abstract description 9
- 238000003062 neural network model Methods 0.000 claims abstract description 9
- 230000000694 effects Effects 0.000 claims abstract description 5
- 239000010410 layer Substances 0.000 claims description 23
- 230000006870 function Effects 0.000 claims description 8
- 101100481876 Danio rerio pbk gene Proteins 0.000 claims description 6
- 101100481878 Mus musculus Pbk gene Proteins 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 239000002356 single layer Substances 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 2
- 238000004422 calculation algorithm Methods 0.000 abstract description 10
- 238000013528 artificial neural network Methods 0.000 abstract description 3
- 238000012360 testing method Methods 0.000 description 8
- 238000011160 research Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 235000019580 granularity Nutrition 0.000 description 4
- 238000005096 rolling process Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000020411 cell activation Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000000059 patterning Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a fine-grained encrypted website fingerprint classification method and device based on a graph attention pooling network. The method comprises the steps of establishing a flow trace graph for describing a network flow mode, wherein nodes in the flow trace graph represent network flows, and edges represent context relations of the network flows; automatically learning the intra-flow characteristics and the inter-flow characteristics in the flow trace diagram by using the graph neural network model to obtain effective embedded representation of the flow trace diagram; the effective embedded representation of the traffic trace graph is utilized for website fingerprint classification. The invention provides a flow trace graph capable of reasonably describing a network flow pattern, the method is based on a graph neural network algorithm, complex artificial feature selection is not needed, the global feature and the local feature of the network flow can be effectively learned at the same time, important flow nodes can be automatically learned and paid more attention to, and the negative effects of similar flow nodes and noise flow nodes among classes are reduced. The method is suitable for various granularity website fingerprint scenes, the performance is better, and the number of required training samples is less.
Description
Technical Field
The invention relates to a fine-grained encrypted website fingerprint classification method and device based on a graph attention pooling network, and belongs to the technical field of computer software.
Background
With the increasing awareness of network security, encryption protocols such as HTTPS are widely used in various websites. These encryption protocols also present a significant challenge to network management (such as QoS and malicious behavior tracking) while protecting data privacy. In recent years, with the revival and development of artificial intelligence, a web site fingerprinting technology for identifying a specific web page in encrypted traffic using a machine learning algorithm has become a popular research point in the field of network security.
In the early research, effective statistical characteristics are mined in encrypted flow from various angles such as data packet size, packet arrival time interval and the like, and a traditional machine learning algorithm, such as a K neighbor algorithm, a support vector machine algorithm, a random forest algorithm and the like, is adopted as a classifier, so that better performance is obtained. In recent years, due to the rapid development of deep learning technology, some researches adopt models such as a convolutional neural network and a cyclic neural network to automatically extract effective features from encrypted traffic and realize high-precision website fingerprint classification. These deep learning methods have better performance and do not require complicated manual feature selection, and thus become the mainstream web site fingerprinting method.
However, research has been directed primarily to website home page classification scenarios. In fact, people are not limited to accessing only the home page. Many web sites typically contain several sub-web pages within them, and these web pages represent different services and network behaviors. Therefore, the fine-grained webpage classification problem has important significance for network management such as QoS and the like. However, because different web pages in the same website often have similar layout and content, the traffic features manually or automatically extracted by the conventional method are no longer clearly distinguishable, which further causes the performance degradation of the existing method.
A small amount of research has proposed some methods to improve the classification performance of fine-grained web pages by combining global and local features of traffic. For example, the total byte number, the total packet number and other global characteristics of the network flow are counted, the time slice characteristics such as the byte number, the packet number and other time slice characteristics of each time slice, and the local characteristics such as the length sequence of the front and back n data packets. Compared with the global features under the traditional home page fingerprint scene, the local features can well represent the tiny difference between the fine-grained webpage traffic. However, these methods employ machine learning algorithms, requiring complex manual feature selection and extraction processes. Although the website fingerprint method based on deep learning can automatically learn a potential traffic pattern, it is difficult to find fine-grained local feature differences, and the performance is also greatly reduced in a fine-grained scene. Therefore, additional knowledge is needed to help the deep learning model learn the slight feature difference between similar samples.
Disclosure of Invention
The invention aims to provide a method and a device for effectively solving fine-grained encrypted website fingerprints. The invention is a deep learning algorithm, does not need complex artificial feature selection, and can simultaneously learn the global features and the local features of the network traffic.
The invention provides a fine-grained encrypted website fingerprint classification method based on a graph attention pooling network. In the webpage access flow, different streams represent different types of resource requests, and the method can automatically learn the importance of the different streams on the final classification and has interpretability. The method also has the advantage of better classification performance under the condition of less training samples.
Specifically, the technical scheme adopted by the invention is as follows:
a fine-grained encryption website fingerprint classification method based on a graph attention pooling network comprises the following steps:
establishing a traffic trace graph for describing a network traffic pattern, wherein nodes in the traffic trace graph represent network flows, and edges represent context relations of the network flows;
automatically learning the intra-flow characteristics and the inter-flow characteristics in the flow trace diagram by using the graph neural network model to obtain effective embedded representation of the flow trace diagram;
the effective embedded representation of the traffic trace graph is utilized for website fingerprint classification.
Further, in the traffic trace graph, for two flows generated by the same client, whether two nodes have edges is determined according to whether the starting time interval of the two flows is smaller than an empirical threshold.
Further, the automatically learning the intra-flow features and the inter-flow features in the traffic trace graph by using the graph neural network model to obtain an effective embedded representation of the traffic trace graph comprises:
the attention weight of the nodes is learned by adopting the attention layer of the multi-head graph, so that the model focuses more on important nodes in the flow trace graph, and the negative effects of similar nodes and noise nodes among classes are reduced;
and further screening important nodes by adopting a self-attention pooling layer, and reducing the parameter quantity of the model.
Furthermore, in the multi-head graph attention layer, a flow trace graph firstly extracts a shallow abstract representation through a single-layer full-connection network, then learns the node attention weight through a K-head graph attention network to obtain K node representations, and then accumulates and sends the K node representations to a self-attention pooling layer.
Further, the self-attention pooling layer calculates the importance of each node by adopting a graph convolution network and retains a topK node so as to further screen important nodes and reduce the number of model parameters.
Further, global maximum pooling and global average pooling are carried out on the topK node graph, the two pooling results are spliced to obtain global embedded representation of the graph and serve as output of one rolling block, and finally the output results of the two rolling blocks are spliced to obtain final effective embedding.
Further, the website fingerprint classification by using the effective embedded representation of the traffic trace graph comprises the following steps: and (3) obtaining a webpage classification result by using a single-layer full-connection network and a Log Softmax function as a classifier, wherein Dropout is used for preventing over-training fitting, and NLLLoss is used as a loss function.
A fine-grained encrypted website fingerprint classification device based on a graph attention pooling network comprises the following components:
the composition module is used for establishing a flow trace graph for describing a network flow mode, wherein nodes in the flow trace graph represent network flows, and edges represent the context of the network flows;
the graph attention level pooling module is used for automatically learning the intra-flow characteristics and the inter-flow characteristics in the flow trace graph by utilizing the graph neural network model to obtain effective embedded representation of the flow trace graph;
and the output module is used for carrying out website fingerprint classification by utilizing the effective embedded representation of the flow trace graph.
The key points of the invention are as follows:
1. aiming at the problem of fine-grained website fingerprints, a graph attention pooling network-based encryption website fingerprint classification method is provided. The method uses the traffic trace graph to represent the context of the flow in the webpage access traffic, and can simultaneously represent the global characteristics and the local characteristics of the network traffic. And automatically learning the intra-flow characteristics and inter-flow characteristics in the trace graph by using the graph neural network model, and finally obtaining the effective embedded representation of the trace graph.
2. And learning the attention weight of the node by adopting a multi-head graph attention mechanism, so that the model focuses more on important nodes in the flow trace graph, and the negative effects of similar nodes and noise nodes among classes are reduced. And a self-attention pooling module is adopted to further screen important nodes, and the parameter quantity of the model is reduced.
3. The method can automatically adapt to the website fingerprint scenes with various granularities. The optimal classifier can achieve target indexes under data sets with various granularities. Meanwhile, the method can obtain better classification performance under the condition of less training samples.
The invention has the following characteristics and beneficial effects on solving the problem of fine-grained website fingerprints:
1. the method provides a traffic trace graph which can reasonably describe a network traffic pattern, uses nodes to represent network flows, and utilizes side information to represent the context of the network flows.
2. Based on the graph neural network algorithm, complex artificial feature selection is not needed, and the global feature and the local feature of the network traffic can be effectively learned at the same time.
3. The method can automatically learn and pay more attention to the important stream nodes, and reduce the negative effects of similar stream nodes and noise stream nodes among classes.
4. The method is suitable for various granularity website fingerprint scenes, the performance is better, and the number of required training samples is less.
Drawings
FIG. 1 is a basic block diagram of the method of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, the present invention shall be described in further detail with reference to the following detailed description and accompanying drawings.
FIG. 1 is a basic framework diagram of the method of the present invention, comprising 3 stages, the left side representing the composition stage, the middle representing the training stage, and the right side representing the classification stage. The training stage is a graph attention level pooling module, and comprises 2 volume blocks, and each block comprises 3 sub-layers. The composition module and the graph attention hierarchy pooling module are the most critical technology of the present invention. The scheme of the invention comprises the following technical steps:
1. and a patterning stage:
(1) preparing data: an SSL/TLS encrypted network traffic data packet in a current task scene is collected (SSL/TLS indicates that a website adopts an HTTPS encryption protocol), and after the SSL/TLS encrypted network traffic data packet is labeled, a data set is divided according to a certain proportion, for example, a training set, a verification set and a test set are 6:1: 3. The training set is used for continuously optimizing and adjusting model parameters in a training stage, and the verification set is used for assisting in observing whether the training degree of the model is over-fitted or meets the expected requirement so as to judge when to stop training. The test set is used to test classification performance of the model in actual network traffic.
(2) And (3) stream generation: and (3) shunting the data packets in the step (1) according to the quintuple, and extracting the start time, the size (packet length) and the arrival time interval of the data packet of each flow for constructing node and side information in the step (3). Wherein the five-tuple refers to a source IP address, a destination IP address, a source port number, a destination port number, and a protocol type.
(3) Constructing a flow trace graph: and (3) taking each flow in the step (2) as a node in the graph, and taking the packet length sequence and the packet arrival time interval sequence of the flow as the characteristics of the node. For the streams generated by the same client, whether two nodes have edges is determined according to whether the starting time interval of the two streams is smaller than an empirical threshold, and the threshold is set to be 2 s. As shown in fig. 1, wherein ClientIP1==ClientIP2Indicating that the two streams are streams generated by the same client, the start time interval | t of the two streams1-t2| is less than or equal to the empirical threshold tthresholdAt two nodes v1、v2With an edge added in between. In fig. 1, a denotes the adjacency matrix of the diagram, and X denotes the feature matrix.
2. The training phase, mainly for the power level pooling module of the drawing attention, contains 3 sublayers:
(1) multi-head graph attention layer: the flow trace graph firstly extracts shallow abstract representation through a single-layer full-connection network, then learns node attention weight through a K head graph attention network (GAT) to obtain K node representations, and then accumulates and feeds the K node representations into (2), as shown in figure 1, wherein FC is the full-connection network, Elu represents an exponential linear unit activation function,indicating the learned attention weight. And K is to perform K independent GAT operations on the same flow trace graph to obtain K different node representations. The purpose of K times of calculation is to learn more diversified weights, so that the learning capability of the model is improved.
(2) Self-attention pooling layer: the layer further screens the important nodes by calculating the importance of each node and reserving the topK nodes (reserving the nodes with the importance scores of K before ranking), and meanwhile, the parameter quantity of the model is reduced. As shown in FIG. 1, where Relu represents the rectifying linear cell activation function, GCN represents the graph convolution network, scoreviRepresenting learned node importance scores. Wherein the node importance is represented by a graph volumeThe product network (GCN) is calculated, and the graph convolution operation can focus on the characteristic information of the node and can fully utilize the structure information of the graph.
(3) A readout layer: and (3) the layer respectively carries out global maximum pooling (Maxpool) and global average pooling (MeanPool) on the topK node map in the step (2), and splices two pooling results to obtain a global embedded representation of the flow trace map of the layer, and outputs the global embedded representation as a rolling block (Readout) of the graph attention level pooling module. And finally, splicing the results of the two rolling blocks of the module to obtain the final effective embedding of the sample. As shown in fig. 1, wherein Jumping Knowledge represents a splicing operation of a jump connection, i.e., Readout.
3. A classification stage:
in the stage, the output module is used for obtaining a webpage classification result by using a single-layer full-connection network (FC) and combining a Log Softmax function as a classifier. Where Dropout is used to prevent over-training, NLLLoss as a loss function.
Examples of the invention:
example 1 traditional website fingerprint Home Page Classification scenario
In 10 months of 2020, the original access traffic of the HTTPS website of Alexa chinese rank Top100 is actively collected. With 1w traffic samples for 100 visits per web page. And carrying out sample labeling and feature extraction on the test data, and then carrying out data set division in a proportion of 6:1:3 on a training set, a verification set and a test set. After the image is formed and trained, the algorithm provided by the invention obtains a high F1 score of 99.85% in a test set, and is improved by more than 1% compared with the existing most advanced research work. The method is an optimal method in the traditional website home page classification scene.
Example 2 Fine-grained Web fingerprint scenarios under Single Web site
In 11 months of 2020, the raw access traffic for a total of 90 popular web pages under 2 representative HTTPS websites is collected. 60 webpages of the A website and 30 webpages of the B website are selected, each webpage is visited 100 times, and 9k samples are counted. And after the traffic samples A and B are respectively labeled, feature extracted and data sets are divided, the traffic samples are sent to the graph attention pooling network model provided by the invention for training. Test set results show that the high F1 values of 96.72% and 91.45% are respectively obtained on the two data sets, and compared with the existing most advanced method, the method is improved by 3% to 23%.
Example 3 Fine-grained webpage fingerprint scenarios under multiple websites
In 12 months of 2020, the original access traffic of a total of 100 popular web pages under 9 representative HTTPS websites is actively collected, and each web page is accessed 100 times, and a total of 1w traffic samples are collected. Experimental test results show that 93.37% of F1 value is obtained, and the F1 value is improved by more than 14% compared with the prior art.
On the whole, the graph and attention pooling network model provided by the invention is best in the website fingerprint scenes with various granularities. Meanwhile, experiments show that the method can achieve high accuracy rate only by using a small number of training samples, for example, the accuracy rate of 99% can be achieved only by using 25 samples in a home page classification scene, and the accuracy rate of 90% can be achieved only by using 15 samples in a single website fine-grained webpage classification scene.
Based on the same inventive concept, another embodiment of the present invention provides an apparatus for classifying fine-grained encrypted website fingerprints based on a graph attention pooling network, including:
the composition module is used for establishing a flow trace graph for describing a network flow mode, wherein nodes in the flow trace graph represent network flows, and edges represent the context of the network flows;
the graph attention level pooling module is used for automatically learning the intra-flow characteristics and the inter-flow characteristics in the flow trace graph by utilizing the graph neural network model to obtain effective embedded representation of the flow trace graph;
and the output module is used for carrying out website fingerprint classification by utilizing the effective embedded representation of the flow trace graph.
The specific implementation process of each module is referred to the description of the method of the invention.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the steps of the inventive method.
Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.
Other embodiments of the invention:
1. multi-head attention layer: when the method is implemented specifically, a Graph Convolution Network (GCN) and the like can be adopted to replace a multi-head attention layer so as to reduce the calculation complexity and achieve higher accuracy.
2. A readout layer: the method of the invention can adopt modes of Readout averaging/summing and the like to replace splicing operation in specific implementation.
3. An output module: the method of the invention can self-define the output layer structure according to the classification task when in concrete implementation, for example, a global average pooling layer is adopted to replace a full connection layer.
The particular embodiments of the present invention disclosed above are illustrative only and are not intended to be limiting, since various alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The invention should not be limited to the disclosure of the embodiments in the specification, but the scope of the invention is defined by the appended claims.
Claims (10)
1. A fine-grained encrypted website fingerprint classification method based on a graph attention pooling network is characterized by comprising the following steps:
establishing a traffic trace graph for describing a network traffic pattern, wherein nodes in the traffic trace graph represent network flows, and edges represent context relations of the network flows;
automatically learning the intra-flow characteristics and the inter-flow characteristics in the flow trace diagram by using the graph neural network model to obtain effective embedded representation of the flow trace diagram;
the effective embedded representation of the traffic trace graph is utilized for website fingerprint classification.
2. The method of claim 1, wherein for two flows generated by the same client, the traffic trace graph determines whether two nodes have an edge according to whether a starting time interval of the two flows is smaller than an empirical threshold.
3. The method of claim 1, wherein automatically learning intra-flow features and inter-flow features in a traffic trace graph using a graph neural network model to obtain an efficient embedded representation of the traffic trace graph comprises:
the attention weight of the nodes is learned by adopting the attention layer of the multi-head graph, so that the model focuses more on important nodes in the flow trace graph, and the negative effects of similar nodes and noise nodes among classes are reduced;
and further screening important nodes by adopting a self-attention pooling layer, and reducing the parameter quantity of the model.
4. The method according to claim 3, wherein in the multi-head graph attention layer, the traffic trace graph firstly extracts a shallow abstract representation through a single-layer full-connection network, then learns the node attention weight through a K-head graph attention network to obtain K node representations, and then accumulates and feeds the K node representations into a self-attention pooling layer.
5. The method of claim 3, wherein the self-attention pooling layer calculates importance of each node using graph convolution network and retains topK node to further screen important nodes while reducing model parameters.
6. The method according to claim 5, wherein global maximal pooling and global average pooling operations are performed on the topK node map, and the two pooling results are concatenated to obtain a global embedding representation of the map and used as an output of one volume block, and finally the results output by the two volume blocks are concatenated to obtain a final effective embedding.
7. The method of claim 1, wherein the utilizing the effectively embedded representation of the traffic trace graph for website fingerprint classification comprises: and (3) obtaining a webpage classification result by using a single-layer full-connection network and a Log Softmax function as a classifier, wherein Dropout is used for preventing over-training fitting, and NLLLoss is used as a loss function.
8. A fine-grained encrypted website fingerprint classification device based on a graph attention pooling network is characterized by comprising the following steps:
the composition module is used for establishing a flow trace graph for describing a network flow mode, wherein nodes in the flow trace graph represent network flows, and edges represent the context of the network flows;
the graph attention level pooling module is used for automatically learning intra-flow features and inter-flow features in the flow trace graph by utilizing a graph neural network model to obtain effective embedded representation of the flow trace graph;
and the output module is used for carrying out website fingerprint classification by utilizing the effective embedded representation of the flow trace graph.
9. An electronic apparatus, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a computer, implements the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111191717.5A CN114510615A (en) | 2021-10-13 | 2021-10-13 | Fine-grained encrypted website fingerprint classification method and device based on graph attention pooling network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111191717.5A CN114510615A (en) | 2021-10-13 | 2021-10-13 | Fine-grained encrypted website fingerprint classification method and device based on graph attention pooling network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114510615A true CN114510615A (en) | 2022-05-17 |
Family
ID=81547910
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111191717.5A Pending CN114510615A (en) | 2021-10-13 | 2021-10-13 | Fine-grained encrypted website fingerprint classification method and device based on graph attention pooling network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114510615A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115580547A (en) * | 2022-11-21 | 2023-01-06 | 中国科学技术大学 | Website fingerprint identification method and system based on time-space correlation between network data streams |
CN117648623A (en) * | 2023-11-24 | 2024-03-05 | 成都理工大学 | Network classification algorithm based on pooling comparison learning |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111382780A (en) * | 2020-02-13 | 2020-07-07 | 中国科学院信息工程研究所 | Encryption website fine-grained classification method and device based on HTTP different versions |
-
2021
- 2021-10-13 CN CN202111191717.5A patent/CN114510615A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111382780A (en) * | 2020-02-13 | 2020-07-07 | 中国科学院信息工程研究所 | Encryption website fine-grained classification method and device based on HTTP different versions |
Non-Patent Citations (2)
Title |
---|
JIE LU等: "GAP-WF: Graph Attention Pooling Network for Fine-grained SSL/TLS Website Fingerprinting", 《2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN》, 20 September 2021 (2021-09-20), pages 1 * |
张道维;段海新;: "基于图像纹理的网站指纹技术", 计算机应用, no. 06, 22 January 2020 (2020-01-22) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115580547A (en) * | 2022-11-21 | 2023-01-06 | 中国科学技术大学 | Website fingerprint identification method and system based on time-space correlation between network data streams |
CN117648623A (en) * | 2023-11-24 | 2024-03-05 | 成都理工大学 | Network classification algorithm based on pooling comparison learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yao et al. | Identification of encrypted traffic through attention mechanism based long short term memory | |
Zeng et al. | DeepVCM: A deep learning based intrusion detection method in VANET | |
US10176364B2 (en) | Media content enrichment using an adapted object detector | |
CN114422211B (en) | HTTP malicious traffic detection method and device based on graph attention network | |
CN114510615A (en) | Fine-grained encrypted website fingerprint classification method and device based on graph attention pooling network | |
CN110855648B (en) | Early warning control method and device for network attack | |
CN106230809B (en) | A kind of mobile Internet public sentiment monitoring method and system based on URL | |
CN114389966A (en) | Network traffic identification method and system based on graph neural network and stream space-time correlation | |
WO2017052953A1 (en) | Client-side web usage data collection | |
Wang et al. | 2ch-TCN: a website fingerprinting attack over tor using 2-channel temporal convolutional networks | |
Wu et al. | TDAE: Autoencoder-based automatic feature learning method for the detection of DNS tunnel | |
CN115080756A (en) | Attack and defense behavior and space-time information extraction method oriented to threat information map | |
CN114338064A (en) | Method, device, equipment and storage medium for identifying network traffic type | |
CN110222795A (en) | The recognition methods of P2P flow based on convolutional neural networks and relevant apparatus | |
Wang et al. | Identifying DApps and user behaviors on ethereum via encrypted traffic | |
Lu et al. | GAP-WF: Graph attention pooling network for fine-grained SSL/TLS Website fingerprinting | |
Wang et al. | An unknown protocol syntax analysis method based on convolutional neural network | |
CN113938290A (en) | Website de-anonymization method and system for user side traffic data analysis | |
CN111310796B (en) | Web user click recognition method oriented to encrypted network flow | |
CN112235254A (en) | Rapid identification method for Tor network bridge in high-speed backbone network | |
CN115604032B (en) | Method and system for detecting complex multi-step attack of power system | |
Gu et al. | An online website fingerprinting defense based on the non-targeted adversarial patch | |
CN115580547A (en) | Website fingerprint identification method and system based on time-space correlation between network data streams | |
CN111835720B (en) | VPN flow WEB fingerprint identification method based on feature enhancement | |
TWI591982B (en) | Network flow recognization method and recognization system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |