CN114792384A - Graph classification method and system integrating high-order structure embedding and composite pooling - Google Patents
Graph classification method and system integrating high-order structure embedding and composite pooling Download PDFInfo
- Publication number
- CN114792384A CN114792384A CN202210486421.4A CN202210486421A CN114792384A CN 114792384 A CN114792384 A CN 114792384A CN 202210486421 A CN202210486421 A CN 202210486421A CN 114792384 A CN114792384 A CN 114792384A
- Authority
- CN
- China
- Prior art keywords
- graph
- subgraph
- layer
- pooling
- composite
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011176 pooling Methods 0.000 title claims abstract description 70
- 239000002131 composite material Substances 0.000 title claims abstract description 49
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000013528 artificial neural network Methods 0.000 claims abstract description 49
- 239000013598 vector Substances 0.000 claims abstract description 31
- 230000007246 mechanism Effects 0.000 claims abstract description 11
- 238000004590 computer program Methods 0.000 claims description 15
- 238000003860 storage Methods 0.000 claims description 8
- 230000004927 fusion Effects 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 150000001875 compounds Chemical class 0.000 claims 1
- 230000001186 cumulative effect Effects 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 13
- 230000002776 aggregation Effects 0.000 description 10
- 238000004220 aggregation Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000006116 polymerization reaction Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 241001640034 Heteropterys Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
- G06V10/765—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of artificial intelligence graph classification, and provides a graph classification method and a system integrating high-order structure embedding and composite pooling, wherein the method comprises the following steps of: acquiring a graph to be classified; inputting the graph to be classified into a graph neural network to obtain the class of the graph; for each subgraph set of the graph, calculating the characteristics of each subgraph by each convolution layer based on the subgraph set output by the last neural network layer, updating the subgraph set by each composite pooling layer based on the characteristics of each subgraph output by the convolution layer, and simultaneously, for each subgraph in the updated subgraph set, fusing the characteristics of the subgraph in a local neighborhood through an attention mechanism to update the characteristics of the subgraph; reading out the layer to obtain a graph representation vector, and inputting the graph representation vector into the classifier to obtain the class of the graph. By using a high-order structure, messages are directly transmitted between the subgraphs, invisible structural information at a node level is captured, and the classification precision of the subgraphs is improved.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence graph classification, and particularly relates to a graph classification method and system integrating high-order structure embedding and composite pooling.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
In real life, data of many real scenes can be quite naturally represented by graphs, from biological and chemical informatics to social network analysis, graph structure data is ubiquitous in the application field, and graph classification is one of important applications. Briefly, given a dataset of graphs of the (G, y) form, where G denotes a graph and y is its class, the goal of the graph classification task is to predict the labels associated with the entire graph using given graph structure and node characteristics. Many realistic graphs have typical local structures, for example, in a social network graph, the self network around each node and the complex multivariate relationship of the whole community have hierarchical features; in a chemical molecular network diagram, in order to efficiently label the diagram structure of an organic molecule, complex structures such as radicals and bonds representing functional units in the molecule are encoded. Such complex multi-element structures are defined in the graph as high-order structures, which play a crucial role in the feature representation of the network subgraph. In order to develop an effective graph classification model in the graph field, the abundant information inherent in the graph structure and the characteristic information contained in the graph nodes and edges need to be fully utilized. In recent years, a large number of correlation models have been proposed to capture the complex structural relationships in the graph to improve the performance of the model in specific tasks. These models, although performing well on their respective problems, have the following disadvantages:
(1) the existing model only uses information of vertexes and edges in the graph for classification, and lacks attention to high-order graph structure information, while in practical application, interaction may occur in groups of three or more nodes, which cannot be simply described as a pair-wise relationship between entities, but can be decomposed and expressed into different levels of high-order structures, for example, the high-order structures and node features in the actor social network graph of fig. 2 play an important role, yellow and green blue dotted boxes respectively represent a second-order subgraph, a third-order subgraph and a fourth-order subgraph, while the existing model cannot represent such complex relationships.
(2) Most of the existing models lack a hierarchical structure, a graph structure formed naturally is formed by mutually associating single nodes, a large number of structural semantics are contained, and learning the representation of the graph in a layered mode is very important for capturing a local structure existing in the graph.
(3) To implement a hierarchical structure in the graph classification task, existing graph neural network approaches implement hierarchical learning using a pooling mechanism. However, the graph pooling mode in the existing model is single, and subgraphs which do not participate in graph pooling are directly discarded in the topology generation process, so that graph characteristic information is lost; and still focus only on the node level without considering higher order structural information; and does not explicitly consider both graph topology and node characterization when generating the pool graph topology.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides a graph classification method and system integrating high-order structure embedding and composite pooling.
In order to achieve the purpose, the invention adopts the following technical scheme:
a first aspect of the present invention provides a graph classification method that merges high-order structure embedding and composite pooling, comprising:
acquiring a graph to be classified;
inputting the graph to be classified into a graph neural network to obtain the class of the graph;
the graph neural network comprises a reading layer, a classifier and a plurality of neural network layers which are connected in sequence, wherein each neural network layer consists of a convolution layer and a composite pooling layer which are connected in sequence; for each subgraph set of the graph, calculating the characteristics of each subgraph by each convolution layer based on the subgraph set output by the last neural network layer, updating the subgraph set by each composite pooling layer based on the characteristics of each subgraph output by the convolution layer, and simultaneously, for each subgraph in the updated subgraph set, fusing the characteristics of the subgraph in a local neighborhood through an attention mechanism to update the characteristics of the subgraph; and the reading layer aggregates the characteristics of all subgraphs in all subgraph sets output by the last layer of neural network layer to obtain a graph representation vector, and inputs the graph representation vector into the classifier to obtain the class of the graph.
Furthermore, the graph corresponds to a plurality of subgraph sets, and all subgraphs in the same subgraph set are of the same order.
Further, in a subgraph set, if there is an edge connection between non-common nodes of two subgraphs, the two subgraphs may join each other in a local neighborhood.
Furthermore, the features of each subgraph in the first layer of convolutional layer are obtained by combining the initial features of the subgraph after the low-order features of each node in the subgraph are aggregated.
Further, each layer of composite pooling layer calculates the score of each subgraph based on the characteristics of each subgraph output by the convolutional layer, and selects a plurality of subgraphs according to the scores to update the subgraph set.
Further, the score of each sub-graph is a weighted sum of the structure information score and the feature score.
Furthermore, the reading layer firstly accumulates and sums the characteristics of all sub-images of each sub-image set to obtain a vector representation corresponding to each sub-image set; and then, splicing vector representations corresponding to all the sub-graph sets to obtain a graph representation vector.
A second aspect of the invention provides a graph classification system that fuses high-order structure embedding and composite pooling, comprising:
an acquisition module configured to: acquiring a graph to be classified;
a classification module configured to: inputting the graph to be classified into a graph neural network to obtain the class of the graph;
the graph neural network comprises a reading layer, a classifier and a plurality of neural network layers which are connected in sequence, wherein each neural network layer consists of a convolution layer and a composite pooling layer which are connected in sequence; for each subgraph set of the graph, each convolutional layer calculates the characteristics of each subgraph based on a subgraph set output by a previous neural network layer, each composite pooling layer updates the subgraph set based on the characteristics of each subgraph output by the convolutional layer, and meanwhile, for each subgraph in the updated subgraph set, the characteristics of the subgraph are updated by fusing the characteristics of the subgraphs in a local neighborhood through an attention mechanism; and the reading layer aggregates the characteristics of all subgraphs in all subgraph sets output by the last layer of neural network layer to obtain a graph representation vector, and inputs the graph representation vector into the classifier to obtain the class of the graph.
A third aspect of the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps in the graph classification method incorporating high-order structure embedding and composite pooling as described above.
A fourth aspect of the present invention provides a computer device, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the graph classification method for merging high-order structure embedding and composite pooling as described above.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a graph classification method integrating embedding of a high-order structure and composite pooling, which utilizes the high-order structure to directly transfer messages between subgraphs instead of single nodes, and the high-order form of message transfer can capture structural information which is not visible at a node level.
The invention provides a graph classification method for integrating high-order structure embedding and composite pooling, which considers subgraph feature information and graph topology information simultaneously in the pooling process, retains the original features of an input graph to the maximum extent for feature generation after the pooled subgraphs are selected, and ensures that the feature representation of the pooled subgraphs contains sufficient and effective information from the graphs by using feature fusion.
The present invention provides a graph classification method that integrates high-order structure embedding with composite pooling, which can efficiently utilize the structural information of a graph to process simple and complex multi-element structures of a given graph and the relationship between them. The topological structure and the node characteristics of the graph are comprehensively considered in one graph, so that a more objective and more comprehensive graph embedding representation can be obtained.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flowchart of a graph classification method integrating embedding and composite pooling of high-level structures according to an embodiment of the present invention;
FIG. 2 is a high-order structure diagram of the actor social network diagram according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an increment of a sub-graph attribute according to a first embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Example one
The embodiment provides a graph classification method integrating high-order structure embedding and composite pooling, which specifically comprises the following steps:
step 1: acquiring image data of known class labels, and acquiring image data to be classified;
specifically, one graph data is node data and node relation data (edges), and constitutes nodes and edges of the graph. For classification application, graph data of a known class label corresponds to a label graph (G, y), where graph G ═ V, E, V (G) represents a node set, a node is an entity in the graph, E (G) represents a set of edges, if there is an edge between two nodes, it indicates that two entities corresponding to the node have a relationship, and y is a label corresponding to the graph.
In particular, the graph is a network graph, for example, the graph may be a social network graph (movie social network graph) regarding movie collaboration. The nodes in the social network graph represent actors, and the edges between the two nodes represent that the corresponding two actors have a cooperative relationship in the same movie, in other words, if there are edges between the two nodes, it means that the two actors corresponding to the nodes have a cooperative relationship, and each graph structure represents an ego network of the actors. The labels of the graph represent the movie style types to which the self network belongs, and the corresponding high-order structures in the graph represent node sets formed by different actors and having specific semantics.
And 2, step: the neural network is trained using graph data of known class labels. Wherein, the neural network is a graph neural network.
And 3, step 3: and inputting the graph to be classified into the trained graph neural network to obtain the class of the graph. The graph neural network comprises a reading layer, a classifier and a plurality of neural network layers which are connected in sequence, wherein each neural network layer consists of a convolution layer and a composite pooling layer which are connected in sequence; for each subgraph set of the graph, each convolutional layer calculates the characteristics of each subgraph based on a subgraph set output by a previous neural network layer, each composite pooling layer updates the subgraph set based on the characteristics of each subgraph output by the convolutional layer, and meanwhile, for each subgraph in the updated subgraph set, the characteristics of the subgraph are updated by fusing the characteristics of the subgraphs in a local neighborhood through an attention mechanism; and the reading layer aggregates the characteristics of all subgraphs in all subgraph sets output by the last layer of neural network layer to obtain a graph representation vector, and inputs the graph representation vector into the classifier to obtain the class of the graph.
Specifically, as shown in fig. 1, the method includes the following steps:
step 301: according to the acquired data information of the graph, the characteristics of k-order neighbors of the high-order structure are aggregated by the step-by-step incremental convolution layer of the high-order structure, and the characteristic representation of the k-order structure at the k-th order is obtained.
In the specific implementation, for each subgraph containing k nodes, initializing by using all learning features containing k-1 node subgraphs, wherein the feature of each node is composed of the feature of the node per se and the feature of a first-order neighborhood to the feature of a k-order neighborhood. Each subgraph obtains a new embedded representation through convolution calculation of an incremental convolution layer on a local neighborhood, wherein incremental convolution refers to that the method transfers the characteristics of the first order to the next order on the characteristics, for example, as shown in fig. 3, the characteristics of the third order include the aggregated first-order information and the aggregated second-order information.
As one or more embodiments, step 301 uses n (v) to represent a first-order neighborhood of node v in v (g) in a given input graph, i.e., n (v) { u ∈ v (g) | (v, u) ∈ e (g) }. Is provided withE P If { (v, u) ∈ e (G) | u, v ∈ P }, then G [ P { (v, u) ∈ e { (G) } u, v ∈ P }, and G { (v, u) } u, v { (G) } u, v ∈ P, and G { (G) }]=(P,E P ) Defined as a subgraph derived from P, requiring G [ P ]]Must be a connected subgraph.
As shown in fig. 2, in the social network of the movie, a node set with a size of k indicates that k actors (or actors and producers, etc.) form a k-order subgraph, and if k is 2, there is a binary set formed by a starring actor and an ordinary actor, a binary set formed by a starring actor and a starring actor, or a binary set formed by a starring actor and a producer, etc. If P is treated as a set, then [ P ]] k The term "k" denotes a set of all subsets of k cardinalities in the set, and specifically, if a node set of k size is defined as a node set of k order (or a k-order subgraph) s ═ s (s ∈ k |) 1 ,...,s k ),s i Representing the ith node in a k-th sub-graph, then [ V (G)] k Represents a set of all s in the graph, i.e., s ∈ [ V (G)] k Wherein [ V (G)] k Representing the set of all k-th order subgraphs. That is, one graph corresponds to a plurality of sub-graph sets, and the same sub-graph set [ V (G)] k All sub-graphs in (a) are of the same order, i.e. s ═ s 1 ,...,s k )。
The graph neural network is composed of many neural network layers, each of which aggregates local neighbor information, i.e., features of neighbors, around each node and then passes this aggregated information to the next layer. When the high-order structure is embedded in an incremental mode, each order of structure embedding needs multilayer convolution to achieve feature aggregation, and for the kth order of embedding, when the number of convolution layers t is larger than 0, a new feature is calculated in a feature matrix of a node v:
wherein H (t) (v) Representing the characteristic representation of the node v in a graph obtained by iteration at the t-th layer, wherein t-1 represents the layer above t; n (v) represents a neighborhood of node v; v 'is e.n (v) represents that node v' belongs to one node in the neighborhood of node v;andis of size R N×F N is the number of nodes, and F is the number of node characteristics; σ is an activation function, such as Sigmode or ReLU. In the training process, in order to adapt the parameters to a given data distribution, they are optimized together with the parameters of the neural network used for classification, called single-order GNN, by a stochastic gradient descent method. Correspondingly, the multi-order aggregation method, i.e., the k-order features, aggregates incremental embedding on a k-1 order basis.
The invention carries out multi-order aggregation according to the node set in the subgraphDetermining the order k, regarding all the k-element subgraphs in V (G) as a whole, so that V (G) in the single-order embedding is the set [ V (G) ]in the multi-order mode] k . To describe the algorithm, the AND [ V (G)] k Node set s of order K has node set s' of k-1 common nodes, which form a neighbor set of s, and can be expressed as: n(s) ═ t e [ v (g)] k ||s∩t|=k-1}。
In a subgraph set, if edge connection exists between non-common nodes of two subgraphs, the two subgraphs can join each other in a local neighborhood, specifically, if edge connection exists between the non-common nodes of the subgraph s and the subgraph s ', the subgraph s' can join in a local neighborhood N of the subgraph s L (s), the same applies to the contrary; n(s) and N L Relative difference set of(s) namely N(s) \ N L (s) is called the global neighborhood N G (s)。
When t is not less than 0, each incremental convolution layer is [ V (G) ]] k Calculates a feature vector for each sub-picture s in the sequenceBased on the definition of the local neighborhood and the global neighborhood, the features obtained by each k-order subgraph s in the k-order subgraph set in the t-th layer iteration are represented as follows:
wherein,andis the parameter matrix of the t-th layer, and σ is the activation function. In addition, the sum of the above neighborhoods may be divided into two ranges, each of which is N L (s) and N G (s) using different parameter matrices to enable the model to compute features in the local and global neighborhoods, respectively. To extend HSECP to larger data sets and prevent overfitting, the model is disregarded in the calculationsThe global neighborhood of the sketch s only considers the local neighborhood and can obtain a subgraph set [ V (G)] k Is used for the feature representation of each k-th order sub-graph s. That is, each convolution layer is based on the subgraph set [ V (G) ] output by the neural network layer of the previous layer] k Set of computer sub-graphs [ V (G)] k The characteristics of each k-order subgraph s obtained in the t-th layer iteration:
wherein u is equal to N L (s) indicates that the sub-graph u belongs to one sub-graph in the local neighborhood of the sub-graph s.
In order to better utilize the structure information of the graph and consider that the initial feature input of HSECP is too single by using a one-hot vector, the invention uses the features learned by the convolution layer of the k-1 order as the initial features together besides the isomorphic type of the mark, namely the features of the first k-1 order aggregation in the k-th order features. I.e. features of each sub-picture in the first convolutional layerBy aggregating the low-order features of each node in the subgraphThen, combining the initial characteristics H of the subgraph k (s) calculated.
Wherein v ∈ s indicates that the node v belongs to the subgraph s, W k-1 For parameter matrices, brackets denote a concatenation of matrices. The characteristics of the input graph under the 1 st order to k th order multi-order high-order structure can be learned end to end through the method.
Step 302: and taking the processed high-order embedding as input, adding a composite pooling layer after each-order convolutional layer, selecting K sub-graphs before final pooling score, taking the output of the composite pooling layer as the input of the next convolutional layer, summarizing final output results in a readout layer, and transmitting the output of each readout layer to a linear layer for classification.
In the specific implementation, a composite pooling layer is added after each order of convolution layer, and the subgraph composite pooling process after each high-order structure polymerization comprises two steps of selection of pooled subgraphs and pooling subgraph feature polymerization.
(1) Selection of pooling subgraphs is topology-based TOP-K pooling, feature-based TOP-K pooling, and fusion of pooling results. The composite pooling layer is used for extracting more effective information on the graph and accelerating calculation, the structural information of the graph is obtained through pooling based on topology, the topological pooling is strengthened by using the feature information extracted through pooling based on features, and a sub-graph of K before the final pooling score is selected;
as one or more embodiments, the (1) performing topology-based TOP-K pooling on the pooling of each-order subgraph, using GCNConv to evaluate the importance of the structure of the higher-order subgraph, and calculating the structure information score of the higher-order subgraph:
wherein A represents the adjacent matrix of the subgraph s, D represents the degree matrix of the subgraph s, I represents the unit matrix, and the adjacent matrix is normalized to obtain the productW denotes a parameter matrix.
TOP-K pooling based on characteristics adopts a self-adaptive characteristic information aggregation method MLP to obtain sub-image characteristics and obtains characteristic scores of high-order sub-images through calculation:
the characteristics of the additional important subgraphs can be reserved according to the aggregation result. The GCNConv can effectively refine the local structure information of the graph, MLP focuses on the characteristic information of the nodes, and in order to enable the subgraph evaluation standard to be more objective, the scores of the subgraph are calculated by adopting the two methods at the same time after the two scores are obtained. The score of each subgraph is the weighted sum of the structural information score and the characteristic score, namely the combination mode of the final score is as follows:
S final =αS 1 +(1-α)S 2
wherein the weight α is a hyper-parameter. According to the final score S final And (5) sequencing the subgraphs, and obtaining the subgraphs of K before sequencing as a pooling result. For each sub-graph set of the graph [ V (G)] k According to the score S final Selecting the sub-graph set [ V (G)] k Several sub-graphs in the sub-graph group (K before score) are used to form a new sub-graph set to update the sub-graph set (V (G))] k 。
(2) And (4) performing pool subgraph feature aggregation, namely pool subgraph feature representation, focusing on retaining features of unselected subgraphs, and retaining feature information of unselected subgraph structures through subgraph feature aggregation. The output of the composite pooling layer is used as the input of the next convolution layer, the final output results are summarized in the readout layers, and the output of each readout layer is transmitted to the linear layer for classification;
as one or more examples, the (2) selective pooling method for each order in Top-K selects only a portion of subgraphs as the pooling result. In order to utilize the valid information of the sub-images which are not selected, the characteristics of the sub-images must be aggregated before discarding the valid information, for example, the characteristics of one-hop or two-hop neighbors of the selected sub-images are aggregated, so that the characteristic information of the sub-images can be more fully utilized, and the final image embedding vector is more representative. The subgraph features are fused by an attention mechanism, and the attention aggregation process is as follows. First compute the attention score e between the subgraph s and its neighbors u su And normalized using the softmax function:
wherein, a T Denotes the attention vector, W 1 And W 2 A parameter matrix is represented. After obtaining the attention score, generating subgraph embedding by using a nonlinear parameter sigma, namely fusing the characteristics of subgraphs in a local neighborhood through an attention mechanism to obtain subgraph characteristics:
for each updated subgraph set [ V (G)] k Combining the characteristics of the subgraph in the local neighborhood through attention mechanism to obtain subgraph characteristicsAs a new subpicture featureAnd inputting the data into the convolutional layer of the next neural network layer, wherein under the composite action of the methods, the selected pooled subgraph carries characteristic information from neighbors, and the pooled result can highly represent the whole graph.
And 303, after obtaining each-order output graph of the graph data of the class label to be predicted, splicing the feature vectors obtained after the high-order structures are aggregated to be used as the final feature representation of the graph, and performing classification prediction by using a pre-trained neural network to finish classification tasks.
In a specific implementation, after the feature representation of the whole graph is obtained through the calculation in the above steps, a Multi Layer Perceptron (MLP) is used to perform classification prediction, and a class label of the graph to be predicted is output.
As one or more embodiments, in step 303, after all the high-order subgraph structures are calculated and feature representations of the high-order subgraph structures are obtained, a classifier is used for classification prediction, a multilayer perceptron utilizes a hidden layer activation function to increase nonlinearity, and in a classification stage, the high-order subgraph structures are selected for classification.
The read-out layer calculates the vector representation on the whole graph by embedding, accumulating and summing all the subgraphs of each subgraph set, namely, the features of all the subgraphs of each subgraph set are accumulated and summed to obtain the vector representation corresponding to each subgraph set:
wherein T represents the last layer of the convolutional layer,and (4) showing the output result of the last layer of the k-th-order composite pooling layer. Splicing the graph representation vectors obtained from the output graphs of all steps to obtain a final representation vector, namely splicing the vector representations corresponding to all the sub-graph sets to obtain a graph representation vector:
F(G)=concat{H 1 (G),...,H k (G)}
the embedding of the self network map obtained by the read-out layer will be input to a classifier for classifying the predicted tags, i.e.:
the classification result of the map corresponding to the social network of the movie corresponds to the type prediction of the actor's own network, such as romantic type or action type.
The invention provides a novel hierarchical end-to-end graph classification method aiming at a multi-order high-order structure aiming at the problem that high-order structure information is not effectively utilized in graph classification embedded representation based on the field knowledge of a graph neural network. The role of the subgraph in feature aggregation is considered by introducing a high-order structure, and a pooling strategy of fusing structure information and attribute information is used to obtain an embedded representation of the graph, so that the complex hierarchical structure of the graph can be extracted.
Example two
The embodiment provides a graph classification system integrating high-order structure embedding and composite pooling, which specifically comprises the following modules:
an acquisition module configured to: obtaining a graph to be classified;
a classification module configured to: inputting the graph to be classified into a graph neural network to obtain the class of the graph;
the graph neural network comprises a reading layer, a classifier and a plurality of neural network layers which are connected in sequence, wherein each neural network layer consists of a convolution layer and a composite pooling layer which are connected in sequence; for each subgraph set of the graph, calculating the characteristics of each subgraph by each convolution layer based on the subgraph set output by the last neural network layer, updating the subgraph set by each composite pooling layer based on the characteristics of each subgraph output by the convolution layer, and simultaneously, for each subgraph in the updated subgraph set, fusing the characteristics of the subgraph in a local neighborhood through an attention mechanism to update the characteristics of the subgraph; and the reading layer aggregates the characteristics of all subgraphs in all subgraph sets output by the last layer of neural network layer to obtain a graph representation vector, and inputs the graph representation vector into the classifier to obtain the class of the graph.
It should be noted that, each module in the present embodiment corresponds to each step in the first embodiment one to one, and the specific implementation process is the same, which is not described again here.
EXAMPLE III
The present embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps in the graph classification method for merging high-order structure embedding and composite pooling as described in the first embodiment above.
Example four
This embodiment provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps in the graph classification method for merging high-order structure embedding and composite pooling as described in the first embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer readable storage medium and executed by a computer to implement the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. The graph classification method integrating the high-order structure embedding and the composite pooling is characterized by comprising the following steps of:
acquiring a graph to be classified;
inputting the graph to be classified into a graph neural network to obtain the class of the graph;
the neural network comprises a reading layer, a classifier and a plurality of neural network layers which are connected in sequence, wherein each neural network layer consists of a convolutional layer and a composite pooling layer which are connected in sequence; for each subgraph set of the graph, calculating the characteristics of each subgraph by each convolution layer based on the subgraph set output by the last neural network layer, updating the subgraph set by each composite pooling layer based on the characteristics of each subgraph output by the convolution layer, and simultaneously, for each subgraph in the updated subgraph set, fusing the characteristics of the subgraph in a local neighborhood through an attention mechanism to update the characteristics of the subgraph; and the reading layer aggregates the characteristics of all subgraphs in all subgraph sets output by the last layer of neural network layer to obtain a graph representation vector, and inputs the graph representation vector into the classifier to obtain the class of the graph.
2. The method for graph classification with merging of higher-order structure embedding and composite pooling of claim 1, wherein the graph corresponds to a number of subgraph sets, all subgraphs in a same subgraph set being of the same order.
3. The method for graph classification fusing embedding and compound pooling of higher-order structures according to claim 1, wherein within a set of subgraphs, two subgraphs can join each other's local neighborhood if there is an edge connection between non-common nodes of the two subgraphs.
4. The method for graph classification fusing higher-order structure embedding and composite pooling of claim 1, wherein the features of each subgraph in the first layer convolutional layer are obtained by combining subgraph initial features after aggregating the lower-order features of each node in the subgraph.
5. The graph classification method fusing embedding and composite pooling of higher-order structures according to claim 1, wherein each layer of composite pooling calculates a score of each subgraph based on the characteristics of each subgraph output by the convolutional layer, and selects several subgraphs according to the scores to update the subgraph set.
6. The method for graph classification fusing higher-order structure embedding and composite pooling of claim 5 wherein the score of each sub-graph is a weighted sum of a structure information score and a feature score.
7. The method for graph classification with fusion of high-order structure embedding and composite pooling of claim 1, wherein the readout layer first performs cumulative summation on the features of all subgraphs of each subgraph set to obtain a vector representation corresponding to each subgraph set; and then, splicing the vector representations corresponding to all the subgraph sets to obtain a representation vector.
8. A graph classification system integrating high-order structure embedding and composite pooling is characterized by comprising:
an acquisition module configured to: obtaining a graph to be classified;
a classification module configured to: inputting the graph to be classified into a graph neural network to obtain the class of the graph;
the graph neural network comprises a reading layer, a classifier and a plurality of neural network layers which are connected in sequence, wherein each neural network layer consists of a convolution layer and a composite pooling layer which are connected in sequence; for each subgraph set of the graph, each convolutional layer calculates the characteristics of each subgraph based on a subgraph set output by a previous neural network layer, each composite pooling layer updates the subgraph set based on the characteristics of each subgraph output by the convolutional layer, and meanwhile, for each subgraph in the updated subgraph set, the characteristics of the subgraph are updated by fusing the characteristics of the subgraphs in a local neighborhood through an attention mechanism; and the reading layer aggregates the characteristics of all subgraphs in all subgraph sets output by the last layer of neural network layer to obtain a graph representation vector, and inputs the graph representation vector into the classifier to obtain the class of the graph.
9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps in the method of graph classification fusing high order structure embedding and composite pooling according to any of the claims 1-7.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps in the graph classification method fusing higher order structure embedding and composite pooling of any of claims 1-7 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210486421.4A CN114792384A (en) | 2022-05-06 | 2022-05-06 | Graph classification method and system integrating high-order structure embedding and composite pooling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210486421.4A CN114792384A (en) | 2022-05-06 | 2022-05-06 | Graph classification method and system integrating high-order structure embedding and composite pooling |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114792384A true CN114792384A (en) | 2022-07-26 |
Family
ID=82461784
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210486421.4A Pending CN114792384A (en) | 2022-05-06 | 2022-05-06 | Graph classification method and system integrating high-order structure embedding and composite pooling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114792384A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115984633A (en) * | 2023-03-20 | 2023-04-18 | 南昌大学 | Gate-level circuit component identification method, system, storage medium and equipment |
-
2022
- 2022-05-06 CN CN202210486421.4A patent/CN114792384A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115984633A (en) * | 2023-03-20 | 2023-04-18 | 南昌大学 | Gate-level circuit component identification method, system, storage medium and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lin et al. | Towards optimal structured cnn pruning via generative adversarial learning | |
Pertsch et al. | Long-horizon visual planning with goal-conditioned hierarchical predictors | |
Spinelli et al. | Adaptive propagation graph convolutional network | |
CN110276765B (en) | Image panorama segmentation method based on multitask learning deep neural network | |
CN109816032B (en) | Unbiased mapping zero sample classification method and device based on generative countermeasure network | |
KR102235745B1 (en) | Method for training a convolutional recurrent neural network and for semantic segmentation of inputted video using the trained convolutional recurrent neural network | |
CN111783540B (en) | Method and system for recognizing human body behaviors in video | |
CN112308081B (en) | Image target prediction method based on attention mechanism | |
CN112651360B (en) | Skeleton action recognition method under small sample | |
CN112801063B (en) | Neural network system and image crowd counting method based on neural network system | |
CN111931549B (en) | Human skeleton motion prediction method based on multi-task non-autoregressive decoding | |
CN113988464A (en) | Network link attribute relation prediction method and equipment based on graph neural network | |
CN114723037A (en) | Heterogeneous graph neural network computing method for aggregating high-order neighbor nodes | |
CN114792384A (en) | Graph classification method and system integrating high-order structure embedding and composite pooling | |
CN117932455A (en) | Internet of things asset identification method and system based on neural network | |
KR20230069578A (en) | Sign-Aware Recommendation Apparatus and Method using Graph Neural Network | |
Panda et al. | Compositional Zero-Shot Learning using Multi-Branch Graph Convolution and Cross-layer Knowledge Sharing | |
Li et al. | Fast Fourier inception networks for occluded video prediction | |
CN116467466A (en) | Knowledge graph-based code recommendation method, device, equipment and medium | |
CN115631057A (en) | Social user classification method and system based on graph neural network | |
CN115544307A (en) | Directed graph data feature extraction and expression method and system based on incidence matrix | |
CN115019342A (en) | Endangered animal target detection method based on class relation reasoning | |
CN114124729A (en) | Dynamic heterogeneous network representation method based on meta-path | |
Knauer et al. | Recall: Rehearsal-free continual learning for object classification | |
Zhang et al. | Progressively diffused networks for semantic image segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |