CN112862003A

CN112862003A - Method, device and equipment for enhancing graph neural network information

Info

Publication number: CN112862003A
Application number: CN202110297502.5A
Authority: CN
Inventors: 吴嘉婧; 夏一钧; 刘洁利; 郑子彬
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2021-05-28

Abstract

The application discloses a method, a device and equipment for enhancing graph neural network information, wherein the method comprises the following steps: identifying network motifs participated by different nodes according to preset node characteristic data and a preset node adjacency matrix; based on the network motif, constructing an initial motif feature vector corresponding to each node; inputting the initial die body characteristic vector into a preset LSTM model for polymerization treatment to obtain self die body characteristic vectors of different nodes, wherein the preset LSTM model comprises an attention mechanism; performing aggregation operation on the neighboring node motif vectors according to the own motif feature vector to obtain a target motif feature vector of each node; and fusing the feature vector of the target motif and the feature vector of the initial node to obtain an enhanced feature vector. The method and the device can solve the technical problem that characteristic information of graph neural network learning is lack of representativeness due to the fact that rich structure information generated due to different interaction modes among network nodes is ignored in the prior art.

Description

Method, device and equipment for enhancing graph neural network information

Technical Field

The present application relates to the field of graph neural network technologies, and in particular, to a method, an apparatus, and a device for enhancing graph neural network information.

Background

Graph Neural Networks (GNNs) are a class of methods for processing graph information based on deep learning, which are increasingly used for analysis of graph structure data due to their better performance and interpretability. In the graph neural network, information transfer between nodes has different ways, and based on this, the graph neural network can be roughly divided into 4 categories of a graph convolution network, a graph attention network, a gated graph neural network, and a residual connecting graph neural network. In these graph neural networks, only binary interaction patterns of the current node and other nodes are typically considered, i.e., connected or unconnected. In practice, however, it is also possible to construct small structures between nodes, i.e. sub-graph structures that occur at high frequencies in the network, also referred to as network motifs.

The existing graph neural network model only learns the low-dimensional feature vectors of the nodes, and ignores the generation of richer high-order information among the nodes due to different interaction modes, so that the graph neural network cannot learn more representative feature information.

Disclosure of Invention

The application provides a method, a device and equipment for enhancing information of a graph neural network, which are used for solving the technical problem that characteristic information of graph neural network learning is lack of representativeness due to the fact that rich information generated by different interaction modes among network nodes is ignored in the prior art.

In view of the above, a first aspect of the present application provides a method for enhancing information of a neural network, including:

identifying network motifs participated by different nodes according to preset node characteristic data and a preset node adjacency matrix;

based on the network motif, constructing an initial motif feature vector corresponding to each node;

inputting the initial die body feature vector into a preset LSTM model for polymerization to obtain self die body feature vectors of different nodes, wherein the preset LSTM model comprises an attention mechanism;

performing aggregation operation on the neighboring node motif vectors according to the own motif feature vector to obtain a target motif feature vector of each node;

and fusing the target motif feature vector and the initial node feature vector to obtain an enhanced feature vector.

Optionally, the identifying, according to the preset node characteristic data and the preset node adjacency matrix, a network motif in which different nodes participate includes:

and acquiring preset node characteristic data and preset node adjacency matrixes of all nodes in the preset neural network model.

Optionally, the initial node feature vector is obtained by performing feature learning on the preset neural network model.

Optionally, the inputting the initial phantom feature vector into a preset LSTM model for aggregation processing to obtain own phantom feature vectors of different nodes includes:

inputting the initial model feature vector into an LSTM network of the preset LSTM model for feature learning to obtain a hidden layer feature vector;

aggregating the hidden layer feature vectors into self-motif feature vectors through a network layer based on an attention mechanism.

Optionally, the fusing the feature vector of the target motif and the feature vector of the initial node to obtain an enhanced feature vector includes:

fusing the target motif feature vector and the initial node feature vector based on a preset fusion formula to obtain an enhanced feature vector, wherein the preset fusion formula is as follows:

H_output＝W₁H_i⊙[1-σ((W₂H_i+b_f)⊙H′_i)]+H′_i；

wherein, the_iIs the feature vector of the target phantom, H_i' is the initial node feature vector, W₁、W₂Is a parameter matrix, b_fσ is the activation function for the bias term.

A second aspect of the present application provides a graph neural network information enhancing apparatus, including:

the motif identification module is used for identifying network motifs participated by different nodes according to preset node characteristic data and a preset node adjacency matrix;

the motif vector construction module is used for constructing an initial motif feature vector corresponding to each node based on the network motif;

the first aggregation module is used for inputting the initial die body characteristic vector into a preset LSTM model for aggregation processing to obtain the self die body characteristic vectors of different nodes, and the preset LSTM model comprises an attention mechanism;

the second aggregation module is used for carrying out aggregation operation on neighboring node motif vectors according to the own motif feature vector to obtain a target motif feature vector of each node;

and the characteristic fusion module is used for fusing the target motif characteristic vector and the initial node characteristic vector to obtain an enhanced characteristic vector.

Optionally, the method further includes:

and the acquisition module is used for acquiring preset node characteristic data and preset node adjacency matrixes of all nodes in the preset neural network model.

Optionally, the first aggregation module is specifically configured to:

Optionally, the feature fusion module is specifically configured to:

H_output＝W₁H_i⊙[1-σ((W₂H_i+b_f)⊙H′_i)]+H′_i；

wherein, the_iIs the feature vector of the target phantom, H_iIs the initial node characteristicVector, W₁、W₂Is a parameter matrix, b_fσ is the activation function for the bias term.

A third aspect of the present application provides a graph neural network information enhancing apparatus, the apparatus comprising a processor and a memory;

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to execute the method for enhancing information of a neural network according to any one of the first aspect according to instructions in the program code.

According to the technical scheme, the embodiment of the application has the following advantages:

the application provides a graph neural network information enhancement method, which comprises the following steps: identifying network motifs participated by different nodes according to preset node characteristic data and a preset node adjacency matrix; based on the network motif, constructing an initial motif feature vector corresponding to each node; inputting the initial die body characteristic vector into a preset LSTM model for polymerization treatment to obtain self die body characteristic vectors of different nodes, wherein the preset LSTM model comprises an attention mechanism; performing aggregation operation on the neighboring node motif vectors according to the own motif feature vector to obtain a target motif feature vector of each node; and fusing the feature vector of the target motif and the feature vector of the initial node to obtain an enhanced feature vector.

The method for enhancing the information of the graph neural network mainly researches the interactive characteristics of each node in the network in different network motifs, can reflect the interactive characteristics of the node by acquiring the characteristic vectors of the motif of the node per se, can reflect the characteristics of the network motif between adjacent nodes by the aggregation operation of the motif vectors of the neighbor nodes, analyzes and learns the information of the nodes in different stages from multiple layers, considers the information of the network motif structure of the neighbor nodes in one stage, and also considers the characteristic information of the network node per se, namely the initial node characteristic vector; the two kinds of feature information are fused to achieve the purpose of information enhancement, so that the enhanced feature vector is more representative and more reliable. Therefore, the method and the device can solve the technical problem that characteristic information of graph neural network learning is lack of representativeness due to the fact that rich information generated due to different interaction modes among network nodes is ignored in the prior art.

Drawings

Fig. 1 is a schematic flowchart illustrating a method for enhancing information of a neural network according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of an information enhancing apparatus of a neural network according to an embodiment of the present disclosure;

FIG. 3 is an illustration of various embodiments of the present disclosure;

fig. 4 is a schematic structural diagram of the preset LSTM model provided in the embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Interpretation of terms:

1. graph neural network: graph Neural Networks (GNNs) are a class of methods that apply neural networks in deep learning to the graph domain.

2. LSTM model: the LSTM is a long-short term memory network, is one of Recurrent Neural Networks (RNN), and mainly aims to solve the problems of gradient extinction and gradient explosion in the long sequence training process.

3. Network model body: node interaction patterns that occur significantly more frequently in complex networks than in random networks.

4. Adjacency matrix: the adjacency matrix is a matrix representing the adjacency relationship between vertices, and the topology of the network is stored in the form of a two-dimensional array.

Network motifs often represent a certain interaction pattern between nodes, so the network motif structure in which the nodes participate contains rich information. Therefore, the network motif information of the nodes is fully utilized, and certain improvement can be brought to the learning of the graph neural network.

For easy understanding, please refer to fig. 1, the present application provides an embodiment of a graph neural network information enhancement method, including:

step 101, identifying network motifs participated by different nodes according to preset node characteristic data and preset node adjacency matrixes.

Further, step 101, before, further includes:

The preset node characteristic data is the characteristic information of the node self learned by the network node, and is expressed as the characteristic data of each node at the moment; the preset node adjacency matrix refers to the connection relationship of adjacent nodes, namely the existence of connection or the nonexistence of connection, namely the topological characteristic description of the nodes.

The same node may exist in a plurality of different network motifs, and the meanings of the same node interactively generated in the different network motifs are different, so that the feature information generated by the different network motifs is very rich.

The network motif can be customized, and based on the topological structure of the preset neural network model and several predefined network motif types, the network motif of each node can be determined by a matching method.

And 102, constructing an initial motif characteristic vector corresponding to each node based on the network motifs.

The element in the initial motif feature vector represents the number of times that each current node and other nodes connected with the current node appear in the network motif together, for example, the current node i is taken as a central node, the central node i needs to send data to the node j, the connection relationship is adopted for 4 times in all network motifs, that is, the interaction that the central node i sends data to the node j exists in 4 different network motifs taking the node i as the central node, and then the jth element of the initial motif feature vector corresponding to the central node i is 4. It can also be understood as the number of times the current node is activated with other nodes in all network motifs.

In this embodiment, the initial motif feature vector is defined as X, the types of the network motifs are K, and the number of nodes is N, so that K N-dimensional initial motif feature vectors X can be obtained for each node; it can be seen that the initial motif feature vector includes the number of times the node is used in all network motif instances.

To facilitate analysis of information generated by network motifs, an adjacency matrix A based on K network motifs may be constructed_k(K1, 2.... K.) the network motif-based adjacency matrix is subject to analysis by the network motif. Firstly, it is to be understood that, in the above process, each node can obtain K N-dimensional row vectors, and like a matrix, N nodes can obtain N K × N matrices, where a first row of each matrix is an initial phantom feature vector X under a first network phantom₁The second row is the initial phantom feature vector X under the second network phantom₂And so on. And taking out and combining all first row vectors of the N matrixes to obtain an adjacent matrix of the first network motif, taking out and combining all second row vectors to obtain an adjacent matrix of the second network motif, and finally obtaining K adjacent matrixes based on the network motifs for analyzing the interactive characteristics generated by the network motifs. It can be found that, in fact, the initial motif feature vector of the same node is divided into different adjacency matrixes by the network motif types, and the nature of interaction between the nodes is not changed, so that the implementation process of the scheme is not affected.

Referring to fig. 3, in the embodiment of the present application, an example of 6 network motifs is provided, and it can be found that the number of nodes included in different types of network motifs is not necessarily consistent, and the same number or the same number of nodes may also have multiple different network motifs, that is, the same three nodes may have different interaction modes, the nodes having a connection line are adjacent nodes, and there is information interaction, and an arrow indicates the directionality of information interaction, that is, a directed network motif is used in the embodiment of the present application, which can better retain and learn direction information in a network.

And 103, inputting the initial die body feature vector into a preset LSTM model for polymerization to obtain self die body feature vectors of different nodes, wherein the preset LSTM model comprises an attention mechanism.

Further, step 103 includes:

inputting the initial model feature vector into an LSTM network with a preset LSTM model for feature learning to obtain a hidden layer feature vector;

and aggregating the hidden layer feature vectors into self-motif feature vectors through the network layer based on the attention mechanism.

Attention Mechanism (Attention Mechanism) originated from the study of human vision, and in cognitive science, human beings selectively focus on a part of all information while ignoring other visible information due to the bottleneck of information processing, which is called Attention Mechanism. A general neural-machine translation model performs sequence-to-sequence conversion in an "encoding-decoding" manner. This approach has two problems: firstly, the capacity bottleneck problem of the encoding vector is that all information of the source language needs to be stored in the encoding vector to be effectively decoded; the second is the long-distance dependence problem, namely the information loss problem in the long-distance information transmission in the encoding and decoding process. By introducing an attention mechanism, information of each position in the source language is saved; the above technical problem is alleviated by selecting relevant information as an aid directly from information in the source language by a concentration mechanism when generating words in each target language during decoding.

Referring to fig. 4, the vectors input into the preset LSTM model are all initial phantom feature vectors of each node, and each node has K N-dimensional initial phantom feature vectors, so that after the LSTM network processing in the preset LSTM model, K hidden layer feature vectors output by the LSTM units can be obtained. Since no chronological order exists, the vector order in the input model is random and is not limited. Specifically, the LSTM network is a bidirectional network, and therefore, the hidden layer feature vector is obtained by splicing a forward vector and a backward vector:

H_ik＝Concatenate(forward-LSTM_ik(X_ik),backward-LSTM_i(K+1-k)(X_ik))；

wherein forward-LSTM_ikA forward k unit in the bidirectional LSTM network representing the input of the node i; Backward-LSTM_i(K+1-k)The reverse (K +1-K) th unit in the bidirectional LSTM network representing the input of the node i; x_ikIs an initial motif feature vector of the node i.

Carrying out aggregation operation on all hidden layer feature vectors of the node i by adopting a network layer based on an attention mechanism to obtain a self-body model feature vector:

H_i＝Attention(H_i1,H_i2,......,H_iK)；

h in FIG. 4_ikI.e. forward vector, H_i′_kIs a reverse vector.

And step 104, carrying out aggregation operation on the neighboring node motif vectors according to the own motif feature vector to obtain a target motif feature vector of each node.

Through the operation process, each node can obtain a characteristic vector H of the self-motif_iAnd the information generated by the network motif of the adjacent node is analyzed, so that the information characteristic expression capability of the network can be enhanced. The specific neighbor node motif vector aggregation operation process is as follows:

wherein, alpha is the learning rate for controlling the aggregation degree of the neighboring node motif vectors, N (i) represents the neighboring node set of the node i, the node j is the neighboring node of the node i, and the corresponding H_jIs the self-motif feature vector of the node j.

Representing the transpose of the feature vector of the own motif of the node i and the node jThe normalization processing is carried out on the product of the own die body feature vectors, and therefore the information weight value of the aggregation adjacent node is determined through the similarity of the own die body feature vectors of the node i and the node j.

And 105, fusing the feature vector of the target motif and the feature vector of the initial node to obtain an enhanced feature vector.

Further, the initial node feature vector is obtained by performing feature learning on a preset neural network model. Namely, the characteristic information of the node itself, describes the characteristics extracted by the neural network.

Further, fusing the feature vector of the target motif and the feature vector of the initial node based on a preset fusion formula to obtain an enhanced feature vector, wherein the preset fusion formula is as follows:

H_output＝W₁H_i⊙[1-σ((W₂H_i+b_f)⊙H′_i)]+H′_i；

wherein, the_iIs a feature vector of the target phantom, H_i' is an initial node feature vector, W₁、W₂Is a parameter matrix, b_fσ is the activation function for the bias term.

After feature fusion, the enhanced feature vector not only contains original node feature information, but also adds interaction information based on network motifs and network motif information of adjacent nodes, and the latter is high-order structure information which cannot be learned by the feature vector of the original node, so that the node can sense richer node interaction modes. When the value of the sigma output of the activation function in the network model becomes smaller, the feature vectors of the original nodes are fused with the feature vectors of network motifs of more nodes; conversely, the larger the activation function σ output value, the fewer the fusion features. Namely, the process of feature fusion is a dynamic process, and the model can be adaptively adjusted according to the quality of the actual effect, so that the high efficiency of information enhancement can be ensured. The activation function σ is expressed as:

wherein Z is an error function.

The obtained enhanced feature vector can improve the performance of the model, and the target effect can be improved to a certain extent regardless of classification or prediction tasks.

The graph neural network information enhancement method provided by the embodiment of the application mainly researches the interactive characteristics of each node in the network in different network motifs, can reflect the interactive characteristics of the node by acquiring the characteristic vectors of the self motifs of different nodes, can reflect the characteristics of the network motifs between adjacent nodes by the aggregation operation of the motif vectors of the neighbor nodes, analyzes and learns the information of different stages of the node from multiple layers, considers the information of the network motif structure of the first-order neighbor node, and also considers the characteristic information of the network node, namely the initial node characteristic vector; the two kinds of feature information are fused to achieve the purpose of information enhancement, so that the enhanced feature vector is more representative and more reliable. Therefore, the method and the device can solve the technical problem that characteristic information learned by the neural network of the graph is lack of representativeness due to the fact that rich information generated by different interaction modes among network nodes is ignored in the prior art.

The above is an embodiment of a method for enhancing information of a graph neural network provided by the present application, and the following is an embodiment of an apparatus for enhancing information of a graph neural network provided by the present application.

For ease of understanding, referring to fig. 2, the present application further provides an embodiment of a graph neural network information enhancing apparatus, including:

the motif identification module 201 is used for identifying network motifs participated by different nodes according to preset node characteristic data and a preset node adjacency matrix;

the motif vector construction module 202 is configured to construct an initial motif feature vector corresponding to each node based on a network motif;

the first aggregation module 203 is configured to input the initial phantom feature vector into a preset LSTM model for aggregation to obtain own phantom feature vectors of different nodes, where the preset LSTM model includes an attention mechanism;

the second aggregation module 204 is configured to perform aggregation operation on neighboring node motif vectors according to the own motif feature vector to obtain a target motif feature vector of each node;

and the feature fusion module 205 is configured to fuse the feature vector of the target motif and the feature vector of the initial node to obtain an enhanced feature vector.

Further, still include:

an obtaining module 206, configured to obtain preset node characteristic data and preset node adjacency matrixes of all nodes in the preset neural network model.

Further, the first aggregation module 203 is specifically configured to:

Further, the feature fusion module 205 is specifically configured to:

fusing the feature vector of the target motif and the feature vector of the initial node based on a preset fusion formula to obtain an enhanced feature vector, wherein the preset fusion formula is as follows:

H_output＝W₁H_i⊙[1-σ((W₂H_i+b_f)⊙H′_i)]+H′_i；

The above is an embodiment of a graph neural network information enhancing apparatus provided by the present application, and the following is an embodiment of a graph neural network information enhancing apparatus provided by the present application.

The application provides a graph neural network information enhancement device, which comprises a processor and a memory;

the memory is used for storing the program codes and transmitting the program codes to the processor;

the processor is used for executing any graph neural network information enhancement method in the method embodiments according to instructions in the program code.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for enhancing information of a graph neural network is characterized by comprising the following steps:

2. The method for enhancing information of a neural network of a graph according to claim 1, wherein the identifying network motifs where different nodes participate according to preset node characteristic data and a preset node adjacency matrix further comprises:

3. The information enhancement method of figure neural network of claim 2, wherein the initial node feature vector is obtained by feature learning of the preset neural network model.

4. The method of enhancing information of a graph neural network according to claim 1, wherein the step of inputting the initial motif feature vector into a preset LSTM model for aggregation processing to obtain the own motif feature vectors of different nodes includes:

5. The method of enhancing information of a graph neural network according to claim 1, wherein the fusing the feature vectors of the target motifs and the feature vectors of the initial nodes to obtain enhanced feature vectors includes:

H_output＝W₁H_i⊙[1-σ((W₂H_i+b_f)⊙H′_i)]+H′_i；

wherein, the_iIs the target motif feature vector, H'_iFor the initial node feature vector, W₁、W₂Is a parameter matrix, b_fσ is the activation function for the bias term.

6. An apparatus for enhancing information of a neural network, comprising:

7. The graph neural network information enhancing apparatus according to claim 6, further comprising:

8. The information enhancing apparatus of a neural network of a graph according to claim 6, wherein the first aggregation module is specifically configured to:

9. The information enhancing apparatus of a graph neural network according to claim 6, wherein the feature fusion module is specifically configured to:

H_output＝W₁H_i⊙[1-σ((W₂H_i+b_f)⊙H′_i)]+H′_i；

10. A graph neural network information enhancing device, the device comprising a processor and a memory;

the processor is configured to execute the graph neural network information enhancement method of any one of claims 1-5 according to instructions in the program code.