CN116340842A

CN116340842A - Common attention-based heterogeneous graph representation learning method

Info

Publication number: CN116340842A
Application number: CN202310309514.4A
Authority: CN
Inventors: 张政; 季颖生; 佴湲坤; 沈佳辰; 陶振宇; 蔡明�
Original assignee: Shanghai Zhongyi Turing Digital Technology Co ltd
Current assignee: Shanghai Zhongyi Turing Digital Technology Co ltd
Priority date: 2023-03-27
Filing date: 2023-03-27
Publication date: 2023-06-27

Abstract

The invention provides a heterogeneous diagram representation learning method based on common attention, which comprises the following steps: determining local attention scores of neighbor nodes of the target element path based on a preset attention parameter vector, a first hidden feature vector, a second hidden feature vector and a semantic fusion vector of the target element path; determining a global attention score of the target element path neighbor node based on the local attention score, the number of other element paths corresponding to the target element path neighbor node and a preset super parameter; determining an updated semantic representation vector of the target node based on the global attention score, the first hidden feature vector, the second hidden feature vector, and the semantic fusion vector; and determining the attention weight of each target element path based on the updated semantic representation vector of the node in each target element path in the heterogeneous graph, and determining the target semantic representation vector of the target node based on the attention weight and the updated semantic representation vector. By the method, high-quality node representation can be obtained.

Description

Common attention-based heterogeneous graph representation learning method

Technical Field

The invention relates to the technical field of heterogeneous diagram representation learning, in particular to a heterogeneous diagram representation learning method based on common attention.

Background

Graph representation learning is a method of mapping graph data to low-dimensional vectors that aims to capture the topology on a graph, learn the representation of nodes or edges in low-dimensional space. In recent years, heterogeneous charts represent learning that has received a great deal of attention.

The graph representation learning method comprises graph representation learning based on a homogeneous graph and graph representation learning based on a heterogeneous graph, wherein the homogeneous graph is composed of nodes and edges of the same type. In the real world, systems are often made up of complex interrelationships of various components with one another. The traditional graph based on the homogeneous graph shows that the learning method can cause a great deal of heterogeneous information loss, thereby affecting the effect of downstream tasks. Thus, more and more methods attempt to abstract a real system into a heterogram, i.e., a heterogram is composed of multiple types of nodes or edges; the advantage of heterogeneous maps is that more information can be integrated than homogeneous maps.

However, heterogeneous graphs have a complex topology, and nodes and edges have heterogeneity. In the related art, semantic information on a heterogeneous graph is generally not efficiently and sufficiently mined, thereby limiting the semantic expression capability of the heterogeneous graph.

Therefore, how to effectively mine the abundant semantic information in the heterogeneous graph is a problem to be solved at present.

Disclosure of Invention

Aiming at the problems existing in the prior art, the embodiment of the invention provides a heterogeneous diagram representation learning method based on common attention.

The invention provides a heterogeneous diagram representation learning method based on common attention, which comprises the following steps:

determining local attention scores of neighbor nodes of any target element path in a heterogeneous graph based on preset attention parameter vectors, first hidden feature vectors, second hidden feature vectors and semantic fusion vectors of all element path examples between the target node and the neighbor nodes of the target element path for at least one target node of the target element path in the heterogeneous graph; the local attention score is used for representing the importance degree of the target element path neighbor node to the target node based on the target element path; the first hidden feature vector is the hidden feature vector of the target node, and the second hidden feature vector is the hidden feature vector of the target element path neighbor node;

determining a global attention score of the target element path neighbor node based on the local attention score, the number of other element paths corresponding to the target element path neighbor node in the heterogeneous graph and a preset super parameter; the global attention score is used for representing the importance degree of all meta-path neighbor nodes in the heterogeneous graph to the target node based on all meta-paths in the heterogeneous graph, and the super-parameters are used for representing the importance degree of all meta-path examples between the target node and the target meta-path neighbor nodes to the target node;

Determining an updated semantic representation vector corresponding to the target node based on the global attention score, the first hidden feature vector, the second hidden feature vector, and the semantic fusion vector;

and determining the attention weight of each target element path based on the updated semantic representation vector of the node in each target element path in the heterogeneous graph, and determining the target semantic representation vector corresponding to the target node based on the attention weight of each target element path and the updated semantic representation vector of the node in each target element path.

Optionally, the determining, based on the global attention score, the first hidden feature vector, the second hidden feature vector, and the semantic fusion vector, an updated semantic representation vector corresponding to the target node includes:

normalizing the global attention score to generate a target attention score;

performing aggregation processing on the target attention score, the second hidden feature vector and the semantic fusion vector to obtain an aggregation message vector corresponding to the target node;

the update semantic representation vector is determined based on the aggregate message vector and the first hidden feature vector.

Optionally, the determining the global attention score of the target meta-path neighboring node based on the local attention score, the number of other meta-paths corresponding to the target meta-path neighboring node in the heterogeneous graph, and a preset super-parameter includes:

determining a global attention score of the target node by using a formula (1) based on the local attention score, the number of other element paths corresponding to the neighbor nodes of the target element path in the heterogeneous graph and a preset super parameter; the formula (1) is:

wherein, the liquid crystal display device comprises a liquid crystal display device,

representing the global attention score; n represents the number of other meta paths corresponding to the target meta path neighbor nodes in the heterogeneous graph; η represents the preset hyper-parameters; />

Representing the local attention score.

Optionally, before the determining the local attention score of the target meta-path neighboring node based on the preset attention parameter vector, the first hidden feature vector, the second hidden feature vector, and the semantic fusion vector of all meta-path instances between the target node and the target meta-path neighboring node, the method further includes:

acquiring at least one meta-path instance corresponding to the target meta-path, wherein the data structure of each meta-path instance is the same as the data structure of the target meta-path;

Encoding each meta-path instance to generate a semantic representation vector of each meta-path instance;

and carrying out aggregation processing on the semantic representation vectors of each meta-path instance to generate the semantic fusion vector.

Optionally, the meta-path instance includes l+1 child nodes and L relationship edges, where L is a positive integer;

encoding each meta-path instance to generate a semantic representation vector of each meta-path instance, including:

generating a child node sequence and a relation edge sequence corresponding to each meta-path instance aiming at each meta-path instance; the child node sequence comprises L+1 child nodes, and the relationship edge sequence comprises L relationship edges;

inputting a child node hidden characteristic vector sequence into a first model group, and updating the child node hidden characteristic vector sequence to obtain an updated hidden characteristic vector of the child node; the first model group comprises a first cyclic neural network model (RNN) and a second RNN, and the first RNN and the second RNN are connected in a jumping manner; the child node hidden feature vector sequence is generated based on the child node sequence;

inputting a relation edge hidden characteristic vector sequence into the first model group, and updating the relation edge hidden characteristic vector sequence to obtain an updated hidden characteristic vector of the relation edge; the relation edge hidden feature vector sequence is generated based on the child node sequence;

And generating a semantic representation vector of the target element path instance based on the updated hidden feature vector of the child node and the updated hidden feature vector of the relation edge.

Optionally, after the determining the target semantic representation vector corresponding to the target node, the method further includes:

inputting the target semantic representation vector into an initial target service classification model, and training the initial target service classification model by using a preset loss function until the initial target service classification model converges to obtain a trained target service classification model and a target semantic representation vector which is output by the target service classification model and is suitable for target service;

and inputting the target semantic representation vector suitable for the target service into the trained target service classification model to obtain a classification result aiming at the target service, which is output by the trained target service classification model.

The heterogeneous graph representation learning method based on common attention provided by the invention can determine the local attention score of the neighbor node of the target element path based on the preset attention parameter vector of the target element path, the hidden feature vector of the target node, the hidden feature vector of the neighbor node of the target element path and the semantic fusion vector of all element path examples between the target node and the neighbor node of the target element path aiming at least one target node of any target element path in the heterogeneous graph, wherein the local attention score is used for representing the importance degree of the neighbor node of the target element path to the target node based on the target element path; then, based on the local attention score, the number of other element paths corresponding to the target element path neighbor nodes in the heterogeneous graph and preset super parameters, the global attention score of the target element path neighbor nodes can be determined, wherein the global attention score is used for representing the importance degree of the target element path neighbor nodes to the target nodes based on all element paths in the heterogeneous graph; based on the global attention score, the hidden feature vector of the target node, the hidden feature vector of the neighbor node of the target node and the semantic fusion vector, the complex semantics can be adaptively and deeply mined, so that the semantic feature representation of the target node is enhanced, the updated semantic representation vector corresponding to the target node is obtained, and the defect that the conventional heterogeneous graph mining algorithm excessively relies on element paths to carry out semantic mining is overcome; based on the attention weight of each target element path and the updated semantic representation vector of the nodes in each target element path, the determined target semantic representation vector corresponding to the target node can be fused with the semantic importance on different target element paths, so that the target node can effectively express rich semantic information in the heterogeneous graph, and high-quality node representation is obtained.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a common attention-based heterogeneous diagram representation learning method provided by the invention;

FIG. 2 is a schematic illustration of a co-attention based semantic feature computation provided by the present invention;

FIG. 3 is a schematic diagram of a meta-path instance encoding structure provided by an embodiment of the present invention;

FIG. 4 is a schematic diagram showing a learning method based on a common attention heterogram.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order to facilitate a clearer understanding of the various embodiments of the present application, some relevant background knowledge is first presented below.

Graph representation learning is a technical approach to map graph data to low-dimensional vectors, which aims to capture the topology on a graph, learn the representation of nodes or edges in low-dimensional space, also known as graph embedding. Early graphs represent that the learning algorithm is mainly based on homogeneous graphs, i.e. graphs are made up of nodes and edges of the same type. In the real world, systems are often made up of complex interrelationships of various components with one another. The traditional method based on the homogeneous map can cause a great deal of heterogeneous information loss, thereby affecting the effect of downstream tasks. Thus, more and more methods attempt to abstract a real-world system into a heterogeneous graph, i.e., a graph is composed of multiple types of nodes or edges. The advantage of heterogeneous maps is that more information can be integrated than homogeneous maps.

Heterograms represent a study that has received a great deal of attention in recent years, and this problem can be summarized as: through heterogeneous graph embedding, heterogeneous graph representation is learned in a low-dimensional space, and complex structure and semantic information on the heterogeneous graph are reserved as far as possible and used for downstream machine learning tasks. The biggest challenge in learning heterograms is how to efficiently mine complex structures and rich semantics on the graph. The concrete steps are as follows:

1) Complex topologies. Heterogeneous graphs are abstract representations of complex systems in reality, and contain rich semantics including node objects of different types and interrelationships between objects with different meanings. How to represent the fusion of such information becomes a great challenge;

2) Nodes and edges have heterogeneity. On heterogeneous graphs, different types of nodes and edges represent different meanings, with different characteristics, including dimensions, data types, and numerical ranges, etc. How to aggregate different types of neighbors and their heterogeneous information is another challenge.

The common heterogeneous graph representation learning method has the following technical means.

a) A divide-and-conquer based method.

The main idea of the method is based on a random walk method. The method mainly comprises the following steps: the method comprises the steps of presetting a walk mode (such as a meta path, a meta structure and the like), carrying out walk on a heterogeneous graph according to the mode, sampling to obtain a node sequence containing heterostructure and semantic information, and then carrying out representation learning on the node sequence. Typical algorithms are e.g. metaath 2vec, HIN2Vec, HINE, ESim etc.

b) An automatic encoder-based method.

The method mainly comprises the following steps: and constructing encoder learning node information and graph structure information by adopting a neural network model, and reconstructing original heterogeneous graph information by a decoder. First, the sub-graph structure is extracted by techniques such as meta-path. The design encoder encodes the sub-graph structure and then fuses. Finally, a reconstruction loss function is designed to train the whole network. Typical algorithms are BL-MNE, SHINE, etc.

c) Based on the method of generating the challenge network.

The method mainly comprises the following steps: a robust node representation is generated by a mutual game between the generator and the discriminator. Here the generator and discriminator need to integrate the relationships between the different nodes and capture heterogeneous semantic information. Typical algorithms are e.g. HeGAN, MV-ACM, etc.

Message-based methods. The method mainly comprises the following steps: and designing a message passing mechanism and constructing a graph neural network model aiming at the heterogeneous graph. And collecting node and neighborhood information thereof, then fusing, and delivering a fused result to the neighbor nodes so as to obtain higher-order knowledge. Finally, a node representation is obtained. Typical algorithms are HAN, hetGNN, MAGNN, etc.

However, in the related art, the heterogeneous graph representation learning has the following drawbacks:

a) The method based on division and treatment comprises the following steps:

the method has the main defects that the topological structure relationship of the heterogeneous diagram is complex, and the heterogeneous diagram is decomposed into a plurality of homogeneous diagrams, so that heterogeneous information is lost, and the heterogeneous structure and semantic information on the diagram cannot be effectively and fully mined.

b) A random walk-based method:

the method is mainly used for analyzing the graph structure, namely learning the relation between nodes on the graph, and cannot be used for mining heterogeneous attributes of different types of nodes and edges on the graph.

c) An automatic encoder-based method:

the method based on the automatic encoder mainly relies on an objective function for reconstruction, but the topological structure of the heterogeneous graph is complex, and a comprehensive reconstruction target is difficult to design, so that the heterogeneous structure and semantic information on the graph cannot be fully mined.

d) Based on the method of generating the countermeasure network:

the main defects are that: firstly, the method based on generating the countermeasure network is often high in training cost and is not suitable for a large-scale network; secondly, most of the existing methods generate samples from known nodes, and generalize the unknown nodes with poor capability; finally, existing methods still lack empirical knowledge of the combinations for the complex semantics contained in the heterograms.

e) Message-based methods.

Firstly, because the network structure is complex, the noise is easier to fit; secondly, the method has higher overall space-time complexity, which results in relatively higher training difficulty; finally, most of the existing methods adopt a meta path or a meta graph to mine complex semantics in heterogeneous graphs, which on one hand needs more accurate priori knowledge and is difficult to realize in some scenes; on the other hand, the expression capacity of the model is limited by doing so, and the complex semantics cannot be further deeply mined.

In summary, in order to effectively mine abundant semantic information in the heterogeneous graphs, the embodiment of the invention provides a heterogeneous graph representation learning method based on common attention.

The common-attention-based heterogeneous diagram representation learning method provided by the invention is specifically described below with reference to fig. 1 to 4. Fig. 1 is a schematic flow chart of a learning method represented by a hetero graph based on common attention, and referring to fig. 1, the method includes steps 101-104, in which:

step 101, determining a local attention score of a target element path neighbor node according to a preset attention parameter vector, a first hidden feature vector, a second hidden feature vector and a semantic fusion vector of all element path examples between the target node and the target element path neighbor node of any target element path in a heterogeneous graph; the local attention score is used for representing the importance degree of the target element path neighbor node to the target node based on the target element path; the first hidden feature vector is the hidden feature vector of the target node, and the second hidden feature vector is the hidden feature vector of the target element path neighbor node.

In this embodiment, the heterogeneous graph includes a plurality of target element paths, each of which includes a plurality of target nodes and a relationship edge connecting the target nodes.

Wherein a given heterogeneous information attribute network (i.e., heterogeneous graph) may be represented as

The node set comprises a plurality of target nodes, which can be expressed as +.>

The relation edge set including a plurality of relation edges connecting the target nodes can be expressed as +.>

Note that, node type set

Satisfying the node type mapping function phi:>

relation edge type set +.>

Satisfy the relation edge type mapping function psi:>

and satisfy->

Wherein the node attribute set comprises the attribute of each target node, and the node attribute set can be expressed as +.>

The attribute set of the relationship side comprises the attribute of each relationship side, and the attribute set of the relationship side can be expressed as

The node attribute matrix is composed of node attribute matrixes of different types and relation edge attribute matrixes of different types.

Wherein the method comprises the steps of

And->

Respectively representing the number of different types of target nodes and the number of different types of relation edges in the heterogeneous information attribute network,/->

And->

Respectively representing the attribute number of the corresponding node type and the attribute number of the corresponding relation edge type.

Further, in the heterogeneous information network, different nodes are generally associated by adopting meta paths, so that rich semantic information contained in the meta paths can be embodied, and the heterogeneous network is facilitated to be mined. The meta-path P is denoted here as

Is a path of (a) describing node type A ₁ And A _l+1 A complex relationship between->

Wherein->

Representing complex operators between relationships.

In particular, the meta-path instance P of P represents a sequence of nodes and relationship edges that satisfy the meta-path definition. Here, for the input heterogeneous diagram

There is a set of pre-set meta-paths +.>

In one implementation, the meta-path +.>

The head and tail target nodes of (a) are all target type nodes. Thus, for each target type node, a different meta-path sub-graph may be extracted from the set of meta-paths.

In this embodiment, neighbor message aggregation is first required. FIG. 2 is a schematic diagram of the calculation of semantic features based on common attention according to the present invention, as shown in FIG. 2, and in FIG. 2, a plurality of meta-path subgraphs G are included ₁ 、G ₂ 、...、G _k The method comprises the steps of carrying out a first treatment on the surface of the By introducing other semantic information and sharing its attention parameter a ₁ ，W ₁ ；A ₂ ，W ₂ ；...；A _k ，W _k Obtaining a semantic representation vector h of the local semantics ₁ 、h ₂ 、...、h _k 。

Specifically, a target node u and a set of element paths are given

For a particular target meta-path P _k It is required to base the target meta path P _k Is +.>

Hidden feature vector h of target node _u Hidden target element path neighbor node Feature vector h _v And semantic fusion vector +.A semantic fusion vector of all meta-path examples between the target node and the target meta-path neighbor node +.>

Determining the importance degree of the target element path neighbor node v to the target node u based on the target element path +.>

In particular, the local graph attention scoring function, i.e., the expression (2), may be used:

representing local attention scores of the target element path neighbor nodes; />

Is the target meta-path P _k Is a vector of attention parameters of (a); />

Representing a parameter matrix; />

A semantic fusion vector representing a meta-path; h is a _u A hidden feature vector representing the target node; h is a _v A hidden feature vector representing a target element path neighbor node; sigma (·) represents the activation function; the term "vector" means a vector concatenation.

In the embodiment, the importance degree of the target element path neighbor node to the target node based on the target element path can be determined by calculating the local attention score of the target element path neighbor node; namely, the importance of different neighbor nodes in the target element path can be learned through the local attention score, and the method has strong characterization capability.

102, determining a global attention score of the target element path neighbor node based on the local attention score, the number of other element paths corresponding to the target element path neighbor node in the heterogeneous diagram and a preset super parameter; the global attention score is used for representing the importance degree of all meta-path neighbor nodes in the heterogeneous graph to the target node based on all meta-paths in the heterogeneous graph, and the super-parameters are used for representing the importance degree of all meta-path examples between the target node and the target meta-path neighbor nodes to the target node.

In this embodiment, the local attention score is based on the local importance of a single semantic, where global information is lacking. Therefore, cross-semantic information needs to be introduced to calculate the global attention value of the target element path neighbor node, so that the importance degree of all element path examples between the target node and the target element path neighbor node to the target node can be calculated more accurately, and the noise is reduced.

Optionally, based on the local attention score of each target node, the number of other element paths corresponding to the target element path neighbor nodes in the heterogeneous graph and a preset super parameter, determining the global attention score of the target element path neighbor nodes by using a global graph attention scoring function, namely, a formula (1); the formula (1) is:

representing the global attention score; n represents the number of other element paths corresponding to the neighbor nodes of the target element path in the heterogeneous graph; η represents the preset hyper-parameters; />

Representing the local attention score.

It should be noted that the number of the substrates,

the importance of other semantics is expressed, and the larger the value is, the higher the importance of other semantics is. Especially when η=0, the global graph attention scoring function is degraded to a general local graph attention scoring function. Notably, the parameters of the global graph attention scoring function are shared across all meta-paths, which can introduce global information without increasing model complexity.

In the embodiment, a graph learning method based on multi-semantic common attention is adopted, and the deep mining is performed on local semantic features by introducing other semantic information and sharing attention parameters, redundant information is filtered, sparse information is enhanced, so that the difference of different semantics is eliminated, and the semantic feature mining is enhanced.

Step 103, determining an updated semantic representation vector corresponding to the target node based on the global attention score, the first hidden feature vector, the second hidden feature vector and the semantic fusion vector.

In this embodiment, the update semantic representation vector corresponding to the target node u may be represented as

Updating the semantic representation vector includes enriching semantic features in the heterograms.

Further, a single layer of updated semantic representation vectors may not be sufficient to obtain all of the semantic features in the heterogeneous map, so a multi-layer map neural network model (Graph Neural Network, GNN) may be stacked to capture high-order semantic information. For example, after stacking the L-layer GNN models, the update semantic representation vector corresponding to the target node u can be represented as

Step 104, determining the attention weight of each target element path based on the updated semantic representation vector of the node in each target element path in the heterogeneous graph, and determining the target semantic representation vector corresponding to the target node based on the attention weight of each target element path and the updated semantic representation vector of the node in each target element path.

In this embodiment, semantic feature fusion is required, where different semantic features are fused by adopting an attention mechanism, so as to obtain importance of different semantics.

For the target node u, the target semantic representation vector corresponding to the final target node may be determined based on the attention weights of other target element paths and the updated semantic representation vector after stacking the L-layer GNN model, and specifically may be represented by the following formula (3):

wherein q _u Representing a target semantic representation vector;

representing the attention weights of other target meta-paths.

Specifically, the method can be calculated by the following formula (4):

is a semantic attention parameter vector; />

Representing a parameter matrix.

Optionally, before determining the local attention score of the target element path neighbor node based on the preset attention parameter vector, the first hidden feature vector, the second hidden feature vector and the semantic fusion vector of all element path instances between the target node and the target element path neighbor node, encoding each target element path in the heterogeneous graph to generate the semantic fusion vector of all element path instances between the target node and the target element path neighbor node; the method can be realized by the following steps (1) -step (3):

step (1), at least one meta-path instance corresponding to the target meta-path is obtained, and the data structure of each meta-path instance is the same as the data structure of the target meta-path;

step (2), encoding each meta-path instance to generate a semantic representation vector of each meta-path instance;

and (3) carrying out aggregation processing on the semantic representation vectors of the meta-path examples to generate the semantic fusion vector.

One target meta-path in the heterogeneous graph corresponds to a plurality of meta-path instances, and in this embodiment, before encoding each meta-path instance, features of nodes and relationship edges of different types in the heterogeneous graph need to be mapped to the same hidden space through feature transformation.

A linear transformation based on node and relationship edge types is employed herein. Specifically, a given node type

Relationship edge type +.>

For any node V e V _A And an arbitrary relationship edge E _R Has h _v ＝W _A ·x _v H _e ＝W _R ·x _e 。

Wherein x is _v ∈X _A ,x _e ∈X _R Is the initial feature vector of the feature set,

hidden feature vectors of node v and relationship edge e, respectively, and +.>

The parameter matrices of node type a and relationship edge type R, respectively.

After the characteristics of nodes and relation edges of different types in the heterogeneous graph are mapped to the same hidden space, encoding each element path instance to generate a semantic representation vector of each element path instance; however, the method is thatThen, the semantic expression vector of each element path instance is aggregated, and then a target semantic fusion vector is generated

In the embodiment, the characteristics of the nodes and the relation edges of different types in the heterogeneous graph can be mapped to the same hidden space through the characteristic transformation, so that semantic information in the heterogeneous graph can be conveniently and fully mined.

the encoding of each meta-path instance to generate a semantic representation vector of each meta-path instance can be realized by the following steps [1] -step [4 ]:

Step [1], aiming at each meta-path instance, generating a sub-node sequence and a relation edge sequence corresponding to the meta-path instance; the child node sequence comprises L+1 child nodes, and the relationship edge sequence comprises L relationship edges.

Specifically, a meta-path is given

And path instance thereof

It can be split into node sequences +.>

And relation edge sequence->

Step [2], inputting a child node hidden feature vector sequence into a first model group, and updating the child node hidden feature vector sequence to obtain an updated hidden feature vector of the child node; the first model group comprises a first cyclic neural network model (RNN) and a second RNN, and the first RNN and the second RNN are connected in a jumping manner; the sequence of child node hidden feature vectors is generated based on the sequence of child nodes.

Specifically, a sequence of hidden feature vectors to child nodes

The updating can be represented by the following formula (5) and formula (6):

an output hidden vector representing the t-th step of the hidden characteristic vector sequence of the child node; />

Updating hidden feature vectors representing corresponding nodes or relationship edges; />

Representing a parameter matrix; />

Representing the parameter vector.

Then will

GRU unit input to RNN to generate output vector of t+1 step of node sequence +. >

(i.e., updated hidden feature vector of child node), specifically expressed by the following formula (6):

and step [3], inputting the hidden characteristic vector sequence of the relation edge into the first model group, and updating the hidden characteristic vector sequence of the relation edge to obtain the updated hidden characteristic vector of the relation edge.

Specifically, the feature vector sequence is hidden from the relation edge

The updating can be represented by the following formulas (7) and (8):

an output hidden vector representing the t-th step of the relation edge hidden feature vector sequence; />

Updating hidden feature vectors representing relationship edges; />

Representing a parameter matrix; />

Representing the parameter vector.

Then will

GRU unit input to RNN to generate output vector of t+1 th step of relation edge sequence>

(i.e., updated hidden feature vectors of the relationship edges), specifically expressed by the following equation (8):

and step [4] determining the semantic representation vector of the meta-path instance based on the updated hidden feature vector of the child node and the updated hidden feature vector of the relation edge.

Specifically, after updating the hidden feature vector of the child node and the hidden feature vector of the relationship edge, the output vector of the last child node (i.e. the (L+1) th child node in the meta-path example) in the meta-path example is used as the semantic representation vector of the meta-path example; wherein the semantic representation vector of the meta-path instance can be represented as:

h _p All semantic information on the meta-path instance is contained.

FIG. 3 is a schematic diagram of a meta-path instance code structure according to an embodiment of the present invention. As shown in fig. 3, the GRU models in fig. 3 are connected in a jumping manner, v represents a child node in the meta-path instance, e represents a relationship edge for connecting the child node in the meta-path instance, and the child node and the relationship edge are input into the GRU model to finally obtain a semantic representation vector of the meta-path instance.

In this embodiment, the semantic representation vectors of the multiple meta-path instances are aggregated by adopting an attention mechanism, which is specifically implemented by the following manner:

in practical application, there may be multiple meta-path instances between node u and node v, denoted as p ^′ _uv They are also of different importance.

Therefore, the semantic representation vector of each element path instance needs to be aggregated through learning the attention weight, and finally a semantic fusion vector is generated. Specifically, the expression can be represented by the following formula (5):

wherein mu _p The importance of an instance of a meta-path can be calculated by the following equation (6):

representing the meta-path instance attention parameter vector.

In the above embodiment, a meta path encoder is designed so as to uniformly mine semantic information in a meta path. The meta-path encoder comprises three modules, namely feature transformation, meta-path instance encoding and meta-path instance fusion. The feature transformation maps heterogeneous information to the same feature space, the meta-path instance coding fully utilizes the node and relation side information in the path to extract semantic features, and the meta-path instance fusion can identify the semantic importance of different instances.

Through the method, through the common attention of multiple semantics and fully utilizing the relationship side information, the rich semantic information in the heterogeneous graph is deeply mined, so that high-quality node representation can be obtained and the method is applied to downstream node classification tasks.

Optionally, the determining, based on the global attention score, the first hidden feature vector, the second hidden feature vector, and the semantic fusion vector, an updated semantic representation vector corresponding to the target node may be specifically implemented by the following steps [ a ] -step [ c ]:

step [ a ], carrying out normalization processing on the global attention score to generate a target attention score;

step [ b ], the target attention score, the second hidden feature vector and the semantic fusion vector are subjected to aggregation processing to obtain an aggregation message vector corresponding to the target node;

step [ c ], determining the updated semantic representation vector based on the aggregate message vector and the first hidden feature vector.

Updating semantic representation vector +.>

Including rich semantic features in the heterograms. For calculating- >

First, the software is required to pass through a softmax function pair

Normalization processing is carried out to generate a target attention score +.>

Specifically, the following formula (11) can be used to realize:

then will

Second hidden feature vector h _v Semantic fusion vector +.>

Aggregation processing is carried out to obtain an aggregation message vector +.>

Specifically, the following formula (12) may be used to realize:

representing a parameter matrix.

After obtaining the aggregate message vector

The semantic representation vector of the target node is updated by adopting a pre-activated residual error connection method to obtain an updated semantic representation vector corresponding to the target node>

Specifically, for the target meta-path P _k The updated semantic representation vector of the target node u can be represented using the following equation (13):

further, a single layer of updated semantic representation vectors may not be sufficient to obtain all of the semantic features in the heterogeneous map, so a multi-layer map neural network model (Graph Neural Network, GNN) may be stacked to capture high-order semantic information.

In particular, the update semantic representation vector of the target node u may be updated recursively herein. For example, after stacking L layers of GNN models, the L+1st update semantic representation vector of the target node u

Can be obtained by the following formula (14):

/>

representing a parameter matrix; />

Representing that target node u is in target element path P _k The lower layer L aggregates message vectors; />

An L-layer semantic representation vector representing the target node u, and a 0-layer semantic representation vector is +.>

In the embodiment, the structure and the semantic information of the heterogeneous graph can be mined by updating the semantic representation vector, so that the downstream node classification performance effect is improved. The method has the advantages that redundant information can be filtered and sparse information can be enhanced through a graph learning method based on common attention, the defect that the conventional heterogeneous graph mining algorithm excessively relies on element paths to carry out semantic mining is overcome, and complex semantics can be adaptively and deeply mined, so that semantic feature representation is enhanced.

Optionally, after the target semantic representation vector corresponding to the target node is determined, training the semantic classification model by using the target semantic representation vector, so as to obtain a trained target semantic classification model, where the target semantic classification model can be applied to a downstream semantic classification task. The method can be realized by the following steps:

and step [1], inputting the target semantic representation vector into an initial target service classification model, and training the initial target service classification model by using a preset loss function until the initial target service classification model converges to obtain a trained target service classification model and a target semantic representation vector which is output by the target service classification model and is suitable for target service.

And step 2, inputting the target semantic representation vector suitable for the target service into the trained target service classification model to obtain a classification result aiming at the target service, which is output by the trained target service classification model.

Specifically, in the training stage of the initial target service classification model, the target semantic representation vector is input into the initial target service classification model, and the prediction label is obtained through calculation.

And then, calculating a loss value between the real label and the predicted label according to the loss function, and iteratively updating parameters in the neural network through a back propagation algorithm until convergence. After training is completed, a trained target service classification model and a target semantic representation vector which is output by the target service classification model and is applicable to target service can be obtained.

In the model application stage, the target semantic representation vector suitable for the target service is input into a trained target service classification model, and a classification result aiming at the target service, which is output by the trained target service classification model, can be obtained.

Taking a target service as a node classification task as an example, firstly, inputting a target semantic representation vector into an initial semantic classification model, training the initial semantic classification model by utilizing a pre-designed loss function until the initial semantic classification model converges, obtaining a trained semantic classification model, and outputting the target semantic representation vector suitable for the node classification task by the trained semantic classification model;

And inputting the target semantic representation vector suitable for the node classification task into the trained semantic classification model to obtain a semantic classification result which is output by the trained semantic classification model and aims at any node in the heterogeneous graph.

FIG. 4 is a schematic diagram showing a learning method based on a common attention heterogram; referring to fig. 4, the method includes steps 401-413, wherein:

step 401, obtaining at least one meta-path instance corresponding to a target meta-path in a heterogeneous graph; wherein the data structure of each meta-path instance is the same as the data structure of the target meta-path.

Step 402, generating a sub-node sequence and a relation edge sequence corresponding to each meta-path instance; the child node sequence comprises L+1 child nodes, the relationship edge sequence comprises L relationship edges, and L is a positive integer.

Step 403, inputting the hidden characteristic vector sequence of the child node into a first model group, and updating the hidden characteristic vector sequence of the child node to obtain an updated hidden characteristic vector of the child node; the first model group comprises a first RNN and a second RNN, and the first RNN and the second RNN are connected in a jumping manner; the sequence of child node hidden feature vectors is generated based on the sequence of child nodes.

Step 404, inputting the hidden feature vector sequence of the relationship side into a first model group, and updating the hidden feature vector sequence of the relationship side to obtain an updated hidden feature vector of the relationship side; the sequence of relational edge hidden feature vectors is generated based on the sequence of child nodes.

Step 405, generating a semantic representation vector of the meta-path instance based on the updated hidden feature vector of the child node and the updated hidden feature vector of the relationship edge.

Step 406, aggregating the semantic representation vectors of each element path instance to generate a semantic fusion vector.

Step 407, determining, for at least one target node of any target element path in the heterogeneous graph, a local attention score of a target element path neighbor node based on a preset attention parameter vector, a first hidden feature vector, a second hidden feature vector and semantic fusion vectors of all element path instances between the target node and the target element path neighbor node of the target element path.

Specifically, the local attention score is used for representing the importance degree of the target element path neighbor node to the target node based on the target element path; the first hidden feature vector is the hidden feature vector of the target node, and the second hidden feature vector is the hidden feature vector of the target element path neighbor node.

Step 408, determining a global attention score of the target element path neighbor node based on the local attention score, the number of other element paths corresponding to the target element path neighbor node in the heterogeneous graph and a preset super parameter.

Specifically, the global attention score is used for representing the importance degree of all element paths in the heterogeneous graph to the target node, and the super parameter is used for representing the importance degree of all element path examples between the target node and the target element path neighbor node to the target node.

Step 409, normalizing the global attention score to generate a target attention score.

Step 410, aggregating the target attention score, the second hidden feature vector and the semantic fusion vector to obtain an aggregate message vector corresponding to the target node.

Step 411, determining an updated semantic representation vector based on the aggregate message vector and the first hidden feature vector.

Step 412, determining the attention weight of each target element path based on the updated semantic representation vector of the node in each target element path in the heterogeneous graph, and determining the target semantic representation vector corresponding to the target node based on the attention weight of each target element path and the updated semantic representation vector of the node in each target element path.

Step 413, inputting the target semantic representation vector into an initial target service classification model, and training the initial target service classification model by using a preset loss function until the initial target service classification model converges, so as to obtain a trained target service classification model and a target semantic representation vector which is output by the target service classification model and is suitable for the target service.

Step 414, inputting the target semantic representation vector suitable for the target service into the trained target service classification model to obtain the classification result for the target service output by the trained target service classification model.

The heterogeneous graph representation learning method based on common attention aims at mining the structure and semantic information of the heterogeneous graph, and further improves the downstream node classification performance effect. On the basis of a common meta-path method, redundant information is filtered and sparse information is enhanced by a graph learning method based on common attention, and the method overcomes the defect that the conventional heterogeneous graph mining algorithm excessively relies on meta-paths to carry out semantic mining, and can carry out deep mining on complex semantics in a self-adaptive manner, so that semantic feature representation is enhanced. In addition, for the problem that the prior heterogeneous graph representation learning method based on the meta-path generally lacks to use the relationship side information, thereby causing information loss, the invention provides a meta-path encoder so as to uniformly mine semantic information. Therefore, the method and the device have the advantages that through the common attention of multiple semantics and the full utilization of the relationship side information, the rich semantic information in the heterogeneous graph is deeply mined, so that high-quality node representation is obtained, and the method and the device are applied to downstream node classification tasks.

Claims

1. A method of learning a common-attention-based heterogram representation, comprising:

2. The method of claim 1, wherein determining the updated semantic representation vector corresponding to the target node based on the global attention score, the first hidden feature vector, the second hidden feature vector, and the semantic fusion vector comprises:

normalizing the global attention score to generate a target attention score;

3. The method for learning a heterogram representation based on common attention according to claim 1, wherein determining the global attention score of the target meta-path neighbor based on the local attention score, the number of other meta-paths corresponding to the target meta-path neighbor in the heterogram, and a preset hyper-parameter includes:

representing the global attention score; n represents the heterogeneous diagramThe number of other meta paths corresponding to the target meta path neighbor nodes; η represents the preset hyper-parameters; />

Representing the local attention score.

4. A co-attention based heterogram representation learning method according to any one of claims 1 to 3, wherein prior to said determining a local attention score for said target meta-path neighbor based on a pre-set attention parameter vector for said target meta-path, a first latent feature vector, a second latent feature vector, and a semantic fusion vector of all meta-path instances between said target node and said target meta-path neighbor, said method further comprises:

5. The co-attention based heterogram representation learning method of claim 4 wherein said meta-path instance comprises l+1 children nodes and L relationship edges, L being a positive integer;

6. The co-attention based heterograph representation learning method of claim 1, wherein after said determining a target semantic representation vector for the target node, the method further comprises: