CN114925243A

CN114925243A - Method and device for predicting node attribute in graph network

Info

Publication number: CN114925243A
Application number: CN202210485497.5A
Authority: CN
Inventors: 张丽娟; 王维强
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-08-19
Anticipated expiration: 2042-05-06
Also published as: CN114925243B

Abstract

The embodiment of the specification describes a method and a device for predicting node attributes in a graph network. According to the method of the embodiment, a plurality of sub-networks are determined according to the node types contained in the graph network, then the time sequence representation of each sub-network is determined, and the fusion representation capable of predicting the attribute of the first node is obtained by fusing the time sequence representations corresponding to the sub-networks. Furthermore, the attribute of the first node can be predicted by utilizing the fusion representation. Because the types of the second nodes in different sub-networks are different, the different types of data in the sub-networks are fused for judging the attribute of the first node, so that the information amount for judging the attribute is increased, and the accuracy for judging the attribute of the first node can be improved.

Description

Method and device for predicting node attribute in graph network

Technical Field

One or more embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for predicting node attributes in a graph network.

Background

Based on graph networks, predicting attributes of nodes in graph networks is a common prediction means. For example, in the financial field, as traditional financial fraud, false publicity activities, and the like are gradually transferred from offline to online, financial accounts with high risk concealment have a significant impact on financial wind control. Therefore, the risk accounts are predicted according to the network relation among the financial accounts, and mining of the risk accounts is facilitated.

However, the accuracy of attribute prediction based on graph networks is currently low.

Disclosure of Invention

One or more embodiments of the present specification describe a method and an apparatus for predicting node attributes in a graph network, which can improve the accuracy of a prediction result of a node attribute.

According to a first aspect, there is provided a method of predicting node attributes in a graph network, comprising:

determining at least two sub-networks from the graph network; each sub-network comprises a first node to be subjected to attribute prediction and a second node which is adjacent to the first node, and the types of the second nodes in any two sub-networks are different;

for each sub-network, determining a timing representation of the sub-network based on the association of the first node with the second node in the sub-network;

fusing the time sequence representations obtained by the sub-networks to obtain a fusion representation of the first node;

and predicting the attribute of the first node in the graph network by using the fusion characterization of the first node.

In one possible implementation, the determining at least two subnetworks according to the graph network includes:

splitting the graph network into at least two first networks according to the difference of the node types in the graph network; any one first network comprises first nodes, and the types of the nodes are not more than 2;

for each of the first networks, a second node of the first network associated with the attribute of the first node is determined, and a sub-network corresponding to the first network is generated using the first node and the second node of the first network.

In one possible implementation, the determining a second node in the first network that is associated with the attribute of the first node includes:

calculating the modularity of the first node and each third node in the first network; wherein the third node is used for characterizing nodes except the first node, and the modularity is used for characterizing the aggregation degree of the first node and the third node;

and determining each third node with the modularity larger than a preset threshold value as a second node associated with the attribute of the first node.

In a possible implementation manner, when the node types of the third node and the first node are all the same, the calculating the modularity of the first node and each third node in the first network includes:

calculating the modularity using the following calculation:

wherein Q is _i Used for representing the modularity of the first node and the ith third node, m is used for representing the total number of edges connected between the first node and each third node, A _i For characterizing the adjacency matrix between the first node and the ith third node, k and k _i Respectively for characterizing the in-degree of the first node and the out-degree of the ith third node, delta (C) _i ) The method is used for representing whether a connecting edge exists between the first node and the third node, and the value of delta (C) is 1 when the connecting edge exists and is 0 when the connecting edge does not exist.

In one possible implementation manner, when the third node includes a node of a different node type from the first node, the calculating the modularity of the first node and each third node in the first network includes:

calculating the modularity using the following calculation:

wherein Q _j The node B is used for representing the modularity of the first node and the jth third node, F is used for representing the total number of edges connected between the first node and the third node with different types from the first node, and B _j An adjacency matrix used for representing the first node and the jth third node with different type from the first node, q is used for representing the degree of entry of the first node, q _j Degree of departure, δ (C), for characterizing a jth third node of a different type than the first node _j ) Used for representing whether a connecting edge, delta (C), exists between the jth third node with the type different from that of the first node and the first node _j ) The value of (b) is 1 when there is a continuous edge and 0 when there is no continuous edge.

In a possible implementation manner, the determining a network characterization of the sub-network according to an association between a second node and a first node in the sub-network includes:

acquiring a first time sequence representation of the first node and a second time sequence representation of a second node which is adjacent to the first node; the first time sequence characterization and the second time sequence characterization are both characteristics capable of influencing the attribute of the first node;

and aggregating the second time sequence representation of the second node to the first time sequence representation of the first node to obtain the fusion representation.

In a possible implementation manner, the aggregating the second timing representation of the second node into the first timing representation of the first node to obtain the fused representation includes:

and calculating the network characterization by using the following calculation formula:

f(H ^l )＝δ(E·H ^l-1 ·W ^l-1 )

wherein，f(H ^l ) For characterizing the network characterization obtained after a layer of iteration, E for characterizing the second timing characterization, H ^l-1 A state matrix for characterizing hidden layers of layer l-1, and H0 is the first timing characterization, W is a weight parameter for characterizing hidden layers of layer l-1, δ is an activation function.

In a possible implementation manner, the fusing the time series representations obtained by the sub-networks to obtain a fused representation of the first node includes:

for each sub-network, performing:

determining a weight matrix in an attention mechanism corresponding to the current sub-network by using the time sequence representation of the current sub-network;

determining the contribution representation of the current sub-network when the attribute of the first node is predicted according to the weight matrix in the attention mechanism corresponding to the current sub-network;

determining the contribution amount of the current sub-network in predicting the attribute of the first node according to the contribution characterization of the current sub-network;

and determining the fusion characterization according to the weight matrix and the contribution amount corresponding to each sub-network.

In one possible implementation, the weight matrix includes Q, K, V weight matrices;

the determining of the weight matrix in the attention mechanism corresponding to the current sub-network by using the time sequence characterization of the current sub-network includes:

calculating the weight matrix using the set of equations:

q, K, V are used to characterize Q, K, V weight matrix, L, in the attention mechanism corresponding to the current subnetwork ₁ 、L ₂ 、L ₃ Respectively, for characterizing the matrix corresponding to the convolution kernel when calculating the Q, K, V weight matrix corresponding to the current sub-network, and D for characterizing the current sub-networkTiming characterization of the network.

In a possible implementation manner, the determining, according to a weight matrix in the attention mechanism corresponding to the current sub-network, a contribution characterization when the current sub-network predicts an attribute of the first node includes:

calculating a contribution characterization for the current sub-network using the following calculation:

g(Q,K)＝Q ^T K

wherein g (Q, K) is used to characterize the contribution of the current sub-network, Q ^T The transpose matrix is used for representing a Q weight matrix in the weight matrixes, and the K is used for representing a K weight matrix in the weight matrixes.

In a possible implementation manner, the determining, according to the contribution characterization of the current sub-network, a contribution amount of the current sub-network in predicting the attribute of the first node includes:

calculating the contribution of the current sub-network using the following calculation formula:

wherein a is used for characterizing the contribution amount of the current sub-network, g (Q, K) is used for characterizing the contribution characterization of the current sub-network, N is used for characterizing the set of K weight matrixes in the weight matrixes of the sub-networks, and g (Q, K') is used for characterizing the contribution characterization of the set of K weight matrixes of the sub-networks.

In a possible implementation manner, the determining the fusion characterization according to the weight matrix and the contribution amount corresponding to each sub-network includes:

for each sub-network, calculating the product of the K weight matrix in the attention mechanism corresponding to the sub-network and the contribution amount corresponding to the sub-network to obtain a first fusion characterization corresponding to the sub-network;

and summing the first fusion representations of the sub-networks to obtain the fusion representations.

In one possible implementation, the predicting the attribute of the first node in the graph network by using the fused representation of the first node includes:

inputting the fusion representation of the first node into a pre-trained attribute prediction model to obtain an attribute prediction value of the first node; the training method of the attribute prediction model comprises the following steps: training by utilizing at least one group of sample training set; each group of sample training set comprises a sample fusion characterization of the first node and a sample prediction value of the attribute of the first node.

According to a second aspect, there is provided an apparatus for predicting node attributes in a graph network, comprising: the device comprises a sub-network determining module, a time sequence characterization fusion module and an attribute prediction module;

the sub-network determining module is configured to determine at least two sub-networks according to the graph network; each sub-network comprises a first node to be subjected to attribute prediction and a second node which is adjacent to the first node, and the types of the second nodes in any two sub-networks are different;

the timing representation determining module is configured to determine, for each sub-network determined by the sub-network determining module, a timing representation of the sub-network according to the association between the second node and the first node in the sub-network;

the time sequence representation fusion module is configured to fuse the time sequence representations obtained by the time sequence representation determination module for the sub-networks to obtain a fusion representation of the first node;

the attribute prediction module is configured to predict the attribute of the first node in the graph network by using the fusion characterization of the first node obtained by the time sequence characterization fusion module.

According to a third aspect, there is provided a computing device comprising: a memory having executable code stored therein, and a processor that when executing the executable code implements the method of any of the first aspects described above.

According to the method and the device provided by the embodiment of the specification, when the attribute of the node in the graph network is predicted, at least two sub-networks can be determined according to the node type in the graph network. And then, determining the time sequence representation of each sub-network, and fusing the time sequence representations obtained by the sub-networks to obtain a fusion representation of the first node. Furthermore, the attribute prediction of the first node can be realized by utilizing the fusion representation of the first node. In the embodiment of the present specification, the types of the second nodes in different sub-networks are different, and thus by fusing different types of data in each sub-network for determining the attribute of the first node, the amount of information for performing attribute determination is increased, and thus the accuracy of determining the attribute of the first node can be improved.

Drawings

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present specification, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a diagram illustrating a network according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method for predicting node attributes in a graph network according to an embodiment of the present disclosure;

FIG. 3 is a flow chart of a method for determining subnetworks provided by one embodiment of the present description;

FIG. 4 is a flow chart of a method for determining a second node provided by one embodiment of the present description;

FIG. 5 is a flow diagram of a method of determining a timing characterization provided by one embodiment of the present description;

FIG. 6 is a flow diagram of a method for temporal token fusion provided by an embodiment of the present description;

fig. 7 is a schematic diagram of an apparatus for predicting node attributes in a graph network according to an embodiment of the present disclosure.

Detailed Description

As shown in fig. 1, a schematic diagram of a network is shown, where any one node does not exist independently, and there is more or less association with other nodes. Therefore, when the attributes of the nodes are predicted based on the graph network, the relevance between other nodes and the nodes to be predicted needs to be considered, so that a more accurate prediction result is obtained.

However, when performing attribute prediction based on a graph network, the node types in the graph network are often not single, and there are a large number of heterogeneous nodes different from the node types to be predicted. For example, in the field of financial wind control, financial transaction data may occur not only between a financial account and a financial account (see the connection between circles and circles in fig. 1), but also between a financial account and a financial device (see the connection between circles and triangles in fig. 1), and possibly between a financial account and a bank card (see the connection between circles and rectangles in fig. 1). Therefore, when constructing a graph network relationship, the nodes in the graph network may include not only the to-be-predicted node, which is a financial account, but also media different from the financial account type, that is, the types of the nodes included in the graph network may be various. If only the association between nodes in the graph network with the same type as the node to be predicted is considered, a large amount of information is lost, resulting in low accuracy of the prediction result. If only the association between the heterogeneous node and the node to be predicted in the graph network is considered, the accuracy of the prediction result is also low due to the loss of a large amount of information. That is, it is difficult to obtain a more accurate prediction result by performing attribute prediction according to a single type of node data.

Based on the method and the device, the characteristics of the nodes of different types are aggregated when the attributes of the nodes are predicted, and the attributes of the nodes are predicted according to the aggregated characteristics. Therefore, the representation of various types of nodes is fully considered, and the information to be represented by different nodes is not lost, so that the accuracy of the prediction result can be improved.

As shown in fig. 2, the present illustrative embodiment provides a method for predicting node attributes in a graph network, which may include the following steps:

step 201: determining at least two sub-networks according to the graph network; each sub-network comprises a first node to be subjected to attribute prediction and a second node which is adjacent to the first node, and the types of the second nodes in any two sub-networks are different;

step 203: for each sub-network, determining a timing representation of the sub-network based on the association of the first node with the second node in the sub-network;

step 205: fusing the time sequence representations obtained by the sub-networks to obtain a fusion representation of the first node;

step 207: and predicting the attribute of the first node in the graph network by using the fusion characterization of the first node.

In this embodiment of the present disclosure, when predicting attributes of a node in a graph network, at least two sub-networks may be determined according to a node type in the graph network. And then, determining the time sequence representation of each sub-network, and fusing the time sequence representations obtained by the sub-networks to obtain a fused representation of the first node. Furthermore, the attribute prediction of the first node can be realized by utilizing the fusion representation of the first node. In the embodiment of the present specification, the types of the second nodes in different sub-networks are different, and thus by fusing different types of data in each sub-network for determining the attribute of the first node, the amount of information for performing attribute determination is increased, and thus the accuracy of determining the attribute of the first node can be improved.

The steps in fig. 2 are described below with reference to specific examples.

Firstly, in step 201, at least two sub-networks are determined according to a graph network; each sub-network comprises a first node to be subjected to attribute prediction and a second node which is adjacent to the first node, and the types of the second nodes in any two sub-networks are different.

For the graph network, the graph network can be constructed according to historical data for attribute prediction. For example, in determining whether an account is a risk account, the graph network may be a network constructed using historical transaction data associated with the account. Such as: transaction connections between the account and other accounts, devices associated with the account, bank cards associated with the account, and the like may be included in the graph network. The attribute to be predicted in the graph network is then the probability of a risk account, and each node in the graph network may include accounts, financial devices, bank cards, etc. involved in financial transactions.

For the sub-network, the sub-network can be obtained by splitting the network constructed according to historical transaction data. The first node in the sub-network is the node that needs attribute prediction. For example, the first node may be a to-be-risk prediction account for which risk account prediction is performed. The second node in the sub-network is the node connected to the first node. For example, the account to be subjected to attribute prediction is a, that is, the first node in the subnet corresponds to account a; account B is a second node connected to the first node if account a has transacted with account B. For another example, account a has an association with a financial device UMID, which is then a second node connected to the first node.

As another example, the graph network may also be a network constructed from business data of a merchant. For example, when predicting whether a certain merchant is a good merchant, the graph network may be a network constructed by using the business data and the credit evaluation information of the merchant. The nodes in the graph network may include the current merchant to be predicted, different merchants, authorities, etc., while the attributes to be predicted in the graph network are the probability that the merchant is a premium merchant.

The sub-network can be obtained by splitting a graph network constructed according to business data and credit evaluation information of merchants, and the first node in the sub-network can be a merchant which is to be predicted to be a good merchant or not. For different sub-networks, for example, the second node in the sub-network a may be a different merchant, and the sub-network may be obtained by the business data and credit evaluation data of other merchants for the merchant to be predicted; for another example, the second node in the sub-network B may also be the current merchant to be predicted, and the sub-network may be obtained by the current merchant to be predicted according to the own operation data and the self-credit evaluation data; as another example, the second node of C in the sub-network may also be an authority, the sub-network being derived from credit assessment reports and business data available to the authority for system review.

The risk account prediction is described below as an example.

Each determined sub-network comprises a first node to be subjected to attribute prediction and a second node connected with the first node, and the types of the second nodes in any two sub-networks are different. That is, when the first node is an account to be risk-predicted, and the second node is a non-account to be risk-predicted, and a financial device, a bank card, or the like, each sub-network includes the first node of the account to be risk-predicted and the second node of at least one of the account to be risk-predicted, the financial device, the bank card, or the like. Also, the second node types in different sub-networks are different. For example, the first node and the second node included in the sub-network 1 are an account to be risk predicted and an account not to be risk predicted, respectively; the first node and the second node included in the sub-network 2 are a pending risk prediction account and a financial device UMID respectively; the sub-network 3 comprises a first node and a second node which are an account to be risk predicted and a bank card respectively.

In one possible implementation, as shown in fig. 3, step 201 may be implemented when determining at least two sub-networks according to the graph network by:

step 301: splitting the graph network into at least two first networks according to the difference of the node types in the graph network; any one first network comprises first nodes, and the types of the nodes contained in the first network are not more than 2;

step 303: for each first network, a second node of the first network associated with the attribute of the first node is determined, and a sub-network corresponding to the first network is generated using the first node and the second node of the first network.

In the embodiment of the specification, when determining the sub-networks, the graph network is firstly split into a plurality of first networks according to the difference of the node types in the graph network. Then, for each first network, a second node of the first network associated with the attribute of the first node is determined, and a sub-network corresponding to the first network is further generated by using the first node and the second node of the first network. Therefore, when the sub-networks are split, the splitting is carried out according to the difference of the node types, and the obtained sub-networks also have different dimensions, such as account dimensions, dimensions of used equipment, and dimensions of related media such as bank cards. In this way, by performing the attribute prediction of the first node based on the sub-networks of different dimensions, the prediction can be performed based on more data information, and the accuracy of the prediction result can be improved.

It should be noted that, in the first networks split in step 301, because the attribute of the first node needs to be predicted, any one of the first networks should include the first node. Also, the types of nodes included in any one first network are not more than two. That is, any one of the first networks includes at most one node of a type different from that of the first node. That is, the first network may be formed by the first node and the node of the same type as the first node. For example, if the first node in the first network is the account to be predicted for risk, then the other nodes are also the nodes of the account type. The first network may also be formed of a first node and nodes of a different node type than the first node. For example, if the first node in the first network is a risk account to be predicted, the other nodes may be nodes of non-account types such as used devices and bank cards, and of course, the first network only includes at most one node of a non-account type.

Step 303 may consider determining by calculating the degree of aggregation between nodes when determining a second node in the first network associated with the attribute of the first node. For example, as shown in fig. 4, the second node may be determined by:

step 401: calculating the modularity of the first node and each third node in the first network; the third node is used for representing nodes except the first node, and the modularity is used for representing the aggregation degree of the first node and the third node;

step 403: and determining each third node with the modularity larger than the preset threshold value as the second node associated with the attribute of the first node.

In this embodiment, when determining the second node associated with the attribute of the first node in the first network, the modularity of the first node and each of the third nodes other than the first node may be first calculated. Then, each third node with the modularity greater than the preset threshold may be determined as the second node associated with the attribute of the first node. That is, the degree of aggregation of the other nodes with the first node, that is, the degree of association of the other nodes with the first node can be sufficiently considered by the modularity. Furthermore, the node with the modularity larger than the preset threshold value is determined as the associated second node, so that the determined second node and the first node have a higher association degree in terms of attributes, and further the attribute of the first node can be predicted by using more effective data.

As described above, the other nodes in the first network except the first node to be subjected to attribute prediction may be the same type as the first node or different types from the first node. Based on this, when the node types of the third nodes representing the nodes other than the first node and the first node are all the same, step 401 may calculate the modularity between the first node and each third node in the first network by using the following calculation formula:

wherein Q _i Used for representing the modularity of the first node and the ith third node, m is used for representing the total number of edges connected between the first node and each third node, A _i For characterizing the adjacency matrix between the first node and the ith third node, k and k _i Respectively for characterising the first nodeIn degree and out degree of ith third node, delta (C) _i ) The method is used for representing whether a connecting edge exists between the first node and the third node, and the value of delta (C) is 1 when the connecting edge exists and is 0 when the connecting edge does not exist.

By using the calculation formula, the modularity of the first node and each third node can be accurately calculated, so that the association degree of each third node and the first node on the attribute can be determined. That is, by using the calculation formula of the modularity, the node information associated with the first node can be accurately mined, and the attribute of the first node can be predicted using more information.

It is to be noted that A _i For characterizing an adjacency matrix between the first node and the ith third node, the adjacency matrix may be a matrix formed by the characterization of the first node and the characterization of the ith third node. For example, the node can be formed by splicing a matrix containing the first node representation and a matrix containing the ith third node representation. k and k _i Respectively used for representing the in degree of the first node and the out degree of the ith third node. I.e. in the sub-network, the connection in the network points to the first node for the ith third node when considering whether the ith third node has an impact on the first node. At this time, the first node has an in-degree of 1 and an out-degree of 0, and the ith third node has an in-degree of 0 and an out-degree of 1.

In yet another possible implementation manner, when the third nodes characterizing the nodes other than the first node include nodes of different node types from the first node, step 401 may be calculated by using the following calculation formula when calculating the modularity between the first node and each third node in the first network:

wherein Q is _j Used for representing the modularity of the first node and the jth third node, F is used for representing the total number of edges connected between the first node and the third node with different types from the first node, and B _j For characterizing a first node andan adjacency matrix between jth third nodes with different node types, q is used for representing the degree of entry of the first node _j For characterizing the out-degree, δ (C), of a jth third node of a different type than the first node _j ) Is used for characterizing whether a connecting edge, delta (C), exists between the jth third node which is different from the first node in type and the first node _j ) The value of (A) is 1 when there is a continuous edge and 0 when there is no continuous edge.

It should be noted that, in the above embodiment, the node type of the third node is the same as the node type of the first node, so the determination of the connecting edge relationship between the first node and other third nodes having the same node type is considered. When the third node includes a node of a node type different from that of the first node, there should be no edge-connecting relationship between the node of the same node type as that of the first node and the first node in order to obtain the attribute prediction information from the dimensions of the different node types. The nodes of the same type as the first node are divided into a single sub-network, and the modularity degree calculation is performed by using the modularity degree calculation method in the previous embodiment. Thus, in this case F represents the total number of edges connected between the first node and a third node of a type different from that of the first node, B _j Characterized by an adjacency matrix between the first node and a jth third node of a different type than the first node, q _j Characterized by the out-degree, δ (C), of the jth third node of a different type than the first node _j ) The characteristic is whether a connecting edge exists between a jth third node which is different from the first node in type and the first node.

It can be seen that in determining the second node associated with the attribute of the first node, all third nodes other than the first node are traversed by scanning the nodes in the data. And then measuring the modular profit brought by the third node as the neighbor node of the third node, and selecting the third node of which the modular profit is greater than a certain preset threshold value as the second node of the first node neighbor. Thus, the second node having the attribute relevance with the first node can be accurately determined. Moreover, by adjusting the size of the preset threshold, the second nodes with different association degrees can be determined to adapt to different application scenes.

Then, in step 203, for each sub-network, a time-series characterization of the sub-network is determined according to the association of the first node with the second node in the sub-network.

After splitting the graph network into sub-networks of different dimensions, the timing characterization corresponding to each sub-network is determined according to the relevance of the second node to the first node in each sub-network. For example, as shown in fig. 5, step 203 may be implemented by the following steps:

step 501: acquiring a first time sequence representation of a first node and a second time sequence representation of a second node which is adjacent to the first node; the first time sequence representation and the second time sequence representation are both features capable of influencing the attribute of the first node;

step 503: and aggregating the second time sequence representation of the second node into the first time sequence representation of the first node to obtain a fusion representation.

In this embodiment, when determining the timing representations of the corresponding sub-networks according to the relevance between the second node and the first node, first, the first timing representation of the first node and the second timing representation of the second node that is a neighbor of the first node may be obtained. And then aggregating the second time sequence representation of the second node to the first time sequence representation of the first node to obtain a fusion representation. The first time sequence representation and the second time sequence representation are both features capable of influencing the attribute of the first node, so that the time sequence representations of the first node and the second node are fused and then used for attribute prediction of the first node, the information quantity of attribute prediction can be increased, and the accuracy of a prediction result is improved.

Step 501 may be performed from historical data when obtaining the first timing representation of the first node and the second timing representation of the second node. For example, for a certain historical transaction data, "zhang san" corresponding transaction data may be obtained from the historical transaction data as the first time series representation. The second time sequence representation can be transaction data corresponding to the people 'Liquan' and 'Wangwu' transacted with the 'Zhang III'; the second timing characterization may also be transaction data on other media. For example, transaction data on a financial device used in the transaction of "zhang san" and transaction data on a bank card used in the transaction of "zhang san" are obtained.

After the first timing representation and the second timing representation are obtained, the second timing representation and the first timing representation are considered to be aggregated. For example, in a possible implementation manner, step 503 may implement aggregating the second timing representation of the second node into the first timing representation of the first node to obtain a fused representation by using the following calculation formula:

f(H ^l )＝δ(E·H ^l-1 ·W ^l-1 )

wherein, f (H) ^l ) For characterizing a network representation obtained after a layer of iteration, E for characterizing a second time-series representation, H ^l-1 A state matrix for characterizing a hidden layer of the l-1 th layer, and H ⁰ For the first timing characterization, W is used to characterize the weight parameter of the hidden layer of layer l-1, and δ is the activation function.

In this embodiment, the number of layers for setting the hidden layer may be considered in advance, that is, more data related to the attribute prediction of the first node may be mined by mining multiple layers. For example, by means of the first layer mining, data of a second node which is a neighbor of the first node may be mined and aggregated into the data of the first node; through second-layer mining, data of neighbor nodes of a second node can be further aggregated into data of a first node; further, neighbor node data of neighbors of the second node can be mined through the third layer and aggregated into the data of the first node. By analogy, the digging depth is determined by setting the number of layers of the hidden layer, so that fusion representations of different depth layers can be obtained for attribute prediction.

Further in step 205, the time sequence representations obtained by the sub-networks are fused to obtain a fused representation of the first node.

In step 203, the time sequence representations corresponding to the sub-networks are obtained, and in this step, the obtained time sequence representations corresponding to the sub-networks are considered to be fused, so as to obtain a fused representation of the first node. In this way, the fused representation may include the influence of different types of nodes on the attribute of the first node, that is, the attribute of the first node is predicted by using data of multiple dimensions. When merging of timing representations is implemented, as shown in fig. 6, step 205 may include the following steps:

for each subnetwork, step 601, step 603 and step 605 are performed:

step 601: determining a weight matrix in an attention mechanism corresponding to the current sub-network by using the time sequence representation of the current sub-network;

step 603: determining the contribution representation of the current sub-network when the attribute of the first node is predicted according to the weight matrix in the attention mechanism corresponding to the current sub-network;

step 605: determining the contribution amount of the current sub-network in predicting the attribute of the first node according to the contribution characterization of the current sub-network;

step 607: and determining the fusion characterization according to the weight matrix and the contribution amount corresponding to each sub-network.

In this embodiment, when the time sequence representations obtained by the sub-networks are fused to obtain the fused representation of the first node, first, for each sub-network, a weight matrix in the attention mechanism corresponding to the current sub-network may be determined by using the time sequence representation of the current sub-network. And then determining the contribution characterization of the current sub-network when the current sub-network predicts the attribute of the first node according to the weight matrix in the attention mechanism corresponding to the current sub-network. And further determining the contribution amount of the current sub-network in attribute prediction of the first node according to the contribution characterization of the current sub-network. Thus, when the weight matrix and the contribution amount corresponding to each sub-network are obtained, the fusion characterization of the first node can be determined.

When the time sequence representations of the sub-networks are fused, the contribution of the time sequence representations in the sub-networks to the attribute prediction of the first node is fully considered, and the fusion is carried out according to the difference of the contribution of the sub-networks to the attribute prediction of the first node. The obtained fusion characterization is more reasonable, and the information in the sub-network with large contribution amount is not weakened or the information in the sub-network with small contribution amount is not strengthened.

Step 601 will be explained.

When determining the weight matrix in the attention mechanism corresponding to the current sub-network by using the time sequence characterization of the current sub-network, step 601 may be obtained by calculating Q, K, V weight matrix in the attention mechanism according to the following calculation formula:

q, K, V is respectively used for characterizing a Q, K, V weight matrix in the attention mechanism corresponding to the current sub-network, L1, L2 and L3 are respectively used for characterizing a matrix corresponding to a convolution kernel when calculating a Q, K, V weight matrix corresponding to the current sub-network, and D is used for characterizing the time sequence of the current sub-network.

In the attention mechanism, the Q weight matrix is usually biased toward the query target, the K weight matrix is usually biased toward the receiving lookup, and the V weight matrix is the specific extraction. Therefore, by applying Q, K, V three weight matrixes in the attention mechanism to the time sequence characterization, data with higher degree of relevance to the attribute prediction of the first node in the sub-network can be extracted.

Step 603 will be explained.

After Q, K, V weight matrix in the attention mechanism corresponding to the current sub-network is calculated in step 601, the contribution characterization of the current sub-network in predicting the attribute of the first node may be determined according to the weight matrix in the attention mechanism corresponding to the current sub-network. For example, in one possible implementation, step 603 may calculate the contribution characterization of the current sub-network using the following calculation:

g(Q,K)＝Q ^T K

wherein g (Q, K) is used to characterize the contribution of the current subnetwork, Q ^T The transpose matrix is used for representing the Q weight matrix in the weight matrix, and the K is used for representing the K weight matrix in the weight matrix.

For each sub-network, the contribution characterization of each sub-network in predicting the attribute of the first node can be accurately calculated by using the calculation formula. The feature that can provide a basis for predicting the attribute of the first node in each sub-network can be extracted by using the calculation formula, so that the prediction result can be more reliable when the attribute of the first node is predicted based on the contribution representation.

Step 605 will be explained.

Step 603, after determining the contribution characterization of each sub-network in predicting the attribute of the first node according to the weight matrix in the attention mechanism corresponding to each sub-network, further calculating the contribution of each sub-network in predicting the attribute of the first node based on the contribution characterization. For example, in one possible implementation, step 605 may calculate the contribution of the sub-network by the following calculation:

wherein a is used for characterizing the contribution amount of the current sub-network, g (Q, K) is used for characterizing the contribution of the current sub-network, N is used for characterizing the set of K weight matrixes in the weight matrixes of the sub-networks, and g (Q, K') is used for characterizing the contribution of the set of K weight matrixes of the sub-networks.

In this embodiment, by calculating the ratio of the contribution characteristics of the current sub-network to the contribution characteristics of all sub-networks, the proportion of the contribution characteristics of the current sub-network in predicting the attribute of the first node can be calculated. Therefore, after the attribute of the first node is predicted, the attribute contribution of each sub-network to the first node can be judged according to the corresponding contribution amount of each sub-network, and which contribution is the largest, so that different attention degrees can be applied to each sub-network in the following process.

For example, in one example of risk account prediction, for three sub-networks, an "account-account" sub-network, an "account-financial device" sub-network, and an "account-bank card" sub-network, respectively. In one embodiment, the contribution amounts corresponding to the sub-networks are 0.15, 0.8 and 0.05, respectively, obtained through the above calculation formula, that is, the contribution amount of the sub-network of the "account-financial device" for determining that the account to be predicted is the risk account is 0.15, the contribution amount of the sub-network of the "account-financial device" for determining that the account to be predicted is the risk account is 0.8, and the contribution amount of the sub-network of the "account-bank card" for determining that the account to be predicted is the risk account is 0.05. Then more attention may be subsequently paid to the financial device used by the risk account to take more targeted wind control measures.

Step 607 is explained.

After the contribution amounts corresponding to the sub-networks are determined in step 605, the fusion representation of the first node can be determined by using the contribution amounts of the sub-networks and the weight matrix of the sub-networks. For example, in one possible implementation, first, for each sub-network, a product of the K weight matrix in the attention mechanism corresponding to the sub-network and the contribution amount corresponding to the sub-network is calculated, so as to obtain a first fused representation corresponding to the sub-network. And then summing the first fusion representations of the sub-networks to obtain the fusion representation of the first node.

For example, step 607 may be calculated by the following calculation:

wherein the Attention is used to characterize the fusion characterization, a _i Contribution, V, to characterize the ith sub-network _i The weight matrix is used for representing the V weight matrix corresponding to the ith sub-network, and M is used for representing the total number of the sub-networks.

In this way, by carrying out weighted fusion on the representations of a plurality of networks, data barriers of various information can be opened, more prediction bases can be provided for attribute prediction of nodes, and the accuracy of prediction results can be improved.

Finally, in step 207, the attributes of the first node in the graph network are predicted using the fusion characterization of the first node.

In this step, when the fusion representation of the first node is obtained, the attribute of the first node may be predicted by using the fusion representation. For example, the fusion representation of the first node may be input into a pre-trained attribute prediction model to obtain an attribute prediction value of the first node; the training method of the attribute prediction model comprises the following steps: training by utilizing at least one group of sample training set; each group of sample training set comprises a sample fusion characterization of the first node and a sample prediction value of the attribute of the first node. The attribute prediction model can be trained, for example, using the mean square error as a loss function.

As shown in fig. 7, an embodiment of the present specification provides an apparatus for predicting node attributes in a graph network, including: a sub-network determining module 701, a time sequence representation determining module 702, a time sequence representation fusing module 703 and an attribute predicting module 704;

a sub-network determining module 701 configured to determine at least two sub-networks according to the graph network; each sub-network comprises a first node to be subjected to attribute prediction and a second node which is adjacent to the first node, and the types of the second nodes in any two sub-networks are different;

a timing characterization determining module 702 configured to determine, for each sub-network determined by the sub-network determining module 701, a timing characterization of the sub-network according to an association between a second node and a first node in the sub-network;

the time sequence representation fusion module 703 is configured to fuse the time sequence representations obtained by the time sequence representation determination module 702 for the respective subnetworks to obtain a fusion representation of the first node;

the attribute predicting module 704 is configured to predict the attribute of the first node in the graph network by using the fusion representation of the first node obtained by the time sequence representation fusion module 703.

In one possible implementation, the sub-network determining module 701, when determining at least two sub-networks according to the graph network, is configured to:

splitting the graph network into at least two first networks according to the difference of the node types in the graph network; any one first network comprises first nodes, and the types of the nodes contained in the first network are not more than 2;

for each first network, a second node of the first network associated with the attribute of the first node is determined, and a sub-network corresponding to the first network is generated using the first node and the second node of the first network.

In one possible implementation, the sub-network determining module 701, when determining a second node of the first network that is associated with the attribute of the first node, is configured to:

calculating the modularity of the first node and each third node in the first network; the third node is used for representing nodes except the first node, and the modularity is used for representing the aggregation degree of the first node and the third node;

In one possible implementation manner, the sub-network determining module 701 is configured to, when the node types of the third nodes are the same as those of the first node and the modularity of the first node and each third node in the first network is calculated, perform the following operations:

the modularity is calculated using the following calculation:

In one possible implementation, the sub-network determining module 701 is configured to, when the third node includes a node of a different node type from the first node, calculate a modularity of the first node and each third node in the first network, perform the following operations:

the modularity is calculated using the following calculation:

wherein Q is _j Used for representing the modularity of the first node and the jth third node, F is used for representing the total number of edges connected between the first node and the third node with different types from the first node, and B _j An adjacency matrix used for representing the first node and the jth third node with different type from the first node, q is used for representing the degree of entry of the first node, q _j For characterizing the out-degree, δ (C), of a jth third node of a different type than the first node _j ) Used for representing whether a connecting edge, delta (C), exists between the jth third node with the type different from that of the first node and the first node _j ) The value of (A) is 1 when there is a continuous edge and 0 when there is no continuous edge.

In one possible implementation, the timing representation determining module 702, when determining the network representation of the sub-network according to the association between the second node and the first node in the sub-network, is configured to:

acquiring a first time sequence representation of a first node and a second time sequence representation of a second node which is adjacent to the first node; the first time sequence representation and the second time sequence representation are both features capable of influencing the attribute of the first node;

and aggregating the second time sequence representation of the second node into the first time sequence representation of the first node to obtain a fusion representation.

In one possible implementation, when aggregating the second timing representation of the second node into the first timing representation of the first node to obtain a fused representation, the timing representation determining module 702 is configured to:

and calculating to obtain the network characterization by using the following calculation formula:

f(H ^l )＝δ(E·H ^l-1 ·W ^l-1 )

In a possible implementation manner, when fusing the time sequence representations obtained by the sub-networks to obtain a fused representation of the first node, the time sequence representation fusing module 703 is configured to perform the following operations:

for each of the sub-networks it is possible to,

In one possible implementation, the weight matrix includes Q, K, V weight matrices; the timing characterization fusion module 703, when determining the weight matrix in the attention mechanism corresponding to the current sub-network by using the timing characterization of the current sub-network, is configured to perform the following operations:

the weight matrix is calculated using the set of equations:

q, K, V are used to characterize the attention mechanism corresponding to the current sub-networkQ, K, V weight matrix of (1), L ₁ 、L ₂ 、L ₃ Respectively, for characterizing the matrix corresponding to the convolution kernel when calculating the Q, K, V weight matrix corresponding to the current sub-network, and D for characterizing the timing characterization of the current sub-network.

In a possible implementation manner, when determining, according to a weight matrix in the attention mechanism corresponding to the current sub-network, the contribution characterization of the current sub-network in predicting the attribute of the first node, the timing characterization fusion module 703 is configured to perform the following operations:

the contribution characterization of the current sub-network is calculated using the following calculation:

g(Q,K)＝Q ^T K

wherein g (Q, K) is used to characterize the contribution of the current sub-network, Q ^T The transpose matrix is used for representing a Q weight matrix in the weight matrix, and the K is used for representing a K weight matrix in the weight matrix.

In a possible implementation manner, when determining the contribution amount of the current sub-network in predicting the attribute of the first node according to the contribution characterization of the current sub-network, the timing characterization fusion module 703 is configured to perform the following operations:

the contribution of the current sub-network is calculated using the following calculation:

In one possible implementation, when determining the fused token according to the weight matrix and the contribution amount corresponding to each sub-network, the timing token fusion module 703 is configured to perform the following operations:

In one possible implementation, the attribute prediction module 704, when predicting the attribute of the first node in the graph network using the fused representation of the first node, is configured to perform the following operations:

inputting the fusion representation of the first node into a pre-trained attribute prediction model to obtain an attribute prediction value of the first node; the training method of the attribute prediction model comprises the following steps: training by utilizing at least one group of sample training set; each group of sample training sets comprises a sample fusion characterization of the first node and a sample prediction value of the attribute of the first node.

The present specification also provides a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of the embodiments of the specification.

The present specification also provides a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method in any of the embodiments of the specification.

It should be understood that the schematic structure of the embodiment in this specification does not constitute a specific limitation to the prediction device for node attributes in the graph network. In other embodiments of the specification, the means for predicting the attributes of the nodes in the graph network may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

For the information interaction, execution process and other contents between the units in the above-mentioned apparatus, because the same concept is based on as the method embodiment of this specification, specific contents can refer to the description in the method embodiment of this specification, and are not described herein again.

Those skilled in the art will recognize that in one or more of the examples described above, the functions described in this specification can be implemented in hardware, software, hardware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The above-mentioned embodiments, the objects, technical solutions and advantages described in the specification are further described in detail, it should be understood that the above-mentioned embodiments are only for describing the embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims

1. The node attribute prediction method in the graph network comprises the following steps:

for each sub-network, determining a time-series representation of the sub-network according to the relevance of a second node to a first node in the sub-network;

2. The method of claim 1, wherein the determining at least two sub-networks from the graph network comprises:

3. The method of claim 2, wherein the determining a second node in the first network associated with the attribute of the first node comprises:

4. The method of claim 3, wherein when the node types of the third node and the first node are all the same, the calculating the modularity of the first node and each third node in the first network comprises:

calculating the modularity using the following calculation:

5. The method of claim 3, wherein when the third node comprises a node of a different node type than the first node, the calculating the modularity of the first node and each third node in the first network comprises:

calculating the modularity using the following calculation:

wherein Q is _j The node B is used for representing the modularity of the first node and the jth third node, F is used for representing the total number of edges connected between the first node and the third node with different types from the first node, and B _j The adjacent matrix is used for representing the adjacent matrix between the first node and the jth third node with the type different from that of the first node, q is used for representing the degree of entry of the first node, and q is used for representing the degree of entry of the first node _j Degree of departure, δ (C), for characterizing a jth third node of a different type than the first node _j ) Used for representing whether a connecting edge, delta (C), exists between the jth third node with the type different from that of the first node and the first node _j ) The value of (A) is 1 when there is a continuous edge and 0 when there is no continuous edge.

6. The method of claim 1, wherein determining the network characterization of the sub-network based on the association of the first node with the second node in the sub-network comprises:

and aggregating the second time sequence representation of the second node into the first time sequence representation of the first node to obtain the fusion representation.

7. The method of claim 6, wherein said aggregating the second timing representation of the second node into the first timing representation of the first node to obtain the fused representation comprises:

calculating the network characterization by using the following calculation formula:

f(H ^l )＝δ(E·H ^l-1 ·W ^l-1 )

wherein, f (H) ^l ) For characterizing the network characterization obtained after a layer of iteration, E for characterizing the second timing characterization, H ^l-1 A state matrix for characterizing a hidden layer of the l-1 th layer, and H ⁰ For the first timing characterization, W is used to characterize the weight parameter of the hidden layer of layer l-1, δ is the activation function.

8. The method of claim 1, wherein the merging the time-series representations obtained from the respective subnetworks to obtain a merged representation of the first node comprises:

for each sub-network, performing:

9. The method of claim 8, wherein the weight matrix comprises an Q, K, V weight matrix;

calculating the weight matrix using the set of equations:

q, K, V are used to characterize Q, K, V weight matrix, L, respectively in the attention mechanism corresponding to the current sub-network ₁ 、L ₂ 、L ₃ Respectively, for characterizing the matrix corresponding to the convolution kernel when calculating the Q, K, V weight matrix corresponding to the current sub-network, and D for characterizing the timing characterization of the current sub-network.

10. The method of claim 8, wherein the determining the contribution characterization of the current sub-network when predicting the attribute of the first node according to the weight matrix in the attention mechanism corresponding to the current sub-network comprises:

g(Q,K)＝Q ^T K

wherein g (Q, K) is used to characterize the contribution of the current sub-network, Q ^T A transpose of Q weight matrices used to characterize the weight matrices, and K is used to characterize K weight matrices of the weight matrices.

11. The method of claim 8, wherein said determining the amount of contribution of the current sub-network to predict the attribute of the first node based on the characterization of the contribution of the current sub-network comprises:

calculating the contribution of the current sub-network by using the following calculation formula:

12. The method of claim 8, wherein the determining the fused characterization from the weight matrix and the contribution corresponding to each sub-network comprises:

for each sub-network, calculating the product of a K weight matrix in an attention mechanism corresponding to the sub-network and a contribution amount corresponding to the sub-network to obtain a first fusion representation corresponding to the sub-network;

13. The method of claim 1, wherein said predicting attributes of a first node in the graph network using the fused representation of the first node comprises:

14. An apparatus for predicting node attributes in a graph network, comprising: the device comprises a sub-network determining module, a time sequence characterization fusion module and an attribute prediction module;

15. A computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, implements the method of any of claims 1-13.