CN112580789B - Training graph coding network, and method and device for predicting interaction event - Google Patents
Training graph coding network, and method and device for predicting interaction event Download PDFInfo
- Publication number
- CN112580789B CN112580789B CN202110196034.2A CN202110196034A CN112580789B CN 112580789 B CN112580789 B CN 112580789B CN 202110196034 A CN202110196034 A CN 202110196034A CN 112580789 B CN112580789 B CN 112580789B
- Authority
- CN
- China
- Prior art keywords
- node
- graph
- vector
- nodes
- subgraph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 118
- 238000000034 method Methods 0.000 title claims abstract description 68
- 238000012549 training Methods 0.000 title claims abstract description 43
- 239000013598 vector Substances 0.000 claims abstract description 249
- 238000012512 characterization method Methods 0.000 claims abstract description 136
- 230000008846 dynamic interplay Effects 0.000 claims abstract description 78
- 230000000875 corresponding effect Effects 0.000 claims description 76
- 230000002452 interceptive effect Effects 0.000 claims description 34
- 238000011176 pooling Methods 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 16
- 230000002596 correlated effect Effects 0.000 claims description 12
- 230000004927 fusion Effects 0.000 claims description 5
- 230000007246 mechanism Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 16
- 230000009471 action Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000012804 iterative process Methods 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the specification provides a method and a device for training and using a graph coding network. In the training method, a dynamic interaction graph is constructed based on an interaction event sequence, and a first node and a second node which form a node pair are selected from the dynamic interaction graph. Coding the first node by using a graph coding network to obtain a characterization vector of the first node; and for the second node, determining a corresponding structural subgraph, encoding a plurality of associated nodes in the structural subgraph by using a graph encoding network, and then determining a graph characteristic vector of the structural subgraph based on the encoding vectors of the associated nodes. From this, a prediction loss can be determined which is inversely related to the similarity between the token vector of the first node and the graph token vector described above. The graph coding network is then updated with the goal of reducing prediction losses.
Description
Technical Field
One or more embodiments of the present specification relate to the field of machine learning, and more particularly, to a training graph coding network, and a method and apparatus for predicting an interaction event using the training graph coding network.
Background
In many scenarios, user interaction events need to be analyzed and processed. The interaction event is one of basic constituent elements of an internet event, for example, a click action when a user browses a page can be regarded as an interaction event between the user and a content block of the page, a purchase action in an e-commerce can be regarded as an interaction event between the user and a commodity, and an inter-account transfer action is an interaction event between the user and the user. The characteristics of fine-grained habit preference and the like of the user and the characteristics of an interactive object are contained in a series of interactive events of the user, and the characteristics are important characteristic sources of a machine learning model. Therefore, in many scenarios, it is desirable to perform feature expression and modeling on interaction participants according to interaction events, and further perform analysis on interaction objects and events, especially on the security of the interaction events, so as to guarantee the security of the interaction platform.
However, an interactive event involves both interacting parties, and the status of each party itself may be dynamically changing, and thus it is very difficult to accurately characterize the interacting parties comprehensively considering their multi-aspect characteristics. Thus, improved solutions for more efficiently analyzing interactive objects and interactive events are desired.
Disclosure of Invention
One or more embodiments of the present disclosure describe a method and an apparatus for training a graph coding network, in which a graph coding network is obtained by training based on a dynamic interaction graph and using a graph structure contrast learning manner, so that coding representation can be performed on nodes more accurately, and then interactive prediction can be performed more effectively.
According to a first aspect, there is provided a method of training a graph coding network, the method comprising:
acquiring a dynamic interaction graph, wherein the dynamic interaction graph comprises a plurality of node pairs, each node pair corresponds to an interaction event, two nodes respectively represent two objects participating in the interaction event, and any node points to two nodes corresponding to the last interaction event participated in by the object represented by the node through a connecting edge;
selecting a first node and a second node from the dynamic interaction graph, wherein the first node and the second node form a node pair;
coding the first node by using the graph coding network to obtain a first node representation vector;
determining a second structural subgraph corresponding to the second node, wherein the second structural subgraph comprises a plurality of second associated nodes;
encoding the plurality of second association nodes by using the graph encoding network to obtain a plurality of second association characterization vectors;
determining a second graph feature vector of the second structural sub-graph based on the plurality of second associated feature vectors;
determining a first loss that is inversely related to a similarity between the first node characterization vector and the second graph characterization vector;
updating the graph coding network with a goal of reducing the first loss.
According to an embodiment, encoding the first node using a graph coding network specifically includes: determining a first coding subgraph by taking the first node as a target node according to a first subgraph determination rule; determining a subgraph formed by nodes in a preset range, which are reached by a connecting edge and take a target node as a root node, in the dynamic interaction graph; and inputting the first coding subgraph into the graph coding network, wherein the graph coding network outputs a hidden vector of a root node as a first node characterization vector according to nodes in the first coding subgraph and the connection relation among the nodes.
Further, in a specific embodiment, the nodes in the predetermined range may include: presetting a number of nodes in K connecting edges; and/or nodes with interaction time within a preset time range.
In one embodiment, determining the second structural subgraph corresponding to the second node specifically includes: and taking the second node as a target node, determining a rule according to the first subgraph, and determining a corresponding subgraph as the second structural subgraph.
In another embodiment, determining the second structural subgraph corresponding to the second node specifically includes: and taking the second node as a target node, determining a rule according to a second subgraph, and determining a corresponding subgraph as the second structural subgraph, wherein the second subgraph determination rule is different from the first subgraph determination rule.
In an embodiment, the process of encoding the plurality of second association nodes by using a graph coding network specifically includes: respectively taking the plurality of second associated nodes as target nodes, determining rules according to the first subgraph, and determining each associated coding subgraph corresponding to each second associated node; and respectively inputting the associated coding subgraphs into the graph coding network, and outputting the hidden vector of the root node as a corresponding second associated representation vector by the graph coding network according to the nodes in the input associated coding subgraphs and the connection relation among the nodes.
Further, in an embodiment, the graph coding network includes an LSTM layer, and for any sub-graph input therein, the LSTM layer respectively uses each node from a leaf node to a root node in the sub-graph as a current node, and sequentially iterates each node, where the iterative processing includes determining a hidden vector of the current node at least according to a node attribute feature of the current node and a hidden vector of two nodes pointed by the current node through a connecting edge.
According to an embodiment, determining the second graph feature vector of the second structure subgraph specifically includes: performing pooling operation on the plurality of second associated characterization vectors, and taking the result of the pooling operation as the second graph characterization vector; the pooling operation includes maximum pooling, sum pooling, or average pooling.
According to another embodiment, determining the second graph feature vector of the second sub-graph specifically includes: determining an attention weight for each of the plurality of second associated characterization vectors based on a self-attention mechanism; and based on the attention weight, performing weighted fusion on the plurality of second associated characterization vectors to obtain the second chart characterization vector.
In one embodiment, determining the first loss specifically includes: calculating a point product between the first node characterization vector and the second graph characterization vector as a similarity between the first node characterization vector and the second graph characterization vector; or calculating the vector distance between the first node characterization vector and the second graph characterization vector, and determining the similarity between the first node characterization vector and the second graph characterization vector according to the vector distance.
According to one embodiment, the method further comprises: sampling a plurality of negative sample nodes different from the second node from the dynamic interaction graph; determining a negative sample structure subgraph corresponding to each negative sample node; for each negative sample structure subgraph, coding the sub-nodes in the graph coding network by using the graph coding network to obtain the characterization vectors corresponding to the sub-nodes; determining a negative sample graph feature vector corresponding to the negative sample structure subgraph based on the feature vector corresponding to the child node; in such a case, determining the first loss includes calculating respective negative sample similarities of the first node characterization vector and the respective negative sample graph characterization vectors, respectively, such that the first loss positively correlates with a sum of the respective negative sample similarities.
In one embodiment, the method further comprises: coding the second node by using the graph coding network to obtain a second node representation vector; determining a first structural subgraph corresponding to a first node, wherein the first structural subgraph comprises a plurality of first associated nodes; encoding the plurality of first association nodes by using the graph encoding network to obtain a plurality of first association characterization vectors; determining a first graph feature vector of the first structural sub-graph based on the plurality of first associated feature vectors; determining a second penalty negatively correlated to the similarity between the second node characterization vector and the first graph characterization vector; updating the coding network with a goal of reducing the second loss.
According to a second aspect, there is provided a method of predicting an interaction event, comprising:
obtaining a graph coding network trained according to the method of the first aspect;
determining a first target node and a second target node respectively corresponding to a first object and a second object to be evaluated from the dynamic interaction graph;
respectively coding the first target node and the second target node by using the graph coding network to obtain a first target characterization vector and a second target characterization vector;
and predicting the probability of the interaction event of the first object and the second object according to the first target characterization vector and the second target characterization vector.
In one embodiment, the first target node is a node of a corresponding pair of nodes in the dynamic interaction graph of interaction events that the first object recently participated in.
According to a third aspect, there is provided an apparatus for training graph coding network, the apparatus comprising:
the dynamic graph acquiring unit is configured to acquire a dynamic interactive graph, wherein the dynamic interactive graph comprises a plurality of node pairs, each node pair corresponds to one interactive event, two nodes respectively represent two objects participating in the interactive event, and any node points to two nodes corresponding to the last interactive event participated by the object represented by the node through a connecting edge;
a node selection unit configured to select a first node and a second node constituting a node pair from the dynamic interaction graph;
the first coding unit is configured to code the first node by using the graph coding network to obtain a first node representation vector;
the structure subgraph determining unit is configured to determine a second structure subgraph corresponding to the second node, wherein the second structure subgraph comprises a plurality of second associated nodes;
a second encoding unit configured to encode the plurality of second association nodes by using the graph coding network to obtain a plurality of second association characterization vectors;
a graph vector determination unit configured to determine a second graph feature vector of the second structural subgraph based on the plurality of second associated feature vectors;
a first loss determination unit configured to determine a first loss that is negatively correlated with a similarity between the first node characterization vector and the second graph characterization vector;
an updating unit configured to update the graph coding network with a goal of reducing the first loss.
According to a fourth aspect, there is provided an apparatus for predicting an interactivity event, the apparatus comprising:
a model obtaining unit configured to obtain a graph coding network obtained by training the apparatus according to the third aspect;
the target node obtaining unit is configured to determine a first target node and a second target node which correspond to a first object to be evaluated and a second object to be evaluated respectively from the dynamic interaction graph;
the graph coding unit is configured to code the first target node and the second target node respectively by using the graph coding network to obtain a first target characterization vector and a second target characterization vector;
and the predicting unit is configured to predict the probability of the first object and the second object generating the interaction event according to the first target characterization vector and the second target characterization vector.
According to a fifth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first or second aspect.
According to a sixth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and wherein the processor, when executing the executable code, implements the method of the first or second aspect.
According to the method and the device provided by the embodiment of the specification, a dynamic interaction graph is constructed based on an interaction event sequence, and a training graph coding network is trained based on the dynamic interaction graph. In the training process, by utilizing the idea of graph structure comparison learning, for two nodes which interact, the node representation of one node is compared with the graph representation related to the other node in similarity, so as to train the graph coding network. After training, the graph coding network obtained by training can be used for carrying out coding representation on the nodes in the dynamic interaction graph, and the future interaction events can be more effectively predicted according to the coding representation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates an implementation scenario diagram according to one embodiment;
FIG. 2 illustrates a flow diagram of a method of training graph encoding a network, according to one embodiment;
FIG. 3 illustrates a dynamic interaction sequence and a dynamic interaction diagram constructed therefrom, in accordance with one embodiment;
FIG. 4 illustrates an example of a subgraph in a dynamic interaction graph;
FIG. 5 shows a working diagram of an iterative process in the LSTM layer;
FIG. 6 illustrates a flow of steps for training graph encoding a network according to one embodiment;
FIG. 7 illustrates a flow diagram of a method of predicting an interaction event, according to one embodiment;
FIG. 8 shows a schematic block diagram of a training apparatus of a graph coding network according to one embodiment;
fig. 9 shows a schematic block diagram of an apparatus for predicting an interaction event according to an embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
As previously mentioned, it is desirable to be able to characterize and model interactive objects and interactive events based on a series of interactive events that occur with the interactive objects.
In one approach, a static interaction relationship network graph is constructed based on historical interaction events, such that individual interaction objects and individual interaction events are analyzed based on the interaction relationship network graph. Specifically, the participants of the historical events can be used as nodes, and connection edges can be established between the nodes with the interaction relationship, so that the interaction network graph is formed. However, the above static network diagram, although it may show the interaction relationships between objects, does not contain timing information of these interaction events. The graph embedding is simply carried out on the basis of the interaction relation network graph, and the obtained feature vectors do not express the influence of the time information of the interaction events on the nodes. Moreover, such static graphs are not scalable enough, and are difficult to flexibly process for the situations of newly added interaction events and newly added nodes.
In view of the above, in accordance with one or more embodiments of the present specification, a dynamically changing sequence of interactivity events is constructed as a dynamic interactivity graph, wherein each interactivity object involved in each interactivity event corresponds to each node in the dynamic interactivity graph. Such a dynamic interaction graph may reflect timing information of interaction events experienced by individual interaction objects. Further, in order to analyze and predict interaction events based on the above dynamic interaction graph, in an embodiment of the present specification, a graph coding network is trained in a graph structure versus learning manner, so as to perform deep characterization learning on nodes in a graph. Specifically, for two nodes corresponding to two objects which are interacted, a node is directly encoded by a graph coding network to obtain a node representation vector; on the other hand, a subgraph corresponding to another node is characterized on the basis of the graph coding network, and a graph characteristic vector of the subgraph is obtained; the graph coding network is trained based on similarity comparisons between the node characterization vectors and the graph characterization vectors. After training, the graph coding network can better characterize the nodes in the dynamic interactive graph based on deep feature learning of the graph structure. The node characterization vector obtained in the way can predict future interaction events or interaction objects more accurately.
Fig. 1 shows a schematic illustration of an implementation scenario according to an embodiment. As shown in FIG. 1, multiple interaction events occurring in sequence may be organized chronologically into a dynamic interaction sequence<E1,E2,…,EN>Wherein each element EiRepresenting an interaction event, which may be represented in the form of an interaction feature set Ei=(ai,bi,ti) Wherein a isiAnd biIs an event EiTwo interacting objects of, tiIs the interaction time.
According to an embodiment of the present specification, a dynamic interaction graph is constructed based on the dynamic interaction sequence. In the dynamic interaction graph, each interaction object a in each interaction event is divided into a plurality of interaction objectsi,biRepresented by nodes, and establishing parent-child relationship connection edges between nodes containing continuous events of the same object. The structure of the dynamic interaction graph will be described in more detail later.
In order to more effectively characterize and analyze the nodes, the graph coding network is trained by using the idea of graph structure contrast learning. Specifically, in the contrast learning process, a pair of nodes u and v corresponding to a certain interaction event is selected from the dynamic interaction graph to be used as a sample node pair. For one node u in the sample node pair, the node u is coded into a node characterization vector h through a graph coding networku. For the other node v in the sample node pair, obtaining a sub-graph Gv of the node v in the dynamic interaction graph, then coding each node in the sub-graph Gv by using a graph coding network, and obtaining a graph characteristic vector H of the sub-graph Gv based on a characteristic vector of each node obtained by codingGv. Then, the vector h is characterized based on the above nodesuSum-chart feature vector HGvAnd determining the prediction loss according to the similarity between the two, and further training the graph coding network. Therefore, the node representation and the graph representation which respectively correspond to the two nodes which are interacted are compared, so that the comparison learning of the graph structure is realized, and the training of the graph coding network is realized. The trained graph coding network can accurately and deeply characterize the nodes in the dynamic interactive graph based on the graph structure of the dynamic interactive graph, and the obtained node characterization can be better used for predicting and analyzing future interactive events or interactive objects.
Specific implementations of the above concepts are described below.
Fig. 2 shows a flowchart of a method of training graph encoding a network according to one embodiment. It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities. The following describes each step in the training method shown in fig. 2 with reference to a specific embodiment.
First, in step 21, a dynamic interaction graph reflecting an interaction event correlation is obtained.
Generally, a plurality of interaction events occurring in sequence can be organized into an interaction event sequence according to the time sequence as described above, and a dynamic interaction graph is constructed based on the interaction event sequence, so as to reflect the incidence relation of the interaction events. Sequences of interaction events, e.g. expressed as<E1,E2,…,EN>May comprise a plurality of interactivity events arranged in chronological order, wherein each interactivity event EiCan be represented as an interactive feature set Ei=(ai,bi,ti) Wherein a isiAnd biIs an event EiTwo interacting objects of, tiIs the interaction time.
In one embodiment, the two interaction objects involved in each interaction event belong to two classes of objects, hereinafter referred to as first class objects and second class objects. For example, in an e-commerce platform, an interaction event may be a purchase behavior of a user, two of which may be a user object (first class object) and a merchandise object (second class object). In another example, the interaction event may be a click action of a user on a page tile, where the two objects may be a user object (first class object) and a page tile object (second class object). In yet another example, the interaction event may be a recommendation event, such as a user (object of the first type) accepting a recommendation (object of the second type) to push the recommendation, and the recommendation may be various types of pushable contents, such as a movie, a product, an article, and so on. In other business scenarios, an interaction event may also be other interaction behavior that occurs between two different classes of objects. It should be understood that although the user object is taken as the first type object in the above examples, this is not necessarily so, and the setting of the first type object and the second type object may be determined according to the interaction events, some interaction events do not include the user object, and in the interaction events including the user object, the user object may be taken as the second type object.
For the dynamic interaction sequence described above, a dynamic interaction graph may be constructed. Specifically, a pair of nodes (two nodes) is used for representing two objects related to one interactive event, and each object in each interactive event in the dynamic interactive sequence is represented by a node respectively. In case an interactivity event involves two types of objects, the two nodes of a node pair represent, respectively, one first type of object and one second type of object participating in an interactivity event. It is to be understood that one node may correspond to one object in one interaction event, but the same physical object may correspond to multiple nodes. For example, if user U1 purchased commodity a1 at time t1 and purchased commodity a2 at time t2, then there are two feature groups of interaction events (U1, a1, t1) and (U1, a2, t 2), then two nodes U1(t1), U1(t 2) are created for user U1 based on these two interaction events, respectively. It can therefore be considered that a node in the dynamic interaction graph corresponds to the state of an interaction object in one interaction event.
For each node in the dynamic interaction graph, a connecting edge is constructed in the following way: for any node i, assuming that it corresponds to an interactivity event i (with an interaction time t), in the dynamic interaction sequence, tracing back from the interactivity event i, i.e. tracing back to a direction earlier than the interaction time t, determines the first interactivity event j (with an interaction time t-, t-earlier than t) which also contains the object represented by the node i as the last interactivity event in which the object participates. Thus, a connecting edge is established pointing from node i to both nodes in the last interactivity event j. The two pointed-to nodes are then also referred to as the associated nodes of node i.
The following description is made in conjunction with specific examples. FIG. 3 illustrates a dynamic interaction sequence and a dynamic interaction diagram constructed therefrom, according to one embodiment. In particular, the left side of FIG. 3 shows a dynamic interaction sequence organized in time order, wherein an exemplary illustration is given at t respectively1,t2,…,t6Interaction event E occurring at a moment1,E2,…,E6Each interaction event contains two interaction objects involved in the interaction and an interaction time (for clarity of illustration, the event feature is omitted), wherein the first column of objects is user class objects such as David, Lucy, etc., and the second column of objects is item class objects such as movie names, which are denoted by M1 to M4 in the figure. The right side of fig. 3 shows a dynamic interaction diagram constructed according to the dynamic interaction sequence on the left side, wherein two interaction objects in each interaction event are respectively taken as nodes. And any node points to two nodes corresponding to the last interactive event participated by the object represented by the node through the connecting edge. E.g. node u (t)6) Representing an interaction event E6The last interactive event involved by the user is E4Then, node u (t)6) By connecting edge direction E4Corresponding two nodes u (t)4) And w (t)4). Similarly, u (t)4) Direction E2The corresponding two nodes, and so on. In this manner, connecting edges are constructed between nodes, thereby forming the dynamic interaction graph of FIG. 3.
The above describes a way and a process for constructing a dynamic interaction graph based on a dynamic interaction sequence. For the training process shown in fig. 2, the process of constructing the dynamic interaction graph may be performed in advance or in the field. Accordingly, in one embodiment, at step 21, a dynamic interaction graph is constructed in the field according to the dynamic interaction sequence. Constructed as described above. In another embodiment, the dynamic interaction graph can be constructed in advance based on the dynamic interaction sequence. In step 21, the formed dynamic interaction graph is read or received.
Upon obtaining the dynamic interaction graph, a sample node pair corresponding to an interaction event is selected from the graph, step 22, which is an optional pair of nodes known to interact for the purpose of training the model. Two nodes in the sample node pair are referred to as a first node and a second node, respectively.
For example, in FIG. 3, assume that an engagement interaction event E was selected6As a sample node pair, then u (t) is6) And v (t)6) As a first node and a second node, respectively. Hereinafter, for simplicity of description, the first node is simply denoted as node u, and the second node is denoted as node v.
Next, as shown in step 23 of fig. 2, the first node u is encoded by using the graph coding network to obtain the corresponding first node token vector hu。
It is to be understood that the graph coding network is used for coding any designated target node based on the graph structure of the dynamic interaction graph and the connection relation between the nodes. For this purpose, a subgraph containing a specified target node may be obtained first according to a certain subgraph determination rule (for convenience of description, it is referred to as a first subgraph determination rule); and then inputting the obtained subgraph into a graph coding network, and coding the target node into a node characterization vector based on the subgraph by the graph coding network. The subgraph described above is used to encode the target node and is therefore referred to hereinafter as the encoding subgraph.
Specifically, in an embodiment, the first subgraph determining rule may include determining, in the dynamic interaction graph, a target subgraph formed by nodes in a predetermined range starting from the target node and arriving via the connecting edge, with the target node as a root node. Therefore, the target subgraph can be input into a graph coding network, and the graph coding network outputs the hidden vector of the root node as a node characterization vector corresponding to the target node according to the node attribute characteristics of each node in the input target subgraph and the connection relation between the nodes.
In one embodiment, the nodes within the predetermined range may be nodes reachable through at most a preset number K of connecting edges. The number K is a preset hyper-parameter and can be selected according to the service situation. It will be appreciated that the preset number K represents the number of steps of the historical interaction events traced back from the root node onwards. The larger the number K, the longer the historical interaction information is considered.
In another embodiment, the nodes in the predetermined range may also be nodes whose interaction time is within a predetermined time range. For example, the interaction time from the root node is traced back forward for a duration of T (e.g., one day) within which the nodes are within reach through the connecting edge.
In yet another embodiment, the predetermined range takes into account both the number of connected sides and the time range. In other words, the nodes in the predetermined range are nodes that are reachable at most through a preset number K of connecting edges and have interaction time within a predetermined time range.
The process of encoding the first node u is described below in conjunction with the example of fig. 3.
In the example of FIG. 3, assume that u (t) is6) As a first node. To encode it, first, a corresponding first encoded sub-picture is determined according to the first sub-picture determination rule. FIG. 4 illustrates an example of a subgraph in a dynamic interaction graph. Specifically, in FIG. 4, the node u (t) is shown6) Traversing along the connection edge direction of the parent-child relationship for the root node, and determining the nodes in the preset range. In the example of fig. 4, it is assumed that the above-described predetermined range is a node which is reached via at most a preset number K =2 of connecting edges. Then, from the current root node u (t)6) Starting from this, the nodes that can be reached via 2 connecting edges are shown as the dashed triangle areas in the figure. And the node and the connection relation in the region are the first coding subgraph corresponding to the first node u. It will be appreciated that in other examples, the predetermined range may be set in other manners, including, for example, a predetermined length of time to backtrack forward, etc.
Upon obtaining a first coding subgraph for the first node u, the first coding subgraph may be input into the graph coding network. The graph coding network outputs a root node, namely a hidden vector of a first node u as a corresponding first node characterization vector h according to the node attribute characteristics of each node in the input subgraph and the connection relation between the nodesu。
In one example, the graph coding network may be a graph neural network that performs graph processing by graph embedding, such as the graph convolution neural network GNN. In the graph embedding process, according to the connection relation among nodes, each node in the input coding subgraph is used as a K-order neighbor node of a root node, neighbor node aggregation is carried out, and a hidden vector of the root node is obtained and used as a coding vector of the root node.
In a specific example, the graph coding network is used for determining attention weights of nodes relative to a root node (a first node in the above example) according to the connection relation among the nodes in the coding subgraph; and according to the attention weight, aggregating the node attribute characteristics of each node to obtain a hidden vector of the root node as a coding vector, namely a first node representation vector.
In another example, the graph coding network may include a temporal recursive layer, such as a recurrent neural network RNN layer, or a long-short term memory LSTM layer, so that the coding subgraph with the target node as the root node is graph-processed in a recursive iterative manner. Specifically, in one embodiment, the graph coding network includes an LSTM layer, where the LSTM layer uses each node from a leaf node to a root node in an input coding subgraph as a current node, and sequentially iterates and processes each node, where the iterative processing includes determining a hidden vector of the current node at least according to a node attribute feature of the current node and hidden vectors of two nodes pointed by the current node through a connecting edge. And finally obtaining a hidden vector of the root node as a node representation vector of the target node through iterative processing from the leaf node to the root node.
Fig. 5 shows a working diagram of the iterative process in the LSTM layer. Connection mode according to dynamic interaction graphA node in the first coded subgraph can point to two nodes of the last interaction event in which the object represented by the node participates through a connecting edge. Assume node z (t) in the first coding subgraph as the current node, which points to node j through the connecting edge1And node j2. As shown in FIG. 5, at time T, the LSTM layers are processed to obtain nodes j1And node j2Including an intermediate vector c and a hidden vector H, are represented vectors H1 and H2; at the next T + time, the LSTM layer is according to the node attribute characteristic X of the node z (T)z(t)J obtained by previous processing1And j2To obtain a representation vector H of node z (t)z(t). It is understood that the representation vector of the node z (t) can be used for processing at a subsequent time to obtain a representation vector (including a hidden vector) pointing to the node z (t), so as to implement the iterative process.
The above-mentioned node attribute characteristics differ depending on the different classes of objects that the node represents. For example, where a node represents a user, the node attribute characteristics may include attribute characteristics of the user, such as age, occupation, education, location, registration duration, crowd labels, and so forth; where the nodes represent items, the node attribute characteristics may include attribute characteristics of the items, such as item category, time on shelf, sales volume, number of reviews, and so forth. And under the condition that the node represents other interactive objects, the node attribute characteristics can be obtained correspondingly based on the attributes of the interactive objects.
Then, in the above various manners, the graph coding network performs coding representation on the first node u, and codes the first node u into the first node representation vector hu。
For the other node in the sample node pair, i.e. the second node v, at step 24, a second structure graph Gv corresponding to the second node v is determined. It should be noted that the second structure sub-graph Gv is a sub-graph determined based on the second node v and used for characterizing the graph structure characteristics related to the second node v, and the graph of the sub-graph characterizes comparison objects to be subsequently characterized as nodes of the first node. Therefore, the manner of determining the second structure subgraph Gv may be the same as or different from the manner of determining the coding subgraph for coding the node.
Specifically, in an embodiment, the second structure subgraph Gv corresponding to the second node v may be determined according to the first subgraph determination rule adopted for determining the coding subgraph. More specifically, for example, the second node v may be used as a root node, and a sub-graph formed by nodes in a predetermined range starting from the root node and reaching via a connecting edge may be determined in the dynamic interaction graph as the second structure sub-graph Gv.
In another embodiment, a second sub-graph determination rule different from the first sub-graph determination rule may be used to determine the second structure sub-graph Gv. Specifically, in an example, the second sub-graph determining rule may be a range of sub-graphs determined by taking the target node as a central node in the dynamic interaction graph. In another example, the second subgraph determination rule may be a subgraph obtained by still tracing back a certain range to the history direction with the target node as the root node, but the traced range is different from that in the first subgraph determination rule, for example, the number of connection edges is different, and so on.
Fig. 4 also shows an example of a second structure diagram Gv in an example, as indicated by the circular dashed area. The second structure subgraph Gv is a subgraph obtained by taking the second node v as a root node and tracing the range of 1 connecting edge forwards.
For the second structural subgraph Gv obtained in the above various manners, the nodes included in the subgraph Gv are all nodes associated (directly or indirectly) with the second node, and the nodes included in the subgraph Gv are referred to as second associated nodes (including the second node itself).
For a plurality of second associated nodes included in the second structure sub-graph Gv, next, in step 25, each second associated node is encoded by using the graph encoding network, so as to obtain a plurality of second associated characterizing vectors.
The process of encoding each second associated node is similar to the process of encoding the first node in step 23. Specifically, the plurality of second associated nodes may be respectively used as target nodes, and each associated coding sub-graph corresponding to each second associated node is determined according to the first sub-graph determination rule; and then, respectively inputting the associated coding subgraphs into a graph coding network, wherein the graph coding network outputs the hidden vector of the root node as a corresponding second associated representation vector according to the nodes in the input associated coding subgraphs and the connection relation among the nodes. The specific processing procedure of the graph coding network is as described above and will not be described again.
On the basis of obtaining a plurality of second associated token vectors corresponding to the plurality of second associated nodes through the graph coding network, in step 26, a second graph feature vector H of the second structure sub-graph Gv is determined based on the plurality of second associated token vectorsGv。
In one embodiment, the foregoing second associated characterization vectors may be subjected to a pooling operation, and the result of the pooling operation is taken as a second graph characterization vector HGv(ii) a The pooling operation includes maximum pooling, sum pooling, or average pooling.
For example, the second chart feature vector can be obtained by the following formula (1):
wherein w is any node in the second structure sub-graph Gv, namely any second associated node; h iswAnd coding a second association characterization vector obtained by the graph coding network aiming at the second association node w. In formula (1), the graph feature vectors of the second structural subgraph are obtained by summing and pooling the second associated feature vectors.
In another embodiment, the attention weight of each of the second associated characterization vectors may be further determined based on a self-attention mechanism; based on the attention weight, the plurality of second associated characterization vectors are subjected to weighted fusion to obtain a second chart characterization vector HGv。
Thus, the vector h can be characterized for a plurality of second associations in a plurality of wayswGo on to meltAnd combining, and taking the fusion result as a graph feature vector H of the second structural subgraphGv. The graph characteristic vector HGvThe overall characteristics of the plurality of nodes in the second structure sub-graph Gv may be reflected.
Next, in step 27, the vector h is characterized according to the first nodeuAnd a second chart feature vector HGvDetermining a first penalty, the first penalty being associated with a first node characterization vector huAnd a second chart feature vector HGvThe similarity between them is inversely correlated.
In particular, in one embodiment, a first node token vector h may be calculateduAnd a second chart feature vector HGvDot product therebetween as the degree of similarity therebetween. In another example, a first node token vector h may also be computeduAnd a second chart feature vector HGvA vector distance therebetween, such as a cosine distance or a euclidean distance, and a similarity therebetween is determined based on the vector distance; the smaller the vector distance, the greater the similarity.
Further, the first loss is determined to be negatively correlated to the above-described similarity.
Specifically, in one example, the first loss L1 may be determined as:
wherein h is obtained by dot multiplicationuAnd HGvS1, and determining a first loss L1.
In another example, the first loss may be determined by the following equation (3):
wherein x is other nodes except the second node v, Gx is a structural subgraph corresponding to the node x, HGxThe graph eigen direction obtained by processing the structure subgraph Gx is processed in the same processing mode as the second structure subgraphAmount of the compound (A).
Equation (3) means that a plurality of negative sample nodes x different from the second node v can be sampled from the dynamic interaction graph, for each negative sample node x, a corresponding negative sample structure subgraph Gx is determined similarly to step 24, similarly to the foregoing step 25, a graph coding network is used to code the sub-nodes included in the negative sample structure subgraph Gx to obtain the characterization vectors corresponding to each sub-node, and similar to step 26, a negative sample graph characterization vector H corresponding to the negative sample structure subgraph Gx is obtained based on the characterization vectors corresponding to the sub-nodesGx. When the first loss is determined, the negative sample similarities of the first node characterization vector and the negative sample map characterization vectors can be respectively calculated, and the sum of the first loss and the negative sample similarities is positively correlated.
In other examples, the first loss may also have other forms, not enumerated here.
Then, at step 28, the graph coding network is updated with the goal of reducing the first loss L1. It can be understood that the first loss L1 and the first node token vector huAnd a second chart feature vector HGvThe similarity S1 between the nodes is inversely correlated, and the decrease of the first loss L1 means the increase of the similarity S1, so that the graph coding network is adjusted by taking the increase as a target, the node characterization of one node is closer to the graph characterization related to the other node for two nodes with interaction, and therefore, a better characterization mode of the node is obtained through graph structure comparison, and the characterization mode can better reflect the actual interaction condition of the node.
When the first loss L1 is determined using equation (3), a decrease in the first loss L1 means not only an increase in the above-described similarity S1 but also a decrease in the respective negative sample similarities. That is, for negative sample nodes that do not interact with the first node, the larger the graph characterization gap associated with the negative sample node, the better the node characterization of the first node. This is equivalent to training the graph coding network from both positive and negative examples.
The process of training a graph-coded network by comparing the node characterization of the first node u with the graph characterization corresponding to the second node v is described above in connection with fig. 2 and 4. It will be appreciated that the roles of the first and second nodes may be interchanged. In one embodiment, after the training through steps 23-28 above, the positions or roles of the first node u and the second node v may also be exchanged, and the training of the graph-coded network may be continued by comparing the node characteristics of the second node with the graph characteristics associated with the first node.
Fig. 6 shows a flow of steps of a training graph encoding network according to one embodiment. As shown in fig. 6, in step 61, a second node v is encoded by using a graph coding network to obtain a second node token vector hv. The specific encoding process is similar to the encoding process for the first node u in step 23, and is not described in detail.
Then, in step 62, a first structural sub-graph Gu corresponding to the first node u is determined, wherein a plurality of first associated nodes are included. It is to be understood that the determination of the first structural sub-graph Gu is similar to the determination of the second structural sub-graph Gv in step 24, both using the same sub-graph determination rule.
Next, in step 63, a graph coding network is used to code a plurality of first association nodes in Gu to obtain a plurality of first association characterization vectors; and determining a first graph feature vector H of the first structural sub-graph Gu based on the plurality of first associated feature vectors at step 64Gu. The specific implementation process of steps 63 and 64 is similar to the processing process for the second structure diagram in steps 25 and 26, and is not described again.
Then, in step 65, a second penalty L2 is determined, which is related to the above-mentioned second node characterization vector hvAnd said first chart feature vector HGuThe similarity between S2 is negative. The specific form of the second loss L2 may correspond to the first loss L1, for example, in the form of formula (2), or formula (3).
Then, in step 66, the graph coding network is updated with the goal of reducing the second loss L2, so that the training of the graph coding network is continued.
Of course, in practical operation, after the first loss and the second loss are determined, the sum of the first loss and the second loss may be used as the total prediction loss, and the graph coding network may be updated with the total prediction loss reduced as a target.
Reviewing the above process, in the solutions of the above embodiments, by using the idea of graph structure comparison learning, for two nodes which interact with each other, the node characterization of one node is compared with the similarity of the graph characterization related to the other node, so as to train the graph coding network. After training, the graph coding network obtained by training can be used for carrying out coding representation on the nodes in the dynamic interaction graph, and future interaction is predicted according to the coding representation.
The following describes a process of inter-prediction using the graph coding network described above.
FIG. 7 illustrates a flow diagram of a method of predicting an interaction event, according to one embodiment. It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities. As shown in fig. 7, the method of predicting an interaction event may include the following steps.
At step 71, a graph coding network trained according to the method described above is obtained.
At step 72, a first target node and a second target node corresponding to the first object and the second object to be evaluated are determined from the dynamic interaction graph.
For example, assuming that it is desired to predict whether the object Lucy in fig. 3 will interact with M2 in the future, the first object is Lucy and the second object is M2. As described above, in the dynamic interaction graph, one physical object may correspond to a plurality of nodes, for example, Lucy corresponds to a node p and a node q. Therefore, when determining the first target node corresponding to the first object, in one example, one of the plurality of nodes corresponding to the first object in the dynamic interaction graph may be selected as the first target node. However, it is more preferable that, in an example, a node corresponding to the latest time among the plurality of nodes corresponding to the first object in the dynamic interaction graph is taken as the first target node, or a node in a node pair corresponding to an interaction event in which the first object recently participates is taken as the first target node. For example, in LucyIs the first object, the most recent interactive event involved is E5Node p may therefore be considered the first target node. The determination of the second target node is similar thereto.
Next, in step 73, the trained graph coding network is used to code the first target node and the second target node respectively, so as to obtain a first target characterization vector and a second target characterization vector. The encoding process is the same as the above description of the training phase and is not repeated.
Then, in step 74, the probability of the first object and the second object to generate an interaction event is predicted according to the first target characterization vector and the second target characterization vector. In a specific embodiment, a simple operation may be performed on the first target token vector and the second target token vector, for example, similarity calculation, distance calculation, and the like, so as to determine the interaction probability. In another embodiment, the first target characterization vector and the second target characterization vector may be input into a trained classifier, and the classifier outputs a prediction probability of whether the first object and the second object will interact with each other.
In the process, through the trained graph coding network, coding representation can be carried out on the nodes corresponding to the objects to be evaluated, and whether an interaction event occurs between the two objects is predicted according to the coding representation. The prediction of the interaction event can be used for various scenes such as article recommendation, content recommendation and the like.
According to another aspect of the embodiment, an apparatus of a training graph encoding network is provided, which may be deployed in any device, platform or device cluster with computing and processing capabilities. FIG. 8 shows a schematic block diagram of a training apparatus of a graph coding network according to one embodiment. As shown in fig. 8, the training device 80 includes:
a dynamic graph obtaining unit 801 configured to obtain a dynamic interaction graph, where the dynamic interaction graph includes a plurality of node pairs, each node pair corresponds to an interaction event, two nodes of the dynamic interaction graph respectively represent two objects participating in the interaction event, and an arbitrary node points to two nodes corresponding to a previous interaction event where an object represented by the node participates through a connecting edge;
a node selecting unit 802 configured to select a first node and a second node constituting a node pair from the dynamic interaction graph;
a first encoding unit 803, configured to encode the first node by using the graph coding network to obtain a first node representation vector;
a structure sub-graph determining unit 804 configured to determine a second structure sub-graph corresponding to a second node, where the second structure sub-graph includes a plurality of second associated nodes;
a second encoding unit 805 configured to encode the plurality of second association nodes by using the graph coding network to obtain a plurality of second association characterization vectors;
a graph vector determination unit 806 configured to determine a second graph feature vector of the second structural subgraph based on the plurality of second associated feature vectors;
a penalty determination unit 807 configured to determine a first penalty negatively correlated with a similarity between the first node characterization vector and the second graph characterization vector;
an updating unit 808 configured to update the graph coding network with a goal of reducing the first loss.
According to an embodiment, the first encoding unit 803 is specifically configured to: determining a first coding subgraph by taking the first node as a target node according to a first subgraph determination rule; determining a subgraph formed by nodes in a preset range, which are reached by a connecting edge and take a target node as a root node, in the dynamic interaction graph; and inputting the first coding subgraph into the graph coding network, wherein the graph coding network outputs a hidden vector of a root node as a first node characterization vector according to nodes in the first coding subgraph and the connection relation among the nodes.
Further, in a specific embodiment, the nodes in the predetermined range may include: presetting a number of nodes in K connecting edges; and/or nodes with interaction time within a preset time range.
In one embodiment, the structural subgraph determining unit 804 is specifically configured to: and taking the second node as a target node, determining a rule according to the first subgraph, and determining a corresponding subgraph as the second structural subgraph.
In another embodiment, the structural subgraph determining unit 804 is specifically configured to: and taking the second node as a target node, determining a rule according to a second subgraph, and determining a corresponding subgraph as the second structural subgraph, wherein the second subgraph determination rule is different from the first subgraph determination rule.
In one embodiment, the second encoding unit 805 is specifically configured to: respectively taking the plurality of second associated nodes as target nodes, determining rules according to the first subgraph, and determining each associated coding subgraph corresponding to each second associated node; and respectively inputting the associated coding subgraphs into the graph coding network, and outputting the hidden vector of the root node as a corresponding second associated representation vector by the graph coding network according to the nodes in the input associated coding subgraphs and the connection relation among the nodes.
Further, in an embodiment, the graph coding network includes an LSTM layer, and for any sub-graph input therein, the LSTM layer respectively uses each node from a leaf node to a root node in the sub-graph as a current node, and sequentially iterates each node, where the iterative processing includes determining a hidden vector of the current node at least according to a node attribute feature of the current node and a hidden vector of two nodes pointed by the current node through a connecting edge.
According to an embodiment, the map vector determination unit 806 is specifically configured to: performing pooling operation on the plurality of second associated characterization vectors, and taking the result of the pooling operation as the second graph characterization vector; the pooling operation includes maximum pooling, sum pooling, or average pooling.
According to another embodiment, the map vector determination unit 806 is specifically configured to: determining an attention weight for each of the plurality of second associated characterization vectors based on a self-attention mechanism; and based on the attention weight, performing weighted fusion on the plurality of second associated characterization vectors to obtain the second chart characterization vector.
In one embodiment, the loss determination unit 807 is specifically configured to: calculating a point product between the first node characterization vector and the second graph characterization vector as a similarity between the first node characterization vector and the second graph characterization vector; or calculating the vector distance between the first node characterization vector and the second graph characterization vector, and determining the similarity between the first node characterization vector and the second graph characterization vector according to the vector distance.
According to one embodiment, the node selection unit 802 is further configured to sample a plurality of negative sample nodes different from the second node from the dynamic interaction graph; the structure sub-graph determining unit 804 is further configured to determine a negative sample structure sub-graph corresponding to each negative sample node; the second encoding unit 805 is further configured to, for each negative sample structure subgraph, encode child nodes in the subgraph by using the graph encoding network to obtain a characterization vector corresponding to the child node; the graph vector determination unit 806 is further configured to determine a negative sample graph feature vector corresponding to the negative sample structure subgraph based on the feature vector corresponding to the child node; in such a case, the loss determining unit 807 is further configured to calculate respective negative sample similarities of the first node characterization vector and the respective negative sample map characterization vectors, respectively, such that the first loss is positively correlated with the sum of the respective negative sample similarities.
In an embodiment, the first encoding unit 803 is further configured to encode the second node by using the graph coding network to obtain a second node characterization vector; the structural sub-graph determining unit 804 is further configured to determine a first structural sub-graph corresponding to the first node, which includes a plurality of first associated nodes; the second encoding unit 805 is further configured to encode the plurality of first association nodes by using the graph coding network to obtain a plurality of first association characterization vectors; the graph vector determination unit 806 is further configured to determine a first graph token vector of the first structural subgraph based on the plurality of first associated token vectors; the penalty determination unit 807 is further configured to determine a second penalty negatively correlated to the similarity between the second node characterization vector and the first graph characterization vector; the updating unit 808 is configured to update the coding network with a view to reducing the second loss.
Through the training device of each embodiment, based on the dynamic interaction graph, the graph structure is used for comparing the learning idea, and the graph coding network is obtained through training.
According to an embodiment of yet another aspect, an apparatus for predicting an interaction event is provided, which may be deployed in any device, platform or cluster of devices having computing and processing capabilities. Fig. 9 shows a schematic block diagram of an apparatus for predicting an interaction event according to an embodiment. As shown in fig. 9, the prediction device 90 includes:
a model obtaining unit 901 configured to obtain the graph coding network obtained by the training;
a target node obtaining unit 902, configured to determine, from the dynamic interaction graph, a first target node and a second target node corresponding to a first object and a second object to be evaluated, respectively;
a graph encoding unit 903, configured to encode the first target node and the second target node respectively by using the graph encoding network to obtain a first target characterization vector and a second target characterization vector;
a predicting unit 904 configured to predict a probability of an interaction event occurring between the first object and the second object according to the first target characterization vector and the second target characterization vector.
With the prediction device 90, the nodes are encoded using the graph coding network obtained by training, and the interaction events between the objects are predicted based on the encoded characterization vectors.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method described in connection with fig. 2.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.
Claims (18)
1. A method of training a graph coding network, the method comprising:
acquiring a dynamic interaction graph, wherein the dynamic interaction graph comprises a plurality of node pairs, each node pair corresponds to an interaction event, two nodes respectively represent two objects participating in the interaction event, and any node points to two nodes corresponding to the last interaction event participated in by the object represented by the node through a connecting edge;
selecting a first node and a second node from the dynamic interaction graph, wherein the first node and the second node form a node pair;
coding the first node by using the graph coding network to obtain a first node representation vector;
determining a second structural subgraph corresponding to the second node, wherein the second structural subgraph comprises a plurality of second associated nodes;
encoding the plurality of second association nodes by using the graph encoding network to obtain a plurality of second association characterization vectors;
determining a second graph feature vector of the second structural sub-graph based on the plurality of second associated feature vectors;
determining a first loss that is inversely related to a similarity between the first node characterization vector and the second graph characterization vector;
updating the graph coding network with a goal of reducing the first loss.
2. The method of claim 1, wherein encoding the first node using the graph coding network to obtain a first node characterization vector comprises:
determining a first coding subgraph by taking the first node as a target node according to a first subgraph determination rule; determining a subgraph formed by nodes in a preset range, which are reached by a connecting edge and take a target node as a root node, in the dynamic interaction graph;
and inputting the first coding subgraph into the graph coding network, wherein the graph coding network outputs a hidden vector of a root node as a first node characterization vector according to nodes in the first coding subgraph and the connection relation among the nodes.
3. The method of claim 2, wherein the predetermined range of nodes comprises:
presetting a number of nodes in K connecting edges; and/or
And nodes with the interaction time within a preset time range.
4. The method of claim 2, wherein determining a second structural subgraph corresponding to a second node comprises:
and determining a rule according to the first subgraph by taking the second node as a target node, and determining a corresponding subgraph as the second structural subgraph.
5. The method of claim 2, wherein determining a second structural subgraph corresponding to a second node comprises:
and taking the second node as a target node, determining a corresponding sub-graph as the second structural sub-graph according to a second sub-graph determination rule, wherein the second sub-graph determination rule is different from the first sub-graph determination rule.
6. The method of claim 2, wherein encoding the plurality of second associated nodes using the graph coding network to obtain a plurality of second associated token vectors comprises:
respectively taking the plurality of second associated nodes as target nodes, determining rules according to the first subgraph, and determining each associated coding subgraph corresponding to each second associated node;
and respectively inputting the associated coding subgraphs into the graph coding network, and outputting the hidden vector of the root node as a corresponding second associated representation vector by the graph coding network according to the nodes in the input associated coding subgraphs and the connection relation among the nodes.
7. The method according to claim 2 or 6, wherein the graph coding network comprises an LSTM layer, the LSTM layer takes each node from a leaf node to a root node in any sub-graph input into the graph as a current node, and sequentially and iteratively processes each node, and the iterative processing comprises determining a hidden vector of the current node according to at least the node attribute characteristics of the current node and the hidden vectors of two nodes pointed by the current node through a connecting edge.
8. The method of claim 1, wherein determining a second graph feature vector of the second structural sub-graph based on the plurality of second associated feature vectors comprises:
performing pooling operation on the plurality of second associated characterization vectors, and taking the result of the pooling operation as the second graph characterization vector; the pooling operation includes maximum pooling, sum pooling, or average pooling.
9. The method of claim 1, wherein determining a second graph feature vector of the second structural sub-graph based on the plurality of second associated feature vectors comprises:
determining an attention weight for each of the plurality of second associated characterization vectors based on a self-attention mechanism;
and based on the attention weight, performing weighted fusion on the plurality of second associated characterization vectors to obtain the second chart characterization vector.
10. The method of claim 1, wherein determining a first loss comprises:
calculating a point product between the first node characterization vector and the second graph characterization vector as a similarity between the first node characterization vector and the second graph characterization vector; or,
calculating the vector distance between the first node characterization vector and the second graph characterization vector, and determining the similarity according to the vector distance.
11. The method of claim 1, further comprising:
sampling a plurality of negative sample nodes different from the second node from the dynamic interaction graph;
determining a negative sample structure subgraph corresponding to each negative sample node;
for each negative sample structure subgraph, coding the sub-nodes in the graph coding network by using the graph coding network to obtain the characterization vectors corresponding to the sub-nodes; determining a negative sample graph feature vector corresponding to the negative sample structure subgraph based on the feature vector corresponding to the child node;
the determining the first loss comprises calculating each negative sample similarity of the first node characterization vector and each negative sample map characterization vector, so that the sum of the first loss and each negative sample similarity is positively correlated.
12. The method of claim 1, further comprising,
coding the second node by using the graph coding network to obtain a second node representation vector;
determining a first structural subgraph corresponding to a first node, wherein the first structural subgraph comprises a plurality of first associated nodes;
encoding the plurality of first association nodes by using the graph encoding network to obtain a plurality of first association characterization vectors;
determining a first graph feature vector of the first structural sub-graph based on the plurality of first associated feature vectors;
determining a second penalty negatively correlated to the similarity between the second node characterization vector and the first graph characterization vector;
updating the coding network with a goal of reducing the second loss.
13. A method of predicting an interaction event, the method comprising:
obtaining a graph coding network trained according to the method of claim 1;
determining a first target node and a second target node respectively corresponding to a first object and a second object to be evaluated from the dynamic interaction graph;
respectively coding the first target node and the second target node by using the graph coding network to obtain a first target characterization vector and a second target characterization vector;
and predicting the probability of the interaction event of the first object and the second object according to the first target characterization vector and the second target characterization vector.
14. The method of claim 13, wherein the first target node is a node in a corresponding pair of nodes in the dynamic interaction graph of interaction events that the first object recently participated in.
15. An apparatus for training a graph coding network, the apparatus comprising:
the dynamic graph acquiring unit is configured to acquire a dynamic interactive graph, wherein the dynamic interactive graph comprises a plurality of node pairs, each node pair corresponds to one interactive event, two nodes respectively represent two objects participating in the interactive event, and any node points to two nodes corresponding to the last interactive event participated by the object represented by the node through a connecting edge;
a node selection unit configured to select a first node and a second node constituting a node pair from the dynamic interaction graph;
the first coding unit is configured to code the first node by using the graph coding network to obtain a first node representation vector;
the structure subgraph determining unit is configured to determine a second structure subgraph corresponding to the second node, wherein the second structure subgraph comprises a plurality of second associated nodes;
a second encoding unit configured to encode the plurality of second association nodes by using the graph coding network to obtain a plurality of second association characterization vectors;
a graph vector determination unit configured to determine a second graph feature vector of the second structural subgraph based on the plurality of second associated feature vectors;
a loss determination unit configured to determine a first loss that is negatively correlated with a similarity between the first node characterization vector and the second graph characterization vector;
an updating unit configured to update the graph coding network with a goal of reducing the first loss.
16. An apparatus to predict an interaction event, the apparatus comprising:
a model acquisition unit configured to acquire a graph coding network trained by the apparatus according to claim 14;
the target node obtaining unit is configured to determine a first target node and a second target node which correspond to a first object to be evaluated and a second object to be evaluated respectively from the dynamic interaction graph;
the graph coding unit is configured to code the first target node and the second target node respectively by using the graph coding network to obtain a first target characterization vector and a second target characterization vector;
and the predicting unit is configured to predict the probability of the first object and the second object generating the interaction event according to the first target characterization vector and the second target characterization vector.
17. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-14.
18. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-14.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110196034.2A CN112580789B (en) | 2021-02-22 | 2021-02-22 | Training graph coding network, and method and device for predicting interaction event |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110196034.2A CN112580789B (en) | 2021-02-22 | 2021-02-22 | Training graph coding network, and method and device for predicting interaction event |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112580789A CN112580789A (en) | 2021-03-30 |
CN112580789B true CN112580789B (en) | 2021-06-25 |
Family
ID=75113935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110196034.2A Active CN112580789B (en) | 2021-02-22 | 2021-02-22 | Training graph coding network, and method and device for predicting interaction event |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112580789B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113283589B (en) * | 2021-06-07 | 2022-07-19 | 支付宝(杭州)信息技术有限公司 | Updating method and device of event prediction system |
CN115496174B (en) * | 2021-06-18 | 2023-09-26 | 中山大学 | Method for optimizing network representation learning, model training method and system |
CN113987280B (en) * | 2021-10-27 | 2024-07-12 | 支付宝(杭州)信息技术有限公司 | Method and device for training graph model aiming at dynamic graph |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111160521A (en) * | 2019-12-09 | 2020-05-15 | 南京航空航天大学 | Urban human flow pattern detection method based on deep neural network image encoder |
CN111695702B (en) * | 2020-06-16 | 2023-11-03 | 腾讯科技(深圳)有限公司 | Training method, device, equipment and storage medium of molecular generation model |
CN111523682B (en) * | 2020-07-03 | 2020-10-23 | 支付宝(杭州)信息技术有限公司 | Method and device for training interactive prediction model and predicting interactive object |
CN112085279B (en) * | 2020-09-11 | 2022-09-06 | 支付宝(杭州)信息技术有限公司 | Method and device for training interactive prediction model and predicting interactive event |
CN112084427A (en) * | 2020-09-15 | 2020-12-15 | 辽宁工程技术大学 | Interest point recommendation method based on graph neural network |
CN112085293B (en) * | 2020-09-18 | 2022-09-09 | 支付宝(杭州)信息技术有限公司 | Method and device for training interactive prediction model and predicting interactive object |
-
2021
- 2021-02-22 CN CN202110196034.2A patent/CN112580789B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112580789A (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112580789B (en) | Training graph coding network, and method and device for predicting interaction event | |
CN110598847B (en) | Method and device for processing interactive sequence data | |
CN111210008B (en) | Method and device for processing interactive data by using LSTM neural network model | |
CN111814977B (en) | Method and device for training event prediction model | |
CN112364976B (en) | User preference prediction method based on session recommendation system | |
CN112085293B (en) | Method and device for training interactive prediction model and predicting interactive object | |
US11250088B2 (en) | Method and apparatus for processing user interaction sequence data | |
CN111814921B (en) | Object characteristic information acquisition method, object classification method, information push method and device | |
CN110543935B (en) | Method and device for processing interactive sequence data | |
CN111523682B (en) | Method and device for training interactive prediction model and predicting interactive object | |
CN111242283B (en) | Training method and device for evaluating self-encoder of interaction event | |
CN110689110B (en) | Method and device for processing interaction event | |
CN110490274B (en) | Method and device for evaluating interaction event | |
US20220253688A1 (en) | Recommendation system with adaptive weighted baysian personalized ranking loss | |
CN113610610A (en) | Session recommendation method and system based on graph neural network and comment similarity | |
CN113592593B (en) | Training and application method, device, equipment and storage medium of sequence recommendation model | |
CN111476223B (en) | Method and device for evaluating interaction event | |
CN111258469B (en) | Method and device for processing interactive sequence data | |
CN111652451B (en) | Social relationship obtaining method and device and storage medium | |
CN112085279B (en) | Method and device for training interactive prediction model and predicting interactive event | |
CN113449176A (en) | Recommendation method and device based on knowledge graph | |
Kang | Outgoing call recommendation using neural network | |
CN114841765A (en) | Sequence recommendation method based on meta-path neighborhood target generalization | |
CN112328835A (en) | Method and device for generating vector representation of object, electronic equipment and storage medium | |
Varghese et al. | TransformerG2G: Adaptive time-stepping for learning temporal graph embeddings using transformers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |