WO2021139525A1 - 评估交互事件的自编码器的训练方法及装置 - Google Patents
评估交互事件的自编码器的训练方法及装置 Download PDFInfo
- Publication number
- WO2021139525A1 WO2021139525A1 PCT/CN2020/138401 CN2020138401W WO2021139525A1 WO 2021139525 A1 WO2021139525 A1 WO 2021139525A1 CN 2020138401 W CN2020138401 W CN 2020138401W WO 2021139525 A1 WO2021139525 A1 WO 2021139525A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- vector
- sample
- subgraph
- nodes
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Definitions
- One or more embodiments of the present specification relate to the field of machine learning, and in particular, to a method and apparatus for training an autoencoder for evaluating interactive events.
- Interaction events are one of the basic elements of Internet events.
- the click behavior of a user when browsing a page can be regarded as an interaction event between the user and the page content block
- the purchase behavior in e-commerce can be regarded as the relationship between the user and the product.
- Interaction events between users, and transfer behavior between accounts are interaction events between users.
- a series of user interaction events contain the user's fine-grained habits and preferences, as well as the characteristics of interactive objects, which are important source of features for machine learning models. Therefore, in many scenarios, it is desirable to express and model the characteristics of interactive participants based on interactive events, and then analyze the interactive events, especially the security of the interactive events, so as to ensure the security of the interactive platform.
- an interactive event involves both parties to the interaction, and the status of each participant itself can be dynamically changed. Therefore, it is very difficult to accurately express the characteristics of the interactive participants comprehensively considering the various characteristics of the interactive parties. Therefore, it is hoped that there will be an improved scheme to analyze and process interaction events more effectively.
- One or more embodiments of this specification describe a method and device for training an autoencoder for evaluating interaction events, wherein based on a dynamic interaction graph, an autoencoder for a specific type of interaction event can be trained, so that the trained autoencoder can be used
- the self-encoder performs the evaluation and analysis of interactive events.
- a method for training an autoencoder for evaluating interaction events comprising: obtaining a dynamic interaction graph reflecting the association relationship of the interaction events, including multiple pairs of nodes, each pair of nodes representing an interaction For the two objects in the event, any node i is connected to two sub-nodes through a connecting edge.
- the two sub-nodes are the two nodes corresponding to the last interaction event that the object represented by the node i participated in; each interacts with a sample of a predetermined category
- the first sample node and the second sample node corresponding to the event are the root nodes.
- the autoencoder includes an LSTM layer, the LSTM layer according to the parent-child relationship between nodes in the input subgraph, iteratively process each node from the leaf node to the root node , Wherein the iterative processing includes determining the hidden vector of the current processing node at least according to the node characteristics of the current processing node and the hidden vectors of its two child nodes;
- the subgraph is input to the autoencoder, and the first sample vector corresponding to the first sample node and the second sample vector corresponding to the second sample node are obtained; by combining the first sample subgraph and the second sample subgraph
- the parent-child relationship between nodes is reversed, and the reversed subgraphs are merged to form a reverse sample subgraph; the reverse sample subgraph includes the union of nodes in
- the set of sample nodes formed by the set; the reverse sample subgraph is input to the autoencoder to obtain the hidden vector of each node in the sample node set, wherein the hidden vector of the leaf node of the reverse sample subgraph Determined according to the first sample vector and the second sample vector; according to the synthesis of the distance between the hidden vector of each node in the sample node set and its node feature, the prediction loss is determined, and in the direction that the prediction loss is reduced, Update the autoencoder.
- the above-mentioned predetermined range of nodes may include: K-level sub-nodes within a preset number K of connected edges; and/or, sub-nodes whose interaction time is within a preset time range.
- the node characteristic of the current processing node includes the attribute characteristic of the object corresponding to the node.
- the attribute feature when the current processing node is a user node, the attribute feature may include at least one of the following: age, occupation, education level, location, registration duration, and crowd tag; when the current processing node When it is an item node, the attribute feature may include at least one of the following: item category, shelf time, number of reviews, and sales.
- the node feature of the current processing node further includes the event behavior feature of the interaction event corresponding to the node.
- the LSTM layer determines the hidden vector of the current processing node in the following way: Combine the node characteristics of the current processing node with the hidden vectors of the two child nodes, and input the first transformation function and the second transformation respectively Function to obtain two first transform vectors and two second transform vectors; combine the intermediate vector used for auxiliary operations of the i-th child node of the two child nodes with the corresponding i-th first transform vector, i-th The second transformation vector is combined to obtain two operation results, and the two operation results are summed to obtain a combined vector; the node characteristics of the current processing node and the implicit vectors of the two child nodes are input respectively The third transformation function and the fourth transformation function respectively obtain the third transformation vector and the fourth transformation vector; based on the combination vector and the third transformation vector, determine the intermediate vector of the current processing node; The intermediate vector and the fourth transform vector determine the implicit vector of the current processing node.
- the iterative processing performed by the LSTM layer includes, according to the node characteristics of the current processing node, the implicit vectors of the two child nodes, and the first interaction time of the interaction event where the current processing node is located and the two The time difference between the second interaction time of the interaction event where each child node is located determines the implicit vector of the current processing node.
- the LSTM layer determines the hidden vector of the current processing node in the following way: separate the node feature of the current processing node and the time difference from the hidden vector of the two child nodes Combine, input the first transformation function to obtain 2 first transformation vectors; combine the node feature with the implicit vectors of the two child nodes respectively, and input the second transformation function to obtain 2 second transformation vectors;
- the intermediate vector used for auxiliary operation of the i-th child node of the two child nodes is combined with the corresponding i-th first transform vector and the i-th second transform vector to obtain 2 operation results.
- the operation results are summed to obtain the combined vector; the node characteristics of the current processing node and the implicit vectors of the two child nodes are input into the third transformation function and the fourth transformation function, respectively, and the third transformation vector and the fourth transformation vector are obtained respectively.
- Transform vector determine the intermediate vector of the current processing node based on the combined vector and the third transformation vector; determine the implicit vector of the current processing node based on the intermediate vector and the fourth transformation vector of the current processing node.
- the step of forming a reverse sample subgraph specifically includes: inverting the parent-child relationship between nodes in the first sample subgraph to form a first reverse direction with the first sample node as a leaf node. Subgraph; reverse the parent-child relationship between nodes in the second sample subgraph to form a second reverse subgraph with the second sample node as a leaf node; merge the first reverse subgraph and all The common nodes in the second reverse subgraph form a merged subgraph; in the merged subgraph, for nodes with only one child node, a default child node is added to it, thereby forming the reverse sample Subgraph.
- the leaf node of the reverse sample subgraph includes a first leaf node corresponding to the first sample node, and a second leaf node corresponding to the second sample node; wherein the implicit vector of the first leaf node Is the first sample vector, and the implicit vector of the second leaf node is the second sample vector.
- the hidden vector of the first leaf node is based on the node feature of the first leaf node, and the first sample vector and the second sample vector are used as the hidden vectors of two child nodes.
- the implicit vector of the second leaf node is determined based on the node characteristics of the second leaf node, and the first sample vector and the second sample vector are used as the implicit vectors of the two child nodes.
- the self-encoder includes multiple LSTM layers, wherein the implicit vector of the current processing node determined by the previous LSTM layer is input to the next LSTM layer as the node feature of the current processing node .
- the prediction loss can be determined in the following manner: for each node in the sample node set, determine the implicit vector of the node output by the last LSTM layer in the multiple LSTM layers, and the implicit vector input to the first LSTM layer. The distance between the node features of the node in an LSTM layer; the prediction loss is determined according to the comprehensive result of the distance corresponding to each node.
- a method for evaluating interaction events using an autoencoder comprising: obtaining a dynamic interaction graph reflecting the association relationship of the interaction events, which includes multiple pairs of nodes, and each pair of nodes represents an interaction event in an interaction event.
- any node i is connected to two sub-nodes through a connecting edge, and the two sub-nodes are the two nodes corresponding to the last interaction event that the object represented by the node i participated; the first corresponding to the target event to be analyzed
- a target node and a second target node are root nodes, and the first target subgraph and the second target subgraph formed by nodes in a predetermined range starting from the root node and reaching through the connecting edge are determined in the dynamic interaction graph;
- the reverse target subgraph includes a target node set formed by the union of
- the target event is a hypothetical occurrence
- the predetermined type of event is an event that is determined to occur.
- the target event is an event that has occurred
- the event of the predetermined category is an event that confirms safety
- a device for training an autoencoder for evaluating interaction events comprising: an interaction graph acquiring unit configured to acquire a dynamic interaction graph reflecting the association relationship of interaction events, including multiple pairs of nodes , Each pair of nodes represents two objects in an interaction event, any node i is connected to two child nodes through a connecting edge, and the two child nodes are the two nodes corresponding to the last interaction event that the object represented by node i participated in;
- the sample subgraph acquisition unit is configured to take the first sample node and the second sample node corresponding to the sample interaction event of a predetermined category as the root nodes, and it is determined in the dynamic interaction graph to start from the root node and pass through the connecting edge.
- the encoder acquisition unit is configured to acquire the autoencoder to be trained, the autoencoder includes an LSTM layer, and the LSTM layer is based on Enter the parent-child relationship between the nodes in the subgraph, and process each node in turn from the leaf node to the root node, wherein the iterative process includes determining at least according to the node characteristics of the current processing node and the implicit vectors of its two child nodes The implicit vector of the current processing node; a sample subgraph processing unit configured to input the first sample subgraph and the second sample subgraph to the autoencoder to obtain the first sample corresponding to the first sample node.
- the sample vector is the second sample vector corresponding to the second sample node;
- the reverse subgraph forming unit is configured to reverse the parent-child relationship between the nodes in the first sample subgraph and the second sample subgraph, and reverse the The backward subgraphs are merged to form a reverse sample subgraph;
- the reverse sample subgraph includes a sample node set formed by the union of nodes in the first sample subgraph and the second sample subgraph;
- the processing unit is configured to input the reverse sample subgraph into the autoencoder to obtain the hidden vector of each node in the sample node set, wherein the hidden vector of the leaf node of the reverse sample subgraph is based on the The first sample vector and the second sample vector are determined;
- the update unit is configured to determine the prediction loss according to the synthesis of the distance between the hidden vector of each node in the sample node set and the node feature, and reduce the prediction loss In the direction, update the autoencoder.
- a device for evaluating interaction events using an autoencoder comprising: an interaction graph acquiring unit configured to acquire a dynamic interaction graph reflecting the association relationship of the interaction event, including multiple pairs of nodes, each pair A node represents two objects in an interaction event, any node i is connected to two child nodes through a connecting edge, and the two child nodes are the two nodes corresponding to the last interaction event that the object represented by the node i participated in; the target subgraph The acquiring unit is configured to take the first target node and the second target node corresponding to the target event to be analyzed as the root nodes, and determine in the dynamic interaction graph a node in a predetermined range that starts from the root node and arrives via the connecting edge The first target sub-picture and the second target sub-picture are formed; the encoder acquisition unit is configured to acquire the autoencoder obtained by training using the apparatus in the third aspect; the target sub-picture processing unit is configured to separately The first target subgraph and the second target
- a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of the first aspect or the second aspect.
- a computing device including a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the first aspect or the first aspect is implemented. Two-sided approach.
- a dynamic interaction graph is constructed based on the sequence of interaction events, and an autoencoder for evaluating interaction events is trained based on such a dynamic interaction graph.
- the autoencoder can characterize the nodes involved in the event as hidden vectors through the subgraphs and reverse subgraphs involved in the event, and make the hidden vectors fully fit the node characteristics. In this way, the trained autoencoder can be used to analyze and evaluate unknown interaction events.
- Fig. 1 shows a schematic diagram of an implementation scenario according to an embodiment
- Fig. 2 shows a flowchart of a method for training an autoencoder according to an embodiment
- Fig. 3 shows a sequence of interaction events according to an embodiment and a dynamic interaction diagram constructed therefrom
- Figure 4 shows an example of a sample subgraph in one embodiment
- Figure 5 shows a working schematic diagram of the LSTM layer
- Figure 6 shows the structure of an LSTM layer according to an embodiment
- FIG. 7 shows the structure of an LSTM layer according to another embodiment
- FIG. 8 shows the structure of an LSTM layer according to yet another embodiment
- Figure 9 shows the steps of forming a reverse subgraph in one embodiment
- Figure 10 shows the first reverse subgraph in an example
- FIG. 11 shows a merged sub-picture corresponding to the sample sub-picture of FIG. 4;
- Figure 12 shows a reverse sample subgraph in an example
- FIG. 13 shows a schematic diagram of processing reverse sample subgraphs by the autoencoder
- Figure 14 shows a schematic diagram of a self-encoder with multiple LSTM layers
- FIG. 15 shows a flowchart of a method for evaluating interaction events using an autoencoder according to an embodiment
- Fig. 16 shows a schematic block diagram of an apparatus for training an autoencoder according to an embodiment
- Fig. 17 shows a schematic block diagram of an apparatus for evaluating interaction events according to an embodiment.
- the interactive objects and interactive events can be characterized and modeled.
- a static interaction relationship network diagram is constructed based on historical interaction events, so that each interaction object and each interaction event are analyzed based on the interaction relationship network diagram.
- the participants of each historical event can be used as nodes, and connecting edges can be established between nodes with interactive relationships, thereby forming an interactive network graph.
- a bipartite graph can be formed based on the interaction between the user and the product as the interaction relationship network graph.
- the bipartite graph contains user nodes and commodity nodes. If a user has purchased a certain commodity, a connecting edge is constructed between the user and the commodity.
- a user transfer relationship graph can be formed based on transfer records between users, where each node represents a user, and there is a connection edge between two users who have made transfer records.
- the static network diagram in the above example can show the interaction relationship between objects, it does not contain the timing information of these interaction events.
- Simply embedding the graph based on such an interactive relationship network graph the obtained feature vector does not express the influence of the time information of the interactive event on the node.
- the scalability of such a static graph is not strong enough, and it is difficult to flexibly handle the situation of newly-added interactive events and newly-added nodes.
- a dynamically changing sequence of interaction events is constructed into a dynamic interaction graph, wherein each interaction object involved in each interaction event corresponds to each node in the dynamic interaction graph.
- a dynamic interaction diagram can reflect the timing information of the interaction events experienced by each interaction object.
- an autoencoder is trained based on the dynamic interaction graph, and the autoencoder is used to encode multiple nodes involved in the interaction event as Characterization vector, through the characterization vector of each node, interaction events can be analyzed and evaluated.
- Fig. 1 shows a schematic diagram of an implementation scenario according to an embodiment.
- a dynamic interaction diagram 100 is constructed based on the dynamic interaction sequence.
- the individual interactive objects each interaction event a i, b i represented by nodes, and establishes parent-child relationships between nodes of successive events include the same object connecting side.
- the structure of the dynamic interaction graph 100 will be described in more detail later.
- an autoencoder is an artificial neural network that can learn to efficiently represent input data through unsupervised learning. This efficient representation of the input data is also called encoding.
- the dimension of the encoding is generally much smaller than the dimension of the input data, so the autoencoder can be used for data dimensionality reduction.
- the autoencoder can also be used for feature extraction or feature encoding in the pre-training process of the deep neural network, and can also be used as a generative model to randomly generate data similar to the training data.
- an autoencoder based on the long and short-term memory LSTM network is proposed. Specifically, for a sample interaction event, two nodes corresponding to the interaction event in the dynamic interaction graph can be determined, and two subgraphs (subgraph 1 and subgraph 2) with the two nodes as root nodes can be obtained. In order to train the autoencoder, the nodes in the above two subgraphs are also reversed from the parent-child relationship to form a reverse subgraph. Then, the above two sub-pictures and the reverse sub-picture are successively input to the auto-encoder.
- the self-encoder includes an LSTM layer, through which each node in the input sub-graph is iteratively processed sequentially, and finally a representation vector of each node in the reverse sub-graph is obtained.
- the autoencoder is trained by fitting the results of the iterative processing to the original input features of each node, so that the autoencoder can learn the characterization vector of each node.
- the autoencoder thus trained can be used for event analysis. Specifically, for the target interaction event to be evaluated, the two subgraphs and the reverse subgraph corresponding to the target interaction event can be obtained similarly to the above-mentioned training process. Then, the trained autoencoder is used to successively process the two subgraphs and the reverse subgraph to obtain the characterization vector of each node in the reverse subgraph. The target interaction events are evaluated by comparing the characterization vectors of these nodes with their original input features. Such evaluation can specifically be, for example, predicting whether there will be interaction between the two objects involved in the target interaction event (for example, whether a user will click on a page), and predicting the event category of the target interaction event (for example, whether it is an abnormal event) ),and many more.
- Fig. 2 shows a flowchart of a method of training an autoencoder according to an embodiment. It can be understood that the method can be executed by any device, device, platform, or device cluster with computing and processing capabilities. The following describes each step in the training method shown in FIG. 2 in conjunction with specific embodiments.
- step 21 a dynamic interaction diagram reflecting the association relationship of interaction events is obtained.
- interaction events that occur sequentially can be organized into an interaction event sequence in chronological order as described above, and a dynamic interaction graph can be constructed based on such an interaction event sequence to reflect the association relationship of the interaction events.
- Interaction sequence of events for example, expressed as ⁇ E 1, E 2, ... , E N>
- the interaction event can be a user's purchase behavior, and the two objects can be a certain user and a certain commodity.
- the interaction event may be a user's click behavior on a page block, and the two objects may be a certain user and a certain page block.
- the interaction event may be a transaction event, for example, a user transfers money to another user, and the two objects are two users at this time.
- interaction events can also be other interaction behaviors that occur between two objects.
- the event behavior feature f may include the background and context information of the occurrence of the interaction event, some attribute characteristics of the interaction behavior, and so on.
- the event behavior feature f may include the type of terminal used by the user to click, browser type, app version, etc.; in the case where the interaction event is a transaction event, The event behavior feature f may include, for example, transaction type (commodity purchase transaction, transfer transaction, etc.), transaction amount, transaction channel, and so on.
- a dynamic interaction diagram can be constructed. Specifically, a pair of nodes (two nodes) are used to represent two objects involved in an interaction event, and each object in each interaction event in the sequence of interaction events is represented by a node. In this way, one node can correspond to one object in one interaction event, but the same physical object may correspond to multiple nodes. For example, if user U1 purchased product M1 at t1 and product M2 at t2, then there are two feature groups of interaction events (U1, M1, t1) and (U1, M2, t2), then according to this The two interaction events respectively create two nodes U1(t1) and U1(t2) for the user U1. Therefore, it can be considered that the nodes in the dynamic interaction graph correspond to the state of an interaction object in an interaction event.
- the parent-child relationship and the corresponding connecting edge For each node in the dynamic interaction graph, construct the parent-child relationship and the corresponding connecting edge in the following way: For any node i, assuming that it corresponds to the interaction event i (interaction time is t), then in the interaction event sequence, from the interaction Event i backtracks forward, that is, backtracks in the direction earlier than the interaction time t, and the first interaction event j (interaction time t-, t- earlier than t) that also contains the object represented by node i is determined as the object The last interaction event participated in. Therefore, the two nodes corresponding to the last interaction event are considered to be child nodes of the node i, and correspondingly, the node i is the parent node of the two nodes corresponding to the last interaction event. Connecting edges can be established between parent and child nodes to show their parent-child dependency relationship.
- Fig. 3 shows a sequence of interaction events according to an embodiment and a dynamic interaction diagram constructed therefrom. Specifically, the left side of FIG. 3 shows a sequence of interactive events organized chronologically, which are exemplarily shown in t 1, t 2, ..., E t 6 time interaction events occurred 1, E 2, ..., E 6 , Each interactive event contains two interactive objects involved in the interaction and the interaction time (for the clarity of the illustration, the event behavior characteristics are omitted). The right side of Fig. 3 shows a dynamic interaction diagram constructed according to the interaction event sequence on the left, where two interaction objects in each interaction event are respectively used as nodes. The following takes node u(t 6 ) as an example to describe the construction of the parent-child relationship and connecting edges.
- the node u(t 6 ) represents an interactive object u in the interactive event E 6. Therefore, starting from the interaction event E 6 and backtracking, the first interaction event found that also contains the interaction object u is E 4 , that is, E 4 is the last interaction event that the object u participated in, and correspondingly, E 4
- the two nodes u(t 4 ) and w(t 4 ) corresponding to the two interactive objects of are two child nodes of node u(t 6 ).
- two connecting edges from the child nodes u(t 4 ) and w(t 4 ) to the parent node u(t 6 ) are established.
- the two objects involved in the interaction event can be divided into two types of objects: the first type of object and the second type of object.
- the first type of object of the page click event is the user object
- the second type of object is the page block.
- the two types of objects can be distinguished by the relative positions of the nodes. For example, for the nodes in each interaction event, the first type of object is placed on the left, and the second type of object is placed on the right. side.
- the nodes are divided into left and right nodes accordingly. For example, in Figure 3, the left node is the user node, and the right node is the item node.
- the positions of nodes may not be distinguished.
- a dynamic interaction diagram based on the sequence of interaction events.
- the process of constructing a dynamic interaction graph can be performed in advance or on site.
- a dynamic interaction diagram is constructed on-site according to the sequence of interaction events.
- the construction method is as described above.
- a dynamic interaction graph may be constructed based on the sequence of interaction events in advance. In step 21, read or receive the formed dynamic interaction graph.
- the dynamic interaction graph constructed in the above manner has strong scalability and can be dynamically updated according to newly added interaction events very easily.
- the two objects involved in the new interaction event can be used as two new nodes and added to the existing dynamic interaction graph. And, for each newly added node, determine whether it has child nodes. If there is a child node, add the connecting edge of the child node pointing to the newly added node, thus forming an updated dynamic interaction graph.
- step 21 a dynamic interaction diagram for reflecting the association relationship of interaction events is obtained.
- step 22 the first sample node and the second sample node corresponding to the sample interaction event are respectively used as root nodes, and the corresponding first sample subgraph and second sample subgraph are determined in the dynamic interaction graph.
- the aforementioned sample interaction event is a selected sample event whose category is known to be a predetermined category.
- the predetermined category may be an event that is determined to have occurred, or an interaction event that is determined to be safe (for example, a non-hacking attack event, a non-fraud transaction event, etc.).
- the two nodes involved in the sample interaction event namely the first sample node and the second sample node, can be determined in the above-mentioned dynamic interaction graph.
- the first sample node and the second sample node are respectively used as the current root nodes.
- the current root node is used to traverse along the parent-child relationship.
- the nodes are used as corresponding subgraphs, so that the first sample subgraph and the second sample subgraph are obtained respectively.
- the nodes within the foregoing predetermined range may be child nodes reachable through at most a preset number K of connection edges.
- the number K is a preset hyperparameter, which can be selected according to business conditions. It can be understood that the preset number K reflects the number of steps of historical interaction events backtracking from the root node, that is, the order of the child nodes. The larger the number K, the longer the historical interaction information is considered.
- the nodes within the foregoing predetermined range may also be child nodes whose interaction time is within the predetermined time range. For example, backtracking from the interaction time of the root node for a duration of T (for example, one day), the child nodes within the range of duration that can be reached by connecting edges.
- T for example, one day
- the aforementioned predetermined range considers both the number of connected edges and the time range.
- the nodes within the predetermined range refer to child nodes that are reachable at most through the preset number K of connection edges and whose interaction time is within the predetermined time range.
- Figure 4 shows an example of a sample subgraph in one embodiment.
- u(t 6 ) is the first sample node
- this node u(t 6 ) is the root node to determine its corresponding first sample subgraph
- the nodes that can be reached via the two connecting edges are shown in the dotted area in the figure.
- the node and connection relationship in this area is the subgraph corresponding to the node u(t 6 ), that is, the first sample subgraph.
- the node v(t 6 ) can be used as the root node to traverse again to obtain the second sample subgraph.
- the first sample node is denoted as u(t)
- the second sample node is denoted as v(t) for description.
- step 23 the autoencoder to be trained is obtained, the autoencoder includes an LSTM layer, and the LSTM layer is iteratively processed from the leaf node to the root node according to the parent-child relationship between the nodes in the input subgraph For each node, the iterative processing includes determining the implicit vector of the current processing node at least according to the node characteristics of the current processing node and the implicit vectors of its two child nodes.
- Figure 5 shows a schematic diagram of the operation of the LSTM layer. Assume that the two child nodes of node Q are node J 1 and node J 2 . As shown in Figure 5, at time T, the LSTM layer processes the representation vectors H1 and H2 of node J 1 and node J 2 respectively, where the representation vector can include implicit vectors and intermediate vectors for auxiliary operations; At time T+, the LSTM layer obtains the characterization vector H Q of node Q according to the characterization vectors H1 and H2 of J 1 and J 2 obtained by the previous processing according to the node characteristics of node Q.
- the characterization vector of the node Q can be used to process the characterization vector of the parent node of the node Q together with the characterization vector of the relative node of the node Q (another node in the same event) at a subsequent time, so as to realize the iteration. deal with.
- Figure 6 shows the structure of the LSTM layer according to one embodiment.
- the currently processed node is denoted as z(t), where x z(t) represents the node characteristics of the node.
- the node feature may include the user’s attribute characteristics, such as age, occupation, education level, location, registration duration, crowd tag, etc.; in the case of a node representing an item, the node feature may include the item’s Attribute characteristics, such as item category, shelf time, sales volume, number of reviews, etc.
- the original node characteristics can be obtained accordingly.
- the feature group of the interaction event also includes the event behavior feature f
- the node feature may also include the event behavior feature f of the interaction event where the node is located.
- c j1 and h j1 represent the intermediate vector and implicit vector of the first child node j 1
- c j2 and h j2 represents the intermediate vector and the hidden vector of the second child node j 2 respectively, where the intermediate vector is used for auxiliary operations, and the hidden vector is used to represent the node.
- the LSTM layer performs the following operations on the input node features, intermediate vectors and hidden vectors.
- the first transformation function g and the second transformation function f are calculated using the following formula (1) and formula (2), respectively:
- transformation functions can also be used, such as selecting different activation functions, modifying the form and number of parameters in the above formula, and so on.
- the intermediate vector c ji of the i-th child node of the 2 child nodes is combined with the corresponding i-th first transformation vector And the i-th second transform vector Perform the combination operation, and then get 2 combination operation results, and sum the 2 combination operation results to obtain the combination vector V.
- the above-mentioned combination operation can be a bitwise multiplication between three vectors, namely (Among them, ⁇ means multiply by bit).
- the above-mentioned combination operation may also be other vector operations such as addition. Since i is 1 or 2, then two combined operation results v1 and v2 are obtained. You can sum v1 and v2 to get the combined vector V.
- the combination operation is bitwise multiplication, the resulting combination vector V can be expressed as:
- the node feature x z(t) of the currently processed node z(t), together with the hidden vectors h j1 and h j2 of the two child nodes, are input into the third transformation function and the fourth transformation function respectively, and the third transformation function is obtained.
- the transformation vector and the fourth transformation vector are input into the third transformation function and the fourth transformation function respectively, and the third transformation function is obtained.
- the third transformation function p can be obtained by first obtaining the vectors u z(t) and s z(t) , and then performing the calculation of u z(t) and s z(t) Multiply by bits to obtain the third transformation vector p z(t) , namely:
- ⁇ means bitwise multiplication.
- u z(t) and s z(t) can be calculated according to the following formula:
- W u , W s and Is the parameter matrix of the linear transformation
- b u and b s are the offset parameters.
- the fourth transformation function o can be obtained by obtaining the fourth transformation vector O z(t) through the following formula:
- W o and I are the parameter matrix of linear transformation
- b o is the offset parameter
- the intermediate vector c z(t) of the current processing node z(t) is determined.
- the combined vector V and the third transform vector p z(t) can be summed to obtain the intermediate vector c z(t) of z(t) .
- other combination methods such as weighted summation and bitwise multiplication, can be used to use the combined result as the intermediate vector c z(t) of z(t) .
- the implicit vector h z(t) of the node z(t) is determined.
- the intermediate vector c z(t) can be combined with the fourth transformation vector O z(t) after performing the tanh function operation, such as bitwise multiplication, as the node z(t)
- the implicit vector h z(t) namely:
- the LSTM layer determines the node z according to the node characteristics of the current processing node z(t), the respective intermediate vectors and implicit vectors of the two child nodes j 1 and j 2 of the node.
- (t) is the intermediate vector c z(t) and the implicit vector h z(t) .
- the interaction time corresponding to the current processing node z(t) and the interaction event of its child nodes are further introduced Time difference ⁇ between times.
- the LSTM layer processing the node includes, according to the node feature x z(t) of the current processing node, the implicit vector of its two child nodes j 1 and j 2, And the time difference ⁇ between the first interaction time (t) of the interaction event where the node z(t) is located and the second interaction time (t-) of the interaction event where the two child nodes (j 1 and j 2) are located, determine the current Process the hidden vector h z(t) of node z(t).
- the factor of the time difference ⁇ can be introduced to similarly obtain the hidden vector and the intermediate vector of the node z(t).
- the process of combining the time difference may include: combining the node feature x z(t) of the currently processed node z(t) and the time difference ⁇ with the implicit vectors h j1 and h j2 of the two child nodes, and inputting The first transformation function is used to obtain two first transformation vectors; the above-mentioned node features and the implicit vectors of the two child nodes are respectively combined, and the second transformation function is input to obtain two second transformation vectors; and the two child nodes are The intermediate vector of the i-th child node in the middle is combined with the corresponding i-th first transform vector and the i-th second transform vector to obtain two operation results, and the two operation results are summed to obtain a combined vector;
- the node feature x z(t) of the current processing node z(t) may include: combining the node feature x
- FIG. 7 shows the structure of an LSTM layer according to another embodiment. Comparing Fig. 7 and Fig. 6, it can be seen that the structure of Fig. 7 and the algorithm implemented are similar to Fig. 6, except that the time difference ⁇ is further introduced on the basis of Fig. 6.
- the time difference ⁇ and the node feature x z(t) of the node z(t) are combined with the hidden vectors of the child nodes, respectively, and input into the first transformation function g.
- the first transformation function g can be modified to:
- formula (9) further introduces the time term corresponding to the time difference ⁇ on the basis of formula (1), correspondingly, It is a parameter for the time term, which can be embodied as a vector.
- the above-mentioned time difference can be further input into the second transformation function. That is, the node characteristics and time difference ⁇ of the current processing node z(t) are respectively combined with the implicit vectors corresponding to the two child nodes, and then the first transformation function g and the second transformation function f are input respectively to obtain 2 First transform vectors and 2 second transform vectors.
- the subsequent processing is the same as described above.
- FIG. 8 shows the structure of an LSTM layer according to still another embodiment. It can be seen that the LSTM layer in Fig. 8 also introduces a time difference ⁇ , and, compared to Fig. 7, the time difference ⁇ in Fig. 8 is further input to the second transformation function f. More specifically, the first transformation function g in FIG. 8 can still take the form of formula (9), and the second transformation function f can take the following form:
- formula (10) further introduces the time term corresponding to the time difference ⁇ on the basis of formula (2), correspondingly, It is a parameter for the time term, which can be embodied as a vector.
- the above-mentioned time difference may be further input to the third transformation function p and/or the fourth transformation function o.
- part or all of the aforementioned formulas (5), (6), (7) can be modified, and the time term for the time difference can be similarly introduced on the original basis, which will not be detailed here.
- the autoencoder can iteratively process the nodes in the input subgraph in turn to obtain the intermediate vector and implicit vector of each node.
- step 24 the first sample subgraph and the second sample subgraph obtained in step 22 are respectively input into the aforementioned autoencoder, and the implicit vectors corresponding to the first sample node and the second sample node are obtained, respectively, These are called the first sample vector and the second sample vector.
- the LSTM layer takes the node u(t 2 ) as the current processing node in any way in Figure 6 to Figure 8, based on the node characteristics of the node u(t 2 ), and the two intermediate vectors c and c generated by default.
- the two hidden vectors h determine the intermediate vector c(u(t 2 )) and the hidden vector h(c(t 2 )) of the node u(t 2 ). Do the same for the lowest node y(t 2 ) to obtain the corresponding intermediate vectors c(y(t 2 )) and h(y(t 2 )).
- the parent node of the nodes u(t 2 ) and y(t 2 ) is u(t 4 ).
- the LSTM layer takes u(t 4 ) as the current processing node, according to the node characteristics of the node u(t 4 ) itself, and the respective intermediate vectors and hidden vectors of its two child nodes u(t 2 ) and y(t 2 ).
- the intermediate vector and the hidden vector of the first sample node u(t 6) can be obtained.
- the autoencoder performs similar processing to obtain the intermediate vector and the hidden vector of the second sample node v(t 6 ).
- a reverse sample subgraph is formed by reversing the parent-child relationship between the nodes in the first sample subgraph and the second sample subgraph, and the reverse sample subgraph includes the first sample subgraph.
- the sample node set formed by the union of nodes in this subgraph and the second sample subgraph, where the first sample node and the second sample node are the root nodes in the aforementioned first sample subgraph and second sample subgraph, respectively, And in the reverse sample subgraph as the bottom leaf node.
- FIG. 9 shows the steps of forming the reverse subgraph in one embodiment, that is, the sub-steps of step 25 described above.
- step 91 the parent-child relationship between nodes in the first sample subgraph can be reversed to form a first reverse subgraph with the first sample node as the leaf node.
- the first sample node u(t 6 ) is the root node, and each connecting edge points from the child node to the parent node.
- the first reverse subgraph shown in FIG. 10 can be formed. It can be seen that the first reverse subgraph can be regarded as a mirror image of the first sample subgraph.
- the first sample node becomes a leaf node, which is denoted as u ⁇ (t 6 ) in the reverse subgraph.
- step 92 the parent-child relationship between the nodes in the second sample subgraph is reversed to form a second reverse subgraph with the second sample node as the leaf node. It can be understood that steps 91 and 92 can be performed in parallel or in any order.
- step 93 the common nodes in the first reverse subgraph and the second reverse subgraph are merged to form a merged subgraph.
- the first sample node is u(t 6 )
- the second sample node is v(t 6 )
- q(t 3 ) and w(t 3 ) are common nodes.
- the first reverse subgraph and the second reverse subgraph still contain these common nodes. Therefore, the common nodes in the two reverse subgraphs can be merged.
- FIG. 11 shows a merged subgraph corresponding to the sample subgraph of FIG.
- step 94 in the above-mentioned merged subgraph, a default child node is added to the node with only one child node, thereby forming a reverse sample subgraph.
- each parent node has two child nodes. After reversing the parent-child relationship in the sample subgraph, two parent nodes share one child node, and some nodes have only one child node.
- a default child node is added to the node that has only one child node in the merged subgraph.
- the node u ⁇ (t 4 ) has only one child node, and a default child node A can be added to it to have two child nodes.
- the node w ⁇ (t 4 ) also has only one child node, and a default child node B can be added to it to have two child nodes.
- the node w ⁇ (t 3 ) already has two child nodes, so there is no need to add a default child node to it.
- the reverse sample subgraph shown in FIG. 12 can be obtained, in which for the simplicity of the illustration, the default child nodes added for some nodes are omitted.
- the objects involved in the interaction event are divided into a first type of object and a second type of object, and correspondingly, the node is divided into a left node and a right node.
- the node is divided into a left node and a right node.
- step 26 of FIG. 2 the reverse sample subgraph is input to the aforementioned autoencoder, and the implicit vector of each node in the sample node set is obtained through the iterative processing of the LSTM layer in the autoencoder.
- the autoencoder when the autoencoder performs the above-mentioned iterative processing, it starts processing from the leaf nodes of the input subgraph.
- the input subgraph is a reverse sample subgraph
- its leaf nodes are exactly the first sample node and the second sample node that were originally the root node, that is, the first leaf node is the first sample node
- the second leaf node is The second sample node.
- the implicit vectors of these two leaf nodes can be processed by the first sample sub-graph and the second sample sub-graph by the autoencoder in step 24, and the first sample vector and the second sample node respectively obtained for the first sample node and the second sample node
- the second sample vector is determined. Therefore, the processing of the reverse sample subgraph by the autoencoder is a continuation of the processing results of the first sample subgraph and the second sample subgraph.
- FIG. 13 shows a schematic diagram of processing the reverse sample subgraph by the self-encoder, and the schematic diagram is described based on the example of the sample subgraph in FIG. 4 and the example of the reverse sample subgraph in FIG. 12.
- the default child nodes added in the reverse sample subgraph are omitted. It can be seen that by processing the first sample subgraph and the second sample subgraph, the autoencoder obtains the first sample vector corresponding to the first sample node (denoted as h(u) for simplicity in the figure), and The second sample vector h(v) corresponding to the second sample node.
- the first sample vector and the second sample vector are used as intermediate results to determine the implicit vector of the leaf node of the reverse sample subgraph, thereby connecting the processing of the forward sample subgraph and the processing of the reverse sample subgraph, Then the implicit vector of the leaf node is used as the starting vector for the autoencoder to process the reverse sample subgraph.
- the first sample vector is directly used as the implicit vector of the first leaf node
- the second sample vector is used as the implicit vector of the second leaf node.
- the implicit vector of the first leaf node in the reverse sample subgraph is based on the node characteristics of the first leaf node, and the above-mentioned first sample vector and second sample vector are used as the two sub-nodes.
- the hidden vector is determined; the hidden vector of the second leaf node is determined based on the node characteristics of the second leaf node, and the first sample vector and the second sample vector are used as the hidden vectors of the two child nodes.
- this is equivalent to treating the root nodes u(t 6 ) and v(t 6 ) in the forward first/second sample subgraph as two leaf nodes in the reverse sample subgraph
- its hidden vector is determined according to its own node characteristics and the hidden vectors of the two child nodes, that is, the first sample vector and the second sample vector.
- the implicit vector of the leaf node in the reverse sample subgraph is obtained.
- the LSTM layer in the autoencoder can use the aforementioned iterative processing to continue processing other nodes in the reverse sample subgraph, so as to obtain the implicit vector of each node.
- the implicit vector of the node u ⁇ (t 4 ) is based on its own node characteristics and its two child nodes, the leaf node u ⁇ (t 6 ) and the default child node A
- the implicit vector of the node w ⁇ (t 4 ) is determined based on its own node characteristics and the implicit vector of its two child nodes, the leaf node u ⁇ (t 6 ) and the default child node B, Determine;
- the implicit vector of node w ⁇ (t 3 ) is determined based on its own node characteristics and the implicit vectors of its two child nodes w ⁇ (t 4 ) and q ⁇ (t 5 ).
- the implicit vector of the default child node can be preset as a default value. In one embodiment, different default implicit vectors are preset for the left default child node and the right default child node.
- the autoencoder can obtain the implicit vector of each node in the sample node set contained in the reverse sample subgraph.
- the self-encoder includes multiple LSTM layers.
- the implicit vector of a certain node determined by the previous LSTM layer is input to the next LSTM layer as the node feature of the node.
- Fig. 14 shows a schematic diagram of a self-encoder of multiple LSTM layers.
- each LSTM layer still processes each node iteratively, and determines the hidden vector of the node i according to the node characteristics of the currently processed node i, the respective intermediate vectors and hidden vectors of the two child nodes of the node i And the intermediate vector, but the bottom LSTM layer uses the original input feature of node i as the node feature, and the subsequent LSTM layer uses the implicit vector of the node i determined by the previous LSTM layer as the node feature.
- the hidden vector of each node obtained when the last LSTM layer processes the reverse sample subgraph is used as the final hidden vector of each node in the sample node set.
- step 27 the prediction loss is determined based on the synthesis of the distance between the hidden vector of each node in the sample node set and the node feature; the autoencoder is updated in the direction where the prediction loss is reduced.
- the distance between its hidden vector h(i) and the node feature x(i) can be determined ⁇ h(i)-x(i) ⁇ , or called Is the predicted distance.
- the self-encoder includes multiple LSTM layers
- the above-mentioned node feature x(i) is the node feature of the node i input to the first LSTM layer
- the hidden vector h(i) is the last LSTM layer processing reverse
- the hidden vector of the node obtained from the sample subgraph.
- the training goal of the autoencoder is set to make the hidden vector of each node fit its node characteristics.
- the prediction loss L can be expressed as:
- the predicted distance ⁇ h(i)-x(i) ⁇ can be determined by cosine distance, Euclidean distance, etc.
- the prediction distance corresponding to each node is summed to obtain the prediction loss.
- the sum of the squares of the predicted distances of each node can also be used as the prediction loss.
- the number of nodes in the sample node set S can also be determined, so that the average value of the predicted distance of each node or the average value of the square of the predicted distance is used as the prediction loss L.
- the model parameters in the autoencoder can be adjusted in the direction that makes L decrease, specifically including the transformation matrix parameters in the aforementioned formulas (1)-(10) in the LSTM layer and Offset parameters, etc., to update the self-encoder.
- methods such as gradient descent and back propagation can be used to implement parameter tuning in the self-encoder.
- the autoencoder can be continuously updated and optimized, and finally an autoencoder dedicated to evaluating interaction events based on the dynamic interaction graph is obtained.
- the hidden vector of each node involved can be obtained, and the prediction loss based on the distance between the hidden vector and the feature of the node can be minimized, for example, lower than a certain threshold.
- Fig. 15 shows a flow chart of a method for evaluating an interaction event using an autoencoder according to an embodiment. It can be understood that the method can be executed by any device, device, platform, or device cluster with computing and processing capabilities. As shown in Fig. 15, the method of evaluating interaction events may include the following steps.
- step 151 a dynamic interaction diagram reflecting the association relationship of interaction events is obtained.
- the construction method and structural characteristics of the dynamic interaction graph are as described in step 21, and will not be repeated.
- step 152 take the first target node and the second target node corresponding to the target event to be analyzed as the root nodes, and determine in the dynamic interaction graph that the nodes in the predetermined range that start from the root node and reach through the connecting edge are formed.
- the specific execution method of this step is similar to the aforementioned step 22, except that the target event in this step is an event of unknown category and to be analyzed.
- the two nodes involved in this event are called the first target node and the second target node.
- the process of determining the first target subgraph and the second target subgraph in the dynamic interaction graph based on the first target node and the second target node corresponds to the foregoing step 22, and will not be repeated.
- step 153 an autoencoder trained according to the method of FIG. 2 is obtained.
- step 154 the first target subgraph and the second target subgraph are respectively input to the autoencoder, and the implicit vectors corresponding to the first target node and the second target node are obtained, which are called the first target vector and the second target vector, respectively.
- Target vector Refer to the description of step 24 for the process of the self-encoder respectively performing iterative processing on the first target sub-picture and the second target sub-picture input into it.
- step 155 by reversing the parent-child relationship between the nodes in the first target subgraph and the second target subgraph, and merging the reversed subgraphs, a reverse target subgraph is formed; the reverse target The subgraph includes a target node set formed by a union of nodes in the first target subgraph and the second target subgraph.
- the execution method of forming the reverse target subgraph in this step is similar to the foregoing step 25, and will not be repeated.
- step 156 the reverse target subgraph is input to the trained autoencoder to obtain the hidden vector of each node in the target node set, where the hidden vector of the leaf node of the reverse target subgraph is based on the first target The vector and the second target vector are determined.
- This step corresponds to the aforementioned step 26, and will not be repeated here.
- step 157 the comprehensive result of the distance between the hidden vector of each node in the target node set and the node feature is determined.
- the method for determining the comprehensive result corresponds to the method for determining the prediction loss L when the autoencoder is trained.
- step 158 according to the comparison of the comprehensive result with a predetermined threshold, it is evaluated whether the above-mentioned target event is an event of a predetermined category.
- the hidden vector obtained by the autoencoder can fit the node feature well, so that the distance between the hidden vector and the node feature is The integration is extremely small. Therefore, a threshold can be set, and based on the comparison of the comprehensive result corresponding to the currently analyzed target event with the threshold, it is determined whether the target event is an event of the aforementioned predetermined category. If the comprehensive result is less than the threshold, it can be determined that the currently analyzed target event belongs to the event of a predetermined category; if the threshold is reached or exceeded, it is determined that the target event does not belong to the event of the predetermined category.
- the target event may be an event assumed to occur, and correspondingly, an event of a predetermined category is an event that determines the occurrence of an interaction. In this way, judging whether the target event is an event of a predetermined category through the autoencoder can be used to evaluate whether two nodes in the dynamic interaction graph will interact next, for example, whether a user will click on a certain page or page block.
- the target event is an event in which an interaction has occurred
- the event of a predetermined category is an event with a certain characteristic, for example, an event that confirms safety.
- judging whether the target event is an event of a predetermined category through the self-encoder can be used to evaluate whether a certain interaction event that has occurred is a security event, or whether it has a higher security risk. For example, when a user sends a payment request to transfer money to another user, the two interact.
- the above-mentioned self-encoder can be used to determine whether the interaction event is a normal transaction or a fraudulent transaction with security risks, including transactions involving theft of accounts, cash-out transactions, and so on.
- an interaction event occurs between the user and the website.
- the above-mentioned autoencoder can be used to determine whether the event is a normal login event or an abnormal event, such as a hacker's attack event, an attempt to log in with a stolen account, and so on.
- the interaction events can be analyzed and evaluated more accurately and effectively.
- a dynamic interaction graph is constructed based on a sequence of interaction events, and an autoencoder for evaluating interaction events is trained based on such a dynamic interaction graph.
- the autoencoder can characterize the nodes involved in the event as hidden vectors through the subgraphs and reverse subgraphs involved in the event, and make the hidden vectors fully fit the node characteristics. In this way, the trained autoencoder can be used to analyze and evaluate unknown interaction events.
- an apparatus for training an autoencoder for evaluating interaction events is provided.
- the apparatus can be deployed in any device, platform, or device cluster with computing and processing capabilities.
- Fig. 16 shows a schematic block diagram of an apparatus for training an autoencoder according to an embodiment.
- the training device 160 includes the following units.
- the interaction graph acquiring unit 161 is configured to acquire a dynamic interaction graph reflecting the association relationship of interaction events, which includes multiple pairs of nodes, each pair of nodes represents two objects in an interaction event, and any node i is connected to two child nodes through a connecting edge, The two child nodes are the two nodes corresponding to the last interaction event in which the object represented by the node i participated.
- the sample subgraph acquisition unit 162 is configured to take the first sample node and the second sample node corresponding to the sample interaction event of a predetermined category as the root nodes, and determine in the dynamic interaction graph to start from the root node and connect The first sample subgraph and the second sample subgraph formed by the nodes in the predetermined range reached by the edge.
- the encoder acquisition unit 163 is configured to acquire an autoencoder to be trained, the autoencoder includes an LSTM layer, and the LSTM layer processes iteratively from the leaf node to the root node according to the parent-child relationship between nodes in the input subgraph For each node, the iterative processing includes determining the implicit vector of the current processing node at least according to the node characteristics of the current processing node and the implicit vectors of its two child nodes.
- the sample subgraph processing unit 164 is configured to respectively input the first sample subgraph and the second sample subgraph to the autoencoder to obtain the first sample vector and the second sample corresponding to the first sample node The second sample vector corresponding to the node.
- the reverse subgraph forming unit 165 is configured to reverse the parent-child relationship between the nodes in the first sample subgraph and the second sample subgraph, and merge the reversed subgraphs to form a reverse sample subgraph;
- the reverse sample subgraph includes a sample node set formed by a union of nodes in the first sample subgraph and the second sample subgraph.
- the reverse subgraph processing unit 166 is configured to input the reverse sample subgraph into the autoencoder to obtain the implicit vector of each node in the sample node set, wherein the value of the leaf node of the reverse sample subgraph is The hidden vector is determined according to the first sample vector and the second sample vector.
- the update unit 167 is configured to determine the prediction loss according to the synthesis of the distance between the hidden vector of each node in the sample node set and the node feature, and update the autoencoder in the direction of reducing the prediction loss.
- the nodes in the predetermined range selected by the sample subgraph acquisition unit 162 include: K-level sub-nodes within a predetermined number of K connecting edges; and/or, those whose interaction time is within a predetermined time range Child node.
- the node characteristics of the current processing node based on the iterative processing of the LSTM layer may include the attribute characteristics of the object corresponding to the node.
- the attribute characteristics may include at least one of the following: age, occupation, education level, location, registration duration, and crowd tag; or, when the current processing node is an item node, the attribute characteristics include at least one of the following: item category, shelf time, number of comments, and sales volume.
- the node characteristic of the current processing node may also include the event behavior characteristic of the interaction event corresponding to the node.
- the LSTM layer in the autoencoder is configured to: combine the node features of the current processing node with the implicit vectors of two child nodes, and input the first transformation function and The second transformation function obtains two first transformation vectors and two second transformation vectors; the intermediate vector used for auxiliary operation of the i-th child node of the two child nodes is combined with the corresponding i-th first transformation vector , The i-th second transformation vector performs a combination operation to obtain 2 operation results, and the 2 operation results are summed to obtain a combination vector; the node feature of the current processing node and the implicit vector of the two child nodes , Input the third transformation function and the fourth transformation function respectively to obtain the third transformation vector and the fourth transformation vector; based on the combination vector and the third transformation vector, determine the intermediate vector of the current processing node; based on the current The intermediate vector and the fourth transform vector of the processing node are used to determine the implicit vector of the current processing node.
- the LSTM layer is configured to, according to the node characteristics of the current processing node, the implicit vectors of the two child nodes, and the first interaction time of the interaction event where the current processing node is located, and the first interaction time The time difference between the second interaction time of the interaction event where the two child nodes are located determines the implicit vector of the current processing node.
- the LSTM layer is specifically configured to: combine the node characteristics of the current processing node and the time difference with the implicit vectors of the two child nodes, respectively, and input the first transform Function to obtain two first transform vectors; combine the node features with the implicit vectors of the two child nodes, and enter the second transform function to obtain two second transform vectors; combine the first of the two child nodes
- the intermediate vector used for the auxiliary operation of the i child node is combined with the corresponding i-th first transform vector and the i-th second transform vector to obtain 2 operation results.
- the reverse subgraph forming unit 165 further includes (not shown): a first reverse module configured to reverse the parent-child relationship between nodes in the first sample subgraph to form The first sample node is the first reverse subgraph of the leaf node; the second reverse module is configured to reverse the parent-child relationship between the nodes in the second sample subgraph to form the second sample The node is the second reverse subgraph of the leaf node; the merging module is configured to merge the common nodes of the first reverse subgraph and the second reverse subgraph to form a merged subgraph; the adding module is configured to In the merged subgraph, for a node with only one child node, a default child node is added to it, thereby forming the reverse sample subgraph.
- the leaf node of the reverse sample subgraph includes a first leaf node corresponding to the first sample node, and a second leaf node corresponding to the second sample node; according to an embodiment ,
- the implicit vector of the first leaf node is the first sample vector, and the implicit vector of the second leaf node is the second sample vector.
- the reverse subgraph processing unit 166 is configured to, based on the node characteristics of the first leaf node, and use the first sample vector and the second sample vector as two sub-nodes
- the implicit vector is used to determine the implicit vector of the first leaf node; based on the node characteristics of the second leaf node, and the first sample vector and the second sample vector are used as the implicit vectors of two child nodes.
- the implicit vector of the second leaf node is configured to, based on the node characteristics of the first leaf node, and use the first sample vector and the second sample vector as two sub-nodes.
- the self-encoder includes multiple LSTM layers, wherein the implicit vector of the current processing node determined by the previous LSTM layer is input to the next LSTM layer as the node feature of the current processing node.
- the update unit 167 may be configured to: for each node in the sample node set, determine the implicit vector of the node output by the last LSTM layer in the plurality of LSTM layers, and the implicit vector input to the first LSTM layer. The distance between the node features of the node in an LSTM layer;
- the predicted loss is determined.
- an apparatus for evaluating interaction events using a self-encoder is provided.
- the apparatus can be deployed in any device, platform, or device cluster with computing and processing capabilities.
- Fig. 17 shows a schematic block diagram of an apparatus for evaluating interaction events according to an embodiment.
- the evaluation device 170 includes the following units.
- the interaction graph acquiring unit 171 is configured to acquire a dynamic interaction graph reflecting the association relationship of interaction events, which includes multiple pairs of nodes, each pair of nodes represents two objects in an interaction event, and any node i is connected to two child nodes through a connecting edge, The two child nodes are the two nodes corresponding to the last interaction event in which the object represented by the node i participated.
- the target subgraph acquiring unit 172 is configured to take the first target node and the second target node corresponding to the target event to be analyzed as the root nodes, and determine in the dynamic interaction graph that the destinations starting from the root node and arriving through the connecting edge A first target subgraph and a second target subgraph formed by nodes within a predetermined range.
- the encoder acquisition unit 173 is configured to acquire the autoencoder obtained by training with the aforementioned training device.
- the target subgraph processing unit 174 is configured to respectively input the first target subgraph and the second target subgraph to the autoencoder to obtain a first target vector corresponding to the first target node, which corresponds to the second target node The second target vector.
- the reverse subgraph forming unit 175 is configured to reverse the parent-child relationship between the nodes in the first target subgraph and the second target subgraph, and merge the reversed subgraphs to form a reverse target subgraph;
- the reverse target subgraph includes a target node set formed by a union of nodes in the first target subgraph and the second target subgraph.
- the reverse subgraph processing unit 176 is configured to input the reverse target subgraph into the autoencoder to obtain the implicit vector of each node in the target node set; wherein the value of the leaf node of the reverse target subgraph is The implicit vector is determined according to the first target vector and the second target vector.
- the synthesis unit 177 is configured to determine a synthesis result of the distance between the hidden vector of each node in the target node set and its node feature.
- the evaluation unit 178 is configured to evaluate whether the target event is an event of a predetermined category according to the comparison between the comprehensive result and a predetermined threshold.
- the target event is a hypothetical event
- the event of the predetermined category is an event that is determined to occur.
- the target event is an event that has occurred
- the event of the predetermined category is an event that confirms safety
- the autoencoder is trained based on the dynamic interaction graph; through the above device 170, the self-encoder obtained by training can be used to evaluate and analyze the interaction event.
- a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method described in conjunction with FIG. 2 and FIG. 15.
- a computing device including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, a combination of FIGS. 2 and 15 is implemented. The method described.
- the functions described in the present invention can be implemented by hardware, software, firmware, or any combination thereof.
- these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on the computer-readable medium.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
一种用于评估交互事件的自编码器的训练和使用方法及装置。在该方法中,首先基于交互事件序列构建动态交互图,基于动态交互图的特点,提出一种自编码器。为训练该自编码器,针对样本交互事件,确定出该交互事件在动态交互图中对应的两个节点,并得到以该两个节点为根节点的两个子图。此外,还将该两个子图中的节点进行父子关系的反向,形成反向子图。然后,将上述两个子图,以及反向子图相继输入到自编码器。自编码器包括LSTM层,通过该LSTM层对输入子图中各个节点依次进行迭代处理,最终得到反向子图中各个节点的表征向量。通过使得迭代处理得到的各个节点的表征向量拟合其原始输入特征,对自编码器进行训练。
Description
本说明书一个或多个实施例涉及机器学习领域,尤其涉及利用训练用于评估交互事件的自编码器的方法和装置。
在许多场景下,需要对用户交互事件进行分析和处理。交互事件是互联网事件的基本组成元素之一,例如,用户浏览页面时的点击行为,可以视为用户与页面内容区块之间的交互事件,电商中的购买行为可以视为用户与商品之间的交互事件,账户间转账行为则是用户与用户之间的交互事件。用户的一系列交互事件中蕴含了用户的细粒度习惯偏好等特点,以及交互对象的特点,是机器学习模型的重要特征来源。因此,在许多场景下,希望根据交互事件对交互参与方进行特征表达和建模,进而对交互事件进行分析,特别是对交互事件的安全性进行分析,从而保障交互平台的安全性。
然而,交互事件涉及交互双方,并且各个参与方本身的状态可以是动态变化的,因此,综合考虑交互参与方的多方面特点对其进行准确的特征表达非常困难。由此,希望能有改进的方案,更为有效地对交互事件进行分析处理。
发明内容
本说明书一个或多个实施例描述了一种用于评估交互事件的自编码器的训练方法和装置,其中基于动态交互图,训练针对特定类型交互事件的自编码器,从而可以利用训练好的自编码器进行交互事件的评估和分析。
根据第一方面,提供了一种训练用于评估交互事件的自编码器的方法,所述方法包括:获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;分别以预定类别的样本交互事件所对应的第一样本节点、第二样本节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一样本子图和第二样本子图;获取待训练的自编码器,所述自编码器包括LSTM层,所述LSTM层根据输入子图中节点之间的父子关系,从叶节点到根节点依次迭代处理各个节点,其中所述迭代处理包 括,至少根据当前处理节点的节点特征,及其两个子节点的隐含向量,确定该当前处理节点的隐含向量;分别将所述第一样本子图和第二样本子图输入到所述自编码器,得到第一样本节点对应的第一样本向量,和第二样本节点对应的第二样本向量;通过将第一样本子图和第二样本子图中节点间的父子关系反向,并将反向后的子图合并,形成反向样本子图;所述反向样本子图包括所述第一样本子图和第二样本子图中节点的并集形成的样本节点集;将所述反向样本子图输入所述自编码器,得到所述样本节点集中各个节点的隐含向量,其中所述反向样本子图的叶节点的隐含向量根据所述第一样本向量和第二样本向量确定;根据所述样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失,并在预测损失减小的方向,更新所述自编码器。
在不同实施例中,上述预定范围的节点可以包括:预设数目K的连接边之内的K阶子节点;和/或,交互时间在预设时间范围内的子节点。
根据一种实施方式,所述当前处理节点的节点特征包括,该节点所对应的对象的属性特征。
在不同实施例中,当当前处理节点为用户节点时,所述属性特征可以包括以下中的至少一项:年龄、职业、教育程度、所在地区、注册时长、人群标签;当所述当前处理节点为物品节点时,所述属性特征可以包括以下中的至少一项:物品类别、上架时间、评论数、销量。
进一步的,在一个实施例中,当前处理节点的节点特征还包括,该节点所对应的交互事件的事件行为特征。
根据一种实施方式,LSTM层通过以下方式确定当前处理节点的隐含向量:将当前处理节点的节点特征,分别与两个子节点的隐含向量组合,并分别输入第一变换函数和第二变换函数,得到2个第一变换向量和2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
根据另一种实施方式,LSTM层进行的迭代处理包括,根据当前处理节点的节点特 征,所述两个子节点的隐含向量,以及该当前处理节点所在交互事件的第一交互时间与所述两个子节点所在交互事件的第二交互时间之间的时间差,确定所述当前处理节点的隐含向量。
在上述实施方式下一个更具体的实施例中,LSTM层通过以下方式确定当前处理节点的隐含向量:将当前处理节点的节点特征和所述时间差,与所述两个子节点的隐含向量分别组合,输入第一变换函数,得到2个第一变换向量;将所述节点特征与所述两个子节点的隐含向量分别组合,输入第二变换函数,得到2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
根据一个实施例,形成反向样本子图的步骤具体包括:将所述第一样本子图中节点间的父子关系反向,形成以所述第一样本节点为叶节点的第一反向子图;将所述第二样本子图中节点间的父子关系反向,形成以所述第二样本节点为叶节点的第二反向子图;合并所述第一反向子图和所述第二反向子图中共同的节点,形成合并子图;在所述合并子图中,对于仅有一个子节点的节点,为其添加缺省子节点,由此形成所述反向样本子图。
根据一种实施方式,反向样本子图的叶节点包括对应于第一样本节点的第一叶节点,和对应于第二样本节点的第二叶节点;其中第一叶节点的隐含向量为所述第一样本向量,第二叶节点的隐含向量为所述第二样本向量。
根据另一种实施方式,第一叶节点的隐含向量基于该第一叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定;第二叶节点的隐含向量基于该第二叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定。
在一个实施例中,所述自编码器包括多个LSTM层,其中,上一LSTM层确定出的所述当前处理节点的隐含向量,输入到下一LSTM层作为该当前处理节点的节点特征。
在这样的情况下,可以通过以下方式确定预测损失:对于所述样本节点集中的各个 节点,确定所述多个LSTM层中最后一个LSTM层输出的该节点的隐含向量,与输入到第一个LSTM层的该节点的节点特征之间的距离;根据所述各个节点对应的距离的综合结果,确定预测损失。
根据第二方面,提供了一种利用自编码器评估交互事件的方法,所述方法包括:获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;以待分析的目标事件所对应的第一目标节点、第二目标节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一目标子图和第二目标子图;获取根据第一方面的方法训练得到的自编码器;分别将所述第一目标子图和第二目标子图输入到所述自编码器,得到第一目标节点对应的第一目标向量,和第二目标节点对应的第二目标向量;通过将第一目标子图和第二目标子图中节点间的父子关系反向,并将反向后的子图合并,形成反向目标子图;所述反向目标子图包括所述第一目标子图和第二目标子图中节点的并集形成的目标节点集;将所述反向目标子图输入所述自编码器,得到所述目标节点集中各个节点的隐含向量;其中所述反向目标子图的叶节点的隐含向量根据所述第一目标向量和第二目标向量确定;确定所述目标节点集中各个节点的隐含向量与其节点特征之间的距离的综合结果;根据所述综合结果与预定阈值的比较,评估所述目标事件是否为预定类别的事件。
根据一种实施方式,所述目标事件为假设发生的事件,所述预定类别的事件为确定发生的事件。
根据另一种实施方式,所述目标事件为已发生事件,所述预定类别的事件为确认安全的事件。
根据第三方面,提供了一种训练用于评估交互事件的自编码器的装置,所述装置包括:交互图获取单元,配置为获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;样本子图获取单元,配置为分别以预定类别的样本交互事件所对应的第一样本节点、第二样本节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一样本子图和第二样本子图;编码器获取单元,配置为获取待训练的自编码器,所述自编码器包括LSTM层,所述LSTM层根据输入子图中节点之间 的父子关系,从叶节点到根节点依次迭代处理各个节点,其中所述迭代处理包括,至少根据当前处理节点的节点特征,及其两个子节点的隐含向量,确定该当前处理节点的隐含向量;样本子图处理单元,配置为分别将所述第一样本子图和第二样本子图输入到所述自编码器,得到第一样本节点对应的第一样本向量,和第二样本节点对应的第二样本向量;反向子图形成单元,配置为通过将第一样本子图和第二样本子图中节点间的父子关系反向,并将反向后的子图合并,形成反向样本子图;所述反向样本子图包括所述第一样本子图和第二样本子图中节点的并集形成的样本节点集;反向子图处理单元,配置为将所述反向样本子图输入所述自编码器,得到所述样本节点集中各个节点的隐含向量,其中所述反向样本子图的叶节点的隐含向量根据所述第一样本向量和第二样本向量确定;更新单元,配置为根据所述样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失,并在预测损失减小的方向,更新所述自编码器。
根据第四方面,提供了一种利用自编码器评估交互事件的装置,所述装置包括:交互图获取单元,配置为获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;目标子图获取单元,配置为以待分析的目标事件所对应的第一目标节点、第二目标节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一目标子图和第二目标子图;编码器获取单元,配置为获取利用上述第三方面中的装置训练得到的自编码器;目标子图处理单元,配置为分别将所述第一目标子图和第二目标子图输入到所述自编码器,得到第一目标节点对应的第一目标向量,和第二目标节点对应的第二目标向量;反向子图形成单元,配置为通过将第一目标子图和第二目标子图中节点间的父子关系反向,并将反向后的子图合并,形成反向目标子图;所述反向目标子图包括所述第一目标子图和第二目标子图中节点的并集形成的目标节点集;反向子图处理单元,配置为将所述反向目标子图输入所述自编码器,得到所述目标节点集中各个节点的隐含向量;其中所述反向目标子图的叶节点的隐含向量根据所述第一目标向量和第二目标向量确定;综合单元,配置为确定所述目标节点集中各个节点的隐含向量与其节点特征之间的距离的综合结果;评估单元,配置为根据所述综合结果与预定阈值的比较,评估所述目标事件是否为预定类别的事件。
根据第五方面,提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行第一方面或第二方面的方法。
根据第六方面,提供了一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现第一方面或第二方面的方法。
根据本说明书实施例提供的方法和装置,基于交互事件序列构建动态交互图,并基于这样的动态交互图训练用于评估交互事件的自编码器。针对预定类别的交互事件,该自编码器可以通过事件涉及的子图和反向子图,将事件涉及的节点表征为隐含向量,并使得隐含向量充分拟合节点特征。如此,可以利用训练好的自编码器,对未知的交互事件进行分析和评估。
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1示出根据一个实施例的实施场景示意图;
图2示出根据一个实施例的训练自编码器的方法流程图;
图3示出根据一个实施例的交互事件序列和由此构建的动态交互图;
图4示出在一个实施例中样本子图的示例;
图5示出LSTM层的工作示意图;
图6示出根据一个实施例LSTM层的结构;
图7示出根据另一实施例的LSTM层的结构;
图8示出根据又一实施例的LSTM层的结构;
图9示出在一个实施例中形成反向子图的步骤;
图10示出在一个例子中的第一反向子图;
图11示出与图4的样本子图对应的合并子图;
图12示出在一个例子中的反向样本子图;
图13示出自编码器处理反向样本子图的示意图;
图14示出多LSTM层的自编码器的示意图;
图15示出根据一个实施例的利用自编码器评估交互事件的方法流程图;
图16示出根据一个实施例的训练自编码器的装置的示意性框图;
图17示出根据一个实施例的评估交互事件的装置的示意性框图。
下面结合附图,对本说明书提供的方案进行描述。
如前所述,希望能够基于交互对象发生的一系列交互事件,对交互对象以及交互事件进行特征表达和建模。
在一种方案中,基于历史交互事件构建静态的交互关系网络图,从而基于该交互关系网络图,分析各个交互对象和各个交互事件。具体地,可以以各个历史事件的参与者作为节点,在存在交互关系的节点之间建立连接边,从而形成交互网络图。在一个例子中,可以基于用户和商品之间的交互形成一个二部图作为交互关系网络图。该二部图中包含用户节点和商品节点,如果某个用户购买过某件商品,则在该用户和该商品之间构建一条连接边。在另一例子中,可以基于用户之间的转账记录形成一个用户转账关系图,其中每个节点代表一个用户,发生过转账记录的两个用户之间存在连接边。
然而,以上例子中的静态网络图,尽管可以示出对象之间的交互关系,但是没有包含这些交互事件的时序信息。简单地基于这样的交互关系网络图进行图嵌入,获得的特征向量也没有表达出交互事件的时间信息对节点的影响。并且,这样的静态图可扩展性不够强,对于新增交互事件和新增节点的情况,难以灵活进行处理。
考虑到以上因素,根据本说明书的一个或多个实施例,将动态变化的交互事件序列构建成动态交互图,其中各个交互事件中涉及的各个交互对象对应于该动态交互图中的各个节点。这样的动态交互图可以反映出各个交互对象所经历的交互事件的时序信息。进一步地,为了基于上述动态交互图进行节点分析和事件分析,在本说明书的实施例中,基于动态交互图训练一个自编码器,该自编码器用于将交互事件中涉及的多个节点编码为表征向量,通过各个节点的表征向量,可以对交互事件进行分析和评估。
图1示出根据一个实施例的实施场景示意图。如图1所示,可以将依次发生的多个交互事件按时间顺序组织成动态交互序列<E
1,E
2,…,E
N>,其中每个元素E
i表示一个交互事件,可以表示为交互特征组的形式E
i=(a
i,b
i,t
i),其中a
i和b
i是事件E
i的两个交互对象, t
i是交互时间。
根据本说明书的实施例,基于该动态交互序列构建动态交互图100。在图100中,将各个交互事件中的各个交互对象a
i,b
i用节点表示,并在包含同一对象的连续事件的节点之间建立父子关系连接边。动态交互图100的结构将在后续进行更具体的描述。
为了更有效地进行节点分析和事件分析,希望基于上述动态交互图训练一个自编码器,从而将交互事件中涉及的节点编码为表征向量。
如本领域技术人员所知,自编码器是一种能够通过无监督学习,学到输入数据高效表示的人工神经网络。输入数据的这一高效表示又称为编码。编码的维度一般远小于输入数据的维度,因此自编码器可用于进行数据降维。此外,自编码器还可以用于在深度神经网络的预训练过程中进行特征抽取或特征编码,还可以用作生成模型来随机生成与训练数据类似的数据。
已经针对例如图像数据、语音数据等形式的输入数据,提出一些自编码器的结构和训练方式。然而,面对以上提出的动态交互图这样的全新图结构,已有的自编码器难以针对性地进行特征编码。
为此,在本说明书的实施例中,基于动态交互图的特点,提出一种基于长短期记忆LSTM网络的自编码器。具体的,针对样本交互事件,可以确定出该交互事件在动态交互图中对应的两个节点,并得到以该两个节点为根节点的两个子图(子图1和子图2)。为了训练自编码器,还将上述两个子图中的节点进行父子关系的反向,形成反向子图。然后,将上述两个子图,以及反向子图相继输入到自编码器。自编码器包括LSTM层,通过该LSTM层对输入子图中各个节点依次进行迭代处理,最终得到反向子图中各个节点的表征向量。通过使得迭代处理的结果拟合各个节点的原始输入特征,对自编码器进行训练,使得自编码器学习到各个节点的表征向量。
如此训练得到的自编码器可以用于进行事件分析。具体的,对于待评估的目标交互事件,可以与上述训练过程类似地得到目标交互事件对应的两个子图和反向子图。然后,利用训练好的自编码器相继处理该两个子图和反向子图,得到反向子图中各个节点的表征向量。通过将这些节点的表征向量与其原始输入特征进行比对,来对目标交互事件进行评估。这样的评估具体可以是,例如,预测目标交互事件涉及的两个对象之间是否会发生交互(比如某个用户是否会点击某个页面),预测目标交互事件的事件类别(比如是否为异常事件),等等。
下面描述以上构思的具体实现方式。
图2示出根据一个实施例的训练自编码器的方法流程图。可以理解,该方法可以通过任何具有计算、处理能力的装置、设备、平台、设备集群来执行。下面结合具体实施例,对如图2所示的训练方法中的各个步骤进行描述。
首先,在步骤21,获取用于反映交互事件关联关系的动态交互图。
一般地,可以如前所述将依次发生的多个交互事件按时间顺序组织成交互事件序列,基于这样的交互事件序列构建动态交互图,以此反映交互事件的关联关系。交互事件序列,例如表示为<E
1,E
2,…,E
N>,可以包括按照时间顺序排列的多个交互事件,其中每个交互事件E
i可以表示为一个交互特征组E
i=(a
i,b
i,t
i),其中a
i和b
i是事件E
i的两个交互对象,t
i是交互时间。
例如,在电商平台中,交互事件可以是用户的购买行为,其中的两个对象可以是某个用户和某个商品。在另一例子中,交互事件可以是用户对页面区块的点击行为,其中的两个对象可以是某个用户和某个页面区块。在又一例子中,交互事件可以是交易事件,例如一个用户向另一用户转账,此时两个对象为两个用户。在其他业务场景中,交互事件还可以是其他在两个对象之间发生的交互行为。
在一个实施例中,每个交互事件对应的交互特征组还可以包括事件行为特征f,如此,每个交互特征组可以表示为X
i=(a
i,b
i,t
i,f)。具体的,事件行为特征f可以包括交互事件发生的背景和上下文信息,交互行为的一些属性特征,等等。
例如,在交互事件为用户点击事件的情况下,事件行为特征f可以包括,用户进行点击所使用的终端的类型,浏览器类型,app版本,等等;在交互事件为交易事件的情况下,事件行为特征f可以包括,例如,交易类型(商品购买交易、转账交易等),交易金额,交易渠道等等。
对于以上所述的交互事件序列,可以构建动态交互图。具体的,用一对节点(两个节点)表示一个交互事件涉及的两个对象,将交互事件序列中各个交互事件中的各个对象分别用节点表示。如此,一个节点可以对应到一个交互事件中的一个对象,但是同一物理对象可能对应到多个节点。例如,如果用户U1在t1时刻购买了商品M1,在t2时刻购买了商品M2,那么存在两个交互事件的特征组(U1,M1,t1)和(U1,M2,t2),那么则根据这两个交互事件分别为用户U1创建两个节点U1(t1),U1(t2)。因此可以认为,动态交互图中的节点对应于一个交互对象在一次交互事件中的状态。
对于动态交互图中的每个节点,按照以下方式构建父子关系和相应的连接边:对于任意节点i,假定其对应于交互事件i(交互时间为t),那么在交互事件序列中,从交互事件i向前回溯,也就是向早于交互时间t的方向回溯,将第一个同样包含节点i代表的对象的交互事件j(交互时间为t-,t-早于t)确定为该对象参与的上一交互事件。于是,将该上一交互事件对应的两个节点,认为是该节点i的子节点,相应的,该节点i为该上一交互事件对应的两个节点的父节点。可以在父子节点之间建立连接边,以示出其父子依赖关系。
下面结合具体例子进行描述。图3示出根据一个实施例的交互事件序列和由此构建的动态交互图。具体的,图3左侧示出按照时间顺序组织的交互事件序列,其中示例性示出分别在t
1,t
2,…,t
6时刻发生的交互事件E
1,E
2,…,E
6,每个交互事件包含交互涉及的两个交互对象以及交互时间(为了图示的清楚,省去了事件行为特征)。图3右侧示出根据左侧的交互事件序列构建的动态交互图,其中,将各个交互事件中的两个交互对象分别作为节点。下面以节点u(t
6)为例,描述父子关系和连接边的构建。
如图所示,该节点u(t
6)代表交互事件E
6中的一个交互对象u。于是,从交互事件E
6出发向前回溯,找到的第一个同样包含交互对象u的交互事件为E
4,也就是说,E
4是对象u参与的上一交互事件,相应的,E
4的两个交互对象对应的两个节点u(t
4)和w(t
4),为节点u(t
6)的两个子节点。于是,建立从子节点u(t
4)和w(t
4)指向父节点u(t
6)的两个连接边。类似的,从u(t
4)(对应于交互事件E
4)继续向前回溯,可以继续找到对象u参与的上一交互事件E
2,于是将E
2对应的两个节点u(t
2)和y(t
2)认为是u(t
4)的子节点,建立从这两个节点指向u(t
4)的连接边。在另一侧,从w(t
4)向前回溯,可以找到对象w参与的上一交互事件E
3,于是,建立E
3对应的两个节点指向w(t
4)的连接边。如此,在节点之间构建反映父子关系的连接边,从而形成图3的动态交互图。
在一个实施例中,交互事件所涉及的两个对象可以划分为两类对象:第一类对象和第二类对象。例如,页面点击事件的第一类对象为用户对象,第二类的对象为页面区块。在这样的情况下,在动态交互图中,可以通过节点的相对位置来区分两类对象,例如对于各个交互事件中的节点,将第一类对象布置在左侧,第二类对象布置在右侧。换而言之,将节点相应的划分为左节点和右节点。例如在图3中,左节点为用户节点,右节点为物品节点。当然,在另一些实施例中,也可以不区分节点的位置。
以上描述了基于交互事件序列构建动态交互图的方式和过程。对于图2所示的训练过程而言,构建动态交互图的过程可以预先进行也可以现场进行。相应地,在一个实施 例中,在步骤21,根据交互事件序列现场构建动态交互图。构建方式如以上所述。在另一实施例中,可以预先基于交互事件序列构建形成动态交互图。在步骤21,读取或接收已形成的动态交互图。
可以理解,按照以上方式构建的动态交互图具有很强的可扩展性,可以非常容易地根据新增的交互事件进行动态更新。当出现新的交互事件时,可以将该新增交互事件涉及的两个对象作为两个新增节点,添加到已有动态交互图中。并且,对于每个新增节点,确定其是否存在子节点。如果存在子节点,则添加子节点指向该新增节点的连接边,如此形成更新的动态交互图。
综合以上,在步骤21,获取到用于反映交互事件关联关系的动态交互图。接着,在步骤22,分别以样本交互事件所对应的第一样本节点、第二样本节点为根节点,在动态交互图中确定出对应的第一样本子图和第二样本子图。
具体地,为了对自编码器进行训练,上述样本交互事件为选取的已知其类别为预定类别的样本事件。例如,该预定类别可以是,确定已发生交互的事件,或者,确定为安全的交互事件(例如,非黑客攻击事件,非欺诈交易事件,等等)。在选取出样本交互事件的基础上,可以在上述动态交互图中确定出该样本交互事件所涉及的两个节点,即第一样本节点和第二样本节点。然后,分别以上述第一样本节点、第二样本节点作为当前根节点,在上述的动态交互图中,以当前根节点出发,沿父子关系进行遍历,将经由连接边到达的预定范围内的节点作为对应的子图,从而分别得到第一样本子图和第二样本子图。
在一个实施例中,上述预定范围内的节点可以是,至多经过预设数目K的连接边可达的子节点。这里数目K为预设的超参数,可以根据业务情况选取。可以理解,该预设数目K体现了,从根节点向前回溯的历史交互事件的步数,也就是子节点的阶数。数目K越大,则考虑越久的历史交互信息。
在另一实施例中,上述预定范围内的节点还可以是,交互时间在预定时间范围内的子节点。例如,从根节点的交互时间向前回溯T时长(例如一天),在该时长范围内、且可通过连接边达到的子节点。
在又一实施例中,上述预定范围既考虑连接边的数目,又考虑时间范围。换而言之,该预定范围内的节点是指,至多经过预设数目K的连接边可达、且交互时间在预定时间范围内的子节点。
下面延续以上示例并结合具体例子进行描述。图4示出在一个实施例中样本子图的示例。在图4的例子中,假定u(t
6)为第一样本节点,于是,以该节点u(t
6)为根节点,确定其对应的第一样本子图,并假定子图是由至多经由预设数目K=2的连接边到达的子节点构成。那么,从当前根节点u(t
6)出发,沿父子关系的指向进行遍历,经由2条连接边可以达到的节点如图中虚线区域所示。该区域中的节点和连接关系即为节点u(t
6)对应的子图,即第一样本子图。
与之类似的,如果设定另一节点v(t
6)为第二样本节点,那么可以以该节点v(t
6)作为根节点再次进行遍历,从而得到第二样本子图。
下文中,为了描述的清楚和简单,将第一样本节点表示为u(t),将第二样本节点表示为v(t)进行说明。如此,对于已知类别的样本交互事件所涉及的第一样本节点u(t)和第二样本节点v(t),分别得到了对应的第一样本子图和第二样本子图。
另一方面,在步骤23,获取待训练的自编码器,所述自编码器包括LSTM层,所述LSTM层根据输入子图中节点之间的父子关系,从叶节点到根节点依次迭代处理各个节点,其中所述迭代处理包括,至少根据当前处理节点的节点特征,及其两个子节点的隐含向量,确定该当前处理节点的隐含向量。
图5示出LSTM层的工作示意图。假定节点Q的两个子节点为节点J
1和节点J
2。如图5所示,在T时刻,LSTM层分别处理得到节点J
1和节点J
2的表征向量H1和H2,其中表征向量可以包括隐含向量和用于辅助运算的中间向量;在接下来的T+时刻,LSTM层根据节点Q的节点特征,之前处理得到的J
1和J
2的表征向量H1和H2,得到节点Q的表征向量H
Q。可以理解,该节点Q的表征向量可以在后续时刻,连同节点Q的相对节点(同一事件中的另一节点)的表征向量一起,用于处理得到节点Q的父节点的表征向量,如此实现迭代处理。
图6示出根据一个实施例LSTM层的结构。在图6的示例中,当前处理的节点记为z(t),其中x
z(t)表示该节点的节点特征。在节点表示用户的情况下,节点特征可以包括用户的属性特征,例如年龄、职业、教育程度、所在地区、注册时长、人群标签等等;在节点表示物品的情况下,节点特征可以包括物品的属性特征,例如物品类别、上架时间、销量、评论数等等。在节点表示其他交互对象的情况下,可以相应的获取到原始的节点特征。在交互事件的特征组中还包括事件行为特征f的情况下,也可以在节点特征中包含该节点所在交互事件的事件行为特征f。
假定节点z(t)的两个子节点为第一子节点j
1和第二子节点j
2,那么c
j1和h
j1分别表示第一子节点j
1的中间向量和隐含向量,c
j2和h
j2分别表示第二子节点j
2的中间向量和隐含向量,其中中间向量用于辅助运算,隐含向量用于表示该节点。
LSTM层对于输入其中的节点特征、中间向量和隐含向量进行以下运算。
将当前处理节点z(t)的节点特征x
z(t),分别与两个子节点j
1和j
2的隐含向量h
j1和h
j2组合,分别输入第一变换函数g和第二变换函数f,于是分别得到2个第一变换向量和第二变换向量。
更具体的,在一个例子中,第一变换函数g和第二变换函数f分别采用以下公式(1)和公式(2)计算:
以上公式(1)和公式(2)中,i为1或2,分别对应两个子节点;σ为激活函数,例如是sigmoid函数,
和
为线性变换矩阵,
和
为偏移量参数。可以看到,公式(1)和(2)算法相同,仅参数不同。通过以上的变换函数,可以得到2个第一变换向量
和2个第二变换向量
其中i为1或2。
当然,在其他例子中,也可以采用类似但不同的变换函数,例如选择不同的激活函数,修改以上公式中参数的形式和数目,等等。
具体的,在一个例子中,上述组合操作可以是,三个向量之间的按位相乘,即
(其中⊙表示按位相乘)。在其他例子中,上述组合操作也可以是相加等其他向量操作。由于i为1或2,于是得到2个组合操作结果v1和v2。可以对v1和v2求和,得到组合向量V。在组合操作为按位相乘的情况下,得到的组合向量V可以表示为:
此外,还将当前处理节点z(t)的节点特征x
z(t),连同2个子节点的隐含向量h
j1和h
j2,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量。
具体的,在图6所示的例子中,第三变换函数p可以是,先求得向量u
z(t)和s
z(t),再将u
z(t)和s
z(t)进行按位相乘,由此得到第三变换向量p
z(t),即:
p
z(t)=u
z(t)⊙s
z(t) (4)
其中,⊙表示按位相乘。
更具体的,u
z(t)和s
z(t)可以按照以下公式计算:
第四变换函数o可以是,通过以下公式得到第四变换向量O
z(t):
接着,基于上述组合向量V和第三变换向量p
z(t),确定当前处理节点z(t)的中间向量c
z(t)。例如,可以将组合向量V和第三变换向量p
z(t)求和,得到z(t)的中间向量c
z(t)。在其他例子中,也可以通过其他组合方式,例如加权求和,按位相乘,将组合结果作为z(t)的中间向量c
z(t)。
此外,基于如此得到的节点z(t)的中间向量c
z(t)和第四变换向量O
z(t),确定该节点z(t)的隐含向量h
z(t)。
在图6所示的具体例子中,可以将中间向量c
z(t)进行tanh函数运算后,与第四变换向量O
z(t)组合,例如按位相乘,作为该节点z(t)的隐含向量h
z(t),即:
h
z(t)=o
z(t)⊙tanh(c
z(t) (8)
于是,根据图6所示的结构和算法,LSTM层根据当前处理节点z(t)的节点特征,该节点的两个子节点j
1和j
2各自的中间向量和隐含向量,确定该节点z(t)的中间向量c
z(t)和隐含向量h
z(t)。
在一个实施例中,在迭代处理各个节点z(t)以确定其中间向量和隐含向量过程中,还进一步引入该当前处理节点z(t)对应的交互时间与其子节点所在交互事件的交互时间之间的时间差Δ。也就是说,对于任意的当前处理节点z(t),LSTM层处理 该节点包括,根据该当前处理节点的节点特征x
z(t),其两个子节点j
1和j
2的隐含向量,以及该节点z(t)所在交互事件的第一交互时间(t)与其两个子节点(j
1和j
2)所在交互事件的第二交互时间(t-)之间的时间差Δ,确定该当前处理节点z(t)的隐含向量h
z(t)。
更具体的,可以在图6所示的方式基础上,引入时间差Δ的因素,类似的得到节点z(t)的隐含向量和中间向量。具体的,结合时间差的处理过程可以包括:将当前处理节点z(t)的节点特征x
z(t)和所述时间差Δ,与两个子节点的隐含向量h
j1和h
j2分别组合,输入第一变换函数,得到2个第一变换向量;将上述节点特征与所述两个子节点的隐含向量分别组合,输入第二变换函数,得到2个第二变换向量;将所述两个子节点中第i子节点的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点z(t)的节点特征x
z(t)连同所述两个子节点的隐含向量h
j1和h
j2,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量c
z(t);基于所述当前处理节点z(t)的中间向量和第四变换向量,确定所述当前处理节点的隐含向量h
z(t)。
图7示出根据另一实施例的LSTM层的结构。对比图7和图6可以看到,图7的结构和实现的算法与图6相似,只是在图6的基础上进一步引入了时间差Δ。在图7的例子中,时间差Δ和节点z(t)的节点特征x
z(t)一起,分别与子节点的隐含向量组合,输入到第一变换函数g中。相应的,第一变换函数g可以修改为:
图7中的其他变换函数,以及函数之间的运算过程,可以与结合图6所述的例子相同。
根据另一种实施方式,还可以将上述时间差进一步输入到第二变换函数中。也就是,将当前处理节点z(t)的节点特征和时间差Δ,分别与两个子节点对应的隐含向量分别组合,然后分别输入第一变换函数g和第二变换函数f,从而分别得到2个第一变换向量和2个第二变换向量。后续处理与前述相同。
例如,图8示出根据又一实施例的LSTM层的结构。可以看到,图8的LSTM层同样引入了时间差Δ,并且,对比于图7,图8中的时间差Δ还进一步输入到了第二 变换函数f。更具体的,图8中的第一变换函数g仍然可以采取公式(9)的形式,而第二变换函数f可以采取以下形式:
图8中的其他变换函数,以及函数之间的运算过程,可以与结合图6所述的例子相同。
在更多实施例中,还可以将上述时间差进一步输入到第三变换函数p和/或第四变换函数o。在这样的情况下,可以对前述的公式(5),(6),(7)的一部分或全部进行修改,在原有基础上类似的引入针对时间差的时间项,在此不一一详述。
通过以上结合图6-图8所述的LSTM层,自编码器可以对输入子图中的节点依次迭代地进行处理,得到各个节点的中间向量和隐含向量。
于是,在步骤24,分别将步骤22得到的第一样本子图和第二样本子图输入到前述的自编码器中,分别得到第一样本节点和第二样本节点对应的隐含向量,称为第一样本向量和第二样本向量。
结合图4的第一样本子图描述这一过程。对于图中最下层的节点u(t
2),在该输入子图中不考虑其子节点。在这样的情况下,通过用缺省值(例如0)填补(padding)的方式产生该节点的两个子节点各自的中间向量c和隐含向量h。于是,LSTM层按照图6-图8中任意的方式,将节点u(t
2)作为当前处理节点,基于该节点u(t
2)的节点特征,以及缺省产生的两个中间向量c和两个隐含向量h,确定节点u(t
2)的中间向量c(u(t
2))和隐含向量h(c(t
2))。对于最下层节点y(t
2)也做同样的处理,得到对应的中间向量c(y(t
2))和h(y(t
2))。
节点u(t
2)和y(t
2)的父节点为u(t
4)。接着,LSTM层将u(t
4)作为当前处理节点,根据该节点u(t
4)本身的节点特征,以及其两个子节点u(t
2)和y(t
2)各自的中间向量和隐含向量,即c(u(t
2)),h(u(t
2)),c(y(t
2))和h(y(t
2)),按照图6-图8中任意的方式,确定节点u(t
4)的中间向量c(u(t
4))和h(u(t
4))。
如此,层层迭代处理,可以得到第一样本节点u(t
6)的中间向量和隐含向量。
对于第二样本子图,自编码器进行类似的处理,可以得到第二样本节点v(t
6) 的中间向量和隐含向量。
然后,为了训练自编码器,在步骤25,通过将第一样本子图和第二样本子图中节点间的父子关系反向形成反向样本子图,该反向样本子图包括第一样本子图和第二样本子图中节点的并集形成的样本节点集,其中第一样本节点和第二样本节点在前述的第一样本子图和第二样本子图中分别为根节点,而在该反向样本子图中作为最底层的叶节点。
图9示出在一个实施例中形成反向子图的步骤,即上述步骤25的子步骤。如图9所示,可以首先在步骤91,将第一样本子图中节点间的父子关系反向,形成以第一样本节点为叶节点的第一反向子图。
以图4的第一样本子图为例进行说明。在图4中,第一样本节点u(t
6)为根节点,各个连接边从子节点指向父节点。通过将图4中的父子关系反向,也就是,将连接边的指向方向反向,并仍然将子节点排布在父节点下方,可以形成图10所示的第一反向子图。可以看到,该第一反向子图可以视为第一样本子图的镜像图。在第一反向子图中,第一样本节点成为了叶节点,在反向子图中将其表示为u`(t
6)。
与之类似的,在步骤92,将第二样本子图中节点间的父子关系反向,形成以第二样本节点为叶节点的第二反向子图。可以理解,步骤91和92可以并行执行,或者以任意先后顺序执行。
然后在步骤93,合并第一反向子图和第二反向子图中共同的节点,形成合并子图。可以理解,第一样本子图和第二样本子图中有可能存在共同的节点,例如在图4中,当第一样本节点为u(t
6),第二样本节点为v(t
6)时,q(t
3)和w(t
3)为共同的节点。在将第一样本子图和第二样本子图反向之后,第一反向子图和第二反向子图仍然包含这些共同的节点。于是可以将两个反向子图中共同的节点进行合并。图11示出与图4的样本子图对应的合并子图,其中在图4的节点编号后添加“`”,来表示其在反向子图中的对应节点。如果第一反向子图和第二反向子图中没有共同的节点,那么直接将两个反向子图合并在一起,形成合并子图。
然后在步骤94,在上述合并子图中,对于仅有一个子节点的节点,为其添加缺省子节点,由此形成反向样本子图。
需要理解,在原始样本子图中,每个父节点具有两个子节点。在将样本子图中父子关系反向之后,会出现两个父节点共享一个子节点,以及某些节点只有一个子节点 的情况。为了与原始样本子图的图结构一致,便于自编码器进行同样的迭代处理,对于合并子图中仅有一个子节点的节点,为其添加缺省子节点。
例如,在图11中,节点u`(t
4)仅有一个子节点,可以为其添加缺省子节点A,使其具有两个子节点。节点w`(t
4)也仅有一个子节点,可以为其添加缺省子节点B,使其具有两个子节点。节点w`(t
3)已有两个子节点,则不必为其添加缺省子节点。于是,可以得到图12所示的反向样本子图,其中为了图示的简单,省去了为部分节点添加的缺省子节点。
在一个实施例中,交互事件涉及的对象被划分为第一类对象和第二类对象,相应的,节点被划分为左节点和右节点。在这样的情况下,在添加缺省子节点时,还需考虑缺少的子节点的类别。当目标节点只有左侧子节点时,为其添加右侧缺省子节点;当目标节点只有右侧子节点时,为其添加左侧缺省子节点。
以上结合图9和图4的例子,描述了一种形成反向样本子图的具体过程。可以理解,还可以在此基础上进行改动,以类似的其他方式形成反向样本子图。例如,可以首先将第一样本子图和第二样本子图进行合并,形成合并子图,然后将合并子图中的节点父子关系反向,最后添加缺省子节点,于是形成反向样本子图。
然后,在图2的步骤26,将所述反向样本子图输入前述自编码器,通过自编码器中LSTM层的迭代处理,得到样本节点集中各个节点的隐含向量。
如前所述,自编码器在进行上述迭代处理时,从输入子图的叶节点开始进行处理。当输入子图为反向样本子图时,其叶节点正是原本作为根节点的第一样本节点和第二样本节点,即第一叶节点为第一样本节点,第二叶节点为第二样本节点。这两个叶节点的隐含向量可以通过步骤24自编码器处理第一样本子图和第二样本子图时,针对第一样本节点和第二样本节点分别得到的第一样本向量和第二样本向量而确定。因此,自编码器对反向样本子图的处理,是在第一样本子图和第二样本子图处理结果基础上的延续。
图13示出自编码器处理反向样本子图的示意图,该示意图基于图4的样本子图的例子和图12的反向样本子图的例子进行描述。在图13中为了图示的简单,省略了反向样本子图中添加的缺省子节点。可以看到,自编码器通过处理第一样本子图和第二样本子图,分别得到了第一样本节点对应的第一样本向量(图中简单起见表示为h(u)),以及第二样本节点对应的第二样本向量h(v)。第一样本向量和第二样本向量作为中间 结果,用于确定反向样本子图的叶节点的隐含向量,从而将正向样本子图的处理与反向样本子图的处理连接起来,于是叶节点的隐含向量作为自编码器处理反向样本子图的起始向量。
在一个实施例中,直接将第一样本向量作为第一叶节点的隐含向量,将第二样本向量作为第二叶节点的隐含向量。结合图13的示意图,这相当于,将正向的第一样本子图中的第一样本节点u(t
6),与反向样本子图中对应的节点u`(t
6),在处理上合并为一个抽象节点。
在另一实施例中,反向样本子图中第一叶节点的隐含向量,基于该第一叶节点的节点特征,以及将上述第一样本向量和第二样本向量作为两个子节点的隐含向量而确定;第二叶节点的隐含向量,基于该第二叶节点的节点特征,以及将上述第一样本向量和第二样本向量作为两个子节点的隐含向量而确定。结合图13的示意图,这相当于,将正向的第一/第二样本子图中的根节点u(t
6)和v(t
6),视为反向样本子图中两个叶节点u`(t
6)和v`(t
6)的子节点。按照前述的自编码器的处理逻辑,对于每个叶节点,其隐含向量根据其自身的节点特征和两个子节点的隐含向量,即第一样本向量和第二样本向量,而确定。
通过以上方式得到了反向样本子图中叶节点的隐含向量。在此基础上,自编码器中的LSTM层就可以采用前述迭代处理,继续处理反向样本子图中的其他节点,从而得到各个节点的隐含向量。
例如,结合图12的例子,节点u`(t
4)的隐含向量,基于其自身的节点特征,和其两个子节点,叶节点u`(t
6)和缺省子节点A,的隐含向量而确定;节点w`(t
4)的隐含向量,基于其自身的节点特征,和其两个子节点,叶节点u`(t
6)和缺省子节点B,的隐含向量而确定;节点w`(t
3)的隐含向量,基于其自身的节点特征,和其两个子节点w`(t
4)和q`(t
5)的隐含向量而确定。其中,缺省子节点的隐含向量可以预设为缺省值。在一个实施例中,为左侧缺省子节点和右侧缺省子节点预设不同的缺省隐含向量。
如此迭代处理,自编码器可以得到反向样本子图包含的样本节点集中各个节点的隐含向量。
在一个实施例中,为了更好地提取高层次的特征,进一步提升表征效果,自编码器包括多个LSTM层。在这样的情况下,上一LSTM层确定出的某个节点的隐含向量,输入到下一LSTM层作为该节点的节点特征。
图14示出多LSTM层的自编码器的示意图。如图14所示,每个LSTM层仍然迭代地处理各个节点,根据当前处理节点i的节点特征、该节点i的两个子节点各自的中间向量和隐含向量,确定该节点i的隐含向量和中间向量,只是最底层的LSTM层采用节点i的原始输入特征作为节点特征,后续的LSTM层采用前一LSTM层确定出的该节点i的隐含向量作为节点特征。在这样的情况下,将最后一个LSTM层处理反向样本子图时得到的各个节点的隐含向量,作为样本节点集中各个节点最终的隐含向量。
于是,接下来在步骤27,根据样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失;在预测损失减小的方向,更新自编码器。
具体的,对于样本节点集S中的任意节点i,可以确定其隐含向量h(i)与节点特征x(i)之间的距离大小‖h(i)-x(i)‖,或称为预测距离。当自编码器包括多个LSTM层的情况下,上述节点特征x(i)为输入到第一个LSTM层的该节点i的节点特征,隐含向量h(i)为最后一个LSTM层处理反向样本子图得到的该节点的隐含向量。如前所述,自编码器的训练目标设定为,使得各个节点的隐含向量拟合其节点特征。于是,在一个例子中,预测损失L可以表示为:
L=∑
i∈S‖h(i)-x(i)‖ (11)
其中,预测距离‖h(i)-x(i)‖可以通过余弦距离,欧式距离等方式确定。
公式(11)中是将各个节点对应的预测距离求和,得到预测损失。此外,还可以将各个节点的预测距离的平方和作为预测损失。在其他例子中,还可以确定出样本节点集S中节点的数目,从而将各个节点的预测距离的平均值,或者预测距离平方的平均值,作为预测损失L。
在确定出预测损失L的基础上,可以在使得L减小的方向,调整自编码器中的模型参数,具体包括,LSTM层中前述公式(1)-(10)中的各个变换矩阵参数和偏移量参数等,从而更新自编码器。具体的,可以采用梯度下降,反向传播等方式,实现自编码器中的参数调优。
通过多次选取预定类别的样本事件,执行上述步骤21到27,可以不断更新和优化自编码器,最终训练得到一个专用于根据动态交互图评估交互事件的自编码器,该自编码器对于前述预定类别的样本事件,可以得到其中涉及的各个节点的隐含向量,且使得基于隐含向量与节点特征的距离得到的预测损失达到极小,例如低于一定阈值。
于是,利用上述训练好的自编码器,就可以对未知的事件进行分析和评估。
图15示出根据一个实施例的利用自编码器评估交互事件的方法流程图。可以理解,该方法可以通过任何具有计算、处理能力的装置、设备、平台、设备集群来执行。如图15所示,评估交互事件的方法可以包括以下步骤。
在步骤151,获取反映交互事件关联关系的动态交互图。该动态交互图的构成方式和结构特点如之前结合步骤21所述,不复赘述。
在步骤152,以待分析的目标事件所对应的第一目标节点、第二目标节点为根节点,在动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一目标子图和第二目标子图。该步骤的具体执行方式与前述步骤22相似,只是本步骤中的目标事件是类别未知的、有待分析的事件。该事件涉及的两个节点称为第一目标节点和第二目标节点。基于第一目标节点和第二目标节点在动态交互图中确定出第一目标子图和第二目标子图的过程,与前述步骤22相对应,不复赘述。
在步骤153,获取根据图2的方法训练得到的自编码器。
在步骤154,分别将第一目标子图和第二目标子图输入到自编码器,得到第一目标节点和第二目标节点各自对应的隐含向量,分别称为第一目标向量和第二目标向量。自编码器对于输入其中的第一目标子图和第二目标子图分别进行迭代处理的过程,可以参照步骤24的描述。
此外,在步骤155,通过将第一目标子图和第二目标子图中节点间的父子关系反向,并将反向后的子图合并,形成反向目标子图;所述反向目标子图包括所述第一目标子图和第二目标子图中节点的并集形成的目标节点集。该步骤中形成反向目标子图的执行方式与前述步骤25相似,不复赘述。
然后在步骤156,将该反向目标子图输入训练好的自编码器,得到目标节点集中各个节点的隐含向量,其中反向目标子图的叶节点的隐含向量根据所述第一目标向量和第二目标向量确定。该步骤与前述步骤26相对应,不复赘述。
接着在步骤157,确定目标节点集中各个节点的隐含向量与其节点特征之间的距离的综合结果。该综合结果的确定方式,与训练自编码器时预测损失L的确定方式相对应。
进一步的,在步骤158,根据该综合结果与预定阈值的比较,评估上述目标事件是否为预定类别的事件。
如前所述,经过图2所示过程的训练,针对预定类别的事件涉及的节点,自编 码器得到的隐含向量可以很好地拟合节点特征,使得隐含向量与节点特征的距离的综合达到极小。因此,可以设置一个阈值,根据当前分析的目标事件对应的综合结果与该阈值的比较,判断目标事件是否为前述预定类别的事件。如果综合结果小于该阈值,则可以确定,当前分析的目标事件属于预定类别的事件;如果达到或者超过该阈值,则认定,目标事件不属于预定类别的事件。
在一个实施例中,该目标事件可以是假设发生的事件,相应的,预定类别的事件为确定发生交互的事件。如此,通过自编码器判断目标事件是否为预定类别的事件,可以用于评估动态交互图中两个节点接下来是否会发生交互,例如,一个用户是否会点击某个页面或页面区块。
在另一实施例中,目标事件为已发生交互的事件,预定类别的事件为具有某种特性的事件,例如,确认安全的事件。如此,通过自编码器判断目标事件是否为预定类别的事件,可以用于评估已发生的某个交互事件是否为安全事件,或者是否具有较高的安全风险。例如,当一个用户发出向另一用户转账的支付请求时,二者发生了交互。可以通过上述自编码器判断该交互事件是正常交易,还是有安全风险的欺诈交易,包括盗用账号的交易,套现交易等等。又例如,当用户针对某个网站发出登录请求时,用户与该网站发生了交互事件。可以通过上述自编码器判断该事件是正常登录事件,还是异常事件,例如黑客的攻击事件,盗用账号的尝试登录,等等。
如此,通过训练好的自编码器,基于动态交互图,更准确、有效地对交互事件进行分析和评估。
综合以上,在本说明书实施例的方案中,基于交互事件序列构建动态交互图,并基于这样的动态交互图训练用于评估交互事件的自编码器。针对预定类别的交互事件,该自编码器可以通过事件涉及的子图和反向子图,将事件涉及的节点表征为隐含向量,并使得隐含向量充分拟合节点特征。如此,可以利用训练好的自编码器,对未知的交互事件进行分析和评估。
根据另一方面的实施例,提供了一种训练用于评估交互事件的自编码器的装置,该装置可以部署在任何具有计算、处理能力的设备、平台或设备集群中。图16示出根据一个实施例的训练自编码器的装置的示意性框图。如图16所示,该训练装置160包括以下单元。
交互图获取单元161,配置为获取反映交互事件关联关系的动态交互图,其中包 括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点。
样本子图获取单元162,配置为分别以预定类别的样本交互事件所对应的第一样本节点、第二样本节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一样本子图和第二样本子图。
编码器获取单元163,配置为获取待训练的自编码器,所述自编码器包括LSTM层,所述LSTM层根据输入子图中节点之间的父子关系,从叶节点到根节点依次迭代处理各个节点,其中所述迭代处理包括,至少根据当前处理节点的节点特征,及其两个子节点的隐含向量,确定该当前处理节点的隐含向量。
样本子图处理单元164,配置为分别将所述第一样本子图和第二样本子图输入到所述自编码器,得到第一样本节点对应的第一样本向量,和第二样本节点对应的第二样本向量。
反向子图形成单元165,配置为通过将第一样本子图和第二样本子图中节点间的父子关系反向,并将反向后的子图合并,形成反向样本子图;所述反向样本子图包括所述第一样本子图和第二样本子图中节点的并集形成的样本节点集。
反向子图处理单元166,配置为将所述反向样本子图输入所述自编码器,得到所述样本节点集中各个节点的隐含向量,其中所述反向样本子图的叶节点的隐含向量根据所述第一样本向量和第二样本向量确定。
更新单元167,配置为根据所述样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失,并在预测损失减小的方向,更新所述自编码器。
在不同实施例中,样本子图获取单元162选取的所述预定范围的节点包括:预设数目K的连接边之内的K阶子节点;和/或,交互时间在预设时间范围内的子节点。
在一种实施方式中,所述LSTM层迭代处理时所基于的当前处理节点的节点特征可以包括,该节点所对应的对象的属性特征。
在更具体的实施例中,当当前处理节点可以为用户节点时,所述属性特征可以包括以下中的至少一项:年龄、职业、教育程度、所在地区、注册时长、人群标签;或者,当当前处理节点为物品节点时,所述属性特征包括以下中的至少一项:物品类别、上架时间、评论数、销量。
进一步的,在一个实施例中,所述当前处理节点的节点特征还可以包括,该节点所对应的交互事件的事件行为特征。
根据一种实施方式,其中,所述自编码器中的LSTM层配置用于:将所述当前处理节点的节点特征,分别与两个子节点的隐含向量组合,并分别输入第一变换函数和第二变换函数,得到2个第一变换向量和2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
根据另一实施方式,所述LSTM层配置用于,根据所述当前处理节点的节点特征,所述两个子节点的隐含向量,以及该当前处理节点所在交互事件的第一交互时间与所述两个子节点所在交互事件的第二交互时间之间的时间差,确定所述当前处理节点的隐含向量。
在该实施方式的一个实施例中,所述LSTM层具体配置用于:将所述当前处理节点的节点特征和所述时间差,与所述两个子节点的隐含向量分别组合,输入第一变换函数,得到2个第一变换向量;将所述节点特征与所述两个子节点的隐含向量分别组合,输入第二变换函数,得到2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
在一个实施例中,所述反向子图形成单元165进一步包括(未示出):第一反向模块,配置为将所述第一样本子图中节点间的父子关系反向,形成以所述第一样本节点为叶节点的第一反向子图;第二反向模块,配置为将所述第二样本子图中节点间的父子关系反向,形成以所述第二样本节点为叶节点的第二反向子图;合并模块,配置为合并所述第一反向子图和所述第二反向子图中共同的节点,形成合并子图;添加模块,配 置为在所述合并子图中,对于仅有一个子节点的节点,为其添加缺省子节点,由此形成所述反向样本子图。
在具体实施例中,反向样本子图的叶节点包括对应于所述第一样本节点的第一叶节点,和对应于所述第二样本节点的第二叶节点;根据一种实施方式,所述第一叶节点的隐含向量为所述第一样本向量,所述第二叶节点的隐含向量为所述第二样本向量。
根据另一种实施方式,所述反向子图处理单元166配置为,基于该第一叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定第一叶节点的隐含向量;基于该第二叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定所述第二叶节点的隐含向量。
在一个实施例中,自编码器包括多个LSTM层,其中,上一LSTM层确定出的所述当前处理节点的隐含向量,输入到下一LSTM层作为该当前处理节点的节点特征。
在这样的情况下,所述更新单元167可以配置为:对于所述样本节点集中的各个节点,确定所述多个LSTM层中最后一个LSTM层输出的该节点的隐含向量,与输入到第一个LSTM层的该节点的节点特征之间的距离;
根据所述各个节点对应的距离的综合结果,确定预测损失。
根据又一方面的实施例,提供了一种利用自编码器评估交互事件的装置,该装置可以部署在任何具有计算、处理能力的设备、平台或设备集群中。图17示出根据一个实施例的评估交互事件的装置的示意性框图。如图17所示,该评估装置170包括以下单元。
交互图获取单元171,配置为获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点。
目标子图获取单元172,配置为以待分析的目标事件所对应的第一目标节点、第二目标节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一目标子图和第二目标子图。
编码器获取单元173,配置为获取利用前述训练装置训练得到的自编码器。
目标子图处理单元174,配置为分别将所述第一目标子图和第二目标子图输入到 所述自编码器,得到第一目标节点对应的第一目标向量,和第二目标节点对应的第二目标向量。
反向子图形成单元175,配置为通过将第一目标子图和第二目标子图中节点间的父子关系反向,并将反向后的子图合并,形成反向目标子图;所述反向目标子图包括所述第一目标子图和第二目标子图中节点的并集形成的目标节点集。
反向子图处理单元176,配置为将所述反向目标子图输入所述自编码器,得到所述目标节点集中各个节点的隐含向量;其中所述反向目标子图的叶节点的隐含向量根据所述第一目标向量和第二目标向量确定。
综合单元177,配置为确定所述目标节点集中各个节点的隐含向量与其节点特征之间的距离的综合结果。
评估单元178,配置为根据所述综合结果与预定阈值的比较,评估所述目标事件是否为预定类别的事件。
在一个实施例中,所述目标事件为假设发生的事件,所述预定类别的事件为确定发生的事件。
在另一实施例中,所述目标事件为已发生事件,所述预定类别的事件为确认安全的事件。
通过以上装置160,基于动态交互图,训练得到自编码器;通过以上装置170,可以利用训练得到的自编码器,对交互事件进行评估和分析。
根据另一方面的实施例,还提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行结合图2和图15所描述的方法。
根据再一方面的实施例,还提供一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现结合图2和图15所述的方法。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。
Claims (34)
- 一种训练用于评估交互事件的自编码器的方法,所述方法包括:获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;分别以预定类别的样本交互事件所对应的第一样本节点、第二样本节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一样本子图和第二样本子图;获取待训练的自编码器,所述自编码器包括LSTM层,所述LSTM层根据输入子图中节点之间的父子关系,从叶节点到根节点依次迭代处理各个节点,其中所述迭代处理包括,至少根据当前处理节点的节点特征,及其两个子节点的隐含向量,确定该当前处理节点的隐含向量;分别将所述第一样本子图和第二样本子图输入到所述自编码器,得到第一样本节点对应的第一样本向量,和第二样本节点对应的第二样本向量;通过将第一样本子图和第二样本子图中节点间的父子关系反向,并将反向后的子图合并,形成反向样本子图;所述反向样本子图包括所述第一样本子图和第二样本子图中节点的并集形成的样本节点集;将所述反向样本子图输入所述自编码器,得到所述样本节点集中各个节点的隐含向量,其中所述反向样本子图的叶节点的隐含向量根据所述第一样本向量和第二样本向量确定;根据所述样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失,并在预测损失减小的方向,更新所述自编码器。
- 根据权利要求1所述的方法,其中,所述预定范围的节点包括:预设数目K的连接边之内的K阶子节点;和/或交互时间在预设时间范围内的子节点。
- 根据权利要求1所述的方法,其中,所述当前处理节点的节点特征包括,所述当前处理节点所对应的对象的属性特征。
- 根据权利要求3所述的方法,其中,所述当前处理节点为用户节点,所述属性特征包括以下中的至少一项:年龄、职业、教育程度、所在地区、注册时长、人群标签;或者,所述当前处理节点为物品节点,所述属性特征包括以下中的至少一项:物品类别、 上架时间、评论数、销量。
- 根据权利要求3所述的方法,其中,所述当前处理节点的节点特征还包括,所述当前处理节点所对应的交互事件的事件行为特征。
- 根据权利要求1所述的方法,其中,所述确定该当前处理节点的隐含向量包括:将所述当前处理节点的节点特征,分别与两个子节点的隐含向量组合,并分别输入第一变换函数和第二变换函数,得到2个第一变换向量和2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
- 根据权利要求1所述的方法,其中,所述迭代处理包括,根据所述当前处理节点的节点特征,所述两个子节点的隐含向量,以及该当前处理节点所在交互事件的第一交互时间与所述两个子节点所在交互事件的第二交互时间之间的时间差,确定所述当前处理节点的隐含向量。
- 根据权利要求7所述的方法,其中,所述确定当前处理节点的隐含向量包括:将所述当前处理节点的节点特征和所述时间差,与所述两个子节点的隐含向量分别组合,输入第一变换函数,得到2个第一变换向量;将所述节点特征与所述两个子节点的隐含向量分别组合,输入第二变换函数,得到2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
- 根据权利要求1所述的方法,其中,所述形成反向样本子图包括:将所述第一样本子图中节点间的父子关系反向,形成以所述第一样本节点为叶节点的第一反向子图;将所述第二样本子图中节点间的父子关系反向,形成以所述第二样本节点为叶节点的第二反向子图;合并所述第一反向子图和所述第二反向子图中共同的节点,形成合并子图;在所述合并子图中,对于仅有一个子节点的节点,为其添加缺省子节点,由此形成所述反向样本子图。
- 根据权利要求1所述的方法,其中,所述反向样本子图的叶节点包括对应于所述第一样本节点的第一叶节点,和对应于所述第二样本节点的第二叶节点;所述第一叶节点的隐含向量为所述第一样本向量,所述第二叶节点的隐含向量为所述第二样本向量。
- 根据权利要求1所述的方法,其中,所述反向样本子图的叶节点包括对应于所述第一样本节点的第一叶节点,和对应于所述第二样本节点的第二叶节点;所述第一叶节点的隐含向量基于该第一叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定;所述第二叶节点的隐含向量基于该第二叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定。
- 根据权利要求1所述的方法,其中,所述自编码器包括多个LSTM层,其中,上一LSTM层确定出的所述当前处理节点的隐含向量,输入到下一LSTM层作为该当前处理节点的节点特征。
- 根据权利要求12所述的方法,其中,根据所述样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失,包括:对于所述样本节点集中的各个节点,确定所述多个LSTM层中最后一个LSTM层输出的该节点的隐含向量,与输入到第一个LSTM层的该节点的节点特征之间的距离;根据所述各个节点对应的距离的综合结果,确定预测损失。
- 一种利用自编码器评估交互事件的方法,所述方法包括:获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;以待分析的目标事件所对应的第一目标节点、第二目标节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一目标子 图和第二目标子图;获取根据权利要求1的方法训练得到的自编码器;分别将所述第一目标子图和第二目标子图输入到所述自编码器,得到第一目标节点对应的第一目标向量,和第二目标节点对应的第二目标向量;通过将第一目标子图和第二目标子图中节点间的父子关系反向,并将反向后的子图合并,形成反向目标子图;所述反向目标子图包括所述第一目标子图和第二目标子图中节点的并集形成的目标节点集;将所述反向目标子图输入所述自编码器,得到所述目标节点集中各个节点的隐含向量;其中所述反向目标子图的叶节点的隐含向量根据所述第一目标向量和第二目标向量确定;确定所述目标节点集中各个节点的隐含向量与其节点特征之间的距离的综合结果;根据所述综合结果与预定阈值的比较,评估所述目标事件是否为预定类别的事件。
- 根据权利要求14所述的方法,其中,所述目标事件为假设发生的事件,所述预定类别的事件为确定发生的事件。
- 根据权利要求14所述的方法,其中,所述目标事件为已发生事件,所述预定类别的事件为确认安全的事件。
- 一种训练用于评估交互事件的自编码器的装置,所述装置包括:交互图获取单元,配置为获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;样本子图获取单元,配置为分别以预定类别的样本交互事件所对应的第一样本节点、第二样本节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一样本子图和第二样本子图;编码器获取单元,配置为获取待训练的自编码器,所述自编码器包括LSTM层,所述LSTM层根据输入子图中节点之间的父子关系,从叶节点到根节点依次迭代处理各个节点,其中所述迭代处理包括,至少根据当前处理节点的节点特征,及其两个子节点的隐含向量,确定该当前处理节点的隐含向量;样本子图处理单元,配置为分别将所述第一样本子图和第二样本子图输入到所述自编码器,得到第一样本节点对应的第一样本向量,和第二样本节点对应的第二样本向量;反向子图形成单元,配置为通过将第一样本子图和第二样本子图中节点间的父子关系反向,并将反向后的子图合并,形成反向样本子图;所述反向样本子图包括所述第一 样本子图和第二样本子图中节点的并集形成的样本节点集;反向子图处理单元,配置为将所述反向样本子图输入所述自编码器,得到所述样本节点集中各个节点的隐含向量,其中所述反向样本子图的叶节点的隐含向量根据所述第一样本向量和第二样本向量确定;更新单元,配置为根据所述样本节点集中各个节点的隐含向量与其节点特征之间的距离的综合,确定预测损失,并在预测损失减小的方向,更新所述自编码器。
- 根据权利要求17所述的装置,其中,所述预定范围的节点包括:预设数目K的连接边之内的K阶子节点;和/或交互时间在预设时间范围内的子节点。
- 根据权利要求17所述的装置,其中,所述当前处理节点的节点特征包括,所述当前处理节点所对应的对象的属性特征。
- 根据权利要求19所述的装置,其中,所述当前处理节点为用户节点,所述属性特征包括以下中的至少一项:年龄、职业、教育程度、所在地区、注册时长、人群标签;或者,所述当前处理节点为物品节点,所述属性特征包括以下中的至少一项:物品类别、上架时间、评论数、销量。
- 根据权利要求19所述的装置,其中,所述当前处理节点的节点特征还包括,所述当前处理节点所对应的交互事件的事件行为特征。
- 根据权利要求17所述的装置,其中,所述LSTM层配置用于:将所述当前处理节点的节点特征,分别与两个子节点的隐含向量组合,并分别输入第一变换函数和第二变换函数,得到2个第一变换向量和2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
- 根据权利要求17所述的装置,其中,所述LSTM层配置用于,根据所述当前处理节点的节点特征,所述两个子节点的隐含向量,以及该当前处理节点所在交互事件 的第一交互时间与所述两个子节点所在交互事件的第二交互时间之间的时间差,确定所述当前处理节点的隐含向量。
- 根据权利要求23所述的装置,其中,所述LSTM层配置用于:将所述当前处理节点的节点特征和所述时间差,与所述两个子节点的隐含向量分别组合,输入第一变换函数,得到2个第一变换向量;将所述节点特征与所述两个子节点的隐含向量分别组合,输入第二变换函数,得到2个第二变换向量;将所述两个子节点中第i子节点的用于辅助运算的中间向量,与对应的第i个第一变换向量,第i个第二变换向量进行组合操作,得到2个操作结果,将该2个操作结果求和,得到组合向量;将所述当前处理节点的节点特征连同所述两个子节点的隐含向量,分别输入第三变换函数和第四变换函数,分别得到第三变换向量和第四变换向量;基于所述组合向量和第三变换向量,确定所述当前处理节点的中间向量;基于所述当前处理节点的中间向量和第四变换向量,确定所述当前处理节点的隐含向量。
- 根据权利要求17所述的装置,其中,所述反向子图形成单元包括:第一反向模块,配置为将所述第一样本子图中节点间的父子关系反向,形成以所述第一样本节点为叶节点的第一反向子图;第二反向模块,配置为将所述第二样本子图中节点间的父子关系反向,形成以所述第二样本节点为叶节点的第二反向子图;合并模块,配置为合并所述第一反向子图和所述第二反向子图中共同的节点,形成合并子图;添加模块,配置为在所述合并子图中,对于仅有一个子节点的节点,为其添加缺省子节点,由此形成所述反向样本子图。
- 根据权利要求17所述的装置,其中,所述反向样本子图的叶节点包括对应于所述第一样本节点的第一叶节点,和对应于所述第二样本节点的第二叶节点;所述第一叶节点的隐含向量为所述第一样本向量,所述第二叶节点的隐含向量为所述第二样本向量。
- 根据权利要求17所述的装置,其中,所述反向样本子图的叶节点包括对应于所述第一样本节点的第一叶节点,和对应于所述第二样本节点的第二叶节点;所述反向子图处理单元配置为,基于该第一叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定第一叶节点的隐含向量; 基于该第二叶节点的节点特征,以及将所述第一样本向量和所述第二样本向量作为两个子节点的隐含向量而确定所述第二叶节点的隐含向量。
- 根据权利要求17所述的装置,其中,所述自编码器包括多个LSTM层,其中,上一LSTM层确定出的所述当前处理节点的隐含向量,输入到下一LSTM层作为该当前处理节点的节点特征。
- 根据权利要求28所述的装置,其中,所述更新单元配置为:对于所述样本节点集中的各个节点,确定所述多个LSTM层中最后一个LSTM层输出的该节点的隐含向量,与输入到第一个LSTM层的该节点的节点特征之间的距离;根据所述各个节点对应的距离的综合结果,确定预测损失。
- 一种利用自编码器评估交互事件的装置,所述装置包括:交互图获取单元,配置为获取反映交互事件关联关系的动态交互图,其中包括多对节点,每对节点代表一个交互事件中的两个对象,任意节点i通过连接边与两个子节点相连,该两个子节点为该节点i所代表的对象参与的上一交互事件对应的两个节点;目标子图获取单元,配置为以待分析的目标事件所对应的第一目标节点、第二目标节点为根节点,在所述动态交互图中确定出从根节点出发,经由连接边到达的预定范围的节点所形成的第一目标子图和第二目标子图;编码器获取单元,配置为获取利用权利要求17的装置训练得到的自编码器;目标子图处理单元,配置为分别将所述第一目标子图和第二目标子图输入到所述自编码器,得到第一目标节点对应的第一目标向量,和第二目标节点对应的第二目标向量;反向子图形成单元,配置为通过将第一目标子图和第二目标子图中节点间的父子关系反向,并将反向后的子图合并,形成反向目标子图;所述反向目标子图包括所述第一目标子图和第二目标子图中节点的并集形成的目标节点集;反向子图处理单元,配置为将所述反向目标子图输入所述自编码器,得到所述目标节点集中各个节点的隐含向量;其中所述反向目标子图的叶节点的隐含向量根据所述第一目标向量和第二目标向量确定;综合单元,配置为确定所述目标节点集中各个节点的隐含向量与其节点特征之间的距离的综合结果;评估单元,配置为根据所述综合结果与预定阈值的比较,评估所述目标事件是否为预定类别的事件。
- 根据权利要求30所述的装置,其中,所述目标事件为假设发生的事件,所述预定类别的事件为确定发生的事件。
- 根据权利要求30所述的装置,其中,所述目标事件为已发生事件,所述预定类别的事件为确认安全的事件。
- 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-16中任一项的所述的方法。
- 一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-16中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010021764.4A CN111242283B (zh) | 2020-01-09 | 2020-01-09 | 评估交互事件的自编码器的训练方法及装置 |
CN202010021764.4 | 2020-01-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021139525A1 true WO2021139525A1 (zh) | 2021-07-15 |
Family
ID=70872562
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/138401 WO2021139525A1 (zh) | 2020-01-09 | 2020-12-22 | 评估交互事件的自编码器的训练方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111242283B (zh) |
WO (1) | WO2021139525A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114596120A (zh) * | 2022-03-15 | 2022-06-07 | 江苏衫数科技集团有限公司 | 一种商品销量预测方法、系统、设备及存储介质 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111242283B (zh) * | 2020-01-09 | 2021-06-25 | 支付宝(杭州)信息技术有限公司 | 评估交互事件的自编码器的训练方法及装置 |
CN111476223B (zh) * | 2020-06-24 | 2020-09-22 | 支付宝(杭州)信息技术有限公司 | 评估交互事件的方法及装置 |
CN112085293B (zh) * | 2020-09-18 | 2022-09-09 | 支付宝(杭州)信息技术有限公司 | 训练交互预测模型、预测交互对象的方法及装置 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170154282A1 (en) * | 2015-12-01 | 2017-06-01 | Palo Alto Research Center Incorporated | Computer-Implemented System And Method For Relational Time Series Learning |
CN107145977A (zh) * | 2017-04-28 | 2017-09-08 | 电子科技大学 | 一种对在线社交网络用户进行结构化属性推断的方法 |
CN109635204A (zh) * | 2018-12-21 | 2019-04-16 | 上海交通大学 | 基于协同过滤和长短记忆网络的在线推荐系统 |
CN109919316A (zh) * | 2019-03-04 | 2019-06-21 | 腾讯科技(深圳)有限公司 | 获取网络表示学习向量的方法、装置和设备及存储介质 |
CN110490274A (zh) * | 2019-10-17 | 2019-11-22 | 支付宝(杭州)信息技术有限公司 | 评估交互事件的方法及装置 |
CN110543935A (zh) * | 2019-08-15 | 2019-12-06 | 阿里巴巴集团控股有限公司 | 处理交互序列数据的方法及装置 |
CN111242283A (zh) * | 2020-01-09 | 2020-06-05 | 支付宝(杭州)信息技术有限公司 | 评估交互事件的自编码器的训练方法及装置 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8630966B2 (en) * | 2009-01-27 | 2014-01-14 | Salk Institute For Biological Studies | Temporally dynamic artificial neural networks |
CN109960759B (zh) * | 2019-03-22 | 2022-07-12 | 中山大学 | 基于深度神经网络的推荐系统点击率预测方法 |
-
2020
- 2020-01-09 CN CN202010021764.4A patent/CN111242283B/zh active Active
- 2020-12-22 WO PCT/CN2020/138401 patent/WO2021139525A1/zh active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170154282A1 (en) * | 2015-12-01 | 2017-06-01 | Palo Alto Research Center Incorporated | Computer-Implemented System And Method For Relational Time Series Learning |
CN107145977A (zh) * | 2017-04-28 | 2017-09-08 | 电子科技大学 | 一种对在线社交网络用户进行结构化属性推断的方法 |
CN109635204A (zh) * | 2018-12-21 | 2019-04-16 | 上海交通大学 | 基于协同过滤和长短记忆网络的在线推荐系统 |
CN109919316A (zh) * | 2019-03-04 | 2019-06-21 | 腾讯科技(深圳)有限公司 | 获取网络表示学习向量的方法、装置和设备及存储介质 |
CN110543935A (zh) * | 2019-08-15 | 2019-12-06 | 阿里巴巴集团控股有限公司 | 处理交互序列数据的方法及装置 |
CN110490274A (zh) * | 2019-10-17 | 2019-11-22 | 支付宝(杭州)信息技术有限公司 | 评估交互事件的方法及装置 |
CN111242283A (zh) * | 2020-01-09 | 2020-06-05 | 支付宝(杭州)信息技术有限公司 | 评估交互事件的自编码器的训练方法及装置 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114596120A (zh) * | 2022-03-15 | 2022-06-07 | 江苏衫数科技集团有限公司 | 一种商品销量预测方法、系统、设备及存储介质 |
CN114596120B (zh) * | 2022-03-15 | 2024-01-05 | 江苏衫数科技集团有限公司 | 一种商品销量预测方法、系统、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN111242283B (zh) | 2021-06-25 |
CN111242283A (zh) | 2020-06-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021139525A1 (zh) | 评估交互事件的自编码器的训练方法及装置 | |
WO2021218828A1 (zh) | 基于差分隐私的异常检测模型的训练 | |
Bai et al. | A neural collaborative filtering model with interaction-based neighborhood | |
WO2021027260A1 (zh) | 处理交互序列数据的方法及装置 | |
US11574201B2 (en) | Enhancing evolutionary optimization in uncertain environments by allocating evaluations via multi-armed bandit algorithms | |
US20190325514A1 (en) | Credit risk prediction method and device based on lstm model | |
WO2021139524A1 (zh) | 利用lstm神经网络模型处理交互数据的方法及装置 | |
US7421380B2 (en) | Gradient learning for probabilistic ARMA time-series models | |
JP2017535857A (ja) | 変換されたデータを用いた学習 | |
US8862662B2 (en) | Determination of latent interactions in social networks | |
US11379726B2 (en) | Finite rank deep kernel learning for robust time series forecasting and regression | |
CN110490274B (zh) | 评估交互事件的方法及装置 | |
CN110543935B (zh) | 处理交互序列数据的方法及装置 | |
CN112085615B (zh) | 图神经网络的训练方法及装置 | |
CN110689110B (zh) | 处理交互事件的方法及装置 | |
Bauer et al. | Improved customer lifetime value prediction with sequence-to-sequence learning and feature-based models | |
CN109034960A (zh) | 一种基于用户节点嵌入的多属性推断的方法 | |
US11640609B1 (en) | Network based features for financial crime detection | |
McFowland III et al. | A prescriptive analytics framework for optimal policy deployment using heterogeneous treatment effects. | |
CN111476223B (zh) | 评估交互事件的方法及装置 | |
Wang et al. | Uncertainty quantification for demand prediction in contextual dynamic pricing | |
Lera et al. | Prediction and prevention of disproportionally dominant agents in complex networks | |
Joachims et al. | Recommendations as treatments | |
Zhang et al. | Advertisement Click‐Through Rate Prediction Based on the Weighted‐ELM and Adaboost Algorithm | |
Kinn | Synthetic control methods and big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20911674 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20911674 Country of ref document: EP Kind code of ref document: A1 |