CN111210008A - Method and device for processing interactive data by using LSTM neural network model - Google Patents

Method and device for processing interactive data by using LSTM neural network model Download PDF

Info

Publication number
CN111210008A
CN111210008A CN202010022183.2A CN202010022183A CN111210008A CN 111210008 A CN111210008 A CN 111210008A CN 202010022183 A CN202010022183 A CN 202010022183A CN 111210008 A CN111210008 A CN 111210008A
Authority
CN
China
Prior art keywords
node
vector
interaction
implicit
transformation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010022183.2A
Other languages
Chinese (zh)
Other versions
CN111210008B (en
Inventor
常晓夫
文剑烽
刘旭钦
宋乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010022183.2A priority Critical patent/CN111210008B/en
Priority to CN202210602804.3A priority patent/CN115081589A/en
Publication of CN111210008A publication Critical patent/CN111210008A/en
Priority to PCT/CN2020/138398 priority patent/WO2021139524A1/en
Application granted granted Critical
Publication of CN111210008B publication Critical patent/CN111210008B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for processing interactive data. In the method, a dynamic interaction graph constructed according to an interaction event set is obtained, any node i in the graph points to M associated nodes corresponding to N associated events in which an object represented by the node i participates last time through a connecting edge, wherein the object is allowed to participate in a plurality of associated events simultaneously, and the nodes are allowed to be connected to more than 2 associated nodes. And then, in the dynamic interaction graph, determining a current sub-graph corresponding to the current node to be analyzed, and inputting the current sub-graph into the neural network model for processing. The neural network model comprises an LSTM layer, and the LSTM layer sequentially and iteratively processes each node according to the directional relation of the connecting edges between each node in the current sub-graph, so that the implicit vector of the current node is obtained.

Description

Method and device for processing interactive data by using LSTM neural network model
Technical Field
One or more embodiments of the present specification relate to the field of machine learning, and more particularly, to a method and apparatus for processing interactive data using machine learning.
Background
In many scenarios, user interaction events need to be analyzed and processed. The interaction event is one of basic constituent elements of an internet event, for example, a click action when a user browses a page can be regarded as an interaction event between the user and a content block of the page, a purchase action in an e-commerce can be regarded as an interaction event between the user and a commodity, and an inter-account transfer action is an interaction event between the user and the user. The characteristics of fine-grained habit preference and the like of the user and the characteristics of an interactive object are contained in a series of interactive events of the user, and the characteristics are important characteristic sources of a machine learning model. Therefore, in many scenarios, it is desirable to characterize and model interactive participants based on interactive events.
However, an interactive event involves both interacting parties, and the status of each party itself may be dynamically changing, and thus it is very difficult to accurately characterize the interacting parties comprehensively considering their multi-aspect characteristics. Therefore, an improved scheme for more effectively analyzing and processing the interactive objects in the interactive events to obtain feature vectors suitable for subsequent business analysis is desired.
Disclosure of Invention
One or more embodiments of the present specification describe methods and apparatus for processing interactive data, wherein an LSTM neural network model is used to process interactive objects into implicit features, taking into account the interactive events in which the interactive objects participate and the effects of other objects in the interactive events, for subsequent business process analysis.
According to a first aspect, there is provided a method of processing interaction data, the method comprising:
acquiring a dynamic interaction graph constructed according to an interaction event set, wherein the interaction event set comprises a plurality of interaction events, and each interaction event at least comprises two objects with interaction behaviors and interaction time; the dynamic interaction graph comprises any first node, the first node corresponds to a first object in an interaction event occurring at a first time, the first node points to M associated nodes corresponding to N associated events through a connecting edge, the N associated events all occur at a second time and all comprise the first object as one of the interaction objects, and the second time is the previous time when the interaction behavior occurs in the first object after the first time is traced forwards; the dynamic interaction graph comprises at least one multi-element node with the number of associated nodes larger than 2;
determining a current sub-graph corresponding to a current node to be analyzed in the dynamic interaction graph, wherein the current sub-graph comprises nodes which start from the current node and reach a predetermined range through a connecting edge;
inputting the current sub-graph into a neural network model, wherein the neural network model comprises an LSTM layer, and the LSTM layer sequentially iterates and processes each node according to the directional relation of a connecting edge between each node in the current sub-graph, so as to obtain an implicit vector of the current node; wherein each node comprises a second node, and the sequentially iterative processing of each node comprises determining an implicit vector and an intermediate vector of the second node at least according to the node characteristics of the second node and the intermediate vector and implicit vectors of the k associated nodes pointed by the second node;
and carrying out service processing related to the current node according to the implicit vector of the current node.
According to one embodiment, the object includes a user, and the interaction event includes at least one of: click events, social events, transaction events.
In an embodiment, the M associated nodes are 2N nodes, and respectively correspond to two objects included in each associated event of the N associated events; or, in another embodiment, the M associated nodes are N +1 nodes, which respectively correspond to N other objects interacting with the first object among the N associated events, and the first object itself.
In different embodiments, the nodes in the predetermined range may include nodes in a connecting edge with a preset order K; and/or nodes with interaction time within a preset time range.
In one embodiment, each interaction event further comprises a behavior characteristic of the interaction behavior; in this case, the node characteristics of the second node may include attribute characteristics of an object corresponding to the second node and behavior characteristics of an interaction event in which the second node participates in the corresponding interaction time.
In one embodiment, the implicit and intermediate vectors for the second node may be determined by: combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes respectively, inputting a first transformation function and a second transformation function which have the same algorithm and different parameters, and obtaining k first transformation vectors and k second transformation vectors respectively; combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and the ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector; respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector; determining an intermediate vector for the second node based on the combined vector and a third transformed vector; determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
According to an embodiment, sequentially and iteratively processing each node may include, according to the node characteristics of the second node, the intermediate vector and the implicit vector of each of the k associated nodes pointed to by the second node, and a time difference between the interaction time corresponding to the second node and the interaction time corresponding to the k associated nodes, determining the implicit vector and the intermediate vector of the second node.
In one embodiment of the above embodiment, the implicit vector and the intermediate vector of the second node may be determined by: respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and inputting a first transformation function to obtain k first transformation vectors; respectively combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes, and inputting a second transformation function to obtain k second transformation vectors; combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector; respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector; determining an intermediate vector for the second node based on the combined vector and a third transformed vector; determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
In another embodiment of the above embodiment, the implicit vector and the intermediate vector of the second node may be determined by: respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and inputting a first transformation function and a second transformation function which have the same algorithm and different parameters to respectively obtain k first transformation vectors and k second transformation vectors; combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector; respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector; determining an intermediate vector for the second node based on the combined vector and a third transformed vector; determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
According to one embodiment, the neural network model may include a plurality of LSTM layers, wherein the implicit vector of the second node determined by the previous LSTM layer is input to the next LSTM layer as the node feature of the second node.
In an embodiment of the foregoing embodiment, the neural network model integrates implicit vectors of the current node output by each of the plurality of LSTM layers to obtain a final implicit vector of the current node.
In another embodiment of the above embodiment, the neural network model takes the implicit vector of the current node output by the last LSTM layer of the plurality of LSTM layers as the final implicit vector of the current node.
According to one embodiment, the neural network model is trained by: acquiring a historical interaction event, wherein the historical interaction event comprises a first sample object and a second sample object; respectively determining a first sub-graph corresponding to the first sample object and a second sub-graph corresponding to the second sample object in the dynamic interaction graph; inputting the first sub-graph and the second sub-graph into the neural network model respectively to obtain an implicit vector of the first sample object and an implicit vector of the second sample object respectively; predicting whether the first sample object and the second sample object can interact or not according to the implicit vector of the first sample object and the implicit vector of the second sample object to obtain a prediction result; determining a prediction loss according to the prediction result; updating the neural network model according to the predicted loss.
According to another embodiment, the neural network model is trained by: selecting a sample object from a plurality of sample objects related to the interaction event set, and obtaining a classification label of the sample object; determining a sample sub-graph corresponding to the sample object in the dynamic interaction graph; inputting the sample subgraph into the neural network model to obtain an implicit vector of the sample object; predicting the classification of the sample object according to the implicit vector of the sample object to obtain a prediction result; determining a prediction loss according to the prediction result and the classification label; updating the neural network model according to the predicted loss.
According to a second aspect, there is provided an apparatus for processing interaction data, the apparatus comprising:
the interactive map acquisition unit is configured to acquire a dynamic interactive map constructed according to an interactive event set, wherein the interactive event set comprises a plurality of interactive events, and each interactive event at least comprises two objects with interactive behaviors and interactive time; the dynamic interaction graph comprises any first node, the first node corresponds to a first object in an interaction event occurring at a first time, the first node points to M associated nodes corresponding to N associated events through a connecting edge, the N associated events all occur at a second time and all comprise the first object as one of the interaction objects, and the second time is the previous time when the interaction behavior occurs in the first object after the first time is traced forwards; the dynamic interaction graph comprises at least one multi-element node with the number of associated nodes larger than 2;
a subgraph determining unit configured to determine a current subgraph corresponding to a current node to be analyzed in the dynamic interaction graph, wherein the current subgraph comprises nodes which start from the current node and reach a predetermined range through a connecting edge;
the sub-graph processing unit is configured to input the current sub-graph into a neural network model, the neural network model comprises an LSTM layer, and the LSTM layer sequentially iterates and processes each node according to the directional relation of a connecting edge between each node in the current sub-graph, so that an implicit vector of the current node is obtained; wherein each node comprises a second node, and the sequentially iterative processing of each node comprises determining an implicit vector and an intermediate vector of the second node at least according to the node characteristics of the second node and the intermediate vector and implicit vectors of the k associated nodes pointed by the second node;
and the service processing unit is configured to perform service processing related to the current node according to the implicit vector of the current node.
According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
According to a fourth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and wherein the processor, when executing the executable code, implements the method of the first aspect.
According to the method and the device provided by the embodiment of the specification, a dynamic interaction diagram is constructed on the basis of an interaction event set, and the dynamic interaction diagram reflects the time sequence relation of each interaction event and the interaction influence transmitted by each interaction event between interaction objects. Considering the possibility of interaction events occurring simultaneously in practical operation, the dynamic interaction graph allows nodes to be connected to an unlimited number of associated nodes, thereby forming a mixed-multivariate interaction graph. And extracting to obtain an implicit vector of the interactive object based on a subgraph related to the interactive object to be analyzed in the dynamic interactive graph by using the trained LSTM neural network model. The influence of other interaction objects in each interaction event on the implicit vector is introduced into the implicit vector, so that the deep features of the interaction objects can be comprehensively expressed for service processing.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1A illustrates an interaction relationship bipartite graph in one example;
FIG. 1B illustrates an interaction relationship network diagram in another example;
FIG. 2 illustrates an implementation scenario diagram according to one embodiment;
FIG. 3 illustrates a flow diagram of a method of processing interaction data, according to one embodiment;
FIG. 4 illustrates a dynamic interaction diagram constructed in accordance with one embodiment;
FIG. 5 illustrates a dynamic interaction diagram constructed in accordance with another embodiment;
FIG. 6 shows an example of a current subgraph in one embodiment;
FIG. 7 shows an example of a current sub-graph in another embodiment;
FIG. 8 shows a schematic diagram of the operation of the LSTM layer;
FIG. 9 illustrates the structure of an LSTM layer according to one embodiment;
FIG. 10 shows the structure of an LSTM layer according to another embodiment;
FIG. 11 shows the structure of an LSTM layer according to yet another embodiment;
FIG. 12 illustrates a flow diagram for training a neural network model in one embodiment;
FIG. 13 shows a flow diagram for training a neural network model in another embodiment;
FIG. 14 shows a schematic block diagram of an apparatus for processing interaction data according to one embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
As previously mentioned, it is desirable to be able to characterize and model the participants of an interaction event, i.e., the interaction objects, based on the interaction event.
In one approach, a static interaction relationship network graph is constructed based on historical interaction events, such that individual interaction objects are analyzed based on the interaction relationship network graph. Specifically, the participants of the historical events may be used as nodes, and connection edges may be established between nodes having an interaction relationship, so as to form the interaction network graph.
Fig. 1A and 1B respectively show an interaction network diagram in a specific example. More specifically, FIG. 1A shows a bipartite graph comprising user nodes (U1-U4) and commodity nodes (V1-V3), where if a user purchases a commodity, a connecting edge is constructed between the user and the commodity. FIG. 1B shows a user transfer relationship diagram where each node represents a user and there is a connecting edge between two users who have had transfer records.
However, it can be seen that fig. 1A and 1B, although showing the interaction relationship between objects, do not contain timing information of these interaction events. The graph embedding is simply carried out on the basis of the interaction relation network graph, and the obtained feature vectors do not express the influence of the time information of the interaction events on the nodes. Moreover, such static graphs are not scalable enough, and are difficult to flexibly process for the situations of newly added interaction events and newly added nodes.
In another scheme, for each interactive object to be analyzed, a behavior sequence of the object is constructed, and the feature expression of the object is extracted based on the behavior sequence. However, such a behavior sequence merely characterizes the behavior of the object to be analyzed itself, whereas an interaction event is an event involving multiple parties, and influences are indirectly transmitted between the participants through the interaction event. Thus, such an approach does not express the impact between the participating objects in the interaction event.
Taking the above factors into consideration, according to one or more embodiments of the present specification, a dynamically changing set of interactivity events is constructed as a dynamic interactivity graph, where each interactivity object involved in each interactivity event corresponds to each node in the dynamic interactivity graph. And connecting any node to a plurality of associated nodes, wherein the associated nodes are nodes corresponding to the interaction events in which the object corresponding to the any node participates last time. And for the interactive object to be analyzed, obtaining a subgraph part related to the corresponding node from the dynamic interactive graph, and inputting the subgraph part into a neural network model based on the LSTM to obtain the feature vector expression of the interactive object.
Fig. 2 shows a schematic illustration of an implementation scenario according to an embodiment. As shown in fig. 2, an interactivity event set consisting of a plurality of interactivity events that occur may be obtained. More specifically, the set of interactivity events may be a sequence of interactivity events that organizes multiple interactivity events into a chronological order<E1,E2,…,EN>Wherein each element EiRepresenting an interaction event, which may be represented in the form of an interaction feature set Ei=(ai,bi,ti) Wherein a isiAnd biIs an event EiTwo interacting objects of, tiIs the interaction time. Allowing multiple occurrences at the same time due to factors such as the accuracy of the time measurementAn interaction event.
According to embodiments of the present description, a dynamic interaction graph 200 is constructed based on the set of interaction events. In diagram 200, each interactive object a in each interactive event is assignedi,biRepresented by nodes and establishing connecting edges between events containing the same object. Since multiple interaction events are allowed to occur simultaneously, the dynamic interaction graph 200 includes at least one multivariate node that can be connected to 3 or more associated nodes. The structure of the dynamic interaction graph 200 will be described in more detail later.
For a certain interactive object to be analyzed, the corresponding current node in the dynamic interactive graph can be determined, and the current sub-graph related to the current node in the dynamic interactive graph is obtained. In general, a current subgraph includes a range of nodes that can be reached through a connecting edge from a current node. The current subgraph reflects the impact on the current node by other objects in the interaction event directly or indirectly associated with the current interaction object.
Then, inputting the current subgraph into a neural network model based on long-short term memory (LSTM), and outputting to obtain a feature vector of the current interactive object through the model. The obtained feature vector can extract the time information of the associated interaction events and the influence between the interaction objects in each interaction event, thereby more accurately expressing the deep features of the current interaction object. Such feature vectors may be subsequently applied to various machine learning models and various business process scenarios. For example, reinforcement learning may be performed based on the feature vector thus obtained, or clustering analysis may be performed based on the feature vector, for example, clustering users into a population. Classification predictions may also be made based on such feature vectors, for example, predicting whether an interaction will occur between two objects (e.g., whether a user will purchase a good), predicting a business type of an object (e.g., a risk level of a user), and so on.
Specific implementations of the above concepts are described below.
FIG. 3 illustrates a flow diagram of a method of processing interaction data, according to one embodiment. It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities. The following describes, with reference to specific embodiments, each step in the method for processing interactive data shown in fig. 3.
First, in step 31, a dynamic interaction graph constructed from a set of interaction events is obtained.
As previously described, an interactivity event set may be obtained that is composed of a plurality of interactivity events, each interactivity event having two interactivity objects and an interactivity time. Thus, any interaction event EiCan be represented as an interactive feature set Ei=(ai,bi,ti) Wherein a isiAnd biIs an event EiAre referred to as a first object and a second object, tiIs the interaction time.
For example, in an e-commerce platform, an interaction event may be a purchase by a user, where a first object may be a user and a second object may be a good. In another example, the interaction event may be a click action of a user on a page sector, wherein the first object may be a certain user and the second object may be a certain page sector. In yet another example, the interaction event may be a transaction event, such as a transfer of money from one user to another, or a payment from one user to a store or platform. In another example, the interaction event may be a social event that occurs by the user through a social platform, such as a chat, a conversation, a red parcel, and so forth. In other business scenarios, an interaction event may also be other interaction behavior that occurs between two objects.
In one embodiment, the two objects that interact may be different types of objects, referred to as a first type of object and a second type of object, for example, depending on the nature of the interaction event. For example, when the interaction event is a purchase in an e-commerce platform, the first type of object may be a user and the second type of object may be a commodity. In other embodiments, the two objects involved in the interaction event may be homogeneous objects. For example, in an instant messaging scenario, an interactive event may be an instant message between two users. At this time, the first object and the second object are both users and belong to the same kind of object. In still other embodiments, whether to distinguish the types of the two interactive objects may be set according to the needs of the service. For example, for a transfer interaction event, in the foregoing example, two users may be considered to belong to the same class of object. In other examples, the money transfer user may be considered as the first type object and the receiving user may be considered as the second type object according to the business requirement.
Further, in one embodiment, the interaction feature set corresponding to each interaction event may further include an event feature or a behavior feature f, and thus, each interaction feature set may be represented as Xi=(ai,bi,tiF). In particular, the event characteristics or behavior characteristics f may include context and context information of the occurrence of the interaction event, some attribute characteristics of the interaction behavior, and so on.
For example, in the case that the interaction event is a user click event, the event feature f may include a type of a terminal used by the user for clicking, a browser type, an app version, and the like; in the case where the interactive event is a transaction event, the event characteristics f may include, for example, a transaction type (commodity purchase transaction, transfer transaction, etc.), a transaction amount, a transaction channel, and the like.
The above describes an interaction event and an interaction object a in the eventi,biExamples of (3) are given.
For interaction time tiIt should be understood that in practice, time is always measured and recorded in units of an appropriate time period. For example, in some service platforms, the unit duration of time to record the interaction time may be hours h, or minutes m. Thus, multiple interaction events are likely to occur within the unit time length. Even in the unit of a short time length, such as seconds or even milliseconds and ms, for some service platforms with very frequent interaction, such as a pay treasure, the situation of multiple interaction events in the unit time length still inevitably occurs.
In addition, there are some instances of batch interactions. For example, the user edits a message in advance, selects a friend group, and then performs batch mass sending. This amounts to initiating an interaction event with multiple buddies simultaneously. As another example, a user has added multiple items in a shopping cart and then has selected bulk settlement, which is equivalent to initiating an interaction event with multiple items at the same time.
For at least the above two reasons, it often happens that the interaction time recorded by a plurality of interaction events is the same. For such cases, sometimes referred to herein as multiple interaction events occurring at the same time, without distinguishing the precise time and sequence thereof.
In a specific example, it is assumed that an interaction event set S is obtained, and the interaction events in the interaction event set S are arranged in a time sequence and represented in the form of an interaction feature group, and may be recorded as follows:
Figure BDA0002361208390000111
wherein a, b, c, d, E, f, u and v are interaction objects and interaction events E2And E3All occur at t2Time, interaction event E4,E5And E6All occur at t3Time, interaction event E7And E8All occur at t4Time.
For the above-mentioned interaction event set, a dynamic interaction graph can be constructed to depict the association relationship between each interaction event and interaction object. Specifically, the objects included in the interaction events occurring at various times may be used as nodes of the dynamic interaction graph. Thus, a node may correspond to an object that is interactive at one time, but the same physical object may correspond to multiple nodes. For example, entity object v is at t6Time interacts with an object u, and a node v (t) can be correspondingly constructed6) At t5Time interacts with an object c, and a node v (t) can be correspondingly constructed5). Thus, it can be considered that a node in the dynamic interaction graph corresponds to an interaction object at a certain interaction time, orIt is said to correspond to the state of the interactive object at a certain interaction time.
For each node in the dynamic interaction graph, a connecting edge is constructed in the following way: for any node i, it is referred to as the first node for simplicity; assuming that the first object corresponds to the first object at the first interaction time t, in the interaction event sequence, tracing back from the first interaction time t forward, that is, tracing back to a direction earlier than the first interaction time t, determining that the last time when the interaction behavior of the first object occurs is the second time (t-), taking N interaction events occurring in the second time and participating in the first object as N associated events of the first node, taking M nodes corresponding to the N associated events as associated nodes, and establishing a connecting edge pointing from the first node i to the M associated nodes. Since there is a possibility that multiple interaction events occur at the same time, N may be greater than 1. As such, a multivariate node, i.e., a node having a number of connected associated nodes greater than 2, may be included in the dynamic interaction graph.
In one embodiment, when constructing the dynamic interaction graph, corresponding nodes are respectively established for two objects of each interaction event. Thus, the aforementioned N association events correspond to 2N nodes, and the 2N nodes serve as the aforementioned M association nodes.
FIG. 4 illustrates a dynamic interaction diagram constructed in accordance with one embodiment. Specifically, the left side of fig. 4 shows an interaction sequence diagram that organizes the aforementioned interaction event set S in time sequence, and the right side shows a dynamic interaction diagram. In the dynamic interaction graph, two interaction objects in each interaction event are respectively used as nodes. Node u (t) is shown below6) And v (t)6) For example, the construction of the connecting edge is described.
As can be appreciated, the node u (t)6) Represents the time t6And (4) the following object u. Thus, from time t6The time that the interaction action of the object u last occurs can be determined as t by backtracking from the beginning to the front4At the time t4Participates in 2 associated events E7And E8I.e. interaction event E7And E8Each containing object u as one of the interaction objects. Thus, event E is correlated7And E8The corresponding 4 nodes are the node u (t)6) The associated node of (2). In FIG. 4, to distinguish event E7And E8The object node u in (1) is denoted as u1(t4) And u2(t4). Thus, the slave node u (t) is established6) Pointing to the connecting edges of its 4 associated nodes.
Node v (t)6) Represents the time t6And (c) the following object v. Thus, from time t6From the forward backtracking, the time when the object v last generates the interactive behavior can be determined as t5At the time t5Participates in 1 associated event E9. Thus, event E is correlated9Corresponding 2 nodes v (t)5) And c (t)5) I.e. node v (t)6) The associated node of (2). Thus, a slave node v (t) is established6) A connecting edge pointing to the 2 associated nodes. For each other node, the above manner can be adopted to determine the associated event and the associated node, so as to establish a connection edge pointing to the associated node. In the dynamic interaction diagram shown in FIG. 4, node u (t)6),c(t5) Are all multi-element nodes.
In another embodiment, when the dynamic interaction graph is constructed, for a plurality of interaction events occurring at the same time, different interaction objects related to the plurality of interaction events are determined, and corresponding nodes are respectively established for the different interaction objects. That is, if the same object is included in a plurality of interaction events occurring at the same time, only one node is established for the same object. Thus, when establishing the connecting edge, if there are N associated events corresponding to the first node of the first object, the N associated events correspond to N +1 associated nodes, respectively corresponding to the first object itself and N other objects interacting with the first object in the N associated events.
FIG. 5 illustrates a dynamic interaction diagram constructed in accordance with another embodiment. Specifically, the left side of fig. 5 shows the aforementioned interaction event set S, and the right side shows a dynamic interaction graph. In the dynamic interaction graph, corresponding nodes are respectively established for different interaction objects in the interaction events which occur simultaneously. Dynamic interaction of FIG. 5The difference between the graph and the graph of fig. 4 is that the nodes of the same object in the multiple interaction events occurring at the same time in fig. 4 are merged into one node. For example, for all occurrences at time t4Two interaction events E7,E8Where 3 different interaction objects a, b, u are involved, 3 nodes a (t) are established for the interaction event at that time4),b(t4),u(t4). This corresponds to u in FIG. 41(t4) And u2(t4) Are merged into a node u (t)4). In such a case, in one example, the interactions that occur may be illustrated by dashed double-headed arrows between the nodes, such as may be illustrated by dashed double-headed arrows in FIG. 5, at t4In time, there is interactive behavior between objects a and u, and between objects b and u, but there is no interactive behavior between objects a and b.
The node u (t) is still used below6) And v (t)6) For example, the construction of the connecting edge is described.
As previously described, node u (t)6) Represents the time t6And (4) the following object u. From time t6The time that the interaction action of the object u last occurs can be determined as t by backtracking from the beginning to the front4At the time t4Participates in 2 associated events E7And E8I.e. interaction event E7And E8Each containing object u as one of the interaction objects. Associating events E7And E8Corresponding 3 nodes a (t)4),b(t4),u(t4) Is node u (t)6) The associated node of (2). Thus, the slave node u (t) is established6) Connecting edges pointing to the 3 associated nodes.
Slave node v (t)6) Pointing to the associated event E can be established9Corresponding 2 nodes v (t)5) And c (t)5) The process is the same as that described in conjunction with fig. 4, and is not described again. For each of the other nodes in fig. 5, the above-mentioned manner may be adopted to determine the associated event and the associated node thereof, so as to establish a connection edge pointing to the associated node. In the dynamic interaction diagram shown in FIG. 5, node u (t)6),c(t5) Are all multi-element nodes.
The above describes a manner and process for building a dynamic interaction graph based on a set of interaction events. For the method for processing interactive data shown in fig. 3, the process of constructing the dynamic interactive map may be performed in advance or in the field. Accordingly, in one embodiment, a dynamic interaction graph is constructed in-situ from a set of interaction events, step 31. Constructed as described above. In another embodiment, a dynamic interaction graph may be constructed in advance based on a set of interaction events. In step 31, the formed dynamic interaction graph is read or received.
It can be understood that the dynamic interaction graph constructed in the above manner has strong extensibility, and can be very easily updated dynamically according to the newly added interaction events. Accordingly, step 31 may also include a process of updating the dynamic interaction graph.
Specifically, an existing dynamic interaction graph constructed based on an existing interaction event set may be obtained, then new interaction events occurring in the update time are continuously detected as time is updated, and the existing dynamic interaction graph is updated according to the new interaction events.
In one embodiment, the existing dynamic interaction graph takes the form of FIG. 4, with two nodes for each interaction event. In such a case, assuming that P newly added interactivity events occurring at the first update time are obtained, then 2P newly added nodes are added to the existing dynamic interaction graph, where the 2P newly added nodes respectively correspond to two objects included in each newly added interactivity event of the P newly added interactivity events. Then, for each newly added node, the associated event and the associated node are searched according to the mode. If the new node has the associated node, a connecting edge pointing to the associated node from the new node is added.
In another embodiment, the existing dynamic interaction graph takes the form of fig. 5, and different objects in the concurrent interaction events correspond to different nodes. In this case, after P newly added interactivity events occurring at the first update time are acquired, Q different objects related to the P newly added interactivity events are determined first. If the same interactive object does not exist in the P newly added interactive events, Q is 2P; if the same interactive object exists in the P new interactive events, Q < 2P. Then, adding Q newly added nodes in the existing dynamic interaction graph, wherein the Q newly added nodes respectively correspond to Q different objects. Then, for each newly added node, the associated event and the associated node are searched according to the mode. If the new node has the associated node, a connecting edge pointing to the associated node from the new node is added.
In step 31, a dynamic interaction graph constructed based on the interaction event set is obtained.
Next, in step 32, in the obtained dynamic interaction graph, a current sub-graph corresponding to the current node to be analyzed is determined, wherein the current sub-graph comprises nodes within a predetermined range which start from the current node and reach via a connecting edge.
The current node is the node corresponding to the interactive object to be analyzed. However, as mentioned above, one entity object may correspond to a plurality of nodes, and express the state of the entity object at different times. In order to express the latest state of the interactive object to be analyzed, in one embodiment, a node is selected as the current node, i.e., in the dynamic interaction graph, there is no connecting edge pointing to the node. That is, the corresponding node at the time when the interaction action of the object to be analyzed occurs recently is selected as the current node. For example, in the dynamic interaction diagrams shown in FIGS. 4 and 5, when it is desired to analyze an interactive object u, a node u (t) may be selected6) As the current node. However, this is not essential. In other embodiments, other nodes may also be selected as the current node, e.g., for training purposes, e.g., node u (t) may also be selected for analysis object u4) As the current node.
And starting from the current node, forming a current subgraph corresponding to the current node through nodes in a preset range reached by the connecting edge. In one embodiment, the nodes in the predetermined range may be nodes reachable through a connecting edge of at most a preset order K. Here, the order K is a preset hyper-parameter, and can be selected according to the service situation. It can be understood that the preset order K represents the number of steps of the history interaction event traced back forward when the information of the current node is expressed. The larger the number K, the more order of historical interaction information is considered.
In another embodiment, the nodes in the predetermined range may also be nodes whose interaction time is within a predetermined time range. For example, a T duration (e.g., one day) is traced back from the interaction time of the current node, nodes within the duration range and reachable through the connecting edge.
In yet another embodiment, the predetermined range takes into account both the number of connected sides and the time range. In other words, the nodes in the predetermined range are nodes which are reachable through a connecting edge of at most a preset order K and have interaction time within a predetermined time range.
For simplicity, in the following example, the connecting edge with the preset order K is taken as an example for description.
FIG. 6 shows an example of the current subgraph in one embodiment. In the example of fig. 6, u (t) in fig. 4 is assumed6) For the current node, the order K is preset to 2, then from u (t)6) Starting from the direction of the connecting edge, traversal is performed, and the nodes which can be reached through the 2-level connecting edge are shown as gray nodes in the graph. These gray nodes and the connection relationship therebetween are the current node u (t)6) The corresponding current sub-graph.
Fig. 7 shows an example of the current subgraph in another embodiment. In the example of fig. 7, u (t) in fig. 5 is assumed6) For the current node, the order K is preset to 2, then from u (t)6) Starting from the direction of the connecting edge, traversal is performed, and the nodes which can be reached through the 2-level connecting edge are shown as gray nodes in the graph. These gray nodes and the connection relationship therebetween are the current node u (t)6) The corresponding current sub-graph.
Next, in step 33, the current sub-graph is input into a neural network model, which includes the LSTM layer. For any node in the current subgraph, called the second node for ease of representation, the LSTM layer performs the following: and determining the implicit vector and the intermediate vector of the second node according to at least the node characteristics of the second node and the intermediate vector and the implicit vector of each of the k associated nodes pointed by the second node. In this way, the LSTM layer sequentially iterates each node according to the directional relationship of the connecting edges between each node in the current subgraph, thereby obtaining the implicit vector of the current node.
Fig. 8 shows a working diagram of the LSTM layer. Assume node Q points to k associated nodes: node J1To node Jk. As shown in FIG. 8, at time T, the LSTM layers are processed to obtain nodes J1To node JkIs represented by vector H1To HkIncluding intermediate vectors and implicit vectors; at the next T + time, the LSTM layer processes the J obtained previously according to the node characteristics of the node Q1To JkIs represented by vector H1To HkTo obtain a representative vector H of the node QQ. It is understood that the representation vector of node Q can be used for processing to obtain the representation vector of the node pointing to node Q at a subsequent time, thus implementing an iterative process.
This process is described in connection with the current sub-diagram of fig. 7. For each of the lowest level nodes in the graph, e.g. node a (t)2) Its pointing node is not considered in the current sub-graph, i.e. a (t) is considered2) There is no associated node. In such a case, the respective intermediate vector c and implicit vector h of the associated node to which the node points are generated by padding (padding) with a default value (e.g. 0). The LSTM layer is then based on this node a (t)2) And the intermediate vector c and the implicit vector h of the default associated node, determine the node a (t)2) Intermediate vector c (a (t) of (1)2) And an implicit vector h (a (t)2)). And performing the same processing on other nodes at the lowest layer to obtain corresponding intermediate vectors and implicit vectors.
For the intermediate level node a (t)4) Which points to 2 associated nodes a (t)2) And f (t)2). Therefore, the LSTM layer is based on the node a (t)4) The node characteristics of itself, and the 2 associated nodes a (t) to which it points2) And f (t)2) Respective intermediate and implicit vectors, i.e. c (a (t)2)),h(a(t2)),c(f(t2) H (t) and h (f (t)2) Determine node a (t)4) Intermediate vector c (a (t) of (1)4) A and h (a (t)4)). And performing the same processing on other intermediate layer nodes to obtain corresponding intermediate vectors and implicit vectors.
For node u (t)6) Pointing to 3 associated nodes a (t)4),u(t4) And b (t)4). Therefore, the LSTM layer is based on the node u (t)6) The node characteristics of itself, and the 3 associated nodes a (t) to which it points4),u(t4) And b (t)4) Respective intermediate and implicit vectors, i.e. c (a (t)4)),h(a(t4)),c(u(t4)),h(u(t4)),c(b(t4) H (b (t))4) Determine node u (t)6) Intermediate vector c (u (t) of (c)6) H (u (t))6))。
Thus, the current node u (t) can be obtained by layer-by-layer iterative processing6) The intermediate vector and the implicit vector of (2).
The internal structure and algorithms of the LSTM layer to achieve the above iterative process are described below.
Fig. 9 illustrates the structure of an LSTM layer according to one embodiment. In the example of fig. 9, the currently processed node is denoted as u (t),
Figure BDA0002361208390000171
the node characteristics of the node u (t) are shown. In the case where the node represents a user, the node characteristics may include attribute characteristics of the user, such as age, occupation, education level, region of residence, and the like; in the case where the nodes represent goods, the node characteristics may include attribute characteristics of the goods, such as a goods category, a time to put on shelf, a sales volume, and the like. And under the condition that the node represents other interactive objects, the original node characteristics can be correspondingly acquired. When the feature group of the interactive event further includes an event feature or a behavior feature f, the node feature may include the behavior feature f of each interactive event participating in the interactive time corresponding to the node.
Suppose node u (t) points to k associated nodes, denoted as u1(t),u2(t),…,uk(t) each ofThe relevant node corresponds to an intermediate vector c and an implicit vector h. An example of k-3 is shown schematically in fig. 9, and accordingly,
Figure BDA0002361208390000181
an intermediate vector representing the associated node(s),
Figure BDA0002361208390000182
implicit vectors representing the associated nodes, i being 1,2,3 in fig. 9, respectively. It is understood that the calculated relationship shown in fig. 9 may be applied to the case where k is other value. For example, if node u (t) does not have a real associated node, k is 0, and a default value, such as a zero vector, may be used as the intermediate vector and the implicit vector of the associated node; if node u (t) is a binary node, k is 2, and then a default value of, for example, a zero vector may be used as the third associated node u3(t) corresponding intermediate nodes and implicit vectors; if the number of the associated nodes of the node u (t) is more than 3, further adding intermediate vectors and implicit vectors corresponding to more associated nodes as input on the basis of the method shown in FIG. 9.
The LSTM layer performs the following operations on the node features, intermediate vectors, and implicit vectors input thereto.
Characterizing nodes
Figure BDA0002361208390000183
K implicit vectors corresponding to the k associated nodes, respectively
Figure BDA0002361208390000184
And (wherein i is from 1 to k), inputting a first transformation function g and a second transformation function f with the same algorithm and different parameters respectively to obtain k first transformation vectors and k second transformation vectors respectively.
More specifically, in one example, the first transformation function g and the second transformation function f are calculated using the following equations (1) and (2), respectively:
Figure BDA0002361208390000185
Figure BDA0002361208390000186
in the above equations (1) and (2), σ is an activation function, for example, a sigmoid function,
Figure BDA0002361208390000187
Figure BDA0002361208390000188
and
Figure BDA0002361208390000189
is a matrix of parameters for the linear transformation,
Figure BDA00023612083900001810
and
Figure BDA00023612083900001811
is an offset parameter. It can be seen that the algorithms of equations (1) and (2) are the same, with only the parameters being different. With the above transformation function, k first transformation vectors and k second transformation vectors can be obtained.
Of course, in other examples, similar but different transformation function forms may be used, such as selecting different activation functions, modifying the form and number of parameters in the above formula, and so forth.
Then, the ith association node u in the k association nodesi(t) intermediate vector
Figure BDA0002361208390000191
With the corresponding ith first transformation vector
Figure BDA0002361208390000192
And the ith second transform vector
Figure BDA0002361208390000193
And performing combination operation to obtain k operation results, and summing the k operation results to obtain a combination vector V.
In particular, in one example, the combining operation may be a bitwise multiplication between three vectors, i.e., a bit-wise multiplication
Figure BDA0002361208390000194
Where ⊙ denotes bitwise multiplication, in other examples the above-described combining operation may be other vector operations such as addition, and in the case where the combining operation is bitwise multiplication, the resulting combined vector V may be expressed as:
Figure BDA0002361208390000195
in addition, node characteristics of the nodes are also determined
Figure BDA0002361208390000196
Together with k implicit vectors
Figure BDA0002361208390000197
(where i is from 1 to k), respectively inputting a third transformation function p and a fourth transformation function o to respectively obtain a third transformation vector pu(t)And a fourth transform vector ou(t)
Specifically, in the example shown in fig. 9, the third transformation function p may be obtained by first obtaining the vector zu(t)And su(t)Then, z is further substitutedu(t)And su(t)Performing a combining operation to obtain a third transformation vector pu(t). For example, in one specific example:
pu(t)=zu(t)⊙su(t)(4)
wherein ⊙ denotes a bitwise multiplication.
More particularly, zu(t)And su(t)Can be calculated according to the following formula:
Figure BDA0002361208390000198
Figure BDA0002361208390000199
wherein, Wz
Figure BDA00023612083900001910
WsAnd
Figure BDA00023612083900001911
parameter matrix being a linear transformation, bzAnd bsIs an offset parameter.
The fourth transformation function o may be that a fourth transformation vector o is obtained by the following formulau(t)
Figure BDA0002361208390000201
Wherein, WoAnd RoiParameter matrix being a linear transformation, boIs an offset parameter.
Then, based on the above-mentioned combined vector V and third transformation vector pu(t)Determining the intermediate vector c of the node u (t)u(t). For example, the combined vector V and the third transformed vector p may be combinedu(t)Summing to obtain a middle vector c of u (t)u(t). In a specific example, the intermediate vector cu(t)Can be expressed as:
Figure BDA0002361208390000202
in other examples, the combination vector V and the third transformed vector may be combined by other combination methods, such as weighted summation, bit-wise multiplication, and the intermediate vector c is obtained according to the combination resultu(t)
Furthermore, an intermediate vector c based on the node u (t) thus obtainedu(t)And a fourth transformation vector Ou(t)Determining an implicit vector h of the node u (t)u(t)
In the specific example shown in fig. 9, the intermediate vector c may be divided intou(t)After the tanh function operation, the fourth transformation vector O is addedu(t)Combined, e.g. in phaseMultiplying, and using the combined result as the implicit vector h of the node u (t)u(t)Namely:
hu(t)=ou(t)⊙tanh(cu(t)) (9)
then, according to the structure and algorithm shown in fig. 9, the LSTM layer determines the intermediate vector c of the node u (t) according to the node characteristics of the current processing node u (t), the intermediate vector and the implicit vector of each of the k associated nodes pointed to by the node u (t)u(t)And an implicit vector hu(t)
In one embodiment, in the process of iteratively processing each node u (t) to determine the intermediate vector and the implicit vector thereof, a time difference Δ between the interaction time corresponding to the current processing node u (t) and the interaction time corresponding to the pointed k associated nodes is further introduced. Specifically, assuming that the current processing node u (t) corresponds to the first interaction time t1, according to the previous description of the dynamic interaction graph, the k associated nodes connected thereto are nodes corresponding to a plurality of interaction events occurring while the object corresponding to the node u (t) participates last time, and the time of the plurality of interaction events occurring simultaneously is denoted as the second interaction time t 2. Then, the time difference Δ is the time difference between the first interaction time t1 and the second interaction time t 2. Thus, the LSTM layer may determine the implicit vector h of the node u (t) according to the node characteristics of the current processing node u (t), the intermediate vector and the implicit vector of each of the k associated nodes pointed to by the node u (t), and the time difference Δu(t)And intermediate vector cu(t)
More specifically, a factor of the time difference Δ may be introduced based on the manner shown in fig. 9, and an implicit vector and an intermediate vector of the node u (t) are obtained similarly. Specifically, one process of combining the time difference may include:
respectively combining the node characteristics of a second node u (t) and the time difference delta with k implicit vectors corresponding to k associated nodes, and inputting a first transformation function g to obtain k first transformation vectors;
combining the node characteristics of the second node with the k implicit vectors corresponding to the k associated nodes respectively, and inputting a second transformation function f to obtain k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and the ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector c of the second node u (t) based on the combined vector and a third transformed vectoru(t)
An intermediate vector c based on the second nodeu(t)And a fourth transformed vector, determining an implicit vector h for the second node u (t)u(t)
Fig. 10 shows the structure of an LSTM layer according to another embodiment. Comparing fig. 10 and 9, it can be seen that the structure and implemented algorithm of fig. 10 is similar to fig. 9, except that the time difference Δ (u, t) is further introduced on the basis of fig. 9. In the example of FIG. 10, the time difference Δ (u, t) and the node characteristics of node u (t)
Figure BDA0002361208390000211
Together, the implicit vectors, combined with the respective associated nodes, are input into a first transformation function g. Accordingly, the first transformation function g may be modified to:
Figure BDA0002361208390000221
wherein, the formula (10) further introduces a time term corresponding to the time difference Δ (u, t) on the basis of the formula (1), and accordingly, MgiThe parameter for the time term may be embodied as a vector.
The other transformation functions in fig. 10, and the operation procedure between the functions, may be the same as the example described in connection with fig. 9.
According to another embodiment, the process of combining time differences may comprise the steps of:
respectively combining the node characteristics of a second node u (t) and the time difference delta with k implicit vectors corresponding to k associated nodes, and inputting a first transformation function g and a second transformation function f with the same algorithm and different parameters to respectively obtain k first transformation vectors and k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and the ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector c of the second node u (t) based on the combined vector and a third transformed vectoru(t)
An intermediate vector c based on the second nodeu(t)And a fourth transformed vector, determining an implicit vector h for the second node u (t)u(t)
Fig. 11 shows the structure of an LSTM layer according to yet another embodiment. It can be seen that the LSTM layer of fig. 11 also introduces a time difference Δ (u, t), and that the time difference Δ (u, t) in fig. 11 is further input to the second transformation function f, in contrast to fig. 10. That is, the time difference Δ (u, t) and the node characteristics of the node u (t)
Figure BDA0002361208390000222
Together, the implicit vector combinations with the respective associated nodes are input into the first transformation function g and the second transformation function f, respectively.
More specifically, the first transformation function g in fig. 11 may still take the form of equation (10). Further, the second transformation function f may take the form:
Figure BDA0002361208390000231
wherein formula (11) is given in formula (A)2) A time term corresponding to the time difference Δ (u, t) is further introduced on the basis, respectively, MfiThe parameter for the time term may be embodied as a vector.
The other transformation functions in fig. 11, and the operation procedure between the functions, may be the same as the example described in connection with fig. 9.
In further embodiments, the time difference may be further input to the third transformation function p and/or the fourth transformation function o. In such a case, some or all of the foregoing equations (5), (6) and (7) may be modified, and the time term for the time difference is introduced similarly on the original basis, which is not described in detail herein.
Through the LSTM layer described above in conjunction with fig. 9-11, each node in the current subgraph is processed iteratively in turn, and the intermediate vector and the implicit vector of the current node can be obtained. In one embodiment, the implicit vector thus obtained can be used as an output of the neural network model to characterize the current node.
It can be seen that the LSTM-based neural network model, unlike the conventional LSTM network, is functionally and structurally modified and optimized for the processing of multivariate dynamic interaction maps, and may be referred to as a dynamic map LSTM neural network model.
According to one embodiment, in order to further improve the effect, the dynamic graph LSTM neural network model may include a plurality of LSTM layers, wherein an implicit vector of a certain node determined by a previous LSTM layer is input to a next LSTM layer as a node feature of the node. That is, each LSTM layer still iteratively processes each node, and determines the implicit vector and the intermediate vector of the node i according to the node feature of the currently processed node i, and the respective intermediate vector and the implicit vector of the k associated nodes pointed by the node i, where only the LSTM layer at the bottommost layer uses the original feature of the node i as the node feature, and the subsequent LSTM layer uses the implicit vector h of the node i determined by the previous LSTM layeriAs a node characteristic. In one embodiment, the LSTM layers are stacked in a residual network to form a neural network model.
In the case of a neural network model having multiple LSTM layers, it will be appreciated that each LSTM layer may determine the implicit vector for the current node. In one embodiment, the neural network model integrates implicit vectors of current nodes output by each of the plurality of LSTM layers to obtain a final implicit vector of the current node. More specifically, the implicit vectors output by the LSTM layers may be weighted and combined to obtain the final implicit vector. The weight of the weighted combination can be simply set to a weight factor corresponding to each layer, and the magnitude of the weight factor is adjusted through training. Alternatively, the weighting factor may be determined by a more complex attention (attention) mechanism.
In another embodiment, the neural network model may further use the implicit vector of the current node output by the last LSTM layer of the plurality of LSTM layers as the final implicit vector of the current node.
Thus, through various modes, the LSTM-based neural network model obtains the implicit vector of the current node as the characteristic vector of the current node based on the current sub-graph corresponding to the current node to be analyzed. Because the current sub-graph reflects the time-sequence interaction history information related to the interaction object corresponding to the current node, the obtained feature vector of the current node not only expresses the features of the interaction object, but also expresses the influence of the interaction object in the past interaction events, thereby comprehensively representing the characteristics of the interaction object.
Then, in step 34, the traffic process related to the current node is performed according to the implicit vector of the current node.
In one embodiment, the business process may be to predict a classification category of an object corresponding to the current node according to the implicit vector obtained above.
For example, in the case that the object corresponding to the current node is a user, the user category of the user, such as the belonging crowd category, the risk level category, and the like, can be predicted based on the implicit vector. In the case that the object corresponding to the current node is an item, the category of the item, such as belonging business category, suitable crowd category, purchased scene category, etc., can be predicted based on the implicit vector.
In one embodiment, the traffic processing may further include analyzing and predicting interaction events associated with the current node. Since an interaction event typically involves two objects, the feature vector of the other node also needs to be analyzed.
In particular, another node different from the aforementioned current node may be selected in the dynamic interaction graph, for example, t (t) in fig. 4 and 56). In a similar manner to steps 32 and 33 in fig. 3, an implicit vector corresponding to the other node is determined. In one embodiment, whether the objects represented by the two nodes will interact or not can be predicted based on the implicit vectors respectively corresponding to the current node and the other node. In another embodiment, the aforementioned current node and the other node are two nodes corresponding to the first interaction event that has occurred. Then the event category of the first interactive event can be predicted according to the implicit vectors respectively corresponding to the two nodes.
For example, in one example, a user of a current node representative has confirmed a purchase of an item represented by the other node, thereby generating a first interaction event. When a user requests payment, whether the first interaction event is a fraudulent interaction suspected of account stealing can be predicted according to the implicit vectors of the two nodes, and whether the payment is allowed or not is determined. In yet another example, a user of a current node representation has performed a comment operation, such as a praise or a text comment, with respect to an item (e.g., a movie) of another node representation, thereby generating a first interaction event. After that, it can be predicted whether the first interaction event is a real operation according to the implicit vectors of the two nodes.
It will be appreciated that the business process described above is based on implicit vectors of nodes determined for the LSTM neural network model based on a dynamic interaction graph. As previously mentioned, the calculation process of the LSTM neural network model to determine the implicit vectors of nodes depends on a large number of parameters, such as those in the various transformation functions previously described. These parameters need to be determined by training the neural network model. In different embodiments, the neural network model may be trained through different tasks.
In one embodiment, the neural network model is trained by predicting interactive behavior. Fig. 12 shows a flowchart for training the neural network model in this embodiment. As shown in FIG. 12, in step 121, a historical interaction event is obtained, which is an interaction event that has actually occurred. In one particular example, historical interactivity events may be obtained from the set of interactivity events described above. The two objects included in the historical interaction event are referred to as a first sample object and a second sample object.
In step 122, a first sub-graph corresponding to the first sample object and a second sub-graph corresponding to the second sample object are determined, respectively, in the dynamic interaction graph. Specifically, a first sample node corresponding to the first sample object and a second sample node corresponding to the second sample object are respectively determined in the dynamic interaction graph, the first sample node and the second sample node are respectively used as current nodes, and the corresponding first sub-graph and the corresponding second sub-graph are determined in a similar manner as in step 32 of fig. 3.
Then, in step 123, the first sub-graph and the second sub-graph are respectively input into the neural network model, and an implicit vector of the first sample object and an implicit vector of the second sample object are respectively obtained. The specific process of determining the implicit vector of the corresponding sample object by the neural network model based on the pointing relationship of the nodes in the subgraph is as described above with reference to step 33, and is not described again.
Then, in step 124, it is predicted whether the first sample object and the second sample object will interact with each other according to the implicit vector of the first sample object and the implicit vector of the second sample object, so as to obtain a prediction result. Generally, a two-class classifier can be used to predict whether two sample objects will interact with each other, and the obtained prediction result generally represents the probability of the interaction between the two sample objects.
Then, at step 125, a prediction loss is determined based on the prediction result. It will be appreciated that the first sample object and the second sample object described above are from historical interaction events, and thus interaction has actually occurred, which is equivalent to knowing the relationship label between the two sample objects. The loss of the current prediction can be determined based on the prediction result according to a loss function form such as a cross entropy calculation mode.
The neural network model is then updated at step 126 based on the predicted loss. Specifically, the parameters in the neural network can be adjusted by adopting gradient descent, back propagation and other modes to update the neural network model until the prediction accuracy of the neural network model meets certain requirements.
In the above, the prediction of the object relationship by using two sample objects in the historical interaction event is equivalent to training by using a positive sample. In one embodiment, two sample objects without interaction relation can be found in the dynamic interaction diagram and used as negative samples for further training, so that a better training effect is achieved.
According to another embodiment, the neural network model is trained by predicting a classification of the interactive object. Fig. 13 shows a flowchart of training a neural network model in this embodiment. As shown in fig. 13, in step 131, a sample object is selected from the objects involved in the interaction event set, and the classification label of the sample object is obtained. The sample object may be any interaction object in any event contained in the set of interaction events, and the class label for the sample object may be a label associated with a business scenario. For example, in the case that the sample object is a user, the classification label may be a label of a preset crowd classification, or a label of a user risk degree classification; in the case where the sample object is a commodity, the classification tag may be a tag of a commodity classification. Such tags may be generated by manual labeling or by other business related processes.
In step 132, a sample sub-graph corresponding to the sample object is determined in the dynamic interaction graph. Specifically, a node corresponding to the sample object may be determined in the dynamic interaction graph, and the node is used as the current node to determine a corresponding sample sub-graph in a manner similar to that in step 32 of fig. 3.
Then, in step 133, the sample subgraph is input into the neural network model to obtain an implicit vector of the sample object. This process is as described above in connection with step 33 and will not be described in detail.
Then, in step 134, the classification of the sample object is predicted according to the implicit vector of the sample object, and a prediction result is obtained. A classifier may be employed to predict respective probabilities that a sample object belongs to respective classes as a prediction result.
Then, at step 135, a prediction loss is determined based on the prediction results and the classification tags. Specifically, for example, a cross entropy calculation method may be adopted, and each probability and classification label in the result may be predicted to determine the loss of the prediction.
At step 136, the neural network model is updated based on the predicted loss. In this manner, the neural network model is trained by the task of predicting the classification of sample objects.
In summary, in the solution of the embodiment of the present specification, a dynamic interaction graph is constructed based on an interaction event set, and the dynamic interaction graph may reflect a time sequence relationship of each interaction event and an interaction effect between interaction objects transmitted through each interaction event. The dynamic interaction graph allows nodes to connect to an unlimited number of associated nodes, taking into account the possibility of interaction events occurring simultaneously, thereby forming a mixed-multivariate interaction graph. And extracting to obtain an implicit vector of the interactive object based on a subgraph related to the interactive object to be analyzed in the dynamic interactive graph by using the trained LSTM neural network model. The influence of other interaction objects in each interaction event on the implicit vector is introduced into the implicit vector, so that the deep characteristics of the interaction objects can be comprehensively expressed for service processing.
According to an embodiment of another aspect, an apparatus for processing interactive data is provided, which may be deployed in any device, platform or device cluster having computing and processing capabilities. FIG. 14 shows a schematic block diagram of an apparatus for processing interaction data according to one embodiment. As shown in fig. 14, the processing device 140 includes:
the interaction graph acquiring unit 141 is configured to acquire a dynamic interaction graph constructed according to an interaction event set, where the interaction event set includes a plurality of interaction events, and each interaction event includes at least two objects where an interaction behavior occurs and an interaction time; the dynamic interaction graph comprises any first node, the first node corresponds to a first object in an interaction event occurring at a first time, the first node points to M associated nodes corresponding to N associated events through a connecting edge, the N associated events all occur at a second time and all comprise the first object as one of the interaction objects, and the second time is the previous time when the interaction behavior occurs in the first object after the first time is traced forwards; the dynamic interaction graph comprises at least one multi-element node with the number of associated nodes larger than 2;
a sub-graph determining unit 142 configured to determine, in the dynamic interaction graph, a current sub-graph corresponding to a current node to be analyzed, where the current sub-graph includes nodes within a predetermined range that are reached via a connecting edge from the current node;
the sub-graph processing unit 143 is configured to input the current sub-graph into a neural network model, where the neural network model includes an LSTM layer, and the LSTM layer sequentially iterates and processes each node according to a directional relationship of a connection edge between each node in the current sub-graph, so as to obtain an implicit vector of the current node; wherein each node comprises a second node, and the sequentially iterative processing of each node comprises determining an implicit vector and an intermediate vector of the second node at least according to the node characteristics of the second node and the intermediate vector and implicit vectors of the k associated nodes pointed by the second node;
the service processing unit 144 is configured to perform service processing related to the current node according to the implicit vector of the current node.
In one embodiment, the object comprises a user, and the interaction event comprises at least one of: click events, social events, transaction events.
In different embodiments, the M associated nodes may be 2N nodes, and respectively correspond to two objects included in each associated event of the N associated events; alternatively, there may be N +1 nodes, corresponding to N other objects of the N associated events that interact with the first object, respectively, and the first object itself.
In various embodiments, the nodes within the predetermined range may include: nodes within the connecting edge of the preset order K; and/or nodes with interaction time within a preset time range.
According to one embodiment, the aforementioned current node is a node that: in the dynamic interaction graph, there is no connecting edge pointing to the node.
In one embodiment, each interaction event further comprises a behavior characteristic of the interaction behavior; in this case, the node characteristics of the second node may include attribute characteristics of an object corresponding to the second node and behavior characteristics of an interaction event in which the second node participates in the corresponding interaction time.
In one embodiment, the LSTM layer in the neural network model utilized by subgraph processing unit 143 is specifically used to:
combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes respectively, inputting a first transformation function and a second transformation function which have the same algorithm and different parameters, and obtaining k first transformation vectors and k second transformation vectors respectively;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and the ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
According to one embodiment, the LSTM layer in the neural network model utilized by sub-graph processing unit 113 is used to: and determining the implicit vector and the intermediate vector of the second node according to the node characteristics of the second node, the respective intermediate vector and implicit vector of the k associated nodes pointed by the second node, and the time difference between the interaction time corresponding to the second node and the interaction time corresponding to the k associated nodes.
More specifically, in one embodiment, the LSTM layer is specifically configured to:
respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and inputting a first transformation function to obtain k first transformation vectors;
respectively combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes, and inputting a second transformation function to obtain k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
In another embodiment, the LSTM layer is specifically configured to:
respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and then inputting a first transformation function and a second transformation function which have the same algorithm and different parameters to respectively obtain k first transformation vectors and k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
According to one embodiment, the neural network model comprises a plurality of LSTM layers, wherein the implicit vector of the second node determined by the last LSTM layer is input to the next LSTM layer as the node feature of the second node.
Under the condition, in one embodiment, the neural network model synthesizes implicit vectors of the current node output by each of the plurality of LSTM layers to obtain a final implicit vector of the current node.
In another embodiment, the neural network model takes the implicit vector of the current node output by the last LSTM layer of the plurality of LSTM layers as the final implicit vector of the current node.
According to one embodiment, the neural network model is trained by the model training unit 145. The model training unit 145 may be included in the apparatus 140 or may be external thereto. The model training unit 145 may include (not shown):
the sample acquisition module is configured to acquire historical interaction events, wherein the historical interaction events comprise a first sample object and a second sample object;
a sub-graph determining module configured to determine a first sub-graph corresponding to the first sample object and a second sub-graph corresponding to the second sample object, respectively, in the dynamic interaction graph;
the vector acquisition module is configured to input the first sub-graph and the second sub-graph into the neural network model respectively to obtain an implicit vector of the first sample object and an implicit vector of the second sample object respectively;
the prediction module is configured to predict whether the first sample object and the second sample object are interacted or not according to the implicit vector of the first sample object and the implicit vector of the second sample object to obtain a prediction result;
a loss determination module configured to determine a predicted loss based on the prediction result;
an update module configured to update the neural network model based on the predicted loss.
In another embodiment, the model training unit 145 may include (not shown):
a sample acquisition module configured to select a sample object from a plurality of sample objects related to the set of interaction events and acquire a class label of the sample object;
a subgraph determination module configured to determine a sample subgraph corresponding to the sample object in the dynamic interaction graph;
the vector acquisition module is configured to input the sample subgraph into the neural network model to obtain an implicit vector of the sample object;
the prediction module is configured to predict the classification of the sample object according to the implicit vector of the sample object to obtain a prediction result;
a loss determination module configured to determine a predicted loss based on the prediction result and the classification label;
an update module configured to update the neural network model based on the predicted loss.
By the device, based on the dynamic interaction diagram, the neural network model is adopted to process the interaction object, and the characteristic vector suitable for subsequent analysis is obtained.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 3.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method described in connection with fig. 3.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (30)

1. A method of processing interaction data, the method comprising:
acquiring a dynamic interaction graph constructed according to an interaction event set, wherein the interaction event set comprises a plurality of interaction events, and each interaction event at least comprises two objects with interaction behaviors and interaction time; the dynamic interaction graph comprises any first node, the first node corresponds to a first object in an interaction event occurring at a first time, the first node points to M associated nodes corresponding to N associated events through a connecting edge, the N associated events all occur at a second time and all comprise the first object as one of the interaction objects, and the second time is the previous time when the interaction behavior occurs in the first object after the first time is traced forwards; the dynamic interaction graph comprises at least one multi-element node with the number of associated nodes larger than 2;
determining a current sub-graph corresponding to a current node to be analyzed in the dynamic interaction graph, wherein the current sub-graph comprises nodes which start from the current node and reach a predetermined range through a connecting edge;
inputting the current sub-graph into a neural network model, wherein the neural network model comprises an LSTM layer, and the LSTM layer sequentially iterates and processes each node according to the directional relation of a connecting edge between each node in the current sub-graph, so as to obtain an implicit vector of the current node; wherein each node comprises a second node, and the sequentially iterative processing of each node comprises determining an implicit vector and an intermediate vector of the second node at least according to the node characteristics of the second node and the intermediate vector and implicit vectors of the k associated nodes pointed by the second node;
and carrying out service processing related to the current node according to the implicit vector of the current node.
2. The method of claim 1, wherein the object comprises a user, the interaction event comprising at least one of: click events, social events, transaction events.
3. The method of claim 1, wherein,
the M associated nodes are 2N nodes and respectively correspond to two objects included by each associated event in the N associated events; alternatively, the first and second electrodes may be,
the M associated nodes are N +1 nodes, and respectively correspond to N other objects that interact with the first object in the N associated events, and the first object itself.
4. The method of claim 1, wherein the nodes within the predetermined range comprise:
nodes within the connecting edge of the preset order K; and/or
And nodes with the interaction time within a preset time range.
5. The method of claim 1, wherein each of the interaction events further comprises a behavior characteristic of an interaction behavior;
the node characteristics of the second node include attribute characteristics of an object corresponding to the second node and behavior characteristics of an interaction event participated in by the second node in the corresponding interaction time.
6. The method of claim 1, wherein the determining implicit and intermediate vectors for the second node comprises:
combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes respectively, inputting a first transformation function and a second transformation function which have the same algorithm and different parameters, and obtaining k first transformation vectors and k second transformation vectors respectively;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and the ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
7. The method according to claim 1, wherein the sequentially iteratively processing each node includes determining an implicit vector and an implicit vector of the second node according to the node characteristics of the second node, the respective intermediate vectors and implicit vectors of the k associated nodes pointed to by the second node, and the time difference between the interaction time corresponding to the second node and the interaction time corresponding to the k associated nodes.
8. The method of claim 7, wherein the determining implicit and intermediate vectors for the second node comprises:
respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and inputting a first transformation function to obtain k first transformation vectors;
respectively combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes, and inputting a second transformation function to obtain k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
9. The method of claim 7, wherein the determining implicit and intermediate vectors for the second node comprises:
respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and then inputting a first transformation function and a second transformation function which have the same algorithm and different parameters to respectively obtain k first transformation vectors and k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
10. The method of claim 1, wherein the neural network model comprises a plurality of LSTM layers, wherein the implicit vectors of the second node determined by a previous LSTM layer are input to a next LSTM layer as node features of the second node.
11. The method of claim 10, wherein the neural network model integrates the implicit vectors of the current node output by each of the plurality of LSTM layers to obtain a final implicit vector of the current node.
12. The method of claim 10, wherein the neural network model takes an implicit vector of a current node output by a last LSTM layer of the plurality of LSTM layers as a final implicit vector of the current node.
13. The method of claim 1, wherein the neural network model is trained by:
acquiring a historical interaction event, wherein the historical interaction event comprises a first sample object and a second sample object;
respectively determining a first sub-graph corresponding to the first sample object and a second sub-graph corresponding to the second sample object in the dynamic interaction graph;
inputting the first sub-graph and the second sub-graph into the neural network model respectively to obtain an implicit vector of the first sample object and an implicit vector of the second sample object respectively;
predicting whether the first sample object and the second sample object can interact or not according to the implicit vector of the first sample object and the implicit vector of the second sample object to obtain a prediction result;
determining a prediction loss according to the prediction result;
updating the neural network model according to the predicted loss.
14. The method of claim 1, wherein the neural network model is trained by:
selecting a sample object from a plurality of sample objects related to the interaction event set, and obtaining a classification label of the sample object;
determining a sample sub-graph corresponding to the sample object in the dynamic interaction graph;
inputting the sample subgraph into the neural network model to obtain an implicit vector of the sample object;
predicting the classification of the sample object according to the implicit vector of the sample object to obtain a prediction result;
determining a prediction loss according to the prediction result and the classification label;
updating the neural network model according to the predicted loss.
15. An apparatus for processing interaction data, the apparatus comprising:
the interactive map acquisition unit is configured to acquire a dynamic interactive map constructed according to an interactive event set, wherein the interactive event set comprises a plurality of interactive events, and each interactive event at least comprises two objects with interactive behaviors and interactive time; the dynamic interaction graph comprises any first node, the first node corresponds to a first object in an interaction event occurring at a first time, the first node points to M associated nodes corresponding to N associated events through a connecting edge, the N associated events all occur at a second time and all comprise the first object as one of the interaction objects, and the second time is the previous time when the interaction behavior occurs in the first object after the first time is traced forwards; the dynamic interaction graph comprises at least one multi-element node with the number of associated nodes larger than 2;
a subgraph determining unit configured to determine a current subgraph corresponding to a current node to be analyzed in the dynamic interaction graph, wherein the current subgraph comprises nodes which start from the current node and reach a predetermined range through a connecting edge;
the sub-graph processing unit is configured to input the current sub-graph into a neural network model, the neural network model comprises an LSTM layer, and the LSTM layer sequentially iterates and processes each node according to the directional relation of a connecting edge between each node in the current sub-graph, so that an implicit vector of the current node is obtained; wherein each node comprises a second node, and the sequentially iterative processing of each node comprises determining an implicit vector and an intermediate vector of the second node at least according to the node characteristics of the second node and the intermediate vector and implicit vectors of the k associated nodes pointed by the second node;
and the service processing unit is configured to perform service processing related to the current node according to the implicit vector of the current node.
16. The apparatus of claim 15, wherein the object comprises a user, the interaction event comprising at least one of: click events, social events, transaction events.
17. The apparatus according to claim 15, wherein the M associated nodes are 2N nodes, respectively corresponding to two objects included in each of the N associated events; alternatively, the first and second electrodes may be,
the M associated nodes are N +1 nodes, and respectively correspond to N other objects that interact with the first object in the N associated events, and the first object itself.
18. The apparatus of claim 15, wherein the nodes within the predetermined range comprise:
nodes within the connecting edge of the preset order K; and/or
And nodes with the interaction time within a preset time range.
19. The apparatus of claim 15, wherein each of the interaction events further comprises a behavior characteristic of an interaction behavior;
the node characteristics of the second node include attribute characteristics of an object corresponding to the second node and behavior characteristics of an interaction event participated in by the second node in the corresponding interaction time.
20. The apparatus of claim 15, wherein the LSTM layer is to:
combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes respectively, inputting a first transformation function and a second transformation function which have the same algorithm and different parameters, and obtaining k first transformation vectors and k second transformation vectors respectively;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and the ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
21. The apparatus of claim 15, wherein the LSTM layer is to: and determining the implicit vector and the intermediate vector of the second node according to the node characteristics of the second node, the respective intermediate vector and implicit vector of the k associated nodes pointed by the second node, and the time difference between the interaction time corresponding to the second node and the interaction time corresponding to the k associated nodes.
22. The apparatus of claim 21, wherein the LSTM layer is specifically configured to:
respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and inputting a first transformation function to obtain k first transformation vectors;
respectively combining the node characteristics of the second node with k implicit vectors corresponding to the k associated nodes, and inputting a second transformation function to obtain k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
23. The apparatus of claim 21, wherein the LSTM layer is specifically configured to:
respectively combining the node characteristics of the second node and the time difference with k implicit vectors corresponding to the k associated nodes, and then inputting a first transformation function and a second transformation function which have the same algorithm and different parameters to respectively obtain k first transformation vectors and k second transformation vectors;
combining the intermediate vector of the ith associated node in the k associated nodes with the corresponding ith first transformation vector and ith second transformation vector to obtain k operation results, and summing the k operation results to obtain a combined vector;
respectively inputting the node characteristics of the second node and the k implicit vectors into a third transformation function and a fourth transformation function to respectively obtain a third transformation vector and a fourth transformation vector;
determining an intermediate vector for the second node based on the combined vector and a third transformed vector;
determining an implicit vector for the second node based on the intermediate vector and a fourth transform vector for the second node.
24. The apparatus of claim 15, wherein the neural network model comprises a plurality of LSTM layers, wherein implicit vectors of the second node determined by a previous LSTM layer are input to a next LSTM layer as node features of the second node.
25. The apparatus of claim 24, wherein the neural network model synthesizes implicit vectors of a current node output by each of the plurality of LSTM layers to obtain a final implicit vector of the current node.
26. The apparatus of claim 24, wherein the neural network model takes an implicit vector of a current node output by a last LSTM layer of the plurality of LSTM layers as a final implicit vector of the current node.
27. The apparatus of claim 15, wherein the neural network model is trained by a model training unit comprising:
the sample acquisition module is configured to acquire historical interaction events, wherein the historical interaction events comprise a first sample object and a second sample object;
a sub-graph determining module configured to determine a first sub-graph corresponding to the first sample object and a second sub-graph corresponding to the second sample object, respectively, in the dynamic interaction graph;
the vector acquisition module is configured to input the first sub-graph and the second sub-graph into the neural network model respectively to obtain an implicit vector of the first sample object and an implicit vector of the second sample object respectively;
the prediction module is configured to predict whether the first sample object and the second sample object are interacted or not according to the implicit vector of the first sample object and the implicit vector of the second sample object to obtain a prediction result;
a loss determination module configured to determine a predicted loss based on the prediction result;
an update module configured to update the neural network model based on the predicted loss.
28. The apparatus of claim 15, wherein the neural network model is trained by a model training unit comprising:
a sample acquisition module configured to select a sample object from a plurality of sample objects related to the set of interaction events and acquire a class label of the sample object;
a subgraph determination module configured to determine a sample subgraph corresponding to the sample object in the dynamic interaction graph;
the vector acquisition module is configured to input the sample subgraph into the neural network model to obtain an implicit vector of the sample object;
the prediction module is configured to predict the classification of the sample object according to the implicit vector of the sample object to obtain a prediction result;
a loss determination module configured to determine a predicted loss based on the prediction result and the classification label;
an update module configured to update the neural network model based on the predicted loss.
29. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-14.
30. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-14.
CN202010022183.2A 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model Active CN111210008B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202010022183.2A CN111210008B (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model
CN202210602804.3A CN115081589A (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model
PCT/CN2020/138398 WO2021139524A1 (en) 2020-01-09 2020-12-22 Method and apparatus for processing interaction data by using lstm neural network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010022183.2A CN111210008B (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202210602804.3A Division CN115081589A (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model

Publications (2)

Publication Number Publication Date
CN111210008A true CN111210008A (en) 2020-05-29
CN111210008B CN111210008B (en) 2022-05-24

Family

ID=70786026

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202010022183.2A Active CN111210008B (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model
CN202210602804.3A Pending CN115081589A (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202210602804.3A Pending CN115081589A (en) 2020-01-09 2020-01-09 Method and device for processing interactive data by using LSTM neural network model

Country Status (2)

Country Link
CN (2) CN111210008B (en)
WO (1) WO2021139524A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522866A (en) * 2020-07-03 2020-08-11 支付宝(杭州)信息技术有限公司 Credible subgraph mining method, device and equipment
CN111523682A (en) * 2020-07-03 2020-08-11 支付宝(杭州)信息技术有限公司 Method and device for training interactive prediction model and predicting interactive object
CN112085293A (en) * 2020-09-18 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for training interactive prediction model and predicting interactive object
CN112307256A (en) * 2020-10-28 2021-02-02 有半岛(北京)信息科技有限公司 Cross-domain recommendation and model training method and device
CN112529115A (en) * 2021-02-05 2021-03-19 支付宝(杭州)信息技术有限公司 Object clustering method and system
WO2021139524A1 (en) * 2020-01-09 2021-07-15 支付宝(杭州)信息技术有限公司 Method and apparatus for processing interaction data by using lstm neural network model
CN113987280A (en) * 2021-10-27 2022-01-28 支付宝(杭州)信息技术有限公司 Method and device for training graph model aiming at dynamic graph
CN116777567A (en) * 2023-08-17 2023-09-19 山东恒诺尚诚信息科技有限公司 Order generation method and system based on artificial intelligence

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303996B (en) * 2023-05-25 2023-08-04 江西财经大学 Theme event extraction method based on multifocal graph neural network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110313736A1 (en) * 2010-06-18 2011-12-22 Bioproduction Group, a California Corporation Method and Algorithm for Modeling and Simulating A Discrete-Event Dynamic System
US8296434B1 (en) * 2009-05-28 2012-10-23 Amazon Technologies, Inc. Providing dynamically scaling computing load balancing
CN109918454A (en) * 2019-02-22 2019-06-21 阿里巴巴集团控股有限公司 The method and device of node insertion is carried out to relational network figure
CN110009093A (en) * 2018-12-07 2019-07-12 阿里巴巴集团控股有限公司 For analyzing the nerve network system and method for relational network figure
CN110490274A (en) * 2019-10-17 2019-11-22 支付宝(杭州)信息技术有限公司 Assess the method and device of alternative events
CN110543935A (en) * 2019-08-15 2019-12-06 阿里巴巴集团控股有限公司 Method and device for processing interactive sequence data
CN110555469A (en) * 2019-08-15 2019-12-10 阿里巴巴集团控股有限公司 Method and device for processing interactive sequence data
CN110598847A (en) * 2019-08-15 2019-12-20 阿里巴巴集团控股有限公司 Method and device for processing interactive sequence data

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017508A (en) * 2006-12-21 2007-08-15 四川大学 SoC software-hardware partition method based on discrete Hopfield neural network
US9659265B2 (en) * 2009-10-12 2017-05-23 Oracle International Corporation Methods and systems for collecting and analyzing enterprise activities
US20130232433A1 (en) * 2013-02-01 2013-09-05 Concurix Corporation Controlling Application Tracing using Dynamic Visualization
US9794359B1 (en) * 2014-03-31 2017-10-17 Facebook, Inc. Implicit contacts in an online social network
JP5901712B2 (en) * 2014-08-29 2016-04-13 株式会社日立製作所 Semiconductor device and information processing apparatus
CN106021364B (en) * 2016-05-10 2017-12-12 百度在线网络技术(北京)有限公司 Foundation, image searching method and the device of picture searching dependency prediction model
CN109934706B (en) * 2017-12-15 2021-10-29 创新先进技术有限公司 Transaction risk control method, device and equipment based on graph structure model
CN108446978A (en) * 2018-02-12 2018-08-24 阿里巴巴集团控股有限公司 Handle the method and device of transaction data
US11537719B2 (en) * 2018-05-18 2022-12-27 Deepmind Technologies Limited Deep neural network system for similarity-based graph representations
CN109284864B (en) * 2018-09-04 2021-08-24 广州视源电子科技股份有限公司 Behavior sequence obtaining method and device and user conversion rate prediction method and device
CN109583475B (en) * 2018-11-02 2023-06-30 创新先进技术有限公司 Abnormal information monitoring method and device
CN110659799A (en) * 2019-08-14 2020-01-07 深圳壹账通智能科技有限公司 Attribute information processing method and device based on relational network, computer equipment and storage medium
CN111258469B (en) * 2020-01-09 2021-05-14 支付宝(杭州)信息技术有限公司 Method and device for processing interactive sequence data
CN111210008B (en) * 2020-01-09 2022-05-24 支付宝(杭州)信息技术有限公司 Method and device for processing interactive data by using LSTM neural network model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8296434B1 (en) * 2009-05-28 2012-10-23 Amazon Technologies, Inc. Providing dynamically scaling computing load balancing
US20110313736A1 (en) * 2010-06-18 2011-12-22 Bioproduction Group, a California Corporation Method and Algorithm for Modeling and Simulating A Discrete-Event Dynamic System
CN110009093A (en) * 2018-12-07 2019-07-12 阿里巴巴集团控股有限公司 For analyzing the nerve network system and method for relational network figure
CN109918454A (en) * 2019-02-22 2019-06-21 阿里巴巴集团控股有限公司 The method and device of node insertion is carried out to relational network figure
CN110543935A (en) * 2019-08-15 2019-12-06 阿里巴巴集团控股有限公司 Method and device for processing interactive sequence data
CN110555469A (en) * 2019-08-15 2019-12-10 阿里巴巴集团控股有限公司 Method and device for processing interactive sequence data
CN110598847A (en) * 2019-08-15 2019-12-20 阿里巴巴集团控股有限公司 Method and device for processing interactive sequence data
CN110490274A (en) * 2019-10-17 2019-11-22 支付宝(杭州)信息技术有限公司 Assess the method and device of alternative events

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021139524A1 (en) * 2020-01-09 2021-07-15 支付宝(杭州)信息技术有限公司 Method and apparatus for processing interaction data by using lstm neural network model
CN111522866A (en) * 2020-07-03 2020-08-11 支付宝(杭州)信息技术有限公司 Credible subgraph mining method, device and equipment
CN111523682A (en) * 2020-07-03 2020-08-11 支付宝(杭州)信息技术有限公司 Method and device for training interactive prediction model and predicting interactive object
CN112085293A (en) * 2020-09-18 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for training interactive prediction model and predicting interactive object
CN112085293B (en) * 2020-09-18 2022-09-09 支付宝(杭州)信息技术有限公司 Method and device for training interactive prediction model and predicting interactive object
CN112307256A (en) * 2020-10-28 2021-02-02 有半岛(北京)信息科技有限公司 Cross-domain recommendation and model training method and device
CN112529115A (en) * 2021-02-05 2021-03-19 支付宝(杭州)信息技术有限公司 Object clustering method and system
CN113987280A (en) * 2021-10-27 2022-01-28 支付宝(杭州)信息技术有限公司 Method and device for training graph model aiming at dynamic graph
CN116777567A (en) * 2023-08-17 2023-09-19 山东恒诺尚诚信息科技有限公司 Order generation method and system based on artificial intelligence

Also Published As

Publication number Publication date
WO2021139524A1 (en) 2021-07-15
CN115081589A (en) 2022-09-20
CN111210008B (en) 2022-05-24

Similar Documents

Publication Publication Date Title
CN111210008B (en) Method and device for processing interactive data by using LSTM neural network model
CN110598847B (en) Method and device for processing interactive sequence data
CN110555469B (en) Method and device for processing interactive sequence data
CN110543935B (en) Method and device for processing interactive sequence data
CN111814977B (en) Method and device for training event prediction model
US11250088B2 (en) Method and apparatus for processing user interaction sequence data
CN111737546B (en) Method and device for determining entity service attribute
CN110490274B (en) Method and device for evaluating interaction event
CN110689110B (en) Method and device for processing interaction event
CN111242283B (en) Training method and device for evaluating self-encoder of interaction event
CN112085293B (en) Method and device for training interactive prediction model and predicting interactive object
TW202008264A (en) Method and apparatus for recommendation marketing via deep reinforcement learning
CN111523682B (en) Method and device for training interactive prediction model and predicting interactive object
CN111476223B (en) Method and device for evaluating interaction event
CN112580789B (en) Training graph coding network, and method and device for predicting interaction event
CN111258469B (en) Method and device for processing interactive sequence data
CN112989146A (en) Method, apparatus, device, medium, and program product for recommending resources to a target user
CN113656699B (en) User feature vector determining method, related equipment and medium
CN113610610B (en) Session recommendation method and system based on graph neural network and comment similarity
CN113449176A (en) Recommendation method and device based on knowledge graph
CN112085279B (en) Method and device for training interactive prediction model and predicting interactive event
CN113450167A (en) Commodity recommendation method and device
CN115204931A (en) User service policy determination method and device and electronic equipment
CN113538069A (en) Method and device for predicting macroscopic state of user group
CN116992292A (en) Click rate estimation model training method and device and click rate estimation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant