CN113283589A - Updating method and device of event prediction system - Google Patents

Updating method and device of event prediction system Download PDF

Info

Publication number
CN113283589A
CN113283589A CN202110631255.8A CN202110631255A CN113283589A CN 113283589 A CN113283589 A CN 113283589A CN 202110631255 A CN202110631255 A CN 202110631255A CN 113283589 A CN113283589 A CN 113283589A
Authority
CN
China
Prior art keywords
event
vector
node
network
intensity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110631255.8A
Other languages
Chinese (zh)
Other versions
CN113283589B (en
Inventor
薛思乔
师晓明
马琳涛
潘晨
王世军
詹姆士·张
郝鸿延
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202110631255.8A priority Critical patent/CN113283589B/en
Publication of CN113283589A publication Critical patent/CN113283589A/en
Application granted granted Critical
Publication of CN113283589B publication Critical patent/CN113283589B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Human Resources & Organizations (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

An embodiment of the present specification provides an update method of an event prediction system, including: inputting samples acquired based on the event sample sequence into an event prediction system for event processing, wherein the event processing comprises the following steps: determining a sequence coding vector of a subsequence up to the occurrence time of the sample through a sequence coding network, wherein each sample in the subsequence corresponds to a first user; updating a node representation vector related to the first user node in the user relationship network graph according to the sequence coding vector through the graph propagation network; fitting an event occurrence intensity function corresponding to the first user according to the updated node characterization vector through an intensity fitting network; mapping the event occurrence intensity function to an event type space through an intensity mapping network to obtain a plurality of intensity functions of the first user under a plurality of event types; and then, updating the network parameters in the event prediction system based on a plurality of intensity functions obtained by event processing and the label samples corresponding to the first user.

Description

Updating method and device of event prediction system
Technical Field
One or more embodiments of the present disclosure relate to the field of machine learning technologies, and in particular, to an update method and apparatus for an event prediction system.
Background
With the development of economy and the advancement of technology, users frequently use various services provided by the service platform to meet various demands in work and life. During the service usage of the user, a large amount of on-line and off-line data is generated. The behavior data (or event data and operation data) reflects the personal interests and behavior preferences of the user, and can effectively guide the optimization of the service if the behavior data can be deeply mined and reasonably utilized. In a data processing mode, a user behavior sequence is modeled to predict what behavior occurs at what time next time, and the data processing method can help provide personalized service meeting the needs of the user, so that the user experience is improved. The point process is a sequence modeling technology, and the prediction of the next event is realized by abstracting the user behavior into points in space, abstracting the user behavior sequence into a point process sequence and simulating the intensity function of the occurrence of the subsequent behavior event.
However, the existing point process algorithm is difficult to meet the high requirement of event prediction in practical application. Therefore, a solution is needed to effectively improve the performance of the peer-to-peer process algorithm, so as to optimize the accuracy and usability of the event prediction result.
Disclosure of Invention
According to the updating method and the updating device for the event prediction system, which are described in one or more embodiments of the specification, the point process algorithm is optimized, and the strength function obtained through the optimized point process algorithm is more accurate, so that the accuracy and the usability of the event prediction result are effectively improved.
According to a first aspect, there is provided an update method of an event prediction system, including: the event samples are sequentially acquired from an event sample sequence formed by arranging according to the time sequence as first event samples, and the sample attributes of the first event samples comprise first occurrence time and first user identification. Inputting the first event sample into an event prediction system for event processing, wherein the event prediction system comprises a sequence coding network, a graph propagation network, an intensity fitting network and an intensity mapping network; the event processing comprises the following steps: determining a sequence coding vector of a subsequence up to the first occurrence moment through the sequence coding network, wherein each event sample in the subsequence corresponds to the first user identifier; updating node characterization vectors of a first user node and neighbor nodes thereof in the user relationship network graph through the graph propagation network according to the sequence coding vectors; determining parameter values in an event occurrence intensity function corresponding to the first user identification according to the updated first node characterization vector of the first user node through the intensity fitting network; and mapping the event occurrence intensity function to an event type space through the intensity mapping network to obtain a plurality of intensity functions of the first user identifier under a plurality of event types. Updating network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity; and the second occurrence time corresponding to the second event sample is later than the first occurrence time.
In one embodiment, the sequence encoding network includes a linear embedding sub-network and a timing sub-network; wherein determining the sequence-encoded vector of the sub-sequence up to the first occurrence time comprises: determining type embedding vectors of event types corresponding to the event samples in the subsequence through the linear embedding sub-network; and outputting the sequence coding vector based on the sequentially input type embedding vector corresponding to each event sample through the time sequence sub-network.
In one embodiment, updating node characterization vectors for a first user node and its neighbor nodes in a user relationship network graph comprises: taking the first user node as a target node, and executing an updating operation aiming at the target node; wherein the update operation comprises: and determining the updated representation vector of the target node according to the target sequence coding vector corresponding to the target node, the current representation vector of the target neighbor node of the target node and the current representation vector of the target node.
In a specific embodiment, after the first user node is used as a target node and the update operation on the characterization vector of the target node is performed, the method further includes: and taking the neighbor node of the first user node as a target node, and executing the updating operation.
In a more specific embodiment, the graph propagation network includes a local propagation layer, a self-propagation layer, an exogenous propagation layer, and a fusion layer; wherein the update operation specifically includes: processing the current characterization vector of the target neighbor node through the local propagation layer to obtain a local propagation vector; performing linear transformation on the current characterization vector of the target node by using a first parameter matrix through the self-propagation layer to obtain a self-propagation vector; performing linear transformation on the target sequence coding vector by using a second parameter matrix through the exogenous propagation layer to obtain an exogenous propagation vector; and performing fusion processing on the local propagation vector, the self-propagation vector and the exogenous propagation vector through the fusion layer to obtain the updated representation vector of the target node.
In one example, the target neighbor node is a plurality of first order neighbor nodes of the target node; wherein, processing the current characterization vector of the target neighbor node to obtain a local propagation vector, includes: and carrying out weighted summation on the current characterization vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
In another example, the target neighbor node includes a plurality of first order neighbor nodes and a plurality of second order neighbor nodes of the target node; wherein, processing the current characterization vector of the target neighbor node to obtain a local propagation vector, includes: aiming at each first-order neighbor node, determining a plurality of attention weights corresponding to a plurality of current characterization vectors of a plurality of second-order neighbor nodes by using the current characterization vector of the first-order neighbor node; weighting and summing the current characterization vectors by using the attention weights to obtain a neighbor aggregation vector of the first-order neighbor node; and carrying out weighted summation on a plurality of neighbor aggregation vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
In one embodiment, the event occurrence intensity function includes a reference intensity, and the intensity fitting network includes a reference intensity determination layer; wherein determining parameter values in an event occurrence strength function corresponding to the first user comprises: and performing linear transformation processing and activation processing on the first node characterization vector through the reference strength determination layer to obtain the reference strength.
In a specific embodiment, the event occurrence intensity function further includes a historical stimulation coefficient and a time attenuation coefficient, and the intensity fitting network further includes a stimulation coefficient determination layer and an attenuation coefficient determination layer; wherein determining a parameter value in an event occurrence strength function corresponding to the first user further comprises: determining, by the stimulation coefficient determination layer, a number of attention weights for a number of historical characterization vectors for the first user node from the first node characterization vector, the number of characterization vectors being derived based on a number of other event samples in the subsequence; and for each other event sample, determining the product result of the corresponding attention weight and the historical characterization vector as a corresponding historical stimulation coefficient; and respectively fusing the first node characterization vector with the plurality of historical characterization vectors through the attenuation coefficient determination layer to obtain a plurality of fusion vectors, and sequentially performing linear transformation and activation processing on each fusion vector to obtain a corresponding time attenuation coefficient.
In one embodiment, updating network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity comprises: determining an intensity function corresponding to the same event type as the second event sample from the plurality of intensity functions; and updating the network parameters based on the intensity function and the occurrence time corresponding to the second event sample.
In one embodiment, the event prediction system further comprises a adjacency matrix prediction layer; wherein, prior to updating network parameters in the event prediction system, the method further comprises: determining a prediction adjacent order matrix of a virtual event relation network graph constructed based on the plurality of event types according to the characterization vectors of the nodes in the user relation network graph through the adjacent matrix prediction layer; wherein updating the network parameters in the event prediction system comprises: determining a first loss term based on the plurality of intensity functions and a second event sample; acquiring a real event relation network graph, wherein the real event relation network graph comprises a plurality of type nodes corresponding to the event types and directed connection edges formed by causal relations among the type nodes; determining a second loss term based on a true adjacency matrix and the predicted adjacency matrix of the true event relationship network map; updating the network parameter based on the first loss term and the second loss term.
In a specific embodiment, obtaining a real event relationship network graph includes: acquiring a plurality of user event sequences, wherein each user event sequence comprises a plurality of events which are made by a corresponding user and are arranged according to a time sequence, and the causal relationship exists between event types corresponding to any two adjacent events; and constructing the real event relation network graph based on the plurality of user event sequences.
In a more specific embodiment, the truth adjacency matrix includes a weight of the directed connecting edge, which is determined based on a statistical number of the causal relationship.
Further, in one example, determining a prediction order matrix of a virtual event relationship network graph constructed based on the plurality of event types according to the characterization vectors of the nodes in the user relationship network graph includes: determining a type characterization vector of each event type in the plurality of event types based on the characterization vectors of the nodes to form a type characterization matrix; determining the predicted adjacency matrix based on the type characterization matrix and a learning parameter matrix in the adjacency matrix prediction layer.
In a more specific example, prior to updating the network parameters in the event prediction system, the method further comprises: and updating the type characterization vector corresponding to the event type of the first event sample into the first node characterization vector.
According to a second aspect, there is provided an updating apparatus of an event prediction system, comprising: the sequence acquiring unit is configured to sequentially acquire event samples as first event samples from an event sample sequence formed by arranging the event samples in a time sequence, wherein the sample attributes of the first event samples comprise a first occurrence time and a first user identifier. The event processing unit is configured to input the first event sample into an event prediction system for event processing, and the event prediction system comprises a sequence coding network, a graph propagation network, an intensity fitting network and an intensity mapping network; the event processing unit comprises the following modules: the encoding module is configured to determine a sequence encoding vector of a subsequence up to the first occurrence time through the sequence encoding network, wherein each event sample in the subsequence corresponds to the first user identifier; the graph propagation module is configured to update the node representation vectors of the first user node and the neighbor nodes thereof in the user relationship network graph according to the sequence coding vector through the graph propagation network; the intensity fitting module is configured to determine a parameter value in an event occurrence intensity function corresponding to the first user identifier according to the updated first node characterization vector of the first user node through the intensity fitting network; and the intensity mapping module is configured to map the event occurrence intensity function to an event type space through the intensity mapping network to obtain a plurality of intensity functions of the first user identifier under a plurality of event types. A parameter updating unit configured to update network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity; and the second occurrence time corresponding to the second event sample is later than the first occurrence time.
In a specific embodiment, the graph propagation module is specifically configured to: taking the first user node as a target node, and executing an updating operation aiming at the target node; wherein the update operation comprises: and determining the updated representation vector of the target node according to the target sequence coding vector corresponding to the target node, the current representation vector of the target neighbor node of the target node and the current representation vector of the target node.
In a particular embodiment, the graph propagation network includes a local propagation layer, a self-propagation layer, an exogenous propagation layer, and a fusion layer; wherein, the update operation executed by the graph propagation module specifically includes: processing the current characterization vector of the target neighbor node through the local propagation layer to obtain a local propagation vector; performing linear transformation on the current characterization vector of the target node by using a first parameter matrix through the self-propagation layer to obtain a self-propagation vector; performing linear transformation on the target sequence coding vector by using a second parameter matrix through the exogenous propagation layer to obtain an exogenous propagation vector; and performing fusion processing on the local propagation vector, the self-propagation vector and the exogenous propagation vector through the fusion layer to obtain the updated representation vector of the target node.
In a more particular embodiment, the target neighbor node includes a plurality of first order neighbor nodes and a plurality of second order neighbor nodes of the target node; the graph propagation module obtains a local propagation vector by executing the update operation, and specifically includes: aiming at each first-order neighbor node, determining a plurality of attention weights corresponding to a plurality of current characterization vectors of a plurality of second-order neighbor nodes by using the current characterization vector of the first-order neighbor node; weighting and summing the current characterization vectors by using the attention weights to obtain a neighbor aggregation vector of the first-order neighbor node; and carrying out weighted summation on a plurality of neighbor aggregation vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
In one embodiment, the event occurrence intensity function includes a reference intensity, and the intensity fitting network includes a reference intensity determination layer; wherein the intensity fitting module is specifically configured to: and performing linear transformation processing and activation processing on the first node characterization vector through the reference strength determination layer to obtain the reference strength.
In a more specific embodiment, the event occurrence intensity function further comprises a historical stimulation coefficient and a time attenuation coefficient, and the intensity fitting network further comprises a stimulation coefficient determination layer and an attenuation coefficient determination layer; wherein the intensity fitting module is further configured to: determining, by the stimulation coefficient determination layer, a number of attention weights for a number of historical characterization vectors for the first user node from the first node characterization vector, the number of characterization vectors being derived based on a number of other event samples in the subsequence; and for each other event sample, determining the product result of the corresponding attention weight and the historical characterization vector as a corresponding historical stimulation coefficient; and respectively fusing the first node characterization vector with the plurality of historical characterization vectors through the attenuation coefficient determination layer to obtain a plurality of fusion vectors, and sequentially performing linear transformation and activation processing on each fusion vector to obtain a corresponding time attenuation coefficient.
In one embodiment, the event prediction system further comprises a adjacency matrix prediction layer; the apparatus further comprises an adjacent order matrix prediction unit configured to: and determining a prediction adjacent order matrix of the virtual event relation network graph constructed based on the plurality of event types according to the characterization vectors of the nodes in the user relation network graph through the adjacent matrix prediction layer. Wherein the parameter updating unit is specifically configured to: determining a first loss term based on the plurality of intensity functions and a second event sample; acquiring a real event relation network graph, wherein the real event relation network graph comprises a plurality of type nodes corresponding to the event types and directed connection edges formed by causal relations among the type nodes; determining a second loss term based on a true adjacency matrix and the predicted adjacency matrix of the true event relationship network map; updating the network parameter based on the first loss term and the second loss term.
According to a third aspect, there is provided an event prediction system comprising: the system comprises an input layer, a data processing layer and a data processing layer, wherein the input layer is used for sequentially acquiring event samples from an event sample sequence formed by arranging according to a time sequence as first event samples, and the sample attributes of the first event samples comprise first generation time and first user identification; the sequence coding network is used for determining a sequence coding vector of a subsequence up to the first occurrence moment, and each event sample in the subsequence corresponds to the first user identifier; the graph propagation network is used for updating the node representation vectors of the first user node and the neighbor nodes thereof in the user relationship network graph according to the sequence coding vector; the intensity fitting network is used for determining a parameter value in an event occurrence intensity function corresponding to the first user identifier according to the updated first node characterization vector of the first user node; the intensity mapping network is used for mapping the event occurrence intensity function to an event type space to obtain a plurality of intensity functions of the first user identifier under a plurality of event types; and the output layer is used for outputting an event prediction result corresponding to the first user identifier based on the plurality of intensity functions, wherein the event prediction result comprises a predicted event type and a predicted occurrence moment.
According to a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
According to a fifth aspect, there is provided a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of the first aspect.
In the method and the device provided by the embodiment of the specification, an event prediction system framework built based on a neural network is innovatively provided, in the training process of the event prediction system, a deep neural network is utilized to perform feature extraction on an event sample, parameters of a strength function are fitted in a hidden space (or simply hidden space), and the strength function is mapped back to an event type space, so that strength functions corresponding to various event types are obtained, and the update of the event prediction system is further realized by combining a label sample. Furthermore, in the updating process, graph regularization processing can be introduced, so that a better training effect is obtained. Therefore, through repeated iterative training, a trained event prediction system can be obtained, so that the intensity function of the target event sequence of the target user can be modeled, and accurate prediction of the next event after the target event sequence in the future can be realized by adopting the obtained more accurate and more flexible intensity function.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates an implementation architecture diagram of an update event prediction system according to one embodiment;
FIG. 2 illustrates a flow diagram of an update method of an event prediction system, according to one embodiment;
FIG. 3 illustrates a sequence of event samples according to one example;
FIG. 4 illustrates a schematic structural diagram of an event prediction system according to one embodiment;
FIG. 5 illustrates an update apparatus architecture diagram of an event prediction system, according to one embodiment.
Detailed Description
The solution provided in the present specification will be described below with reference to the above drawings.
As previously described, the point process may be used to model a sequence of events, thereby enabling prediction of events. The key to the point process algorithm is to determine a conditional intensity function (conditional intensity function), which can be defined as the following equation:
Figure BDA0003103826470000069
wherein λ istRepresenting the intensity at time t; symbol: is defined as;
Figure BDA0003103826470000062
representing historical events occurring by time t;
Figure BDA0003103826470000063
is shown in given
Figure BDA0003103826470000064
At time intervals (t, t + delta)]The desired number of events that occur within. It is to be understood that formula (1) shows the definition of the intensity function, and the actually used intensity function is a function with the time t as an independent variable and the intensity as a dependent variable.
After modeling the intensity function from historical events, it can be used to predict the time of occurrence and event type of the next event. It is to be understood that the intensity function can be derived for each of a limited number of event types by modeling.
In one use of the intensity function, the probability density function of event propagation to time t may be first calculated from the intensity function:
Figure BDA0003103826470000065
wherein the content of the first and second substances,
Figure BDA0003103826470000066
representing a given historical event before the (i + 1) th event that needs to be predicted;
Figure BDA0003103826470000067
the function value of (a) represents the probability density of the occurrence of the (i + 1) th event at time t; lambda (t) ═ Σkλk(t),λk(t) represents the intensity function at the kth event type; t is tiIndicating the occurrence time of the ith event. It is to be understood that,
Figure BDA0003103826470000068
can also be abbreviated as pi+1(t)。
Obtaining a probability density function pi+1After (t), the occurrence time and the event type of the next event may be calculated according to the following equation (3) and equation (4), respectively.
Figure BDA0003103826470000071
Figure BDA0003103826470000072
Wherein the content of the first and second substances,
Figure BDA0003103826470000073
indicating the occurrence of the (i + 1) th event,
Figure BDA0003103826470000074
indicating the event type of the (i + 1) th event.
For example, suppose the historical event sequence of a given user is: meal at 12 pm → sleeping at 1 pm → playing at 3 pm, and after modeling the intensity function, the next event can be predicted: and 6 o' clock eating at night. In this way, a strength function can be modeled based on the historical event sequence, thereby predicting the occurrence time and event type of the next event.
From the above, the modeling effect of the intensity function determines the accuracy of the event prediction result, and is therefore crucial. Accordingly, in the event prediction system, a deep neural network is used to perform feature extraction on events in a historical event sequence, parameters of a strength function are fitted in a hidden space (or simply, hidden space), and the strength function is mapped back to an event type space, so that a strength function with extremely high availability corresponding to each event type is obtained.
FIG. 1 illustrates an implementation architecture diagram of an update event prediction system, according to one embodiment. As shown in fig. 1, a history event sample (or simply an event sample) is obtained according to a behavior record of an event of type c made by a user u at a time t, and accordingly, a plurality of event samples arranged in a time sequence may be obtained to form an event sample sequence, where an ith event sample xiIncludes the occurrence time t of the ith eventiEvent type ciAnd a user identity uiIn FIG. 1, ti-1<ti<ti+1. Further, based on the obtained event sample sequence, sequentially inputting each event sample into an event prediction system for event processing, wherein the event processing comprises sequentially utilizing a sequence coding network 101, a graph propagation network 102, an intensity fitting network 103 and an intensity mapping network 104 for processing, and respectively and correspondingly obtaining a sequence coding vector, an updated representation vector for a node in a user relationship network graph, an intensity function of a hidden space and an intensity function of an event type space; and then updating network parameters in the event prediction system by using the intensity function of the event type space and the label event sample. Therefore, the updating of the event prediction system can be realized, and the updated event prediction system is used for modeling the intensity function of the target event sequence of the target user, so that a more accurate and more flexible intensity function can be obtained, and the accurate prediction of the next event which is continued after the target event sequence is realized.
The following describes the implementation steps of the above inventive concept with reference to fig. 1 and 2, and the specific embodiments. FIG. 2 illustrates a flow diagram of an update method of an event prediction system, according to one embodiment. It is understood that the execution subject of the update method may be any platform, apparatus or device cluster with computing and processing capabilities. As shown in fig. 2, the method comprises the steps of:
step S210, sequentially obtaining event samples from the event sample sequence formed by arranging the event samples in the time sequence as a first event sample, wherein the sample attributes include a first occurrence time and a first user identifier. Step S220, inputting the first event sample into an event prediction system for event processing, wherein the event prediction system comprises a sequence coding network 101, a graph propagation network 102, a strength fitting network 103 and a strength mapping network 104; wherein the event processing comprises: step S221, determining a sequence coding vector of a subsequence up to the first occurrence time through a sequence coding network 101, wherein each event sample in the subsequence corresponds to the first user identifier; step S222, updating the node representation vectors of the first user node and the neighbor nodes thereof in the user relationship network graph through the graph propagation network 102 according to the sequence coding vectors; step S223, determining, through the strength fitting network 103, a parameter value in the event occurrence strength function corresponding to the first user identifier according to the updated first node characterization vector of the first user node; step S224, mapping the event occurrence strength function to an event type space through the strength mapping network 104, so as to obtain a plurality of strength functions of the first user identifier under a plurality of event types. Step S230, updating network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity; the second occurrence time corresponding to the second event sample is later than the first occurrence time.
In the above steps, it should be noted that, in the above "first event sample", "first occurrence time", and "second" in the "first" and "second event sample", and similar terms in other places are used to distinguish similar things, and do not have other limiting functions such as ordering.
The development of the above steps is as follows:
first, in step S210, event samples are sequentially acquired as first event samples from a sequence of event samples formed in chronological order. The first event sample may also be referred to as the current sample to be processed, or the current sample to be processed.
The sample property of the first event sample includes a first occurrence time of the corresponding first event, and the precision of the first occurrence time can be set according to requirements, for example, to be precise to minutes (min), seconds(s), days or months, and the like. The sample attribute of the first event further includes a subject identifier of an implementation subject of the first event, or a second user identifier, where the subject identifier may be a numeric number, or a serial number composed of numbers and letters, or the like. The first event has a first event type, which may also be included in the sample properties of the first event sample, accordingly.
For the acquisition of the event sample sequence, in an embodiment, a discrete event type space may be defined first, where the discrete event type space includes a limited number of multiple event types, and then, according to the multiple event types, a user behavior event is subjected to embedded point acquisition, a large number of user events are acquired, and then, according to the sequence of occurrence times, the acquired user events are sorted. Further, in a specific embodiment, the total sequence of events obtained by sorting may be directly used as the above-mentioned sequence of event samples. In another specific embodiment, considering that the number of collected user events is huge, and the computing power in practical engineering application is often limited, a sliding window sampling may be performed on the total sequence of events obtained by sorting, and a plurality of window sequences are slid by using the sliding window, and in any one of a plurality of rounds of iterative training, any window sequence is extracted from the plurality of window sequences to be used as an event sample sequence used by the training round.
Thus, after the sequence of event samples is acquired, the event samples therein are sequentially acquired as the first event sample. In step S220, the first event sample is input into the event prediction system for event processing. For clarity of description, the first event sample is denoted as event sample xiIncluding the occurrence time t of the ith eventiTo do something likeType c of pieceiAnd a user identity ui
The event processing comprises the following steps:
in step S221, the occurrence time t is determined by the sequence encoding network 101iOf the sub-sequence of (1)
Figure BDA0003103826470000081
Each event sample in the subsequence corresponds to a user identifier ui
It is to be understood that the sub-sequence is a sub-sequence of the above-mentioned sequence of event samples, and the event samples in the sub-sequence are also arranged in time sequence, and the sub-sequence includes the event samples occurring at the time tiEvent sample x ofiAnd occurs at time tiPrevious identical corresponding subscriber identity uiThe event sample of (2). In one possible case, only event sample x is included in the sequence of event samplesi. In one example, FIG. 3 shows a sequence of event samples, i.e., x, according to one example0→x1→x2→x3→x4→x5....... Assume event sample xiIs x4Then, the sub-sequence corresponding to the user whitish up to 2 months and 10 days can be obtained as x0→x2→x4
In one embodiment, as shown in FIG. 4, a linear embedding subnetwork 1011 and a timing subnetwork 1012 are included in the sequence encoding network 101. Based on this, in a specific embodiment, the step may include: determining a type embedding vector of an event type corresponding to each sample in the subsequence through a linear embedding sub-network 1011; then, the sequential sub-network 1012 outputs a sequential code vector based on the sequentially input type-embedded vector corresponding to each sample
Figure BDA0003103826470000091
In one example, the timing subnetwork 1021 may be implemented as a recurrent neural network RNN or a long-short term timing memory network LSTM, or the like. In another specific embodiment, the sub-sequence is divided by event sample xiOther than othersThe sample sequence formed by the event samples corresponds to the last sequence of coding vectors
Figure BDA0003103826470000092
Accordingly, the step may include: determining type embedding vector of event type ci of event sample xi by linear embedding sub-network 1011
Figure BDA0003103826470000093
Then passes through timing subnetwork 1012, based on the last sequence of encoded vectors
Figure BDA0003103826470000094
And type embedding vector
Figure BDA0003103826470000095
Outputting sequence-encoded vectors
Figure BDA0003103826470000096
In one example, assuming the timing subnetwork 1021 is implemented as an LSTM network, accordingly, the sequence encoding vectors may be implemented
Figure BDA0003103826470000097
Is represented by
Figure BDA0003103826470000098
Figure BDA0003103826470000099
From the above, a sequence-coded vector can be obtained
Figure BDA00031038264700000910
Step S222, updating the first user node (denoted as "first user node") in the user relationship network graph according to the sequence coding vector through the graph propagation network 102
Figure BDA00031038264700000923
) And the node characterization vectors of its neighboring nodes. It is understood that the first one therein isHousehold node
Figure BDA00031038264700000911
Corresponding to the first user identifier uiIn other words, the first user node
Figure BDA00031038264700000924
And a first subscriber identity uiAll uniquely corresponding to the same user.
It should be noted that the user relationship network graph is used to represent the relationship between multiple users, and includes multiple user nodes corresponding to the multiple users, and a connection edge formed by association between the user nodes. In one example, the associations between user nodes may include social relationships, e.g., where two users are friends in a social platform, or even if the frequency of communication (e.g., number of messaging, cumulative number of communication days) exceeds a preset threshold. In another example, the association between user nodes may include a relative, e.g., a mother, a grandmother, etc.
In one embodiment, this step may include: connecting the first user node
Figure BDA00031038264700000912
As a target Node (denoted as Node)o) And executing the updating operation of the characterization vector aiming at the target node. This update operation includes: node according to target NodeoCorresponding target sequence code vector
Figure BDA00031038264700000913
The target NodeoTarget neighbor Node (denoted as Node)rAnd is and
Figure BDA00031038264700000915
Figure BDA00031038264700000916
set of neighbor nodes representing target node) of a current token vector
Figure BDA00031038264700000917
And the current characterization vector of the target node itself
Figure BDA00031038264700000918
Determining a characterization vector for an updated target node
Figure BDA00031038264700000919
Wherein the content of the first and second substances,
Figure BDA00031038264700000920
in one particular embodiment, as shown in FIG. 4, graph propagation network 102 includes a local propagation layer 1021, a self-propagation layer 1022, an exogenous propagation layer 1023, and a fusion layer 1024. Based on this, the update operation specifically includes: processing the current characterization vector of the target neighbor node through a local propagation layer 1021
Figure BDA00031038264700000921
Obtaining local propagation vectors
Figure BDA00031038264700000925
Through the self-propagating layer 1022, a first parameter matrix W is utilized1The current characterization vector of the target node
Figure BDA00031038264700000922
Performing linear transformation to obtain self-propagation vector
Figure BDA0003103826470000101
Using a second parameter matrix W through an exogenous propagation layer 10232Encoding the target sequence with the vector
Figure BDA0003103826470000102
Linear transformation is carried out to obtain exogenous propagation vectors
Figure BDA0003103826470000103
By fusing the layers 1024, local propagation vectors are aligned
Figure BDA0003103826470000104
Self-propagating vector
Figure BDA0003103826470000105
And exogenous propagation vectors
Figure BDA0003103826470000106
Performing fusion processing to obtain the updated characterization vector of the target node
Figure BDA0003103826470000107
Determining local propagation vectors for the above-described passing through local propagation layer 1021
Figure BDA0003103826470000108
In a more specific embodiment, the target neighbor node is a plurality of first-order neighbor nodes, and accordingly, the current characterization vectors of the plurality of first-order neighbor nodes may be obtained by performing weighted summation by using the weight parameter vector w in the local propagation layer 1021, which may be specifically expressed as a calculation formula:
Figure BDA0003103826470000109
in the formula (I), the compound is shown in the specification,
Figure BDA00031038264700001010
representing the current token vector of the plurality of first-order neighbor nodes
Figure BDA00031038264700001011
And stacking the characterization matrixes.
In another more specific embodiment, the target neighbor Node includes a plurality of first-order neighbor nodes and a plurality of second-order neighbor nodes, and the first-order neighbor nodes are denoted as Node r1The second-order neighbor Node is recorded as Node r2. Corresponding, local propagation vector
Figure BDA00031038264700001012
The determining of (a) may include: aiming at each first-order neighbor node, the current characterization vector of the first-order neighbor node is firstly utilized
Figure BDA00031038264700001013
Determining a plurality of current characterization vectors (denoted as second-order neighbor nodes)
Figure BDA00031038264700001014
) Corresponding attention weights (denoted as
Figure BDA00031038264700001015
) It is to be understood that
Figure BDA00031038264700001016
It is meant that the number of second order neighbor nodes are first order neighbors of the first order neighbor node, and several refer herein to one or more. Then, the plurality of attention weights are utilized to carry out weighted summation on the plurality of current characterization vectors to obtain a neighbor aggregation vector of the first-order neighbor node
Figure BDA00031038264700001017
Then, a plurality of neighbor aggregation vectors of the first-order neighbor nodes are weighted and summed by using the weight parameter vector w in the local propagation layer 1021 to obtain a local propagation vector
Figure BDA00031038264700001018
For the above attention weights
Figure BDA00031038264700001019
In one example, an attention mechanism may be introduced, for example, calculating an attention weight using equation (6) below
Figure BDA00031038264700001020
Figure BDA00031038264700001021
In equation (6), score represents the attention score, and there are various solving methods. In a specific example, the local propagation layer 1021 includes an attention scoring sublayer, and accordingly, the corresponding two vectors may be spliced and input to the attention scoring sublayer, so as to obtain the corresponding attention score. In another specific example, a similarity between two vectors may be calculated as the corresponding attention score.
In another example, the attention weights are several
Figure BDA00031038264700001022
Are set so that the respective meta attention weights are equal and the sum value is 1. In yet another example, the attention weight is a parameter that needs to be learned in the local propagation layer 1021.
As such, several attention weights may be derived
Figure BDA00031038264700001023
The method is used for aggregation of a plurality of second-order neighbor nodes, so that an aggregation vector of the first-order neighbor nodes is obtained, and a local propagation vector for a target node is obtained
Figure BDA00031038264700001024
For the above-described fusion process by the fusion layer 1024, in one embodiment, this fusion process may include an addition process, an averaging process, or a weighted summation process.
According to a specific example, the above updating operation can be performed by using the following formula (7), and the description of the mathematical symbols therein can be referred to above.
Figure BDA0003103826470000111
In another embodiment, the refresh operation toolThe body may include: encoding a target sequence into a vector
Figure BDA0003103826470000112
Current characterization vector of target node
Figure BDA0003103826470000113
And the current characterization vector of the target neighbor node
Figure BDA0003103826470000114
Splicing is carried out, and the spliced vectors are input into the graph propagation network 102 to obtain updated characterization vectors of the target nodes
Figure BDA0003103826470000115
In the above, the update operation for the target node characterization vector is described. After the first user node is taken as the target node and the updating operation for the target node is executed, the updated node characterization vector of the first user node can be obtained
Figure BDA0003103826470000116
Further, in one embodiment, the node characterization vector is obtained
Figure BDA0003103826470000117
Then, the method can further comprise the following steps: characterizing vectors based on the node
Figure BDA0003103826470000118
And updating the characterization vectors of the neighbor nodes of the first user node. Thus, the above event sample x can be implementediThe event information in (1) is propagated in the user relationship network graph. It should be understood that the order of the neighbor node corresponding to the update may be set according to actual requirements, for example, set to be 1 order, or set to be within 2 orders. In addition, the token vector updating method adopted for the first user node may be the same as or different from the token vector updating method adopted for the neighboring node. In one example, canAnd respectively taking each neighbor node of the first user node as the target node, and executing the updating operation to obtain the updated node characterization vector of the neighbor node. In another example, each neighboring node of the first user node is respectively used as a central node, and multi-order neighbor aggregation operation is performed to obtain an updated node characterization vector of each neighboring node. The multi-order neighbor aggregation operation can be used for referencing the neighbor aggregation operation commonly used in the graph neural network.
Therefore, the first user node in the user relationship network graph can be realized
Figure BDA0003103826470000119
And the update of the node characterization vectors of its neighboring nodes.
Thereafter, in step S223, the first node characterization vector according to the updated first user node is obtained through the strength fitting network 103
Figure BDA00031038264700001110
And determining the parameter value in the event occurrence strength function corresponding to the first user identification. It is to be understood that the mathematical form of the event occurrence intensity function may be predetermined, including the dependent variable time t and the parameter term, the value of which may be determined based on the intensity fitting network 103 and its inputs.
In one embodiment, the event occurrence intensity function may be represented by the following equation:
Figure BDA00031038264700001111
in the formula (8), the first and second groups,
Figure BDA00031038264700001112
representing a user identity uiEvent Strength function in hidden space (labeled h), t ∈ [ t ]i,ti+1),
Figure BDA00031038264700001113
Presentation and user targetingHu JiuiIn the corresponding subsequence, the event occurrence time contained in the jth event sample;
Figure BDA00031038264700001114
representing a reference intensity; alpha is alphaj,iAnd deltaj,iRespectively representing the historical stimulation coefficient and the time attenuation coefficient of the jth event sample to the ith event sample in the subsequence;
Figure BDA00031038264700001115
indicating a bit-wise multiplication operation between vectors.
Further, in a specific embodiment, as shown in fig. 4, the intensity fitting network 103 includes a reference intensity determining layer 1031, a stimulation coefficient determining layer 1032 and an attenuation coefficient determining layer 1033, which are respectively used for determining the reference intensity in formula (8)
Figure BDA00031038264700001116
Historical stimulation coefficient alphaj,iAnd the time attenuation coefficient deltaj,i
In a more specific embodiment, the vector is characterized for the first node by a reference strength determination layer 1031
Figure BDA00031038264700001117
Performing linear transformation and/or activation to obtain reference intensity
Figure BDA0003103826470000121
In one specific example, a vector is first characterized for a first node using a weight matrix
Figure BDA0003103826470000122
Performing linear transformation processing, and then performing activation processing on the result of the linear transformation processing, wherein the calculation process can be expressed as the following formula:
Figure BDA0003103826470000123
where σ (·) represents an activation function in machine learning; wμAnd bμThe weight matrix and the bias vector in the reference strength determination layer 1031 are respectively represented, and are training parameters that need to be learned.
As such, the reference strength determination layer 1031 may be utilized to determine the reference strength
Figure BDA0003103826470000124
In a more specific embodiment, the vector is characterized from the first node by a stimulation coefficient determination layer 1032
Figure BDA0003103826470000125
Determining a first user node
Figure BDA0003103826470000126
A plurality of historical characterization vectors (denoted as
Figure BDA0003103826470000127
) Several attention weights (denoted as { beta)j,i,j∈[1,i-1]{ x) }) that are based on a number of other event samples in the subsequence (denoted as { x })j,j∈[1,i-1]}) to obtain; and, for each other event sample xjIts corresponding attention weight betaj,iAnd historical characterization vectors
Figure BDA0003103826470000128
Result of multiplication of
Figure BDA0003103826470000129
Determined as corresponding historical stimulation coefficients alphaj,i. It is to be understood that vectors are characterized for history
Figure BDA00031038264700001210
For the determination of (2), see the characterization vector for the first node
Figure BDA00031038264700001211
The determination of (2) is not described in detail herein。
For the above attention weight βj,iIn one example, for any historical token vector
Figure BDA00031038264700001212
The vector may be characterized by the first node first
Figure BDA00031038264700001213
Determining the historical characterization vector
Figure BDA00031038264700001214
Attention score ω of (1)j,iAnd then, normalizing a plurality of attention scores corresponding to a plurality of historical characterization vectors to obtain a plurality of attention weights. For the attention score ωj,iIn one particular example, the determination of (c) can be calculated
Figure BDA00031038264700001215
And
Figure BDA00031038264700001216
vector similarity between them as the attention score ωj,i. In another specific example, the attention score ω may be calculated based on the following equation (10)j,i
Figure BDA00031038264700001217
In equation (10), [ ·; a]Representing a splice between vectors; v and WωIs a parameter matrix to be learned in the stimulation coefficient determination layer 1032.
For the above normalization processing, in a specific example, the normalization processing may be implemented using a softmax function; in another specific example, the normalization process of the attention score may be implemented in a manner of taking a ratio.
As such, the historical stimulation coefficient α may be determined using the stimulation coefficient determination layer 1032j,i. Note that the calendarStachy coefficient of stimulation alphaj,iFor capturing long-term dependencies of stimuli between events based on subsequences. In the learning process, the historical stimulation coefficient alphaj,iThe value of (a) can be negative, thereby realizing the capture of the inhibition.
In a more specific embodiment, the first node is characterized by a attenuation coefficient determination layer 1033 that characterizes the vector
Figure BDA00031038264700001218
Respectively carrying out fusion processing with a plurality of historical representation vectors to obtain a plurality of fusion vectors, and carrying out linear transformation and activation processing on each fusion vector in sequence to obtain a corresponding time attenuation coefficient deltaj,i. In one example, the fusion process may include a stitching process, an addition process, or a bit-by-bit multiplication process, among others. In a specific example, the time attenuation coefficient δ may be calculated in the attenuation coefficient determination layer 1033 by the following equation (11)j,i
Figure BDA00031038264700001219
Wherein, [ ·; a]Representing a splice between vectors; wδAnd bδRespectively, a parameter matrix and a bias vector to be learned in the attenuation coefficient determination layer 1033.
Thus, the temporal attenuation coefficient δ can be determined using the attenuation coefficient determination layer 1033j,i
In the above, the reference intensity may be determined by the reference intensity determination layer 1031, the stimulation coefficient determination layer 1032, and the attenuation coefficient determination layer 1033 in the intensity fitting network 103
Figure BDA0003103826470000131
Historical stimulation coefficient alphaj,iAnd the time attenuation coefficient deltaj,iTo fit an event occurrence intensity function in a hidden space
Figure BDA0003103826470000132
Functional form parametrizationSee equation (8). It will be appreciated that where only event samples xi are included in the sub-sequence, the historical stimulation coefficient αj,iAnd the time attenuation coefficient deltaj,iAre all 0, so only the reference intensity needs to be determined
Figure BDA0003103826470000133
In another embodiment, the form of the event occurrence intensity function may also be represented as follows:
Figure BDA0003103826470000134
in the formula (12), ηt
Figure BDA0003103826470000135
And b is a parameter vector to be learned; for the rest of the symbols, reference may be made to the description of the symbols in equation (8).
Further, when each sample in the subsequence is used as an input sample of the event prediction system, the updated node characterization vectors of the first user node are spliced, and the spliced vectors are input to the strength fitting network 103, so that the parameter η in the formula (12) is obtainedt
Figure BDA0003103826470000136
And b is selected.
In the above, η in the formula (12) can be determined by the intensity fitting network 103t
Figure BDA0003103826470000137
And b, obtaining the event occurrence intensity function fitted in the hidden space
Figure BDA0003103826470000138
From the above, the intensity function of the event occurrence in the hidden space can be fitted through the intensity fitting network 103
Figure BDA0003103826470000139
Then, in step S224, the intensity of the event occurrence is functionalized through the intensity mapping network 104
Figure BDA00031038264700001310
Mapping to event type space with event type as space dimension to obtain mapped function
Figure BDA00031038264700001311
I.e. a plurality of intensity functions at a plurality of event types, RKK in (2) is equal to the total number of the plurality of event types.
In a particular embodiment, the intensity mapping network 104 may be implemented as a fully connected network with the softplus function as the activation function. In another particular embodiment, the intensity mapping network 104 may be implemented as a multi-layer fully connected network.
Thus, the first user identification u can be obtainediA plurality of intensity functions under a plurality of event types. Next, in step S230, based on the plurality of intensity functions and the corresponding first user ID uiSecond event sample xi+1And updating the network parameters in the event prediction system. Wherein the second event sample xi+1Is taken as a label sample corresponding to the second occurrence time ti+1Sample x later than the first eventiCorresponding first occurrence time ti. It is to be understood that historically, the first subscriber identity uiThe identified first user, after making a first event in the first event sample, then makes a second event sample xi+1A second event in (1); in addition, a second event sample xi+1May or may not be included in the sequence of event samples described above.
In one embodiment, this step may include: determining the second event sample x from the plurality of intensity functions corresponding to the plurality of event typesi+1Intensity function (denoted as
Figure BDA00031038264700001312
Or
Figure BDA00031038264700001313
) (ii) a Based on this intensity function
Figure BDA00031038264700001314
And a second event sample xi+1Corresponding occurrence time ti+1And updating the network parameters in the event prediction system. In a particular embodiment, it may be based on an intensity function
Figure BDA00031038264700001315
And the occurrence time ti+1And determining the training loss, and updating the network parameters by using the training loss. In another specific embodiment, it may be based on an intensity function
Figure BDA0003103826470000141
Time of occurrence ti+1And the strength function corresponding to other K-1 event types, determining the training loss, and updating the network parameters by using the training loss. Further, in one example, the training loss is calculated based on a negative log-likelihood function, as shown in equation (13) below.
Figure BDA0003103826470000142
In the formula (13), the first and second groups,
Figure BDA0003103826470000143
k represents the kth event type of the K event types. Thus, L can be reducednllTo target, network parameters in the event prediction system are updated.
In one embodiment, propagation between event types is modeled through a design Graph Regularization (Graph Regularization) process, thereby further improving the training effectiveness of the event prediction system described above. Specifically, as shown in FIG. 4, topThe event prediction system also includes a neighbor matrix prediction layer 105. Prior to this step, the method further comprises: and determining a prediction adjacent order matrix of the virtual event relation network graph constructed based on the plurality of event types according to the characterization vectors of the nodes in the user relation network graph through the adjacent matrix prediction layer 105. Based on this, the method can comprise the following steps: in one aspect, the second event sample x is based on the plurality of intensity functions andi+1determining a first loss term; on the other hand, acquiring a real event relation network graph, wherein the real event relation network graph comprises a plurality of type nodes corresponding to the event types and directed connection edges formed by causal relations among the type nodes; determining a second loss term based on the true adjacency matrix and the predicted adjacency matrix of the real event relationship network diagram; and updating the network parameters of the event prediction system based on the first loss term and the second loss term.
In a specific embodiment, the constructing of the virtual event relationship network graph includes: designing K nodes corresponding to the K event types, then establishing directed connection edges between any two nodes, and correspondingly obtaining an edge set epsilon { e }pq}KxKWherein e ispqIndicating that node p is the parent of node q, i.e., an event of type p may cause an event of type q to occur, with a causal relationship between them that event type p is a cause and event type q is an effect. It should be noted that the adjacency matrix is used to record the connection relationship between nodes in the relational network graph, for example, if there is a directed edge pointing from node i to node j in the relational network graph, the element B in the adjacency matrix B pq1, otherwise bpq0. The prediction of the adjacent order matrix means that the value of the matrix element is obtained through prediction, and correspondingly, if the predicted value is 0, the connection strength of the corresponding directed edge is low, which is equivalent to absence; if the predicted numerical value is larger, the connection strength of the corresponding directed edge is higher. Alternatively, the predicted values of the matrix elements may be considered as the connection weights of the corresponding connection edges.
In a specific embodiment, the determining of the prediction adjacent order matrix may include: determining type characterization vectors of each event type in the event types based on the characterization vectors of the nodes in the user relationship network diagram to form a type characterization matrix H; then, the predicted adjacency matrix a is determined based on the type characterization matrix H and the learning parameter matrix in the adjacency matrix prediction layer 105.
In a more specific embodiment, for the determination of the type token vector for each event type, in one example, in step S222 above, the token vector of the first user node is updated to be the token vector of the first user node through the graph propagation network 102
Figure BDA0003103826470000144
Thereafter, the event sample x may be samplediType of Medium event ciThe corresponding type token vector is updated to
Figure BDA0003103826470000145
In another example, in step S222 above, the token vector of the first user node is updated to be the token vector of the first user node through the graph propagation network 102
Figure BDA0003103826470000146
Thereafter, the event sample x may be samplediType of Medium event ciThe corresponding current type token vector is updated to be the sum thereof
Figure BDA0003103826470000147
Average value of (a) to (b). In this way, by sequentially using the event samples in the event sequence as the first event sample, a plurality of type characterization vectors corresponding to a plurality of event types can be obtained.
In a more specific embodiment, the predicted adjacency matrix a may be calculated by the following equation (14):
A=HΩHT (14)
in formula (14), H denotes the above-described type characterization matrix, Ω denotes a learning parameter matrix in the adjacency matrix prediction layer 105, and T denotes a transposition operation of the matrix.
In a specific embodiment, the obtaining the real event relationship network graph may include: the method comprises the steps of obtaining a plurality of user event sequences, wherein each user event sequence comprises a plurality of events which are made by a corresponding user and are arranged according to a time sequence, and a causal relationship exists between event types corresponding to any two adjacent events; and constructing the real event relation network graph based on the plurality of user event sequences. It should be understood that the truth adjacency matrix records the connection relationship between nodes in the real event relationship network diagram. In a more specific embodiment, the truth adjacency matrix also records a weight of the directed connecting edge, and the weight is determined based on the statistical times of the causal relationship. In an example, the propagation times between any two event types can be counted according to the user event sequence, and then the connection edge weight between the nodes corresponding to any two event types is calculated through the following formula.
Figure BDA0003103826470000151
In the formula (15), NpqRepresenting the statistical number of times an event type p propagates to an event type q, NmaxRepresents the maximum of all statistical times, epqRepresenting the weight of the connecting edge that node p points to node q.
In a specific embodiment, determining the second loss term based on the true adjacency matrix and the predicted adjacency matrix may be implemented as: determining the second loss term L by using KL divergence (KL-divergence) or other modes capable of quantifying the distance between the matrixesgraph
It should be noted that, for the determination of the first loss term, reference may be made to the relevant description in the foregoing embodiments. After the first loss term and the second loss term are determined, the network parameters of the event prediction system are updated with the goal of reducing the combined loss between the first loss term and the second loss term. In one example, the synthetic loss is calculated as:
min-Lnll+γLgraph (16)
in equation (16), the first loss term is implemented as LnllAnd L isnllThe meanings of (A) can be found inSee equation (13); l isgraphRepresenting a second loss term; gamma denotes LgraphThe weighting factor of (2) is a super parameter, and may be set to 0.02, for example.
In the above, by introducing graph regularization processing, more effective training of the event prediction system can be realized.
In summary, in the embodiment of the present specification, an event prediction system framework built based on a neural network is innovatively provided, in a training process of the event prediction system, a deep neural network is used to perform feature extraction on an event sample, parameters of a strength function are fitted in a hidden space (or simply, hidden space), and the strength function is mapped back to an event type space, so that strength functions corresponding to various event types are obtained, and the update of the event prediction system is further realized by combining with a label sample. Furthermore, in the updating process, graph regularization processing can be introduced, so that a better training effect is obtained. Therefore, through repeated iterative training, a trained event prediction system can be obtained, so that the intensity function of the target event sequence of the target user can be modeled, and accurate prediction of the next event after the target event sequence in the future can be realized by adopting the obtained more accurate and more flexible intensity function.
Corresponding to the above updating method, the embodiment of the present specification further discloses an updating apparatus. FIG. 5 illustrates an update apparatus architecture diagram of an event prediction system, according to one embodiment. As shown in fig. 5, the illustrated apparatus 500 includes:
the sequence acquiring unit 510 is configured to sequentially acquire event samples as first event samples from a sequence of event samples formed by arranging in a time sequence, where sample attributes include a first occurrence time and a first user identifier. An event processing unit 520, configured to input the first event sample into an event prediction system for event processing, where the event prediction system includes a sequence coding network, a graph propagation network, an intensity fitting network, and an intensity mapping network; the event processing unit 520 includes the following modules: the encoding module 521 is configured to determine, through the sequence encoding network, a sequence encoding vector of a subsequence up to the first occurrence time, where each event sample in the subsequence corresponds to the first user identifier; a graph propagation module 522 configured to update node characterization vectors of the first user node and its neighboring nodes in the user relationship network graph according to the sequence coding vector through the graph propagation network; a strength fitting module 523 configured to determine, through the strength fitting network, a parameter value in an event occurrence strength function corresponding to the first user identifier according to the updated first node characterization vector of the first user node; a strength mapping module 524, configured to map the event occurrence strength function to an event type space through the strength mapping network, to obtain a plurality of strength functions of the first subscriber identity under a plurality of event types. A parameter updating unit 530 configured to update network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity; and the second occurrence time corresponding to the second event sample is later than the first occurrence time.
In one embodiment, the sequence encoding network includes a linear embedding sub-network and a timing sub-network; the encoding module 521 is specifically configured to: determining type embedding vectors of event types corresponding to the event samples in the subsequence through the linear embedding sub-network; and outputting the sequence coding vector based on the sequentially input type embedding vector corresponding to each event sample through the time sequence sub-network.
In one embodiment, the graph propagation module 522 is specifically configured to: taking the first user node as a target node, and executing an updating operation aiming at the target node; wherein the update operation comprises: and determining the updated representation vector of the target node according to the target sequence coding vector corresponding to the target node, the current representation vector of the target neighbor node of the target node and the current representation vector of the target node.
In a specific embodiment, the graph propagation module 522 is further configured to: and taking the neighbor node of the first user node as a target node, and executing the updating operation.
In a particular embodiment, the graph propagation network includes a local propagation layer, a self-propagation layer, an exogenous propagation layer, and a fusion layer; wherein the update operation specifically includes: processing the current characterization vector of the target neighbor node through the local propagation layer to obtain a local propagation vector; performing linear transformation on the current characterization vector of the target node by using a first parameter matrix through the self-propagation layer to obtain a self-propagation vector; performing linear transformation on the target sequence coding vector by using a second parameter matrix through the exogenous propagation layer to obtain an exogenous propagation vector; and performing fusion processing on the local propagation vector, the self-propagation vector and the exogenous propagation vector through the fusion layer to obtain the updated representation vector of the target node.
In a more specific embodiment, the target neighbor node is a plurality of first order neighbor nodes of the target node; the graph propagation module 522 obtains a local propagation vector by performing the update operation, and specifically includes: and carrying out weighted summation on the current characterization vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
In another more particular embodiment, the target neighbor node includes a plurality of first order neighbor nodes and a plurality of second order neighbor nodes of the target node; the graph propagation module 522 obtains a local propagation vector by performing the update operation, and specifically includes: aiming at each first-order neighbor node, determining a plurality of attention weights corresponding to a plurality of current characterization vectors of a plurality of second-order neighbor nodes by using the current characterization vector of the first-order neighbor node; weighting and summing the current characterization vectors by using the attention weights to obtain a neighbor aggregation vector of the first-order neighbor node; and carrying out weighted summation on a plurality of neighbor aggregation vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
In one embodiment, the event occurrence intensity function includes a reference intensity, and the intensity fitting network includes a reference intensity determination layer; wherein the intensity fitting module 523 is specifically configured to: and performing linear transformation processing and activation processing on the first node characterization vector through the reference strength determination layer to obtain the reference strength.
In a specific embodiment, the event occurrence intensity function further includes a historical stimulation coefficient and a time attenuation coefficient, and the intensity fitting network further includes a stimulation coefficient determination layer and an attenuation coefficient determination layer; the intensity fitting module 523 is further configured to: determining, by the stimulation coefficient determination layer, a number of attention weights for a number of historical characterization vectors for the first user node from the first node characterization vector, the number of characterization vectors being derived based on a number of other event samples in the subsequence; and for each other event sample, determining the product result of the corresponding attention weight and the historical characterization vector as a corresponding historical stimulation coefficient; and respectively fusing the first node characterization vector with the plurality of historical characterization vectors through the attenuation coefficient determination layer to obtain a plurality of fusion vectors, and sequentially performing linear transformation and activation processing on each fusion vector to obtain a corresponding time attenuation coefficient.
In an embodiment, the parameter updating unit 530 is specifically configured to: determining an intensity function corresponding to the same event type as the second event sample from the plurality of intensity functions; and updating the network parameters based on the intensity function and the occurrence time corresponding to the second event sample.
In one embodiment, the event prediction system further comprises a adjacency matrix prediction layer; the apparatus 500 further comprises: the adjacent order matrix prediction unit 540 is configured to determine, through the adjacent matrix prediction layer, a predicted adjacent order matrix of the virtual event relationship network graph constructed based on the multiple event types according to the characterization vectors of the nodes in the user relationship network graph. The updating unit 530 is specifically configured to: determining a first loss term based on the plurality of intensity functions and a second event sample; acquiring a real event relation network graph, wherein the real event relation network graph comprises a plurality of type nodes corresponding to the event types and directed connection edges formed by causal relations among the type nodes; determining a second loss term based on a true adjacency matrix and the predicted adjacency matrix of the true event relationship network map; updating the network parameter based on the first loss term and the second loss term.
In a specific embodiment, the updating unit 530 is configured to obtain a real event relationship network graph, including: acquiring a plurality of user event sequences, wherein each user event sequence comprises a plurality of events which are made by a corresponding user and are arranged according to a time sequence, and the causal relationship exists between event types corresponding to any two adjacent events; and constructing the real event relation network graph based on the plurality of user event sequences.
In a more specific embodiment, the truth adjacency matrix includes a weight of the directed connecting edge, which is determined based on a statistical number of the causal relationship.
On the other hand, in a specific embodiment, the neighboring matrix prediction unit 540 is specifically configured to: determining a type characterization vector of each event type in the plurality of event types based on the characterization vectors of the nodes to form a type characterization matrix; determining the predicted adjacency matrix based on the type characterization matrix and a learning parameter matrix in the adjacency matrix prediction layer.
In a more specific embodiment, the apparatus further comprises: the vector updating unit 550 is configured to update the type characterization vector corresponding to the event type of the first event sample to the first node characterization vector.
In summary, in the embodiment of the present specification, an event prediction system framework built based on a neural network is innovatively provided, in a training process of the event prediction system, a deep neural network is used to perform feature extraction on an event sample, parameters of a strength function are fitted in a hidden space (or simply, hidden space), and the strength function is mapped back to an event type space, so that strength functions corresponding to various event types are obtained, and the update of the event prediction system is further realized by combining with a label sample. Furthermore, in the updating process, graph regularization processing can be introduced, so that a better training effect is obtained. Therefore, through repeated iterative training, a trained event prediction system can be obtained, so that the intensity function of the target event sequence of the target user can be modeled, and accurate prediction of the next event after the target event sequence in the future can be realized by adopting the obtained more accurate and more flexible intensity function.
According to an embodiment of a further aspect, the present specification further discloses an event prediction system. The event prediction system comprises: the system comprises an input layer, a data processing layer and a data processing layer, wherein the input layer is used for sequentially acquiring event samples from an event sample sequence formed by arranging according to a time sequence as first event samples, and the sample attributes of the first event samples comprise first generation time and first user identification; the sequence coding network is used for determining a sequence coding vector of a subsequence up to the first occurrence moment, and each event sample in the subsequence corresponds to the first user identifier; the graph propagation network is used for updating the node representation vectors of the first user node and the neighbor nodes thereof in the user relationship network graph according to the sequence coding vector; the intensity fitting network is used for determining a parameter value in an event occurrence intensity function corresponding to the first user identifier according to the updated first node characterization vector of the first user node; the intensity mapping network is used for mapping the event occurrence intensity function to an event type space to obtain a plurality of intensity functions of the first user identifier under a plurality of event types; and the output layer is used for outputting an event prediction result corresponding to the first user identifier based on the plurality of intensity functions, wherein the event prediction result comprises a predicted event type and a predicted occurrence moment. It should be noted that the description of the event prediction system can be referred to the related description in the foregoing embodiments. In addition, for the case that the event prediction system includes the adjacent order matrix prediction layer 105 in the training process, when the trained event prediction system is used, the trained adjacent order matrix prediction layer 105 can be removed from the event prediction system, and the modeling of the intensity function and the prediction of the future event can be realized by using the residual network part.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory having stored therein executable code, and a processor that, when executing the executable code, implements the method described in connection with fig. 2.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (25)

1. An updating method of an event prediction system comprises the following steps:
the method comprises the steps that event samples are sequentially obtained from an event sample sequence formed by arranging according to a time sequence and serve as first event samples, and the sample attributes of the first event samples comprise first occurrence time and first user identification;
inputting the first event sample into an event prediction system for event processing, wherein the event prediction system comprises a sequence coding network, a graph propagation network, an intensity fitting network and an intensity mapping network; the event processing comprises the following steps:
determining a sequence coding vector of a subsequence up to the first occurrence moment through the sequence coding network, wherein each event sample in the subsequence corresponds to the first user identifier;
updating node characterization vectors of a first user node and neighbor nodes thereof in the user relationship network graph through the graph propagation network according to the sequence coding vectors;
determining parameter values in an event occurrence intensity function corresponding to the first user identification according to the updated first node characterization vector of the first user node through the intensity fitting network;
mapping the event occurrence intensity function to an event type space through the intensity mapping network to obtain a plurality of intensity functions of the first user identifier under a plurality of event types;
updating network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity; and the second occurrence time corresponding to the second event sample is later than the first occurrence time.
2. The method of claim 1, wherein the sequence encoding network comprises a linear embedding sub-network and a timing sub-network; wherein determining the sequence-encoded vector of the sub-sequence up to the first occurrence time comprises:
determining type embedding vectors of event types corresponding to the event samples in the subsequence through the linear embedding sub-network;
and outputting the sequence coding vector based on the sequentially input type embedding vector corresponding to each event sample through the time sequence sub-network.
3. The method of claim 1, wherein updating the node characterization vectors for the first user node and its neighbor nodes in the user relationship network graph comprises:
taking the first user node as a target node, and executing an updating operation aiming at the target node;
wherein the update operation comprises: and determining the updated representation vector of the target node according to the target sequence coding vector corresponding to the target node, the current representation vector of the target neighbor node of the target node and the current representation vector of the target node.
4. The method of claim 3, wherein after performing an update operation on a target node characterization vector with the first user node as a target node, the method further comprises:
and taking the neighbor node of the first user node as a target node, and executing the updating operation.
5. The method of claim 3 or 4, wherein the graph propagation network comprises a local propagation layer, a self propagation layer, an exogenous propagation layer, and a fusion layer; wherein the update operation specifically includes:
processing the current characterization vector of the target neighbor node through the local propagation layer to obtain a local propagation vector;
performing linear transformation on the current characterization vector of the target node by using a first parameter matrix through the self-propagation layer to obtain a self-propagation vector;
performing linear transformation on the target sequence coding vector by using a second parameter matrix through the exogenous propagation layer to obtain an exogenous propagation vector;
and performing fusion processing on the local propagation vector, the self-propagation vector and the exogenous propagation vector through the fusion layer to obtain the updated representation vector of the target node.
6. The method of claim 5, wherein the target neighbor node is a plurality of first order neighbor nodes of the target node; wherein, processing the current characterization vector of the target neighbor node to obtain a local propagation vector, includes:
and carrying out weighted summation on the current characterization vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
7. The method of claim 5, wherein the target neighbor node comprises a plurality of first order neighbor nodes and a plurality of second order neighbor nodes of the target node; wherein, processing the current characterization vector of the target neighbor node to obtain a local propagation vector, includes:
aiming at each first-order neighbor node, determining a plurality of attention weights corresponding to a plurality of current characterization vectors of a plurality of second-order neighbor nodes by using the current characterization vector of the first-order neighbor node;
weighting and summing the current characterization vectors by using the attention weights to obtain a neighbor aggregation vector of the first-order neighbor node;
and carrying out weighted summation on a plurality of neighbor aggregation vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
8. The method of claim 1, wherein the incident intensity function includes a reference intensity, and the intensity fit network includes a reference intensity determination layer; wherein determining parameter values in an event occurrence strength function corresponding to the first user comprises:
and performing linear transformation processing and activation processing on the first node characterization vector through the reference strength determination layer to obtain the reference strength.
9. The method according to claim 8, wherein the event occurrence intensity function further comprises a historical stimulation coefficient and a time attenuation coefficient, and the intensity fitting network further comprises a stimulation coefficient determination layer and an attenuation coefficient determination layer; wherein determining a parameter value in an event occurrence strength function corresponding to the first user further comprises:
determining, by the stimulation coefficient determination layer, a number of attention weights for a number of historical characterization vectors for the first user node from the first node characterization vector, the number of characterization vectors being derived based on a number of other event samples in the subsequence; and for each other event sample, determining the product result of the corresponding attention weight and the historical characterization vector as a corresponding historical stimulation coefficient;
and respectively fusing the first node characterization vector with the plurality of historical characterization vectors through the attenuation coefficient determination layer to obtain a plurality of fusion vectors, and sequentially performing linear transformation and activation processing on each fusion vector to obtain a corresponding time attenuation coefficient.
10. The method of claim 1, wherein updating network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity comprises:
determining an intensity function corresponding to the same event type as the second event sample from the plurality of intensity functions;
and updating the network parameters based on the intensity function and the occurrence time corresponding to the second event sample.
11. The method of claim 1, wherein the event prediction system further comprises a adjacency matrix prediction layer; wherein, prior to updating network parameters in the event prediction system, the method further comprises:
determining a prediction adjacent order matrix of a virtual event relation network graph constructed based on the plurality of event types according to the characterization vectors of the nodes in the user relation network graph through the adjacent matrix prediction layer;
wherein updating the network parameters in the event prediction system comprises:
determining a first loss term based on the plurality of intensity functions and a second event sample;
acquiring a real event relation network graph, wherein the real event relation network graph comprises a plurality of type nodes corresponding to the event types and directed connection edges formed by causal relations among the type nodes;
determining a second loss term based on a true adjacency matrix and the predicted adjacency matrix of the true event relationship network map;
updating the network parameter based on the first loss term and the second loss term.
12. The method of claim 11, wherein obtaining a real event relationship network graph comprises:
acquiring a plurality of user event sequences, wherein each user event sequence comprises a plurality of events which are made by a corresponding user and are arranged according to a time sequence, and the causal relationship exists between event types corresponding to any two adjacent events;
and constructing the real event relation network graph based on the plurality of user event sequences.
13. The method according to claim 11 or 12, wherein the truth adjacency matrix includes a weight of the directed connecting edge, the weight being determined based on a statistical number of the causal relationship.
14. The method of claim 11, wherein determining a prediction neighborhood matrix for a virtual event relationship network graph constructed based on the plurality of event types from characterization vectors for nodes in the user relationship network graph comprises:
determining a type characterization vector of each event type in the plurality of event types based on the characterization vectors of the nodes to form a type characterization matrix;
determining the predicted adjacency matrix based on the type characterization matrix and a learning parameter matrix in the adjacency matrix prediction layer.
15. The method of claim 14, wherein prior to updating network parameters in the event prediction system, the method further comprises:
and updating the type characterization vector corresponding to the event type of the first event sample into the first node characterization vector.
16. An updating apparatus of an event prediction system, comprising:
the event sample sequence acquiring unit is configured to sequentially acquire event samples from an event sample sequence formed by arranging according to a time sequence, wherein the event samples are used as first event samples, and the sample attributes of the event samples comprise a first occurrence time and a first user identifier;
the event processing unit is configured to input the first event sample into an event prediction system for event processing, and the event prediction system comprises a sequence coding network, a graph propagation network, an intensity fitting network and an intensity mapping network; the event processing unit comprises the following modules:
the encoding module is configured to determine a sequence encoding vector of a subsequence up to the first occurrence time through the sequence encoding network, wherein each event sample in the subsequence corresponds to the first user identifier;
the graph propagation module is configured to update the node representation vectors of the first user node and the neighbor nodes thereof in the user relationship network graph according to the sequence coding vector through the graph propagation network;
the intensity fitting module is configured to determine a parameter value in an event occurrence intensity function corresponding to the first user identifier according to the updated first node characterization vector of the first user node through the intensity fitting network;
the intensity mapping module is configured to map the event occurrence intensity function to an event type space through the intensity mapping network to obtain a plurality of intensity functions of the first user identifier under a plurality of event types;
a parameter updating unit configured to update network parameters in the event prediction system based on the plurality of intensity functions and a second event sample corresponding to the first subscriber identity; and the second occurrence time corresponding to the second event sample is later than the first occurrence time.
17. The apparatus of claim 16, wherein the graph propagation module is specifically configured to:
taking the first user node as a target node, and executing an updating operation aiming at the target node;
wherein the update operation comprises: and determining the updated representation vector of the target node according to the target sequence coding vector corresponding to the target node, the current representation vector of the target neighbor node of the target node and the current representation vector of the target node.
18. The apparatus of claim 17, wherein the graph propagation network comprises a local propagation layer, a self propagation layer, an exogenous propagation layer, and a fusion layer; wherein, the update operation executed by the graph propagation module specifically includes:
processing the current characterization vector of the target neighbor node through the local propagation layer to obtain a local propagation vector;
performing linear transformation on the current characterization vector of the target node by using a first parameter matrix through the self-propagation layer to obtain a self-propagation vector;
performing linear transformation on the target sequence coding vector by using a second parameter matrix through the exogenous propagation layer to obtain an exogenous propagation vector;
and performing fusion processing on the local propagation vector, the self-propagation vector and the exogenous propagation vector through the fusion layer to obtain the updated representation vector of the target node.
19. The apparatus of claim 18, wherein the target neighbor node comprises a plurality of first order neighbor nodes and a plurality of second order neighbor nodes of the target node; the graph propagation module obtains a local propagation vector by executing the update operation, and specifically includes:
aiming at each first-order neighbor node, determining a plurality of attention weights corresponding to a plurality of current characterization vectors of a plurality of second-order neighbor nodes by using the current characterization vector of the first-order neighbor node;
weighting and summing the current characterization vectors by using the attention weights to obtain a neighbor aggregation vector of the first-order neighbor node;
and carrying out weighted summation on a plurality of neighbor aggregation vectors of the first-order neighbor nodes by using the weight parameter vector in the local propagation layer to obtain the local propagation vector.
20. The apparatus of claim 16, wherein the event occurrence intensity function includes a reference intensity, and the intensity fit network includes a reference intensity determination layer; wherein the intensity fitting module is specifically configured to:
and performing linear transformation processing and activation processing on the first node characterization vector through the reference strength determination layer to obtain the reference strength.
21. The apparatus according to claim 20, wherein the event occurrence intensity function further comprises a historical stimulation coefficient and a time attenuation coefficient, and the intensity fitting network further comprises a stimulation coefficient determination layer and an attenuation coefficient determination layer; wherein the intensity fitting module is further configured to:
determining, by the stimulation coefficient determination layer, a number of attention weights for a number of historical characterization vectors for the first user node from the first node characterization vector, the number of characterization vectors being derived based on a number of other event samples in the subsequence; and for each other event sample, determining the product result of the corresponding attention weight and the historical characterization vector as a corresponding historical stimulation coefficient;
and respectively fusing the first node characterization vector with the plurality of historical characterization vectors through the attenuation coefficient determination layer to obtain a plurality of fusion vectors, and sequentially performing linear transformation and activation processing on each fusion vector to obtain a corresponding time attenuation coefficient.
22. The apparatus of claim 16, wherein the event prediction system further comprises a adjacency matrix prediction layer; the apparatus further comprises an adjacent order matrix prediction unit configured to:
determining a prediction adjacent order matrix of a virtual event relation network graph constructed based on the plurality of event types according to the characterization vectors of the nodes in the user relation network graph through the adjacent matrix prediction layer;
wherein the parameter updating unit is specifically configured to:
determining a first loss term based on the plurality of intensity functions and a second event sample;
acquiring a real event relation network graph, wherein the real event relation network graph comprises a plurality of type nodes corresponding to the event types and directed connection edges formed by causal relations among the type nodes;
determining a second loss term based on a true adjacency matrix and the predicted adjacency matrix of the true event relationship network map;
updating the network parameter based on the first loss term and the second loss term.
23. An event prediction system comprising:
the system comprises an input layer, a data processing layer and a data processing layer, wherein the input layer is used for sequentially acquiring event samples from an event sample sequence formed by arranging according to a time sequence as first event samples, and the sample attributes of the first event samples comprise first generation time and first user identification;
the sequence coding network is used for determining a sequence coding vector of a subsequence up to the first occurrence moment, and each event sample in the subsequence corresponds to the first user identifier;
the graph propagation network is used for updating the node representation vectors of the first user node and the neighbor nodes thereof in the user relationship network graph according to the sequence coding vector;
the intensity fitting network is used for determining a parameter value in an event occurrence intensity function corresponding to the first user identifier according to the updated first node characterization vector of the first user node;
the intensity mapping network is used for mapping the event occurrence intensity function to an event type space to obtain a plurality of intensity functions of the first user identifier under a plurality of event types;
and the output layer is used for outputting an event prediction result corresponding to the first user identifier based on the plurality of intensity functions, wherein the event prediction result comprises a predicted event type and a predicted occurrence moment.
24. A computer-readable storage medium, having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-15.
25. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that when executed by the processor implements the method of any of claims 1-15.
CN202110631255.8A 2021-06-07 2021-06-07 Updating method and device of event prediction system Active CN113283589B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110631255.8A CN113283589B (en) 2021-06-07 2021-06-07 Updating method and device of event prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110631255.8A CN113283589B (en) 2021-06-07 2021-06-07 Updating method and device of event prediction system

Publications (2)

Publication Number Publication Date
CN113283589A true CN113283589A (en) 2021-08-20
CN113283589B CN113283589B (en) 2022-07-19

Family

ID=77283515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110631255.8A Active CN113283589B (en) 2021-06-07 2021-06-07 Updating method and device of event prediction system

Country Status (1)

Country Link
CN (1) CN113283589B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821374A (en) * 2023-07-27 2023-09-29 中国人民解放军陆军工程大学 Event prediction method based on information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944610A (en) * 2017-11-17 2018-04-20 平安科技(深圳)有限公司 Predicted events measure of popularity, server and computer-readable recording medium
CN109659033A (en) * 2018-12-18 2019-04-19 浙江大学 A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network
CN109961192A (en) * 2019-04-03 2019-07-02 南京中科九章信息技术有限公司 Object event prediction technique and device
EP3564889A1 (en) * 2018-05-04 2019-11-06 The Boston Consulting Group, Inc. Systems and methods for learning and predicting events
CN112183881A (en) * 2020-10-19 2021-01-05 中国人民解放军国防科技大学 Public opinion event prediction method and device based on social network and storage medium
CN112580789A (en) * 2021-02-22 2021-03-30 支付宝(杭州)信息技术有限公司 Training graph coding network, and method and device for predicting interaction event
CN112905801A (en) * 2021-02-08 2021-06-04 携程旅游信息技术(上海)有限公司 Event map-based travel prediction method, system, device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944610A (en) * 2017-11-17 2018-04-20 平安科技(深圳)有限公司 Predicted events measure of popularity, server and computer-readable recording medium
EP3564889A1 (en) * 2018-05-04 2019-11-06 The Boston Consulting Group, Inc. Systems and methods for learning and predicting events
CN109659033A (en) * 2018-12-18 2019-04-19 浙江大学 A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network
CN109961192A (en) * 2019-04-03 2019-07-02 南京中科九章信息技术有限公司 Object event prediction technique and device
CN112183881A (en) * 2020-10-19 2021-01-05 中国人民解放军国防科技大学 Public opinion event prediction method and device based on social network and storage medium
CN112905801A (en) * 2021-02-08 2021-06-04 携程旅游信息技术(上海)有限公司 Event map-based travel prediction method, system, device and storage medium
CN112580789A (en) * 2021-02-22 2021-03-30 支付宝(杭州)信息技术有限公司 Training graph coding network, and method and device for predicting interaction event

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RAKSHIT TRIVEDI 等: "DYREP: LEARNING REPRESENTATIONS OVER DYNAMIC GRAPHS", 《ICLR2019》 *
WEICHANG WU 等: "Modeling Event Propagation via Graph Biased Temporal Point Process", 《ARXIV:1908.01623V2 [CS.SI]》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821374A (en) * 2023-07-27 2023-09-29 中国人民解放军陆军工程大学 Event prediction method based on information

Also Published As

Publication number Publication date
CN113283589B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN111814977B (en) Method and device for training event prediction model
CN107391542B (en) Open source software community expert recommendation method based on file knowledge graph
Yuan et al. Importance sampling algorithms for Bayesian networks: Principles and performance
CN111651671B (en) User object recommendation method, device, computer equipment and storage medium
CN111079931A (en) State space probabilistic multi-time-series prediction method based on graph neural network
CN111708876B (en) Method and device for generating information
CN112910710B (en) Network flow space-time prediction method and device, computer equipment and storage medium
CN112214499B (en) Graph data processing method and device, computer equipment and storage medium
CN112085615A (en) Method and device for training graph neural network
CN112529071B (en) Text classification method, system, computer equipment and storage medium
CN113407784A (en) Social network-based community dividing method, system and storage medium
US10878334B2 (en) Performing regression analysis on personal data records
CN111428866A (en) Incremental learning method and device, storage medium and electronic equipment
CN110335160B (en) Medical care migration behavior prediction method and system based on grouping and attention improvement Bi-GRU
CN113283589B (en) Updating method and device of event prediction system
CN113610610B (en) Session recommendation method and system based on graph neural network and comment similarity
Li et al. Dynamic multi-view group preference learning for group behavior prediction in social networks
CN113592593A (en) Training and application method, device, equipment and storage medium of sequence recommendation model
Cao et al. Implicit user relationships across sessions enhanced graph for session-based recommendation
CN111957053A (en) Game player matching method and device, storage medium and electronic equipment
CN115829110A (en) Method and device for predicting user behavior based on Markov logic network
CN113256024B (en) User behavior prediction method fusing group behaviors
CN111935259B (en) Method and device for determining target account set, storage medium and electronic equipment
CN108876031B (en) Software developer contribution value prediction method
CN110727705A (en) Information recommendation method and device, electronic equipment and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant