WO2024057414A1

WO2024057414A1 - Information processing device, information processing method, and program

Info

Publication number: WO2024057414A1
Application number: PCT/JP2022/034271
Authority: WO
Inventors: 祥章瀧本; 真耶大川; 具治岩田; 佑典田中; 秀明金
Original assignee: 日本電信電話株式会社
Priority date: 2022-09-13
Filing date: 2022-09-13
Publication date: 2024-03-21

Abstract

An information processing device according to one embodiment of the present invention comprises: a first monotonically-increasing neural network; a second monotonically-increasing neural network; a first calculation unit that calculates a first cumulative function on the basis of a parameter and an output from the first monotonically-increasing neural network; and a second calculation unit that calculates a second cumulative function on the basis of a cycle, a parameter, and an output from the second monotonically-increasing neural network.

Description

Information processing device, information processing method, and program

Embodiments of the present invention relate to an information processing device, an information processing method, and a program.

A method using a point process is known as one of the methods for predicting the occurrence of various events such as equipment failure, human behavior, crime, earthquakes, and infectious diseases. A point process is a probabilistic model that describes the timing of events.

Neural networks (NN) are known as a technology that can model point processes at high speed and with high accuracy. As one type of neural network, a monotonic neural network (MNN) has been proposed.

However, monotonically increasing neural networks may be inferior to ordinary neural networks in terms of expressiveness. Furthermore, monotonically increasing neural networks may lack stability in learning processing due to disappearance or divergence of the gradient of the activation function. The above-mentioned challenges of monotonically increasing neural networks become especially pronounced when predicting events over time.
Furthermore, it is difficult for monotonically increasing neural networks to incorporate human knowledge, such as knowledge that the intensity function changes periodically, such as on the day of the week.

The present invention has been made in view of the above circumstances, and its purpose is to provide a means that enables long-term prediction of events.

In one embodiment, the information processing device generates a first cumulative function based on a first monotonically increasing neural network, a second monotonically increasing neural network, an output from the first monotonically increasing neural network, and a parameter. and a second calculation unit that calculates a second cumulative function based on the output from the second monotonically increasing neural network, a parameter, and a period.

An information processing method of one aspect is a method performed by an information processing device, wherein a first output unit of the information processing device outputs a scalar value according to a monotonically increasing function from a first monotonically increasing neural network. a second output unit of the information processing device outputs a scalar value according to a monotonically increasing function from the second monotonically increasing neural network; and a first calculation unit of the information processing device outputs a scalar value according to a monotonically increasing function; calculating a first cumulative function based on a scalar value output from the monotonically increasing neural network and a parameter; and outputting from the second monotonically increasing neural network by a second calculation unit of the information processing device. calculating a second cumulative function based on the calculated scalar value, the parameter, and the period.

According to the embodiment, it is possible to provide a means that enables long-term prediction of events.

FIG. 1 is a block diagram showing an example of the hardware configuration of an event prediction device according to the first embodiment. FIG. 2 is a block diagram illustrating an example of the configuration of a learning function of the event prediction device according to the first embodiment. FIG. 3 is a diagram illustrating an example of the structure of a sequence in a learning data set of the event prediction device according to the first embodiment. FIG. 4 is a block diagram illustrating an example of the configuration of a prediction function of the event prediction device according to the first embodiment. FIG. 5 is a diagram illustrating an example of the configuration of prediction data of the event prediction device according to the first embodiment. FIG. 6 is a flowchart illustrating an example of a learning operation in the event prediction device according to the first embodiment. FIG. 7 is a flowchart illustrating an example of a prediction operation in the event prediction device according to the first embodiment. FIG. 8 is a block diagram illustrating an example of the configuration of a learning function of the event prediction device according to the second embodiment. FIG. 9 is a block diagram illustrating an example of a configuration of a prediction function of an event prediction device according to the second embodiment. FIG. 10 is a flowchart illustrating an example of an overview of learning operations in the event prediction device according to the second embodiment. FIG. 11 is a flowchart illustrating an example of the first update process in the event prediction device according to the second embodiment. FIG. 12 is a flowchart illustrating an example of the second update process in the event prediction device according to the second embodiment. FIG. 13 is a flowchart illustrating an example of a prediction operation in the event prediction device according to the second embodiment. FIG. 14 is a block diagram illustrating an example of a configuration of a latent expression calculation unit of an event prediction device according to a first modification. FIG. 15 is a block diagram illustrating an example of the configuration of the intensity function calculation unit of the event prediction device according to the second modification. FIG. 16 is a block diagram illustrating an example of the configuration of the first intensity function calculation unit of the event prediction device according to the third modification. FIG. 17 is a block diagram illustrating an example of the configuration of the second intensity function calculation unit of the event prediction device according to the third modification.

Hereinafter, some embodiments will be described with reference to the drawings. In the following description, common reference numerals are given to components having the same function and configuration. Furthermore, when distinguishing a plurality of components having a common reference numeral, they are distinguished by a further reference numeral (for example, a hyphen and a number such as "-1") appended to the common reference numeral.

1. First Embodiment An information processing apparatus according to a first embodiment will be described. Below, an event prediction device will be described as an example of the information processing device according to the first embodiment.

The event prediction device includes a learning function and a prediction function. The learning function is a function for meta-learning a point process. The prediction function is a function that predicts the occurrence of an event based on the point process learned by the learning function. An event is a phenomenon that occurs discretely over continuous time. Specifically, for example, the event is a user's purchasing behavior on an EC (Electronic Commerce) site. Meta-learning is a method using MAML (Model-Agnostic Meta-Learning), for example, and is described in the document “Chelsea Finn, et al., “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks,” Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70:1126-1135, 2017.,<
Disclosed at https://arxiv.org/abs/1703.03400>.

1.1 Configuration The configuration of the event prediction device according to the first embodiment will be described.

1.1.1 Hardware Configuration FIG. 1 is a block diagram showing an example of the hardware configuration of the event prediction device according to the first embodiment. As shown in FIG. 1, the event prediction device 1 includes a control circuit 10, a memory 11, a communication module 12, a user interface 13, and a drive 14.

The control circuit 10 is a circuit that controls each component of the event prediction device 1 as a whole. The control circuit 10 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like.

The memory 11 is a storage device of the event prediction device 1. The memory 11 includes, for example, an HDD (Hard Disk Drive), an SSD (Solid State Drive), a memory card, and the like. The memory 11 stores information used for learning operations and prediction operations in the event prediction device 1. The memory 11 also stores a learning program for causing the control circuit 10 to perform a learning operation and a prediction program for causing the control circuit 10 to perform a predictive operation.

The communication module 12 is a circuit used for transmitting and receiving data to and from the outside of the event prediction device 1 via a network.

The user interface 13 is a circuit for communicating information between the user and the control circuit 10. User interface 13 includes input equipment and output equipment. Input devices include, for example, a touch panel and operation buttons. The output device includes, for example, an LCD (Liquid Crystal Display), an EL (Electroluminescence) display, and a printer. The user interface 13 outputs, for example, the execution results of various programs received from the control circuit 10 to the user.

The drive 14 is a device for reading programs stored in the storage medium 15. The drive 14 includes, for example, a CD (Compact Disk) drive, a DVD (Digital Versatile Disk) drive, and the like.

The storage medium 15 is a medium that stores information such as programs through electrical, magnetic, optical, mechanical, or chemical action. The storage medium 15 may store a learning program and a prediction program.

1.1.2 Learning Function Configuration FIG. 2 is a block diagram showing an example of the learning function configuration of the event prediction device according to the first embodiment.

The CPU of the control circuit 10 loads the learning program stored in the memory 11 or the storage medium 15 into the RAM. The CPU of the control circuit 10 controls the memory 11, the communication module 12, the user interface 13, the drive 14, and the storage medium 15 by interpreting and executing the learning program developed in the RAM. As a result, as shown in FIG. 2, the event prediction device 1 includes a computer including a data extraction unit 21, an initialization unit 22, a latent expression calculation unit 23, an intensity function calculation unit 24, an update unit 25, and a determination unit 26. functions as The memory 11 of the event prediction device 1 also stores a learning data set 20 and learned parameters 27 as information used for learning operations.

The learning dataset 20 is, for example, a collection of event sequences of multiple users at a certain EC site. Alternatively, the learning dataset 20 is a collection of event sequences of a certain user at multiple EC sites. The learning dataset 20 has multiple sequences Ev. When the learning dataset 20 is a collection of event sequences of multiple users at a certain EC site, each sequence Ev corresponds to, for example, a user.
When the learning dataset 20 is a collection of event sequences of a certain user at multiple EC sites, each sequence Ev corresponds to, for example, an EC site. Each sequence Ev is information including occurrence times t _i (1≦i≦I) of I events that occurred during a period [0, t ^e ] (I is an integer equal to or greater than 1). The number of events I in each sequence Ev may be different from each other. In other words, the data length of each sequence Ev may be any length.

The data extraction unit 21 extracts the series Ev from the learning data set 20. The data extraction unit 21 further extracts a support sequence Es and a query sequence Eq from the extracted sequence Ev. The data extraction unit 21 transmits the support sequence Es and the query sequence Eq to the latent expression calculation unit 23 and the update unit 25, respectively.

FIG. 3 is a diagram illustrating an example of the configuration of a series of learning data sets of the event prediction device according to the first embodiment. As shown in FIG. 3, the support sequence Es and the query sequence Eq are partial sequences of the sequence Ev.

The support sequence Es is a partial sequence corresponding to the period [0, t _s ] of the sequence Ev (Es={t _i |0≦t _i ≦t _s }). The time _ts is arbitrarily determined within the range from time 0 to less than time ^te .

The query sequence Eq is a partial sequence corresponding to the period [t _s , t _q ] of the sequence Ev (Eq={t _i |t _s <t _i ≦t _q }). The time _tq is arbitrarily determined within a range greater than the time _ts and less than or equal to the time ^te .

Referring again to FIG. 2, the configuration of the learning function of the event prediction device 1 will be described.

The initialization unit 22 initializes multiple parameters p1, p2a, and p2b based on rule X. The initialization unit 22 transmits the plurality of initialized parameters p1 to the latent expression calculation unit 23. The initialization unit 22 transmits the plurality of initialized parameters p2a and p2b to the intensity function calculation unit 24. The plurality of parameters p1, p2a, and p2b will be described later.

Rule X includes applying to the parameters random numbers generated according to a distribution whose average is less than or equal to 0. For example, examples of applying rule X to a neural network having multiple layers include initialization of Xavier and initialization of He. When initializing Xavier, when the number of nodes in the previous layer is n, parameters are initialized according to a normal distribution with an average of 0 and a standard deviation of 1/√n. In initializing He, when the number of nodes in the previous layer is n, parameters are initialized according to a normal distribution with an average of 0 and a standard deviation of √(2/n).

The latent expression calculation unit 23 calculates the latent expression z based on the support series Es. The latent expression z is data representing the characteristics of the event occurrence timing in the series Ev. The latent expression calculation unit 23 transmits the calculated latent expression z to the strength function calculation unit 24.

Specifically, the latent expression calculation unit 23 includes a neural network 23-1. The neural network 23-1 is a mathematical model modeled to input a sequence and output a latent representation. The neural network 23-1 is configured so that variable length data can be input. A plurality of parameters p1 are applied to the neural network 23-1 as weights and bias terms. The neural network 23-1 to which the plurality of parameters p1 are applied receives the support sequence Es as an input and outputs a latent expression z. The neural network 23-1 transmits the output latent expression z to the strength function calculation unit 24.

The strength function calculation unit 24 calculates the strength function λ(t) based on the latent expression z and time t. The intensity function λ(t) is a time function that indicates how likely (for example, the probability of occurrence) an event will occur in a future time period. The intensity function calculation unit 24 transmits the calculated intensity function λ(t) to the update unit 25.

Specifically, the intensity function calculation unit 24 includes a first monotonically increasing neural network 24-1a, a second monotonically increasing neural network 24-1b, a cumulative intensity function calculation unit 24-2, and an automatic differentiation unit 24-3. .

The first monotonically increasing neural network 24-1a is a mathematical model modeled to calculate as an output a scalar value according to a monotonically increasing function defined by a latent expression and time. The second monotonically increasing neural network 24-1b is a mathematical model modeled to calculate as an output a scalar value according to a monotonically increasing function defined by a latent representation, a period, and time.
A plurality of weights and bias terms based on a plurality of parameters p2a are applied to the first monotonically increasing neural network 24-1a. When a negative value is included in the weight of the plurality of parameters p2a, the negative value is converted into a non-negative value by an operation such as taking an absolute value.
When the weights among the plurality of parameters p2a are non-negative values, the plurality of parameters p2a may be directly applied as weights and bias terms to the first monotonically increasing neural network 24-1a. That is, each weight applied to the first monotonically increasing neural network 24-1a is a non-negative value.
The first monotonically increasing neural network 24-1a to which the plurality of parameters p2a is applied calculates an output f(z, t), etc. as a scalar value according to a monotonically increasing function defined by the latent expression z and time t. The first monotonically increasing neural network 24-1a transmits the output f(z, t) and the like to the cumulative intensity function calculation unit 24-2.

A plurality of weights and bias terms based on a plurality of parameters p2b are applied to the second monotonically increasing neural network 24-1b. When a negative value is included in the weight of the plurality of parameters p2b, the negative value is converted into a non-negative value by an operation such as taking an absolute value.
When the weights among the plurality of parameters p2b are non-negative values, the plurality of parameters p2b may be directly applied as weights and bias terms to the second monotonically increasing neural network 24-1b. That is, each weight applied to the second monotonically increasing neural network 24-1b is a non-negative value.
The second monotonically increasing neural network 24-1b to which the plurality of parameters p2b is applied outputs g(z, t′) and g(z, τ), etc.
t' is calculated according to the following equation (1). τ is a cycle of, for example, one day or one week, and is set in advance.

The second monotonically increasing neural network 24-1b transmits outputs g(z, t'), g(z, τ), etc. to the cumulative intensity function calculation unit 24-2.

The cumulative intensity function calculating unit 24-2 calculates the period τ, the outputs f(z, t), g(z, t'), and g( A cumulative intensity function Λ(t) is calculated based on z, τ), etc.
λ ₁ (u) in equation (3) etc. is a non-periodic part in the intensity function, and λ ₂ (u) in equation (4) etc. is a periodic part in the intensity function. f(z, t) and f(z, 0) on the right side of equation (3) regarding λ ₁ (u) are calculated by the first monotonically increasing neural network 24-1a. g(z, t'), g(z, τ), and g(z, 0) on the right side of equation (4) regarding λ 2 ₍ u) are calculated by the second monotonically increasing neural network 24-1b. .
Equation (4) makes it possible to define an arbitrarily shaped intensity function with a period τ. Therefore, while human knowledge that there is a period is incorporated into the model, there is no need for assumptions that constrain the shape of the model, that is, that limit the expressive power required of each monotonically increasing neural network.

As shown in equations (2) to (4), the cumulative intensity function Λ(t) is expressed as The output g(z, t') from the second monotonically increasing neural network 24-1b is taken into consideration. The cumulative intensity function calculation section 24-2 transmits the calculated cumulative intensity function Λ(t) to the automatic differentiation section 24-3.

The automatic differentiation section 24-3 calculates the intensity function λ(t) by automatically differentiating the cumulative intensity function Λ(t). The automatic differentiator 24-3 transmits the calculated intensity function λ(t) to the updater 25.

The updating unit 25 updates the plurality of parameters p1, p2a, and p2b based on the intensity function λ(t) and the query sequence Eq. The updated plurality of parameters p1, p2a, and p2b are applied one-to-one to the neural network 23-1, the first monotonically increasing neural network 24-1a, and the second monotonically increasing neural network 24-1b, respectively. . Furthermore, the updating unit 25 transmits the updated parameters p1, p2a, and p2b to the determining unit 26.

Specifically, the update unit 25 includes an evaluation function calculation unit 25-1 and an optimization unit 25-2.

The evaluation function calculation unit 25-1 calculates the evaluation function L(Eq) based on the intensity function λ(t) and the query sequence Eq. The evaluation function L(Eq) is, for example, a negative log likelihood. The evaluation function calculation unit 25-1 transmits the calculated evaluation function L(Eq) to the optimization unit 25-2.

The optimization unit 25-2 optimizes the plurality of parameters p1, p2a, and p2b based on the evaluation function L(Eq). For example, an error backpropagation method is used for the optimization. The optimization unit 25-2 applies the optimized parameters p1, p2a, and p2b to the neural network 23-1, the first monotonically increasing neural network 24-1a, and the second monotonically increasing neural network 24-1b. A plurality of parameters p1, p2a, and p2b that are applied on a one-to-one basis are updated. Furthermore, the optimization unit 25-2 may optimize the above parameters based on a negative log likelihood in which events in the support sequence Es are taken into consideration.

The determination unit 26 determines whether or not a condition is satisfied based on the updated parameters p1, p2a, and p2b. The condition may be, for example, that the number of times the parameters p1, p2a, and p2b are transmitted to the determination unit 26 (i.e., the number of parameter update loops) is equal to or greater than a threshold. The condition may be, for example, that the amount of change in the values of the parameters p1, p2a, and p2b before and after the update is equal to or less than a threshold. If the condition is not satisfied, the determination unit 26 causes the data extraction unit 21, the latent expression calculation unit 23, the intensity function calculation unit 24, and the update unit 25 to repeatedly execute a parameter update loop. If the condition is satisfied, the determination unit 26 ends the parameter update loop and stores the last updated parameters p1, p2a, and p2b in the memory 11 as the learned parameters 27. In the following description, the parameters in the learned parameters 27 are described as p1 ^* , p2a ^* , and p2b ^* to distinguish them from the parameters before learning.

With the above configuration, the event prediction device 1 has a function of generating learned parameters 27 based on the learning data set 20.

1.1.3 Prediction Function Configuration FIG. 4 is a block diagram showing an example of a prediction function configuration of the event prediction device according to the first embodiment.

The CPU of the control circuit 10 loads the prediction program stored in the memory 11 or the storage medium 15 into the RAM. The CPU of the control circuit 10 controls the memory 11, the communication module 12, the user interface 13, the drive 14, and the storage medium 15 by interpreting and executing the prediction program developed in the RAM. Thereby, as shown in FIG. 4, the event prediction device 1 further functions as a computer including a latent expression calculation section 23, an intensity function calculation section 24, and a prediction sequence generation section 29. Furthermore, the memory 11 of the event prediction device 1 further stores prediction data 28 as information used for prediction operations. In FIG. 4, the neural network 23-1, the first monotonically increasing neural network 24-1a, and the second monotonically increasing neural network 24-1b each have a plurality of parameters p1 ^* , p2a ^* , and A case is shown where p2b ^* is applied on a one-to-one basis.

If the learning data set 20 is a collection of event series of multiple users on a certain EC site, the prediction data 28 corresponds to, for example, the new user's event series for the next one week. When the learning data set 20 is a collection of event sequences of a certain user on a plurality of EC sites, the prediction data 28 corresponds to, for example, the user's event sequence for the next one week on another EC site.

FIG. 5 is a diagram illustrating an example of the configuration of prediction data of the event prediction device according to the first embodiment. As shown in FIG. 5, the prediction data 28 has a prediction sequence Es ^* . The prediction sequence Es ^* is information including the occurrence time of an event that occurred before the period to be predicted. Specifically, the prediction sequence Es ^* includes the occurrence times _ti (1≦i≦I ^* ) of I ^* events occurring during the period Ts ^* =[0, ts ^* ] (I ^* is an integer greater than or equal to 1).

In other words, the period Tq ^* ⁼ (ts ^* , tq ^* ) following the period Ts * is the period in which event occurrence is predicted in the prediction operation. Below, information including the occurrence time of the event predicted in the period Tq ^* will be used. Let be the predicted series Eq ^* .

Referring again to FIG. 4, the configuration of the prediction function of the event prediction device 1 will be described.

The latent expression calculation unit 23 inputs the prediction sequence Es ^* in the prediction data 28 to the neural network 23-1. The neural network 23-1 to which a plurality of parameters p1 ^* are applied receives the prediction sequence Es ^* as an input and outputs a latent expression z ^* . The neural network 23-1 transmits the output latent expression z ^* to the first monotonically increasing neural network 24-1a and the second monotonically increasing neural network 24-1b within the intensity function calculation unit 24.

The first monotonically increasing neural network 24-1a to which a plurality of parameters p2a ^* are applied outputs ^f* ⁽ z ^* ,t) and f ^* (z ^* ,0) is calculated. The first monotonically increasing neural network 24-1a transmits the outputs f ^* (z ^* , t) and f ^* (z ^* , 0) to the cumulative intensity function calculation unit 24-2.

The second monotonically increasing neural network 24-1b to which a plurality of parameters p2b ^* are applied outputs ^g* ⁽ z ^* , t'), g ^* (z ^* , τ), and g ^* (z ^* , 0). The second monotonically increasing neural network 24-1b transmits the calculated value to the cumulative intensity function calculation unit 24-2.

The cumulative intensity function calculation unit 24-2 calculates the above equations (2) to (4) (where z, f, g, λ, and Λ are replaced by z ^* , f ^* , g ^* , λ ^* , and Λ ^*) . The cumulative intensity function Λ * (t) is calculated based on the period τ and the outputs f ^* (z ^* , t), g ^* ⁽ z ^* , t'), g ^* (z ^* , τ), etc. according to calculate. The cumulative intensity function calculation unit 24-2 transmits the calculated cumulative intensity function Λ ^* (t) to the automatic differentiation unit 24-3.

The automatic differentiator 24-3 calculates the intensity function λ ^* (t) by automatically differentiating the cumulative intensity function Λ ^* (t). The automatic differentiator 24-3 transmits the calculated intensity function λ ^* (t) to the predicted sequence generator 29.

The predicted sequence generation unit 29 generates the predicted sequence Eq ^* based on the intensity function λ ^* (t). The predicted sequence generation unit 29 outputs the generated predicted sequence Eq ^* to the user. The predicted sequence generation unit 29 may output the intensity function λ ^* (t) to the user. Note that to generate the predicted sequence Eq ^* , a simulation using the Lewis method or the like is performed, for example. Information on the Lewis method can be found, for example, in the document “Yoshihiko Ogata, “On Lewis' Simulation Method for Point Processes,” IEEE Transactions on Information Theory, Vol.27, Issue.1, January 1981, <https://ieeexplore.ieee.org /abstract/document/1056305＞”
With the above configuration, the event prediction device 1 has a function of predicting the prediction sequence Eq ^* following the prediction sequence Es ^* based on the learned parameters 27 .

1.2. Operation Next, the operation of the event prediction device according to the first embodiment will be described.

1.2.1 Learning Operation FIG. 6 is a flowchart showing an example of the learning operation in the event prediction device according to the first embodiment. In the example of FIG. 6, it is assumed that the learning data set 20 is stored in the memory 11 in advance.

As shown in FIG. 6, in response to a user's instruction to start a learning operation (start), the initialization unit 22 initializes a plurality of parameters p1, p2a, and p2b based on rule X (S10). . For example, the initialization unit 22 initializes the plurality of parameters p1, p2a, and p2b based on Xavier initialization or He initialization. The plurality of parameters p1, p2a, and p2b initialized by the process of S10 are applied to the neural network 23-1, the first monotonically increasing neural network 24-1a, and the second monotonically increasing neural network 24-1b, respectively. .

The data extraction unit 21 extracts the series Ev from the learning data set 20. Subsequently, the data extraction unit 21 further extracts the support sequence Es and the query sequence Eq from the extracted sequence Ev (S11).

The neural network 23-1 to which the plurality of parameters p1 initialized in the process of S10 is applied calculates a latent expression z by inputting the support series Es extracted in the process of S11 (S12).

The first monotonically increasing neural network 24-1a to which the plurality of parameters p2a initialized in the process of S10 is applied follows the monotonically increasing function defined by the latent expression z calculated in the process of S12 and the time t. Outputs f(z, t) and f(z, 0) are calculated (S13).

The second monotonically increasing neural network 24-1b to which the plurality of parameters p2b initialized in the process of S10 is applied is defined by the latent expression z, time t, time t', and period τ calculated in the process of S12. Outputs g(z, t'), g(z, τ), and g(z, 0) are calculated according to the monotonically increasing function (S14).

The cumulative intensity function calculation unit 24-2 calculates the cumulative intensity function Λ( t) is calculated (S15).

The automatic differentiation unit 24-3 calculates the intensity function λ(t) based on the cumulative intensity function Λ(t) calculated in the process of S15 (S16).

The updating unit 25 updates the plurality of parameters p1, p2a, and p2b based on the intensity function λ(t) calculated in S16 and the query sequence Eq extracted in the process of S11 (S17). Specifically, the evaluation function calculation unit 25-1 calculates the evaluation function L(Eq) based on the intensity function λ(t) and the query sequence Eq. The optimization unit 25-2 uses the error backpropagation method to calculate a plurality of optimized parameters p1, p2a, and p2b based on the evaluation function L(Eq). The optimization unit 25-2 applies the optimized parameters p1, p2a, and p2b to a neural network 23-1, a first monotonically increasing neural network 24-1a, and a second monotonically increasing neural network 24-1b, respectively. applied on a one-to-one basis.

The determination unit 26 determines whether the condition is satisfied based on the plurality of parameters p1, p2a, and p2b (S18).

If the conditions are not met (S18; no), the data extraction unit 21 extracts a new support sequence Es and query sequence Eq from the learning dataset 20 (S11). Then, based on the extracted new support series Es and query series Eq, and the plurality of parameters p1, p2a, and p2b updated in the process of S17, the processes of S12 to S18 are executed. As a result, the process of updating the plurality of parameters p1, p2a, and p2b is repeated until it is determined in the process of S18 that the condition is satisfied.

If the condition is satisfied (S18; yes), the determination unit 26 sets the plurality of parameters p1, p2a, and p2b that were last updated in the process of S17 as learned parameters p1 ^* , p2a ^* , and p2b ^*. 27 (S19).

When the process of S19 ends, the learning operation in the event prediction device 1 ends (end).

1.2.2 Prediction Operation FIG. 7 is a flowchart showing an example of prediction operation in the event prediction device according to the first embodiment. In the example of FIG. 7, a plurality of parameters p1 ^* , p2a ^* , and p2b ^* in the learned parameters 27 are set to the neural network 23-1 and the first monotonically increasing neural network 24-1a, respectively, by the learning operation executed in advance. , and the second monotonically increasing neural network 24-1b on a one-to-one basis. Further, in the example of FIG. 7, it is assumed that the prediction data 28 is stored in the memory 11.

As shown in FIG. 7, in response to a user's instruction to start a prediction operation (start), the neural network 23-1 to which a plurality of parameters p1 ^* are applied inputs the prediction sequence Es ^* and generates a latent expression z. ^* is calculated (S20).

The first monotonically increasing neural network 24-1a to which a plurality of parameters ^p2a* ^are applied outputs an output f ^* (z ^* , t) and f ^* (z ^* , 0) are calculated (S21).

The second monotonically increasing neural network 24-1b to which a plurality of parameters p2b ^* are applied is a monotonically increasing function defined by the latent expression z ^* calculated in the process of S20, time t, time t', and period τ. Accordingly, outputs g ^* (z ^* , t'), g ^* (z ^* , τ), and g ^* (z ^* , 0) are calculated (S22).

The cumulative intensity function calculation unit 24-2 calculates the outputs f ^* (z ^* , t) and f ^* (z ^* , 0) calculated in the process of S21 and the output g ^* (z ^* ) calculated in the process of S22. , t'), g ^* (z ^* , τ), and g ^* (z ^* , 0), the cumulative intensity function Λ ^* (t) is calculated (S23).

The automatic differentiator 24-3 calculates the intensity function λ ^* (t) based on the cumulative intensity function Λ ^* (t) calculated in the process of S23 (S24).

The predicted sequence generation unit 29 generates the predicted sequence Eq ^* based on the intensity function λ ^* (t) calculated in S24 (S25). Then, the predicted sequence generation unit 29 outputs the predicted sequence Eq ^* generated in the process of S25 to the user.

When the process of S25 ends, the prediction operation in the event prediction device 1 ends (end).

1.3 Effects of First Embodiment According to the first embodiment, the first monotonically increasing neural network 24-1a outputs f according to the monotonically increasing function defined by the latent representation z of the support sequence Es and the time t. (z, t) and f(z, 0).
The second monotonically increasing neural network 24-1b outputs g(z, t'), g(z , τ), and g(z,0).
The cumulative intensity function calculation unit 24-2 calculates the Then, the cumulative intensity function Λ(t) is calculated.
This eliminates the need for the first monotonically increasing neural network 24-1a to represent periodic changes. Therefore, the requirement for expressiveness required for the output of the first monotonically increasing neural network 24-1a can be relaxed.

Furthermore, the automatic differentiator 24-3 calculates the intensity function λ(t) regarding the point process based on the cumulative intensity function Λ(t). Thereby, the first monotonically increasing neural network 24-1a and the second monotonically increasing neural network 24-1b can be used for modeling a point process. Therefore, long-term prediction of events can be performed using the first monotonically increasing neural network 24-1a and the second monotonically increasing neural network 24-1b.

2. Second Embodiment Next, an information processing apparatus according to a second embodiment will be described.

In the first embodiment described above, when modeling the intensity function λ(t), a case has been described in which a neural network that inputs the support sequence Es and outputs the latent expression z is used, but the present invention is not limited to this. For example, modeling of the intensity function λ(t) may be realized by combining with a meta-learning method such as MAML (Model-Agnostic Meta-Learning). Below, the configuration and operation that are different from the first embodiment will be mainly explained. Further, descriptions of configurations and operations equivalent to those of the first embodiment will be omitted as appropriate.

2.1 Learning Function Configuration FIG. 8 is a block diagram showing an example of the learning function configuration of the event prediction device according to the second embodiment.

As shown in FIG. 8, the event prediction device 1 includes a data extraction section 31, an initialization section 32, a first intensity function calculation section 33A, a second intensity function calculation section 33B, a first update section 34A, and a second update section. 34B, a first determination section 35A, and a second determination section 35B. The memory 11 of the event prediction device 1 also stores a learning data set 30 and learned parameters 36 as information used for learning operations.

The learning data set 30 and the data extraction unit 31 are equivalent to the learning data set 20 and the data extraction unit 21 in the first embodiment. That is, the data extraction unit 31 extracts the support sequence Es and the query sequence Eq from the learning data set 30.

The initialization unit 32 initializes multiple parameters p2a and p2b based on rule X. The initialization unit 22 transmits the plurality of initialized parameters p2a and p2b to the first intensity function calculation unit 33A. Note that hereinafter, the set of the plurality of parameters p2a and p2b is also referred to as a parameter set θ{p2a, p2b}. Further, the plurality of parameters p2a and p2b in the parameter set θ{p2a, p2b} are also referred to as the plurality of parameters θ{p2a} and θ{p2b}, respectively.

The first intensity function calculation unit 33A calculates the intensity function λ _a (t) based on time t. The first intensity function calculation unit 33A transmits the calculated intensity function λ _a (t) to the first update unit 34A.

Specifically, the first intensity function calculating section 33A includes a first monotonically increasing neural network 33A-1a, a second monotonically increasing neural network 33A-1b, a cumulative intensity function calculating section 33A-2, and an automatic differentiation section 33A-3. including.

The first monotonically increasing neural network 33A-1a is a mathematical model modeled to calculate as an output a scalar value according to a monotonically increasing function defined by time. A plurality of weights and bias terms based on a plurality of parameters θ{p2a} are applied to the first monotonically increasing neural network 33A-1a. Each weight applied to the first monotonically increasing neural network 33A-1a is a non-negative value. The first monotonically increasing neural network 33A-1a to which a plurality of parameters θ{p2a} are applied calculates outputs f _a (t) and f _a (0) according to a monotonically increasing function defined by time t. The first monotonically increasing neural network 33A-1a transmits the calculated outputs f _a (t) and f _a (0) to the cumulative intensity function calculation unit 33A-2.
Further, the second monotonically increasing neural network 33A-1b is a mathematical model modeled to calculate as an output a scalar value according to a monotonically increasing function defined by period and time. A plurality of weights and bias terms based on a plurality of parameters θ{p2b} are applied to the second monotonically increasing neural network 33A-1b. Each weight applied to the second monotonically increasing neural network 33A-1b is a non-negative value. The second monotonically increasing neural network 33A-1b to which a plurality of parameters θ{p2b} are applied outputs g _a (t'), g _a (τ) and g _a (0) are calculated. The second monotonically increasing neural network 33A-1b transmits the calculated outputs g _a (t'), g _a (τ), and g _a (0) to the cumulative intensity function calculation unit 33A-2.

The cumulative intensity function calculation unit 33A-2 calculates the period τ, the outputs f _a (t), f _a (0), g _a (t'), according to equations (5), (6), and (7) shown below. A cumulative intensity function Λ _a (t) is calculated based on g _a (τ) and g _a (0).

The cumulative intensity function calculation unit 33A-2 transmits the calculated cumulative intensity function Λ _a (t) to the automatic differentiation unit 33A-3.

The automatic differentiator 33A-3 calculates the intensity function λ _a (t) by automatically differentiating the cumulative intensity function Λ _a (t). The automatic differentiator 33A-3 transmits the calculated intensity function λ _a (t) to the first updater 34A.

The first updating unit 34A updates the parameter set θ{p2a, p2b} based on the intensity function λ _a (t) and the support sequence Es. The updated plurality of parameters θ{p2a} and θ{p2a} are respectively applied to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b. Further, the first updating unit 34A transmits the updated parameter set θ{p2a, p2b} to the first determining unit 35A.

Specifically, the first update section 34A includes an evaluation function calculation section 34A-1 and an optimization section 34A-2.

The evaluation function calculation unit 34A-1 calculates the evaluation function L _a (Es) based on the intensity function λ _a (t) and the support series Es. The evaluation function L _a (Es) is, for example, a negative log likelihood. The evaluation function calculation unit 34A-1 transmits the calculated evaluation function L _a (Es) to the optimization unit 34A-2.

The optimization unit 34A-2 optimizes the parameter set θ{p2a, p2b} based on the evaluation function L _a (Es). For example, an error backpropagation method is used for the optimization. The optimization unit 34A-2 is the optimized parameter set θ{p2a, p2b}, and the parameter set θ{ applied to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b. p2a, p2b}.

The first determination unit 35A determines whether the first condition is satisfied based on the updated parameter set θ{p2a, p2b}. The first condition is, for example, the number of times the parameter set θ{p2a, p2b} is transmitted to the first determination unit 35A (that is, the number of update loops of the parameter set in the first intensity function calculation unit 33A and the first update unit 34A). may be greater than or equal to a threshold value. The first condition may be, for example, that the amount of change in the value of the parameter set θ{p2a, p2b} before and after updating is equal to or less than a threshold value. Hereinafter, the parameter set update loop in the first intensity function calculation unit 33A and the first update unit 34A is also referred to as an inner loop.

If the first condition is not satisfied, the first determination unit 35A causes the update to be repeatedly executed using the inner loop. If the first condition is satisfied, the first determination unit 35A ends the update using the inner loop, and transmits the last updated parameter set θ{p2a, p2b} to the second intensity function calculation unit 33B. In the following description, the parameter set sent to the second intensity function calculation unit 33B in the learning function will be described as θ'{p2a, p2b} in order to distinguish it from the parameter set before learning.

The second intensity function calculation unit 33B calculates the intensity function λ _b (t) based on the time t, the time t′, and the period τ. The second intensity function calculation unit 33B transmits the calculated intensity function λ _b (t) to the second update unit 34B.

Specifically, the second intensity function calculating section 33B includes a first monotonically increasing neural network 33B-1a, a second monotonically increasing neural network 33B-1b, a cumulative intensity function calculating section 33B-2, and an automatic differentiation section 33B-3. including.

The first monotonically increasing neural network 33B-1a is a mathematical model modeled to calculate as an output a scalar value according to a monotonically increasing function defined by time. A plurality of parameters θ'{p2a} are applied to the first monotonically increasing neural network 33B-1a as weights and bias terms. The first monotonically increasing neural network 33B-1a to which a plurality of parameters θ'{p2a} are applied calculates outputs f _b (t) and f _b (0) according to a monotonically increasing function defined by time t. The first monotonically increasing neural network 33B-1a transmits the calculated outputs f _b (t) and f _b (0) to the cumulative intensity function calculation unit 33B-2.
The second monotonically increasing neural network 33B-1b is a mathematical model modeled to calculate as an output a scalar value according to a monotonically increasing function defined by time and period. A plurality of parameters θ'{p2b} are applied as weights and bias terms to the second monotonically increasing neural network 33B-1b. The second monotonically increasing neural network 33B-1b to which a plurality of parameters θ'{p2b} are applied outputs g _b (t'), g according to a monotonically increasing function defined by time t, time t', and period τ. Calculate _b (τ) and g _b (0). The second monotonically increasing neural network 33B-1b transmits the calculated outputs g _b (t'), g _b (τ), and g _b (0) to the cumulative intensity function calculation unit 33B-2.

The cumulative intensity function calculation unit 33B-2 calculates the period according to the above equations (5), (6), and (7) (where Λ _a , f _a , and g _a are replaced with Λ _b , f _b , and g _b ). A cumulative intensity function Λ _{b (t) is calculated based on τ and the outputs f b} (t), f _b (0), g _b (t'), g _b (τ), and _g _b (0). The cumulative intensity function calculation unit 33B-2 transmits the calculated cumulative intensity function Λ _b (t) to the automatic differentiation unit 33B-3.

The automatic differentiation section 33B-3 calculates the intensity function λ _b (t) by automatically differentiating the cumulative intensity function Λ _b (t). The automatic differentiator 33B-3 transmits the calculated intensity function λ _b (t) to the second updater 34B.

The second updating unit 34B updates the parameter set θ{p2a, p2b} based on the intensity function λ _b (t) and the query sequence Eq. The updated plurality of parameters θ{p2a} and θ{p2b} are respectively applied to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b. Further, the second updating section 34B transmits the updated parameter set θ{p2a, p2b} to the second determining section 35B.

Specifically, the second update unit 34B includes an evaluation function calculation unit 34B-1 and an optimization unit 34B-2.

The evaluation function calculation unit 34B-1 calculates the evaluation function L _b (Eq) based on the intensity function λ _b (t) and the query sequence Eq. The evaluation function L _b (Eq) is, for example, a negative log likelihood. The evaluation function calculation unit 34B-1 transmits the calculated evaluation function L _b (Eq) to the optimization unit 34B-2.

The optimization unit 34B-2 optimizes the parameter set θ{p2a, p2b} based on the evaluation function L _b (Eq). For example, an error backpropagation method is used to optimize the parameter set θ{p2a, p2b}. More specifically, the optimization unit 34B-2 calculates the second derivative of the evaluation function L _b (Eq) with respect to the parameter set θ{p2a, p2b} using the parameter set θ′{p2a, p2b}, and Optimize the set θ{p2a, p2b}. The optimization unit 34B-2 then sets the parameter set θ{p2a, p2b} to be applied to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b. Update θ{p2a, p2b}.

The second determination unit 35B determines whether the second condition is satisfied based on the updated parameter set θ{p2a, p2b}. The second condition is, for example, the number of times the parameter set θ{p2a, p2b} is transmitted to the second determination unit 35B (that is, the number of update loops of the parameter set in the second intensity function calculation unit 33B and the second update unit 34B). may be greater than or equal to a threshold value. The second condition may be, for example, that the amount of change in the value of the parameter set θ{p2a, p2b} before and after updating is equal to or less than a threshold value. Hereinafter, the parameter set update loop in the second intensity function calculation unit 33B and the second update unit 34B will also be referred to as an outer loop.

If the second condition is not satisfied, the second determination unit 35B causes the parameter set to be updated repeatedly using the outer loop. If the second condition is satisfied, the second determination unit 35B ends the updating of the parameter set by the outer loop, and stores the last updated parameter set θ{p2a, p2b} in the memory 11 as the learned parameters 36. Make me remember. In the following description, the parameter set in the learned parameters 36 will be described as θ{p2a ^* , p2b ^* } in order to distinguish it from the parameter set before learning by the outer loop.

With the above configuration, the event prediction device 1 has a function of generating learned parameters 36 based on the learning data set 30.

2.2 Prediction Functional Configuration FIG. 9 is a block diagram showing an example of the configuration of the prediction function of the event prediction device according to the second embodiment.

As shown in FIG. 9, the event prediction device 1 includes a first intensity function calculation section 33A, a first update section 34A, a first determination section 35A, a second intensity function calculation section 33B, and a prediction sequence generation section 38. It also functions as a computer. Furthermore, the memory 11 of the event prediction device 1 further stores prediction data 37 as information used for prediction operations. The configuration of the prediction data 37 is equivalent to the prediction data 28 in the first embodiment.

Note that FIG. 9 shows a case where the parameter set θ{p2a ^* , p2b ^* } is applied from the learned parameters 36 to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b. It will be done.

The first monotonically increasing neural network 33A-1a to which a plurality of parameters θ{p2a ^* } is applied calculates outputs f _a ^* (t) and f _a ^* (0) according to a monotonically increasing function defined by time t. do. The first monotonically increasing neural network 33A-1a transmits the calculated outputs f _a ^* (t) and f _a ^* (0) to the cumulative intensity function calculation unit 33A-2.
Further, the second monotonically increasing neural network 33A-1b to which a plurality of parameters θ{p2b ^* } is applied outputs g _a ^* (t' ), g _a ^* (τ) and g _a ^* (0) are calculated. The second monotonically increasing neural network 33A-1b transmits the calculated outputs g _a ^* (t'), g _a ^* (τ), and g _a ^* (0) to the cumulative intensity function calculation unit 33 A-2.

The cumulative intensity function calculation unit 33A-2 calculates the above equations (5), (6), and (7) (where f _a , g _a , λ _a , and Λ _a are replaced by f _a ^* , g _a ^* , λ _a ^* , and Λ _a ^* ) and based on the outputs f _a ^* (t), f _a ^* (0), g _a ^* (t'), g _a ^* (τ) and g _a ^* (0) Then, the cumulative intensity function Λ _a ^* (t) is calculated. The cumulative intensity function calculation unit 33A-2 transmits the calculated cumulative intensity function Λ _a ^* (t) to the automatic differentiation unit 33A-3.

The automatic differentiation section 33A-3 calculates the intensity function λ _a ^* (t) by automatically differentiating the cumulative intensity function Λ _a ^* (t). The automatic differentiation section 33A-3 transmits the calculated intensity function λ _a ^* (t) to the first determination section 35A.

The evaluation function calculation unit 34A-1 calculates the evaluation function L _a (Es ^* ) based on the intensity function λ _a ^* (t) and the prediction sequence Es ^* . The evaluation function L _a (Es ^* ) is, for example, a negative log likelihood. The evaluation function calculation unit 34A-1 transmits the calculated evaluation function L _a (Es ^* ) to the optimization unit 34A-2.

The optimization unit 34A-2 optimizes the parameter set θ{p2a ^* , p2b ^* } based on the evaluation function L _a (Es ^* ). For example, an error backpropagation method is used for the optimization. The optimization unit 34A-2 uses the optimized parameter set θ{p2a ^* , p2b ^* } as a parameter set to be applied to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b. Update θ{p2a ^* , p2b ^* }.

The first judgment unit 35A judges whether the third condition is satisfied based on the updated parameter set θ{p2a ^* , p2b ^* }. The third condition may be, for example, that the number of inner loops for updating the parameter set θ{p2a ^* , p2b ^* } is equal to or greater than a threshold. The third condition may be, for example, that the amount of change in the value of the parameter set θ{p2a ^* , p2b ^* } before and after the update is equal to or less than a threshold.

If the third condition is not satisfied, the first determination unit 35A causes the inner loop to repeatedly update the parameter set. If the third condition is satisfied, the first determination unit 35A ends the update of the parameter set by the inner loop, and sends the last updated parameter set θ{p2a ^* , p2b ^* } to the second intensity function calculation unit. 33B. In the following description, in order to distinguish it from the parameter set before inner loop learning, the parameter set sent to the second strength function calculation unit 33B in the prediction function will be described as θ'{p2a ^* , p2b ^* }.

The first monotonically increasing neural network 33B-1a to which the parameter θ'{p2a ^* } is applied calculates the outputs f _b ^* (t) and f _b ^* (0) according to the monotonically increasing function defined by the time t. . The first monotonically increasing neural network 33B-1a transmits the calculated outputs f _b ^* (t) and f _b ^* (0) to the cumulative intensity function calculation unit 33B-2.
The second monotonically increasing neural network 33B-1b to which the parameter θ'{p2b ^* } is applied outputs g _b ^* (t'), g according to a monotonically increasing function defined by time t, time t' and period τ. Calculate _b ^* (τ) and g _b ^* (0). The second monotonically increasing neural network 33B-1b outputs the calculated outputs f _b ^* (t), f _b ^* (0), g _b ^* (t'), g _b ^* (τ), and g _b ^* (0). is transmitted to the cumulative intensity function calculation unit 33B-2.

The cumulative intensity function calculation unit 33B-2 calculates the above equations (5), (6), and (7) (where f _a , g _a , λ _a , and Λ _a are replaced by f _b ^* , g _b ^* , λ _b ^* , and Λ _b ^* ), the period τ, the output f _b ^* (t), f _b ^* (0), g b ^* (t'), ^{g b *} ₍ _τ ), and g _b ^* (0) The cumulative intensity function Λ _b ^* (t) is calculated based on . The cumulative intensity function calculation unit 33B-2 transmits the calculated cumulative intensity function Λ _b ^* (t) to the automatic differentiation unit 33B-3.

The automatic differentiation section 33B-3 calculates the intensity function λ _b ^* (t) by automatically differentiating the cumulative intensity function Λ _b ^* (t). The automatic differentiation section 33B-3 transmits the calculated intensity function λ _b ^* (t) to the prediction sequence generation section 38.

The predicted sequence generation unit 38 generates the predicted sequence Eq ^* based on the intensity function λ _b ^* (t). The predicted sequence generation unit 38 outputs the generated predicted sequence Eq ^* to the user. Note that to generate the predicted sequence Eq ^* , a simulation using the Lewis method or the like is performed, for example.

With the above configuration, the event prediction device 1 has a function of predicting the prediction sequence Eq ^* following the prediction sequence Es ^* based on the learned parameters 36 .

2.3 Learning Operation FIG. 10 is a flowchart showing an example of an overview of the learning operation in the event prediction device according to the second embodiment. In the example of FIG. 10, it is assumed that the learning data set 30 is stored in the memory 11 in advance.

As shown in FIG. 10, in response to the user's instruction to start the learning operation (start), the initialization unit 32 initializes the parameter set θ{p2a, p2b} based on rule X (S50). The parameter set θ{p2a, p2b} initialized by the process of S50 is applied to the first intensity function calculation unit 33A.

The data extraction unit 31 extracts the series Ev from the learning data set 30. Subsequently, the data extraction unit 31 further extracts the support sequence Es and the query sequence Eq from the extracted sequence Ev (S51).

The first intensity function calculation unit 33A and the first update unit 34A to which the parameter set θ{p2a, p2b} initialized in the process of S50 is applied perform the first update process of the parameter set θ{p2a, p2b}. Execute (S52). Details of the first update process will be described later.

After the process of S52, the first determination unit 35A determines whether the first condition is satisfied based on the parameter set θ{p2a, p2b} updated in the process of S52 (S53).

If the first condition is not satisfied (S53; no), the first intensity function calculation unit 33A and the first update unit 34A to which the parameter set θ{p2a, p2b} updated in the process of S52 is applied, The first update process is executed again (S52). In this way, the first update process is repeated (inner loop) until it is determined in the process of S53 that the first condition is satisfied.

If the first condition is satisfied (S53; yes), the first determination unit 35A sets the parameter set θ{p2a, p2b} that was last updated in the process of S52 as the parameter set θ'{p2a, p2b}. It is applied to the second intensity function calculation unit 33B (S54).

The second intensity function calculation unit 33B and the second update unit 34B to which the parameter set θ'{p2a, p2b} is applied execute a second update process for the parameter set θ{p2a, p2b} (S55). Details of the second update process will be described later.

After the process of S55, the second determination unit 35B determines whether the second condition is satisfied based on the parameter set θ{p2a, p2b} updated in the process of S55 (S56).

If the second condition is not satisfied (S56; no), the data extraction unit 31 extracts a new support sequence Es and a query sequence Eq (S51). Then, the inner loop and the second update process are repeated (outer loop) until it is determined in the process of S56 that the second condition is satisfied.

If the second condition is satisfied (S56; yes), the second determination unit 35B converts the parameter set θ{p2a, p2b} that was last updated in the process of S55 into the parameter set θ{p2a ^* , p2b ^* } The learned parameter 36 is stored as the learned parameter 36 (S57).

When the process of S57 ends, the learning operation in the event prediction device 1 ends (ends).

FIG. 11 is a flowchart illustrating an example of the first update process in the event prediction device according to the second embodiment. The processing of S52-1a to S52-4 shown in FIG. 11 corresponds to the processing of S52 in FIG. 10.

After the process of S51 (start), the first monotonically increasing neural network 33A-1a to which the plurality of parameters θ{p2a} initialized in the process of S50 are applied follows the monotonically increasing function defined by the time t. Outputs f _a (t) and f _a (0) are calculated (S52-1a).
Further, the second monotonically increasing neural network 33A-1b to which the plurality of parameters θ{p2b} initialized in the process of S50 is applied follows a monotonically increasing function defined by time t, time t', and period τ. Outputs g _a (t'), g _a (τ), and g _a (0) are calculated (S52-1b).

The cumulative intensity function calculation unit 33A-2 outputs f _a (t), f _a (0) calculated in the process of S52-1a, and g _a (t'), g _a calculated in the process of S52-1b. The cumulative intensity function Λ _a (t) is calculated based on (τ) and g _a (0) (S52-2).

The automatic differentiator 33A-3 calculates the intensity function λ _a (t) based on the cumulative intensity function Λ _a (t) calculated in the process of S52-2 (S52-3).

The first updating unit 34A updates the parameter set θ{p2a, p2b} based on the intensity function λ _a (t) calculated in S52-3 and the support sequence Es extracted in the process of S51 (S52- 4). Specifically, the evaluation function calculation unit 34A-1 calculates the evaluation function L _a (Es) based on the intensity function λ _a (t) and the support series Es. The optimization unit 34A-2 uses the error backpropagation method to calculate an optimized parameter set θ{p2a, p2b} based on the evaluation function L _a (Es). The optimization unit 34A-2 applies the optimized parameter set θ{p2a, p2b} to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b.

When the process of S52-4 ends, the first update process ends (end).

FIG. 12 is a flowchart illustrating an example of the second update process in the event prediction device according to the second embodiment. The processing of S55-1a to S55-4 shown in FIG. 12 corresponds to the processing of S55 in FIG.

After the process of S54 (start), the first monotonically increasing neural network 33B-1a to which the plurality of parameters θ'{p2a} is applied outputs f _b (t) and f _b (0) is calculated (S55-1a).
Further, the second monotonically increasing neural network 33B-1b to which a plurality of parameters θ'{p2b} is applied outputs g _b (t') according to a monotonically increasing function defined by time t, time t', and period τ. , g _b (τ) and g _b (0) are calculated (S55-1b).

The cumulative intensity function calculation unit 33B-2 outputs f _b (t) and f _b (0) calculated in the process of S55-1a, and g _b (t') and g _b calculated in the process of S55-1b. (τ) and g _b (0), the cumulative intensity function Λ _b (t) is calculated (S55-2).

The automatic differentiator 33B-3 calculates the intensity function λ _b (t) based on the cumulative intensity function Λ _b (t) calculated in the process of S55-2 (S55-3).

The second updating unit 34B updates the parameter set θ{p2a, p2b} based on the intensity function λ _b (t) calculated in S55-3 and the query sequence Eq extracted in the process of S51 (S55- 4). Specifically, the evaluation function calculation unit 34B-1 calculates the evaluation function L _b (Eq) based on the intensity function λ _b (t) and the query sequence Eq. The optimization unit 34B-2 uses the error backpropagation method to calculate an optimized parameter set θ{p2a, p2b} based on the evaluation function L _b (Eq). The optimization unit 34B-2 applies the optimized parameter set θ{p2a, p2b} to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b.

When the process of S55-4 ends, the second update process ends (end).

2.4 Prediction Operation FIG. 13 is a flowchart showing an example of prediction operation in the event prediction device according to the second embodiment. In the example of FIG. 13, it is assumed that the parameter set θ{p2a ^* , p2b ^* } in the learned parameters 36 has been applied to the first intensity function calculation unit 33A by a learning operation performed in advance. Further, in the example of FIG. 13, it is assumed that the prediction data 37 is stored in the memory 11.

As shown in FIG. 13, in response to a user's instruction to start a predicted operation (start), the first monotonically increasing neural network 33A-1a to which a plurality of parameters θ{p2a ^* } is applied is defined by time t. The outputs f _a ^* (t) and f _a ^* (0) are calculated according to a monotonically increasing function (S60a).
Further, the second monotonically increasing neural network 33A-1b to which a plurality of parameters θ{p2b ^* } is applied outputs g _a ^* (t' ), g _a ^* (τ) and g _a ^* (0) are calculated (S60b).

The cumulative intensity function calculation unit 33A-2 calculates the outputs f _a ^* (t) and f _a ^* (0) calculated in the process of S60a, and the outputs g _a ^* (t'), g _a calculated in the process of S60b. A cumulative intensity function Λ _a ^* (t) is calculated based on ^* (τ) and g _a ^* (0) (S61).

The automatic differentiation unit 33A-3 calculates the intensity function λ _a ^* (t) based on the cumulative intensity function Λ _a ^* (t) calculated in the process of S61 (S62).

The first updating unit 34A updates the parameter set θ{p2a ^* , p2b ^* } based on the intensity function λ _a ^* (t) and the prediction sequence Es ^* calculated in S62 (S63). Specifically, the evaluation function calculation unit 34A-1 calculates the evaluation function _L a (Es ^* ) based on the intensity function λ _a ^* (t) and the prediction sequence Es ^* . The optimization unit 34A-2 uses the error backpropagation method to calculate an optimized parameter set θ{p2a ^* , p2b ^* } based on the evaluation function L _a (Es ^* ). The optimization unit 34A-2 applies the optimized parameter set θ{p2a ^* , p2b ^* } to the first monotonically increasing neural network 33A-1a and the second monotonically increasing neural network 33A-1b.

The first determination unit 35A determines whether the third condition is satisfied based on the parameter set θ{p2a ^* , p2b ^* } updated in the process of S63 (S64).

If the third condition is not satisfied (S64; no), the first intensity function calculation unit 33A and the first update unit 34A to which the parameter set θ{p2a ^* , p2b ^* } updated in the process of S63 are applied. further executes the processes of S60a to S64. In this way, the process of updating the parameter set θ{p2a ^* , p2b ^* } is repeated (inner loop) until it is determined in the process of S64 that the third condition is satisfied.

If the third condition is satisfied (S64; yes), the first determination unit 35A converts the parameter set θ{p2a ^* , p2b ^* } that was last updated in the process of S63 into θ'{p2a ^* , p2b ^* } is applied to the second intensity function calculation unit 33B (S65).

The first monotonically increasing neural network 33B-1a to which a plurality of parameters θ'{p2a ^* } are applied outputs f _b ^* (t) and f _b ^* (0) according to a monotonically increasing function defined by time t. Calculate (S66a).
Further, the second monotonically increasing neural network 33B-1b to which a plurality of parameters θ'{p2b ^* } is applied outputs g _b ^* (t '), g _b ^* (τ) and g _b ^* (0) are calculated (S66b).

The cumulative intensity function calculation unit 33B-2 calculates the outputs f _b ^* (t) and f _b ^* (0) calculated in the process of S66a, and the outputs g _b ^* (t'), g _b calculated in the process of S66b. A cumulative intensity function Λ _b ^* (t) is calculated based on ^* (τ) and g _b ^* (0) (S67).

The automatic differentiator 33B-3 calculates the intensity function λ _b ^* (t) based on the cumulative intensity function Λ _b ^* (t) calculated in the process of S67 (S68).

The predicted sequence generation unit 38 generates the predicted sequence Eq ^* based on the intensity function λ _b ^* (t) calculated in S68 (S69). Then, the predicted sequence generation unit 38 outputs the predicted sequence Eq ^* generated in the process of S69 to the user.

When the process of S69 ends, the prediction operation in the event prediction device 1 ends (end).

2.5 Effects of Second Embodiment According to the second embodiment, the first intensity function calculation unit 33A to which the parameter set θ{p2a, p2b} is applied inputs the time t, the time t', and the period τ. The intensity function λ _a (t) is calculated as follows. The first updating unit 34A updates the parameter set θ{p2a, p2b} to the parameter set θ'{p2a, p2b} based on the intensity function λ _a (t) and the support sequence Es. The second intensity function calculation unit 33B to which the parameter set θ'{p2a, p2b} is applied calculates the intensity function λ _b (t) by inputting the time t, the time t', and the period τ. The second updating unit 34B updates the parameter set θ{p2a, p2b} based on λ _b (t) and the query sequence Eq. This allows point processes to be modeled even when a meta-learning method such as MAML is used.

In this case, the cumulative intensity function calculation unit 33A-2 calculates the cumulative intensity function based on the outputs f _a (t), f _a (0), g _a (t'), g _a (τ), and g _a (0). Calculate Λ _a (t). The cumulative intensity function calculation unit 33B-2 calculates the cumulative intensity function Λ b ₍ t) based on the outputs f _b (t) f _b (0), g _b (t'), g _b (τ), and g _b ) is calculated. Thereby, the requirement for expressiveness required for the outputs of the first monotonically increasing neural networks 33A-1a and 33B-1 can be relaxed. Therefore, effects similar to those of the first embodiment can be achieved.

3. Third Embodiment Next, an information processing apparatus according to a third embodiment will be described.

In the third embodiment, in the first embodiment, a plurality of types of periods τ, for example, τ _i (i=1 to n), that is, τ ₁ , τ ₂ , . . . τ _n, are prepared.

The configuration of the event prediction device according to the third embodiment is the same as that of the first embodiment.
On the other hand, in the third embodiment, the cumulative intensity function calculation unit 24-2 calculates the period τ _i and the output f _i (z, t), g _i (z, t _{' i} ), g _i (z, τ _i ), etc., the cumulative intensity function Λ(t) is calculated.
f(z, t) and f(z, 0) on the right side of equation (3) regarding λ ₁ (u) are calculated by the first monotonically increasing neural network 24-1a. g _i (z, t' _i ), g _i (z, τ _i ), and g _i (z, 0) on the right side of equation (8) regarding λ ₂ (u) are the i-th second monotonically increasing Calculated by the neural network 24-1b.

As shown in equations (2), (3), and (8), the cumulative intensity function Λ(t) is the output f(z, t) and f(z , 0), the output g _i (z, t' _i ) from the i-th second monotonically increasing neural network 24-1b, etc. are taken into consideration. The cumulative intensity function calculation unit 24-2 transmits the calculated cumulative intensity function Λ(t) to the automatic differentiation unit 24-3.

4. Fourth Embodiment Next, an information processing apparatus according to a fourth embodiment will be described.

In the fourth embodiment, a plurality of types of periods τ, for example, τ ₁ , τ ₂ , . . . τ _n , are prepared in the second embodiment.

The configuration of the event prediction device according to the fourth embodiment is similar to that of the second embodiment.
On the other hand, in the fourth embodiment, the cumulative intensity function calculation unit 33A-2 of the first intensity function calculation unit 33A calculates the period τ according to the above equations (5) and (6) and the following equation (9). A cumulative intensity function Λ _a (t) is calculated based on _i , outputs f _a (t), g _a (t'), g _a (τ _i ), and the like.
f _a (t) and f _a (0) on the right side of equation (6) regarding λ _a1 (u) are calculated by the first monotonically increasing neural network 33A-1a. The right side of equation (9) regarding λ _a2 (u)

is calculated by the second monotonically increasing neural network 33A-1b.

In the fourth embodiment, as shown in equations (5), (6), and (9), the cumulative intensity function Λ _a (t) is the output f _a (t) from the first monotonically increasing neural network 33A-1a. t) and f _a (0), the output from the second monotonically increasing neural network 33A-1b

will be taken into consideration. The cumulative intensity function calculation unit 33A-2 transmits the calculated cumulative intensity function Λ _a (t) to the automatic differentiation unit 33A-3.

Further, the cumulative intensity function calculation unit 33B-2 of the second intensity function calculation unit 33B calculates the above equations (5), (6), and equation (9) (where λ _a , f _a , and g _a _b , f _b , and g _b ), the cumulative intensity function Λ b (t) is calculated based on the period τ _i , the outputs f _b (t), g _b (t'), _g _b (τ _i ), etc. Calculate.
f _b (t) and f _b (0) are calculated by the first monotonically increasing neural network 33B-1a.

is calculated by the second monotonically increasing neural network 33B-1b.

In the fourth embodiment, the cumulative intensity function Λ _b (t) is applied to the outputs f _b (t) and f b (0) from the first monotonically increasing neural network 33B-1a and the outputs f b (t) and f _b (0) from the first monotonically increasing neural network 33B-1b. output from

will be taken into consideration. The cumulative intensity function calculation unit 33B-2 transmits the calculated cumulative intensity function Λ _b (t) to the automatic differentiation unit 33B-3.

5. Fifth Embodiment Next, an information processing apparatus according to a fifth embodiment will be described.

In the fifth embodiment, in the first embodiment, the period τ is a learnable parameter.
During learning, if a learnable parameter is included in the floor function, the gradient becomes 0.
In the fifth embodiment, for example, the document "Edward Wilson, et al., "Backpropagation Learning for Systems with Discrete-Valued Functions"," Proceedings of the World Congress on Neural Networks, San Diego, California, June 1994.
Learning is performed using a known method disclosed in http://www.intellization.com/files/NN_noisy_backprop_paper_WCNN_94.pdf>.
A plurality of types of period τ may be prepared as in the third embodiment, and the plurality of types of period τ may include both a learned period and an arbitrarily given period τ.

6. Sixth Embodiment Next, an information processing apparatus according to a sixth embodiment will be described.

In the sixth embodiment, in the second embodiment, the period τ is a learnable parameter.
During learning, if a learnable parameter is included in the floor function, the gradient becomes 0.
Similar to the fifth embodiment, the sixth embodiment includes “Edward Wilson, “Backpropagation Learning for Systems with Discrete-Valued Functions”,” Proceedings of the World Congress on Neural Networks, San Diego, California, June 1994.” Learning is performed using a known method disclosed in .
A plurality of types of period τ may be prepared as in the fourth embodiment, and the plurality of types of τ may include both a learned period and an arbitrarily given period τ.

7. Other Modifications, etc. Various modifications may be applied to the embodiments described above. Regarding the following modified example, differences from the first embodiment will be explained.

7.1 First Modification In the first embodiment described above, each event is described as having no mark or additional information attached thereto, but the present invention is not limited to this. For example, each event may be marked or provided with additional information. The marks or additional information attached to each event include, for example, what the user has purchased and the payment method. Hereinafter, for the sake of simplicity, the mark or additional information will be simply referred to as a "mark."

FIG. 14 is a block diagram illustrating an example of a configuration of a latent expression calculation unit of an event prediction device according to a first modification. The latent expression calculation unit 23 further includes a neural network 23-2. Further, in the example of FIG. 14, the support series Es is a series of a pair of event occurrence time t _i and mark mi (Es={(t _i , mi)}).

The neural network 23-2 is a mathematical model modeled to input the mark m _i and output a parameter NN2(m _i ) that takes the mark m _i into consideration. Then, the neural network 23-2 generates the sequence Es′={[t _i NN2(m _i )]} by combining the output NN2(m _i ) with the event occurrence time t _i in the support sequence Es. do. Neural network 23-2 transmits the generated sequence Es' to neural network 23-1.

The neural network 23-1 receives the sequence Es' as input and outputs the latent expression z. The neural network 23-1 transmits the output latent expression z to the strength function calculation unit 24.

Although not shown in FIG. 14, a plurality of parameters are applied to the neural network 23-2. The plurality of parameters applied to the neural network 23-2 are initialized by the initialization unit 22 and updated by the update unit 25, similarly to the plurality of parameters p1, p2a, and p2b.

With the above configuration, the latent expression calculation unit 23 can calculate the latent expression z while taking the mark m _i into consideration. This makes it possible to improve event prediction accuracy.

7.2 Second Modification In the first embodiment described above, the case where additional information is not attached to the series has been described, but the present invention is not limited to this. For example, additional information may be attached to the series. The additional information attached to the series is, for example, user attribute information such as the user's gender and age.

FIG. 15 is a block diagram illustrating an example of the configuration of the intensity function calculation unit of the event prediction device according to the second modification. The intensity function calculation unit 24 further includes neural networks 24-5 and 24-6.

The neural network 24-5 is a mathematical model modeled to input the additional information a and output a parameter NN3(a) that takes the additional information a into consideration. Neural network 24-5 transmits the output parameter NN3(a) to neural network 24-6.

The neural network 24-6 inputs the latent expression z and the parameter NN3(a) and outputs a latent expression z'=NN4([z, NN3(a)]) that takes into account the additional information a. The neural network 24-6 transmits the output latent representation z' to the first monotonically increasing neural network 24-1a and the second monotonically increasing neural network 24-1b.

The first monotonically increasing neural network 24-1a calculates outputs f(z', t) and f(z', 0) according to a monotonically increasing function defined by the latent expression z' and time t. The first monotonically increasing neural network 24-1a transmits the calculated outputs f(z', t) and f(z', 0) to the cumulative intensity function calculation unit 24-2.
The second monotonically increasing neural network 24-1b outputs g(z', t'), g(z', τ) and g(z', 0). The second monotonically increasing neural network 24-1b sends the calculated outputs g(z', t'), g(z', τ), and g(z', 0) to the cumulative intensity function calculation unit 24-2. Send.

The configurations of the cumulative intensity function calculation unit 24-2 and the automatic differentiation unit 24-3 are the same as those in the first embodiment, so their description will be omitted. Note that the above equations (2) to (4) (where z is read as z') can be used to calculate the cumulative intensity function by the cumulative intensity function calculation unit 24-2.

Although not shown in FIG. 15, a plurality of parameters are applied to each of the neural networks 24-5 and 24-6. The plurality of parameters applied to the neural networks 24-5 and 24-6 are initialized by the initialization unit 22 and updated by the updater 25, similarly to the plurality of parameters p1, p2a, and p2b.

By configuring as described above, the intensity function calculation unit 24 can calculate the output f(z', t) while taking the additional information a into consideration. This makes it possible to improve event prediction accuracy.

7.3 Third Modification In the second embodiment described above, a case has been described in which additional information is not attached to the series Es, but the present invention is not limited to this. For example, additional information may be attached to the series.

FIG. 16 is a block diagram illustrating an example of the configuration of the first intensity function calculation unit of the event prediction device according to the third modification. FIG. 17 is a block diagram illustrating an example of the configuration of the second intensity function calculation unit of the event prediction device according to the third modification. The first intensity function calculation section 33A and the second intensity function calculation section 33B further include neural networks 33A-4 and 33B-4, respectively.

The neural networks 33A-4 and 33B-4 are mathematical models modeled to input additional information a and output a parameter NN5(a) that takes additional information a into consideration. Neural networks 33A-4 and 33B-4 transmit the output parameter NN5(a) to first monotonically increasing neural networks 33A-1a and 33B-1a and second monotonically increasing neural networks 33A-1b and 33B-1b, respectively. do.

The first monotonically increasing neural network 33A-1a calculates outputs f _a (t) and f _a (0) according to a monotonically increasing function defined by parameter NN5(a) and time t. The first monotonically increasing neural network 33B-1a calculates outputs f _b (t) and f _b (0) according to a monotonically increasing function defined by parameter NN5(a) and time t. Here, both the outputs f _a (t) and f _b (t) are expressed as MNN1 ([t, NN5(a)]). The first monotonically increasing neural network 33A-1a transmits the calculated outputs f _a (t) and f _a (0) to the cumulative intensity function calculation unit 33A-2. The first monotonically increasing neural network 33B-1a transmits the calculated outputs f _b (t) and f _b (0) to the cumulative intensity function calculation unit 33B-2.

The second monotonically increasing neural network 33A-1b outputs g _a (t'), g _a (τ), and Calculate g _a (0). The second monotonically increasing neural network 33B-1b outputs g _b (t'), g _b (τ), and Calculate g _b (0). Here, both the outputs g _a (t') and g _b (t') are expressed as MNN2 ([t', NN5(a)]). The second monotonically increasing neural network 33A-1b transmits the calculated outputs g _a (t'), g _a (0), and g _a (0) to the cumulative intensity function calculation unit 33A-2. The second monotonically increasing neural network 33B-1b transmits the calculated outputs g _b (t'), g _b (τ), and g _b (0) to the cumulative intensity function calculation unit 33B-2.

The configurations of the cumulative intensity function calculation units 33A-2 and 33B-2 and the automatic differentiation units 33A-3 and 33B-3 are the same as those in the second modification, and therefore their description will be omitted.

Although descriptions are omitted in FIGS. 16 and 17, a plurality of parameters are applied to each of the neural networks 33A-4 and 33B-4. The plurality of parameters applied to the neural network 33A-4 are initialized by the initialization unit 32 and updated by the first update unit 34A, similarly to the parameter set θ{p2a, p2b}. The plurality of parameters applied to the neural network 33B-4 are used for updating by the second updating unit 34B, similar to the parameter set θ'{p2a, p2b}.

With the above configuration, the first intensity function calculation unit 33A can calculate the outputs f _a (t) and g _a (t') while taking the additional information a into consideration, and the second intensity function calculation The unit 33B can calculate the outputs f _b (t) and g _b (t') while considering the additional information a. This makes it possible to improve event prediction accuracy.

7.4 Others In the first to sixth embodiments and the first to third modified examples described above, the dimension of the event is described as one dimension of time, but it is not limited to this. For example, the dimensionality of an event may be extended to any number of dimensions greater than or equal to two (eg, three dimensions in space and time).

In the first to sixth embodiments and the first to third variations described above, the learning operation and the prediction operation are executed by a program stored in the event prediction device 1. , but not limited to this. For example, learning operations and prediction operations may be performed on computational resources on the cloud.

The information processing apparatus according to each of the embodiments described above is not limited to a configuration that meta-learns a point process, but can also be applied to a configuration that learns a point process without using meta-learning. Further, the information processing apparatus according to each embodiment can be applied to, for example, a configuration for solving a regression problem in which monotonically increasing property is desired to be guaranteed. An example of a regression problem for which monotonically increasing property is desired is the problem of estimating credit risk from loan usage amount. Furthermore, the information processing apparatus according to each embodiment can also be applied to a configuration that solves a problem using a neural network that guarantees reversible transformation. Examples of problems that use neural networks that guarantee reversible transformations include density estimation of empirical distributions, VAE (Variational Auto-Encoders), speech synthesis, likelihood-free inference, and probabilistic programming. ), image generation, etc. Furthermore, the information processing apparatus according to each embodiment can also be applied to a configuration that solves a problem in which a survival analysis hazard function is used.

In addition, the method described in each embodiment can be applied to a magnetic disk (floppy (registered trademark) disk, hard disk) as a program (software means) that can be executed by a computer (computer). etc.), optical discs (CD-ROM, DVD, MO, etc.), semiconductor memories (ROM, RAM, Flash memory, etc.), and are stored in recording media, or transmitted and distributed via communication media. can be done. Note that the programs stored on the medium side also include a setting program for configuring software means (including not only execution programs but also tables and data structures) in the computer to be executed by the computer. A computer that realizes this device reads a program recorded on a recording medium, and if necessary, constructs software means using a setting program, and executes the above-described processing by controlling the operation of the software means. Note that the recording medium referred to in this specification is not limited to one for distribution, and includes storage media such as a magnetic disk and a semiconductor memory provided inside a computer or in a device connected via a network.

Note that the present invention is not limited to the above-described embodiments, and can be variously modified at the implementation stage without departing from the gist thereof. Moreover, each embodiment may be implemented in combination as appropriate, and in that case, the combined effect can be obtained. Furthermore, the embodiments described above include various inventions, and various inventions can be extracted by combinations selected from the plurality of constituent features disclosed. For example, if a problem can be solved and an effect can be obtained even if some constituent features are deleted from all the constituent features shown in the embodiment, the configuration from which these constituent features are deleted can be extracted as an invention.

1...Event prediction device 10...Control circuit 11...Memory 12...Communication module 13...User interface 14...Drive 15...

Storage medium

20, 30, 40, 50...Learning data set 21, 31...

Data extraction unit

22, 32... Initialization unit 23...Latent expression calculation unit 23-1, 23-2, 24-4, 24-5, 24-6, 33A-4, 33B-4...Neural network 24...Intensity function calculation unit 24-1a, 33A -1a, 33B-1a...First monotonically increasing neural network 24-1b, 33A-1b, 33B-1b...Second monotonically increasing neural network 24-2, 33A-2, 33B-2...Cumulative intensity function calculation unit 24- 3, 33A-3, 33B-3... Automatic differentiation section 25... Update section 25-1, 34A-1, 34B-1... Evaluation function calculation section 25-2, 34A-2, 34B-2... Optimization section 26...

Judgment unit

27, 36...Learned

parameters

28, 37...Prediction data 29, 38...Prediction sequence generation unit 33A...First intensity function calculation unit 33B...Second intensity function calculation unit 34A...First update unit 34B...Second Update section 35A...First judgment section 35B...Second judgment section

Claims

a first monotonically increasing neural network;
a second monotonically increasing neural network;
a first calculation unit that calculates a first cumulative function based on an output from the first monotonically increasing neural network and a parameter;
a second calculation unit that calculates a second cumulative function based on the output from the second monotonically increasing neural network, the parameter, and the period;
An information processing device equipped with
The first calculation unit is
Calculating a first cumulative intensity function based on the output from the first monotonically increasing neural network and the parameter;
The second calculation unit is
Calculating a second cumulative intensity function based on the output from the second monotonically increasing neural network, the parameter, and the period;
further comprising a third calculation unit that calculates an intensity function related to a point process based on the calculated first and second cumulative intensity functions;
The information processing device according to claim 1.
further comprising an updating unit that updates the parameter based on the calculated intensity function;
The information processing device according to claim 2.
Further comprising a neural network that receives as input all events included in a series including a plurality of events arranged discretely in continuous time, or the number of the plurality of events included in the series, and outputs the parameter.
The information processing device according to claim 1.
further comprising an initialization unit that initializes a plurality of weights applied to the first and second monotonically increasing neural networks based on a distribution with a positive average;
The information processing device according to claim 1.
A method performed by an information processing device, the method comprising:
outputting a scalar value according to a monotonically increasing function from a first monotonically increasing neural network by a first output unit of the information processing device;
outputting a scalar value according to a monotonically increasing function from a second monotonically increasing neural network by a second output unit of the information processing device;
A first calculation unit of the information processing device calculates a first cumulative function based on the scalar value output from the first monotonically increasing neural network and the parameter;
Calculating a second cumulative function by a second calculation unit of the information processing device based on the scalar value output from the second monotonically increasing neural network, the parameter, and the period;
An information processing method with
The first calculation unit is
Calculating a first cumulative intensity function based on the output from the first monotonically increasing neural network and the parameter;
The second calculation unit is
Calculating a second cumulative intensity function based on the output from the second monotonically increasing neural network, the parameter, and the period;
further comprising calculating, by a third calculation unit of the information processing device, an intensity function related to a point process based on the calculated first and second cumulative intensity functions;
The information processing method according to claim 6.
A program for causing a computer to function as each section included in the information processing apparatus according to any one of claims 1 to 5.