WO2015163322A1

WO2015163322A1 - Data analysis device, data analysis method, and program

Info

Publication number: WO2015163322A1
Application number: PCT/JP2015/062123
Authority: WO
Inventors: 勇気小阪; 虎王
Original assignee: 日本電気株式会社
Priority date: 2014-04-24
Filing date: 2015-04-21
Publication date: 2015-10-29
Also published as: CN105095616A; JPWO2015163322A1

Abstract

　The present invention reduces the number of prediction rules while preventing a reduction in prediction accuracy pertaining to multitask-type analysis in which relations between a plurality of objective variables and a plurality of explanatory variables are analyzed simultaneously. This multitask-type data analysis device is provided with: a storage unit for holding first measured values which are the measured values of the plurality of objective variables, second measured values which are the measured values of the plurality of explanatory variables corresponding to the plurality of objective variables, and third measured values which are the measured values of explanatory variables corresponding to objective variables to be predicted; and a prediction rule learning unit that uses the first and second measured values to calculate a common prediction rule, which is the prediction rule expressed by an explanatory variable relating in common to the plurality of objective variables, individual prediction rules, which comprise by-objective-variable prediction rules expressed by an explanatory variable relating to each objective variable, and by-group prediction rules, which, when prediction rules included in the individual prediction rules have been grouped, are formed from prediction rules for each group.

Description

Data analysis apparatus, data analysis method and program

[Description of related applications]
The present invention is based on a patent application of the People's Republic of China: application number 20141167977.2 (filed on April 24, 2014), and the entire description of the application is incorporated herein by reference.
The present invention relates to a data analysis device, a data analysis method, and a program, and more particularly, to a data analysis device, a data analysis method, and a program that simultaneously analyze relationships between a plurality of objective variables and a plurality of explanatory variables.

The results of future predictions derived by analyzing a large amount of accumulated data are beginning to be used for corporate decision making. For example, in stores such as supermarkets and convenience stores, the number of purchases of each product is adjusted based on the demand prediction result of each product. For example, when the demand prediction of each product is performed, the relationship between the product / customer information such as the attribute of each product and the attribute of the customer who sold each product, and the sales performance of each product is analyzed. At this time, the sales performance value of each product is used as the objective variable, while the attributes (price, manufacturer) of each product, the attributes of the customer who sold each product (age, gender), etc. are used as the explanatory variables. It is done.

In the above data analysis, instead of learning the prediction rule that expresses the relationship between each objective variable and multiple explanatory variables by handling multiple objective variables separately, the relationship between multiple objective variables is also considered. On the other hand, a technique for improving the prediction accuracy by learning a prediction rule representing the relationship between each objective variable and a plurality of explanatory variables has been developed. Such an approach is called “multitask analysis”. In other words, in multi-task analysis, after learning prediction rules that express the relationship between each objective variable and multiple explanatory variables, the values of multiple explanatory variables are input to the learned prediction rules, so that Calculate the predicted value.

An example of multitask analysis technology is described in Non-Patent Document 1. In the technique described in Non-Patent Document 1, a prediction rule (hereinafter referred to as “prediction rule”) expressed by explanatory variables related to all objective variables in common based on measured values of a plurality of objective variables and measured values of a plurality of explanatory variables. , “Common prediction rule”) and a prediction rule for each objective variable represented by an explanatory variable related to each objective variable (hereinafter referred to as “individual prediction rule”). Next, actual values of each explanatory variable are input to the learned common prediction rule and individual prediction rule, and a prediction value is calculated for each objective variable.

Also, as related technology, Non-Patent Document 2 describes a convex optimization method for minimizing an objective function.

The entire contents disclosed in Non-Patent Documents 1 and 2 above are incorporated herein by reference. The following analysis was made by the present inventors.

In multi-task type data analysis, not only the results predicted by the machine, but also practically it is required to express how the machine derived the prediction result. This is because, when making a decision, not only the prediction result is confirmed, but also the validity of the prediction rule that led to the prediction result is important.

In order to inform the user how the machine has derived the prediction result, it is necessary to display and provide the prediction rule to the user. However, if the number of objective variables ranges from thousands to tens of thousands, the number of prediction rules also ranges from thousands to tens of thousands, making it difficult for the user to check whether each prediction rule is valid. .

Therefore, even when the number of objective variables is enormous, it is important to reduce the number of prediction rules learned without greatly reducing the prediction accuracy, but at present, such a technique has not been established.

Therefore, in multi-task type data analysis, it is desired to reduce the number of prediction rules while preventing a decrease in prediction accuracy. An object of the present invention is to provide a data analysis apparatus, a data analysis method, and a program that contribute to such a demand.

According to the first aspect of the present invention, a multitask type data analysis apparatus is provided. The data analysis apparatus includes a first actual measurement value that is an actual measurement value of a plurality of objective variables, a second actual measurement value that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables, and an object of a prediction target. A storage unit for holding a third actual measurement value that is an actual measurement value of the explanatory variable corresponding to the variable is provided. In addition, the data analysis device uses the first actual measurement value and the second actual measurement value, and a common prediction rule that is a prediction rule represented by explanatory variables related to the plurality of objective variables in common. By group, comprising prediction rules for each objective variable represented by explanatory variables related to each objective variable, and prediction rules for each group when the prediction rules included in the individual prediction rules are grouped A prediction rule learning unit for learning prediction rules is provided.

According to the second aspect of the present invention, there is provided a data analysis method in which a computer performs multitask data analysis. In the data analysis method, the computer includes a first actual measurement value that is an actual measurement value of a plurality of objective variables, and a second actual measurement value that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables; A step of holding a third actual measurement value, which is an actual measurement value of the explanatory variable corresponding to the target variable to be predicted, in the storage unit. Further, in the data analysis method, the computer uses the first actual measurement value and the second actual measurement value read from the storage unit, and is an explanatory variable commonly related to the plurality of objective variables. A group of the common prediction rule that is a prediction rule represented by, the individual prediction rule that consists of the prediction rule for each objective variable represented by the explanatory variable related to each objective variable, and the prediction rule included in the individual prediction rule A step of learning a group-specific prediction rule including a prediction rule for each group and recording it in the storage unit.

According to the third aspect of the present invention, there is provided a program for causing a computer to execute multitask type data analysis. The program includes a first actual measurement value that is an actual measurement value of a plurality of objective variables, a second actual measurement value that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables, and an objective variable to be predicted. The computer is caused to execute a process of holding a third actual measurement value, which is an actual measurement value of the corresponding explanatory variable, in the storage unit. In addition, the program uses the first actual measurement value and the second actual measurement value read from the storage unit, and the prediction rule is expressed by an explanatory variable commonly related to the plurality of objective variables. For each group when the prediction rules included in the individual prediction rule are grouped, and the individual prediction rule composed of the prediction rule for each objective variable represented by the explanatory variable related to each objective variable. The computer is caused to execute a process of learning a prediction rule for each group composed of prediction rules and recording it in the storage unit. The program can be provided as a program product recorded in a non-transitory computer-readable storage medium.

The data analysis apparatus, data analysis method, and program according to the present invention can reduce the number of prediction rules while preventing a decrease in prediction accuracy in multi-task type data analysis.

It is a block diagram which shows the structure of the data analyzer which concerns on one Embodiment as an example. It is a block diagram which shows the structure of the data analyzer which concerns on 1st Embodiment as an example. It is a flowchart which shows operation | movement of the data analyzer which concerns on 1st Embodiment as an example.

First, an outline of one embodiment will be described. Note that the reference numerals of the drawings attached to this summary are merely examples for facilitating understanding, and are not intended to limit the present invention to the illustrated embodiment.

FIG. 1 is a block diagram illustrating the configuration of a data analysis apparatus 10 according to an embodiment. Referring to FIG. 1, the data analysis device 10 is a multitask type data analysis device, and includes a storage unit 14, a prediction rule learning unit 15B, and a prediction value calculation unit 15C.

The storage unit 14 includes a first actual measurement value 14A that is an actual measurement value of a plurality of objective variables, a second actual measurement value 14B that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables, and a prediction target. A third actual measurement value 14C that is an actual measurement value of the explanatory variable corresponding to the objective variable is held.

The prediction rule learning unit 15B uses the first actual measurement value 14A and the second actual measurement value 14B, and a common prediction rule 14D that is a prediction rule expressed by explanatory variables related to a plurality of objective variables in common. By group consisting of individual prediction rules 14E consisting of prediction rules for each objective variable represented by explanatory variables related to each objective variable, and prediction rules for each group when the prediction rules included in the individual prediction rules 14E are grouped The prediction rule 14F is calculated. Here, the prediction rule learning unit 15B preferably groups the plurality of prediction rules so that prediction rules similar to each other among the plurality of prediction rules included in the individual prediction rule 14E belong to the same group.

The prediction value calculation unit 15C calculates the prediction value 14G of the target variable to be predicted using the common prediction rule 14D and the group-specific prediction rule 14F calculated by the prediction rule learning unit 15B and the third actual measurement value 14C. .

According to the data analysis apparatus 10, it is possible to reduce the number of prediction rules while preventing a decrease in prediction accuracy in multitask type data analysis. This is because, according to the data analysis apparatus 10, instead of the individual prediction rule 14E composed of the prediction rules for each objective variable, common with the group-specific prediction rule 14F for each group when the prediction rules included in the individual prediction rule 14E are grouped. The prediction value 14G of the target variable to be predicted can be calculated using the prediction rule 14D. At this time, the number of prediction rules included in the group-specific prediction rule 14F is the number of prediction rules included in the individual prediction rule 14E. This is because it can be greatly reduced.

Therefore, according to the data analysis apparatus 10, the user determines the validity of the prediction rule used to derive the prediction result based on a relatively small number of prediction rules (common prediction rule 14D, group-specific prediction rule 14F). It becomes possible. *

<Embodiment 1>
Next, the data analysis apparatus according to the first embodiment will be described in detail with reference to the drawings. FIG. 2 is a block diagram illustrating an example of the configuration of the data analysis apparatus 20 according to the present embodiment.

The data analyzer 20 shown in FIG. 2 performs multitask analysis. That is, the data analysis apparatus 20 inputs the actual measurement values 24A of the plurality of objective variables and the actual measurement values 24B of the plurality of explanatory variables, and learns prediction rules (24D to 24F) representing the relationship between the objective variable and the explanatory variables. When the actual measured value 24C of the explanatory variable corresponding to the target variable to be predicted is input, the predicted value 24G for each target variable to be predicted is calculated and output.

In particular, the data analysis apparatus 20 of the present embodiment relates to a prediction rule (referred to as “common prediction rule 24D”) represented by explanatory variables that are commonly related to all objective variables, and to each objective variable. A prediction rule for each objective variable represented by the explanatory variable (referred to as “individual prediction rule 24E”) and similar individual prediction rules are grouped, and a prediction rule for each group 24F is calculated by recalculating the prediction rule for each group Then, when the measured value of the explanatory variable is input based on the common prediction rule 24D and the group-specific prediction rule 24F, the predicted value 24G for each target variable to be predicted is calculated and output.

Referring to FIG. 2, the data analysis apparatus 20 includes a communication interface (I / F) unit 21, an operation input unit 22, a screen display unit 23, a storage unit 24, and a processor 25 as hardware. .

The communication I / F unit 21 has a dedicated data communication circuit, and performs data communication with various devices (not shown) connected via a communication line (not shown). The operation input unit 22 includes an operation input device such as a keyboard and a mouse, detects an operator's operation, and outputs it to the processor 25. The screen display unit 23 includes a screen display device such as an LCD (Liquid Crystal Display) or a PDP (Plasma Display Panel), and displays various information such as an operation menu and a selection result on the screen in response to an instruction from the processor 25. .

The storage unit 24 has a storage device such as a hard disk or a semiconductor memory, and stores processing information and programs required for various processes in the processor 25. The program is a program that realizes various processing units (25A to 25C) by being read and executed by the processor 25. The program is read in advance from an external device (not shown) or a computer-readable storage medium (not shown) via a data input / output function such as the communication I / F unit 21 and stored in the storage unit 24. Also good.

The main processing information recorded in the storage unit 24 includes measured values 24A of a plurality of objective variables, measured values 24B of a plurality of explanatory variables, and measured values 24C of explanatory variables corresponding to the target variable to be predicted. The common prediction rule 24D, the individual prediction rule 24E, the group-specific prediction rule 24F, and the prediction value 24G are included.

The measured values 24A of the plurality of objective variables and the measured values 24B of the plurality of explanatory variables are classified according to the type of the objective variable. The data divided according to the type of the objective variable may be a list in which the actual measured value of the objective variable and the actual measured value of the corresponding explanatory variable are paired.

The actual measured value 24C of the explanatory variable corresponding to the target variable to be predicted is the actual measured value of the explanatory variable corresponding to the target variable to be predicted.

The common prediction rule 24D is a prediction rule represented by explanatory variables related to all the objective variables in common. The common prediction rule 24 </ b> D may be a list in which an explanatory variable name commonly related to all objective variables and a value representing an influence exerted by the explanatory variable on the objective variable are paired.

The individual prediction rule 24E is a prediction rule for each objective variable represented by an explanatory variable related to each objective variable. The individual prediction rule 24E is a list composed of triples of an objective variable name, an explanatory variable name related to the objective variable, and a value representing the influence of the explanatory variable on the objective variable. May be.

The group-specific prediction rule 24F is a group-specific prediction rule when similar individual prediction rules are grouped. The group-specific prediction rule 24F may be configured by information in which a group ID and a group-specific prediction rule are paired, and information indicating the individual prediction rule 24E belonging to each group ID.

The predicted value 24G may be a list in which the target variable to be predicted and the predicted result are paired.

The processor 25 includes a microprocessor such as a CPU (Central Processing Unit) and its peripheral circuits. The processor 25 reads the program from the storage unit 24 and executes it, thereby realizing various processing units by cooperating the hardware and the program. The main processing units realized by the processor 25 include an input unit 25A, a prediction rule learning unit 25B, and a predicted value calculation unit 25C.

The input unit 25A inputs from the communication I / F unit 21 or the operation input unit 22 the measured values 24A of a plurality of objective variables and the measured values 24C of explanatory variables corresponding to the objective variable to be predicted. To store.

The prediction rule learning unit 25B uses the measured values 24A of the plurality of objective variables and the measured values 24B of the plurality of explanatory variables, and the common prediction rule 24D represented by the explanatory variables related to all the objective variables, The individual prediction rule 24E represented by the explanatory variable related to each objective variable is learned, and further, the group-specific prediction rule 24F calculated by grouping similar individual prediction rules is learned and stored in the storage unit 24 To do.

The predicted value calculation unit 25C reads the common prediction rule 24D, the group-specific prediction rule 24F, and the actual measured value 24C of the explanatory variable corresponding to the target variable to be predicted from the storage unit 24, and the common prediction rule 24D and the group-specific prediction The measured value 24C of the explanatory variable corresponding to the objective variable to be predicted is input to the rule 24F, and the predicted value 24G for each objective variable to be predicted is calculated and stored in the storage unit 24.

The predicted value calculation unit 25C reads the predicted value 24G from the storage unit 24 and outputs it to the screen display unit 23 or outputs it to the outside through the communication I / F unit 21. Further, the predicted value calculation unit 25C reads the common prediction rule 24D, the individual prediction rule 24E, and the group-specific prediction rule 24F from the storage unit 24, and outputs them to the screen display unit 23 or externally through the communication I / F unit 21. Output.

Next, the operation of the data analysis apparatus 20 according to the present embodiment will be described with reference to the drawings. FIG. 3 is a flowchart showing an operation of the data analysis apparatus 20 as an example.

Referring to FIG. 3, the operation of the data analysis apparatus 20 according to the present embodiment includes two phases, a learning phase and a prediction phase.

First, in the learning phase, the data analysis apparatus 20 performs the following operations. The input unit 25A inputs measured values 24A of a plurality of objective variables and measured values 24B of a plurality of explanatory variables corresponding to the measured values 24A from the communication I / F unit 21 or the operation input unit 22, and stores the storage unit 24. (Step S11).

Next, the prediction rule learning unit 25B reads the measured values 24A of the plurality of objective variables and the measured values 24B of the plurality of explanatory variables from the storage unit 24, and sets the common prediction rule 24D, the individual prediction rule 24E, and the group-specific prediction rule 24F. All are learned simultaneously (step S12).

On the other hand, in the prediction phase, the data analysis device 20 performs the following operations. First, the input unit 25A inputs the actual measured value 24C of the explanatory variable corresponding to the target variable to be predicted from the communication I / F unit 21 or the operation input unit 22, and stores it in the storage unit 24 (step S21).

Next, the predicted value calculation unit 25C reads the common prediction rule 24D and the group-specific prediction rule 24F from the storage unit 24, inputs the actual measured value 24C of the explanatory variable corresponding to the target variable to be predicted, and performs the desired prediction A predicted value for each variable is calculated (step S22).

Next, the predicted value calculation unit 25C outputs, to the screen display unit 23, the one selected by the user from the predicted value 24G, the common prediction rule 24D, the individual prediction rule 24E, and the group-specific prediction rule 24F, or Output to the outside through the communication I / F unit 21 (step S23).

According to the data analysis apparatus 20 of the present embodiment, even when there are many objective variables, the prediction accuracy is greatly reduced by obtaining the group-specific prediction rules 24F calculated by grouping the prediction rules learned for each objective variable. Without making it possible, the number of prediction rules can be reduced.

Next, operations in the learning phase and the prediction phase of the data analysis device 20 will be described in more detail based on specific examples. In the following, subscripts are expressed with an underline. For example, it referred to as A_B the _{A B.} Also, superscripts are expressed with a hat. For example, A ^B is expressed as A ^ B.

(1) Details of the learning phase [Step S11]
The input unit 25A receives the measured values 24A of a plurality of objective variables and the measured values 24B of a plurality of explanatory variables as inputs. The input actual measurement values 24A and plural explanatory variables 24B are X_nt and Y_nt (n = 1, 2,..., N_t; t = 1,..., T), respectively.

Here, the vector X_nt is an M-dimensional column vector representing the n-th observation vector of the target variable type t. On the other hand, Y_nt is the nth actually measured value of the target variable type t. N_t represents the number of actually measured values of the target variable type t. Further, T represents the number of types of objective variables. X_ntm (m = 1,..., M) represents an actual measurement value of the explanatory variable m of the n-th observation vector of the target variable type t. M represents the number of explanatory variables. The matrix X_t represents an N_t × M size matrix in which row vectors X_nt ^ {T} (n = 1, 2,..., N_t) are aligned for each row. Here, {T} represents transposition. The vector Y_t represents an N_t × 1 size column vector in which Y_nt (n = 1, 2,..., N_t) is aligned for each row.

[Step S12]
The common prediction rule 24D, the individual prediction rule 24E, and the group-specific prediction rule 24F are represented by a column vector p_t, a matrix Q, and a matrix F, respectively. Here, the M-dimensional column vector p_t represents a common prediction rule for the task t. A method for calculating the respective prediction rules 24D to 24F is as follows.

The matrix P is a matrix indicating the common prediction rule, and is a T × M matrix given by P = [p_1 ^ {T}; p_2 ^ {T};...; P_T ^ {T}]. The common prediction rule indicates an explanatory variable that is commonly related to the objective variables of all tasks, and the degree of influence of each explanatory variable on the objective variable differs for each task. Therefore, the common prediction rule is defined for each task.

The matrix Q is a matrix indicating the individual prediction rule 24E. The matrix Q represents an M × T size matrix of [q1q2... Q_t... Q_T]. Here, the vector q_t is an M-dimensional column vector and represents an individual prediction rule for the task t.

The matrix F is a matrix indicating the group-specific prediction rule 24F. The matrix F represents an M × K sized matrix [f1f2... F_k. Here, the vector f_k is an M-dimensional column vector and represents the kth group-specific prediction rule. K represents the number of groups when the individual prediction rule 24E is divided into groups.

The matrix G represents a T × K sized matrix [g_1 ^ {T}; g_2 ^ {T};...; G_t ^ {T}; ...; g_T ^ {T}]. The vector g_t is a K-dimensional column vector. The vector g_t represents to which group the individual prediction rule of the objective variable type t belongs.

The prediction rule learning unit 25B learns the vector p_t and the matrices Q, F, and G at the same time. Specifically, the vector p_t and the matrices Q, F, and G may be learned by minimizing a predetermined objective function.

The prediction rule learning unit 25B can use an objective function represented by the following formula (1) as an example.

Σ_t || X_t (p_t + Fg_t) −Y_t || ^ 2
+ Ρ_1 || P || _ (1, ∞)
+ Ρ_2 || F || _1
+ Ρ_3tr (PQ)
+ Ρ — 4tr (Q ^ {T} Q-2Q ^ {T} FG ^ {T} + GF ^ {T} FG ^ {T})
... (1)

In Expression (1), ρ_1, ρ_2, ρ_3, and ρ_4 are parameters that adjust the degree of influence of each term. Σ_t represents the sum of t.

The purpose of introducing each term of Equation (1) is as follows. The first term aims to reduce the error between the prediction result using the prediction rule and the actual measurement value. The second term is intended to reduce the number of types of explanatory variables that are effective for prediction in common with respect to tasks with respect to the common prediction rule. The third term is aimed at reducing the number of types of explanatory variables that are effective for prediction with respect to the group-specific prediction rule. The purpose of the fourth term is to make the types of explanatory variables effective for prediction different between the group-specific prediction rule and the common prediction rule. The purpose of the fifth term is to group so that prediction rules similar to each other among a plurality of individual prediction rules belong to the same group-specific prediction rule. Here, the fourth term is directly effective in making the types of explanatory variables effective for prediction differ between the individual prediction rule and the common prediction rule, but the group-specific prediction rule is an individual prediction rule. Therefore, if the type of explanatory variable that works for prediction differs between the individual prediction rule and the common prediction rule, the type of explanatory variable that works for prediction also differs between the group-specific prediction rule and the co-prediction rule. It is thought that it becomes.

In addition, the definition of the norm in the formula (1) is as follows. If W is a d-dimensional column vector, || W || _1 = | w_1 | + | w_2 | + ... + | w_d |. Here, w_d represents a d-dimensional value of the vector W, and | · | represents an absolute value. Further, || W || _∞ = max (| w_1 |, | w_2 |,..., | W_d |). Further, if the matrix A is a d-dimensional × T-dimensional matrix and a ^ {i} is the i-th row vector, || A || _ (1, ∞) is || A || _ (1, ∞ ) = (Σ_ {i = 1} ^ {d} || a ^ {i} || _ {∞}). Here, || a ^ {i} || _ {∞} = max (| a ^ {i} _1 |, | a ^ {i} _2 |,..., | A ^ {i} _T |.

The prediction rule learning unit 25B calculates matrices P, Q, F, and G that minimize the objective function given by Expression (1). As an example, the prediction rule learning unit 25B can calculate the matrices P, Q, F, and G that minimize Equation (1) by using the convex optimization method described in Non-Patent Document 2.

(2) Details of prediction phase [step S21]
The input unit 25A inputs the nth actually measured value X′_nt of the explanatory variable corresponding to the target variable t to be predicted. The vector X′_nt is an M-dimensional vector.

[Step S22]
The predicted value calculation unit 25C calculates the predicted value Y′_nt of the objective variable of X′_nt using the following equation (2).

Y'_nt = (p_t + Fg_t) ^ {T} X'_nt (2)

[Step S23]
Then, the predicted value calculation unit 25C outputs the predicted value Y ^'_nt calculated based on equation (2).

In the present invention, the following modes are possible.
[Form 1]
The data analysis apparatus according to the first aspect is as described above.
[Form 2]
A prediction value calculation unit that calculates a prediction value of the target variable to be predicted using the common prediction rule and the group-specific prediction rule learned by the prediction rule learning unit and the third actually measured value is provided. The data analysis device described in 1.
[Form 3]
The prediction rule learning unit further learns a grouping rule for grouping the plurality of prediction rules so that prediction rules similar to each other among the plurality of prediction rules included in the individual prediction rule belong to the same group. The data analysis apparatus according to mode 2.
[Form 4]
The prediction rule learning unit, based on a predetermined objective function including the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule, the common prediction rule, the individual prediction rule, The data analysis device according to mode 3, wherein the group-specific prediction rule and the grouping rule are learned.
[Form 5]
The prediction rule learning unit learns the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule by minimizing the predetermined objective function based on a convex optimization method. The data analysis device according to claim 4, wherein
[Form 6]
The predetermined objective function includes a first prediction function for reducing an error between a predicted value based on the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule and the first actual measurement value. , A second term for learning the common prediction rule, a third term for learning the group-specific prediction rule, an explanation effective for prediction between the group-specific prediction rule and the common prediction rule A fourth term for making the types of variables different, and a fifth term for making prediction rules similar to each other among a plurality of prediction rules included in the individual prediction rule belong to the same group The data analysis device according to aspect 4 or 5, including at least one of the items.
[Form 7]
The data analysis apparatus according to mode 6, wherein the predetermined objective function is a weighted sum of a plurality of terms among the first term to the fifth term.
[Form 8]
The prediction value calculation unit is configured to determine the prediction target based on the third actual measurement value, the common prediction rule learned by the prediction rule learning unit, the group-specific prediction rule, and the grouping rule. The data analysis device according to any one of Embodiments 3 to 7, which calculates a predicted value of an objective variable.
[Form 9]
The data analysis method according to the second viewpoint is as described above.
[Mode 10]
A ninth aspect includes a prediction value calculation unit that calculates a prediction value of the target variable to be predicted using the common prediction rule and the group-specific prediction rule learned by the prediction rule learning unit and the third actually measured value. The data analysis method described in 1.
[Form 11]
The prediction rule learning unit further learns a grouping rule for grouping the plurality of prediction rules so that prediction rules similar to each other among the plurality of prediction rules included in the individual prediction rule belong to the same group. The data analysis method according to claim 10.
[Form 12]
The prediction rule learning unit, based on a predetermined objective function including the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule, the common prediction rule, the individual prediction rule, The data analysis method according to claim 11, wherein the group-specific prediction rule and the grouping rule are learned.
[Form 13]
The prediction rule learning unit learns the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule by minimizing the predetermined objective function based on a convex optimization method. The data analysis method according to claim 12, wherein
[Form 14]
The predetermined objective function includes a first prediction function for reducing an error between a predicted value based on the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule and the first actual measurement value. , A second term for learning the common prediction rule, a third term for learning the group-specific prediction rule, an explanation effective for prediction between the group-specific prediction rule and the common prediction rule A fourth term for making the types of variables different, and a fifth term for making prediction rules similar to each other among a plurality of prediction rules included in the individual prediction rule belong to the same group 14. The data analysis method according to form 12 or 13, comprising at least one of the items.
[Form 15]
15. The data analysis method according to claim 14, wherein the predetermined objective function is a weighted sum of a plurality of terms among the first term to the fifth term.
[Form 16]
The prediction value calculation unit is configured to determine the prediction target based on the third actual measurement value, the common prediction rule learned by the prediction rule learning unit, the group-specific prediction rule, and the grouping rule. The data analysis method according to any one of forms 11 to 15, wherein a predicted value of the objective variable is calculated.
[Form 17]
The program is related to the third viewpoint.
[Form 18]
A mode 17 including a prediction value calculation unit that calculates a prediction value of the target variable to be predicted using the common prediction rule and the group-specific prediction rule learned by the prediction rule learning unit and the third actually measured value. The program described in.
[Form 19]
The prediction rule learning unit further learns a grouping rule for grouping the plurality of prediction rules so that prediction rules similar to each other among the plurality of prediction rules included in the individual prediction rule belong to the same group. , The program according to Form 18.
[Mode 20]
The prediction rule learning unit, based on a predetermined objective function including the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule, the common prediction rule, the individual prediction rule, The program according to mode 19, which learns group-specific prediction rules and the grouping rules.
[Form 21]
The prediction rule learning unit learns the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule by minimizing the predetermined objective function based on a convex optimization method. The program according to claim 20, wherein
[Form 22]
The predetermined objective function includes a first prediction function for reducing an error between a predicted value based on the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule and the first actual measurement value. , A second term for learning the common prediction rule, a third term for learning the group-specific prediction rule, an explanation effective for prediction between the group-specific prediction rule and the common prediction rule A fourth term for making the types of variables different, and a fifth term for making prediction rules similar to each other among a plurality of prediction rules included in the individual prediction rule belong to the same group The program according to the

form

20 or 21, including at least one of the items.
[Form 23]
The program according to the form 22, wherein the predetermined objective function is a weighted sum of a plurality of terms among the first term to the fifth term.
[Form 24]
The prediction value calculation unit is configured to determine the prediction target based on the third actual measurement value, the common prediction rule learned by the prediction rule learning unit, the group-specific prediction rule, and the grouping rule. The program according to any one of forms 19 to 23, which calculates a predicted value of an objective variable.

It should be noted that all the disclosed contents of Non-Patent Documents 1 and 2 are incorporated herein by reference. Within the scope of the entire disclosure (including claims) of the present invention, the embodiment can be changed and adjusted based on the basic technical concept. Further, various combinations or selections of various disclosed elements (including each element of each claim, each element of each embodiment, each element of each drawing, etc.) are possible within the scope of the claims of the present invention. It is. That is, the present invention of course includes various variations and modifications that could be made by those skilled in the art according to the entire disclosure including the claims and the technical idea. In particular, with respect to the numerical ranges described in this document, any numerical value or small range included in the range should be construed as being specifically described even if there is no specific description.

10, 20

Data analysis device

14, 24 Storage unit 14A First measured value 14B Second measured value 14C Third measured value 14D, 24D Common prediction rule 14E, 24E Individual prediction rule 14F, 24F Group-specific prediction rule

14G Prediction Value

15B, 25B Prediction

rule learning unit

15C, 25C Predicted value calculation unit 21 Communication I / F unit 22 Operation input unit 23 Screen display unit 24A Target variable actual value 24B Explanation variable actual value 24C Corresponds to target variable to be predicted Measured value of explanatory variable 24G Predicted value 25 Processor 25A Input section

Claims

A multitasking data analysis device,
A first actual measurement value that is an actual measurement value of a plurality of objective variables; a second actual measurement value that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables; and an explanatory variable that corresponds to an objective variable to be predicted. A storage unit that holds a third actual measurement value that is an actual measurement value of
Using the first actual measurement value and the second actual measurement value, a common prediction rule that is a prediction rule represented by an explanatory variable commonly related to the plurality of objective variables, and an explanation related to each objective variable Prediction rule learning that learns individual prediction rules composed of prediction rules for each objective variable represented by variables and prediction rules for each group when the prediction rules included in the individual prediction rules are grouped A data analysis device.
The prediction value calculation part which calculates the prediction value of the objective variable of the prediction object using the common prediction rule and group-specific prediction rule learned by the prediction rule learning part, and the 3rd actual measurement value is provided. The data analysis apparatus according to 1.
The prediction rule learning unit further learns a grouping rule for grouping the plurality of prediction rules so that prediction rules similar to each other among the plurality of prediction rules included in the individual prediction rule belong to the same group. The data analysis apparatus according to claim 2.
The prediction rule learning unit, based on a predetermined objective function including the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule, the common prediction rule, the individual prediction rule, The data analysis apparatus according to claim 3, wherein the group-specific prediction rule and the grouping rule are learned.
The prediction rule learning unit learns the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule by minimizing the predetermined objective function based on a convex optimization method. The data analysis apparatus according to claim 4.
The predetermined objective function includes a first prediction function for reducing an error between a predicted value based on the common prediction rule, the individual prediction rule, the group-specific prediction rule, and the grouping rule and the first actual measurement value. , A second term for learning the common prediction rule, a third term for learning the group-specific prediction rule, an explanation effective for prediction between the group-specific prediction rule and the common prediction rule A fourth term for making the types of variables different, and a fifth term for making prediction rules similar to each other among a plurality of prediction rules included in the individual prediction rule belong to the same group The data analysis device according to claim 4, comprising at least one of the following items.
The data analysis apparatus according to claim 6, wherein the predetermined objective function is a weighted sum of a plurality of terms among the first term to the fifth term.
The prediction value calculation unit is configured to determine the prediction target based on the third actual measurement value, the common prediction rule learned by the prediction rule learning unit, the group-specific prediction rule, and the grouping rule. The data analysis apparatus according to claim 3, wherein a predicted value of the objective variable is calculated.
A data analysis method in which a computer performs multitasking data analysis,
The computer uses a first actual measurement value that is an actual measurement value of a plurality of objective variables, a second actual measurement value that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables, and an objective variable to be predicted. Holding a third actual measurement value, which is an actual measurement value of the corresponding explanatory variable, in the storage unit;
A common prediction rule, which is a prediction rule represented by explanatory variables related to the plurality of objective variables in common, using the first actual measurement value and the second actual measurement value read from the storage unit; By group, comprising prediction rules for each objective variable represented by explanatory variables related to each objective variable, and prediction rules for each group when the prediction rules included in the individual prediction rules are grouped Learning a prediction rule and recording it in the storage unit.
A program that causes a computer to perform multitasking data analysis,
A first actual measurement value that is an actual measurement value of a plurality of objective variables; a second actual measurement value that is an actual measurement value of a plurality of explanatory variables corresponding to the plurality of objective variables; and an explanatory variable that corresponds to an objective variable to be predicted. A process of holding a third actual measurement value, which is an actual measurement value, in the storage unit;
A common prediction rule, which is a prediction rule represented by explanatory variables related to the plurality of objective variables in common, using the first actual measurement value and the second actual measurement value read from the storage unit; By group, comprising prediction rules for each objective variable represented by explanatory variables related to each objective variable, and prediction rules for each group when the prediction rules included in the individual prediction rules are grouped The program which makes the said computer perform the process which learns a prediction rule and records it on the said memory | storage part.