WO2023149138A1 - Estimator learning device - Google Patents

Estimator learning device Download PDF

Info

Publication number
WO2023149138A1
WO2023149138A1 PCT/JP2022/048176 JP2022048176W WO2023149138A1 WO 2023149138 A1 WO2023149138 A1 WO 2023149138A1 JP 2022048176 W JP2022048176 W JP 2022048176W WO 2023149138 A1 WO2023149138 A1 WO 2023149138A1
Authority
WO
WIPO (PCT)
Prior art keywords
condition
qubo
estimator
learning device
unit
Prior art date
Application number
PCT/JP2022/048176
Other languages
French (fr)
Japanese (ja)
Inventor
晃一郎 八幡
彰規 淺原
好弘 刑部
秀和 森田
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Publication of WO2023149138A1 publication Critical patent/WO2023149138A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Definitions

  • the present invention relates to an estimator learning device.
  • the technology for estimating objective variables from explanatory variable data is the most basic technology of machine learning or artificial intelligence.
  • Such estimation techniques are utilized in many situations. For example, in the field of material development, it takes an enormous amount of time and money to experiment with all combinations (conditions) of a plurality of material combinations in order to develop materials with high specific material property values. If the material property values can be estimated in advance from those experimental conditions, it will be possible to omit experiments with low prospects and enable efficient material development. At this time, it is desirable that the material property values be estimated with high estimation accuracy. Decision trees and their derived algorithms are used in techniques for estimating objective variables from explanatory variable data due to their high accuracy.
  • Ising machines are machines that can solve QUBO (Quadratic Unconstrained Binary Optimization) problems, such as quadratic binary variable optimization problems, and are used to solve combinatorial optimization problems. Therefore, if the problem of searching for a decision tree that minimizes the estimation error can be converted to a QUBO problem, it will be possible to make use of the strengths of the Ising machine in learning decision trees.
  • QUBO Quadrattic Unconstrained Binary Optimization
  • Patent Document 1 discloses an Ising machine data input device and a method of inputting data to the Ising machine.
  • the Ising machine data input device includes a conversion unit that performs a conversion process to convert an input expression in a format not suitable for input to the Ising machine into a format suitable for input.
  • a transformation unit derives a mathematical expression and evaluates whether the derived expression satisfies a preset quality metric.
  • the derived expressions are input to the Ising machine when evaluated to satisfy the criteria. When the derived representation is evaluated as not satisfying the criteria, the transformation unit repeats the transformation process using a different input representation.
  • Patent Document 1 converts the input problem into a QUBO problem by repeating conversion processing that converts the input problem into a mathematically equivalent problem, and solves the converted QUBO problem with an Ising machine.
  • the problem of searching a decision tree that minimizes the estimation error cannot be converted into a QUBO problem with only mathematically equivalent transformations.
  • the present invention has been made in view of the above problems, and its purpose is to provide a technique for increasing the accuracy of decision tree estimation.
  • the present invention provides an estimator learning apparatus for learning an estimator that searches for a branching condition of a decision tree for estimating an objective variable from explanatory variable data, wherein the estimator comprises: a QUBO transforming unit that transforms a prediction error minimization problem in searching for conditions into a QUBO problem or a first problem equivalent to the QUBO problem; a QUBO computing unit that computes the first problem transformed by the QUBO transforming unit; a branch condition generation unit that generates the branch condition based on the calculation result of the QUBO calculation unit.
  • FIG. 2 is a functional block diagram of the estimator learning device according to the first embodiment;
  • FIG. A decision tree according to the first embodiment. 4 is a functional block diagram of a QUBO question conversion unit according to the first embodiment;
  • FIG. 4 is a diagram showing a data structure example of an explanatory variable DB according to the first embodiment;
  • FIG. 4 is a diagram showing a data structure example of an objective variable DB according to the first embodiment;
  • FIG. 4 is a diagram showing an example data structure of a condition DB according to the first embodiment;
  • FIG. 4 is a diagram showing a data structure example of a conditional explanatory variable DB according to the first embodiment;
  • FIG. 4 is a diagram showing an example data structure of a decision tree DB according to the first embodiment;
  • FIG. 4 is a diagram showing an example data structure of a learning parameter DB according to the first embodiment;
  • FIG. 4 is a processing flow of the estimator learning device according to the first embodiment;
  • FIG. 1 A first embodiment of the present invention will be described using FIG. 1
  • FIG. 1 is a functional block diagram of the estimator learning device according to the first embodiment.
  • the estimator learning device 100 of the present invention comprises an interface 10, a database (DB) 11, and an estimator 12.
  • the interface 10 includes an input unit 101 and an output unit 102 as an example of a "display unit". Data including an explanatory variable and an objective variable are input to the input unit 101 .
  • the explanatory variables are stored in the explanatory variable DB 110 (FIG. 4), and the objective variables are stored in the objective variable DB 111 (FIG. 5).
  • the output unit 102 outputs data to the outside.
  • the estimator 12 includes a condition generation unit 103, a conditional explanatory variable generation unit 104, a QUBO problem conversion unit 105, a QUBO problem calculation unit 106, a branch condition generation unit 107, a condition determination unit 108, and an objective variable estimation. 109.
  • FIG. 2 is a decision tree according to the first embodiment.
  • the condition generation unit 103 generates branch conditions.
  • a branching condition is used for branching a decision tree.
  • a decision tree is a machine learning algorithm that estimates objective variables based on given explanatory variables, as shown in FIG. In the decision tree, branches are created sequentially using explanatory variables so that the prediction error becomes small.
  • the branching condition of the decision tree is a numerical condition using one explanatory variable, such as "temperature is higher than 30 degrees (temperature>30)".
  • the branching conditions of the present invention are not limited to such conditions, and may be conditions that can return true/false from the explanatory variables of each sample. For example, "the sum of temperature and humidity is 100 or more" and "a person is reflected in the image" can be considered.
  • the condition generation unit 103 may create branch conditions manually by the user, or may create branch conditions automatically from explanatory variables. For example, if the explanatory variable is a continuous quantity, you can use "temperature > 1/5th quantile of temperature", “temperature > 2/5th quantile of temperature”, or "temperature > 3/5th quantile of temperature”. It can be determined based on the statistics of explanatory variables. Parameters related to statistics, such as quantile granularity, may be selected by the user or automatically determined based on the sample size. If the explanatory variable is label data, it is conceivable to automatically generate a condition such as "the day of the week is Monday" or "the day of the week is not Monday”.
  • condition DB 112 it is thought that conditions such as "temperature data is missing" or "the number of missing explanatory variables is 5 or more" can be automatically generated. If explanatory variables are missing and it is difficult to determine the conditions, for example, it may be determined that the conditions are not met. The generated conditions are saved in the condition DB 112 .
  • the conditioned explanatory variable generation unit 104 generates conditioned explanatory variables from the explanatory variables, and stores the generated conditioned explanatory variables in the conditioned explanatory variable DB 113 .
  • the QUBO problem conversion unit 105 converts a branch condition search problem that reduces the prediction error (prediction error minimization problem) into a QUBO (Quadratic Unconstrained Binary Optimization) problem as an example of the "first problem".
  • the QUBO problem conversion unit 105 may convert the prediction error minimization problem into a problem equivalent to the QUBO problem.
  • FIG. 3 is a functional block diagram of the QUBO question conversion unit.
  • the QUBO problem conversion unit 105 includes an error function generation unit 301 and a QUBO problem generation unit 302.
  • the error function generator 301 will be explained.
  • the error function which is an index of the prediction error, is the sum of squares of residuals, which is the error between prediction and estimation. It is represented by Equation 1 below.
  • J is the sum of squared residuals
  • y[i] is the objective variable for sample i
  • S1 is the set of samples that satisfy the condition
  • S0 is the set of samples that do not satisfy the condition
  • pred1 is the set of samples that satisfy the condition
  • the predicted value of the sample that is satisfied, pred0 is the predicted value of the unsatisfied sample.
  • the pred1 and pred0 that minimize J are the mean y of the satisfying and the mean y of the unsatisfied samples, respectively. Therefore, the sum of squares J of the residuals is represented by Equation 2 below.
  • Var(S) represents the variance of the set S
  • N(S) represents the number of sets S.
  • FIG. the residual sum of squares is a value obtained by weighting the variance of the sample group divided by the condition by the number of each sample. Transforming the formula of the formula 2, the following formula 3 is obtained.
  • N(S1) and N(S0) exist in the denominator of Equation 3, it cannot be converted into a QUBO problem.
  • the QUBO problem transforming unit 105 transforms the residual sum of squares J into a QUBO problem by adjusting the weight for the variance of the sample group. For example, weighting is performed not by the number of samples, but by the square of the number of samples, as in Equation 4 below. However, if N(S1) and N(S0) can be eliminated from the denominator, the weight does not have to be the square of the number of samples. For example, it may be the 3rd power of the number of samples, the 4th power of the number of samples, or the square of the ratio of the number of samples.
  • the error function H is a value obtained by changing the weighting from the residual sum of squares J, has a strong correlation with the residual sum of squares J, and is in a form that can be transformed into a QUBO problem. Therefore, a branching condition that reduces the error function H is a branching condition that reduces the sum of squares J of the residuals.
  • the QUBO question generation unit 302 will be explained.
  • the QUBO question generator 302 determines search conditions and data to be input to the QUBO question calculator 106 .
  • a search condition for example, a condition (temperature>20, etc.) corresponding to a column of conditional explanatory variables (FIG. 7), which will be described later, can be considered.
  • the generated QUBO will be described.
  • the QUBO problem is expressed by an error function to be minimized expressed by QUBO variables expressed by 0 or 1 and one or more constraints to be satisfied by the QUBO variables.
  • the error function is represented by the following equation (6).
  • S is a set of all samples
  • X[i][j] is a conditioned explanatory variable of condition j of sample i
  • C is a set of conditions
  • c is a QUBO variable expressing whether to use a condition
  • the condition to be used must be narrowed down to one, which is represented by the following constraints of Expression 7.
  • the QUBO problem conversion unit 105 outputs the error function and constraints calculated as described above.
  • the QUBO problem calculation unit 106 calculates QUBO problems.
  • the QUBO problem calculation unit 106 (also called an annealing machine) is a digital Annealer or the like can be used.
  • the QUBO question calculation unit 106 outputs the QUBO variable c as an example of the "calculation result".
  • the condition determination unit 108 determines whether to use the condition j output from the QUBO problem calculation unit 106 for the estimator 12 . First, the condition determination unit 108 uses the output condition j to calculate how the number of samples is divided and the prediction error at that time. Condition determination unit 108 then stores this information in decision tree DB 114 . When the condition determination unit 108 determines to use the condition j, it may be repeated whether to further divide each divided sample group.
  • the objective variable estimation unit 109 estimates the objective variable from the explanatory variable data using the learned estimator 12 .
  • the database (DB) 11 comprises an explanatory variable database (DB) 110, an objective variable DB 111, a condition DB 112, a conditioned explanatory variable DB 113, a decision tree DB 114, and a learning parameter DB 115.
  • the user inputs data including an explanatory variable and an objective variable in the input section, and obtains an estimator that estimates the objective variable or an estimation result of the estimator using the explanatory variable of new data. be able to.
  • FIG. 4 is a diagram showing an example data structure of the explanatory variable DB according to the first embodiment.
  • FIG. 5 is a diagram showing an example data structure of a target variable DB according to the first embodiment.
  • the case of learning an estimator for estimating the daily sales of juice at a certain store will be described as an example.
  • the explanatory variable DB 110 is a table that stores an ID 401 as item values (column values), and temperature 402, humidity 403, day of the week 404, and a photo 405 of the previous day's shop front as examples of "explanatory variables" for each sample.
  • the ID 401 is an identifier that identifies an explanatory variable.
  • the temperature 402 is the Celsius temperature (degrees) around a certain store on the day.
  • Humidity 403 is the current humidity (%) around a certain store.
  • the day of the week 404 is the current day of the week at a store.
  • a photo 405 of the front of the store on the previous day is an image of the front of a certain store on the previous day.
  • Each row of the explanatory variable DB 110 and objective variable DB 111 corresponds to a sample, and these two explanatory variable DB 110 and objective variable DB 111 are linked with IDs 401 and 501.
  • IDs 401 and 501 may be not only numbers but also character strings. For example, for juice sales, ID 401, 501 may be the date.
  • each ID 401 is associated with an explanatory variable for each sample.
  • Explanatory variables may correspond to continuous numerical values such as temperature 402 and humidity 403, class information such as day of the week 204, or ID 401 such as a photo (image information) 405 in front of the shop on the previous day. format is not limited.
  • explanatory variables may also include speech, sentences, chemical formulas, and the like. Also, some of the explanatory variables may be missing.
  • the objective variable DB 111 is a table that stores an ID 501 as an item value (column value) and juice sales 502 as an example of the "objective variable" to be estimated.
  • ID 501 is an identifier that identifies the objective variable.
  • the juice sales 502 is the number of sales of juice on the day at a certain store. As an example, the juice sales 502 are “20 (bottles)”, “22 (bottles)”, and “33 (bottles)”.
  • a target variable is stored in the target variable DB 111 for each ID 501 .
  • FIG. 6 is a diagram showing an example data structure of the condition DB according to the first embodiment.
  • the condition DB 112 is a table that stores condition IDs 601 as item values (column values) and conditions 602 as an example of "branch conditions".
  • a condition ID 601 is an identifier that identifies a branch condition.
  • a condition 602 is a branching condition in the decision tree for estimating the objective variable from the explanatory variables. As an example, the condition 602 is "Temperature>20 (degrees)", “Temperature>22 (degrees)", and "Day of the week is Sunday".
  • FIG. 7 is a diagram showing an example data structure of the conditional explanatory variable DB according to the first embodiment.
  • the conditional explanatory variable DB 113 has, as item values (column values), an ID 701, "Temperature > 20 (Condition 0)" 702, “Temperature > 22 (Condition 1)” 703, and "Day of the week is Sunday (Condition 2) 704 and 'A person exists in the image (Condition 3)' 705.
  • the ID 501 is an identifier that identifies a conditional explanatory variable.
  • “Temperature>20 (condition 0)” 502 is a branching condition that the temperature around a certain store on the day is higher than 20 degrees.
  • “Temperature>22 (Condition 1)” 503 is a branching condition that the temperature around a certain store on the day is higher than 22 degrees.
  • “Day of the week is Sunday (Condition 2)” 504 is a branching condition that the current day of the week is Sunday at a store.
  • “People exist in image (Condition 3)” 505 is a branching condition that a person exists in an image captured in front of a shop on the previous day.
  • Each column in FIG. 7 indicates with 0 and 1 whether each sample satisfies the condition.
  • the value to be stored does not have to be "1, 0" as long as it is known whether the condition is satisfied. For example, "True, False” or “True, False” may be used.
  • FIG. 8 is a diagram showing an example data structure of the learning decision tree DB according to the first embodiment.
  • the decision tree DB 114 shown in the upper part of FIG. 8 stores the characteristics of the decision trees that are being created.
  • the decision tree DB 114 contains, as item values (column values), a node ID 801, a parent node 802, a true/false parent node condition 803, a condition 804, a predicted value 805 when true, and a predicted value when false. It is a table that stores values 806 .
  • the condition 804 is managed as a node, and the node indicating the condition used in the condition is identified by the ID 801 of the parent node, the truth 803 of the condition of the parent node, the ID related to the condition of the node, and the truth of the condition of the node.
  • the predicted value is the average value of the objective variable for each divided sample.
  • a condition for determining whether or not to use the estimator for example, a case where the number of samples divided under the condition is equal to or less than a threshold can be considered. Alternatively, it can be considered that the decrease in the prediction error is small, or that the depth of the decision tree exceeds the threshold.
  • the threshold is stored in the learning data parameter DB 115.
  • a decision tree based on the data stored in the decision tree DB 114 is shown at the bottom of FIG.
  • the node ID: 0 and the condition 804 of "Temperature>22 (degrees)" is true (YES)
  • the node ID: 1 proceeds to the condition 804 of "Sunday”. If the condition 804 of node ID: 0 and "temperature>22 (degrees)" is false (NO), the expected value 806 in the false case is "10 (books)”. If the node ID is 1 and the condition 804 "day of the week is Sunday” is true (YES), the expected value 805 in the true case is "120 (books)”. If the node ID is 1 and the “day of the week is Sunday” condition 804 is false (NO), then the expected value 806 is “90 (books)”.
  • FIG. 9 is a diagram showing an example data structure of a learning parameter DB according to the first embodiment.
  • the learning parameter DB 115 includes, as item values (column values), a minimum division parameter 901, It is a table that stores a minimum prediction error decrease width 902 and a maximum decision tree seismic intensity 903 .
  • the minimum division parameter 901 is "10”
  • the minimum prediction error decrease width 902 is "0.01”
  • the maximum decision tree seismic intensity 903 is "5".
  • the parameters may be set by the user or may be fixed values. Or you can try multiple parameters.
  • FIG. 10 is a diagram showing the processing flow of the estimator learning device according to the first embodiment. The system configuration will be explained in order of the processing flow.
  • Data including an explanatory variable and an objective variable are input to the input unit 101 (S1).
  • the explanatory variables input to the input unit 101 are stored in the explanatory variable DB 110 and the objective variables are stored in the objective variable DB 111 .
  • the condition generation unit 103 generates conditions used for branching of the decision tree (S2).
  • conditional explanatory variable generation unit 104 generates a conditional explanatory variable from the explanatory variables, and stores the generated conditional explanatory variable in the conditional explanatory variable DB 113 (S3).
  • the QUBO problem conversion unit 105 converts the branch condition search problem that reduces the prediction error into a QUBO problem (S4).
  • the QUBO problem calculation unit 106 calculates the QUBO problem converted by the QUBO problem conversion unit 105, and the branch condition generation unit 107 generates conditions used for branching (S5).
  • condition determination unit 108 divides the data sample using the branch condition generated by the branch condition generation unit 107, determines whether the division is used in the estimator 12, and stores the determination result in the decision tree DB 114. (S6). If the determination result is true (S6: YES), the process flow returns to S5 in order to further divide each divided sample group. If the result of this determination is false (when there are no sample groups to be divided) (S6: NO), the condition determination unit 108 proceeds to the next processing flow S7.
  • the output unit 102 outputs the features of the decision tree stored in the decision tree DB 114. That is, the output unit 102 outputs parameters obtained by learning (S7).
  • the estimator learning device trains the estimator 12 for searching the branching condition of the decision tree for estimating the objective variable from the explanatory variable data.
  • a QUBO problem calculation unit 106 and a branch condition generation unit 107 are provided.
  • the QUBO problem transforming unit 105 transforms the prediction error minimization problem in the branch condition search into a QUBO problem.
  • the QUBO question calculation unit 106 calculates the QUBO question converted by the QUBO question conversion unit.
  • a branching condition generation unit 107 generates a branching condition based on the calculation result of the QUBO problem calculation unit 106 . As a result, the accuracy of estimating the branching condition of the decision tree can be improved.
  • FIG. 11 is a decision tree including conditions that can be expressed by a logical product according to the first embodiment.
  • FIG. 12 is a diagram for explaining a method of expressing a logical product condition according to the second embodiment.
  • Embodiment 2 discloses an example in which the QUBO question conversion unit 1005, which is different from that in Embodiment 1, is applied, and shows that it includes not only one condition but also a condition that can be expressed as a logical AND of the conditions.
  • a branch using a condition that can be expressed by a logical product is defined as “temperature>30” and “day of the week is Sunday”, such as “temperature>30 and Sunday” as shown in FIG. condition to be met.
  • the conditions are not limited to two, and any number of conditions described in the condition DB 112 can be used.
  • a condition that can be expressed as a logical product of such conditions is called a logical product condition.
  • condition IDs are expressed by vectors indicating whether or not each condition is used, as shown in FIG.
  • the conditions are "temperature>30 and humidity>50". Therefore, the QUBO problem conversion unit searches for the vector.
  • Equation 8 The error function H in the search problem is represented by Equation 8 below.
  • KX is a matrix of QUBO variables that indicates the number of unsatisfied conditions among the conditions that constitute the conditions expressed by the logical AND in sample i.
  • K is the maximum of the conditions that make up the conjunctive condition.
  • the QUBO problem conversion unit 1005 generates a QUBO problem expressed by the error function and three types of constraints as described above.
  • the branch condition generation unit 107 sets the branch condition search range to conditions generated from table-format data divided for each branch, or conditions expressed by a logical product of conditions. This makes it possible to widen the search range of branch conditions.
  • the present invention is not limited to the above-described embodiments, and includes various modifications.
  • the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations.
  • it is possible to replace part of the configuration of one embodiment with the configuration of another embodiment and it is also possible to add the configuration of another embodiment to the configuration of one embodiment.
  • each of the above configurations may be partially or wholly configured by hardware, or may be configured to be realized by executing a program on a processor.
  • control lines and information lines indicate those considered necessary for explanation, and not all control lines and information lines are necessarily indicated on the product. In practice, it may be considered that almost all configurations are interconnected.
  • the QUBO problem conversion units 105 and 1005 convert the error of the sample group out of the error to be minimized, which is represented by the sum of the errors of the sample groups of table-format data divided for each branch, into the sample group
  • the error minimization problem may be converted to a QUBO problem by weighting with a number, a value proportional to the number of samples, or an output value of a function expressed by a sample coefficient or a sample number and a sample coefficient.
  • the branching condition generation unit 107 may create a new branching condition based on the decision tree searched for the branching condition. This makes it possible to create deep decision trees.
  • the branching condition generation unit 107 may create a plurality of decision trees and combine the created decision trees to create a new decision tree. This makes it possible to further improve the accuracy of estimating the branching condition of the decision tree.
  • An importance calculation unit that calculates the importance of the branch condition based on the calculation result of the QUBO problem calculation unit 106, and a display unit 102 that displays the importance calculated by the importance calculation unit may be provided. This allows the user to determine the branching condition while confirming the degree of importance.
  • a display unit 102 for displaying the importance of the conditions generated by the branch condition generation unit 107 may be provided. This allows the user to determine the branching condition while confirming the degree of importance.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

According to the present invention, estimation accuracy of a branch condition of a decision tree is heightened. This estimator learning device 100 is for training an estimator 12 that searches for a branch condition of a decision tree for estimating a target variable from data of an explanatory variable. The estimator 12 comprises: a QUBO problem conversion unit 105 that converts a prediction error minimization problem in the search for the branch condition to a first problem that is a QUBO problem or is equivalent to a QUBO problem; a QUBO problem computation unit 106 that computes the first problem converted to by the QUBO problem conversion unit 105; and a branch condition generation unit 107 that generates the branch condition on the basis of the computation result of the QUBO problem computation unit 106.

Description

推定器学習装置estimator learning device
 本発明は、推定器学習装置に関する。 The present invention relates to an estimator learning device.
 説明変数のデータから目的変数を推定する技術は、機械学習もしくは人工知能の最も基本的な技術である。このような推定技術は、多くの場面で活用されている。例えば、材料開発の場面では、特定の材料特性値の高い材料を開発するために、複数の材料の組合せのうち全ての組合せ(条件)で実験すると、膨大な時間とコストが必要となる。それらの実験条件から事前に材料特性値を推定できれば、見込みの低い実験を省くことができ、効率的な材料開発が可能となる。この際、材料特性値の推定は、高い推定精度が望ましい。決定木及びその派生アルゴリズムは、それらの精度の高さから、説明変数のデータから目的変数を推定する技術に使われている。 The technology for estimating objective variables from explanatory variable data is the most basic technology of machine learning or artificial intelligence. Such estimation techniques are utilized in many situations. For example, in the field of material development, it takes an enormous amount of time and money to experiment with all combinations (conditions) of a plurality of material combinations in order to develop materials with high specific material property values. If the material property values can be estimated in advance from those experimental conditions, it will be possible to omit experiments with low prospects and enable efficient material development. At this time, it is desirable that the material property values be estimated with high estimation accuracy. Decision trees and their derived algorithms are used in techniques for estimating objective variables from explanatory variable data due to their high accuracy.
 イジングマシンは、例えば、二次形式二値変数最適化問題のようなQUBO(Quadratic Unconstrained Binary Optimization)問題を解くことが可能なマシンであり、組み合わせ最適化問題を解く際に用いられている。そこで、推定誤差を最小にするような決定木を探索する問題をQUBO問題に変換できれば、イジングマシンの強みを決定木の学習に生かすことが可能となる。 Ising machines are machines that can solve QUBO (Quadratic Unconstrained Binary Optimization) problems, such as quadratic binary variable optimization problems, and are used to solve combinatorial optimization problems. Therefore, if the problem of searching for a decision tree that minimizes the estimation error can be converted to a QUBO problem, it will be possible to make use of the strengths of the Ising machine in learning decision trees.
 そのため、特許文献1には、イジングマシンデータ入力機器、及びイジングマシンにデータを入力する方法が開示されている。イジングマシンデータ入力機器は、イジングマシンに入力するのに適さないフォーマットの入力表現を、適したフォーマットに変換する変換処理を実行する変換部を含んでいる。変換部は、数学的表現を導出し、導出された表現がプリセット品質指標を満たすか否かを評価する。導出された表現は、指標を満たすと評価されたとき、イジングマシンに入力される。導出された表現が指標を満たさないと評価されたとき、変換部は、異なる入力表現を用いて変換処理を繰り返す。 Therefore, Patent Document 1 discloses an Ising machine data input device and a method of inputting data to the Ising machine. The Ising machine data input device includes a conversion unit that performs a conversion process to convert an input expression in a format not suitable for input to the Ising machine into a format suitable for input. A transformation unit derives a mathematical expression and evaluates whether the derived expression satisfies a preset quality metric. The derived expressions are input to the Ising machine when evaluated to satisfy the criteria. When the derived representation is evaluated as not satisfying the criteria, the transformation unit repeats the transformation process using a different input representation.
特開2021-2322号公報Japanese Patent Application Laid-Open No. 2021-2322
 特許文献1の技術は、入力される問題を数学的に等価な問題に変換する変換処理を繰り返すことにより、入力される問題をQUBO問題に変換して、変換したQUBO問題をイジングマシンで解く。しかし、推定誤差を最小化するような決定木を探索する問題は、数学的に等価な変換のみでは、QUBO問題にすることはできない。 The technology of Patent Document 1 converts the input problem into a QUBO problem by repeating conversion processing that converts the input problem into a mathematically equivalent problem, and solves the converted QUBO problem with an Ising machine. However, the problem of searching a decision tree that minimizes the estimation error cannot be converted into a QUBO problem with only mathematically equivalent transformations.
 そこで、本発明は、上記課題に鑑みてなされたもので、その目的は、決定木の推定精度を高める技術を提供することにある。 Therefore, the present invention has been made in view of the above problems, and its purpose is to provide a technique for increasing the accuracy of decision tree estimation.
 上記目的を解決するために、本発明は、説明変数のデータから目的変数を推定する決定木の分岐条件を探索する推定器を学習する推定器学習装置であって、前記推定器は、前記分岐条件の探索における予測誤差最小化問題を、QUBO問題または当該QUBO問題と等価な第1問題に変換するQUBO変換部と、前記QUBO変換部が変換した前記第1問題を演算するQUBO演算部と、前記QUBO演算部の演算結果に基づいて、前記分岐条件を生成する分岐条件生成部と、を備える。 In order to solve the above object, the present invention provides an estimator learning apparatus for learning an estimator that searches for a branching condition of a decision tree for estimating an objective variable from explanatory variable data, wherein the estimator comprises: a QUBO transforming unit that transforms a prediction error minimization problem in searching for conditions into a QUBO problem or a first problem equivalent to the QUBO problem; a QUBO computing unit that computes the first problem transformed by the QUBO transforming unit; a branch condition generation unit that generates the branch condition based on the calculation result of the QUBO calculation unit.
 本発明によれば、決定木の分岐条件の推定精度を高めることができる。 According to the present invention, it is possible to improve the accuracy of estimating the branching conditions of a decision tree.
実施形態1に係る推定器学習装置の機能ブロック図。2 is a functional block diagram of the estimator learning device according to the first embodiment; FIG. 実施形態1に係る決定木。A decision tree according to the first embodiment. 実施形態1に係るQUBO問題変換部の機能ブロック図。4 is a functional block diagram of a QUBO question conversion unit according to the first embodiment; FIG. 実施形態1に係る説明変数DBのデータ構造例を示す図。4 is a diagram showing a data structure example of an explanatory variable DB according to the first embodiment; FIG. 実施形態1に係る目的変数DBのデータ構造例を示す図。4 is a diagram showing a data structure example of an objective variable DB according to the first embodiment; FIG. 実施形態1に係る条件DBのデータ構造例を示す図。4 is a diagram showing an example data structure of a condition DB according to the first embodiment; FIG. 実施形態1に係る条件化説明変数DBのデータ構造例を示す図。4 is a diagram showing a data structure example of a conditional explanatory variable DB according to the first embodiment; FIG. 実施形態1に係る決定木DBのデータ構造例を示す図。4 is a diagram showing an example data structure of a decision tree DB according to the first embodiment; FIG. 実施形態1に係る学習パラメータDBのデータ構造例を示す図。4 is a diagram showing an example data structure of a learning parameter DB according to the first embodiment; FIG. 実施形態1に係る推定器学習装置の処理フロー。4 is a processing flow of the estimator learning device according to the first embodiment; 実施形態2に係る論理積で表現できる条件を含む決定木。A decision tree including conditions that can be expressed by a logical product according to the second embodiment. 実施形態2に係る論理積条件の表現方法。A method of expressing a logical AND condition according to the second embodiment.
 以下、本発明の実施形態に係る推定器学習装置の具体例を、図面を参照しつつ説明する。なお、本発明は実施例によって限定されるものではなく、特許請求の範囲によって示される。 A specific example of the estimator learning device according to the embodiment of the present invention will be described below with reference to the drawings. It should be noted that the present invention is not limited by the examples, but is indicated by the claims.
 本発明の第1の実施形態について図1を用いて説明する。 A first embodiment of the present invention will be described using FIG.
 図1は、実施形態1に係る推定器学習装置の機能ブロック図である。 FIG. 1 is a functional block diagram of the estimator learning device according to the first embodiment.
 本発明の推定器学習装置100は、インターフェース10と、データベース(DB)11と、推定器12とを備えている。 The estimator learning device 100 of the present invention comprises an interface 10, a database (DB) 11, and an estimator 12.
 インターフェース10は、入力部101と、「表示部」の一例としての出力部102とを備えている。入力部101には、説明変数と目的変数とを含むデータが入力される。説明変数は、説明変数DB110(図4)に格納され、目的変数は、目的変数DB111(図5)に格納される。出力部102は、外部にデータを出力する。 The interface 10 includes an input unit 101 and an output unit 102 as an example of a "display unit". Data including an explanatory variable and an objective variable are input to the input unit 101 . The explanatory variables are stored in the explanatory variable DB 110 (FIG. 4), and the objective variables are stored in the objective variable DB 111 (FIG. 5). The output unit 102 outputs data to the outside.
 推定器12は、条件生成部103と、条件化説明変数生成部104と、QUBO問題変換部105と、QUBO問題演算部106と、分岐条件生成部107と、条件判定部108と、目的変数推定部109とを備えている。 The estimator 12 includes a condition generation unit 103, a conditional explanatory variable generation unit 104, a QUBO problem conversion unit 105, a QUBO problem calculation unit 106, a branch condition generation unit 107, a condition determination unit 108, and an objective variable estimation. 109.
 図2は、実施形態1に係る決定木である。 FIG. 2 is a decision tree according to the first embodiment.
 条件生成部103は、分岐条件を生成する。分岐条件とは、決定木の分岐に用いるものである。決定木は、図2のように、与えられた説明変数に基づいて、目的変数を推定する機械学習アルゴリズムである。決定木では、予測誤差が小さくなるように、説明変数を用いた分岐を逐次的に作成する。一般的に、決定木の分岐条件は、例えば、「気温が30度より高い(気温>30)」のように、1つの説明変数を用いた数値に関する条件である。しかし、本発明の分岐条件は、このような条件に限定せず、各サンプルの説明変数から、真偽を返すことのできる条件であればよい。例えば、「気温と湿度の合計が100以上」および「画像に人が映っている」などが考えられる。 The condition generation unit 103 generates branch conditions. A branching condition is used for branching a decision tree. A decision tree is a machine learning algorithm that estimates objective variables based on given explanatory variables, as shown in FIG. In the decision tree, branches are created sequentially using explanatory variables so that the prediction error becomes small. Generally, the branching condition of the decision tree is a numerical condition using one explanatory variable, such as "temperature is higher than 30 degrees (temperature>30)". However, the branching conditions of the present invention are not limited to such conditions, and may be conditions that can return true/false from the explanatory variables of each sample. For example, "the sum of temperature and humidity is 100 or more" and "a person is reflected in the image" can be considered.
 条件生成部103は、ユーザが手動で分岐条件を作成してもよいし、説明変数から自動で分岐条件を作成してもよい。例えば、説明変数が連続量の場合、「気温>気温の1/5分位数」、「気温>気温の2/5分位数」または「気温>気温の3/5分位数」などの説明変数の統計量に基づいて決定することができる。分位数の細かさなどの統計量に関するパラメータは、ユーザによって選択されても、サンプル数に基づいて自動決定されてもよい。説明変数がラベルデータの場合、「曜日が月曜日」または「曜日が月曜ではない」などの条件を自動生成することが考えられる。また、「気温データが欠損している」または「欠損している説明変数の個数が5以上」といった条件も自動で生成できると考えられる。説明変数が欠損しており、条件の判定が難しい場合、例えば、一律に条件を満たさないと判断すればよい。生成された条件は、条件DB112に保存される。 The condition generation unit 103 may create branch conditions manually by the user, or may create branch conditions automatically from explanatory variables. For example, if the explanatory variable is a continuous quantity, you can use "temperature > 1/5th quantile of temperature", "temperature > 2/5th quantile of temperature", or "temperature > 3/5th quantile of temperature". It can be determined based on the statistics of explanatory variables. Parameters related to statistics, such as quantile granularity, may be selected by the user or automatically determined based on the sample size. If the explanatory variable is label data, it is conceivable to automatically generate a condition such as "the day of the week is Monday" or "the day of the week is not Monday". In addition, it is thought that conditions such as "temperature data is missing" or "the number of missing explanatory variables is 5 or more" can be automatically generated. If explanatory variables are missing and it is difficult to determine the conditions, for example, it may be determined that the conditions are not met. The generated conditions are saved in the condition DB 112 .
 条件化説明変数生成部104は、説明変数から条件化説明変数を生成し、生成した条件化説明変数を条件化説明変数DB113に格納する。 The conditioned explanatory variable generation unit 104 generates conditioned explanatory variables from the explanatory variables, and stores the generated conditioned explanatory variables in the conditioned explanatory variable DB 113 .
 QUBO問題変換部105は、予測誤差を小さくする分岐の条件の探索問題(予測誤差最小化問題)を、「第1問題」の一例としてのQUBO(Quadratic Unconstrained Binary Optimization))問題に変換する。尚、QUBO問題変換部105は、予測誤差最小化問題をQUBO問題と等価な問題に変換してもよい。 The QUBO problem conversion unit 105 converts a branch condition search problem that reduces the prediction error (prediction error minimization problem) into a QUBO (Quadratic Unconstrained Binary Optimization) problem as an example of the "first problem". The QUBO problem conversion unit 105 may convert the prediction error minimization problem into a problem equivalent to the QUBO problem.
 図3は、QUBO問題変換部の機能ブロック図である。 FIG. 3 is a functional block diagram of the QUBO question conversion unit.
 QUBO問題変換部105は、誤差関数生成部301と、QUBO問題生成部302とを備えている。 The QUBO problem conversion unit 105 includes an error function generation unit 301 and a QUBO problem generation unit 302.
 誤差関数生成部301について説明する。予測誤差を示す指標である誤差関数として、予測と推定との誤差となる残差の二乗和があり、決定木における残差の二乗和は。以下の数1で表される。
Figure JPOXMLDOC01-appb-M000001
The error function generator 301 will be explained. The error function, which is an index of the prediction error, is the sum of squares of residuals, which is the error between prediction and estimation. It is represented by Equation 1 below.
Figure JPOXMLDOC01-appb-M000001
 Jは、残差の二乗和、y[i]は、サンプルiの目的変数、S1は、条件を満たしたサンプルの集合、S0は、条件を満たしていないサンプルの集合、pred1は、条件を満たしたサンプルの予測値、pred0は、条件を満たしていないサンプルの予測値である。Jを最小化するようなpred1とpred0は、それぞれ条件を満たしているサンプルのyの平均、条件を満たしていないサンプルのyの平均である。したがって、残差の二乗和Jは、以下の数2で表される。
Figure JPOXMLDOC01-appb-M000002
J is the sum of squared residuals, y[i] is the objective variable for sample i, S1 is the set of samples that satisfy the condition, S0 is the set of samples that do not satisfy the condition, pred1 is the set of samples that satisfy the condition The predicted value of the sample that is satisfied, pred0, is the predicted value of the unsatisfied sample. The pred1 and pred0 that minimize J are the mean y of the satisfying and the mean y of the unsatisfied samples, respectively. Therefore, the sum of squares J of the residuals is represented by Equation 2 below.
Figure JPOXMLDOC01-appb-M000002
 Var(S)は、集合Sの分散、N(S)は、集合Sの個数を表している。つまり、残差の二乗和は、条件により分けられたサンプル群の分散をそれぞれのサンプル数で重みづけした値である。この数2の式を変形すると、以下の数3となる。
Figure JPOXMLDOC01-appb-M000003
Var(S) represents the variance of the set S, and N(S) represents the number of sets S. FIG. In other words, the residual sum of squares is a value obtained by weighting the variance of the sample group divided by the condition by the number of each sample. Transforming the formula of the formula 2, the following formula 3 is obtained.
Figure JPOXMLDOC01-appb-M000003
 しかし、この数3の式の分母にN(S1)とN(S0)とが存在すると、QUBO問題に変換することができない。 However, if N(S1) and N(S0) exist in the denominator of Equation 3, it cannot be converted into a QUBO problem.
 そこで、QUBO問題変換部105では、サンプル群の分散に対する重みを調整することによって、残差の二乗和JをQUBO問題に変換する。例えば、次式の数4のように、サンプル数で重みづけするのではなく、サンプル数の二乗した値で重みづけする。ただし、分母からN(S1)とN(S0)を消すことができれば、重みは、サンプル数の二乗でなくてもよい。例えば、サンプル数の3乗、4乗、もしくは、サンプル数の割合の2乗でもよい。
Figure JPOXMLDOC01-appb-M000004
Therefore, the QUBO problem transforming unit 105 transforms the residual sum of squares J into a QUBO problem by adjusting the weight for the variance of the sample group. For example, weighting is performed not by the number of samples, but by the square of the number of samples, as in Equation 4 below. However, if N(S1) and N(S0) can be eliminated from the denominator, the weight does not have to be the square of the number of samples. For example, it may be the 3rd power of the number of samples, the 4th power of the number of samples, or the square of the ratio of the number of samples.
Figure JPOXMLDOC01-appb-M000004
 この数4の式を変形すると、次式の数5となり、分母からN(S1)とN(S0)とが消える。
Figure JPOXMLDOC01-appb-M000005
Transforming the equation 4 yields the following equation 5, where N(S1) and N(S0) disappear from the denominator.
Figure JPOXMLDOC01-appb-M000005
 誤差関数Hは、残差の二乗和Jから重みづけを変化させた値であり、残差の二乗和Jと相関も強く、かつ、QUBO問題に変形できる形である。したがって、誤差関数Hを小さくするような分岐の条件は、残差の二乗和Jを小さくするような分岐の条件となる。 The error function H is a value obtained by changing the weighting from the residual sum of squares J, has a strong correlation with the residual sum of squares J, and is in a form that can be transformed into a QUBO problem. Therefore, a branching condition that reduces the error function H is a branching condition that reduces the sum of squares J of the residuals.
 QUBO問題生成部302について説明する。QUBO問題生成部302は、探索する条件と、QUBO問題演算部106に入力するデータを決定する。探索する条件としては、例えば、後述する条件化説明変数(図7)の列に対応する条件(気温>20など)が考えられる。 The QUBO question generation unit 302 will be explained. The QUBO question generator 302 determines search conditions and data to be input to the QUBO question calculator 106 . As a search condition, for example, a condition (temperature>20, etc.) corresponding to a column of conditional explanatory variables (FIG. 7), which will be described later, can be considered.
 生成されるQUBOについて述べる。ただし、QUBO問題とは、0もしくは1で表現されるQUBO変数で表現される最小化したい誤差関数と、QUBO変数の満たすべき1つもしくは複数の制約で表現される。そして、誤差関数は、次の数6の式で表される。
Figure JPOXMLDOC01-appb-M000006
The generated QUBO will be described. However, the QUBO problem is expressed by an error function to be minimized expressed by QUBO variables expressed by 0 or 1 and one or more constraints to be satisfied by the QUBO variables. Then, the error function is represented by the following equation (6).
Figure JPOXMLDOC01-appb-M000006
 Sは、サンプル全体の集合、X[i][j]は、サンプルiの条件jの条件化説明変数、Cは、条件の集合、cは、条件を使うかを表現したQUBO変数であり、c[j]=1は、条件jを分岐で用いることを示す。使う条件は、1つに絞る必要があり、以下の数7の制約で表される。
Figure JPOXMLDOC01-appb-M000007
S is a set of all samples, X[i][j] is a conditioned explanatory variable of condition j of sample i, C is a set of conditions, c is a QUBO variable expressing whether to use a condition, c[j]=1 indicates that condition j is used in the branch. The condition to be used must be narrowed down to one, which is represented by the following constraints of Expression 7.
Figure JPOXMLDOC01-appb-M000007
 QUBO問題変換部105は、以上のように算出された、誤差関数と制約とを出力する。
 QUBO問題演算部106は、QUBO問題を演算する。QUBO問題演算部106(アニーリングマシンとも呼ぶ)は、例えば、量子力学の性質を利用した量子アニーリングマシン、光の特性を使ったコヒーレントイジングマシン、およびCMOSもしくはFPGAを利用したデジタル回路で構成されるデジタルアニーラなどでよい。QUBO問題演算部106は、「演算結果」の一例としてのQUBO変数cを出力する。
The QUBO problem conversion unit 105 outputs the error function and constraints calculated as described above.
The QUBO problem calculation unit 106 calculates QUBO problems. The QUBO problem calculation unit 106 (also called an annealing machine) is a digital Annealer or the like can be used. The QUBO question calculation unit 106 outputs the QUBO variable c as an example of the "calculation result".
 分岐条件生成部107は、QUBO変数cに基づいて、条件jを生成する。分岐条件生成部107は、QUBO変数cが制約を満たさない場合、アニーリングマシンの学習に関するパラメータを変更し、再び条件jを探索する。分岐条件生成部107は、一定の回数の条件jの探索を繰り返して、制約を満たさない場合、条件jなしで次のプロセスに進んでもよい。そして、QUBO問題演算部106は、c[h]=1となるような条件jを、分岐に用いる条件として採用する。 The branch condition generation unit 107 generates condition j based on QUBO variable c. If the QUBO variable c does not satisfy the constraint, the branching condition generation unit 107 changes the parameters related to learning of the annealing machine and searches for the condition j again. The branch condition generation unit 107 may repeat the search for condition j a certain number of times, and if the constraint is not satisfied, proceed to the next process without condition j. Then, the QUBO problem calculation unit 106 adopts the condition j such that c[h]=1 as the condition used for branching.
 条件判定部108は、QUBO問題演算部106の出力である条件jを推定器12に用いるかを判定する。まず、条件判定部108は、出力された条件jを用いることによって、サンプル数の分けられ方と、そのときの予測誤差とを演算する。そして、条件判定部108は、これらの情報を決定木DB114に格納する。条件判定部108、条件jを用いることが決定した場合、分割されたサンプル群ごとにさらに分割するかを繰り返してよい。 The condition determination unit 108 determines whether to use the condition j output from the QUBO problem calculation unit 106 for the estimator 12 . First, the condition determination unit 108 uses the output condition j to calculate how the number of samples is divided and the prediction error at that time. Condition determination unit 108 then stores this information in decision tree DB 114 . When the condition determination unit 108 determines to use the condition j, it may be repeated whether to further divide each divided sample group.
 目的変数推定部109は、学習させた推定器12を用いて説明変数のデータから目的変数を推定する。 The objective variable estimation unit 109 estimates the objective variable from the explanatory variable data using the learned estimator 12 .
 データベース(DB)11は、説明変数データベース(DB)110と、目的変数DB111と、条件DB112と、条件化説明変数DB113と、決定木DB114と、学習パラメータDB115とを備えている。ユーザは入力部にて、説明変数と目的変数とを含むデータを入力することで、目的変数を推定する推定器、もしくは、新たなデータの説明変数を用いた当該推定器での推定結果を得ることができる。 The database (DB) 11 comprises an explanatory variable database (DB) 110, an objective variable DB 111, a condition DB 112, a conditioned explanatory variable DB 113, a decision tree DB 114, and a learning parameter DB 115. The user inputs data including an explanatory variable and an objective variable in the input section, and obtains an estimator that estimates the objective variable or an estimation result of the estimator using the explanatory variable of new data. be able to.
 図4は、実施形態1に係る説明変数DBのデータ構造例を示す図である。図5は、実施形態1に係る目的変数DBのデータ構造例を示す図である。ここでは、ある店舗における一日のジュースの売り上げを推定する推定器を学習する場合を例に説明する。 FIG. 4 is a diagram showing an example data structure of the explanatory variable DB according to the first embodiment. FIG. 5 is a diagram showing an example data structure of a target variable DB according to the first embodiment. Here, the case of learning an estimator for estimating the daily sales of juice at a certain store will be described as an example.
 説明変数DB110は、項目値(カラム値)として、ID401と、それぞれのサンプルの「説明変数」の一例としての気温402、湿度403、曜日404、および前日の店前の写真405とを格納するテーブルである。ID401は、説明変数を特定する識別子である。気温402は、ある店舗の周辺の当日の摂氏気温(度)である。湿度403は、ある店舗の周辺における当日の湿度(%)である。曜日404は、ある店舗における当日の曜日である。前日の店前の写真405は、ある店舗における前日の店舗の前を撮像した画像である。 The explanatory variable DB 110 is a table that stores an ID 401 as item values (column values), and temperature 402, humidity 403, day of the week 404, and a photo 405 of the previous day's shop front as examples of "explanatory variables" for each sample. is. The ID 401 is an identifier that identifies an explanatory variable. The temperature 402 is the Celsius temperature (degrees) around a certain store on the day. Humidity 403 is the current humidity (%) around a certain store. The day of the week 404 is the current day of the week at a store. A photo 405 of the front of the store on the previous day is an image of the front of a certain store on the previous day.
 説明変数DB110と目的変数DB111の各行は、サンプルに対応しており、これら2つの説明変数DB110および目的変数DB111は、ID401,501で紐づけられている。ID401,501は数字だけでなく、文字列でもよい。例えば、ジュースの売り上げの場合、ID401,501は、日付でもよい。 Each row of the explanatory variable DB 110 and objective variable DB 111 corresponds to a sample, and these two explanatory variable DB 110 and objective variable DB 111 are linked with IDs 401 and 501. IDs 401 and 501 may be not only numbers but also character strings. For example, for juice sales, ID 401, 501 may be the date.
 説明変数DB110には、ID401毎に、それぞれのサンプルの説明変数が紐づけられている。説明変数は、気温402および湿度403のような連続的な数値、曜日204などのクラス情報、または、前日の店前の写真(画像情報)405のようにID401と対応していればよく、データの形式は限定されない。例えば、説明変数は、他にも音声、文章、化学式なども考えられる。また、説明変数は、一部欠損していてもよい。 In the explanatory variable DB 110, each ID 401 is associated with an explanatory variable for each sample. Explanatory variables may correspond to continuous numerical values such as temperature 402 and humidity 403, class information such as day of the week 204, or ID 401 such as a photo (image information) 405 in front of the shop on the previous day. format is not limited. For example, explanatory variables may also include speech, sentences, chemical formulas, and the like. Also, some of the explanatory variables may be missing.
 目的変数DB111は、項目値(カラム値)として、ID501と、推定対象である「目的変数」の一例としてのジュースの売り上げ502とを格納するテーブルである。ID501は、目的変数を特定する識別子である。ジュースの売り上げ502は、ある店舗における当日のジュースの売り上げ数である。一例として、ジュースの売り上げ502は、「20(本)」、「22(本)」、「33(本)」である。 The objective variable DB 111 is a table that stores an ID 501 as an item value (column value) and juice sales 502 as an example of the "objective variable" to be estimated. ID 501 is an identifier that identifies the objective variable. The juice sales 502 is the number of sales of juice on the day at a certain store. As an example, the juice sales 502 are “20 (bottles)”, “22 (bottles)”, and “33 (bottles)”.
 目的変数DB111には、ID501毎に、目的変数が格納されている。 A target variable is stored in the target variable DB 111 for each ID 501 .
 図6は、実施形態1に係る条件DBのデータ構造例を示す図である。 FIG. 6 is a diagram showing an example data structure of the condition DB according to the first embodiment.
 条件DB112は、項目値(カラム値)として、条件ID601と、「分岐条件」の一例としての条件602とを格納するテーブルである。条件ID601は、分岐条件を特定する識別子である。条件602は、説明変数から目的変数を推定する決定木における分岐条件である。一例として、条件602は、「気温>20(度)」、「気温>22(度)」、「曜日が日曜日」である。 The condition DB 112 is a table that stores condition IDs 601 as item values (column values) and conditions 602 as an example of "branch conditions". A condition ID 601 is an identifier that identifies a branch condition. A condition 602 is a branching condition in the decision tree for estimating the objective variable from the explanatory variables. As an example, the condition 602 is "Temperature>20 (degrees)", "Temperature>22 (degrees)", and "Day of the week is Sunday".
 図7は、実施形態1に係る条件化説明変数DBのデータ構造例を示す図である。 FIG. 7 is a diagram showing an example data structure of the conditional explanatory variable DB according to the first embodiment.
 条件化説明変数DB113は、項目値(カラム値)として、ID701と、「気温>20(条件0)」702と、「気温>22(条件1)」703と、「曜日が日曜(条件2)」704と、「画像に人が存在(条件3)」705とを格納するテーブルである。ID501は、条件化説明変数を特定する識別子である。「気温>20(条件0)」502は、ある店舗の周囲における当日の気温が20度よりも高い分岐条件である。「気温>22(条件1)」503は、ある店舗の周囲における当日の気温が22度よりも高い分岐条件である。「曜日が日曜(条件2)」504は、ある店舗における当日の曜日が日曜日である分岐条件である。「画像に人が存在(条件3)」505は、ある店舗における前日の店舗の前を撮像した画像に人が存在する分岐条件である。 The conditional explanatory variable DB 113 has, as item values (column values), an ID 701, "Temperature > 20 (Condition 0)" 702, "Temperature > 22 (Condition 1)" 703, and "Day of the week is Sunday (Condition 2) 704 and 'A person exists in the image (Condition 3)' 705. The ID 501 is an identifier that identifies a conditional explanatory variable. “Temperature>20 (condition 0)” 502 is a branching condition that the temperature around a certain store on the day is higher than 20 degrees. "Temperature>22 (Condition 1)" 503 is a branching condition that the temperature around a certain store on the day is higher than 22 degrees. “Day of the week is Sunday (Condition 2)” 504 is a branching condition that the current day of the week is Sunday at a store. “People exist in image (Condition 3)” 505 is a branching condition that a person exists in an image captured in front of a shop on the previous day.
 図7の各列は、各サンプルで条件を満たしているかを0と1で示しており、満たしている場合は「1」、満たしていない場合は「0」が格納される。ただし、格納される値は、条件の満たしているかどうかがわかればよく、「1、0」ではなくてもよい。例えば、「真、偽」「True、False」でもよい。 Each column in FIG. 7 indicates with 0 and 1 whether each sample satisfies the condition. However, the value to be stored does not have to be "1, 0" as long as it is known whether the condition is satisfied. For example, "True, False" or "True, False" may be used.
 図8は、実施形態1に係る学習決定木DBのデータ構造例を示す図である。 FIG. 8 is a diagram showing an example data structure of the learning decision tree DB according to the first embodiment.
 図8の上部に示す決定木DB114では、作成されていく決定木の特徴を格納している。決定木DB114は、項目値(カラム値)として、ノードID801と、親ノード802と、親ノードの条件の真偽803と、条件804と、真の場合の予想値805と、偽の場合の予想値806とを格納するテーブルである。 The decision tree DB 114 shown in the upper part of FIG. 8 stores the characteristics of the decision trees that are being created. The decision tree DB 114 contains, as item values (column values), a node ID 801, a parent node 802, a true/false parent node condition 803, a condition 804, a predicted value 805 when true, and a predicted value when false. It is a table that stores values 806 .
 条件804は、ノードとして管理されており、当該条件に使われた条件を示すノードを親ノードのID801、当該親ノードの条件の真偽803、ノードの条件に関するID、ノードの条件の真偽ごとの予測値が記載されている。ただし、最初に用いられる条件804には、親ノード802と、親ノードの条件の真偽803は格納されていない。また、さらなる条件804の分岐がある場合、親ノードの条件の真偽803毎の予測値は格納れていない。 The condition 804 is managed as a node, and the node indicating the condition used in the condition is identified by the ID 801 of the parent node, the truth 803 of the condition of the parent node, the ID related to the condition of the node, and the truth of the condition of the node. The predicted value of However, the condition 804 used first does not store the parent node 802 and the true/false condition 803 of the parent node. Also, if there is a further branch of the condition 804, the predicted value for each true/false 803 of the condition of the parent node is not stored.
 なお、予測値は、分けられたそれぞれのサンプルの目的変数の平均値である。推定器を用いるかの判定条件としては、例えば、当該条件で分けたサンプルのそれぞれの数が閾値以下になった場合が考えられる。もしくは、予測誤差の減少幅が小さかった場合、もしくは、決定木の深さが閾値を超えた場合などが考えられる。閾値は、学習データパラメータDB115に格納される。 The predicted value is the average value of the objective variable for each divided sample. As a condition for determining whether or not to use the estimator, for example, a case where the number of samples divided under the condition is equal to or less than a threshold can be considered. Alternatively, it can be considered that the decrease in the prediction error is small, or that the depth of the decision tree exceeds the threshold. The threshold is stored in the learning data parameter DB 115. FIG.
 図8の下部に決定木DB114に格納されたデータに基づく決定木を示す。この決定木では、ノードID:0で「気温>22(度)」の条件804が真の場合(YES)、ノードID:1で「曜日が日曜」の条件804に進む。ノードID:0で「気温>22(度)」の条件804が偽の場合(NO)、偽の場合の予想値806は、「10(本)」である。ノードID:1で「曜日が日曜」の条件804が真の場合(YES)、真の場合の予想値805は、「120(本)」である。ノードID:1で「曜日が日曜」の条件804が偽の場合(NO)、偽の場合の予想値806は、「90(本)」である。 A decision tree based on the data stored in the decision tree DB 114 is shown at the bottom of FIG. In this decision tree, if the node ID: 0 and the condition 804 of "Temperature>22 (degrees)" is true (YES), the node ID: 1 proceeds to the condition 804 of "Sunday". If the condition 804 of node ID: 0 and "temperature>22 (degrees)" is false (NO), the expected value 806 in the false case is "10 (books)". If the node ID is 1 and the condition 804 "day of the week is Sunday" is true (YES), the expected value 805 in the true case is "120 (books)". If the node ID is 1 and the “day of the week is Sunday” condition 804 is false (NO), then the expected value 806 is “90 (books)”.
 図9は、実施形態1に係る学習パラメータDBのデータ構造例を示す図である。
 学習パラメータDB115は、項目値(カラム値)として、最小分割パラメータ901と、
最小予測誤差減少幅902と、最大決定木震度903とを格納するテーブルである。一例として、最小分割パラメータ901は、「10」、最小予測誤差減少幅902は、「0.01」最大決定木震度903は、「5」である。パラメータの設定は、ユーザが設定しもよいし、固定値にしてもよい。もしくは複数のパラメータを試す形でもよい。
FIG. 9 is a diagram showing an example data structure of a learning parameter DB according to the first embodiment.
The learning parameter DB 115 includes, as item values (column values), a minimum division parameter 901,
It is a table that stores a minimum prediction error decrease width 902 and a maximum decision tree seismic intensity 903 . As an example, the minimum division parameter 901 is "10", the minimum prediction error decrease width 902 is "0.01", and the maximum decision tree seismic intensity 903 is "5". The parameters may be set by the user or may be fixed values. Or you can try multiple parameters.
 図10は、実施形態1に係る推定器学習装置の処理フローを示す図である。処理フローの順に、システム構成を説明する。 FIG. 10 is a diagram showing the processing flow of the estimator learning device according to the first embodiment. The system configuration will be explained in order of the processing flow.
 入力部101には、説明変数と目的変数とを含むデータが入力される(S1)。入力部101に入力された説明変数は、説明変数DB110に格納され、目的変数は、目的変数DB111に格納される。次に、条件生成部103は、決定木の分岐に用いる条件を生成する(S2)。 Data including an explanatory variable and an objective variable are input to the input unit 101 (S1). The explanatory variables input to the input unit 101 are stored in the explanatory variable DB 110 and the objective variables are stored in the objective variable DB 111 . Next, the condition generation unit 103 generates conditions used for branching of the decision tree (S2).
 次に、条件化説明変数生成部104は、説明変数から条件化説明変数を生成し、生成した条件化説明変数を条件化説明変数DB113に格納する(S3)。 Next, the conditional explanatory variable generation unit 104 generates a conditional explanatory variable from the explanatory variables, and stores the generated conditional explanatory variable in the conditional explanatory variable DB 113 (S3).
 次に、QUBO問題変換部105は、予測誤差を小さくする分岐の条件の探索問題をQUBO問題に変換する(S4)。 Next, the QUBO problem conversion unit 105 converts the branch condition search problem that reduces the prediction error into a QUBO problem (S4).
 次に、QUBO問題演算部106は、QUBO問題変換部105が変換したQUBO問題を演算し、分岐条件生成部107は、分岐に用いる条件を生成する(S5)。 Next, the QUBO problem calculation unit 106 calculates the QUBO problem converted by the QUBO problem conversion unit 105, and the branch condition generation unit 107 generates conditions used for branching (S5).
 次に、条件判定部108は、分岐条件生成部107が生成した分岐条件を用いて、データのサンプルを分割し、分割を推定器12に用いるかを判定し、判定結果を決定木DB114に格納する(S6)。この判定結果が真の場合(S6:YES)、分割されたサンプル群ごとにさらに分割するため、処理フローS5に戻る。条件判定部108は、この判定結果が偽の場合(分割するサンプル群が無くなった時点で)(S6:NO)、次の処理フローS7に進む。 Next, the condition determination unit 108 divides the data sample using the branch condition generated by the branch condition generation unit 107, determines whether the division is used in the estimator 12, and stores the determination result in the decision tree DB 114. (S6). If the determination result is true (S6: YES), the process flow returns to S5 in order to further divide each divided sample group. If the result of this determination is false (when there are no sample groups to be divided) (S6: NO), the condition determination unit 108 proceeds to the next processing flow S7.
 次に、出力部102は、決定木DB114に格納した決定木の特徴を出力する。即ち、出力部102は、学習により得られたパラメータを出力する(S7)。 Next, the output unit 102 outputs the features of the decision tree stored in the decision tree DB 114. That is, the output unit 102 outputs parameters obtained by learning (S7).
 この構成によれば、説明変数のデータから目的変数を推定する決定木の分岐条件を探索する推定器12を学習させる推定器学習装置であって、推定器12は、QUBO問題変換部105と、QUBO問題演算部106と、分岐条件生成部107と、を備える。QUBO問題変換部105は、分岐条件の探索における予測誤差最小化問題を、QUBO問題に変換する。QUBO問題演算部106は、QUBO問題変換部が変換したQUBO問題を演算する。分岐条件生成部107は、QUBO問題演算部106の演算結果に基づいて、分岐条件を生成する。これにより、決定木の分岐条件の推定精度を高めることができる。 According to this configuration, the estimator learning device trains the estimator 12 for searching the branching condition of the decision tree for estimating the objective variable from the explanatory variable data. A QUBO problem calculation unit 106 and a branch condition generation unit 107 are provided. The QUBO problem transforming unit 105 transforms the prediction error minimization problem in the branch condition search into a QUBO problem. The QUBO question calculation unit 106 calculates the QUBO question converted by the QUBO question conversion unit. A branching condition generation unit 107 generates a branching condition based on the calculation result of the QUBO problem calculation unit 106 . As a result, the accuracy of estimating the branching condition of the decision tree can be improved.
 本発明の実施形態2に係る推定器学習装置の具体例を、図面を参照しつつ説明する。なお、本発明は実施例によって限定されるものではなく、特許請求の範囲によって示される。 A specific example of the estimator learning device according to Embodiment 2 of the present invention will be described with reference to the drawings. It should be noted that the present invention is not limited by the examples, but is indicated by the claims.
 図11は、実施形態1に係る論理積で表現できる条件を含む決定木である。図12は、実施形態2に係る論理積条件の表現方法を説明する図である。 FIG. 11 is a decision tree including conditions that can be expressed by a logical product according to the first embodiment. FIG. 12 is a diagram for explaining a method of expressing a logical product condition according to the second embodiment.
 本実施形態2では、実施形態1と異なるQUBO問題変換部1005を適用する例を開示して、1つの条件だけではなく、当該条件の論理積で表現できる条件も含むことを示す。 Embodiment 2 discloses an example in which the QUBO question conversion unit 1005, which is different from that in Embodiment 1, is applied, and shows that it includes not only one condition but also a condition that can be expressed as a logical AND of the conditions.
 論理積で表現できる条件を用いた分岐とは、図11で示されているような「気温>30かつ日曜」のような「気温>30」と「曜日は日曜」との2つの条件をすべて満たすかどうかの条件である。条件に関しては、2つに限定されるものではなく、条件DB112に記載されている条件をいくつでも使うことができる。このような条件の論理積で表現できるような条件を論理積条件と呼ぶ。 A branch using a condition that can be expressed by a logical product is defined as “temperature>30” and “day of the week is Sunday”, such as “temperature>30 and Sunday” as shown in FIG. condition to be met. The conditions are not limited to two, and any number of conditions described in the condition DB 112 can be used. A condition that can be expressed as a logical product of such conditions is called a logical product condition.
 そこで、条件の表現方法としては、条件のIDではなく、図12のように各条件を使うかどうかのベクトルで表現される。図12の場合は「気温>30かつ湿度>50」といった条件になる。ゆえに、QUBO問題変換部では、当該ベクトルを探索することになる。 Therefore, as a method of expressing conditions, instead of condition IDs, they are expressed by vectors indicating whether or not each condition is used, as shown in FIG. In the case of FIG. 12, the conditions are "temperature>30 and humidity>50". Therefore, the QUBO problem conversion unit searches for the vector.
 探索問題における誤差関数Hは、次式の数8で表される。
Figure JPOXMLDOC01-appb-M000008
The error function H in the search problem is represented by Equation 8 below.
Figure JPOXMLDOC01-appb-M000008
 KXは、サンプルiにおける論理積で表現された条件を構成する条件のうち満たしていない条件の数を示すQUBO変数の行列である。KX[i][k]=1は、サンプルiにおける論理積で表現された条件を構成する条件のうち満たしていない条件がk個であることを示す。つまり、KX[i][0]=1は、サンプルiにおいては論理積条件を構成する条件を全て満たすことを示しており、論理積条件に関しして真であることを示す。逆にKX[i][0]=0でないことは、論積条件に関して偽であることを示す。  KX is a matrix of QUBO variables that indicates the number of unsatisfied conditions among the conditions that constitute the conditions expressed by the logical AND in sample i. KX[i][k]=1 indicates that there are k unsatisfied conditions among the conditions constituting the condition expressed by the logical AND in sample i. That is, KX[i][0]=1 indicates that sample i satisfies all the conditions constituting the logical product condition, and indicates that the logical product condition is true. Conversely, not being KX[i][0]=0 indicates that the conjunctive condition is false.
 QUBO問題における制約は、複数存在する。まず、全てのサンプルiに関して、以下2つの制約(1)(2)が成り立つ必要がある。
Figure JPOXMLDOC01-appb-I000009
There are multiple constraints in the QUBO problem. First, the following two constraints (1) and (2) must hold for all samples i.
Figure JPOXMLDOC01-appb-I000009
 scは、論理積条件を表現するQUBO変数であり、sc[j]=1は、条件jを論積績条件に用いることを示す。Kは、論理積条件を構成する条件の最大である。  sc is a QUBO variable that expresses a logical product condition, and sc[j]=1 indicates that condition j is used as a logical product condition. K is the maximum of the conditions that make up the conjunctive condition.
 また、以下の制約(3)も成り立つ必要がある 
Figure JPOXMLDOC01-appb-I000010
In addition, the following constraint (3) must also hold
Figure JPOXMLDOC01-appb-I000010
 QUBO問題変換部1005は、以上のような、誤差関数と3種類の制約で表現されるQUBO問題を生成する。 The QUBO problem conversion unit 1005 generates a QUBO problem expressed by the error function and three types of constraints as described above.
 この構成によれば、分岐条件生成部107は、分岐条件の探索範囲を、分岐毎に分けられたテーブル形式のデータから生成される条件、または条件の論理積で表現される条件とする。これにより、分岐条件の探索範囲を広げることが可能となる。 According to this configuration, the branch condition generation unit 107 sets the branch condition search range to conditions generated from table-format data divided for each branch, or conditions expressed by a logical product of conditions. This makes it possible to widen the search range of branch conditions.
 なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 It should be noted that the present invention is not limited to the above-described embodiments, and includes various modifications. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations. In addition, it is possible to replace part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. Moreover, it is possible to add, delete, or replace a part of the configuration of each embodiment with another configuration.
 また、上記の各構成は、それらの一部又は全部が、ハードウェアで構成されても、プロセッサでプログラムが実行されることにより実現されるように構成されてもよい。また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 In addition, each of the above configurations may be partially or wholly configured by hardware, or may be configured to be realized by executing a program on a processor. Further, the control lines and information lines indicate those considered necessary for explanation, and not all control lines and information lines are necessarily indicated on the product. In practice, it may be considered that almost all configurations are interconnected.
 例えば、QUBO問題変換部105,1005は、分岐毎に分けられたテーブル形式のデータのサンプル群それぞれの誤差の和で表される最少化すべき誤差のうち、サンプル群の誤差を、サンプル群のサンプル数、サンプル数に比例する値、またはサンプル係数もしくはサンプル数とサンプル係数とで表現される関数の出力値で重みづけして、誤差最小化問題をQUBO問題に変換してもよい。これにより、最小化される誤差が小さくなるため、決定木の分岐条件の推定精度をより高めることができる。 For example, the QUBO problem conversion units 105 and 1005 convert the error of the sample group out of the error to be minimized, which is represented by the sum of the errors of the sample groups of table-format data divided for each branch, into the sample group The error minimization problem may be converted to a QUBO problem by weighting with a number, a value proportional to the number of samples, or an output value of a function expressed by a sample coefficient or a sample number and a sample coefficient. As a result, the error to be minimized becomes smaller, so the accuracy of estimating the branching condition of the decision tree can be further improved.
 分岐条件生成部107は、分岐条件を探索した決定木に基づいて、新たな分岐条件を作成してもよい。これにより、深い決定木を作成することができる。 The branching condition generation unit 107 may create a new branching condition based on the decision tree searched for the branching condition. This makes it possible to create deep decision trees.
 分岐条件生成部107は、複数の決定木を作成し、作成した複数の決定木を組み合わせて、新たな決定木を作成してもよい。これにより、決定木の分岐条件の推定精度をより高めることができる。 The branching condition generation unit 107 may create a plurality of decision trees and combine the created decision trees to create a new decision tree. This makes it possible to further improve the accuracy of estimating the branching condition of the decision tree.
 QUBO問題演算部106の演算結果に基づいて、分岐条件の重要度を算出する重要度算出部と、重要度算出部が算出した重要度を表示する表示部102と、を備えてよい。
これにより、ユーザは、重要度を確認しながら、分岐条件を決定することができる。
An importance calculation unit that calculates the importance of the branch condition based on the calculation result of the QUBO problem calculation unit 106, and a display unit 102 that displays the importance calculated by the importance calculation unit may be provided.
This allows the user to determine the branching condition while confirming the degree of importance.
 分岐条件生成部107が生成した条件の重要度を表示する表示部102を備えてよい。これにより、ユーザは、重要度を確認しながら、分岐条件を決定することができる。 A display unit 102 for displaying the importance of the conditions generated by the branch condition generation unit 107 may be provided. This allows the user to determine the branching condition while confirming the degree of importance.
 12…推定器、100…推定器学習装置、102…出力部、105…QUBO問題変換部、106…QUBO問題演算部、107…分岐条件生成部、109…目的変数推定部
 
12... Estimator, 100... Estimator learning device, 102... Output unit, 105... QUBO problem conversion unit, 106... QUBO problem calculation unit, 107... Branch condition generation unit, 109... Objective variable estimation unit

Claims (9)

  1.  説明変数のデータから目的変数を推定する決定木の分岐条件を探索する推定器を学習する推定器学習装置であって、
     前記推定器は、
     前記分岐条件の探索における予測誤差最小化問題を、QUBO問題または当該QUBO問題と等価な第1問題に変換するQUBO問題変換部と、
     前記QUBO問題変換部が変換した前記第1問題を演算するQUBO問題演算部と、
     前記QUBO問題演算部の演算結果に基づいて、前記分岐条件を生成する分岐条件生成部と、を備える推定器学習装置。
    An estimator learning device that learns an estimator that searches for a branching condition of a decision tree that estimates an objective variable from data of explanatory variables,
    The estimator is
    a QUBO problem conversion unit that converts the prediction error minimization problem in the branch condition search into a QUBO problem or a first problem equivalent to the QUBO problem;
    a QUBO question calculation unit that calculates the first question converted by the QUBO question conversion unit;
    an estimator learning device comprising: a branching condition generation unit that generates the branching condition based on the calculation result of the QUBO problem calculation unit.
  2.  前記QUBO問題変換部は、分岐毎に分けられたテーブル形式の前記データのサンプル群それぞれの誤差の和で表される最少化すべき誤差のうち、前記サンプル群の誤差を、前記サンプル群のサンプル数、前記サンプル数に比例する値、またはサンプル係数もしくは前記サンプル数と前記サンプル係数とで表現される関数の出力値で重みづけして、前記誤差最小化問題を前記第1問題に変換する、
    請求項1に記載の推定器学習装置。
    The QUBO problem conversion unit converts the error of the sample group, out of the error to be minimized represented by the sum of the errors of the sample groups of the data in the table format divided for each branch, into the number of samples of the sample group. , weighting by a value proportional to the number of samples, or an output value of a function expressed by a sample coefficient or the number of samples and the sample coefficient, to convert the error minimization problem into the first problem;
    The estimator learning device according to claim 1.
  3.  前記分岐条件生成部は、前記分岐条件の探索範囲を、分岐毎に分けられたテーブル形式の前記データから生成される条件、または前記条件の論理積で表現される条件とする、
    請求項1に記載の推定器学習装置。
    The branch condition generation unit sets the search range of the branch condition to a condition generated from the data in a table format divided for each branch, or a condition expressed by a logical product of the conditions.
    The estimator learning device according to claim 1.
  4.  前記分岐条件生成部は、前記分岐条件を探索した決定木に基づいて、新たな分岐条件を作成する、
    請求項1に記載の推定器学習装置。
    The branch condition generation unit creates a new branch condition based on the decision tree searched for the branch condition.
    The estimator learning device according to claim 1.
  5.  前記分岐条件生成部は、複数の前記決定木を作成し、作成した複数の決定木を組み合わせて、新たな決定木を作成する、
    請求項4に記載の推定器学習装置。
    The branch condition generation unit creates a plurality of the decision trees, and combines the created decision trees to create a new decision tree.
    The estimator learning device according to claim 4.
  6.  学習させた前記推定器を用いて前記目的変数を推定する目的変数推定部を備える、
    請求項1に記載の推定器学習装置。
    An objective variable estimation unit that estimates the objective variable using the learned estimator,
    The estimator learning device according to claim 1.
  7.  前記QUBO問題演算部は、イジングマシンである、
    請求項1に記載の推定器学習装置。
    The QUBO problem calculation unit is an Ising machine,
    The estimator learning device according to claim 1.
  8.  前記QUBO問題演算部の演算結果に基づいて、前記分岐条件の重要度を算出する重要度算出部と、
     前記重要度算出部が算出した前記重要度を表示する表示部と、を備える、
    請求項1に記載の推定器学習装置。
    an importance calculation unit that calculates the importance of the branch condition based on the calculation result of the QUBO problem calculation unit;
    a display unit that displays the degree of importance calculated by the degree-of-importance calculation unit;
    The estimator learning device according to claim 1.
  9.  前記分岐条件生成部が生成した前記条件の重要度を表示する表示部を備える、
    請求項2に記載の推定器学習装置。
    a display unit for displaying the importance of the condition generated by the branch condition generation unit;
    The estimator learning device according to claim 2.
PCT/JP2022/048176 2022-02-03 2022-12-27 Estimator learning device WO2023149138A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022015734A JP2023113393A (en) 2022-02-03 2022-02-03 estimator learning device
JP2022-015734 2022-02-03

Publications (1)

Publication Number Publication Date
WO2023149138A1 true WO2023149138A1 (en) 2023-08-10

Family

ID=87552279

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/048176 WO2023149138A1 (en) 2022-02-03 2022-12-27 Estimator learning device

Country Status (2)

Country Link
JP (1) JP2023113393A (en)
WO (1) WO2023149138A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10222370A (en) * 1997-02-06 1998-08-21 Kokusai Denshin Denwa Co Ltd <Kdd> System for generating decision tree in database
JP2004157814A (en) * 2002-11-07 2004-06-03 Fuji Electric Holdings Co Ltd Decision tree generating method and model structure generating device
WO2019189249A1 (en) * 2018-03-29 2019-10-03 日本電気株式会社 Learning device, learning method, and computer-readable recording medium
US20190392332A1 (en) * 2018-06-25 2019-12-26 Tmaxsoft Co., Ltd Computer Program Stored in Computer Readable Medium and Database Server Transforming Decision Table Into Decision Tree
JP2020030699A (en) * 2018-08-23 2020-02-27 株式会社リコー Leaning device and leaning method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10222370A (en) * 1997-02-06 1998-08-21 Kokusai Denshin Denwa Co Ltd <Kdd> System for generating decision tree in database
JP2004157814A (en) * 2002-11-07 2004-06-03 Fuji Electric Holdings Co Ltd Decision tree generating method and model structure generating device
WO2019189249A1 (en) * 2018-03-29 2019-10-03 日本電気株式会社 Learning device, learning method, and computer-readable recording medium
US20190392332A1 (en) * 2018-06-25 2019-12-26 Tmaxsoft Co., Ltd Computer Program Stored in Computer Readable Medium and Database Server Transforming Decision Table Into Decision Tree
JP2020030699A (en) * 2018-08-23 2020-02-27 株式会社リコー Leaning device and leaning method

Also Published As

Publication number Publication date
JP2023113393A (en) 2023-08-16

Similar Documents

Publication Publication Date Title
Nasteski An overview of the supervised machine learning methods
Do Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models
US20220043972A1 (en) Answer generating device, answer learning device, answer generating method, and answer generating program
US9390383B2 (en) Method for an optimizing predictive model using gradient descent and conjugate residuals
CN113705772A (en) Model training method, device and equipment and readable storage medium
CN114186084B (en) Online multi-mode Hash retrieval method, system, storage medium and equipment
Navgaran et al. Evolutionary based matrix factorization method for collaborative filtering systems
CN113420421B (en) QoS prediction method based on time sequence regularized tensor decomposition in mobile edge calculation
Fan et al. Adaptive partition intuitionistic fuzzy time series forecasting model
US7392231B2 (en) Determining utility functions from ordenal rankings
CN108416636A (en) A kind of ARIMA and support vector regression fusion method of prediction drug sales volume
Mohammed et al. A new optimizer for image classification using wide ResNet (WRN)
CN103605493B (en) Sorting in parallel learning method based on Graphics Processing Unit and system
Mashinchi et al. An improvement on genetic-based learning method for fuzzy artificial neural networks
Vizuete-Luciano et al. Decision making in the assignment process by using the Hungarian algorithm with OWA operators
WO2023149138A1 (en) Estimator learning device
Leathart et al. Temporal probability calibration
Wang et al. An improved neural network with random weights using backtracking search algorithm
Su et al. Using imputation techniques to help learn accurate classifiers
US20230087642A1 (en) Training apparatus and method for neural network model, and related device
Vander Mijnsbrugge et al. Parameter efficient neural networks with singular value decomposed kernels
CN111882441A (en) User prediction interpretation Treeshap method based on financial product recommendation scene
CN111290756A (en) Code-annotation conversion method based on dual reinforcement learning
Ko et al. Deep compression of sum-product networks on tensor networks
Krzyśko et al. New variants of pairwise classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22925040

Country of ref document: EP

Kind code of ref document: A1