JP2015060237A

JP2015060237A - Prediction model learning device, prediction model learning method, and computer program

Info

Publication number: JP2015060237A
Application number: JP2013191271A
Authority: JP
Inventors: 優輔村岡; Yusuke Muraoka; 幸貴楠村; Yukitaka Kusumura; 弘紀水口; Hiroki Mizuguchi
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2013-09-17
Filing date: 2013-09-17
Publication date: 2015-03-30
Anticipated expiration: 2033-09-17
Also published as: JP6201556B2

Abstract

PROBLEM TO BE SOLVED: To provide technology for generating a model capable of highly accurate prediction even when some of explanatory variables included in training data is lost.SOLUTION: In the case of machine learning of a model conducted on the basis of training data in which samples composed of a pair of objective variable and explanatory variable are collected, the machine learning using a plurality of prediction models set for each group of samples of the training data that is divided into a plurality of groups, a use rate calculation unit 14 calculates, using an estimated parameter, a use rate of each of the prediction models constituting a model to be outputted for a lost pattern indicating a loss of component in an explanatory variable vector. An estimation unit 13 estimates a parameter of each prediction model by using the use rate of each prediction model for the lost pattern. A process performed by the use rate calculation unit 14 and a process performed by the estimation unit 13 using the use rate of each prediction model calculated by the process are alternately repeated by an instruction unit 15.

Description

本発明は、入手可能なデータに基づいて予測対象データを予測する技術に関する。 The present invention relates to a technique for predicting prediction target data based on available data.

入手可能なデータに基づいて将来を予測することは業務改善に有用である。例えば、商店において、直近２週間の売り上げデータに基づいて商品の売り上げを予測できれば、商店は、商品の在庫管理を適切に行うことができる。また、営業店において、営業日報等の業務記録に基づいた営業手法と受注との関係を分析することによって、どのような営業手法により受注の可能性が高まるかを予測できれば、営業店は受注率を向上できる。 Forecasting the future based on available data is useful for business improvement. For example, in a store, if the sales of a product can be predicted based on sales data for the last two weeks, the store can appropriately manage the inventory of the product. In addition, if a sales office can predict the sales method that will increase the possibility of receiving an order by analyzing the relationship between the sales method based on business records such as daily business reports, etc. Can be improved.

ここで、予測の手掛かりとなるデータ（例えば実際の売り上げデータや、実行された営業手法）を説明変数と呼ぶこととする。また、予測対象となるデータ（例えば、予測したい商品の売り上げや、予測したい受注状況）を目的変数と呼ぶこととする。さらに、説明変数（データ）を代入（入力）することによって目的変数（予測値）を得ることができる関数をモデルあるいは予測関数と呼ぶこととする。さらにまた、過去データ（サンプル）である説明変数と目的変数の組み合わせの集合を訓練データと呼ぶこととする。この訓練データに基づいてモデル（説明変数を利用して目的変数を出力する関数）を作成する技術として、機械学習が利用される。 Here, data that serves as a clue to prediction (for example, actual sales data or an executed sales technique) is referred to as an explanatory variable. In addition, data to be predicted (for example, sales of a product to be predicted or order status to be predicted) is referred to as an objective variable. Furthermore, a function that can obtain an objective variable (predicted value) by substituting (inputting) an explanatory variable (data) is called a model or a predicting function. Furthermore, a set of combinations of explanatory variables and objective variables that are past data (samples) is referred to as training data. Machine learning is used as a technique for creating a model (a function for outputting an objective variable using explanatory variables) based on the training data.

ところで、その機械学習に際し、訓練データにおける説明変数の一部が欠損している場合がある。具体的には、例えば、或る時間帯に商品Ａが店に出されていなかった場合には、その時間帯における商品Ａの売り上げが欠損することとなる。また、営業日報に記録することを忘れた日がある場合には、その忘れられた日のデータが欠損することとなる。このように説明変数の一部が欠損している訓練データに基づいてモデル（予測関数）を機械学習する際には、例えば、説明変数の平均値を、欠損している説明変数として利用する手法が採用される場合がある。また、他の説明変数に基づいて予測した値を、欠損している説明変数として利用することによってモデル（予測関数）を機械学習する手法もある。 By the way, in the machine learning, some explanatory variables in the training data may be missing. Specifically, for example, if the product A has not been put out in the store during a certain time zone, the sales of the product A in that time zone will be lost. In addition, when there is a day forgetting to record in the business daily report, the data for the forgotten day is lost. Thus, when machine learning a model (prediction function) based on training data in which some of the explanatory variables are missing, for example, a method of using the average value of the explanatory variables as the missing explanatory variables May be adopted. There is also a method of machine learning of a model (prediction function) by using a value predicted based on another explanatory variable as a missing explanatory variable.

しかしながら、そのような手法では、欠損している説明変数として使用した想定値（代替値）が本来の値から大きくずれていることがあるために、精度の良いモデルを作成できない虞がある。精度の良くないモデルを利用すると、予測の精度が落ちるという問題が生じる。 However, with such a method, an assumed value (substitute value) used as a missing explanatory variable may be greatly deviated from the original value, so that there is a possibility that a highly accurate model cannot be created. If an inaccurate model is used, there is a problem that the accuracy of prediction is lowered.

非特許文献１には、訓練データにおける説明変数の一部が欠損している場合にモデル（予測関数）を機械学習する手法が開示されている。この非特許文献１に表されている手法では、機械学習する装置（コンピュータ）は、訓練データにおいて、どの説明変数が欠損しているかを検知し、欠損している説明変数が同じであるサンプル（説明変数と目的変数の組み合わせである過去データ）に同じラベルを付与する。そして、当該装置は、同じラベルが付与されているサンプルの集合のみを訓練データとした機械学習を行うことにより、モデルを出力（生成）する。 Non-Patent Document 1 discloses a method of machine learning a model (prediction function) when some of explanatory variables in training data are missing. In the technique shown in Non-Patent Document 1, a machine (computer) that performs machine learning detects which explanatory variable is missing in training data, and samples that have the same missing explanatory variable ( The same label is given to past data that is a combination of explanatory variables and objective variables. Then, the apparatus outputs (generates) a model by performing machine learning using only a set of samples having the same label as training data.

Maytal Saar−Tsechansky and Foster Provost, “Handling Missing Values when Applying Classification Models” Journal Of Machine Learning Research, 8, (2007), 1625−1657Maytal Saar-Tsechansky and Foster Provost, “Handling Missing Values when Applying Classification Models” Journal Of Machine Learning Research, 8, (2007), 1625-1657

しかしながら、目的変数に対する関与が小さい説明変数が欠損している場合に、非特許文献１の手法を用いてモデルを生成してしまうと、そのモデルの精度が悪くなる虞がある。それというのは、非特許文献１の手法では、説明変数の欠損状態に基づいて訓練データを分割しており、目的変数に対する説明変数の関与の度合いが考慮されていないからである。 However, if an explanatory variable having a small contribution to the objective variable is missing, if the model is generated using the method of Non-Patent Document 1, the accuracy of the model may be deteriorated. This is because in the method of Non-Patent Document 1, the training data is divided based on the missing state of the explanatory variable, and the degree of involvement of the explanatory variable with respect to the objective variable is not considered.

本発明は上記課題を解決するためになされた。すなわち、本発明の主な目的は、訓練データ（過去データ）に含まれている説明変数の一部が欠損していても、精度の高い予測を可能にするモデル（予測関数）を生成できる機械学習に関わる技術を提供することである。 The present invention has been made to solve the above problems. That is, the main object of the present invention is to provide a machine that can generate a model (prediction function) that enables highly accurate prediction even if some of the explanatory variables included in the training data (past data) are missing. It is to provide technology related to learning.

上記目的を達成するために、本発明の予測モデル学習装置は、
目的変数と説明変数ベクトルとの組であるサンプルが集められている訓練データにおいて複数にグループ分けされた前記サンプルの各グループに対してそれぞれ設定された予測モデルを複数使用する出力対象のモデルを機械学習する場合に、前記説明変数ベクトルにおける成分の欠損状態を示す欠損パターンに対する前記出力対象のモデルを構成する前記各予測モデルの使用割合を、前記予測モデルの推定されたパラメータを利用して計算する使用割合計算部と、
前記欠損パターンに対する前記各予測モデルの使用割合を利用して、前記各予測モデルのパラメータを推定する推定部と、
前記使用割合計算部により計算された前記欠損パターンに対する前記各予測モデルの使用割合を利用して前記推定部が前記各予測モデルのパラメータを推定する処理と、当該推定部により推定された前記各予測モデルのパラメータを利用して前記欠損パターンに対する前記各予測モデルの使用割合を前記使用割合計算部が計算する処理とを交互に繰り返す処理を制御する指令部と
を備えている。 In order to achieve the above object, the predictive model learning device of the present invention provides:
An output target model that uses a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data in which samples that are pairs of objective variables and explanatory variable vectors are collected is a machine When learning, the usage rate of each prediction model constituting the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector is calculated using the estimated parameter of the prediction model. Usage rate calculator,
An estimation unit that estimates a parameter of each prediction model using a use ratio of each prediction model with respect to the missing pattern;
The estimation unit estimates a parameter of each prediction model using the use rate of each prediction model for the missing pattern calculated by the use rate calculation unit, and each prediction estimated by the estimation unit A command unit that controls a process of alternately repeating the process of calculating the usage ratio of each prediction model with respect to the missing pattern using the model parameters.

また、本発明の予測モデル学習方法は、
目的変数と説明変数ベクトルとの組であるサンプルが集められている訓練データにおいて複数にグループ分けされた前記サンプルの各グループに対してそれぞれ設定された予測モデルを複数使用する出力対象のモデルを機械学習する場合に、前記説明変数ベクトルにおける成分の欠損状態を示す欠損パターンに対する前記出力対象のモデルを構成する前記各予測モデルの使用割合を、前記予測モデルの推定されたパラメータを利用してコンピュータが計算し、
前記欠損パターンに対する前記各予測モデルの使用割合を利用して、前記各予測モデルのパラメータをコンピュータが推定し、
前記各予測モデルのパラメータを推定する処理と、その推定された前記各予測モデルのパラメータを利用して前記欠損パターンに対する前記各予測モデルの使用割合を計算する処理とをコンピュータが交互に繰り返す。 The prediction model learning method of the present invention includes:
An output target model that uses a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data in which samples that are pairs of objective variables and explanatory variable vectors are collected is a machine When learning, the computer uses the estimated parameters of the prediction model to determine the usage ratio of each prediction model that constitutes the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector. Calculate
Utilizing the usage rate of each prediction model for the missing pattern, the computer estimates the parameters of each prediction model,
The computer alternately repeats the process of estimating the parameters of each prediction model and the process of calculating the usage rate of each prediction model with respect to the missing pattern using the estimated parameters of each prediction model.

さらに、本発明のコンピュータプログラムは、
目的変数と説明変数ベクトルとの組であるサンプルが集められている訓練データにおいて複数にグループ分けされた前記サンプルの各グループに対してそれぞれ設定された予測モデルを複数使用する出力対象のモデルを機械学習する場合に、前記説明変数ベクトルにおける成分の欠損状態を示す欠損パターンに対する前記出力対象のモデルを構成する前記各予測モデルの使用割合を、前記予測モデルの推定されたパラメータを利用して計算する処理と、
前記欠損パターンに対する前記各予測モデルの使用割合を利用して、前記各予測モデルのパラメータを推定する処理と
をコンピュータに実行させる処理手順が示され、
さらに、前記各予測モデルのパラメータを推定する処理と、その推定された前記各予測モデルのパラメータを利用して前記欠損パターンに対する前記各予測モデルの使用割合を計算する処理とを交互に繰り返す処理をコンピュータに実行させる処理手順が示されている。 Furthermore, the computer program of the present invention is
An output target model that uses a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data in which samples that are pairs of objective variables and explanatory variable vectors are collected is a machine When learning, the usage rate of each prediction model constituting the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector is calculated using the estimated parameter of the prediction model. Processing,
A processing procedure for causing a computer to execute a process of estimating a parameter of each prediction model using a use ratio of each prediction model with respect to the missing pattern is shown.
Furthermore, a process of alternately repeating the process of estimating the parameters of each prediction model and the process of calculating the usage ratio of each prediction model with respect to the missing pattern using the estimated parameters of each prediction model A processing procedure to be executed by the computer is shown.

なお、本発明の前記目的は、前記構成の本発明の予測モデル学習装置に対応する本発明の予測モデル学習方法によっても達成される。また、本発明の前記目的は、本発明の予測モデル学習装置および予測モデル学習方法をコンピュータによって実現するコンピュータプログラムおよびそれを記憶するコンピュータプログラム記憶媒体によっても達成される。 The object of the present invention is also achieved by the prediction model learning method of the present invention corresponding to the prediction model learning apparatus of the present invention having the above-described configuration. The object of the present invention is also achieved by a computer program for realizing the prediction model learning device and the prediction model learning method of the present invention by a computer and a computer program storage medium for storing the computer program.

本発明によれば、訓練データ（過去データ）に含まれている説明変数の一部が欠損していても、精度の高い予測を可能にするモデル（予測関数）を生成できる。 According to the present invention, it is possible to generate a model (prediction function) that enables highly accurate prediction even if some of the explanatory variables included in the training data (past data) are missing.

本発明に係る第１実施形態の予測モデル学習装置の構成を簡略化して表すブロック図である。It is a block diagram which simplifies and represents the structure of the prediction model learning apparatus of 1st Embodiment which concerns on this invention. 本発明に係る第２実施形態の予測モデル学習装置の構成を簡略化して表すブロック図である。It is a block diagram which simplifies and represents the structure of the prediction model learning apparatus of 2nd Embodiment which concerns on this invention. 訓練パターンにおける欠損パターンの具体例を表す表である。It is a table | surface showing the specific example of the defect pattern in a training pattern. 訓練パターンをクラスタリングする処理の説明に利用する表である。It is a table | surface utilized for description of the process which clusters a training pattern. 第２実施形態の予測モデル学習装置における機械学習の動作例を表すフローチャートである。It is a flowchart showing the operation example of the machine learning in the prediction model learning apparatus of 2nd Embodiment.

以下に、本発明に係る実施形態を図面を参照しつつ説明する。 Embodiments according to the present invention will be described below with reference to the drawings.

（第１実施形態）
図１は、本発明に係る第１実施形態の予測モデル学習装置の構成を簡略化して表すブロック図である。この第１実施形態の予測モデル学習装置１０は、目的変数と説明変数ベクトルとの組であるサンプルが集められている訓練データに基づいて、次のようなモデルを機械学習する装置である。そのモデルとは、前記訓練データにおいて複数にグループ分けされた前記サンプルの各グループに対してそれぞれ設定された予測モデルを複数使用することにより構成されているモデル（予測関数）である。 (First embodiment)
FIG. 1 is a block diagram showing a simplified configuration of the prediction model learning apparatus according to the first embodiment of the present invention. The prediction model learning device 10 of the first embodiment is a device that machine-learns the following model based on training data in which samples that are pairs of objective variables and explanatory variable vectors are collected. The model is a model (prediction function) configured by using a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data.

この第１実施形態の予測モデル学習装置１０は、制御装置１１と、記憶装置１２とを備えている。記憶装置１２には、制御装置１１の動作を制御する制御手順が表されているコンピュータプログラム（以下、プログラムとも記す）１６が格納されている。 The prediction model learning device 10 according to the first embodiment includes a control device 11 and a storage device 12. The storage device 12 stores a computer program (hereinafter also referred to as a program) 16 in which a control procedure for controlling the operation of the control device 11 is represented.

制御装置１１は、例えばＣＰＵ（Central Processing Unit）を有し、当該制御装置（コンピュータ）１１は、記憶装置１２から読み出したプログラム１６を実行することにより、次のような機能を持つことができる。すなわち、制御装置１１は、機能部として、推定部１３と使用割合計算部１４と指令部１５とを有している。 The control device 11 has, for example, a CPU (Central Processing Unit), and the control device (computer) 11 can have the following functions by executing the program 16 read from the storage device 12. That is, the control device 11 includes an estimation unit 13, a usage rate calculation unit 14, and a command unit 15 as functional units.

使用割合計算部１４は、前記説明変数ベクトルにおける成分の欠損状態を示す欠損パターンに対する前記出力対象のモデルを構成する前記各予測モデルの使用割合を、前記予測モデルの推定されたパラメータを利用して計算する機能を備えている。 The usage rate calculation unit 14 uses the estimated parameters of the prediction model to calculate the usage rate of each prediction model that constitutes the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector. Has the function to calculate.

推定部１３は、前記欠損パターンに対する前記各予測モデルの使用割合を利用して、前記各予測モデルのパラメータを推定する機能を備えている。 The estimation unit 13 has a function of estimating a parameter of each prediction model by using a use ratio of each prediction model with respect to the missing pattern.

指令部１５は、推定部１３と使用割合計算部１４を制御する機能を備えている。例えば、指令部１５は、使用割合計算部１４が前記欠損パターンに対する前記各予測モデルの使用割合を計算すると、その計算結果を推定部１３に出力する。これにより、推定部１３は、その計算結果である前記欠損パターンに対する前記各予測モデルの使用割合を利用して、前記各予測モデルのパラメータを推定する。指令部１５は、この推定部１３により推定された前記各予測モデルのパラメータを使用割合計算部１４に出力する。これにより、使用割合計算部１４は、その推定された前記各予測モデルのパラメータを利用して、前記同様に前記欠損パターンに対する前記各予測モデルの使用割合を計算する。このように、指令部１５は、推定部１３による処理と使用割合計算部１４による処理とを交互に繰り返す処理を制御する機能を備えている。 The command unit 15 has a function of controlling the estimation unit 13 and the usage rate calculation unit 14. For example, when the usage rate calculation unit 14 calculates the usage rate of each prediction model for the missing pattern, the command unit 15 outputs the calculation result to the estimation unit 13. Thereby, the estimation part 13 estimates the parameter of each said prediction model using the usage rate of each said prediction model with respect to the said missing pattern which is the calculation result. The command unit 15 outputs the parameters of each prediction model estimated by the estimation unit 13 to the usage rate calculation unit 14. Thereby, the usage rate calculation unit 14 calculates the usage rate of each prediction model with respect to the missing pattern in the same manner as described above, using the estimated parameters of each prediction model. In this way, the command unit 15 has a function of controlling processing that alternately repeats processing by the estimation unit 13 and processing by the usage rate calculation unit 14.

この第１実施形態の予測モデル学習装置１０は、欠損パターンを考慮してモデルを機械学習する構成を備えているので、訓練データ（過去データ）に含まれている説明変数ベクトルの一部の成分が欠損していても、精度の高い予測を可能にするモデルを生成できる。 Since the prediction model learning apparatus 10 of the first embodiment has a configuration for machine learning of a model in consideration of a missing pattern, some components of explanatory variable vectors included in training data (past data) Even if is missing, it is possible to generate a model that enables highly accurate prediction.

（第２実施形態）
以下に、本発明に係る第２実施形態を説明する。 (Second Embodiment)
The second embodiment according to the present invention will be described below.

図２は、第２実施形態の予測モデル学習装置の構成を簡略化して表すブロック図である。この予測モデル学習装置２０は、大別すると、制御装置２１と、記憶装置２２とを備えている。記憶装置２２は記憶媒体（図示せず）を有し、当該記憶媒体には、コンピュータプログラム（プログラム）３０や各種データが格納されている。プログラム３０には、予測モデル学習装置２０の動作を制御する処理手順が表されている。 FIG. 2 is a block diagram illustrating a simplified configuration of the prediction model learning apparatus according to the second embodiment. The prediction model learning device 20 includes a control device 21 and a storage device 22 when roughly classified. The storage device 22 has a storage medium (not shown), and a computer program (program) 30 and various data are stored in the storage medium. The program 30 represents a processing procedure for controlling the operation of the prediction model learning device 20.

制御装置（コンピュータ）２１は、例えば、ＣＰＵ（Central Processing Unit）を備えている。当該制御装置２１（ＣＰＵ）は、記憶装置２２から読み込んだプログラム３０に従って動作することにより、次のような機能を持つことができる。すなわち、この第２実施形態では、制御装置２１は、機能部として、クラスタリング部２３と、補完部２４と、指令部２５と、使用割合計算部２６と、推定部２７と、設定部２８とを有している。 The control device (computer) 21 includes, for example, a CPU (Central Processing Unit). The control device 21 (CPU) can have the following functions by operating according to the program 30 read from the storage device 22. That is, in the second embodiment, the control device 21 includes a clustering unit 23, a complementing unit 24, a command unit 25, a usage rate calculating unit 26, an estimating unit 27, and a setting unit 28 as functional units. Have.

クラスタリング部２３は、与えられた訓練データ（過去データ）をデータ解析することによってクラスタリングする機能を備えている。訓練データとは、過去データ（実際のデータ）に基づいた目的変数と説明変数ベクトルとの組み合わせであるサンプルが集められたデータ群である。例えば、訓練データは、外部から予測モデル学習装置２０に与えられる（入力される）。ここでは、説明変数ベクトルをｘと表し、目的変数をｙと表すとする。また、サンプル（目的変数と説明変数ベクトルとの組み合わせ）は、（ｘ_i，ｙ_i）と表すとする。なお、ｉは、１からＮまでの正の整数とする。これにより、訓練データＤは、

The clustering unit 23 has a function of performing clustering by analyzing the given training data (past data). Training data is a data group in which samples that are combinations of objective variables and explanatory variable vectors based on past data (actual data) are collected. For example, the training data is given (inputted) to the prediction model learning device 20 from the outside. Here, it is assumed that the explanatory variable vector is represented as x and the objective variable is represented as y. A sample (a combination of an objective variable and an explanatory variable vector) is represented as (x _i , y _i ). Note that i is a positive integer from 1 to N. Thereby, the training data D is

と表すことができる。 It can be expressed as.

この第２実施形態では、クラスタリング部２３は、与えられた訓練データＤの各サンプル（ｘ_i，ｙ_i）において、説明変数ベクトルｘにおける成分の一部が欠損しているか否かを検知する機能を備えている。また、クラスタリング部２３は、説明変数ベクトルｘにおける成分の一部が欠損している場合には、その欠損状態を表す欠損パターンを検知（特定）する機能を備えている。さらに、クラスタリング部２３は、その検知された欠損パターンに基づいて、説明変数ベクトルｘの欠損パターンが同じ又は似ているサンプル同士に同じラベルを与える機能を備えている。ここでのクラスタリングとは、上記のようにサンプルを分類し、その後、ラベルを付与するまでの処理を表している。 In the second embodiment, the clustering unit 23 has a function of detecting whether a part of the component in the explanatory variable vector x is missing in each sample (x _i , y _i ) of the given training data D. It has. In addition, the clustering unit 23 has a function of detecting (identifying) a missing pattern representing a missing state when a part of the component in the explanatory variable vector x is missing. Further, the clustering unit 23 has a function of giving the same label to samples having the same or similar missing pattern of the explanatory variable vector x based on the detected missing pattern. Here, the clustering represents the process from classifying the samples as described above until labeling.

なお、クラスタリングの手法には様々な手法が有る。ここでは、説明変数ベクトルｘの成分の一部が欠損している場合に当該欠損パターンに基づいてサンプルを分類することができれば、何れの手法が採用されてもよいが、クラスタリング部２３によるクラスタリングの一具体例を次に述べる。 There are various clustering methods. Here, when a part of the component of the explanatory variable vector x is missing, any method may be adopted as long as the sample can be classified based on the missing pattern. One specific example is described below.

この具体例では、訓練データに含まれているサンプル数は４０とし、これらサンプルＳ１−Ｓ４０における説明変数ベクトルｘは、それぞれ、１０個の成分Ｘ１−Ｘ１０を有するとする。図３は、サンプルＳ１−Ｓ４０において、説明変数ベクトルｘの成分Ｘ１−Ｘ１０における欠損状態を表す表である。図３では、欠損している成分に対応する位置に「ＮＡ」が表され、その他の成分の数値は省略されている。図３によれば、サンプルＳ１−Ｓ５においては、説明変数ベクトルｘの全ての成分Ｘ１−Ｘ１０は欠損していない。サンプルＳ６−Ｓ１０においては、説明変数ベクトルｘの成分Ｘ１−Ｘ５が欠損している。さらに、サンプルＳ１１−Ｓ２０においては、説明変数ベクトルｘの成分Ｘ１−Ｘ６が欠損し、さらにまた、サンプルＳ２１−Ｓ４０においては、説明変数ベクトルｘの成分Ｘ７−Ｘ１０が欠損している。 In this specific example, it is assumed that the number of samples included in the training data is 40, and the explanatory variable vectors x in these samples S1 to S40 have 10 components X1 to X10, respectively. FIG. 3 is a table showing a missing state in the components X1-X10 of the explanatory variable vector x in the samples S1-S40. In FIG. 3, “NA” is shown at the position corresponding to the missing component, and the numerical values of the other components are omitted. According to FIG. 3, in the samples S1-S5, all the components X1-X10 of the explanatory variable vector x are not missing. In samples S6-S10, the components X1-X5 of the explanatory variable vector x are missing. Further, in the samples S11 to S20, the components X1 to X6 of the explanatory variable vector x are missing, and in the samples S21 to S40, the components X7 to X10 of the explanatory variable vector x are missing.

このような説明変数ベクトルｘを含む訓練データに関し、クラスタリング部２３は、各サンプルＳ１−Ｓ４０の説明変数ベクトルｘを他のサンプルＳ１−Ｓ４０の説明変数ベクトルｘに比較し、説明変数ベクトルｘの類似度を算出する。ここで、比較している２つのサンプルにおける説明変数ベクトルｘにおいて、共通に欠損している成分の数をＭとし、当該２つの説明変数ベクトルｘのうちの欠損している成分の数（欠損数）が多い方のサンプルに含まれている欠損数をＮとする。クラスタリング部２３は、例えば、類似度ＲをＭ÷Ｎの数式に従って算出する。なお、Ｍが零であり、Ｎも零である場合における類似度Ｒは１とする。 Regarding the training data including such an explanatory variable vector x, the clustering unit 23 compares the explanatory variable vector x of each sample S1 to S40 with the explanatory variable vector x of the other samples S1 to S40, and resembles the explanatory variable vector x. Calculate the degree. Here, in the explanatory variable vector x in the two samples being compared, the number of components missing in common is M, and the number of missing components in the two explanatory variable vectors x (number of missing portions). ) Let N be the number of defects included in the sample with the larger number of). For example, the clustering unit 23 calculates the similarity R according to an equation of M / N. The similarity R is 1 when M is zero and N is zero.

図４は、上記算出手法に基づいて算出された類似度Ｒを表す表である。例えば、サンプルＳ１−Ｓ５は、説明変数ベクトルｘの全ての成分が欠損していないことから、各サンプルＳ１−Ｓ５がサンプルＳ１−Ｓ５と比較した結果による説明変数ベクトルｘの類似度Ｒは、Ｒ＝Ｍ÷Ｎ＝０÷０＝１である。また、各サンプルＳ１−Ｓ５がサンプルＳ６−Ｓ１０，Ｓ２１−Ｓ４０と比較した結果による説明変数ベクトルｘの類似度Ｒは、Ｒ＝Ｍ÷Ｎ＝０÷５＝０である。さらに、各サンプルＳ１−Ｓ５がサンプルＳ１１−Ｓ２０と比較した結果による説明変数ベクトルｘの類似度Ｒは、Ｒ＝Ｍ÷Ｎ＝０÷６＝０である。 FIG. 4 is a table showing the similarity R calculated based on the above calculation method. For example, since all the components of the explanatory variable vector x are not missing in the samples S1-S5, the similarity R of the explanatory variable vectors x as a result of comparing each sample S1-S5 with the samples S1-S5 is R = M ÷ N = 0 ÷ 0 = 1. Further, the similarity R of the explanatory variable vector x based on the result of comparison of each sample S1-S5 with samples S6-S10, S21-S40 is R = M ÷ N = 0 ÷ 5 = 0. Further, the similarity R of the explanatory variable vector x resulting from the comparison of each sample S1-S5 with the samples S11-S20 is R = M ÷ N = 0 ÷ 6 = 0.

クラスタリング部２３は、そのように算出された類似度Ｒが０.８以上であるサンプルの組に同じラベルを設定（付与）する。例えば、図４に表される類似度Ｒに基づいて、クラスタリング部２３は、サンプルＳ１-Ｓ５にはそれぞれラベルＣ１を設定し、サンプルＳ６−Ｓ２０にはそれぞれラベルＣ２を設定し、サンプルＳ２１−Ｓ４０にはそれぞれラベルＣ３を設定する。 The clustering unit 23 sets (applies) the same label to a set of samples whose similarity R calculated in this way is 0.8 or more. For example, based on the similarity R shown in FIG. 4, the clustering unit 23 sets a label C1 for each of the samples S1-S5, sets a label C2 for each of the samples S6-S20, and samples S21-S40. Is set with a label C3.

クラスタリング部２３は、上記のように、説明変数ベクトルｘの欠損パターンに着目して複数のサンプルをクラスタリングする機能を備えている。 As described above, the clustering unit 23 has a function of clustering a plurality of samples by paying attention to the missing pattern of the explanatory variable vector x.

補完部２４は、説明変数ベクトルｘにおいて欠損している成分に代わるデータ（数値）を補完する機能を備えている。例えば、補完部２４は、各サンプルＳ６−Ｓ４０において、説明変数ベクトルｘにおける欠損していない成分の平均値を、欠損している成分として代入する（補完する）。より具体的には、サンプルＳ６−Ｓ１０においては、補完部２４は、成分Ｘ６−Ｘ１０の平均値を、欠損している成分Ｘ１−Ｘ５に代入（補完）する。また、サンプルＳ１１−Ｓ２０においては、補完部２５は、成分Ｘ７−Ｘ１０の平均値を、欠損している成分Ｘ１−Ｘ６に代入（補完）する。さらに、サンプルＳ２１−Ｓ４０においては、補完部２４は、成分Ｘ１−Ｘ６の平均値を、欠損している成分Ｘ７−Ｘ１０に代入（補完）する。 The complement unit 24 has a function of complementing data (numerical values) in place of the missing component in the explanatory variable vector x. For example, in each sample S6-S40, the complement unit 24 substitutes (complements) the average value of the missing components in the explanatory variable vector x as the missing component. More specifically, in sample S6-S10, complement part 24 substitutes (complements) the average value of component X6-X10 into missing component X1-X5. In samples S11 to S20, the complementing unit 25 substitutes (complements) the average value of the components X7 to X10 into the missing components X1 to X6. Further, in the samples S21 to S40, the complementing unit 24 substitutes (complements) the average value of the components X1 to X6 into the missing components X7 to X10.

設定部２８は、訓練データに基づいて予測モデルを設定する機能を備えている。例えば、設定部２８は、説明変数ベクトルｘの欠損していない成分の組み合わせ（パターン）に基づいて、訓練データのサンプルＳ１−Ｓ４０を次のような４つのグループに分類する。つまり、訓練データが図３の表に表されるような欠損パターンを有するサンプルの集合である場合には、サンプルＳ１−Ｓ５は、全ての説明変数ベクトルｘの成分Ｘ１−Ｘ１０が欠損していないグループ（グループＧ１とする）である。サンプルＳ６−Ｓ１０は、説明変数ベクトルｘの成分Ｘ６−Ｘ１０が欠損していないグループ（グループＧ２とする）である。サンプルＳ１１−Ｓ２０は、説明変数ベクトルｘの成分Ｘ７−Ｘ１０が欠損していないグループ（グループＧ３とする）である。サンプルＳ２１−Ｓ４０は、説明変数ベクトルｘの成分Ｘ１−Ｘ６が欠損していないグループ（グループＧ４とする）である。設定部２８は、このようにグループ分けされた各サンプルのグループにそれぞれ対応する予測モデルを設定する。 The setting unit 28 has a function of setting a prediction model based on training data. For example, the setting unit 28 classifies the training data samples S1 to S40 into the following four groups based on combinations (patterns) of missing components of the explanatory variable vector x. That is, when the training data is a set of samples having a missing pattern as shown in the table of FIG. 3, the samples S1-S5 are not missing the components X1-X10 of all the explanatory variable vectors x. It is a group (referred to as group G1). Samples S6-S10 are groups in which the components X6-X10 of the explanatory variable vector x are not missing (referred to as group G2). Samples S11 to S20 are groups in which the components X7 to X10 of the explanatory variable vector x are not missing (referred to as group G3). Samples S21 to S40 are groups in which the components X1 to X6 of the explanatory variable vector x are not missing (referred to as group G4). The setting unit 28 sets a prediction model corresponding to each group of samples grouped in this way.

ここでは、各グループＧ１−Ｇ４に関連付けられる予測モデル（関数）は式（１）に表されるとする。

Here, it is assumed that the prediction model (function) associated with each group G1-G4 is represented by Expression (1).

なお、式（１）に表されるｘは説明変数ベクトルであり、ｙは目的変数である。また、ｋは、予測モデルを識別する符号であり、１以上の整数（ｋ＝１，２，．．．，Ｋ）であるとする。ここでは、各予測モデルのｋは、上記のようにグループ分けされたグループＧ１−Ｇ４にそれぞれ対応する数値が設定される。つまり、サンプルのグループＧ１に対応する予測モデルのｋには１が設定され、サンプルのグループＧ２に対応する予測モデルのｋには２が設定される。また、サンプルのグループＧ３に対応する予測モデルのｋには３が設定され、サンプルのグループＧ４に対応する予測モデルのｋには４が設定される。つまり、この場合には、Ｋ＝４となる。 In addition, x represented by Formula (1) is an explanatory variable vector, and y is an objective variable. K is a code for identifying the prediction model, and is an integer of 1 or more (k = 1, 2,..., K). Here, k of each prediction model is set to a numerical value corresponding to each of the groups G1-G4 grouped as described above. That is, 1 is set to k of the prediction model corresponding to the sample group G1, and 2 is set to k of the prediction model corresponding to the group of samples G2. In addition, 3 is set to k of the prediction model corresponding to the group of samples G3, and 4 is set to k of the prediction model corresponding to the group of samples G4. That is, in this case, K = 4.

また、θ^(k)は、予測モデルｆ_kにおけるパラメータを表している。 Θ ^(k) represents a parameter in the prediction model f _k .

ここで、クラスタリング部２３のクラスタリング処理によりサンプルに付与されたラベルをｃ_(xi)とした場合に、各ラベルに対する予測モデルの使用割合（モデル割り当て潜在変数）は、Ｚ_ｃ(xi)，kと表されるとする。この場合に、その使用割合を考慮した予測モデルは式(２)に表される。

Here, when the label given to the sample by the clustering process of the clustering unit 23 is c _(xi) , the usage ratio (model allocation latent variable) of the prediction model for each label is Z _{c (xi), k} Let it be represented. In this case, the prediction model considering the usage rate is expressed by Equation (2).

より具体例を述べると、予測モデルとして、式（３）に表される確率密度関数族が設定（定義）されているとする。

More specifically, it is assumed that the probability density function family represented by Expression (3) is set (defined) as the prediction model.

なお、式（３）において、θ:＝（β，σ²）とする（βは平均値（説明変数の線型関数で表す場合には重みともいう）を表し、σは分散を表す）。また、τ∈｛１，２，・・・・｝である。 In Equation (3), θ: = (β, σ ² ) (β represents an average value (also referred to as a weight when represented by a linear function of explanatory variables), and σ represents variance). Further, τε {1, 2,...}.

式（３）に基づくと、各グループＧ１−Ｇ４に対応する予測モデルは、式（４）−式（７）のように表される（定義される）。

Based on Expression (3), the prediction model corresponding to each group G1-G4 is expressed (defined) as Expression (4) -Expression (7).

この第２実施形態では、モデルの機械学習とは、パラメータθ^(k)および使用割合Ｚ_ｃ(xi)，kを機械学習することである。指令部２５は、その機械学習のために、使用割合計算部２６および推定部２７の動作を制御する機能を備えている。例えば、指令部２５は、訓練データを受け取ると、例えば制御装置２１に備えられている記憶部３３に予測モデルの使用割合Ｚ_ｃ(xi)，kの情報が格納されているか否かを判断し、格納されていないと判断した場合には、使用割合Ｚ_ｃ(xi)，kの初期値を設定（生成）する。具体例を挙げると、指令部２５は、前記のようなグループＧ１−Ｇ４に対する予測モデルｆ₁−ｆ₄が設定されている場合には、全ての予測モデルｆ₁−ｆ₄における使用割合Ｚ_c,kとして同じ定数を設定する。つまり、使用割合Ｚ_c,kは０．２５と設定される。また、この場合には、ｃ＝１，２，３であり、ｋ＝１，２，３，４である。 In the second embodiment, the machine learning of the model is machine learning of the parameter θ ^(k) and the usage ratio Z _{c (xi), k} . The command unit 25 has a function of controlling the operations of the usage rate calculation unit 26 and the estimation unit 27 for the machine learning. For example, when receiving the training data, the command unit 25 determines whether or not the information on the use ratios _{Zc (xi), k} of the prediction model is stored in the storage unit 33 provided in the control device 21, for example. If it is determined that it is not stored _, the initial values of the use ratios _{Zc (xi), k} are set (generated). As a specific example, when the prediction models f ₁ -f ₄ for the groups G ₁ -G _{4 as described above} are set, the command unit 25 uses the usage ratio Z _c in all the prediction models f ₁ -f ₄ . _{, k} , set the same constant. That is, the usage ratio Z _{c, k} is set to 0.25. In this case, c = 1, 2, 3 and k = 1, 2, 3, 4.

指令部２５は、使用割合Ｚ_c,kの情報を取得できた場合には、その使用割合_c,kおよび訓練データを推定部２７に出力する。これにより、推定部２７が機能し始め、後述するように各予測モデルのパラメータθ^(k)を推定する。指令部２５は、推定部２７により推定（算出）されたパラメータθ^(k)を推定部２７から受け取ると、当該パラメータθ^(k)および訓練データを使用割合計算部２８に出力する。これにより、使用割合計算部２８が機能し始め、後述するように使用割合Ｚ_c,kを算出する。指令部２５は、使用割合計算部２８により算出された使用割合Ｚ_c,kを使用割合計算部２８から受け取ると、当該使用割合Ｚ_c,kおよび訓練データを推定部２７に出力する。 The command unit 25 outputs the usage rate _{c, k} and training data to the estimation unit 27 when the usage rate Z _{c, k} information can be acquired. Thereby, the estimation unit 27 starts to function, and estimates the parameter θ ^(k) of each prediction model as described later. When the command unit 25 receives the parameter θ ^(k) estimated (calculated) by the estimation unit 27 from the estimation unit 27, the command unit 25 outputs the parameter θ ^(k) and training data to the use ratio calculation unit 28. As a result, the usage rate calculation unit 28 starts to function and calculates the usage rate Z _{c, k} as described later. When the command unit 25 receives the use rate Z _{c, k} calculated by the use rate calculation unit 28 from the use rate calculation unit 28, the command unit 25 outputs the use rate Z _{c, k} and training data to the estimation unit 27.

このように、指令部２５は、推定部２７と使用割合計算部２８が交互に繰り返し機能するように制御することによって、パラメータθ^(k)および使用割合Ｚ_ｃ(xi)，kの機械学習を進める。指令部２５は、予め定められた停止条件が満たされるまで、そのような機械学習を継続して行う。停止条件としては、例えば、新たに算出されたパラメータθ^(k)と、当該パラメータθ^(k)が算出される１回前の計算により算出されたパラメータθ^(k)との各成分の差分の二乗和が１０^−５以下であるという条件がある。 In this way, the command unit 25 performs machine learning of the parameter θ ^(k) and the usage rate Z _{c (xi), k} by controlling the estimation unit 27 and the usage rate calculation unit 28 to function alternately and repeatedly. Proceed. The command unit 25 continues such machine learning until a predetermined stop condition is satisfied. The stop condition, for example, a parameter which is newly calculated theta ^(k), the parameter theta ^(k) of each component of the parameter calculated by the calculation before one calculated theta ^(k) of the difference There is a condition that the sum of squares is 10 ⁻⁵ or less.

なお、上記例では、指令部２５は、使用割合Ｚ_ｃ(xi)，kの初期値を設定した後に、推定部２７と使用割合計算部２８の繰り返し動作を制御している。これに代えて、指令部２５は、パラメータθ^(k)の初期値を設定（生成）し、この設定した初期値と訓練データを使用割合計算部２８に出力することにより、上記のような推定部２７と使用割合計算部２８の繰り返し動作の開始を制御してもよい。 In the above example, the command unit 25 controls the repetitive operations of the estimation unit 27 and the usage rate calculation unit 28 after setting the initial values of the usage rate _{Zc (xi), k} . Instead, the command unit 25 sets (generates ⁾ an initial value of the parameter θ ^(k) , and outputs the set initial value and the training data to the use ratio calculation unit 28, thereby estimating as described above. The start of the repetitive operation of the unit 27 and the usage rate calculation unit 28 may be controlled.

推定部２７は、訓練データと、各ラベルに対する予測モデルｆ₁−ｆ₄の使用割合Ｚ_ｃ(xi),kとに基づき、かつ、設定部２８により設定された予測モデルの情報を適宜利用することにより、パラメータθを推定する機能を備えている。例えば、推定部２７は、指令部２５から出力された訓練データおよび使用割合の情報Ｚ_ｃ(xi)，kに基づいて、式（８）で表される対数尤度が大きくなるように各予測モデルｆ₁−ｆ₄のパラメータθ⁽¹⁾−θ⁽⁴⁾を計算する。なお、訓練データにおける説明変数ベクトルｘの成分の一部が欠損している場合には、補完部２４により補完されたデータ（数値）を利用する。

The estimation unit 27 appropriately uses information on the prediction model set by the setting unit 28 based on the training data and the usage ratio Z _{c (xi), k} of the prediction model f ₁ -f ₄ for each label. Thus, a function for estimating the parameter θ is provided. For example, the estimation unit 27 uses the training data output from the command unit 25 and the usage rate information _{Zc (xi), k} to make predictions so that the log likelihood represented by Expression (8) increases. The parameter θ ⁽¹⁾ −θ ⁽⁴⁾ of the model f ₁ −f ₄ is calculated. In addition, when a part of the component of the explanatory variable vector x in the training data is missing, the data (numerical value) supplemented by the complementing unit 24 is used.

対数尤度が大きくなるようにパラメータθを算出する手法には様々な手法があり、推定部２７は、それら手法の中から適宜な手法を採用してよい。例えば、推定部２７は、計算の複雑化を防止するために、正則化の手法を利用してもよい。また、推定部２７は、式（９）に表される連立方程式が解析的に解ける場合には、その計算結果を式（８）に代入することによって、パラメータθを算出（推定）することができる。また、式（９）の連立方程式が解析的に解けない場合には、推定部２７は、ニュートン法などの数値計算を用いて、パラメータθを算出（推定）してもよい。

There are various methods for calculating the parameter θ so as to increase the log likelihood, and the estimation unit 27 may employ an appropriate method from among these methods. For example, the estimation unit 27 may use a regularization method in order to prevent the calculation from becoming complicated. Further, when the simultaneous equations represented by the equation (9) can be solved analytically, the estimation unit 27 can calculate (estimate) the parameter θ by substituting the calculation result into the equation (8). it can. Further, when the simultaneous equations of Equation (9) cannot be solved analytically, the estimation unit 27 may calculate (estimate) the parameter θ using numerical calculation such as Newton's method.

なお、式（９）における演算記号∇は、ベクトル微分演算子であるナブラを表す。 Note that the operation symbol における in equation (9) represents a nabla which is a vector differential operator.

推定部２７が、式（９）を利用してパラメータθ（θ＝（β，σ²））を推定した結果は下記の通りである。

The result of the estimation unit 27 estimating the parameter θ (θ = (β, σ ² )) using the equation (9) is as follows.

なお、Ｘ^(k)，Ｙ^(k)は、次のように定義されているとする。

X ^(k) and Y ^(k) are defined as follows.

なお、上記列ベクトル中に表されているｄ(k)は、予測モデルｆ_kに対応する当該列ベクトルを構成する成分の総数を表す。 Note that d (k) represented in the column vector represents the total number of components constituting the column vector corresponding to the prediction model f _k .

推定部２７は、推定したパラメータθに関する情報を指令部２５に出力する機能と、当該パラメータθに関する情報を例えば制御装置２１に備えられている記憶部３３に登録する機能とを備えている。 The estimation unit 27 has a function of outputting information on the estimated parameter θ to the command unit 25 and a function of registering information on the parameter θ in, for example, the storage unit 33 provided in the control device 21.

使用割合計算部２６は、指令部２５から出力された訓練データおよびパラメータθの情報に基づき、かつ、設定部２８により設定された予測モデルの情報を適宜利用することにより、予測モデルの使用割合Ｚ_c(xi),kを算出する機能を備えている。例えば、使用割合計算部２６は、クラスタリング部２３のクラスタリング処理による各ラベルのサンプルに対する確率が大きい予測モデルに対して、当該ラベルの予測モデルの使用割合が大きくなるように、当該予測モデルの使用割合を算出する。例えば、使用割合計算部２６は、式（１０）により表される予測モデルの尤度比に基づいて予測モデルの使用割合Ｚ_c(xi),kを算出する。

The usage rate calculation unit 26 uses the prediction model usage rate Z based on the training data and the parameter θ information output from the command unit 25 and appropriately uses the prediction model information set by the setting unit 28. _It has a function to calculate _{c (xi), k} . For example, the usage rate calculation unit 26 uses the prediction model usage rate so that the usage rate of the prediction model of the label increases with respect to the prediction model having a large probability for the sample of each label by the clustering process of the clustering unit 23. Is calculated. For example, the usage rate calculation unit 26 calculates the usage rate Z _{c (xi), k} of the prediction model based on the likelihood ratio of the prediction model represented by Expression (10).

なお、ｐ_k（ｃ）とτ（ｃ）は、次のように定義されている。

Note that p _k (c) and τ (c) are defined as follows.

使用割合計算部２６は、算出した予測モデルの使用割合Ｚ_c(xi),kに関する情報を指令部２５に出力する機能と、当該予測モデルの使用割合Ｚ_c(xi),kに関する情報を例えば制御装置２１に備えられている記憶部３３に登録する機能とを備えている。 The ratio calculation unit 26, the proportion Z _c of the calculated prediction model _(xi), and a function of outputting information about _k in the command unit 25, the use of the prediction model ratio Z _{c (xi),} the information about the _k example And a function of registering in the storage unit 33 provided in the control device 21.

以下に、第２実施形態の予測モデル学習装置２０における予測モデル学習に関わる動作例を図５のフローチャートを参照しながら説明する。なお、図５は、第２実施形態の予測モデル学習装置２０が実行する予測モデル学習に関わる動作のフローチャートであり、当該フローチャートは、予測モデル学習装置２０の制御装置２１（ＣＰＵ）が実行するコンピュータプログラム３０の処理手順を表している。 Hereinafter, an operation example related to prediction model learning in the prediction model learning device 20 of the second embodiment will be described with reference to the flowchart of FIG. FIG. 5 is a flowchart of operations related to prediction model learning executed by the prediction model learning device 20 of the second embodiment. The flowchart is a computer executed by the control device 21 (CPU) of the prediction model learning device 20. The processing procedure of the program 30 is shown.

例えば、制御装置２１（クラスタリング部２３）は、当該制御装置２１の外部から訓練データを受け取ると、当該訓練データにおける各サンプルにおいて、説明変数ベクトルｘの欠損状態を表す欠損パターンを特定する（ステップＳ１０１）。そして、クラスタリング部２３は、その欠損パターンに基づいて、訓練データのサンプルを分類し、同じ分類のサンプルに同じラベルを付与する。換言すれば、制御装置２１（クラスタリング部２３）は、その欠損パターンに基づいて、訓練データをクラスタリングする（ステップＳ１０２）。 For example, when the control device 21 (clustering unit 23) receives the training data from the outside of the control device 21, the control device 21 (clustering unit 23) specifies a missing pattern that represents the missing state of the explanatory variable vector x in each sample in the training data (step S101). ). And the clustering part 23 classify | categorizes the sample of training data based on the defect | deletion pattern, and provides the same label to the sample of the same classification | category. In other words, the control device 21 (clustering unit 23) clusters the training data based on the missing pattern (step S102).

然る後に、制御装置２１（指令部２５）は、予測モデルの使用割合Ｚ_c(xi),kの初期値を設定（生成）する（ステップＳ１０３）。なお、その予測モデルは、前記の如く制御装置２１に与えられた訓練データに基づいて、設定部２８により設定（定義）されたモデルである。 Thereafter, the control device 21 (command unit 25) sets (generates) an initial value of the use rate Z _{c (xi), k} of the prediction model (step S103). The prediction model is a model set (defined) by the setting unit 28 based on the training data given to the control device 21 as described above.

その後、制御装置２１は、訓練データにおける説明変数ベクトルｘの一部の成分が欠損しているか否かを判断する（ステップＳ１０４）。これにより、欠損していると判断した場合には、制御装置２１（補完部２４）は、その欠損している成分を補完する（ステップＳ１０５）。 Thereafter, the control device 21 determines whether or not some components of the explanatory variable vector x in the training data are missing (step S104). Accordingly, when it is determined that the component is missing, the control device 21 (complement unit 24) supplements the missing component (step S105).

その補完処理の後に、あるいは、訓練データにおける説明変数ベクトルｘの成分が欠損していない場合には、制御装置２１（指令部２５）は、停止条件を満たしているか否かを判断する（ステップＳ１０６）。そして、指令部２５は、停止条件を満たしていないと判断した場合には、使用割合Ｚ_c(xi),kの初期値と訓練データ（補完済みデータ）を推定部２７に出力する。これにより、推定部２７は、機能を開始し、予測モデルのパラメータθを推定する（ステップＳ１０７）。この推定されたパラメータθの情報は、推定部２７から指令部２５に出力されると共に、記憶部３３に登録される。 After the supplement processing or when the component of the explanatory variable vector x in the training data is not missing, the control device 21 (command unit 25) determines whether or not the stop condition is satisfied (step S106). ). When the command unit 25 determines that the stop condition is not satisfied, the command unit 25 outputs an initial value of the use ratio Z _{c (xi), k} and training data (complemented data) to the estimation unit 27. Thereby, the estimation part 27 starts a function and estimates parameter (theta) of a prediction model (step S107). Information on the estimated parameter θ is output from the estimation unit 27 to the command unit 25 and registered in the storage unit 33.

指令部２５は、推定部２７からパラメータθの情報と訓練データを受け取ると、これらの情報を使用割合計算部２６に出力する。これにより、使用割合計算部２６は、受け取った情報に基づいて、使用割合Ｚ_c(xi),kを計算する（ステップＳ１０８）。この算出された使用割合Ｚ_c(xi),kの情報は、指令部２５に出力されると共に、記憶部３３に登録される。 When the command unit 25 receives the parameter θ information and the training data from the estimation unit 27, the command unit 25 outputs the information to the use ratio calculation unit 26. Thereby, the usage rate calculation unit 26 calculates the usage rate Z _{c (xi), k} based on the received information (step S108). Information on the calculated use ratio Z _{c (xi), k} is output to the command unit 25 and registered in the storage unit 33.

その後、指令部２５は、停止条件を満たしているか否かを判断し（ステップＳ１０６）、停止条件を満たしていないと判断した場合には、前記ステップＳ１０７以降の動作を繰り返す。指令部２５は、停止条件を満たしていると判断した場合には、モデルの機械学習を終了する。 Thereafter, the command unit 25 determines whether or not the stop condition is satisfied (step S106). If it is determined that the stop condition is not satisfied, the operation after step S107 is repeated. If the command unit 25 determines that the stop condition is satisfied, it ends the machine learning of the model.

上記のような動作により、制御装置２１は、訓練データに基づいてモデルを機械学習する。 With the above operation, the control device 21 performs machine learning on the model based on the training data.

この第２実施形態の予測モデル学習装置２０は、上記のように、訓練データに含まれる説明変数ベクトルｘの欠損パターンに対する予測モデルの使用割合（モデル割り当て潜在変数）を機械学習している。そして、予測モデル学習装置２０は、その予測モデルの使用割合を考慮したモデルを機械学習している。つまり、予測モデル学習装置２０は、目的変数に対する説明変数ベクトルｘの欠損している成分の関与の度合いが考慮された機械学習を行うことができる。これにより、予測モデル学習装置２０は、訓練データ（過去データ）に含まれている説明変数ベクトルｘの成分の一部が欠損していても、精度の高い予測を可能にするモデル（予測関数）を生成できる。 As described above, the prediction model learning device 20 of the second embodiment performs machine learning on the use ratio (model allocation latent variable) of the prediction model with respect to the missing pattern of the explanatory variable vector x included in the training data. Then, the prediction model learning device 20 performs machine learning on a model that takes into account the usage rate of the prediction model. That is, the prediction model learning device 20 can perform machine learning in which the degree of involvement of the missing component of the explanatory variable vector x with respect to the objective variable is considered. Thereby, the prediction model learning apparatus 20 is a model (prediction function) that enables highly accurate prediction even when a part of the components of the explanatory variable vector x included in the training data (past data) is missing. Can be generated.

（その他の実施形態）
なお、本発明は第１や第２の実施形態に限定されず、様々な実施の形態を採り得る。例えば、第２実施形態では、設定部２８は、訓練データにおいて説明変数ベクトルｘの欠損していない成分のパターンに着目してグループ分けされたサンプルのグループにそれぞれ対応する予測モデルを設定（定義）している。これに代えて、例えば、設定部２８は、訓練データにおける説明変数ベクトルｘのパターン（欠損パターン）に着目してグループ分けされたサンプルのグループにそれぞれ対応する予測モデルを設定（定義）してもよい。あるいは、設定部２８は、訓練データにおける各サンプルの欠損パターンに以外の着目事項に基づいて分けされたサンプルのグループにそれぞれ対応する予測モデルを設定（定義）してもよい。このように、予測モデルを設定（定義）する手法には様々な手法があり、ここでは、何れの手法を利用して予測モデルを設定（定義）してもよい。 (Other embodiments)
The present invention is not limited to the first and second embodiments, and various embodiments can be adopted. For example, in the second embodiment, the setting unit 28 sets (defines) a prediction model corresponding to each group of samples grouped by paying attention to the pattern of the missing component of the explanatory variable vector x in the training data. doing. Instead of this, for example, the setting unit 28 may set (define) a prediction model corresponding to each group of samples grouped by focusing on the pattern (missing pattern) of the explanatory variable vector x in the training data. Good. Alternatively, the setting unit 28 may set (define) a prediction model corresponding to each group of samples divided based on the subject matter other than the missing pattern of each sample in the training data. As described above, there are various methods for setting (defining) the prediction model, and here, any method may be used to set (define) the prediction model.

１０，２０予測モデル学習装置
１３，２７推定部
１４，２６使用割合計算部
１５，２５指令部
２３クラスタリング部
２４補完部 10, 20 Prediction model learning device 13, 27 Estimation unit 14, 26 Usage ratio calculation unit 15, 25 Command unit 23 Clustering unit 24 Complement unit

Claims

An output target model that uses a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data in which samples that are pairs of objective variables and explanatory variable vectors are collected is a machine When learning, the usage rate of each prediction model constituting the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector is calculated using the estimated parameter of the prediction model. Usage rate calculator,
An estimation unit that estimates a parameter of each prediction model using a use ratio of each prediction model with respect to the missing pattern;
The estimation unit estimates a parameter of each prediction model using the use rate of each prediction model for the missing pattern calculated by the use rate calculation unit, and each prediction estimated by the estimation unit A prediction model learning apparatus comprising: a command unit that controls a process of alternately repeating a process in which the usage rate calculation unit calculates a usage rate of each prediction model with respect to the missing pattern using a model parameter.

The usage rate calculator is configured so that the usage rate of the prediction model having the highest likelihood for the sample having the missing pattern is the highest in the usage rate of each prediction model for the missing pattern. The prediction model learning apparatus according to claim 1, wherein the use ratio of each prediction model is calculated.

The said estimation part estimates the parameter of the prediction model so that machine learning advances in the direction where the log likelihood of the function which multiplied the said usage rate of the said prediction model to the said prediction model becomes large. 2. The prediction model learning device according to 2.

And further comprising a complement that complements the missing component in the explanatory variable vector,
The prediction model according to claim 1, wherein the use ratio calculation unit and the estimation unit use the training data in which a missing component in the explanatory variable vector is supplemented by the complement unit. Learning device.

Classifying the sample of the training data based on the missing pattern, further comprising a clustering unit that assigns a label to each classification,
The prediction model learning device according to any one of claims 1 to 4, wherein the usage rate calculation unit calculates a usage rate of each prediction model with respect to the missing pattern for each label.

An output target model that uses a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data in which samples that are pairs of objective variables and explanatory variable vectors are collected is a machine When learning, the computer uses the estimated parameters of the prediction model to determine the usage ratio of each prediction model that constitutes the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector. Calculate
Utilizing the usage rate of each prediction model for the missing pattern, the computer estimates the parameters of each prediction model,
A prediction model in which the computer alternately repeats the process of estimating the parameters of each prediction model and the process of calculating the usage rate of each prediction model with respect to the missing pattern using the estimated parameters of each prediction model Learning method.

An output target model that uses a plurality of prediction models respectively set for each group of the samples grouped into a plurality of groups in the training data in which samples that are pairs of objective variables and explanatory variable vectors are collected is a machine When learning, the usage rate of each prediction model constituting the output target model with respect to the missing pattern indicating the missing state of the component in the explanatory variable vector is calculated using the estimated parameter of the prediction model. Processing,
A processing procedure for causing a computer to execute a process of estimating a parameter of each prediction model using a use ratio of each prediction model with respect to the missing pattern is shown.
Furthermore, a process of alternately repeating the process of estimating the parameters of each prediction model and the process of calculating the usage ratio of each prediction model with respect to the missing pattern using the estimated parameters of each prediction model A computer program showing the processing procedure to be executed by a computer.