JP2018169999A

JP2018169999A - Model construction system and model construction method

Info

Publication number: JP2018169999A
Application number: JP2017249728A
Authority: JP
Inventors: 幸仁西田; Yukito Nishida; 英己安井; Hideki Yasui
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2017-03-29
Filing date: 2017-12-26
Publication date: 2018-11-01
Anticipated expiration: 2037-12-26
Also published as: JP6767355B2; TW201837761A; TWI677799B

Abstract

PROBLEM TO BE SOLVED: To provide a model construction system and a model construction method capable of constructing a model with high generalization capabilities while suppressing precision deterioration.SOLUTION: A model construction system according to an embodiment includes a base model construction part, a similarity calculation part, a modified model construction part, and a generalization capability calculation part. The base model construction part constructs a base model indicating relationship between selected input variables selected from a plurality of input variables and an output variable. The similarity calculation part calculates similarity between the non-selected input variables other than the selected input variables included in the plurality of input variables and the selected input variables. Based on the similarity, the modified model construction part replaces at least part of the selected input variables with the non-selected input variables to construct a modified model indicating relationship between the input variables obtained through the replacement and the output variable. The generalization capability calculation part calculates generalization capabilities of the base model and the modified model.SELECTED DRAWING: Figure 1

Description

本発明の実施形態は、モデル構築システムおよびモデル構築方法に関する。 Embodiments described herein relate generally to a model construction system and a model construction method.

ある出力変数（目的変数）を、複数の入力変数（説明変数）を用いて予測することを目的として、複数の入力変数と出力変数との関係を表すモデルの構築が一般的に行われている。モデルを構築する際には、多数の入力変数の中から一部の入力変数を選択し、選ばれた入力変数と出力変数とを用いてモデルが構築される。例えば、入力変数は、出力変数に対する予測誤差が小さく、出力変数をより高精度に予測できるように選択される。 For the purpose of predicting a certain output variable (objective variable) using a plurality of input variables (explanatory variables), a model that represents the relationship between a plurality of input variables and output variables is generally constructed. . When building a model, some input variables are selected from a large number of input variables, and the model is built using the selected input variables and output variables. For example, the input variable is selected so that the prediction error with respect to the output variable is small and the output variable can be predicted with higher accuracy.

モデルについては、精度以外に、汎化能力が高いことが求められる。すなわち、ある範囲のデータ（既存のデータ）に基づいて構築されたモデルが、別の範囲のデータ（未知のデータ）に対しても、良好な精度を有することが求められる。しかし、既存のデータに対して高い精度を有するモデルが、高い汎化能力を有するとは限らない。また、既存のデータに対し精度が最も高いモデルより、ある程度精度が低いモデルの方が汎化能力では優れていることもある。このため、精度の低下を抑えつつ、汎化能力の高いモデルを構築できる技術の開発が望まれていた。 The model is required to have high generalization ability in addition to accuracy. That is, a model constructed based on a certain range of data (existing data) is required to have good accuracy with respect to another range of data (unknown data). However, a model having high accuracy with respect to existing data does not necessarily have high generalization ability. In addition, the generalization ability may be better for a model with a certain degree of accuracy than a model with the highest accuracy for existing data. For this reason, it has been desired to develop a technology capable of constructing a model having a high generalization ability while suppressing a decrease in accuracy.

特開２０１０−２８２５４７号公報JP 2010-282547 A

本発明が解決しようとする課題は、精度の低下を抑制しつつ、汎化能力の高いモデルを構築できるモデル構築システムおよびモデル構築方法を提供することである。 The problem to be solved by the present invention is to provide a model construction system and a model construction method capable of constructing a model having a high generalization ability while suppressing a decrease in accuracy.

実施形態に係るモデル構築システムは、ベースモデル構築部と、類似度算出部と、変形モデル構築部と、汎化能力算出部と、を備える。前記ベースモデル構築部は、複数の入力変数から選択された選択入力変数と、出力変数と、の関係を表すベースモデルを構築する。前記類似度算出部は、前記複数の入力変数のうち前記選択入力変数以外の非選択入力変数と、前記選択入力変数と、の間の類似度を算出する。前記変形モデル構築部は、前記類似度に基づき、前記選択入力変数の少なくとも一部を、前記非選択入力変数と入れ換え、入れ換えた後の入力変数と出力変数との関係を表す変形モデルを構築する。前記汎化能力算出部は、前記ベースモデルおよび前記変形モデルの汎化能力を算出する。 The model construction system according to the embodiment includes a base model construction unit, a similarity calculation unit, a deformation model construction unit, and a generalization ability calculation unit. The base model construction unit constructs a base model representing a relationship between a selected input variable selected from a plurality of input variables and an output variable. The similarity calculation unit calculates a similarity between a non-selected input variable other than the selected input variable and the selected input variable among the plurality of input variables. The deformation model constructing unit replaces at least a part of the selected input variable with the non-selected input variable based on the similarity, and constructs a deformation model representing a relationship between the input variable and the output variable after the replacement. . The generalization ability calculation unit calculates the generalization ability of the base model and the deformation model.

実施形態に係るモデル構築システムの構成を表すブロック図である。It is a block diagram showing the structure of the model construction system which concerns on embodiment. 実施形態に係るモデル構築システムによる処理の一例を説明する図である。It is a figure explaining an example of the processing by the model construction system concerning an embodiment. 実施形態に係るモデル構築システムによる処理の一例を説明する図である。It is a figure explaining an example of the processing by the model construction system concerning an embodiment. 実施形態に係るモデル構築方法の一例を表すフローチャートである。It is a flowchart showing an example of the model construction method which concerns on embodiment. 実施形態に係るモデル構築方法の他の一例を表すフローチャートである。It is a flowchart showing another example of the model construction method concerning an embodiment. 実施形態に係るモデル構築システムを実現するためのモデル構築装置の構成を例示するブロック図である。It is a block diagram which illustrates the composition of the model construction device for realizing the model construction system concerning an embodiment. 実施形態に係るモデル構築システムを用いて構築したモデルの特性を例示するグラフである。It is a graph which illustrates the characteristic of the model built using the model construction system concerning an embodiment. 実施形態に係るモデル構築システムを用いて構築したモデルの特性を例示するグラフである。It is a graph which illustrates the characteristic of the model built using the model construction system concerning an embodiment.

以下に、本発明の各実施形態について図面を参照しつつ説明する。
また、本願明細書と各図において、既に説明したものと同様の要素には同一の符号を付して詳細な説明は適宜省略する。 Embodiments of the present invention will be described below with reference to the drawings.
In the present specification and each drawing, the same elements as those already described are denoted by the same reference numerals, and detailed description thereof is omitted as appropriate.

図１は、実施形態に係るモデル構築システム１の構成を表すブロック図である。
図２および図３は、実施形態に係るモデル構築システム１による処理の一例を説明する図である。 FIG. 1 is a block diagram illustrating a configuration of a model construction system 1 according to the embodiment.
2 and 3 are diagrams for explaining an example of processing by the model construction system 1 according to the embodiment.

図１に表すように、モデル構築システム１は、取得部１００、ベースモデル構築部１０２、モデル情報保存部１０４、類似度算出部１０６、類似度情報保存部１０８、変形モデル構築部１１０、汎化能力算出部１１２、外部出力部１１４、規定数データベース１２０、および変数データベース１２２を備える。 As illustrated in FIG. 1, the model construction system 1 includes an acquisition unit 100, a base model construction unit 102, a model information storage unit 104, a similarity calculation unit 106, a similarity information storage unit 108, a modified model construction unit 110, a generalization. A capability calculation unit 112, an external output unit 114, a specified number database 120, and a variable database 122 are provided.

規定数データベース１２０は、規定数を記憶する。規定数は、モデル構築システム１において構築されるモデルの数を表す。規定数は、例えば、予めユーザによって入力される。変数データベース１２２は、入力変数および出力変数に関して、それぞれの変数の実測値である変数データを記憶している。 The specified number database 120 stores the specified number. The specified number represents the number of models constructed in the model construction system 1. The specified number is input by the user in advance, for example. The variable database 122 stores variable data, which are actually measured values of each variable, regarding input variables and output variables.

取得部１００は、規定数データベース１２０および変数データベース１２２から、それぞれ、規定数および変数データを取得する。取得部１００は、取得した情報を、ベースモデル構築部１０２に出力する。 The acquisition unit 100 acquires the specified number and variable data from the specified number database 120 and the variable database 122, respectively. The acquisition unit 100 outputs the acquired information to the base model construction unit 102.

ベースモデル構築部１０２は、取得部１００から出力された複数の入力変数から、一部の入力変数を選択する。ベースモデル構築部１０２は、取得部１００によって取得された変数データを用いて、選択された入力変数と出力変数との関係を表すモデルを構築する。入力変数の選択およびモデルの構築は、例えば、Least Absolute Shrinkage and Selection Operator(Lasso)、Elastic Net、Ridge、Least Angle Regression(LARS)、Non Negative Garrote、またはSmoothly Clipped Absolute Deviation(SCAD)を用いて行うことができる。あるいは、入力変数の選択を、ステップワイズ、Variable Important in the Projection(VIP)、遺伝的アルゴリズム、およびNearest Correlation Louvain Method(NCLM)のいずれかを用いて行い、モデルの構築を、重回帰またはPartial Least Squares(PLS)を用いて行っても良い。 The base model construction unit 102 selects some input variables from the plurality of input variables output from the acquisition unit 100. The base model construction unit 102 constructs a model that represents the relationship between the selected input variable and the output variable, using the variable data acquired by the acquisition unit 100. Select input variables and build models using, for example, Least Absolute Shrinkage and Selection Operator (Lasso), Elastic Net, Ridge, Least Angle Regression (LARS), Non Negative Garrote, or Smoothly Clipped Absolute Deviation (SCAD) be able to. Alternatively, input variables can be selected using one of Stepwise, Variable Important in the Projection (VIP), genetic algorithms, and Nearest Correlation Louvain Method (NCLM), and model building can be performed using multiple regression or partial least You may use Squares (PLS).

以降では、ベースモデル構築部１０２によるモデル構築の際に選択された入力変数を「選択入力変数」と言う。、選択されなかった変数を「非選択入力変数」と言う。選択入力変数は、取得部１００によって取得された複数の入力変数の一部である。非選択入力変数は、複数の入力変数の別の一部である。非選択入力変数は、選択入力変数と異なる。また、ベースモデル構築部１０２によって、選択入力変数を用いて構築されたモデルを「ベースモデル」と言う。ベースモデルは、複数の選択入力変数を含む入力変数群と、出力変数と、の関係を表す。 Hereinafter, the input variable selected at the time of model construction by the base model construction unit 102 is referred to as “selected input variable”. The variable that is not selected is referred to as “non-selected input variable”. The selected input variable is a part of a plurality of input variables acquired by the acquisition unit 100. The non-selected input variable is another part of the plurality of input variables. Non-selected input variables are different from selected input variables. A model constructed by the base model construction unit 102 using the selected input variable is referred to as a “base model”. The base model represents a relationship between an input variable group including a plurality of selected input variables and an output variable.

ベースモデル構築部１０２は、構築したベースモデルを、モデル情報保存部１０４に出力する。これにより、モデル情報保存部１０４に、モデル情報が保存される。また、ベースモデル構築部１０２は、ベースモデルを、類似度算出部１０６および変形モデル構築部１１０にも出力する。 The base model construction unit 102 outputs the constructed base model to the model information storage unit 104. As a result, the model information is stored in the model information storage unit 104. The base model construction unit 102 also outputs the base model to the similarity calculation unit 106 and the deformation model construction unit 110.

類似度算出部１０６は、ベースモデルに含まれる複数の選択入力変数のそれぞれと、複数の非選択入力変数のそれぞれと、の間の複数の類似度を算出する。例えば、相関係数、偏相関係数、正準相関、またはRidge決定係数などを類似度として用いることができる。類似度算出部１０６は、算出した類似度を、類似度情報保存部１０８に出力する。 The similarity calculation unit 106 calculates a plurality of similarities between each of the plurality of selected input variables included in the base model and each of the plurality of non-selected input variables. For example, a correlation coefficient, partial correlation coefficient, canonical correlation, Ridge determination coefficient, or the like can be used as the similarity. The similarity calculation unit 106 outputs the calculated similarity to the similarity information storage unit 108.

変形モデル構築部１１０は、類似度情報保存部１０８から入力変数の類似度情報を取得する。変形モデル構築部１１０は、この類似度情報に基づいて、複数の選択入力変数の少なくとも一部を、複数の非選択入力変数の少なくとも一部と入れ替える。これにより、別の入力変数群が生成される。このとき、変形モデル構築部１１０は、ベースモデルに含まれる複数の選択入力変数の全てを、複数の非選択入力変数の少なくとも一部と入れ替えても良い。または、変形モデル構築部１１０は、ベースモデルに含まれる複数のの選択入力変数の一部を、複数の非選択入力変数の少なくとも一部と入れ替えても良い。変形モデル構築部１１０は、上記別の入力変数群と出力変数との関係を表すモデルを構築する。以降では、変形モデル構築部１１０によって構築されたこのモデルを「変形モデル」と言う。 The deformation model construction unit 110 acquires similarity information of input variables from the similarity information storage unit 108. Based on the similarity information, the deformation model construction unit 110 replaces at least some of the plurality of selected input variables with at least some of the plurality of non-selected input variables. Thereby, another input variable group is generated. At this time, the deformation model construction unit 110 may replace all of the plurality of selected input variables included in the base model with at least some of the plurality of non-selected input variables. Alternatively, the deformed model construction unit 110 may replace some of the plurality of selected input variables included in the base model with at least some of the plurality of non-selected input variables. The deformation model construction unit 110 constructs a model that represents the relationship between the other input variable group and the output variable. Hereinafter, this model constructed by the deformation model construction unit 110 is referred to as a “deformation model”.

変形モデル構築部１１０によって構築された変形モデルのモデル情報は、モデル情報保存部１０４に保存される。また、変形モデル構築部１１０は、モデル構築システム１によって構築されたベースモデルと変形モデルの総数が規定数に達しているか判定する。構築されたモデルの総数が規定数に達していない場合、変形モデル構築部１１０は、変形モデルに含まれる変数を入れ換えながら、さらに他の変形モデルを繰り返し構築していく。 The model information of the deformed model constructed by the deformed model constructing unit 110 is stored in the model information storing unit 104. In addition, the deformation model construction unit 110 determines whether the total number of base models and deformation models constructed by the model construction system 1 has reached a specified number. When the total number of constructed models has not reached the specified number, the deformed model constructing unit 110 repeatedly constructs another deformed model while replacing variables included in the deformed model.

ベースモデルおよび変形モデルの総数が規定数に達すると、汎化能力算出部１１２によって、構築された各モデルの汎化能力が算出される。汎化能力算出部１１２は、モデル情報保存部１０４に保存されたモデル情報（ベースモデルおよび変形モデル）を取得し、変数データベース１２２から変数データを取得する。このとき、汎化能力算出部１１２は、ベースモデルおよび変形モデルの構築時とは異なる範囲の変数データ（未知のデータ）を取得する。例えば、汎化能力算出部１１２は、未知のデータの入力変数に対してベースモデルおよび変形モデルを適用する。汎化能力算出部１１２は、各モデルの予測値と出力変数の実測値とを比較し、予測の精度を各モデルの汎化能力として算出する。 When the total number of base models and deformation models reaches a specified number, the generalization ability calculation unit 112 calculates the generalization ability of each constructed model. The generalization capability calculation unit 112 acquires model information (base model and deformation model) stored in the model information storage unit 104 and acquires variable data from the variable database 122. At this time, the generalization ability calculation unit 112 acquires variable data (unknown data) in a different range from that when the base model and the deformation model are constructed. For example, the generalization ability calculation unit 112 applies a base model and a deformation model to input variables of unknown data. The generalization ability calculation unit 112 compares the prediction value of each model with the actual measurement value of the output variable, and calculates the accuracy of prediction as the generalization ability of each model.

一例として、ベースモデルおよび変形モデルは、ある製造装置で得られた各種データ（温度や、圧力、出来栄え）を入力変数および出力変数として構築される。この場合、各モデルを、別の製造装置で得られた変数データに適用し、その精度を各モデルの汎化能力として算出する。
または、ベースモデルおよび変形モデルは、ある製造装置の所定期間に得られた変数データに基づいて構築される。この場合、各モデルを、当該装置の別の期間に得られたデータに適用し、その精度を各モデルの汎化能力として算出しても良い。
汎化能力は、例えば、Mean Square Error(MSE）、Root Mean Square Error(RMSE）、決定係数（Ｒ^２）、相関係数、Akaike's Information Criterion(AIC）、またはBayesian Information Criterion(BIC)などを用いて算出される。汎化能力算出部１１２は、各モデルについての汎化能力の算出結果を、外部出力部１１４に出力する。 As an example, the base model and the deformation model are constructed by using various data (temperature, pressure, performance) obtained by a certain manufacturing apparatus as input variables and output variables. In this case, each model is applied to variable data obtained by another manufacturing apparatus, and the accuracy is calculated as the generalization ability of each model.
Alternatively, the base model and the deformation model are constructed based on variable data obtained during a predetermined period of a certain manufacturing apparatus. In this case, each model may be applied to data obtained in another period of the device, and the accuracy may be calculated as the generalization ability of each model.
Generalization ability uses, for example, Mean Square Error (MSE), Root Mean Square Error (RMSE), coefficient of determination (R ² ), correlation coefficient, Akaike's Information Criterion (AIC), or Bayesian Information Criterion (BIC) Is calculated. The generalization ability calculation unit 112 outputs the calculation result of the generalization ability for each model to the external output unit 114.

外部出力部１１４は、最も汎化能力が高かったベースモデルおよび変形モデルの１つを、ディスプレイ上でユーザに対して表示させ、または所定のファイル形式で出力させる。外部出力部１１４は、最も高い汎化能力を有するモデルを含む複数のモデルを出力しても良い。 The external output unit 114 displays one of the base model and the deformation model having the highest generalization ability on the display to the user or outputs in a predetermined file format. The external output unit 114 may output a plurality of models including a model having the highest generalization ability.

ここで、図２および図３を参照しつつ、複数の具体的な例について説明する。
例えば、１２個の入力変数Ｘ_ｉ（ｉ＝１〜１２の自然数）と出力変数Ｙとの変数データが変数データベース１２２に記憶されている。この場合、ベースモデル構築部１０２は、１２個の入力変数の一部を選択する。ベースモデル構築部１０２は、１２個の入力変数の一部と出力変数Ｙとの間で、例えば以下の式（１）で表されるベースモデルを作成する。ベースモデル構築部１０２は、このベースモデルを、モデル情報保存部１０４に保存する。
Ｙ＝ｂ_１Ｘ_１＋ｂ_２Ｘ_２＋ｂ_３Ｘ_３＋ｂ_０（１） Here, a plurality of specific examples will be described with reference to FIGS. 2 and 3.
For example, variable data of 12 input variables X _i (i = 1 to 12 natural number) and output variable Y are stored in the variable database 122. In this case, the base model construction unit 102 selects some of the 12 input variables. The base model construction unit 102 creates a base model represented by, for example, the following expression (1) between some of the 12 input variables and the output variable Y. The base model construction unit 102 stores the base model in the model information storage unit 104.
Y = b ₁ X ₁ + b ₂ X ₂ + b ₃ X ₃ + b ₀ (1)

次に、類似度算出部１０６は、選択入力変数であるＸ_１、Ｘ_２、およびＸ_３のそれぞれと、非選択入力変数であるＸ_４〜Ｘ_{１２のそれぞれ}と、の間で、図２（ａ）に表すように、類似度を算出する。図２（ａ）では、類似度として相関係数を用いた場合を例示している。 Next, the similarity calculation unit 106 selects between each of the selected input variables X ₁ , X ₂ , and X ₃ and each of the non-selected input variables X _{4 to} X _{12 as} shown in FIG. The similarity is calculated as shown in a). FIG. 2A illustrates a case where a correlation coefficient is used as the similarity.

１つ目の方法として、変形モデル構築部１１０は、例えば、予め設定された閾値を用いる。変形モデル構築部１１０は、それぞれの選択入力変数について、閾値以上の類似度を有する少なくとも１つの非選択入力変数を抽出する。
図２（ｂ）に表した例では、閾値は８０％に設定され、それぞれの選択入力変数に対して類似度の高い非選択入力変数が抽出されている。すなわち、この例では、変数Ｘ_１に対しては、変数Ｘ_４、Ｘ_５、Ｘ_６が抽出されている。変数Ｘ_２に対しては、変数Ｘ_７、Ｘ_８、Ｘ_９が抽出され、変数Ｘ_３に対しては、変数Ｘ_１０、Ｘ_１１、Ｘ_１２が抽出されている。これにより、１つの選択入力変数と、当該１つの選択入力変数と類似度が高い非選択入力変数と、の組が複数作成される。図２（ａ）に表す例では、非選択入力変数Ｘ_１２の類似度は、選択入力変数Ｘ_１およびＸ_３の両方に対して８０％以上である。この場合、非選択入力変数Ｘ_１２は、例えば、より類似度の高い選択入力変数Ｘ_３に対して割り当てられる。 As a first method, the deformation model construction unit 110 uses, for example, a preset threshold value. The deformation model construction unit 110 extracts at least one non-selected input variable having a similarity equal to or greater than a threshold for each selected input variable.
In the example shown in FIG. 2B, the threshold is set to 80%, and non-selected input variables having a high similarity to each selected input variable are extracted. That is, in this example, variables X ₄ , X ₅ , and X ₆ are extracted for variable X ₁ . For variables _{X 2,} variable _X _7, X 8, _{X 9} is extracted, for the variable _{X 3,} variables _{_X _10,} _X _11, _X ₁₂ are extracted. Thereby, a plurality of sets of one selected input variable and a non-selected input variable having a high degree of similarity with the one selected input variable are created. In the example depicted in FIG. 2 (a), the similarity of the non-selected input variable _{X 12} is 80% or more with respect to both the selected input variables _{X 1} and _{X 3.} In this case, the non-selected input variable X ₁₂ is assigned to the selected input variable X ₃ having a higher similarity, for example.

変形モデル構築部１１０は、それぞれの組について、例えば、選択入力変数と非選択入力変数とを一様な確率で入れ換える。変形モデル構築部１１０は、入れ換え後の選択入力変数と非選択入力変数との群に基づいて変形モデルを構築する。変形モデル構築部１１０は、この変形モデルを、モデル情報保存部１０４に保存する。例えば、図２（ａ）および図２（ｂ）に表した例において、変数Ｘ_１およびＸ_３が入れ換えられず、変数Ｘ_２が変数Ｘ_７に入れ換えられる。この場合、変形モデル構築部１１０は、これらの入力変数に基づいて以下の式（２）で表される変形モデルを構築し、モデル情報保存部１０４に保存する。
Ｙ＝ｂ_５Ｘ_１＋ｂ_６Ｘ_７＋ｂ_７Ｘ_３＋ｂ_４（２） For each pair, for example, the deformation model construction unit 110 replaces the selected input variable and the non-selected input variable with a uniform probability. The deformation model construction unit 110 constructs a deformation model based on a group of selected input variables and non-selected input variables after replacement. The deformation model construction unit 110 stores the deformation model in the model information storage unit 104. For example, in the example shown in FIGS. 2 (a) and 2 (b), can not be interchanged variables _{X 1} and _{X 3,} the variable _{X 2} is replaced in the variable _{X 7.} In this case, the deformation model construction unit 110 constructs a deformation model represented by the following expression (2) based on these input variables, and stores the deformation model in the model information storage unit 104.
Y = b ₅ X ₁ + b ₆ X ₇ + b ₇ X ₃ + b ₄ (2)

２つ目の方法として、変形モデル構築部１１０は、非選択入力変数の類似度に基づく確率を設定する。変形モデル構築部１１０は、この確率に従って、少なくとも１つの選択入力変数と少なくとも１つの非選択入力変数とを入れ換える。図２（ｃ）は、図２（ａ）に表した類似度の算出結果を、各選択入力変数に対して類似度が高い非選択入力変数から順に並べたものである。各非選択入力変数の類似度を用いて、選択入力変数Ｘ_ｊ（ｊ＝１、２、３）と、非選択入力変数Ｘ_ｋ（ｋ＝４〜１２）と、を入れ換える確率Ｐ_ｊｋを、例えば、以下の式（３）のように設定される。αは、入れ換えをしない確率のために設定される数値である。

As a second method, the deformation model construction unit 110 sets a probability based on the similarity of non-selected input variables. The deformation model construction unit 110 exchanges at least one selected input variable and at least one non-selected input variable according to this probability. FIG. 2 (c) shows the similarity calculation results shown in FIG. 2 (a) arranged in order from non-selected input variables having high similarity to each selected input variable. Using the similarity of each non-selected input variable, the probability P _jk of exchanging the selected input variable X _j (j = 1, 2, 3) and the non-selected input variable X _k (k = 4 to 12) is For example, the following equation (3) is set. α is a numerical value set for the probability of no replacement.

変形モデル構築部１１０は、式（３）で表される確率に従って選択入力変数と非選択入力変数とを入れ換える。変形モデル構築部１１０は、式（２）と同様に変形モデルを構築し、この変形モデルをモデル情報保存部１０４に保存する。この方法によれば、先に説明した方法に比べて、類似度がより忠実に反映されて変形モデルが構築される。従って、先の方法に比べて、より出力変数に対する予測誤差が小さい入力変数Ｘの組み合わせで変形モデルが構築され易くなる。 The deformation model construction unit 110 interchanges the selected input variable and the non-selected input variable according to the probability represented by the equation (3). The deformation model construction unit 110 constructs a deformation model in the same manner as Expression (2), and stores this deformation model in the model information storage unit 104. According to this method, the deformation model is constructed by reflecting the similarity more faithfully than the method described above. Therefore, it becomes easier to construct a deformation model with a combination of input variables X having a smaller prediction error with respect to the output variables than the previous method.

あるいは、３つ目の方法として、変形モデル構築部１１０は、実験計画法を用いて変形モデルを構築しても良い。具体的には、変形モデル構築部１１０は、まず、図２（ｂ）に表したように、それぞれの選択入力変数に対して類似度の高い非選択入力変数を抽出する。次に、変形モデル構築部１１０は、実験計画法を用いて、図３（ａ）に表すように直交表を作成し、この直交表に基づいて順番に変形モデルを構築していく。汎化能力算出部１１２は、直交表に基づいて構築されたそれぞれの変形モデルについて、図３（ｂ）に表すように、汎化能力（ＭＳＥ）を算出する。変形モデル構築部１１０は、汎化能力の算出結果を参照し、変数を入れ換えたことによる主効果を算出する。そして、変形モデル構築部１１０は、汎化能力が最も高くなるように、複数の選択入力変数の少なくとも一部を、主効果が最も大きい少なくとも１つの非選択入力変数と入れ換えて変形モデルを構築する。変形モデル構構築部１１０は、この変形モデルを外部へ出力する。 Alternatively, as a third method, the deformation model construction unit 110 may construct a deformation model using an experiment design method. Specifically, as shown in FIG. 2B, the deformation model construction unit 110 first extracts non-selected input variables having a high degree of similarity with respect to each selected input variable. Next, the deformation model construction unit 110 creates an orthogonal table as shown in FIG. 3A by using the experimental design method, and constructs a deformation model in order based on the orthogonal table. The generalization ability calculation unit 112 calculates a generalization ability (MSE) for each deformation model constructed based on the orthogonal table, as shown in FIG. The deformation model construction unit 110 refers to the calculation result of the generalization ability, and calculates the main effect by replacing the variables. Then, the deformation model construction unit 110 constructs a deformation model by replacing at least some of the plurality of selected input variables with at least one non-selected input variable having the largest main effect so that the generalization ability is the highest. . The deformation model structure construction unit 110 outputs the deformation model to the outside.

この方法において、変形モデル構築部１１０は、直交表を作成した際に、直交表に基づいて構築される変形モデルの数が、規定数以下か判定を行っても良い。構築される変形モデルの数が規定数以下である場合は、上述した方法に従って変形モデルの構築や主効果の算出を行う。構築される変形モデルの数が規定数を超える場合、モデル構築システム１は、例えば、外部出力部１１４からエラーを出力するか、１つ目か２つ目の方法に切り替えて変形モデルを構築していく。 In this method, the deformation model construction unit 110 may determine whether the number of deformation models constructed based on the orthogonal table is equal to or less than a specified number when the orthogonal table is created. When the number of deformation models to be constructed is equal to or less than the specified number, the deformation models are constructed and the main effect is calculated according to the method described above. If the number of deformed models to be built exceeds the specified number, the model building system 1 outputs an error from the external output unit 114 or switches to the first or second method to build a deformed model, for example. To go.

図４は、実施形態に係るモデル構築方法の一例を表すフローチャートである。
図５は、実施形態に係るモデル構築方法の別の一例を表すフローチャートである。
図４に表したフローチャートは、図２（ａ）〜図２（ｃ）を用いて説明した１つ目および２つ目の方法に対応する。図５に表したフローチャートは、図３を用いて説明した３つ目の方法に対応する。 FIG. 4 is a flowchart illustrating an example of a model construction method according to the embodiment.
FIG. 5 is a flowchart illustrating another example of the model construction method according to the embodiment.
The flowchart shown in FIG. 4 corresponds to the first and second methods described with reference to FIGS. 2 (a) to 2 (c). The flowchart shown in FIG. 5 corresponds to the third method described with reference to FIG.

まず、図４に表したフローチャートについて説明する。
取得部１００が、規定数データベース１２０および変数データベース１２２から、規定数および変数データを取得する（ステップＳ１）。ベースモデル構築部１０２が、複数の入力変数の一部を選択し、ベースモデルを構築する（ステップＳ２）。ベースモデル構築部１０２は、構築されたベースモデルのモデル情報を、モデル情報保存部１０４に保存する（ステップＳ３）。 First, the flowchart shown in FIG. 4 will be described.
The acquisition unit 100 acquires the specified number and variable data from the specified number database 120 and the variable database 122 (step S1). The base model construction unit 102 selects some of the plurality of input variables and constructs a base model (step S2). The base model construction unit 102 stores the model information of the constructed base model in the model information storage unit 104 (step S3).

類似度算出部１０６は、ベースモデルの構築のために選択された複数の選択入力変数のそれぞれと、選択されなかった複数の非選択入力変数のそれぞれと、の間の類似度を算出する（ステップＳ４）。類似度算出部１０６は、算出されたこれらの変数間の類似度を、類似度情報保存部１０８に保存する（ステップＳ５）。変形モデル構築部１１０は、少なくとも１つの選択入力変数を、当該少なくとも１つの選択入力変数と類似度が高い非選択入力変数と入れ換える。変形モデル構築部１１０は、入れ換え後の入力変数群に基づいて変形モデルを構築する（ステップＳ６）。 The similarity calculation unit 106 calculates the similarity between each of the plurality of selected input variables selected for constructing the base model and each of the plurality of unselected input variables not selected (step S4). The similarity calculation unit 106 stores the calculated similarity between these variables in the similarity information storage unit 108 (step S5). The deformation model construction unit 110 replaces at least one selected input variable with a non-selected input variable having a high degree of similarity with the at least one selected input variable. The deformation model construction unit 110 constructs a deformation model based on the input variable group after replacement (step S6).

変形モデル構築部１１０は、構築された変形モデルのモデル情報を、モデル情報保存部１０４に保存する（ステップＳ７）。変形モデル構築部１１０は、構築されたモデル数が、ステップＳ１で取得された規定数に達したか判定する（ステップＳ８）。規定数に達していない場合、規定数に達するまで、ステップＳ６およびＳ７を繰り返し行う。 The deformation model construction unit 110 stores the model information of the constructed deformation model in the model information storage unit 104 (step S7). The deformation model construction unit 110 determines whether the number of constructed models has reached the specified number acquired in step S1 (step S8). If the specified number has not been reached, steps S6 and S7 are repeated until the specified number is reached.

構築されたモデル数が規定数に達すると、汎化能力算出部１１２は、構築されたモデルの汎化能力を算出するための変数データを、変数データベース１２２から取得する（ステップＳ９）。また、汎化能力算出部１１２は、モデル情報保存部１０４から、ベースモデルおよび変形モデルのモデル情報を取得し、各モデルの汎化能力を算出する（ステップＳ１０）。外部出力部１１４は、汎化能力が高いモデルを選択し、外部に出力する（ステップＳ１１）。 When the number of constructed models reaches the specified number, the generalization ability calculation unit 112 acquires variable data for calculating the generalization ability of the constructed model from the variable database 122 (step S9). Further, the generalization ability calculation unit 112 acquires model information of the base model and the deformation model from the model information storage unit 104, and calculates the generalization ability of each model (step S10). The external output unit 114 selects a model having a high generalization capability and outputs it to the outside (step S11).

次に、図５に表したフローチャートについて説明する。
ステップＳ１〜Ｓ５を、図４に表したフローチャートのステップＳ１〜Ｓ５と同様に実行する。変形モデル構築部１１０は、類似度情報保存部１０８に保存された類似度に基づいて、直交表を作成する（ステップＳ６）。変形モデル構築部１１０は、直交表に基づいて作成される変形モデルの数が、規定数以下か判定する（ステップＳ７）。変形モデルの数が規定数を超える場合、実験計画法を用いた変形モデルの構築を終了する。変形モデルの数が規定数以下の場合、変形モデル構築部１１０は、直交表に基づいて別の変形モデルを構築する（ステップＳ８）。 Next, the flowchart shown in FIG. 5 will be described.
Steps S1 to S5 are executed in the same manner as steps S1 to S5 in the flowchart shown in FIG. The deformation model construction unit 110 creates an orthogonal table based on the similarity stored in the similarity information storage unit 108 (step S6). The deformation model construction unit 110 determines whether the number of deformation models created based on the orthogonal table is equal to or less than the specified number (step S7). If the number of deformation models exceeds the specified number, the construction of the deformation model using the experimental design method is terminated. When the number of deformation models is equal to or less than the specified number, the deformation model construction unit 110 constructs another deformation model based on the orthogonal table (step S8).

変形モデル構築部１１０は、構築された変形モデルのモデル情報を、モデル情報保存部１０４に保存する（ステップＳ９）。汎化能力算出部１１２は、構築されたモデルの汎化能力を算出するための変数データを、変数データベース１２２から取得する（ステップＳ１０）。また、汎化能力算出部１１２は、モデル情報保存部１０４から、ベースモデルおよび変形モデルのモデル情報を取得し、各モデルの汎化能力を算出する（ステップＳ１１）。汎化能力算出部１１２は、汎化能力の算出結果を参照し、変数を入れ換えたことによる主効果を算出する（ステップＳ１２）。変形モデル構築部１１０は、選択入力変数の少なくとも一部を、主効果が最も大きい少なくとも１つの非選択入力変数と入れ換えて、別の変形モデルを構築する（ステップＳ１３）。外部出力部１１４は、ステップＳ１３で構築された別の変形モデルを、最も汎化能力が高いモデルとして外部に出力する（ステップＳ１４）。 The deformation model construction unit 110 stores the model information of the constructed deformation model in the model information storage unit 104 (step S9). The generalization ability calculation unit 112 acquires variable data for calculating the generalization ability of the constructed model from the variable database 122 (step S10). Further, the generalization ability calculation unit 112 acquires model information of the base model and the deformation model from the model information storage unit 104, and calculates the generalization ability of each model (step S11). The generalization ability calculation unit 112 refers to the calculation result of the generalization ability, and calculates the main effect by replacing the variables (step S12). The deformation model construction unit 110 constructs another deformation model by replacing at least a part of the selected input variables with at least one non-selected input variable having the largest main effect (step S13). The external output unit 114 outputs the other deformation model constructed in step S13 to the outside as a model having the highest generalization ability (step S14).

図６は、実施形態に係るモデル構築システム１を実現するためのモデル構築装置２の構成を例示するブロック図である。
モデル構築装置２は、例えば、入力装置２００、出力装置２０２、およびコンピュータ２０４を備える。コンピュータ２０４は、例えば、ＲＯＭ(Read Only Memory)２０６、ＲＡＭ(Random Access Memory)２０８、ＣＰＵ(Central Processing Unit)２１０、および記憶装置ＨＤＤ(Hard Disk Drive)２１２を有する。 FIG. 6 is a block diagram illustrating a configuration of the model building device 2 for realizing the model building system 1 according to the embodiment.
The model construction device 2 includes, for example, an input device 200, an output device 202, and a computer 204. The computer 204 includes, for example, a ROM (Read Only Memory) 206, a RAM (Random Access Memory) 208, a CPU (Central Processing Unit) 210, and a storage device HDD (Hard Disk Drive) 212.

入力装置２００は、ユーザがモデル構築装置２に対して情報の入力を行うためのものである。入力装置２００は、キーボードまたはタッチパネルなどである。
出力装置２０２は、モデル構築システム１によって得られる出力結果を、ユーザに対して出力するためのものである。出力装置２０２は、ディスプレイまたはプリンタなどである。 The input device 200 is for a user to input information to the model construction device 2. The input device 200 is a keyboard or a touch panel.
The output device 202 is for outputting the output result obtained by the model construction system 1 to the user. The output device 202 is a display or a printer.

ＲＯＭ２０６は、モデル構築装置２の動作を制御するプログラムを格納している。ＲＯＭ２０６は、コンピュータ２０４を、図１に表した、取得部１００、ベースモデル構築部１０２、類似度算出部１０６、変形モデル構築部１１０、汎化能力算出部１１２、および外部出力部１１４として機能させるために必要なプログラムを格納している。 The ROM 206 stores a program for controlling the operation of the model construction device 2. The ROM 206 causes the computer 204 to function as the acquisition unit 100, the base model construction unit 102, the similarity calculation unit 106, the deformation model construction unit 110, the generalization capability calculation unit 112, and the external output unit 114 illustrated in FIG. The necessary program is stored.

ＲＡＭ２０８は、ＲＯＭ２０６に格納されたプログラムが展開される記憶領域として機能する。ＣＰＵ２１０は、ＲＯＭ１０３に格納された制御プログラムを読み込み、当該制御プログラムに従ってコンピュータ２０４の動作を制御する。また、ＣＰＵ２１０は、コンピュータ２０４の動作によって得られた様々なデータをＲＡＭ２０８に展開する。 The RAM 208 functions as a storage area in which the program stored in the ROM 206 is expanded. The CPU 210 reads a control program stored in the ROM 103 and controls the operation of the computer 204 according to the control program. The CPU 210 develops various data obtained by the operation of the computer 204 in the RAM 208.

ＨＤＤ２１２は、図１に表した、規定数データベース１２０および変数データベース１２２を格納している。また、ＨＤＤ２１２は、構築されたモデルや算出された類似度が保存される、モデル情報保存部１０４および類似度情報保存部１０８としても機能する。 The HDD 212 stores the specified number database 120 and the variable database 122 shown in FIG. The HDD 212 also functions as the model information storage unit 104 and the similarity information storage unit 108 in which the built model and the calculated similarity are stored.

ここで、以上で説明した実施形態の効果について説明する。
本実施形態に係るモデル構築システム１によれば、まず、ベースモデル構築部１０２によって、複数の選択入力変数を含む入力変数群を用いて、出力変数を精度良く予測できるベースモデルが構築される。さらに、変形モデル構築部１１０によって、複数の選択入力変数のそれぞれと複数の非選択入力変数のそれぞれとの間の類似度に基づき、複数の選択入力変数の少なくとも一部が複数の非選択入力変数の少なくとも一部と入れ換えられる。これにより、別の入力変数群が生成される。この別の入力変数群を用いて変形モデルが構築される。類似度を用いて複数の選択入力変数の少なくとも一部と複数の非選択入力変数の少なくとも一部とを入れ換えることで、上記別の入力変数群を用いて構築された変形モデルも、比較的高い精度で出力変数を予測することができる。そして、構築されたベースモデルおよび変形モデルは、汎化能力算出部１１２によって、汎化能力が算出される。このとき、汎化能力算出部１１２によって最も高い汎化能力が算出されたモデルは、上述の通り、出力変数を比較的高い精度で予測することが可能である。
すなわち、本実施形態によれば、精度の低下を抑制しつつ、汎化能力の高いモデルを構築することが可能となる。 Here, the effect of the embodiment described above will be described.
According to the model construction system 1 according to the present embodiment, first, the base model construction unit 102 constructs a base model that can predict output variables with high accuracy using an input variable group including a plurality of selected input variables. Furthermore, based on the similarity between each of the plurality of selected input variables and each of the plurality of non-selected input variables, the deformation model construction unit 110 converts at least some of the plurality of selected input variables into the plurality of non-selected input variables. At least part of. Thereby, another input variable group is generated. A deformation model is constructed using the other input variable group. The deformation model constructed by using another input variable group described above is relatively high by replacing at least some of the plurality of selected input variables and at least some of the plurality of non-selected input variables using similarity. Output variables can be predicted with accuracy. The generalization ability is calculated by the generalization ability calculation unit 112 for the constructed base model and deformation model. At this time, the model for which the highest generalization ability is calculated by the generalization ability calculation unit 112 can predict the output variable with relatively high accuracy as described above.
That is, according to the present embodiment, it is possible to construct a model with high generalization ability while suppressing a decrease in accuracy.

選択入力変数と非選択入力変数との入れ換えにおいては、例えば図２（ａ）および図２（ｂ）に表したように、所定の閾値以上の非選択入力変数が抽出される。そして、抽出された非選択入力変数が、確率で選択入力変数と入れ換える。この方法によれば、選択入力変数と類似度の高い非選択入力変数のみが変形モデルの構築に用いられるため、変形モデルの精度の低下を抑えることができる。 In the replacement of the selected input variable and the non-selected input variable, for example, as illustrated in FIGS. 2A and 2B, a non-selected input variable equal to or greater than a predetermined threshold is extracted. Then, the extracted non-selected input variable is replaced with the selected input variable with probability. According to this method, only non-selected input variables having a high similarity to the selected input variable are used for constructing the deformation model, so that it is possible to suppress a decrease in accuracy of the deformation model.

または、図２（ａ）および図２（ｃ）に表したように、全ての非選択入力変数について、類似度に基づく入れ換えの確率が設定される。この確率に従って複数の選択入力変数の少なくとも一部が複数の非選択入力変数の少なくとも一部と入れ換えられても良い。非選択入力変数の類似度が低いほど、選択入力変数が当該非選択入力変数と入れ換わる確率も低下する。このため、この方法においても、変形モデルの精度の低下を抑えることができる。また、この方法によれば、多様な変形モデルが構築されるため、より汎化能力の高いモデルを構築することが可能となる。 Alternatively, as shown in FIGS. 2A and 2C, replacement probabilities based on similarity are set for all non-selected input variables. According to this probability, at least some of the plurality of selected input variables may be replaced with at least some of the plurality of non-selected input variables. The lower the similarity of a non-selected input variable, the lower the probability that the selected input variable will be replaced with the non-selected input variable. For this reason, also in this method, a decrease in accuracy of the deformation model can be suppressed. Also, according to this method, since various deformation models are constructed, it is possible to construct a model with higher generalization ability.

あるいは、図３（ａ）および図３（ｂ）に表したように、直交表に基づいて変数が入れ換えられ、汎化能力が算出されても良い。主効果が最も高くなるように複数の選択入力変数の一部が複数の非選択入力変数の少なくとも一部と入れ換えられて変形モデルが構築される。この方法によれば、精度の低下を抑制しつつ、より一層汎化能力の高いモデルを構築することが可能となる。また、この方法によれば、選択入力変数と抽出された非選択入力変数との全ての組み合わせについて変形モデルを構築する必要が無く、汎化能力の高い変形モデルをより短時間で効率的に構築することが可能となる。 Alternatively, as shown in FIG. 3A and FIG. 3B, the generalization ability may be calculated by exchanging variables based on the orthogonal table. A deformation model is constructed by replacing some of the plurality of selected input variables with at least some of the plurality of non-selected input variables so that the main effect is the highest. According to this method, it is possible to construct a model with higher generalization ability while suppressing a decrease in accuracy. In addition, according to this method, there is no need to construct a deformation model for all combinations of selected input variables and extracted non-selected input variables, and a deformation model with high generalization ability can be constructed more quickly and efficiently. It becomes possible to do.

以下で、具体的な実施例について説明する。 Hereinafter, specific examples will be described.

（第１実施例）
第１実施例では、電子機器の製造装置において、加工後のワークの出来栄えを出力変数としている。製造装置に設けられた各種センサのデータ（温度や圧力等）を入力変数としている。規定数は、１００に設定した。複数の入力変数の選択およびベースモデルの構築は、Adaptive Lassoを用いた。類似度には、選択入力変数と非選択入力変数との間の相関係数を用いた。相関係数が０．５以上の非選択入力変数を抽出し、一様な確率で選択入力変数と入れ換えを行った。選択入力変数と非選択入力変数を入れ換えた後のモデルの構築は、重回帰を用いた。各モデルは、所定の構築期間Ｔ０における変数データに基づいて構築された。汎化能力の算出は、同じ製造装置において、構築期間Ｔ０の後のテスト期間Ｔ１〜Ｔ５の各期間の変数データを用いた。 (First embodiment)
In the first embodiment, in the electronic device manufacturing apparatus, the work quality after processing is used as an output variable. Data (temperature, pressure, etc.) of various sensors provided in the manufacturing apparatus are used as input variables. The specified number was set to 100. Adaptive Lasso was used to select multiple input variables and to build a base model. For the similarity, a correlation coefficient between a selected input variable and a non-selected input variable was used. A non-selected input variable having a correlation coefficient of 0.5 or more was extracted and replaced with a selected input variable with a uniform probability. Multiple regression was used to build the model after swapping selected and unselected input variables. Each model was constructed based on variable data in a predetermined construction period T0. For the calculation of the generalization ability, variable data for each period of the test periods T1 to T5 after the construction period T0 was used in the same manufacturing apparatus.

図７は、実施形態に係るモデル構築システム１を用いて構築したモデルの特性を例示するグラフである。
図７（ａ）は、各期間におけるＲ^２を表している。図７（ｂ）は、各期間におけるＭＳＥを表している。
図７（ａ）および図７（ｂ）では、ベースモデルと最も汎化能力の高い変形モデルのみを表している。ベースモデルの結果は、〇（白丸）で表されている。最も汎化能力の高い変形モデルの結果は、●（黒丸）で表されている。 FIG. 7 is a graph illustrating characteristics of a model constructed using the model construction system 1 according to the embodiment.
FIG. 7A shows R ² in each period. FIG. 7B shows the MSE in each period.
FIGS. 7A and 7B show only the base model and the deformation model having the highest generalization ability. The result of the base model is represented by ○ (white circle). The result of the deformation model with the highest generalization ability is represented by ● (black circle).

図７（ａ）および図７（ｂ）の結果から、変形モデルは、ベースモデルと同様に、高いＲ^２および小さいＭＳＥが得られ、良好な精度を有することが分かる。テスト期間がより未来に移るに連れて、ベースモデルおよび変形モデルの精度は低下している。テスト期間Ｔ４およびＴ５では、変形モデルの精度の低下が、ベースモデルの精度の低下に比べて緩やかであり、より高い精度を有することが分かる。すなわち、この結果から、本実施形態によって得られた変形モデルは、ベースモデルとほぼ同等の精度を有することが分かる。さらに、変形モデルは、ベースモデルに比べて長期間の変数データに対してより高い精度を有し、高い汎化能力を備えていることがわかる。 From the results of FIG. 7A and FIG. 7B, it can be seen that the deformed model has a high R ² and a small MSE, and has good accuracy, like the base model. As the test period moves to the future, the accuracy of the base model and the deformation model is decreasing. In the test periods T4 and T5, it can be seen that the decrease in accuracy of the deformed model is more gradual than the decrease in accuracy of the base model, and has higher accuracy. That is, it can be seen from this result that the deformation model obtained by the present embodiment has almost the same accuracy as the base model. Furthermore, it can be seen that the deformation model has higher accuracy for long-term variable data than the base model and has a high generalization capability.

（実施例２）
第２実施例では、電子機器の製造装置において、加工後のワークの出来栄えを出力変数としている。製造装置に設けられた各種センサのデータ（加工時温度や圧力等）を入力変数としている。出来栄えは、加工後のワークの寸法およびワークの加工レートの少なくともいずれかに基づく。規定数は、１０００に設定した。複数の入力変数の選択およびベースモデルの構築は、Adaptive Lassoを用いた。類似度には、選択入力変数と非選択入力変数との間の相関係数を用いた。相関係数が０．５以上の非選択入力変数を抽出し、一様な確率で選択入力変数と入れ換えを行った。選択入力変数と非選択入力変数を入れ換えた後のモデルの構築は、重回帰を用いた。各モデルは、所定の構築期間Ｔ０における変数データに基づいて構築された。汎化能力の算出は、同じ製造装置において、構築期間Ｔ１０の後のテスト期間Ｔ１１〜Ｔ１３の各期間の変数データを用いた。 (Example 2)
In the second embodiment, in the electronic device manufacturing apparatus, the work quality after processing is used as an output variable. Data of various sensors (such as processing temperature and pressure) provided in the manufacturing apparatus are used as input variables. The performance is based on at least one of the dimension of the workpiece after machining and the machining rate of the workpiece. The specified number was set to 1000. Adaptive Lasso was used to select multiple input variables and to build a base model. For the similarity, a correlation coefficient between a selected input variable and a non-selected input variable was used. A non-selected input variable having a correlation coefficient of 0.5 or more was extracted and replaced with a selected input variable with a uniform probability. Multiple regression was used to build the model after swapping selected and unselected input variables. Each model was constructed based on variable data in a predetermined construction period T0. For the calculation of the generalization ability, variable data for each period of the test periods T11 to T13 after the construction period T10 was used in the same manufacturing apparatus.

図８は、実施形態に係るモデル構築システム１を用いて構築したモデルの特性を例示するグラフである。
図８（ａ）は、各期間における各モデルのＲ^２を表している。図８（ｂ）は、各期間における各モデルのＭＳＥを表している。
図８（ａ）および図８（ｂ）では、ベースモデルと最も汎化能力の高い変形モデルのみを表している。ベースモデルの結果は、〇（白丸）で表されている。最も汎化能力の高い変形モデルの結果は、●（黒丸）で表されている。 FIG. 8 is a graph illustrating characteristics of a model constructed using the model construction system 1 according to the embodiment.
FIG. 8A shows R ² of each model in each period. FIG. 8B shows the MSE of each model in each period.
8A and 8B show only the base model and the deformation model having the highest generalization ability. The result of the base model is represented by ○ (white circle). The result of the deformation model with the highest generalization ability is represented by ● (black circle).

図８（ａ）および図８（ｂ）の結果から、構築時においては、ベースモデルのＲ^２およびＭＳＥは、それぞれ、変形モデルのＲ^２およびＭＳＥとほぼ同じである。すなわち、変形モデルの精度は、ベースモデルの精度と同等である。
そして、ベースモデルについては、時間が経過するほど、Ｒ^２が低下し、且つＭＳＥが増大している。これに対して、変形モデルについては、Ｒ^２の低下が期間Ｔ１２からＴ１３にかけて停止している。また、ＭＳＥは、期間Ｔ１２からＴ１３で低下している。これらの結果は、変形モデルが高い精度を有し、且つ変形モデルの汎化能力が、ベースモデルの汎化能力よりも高いことを示している。 From the results of FIGS. 8A and 8B, at the time of construction, R ² and MSE of the base model are almost the same as R ² and MSE of the deformation model, respectively. That is, the accuracy of the deformation model is equivalent to the accuracy of the base model.
As for the base model, as time elapses, R ² decreases and MSE increases. In contrast, for the deformation model, reduction of ^{R 2} is stopped from time T12 toward T13. In addition, MSE decreases from period T12 to T13. These results indicate that the deformation model has high accuracy, and the generalization ability of the deformation model is higher than the generalization ability of the base model.

以上、本発明のいくつかの実施形態を例示したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更などを行うことができる。これら実施形態やその変形例は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。また、前述の各実施形態は、相互に組み合わせて実施することができる。 As mentioned above, although several embodiment of this invention was illustrated, these embodiment is shown as an example and is not intending limiting the range of invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, changes, and the like can be made without departing from the spirit of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and equivalents thereof. Further, the above-described embodiments can be implemented in combination with each other.

１モデル構築システム、２モデル構築装置、１００取得部、１０２ベースモデル構築部、１０４モデル情報保存部、１０６類似度算出部、１０８類似度情報保存部、１１０変形モデル構築部、１１２汎化能力算出部、１１４外部出力部、１２０規定数データベース、１２２変数データベース DESCRIPTION OF SYMBOLS 1 Model construction system, 2 Model construction apparatus, 100 Acquisition part, 102 Base model construction part, 104 Model information storage part, 106 Similarity degree calculation part, 108 Similarity degree information preservation part, 110 Deformation model construction part, 112 Generalization ability calculation Part, 114 external output part, 120 regulated number database, 122 variable database

Claims

A base model construction unit that constructs a base model representing a relationship between a selected input variable selected from a plurality of input variables and an output variable;
A similarity calculation unit for calculating a similarity between a non-selected input variable other than the selected input variable among the plurality of input variables and the selected input variable;
Based on the similarity, at least a part of the selected input variable is replaced with the non-selected input variable, and a modified model construction unit that constructs a modified model representing the relationship between the input variable and the output variable after replacement,
A generalization ability calculator for calculating a generalization ability of the base model and the deformation model;
Model building system with

The deformation model construction unit extracts the non-selected input variables having a similarity equal to or higher than a predetermined threshold for each of the selected input variables, and extracts at least a part of the selected input variables from the extracted non-selected variables. The model construction system according to claim 1, wherein the deformation model is constructed by replacing a selected input variable.

The deformation model construction unit creates an orthogonal table using an experimental design from the selected input variable and the extracted non-selected input variable, and constructs a plurality of deformation models based on the orthogonal table,
The generalization ability calculator calculates the generalization ability of each of the deformation models,
The deformation model construction unit calculates a main effect by replacing the variable from the calculation result of the generalization ability, and at least a part of the selected input variable is the main effect so that the main effect becomes the highest. The model construction system according to claim 2, wherein a deformation model is constructed by replacing the non-selected input variable having the largest value.

The deformation model construction unit
For each of the selected input variables, based on the similarity, set a probability of replacing each of the non-selected input variables,
The model construction system according to claim 1, wherein the deformation model is constructed by replacing at least a part of the selected input variables with the non-selected input variables according to the probability.

5. The model construction system according to claim 1, further comprising an external output unit that outputs the base model or the deformation model for which the highest generalization capability is calculated to the outside.

Build a base model that represents the relationship between the selected input variable selected from multiple input variables and the output variable,
Calculating a similarity between a non-selected input variable other than the selected input variable among the plurality of input variables and the selected input variable;
Based on the similarity, at least a part of the selected input variable is replaced with the non-selected input variable, and a deformation model representing the relationship between the input variable after the replacement and the output variable is constructed,
A model construction method for calculating a generalization ability of the base model and the deformation model.