JP7232122B2

JP7232122B2 - Physical property prediction device and physical property prediction method

Info

Publication number: JP7232122B2
Application number: JP2019089910A
Authority: JP
Inventors: 慶行但馬; 綱雄奥村; 智子大嶺; 晃弘近藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2023-03-02
Anticipated expiration: 2039-05-10
Also published as: JP2020187417A

Description

本発明は、物性予測装置及び物性予測方法に関し、マテリアルズインフォマティクスや製造条件の最適化等において、設計を支援するために物性を予測する物性予測装置及び物性予測方法に適用して好適なものである。 The present invention relates to a physical property prediction device and a physical property prediction method, and is suitable for application to a physical property prediction device and a physical property prediction method for predicting physical properties to support design in materials informatics, optimization of manufacturing conditions, etc. be.

昨今、マテリアルズインフォマティクスや製造条件の最適化等において、情報技術（ＩＴ）や人工知能（ＡＩ）を活用することで、設計期間の短縮や新材料・新配合の発見が期待されている。このようなＩＴやＡＩの活用の１つとして、機械学習ベースによる物性予測に大きな期待が寄せられている。 Recently, it is expected to shorten the design period and discover new materials and new formulations by utilizing information technology (IT) and artificial intelligence (AI) in materials informatics and optimization of manufacturing conditions. As one of such applications of IT and AI, great expectations are placed on prediction of physical properties based on machine learning.

例えば、特許文献１には、化合物の物性予測を行う物性予測装置が開示されている。特許文献１に開示された物性予測装置では、化学構造に関するパラメータ値と予測項目に対する値とが予め登録された学習サンプルについて、類似度が予め決定した閾値以上である学習サンプルを取り出してサブサンプルセットを構築し、サブサンプルセットにデータ解析を行って予測モデルを作成し、この予測モデルを未知のサンプルに適用して、予測項目の値（予測値）を算出する。その際、サブサンプルセット中の学習サンプル数が最終パラメータセット中のパラメータ数に比べて少ない場合には、予測モデルの信頼性を高めるために、サブサンプルセットの構築に用いる類似度の閾値を変更することにより、サブサンプルセットに含まれる学習サンプルの数を増やすことが開示されている。 For example, Patent Literature 1 discloses a physical property prediction device that predicts physical properties of compounds. In the physical property prediction apparatus disclosed in Patent Document 1, among learning samples in which parameter values related to chemical structures and values for prediction items are registered in advance, learning samples whose similarity is equal to or higher than a predetermined threshold are taken out and a sub-sample set is obtained. is constructed, data analysis is performed on the sub-sample set to create a prediction model, and this prediction model is applied to unknown samples to calculate the value of the prediction item (prediction value). At that time, if the number of training samples in the subsample set is less than the number of parameters in the final parameter set, the similarity threshold used to construct the subsample set is changed to increase the reliability of the prediction model. It is disclosed to increase the number of training samples included in the sub-sample set by doing so.

特許第５０８３３２０号公報Japanese Patent No. 5083320

しかし、特許文献１に開示された物性予測装置の場合、上記類似度が閾値未満となった学習サンプル（訓練データ）の情報はサブサンプルセットに用いられないため、物性予測の汎化誤差が大きくなるおそれがあった。また、特許文献１の物性予測装置では、上記類似度を毎回計算する必要があるため、処理負荷の増加により処理能力が低下してしまうおそれがあった。また、特許文献１の物性予測装置では、予測における統計学的なリスクの度合い（ブレ具合）が分からないため、算出された予測値に対する信頼性を判断することが難しいという問題もあった。 However, in the case of the physical property prediction device disclosed in Patent Document 1, the information of the learning sample (training data) whose similarity is less than the threshold is not used for the sub-sample set, so the generalization error of physical property prediction is large. There was a possibility that Moreover, since the physical property prediction apparatus of Patent Document 1 needs to calculate the degree of similarity each time, there is a possibility that the processing capacity may decrease due to an increase in the processing load. In addition, with the physical property prediction apparatus of Patent Document 1, since the statistical degree of risk (degree of blurring) in the prediction is not known, there is also the problem that it is difficult to judge the reliability of the calculated predicted value.

本発明は以上の点を考慮してなされたもので、物性予測において、予測の汎化誤差を抑制するとともに、予測のリスクを管理可能な物性予測装置及び物性予測方法を提案しようとするものである。 The present invention has been made in consideration of the above points, and intends to propose a physical property prediction apparatus and a physical property prediction method capable of suppressing prediction generalization errors and managing prediction risks in property prediction. be.

かかる課題を解決するため本発明においては、物性予測に用いるモデルを学習するモデル学習部と、未知サンプルのデータとして入力された未知入力ベクトルに対して、前記モデル学習部で学習した前記モデルを用いた前記物性予測を行う物性予測部と、前記物性予測の結果を表示する表示部と、を備える物性予測装置が提供される。この物性予測装置において、前記モデル学習部は、前記モデルの学習に用いる訓練データの入力集合をクラスタリングし、各クラスタにおける代表ベクトルを選定する代表ベクトル選定部と、各前記代表ベクトルの近傍にある第１所定数の前記訓練データを使って、物性値を予測するベースモデルを学習するベースモデル学習部と、各前記代表ベクトルの近傍にある第２所定数の前記訓練データを使って、前記ベースモデルごとに当該ベースモデルの残差の反数を予測する補正モデルを学習する補正モデル学習部と、を有する。また、前記物性予測部は、前記未知入力ベクトルに対して、当該未知入力ベクトルに近い前記代表ベクトルに関する前記ベースモデル及び前記補正モデルを検索するモデル検索部と、前記未知入力ベクトルに対する前記ベースモデルの予測値としてベースモデル予測値を算出するベースモデル予測部と、前記未知入力ベクトルに対する前記補正モデルの予測値として補正モデル予測値を算出する補正モデル予測部と、前記ベースモデル予測値と前記補正モデル予測値に所定の定数を掛けた値との和によって、物性ごとに物性予測値を算出するとともに、前記補正モデル予測値に基づいて当該物性予測値のリスクを示す補正度を算出する予測結果決定部と、を有する。そして、前記表示部は、少なくとも前記予測結果決定部によって算出された前記物性予測値及び前記補正度を表示する予測結果表示画面を提供する。 In order to solve this problem, in the present invention, a model learning unit that learns a model to be used for physical property prediction, and the model learned by the model learning unit is used for an unknown input vector that is input as unknown sample data. A physical property prediction device including a physical property prediction unit that performs the physical property prediction and a display unit that displays the result of the physical property prediction is provided. In this physical property prediction device, the model learning unit clusters an input set of training data used for learning the model, and includes a representative vector selection unit that selects a representative vector in each cluster; 1 A base model learning unit that learns a base model for predicting physical property values using a predetermined number of the training data, and a second predetermined number of the training data near each of the representative vectors to learn the base model and a correction model learning unit that learns a correction model for predicting the inverse of the residual of the base model for each model. Further, the physical property prediction unit includes, for the unknown input vector, a model search unit for searching the base model and the correction model for the representative vector close to the unknown input vector; a base model prediction unit that calculates a base model prediction value as a prediction value; a correction model prediction unit that calculates a correction model prediction value as a prediction value of the correction model for the unknown input vector; and the base model prediction value and the correction model. Prediction result determination for calculating the predicted value of physical properties for each physical property by summing the predicted value and the value obtained by multiplying the predicted value by a predetermined constant, and calculating the degree of correction indicating the risk of the predicted value of the physical property based on the corrected model predicted value. and The display unit provides a prediction result display screen that displays at least the predicted physical property value and the degree of correction calculated by the prediction result determination unit.

また、かかる課題を解決するため本発明においては、物性予測に用いるモデルを学習するモデル学習ステップと、未知サンプルのデータとして入力された未知入力ベクトルに対して、前記モデル学習ステップで学習した前記モデルを用いた前記物性予測を行う物性予測ステップと、前記物性予測の結果を表示する表示ステップと、を備える物性予測方法が提供される。この物性予測方法において、前記モデル学習ステップは、前記モデルの学習に用いる訓練データの入力集合をクラスタリングし、各クラスタにおける代表ベクトルを選定する代表ベクトル選定ステップと、各前記代表ベクトルの近傍にある第１所定数の前記訓練データを使って、物性値を予測するベースモデルを学習するベースモデル学習ステップと、各前記代表ベクトルの近傍にある第２所定数の前記訓練データを使って、前記ベースモデルごとに当該ベースモデルの残差の反数を予測する補正モデルを学習する補正モデル学習ステップと、を有する。また、前記物性予測ステップは、前記未知入力ベクトルに対して、当該未知入力ベクトルに近い前記代表ベクトルに関する前記ベースモデル及び前記補正モデルを検索するモデル検索ステップと、前記未知入力ベクトルに対する前記ベースモデルの予測値としてベースモデル予測値を算出するベースモデル予測ステップと、前記未知入力ベクトルに対する前記補正モデルの予測値として補正モデル予測値を算出する補正モデル予測ステップと、前記ベースモデル予測ステップで算出された前記ベースモデル予測値と、前記補正モデル予測ステップで算出された前記補正モデル予測値に所定の定数を掛けた値との和をとることによって、物性ごとに物性予測値を算出するとともに、前記補正モデル予測値に基づいて当該物性予測値のリスクを示す補正度を算出する予測結果決定ステップと、を有する。そして、前記表示ステップでは、少なくとも前記予測結果決定ステップで算出された前記物性予測値及び前記補正度を表示する予測結果表示画面が提供される。 Further, in order to solve such problems, in the present invention, there are provided a model learning step for learning a model to be used for physical property prediction; and a display step of displaying the result of the physical property prediction. In this physical property prediction method, the model learning step includes a representative vector selection step of clustering an input set of training data used for learning the model and selecting a representative vector in each cluster; 1 A base model learning step of learning a base model for predicting physical property values using a predetermined number of the training data, and a second predetermined number of the training data near each of the representative vectors to learn the base model and a correction model learning step of learning a correction model that predicts the inverse of the residual of the base model for each. Further, the physical property prediction step includes, for the unknown input vector, a model search step for searching the base model and the correction model for the representative vector close to the unknown input vector; a base model prediction step of calculating a base model prediction value as a prediction value; a correction model prediction step of calculating a correction model prediction value as a prediction value of the correction model for the unknown input vector; A physical property predicted value is calculated for each physical property by summing the base model predicted value and a value obtained by multiplying the corrected model predicted value calculated in the corrected model prediction step by a predetermined constant, and the corrected and a prediction result determination step of calculating a correction degree indicating a risk of the predicted physical property value based on the model predicted value. In the display step, a prediction result display screen is provided that displays at least the predicted physical property value and the degree of correction calculated in the prediction result determination step.

本発明によれば、物性予測において、予測の汎用誤差を抑制するとともに、予測のリスクを管理することができる。 ADVANTAGE OF THE INVENTION According to this invention, in physical property prediction, while suppressing the general-purpose error of prediction, the risk of prediction can be managed.

本発明の一実施の形態に係る物性予測装置のシステム構成並びに機能構成を示す構成図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a block diagram which shows the system configuration|structure and functional configuration of the physical-property prediction apparatus which concerns on one embodiment of this invention. 図１に示した物性予測装置のハードウェア構成を示す構成図である。2 is a configuration diagram showing a hardware configuration of the physical property prediction device shown in FIG. 1; FIG. モデル学習処理の処理手順の概要を示すフローチャートである。4 is a flowchart showing an outline of a processing procedure of model learning processing; 実験データの入力集合のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of an input set of experimental data; 実験データの出力集合のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of an output set of experimental data; 補正度調整画面の具体例を示す図である。It is a figure which shows the specific example of a correction degree adjustment screen. 予測モデル学習処理の詳細な処理手順例を示すフローチャートである。9 is a flowchart showing a detailed processing procedure example of prediction model learning processing; 近傍定義処理の処理手順の一例を示すフローチャートである。8 is a flowchart illustrating an example of a processing procedure of neighborhood definition processing; 物性予測処理の処理手順の一例を示すフローチャートである。6 is a flow chart showing an example of a processing procedure of physical property prediction processing; 未知入力ベクトルの入力集合のデータ構成例を示す図である。FIG. 4 is a diagram showing a data configuration example of an input set of unknown input vectors; 予測結果データのデータ構成例を示す図である。It is a figure which shows the data structural example of prediction result data. 予測結果表示画面の具体例を示す図である。It is a figure which shows the specific example of a prediction result display screen. 基本形における物性予測値及びリスク値の算出過程を示すブロック線図である。FIG. 3 is a block diagram showing the process of calculating physical property prediction values and risk values in a basic form; 変形例における物性予測値及びリスク値の算出過程を示すブロック線図である。FIG. 11 is a block diagram showing a process of calculating physical property prediction values and risk values in a modified example;

以下、図面を参照して、本発明の実施の形態を詳述する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（１）物性予測装置の構成
図１は、本発明の一実施の形態に係る物性予測装置のシステム構成並びに機能構成を示す構成図である。本実施の形態に係る物性予測装置１は、学習フェーズにおいて予測モデル（ベースモデル及び補正モデル）を学習し、運用フェーズにおいては、入力された未知サンプルのデータ（未知入力ベクトル）に基づいて検索した好適な予測モデルを用いて物性予測を行い、その予測結果を出力する装置であって、例えば一般的な計算サーバ（及びその周辺機器）等によって実現される。 (1) Configuration of Physical Property Prediction Apparatus FIG. 1 is a configuration diagram showing the system configuration and functional configuration of a physical property prediction apparatus according to an embodiment of the present invention. The physical property prediction apparatus 1 according to the present embodiment learns prediction models (base model and correction model) in the learning phase, and searches based on the input unknown sample data (unknown input vector) in the operation phase. A device that predicts physical properties using a suitable prediction model and outputs the prediction results, and is realized by, for example, a general calculation server (and its peripherals).

図１に示すように、物性予測装置１は、ネットワーク３を介して、エンジニアが使用する端末２と通信可能に接続される。エンジニアとは、端末２の操作を介して物性予測装置１を利用するユーザを意味し、以降の説明ではエンジニアに統一して表記する。ネットワーク３は、例えばＬＡＮ（Local Area Network）であるが、特定のネットワークに限定されるものではなく、有線／無線を問わない。例えば、物性予測装置１と端末２との間は、ＷＷＷ（World Wide Web）経由で接続されてもよい。 As shown in FIG. 1, a physical property prediction device 1 is connected via a network 3 to a terminal 2 used by an engineer so as to be communicable. An engineer means a user who uses the physical property prediction apparatus 1 through the operation of the terminal 2, and is uniformly written as an engineer in the following description. The network 3 is, for example, a LAN (Local Area Network), but is not limited to a specific network and may be wired or wireless. For example, the physical property prediction device 1 and the terminal 2 may be connected via WWW (World Wide Web).

そして、詳細は後述するが、物性予測装置１は、モデル学習部１１、物性予測部１２、表示部１３、及びデータ管理部１４を機能構成として備える。このうち、モデル学習部１１は、代表ベクトル選定部１５、ベースモデル学習部１６、及び補正モデル学習部１７を有し、物性予測部１２は、モデル検索部１８、ベースモデル予測部１９、補正モデル予測部２０、及び予測結果決定部２１を有するが、不図示の機能部をさらに有してもよい。 Although details will be described later, the physical property prediction device 1 includes a model learning unit 11, a physical property prediction unit 12, a display unit 13, and a data management unit 14 as functional configurations. The model learning unit 11 includes a representative vector selection unit 15, a base model learning unit 16, and a correction model learning unit 17. The physical property prediction unit 12 includes a model search unit 18, a base model prediction unit 19, a correction model Although it has a prediction unit 20 and a prediction result determination unit 21, it may further have a functional unit (not shown).

図２は、図１に示した物性予測装置のハードウェア構成を示す構成図である。図２に示したように、物性予測装置１は、ＣＰＵ（Central Processing Unit）１０１、ＲＯＭ（Read Only Memory）１０２、ＲＡＭ（Random Access Memory）１０３、外部記憶装置１０４、通信Ｉ／Ｆ１０５、外部入力装置１０６、及び外部出力装置１０７を備える。外部記憶装置１０４は、例えばＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等の記憶装置に相当し、外部入力装置１０６は、例えばキーボードやマウス等の入力装置に相当し、外部出力装置１０７は、例えばディスプレイやプリンタ等の出力装置に相当する。 FIG. 2 is a configuration diagram showing the hardware configuration of the physical property prediction device shown in FIG. As shown in FIG. 2, the physical property prediction apparatus 1 includes a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, a RAM (Random Access Memory) 103, an external storage device 104, a communication I / F 105, an external input A device 106 and an external output device 107 are provided. The external storage device 104 corresponds to a storage device such as a HDD (Hard Disk Drive) or SSD (Solid State Drive). corresponds to an output device such as a display or a printer.

物性予測装置１について、図２に示したハードウェア構成と図１に示した機能構成との関係を説明すると、データ管理部１４は、ＣＰＵ１０１がＲＡＭ１０３または外部記憶装置１０４に対してデータの読み書きを制御することによって実現される。 Regarding the physical property prediction apparatus 1, the relationship between the hardware configuration shown in FIG. 2 and the functional configuration shown in FIG. It is realized by controlling.

また、物性予測装置１におけるその他の機能構成は、ＣＰＵ１０１が、ＲＯＭ１０２あるいは外部記憶装置１０４に格納されたプログラムをＲＡＭ１０３に読み込み、必要に応じてＲＯＭ１０２、ＲＡＭ１０３、または外部記憶装置１０４に記憶されたデータを参照しながら上記プログラムを実行し、通信Ｉ／Ｆ１０５、外部入力装置１０６、または外部出力装置１０７を制御することによって実現される。 Other functional configurations of the physical property prediction apparatus 1 are such that the CPU 101 reads a program stored in the ROM 102 or the external storage device 104 into the RAM 103, reads data stored in the ROM 102, the RAM 103, or the external storage device 104 as necessary. , and controls the communication I/F 105 , the external input device 106 , or the external output device 107 .

特に、表示部１３は、ＣＰＵ１０１の制御によって外部出力装置１０７に対して画面表示等の出力を可能とする他、ネットワーク３を経由して出力情報を所定の出力形式（例えばＧＵＩ：Graphic User Interface）で外部（例えば端末２）に提供することもできる。このとき、エンジニアは、端末２を操作して出力情報にアクセスし、ＧＵＩ上で各種操作を行うことができ、物性予測装置１のＣＰＵ１０１は、その操作結果を受け付けてさらなる処理を実行することができる。また、以下の説明では「画像表示」について説明するが、本実施の形態は、画像表示以外の出力形態（例えば、印刷やデータ出力）に置き換えることもできる。 In particular, the display unit 13 enables output such as screen display to the external output device 107 under the control of the CPU 101, and outputs output information via the network 3 in a predetermined output format (for example, GUI: Graphic User Interface). can also be provided externally (for example, terminal 2). At this time, the engineer can operate the terminal 2 to access the output information and perform various operations on the GUI, and the CPU 101 of the physical property prediction apparatus 1 can accept the operation result and execute further processing. can. In addition, although "image display" will be described in the following description, this embodiment can be replaced with an output form other than image display (for example, printing or data output).

なお、図２に示したハードウェア構成の一部（具体的には、外部記憶装置１０４、外部入力装置１０６、または外部出力装置１０７等）は、必ずしも物性予測装置１の計算サーバ内になくてもよく、計算サーバに外部接続された周辺機器や装置等に置き換えてもよい。また、端末２のハードウェア構成は、物性予測装置１と同様と考えてよいため、詳細な説明を省略する。 It should be noted that part of the hardware configuration shown in FIG. Alternatively, it may be replaced with a peripheral device or device externally connected to the calculation server. Also, since the hardware configuration of the terminal 2 can be considered to be the same as that of the physical property prediction device 1, detailed description thereof will be omitted.

図１に戻り、物性予測装置１の各機能構成について説明する。 Returning to FIG. 1, each functional configuration of the physical property prediction device 1 will be described.

物性予測装置１においてモデル学習部１１は、学習フェーズにおける処理（モデル学習処理）を実行することにより、予測モデル（ベースモデル及び補正モデル）を学習する、モデル学習機能を有する。 The model learning unit 11 in the physical property prediction device 1 has a model learning function of learning a prediction model (base model and correction model) by executing processing (model learning processing) in the learning phase.

モデル学習部１１に含まれる機能構成のうち、代表ベクトル選定部１５は、モデル学習処理において学習する訓練データの入力集合をクラスタリングし、各クラスタの代表ベクトルを選定する機能を有する。なお、モデル学習処理で学習する訓練データは、データ管理部１４に蓄積された過去の実験データから選択される。 Among the functional components included in the model learning unit 11, the representative vector selection unit 15 has a function of clustering an input set of training data learned in the model learning process and selecting a representative vector of each cluster. The training data learned in the model learning process is selected from past experimental data accumulated in the data management unit 14 .

また、ベースモデル学習部１６は、代表ベクトルに基づいてベースモデルを学習する機能を有する。具体的にはベースモデル学習部１６は、代表ベクトル選定部１５によって選定された代表ベクトルごとに、当該代表ベクトルの近傍にある所定個数（本実施の形態では「Ｋ個」とする）の訓練データ（ベースモデル訓練データ）を使って、物性値を予測するベースモデルを学習する。ベースモデル学習部１６によって学習されたベースモデルは、データ管理部１４に登録される。 The base model learning unit 16 also has a function of learning a base model based on the representative vector. Specifically, the base model learning unit 16 selects, for each representative vector selected by the representative vector selection unit 15, a predetermined number ("K" in the present embodiment) of training data near the representative vector. Use (base model training data) to learn a base model that predicts physical property values. A base model learned by the base model learning unit 16 is registered in the data management unit 14 .

また、補正モデル学習部１７は、補正モデルを学習する機能を有する。具体的には補正モデル学習部１７は、代表ベクトル選定部１５によって選定された代表ベクトルごとに、当該代表ベクトルの近傍にある所定個数（本実施の形態では「Ｌ個」とする）の訓練データ（補正モデル訓練データ）を使って、ベースモデルごとに当該ベースモデルの残差の反数を予測する補正モデルを学習する。補正モデル学習部１７によって学習された補正モデルは、データ管理部１４に登録される。 Further, the correction model learning unit 17 has a function of learning a correction model. Specifically, the correction model learning unit 17 selects, for each representative vector selected by the representative vector selection unit 15, a predetermined number (in this embodiment, "L") of training data in the vicinity of the representative vector. (correction model training data) is used to learn a correction model that predicts the inverse of the residual of each base model. The correction model learned by the correction model learning unit 17 is registered in the data management unit 14 .

なお、上記したモデル学習部１１の機能部は、モデル学習部１１が有するモデル学習機能のうちの代表的な一部の機能を実現するための機能構成の一例であって、モデル学習部１１は、モデル学習機能に含まれる「その他の機能（処理）」を実現するために、さらに不図示の機能部を有してもよい（あるいは、上記の機能部の何れかが代替してもよい）。「その他の処理」とは具体的には例えば、実験データを受け付ける処理（図３のステップＳ１１）、実験データから訓練データを選択する処理（図３のステップＳ１２）、訓練データの前処理を行う処理（図３のステップＳ１３）、及び、物性予測値の汎化誤差を最小化するための定数（汎化誤差最小化定数）を探索する処理（図７のステップＳ２４）等が相当し、それぞれの詳細は後述する。そして、以降の説明では、このような不図示の機能部による処理は、モデル学習部１１を処理主体として表記する。 Note that the functional unit of the model learning unit 11 described above is an example of a functional configuration for realizing a representative part of the model learning functions of the model learning unit 11, and the model learning unit 11 , in order to realize "other functions (processing)" included in the model learning function, it may further have a functional unit (not shown) (or any of the above functional units may be substituted). . "Other processing" specifically includes, for example, processing for receiving experimental data (step S11 in FIG. 3), processing for selecting training data from experimental data (step S12 in FIG. 3), and preprocessing of training data. The processing (step S13 in FIG. 3) and the processing for searching for a constant (generalization error minimizing constant) for minimizing the generalization error of the predicted physical property value (step S24 in FIG. 7) correspond to each will be detailed later. In the following description, such processing by a functional unit (not shown) is described with the model learning unit 11 as the processing entity.

物性予測装置１において物性予測部１２は、運用フェーズにおける処理（物性予測処理）を実行することにより、未知入力ベクトルに基づいて検索した好適な予測モデルを用いて物性予測を行い、その予測結果を表示部１３に出力する、物性予測機能を有する。 In the physical property prediction device 1, the physical property prediction unit 12 executes processing in the operation phase (physical property prediction processing) to perform physical property prediction using a suitable prediction model searched based on the unknown input vector, and the prediction result is It has a physical property prediction function to output to the display unit 13 .

物性予測部１２に含まれる機能構成のうち、モデル検索部１８は、未知入力ベクトルに対する物性予測に用いるモデルを検索する機能を有する。具体的にはモデル検索部１８は、データ管理部１４に登録されたベースモデル及び補正モデルのうちから、未知入力ベクトルに近い代表ベクトルに関するベースモデル（対応ベースモデル）及び補正モデル（対応補正モデル）を検索する。ここでの、「代表ベクトルに関する」モデルとは、学習フェーズにおいて、当該代表ベクトルに基づいて（当該代表ベクトルの近傍にある訓練データを使って）学習されたモデルであることを意味する。 Among the functional components included in the physical property prediction unit 12, the model search unit 18 has a function of searching for a model used for physical property prediction for an unknown input vector. Specifically, the model search unit 18 retrieves a base model (corresponding base model) and a correction model (corresponding correction model) related to a representative vector close to the unknown input vector from among the base models and correction models registered in the data management unit 14. Search for Here, a model "associated with a representative vector" means a model learned based on the representative vector (using training data in the vicinity of the representative vector) in the learning phase.

また、ベースモデル予測部１９は、モデル検索部１８が検索した対応ベースモデルを用いて、未知入力ベクトルに対する対応ベースモデルの予測値（ベースモデル予測値）を算出する機能を有する。 The base model prediction unit 19 also has a function of calculating a prediction value (base model prediction value) of the corresponding base model for the unknown input vector using the corresponding base model searched by the model search unit 18 .

また、補正モデル予測部２０は、モデル検索部１８が検索した対応補正モデルを用いて、未知入力ベクトルに対する対応補正モデルの予測値（補正モデル予測値）を算出する機能を有する。 The correction model prediction unit 20 also has a function of calculating a prediction value (correction model prediction value) of the corresponding correction model for the unknown input vector using the corresponding correction model searched by the model search unit 18 .

予測結果決定部２１は、ベースモデル予測部１９が算出したベースモデル予測値と補正モデル予測部２０が算出した補正モデル予測値とに基づいて、未知入力ベクトルに対する物性予測の予測結果を決定する機能を有する。具体的にはまず、予測結果決定部２１は、ベースモデル予測値と補正モデル予測値に汎化誤差最小化定数を掛けた値との和から、未知入力ベクトルに対する物性ごとの予測値（物性予測値）を算出する。また、予測結果決定部２１は、補正モデル予測値の絶対値からそれぞれの物性予測値の補正度を算出する。この物性予測値の補正度は、物性ごとの予測値の統計学的なリスクの度合い（ブレ具合）を示す値である。さらに、予測結果決定部２１は、上記算出した各物性予測値の補正度に所定の演算処理を行うことにより、今回の物性予測全体における予測値のリスクの度合い（ブレ具合）を「リスク値」として算出する。 The prediction result determination unit 21 has a function of determining a prediction result of physical property prediction for an unknown input vector based on the base model prediction value calculated by the base model prediction unit 19 and the correction model prediction value calculated by the correction model prediction unit 20. have Specifically, first, the prediction result determination unit 21 calculates a prediction value for each physical property (physical property prediction value). Also, the prediction result determination unit 21 calculates the correction degree of each physical property prediction value from the absolute value of the correction model prediction value. The degree of correction of the physical property predicted value is a value indicating the degree of statistical risk (blurring degree) of the predicted value for each physical property. Furthermore, the prediction result determination unit 21 performs a predetermined arithmetic processing on the calculated correction degree of each physical property prediction value, and the degree of risk (blurring degree) of the prediction value in the current physical property prediction as a whole is a “risk value”. Calculate as

なお、上記した物性予測部１２の機能部は、物性予測部１２が有する物性予測機能のうちの代表的な一部の機能を実現するための機能構成の一例であって、物性予測部１２は、物性予測機能に含まれる「その他の機能（処理）」を実現するために、さらに不図示の機能部を有してもよい（あるいは、上記の機能部の何れかが代替してもよい）。「その他の処理」とは具体的には例えば、未知入力ベクトルを受け付ける処理（図９のステップＳ４１）等が相当する。そして、以降の説明では、このような不図示の機能部による処理は、物性予測部１２を処理主体として表記する。 The functional unit of the physical property prediction unit 12 described above is an example of a functional configuration for realizing a representative part of the physical property prediction functions of the physical property prediction unit 12, and the physical property prediction unit 12 , in order to realize "other functions (processing)" included in the physical property prediction function, it may further have a functional unit (not shown) (or any of the above functional units may be substituted). . "Other processing" specifically corresponds to, for example, processing for receiving an unknown input vector (step S41 in FIG. 9). In the following description, processing by such a functional unit (not shown) is described with the physical property prediction unit 12 as the main processing entity.

表示部１３は、モデル学習部１１または物性予測部１２の出力指示に基づいて、指示された情報を表示する機能を有する。具体的には例えば、表示部１３は、物性予測処理において、物性予測の結果を示す予測結果表示画面（図１３参照）を表示したり、モデル学習処理において、予測結果表示画面における補正度レベルの分類基準等を調整可能な補正度調整画面（図６参照）を表示したりする。 The display unit 13 has a function of displaying instructed information based on an output instruction from the model learning unit 11 or the physical property prediction unit 12 . Specifically, for example, the display unit 13 displays a prediction result display screen (see FIG. 13) showing the result of physical property prediction in the physical property prediction process, or displays the correction degree level on the prediction result display screen in the model learning process. A correction degree adjustment screen (see FIG. 6) that allows adjustment of classification criteria and the like is displayed.

データ管理部１４は、物性予測装置１で用いられる各種データを格納・管理する機能を有する。図４，図５，図１０，図１１には、データ管理部１４が格納・管理するデータの一例が示されている。 The data management unit 14 has a function of storing and managing various data used in the physical property prediction device 1 . 4, 5, 10 and 11 show examples of data stored and managed by the data management unit 14. FIG.

（２）学習フェーズ（モデル学習処理）
物性予測装置１の学習フェーズにおける処理手順を、図３等を参照しながら説明する。図３は、モデル学習処理の処理手順の概要を示すフローチャートである。前述したように、モデル学習処理は、主にモデル学習部１１の各部によって実行される。 (2) Learning phase (model learning processing)
A processing procedure in the learning phase of the physical property prediction device 1 will be described with reference to FIG. 3 and the like. FIG. 3 is a flowchart showing an outline of the processing procedure of model learning processing. As described above, the model learning process is mainly executed by each part of the model learning section 11 .

図３によればまず、モデル学習部１１は、エンジニアによって端末２から実験データが入力されると、これを受け付けてデータ管理部１４に蓄積する（ステップＳ１１）。ここで、実験データとは、過去に行った実験に関するデータであって、実験に用いた原料や条件等のデータを示す入力情報と、実験で計測された物性のデータを示す出力情報とに分類することができる。本実施の形態では、データ管理部１４は、実験の入力情報と出力情報とを、それぞれ入力集合と出力集合とにまとめて管理する。 According to FIG. 3, first, when the engineer inputs experimental data from the terminal 2, the model learning unit 11 receives and stores the data in the data management unit 14 (step S11). Here, the experimental data is data related to experiments conducted in the past, and is classified into input information indicating data such as raw materials and conditions used in the experiment, and output information indicating data of physical properties measured in the experiment. can do. In this embodiment, the data management unit 14 collectively manages the input information and the output information of the experiment into an input set and an output set, respectively.

図４は、実験データの入力集合のデータ構成例を示す図であり、図５は、実験データの出力集合のデータ構成例を示す図である。図４に示した入力集合１１０は、実験の識別子を示す実験番号１１１と、当該実験で用いた各原料の使用量（使用率等でもよい）を示す原料１１２，１１３と、当該実験における各条件を示す条件１１４，１１５と、を備えて構成される。図５に示した出力集合１２０は、図４の実験番号１１１と同様に実験の識別子を示す実験番号１２１と、当該実験で計測された各物性の値を示す物性１２２～１２６と、を備えて構成される。なお、図４，図５に示したデータ構成は一例であって、本実施の形態における実験データの格納形態は上記例に限定されるものではない。 FIG. 4 is a diagram showing a data configuration example of an input set of experimental data, and FIG. 5 is a diagram showing a data configuration example of an output set of experimental data. The input set 110 shown in FIG. 4 includes an experiment number 111 indicating the identifier of the experiment, raw materials 112 and 113 indicating the usage amount (usage rate, etc.) of each raw material used in the experiment, and each condition in the experiment. and conditions 114 and 115 indicating The output set 120 shown in FIG. 5 includes an experiment number 121 indicating the identifier of the experiment, similar to the experiment number 111 in FIG. Configured. The data configurations shown in FIGS. 4 and 5 are examples, and the storage form of experimental data in the present embodiment is not limited to the above examples.

ステップＳ１１の次に、モデル学習部１１は、ステップＳ１１でデータ管理部１４に蓄積された実験データから、モデル学習処理において学習する訓練データを選択する（ステップＳ１２）。ステップＳ１２においてどのような訓練データを選択するかは任意であり、例えばエンジニアによって選択可能であってもよい。また、訓練データは実験データから選択されるため、訓練データの入力集合のデータ構成は、図４に示した実験データの入力集合１１０と同様であると考えることができ、訓練データの出力集合のデータ構成は、図５に示した実験データの出力集合１２０と同様であると考えることができる。 After step S11, the model learning unit 11 selects training data to be learned in the model learning process from the experimental data accumulated in the data management unit 14 in step S11 (step S12). What kind of training data is selected in step S12 is arbitrary, and may be selectable by an engineer, for example. Further, since the training data is selected from the experimental data, the data configuration of the training data input set can be considered to be the same as the experimental data input set 110 shown in FIG. The data organization can be considered similar to the output set 120 of experimental data shown in FIG.

次に、モデル学習部１１は、ステップＳ１２で選択した訓練データの前処理を行う（ステップＳ１３）。ステップＳ１３で実行される前処理の種類は限定されないが、例えば、訓練データに含まれるカテゴリカルデータ（定性的データ）に対して、ワンホットベクトル化したり、正規化したりする処理が挙げられる。 Next, the model learning unit 11 preprocesses the training data selected in step S12 (step S13). The type of preprocessing executed in step S13 is not limited, but includes, for example, one-hot vectorization and normalization of categorical data (qualitative data) included in the training data.

次に、モデル学習部１１は、ステップＳ１３で前処理が行われた訓練データを用いて、予測モデル（ベースモデル及び補正モデル）を学習する予測モデル学習処理を実行する（ステップＳ１４）。詳細は図７等を参照しながら後述するが、予測モデル学習処理では、代表ベクトルの選定、ベースモデルの学習、補正モデルの学習、及び、汎化誤差最小化定数（本例では「定数α」とする）の線形探索が行われる。 Next, the model learning unit 11 uses the training data preprocessed in step S13 to execute a prediction model learning process for learning a prediction model (base model and correction model) (step S14). Details will be described later with reference to FIG. ) is performed.

次に、モデル学習部１１は、ステップＳ１４の予測モデル学習処理の処理結果に基づいて、代表ベクトル、学習後のベースモデルと補正モデル、及び汎化誤差最小化定数（定数α）を、紐付けてデータ管理部１４に登録する（ステップＳ１５）。 Next, the model learning unit 11 associates the representative vector, the learned base model and correction model, and the generalization error minimization constant (constant α) based on the processing result of the prediction model learning processing in step S14. is registered in the data management unit 14 (step S15).

そして最後に、モデル学習部１１は、予測結果表示画面における補正度レベルの分類基準や表示態様を調整可能な補正度調整画面の表示出力を表示部１３に指示し、当該画面に対するエンジニアによる操作が有った場合は適宜調整を行い（ステップＳ１６）、その後、モデル学習処理を終了する。 Finally, the model learning unit 11 instructs the display unit 13 to display and output a correction level adjustment screen on which the correction level classification criteria and display mode can be adjusted on the prediction result display screen, and the engineer can operate the screen. If there is, appropriate adjustment is made (step S16), and then the model learning process is terminated.

ここで、補正度調整画面について補足する。具体例は図１３に示すが、本実施の形態に係る物性予測装置１は、運用フェーズで物性予測が行われたとき、その結果を表示する予測結果表示画面において、物性の予測値を、予測のリスクを示す補正度の度合い（補正度レベル）を認識可能な態様で表示することができる。そして、このような予測結果表示画面における「補正度」の表示態様をエンジニアが事前に設定できるようにするため、モデル学習処理のステップＳ１６では、補正度調整画面が表示される。なお、本例では、予測結果表示画面において補正度レベルを認識可能な態様で物性の予測値を表示することについて詳しく後述するが、本実施の形態はこれに限らず、補正度自体を認識可能な態様で表示するものであってもよい。 Here, the correction degree adjustment screen will be supplemented. A specific example is shown in FIG. 13, but when physical property prediction is performed in the operation phase, the physical property prediction device 1 according to the present embodiment displays the predicted value of the physical property on the prediction result display screen that displays the result. It is possible to display the degree of correction (correction level) that indicates the risk of recognizable. Then, in order to enable the engineer to set the display mode of the "correction degree" on the prediction result display screen in advance, a correction degree adjustment screen is displayed in step S16 of the model learning process. In this example, the display of the predicted values of physical properties in a manner in which the degree of correction level can be recognized on the prediction result display screen will be described later in detail. It may be displayed in a similar manner.

図６は、補正度調整画面の具体例を示す図である。図６に示したように、補正度調整画面２１０には、平均補正度と残差の絶対値とを両軸に有する分布図が表示される。分布図におけるデータの表示態様は、凡例２１１に示すように、補正度レベル（「補正度小」、「補正度中」、「補正度大」）によって異なる。そして、補正度調整画面２１０には、補正度レベルの分類基準を移動可能なスライドバー２１２，２１３が設置されている。そこでエンジニアは、スライドバー２１２，２１３を操作することによって、補正度レベルの分類基準を自在に調整することができる。なお、本例では、補正度レベルによって異なる模様が表示されるとしているが、本実施の形態はこれに限定されるものではなく、補正度レベルを識別できるものであればよい。例えば、色や大きさを異ならせてもよいし、これらを組み合わせてもよい。さらに、どのような表示態様とするかを選択できるようにしてもよい。そして、補正度調整画面２１０で調整された設定情報はデータ管理部１４に登録され、予測結果表示画面の表示設定に反映される（図１２の予測結果表示画面２２０参照）。 FIG. 6 is a diagram showing a specific example of the correction degree adjustment screen. As shown in FIG. 6, the correction degree adjustment screen 210 displays a distribution chart having the average correction degree and the absolute value of the residual on both axes. As shown in legend 211, the display mode of the data in the distribution map differs depending on the correction degree level ("small correction degree", "medium correction degree", and "large correction degree"). Further, on the correction degree adjustment screen 210, slide bars 212 and 213 are provided to allow movement of the correction degree level classification criteria. Therefore, the engineer can freely adjust the correction degree level classification criteria by operating the slide bars 212 and 213 . In this example, different patterns are displayed depending on the correction level, but the present embodiment is not limited to this, as long as the correction level can be identified. For example, they may have different colors and sizes, or they may be combined. Furthermore, it may be possible to select a display mode. The setting information adjusted on the correction degree adjustment screen 210 is registered in the data management unit 14 and reflected in the display settings of the prediction result display screen (see prediction result display screen 220 in FIG. 12).

（２－１）予測モデル学習処理
図３のステップＳ１４における予測モデル学習処理について詳しく説明する。図７は、予測モデル学習処理の詳細な処理手順例を示すフローチャートである。 (2-1) Prediction Model Learning Processing The prediction model learning processing in step S14 of FIG. 3 will be described in detail. FIG. 7 is a flowchart illustrating a detailed processing procedure example of prediction model learning processing.

図７によればまず、モデル学習部１１の代表ベクトル選定部１５が、図３のステップＳ１３で前処理が行われた訓練データの入力集合をクラスタリングし、各クラスタにおける代表ベクトルを選定する（ステップＳ２１）。代表ベクトルの選定は特定の方法に限定されるものではないが、例えば、ｋ－ｍｅａｎｓ法を使って、対象クラスタの重心や中心を示すベクトルを当該クラスタの代表ベクトルに選定することが考えられる。その他にも例えば、階層的クラスタリングやＤＢＳＣＡＮ（Density-based Spatial Clustering of Applications with Noise）等の方法を使って、代表ベクトルを選定するようにしてもよい。 According to FIG. 7, first, the representative vector selection unit 15 of the model learning unit 11 clusters the input set of training data preprocessed in step S13 of FIG. 3, and selects a representative vector in each cluster (step S21). The selection of the representative vector is not limited to a specific method. For example, it is conceivable to use the k-means method to select a vector indicating the center of gravity or the center of the target cluster as the representative vector of the cluster. Alternatively, for example, a method such as hierarchical clustering or DBSCAN (Density-based Spatial Clustering of Applications with Noise) may be used to select a representative vector.

次に、モデル学習部１１のベースモデル学習部１６が、ステップＳ２１で選定された複数の代表ベクトルのそれぞれについて、当該代表ベクトルの近傍にある所定個数（Ｋ個）の訓練データ（ベースモデル訓練データ）を使って、物性値を予測するベースモデルを学習する（ステップＳ２２）。 Next, for each of the plurality of representative vectors selected in step S21, the base model learning unit 16 of the model learning unit 11 selects a predetermined number (K) of training data (base model training data) in the vicinity of the representative vector. ) is used to learn a base model for predicting physical property values (step S22).

ステップＳ２２の詳細な処理手順例を説明する。第１に、ベースモデル学習部１６は、ベースモデル訓練データを空集合に初期化する。第２に、ベースモデル学習部１６は、代表ベクトルが属するクラスタに属する訓練データをベースモデル訓練データに追加する。第２手順では、上記クラスタに属する訓練データを、代表ベクトルの近傍にあるものから順にベースモデル訓練データに追加していき、所定個数のＫ個に達した段階で第２手順を終了して第４手順に進む。一方、第２手順の完了時にベースモデル訓練データに追加された訓練データの数がＫ個未満であった場合は、ベースモデル訓練データに訓練データをさらに追加するために第３手順に進む。第３手順では、ベースモデル学習部１６は、代表ベクトルが属するクラスタに属するか否かを問わずに、代表ベクトルの近傍にある未追加の訓練データを、総計がＫ個に達するまでベースモデル訓練データに追加し、完了後は第４手順に進む。第４手順では、これまでの各手順を経て追加されたＫ個のベースモデル訓練データを使って、ベースモデルを学習する。以上のような処理を代表ベクトルごとに繰り返すことにより、ベースモデル学習部１６は、各代表ベクトルに関する（対応する）ベースモデルを学習することができる。 A detailed processing procedure example of step S22 will be described. First, the base model learning unit 16 initializes the base model training data to an empty set. Second, the base model learning unit 16 adds training data belonging to the cluster to which the representative vector belongs to the base model training data. In the second procedure, the training data belonging to the cluster are added to the base model training data in order from those in the vicinity of the representative vector. Proceed to step 4. On the other hand, if the number of training data added to the base model training data at the completion of the second procedure is less than K, proceed to the third procedure to add more training data to the base model training data. In the third procedure, the base model learning unit 16 performs base model training on unadded training data in the vicinity of the representative vector until the total reaches K, regardless of whether the representative vector belongs to the cluster to which the base model learning unit 16 belongs. Add to the data and proceed to step 4 after completion. In the fourth procedure, the K base model training data added through the previous procedures are used to learn the base model. By repeating the above processing for each representative vector, the base model learning unit 16 can learn a (corresponding) base model for each representative vector.

次に、モデル学習部１１の補正モデル学習部１７が、ステップＳ２１で選定された複数の代表ベクトルのそれぞれについて、当該代表ベクトルの近傍にある所定個数（Ｌ個）の訓練データ（補正モデル訓練データ）を使って、当該代表ベクトルに関するベースモデルの残差の反数を予測する補正モデルを学習する（ステップＳ２３）。ここで、ベースモデルの残差の反数を予測する補正モデルとは、言い換えれば、ベースモデルの真値の偏差を打ち消す補正モデルであり、このような補正モデルを学習するためには、補正モデル訓練データに含まれる訓練データの数「Ｌ個」が、ベースモデル訓練データに含まれる訓練データの数（Ｋ個）よりも多いことが好ましい。 Next, for each of the plurality of representative vectors selected in step S21, the correction model learning unit 17 of the model learning unit 11 selects a predetermined number (L) of training data (correction model training data) in the vicinity of the representative vector. ) is used to learn a correction model that predicts the reciprocal of the residual of the base model with respect to the representative vector (step S23). Here, the correction model that predicts the inverse of the residual of the base model is, in other words, a correction model that cancels the deviation of the true value of the base model. It is preferable that the number "L" of training data included in the training data is larger than the number (K) of training data included in the base model training data.

次に、モデル学習部１１は、各代表ベクトルについて、物性予測値の汎化誤差を最小化する最適な汎化誤差最小化定数（定数α）を線形探索する（ステップＳ２４）。 Next, the model learning unit 11 linearly searches for the optimum generalization error minimizing constant (constant α) that minimizes the generalization error of the predicted physical property value for each representative vector (step S24).

ステップＳ２４について補足説明する。本実施の形態では、代表ベクトルの近傍にある訓練データを用いて学習した予測モデルによる予測値、すなわち、ベースモデルの予測値と補正モデルの予測値とに基づいて、物性予測値を算出する。このとき物性予測値は、ベースモデルの予測値と補正モデルの予測値に「定数α」を掛けた値との和によって表される。そして、ステップＳ２４では、このような物性予測値で推定される汎化誤差ができるだけ小さくなるように、線形探索（リニアサーチ）によって「定数α」の値を「０」から「１」の間で探索する。具体的には例えば、実験で計測された物性のデータを示す出力集合１２０（図５参照）のうち、ベースモデルや補正モデルの代表ベクトルに近い入力情報に対応する出力情報を参照しながらリニアサーチを行うことにより、推定汎化誤差を最小化する最適な定数αを探索することができる。このようにして探索された定数αを汎化誤差最小化定数と呼ぶ。 A supplementary description of step S24 will be given. In the present embodiment, the predicted values of physical properties are calculated based on the predicted values of the prediction model learned using the training data near the representative vector, that is, the predicted values of the base model and the predicted values of the correction model. At this time, the predicted value of the physical property is represented by the sum of the predicted value of the base model and the predicted value of the correction model multiplied by the "constant α". Then, in step S24, the value of the "constant α" is set between "0" and "1" by linear search so that the generalization error estimated by such predicted physical property values is as small as possible. Explore. Specifically, for example, among the output set 120 (see FIG. 5) indicating the physical property data measured in the experiment, the linear search is performed while referring to the output information corresponding to the input information close to the representative vector of the base model and the correction model. , it is possible to search for the optimal constant α that minimizes the estimated generalization error. The constant α found in this way is called a generalization error minimization constant.

そしてステップＳ２４の処理が完了すると、モデル学習部１１は予測モデル学習処理を終了する。 When the process of step S24 is completed, the model learning unit 11 terminates the prediction model learning process.

（２－２）近傍の定義
ところで、上述した予測モデル学習処理では、ステップＳ２２，Ｓ２３において、代表ベクトルの「近傍」にあることを基準として、モデル学習に用いる訓練データが選択される。本実施の形態に係る物性予測装置１では、この「近傍」をどのように判断するかを、既知のメトリックラーニングを活用する等、任意に定義することができる。 (2-2) Definition of Neighborhood By the way, in the prediction model learning process described above, in steps S22 and S23, training data to be used for model learning is selected on the basis of being in the "neighborhood" of the representative vector. In the physical property prediction apparatus 1 according to the present embodiment, it is possible to arbitrarily define how to determine this "neighborhood" by using known metric learning or the like.

具体的な方法として例えば、ユークリッド距離やマハラノビス距離等の距離計量、物性（出力）に大きく（所定程度以上に）寄与する１若しくは複数の変数（次元）に対して定義される計量、あるいは、カーネル関数等で定義される計量等を用いて、近傍を定めることができる。 Specific methods include distance metrics such as Euclidean distance and Mahalanobis distance, metrics defined for one or more variables (dimensions) that contribute significantly (more than a predetermined degree) to physical properties (output), or kernels A metric or the like defined by a function or the like can be used to define the neighborhood.

また、別の具体的な方法として例えば、ＭＬＫＲ（Metric Learning for Kernel Regression）に代表される距離計量学習によって得られる計量を用いて、近傍を定めることもできる。この方法によれば、「距離計量」を可変性を有するものとして扱い、「距離計量」自体を学習することができる。 Also, as another specific method, for example, the neighborhood can be defined using a metric obtained by distance metric learning represented by MLKR (Metric Learning for Kernel Regression). According to this method, the "distance metric" can be treated as having variability and the "distance metric" itself can be learned.

図８は、近傍定義処理の処理手順の一例を示すフローチャートである。図８に示した近傍決定処理は、距離計量学習を利用して近傍を定義するメトリックの一例である。図８に示した処理は、モデル学習部１１（例えばベースモデル学習部１６や補正モデル学習部１７）によって、例えば図３のステップＳ１３とステップＳ１４との間に実行される。なお、モデル学習部１１が距離計量学習を実行可能な新たな機能部（距離計量学習部）を有すると考えてもよい。 FIG. 8 is a flowchart illustrating an example of a processing procedure for neighborhood definition processing. The neighborhood determination process shown in FIG. 8 is an example of a metric that uses distance metric learning to define a neighborhood. The process shown in FIG. 8 is executed by the model learning unit 11 (eg, the base model learning unit 16 and the correction model learning unit 17) between steps S13 and S14 in FIG. 3, for example. Note that the model learning unit 11 may be considered to have a new functional unit (distance metric learning unit) capable of executing distance metric learning.

図８によればまず、モデル学習部１１は、決定木の変数重要度に基づいて、近傍を判断するデータ集合の次元を削減する（ステップＳ３１）。具体的には例えば、訓練データの入力集合において、各訓練データの次元（原料や材料）を削減する。なお、決定木の変数重要度は、決定木における不順度の減少具合等によって与えられる。 According to FIG. 8, first, the model learning unit 11 reduces the dimension of the data set for judging the neighborhood based on the variable importance of the decision tree (step S31). Specifically, for example, in an input set of training data, the dimensions (raw materials and materials) of each training data are reduced. Note that the variable importance of the decision tree is given by the degree of decrease in the degree of irregularity in the decision tree.

次に、モデル学習部１１は、距離計量学習としてＭＬＫＲによるデータ変換を実施する（ステップＳ３２）。ＭＬＫＲは、カーネル線形回帰の観点で汎化誤差を最小化するマハラノビス距離の共分散行列を得ることができる。そしてステップＳ３２では、この共分散行列を２つの行列に分解したうちの片方を使ってデータを変換することで、「変換された空間におけるユークリッド距離」が、「元のデータを共分散行列に倣ったマハラノビス距離で与えられる距離」に相当する。したがってモデル学習部１１は、ステップＳ３２の距離計量学習で得られる計量を用いて、すなわち、当該空間におけるユークリッド距離に基づいて近傍を判断することにより、マハラノビス距離に基づく近傍を判断することが可能となる。 Next, the model learning unit 11 performs data conversion by MLKR as distance metric learning (step S32). MLKR can obtain a covariance matrix of Mahalanobis distances that minimizes the generalization error in terms of kernel linear regression. Then, in step S32, by transforming the data using one of the two matrices obtained by decomposing the covariance matrix into two matrices, the "Euclidean distance in the transformed space" can be obtained by "following the original data with the covariance matrix." distance given by the Mahalanobis distance”. Therefore, the model learning unit 11 can determine the neighborhood based on the Mahalanobis distance by using the metric obtained in the distance metric learning in step S32, that is, by determining the neighborhood based on the Euclidean distance in the space. Become.

なお、ステップＳ３１で「次元を削減する」際には、主成分分析（ＰＣＡ：Principal Component Analysis）の手法を利用するようにしてもよい。あるいは、ＭＬＫＲによって次元を削減するようにしてもよい。 Note that when "reducing the dimension" in step S31, a technique of principal component analysis (PCA) may be used. Alternatively, the dimensionality may be reduced by MLKR.

また、別の例として、図８に示したステップＳ３１，Ｓ３２の処理のうち、何れかだけを実施するようにしてもよい。 As another example, only one of steps S31 and S32 shown in FIG. 8 may be performed.

また、上記の他にも、他のメトリックやカーネル関数を定義し、その距離や内積を使って近傍を判断できるようにしてもよい。 In addition to the above, other metrics or kernel functions may be defined so that neighbors can be determined using their distances or inner products.

（３）運用フェーズ（物性予測処理）
物性予測装置１の運用フェーズにおける処理手順を、図９等を参照しながら説明する。図９は、物性予測処理の処理手順の一例を示すフローチャートである。前述したように、物性予測処理は、主に物性予測部１２の各部によって実行される。 (3) Operation phase (physical property prediction processing)
A processing procedure in the operation phase of the physical property prediction device 1 will be described with reference to FIG. 9 and the like. FIG. 9 is a flowchart illustrating an example of the procedure of physical property prediction processing. As described above, the physical property prediction process is mainly performed by each part of the physical property prediction unit 12. FIG.

図９によればまず、物性予測部１２は、エンジニアによって端末２から未知サンプルのデータとして未知入力ベクトルが入力されると、これを受け付ける（ステップＳ４１）。本実施の形態では、データ管理部１４は、入力された未知入力ベクトルを実験番号に紐付けてまとめた入力集合で管理する。なお、運用フェーズにおける「実験」は、未知入力ベクトルに対して物性予測を行う実験を意味し、学習フェーズで入力される実験データの実験とは異なるものである。 According to FIG. 9, first, when an engineer inputs an unknown input vector as unknown sample data from the terminal 2, the physical property prediction unit 12 receives it (step S41). In the present embodiment, the data management unit 14 manages the input unknown input vector as an input set in which the input unknown input vector is associated with the experiment number. The "experiment" in the operation phase means an experiment in which physical properties are predicted for an unknown input vector, and is different from the experiment with experimental data input in the learning phase.

図１０は、未知入力ベクトルの入力集合のデータ構成例を示す図である。図１０に示した入力集合１３０は、実験の識別子を示す実験番号１３１と、当該実験で入力された未知サンプルの原料１３２，１３３及び条件１３４，１３５と、を備えて構成される。 FIG. 10 is a diagram showing a data configuration example of an input set of unknown input vectors. The input set 130 shown in FIG. 10 comprises an experiment number 131 indicating an experiment identifier, and raw materials 132, 133 and conditions 134, 135 of unknown samples input in the experiment.

ステップＳ４１の次に、物性予測部１２のモデル検索部１８が、ステップＳ４１で受け付けた未知入力ベクトル（以後、単に未知入力ベクトルと称する）に対して、物性予測に用いるモデルを検索する（ステップＳ４２）。具体的にはモデル検索部１８は、データ管理部１４を参照して、当該未知入力ベクトルに近い代表ベクトルを１つ検索し、該当する代表ベクトルに関するベースモデル（対応ベースモデル）及び補正モデル（対応補正モデル）を検索する。 After step S41, the model search unit 18 of the physical property prediction unit 12 searches for a model to be used for physical property prediction for the unknown input vector received in step S41 (hereinafter simply referred to as the unknown input vector) (step S42 ). Specifically, the model search unit 18 refers to the data management unit 14 to search for one representative vector close to the unknown input vector, and obtains a base model (corresponding base model) and a correction model (corresponding base model) related to the relevant representative vector. correction model).

次に、物性予測部１２のベースモデル予測部１９が、ステップＳ４２で検索された対応ベースモデルを用いて、未知入力ベクトルに対するベースモデルの予測値（ベースモデル予測値）を算出する（ステップＳ４３）。 Next, the base model prediction unit 19 of the physical property prediction unit 12 uses the corresponding base model retrieved in step S42 to calculate the base model prediction value (base model prediction value) for the unknown input vector (step S43). .

次に、物性予測部１２の補正モデル予測部２０が、ステップＳ４２で検索された対応補正モデルを用いて、未知入力ベクトルに対する補正モデルの予測値（補正モデル予測値）を算出する（ステップＳ４４）。 Next, the correction model prediction unit 20 of the physical property prediction unit 12 uses the corresponding correction model retrieved in step S42 to calculate the correction model prediction value (correction model prediction value) for the unknown input vector (step S44). .

次に、物性予測部１２の予測結果決定部２１が、ステップＳ４３で算出されたベースモデル予測値と、ステップＳ４４で算出された補正モデル予測値とに基づいて、未知入力ベクトルに対する物性ごとの予測値（物性予測値）を算出する（ステップＳ４５）。このとき、物性予測値は、ベースモデル予測値と補正モデル予測値に汎化誤差最小化定数（定数α）を掛けた値との和で示される。定数αは、学習フェーズにおいて探索されてデータ管理部１４に登録されている（図３のステップＳ１４、図７のステップＳ２４参照）。 Next, the prediction result determination unit 21 of the physical property prediction unit 12 predicts each physical property for the unknown input vector based on the base model prediction value calculated in step S43 and the corrected model prediction value calculated in step S44. A value (predicted physical property value) is calculated (step S45). At this time, the physical property predicted value is represented by the sum of the base model predicted value and the corrected model predicted value multiplied by the generalization error minimization constant (constant α). The constant α is searched for in the learning phase and registered in the data management unit 14 (see step S14 in FIG. 3 and step S24 in FIG. 7).

次に、予測結果決定部２１は、ステップＳ４３で算出された物性ごとの物性予測値に対する補正度を算出する（ステップＳ４６）。具体的には、ステップＳ４４で算出されたそれぞれの補正モデル予測値について絶対値を算出し、これを対応する物性予測値の補正度とする。すなわち、ステップＳ４６の処理が行われることにより、物性ごと、すなわち多次元で、物性予測値とその補正度とが算出される。 Next, the prediction result determination unit 21 calculates the degree of correction for the physical property predicted value for each physical property calculated in step S43 (step S46). Specifically, an absolute value is calculated for each corrected model predicted value calculated in step S44, and this is taken as the correction degree of the corresponding physical property predicted value. That is, by performing the process of step S46, the physical property predicted value and its correction degree are calculated for each physical property, that is, multidimensionally.

そして次に、予測結果決定部２１は、ステップＳ４６で算出した補正度の各次元の要素（すなわち、物性ごとの補正度）に基づいて、今回の物性予測の実験全体におけるリスク値を算出する（ステップＳ４７）。具体的には例えば、補正度の各次元の要素を定数倍した値の和を算出し、これをリスク値とする。なお、上記算出時に各次元の要素に掛ける「定数」には、例えば、訓練データに対する各次元の補正度の標準偏差の逆数、等を設定することができる。あるいは、重要度の観点からエンジニアが設定できるようにしてもよく、さらに、定数に限定しなくてもよい。ステップＳ４７の処理が行われることにより、実験単位での予測値のリスクの度合いがリスク値として算出される。 Next, the prediction result determination unit 21 calculates the risk value in the entire physical property prediction experiment of this time based on the elements of each dimension of the correction degree calculated in step S46 (that is, the correction degree for each physical property) ( step S47). Specifically, for example, the sum of the values obtained by multiplying the elements of each dimension of the degree of correction by a constant is calculated, and this sum is used as the risk value. For the "constant" to be multiplied by the elements of each dimension during the above calculation, for example, the reciprocal of the standard deviation of the degree of correction of each dimension with respect to the training data can be set. Alternatively, the degree of importance may be set by the engineer, and furthermore, it may not be limited to a constant. By performing the process of step S47, the degree of risk of the predicted value for each experiment is calculated as a risk value.

そして、予測結果決定部２１は、今回の物性予測の予測結果をデータ管理部１４に登録して、予測結果データを蓄積させる。最後に、物性予測部１２は、データ管理部１４に登録された物性予測の予測結果等に基づいて、予測結果表示画面の表示出力を表示部１３に指示し（ステップＳ４８）、物性予測処理を終了する。 Then, the prediction result determination unit 21 registers the prediction result of the current physical property prediction in the data management unit 14 to accumulate prediction result data. Finally, the physical property prediction unit 12 instructs the display unit 13 to output a prediction result display screen based on the prediction result of the physical property prediction registered in the data management unit 14 (step S48), and starts the physical property prediction process. finish.

図１１は、予測結果データのデータ構成例を示す図である。図１１に示した予測結果データ１４０は、図１０の実験番号１３１と対応して物性予測の実験の識別子を示す実験番号１４１と、当該実験で算出された物性単位の物性予測値及び補正度（予測値１４２，１４４、補正度１４３，１４５）と、実験単位のリスク値１４６と、を備えて構成される。なお、図１１のデータ構成例は一例であって、これに限定されるものではない。 FIG. 11 is a diagram illustrating a data configuration example of prediction result data. The prediction result data 140 shown in FIG. 11 includes the experiment number 141 indicating the identifier of the physical property prediction experiment corresponding to the experiment number 131 in FIG. 10, the physical property prediction value and the correction degree ( prediction values 142, 144, corrections 143, 145), and experimental unit risk values 146. Note that the data configuration example in FIG. 11 is just an example, and the present invention is not limited to this.

図１２は、予測結果表示画面の具体例を示す図である。図１２に示した予測結果表示画面２２０において、領域２２３の分布図には、物性予測処理で算出された各次元の物性予測値のうち、コンボボックス２２１，２２２で指定された物性（次元）の予測値を示すデータが表示される。コンボボックス２２１，２２２は、エンジニアの操作によって指定する物性を変更可能である。また、分布図にプロットされた点は、凡例２２４に従って補正度レベルに応じて異なる表示態様で表示されることにより、予測値のリスクを識別し易くなっている。そして、分布図において選択状態にある点（図中では破線で囲んで表示）について、予測結果表２２５には各物性の予測値及び補正度が表示され、リスク値表２２６にはリスク値が表示される。 FIG. 12 is a diagram showing a specific example of the prediction result display screen. In the prediction result display screen 220 shown in FIG. 12, the distribution map of the area 223 shows the physical properties (dimensions) specified in the combo boxes 221 and 222 among the physical property prediction values of each dimension calculated in the physical property prediction processing. Data showing the predicted value is displayed. Combo boxes 221 and 222 can change physical properties specified by an engineer's operation. In addition, the points plotted on the distribution map are displayed in different display modes according to the correction level according to the legend 224, thereby facilitating identification of the risk of the predicted value. Then, for points in the selected state in the distribution map (indicated by dashed lines in the figure), the prediction result table 225 displays the predicted value and correction degree of each physical property, and the risk value table 226 displays the risk value. be done.

なお、図１２には示していないが、選択状態にある点にカーソルを合わせた場合に、物性予測に用いられた未知入力ベクトルがポップアップ等で表示されるようにしてもよい。 Although not shown in FIG. 12, when the cursor is placed on a point in the selected state, the unknown input vector used for physical property prediction may be displayed in a pop-up or the like.

（４）まとめ
以上に説明してきたように、本実施の形態に係る物性予測装置１によれば、未知入力ベクトルの近傍の所定個数（Ｋ個）の訓練データを使ったベースモデルと、未知入力ベクトルの近傍の所定個数（Ｌ個）の訓練データを使ってベースモデルの偏差を補正する補正モデルとを用いて、物性予測を行うことにより、汎化誤差を抑えた物性予測値を算出できるだけでなく、物性予測値のリスクを「補正度」や「リスク値」によって予測することができる。 (4) Summary As described above, according to the physical property prediction device 1 according to the present embodiment, a base model using a predetermined number (K) of training data in the vicinity of an unknown input vector, and an unknown input By using a correction model that corrects the deviation of the base model using a predetermined number (L) of training data in the vicinity of the vector, and by performing physical property prediction, it is possible to calculate the physical property prediction value with reduced generalization error. Instead, it is possible to predict the risk of physical property prediction values by "correction degree" and "risk value".

詳細には、物性予測装置１は、物性予測値の算出において、学習フェーズで実績データ（訓練データ）を参照しながら探索した汎化誤差を最小化する最適な定数（汎化誤差最小化定数）を用い、ベースモデルの出力（ベースモデル予測値）と補正モデルの出力（補正モデル予測値）に汎化誤差最小化定数を掛けた値との和を物性予測値として算出することで、物性予測値の汎化誤差を大幅に低減することができる。 Specifically, in the calculation of the physical property prediction value, the physical property prediction device 1 searches for the optimum constant (generalization error minimization constant) that minimizes the generalization error searched while referring to the actual data (training data) in the learning phase. is used to calculate the sum of the output of the base model (predicted value of the base model) and the output of the correction model (predicted value of the corrected model) multiplied by the generalization error minimization constant as the predicted value of the physical property. The value generalization error can be greatly reduced.

また、それぞれの予測モデルを学習する際に、近傍を基準とした訓練データを用いることも、予測値の汎化誤差を低減することに貢献する。 In addition, when learning each prediction model, using training data based on the neighborhood also contributes to reducing the generalization error of the prediction value.

詳しく説明すると、本実施の形態に係る物性予測装置１では、モデル学習処理においてベースモデル訓練データや補正モデル訓練データを選ぶ際、物性（出力）に大きく寄与する１若しくは複数の変数（次元）、入力を主成分分析等で提示の空間に圧縮した空間上で定義される計量、あるいはカーネル関数などで定義される計量等を用いて「近傍」を定め、この「近傍」の定義に基づいて訓練データを選択して予測モデル（ベースモデルや補正モデル）を学習するため、運用フェーズにおいて上記予測モデルを用いて出力される物性予測の予測値の汎化誤差を低減することができる。また、図８を参照しながら説明したように、本実施の形態に係る物性予測装置１では、ＭＬＫＲに代表される距離計量学習を活用し、距離計量自体を学習した結果を用いて「近傍」を定義するメトリックを採用することもできる。この結果、より精度の高い距離計量を用いて学習モデルの構築（学習）が可能になるため、物性予測の予測値の汎化誤差をさらに低減させる効果に期待できる。 More specifically, in the physical property prediction device 1 according to the present embodiment, when selecting base model training data or correction model training data in model learning processing, one or more variables (dimensions) that greatly contribute to physical properties (output), Define the "neighborhood" using a metric defined on the space where the input is compressed to the presented space by principal component analysis, or a metric defined by a kernel function, etc., and train based on the definition of this "neighborhood" Since prediction models (base models and correction models) are learned by selecting data, generalization errors in prediction values of physical property prediction output using the prediction models in the operation phase can be reduced. Further, as described with reference to FIG. 8, in the physical property prediction device 1 according to the present embodiment, distance metric learning represented by MLKR is utilized, and the result of learning the distance metric itself is used to A metric that defines the As a result, it is possible to construct (learn) a learning model using a more accurate distance metric, so that the effect of further reducing the generalization error of the predicted value of physical property prediction can be expected.

そして、物性予測装置１によれば、物性単位の「補正度」または物性予測単位の「リスク値」で数値化し出力（表示）することで、物性予測で算出した物性予測値のリスクの度合い（ブレ具合）を可視化できるため、エンジニアが物性予測のリスクを管理できる。 Then, according to the physical property prediction device 1, by quantifying and outputting (displaying) the "correction degree" of the physical property unit or the "risk value" of the physical property prediction unit, the degree of risk of the physical property prediction value calculated in the physical property prediction ( The degree of blurring) can be visualized, so engineers can manage the risk of physical property prediction.

なお、物性予測の結果表示に関しては、図１２の予測結果表示画面２２０で領域２２３に示した分布図のように、物性予測値を補正度と関連付けて表示することにより、エンジニアが予測結果を視覚的に識別し易くなり、リスク管理の支援効果を高めることができる。また、上記した補正度の関連付け表示については、図６の補正度調整画面２１０の提供により、エンジニアの要望に応じた表示設定を調整可能にすることで、サービス性の向上に期待できる。 Regarding the display of the physical property prediction results, as in the distribution diagram shown in the area 223 of the prediction result display screen 220 in FIG. This makes it easier to identify risks, and enhances the effectiveness of supporting risk management. In addition, regarding the display of association of the degree of correction described above, by providing the correction degree adjustment screen 210 in FIG. 6, it is possible to adjust the display settings according to the engineer's request, which can be expected to improve the serviceability.

また、本実施の形態に係る物性予測装置１では、予測モデルを学習する学習フェーズと物性予測を行う運用フェーズとが分かれており、学習フェーズは実験データ（訓練データ）が更新されたときに実行すればよい。すなわち、本実施の形態に係る物性予測装置１によれば、物性予測を行うたびに予測モデルの学習を行う必要がないため、全体的な処理負荷の軽減や処理速度の向上を図ることができる。 Further, in the physical property prediction apparatus 1 according to the present embodiment, a learning phase for learning a prediction model and an operation phase for performing physical property prediction are separated, and the learning phase is executed when experimental data (training data) is updated. do it. That is, according to the physical property prediction device 1 according to the present embodiment, since it is not necessary to learn the prediction model each time the physical property prediction is performed, it is possible to reduce the overall processing load and improve the processing speed. .

また、本実施の形態に係る物性予測装置１では、学習フェーズで学習するモデルの数を、モデル学習に用いる実験データ（訓練データ）の総数よりも減らすことができるため、モデル学習の負荷を従来手法より低減することに期待できる。具体的には、モデル学習に用いる実験データ（訓練データ）の総数をＮとして、訓練データがクラスタリングされて生成されるクラスタの総数をＭとすると、本実施の形態において学習するモデルの数は、ベースモデルと補正モデルをそれぞれ１種類とした場合に、高々２Ｍに過ぎず、ほぼ確実に「Ｎ＞２Ｍ」の関係が成立する。さらに言えば、クラスタリングでまとめる数が大きくなるほど、学習するモデルの数を抑制することができる。 In addition, in the physical property prediction device 1 according to the present embodiment, the number of models to be learned in the learning phase can be reduced from the total number of experimental data (training data) used for model learning, so the load of model learning can be reduced to conventional It can be expected that it will be reduced more than the method. Specifically, when the total number of experimental data (training data) used for model learning is N, and the total number of clusters generated by clustering the training data is M, the number of models to be learned in the present embodiment is If one type of base model and one type of correction model are used, the number of models is only 2M at most, and the relationship of "N>2M" is almost certainly established. Furthermore, the number of models to be learned can be reduced as the number of models to be clustered increases.

（５）変形例（補正モデルの多段化）
本実施の形態に係る物性予測装置１の変形例について説明する。この変形例は、物性予測処理に用いる補正モデルを多段化構成とする点で、上述した本実施の形態の基本形と相違する。以下では、基本形と比較しながら変形例の特徴を説明する。 (5) Modified example (multistage correction model)
A modification of the physical property prediction device 1 according to this embodiment will be described. This modification differs from the basic form of the present embodiment described above in that the correction model used in the physical property prediction process has a multistage configuration. In the following, the features of the modified example will be explained while comparing with the basic form.

図１３は、基本形における物性予測値及びリスク値の算出過程を示すブロック線図である。図１３に示したように、未知入力ベクトル３１０に対する物性予測処理では、未知入力ベクトル３１０に近い代表ベクトルに関する対応ベースモデル（ベースモデル３２０）と対応補正モデル（補正モデル３３０）とが検索され、これらの両モデルを使って物性予測値や補正度、及びリスク値が算出される。なお、基本形の場合、補正モデル３３０は単体の１段で構成される。 FIG. 13 is a block diagram showing the process of calculating physical property prediction values and risk values in the basic form. As shown in FIG. 13, in the physical property prediction process for the unknown input vector 310, a corresponding base model (base model 320) and a corresponding correction model (correction model 330) related to a representative vector close to the unknown input vector 310 are retrieved. Both models are used to calculate physical property prediction values, correction degrees, and risk values. In the case of the basic model, the correction model 330 is composed of a single stage.

図１３の構成における物性予測値等の具体的な算出方法を説明すると、物性予測値は、未知入力ベクトル３１０を入力としてベースモデル３２０が出力したベースモデル予測値と、未知入力ベクトル３１０を入力として補正モデル３３０が出力した補正モデル予測値に汎化誤差最小化定数（定数α）を掛けた値との和をとることによって算出される（図９のステップＳ４５参照）。また、補正度は、上記補正モデル予測値の絶対値をとることによって算出される（図９のステップＳ４６参照）。また、リスク値は、補正モデル予測値の各次元の要素に対して重み付けを行って和をとることによって算出される（図９のステップＳ４７参照）。 13. To explain a specific method of calculating the physical property predicted value, etc. in the configuration of FIG. It is calculated by taking the sum of the value obtained by multiplying the correction model prediction value output by the correction model 330 by the generalization error minimization constant (constant α) (see step S45 in FIG. 9). Further, the degree of correction is calculated by taking the absolute value of the correction model predicted value (see step S46 in FIG. 9). Also, the risk value is calculated by weighting the elements of each dimension of the corrected model prediction values and taking the sum (see step S47 in FIG. 9).

これに対して、本実施の形態の変形例では、補正モデル３３０を多段化構成とすることが考えられる。補正モデル３３０の多段化には、例えば機械学習のメタアルゴリズムとして知られているブースティングを活用することができる。 On the other hand, in a modified example of the present embodiment, it is conceivable that the correction model 330 has a multistage configuration. Boosting known as a meta-algorithm of machine learning, for example, can be utilized for multistage correction model 330 .

図１４は、変形例における物性予測値及びリスク値の算出過程を示すブロック線図である。図１４の場合、補正モデル３３０は、多段化された複数の補正モデル（第１補正モデル３３１，第２補正モデル３３２，第３補正モデル３３３）で構成されている。 FIG. 14 is a block diagram showing the process of calculating the predicted physical property value and the risk value in the modified example. In the case of FIG. 14, the correction model 330 is composed of a plurality of multistage correction models (first correction model 331, second correction model 332, and third correction model 333).

なお、補正モデル３３０を構成する多段の補正モデルを作成する方法は、特定のものに限定されない。例えば学習フェーズにおいて、ブースティングの要領で、第１補正モデル３３１から第２補正モデル３３２、第２補正モデル３３２から第３補正モデル３３３、というように順次、各段の補正モデルを作っていくようにしてもよい。 It should be noted that the method of creating the multi-stage correction model that constitutes the correction model 330 is not limited to a specific one. For example, in the learning phase, in the manner of boosting, the correction models of each stage are created sequentially, such as from the first correction model 331 to the second correction model 332, from the second correction model 332 to the third correction model 333, and so on. can be

さらに、学習フェーズにおいて、補正モデルの学習に使う訓練データの個数を、多段化に合わせて順次増やすようにしてもよい。例えば、基本形の場合、ベースモデル３２０の学習に用いる訓練データはＫ個、補正モデルの学習に用いる訓練データはＫ個よりも多いＬ個としたが、本変形例の場合は、第１補正モデル３３１の学習に用いる訓練データを同様にＬ個としたとき、Ｌ個よりも多くの訓練データを第２補正モデル３３２の学習に用いるようにし、第３補正モデル３３３の学習にはさらに多くの訓練データを用いるようにすればよい。より具体的には、ベースモデル３２０の残差の反数を予測する補正モデルを第１補正モデル３３１として作成し、ベースモデル３２０の出力と第１補正モデル３３１の定数倍の予測値との残差の反数を予測する補正モデルを第２補正モデル３３２として作成し、ベースモデル３２０の出力と第２補正モデル３３２の定数倍の予測値との残差の反数を予測する補正モデルを第３補正モデル３３３として作成することができる。一般化すると、上記作成処理を繰り返して、第Ｎ補正モデルまで作成することができる（Ｎ：２以上の自然数）。但し、一般に知られているように、学習に用いるデータが大きくなるほど、過学習しやすくなるため、多段化構成の各補正モデルから出力される予測値（補正モデル予測値）に対しては、補正モデルごとに所定の学習係数を掛けることが好ましい。 Furthermore, in the learning phase, the number of pieces of training data used for learning the correction model may be sequentially increased according to the number of stages. For example, in the case of the basic model, the number of training data used for learning the base model 320 is K, and the number of training data used for learning the correction model is L, which is larger than K, but in the case of this modification, the first correction model 331 is similarly set to L, more training data than L is used for learning the second correction model 332, and more training data is used for learning the third correction model 333. Data should be used. More specifically, a correction model that predicts the inverse of the residual of the base model 320 is created as the first correction model 331, and the residual of the output of the base model 320 and the constant multiple prediction value of the first correction model 331 is calculated. A correction model that predicts the reciprocal of the difference is created as the second correction model 332, and the correction model that predicts the reciprocal of the residual between the output of the base model 320 and the constant multiple prediction value of the second correction model 332 is the second correction model. 3 correction model 333 can be created. To generalize, the above creation process can be repeated to create up to the N-th correction model (N: a natural number of 2 or more). However, as is generally known, the larger the data used for learning, the more likely it is to overlearn. Preferably, each model is multiplied by a predetermined learning factor.

図１４に示した変形例における物性予測値等の算出方法を具体的に詳しく説明する。 A method of calculating the predicted physical property values and the like in the modified example shown in FIG. 14 will be specifically described in detail.

物性予測値は、未知入力ベクトル３１０を入力としてベースモデル３２０が出力したベースモデル予測値と、未知入力ベクトル３１０を入力として第１補正モデル３３１が出力した補正モデル予測値に定数α１を掛けた値と、同じく第２補正モデル３３２が出力した補正モデル予測値に定数α２を掛けた値と、同じく第３補正モデル３３３が出力した補正モデル予測値に定数α３を掛けた値と、の和を取ることによって算出される。 The physical property prediction value is a value obtained by multiplying the base model prediction value output by the base model 320 with the unknown input vector 310 as an input and the correction model prediction value output by the first correction model 331 with the unknown input vector 310 as an input multiplied by a constant α1. , the sum of the value obtained by multiplying the corrected model predicted value output by the second correction model 332 by the constant α2 and the value obtained by multiplying the corrected model predicted value similarly output by the third correction model 333 by the constant α3. calculated by

ここで、定数α１，α２，α３は、「汎化誤差最小化定数（定数α）×学習係数」によって決定される定数であり、「０」～「１」の間の値をとる。なお、それぞれの補正モデルに対応する学習係数は、予測モデル全体での過学習を防止するように、「０」から「１」までの間で所定の値が定められる。 Here, constants α1, α2, and α3 are constants determined by “generalization error minimization constant (constant α)×learning coefficient” and take values between “0” and “1”. The learning coefficient corresponding to each correction model is set to a predetermined value between "0" and "1" so as to prevent over-learning of the prediction model as a whole.

補正度は、各段補正モデル（第１補正モデル３３１～第３補正モデル３３３）で出力される補正モデル予測値の絶対値にそれぞれの学習係数を掛けた値の和をとることによって算出される。 The degree of correction is calculated by taking the sum of the values obtained by multiplying the absolute values of the correction model prediction values output by each stage correction model (first correction model 331 to third correction model 333) by the respective learning coefficients. .

リスク値は、各段補正モデル（第１補正モデル３３１～第３補正モデル３３３）において、補正モデル予測値の各次元の要素に対して重み付けを行って和をとった値について、それらの総和をとることによって算出される。 The risk value is the sum of the values obtained by weighting and summing the elements of each dimension of the correction model prediction value in each stage correction model (first correction model 331 to third correction model 333). calculated by taking

そして、以上のようにして算出された物性予測値、補正度、及びリスク値は、基本例の場合と同様に、図１２のような予測結果表示画面を通じてエンジニアが確認することができる。 The predicted physical property value, degree of correction, and risk value calculated as described above can be confirmed by an engineer through the prediction result display screen as shown in FIG. 12, as in the case of the basic example.

なお、本変形例のさらなる派生例として、物性予測処理において多段化された複数の補正モデルを利用可能なとき、多段化構成のうちのどの補正モデルを用いて物性予測を行うかを様々に変更するようにしてもよい。
具体的には例えば、図１４において第１補正モデル３３１だけを使用する場合、第１補正モデル３３１と第２補正モデル３３２を組み合わせて使用する場合、第１補正モデル３３１～第３補正モデル３３３を組み合わせて使用する場合で、それぞれ、物性予測値、補正度、及びリスク値を算出し、算出結果を表示するようにしてもよい。このように、様々な補正モデルの組み合わせに応じた算出結果を表示することで、補正モデルに使用されるデータ範囲を変えた結果が表示されるため、エンジニアは、どの程度近傍にある予測モデルを使って物性予測を行うべきか考察することができる。 As a further derived example of this modified example, when a plurality of multi-stage correction models can be used in the physical property prediction process, various changes are made to which correction model in the multi-stage configuration is used to perform physical property prediction. You may make it
Specifically, for example, when using only the first correction model 331 in FIG. When used in combination, the predicted physical property value, degree of correction, and risk value may be calculated respectively, and the calculation results may be displayed. In this way, by displaying the calculation results according to the combination of various correction models, the results of changing the data range used for the correction models are displayed, so engineers can see how close the prediction models are. It can be considered whether it should be used to predict physical properties.

以上、本変形例では、多段化された補正モデルによる予測値を用いて物性予測値等を算出することにより、ベースモデルの偏差を補正する補正モデルを多段化構成とすることで、補正モデルに用いられるデータの範囲を変更することができるため、基本形の場合に得られる効果に加えて、よりも強い学習モデルを用いた物性予測に期待できる。また、これら多段化された補正モデルの補正値を合算するときに学習係数を掛けることで、過学習を防止することができる。また、補正モデルの組み合わせを変更可能とする場合には、予測のリスク（補正度）を調整・選択することができる。そしてこれらの結果、本変形例では、予測の汎化誤差をさらに抑制することに期待できる。 As described above, in this modified example, the correction model for correcting the deviation of the base model is configured to have a multi-stage configuration by calculating the physical property prediction value etc. using the prediction value by the multi-stage correction model. Since the range of data used can be changed, in addition to the effect obtained in the case of the basic model, prediction of physical properties using a stronger learning model can be expected. Further, by multiplying the learning coefficient when summing the correction values of these multistage correction models, over-learning can be prevented. In addition, when the combination of correction models can be changed, the prediction risk (correction degree) can be adjusted and selected. As a result, in this modified example, it can be expected to further suppress the generalization error of prediction.

なお、本発明は上記した実施の形態及び変形例に限定されるものではなく、他にも様々な変形例が含まれる。例えば、上記した実施の形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、実施の形態の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 It should be noted that the present invention is not limited to the above-described embodiments and modified examples, and includes various other modified examples. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations. Moreover, it is possible to add, delete, or replace part of the configuration of the embodiment with another configuration.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each of the above configurations, functions, processing units, processing means, and the like may be realized by hardware, for example, by designing a part or all of them using an integrated circuit. Moreover, each of the above configurations, functions, etc. may be realized by software by a processor interpreting and executing a program for realizing each function. Information such as programs, tables, and files that implement each function can be stored in a recording device such as a memory, a hard disk, or an SSD, or a recording medium such as an IC card, an SD card, or a DVD.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実施には殆ど全ての構成が相互に接続されていると考えてもよい。 Further, the control lines and information lines indicate those considered necessary for explanation, and not all control lines and information lines are necessarily indicated on the product. It may be considered that almost all configurations are interconnected in practice.

１物性予測装置
２端末
３ネットワーク
１１モデル学習部
１２物性予測部
１３表示部
１４データ管理部
１５代表ベクトル選定部
１６ベースモデル学習部
１７補正モデル学習部
１８モデル検索部
１９ベースモデル予測部
２０補正モデル予測部
２１予測結果決定部
１０１ＣＰＵ
１０２ＲＯＭ
１０３ＲＡＭ
１０４外部記憶装置
１０５通信Ｉ／Ｆ
１０６外部入力装置
１０７外部出力装置
２１０補正度調整画面
２２０予測結果表示画面
３１０未知入力ベクトル
３２０ベースモデル
３３０補正モデル
３３１第１補正モデル
３３２第２補正モデル
３３３第３補正モデル
1 physical property prediction device 2 terminal 3 network 11 model learning unit 12 physical property prediction unit 13 display unit 14 data management unit 15 representative vector selection unit 16 base model learning unit 17 correction model learning unit 18 model search unit 19 base model prediction unit 20 correction model Prediction unit 21 Prediction result determination unit 101 CPU
102 ROMs
103 RAM
104 external storage device 105 communication I/F
106 external input device 107 external output device 210 correction degree adjustment screen 220 prediction result display screen 310 unknown input vector 320 base model 330 correction model 331 first correction model 332 second correction model 333 third correction model

Claims

a model learning unit that learns a model used for physical property prediction;
a physical property prediction unit that performs the physical property prediction using the model learned by the model learning unit for an unknown input vector that is input as data of an unknown sample;
a display unit for displaying the result of the physical property prediction;
with
The model learning unit
a representative vector selection unit that clusters an input set of training data used for learning the model and selects a representative vector in each cluster;
a base model learning unit that learns a base model for predicting physical property values using a first predetermined number of the training data in the vicinity of each of the representative vectors;
a correction model learning unit that learns a correction model for predicting the inverse of the residual of each base model using a second predetermined number of the training data in the vicinity of each of the representative vectors;
has
The physical property prediction unit is
a model search unit for searching the base model and the correction model for the representative vector close to the unknown input vector;
a base model prediction unit that calculates a base model prediction value as a prediction value of the base model for the unknown input vector;
a correction model prediction unit that calculates a correction model prediction value as a prediction value of the correction model for the unknown input vector;
A physical property predicted value is calculated for each physical property by the sum of the base model predicted value and the corrected model predicted value multiplied by a predetermined constant, and the risk of the physical property predicted value is calculated based on the corrected model predicted value. A prediction result determination unit that calculates the degree of correction indicated;
has
The physical property prediction device, wherein the display unit provides a prediction result display screen that displays at least the physical property prediction value and the degree of correction calculated by the prediction result determination unit.

The display unit displays the predicted physical property value calculated by the predicted result determination unit in a display mode in which the correction degree corresponding to the predicted physical property value is associated with the prediction result display screen. Item 1. The physical property prediction device according to item 1.

On the prediction result display screen, the physical property prediction value is displayed in a different display mode according to the correction degree level indicating the degree of the corresponding correction degree,
3. The display unit according to claim 2, wherein the display unit further provides a correction degree adjustment screen on which classification criteria for the correction degree levels or a display mode for each correction degree level in the prediction result display screen can be set. Physical property prediction device.

The prediction result determination unit further calculates a risk value indicating the risk of the physical property predicted value in the current physical property prediction as a whole, based on the corrected model predicted value corresponding to the physical property predicted value for each physical property. The physical property prediction device according to claim 1.

The model learning unit further has a distance metric learning unit that learns the distance metric of the data space by MLKR (Metric Learning for Kernel Regression),
In the learning of the base model by the base model learning unit and the learning of the correction model by the correction model learning unit, which are performed using the training data in the vicinity of the representative vector, the neighborhood is the distance metric learning unit. 2. The physical property prediction device according to claim 1, wherein the distance metric learned in is determined.

The correction model learning unit, when learning the correction model, creates a first correction model that predicts the inverse of the residual of the base model, and outputs the output of the base model and the first correction model. Create a second correction model that predicts the reciprocal of the residual from the value obtained by multiplying the predicted value by a constant, and repeat the creation procedure to create up to the Nth correction model (N is a natural number of 2 or more) death,
The physical property prediction device according to claim 1, wherein the physical property prediction unit uses the created first to Nth correction models as the correction models to perform the physical property prediction.

The model learning unit linearly searches for a constant that minimizes a generalization error estimated for the input set and the output set of the training data, so that in the calculation of the physical property predicted value by the prediction result determination unit, the 2. The physical property prediction device according to claim 1, wherein the predetermined constant to be multiplied by the corrected model predicted value is set in advance.

In learning the base model by the base model learning unit and learning the correction model by the correction model learning unit, which are performed using the training data near the representative vector,
The neighborhood is defined on a given distance metric, a metric defined for one or more dimensions whose contribution to the output is greater than or equal to a given degree, and a space obtained by compressing the input to the presentation space by principal component analysis. 2. The physical property prediction device according to claim 1, wherein the determination is made using either a metric or a metric defined by a kernel function.

a model learning step for learning a model used for physical property prediction;
a physical property prediction step of performing the physical property prediction using the model learned in the model learning step for an unknown input vector input as data of an unknown sample;
a display step of displaying the result of the physical property prediction;
with
The model learning step includes:
a representative vector selection step of clustering an input set of training data used for learning the model and selecting a representative vector in each cluster;
a base model learning step of learning a base model for predicting physical property values using a first predetermined number of said training data in the vicinity of each said representative vector;
a correction model learning step of learning, for each base model, a correction model that predicts the inverse of the residual of the base model using a second predetermined number of the training data in the vicinity of each of the representative vectors;
has
The physical property prediction step includes:
a model retrieval step of retrieving the base model and the correction model for the representative vector close to the unknown input vector;
a base model prediction step of calculating a base model prediction value as the prediction value of the base model for the unknown input vector;
a correction model prediction step of calculating a correction model prediction value as a prediction value of the correction model for the unknown input vector;
By taking the sum of the base model prediction value calculated in the base model prediction step and the value obtained by multiplying the correction model prediction value calculated in the correction model prediction step by a predetermined constant, A prediction result determination step of calculating a predicted value and calculating a correction degree indicating a risk of the predicted physical property value based on the correction model predicted value;
has
The physical property prediction method, wherein the display step provides a prediction result display screen that displays at least the physical property prediction value and the degree of correction calculated in the prediction result determination step.

In the display step, the predicted physical property value calculated in the predicted result determination step is displayed on the predicted result display screen in a display mode in which the correction degree corresponding to the predicted physical property value is associated. The physical property prediction method according to claim 9 .

In the display step, the predicted physical property value is displayed on the prediction result display screen in different display modes according to the corresponding correction level indicating the degree of the correction,
The model learning step includes:
11. The method according to claim 10, further comprising a second display step of displaying a correction degree adjustment screen on which a classification criterion for the correction degree level or a display mode for each correction degree level in the prediction result display screen can be set. physical property prediction method.

In the prediction result determination step, further calculating a risk value indicating the risk of the physical property prediction value in the current physical property prediction as a whole, based on the correction model prediction value corresponding to the physical property prediction value for each physical property. The physical property prediction method according to claim 9.

The model learning step further includes a distance metric learning step of learning the distance metric of the data space by MLKR (Metric Learning for Kernel Regression),
In learning the base model in the base model learning step and learning the correction model in the correction model learning step using the training data in the vicinity of the representative vector, the neighborhood is the distance metric learning step 10. The physical property prediction method according to claim 9, wherein the distance metric learned in .

When learning the correction model in the correction model learning step, creating a first correction model for predicting the inverse of the residual of the base model, and predicting the output of the base model and the first correction model Create a second correction model that predicts the reciprocal of the residual with the value obtained by multiplying the value by a constant, and repeat the creation procedure to create up to the Nth (N is a natural number of 2 or more) correction model. ,
The physical property prediction method according to claim 9, wherein, in the physical property prediction step, the physical property prediction is performed using the created first to Nth correction models as the correction models.

In the model learning step, by performing a linear search for a constant that minimizes an estimated generalization error for the input set and the output set of the training data, in the prediction result determination step, the 10. The physical property prediction method according to claim 9, wherein the predetermined constant to be multiplied by the corrected model predicted value is set in advance.