JP2005157788A

JP2005157788A - Device, program and method for identifying model

Info

Publication number: JP2005157788A
Application number: JP2003396019A
Authority: JP
Inventors: Miyako Nishino; 都西野; Yoshiharu Nishida; 吉晴西田; Toshihiko Watanabe; 俊彦渡邊
Original assignee: Kobe Steel Ltd
Current assignee: Kobe Steel Ltd
Priority date: 2003-11-26
Filing date: 2003-11-26
Publication date: 2005-06-16
Anticipated expiration: 2023-11-26
Also published as: JP4230890B2

Abstract

<P>PROBLEM TO BE SOLVED: To specify model parameters while using a mathematical model that represents the characteristics of a model object, in such a way as to make effective use of a large amount of performance data to increase the predictive accuracy of the mathematical model. <P>SOLUTION: Among classification patterns that classify the performance data into two groups, the classification pattern that makes bigger the difference in statistical information about the predictive errors of the mathematical model between the two groups is determined (S3). A processing (S4) for identifying the mathematical model for each group with the determined classification pattern is repeated as the groups are subdivided until the predictive accuracy for the model meets preset requirements (S6, S8). Parameters when the preset requirements are met are set as parameters for the mathematical model of that group. Determination of the classification pattern proceeds by first selecting data items for use in grouping based on average values and standard deviations, and determining from among them the classification pattern that is to be employed according to information standards (S3). <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は，モデル対象の数式モデルを実績データに基づいて同定するモデル同定装置，モデル同定プログラム及びモデル同定方法に関するものである。 The present invention relates to a model identification device, a model identification program, and a model identification method for identifying a mathematical model to be modeled based on actual data.

プラントの制御や診断などに用いる数式モデルは，対象（モデル対象）の物理的特性や機構，化学的特性に基づいて作成され，その数式モデルの係数等のパラメータは，それを同定する実験や操業実績データに基づいてシステム同定手法により同定されるのが通常である。一般に，操業実績の豊富なモデル対象の数式モデルについては，ある程度の精度を確保できる場合が多い。
しかし，新製品の製造や，追加或いは改変された工程による製造等を行う場合のように，操業実績がない或いは少ない対象の数式モデルについては，モデル精度が劣化し，製品不良や工程不良につながることがあった。このような場合，技術者や操業オペレータが，操業知識や製品知識を基にして数式モデルのパラメータを調整し，モデル精度を向上させることによって操業の安定化，品質水準の回復が行われていた。
また，特許文献１には，調整作業が困難な対象や操業知識があまり得られない対象について，回帰木等のデータマイニング手法やデータ解析手法等を用いて，大規模データベースから品質とその原因と推定される属性の因果関係を知識として抽出し，これに基づいて調整作業を行うことが示されている。
特開２００２−３２４２０６号公報西野，前田，渡辺，北村，森本，大江：遺伝的プログラミングを用いた関数合成による圧延荷重のモデリング；システム制御情報学会論文誌，Vol.14，No.3，pp.138-145(2001) A mathematical model used for plant control and diagnosis is created based on the physical characteristics, mechanism, and chemical characteristics of the target (model target), and parameters such as the coefficient of the mathematical model are used for experiments and operations to identify it. Usually, it is identified by a system identification method based on the result data. In general, there are many cases where a certain degree of accuracy can be secured for a mathematical model that is a model object with a rich operational record.
However, for a mathematical model that has little or no operation record, such as when manufacturing a new product or manufacturing with an added or modified process, the model accuracy deteriorates, leading to product defects and process defects. There was a thing. In such a case, the engineer and the operation operator adjust the parameters of the mathematical model based on the operation knowledge and product knowledge and improve the model accuracy to stabilize the operation and restore the quality level. .
In addition, Patent Document 1 describes the quality and its causes from a large-scale database using data mining techniques such as regression trees and data analysis techniques for objects that are difficult to adjust or for which operational knowledge is not obtained. It is shown that causal relationships of estimated attributes are extracted as knowledge, and adjustment work is performed based on this knowledge.
JP 2002-324206 A Nishino, Maeda, Watanabe, Kitamura, Morimoto, Oe: Modeling rolling load by functional synthesis using genetic programming; Transactions of the Institute of Systems, Control and Information Engineers, Vol.14, No.3, pp.138-145 (2001)

しかしながら，操業実績がない或いは少ないモデル対象等については，技術者や操業オペレータの経験に基づく知識をそのまま活用できない場合が多く，操業データを分析してモデル精度を向上させるのに長期間を要する場合が多いという問題点があった。さらに，人間が介在した場合のデータ処理能力は比較的低いため，モデル精度の劣化原因を特定するにあたり，大量の実績データを十分に活用できないという問題点があった。
一方，特許文献１に示される回帰木等のデータマイニング手法やデータ解析手法等では，対象の数式モデル（物理モデル等）に基づかず得られたデータのみに基づいて処理を行うため，物理モデル等に基づく外挿或いは内挿は行われず，モデル対象と実績データとの間の運転条件の違いが大きい場合や，モデル対象と近似した条件の実績データの数が少ない場合等には，知識を抽出できない，或いは抽出された知識の精度が著しく悪くなるという問題点があった。
従って，本発明は上記事情に鑑みてなされたものであり，その目的とするところは，モデル対象の特性をある程度正しく表す物理モデル等の数式モデルを用い，大量の実績データを有効活用してその数式モデルの予測精度をより高精度化するようモデルパラメータを同定できるモデル同定装置，モデル同定プログラム及びモデル同定方法を提供することにある。 However, for models with little or no operational results, it is often impossible to use knowledge based on the experience of engineers and operators as it is, and it takes a long time to improve the model accuracy by analyzing operational data. There was a problem that there were many. Furthermore, since the data processing capability when humans are involved is relatively low, there is a problem that a large amount of actual data cannot be fully used to identify the cause of the deterioration of model accuracy.
On the other hand, the data mining method such as regression tree and the data analysis method disclosed in Patent Document 1 perform processing based only on data obtained without being based on the target mathematical model (physical model, etc.). Knowledge is extracted when there is a large difference in operating conditions between the model target and actual data, or when there is a small number of actual data with conditions approximate to the model target. There is a problem that the accuracy of the extracted knowledge cannot be remarkably deteriorated.
Therefore, the present invention has been made in view of the above circumstances, and the object of the present invention is to use a mathematical model such as a physical model that accurately represents the characteristics of the model object to some extent and effectively utilize a large amount of actual data. It is an object of the present invention to provide a model identification apparatus, a model identification program, and a model identification method that can identify model parameters so that the prediction accuracy of a mathematical model is further improved.

上記目的を達成するために本発明は，モデル対象の複数の条件各々における該モデル対象に関する複数のデータ項目についての実績データからなる実績データ群を記憶する実績データ記憶手段から前記実績データを読み出し，該実績データに基づいて前記データ項目の一部である複数の説明変数についてのデータから他の前記データ項目の一つである目的変数についての予測データを求める数式モデルのパラメータを同定し，同定したパラメータをパラメータ記憶手段に記憶させる制御手段を備えたモデル同定装置において，前記制御手段が，前記実績データ群を複数の分類パターンで各々複数グループに分類する実績データ分類手段と，前記実績データを前記実績データ記憶手段から読み出す実績データ読み出し手段と，前記複数の分類パターン各々における前記複数グループ各々について，該グループに属する前記実績データを前記数式モデルに適用して前記予測データを求める予測データ算出手段と，前記分類パターン各々の前記複数グループ相互間における，前記予測データとこれに対応する前記実績データとの誤差のデータ群が有する統計情報の差異の大きさに基づいて採用する分類パターンを決定する分類パターン決定手段と，前記分類パターン決定手段の決定結果に従って分類されたグループごとに該グループに属する前記実績データに基づいて前記数式モデルのパラメータを同定するグループ毎同定手段と，前記グループ毎同定手段により同定された前記パラメータ各々を適用した前記数式モデルの予測精度が設定条件を満たすまで，前記分類パターン決定手段の決定結果に従って分類された各グループに属する各実績データ群を，前記実績データ分類手段，前記予測データ算出手段及び前記分類パターン決定手段により順次さらに細分類させるとともに前記グループ毎同定手段による前記数式モデルのパラメータの同定を実行させ，前記設定条件を満たしたときの前記パラメータをそのグループにおける前記数式モデルのパラメータとして前記パラメータ記憶手段に記憶させる同定パラメータ設定手段と，を具備してなることを特徴とするモデル同定装置として構成されるものである。
このような構成によれば，以下のような作用効果を奏する。
ここで，分類された複数グループ相互間のモデル予測の誤差のデータ群が有する統計情報の差異が大きいということは，そのときのグループ各々に設定されるべき（予測誤差を小さくする）数式モデルのパラメータの差異が大きいことを意味する。従って，そのときの分類パターンで分類したグループごとに実績データを分けて数式モデルの同定を行えば，モデル予測誤差の大幅な改善が期待できる。
上記構成によれば，大量の実績データについて数式モデルの予測精度の改善が最も大きくなると考えられるグループ分類のパターンが，網羅的な探索によって決定され，さらに，決定されたグループ分類でのグループ毎に数式モデルを同定する処理が順次グループをさらに細分化（分類）しながら繰り返されることになる。
これにより，大量の実績データを有効活用しつつ効率的に高精度の数式モデルの同定を行うことができる。
数式モデルの同定を終了させる前記設定条件としては，例えば，予め定められた許容精度以内に収まることや，前回の同定の際の予測精度を記憶しておき，これと比較した今回の同定結果による予測精度の改善度合いが予め定められた範囲に収まること等が考えられる。
ここで，前記分類パターン決定手段としては，前記複数グループ相互間における前記予測誤差の平均値の差及び標準偏差の差に基づいてグループ分類に用いる前記データ項目を選択し，該選択した前記データ項目を用いた前記分類パターンの中から，前記複数グループ相互間の情報量規範の差に基づいて採用する前記分類パターンを決定するものが考えられる。
これにより，まず平均値や標準偏差といった比較的簡易な（処理負荷の軽い）計算で求まる統計情報の指標に基づいてグループ分類に用いる前記データ項目が選択され，採用する分類パターンの候補が絞り込まれる。さらに，絞り込んだ分類パターンの中から，統計情報の大きさ（統計的特性）をより顕著に表すが比較的処理負荷の大きい情報量規範の計算によって統計情報の差異が大きくなる分類パターンが決定される。従って，処理負荷の低減と分類パターン決定の適正化とを両立できる（バランスがとれた）分類パターン決定を行うことができる。
また，本発明は，前記モデル同定装置における各手段が実行する処理をコンピュータに実行させるためのモデル同定プログラムとして捉えたものであってもよい。 In order to achieve the above object, the present invention reads out the actual data from actual data storage means for storing actual data consisting of actual data for a plurality of data items related to the model object in each of a plurality of conditions of the model object, Based on the actual data, the parameters of the mathematical model for obtaining the prediction data for the objective variable that is one of the other data items are identified from the data for the plurality of explanatory variables that are part of the data item, and identified. In a model identification apparatus comprising a control means for storing parameters in a parameter storage means, the control means classifies the achievement data group into a plurality of groups according to a plurality of classification patterns, and records the achievement data into the plurality of groups. Result data reading means for reading from the result data storage means, and the plurality of classification parameters. Prediction data calculation means for obtaining the prediction data by applying the performance data belonging to the group to the mathematical model for each of the plurality of groups in each of the groups, and the prediction between the plurality of groups of each of the classification patterns Classification pattern determining means for determining a classification pattern to be adopted based on the magnitude of the difference in statistical information included in the data group of the error between the data and the actual data corresponding thereto, and classification according to the determination result of the classification pattern determining means An identification unit for each group that identifies the parameters of the formula model based on the actual data belonging to the group for each group, and a prediction accuracy of the formula model to which each of the parameters identified by the identification unit for each group is applied Until the classification pattern determining means Each result data group belonging to each group classified according to the fixed result is further sub-classified sequentially by the result data classification means, the prediction data calculation means, and the classification pattern determination means, and the formula model by the group identification means An identification parameter setting unit that executes parameter identification and stores the parameter when the setting condition is satisfied in the parameter storage unit as a parameter of the mathematical model in the group. It is configured as a model identification device.
According to such a configuration, the following operational effects can be obtained.
Here, the large difference in statistical information of the model prediction error data group among the classified groups means that the mathematical model to be set for each group at that time (to reduce the prediction error). This means that the parameter difference is large. Therefore, if the mathematical model is identified by dividing the performance data for each group classified by the classification pattern at that time, a significant improvement in model prediction error can be expected.
According to the above configuration, the pattern of the group classification that is considered to have the greatest improvement in the prediction accuracy of the mathematical model for a large amount of actual data is determined by exhaustive search, and for each group in the determined group classification. The process of identifying the mathematical expression model is repeated while further subdividing (classifying) the group.
This makes it possible to efficiently identify a highly accurate mathematical model while effectively utilizing a large amount of actual data.
As the setting condition for terminating the identification of the mathematical model, for example, it is within a predetermined allowable accuracy, or the prediction accuracy at the previous identification is stored, and this identification result is compared with this. It is conceivable that the degree of improvement in prediction accuracy falls within a predetermined range.
Here, as the classification pattern determination means, the data item used for group classification is selected based on the difference between the average value and the standard deviation of the prediction errors between the plurality of groups, and the selected data item The classification pattern to be adopted may be determined based on the difference in information criterion between the plurality of groups from among the classification patterns using.
As a result, first, the data items used for group classification are selected based on the statistical information index obtained by relatively simple (light processing load) calculations such as average value and standard deviation, and the classification pattern candidates to be adopted are narrowed down. . Furthermore, among the narrowed-down classification patterns, a classification pattern that expresses the size of statistical information (statistical characteristics) more prominently but that has a large difference in statistical information is determined by calculation of the information amount norm with a relatively large processing load. The Therefore, it is possible to perform classification pattern determination (balanced) that can achieve both reduction in processing load and optimization of classification pattern determination.
Further, the present invention may be a model identification program for causing a computer to execute processing executed by each means in the model identification device.

同様に，本発明は，前記モデル同定装置における各手段が実行する処理内容に相当するモデル同定方法として捉えたものであってもよい。
即ち，モデル対象の複数の条件各々における該モデル対象に関する複数のデータ項目についての実績データからなる実績データ群に基づいて，予め定められた前記モデル対象の数式モデルを同定するモデル同定方法において，前記数式モデルを用いて前記データ項目の一部である複数の説明変数についてのデータから他の前記データ項目の一つである目的変数についての予測データを求める予測データ算出工程と，前記実績データ群を複数の分類パターンで各々複数グループに分類する実績データ分類工程と，前記分類パターン各々の前記複数グループ相互間における，前記予測データとこれに対応する前記実績データとの誤差のデータ群が有する統計情報の差異の大きさに基づいて採用する前記分類パターンを決定する分類パターン決定工程と，前記分類パターン決定工程による決定結果に従って分類されたグループごとに，該グループに属する前記実績データに基づいて前記数式モデルのパラメータを同定するグループ毎同定工程と，前記グループ毎同定工程により同定された前記パラメータ各々を適用した前記数式モデルの予測精度が設定条件を満たすまで，前記分類パターン決定工程の決定結果に従って分類された各グループに属する各実績データ群を，前記実績データ分類工程，前記予測データ算出工程及び前記分類パターン決定工程を実行することにより順次さらに細分類させるとともに前記グループ毎同定工程による前記数式モデルのパラメータの同定を実行させ，前記設定条件を満たしたときの前記パラメータをそのグループにおける前記数式モデルのパラメータとして設定するパラメータ設定工程と，を有してなることを特徴とするモデル同定方法である。 Similarly, the present invention may be understood as a model identification method corresponding to the processing content executed by each means in the model identification device.
That is, in the model identification method for identifying a predetermined mathematical model of a model object based on a record data group composed of record data for a plurality of data items related to the model object in each of a plurality of conditions of the model object, A prediction data calculation step for obtaining prediction data for an objective variable that is one of the other data items from data about a plurality of explanatory variables that are part of the data item using a mathematical model, and the actual data group Statistical data possessed by an error data group between the plurality of groups of each of the classification patterns and the result data classification step for classifying each group into a plurality of groups with a plurality of classification patterns. Classification pattern determination process for determining the classification pattern to be adopted based on the difference in size For each group classified according to the determination result of the classification pattern determination step, the group identification step for identifying the parameters of the mathematical model based on the actual data belonging to the group, and the group identification step Until the prediction accuracy of the mathematical model to which each of the parameters is applied satisfies a setting condition, each actual data group belonging to each group classified according to the determination result of the classification pattern determination step is used as the actual data classification step, the prediction The data calculation step and the classification pattern determination step are sequentially executed to further sub-classify, and the identification of the parameters of the mathematical formula model is executed by the group-by-group identification step, and the parameters when the setting condition is satisfied are grouped. Parameters of the mathematical model in It is model identification method comprising consisting comprises a parameter setting step of setting in, the.

本発明によれば，大量の実績データについて数式モデルの予測精度の改善が最も大きくなると考えられるグループ分類のパターンが，網羅的な探索によって決定され，さらに，決定されたグループ分類でのグループ毎に数式モデルを同定する処理が順次グループをさらに細分化（分類）しながら繰り返されるので，大量の実績データを有効活用しつつ効率的に高精度の数式モデルの同定を行うことができる。
また，グループ分類（条件分類）の自動化により，より細分化したグループ毎に数式モデルのパラメータを設定することが容易となるので，数式モデル自体が比較的簡易な線形モデル等であっても，高い精度で非線形のモデル対象をモデル化することが可能となる。
さらに，まず平均値や標準偏差といった比較的簡易な（処理負荷の軽い）計算で求まる統計情報の指標に基づいてグループ分類に用いる前記データ項目を選択し，採用する分類パターンの候補が絞り込んだ上，その中から統計的特性をより顕著に表すが比較的処理負荷の大きい情報量規範に基づいて採用する分類パターンを決定することにより，処理負荷の低減と分類パターン決定の適正化とを両立する（バランスをとる）ことができる。 According to the present invention, the pattern of group classification that is considered to have the greatest improvement in the prediction accuracy of the mathematical model for a large amount of actual data is determined by exhaustive search, and for each group in the determined group classification. Since the process of identifying the mathematical model is repeated sequentially while further subdividing (classifying) the group, it is possible to efficiently identify the mathematical model with high accuracy while effectively utilizing a large amount of actual data.
In addition, automation of group classification (conditional classification) makes it easy to set the parameters of the mathematical model for each subdivided group, so even if the mathematical model itself is a relatively simple linear model, etc. It becomes possible to model a nonlinear model object with accuracy.
In addition, the data items used for group classification are selected based on statistical information indicators obtained by relatively simple (light processing load) calculations such as average values and standard deviations, and the classification pattern candidates to be adopted are narrowed down. , And reducing the processing load and optimizing the classification pattern determination by determining the classification pattern to be adopted based on the information amount norm that expresses statistical characteristics more prominently but with a relatively large processing load. (Balance).

以下添付図面を参照しながら，本発明の実施の形態について説明し，本発明の理解に供する。尚，以下の実施の形態は，本発明を具体化した一例であって，本発明の技術的範囲を限定する性格のものではない。
ここに，図１は本発明の実施の形態に係るモデル同定装置の一例である情報処理装置Ｘの主要部の構成を表すブロック図，図２は実績データの概略構成の一例を表す図，図３は情報処理装置Ｘによるモデル同定処理手順を表すフローチャート，図４は情報処理装置Ｘによる分類パターン決定の過程を２分木で模式的に表した図，図５はＡｂａｌｏｎｅデータのデータ構成の一例を表す図，図６はＡｂａｌｏｎｅデータに従来のＣＡＲＴシステムを適用した場合と情報処理装置Ｘによるモデル同定を適用した場合に作成される決定木の一例，図７は鉄鋼の圧延プロセスにおける圧延荷重予測モデルに情報処理装置Ｘによる数式モデル同定を適用した場合に生成される決定木の一例である。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings so that the present invention can be understood. The following embodiment is an example embodying the present invention, and is not of a character that limits the technical scope of the present invention.
FIG. 1 is a block diagram showing a configuration of a main part of an information processing apparatus X which is an example of a model identification apparatus according to an embodiment of the present invention. FIG. 2 is a diagram showing an example of a schematic configuration of actual data. 3 is a flowchart showing a model identification processing procedure by the information processing apparatus X, FIG. 4 is a diagram schematically showing a classification pattern determination process by the information processing apparatus X by a binary tree, and FIG. 5 is an example of the data configuration of Abalone data FIG. 6 is an example of a decision tree created when the conventional CART system is applied to the Abalone data and when the model identification by the information processing apparatus X is applied, and FIG. 7 is a rolling load prediction in the steel rolling process. It is an example of a decision tree generated when mathematical model identification by the information processing apparatus X is applied to a model.

まず，図１のブロック図を用いて，本発明の実施の形態に係るモデル同定装置の一例である情報処理装置Ｘの主要部の構成について説明する。
情報処理装置Ｘは，そのハードウェア及び基本ソフトウェア（ＯＳ）は，一般的なパーソナルコンピュータ等の計算機である。即ち，ＣＰＵ及びその周辺デバイスであるＲＯＭ，ＲＡＭ等から構成される制御部１と，該制御部１によって実行される各種プログラムやその処理に用いられる各種データが記憶されるハードディスク等の記憶手段２と，キーボード，マウス等の入力手段３と，液晶パネルやＣＲＴディスプレイ等の表示手段４とを備えている。情報処理装置Ｘは，一般的な計算機と同様の他の機器も備えているが，ここでは説明を省略する。
前記記憶手段２は，鉄鋼・圧延プロセス等のモデル対象の数式モデルを同定する同定プログラム等の各種プログラムが記憶されるプログラム記憶部２１と，モデル対象の複数の運転条件各々における該モデル対象に関する複数のデータ項目についての実績データからなる実績データ群ｄｊを記憶する実績データ記憶部２１（前記実績データ記憶手段の一例）と，前記同定プログラムの処理の過程で生成・アクセスされる各種中間データが記憶される中間データ記憶部２３と，数式プログラムのパラメータが記憶されるパラメータ記憶部２４とに領域分割されている。
前記制御部１（前記制御手段の一例）は，前記入力手段３から処理開始の操作入力がなされると，前記プログラム記憶部２１から前記同定プログラムを主記憶（ＲＡＭ）に展開してこれを実行する。そして，その実行により，モデル対象の数式モデルのパラメータを同定し，同定したパラメータを前記パラメータ記憶部２４に記憶させる処理を行う。
前記同定プログラムは，大きく分けると，前記実績データ群ｄｊを複数の分類パターンで各々複数グループに分類する処理を実行する実績データ分類処理部１１と，予め与えられた数式モデルに実績データを適用することによりモデル対象の目的変数の予測データを求める予測データ算出部１２と，その予測データとこれに対応する実績データとの誤差のデータ群が有する統計情報の複数グループ相互間での差異の大きさに基づいて数式モデル同定に最適な分類パターンを決定する分類パターン決定処理部１３と，決定された分類パターンに従って分類したグループごとに該グループに属する実績データによって数式モデルを同定するグループ毎同定処理部１４と，同定された数式モデルのパラメータを設定して前記パラメータ記憶部２４へ記憶させる同定パラメータ設定処理部１５とを有している。 First, the configuration of the main part of an information processing apparatus X, which is an example of a model identification apparatus according to an embodiment of the present invention, will be described using the block diagram of FIG.
The information processing apparatus X has a hardware and basic software (OS) such as a general personal computer. That is, a control unit 1 composed of a CPU and its peripheral devices such as ROM and RAM, and storage means 2 such as a hard disk in which various programs executed by the control unit 1 and various data used for the processing are stored. And input means 3 such as a keyboard and a mouse, and display means 4 such as a liquid crystal panel and a CRT display. The information processing apparatus X includes other devices similar to a general computer, but the description thereof is omitted here.
The storage means 2 includes a program storage unit 21 for storing various programs such as an identification program for identifying a mathematical model of a model object such as a steel / rolling process, and a plurality of model objects for each of a plurality of model object operating conditions. A record data storage unit 21 (an example of the record data storage means) that stores record data group dj composed of record data for the data items, and various intermediate data generated and accessed during the process of the identification program are stored. Are divided into an intermediate data storage unit 23 and a parameter storage unit 24 in which parameters of the mathematical formula program are stored.
When the control unit 1 (an example of the control unit) receives an operation input to start processing from the input unit 3, the control unit 1 expands the identification program from the program storage unit 21 to the main memory (RAM) and executes it. To do. As a result, the parameters of the mathematical model to be modeled are identified, and the parameter storage unit 24 stores the identified parameters.
The identification program is roughly divided into a performance data classification processing unit 11 that executes processing for classifying the performance data group dj into a plurality of groups according to a plurality of classification patterns, and applies the performance data to a mathematical model given in advance. Accordingly, the prediction data calculation unit 12 for obtaining the prediction data of the objective variable to be modeled, and the magnitude of the difference between the plurality of groups of statistical information included in the error data group between the prediction data and the actual data corresponding to the prediction data Classification pattern determination processing unit 13 for determining the optimum classification pattern for formula model identification based on the above, and a group-specific identification processing unit for identifying the formula model from the actual data belonging to the group for each group classified according to the determined classification pattern 14 and the parameters of the identified mathematical model are set and the parameter storage unit 24 is set. And a identification parameter setting processing unit 15 for 憶.

以下，モデル対象が，鉄鋼の圧延プロセスであり，数式モデルが圧延荷重数式モデルである場合を具体例として，情報処理装置Ｘによる数式モデルの同定処理について説明する。
図２は，前記実績データ記憶部２２に予め記憶される前記実績データ群ｄｊの概略構成の一例を表すものである。
図２に示すように，前記実績データ群ｄｊは，モデル対象（鉄鋼の圧延プロセス）における複数の運転ケース（各種の条件）各々におけるそのモデル対象に関する複数のデータ項目についての実績データから構成される。図２の例では，運転ケース（条件）を「ケース１」，「ケース２」，…で表し，前記データ項目の例として，板厚（Ｈ），パス間時間（ＴＯＭ），板温度（Ｔ）及び鋼種（Ｓ）等を示し，その実績データを「ｘｘｘ」で示している。ここで，板厚（Ｈ），パス間時間（ＴＯＭ）及び板温度（Ｔ）は数値データであり，鋼種（Ｓ）は鋼種識別記号等で表される名義属性データ（非数値データ）である。前記実績データ群ｄｊには，このように数値データ及び非数値データの両方が存在し得る。前記実績データ記憶部ｄ２２には，大量の運転ケース各々についてこのような実績データが記憶されている。 Hereinafter, the identification process of the mathematical model by the information processing apparatus X will be described by taking as an example a case where the model object is a steel rolling process and the mathematical model is a rolling load mathematical model.
FIG. 2 shows an example of a schematic configuration of the record data group dj stored in advance in the record data storage unit 22.
As shown in FIG. 2, the record data group dj is composed of record data for a plurality of data items related to the model object in each of a plurality of operation cases (various conditions) in the model object (steel rolling process). . In the example of FIG. 2, the operation cases (conditions) are represented by “case 1”, “case 2”,..., And examples of the data items include plate thickness (H), time between passes (TOM), plate temperature (T ) And steel type (S), etc., and the result data is indicated by “xxx”. Here, the plate thickness (H), the time between passes (TOM), and the plate temperature (T) are numerical data, and the steel type (S) is nominal attribute data (non-numeric data) represented by a steel type identification symbol or the like. . Thus, both the numerical data and the non-numeric data can exist in the record data group dj. The result data storage unit d22 stores such result data for each of a large number of operation cases.

次に，図３のフローチャートを用いて，情報処理装置Ｘによる数式モデルの同定処理の手順について説明する。以下，Ｓ１，Ｓ２，…は，処理手順（ステップ）の番号を表す。各処理（Ｓ１〜Ｓ８）は，前述したように，前記制御部１が，前記同定プログラムに従って実行する。
＜ステップＳ１＞
まず，前記実績データ分類処理部１１（前記実績データ分類手段の一例）により，前記実績データ群を複数の分類パターンで各々２グループ（複数グループ）に分類する。前記分類パターンは，前記データ項目の種類と実績データのとり得る値とにより定めされるものである。
この分類処理は，例えば，以下の手順でおこなう。
まず，前記データ項目の全部又は予め定められた一部の中から，分類のキーとする前記データ項目を順次選択する。
次に，選択したデータ項目が，その値が数値データである場合は，その数値データがとり得る値の範囲内で任意に設定した閾（しきい）値以上であるか未満であるかによって２グループに分類する。
例えば，選択した前記データ項目の値（実績データの値）の最小値が０．１，最大値が０．３である場合に，２グループに分類する閾値として，０．２刻みで，０．１２，０．１４，０．１６，…，０．２９と９つの閾値を設定すれば，各閾値以上又は未満の２グループに分類する９つの分類パターンでの分類がなされることになる。
このような閾値は，予め前記データ項目ごとに定めておくことや，前記実績データ群の当該データ項目についての実績データの値の分布等に基づいて，例えば，データ数が均等化するように都度設定すること等が考えられる。
また，選択した前記データ項目が，前記鋼種のように，名義属性データ（非数値データ）である場合は，それを２分類する全組合せを分類パターンとすることが考えられる。例えば，ａ〜ｅの５段階の値をとり得る場合，｛（ａ），（ｂ〜ｅ）｝，｛（ｂ），（ａ，ｃ〜ｅ）｝，…，｛（ａ，ｂ），（ｃ，ｄ，ｅ）｝，…というように，最大１５通りの分類パターンでの分類がなされる。 Next, the procedure of the identification process of the mathematical model by the information processing apparatus X will be described using the flowchart of FIG. In the following, S1, S2,... Represent processing procedure (step) numbers. Each process (S1 to S8) is executed by the control unit 1 according to the identification program as described above.
<Step S1>
First, the actual data grouping unit 11 (an example of the actual data classifying unit) classifies the actual data group into two groups (a plurality of groups) according to a plurality of classification patterns. The classification pattern is determined by the type of the data item and the value that the actual data can take.
This classification process is performed, for example, according to the following procedure.
First, the data items to be used as classification keys are sequentially selected from all of the data items or a predetermined part.
Next, if the value of the selected data item is numeric data, 2 is selected depending on whether it is greater than or less than a threshold value arbitrarily set within the range of values that the numeric data can take. Classify into groups.
For example, when the minimum value of the selected data item value (result data value) is 0.1 and the maximum value is 0.3, the threshold value for classifying into two groups is set to 0. If nine threshold values of 12, 0.14, 0.16,..., 0.29 are set, classification is performed with nine classification patterns classified into two groups that are greater than or less than each threshold value.
Such a threshold value is determined in advance for each data item, or based on the distribution of the value of actual data for the data item in the actual data group, for example, so that the number of data is equalized each time. It can be set.
Further, when the selected data item is nominal attribute data (non-numeric data) like the steel type, it is considered that all combinations for classifying the data item into two are used as the classification pattern. For example, when five values of a to e can be taken, {(a), (b to e)}, {(b), (a, c to e)}, ..., {(a, b), As in (c, d, e)},..., Classification is performed with a maximum of 15 classification patterns.

＜ステップＳ２＞
次に，前記予測データ算出部１２（前記予測データ算出手段の一例）により，前記複数の分類パターン各々における前記２グループ各々について，各グループに属する実績データを前記実績データ記憶部２１から読み出し（前記実績データ読み出し手段の処理の一例），これを予め与えられた（定められた）モデル対象の数式モデルに適用して目的変数の予測データを求め，これを前記中間データ記憶部２３へ記憶させる。
ここで，数式モデルは，前記データ項目の一部である複数の説明変数についてのデータから，他の前記データ項目の一つである目的変数についての予測データを求める数式モデルであり，初期状態では，そのパラメータ（係数等）に予め初期値が設定されている。この数式モデルは，モデル対象の物理的な特性等を表し，初期状態では高い精度は望めないが，ある程度モデル対象の特性を表すものである。
数式モデルは，例えば，ごく簡易な線形モデルで表すとすると，説明変数のデータをｘ１，ｘ２，目的変数ｙの予測データをｙｓとしたとき，次の（１）式等で表すことができる。
ｙｓ＝ａｘ１＋ｂｘ２＋ｃ …（１）
但し，ａ，ｂ，ｃはモデルのパラメータ（係数）であり，初期状態では予め定められた初期値が設定されている。
本処理では，説明変数ｘ１，ｘ２に，分類された各グループに属する実績データｘｒ１（ｉ），ｘｒ２（ｉ）（ｉは運転ケースの識別番号を表す）を適用し，予測データｙｓ（ｉ）を算出する。 <Step S2>
Next, for each of the two groups in each of the plurality of classification patterns, the performance data belonging to each group is read from the performance data storage section 21 by the prediction data calculation section 12 (an example of the prediction data calculation means) An example of the processing of the result data reading means), applying this to a predetermined (predetermined) model object mathematical model to obtain the prediction data of the objective variable, and storing it in the intermediate data storage unit 23.
Here, the mathematical model is a mathematical model that obtains prediction data for a target variable that is one of the other data items from data about a plurality of explanatory variables that are part of the data item. , Initial values are preset for the parameters (coefficients, etc.). This mathematical model represents the physical characteristics of the model object, and high accuracy cannot be expected in the initial state, but it represents the characteristics of the model object to some extent.
For example, if the mathematical model is represented by a very simple linear model, it can be represented by the following equation (1) when the data of the explanatory variable is x1, x2 and the prediction data of the objective variable y is ys.
ys = ax1 + bx2 + c (1)
Here, a, b, and c are model parameters (coefficients), and predetermined initial values are set in the initial state.
In this process, the performance data xr1 (i), xr2 (i) (i represents the identification number of the driving case) belonging to each classified group is applied to the explanatory variables x1, x2, and the prediction data ys (i) Is calculated.

＜ステップＳ３＞
次に，前記分類パターン決定処理部１３（分類パターン決定手段の一例）により，前記分類パターン各々の前記２グループ相互間における，予測データとこれに対応する実績データとの誤差（予測誤差）のデータ群が有する統計情報の差異の大きさに基づいて採用する前記分類パターンを決定し，決定された前記分類パターンを表す識別情報を前記パラメータ記憶部２４へ記憶させる。以下，本処理で決定される分類パターンを決定分類パターンという。
ここで，前記統計情報の差異の大きさとは，前記２グループ相互間の統計的性質の差異の大きさのことであり，前記２グループ相互間の統計的性質の分離度合いを表すものである。
ここでの分類パターン決定処理（Ｓ３）は，以下のように，前記２グループ相互間における予測誤差の平均値の差及び標準偏差の差に基づいてグループ分類に用いる前記データ項目を選択（決定）し，該選択した前記データ項目を用いた分類パターンの中から，前記２グループ相互間の情報量規範の差に基づいて採用する分類パターン（前記決定分類パターン）を決定する。
まず，前記予測データｙｓ（ｉ）と，これに対応する（運転ケースｉが共通する）実績データｙｒ（ｉ）との誤差ｅｒｒ（ｉ）（＝ｙｒ（ｉ）−ｙｓ（ｉ））を求め，前記分類パターン各々における前記２グループの各々について，前記誤差ｅｒｒ（ｉ）の平均値ａｖ１，ａｖ２，及び標準偏差σ１，σ２を求める。
次に，前記平均値ａｖ１，ａｖ２の差の絶対値ｄav（＝｜ａｖ１−ａｖ２｜）と，前記標準偏差σ１，σ２の差の絶対値ｄσ（＝｜σ１−σ２｜）との加重平均値Ｐj（ｊは，前記分類パターンの識別番号を表す）を，次の（２）式によって求める。
Ｐj＝Wav×ｄav＋Ｗs×ｄσ …（２）
但し，Ｗav，Ｗsは，加重平均計算に用いる重み係数である。この重み係数は，予め設定されるものである。
このＰjが，前記統計情報の差異の大きさを表す指標の一例である。以下，この指標Ｐjを，誤差差異度という。この誤差差異度Ｐjは，前記２グループ相互間における，予測データと実績データとの誤差の平均値及び標準偏差（ばらつき）（各々，統計情報）の差異の大きさを表す指標であり，この値が大きいということは，そのときのグループ各々に設定されるべき（予測誤差を小さくする）数式モデルのパラメータの差異が大きいことを意味する。従って，そのときの分類パターンで分類したグループごとに実績データを分けて数式モデルの同定を行えば，モデル予測誤差の大幅な改善が期待できる。 <Step S3>
Next, by the classification pattern determination processing unit 13 (an example of a classification pattern determination unit), the error (prediction error) data between the prediction data and the actual data corresponding thereto between the two groups of each of the classification patterns. The classification pattern to be adopted is determined based on the difference in the statistical information of the group, and the identification information representing the determined classification pattern is stored in the parameter storage unit 24. Hereinafter, the classification pattern determined in this process is referred to as a determined classification pattern.
Here, the magnitude of the statistical information difference means the magnitude of the statistical property difference between the two groups, and represents the degree of separation of the statistical properties between the two groups.
In this classification pattern determination process (S3), the data items used for group classification are selected (determined) based on the difference between the average values and the standard deviations of the prediction errors between the two groups as follows. Then, a classification pattern to be adopted (the determined classification pattern) is determined based on the difference in information criterion between the two groups from among the classification patterns using the selected data items.
First, an error err (i) (= yr (i) −ys (i)) between the prediction data ys (i) and the actual data yr (i) corresponding to this (the operation case i is common) is obtained. , Average values av1, av2, and standard deviations σ1, σ2 of the error err (i) are obtained for each of the two groups in each of the classification patterns.
Next, a weighted average value of the absolute value dav (= | av1−av2 |) of the difference between the average values av1 and av2 and the absolute value dσ (= | σ1−σ2 |) of the difference between the standard deviations σ1 and σ2 Pj (j represents the identification number of the classification pattern) is obtained by the following equation (2).
Pj = Wav × dav + Ws × dσ (2)
However, Wav and Ws are weighting coefficients used for the weighted average calculation. This weighting factor is set in advance.
This Pj is an example of an index indicating the magnitude of the difference in the statistical information. Hereinafter, this index Pj is referred to as an error difference degree. This error difference degree Pj is an index representing the difference between the average value and standard deviation (variation) (statistical information) of the error between the predicted data and the actual data between the two groups. A large value means that there is a large difference in the parameters of the mathematical model that should be set for each group at that time (to reduce the prediction error). Therefore, if the mathematical model is identified by dividing the performance data for each group classified by the classification pattern at that time, a significant improvement in model prediction error can be expected.

ここで，前記誤差差異度Ｐjが最も大きくなるときの前記分類パターンを前記決定分類パターンとすることも考えられる。
しかし，本実施の形態では，前記誤差の差異度Ｐjは，グループ分類に用いる前記データ項目の絞り込み（決定）にのみ，この誤差差異度Ｐjを用いる。即ち，前記誤差差異度Ｐjが最も大きくなるときの前記分類パターン（分類）に用いられている前記データ項目を選択し，該データ項目を用いた分類パターンの中から１つの分類パターンを決定する。そして，選択された前記データ項目について，数値データの場合はいずれの閾値で分類するか，非数値データの場合はいずれの組合せで分類するかは，以下のようにして決定する。
ここでは，前記誤差差異度Ｐj基づいて選択された前記データ項目を用いた分類パターン各々について，分類された前記２グループ相互間のいわゆる情報量規範（統計情報の差異の大きさの一例）に基づいて前記決定分類パターンを決定する。
選択された前記データ項目が数値データである場合，モデル化誤差ｅｒｒ（ｉ）を，その最小値から最大値までの範囲を複数の階級に分割してヒストグラムにより量子化し，次の（３）式を最小化する閾値θで分類する分類パターンを前記決定分類パターンとする。

ここで，Ｐは分割対象とする属性値がθ以下のグループに属する確率であり，ｐは数式モデルによる予測誤差が，前記ヒストグラムの各階級に属する確率である。同様に，Qは分割対象とする属性値がθより大きいグループに属する確率であり，ｑは数式モデルによる予測誤差が，前記ヒストグラムの各階級に属する確率である。この（３）式右辺の第１項及び第２項の各々は，前記２グループ各々の統計情報の大きさの指標となる情報量規範（情報量）である。
（２）式では，前記２グループ間で，統計的性質（統計情報）の差異が大きい閾値θで分類した場合，そうでない場合と比較して情報量が小さくなるという性質を利用している。ある閾値で２グループに分類した場合，いずれかのグループに統計的性質が大きく変化する点が内包されれば，そちらのグループの情報量は大きくなる。閾値θと統計的性質の変化点とが一致した場合，前記２グループの情報量の和は最小となる。
また，選択された前記データ項目が名義属性データ（非数値データ）である場合であっても，分類の組合せを変化させることにより全く同様に処理できる。
以上示したように，前記２グループ相互間の統計情報の差異が大きくなるときの分類パターンで分類したグループごとに，実績データを分けて数式モデルの同定を行えば，モデル予測誤差の大幅な改善が期待できる。
また，本処理では，まず平均値や標準偏差といった比較的簡易な（処理負荷の軽い）計算で求まる統計情報の指標に基づいてグループ分類に用いる前記データ項目が選択され，採用する分類パターンの候補が絞り込まれる。さらに，絞り込んだ分類パターンの中から，統計情報の大きさ（統計的特性）をより顕著に表すが比較的処理負荷の大きい情報量規範の計算によって統計情報の差異が大きくなる分類パターンが決定される。従って，処理負荷の低減と分類パターン決定の適正化とを両立できる分類パターン決定を行うことができる。
なお，分布を特徴づける階級数ｃは，例えば，次の（４）式で表されるスタージェスの式で決定することが考えられる。

但し，ｍは，分類の対象となる全実績データの数である。 Here, it can be considered that the classification pattern when the error difference degree Pj is the largest is the determined classification pattern.
However, in the present embodiment, the error difference degree Pj is used only for narrowing down (determining) the data items used for group classification. That is, the data item used for the classification pattern (classification) when the error difference degree Pj is the largest is selected, and one classification pattern is determined from the classification patterns using the data item. Then, for the selected data item, it is determined as follows which threshold value is used to classify numerical data and which combination is used to classify non-numeric data.
Here, for each classification pattern using the data item selected based on the error difference degree Pj, based on a so-called information criterion between the two groups classified (an example of the magnitude of statistical information difference). To determine the determined classification pattern.
When the selected data item is numerical data, the modeling error err (i) is quantized by a histogram by dividing the range from the minimum value to the maximum value into a plurality of classes, and the following equation (3) The classification pattern that is classified by the threshold value θ that minimizes the threshold is defined as the determined classification pattern.

Here, P is a probability that the attribute value to be divided belongs to a group of θ or less, and p is a probability that a prediction error by the mathematical model belongs to each class of the histogram. Similarly, Q is the probability that the attribute value to be divided belongs to a group greater than θ, and q is the probability that the prediction error by the mathematical model belongs to each class of the histogram. Each of the first term and the second term on the right side of the equation (3) is an information amount standard (information amount) serving as an index of the size of the statistical information of each of the two groups.
Equation (2) uses the property that the amount of information is smaller when the two groups are classified by the threshold value θ where the difference in statistical properties (statistical information) is large. When classified into two groups with a certain threshold, if one of the groups includes a point whose statistical properties greatly change, the information amount of that group increases. When the threshold θ matches the statistical property change point, the sum of the information amounts of the two groups is minimized.
Even if the selected data item is nominal attribute data (non-numeric data), the same processing can be performed by changing the combination of classifications.
As described above, if the mathematical model is identified by dividing the performance data for each group classified according to the classification pattern when the statistical information difference between the two groups becomes large, the model prediction error is greatly improved. Can be expected.
In this process, first, the data items used for group classification are selected based on an index of statistical information obtained by a relatively simple (light processing load) calculation such as an average value or standard deviation, and candidate classification patterns to be adopted are selected. Is narrowed down. Furthermore, among the narrowed-down classification patterns, a classification pattern that expresses the size of statistical information (statistical characteristics) more prominently but that has a large difference in statistical information is determined by calculation of the information amount norm with a relatively large processing load. The Therefore, it is possible to perform classification pattern determination that can achieve both reduction in processing load and optimization of classification pattern determination.
Note that the class number c that characterizes the distribution may be determined by, for example, the Sturges equation expressed by the following equation (4).

However, m is the number of all performance data to be classified.

また，選択された前記データ項目について，前記決定分類パターンを決定する別の規範として，より高次の統計量で前記２グループ相互間の統計情報の差異（グループ間の分離度合い）を評価する次の（５）式を用いることも考えられる。

ここで，Ａは閾値により分類された一方のグループ（集合）に関する値であること表し，Ｂは他方のグループ（集合）に関する値であることを表す。また，Ａ∪Ｂは，分類前のデータ集合を表している。さらに，μは誤差平均，σは誤差の標準偏差を表し，Ｗは評価値の重み係数を表す。この重み係数は，予め設定されるものである。 Further, as another criterion for determining the determined classification pattern for the selected data item, a difference in statistical information between the two groups (a degree of separation between groups) is evaluated with a higher-order statistic. It is also conceivable to use the equation (5).

Here, A represents that the value is related to one group (set) classified by the threshold, and B is the value related to the other group (set). A∪B represents a data set before classification. Furthermore, μ represents the error average, σ represents the standard deviation of the error, and W represents the weighting coefficient of the evaluation value. This weighting factor is set in advance.

＜ステップＳ４＞
次に，前記グループ毎同定処理部１４（前記グループ毎同定手段の一例）により，前記決定分類パターンに従って分類されたグループごとに，各グループに属する実績データを前記実績データ記憶部２２から読み出し（前記実績データ読み出し手段の処理の一例），その実績データに基づいて数式モデルのパラメータを同定し，そのグループにおける数式モデルのパラメータとして前記パラメータ記憶部２４に記憶させる。具体的には，そのグループの識別情報と同定後のパラメータとを関連付けて前記パラメータ記憶部２４に記憶させる。
パラメータの同定手法としては，最小自乗法等の周知の同定手法が各種考えられる。例えば，最小自乗法を用いて，前記予測データとこれに対応する実績データとの誤差が最小になるように，前記パラメータａ，ｂ，ｃを設定する。これにより，前記決定分類パターンで分類されたグループごとに，そのグループに属する運転ケース各々に適した数式モデルのパラメータが設定（記憶）される。
＜ステップＳ５＞
次に，前記同定パラメータ設定処理部１５の一部の処理として，Ｓ４（グループ毎同定処理部）により同定された数式モデルのパラメータ各々を適用した数式モデルの予測精度の評価指標を求め，その評価指標を前記中間データ記憶部２３に記憶させる予測精度の評価処理を行う。
予測精度の評価指標としては，各種考えられるが，例えば，グループ毎に同定後のパラメータを適用した数式モデルにより前記予測データｙｓ（ｉ）を算出し，これに対応する（運転ケースｉが共通する）実績データｙｒ（ｉ）との前記誤差ｅｒｒ（ｉ）（＝ｙｒ（ｉ）−ｙｓ（ｉ））を求め，その平均値（前記平均値ａｖ１又はａｖ２）の大きさ，或いは該平均値ａｖ１，ａｖ２と前記標準偏差σ１，σ２との加重平均値の大きさ等を評価指標とすることが考えられる。この予測精度の評価指標は，グループごとに求められるものであるため，前記中断データ記憶部２３には，そのグループの識別情報と関連づけて記憶させる。 <Step S4>
Next, the record data belonging to each group is read from the record data storage unit 22 for each group classified according to the determined classification pattern by the group identification processing unit 14 (an example of the group identification unit) An example of the processing of the result data reading means), the parameters of the formula model are identified based on the result data, and are stored in the parameter storage unit 24 as parameters of the formula model in the group. Specifically, the identification information of the group and the parameters after identification are associated and stored in the parameter storage unit 24.
Various known identification methods such as the least square method can be considered as the parameter identification method. For example, the parameters a, b, and c are set by using the method of least squares so that the error between the predicted data and the corresponding performance data is minimized. Thereby, for each group classified by the determined classification pattern, parameters of the mathematical formula model suitable for each operation case belonging to the group are set (stored).
<Step S5>
Next, as a part of the process of the identification parameter setting processing unit 15, an evaluation index of the prediction accuracy of the mathematical model applying each of the parameters of the mathematical model identified by S4 (identification processing unit for each group) is obtained, and the evaluation A prediction accuracy evaluation process for storing the index in the intermediate data storage unit 23 is performed.
Various evaluation indexes of prediction accuracy are conceivable. For example, the prediction data ys (i) is calculated by a mathematical model to which the parameter after identification is applied for each group, and the prediction data ys (i) corresponds to this (the operation case i is common). ) The error err (i) (= yr (i) −ys (i)) with the actual data yr (i) is obtained, and the average value (the average value av1 or av2) or the average value av1 , Av2 and the standard deviations σ1 and σ2 can be considered as the evaluation index. Since the evaluation index of the prediction accuracy is obtained for each group, the interruption data storage unit 23 stores it in association with the identification information of the group.

＜ステップＳ６＞
さらに，前記同定パラメータ設定処理部１５の一部の処理として，Ｓ５での数式モデルの予測精度の評価結果（予測精度の評価指標）が，予め定められた設定条件を満たすか否か（設定条件を満たす数式モデル（のパラメータ）が存在するか否か）を判別し，その設定条件を満たすまで，前記決定分類パターンに従って分類された各グループに属する各実績データ群それぞれについて，前述したＳ１〜Ｓ５の処理を繰り返すよう制御する。即ち，Ｓ１（実績データ分類処理），Ｓ２（予測データ算出）及びＳ３（分類パターン決定処理）により，その時点で分類されている各グループを順次さらに細分類させるとともに，Ｓ４（グループ毎同定処理）による前記数式モデルのパラメータ同定処理及びＳ５（モデル精度評価処理）による同定結果の評価処理を実行させる。
各グループごとの数式モデルの同定を終了させる前記設定条件としては，例えば，前記予測精度の評価指標の値が，予め定められた目標範囲（目標精度）以内に収まることを条件とすること等が考えられる。
ここで，実績データの数が比較的少ない場合等には，必ずしも目標精度範囲に収束しないことも考えられる。このため，何らかの停止条件を前記設定条件に加えることが望ましい。
例えば，前記中間データ記憶部２３に記憶された前回の同定の際の評価指標と今回の評価指標とを比較し，今回の評価指標の改善度合い（差や比等）が予め定められた範囲に収まる（即ち，改善度が収束した）ことを停止条件とすることが考えられる。
また，評価用の実績データと同定用の実績データとを別々に用意しておき，同定用の実績データで同定したパラメータを用いて評価用の実績データで予測精度を評価すれば，精度評価の信頼性（汎用性）が高まる。この場合，N-Fold Cross Validationにより，評価用実績データと同定用実績データの分類を，精度評価の都度変更することも考えられる。
また，できるだけ深い木（２分木）を作成しておき，これに対して枝刈りを行う方法も考えられる。 <Step S6>
Further, as part of the process of the identification parameter setting processing unit 15, whether or not the evaluation result of the prediction accuracy of the mathematical model (prediction accuracy evaluation index) in S5 satisfies a predetermined setting condition (setting condition) S1 to S5 described above for each of the performance data groups belonging to each group classified according to the determined classification pattern until the setting condition is satisfied. Control to repeat the process. That is, S1 (actual data classification process), S2 (predicted data calculation), and S3 (classification pattern determination process) sequentially subdivide each group classified at that time, and S4 (identification process for each group). The parameter identification process of the mathematical model and the identification result evaluation process in S5 (model accuracy evaluation process) are executed.
As the setting condition for ending identification of the mathematical model for each group, for example, the condition that the value of the evaluation index of the prediction accuracy falls within a predetermined target range (target accuracy), etc. Conceivable.
Here, when the number of performance data is relatively small, it may be considered that the target data does not necessarily converge to the target accuracy range. For this reason, it is desirable to add some stop condition to the set condition.
For example, the evaluation index of the previous identification stored in the intermediate data storage unit 23 is compared with the current evaluation index, and the improvement degree (difference, ratio, etc.) of the current evaluation index is within a predetermined range. It can be considered that the stop condition is that the degree of improvement is settled (that is, the degree of improvement has converged).
Moreover, if the performance data for evaluation and the performance data for identification are prepared separately, and the prediction accuracy is evaluated with the performance data for evaluation using the parameters identified in the performance data for identification, the accuracy evaluation is performed. Reliability (general versatility) increases. In this case, it may be possible to change the classification of the actual data for evaluation and the actual data for identification by N-Fold Cross Validation at every accuracy evaluation.
Another possible method is to create a tree (binary tree) as deep as possible and prune it.

＜ステップＳ７＞
次に，Ｓ６において前記設定条件を満たす数式モデル（のパラメータ）が存在すると判別した場合は，前記同定パラメータ設定処理部１５の一部の処理として，そのときの前記パラメータをそのグループにおける数式モデルの最終的な確定パラメータとして前記パラメータ記憶部２４に記憶させる。例えば，そのグループの識別情報と同定後のパラメータとを関連付けて，前記パラメータ記憶部２４の確定パラメータを格納する予め定められた領域（フォルダ等）に記憶させる。
本実施の形態では，Ｓ１〜Ｓ６の処理により，実績データ群ｄｊを，２グループに分類し，分類された各グループをさらに２分類する処理が前記設定条件を満たすまで順次繰り返される。このような分類の情報は，２分木情報として表すことができる。
＜ステップＳ８＞
次に，前記確定パラメータの設定（記憶）を終了すると，前記同定パラメータ設定処理部１５の一部の処理として，その時点で分類されている全てのグループについて，前記設定条件を満足する数式モデル（即ち，同定後のパラメータ）となっているか否かを判別し，全てのグループについて前記設定条件を満たすまで，前記決定分類パターンに従って分類された各グループに属する各実績データ群それぞれについて，前述したＳ１〜Ｓ７の処理を繰り返すよう制御する。そして，全てが前記設定条件を満足した場合に，本同定処理を終了させる。 <Step S7>
Next, when it is determined in S6 that there is a mathematical model (parameter) satisfying the setting condition, as a part of the process of the identification parameter setting processing unit 15, the parameter at that time is converted to the mathematical model in the group. The final parameter is stored in the parameter storage unit 24. For example, the identification information of the group and the parameter after identification are associated with each other and stored in a predetermined area (folder or the like) in which the fixed parameter of the parameter storage unit 24 is stored.
In the present embodiment, the results data group dj is classified into two groups by the processing of S1 to S6, and the processing of further classifying each classified group into two groups is sequentially repeated until the setting condition is satisfied. Such classification information can be represented as binary tree information.
<Step S8>
Next, when the setting (storing) of the deterministic parameter is finished, as a part of the process of the identification parameter setting processing unit 15, a mathematical expression model that satisfies the setting condition for all the groups classified at that time ( That is, it is determined whether or not it is a parameter after identification), and each of the performance data groups belonging to each group classified according to the determined classification pattern until the setting condition is satisfied for all the groups is the above-described S1. Control is performed to repeat the processes of .about.S7. Then, when all of the setting conditions are satisfied, the identification process is terminated.

図４は，情報処理装置Ｘによる分類パターン決定の過程を２分木で模式的に表したものである。
例えば，第１回目の分類パターン決定処理（Ｓ３）により，前記データ項目ｘ１について，閾値を０．２として分類（０．２以上と０．２未満）する分類パターンが決定された場合，その分類パターンは，図４（ａ）に示すように，ルートの節から２つに枝分かれした２分木として表すことができる。その一方の枝は，「ｘ１＜０．２」のグループを表し，他の一方の枝は残り（「ｘ≧０．２」）のグループを表す。図４では，枝先の終端部は，「ｔk」（ｋは各終端部の識別番号）で表す。
図４（ａ）の２分木を表す２分木情報は，例えば，「ｘ１＝０．２（ｔ１，ｔ２）」等と表すことができる。これは，ルートの節では，前記データ項目ｘ１について，閾値０．２で２分岐（２分類）し，その分岐した枝先各々が終端部ｔ１，ｔ２であることを表す。 FIG. 4 schematically shows a process of determining a classification pattern by the information processing apparatus X using a binary tree.
For example, when a classification pattern for classifying the data item x1 with a threshold value of 0.2 (0.2 or more and less than 0.2) is determined by the first classification pattern determination process (S3), the classification As shown in FIG. 4A, the pattern can be expressed as a binary tree branched into two from the root node. One branch represents a group of “x1 <0.2”, and the other branch represents a remaining group (“x ≧ 0.2”). In FIG. 4, the end of the branch destination is represented by “tk” (k is an identification number of each end).
The binary tree information representing the binary tree in FIG. 4A can be expressed as “x1 = 0.2 (t1, t2)”, for example. This indicates that, in the route section, the data item x1 is branched into two (two classifications) with a threshold value of 0.2, and the branched branch destinations are terminal portions t1 and t2.

また，図４（ｂ），（ｃ）は，Ｓ６又はＳ８でのループ処理により，第２回目の分類パターン決定処理（Ｓ３）がなされた結果の一例を表す。
図４（ｂ）は，第２回目の分類パターン決定処理で，前記データ項目ｘ１が０．２以上のグループおいて，さらに，前記データ項目ｘ２について，閾値を５として分類（５以上と５未満）する分類パターンが決定された場合の例である。
この図４（ｂ）の２分木を表す２分木情報は，「ｘ１＝０．２（ｘ２＝５（ｔ１１，ｔ１２），ｔ２）」等と表すことができる。これは，前記データ項目ｘ１について，閾値＝０．２で２分岐（２分類）した一方（ｘ１≧０．２）のグループについて，さらに，前記データ項目ｘ２について，閾値＝５として分岐した枝先各々が終端部ｔ１１，ｔ１２であることを表す。
一方，図４（ｃ）は，２回目の分類パターン決定処理で，前記データ項目ｘ１が０．２未満のグループおいて，さらに，前記データ項目ｘ２について，閾値を５として分類（５以上と５未満）する分類パターンが決定された場合の例である。さらにグループが細分化されても同様に表すことができる。
この図４（ｃ）の２分木を表す２分木情報は，同様に，「ｘ１＝０．２（ｔ１，ｘ２＝５（ｔ２１，ｔ２２））」等と表すことができる。 FIGS. 4B and 4C show an example of the result of the second classification pattern determination process (S3) performed by the loop process in S6 or S8.
FIG. 4B shows a second classification pattern determination process in which the data item x1 is classified into groups of 0.2 or more, and the data item x2 is classified with a threshold value of 5 (5 or more and less than 5). This is an example when the classification pattern to be determined is determined.
The binary tree information representing the binary tree in FIG. 4B can be expressed as “x1 = 0.2 (x2 = 5 (t11, t12), t2)” or the like. This is because, for the data item x1, two branches (two classifications) with threshold = 0.2 (x1 ≧ 0.2), and for the data item x2, the branch destination branched with threshold = 5. Each represents a terminal end t11, t12.
On the other hand, FIG. 4C shows a second classification pattern determination process in which the data item x1 is classified into groups where the data item x1 is less than 0.2, and the data item x2 is classified with a threshold value of 5 (5 or more and 5 This is an example when a classification pattern to be determined is determined. Furthermore, even if a group is subdivided, it can be expressed in the same manner.
Similarly, the binary tree information representing the binary tree in FIG. 4C can be expressed as “x1 = 0.2 (t1, x2 = 5 (t21, t22))” or the like.

以上示した２分木情報を，前記決定分類パターンを表す識別情報として前記パラメータ記憶部２４に記憶させておけば，その情報により，どのような分類パターンが決定されたかを把握することができる。
ここで，分類に用いる前記データ項目の値が，名義属性データ（非数値データ）である場合は，前記２グループに分類する分類パターンごとに，その２グループの一方を表す識別記号を予め定めておき，その識別情報を閾値の代わりに用いればよい。
また，前記終端部の識別情報ｔkに関連付けて，前記同定パラメータ設定処理（Ｓ７）で設定される同定後の数式モデルのパラメータの組（ａ，ｂ，ｃ）を前記パラメータ記憶部２４に記憶させておけば，その識別情報ｔkと前記２分木情報とによって，グループを特定できる。
この場合，前記パラメータ記憶部２４に記憶される前記２分木情報，及びこれと前記終端部の識別情報ｔkによって関連付けられるグループごとのパラメータ（ａ，ｂ，ｃ）と，予め与えられた数式モデル（１）式とが，モデル対象の予測モデルを表すことになる。 If the binary tree information shown above is stored in the parameter storage unit 24 as identification information representing the determined classification pattern, it is possible to grasp what classification pattern has been determined based on the information.
Here, when the value of the data item used for classification is nominal attribute data (non-numeric data), an identification symbol representing one of the two groups is determined in advance for each classification pattern classified into the two groups. The identification information may be used instead of the threshold value.
Further, the parameter storage unit 24 stores the set of parameters (a, b, c) of the mathematical formula model after identification set in the identification parameter setting process (S7) in association with the identification information tk of the terminal portion. Then, the group can be specified by the identification information tk and the binary tree information.
In this case, the binary tree information stored in the parameter storage unit 24, the parameters (a, b, c) for each group associated with the binary tree information and the terminal unit identification information tk, and a mathematical model given in advance. Equation (1) represents the prediction model to be modeled.

以上のようにして求めた予測モデルを用いて，運転データから前記予測データを求める場合，まず，入力された運転データと前記２分木情報とに基づいて，２分木のいずれの終端部に属するかを判別し，その終端部の識別情報ｔkに対応したパラメータを適用した数式モデルにより，前記予測データを算出すればよい。
以上示したように，技術者や操業オペレータ等の人間を介在せずに数式モデルの同定を行えるので，人間のデータ処理能力に制限されず，大量の実績データを十分に活用してモデル精度の向上を図ることができる。
また，数式モデルの予測精度の改善が最も大きくなると考えられるグループ分類のパターンから順に，網羅的な探索によって決定され，さらに，決定されたグループ分類でのグループ毎に数式モデルを同定する処理が順次グループをさらに細分化（分類）しながら繰り返されるので，大量の実績データを有効活用しつつ効率的に高精度の数式モデルの同定を行うことができる。
さらに，データマイニング手法等のように，対象から得られたデータのみに基づいて処理を行うのではなく，物理モデル等に基づく数式モデルのパラメータを同定するという形で実績データの解析が行われるので，数式モデルによる外挿或いは内挿の機能により，モデル対象と実績データとの間の運転条件の違いが大きい場合や，モデル対象と近似した条件の実績データの数が少ない場合であっても，高い精度での知識抽出（グループ化とモデル同定）を行うことができる。
また，グループ分類（条件分類）の自動化により，より細分化したグループ毎に数式モデルのパラメータを設定することが容易となるので，数式モデル自体が比較的簡易な線形モデル等であっても，高い精度で非線形のモデル対象をモデル化することが可能となる。
以上示した実施の形態では，グループ分類は，２グループへの分類を行うものであるが，３グループ以上の多グループへの分類を行うものであってもよい。この場合，グループ相互間の統計情報の差異の大きさの評価は，例えば，多グループから任意に選択した２グループ相互間の差異の評価結果（評価指標）の平均値や，平均値と最大値と最小値との加重平均値等を総合指標として評価することが考えられる。 When the prediction data is obtained from the operation data using the prediction model obtained as described above, first, based on the input operation data and the binary tree information, any terminal part of the binary tree is determined. What is necessary is just to calculate the said prediction data by the numerical formula model which discriminate | determined whether it belongs and applied the parameter corresponding to the identification information tk of the terminal part.
As described above, the mathematical model can be identified without human intervention, such as engineers and operators, so it is not limited by human data processing capabilities, and a large amount of actual data can be fully utilized to improve model accuracy. Improvements can be made.
In addition, the process of identifying the mathematical model for each group in the determined group classification is sequentially determined in an exhaustive search in order from the group classification pattern considered to have the greatest improvement in the prediction accuracy of the mathematical model. Since the group is repeated while further subdividing (classifying), it is possible to efficiently identify a high-precision mathematical model while effectively utilizing a large amount of actual data.
In addition, as in the case of data mining methods, the actual data is analyzed in the form of identifying the parameters of the mathematical model based on the physical model, etc., instead of processing only based on the data obtained from the target. Even if there is a large difference in operating conditions between the model target and the actual data due to the extrapolation or interpolation function based on the mathematical model, or a small number of actual data with conditions approximate to the model target, Knowledge extraction (grouping and model identification) can be performed with high accuracy.
In addition, automation of group classification (conditional classification) makes it easy to set the parameters of the mathematical model for each subdivided group, so even if the mathematical model itself is a relatively simple linear model, etc. It becomes possible to model a nonlinear model object with accuracy.
In the embodiment described above, the group classification is performed to classify into two groups, but may be performed to classify into three or more groups. In this case, the evaluation of the magnitude of the statistical information difference between the groups is, for example, the average value or the average value and the maximum value of the evaluation result (evaluation index) of the difference between two groups arbitrarily selected from multiple groups. It is conceivable to evaluate a weighted average value between the minimum value and the minimum value as a comprehensive index.

（abaloneデータに対する適用例）
前記情報処理装置Ｘによる数式モデル同定を，一般に公開されているabalone（アワビ）の年齢推定問題に適用した例について説明する。
図５に，Abaloneデータのデータ項目の構成例を示す。図５に示すように，９つの前記データ項目があり，そのうち，１つの項目である「Ｒｉｎｇｓ（アワビの年齢）」が，数式モデルの目的変数（出力），であり，残りの８項目が，数式モデルの説明変数（入力）である。
実績データの総ケース数は，4177件である。即ち，Sex, Length, Diameter, Height, Whole weight, Shucked weight, Viscera weight, Shell weightを入力としてRingsを出力とする数式モデルを同定する問題を考える。
一般に公開されているAbaloneデータはShell weightも連続値属性(continuous)であるが，説明のため，値の大きさでL,M.Hの名義属性データに変換して用いている。
ここで，数式モデルとして，連続数値属性の線形重回帰モデルを用いる。即ち，Length, Diameter, Height, Whole weight, Shucked weight, Viscera weightの６つの変数を入力とし，Ringsを出力する線形式がモデル対象（アワビ）の特性を表す数式モデルであると仮定する（プラントモデルの場合の物理モデルに対応）。
上記条件下で，本発明を適用して数式モデルの同定を行った結果と，特許文献１等に示される周知のＣＡＲＴシステムによるモデリング結果とを比較した。
ＣＡＲＴ(Classification And Regression Trees：回帰木)システムは，決定木に基づいたモデリング手法であり，説明変数を決定木により閾値分割し，階段状の関数で対象を表す。階段状の不連続関数で対象を記述するため，モデル精度は従来のモデリング手法と比較して高くなるわけではないが，モデリングの結果によってモデルの構造が明確になり，モデリング結果を知識として活用して対象を説明することが可能となる。
ここで，モデル精度の評価は，次の（６）式による自乗誤差Ｒｄにより行った。

ここで，Ｎは実績データのデータ数（ケース数），ｆ（ｘ）は予測データ，ｙ_nはその予測データに対応する実績データを表す。その結果，モデルの予測精度は，ＣＡＲＴシステムでは，自乗誤差Ｒｄ=６．９７であったのに対して，本発明によれば，Ｒｄ＝４．８２とより高い予測精度が得られた。
図６に，このときのＣＡＲＴシステムにより得られた決定木（２分木）（図６（ａ））と，本発明の適用により得られた決定木（図６（ｂ））とを示す。本発明によれば，モデル精度劣化の原因を的確に捉えたグループ分類により，モデル精度を高くできている。 (Application example for abalone data)
An example in which the mathematical model identification by the information processing apparatus X is applied to an abalone age estimation problem that is publicly available will be described.
FIG. 5 shows a configuration example of data items of Abalone data. As shown in FIG. 5, there are nine data items, of which “Rings (age of abalone)” is an objective variable (output) of the mathematical model, and the remaining eight items are This is an explanatory variable (input) of the mathematical model.
The total number of cases in actual data is 4177. That is, consider the problem of identifying a mathematical model that takes Sex, Length, Diameter, Height, Whole weight, Shucked weight, Viscera weight, and Shell weight as inputs and outputs Rings.
In the publicly available Abalone data, Shell weight is also a continuous value attribute (continuous), but for the purpose of explanation, it is converted to nominal attribute data of L and MH by the size of the value.
Here, a linear multiple regression model having continuous numerical attributes is used as the mathematical model. That is, it is assumed that the linear form that outputs the Rings is a mathematical model that expresses the characteristics of the model object (abalone), with six variables of Length, Diameter, Height, Whole weight, Shucked weight, and Viscera weight as inputs (plant model) Corresponds to the physical model in the case of.
Under the above-mentioned conditions, the result of identifying the mathematical model by applying the present invention was compared with the modeling result by the well-known CART system disclosed in Patent Document 1 and the like.
A CART (Classification And Regression Trees) system is a modeling technique based on a decision tree, in which explanatory variables are threshold-divided by a decision tree and an object is represented by a step-like function. Since the target is described by a step-like discontinuous function, the model accuracy is not higher than that of the conventional modeling method, but the model structure is clarified by the modeling result, and the modeling result is used as knowledge. Can explain the subject.
Here, the evaluation of the model accuracy was performed by the square error Rd according to the following equation (6).

Here, N the number of data of the actual data (number of cases), f (x) is the predicted data, y _n represents the actual data corresponding to the prediction data. As a result, the prediction accuracy of the model was square error Rd = 6.97 in the CART system, but according to the present invention, a higher prediction accuracy of Rd = 4.82 was obtained.
FIG. 6 shows a decision tree (binary tree) (FIG. 6 (a)) obtained by the CART system at this time and a decision tree (FIG. 6 (b)) obtained by application of the present invention. According to the present invention, model accuracy can be increased by group classification that accurately captures the cause of model accuracy degradation.

（鉄鋼の圧延プロセスにおける圧延荷重予測モデルへの適用例）
以下，厚鋼板の圧延荷重予測の数式モデルの同定に，前記情報処理装置Ｘによる数式モデル同定を適用した例について説明する。
実績データ群ｄｊとしては，図２に示したように，前記板厚（Ｈ），前記パス間時間（ＴＯＭ）及び前記板温度（Ｔ）のような連続的な数値データに関するデータ項目と，前記鋼種（Ｓ）のような名義属性データに関するデータ項目を含む，全２５種類のデータ項目についての実績データ群が存在する事例である。
また，数式モデルとしては，非特許文献１内で示される（６）式及び（１４）〜（１７）式で得られた数式モデルを用いた。これは，５つの説明変数から圧延加重（目的変数）を予測する数式モデルである。
このような事例に，本発明を適用して得られた決定木（２分木）の一例を図７に示す。但し，図中の数値はデータ項目の実績データの平均値と標準偏差とを用いて正規化したものである。図７より，技術者の感覚に近い結果，即ち，変態や回復により特性が変化するグループ分類（２分木の分岐）が表現できている結果が得られた。 (Application example to rolling load prediction model in steel rolling process)
Hereinafter, an example in which the mathematical model identification by the information processing apparatus X is applied to the identification of the mathematical model for predicting the rolling load of the thick steel plate will be described.
As the actual data group dj, as shown in FIG. 2, data items relating to continuous numerical data such as the plate thickness (H), the time between passes (TOM) and the plate temperature (T), This is an example in which there are performance data groups for all 25 types of data items including data items related to nominal attribute data such as steel type (S).
Further, as the mathematical model, the mathematical model obtained by the formulas (6) and (14) to (17) shown in Non-Patent Document 1 was used. This is a mathematical model that predicts rolling weight (objective variable) from five explanatory variables.
An example of a decision tree (binary tree) obtained by applying the present invention to such a case is shown in FIG. However, the numerical values in the figure are normalized using the average value and standard deviation of the actual data of the data items. From FIG. 7, a result close to the sense of the engineer, that is, a result of expressing a group classification (a branch of the binary tree) whose characteristics change due to transformation or recovery was obtained.

本発明は，数式モデルの同定装置等への利用が可能である。 The present invention can be used for a mathematical model identification device or the like.

本発明の実施の形態に係るモデル同定装置の一例である情報処理装置Ｘの主要部の構成を表すブロック図。The block diagram showing the structure of the principal part of the information processing apparatus X which is an example of the model identification apparatus which concerns on embodiment of this invention. 実績データの概略構成の一例を表す図。The figure showing an example of schematic structure of track record data. 情報処理装置Ｘによるモデル同定処理手順を表すフローチャート。7 is a flowchart showing a model identification processing procedure by the information processing apparatus X. 情報処理装置Ｘによる分類パターン決定の過程を２分木で模式的に表した図。The figure which represented typically the process of the classification pattern determination by the information processing apparatus X with a binary tree. 情報処理装置Ｘによる分類パターン決定の過程を２分木で模式的に表した図。The figure which represented typically the process of the classification pattern determination by the information processing apparatus X with a binary tree. Ａｂａｌｏｎｅデータに従来のＣＡＲＴシステムを適用した場合と情報処理装置Ｘによるモデル同定を適用した場合に作成される決定木の一例。An example of a decision tree created when a conventional CART system is applied to Abalone data and when model identification by the information processing apparatus X is applied. 鉄鋼の圧延プロセスにおける圧延荷重予測モデルに情報処理装置Ｘによる数式モデル同定を適用した場合に生成される決定木の一例。An example of the decision tree produced | generated when the mathematical model identification by the information processing apparatus X is applied to the rolling load prediction model in the rolling process of steel.

Explanation of symbols

１…制御部
２…記憶手段
３…入力手段
４…表示手段
１１…実績データ分類処理部
１２…予測データ算出部
１３…分類パターン決定処理部
１４…グループ毎同定処理部
１５…同定パラメータ設定処理部 DESCRIPTION OF SYMBOLS 1 ... Control part 2 ... Memory | storage means 3 ... Input means 4 ... Display means 11 ... Performance data classification process part 12 ... Prediction data calculation part 13 ... Classification pattern determination process part 14 ... Group identification process part 15 ... Identification parameter setting process part

Claims

The actual data is read from actual data storage means for storing actual data consisting of actual data for a plurality of data items related to the model object in each of a plurality of conditions of the model object, and one of the data items is read based on the actual data. Control means for identifying a parameter of a mathematical model for obtaining prediction data for a target variable which is one of the other data items from data on a plurality of explanatory variables as a part, and storing the identified parameter in a parameter storage means In the model identification device provided,
The control means is
A result data classification means for classifying the result data group into a plurality of groups each with a plurality of classification patterns;
Achievement data reading means for reading the achievement data from the achievement data storage means;
For each of the plurality of groups in each of the plurality of classification patterns, prediction data calculation means for obtaining the prediction data by applying the performance data belonging to the group to the mathematical model;
The classification pattern to be adopted based on the difference in statistical information included in the prediction error data group between the prediction data and the corresponding performance data among the plurality of groups of the classification patterns. A classification pattern determining means to be determined;
A group identification unit for identifying a parameter of the mathematical formula model based on the actual data belonging to the group for each group classified according to the determination result of the classification pattern determination unit;
Each actual data group belonging to each group classified according to the determination result of the classification pattern determination unit until the prediction accuracy of the mathematical model applying each of the parameters identified by the group identification unit satisfies a setting condition, The result data classification means, the prediction data calculation means, and the classification pattern determination means sequentially perform further fine classification and the identification of the parameters of the mathematical model by the group identification means is executed, and the setting condition is satisfied. Identification parameter setting means for storing parameters in the parameter storage means as parameters of the mathematical model in the group;
The model identification apparatus characterized by comprising.

The classification pattern determining means selects the data item to be used for group classification based on a difference in average value and standard deviation of the prediction error in each of the plurality of groups, and the classification using the selected data item The model identification apparatus according to claim 1, wherein the classification pattern to be employed is determined based on a difference in information criterion for each of the plurality of groups from among the patterns.

The actual data is read from actual data storage means for storing actual data consisting of actual data for a plurality of data items related to the model object in each of a plurality of conditions of the model object, and one of the data items is read based on the actual data. Processing for identifying a parameter of a mathematical model for obtaining prediction data for a target variable which is one of the other data items from data on a plurality of explanatory variables which are parts, and storing the identified parameters in a parameter storage means In the model identification program to be executed by
A result data classification process for classifying the result data group into a plurality of groups each with a plurality of classification patterns;
A result data read process for reading the result data from the result data storage means;
For each of the plurality of groups in each of the plurality of classification patterns, a prediction data calculation process for obtaining the prediction data by applying the performance data belonging to the group to the mathematical model;
The classification pattern to be adopted based on the difference in statistical information included in the prediction error data group between the prediction data and the corresponding performance data among the plurality of groups of the classification patterns. A classification pattern determination process to be determined;
A group-by-group identification process for identifying a parameter of the mathematical model based on the performance data belonging to the group for each group classified according to the determination result of the classification pattern determination process;
Each actual data group belonging to each group classified according to the determination result of the classification pattern determination process until the prediction accuracy of the mathematical model applying each of the parameters identified by the identification process for each group satisfies a setting condition, By performing the actual data classification process, the prediction data calculation process, and the classification pattern determination process, further subclassification is performed sequentially, and identification of the parameters of the mathematical model is performed by the identification process for each group, and the setting condition is satisfied. An identification parameter setting process for storing the parameter in the parameter storage means as a parameter of the mathematical model in the group;
A model identification program for causing a computer to execute.

In a model identification method for identifying a predetermined mathematical model of a model object based on a record data group consisting of record data for a plurality of data items related to the model object in each of a plurality of conditions of the model object,
A prediction data calculation step of obtaining prediction data for an objective variable that is one of the other data items from data about a plurality of explanatory variables that are a part of the data items using the mathematical model;
A result data classification step of classifying the result data group into a plurality of groups each with a plurality of classification patterns;
The classification pattern to be adopted based on the difference in statistical information included in the prediction error data group between the prediction data and the corresponding performance data among the plurality of groups of the classification patterns. A classification pattern determination step to be determined;
For each group classified according to the determination result of the classification pattern determination step, a group-by-group identification step for identifying parameters of the mathematical formula model based on the actual data belonging to the group;
Each actual data group belonging to each group classified according to the determination result of the classification pattern determination step until the prediction accuracy of the mathematical model applying each of the parameters identified by the group identification step satisfies a setting condition, By performing the actual data classification step, the prediction data calculation step, and the classification pattern determination step, further subclassification is performed sequentially, and identification of the parameters of the mathematical formula model by the group-specific identification step is performed, and the setting condition is satisfied A parameter setting step for setting the parameter as a parameter of the mathematical model in the group;
A model identification method characterized by comprising: