JP2020177542A

JP2020177542A - State variation detection device and program for state variation detection

Info

Publication number: JP2020177542A
Application number: JP2019080556A
Authority: JP
Inventors: 俊和鈴木; Toshikazu Suzuki; 牧野　真一; Shinichi Makino; 真一牧野
Original assignee: Toshiba Information Systems Japan Corp
Current assignee: Toshiba Information Systems Japan Corp
Priority date: 2019-04-19
Filing date: 2019-04-19
Publication date: 2020-10-29
Anticipated expiration: 2039-04-19
Also published as: JP7012928B2

Abstract

To realize verification and correction performed on prediction that is made by state variation detection using a prediction model created based on teaching data, or change of a result.SOLUTION: A state variation detection device comprises: prediction model creation means 31 that creates a prediction model based on teaching data; predetermined range importance explanatory variable searching means 32 that determines, in the prediction model, a predetermined range importance explanatory variable that is an explanatory variable with lower importance than that of a preset threshold; teaching data geometric distance calculation means 33 that determines the geometric distance of the teaching data by using the predetermined range importance explanatory variable; analysis object data geometric distance calculation means 34 that determines the geometric distance of analysis object data; and state variation detection means 35 that detects a variation in state of the analysis data based on the distribution of the teaching data geometric distance and the distribution of the analysis object data geometric distance.SELECTED DRAWING: Figure 4

Description

この発明は、状態変動検出装置及び状態変動検出用プログラムに関するものである。 The present invention relates to a state change detection device and a state change detection program.

例えば、機械の電流と電圧と振動とを測定して、異常等の状態変動を検出する場合に、一般的には、人による過去の経験等に基づき異常の判定を行う。即ち、状態変動の閾値は人が設定するものであり、異常データを持っていない機械が状態変動を行うことはほぼ不可能であった。 For example, when a state change such as an abnormality is detected by measuring the current, voltage, and vibration of a machine, the abnormality is generally determined based on the past experience of a person or the like. That is, the threshold value of the state change is set by a person, and it is almost impossible for a machine having no abnormal data to perform the state change.

特許文献１には、電波の異常伝搬が発生した場合の当該電波の到来状況を予測するシステムが開示されている。このシステムは、気象情報を取り込む気象情報取込手段と、この気象情報取込手段で取り込んだ気象情報から、高度毎の気温に関する情報、高度毎の気圧に関する情報、高度毎の湿度に関する情報の少なくとも１つを、気象データとして抽出する気象データ抽出手段を備える。更に、上記気象データ抽出手段で上記気象データが抽出された日にちの電波到来に関し、予め蓄積された測定データに基づき算出された上記電波の異常伝搬を予測するための予測関数を蓄積しており、蓄積されている予測関数に、上記気象データ抽出手段で抽出した気象データを入力して、上記電波の異常伝搬が発生する発生確率である予測値を決定するものである。 Patent Document 1 discloses a system for predicting the arrival status of the radio wave when an abnormal propagation of the radio wave occurs. This system uses the meteorological information importing means for capturing meteorological information and the meteorological information captured by this meteorological information capturing means to at least information on temperature at each altitude, atmospheric pressure at each altitude, and humidity at each altitude. It is provided with a meteorological data extraction means for extracting one as meteorological data. Further, regarding the arrival of the radio wave on the day when the weather data is extracted by the weather data extraction means, a prediction function for predicting the abnormal propagation of the radio wave calculated based on the measurement data accumulated in advance is accumulated. The meteorological data extracted by the meteorological data extracting means is input to the accumulated prediction function to determine the predicted value which is the occurrence probability of the abnormal propagation of the radio wave.

特許文献２には、生産物を生産可能な機械設備から、上記生産物の所定の生産単位ごとに上記機械設備の有するセンサの時系列データを取得する取得工程、前記時系列データを複数の分割データに分割した分割データ群を生成する分割工程、上記分割データ群に含まれる所定の正常品質分に対し、故障がないことを示す付加情報である第１付加情報を対応付けると共に、上記分割データ群に含まれる直近生産分に対し、故障があることを示す第２付加情報を無条件に対応付ける設定工程を備える品質異常検知方法が開示されている。 Patent Document 2 describes an acquisition process for acquiring time-series data of a sensor possessed by the machine-equipment for each predetermined production unit of the product from a machine-equipment capable of producing the product, and a plurality of divisions of the time-series data. The division process for generating the divided data group divided into data, the first additional information which is additional information indicating that there is no failure is associated with the predetermined normal quality portion included in the divided data group, and the divided data group A quality abnormality detection method including a setting process for unconditionally associating a second additional information indicating that there is a failure with respect to the latest production amount included in the above is disclosed.

この品質異常検知方法は、更に、上記正常品質分及び上記直近生産分を混在させた２グループを形成するグループ化工程、上記２グループのうちの一方を用いた学習により、入力データに対応する上記付加情報を予測する予測モデルを生成する生成工程、上記２グループのうちの他方を上記入力データとすることによって得られる上記予測モデルの予測結果において、上記付加情報が上記第２付加情報であると予測される確率を算出する算出工程を有しており、上記確率に基づいて上記生産物の品質異常を判定すると言うものである。 This quality abnormality detection method further corresponds to the input data by a grouping step of forming two groups in which the normal quality component and the latest production component are mixed, and learning using one of the two groups. In the generation step of generating the prediction model for predicting the additional information, in the prediction result of the prediction model obtained by using the other of the two groups as the input data, the additional information is the second additional information. It has a calculation process for calculating the predicted probability, and determines the quality abnormality of the product based on the above probability.

また、特許文献３には過去の実績に基づいて将来を予測するデータ予測方法が開示されている。このデータ予測方法では、予測モデルは、予測対象日における少なくとも一つの特徴量について予測値を出力する第１の予測モデルと、この第１の予測モデルから出力される予測値を入力因子に含み、予測対象日の所定時間ごとの予測値を出力する第２の予測モデルと、から構成されるものである。 Further, Patent Document 3 discloses a data prediction method for predicting the future based on past achievements. In this data prediction method, the prediction model includes a first prediction model that outputs a prediction value for at least one feature quantity on the prediction target date, and a prediction value output from this first prediction model as input factors. It is composed of a second prediction model that outputs a predicted value for each predetermined time on the prediction target day.

そして、特許文献３のデータ予測方法では、収集された至近実績データ及び過去実績データを用いて予測モデルを構築する予測モデル構築手段と、構築された予測モデルに予測用入力データを入力して予測を実行し、予測値を得る予測実行手段と、収集された至近実績データと予測値とから予測誤差またはモデル誤差を計算する予測誤差計算手段と、予測誤差またはモデル誤差に基づいて前記予測値を補正する補正係数又は補正量を算出し、軸方向補正予測値を得るものである。 Then, in the data prediction method of Patent Document 3, a prediction model construction means for constructing a prediction model using the collected closest actual data and past actual data, and a prediction input data for input to the constructed prediction model for prediction. The prediction execution means for obtaining the prediction value, the prediction error calculation means for calculating the prediction error or the model error from the collected nearest actual data and the prediction value, and the prediction value based on the prediction error or the model error. The correction coefficient or correction amount to be corrected is calculated, and the axial correction predicted value is obtained.

一方、ランダムフォレストでは、図１（Ａ）に示されるような領域の境界線によって、正常な場合を「１」により示し、異常な場合を「２」で示すようにした状態変動検出装置が知られている。即ち、この状態変動検出装置では縦軸方向の値が１６０以上で横軸方向の値が４５より小さい領域は全て正常の「１」であり、縦軸の値が１６０より小さい領域では、横軸方向の値が３０より小さければ正常の「１」であり、横軸方向の値が３０以上であれば異常の「２」となる。また、縦軸方向の値が１６０以上で横軸方向の値が４５より大きい領域は、縦軸方向の値が２６０より小さければ異常の「２」であり、縦軸方向の値が２６０以上であれば正常の「１」となる。図１（Ａ）において、白無地の領域が正常の「１」であり、梨地模様の領域が異常の「２」である。 On the other hand, in a random forest, a state change detection device that indicates a normal case by "1" and an abnormal case by "2" by the boundary line of the region as shown in FIG. 1 (A) is known. Has been done. That is, in this state fluctuation detection device, the region where the value in the vertical axis direction is 160 or more and the value in the horizontal axis direction is smaller than 45 is all normal "1", and in the region where the value in the vertical axis is smaller than 160, the horizontal axis If the value in the direction is smaller than 30, it is a normal "1", and if the value in the horizontal axis direction is 30 or more, it is an abnormal "2". Further, the region where the value in the vertical axis direction is 160 or more and the value in the horizontal axis direction is larger than 45 is abnormal "2" when the value in the vertical axis direction is smaller than 260, and the value in the vertical axis direction is 260 or more. If there is, it becomes a normal "1". In FIG. 1A, the plain white region is a normal “1” and the satin-patterned region is an abnormal “2”.

以上のような領域を分けて異常と正常の判定を行う場合には、分類器により（例えばツリー構造で）分岐を行って予測を行う機械学習による状態変動検出装置を実現できる。このような状態変動検出装置において、例えば、図１（Ｂ）に示すような解析対象データＤが生じた場合には、図１（Ａ）の領域に関する状態変動検出装置によると、縦軸方向の値が１６０以上で横軸方向の値が４５より小さい領域に該当し、図示のように正常の「１」に分類されてしまう。しかしながら、縦軸方向の値が概ね１００００であるから、通常の縦軸の値４００などから矢印で示すように大きく離れており、明らかに異常の「２」に分類しなければならない解析対象データであり、誤分類が生じている。 When the above-mentioned areas are divided into abnormal and normal judgments, it is possible to realize a state change detection device by machine learning that makes predictions by branching by a classifier (for example, in a tree structure). In such a state change detection device, for example, when the analysis target data D as shown in FIG. 1 (B) is generated, according to the state change detection device for the region of FIG. 1 (A), the vertical direction It corresponds to a region where the value is 160 or more and the value in the horizontal axis direction is smaller than 45, and is classified as a normal "1" as shown in the figure. However, since the value in the vertical axis direction is approximately 10,000, it is far from the normal vertical axis value 400 and the like as shown by the arrow, and it is clearly anomalous "2" for analysis target data. Yes, there is a misclassification.

また、図２に示すような決定木構成の分類器によるランダムフォレストの状態変動検出装置においては、円によって示すところで分岐を行う。この図２では、分岐のところで用いる説明変数をＡ、Ｂ、Ｃ、Ｄ、Ｅとして示している。例えば、説明変数Ａは年齢であり、説明変数Ｂは体重であり、説明変数Ｃは身長であり、説明変数Ｄは「性」、・・・という如くに設定されている。このような状態変動検出装置においては、重要度の低い説明変数は結果に影響を与えない。具体的には、健康と不健康が結果となる装置とすれば、説明変数Ｄの「性」は結果に影響を与えにくいと言える。すべての説明変数Ａ、Ｂ、Ｃ、Ｄ、Ｅにおいて、結果に影響を与えない説明変数を重要度が低い説明変数とし、その説明変数の値が大きく変動しても、結果が変わらないということである。この場合、各説明変数Ａ、Ｂ、Ｃ、Ｄ、Ｅ毎に値をランダムに入れ替えてどれだけの誤差が生じるかを算出し、重要度を決定している。 Further, in the random forest state change detection device using the decision tree-structured classifier as shown in FIG. 2, branching is performed at the points indicated by circles. In FIG. 2, the explanatory variables used at the branch are shown as A, B, C, D, and E. For example, the explanatory variable A is the age, the explanatory variable B is the body weight, the explanatory variable C is the height, the explanatory variable D is “sex”, and so on. In such a state change detector, less important explanatory variables do not affect the results. Specifically, if the device results in health and unhealth, it can be said that the "sex" of the explanatory variable D is unlikely to affect the result. In all the explanatory variables A, B, C, D, and E, the explanatory variables that do not affect the result are set as the explanatory variables of low importance, and the result does not change even if the value of the explanatory variable fluctuates greatly. Is. In this case, the values are randomly exchanged for each of the explanatory variables A, B, C, D, and E to calculate how much error occurs and determine the importance.

図１に示した例によっては、適切な分類ができない場合が生じることがあることが判り、また、図２に示した例によっても、重量度が低い説明変数は結果に対する影響が低いために、誤差が生じる確率が低いものの、判定装置全体では精度が低下させられていることがある。 It was found that some examples shown in FIG. 1 may not be properly classified, and also in the example shown in FIG. 2, explanatory variables with low weight have a low effect on the results. Although the probability of an error is low, the accuracy of the entire determination device may be reduced.

特許文献４では、信号を取得する空間として二次元平面を考え、機械学習と識別に利用するデータは、平面内の分布情報とスペクトル情報との合計で三次元の構造をした多次元的情報と呼ばれるもので、この多次元的情報を用いた場合の、機械学習と識別の手順は、スペクトルデータを用いた場合と本質的には同様としている。更に、この場合には、データのパターンを記述するのに適切な特徴量を複数取得した上で、それを特徴ベクトルとし、機械学習及び識別処理に用いることもできること、特徴量の代表的例としては、体積や曲率、空間勾配、ＨＬＡＣ（高次局所自己相関）等があることが述べられている。 In Patent Document 4, a two-dimensional plane is considered as a space for acquiring signals, and the data used for machine learning and identification is multidimensional information having a three-dimensional structure in total of distribution information and spectral information in the plane. The procedure for machine learning and identification when using this multidimensional information is essentially the same as when using spectral data. Further, in this case, it is possible to acquire a plurality of feature quantities suitable for describing a data pattern and then use them as feature vectors for machine learning and identification processing, as a typical example of feature quantities. Is stated to have volume, curvature, spatial gradient, HLAC (higher-order local autocorrelation), and the like.

そして、機械学習に用いる特徴量を事前に選別することも可能であり、この場合、例えば、マハラノビス距離を算出して、識別に用いる特徴量を選別すれば良いことが述べられている。マハラノビス距離が大きければ、識別がより容易になるため、注目する各群間のマハラノビス距離が大きくなる様な特徴量を優先的に選別することもできるとしている。 Then, it is also possible to select the feature amount used for machine learning in advance. In this case, it is stated that, for example, the Mahalanobis distance may be calculated and the feature amount used for identification may be selected. The larger the Mahalanobis distance, the easier it is to identify, so it is possible to preferentially select features that increase the Mahalanobis distance between each group of interest.

つまり、上記特許文献４に記載されているのは、機械学習及び識別処理をより容易にする特徴量の選定を行う場合にマハラノビス距離が大きくなる様な特徴量を選択することを述べている。 That is, what is described in Patent Document 4 states that when selecting a feature amount that facilitates machine learning and identification processing, a feature amount that increases the Mahalanobis distance is selected.

更に、特許文献５の第０１１７欄には、「上記所定のアルゴリズムとして決定木を利用した場合には、ランダムフォレスト法等により、各特徴量（説明変数）の関連度（重要度又は寄与度とも称しても良い）を算出することができる。具体的には、決定木を作成する際には、ジニ係数、交差エントロピー等で示される不純度が小さくなるように、各特徴量（説明変数）からノード（分岐）の選択が行われる。そのため、各特徴量（説明変数）をノードに選択した際の不純度の減少量をその特徴量（説明変数）の重要度として利用することができる。上記ステップＳ１０２では、この重要度を利用して、複数種類の特徴量の中から異常発生の予測に有効な特徴量を選択するようにしてもよい。」と記載がなされており、基本的には図２を用いて説明したような重要度による選択が示されている。 Further, in column 0117 of Patent Document 5, "When a decision tree is used as the above-mentioned predetermined algorithm, the degree of relevance (importance or contribution) of each feature amount (explanatory variable) is determined by a random forest method or the like. (It may be called) can be calculated. Specifically, when creating a decision tree, each feature amount (explanatory variable) so as to reduce the impureness indicated by the Gini coefficient, cross entropy, etc. Therefore, the node (branch) is selected from the above. Therefore, the amount of decrease in purity when each feature amount (explanatory variable) is selected as the node can be used as the importance of the feature amount (explanatory variable). In step S102, the importance may be used to select a feature amount effective for predicting the occurrence of an abnormality from a plurality of types of feature amounts. " Shows selection by importance as described with reference to FIG.

特開２００５−３１５７５３号公報Japanese Unexamined Patent Publication No. 2005-315753 特開２０１８−１４７４０６号公報JP-A-2018-147406 特開２００４−９４４３７号公報Japanese Unexamined Patent Publication No. 2004-94437 特開２０１５−５２５８１号公報Japanese Unexamined Patent Publication No. 2015-52581 特開２０１８−１１６５４５号公報Japanese Unexamined Patent Publication No. 2018-116545

本発明は、教師データに基づき作成した予測モデルを用いた状態変動検出において行った予測に対して行う検証や補正或いは結果の変更などに使用可能であり、上記予測モデルにおいて誤分類が生じる度合いを減少させることも可能であり、また、精度が低下をカバーすることが可能な状態変動検出装置及び状態変動検出用プログラムを提供する。 The present invention can be used for verification, correction, change of result, etc. performed on the prediction performed in the state fluctuation detection using the prediction model created based on the teacher data, and the degree of misclassification in the above prediction model can be determined. Provided are a state change detection device and a state change detection program which can be reduced and can cover the decrease in accuracy.

本実施形態に係る状態変動検出装置は、教師データに基づき状態変動検出の予測モデルを作成する予測モデル作成手段と、予め設定された閾値より低い重要度の説明変数である所定範囲重要度説明変数を前記予測モデルにおいて求める所定範囲重要度説明変数検索手段と、前記所定範囲重要度説明変数を用いて教師データの幾何学的距離を求める教師データ幾何学的距離算出手段と、前記教師データの幾何学的距離の算出に用いた各説明変数の平均値と分散共分散行列の逆行列を用いて解析対象データの幾何学的距離を求める解析対象データ幾何学的距離算出手段と、前記教師データ幾何学的距離の分布と前記解析対象データ幾何学的距離の分布に基づき、前記解析データの状態変動を検出する状態変動検出手段とを具備することを特徴とする。 The state change detection device according to the present embodiment is a prediction model creating means for creating a prediction model for state change detection based on teacher data, and a predetermined range importance explanatory variable which is an explanatory variable of importance lower than a preset threshold. The predetermined range importance explanatory variable search means for obtaining the above in the prediction model, the teacher data geometric distance calculation means for obtaining the geometric distance of the teacher data using the predetermined range importance explanatory variable, and the geometry of the teacher data. The analysis target data geometric distance calculation means for obtaining the geometric distance of the analysis target data using the average value of each explanatory variable used for calculating the scientific distance and the inverse matrix of the dispersion covariance matrix, and the teacher data geometry. It is characterized by comprising a state change detecting means for detecting a state change of the analysis data based on the distribution of the scientific distance and the distribution of the geometric distance of the data to be analyzed.

ランダムフォレストにより状態変動検出を行う場合の手法を示すための説明図。Explanatory drawing for showing the method when state change detection is performed by a random forest. ランダムフォレストにより状態変動検出を行う場合に用いられる決定木の一例を示す図。The figure which shows an example of the decision tree used when the state change detection is performed by a random forest. 本発明の実施形態に係る状態変動検出装置を実現するコンピュータシステムの構成図。The block diagram of the computer system which realizes the state change detection apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る状態変動検出装置の機能ブロック図。The functional block diagram of the state change detection apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る状態変動検出装置に用いられる教師データの一例を示す図。The figure which shows an example of the teacher data used for the state change detection apparatus which concerns on embodiment of this invention. 図５に示した教師データのマハラノビス距離と、これをヒストグラムにより示した図。The Mahalanobis distance of the teacher data shown in FIG. 5 and the figure showing this by a histogram. 測定対象データのマハラノビス距離と、これをヒストグラムにより示した図。The Mahalanobis distance of the data to be measured and the figure showing this by a histogram. 本発明の実施形態に係る状態変動検出装置の動作を示すフローチャート。The flowchart which shows the operation of the state change detection apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る状態変動検出装置の動作によって求められた説明変数の重要度と、重要度の割合及びその累計値の一例を示す図。The figure which shows the importance of the explanatory variable obtained by the operation of the state change detection apparatus which concerns on embodiment of this invention, the ratio of the importance, and an example of the cumulative value thereof. 教師データのマハラノビス距離の分布図と解析対象データのマハラノビス距離の分布図ＡＣとに対し、横軸を一致させて縦方向に並べた図であって、本発明の実施形態に係る状態変動検出装置による状態変動検出の手法を説明するための説明図。This is a diagram in which the distribution map of the Mahalanobis distance of the teacher data and the distribution map AC of the Mahalanobis distance of the analysis target data are arranged in the vertical direction with the horizontal axes aligned with each other, and is a state change detection device according to the embodiment of the present invention. Explanatory drawing for explaining the method of state change detection by.

以下添付図面を参照して、本発明に係る状態変動検出装置及び状態変動検出用プログラムの実施形態を説明する。各図において、同一の構成要素には同一の符号を付して重複する説明を省略する。本発明の実施形態に係る状態変動検出装置は、例えば図３に示されるようなパーソナルコンピュータやワークステーション、その他のコンピュータシステムにより構成することができる。このコンピュータシステムは、ＣＰＵ１０が主メモリ１１に記憶されている或いは主メモリ１１に読み込んだプログラムやデータに基づき各部を制御し、必要な処理を実行することにより状態変動検出装置として動作を行うものである。 Hereinafter, embodiments of the state change detection device and the state change detection program according to the present invention will be described with reference to the accompanying drawings. In each figure, the same components are designated by the same reference numerals and duplicate description will be omitted. The state change detection device according to the embodiment of the present invention can be configured by, for example, a personal computer, a workstation, or other computer system as shown in FIG. This computer system operates as a state change detection device by controlling each part based on a program or data stored in the main memory 11 or read into the main memory 11 by the CPU 10 and executing necessary processing. is there.

ＣＰＵ１０には、バス１２を介して外部記憶インタフェース１３、入力インタフェース１４、表示インタフェース１５、データ入力インタフェース１６が接続されている。外部記憶インタフェース１３には、状態変動検出用プログラム等のプログラムと必要なデータ等が記憶されている外部記憶装置２３が接続されている。入力インタフェース１４には、コマンドやデータを入力するための入力装置としてのキーボードなどの入力装置２４とポインティングデバイスとしてのマウス２２が接続されている。 An external storage interface 13, an input interface 14, a display interface 15, and a data input interface 16 are connected to the CPU 10 via a bus 12. A program such as a state change detection program and an external storage device 23 in which necessary data and the like are stored are connected to the external storage interface 13. An input device 24 such as a keyboard as an input device for inputting commands and data and a mouse 22 as a pointing device are connected to the input interface 14.

表示インタフェース１５には、ＬＥＤやＬＣＤなどの表示画面を有する表示装置２５が接続されている。データ入力インタフェース１６には、測定データを得るためのセンサ２６−１、２６−２、・・・、２６−ｍが接続されている。このコンピュータシステムには、他の構成が備えられていても良く、また、図３の構成は一例に過ぎない。 A display device 25 having a display screen such as an LED or an LCD is connected to the display interface 15. Sensors 26-1, 26-2, ..., 26-m for obtaining measurement data are connected to the data input interface 16. This computer system may be provided with other configurations, and the configuration of FIG. 3 is only an example.

上記において、ＣＰＵ１０には、外部記憶装置２３内の状態変動検出用プログラムによって図４に記載の各手段等が実現される。即ち、予測モデル作成手段３１、所定範囲重要度説明変数検索手段３２、教師データ幾何学的距離算出手段３３、解析対象データ幾何学的距離算出手段３４、状態変動検出手段３５が実現される。また、外部記憶装置２３内には、教師データＴが記憶されている。予測モデル作成手段３１は、教師データを用いて予測モデル３０を作成するものである。 In the above, in the CPU 10, each means and the like shown in FIG. 4 are realized by the state change detection program in the external storage device 23. That is, the prediction model creating means 31, the predetermined range importance explanatory variable search means 32, the teacher data geometric distance calculating means 33, the analysis target data geometric distance calculating means 34, and the state change detecting means 35 are realized. Further, the teacher data T is stored in the external storage device 23. The prediction model creating means 31 creates the prediction model 30 using the teacher data.

予測モデル３０は、機械学習により説明変数から目的変数を予測するものである。ここに、機械学習のアルゴリズムとしては、パターンマイ二ングのランダムフォレストを挙げることができるが、これ以外に、分類木や回帰木などのように分類器により（例えばツリー構造で）分岐を行って予測を行う機械学習によるアリゴリズムを採用することができる。 The prediction model 30 predicts the objective variable from the explanatory variables by machine learning. Here, as a machine learning algorithm, a random forest of pattern mining can be mentioned, but in addition to this, branching is performed by a classifier (for example, in a tree structure) such as a classification tree or a regression tree. It is possible to adopt algorithm by machine learning that makes predictions.

所定範囲重要度説明変数検索手段３２は、上記予測モデル３０において予め設定された閾値より低い重要度の説明変数である所定範囲重要度説明変数を求めるものである。所定範囲重要度説明変数検索手段３２は、教師データの説明変数の取得元毎に重要度を求め、所定閾値よりも低い値の重要度を有している取得元の説明変数を所定範囲重要度説明変数として求める。ここで、取得元とは、説明変数を幾つかのセンサを用いて取得・収集している場合に、その各センサを取得元と言う。また、説明変数を幾つかの装置を用いて取得・収集している場合に、その各装置を取得元と言う。つまり、目的変数や説明変数を取得する元となるところと言う意味で、取得元を用いる。 The predetermined range importance explanatory variable search means 32 obtains a predetermined range importance explanatory variable which is an explanatory variable of importance lower than the threshold value set in advance in the prediction model 30. The predetermined range importance explanatory variable search means 32 obtains the importance for each acquisition source of the explanatory variable of the teacher data, and determines the explanatory variable of the acquisition source having the importance of a value lower than the predetermined threshold value in the predetermined range importance. Obtained as an explanatory variable. Here, the acquisition source is referred to as an acquisition source when explanatory variables are acquired / collected using several sensors. Further, when the explanatory variables are acquired / collected using several devices, each device is referred to as an acquisition source. In other words, the acquisition source is used in the sense that it is the source for acquiring the objective variable and the explanatory variable.

例えば、教師データは、図５により示すように、センサＡにより得られるべき目的変数のデータと、センサＢにより得られる１つ目の説明変数のデータと、センサＣにより得られる２つ目の説明変数のデータと、センサＤにより得られる３つ目の説明変数のデータと、センサＥにより得られる４つ目の説明変数のデータと、センサＦにより得られる５つ目の説明変数のデータとにより構成される。データの取得回数（図５の縦方向のデータ数）は任意である。本実施形態では、所定閾値よりも低い値の重要度を有している説明変数の取得元は、センサＥとセンサＦであるものとし、所定範囲重要度説明変数は図５のセンサＥとセンサＦの欄に記載されている値となる。 For example, as shown in FIG. 5, the teacher data includes the data of the objective variable to be obtained by the sensor A, the data of the first explanatory variable obtained by the sensor B, and the second explanation obtained by the sensor C. From the variable data, the data of the third explanatory variable obtained by the sensor D, the data of the fourth explanatory variable obtained by the sensor E, and the data of the fifth explanatory variable obtained by the sensor F. It is composed. The number of data acquisitions (the number of data in the vertical direction in FIG. 5) is arbitrary. In the present embodiment, it is assumed that the acquisition sources of the explanatory variables having the importance of a value lower than the predetermined threshold value are the sensor E and the sensor F, and the predetermined range importance explanatory variables are the sensor E and the sensor of FIG. It becomes the value described in the column of F.

教師データ幾何学的距離算出手段３３は、上記で求めた所定範囲重要度説明変数を用いて教師データの幾何学的距離を求めるものである。教師データの幾何学的距離としては、マハラノビス距離を用いることができる。 The teacher data geometric distance calculation means 33 obtains the geometric distance of the teacher data by using the predetermined range importance explanatory variable obtained above. The Mahalanobis distance can be used as the geometric distance of the teacher data.

解析対象データ幾何学的距離算出手段３４は、前記教師データの幾何学的距離の算出に用いた各説明変数の平均値と分散共分散行列の逆行列を用いて解析対象データの幾何学的距離を求めるものである。解析対象データの幾何学的距離としては、マハラノビス距離を用いることができる。 The analysis target data geometric distance calculation means 34 uses the mean value of each explanatory variable used for calculating the geometric distance of the teacher data and the inverse matrix of the variance-covariance matrix to analyze the geometric distance of the analysis target data. Is what you want. The Mahalanobis distance can be used as the geometric distance of the data to be analyzed.

上記のように解析対象データの幾何学的距離を求める場合に、上記教師データの幾何学的距離の算出に用いた各説明変数の平均値と分散共分散行列の逆行列を用いている結果、教師データのデータ平均から解析対象データまでの幾何学的距離が求まることになる。 As a result of using the mean value of each explanatory variable used to calculate the geometric distance of the teacher data and the inverse of the variance-covariance matrix when calculating the geometric distance of the data to be analyzed as described above. The geometric distance from the data mean of the teacher data to the data to be analyzed can be obtained.

状態変動検出手段３５は、上記教師データ幾何学的距離の分布と上記解析対象データ幾何学的距離の分布に基づき、上記解析データの状態変動を検出するものである。具体的には、図６（Ａ）に示すように教師データのマハラノビス距離（ＭＤ）が求まり、これをヒストグラムにした場合に図６（Ｂ）のようになったものとする。また、図７（Ａ）に示すように解析対象データのマハラノビス距離（ＭＤ）が求まり、これをヒストグラムにした場合に図７（Ｂ）のようになったものとする。なお、解析データは、図５に示した教師データの説明変数の取得元であるセンサＢ〜Ｆによって状態変動検出する対象として収集されたデータであり、図示しないが図５に示したような値（勿論、実測値であるから通常は図５のものと同一ではない）を持つものである。 The state change detecting means 35 detects the state change of the analysis data based on the distribution of the teacher data geometric distance and the distribution of the analysis target data geometric distance. Specifically, it is assumed that the Mahalanobis distance (MD) of the teacher data is obtained as shown in FIG. 6 (A), and when this is made into a histogram, it becomes as shown in FIG. 6 (B). Further, as shown in FIG. 7 (A), the Mahalanobis distance (MD) of the data to be analyzed is obtained, and when this is made into a histogram, it is assumed that the result is as shown in FIG. 7 (B). The analysis data is data collected by the sensors B to F, which are the acquisition sources of the explanatory variables of the teacher data shown in FIG. 5, as targets for state fluctuation detection, and are not shown, but are values as shown in FIG. (Of course, since it is an actually measured value, it is not usually the same as that in FIG. 5).

教師データのマハラノビス距離（ＭＤ）の分布では、最大値が４．００であり、これより大きな値は現れていない。従って、教師データのマハラノビス距離（ＭＤ）の最大値より大きなマハラノビス距離（ＭＤ）が現れる解析対象データは、状態変動が生じていると結論することができる。そして、教師データのマハラノビス距離（ＭＤ）の最大値からの距離が大きな位置にマハラノビス距離（ＭＤ）の値を持つ場合には状態変動の度合いが大きいと結論することができる。従って、教師データのマハラノビス距離（ＭＤ）の分布範囲が正常であれば、教師データのマハラノビス距離（ＭＤ）の最大値からの距離が大きな位置では異常の度合いが大きいと結論付けることができる。 In the distribution of Mahalanobis distance (MD) in the teacher data, the maximum value is 4.00, and no larger value appears. Therefore, it can be concluded that the analysis target data in which the Mahalanobis distance (MD) larger than the maximum value of the Mahalanobis distance (MD) of the teacher data appears has a state change. Then, it can be concluded that the degree of state variation is large when the Mahalanobis distance (MD) value is held at a position where the distance from the maximum Mahalanobis distance (MD) of the teacher data is large. Therefore, if the distribution range of the Mahalanobis distance (MD) of the teacher data is normal, it can be concluded that the degree of abnormality is large at the position where the distance from the maximum value of the Mahalanobis distance (MD) of the teacher data is large.

つまり、状態変動検出手段３５は、上記教師データ幾何学的距離の分布範囲を求め、この分布範囲を超えた上記解析対象データ幾何学的距離の分布の割合に基づき上記解析データの状態変動を検出することができる。上記の動作は、上記状態変動検出手段３５が、教師データ幾何学的距離の分布範囲を求め、この分布範囲から前記解析対象データ幾何学的距離の分布が離れている程度に基づき上記解析データの状態変動を検出することができることを示している。 That is, the state change detecting means 35 obtains the distribution range of the teacher data geometric distance, and detects the state change of the analysis data based on the ratio of the distribution of the analysis target data geometric distance beyond this distribution range. can do. In the above operation, the state fluctuation detecting means 35 obtains the distribution range of the teacher data geometric distance, and the analysis data is based on the degree to which the distribution of the analysis target data geometric distance is separated from this distribution range. It shows that the state change can be detected.

以上のような手段等によって構成される状態変動検出装置は、図８に示すフローチャートによって処理動作を実行するので、このフローチャートを参照して動作説明を行う。 Since the state change detection device configured by the above means and the like executes the processing operation according to the flowchart shown in FIG. 8, the operation will be described with reference to this flowchart.

最初に教師データを用いて予測モデル３０を作成する（ＳＴＥＰ１）。教師データは、既に図示したデータであり、センサＡにより得られるべき目的変数のデータと、センサＢからセンサＦにより得られる説明変数のデータとにより構成される。斯して、図５の教師データＴが用いられて予測モデル作成手段３１により予測モデル３０が作成される。なお、ここで作成される予測モデル３０は、ランダムフォレストにより予測を行うものとするが、分類器により（例えばツリー構造で）分岐を行って機械学習による予測を行うものであれば、ランダムフォレスト以外の手法を用いるものであっても良い。 First, the prediction model 30 is created using the teacher data (STEP 1). The teacher data is the data already illustrated, and is composed of the objective variable data to be obtained by the sensor A and the explanatory variable data obtained from the sensor B by the sensor F. Thus, the prediction model 30 is created by the prediction model creating means 31 using the teacher data T of FIG. The prediction model 30 created here assumes that prediction is performed by a random forest, but if the prediction is performed by machine learning by branching by a classifier (for example, in a tree structure), it is not a random forest. It may use the method of.

次に、上記で作成された予測モデル３０における説明変数の重要度を取得する（ＳＴＥＰ２）。説明変数の取得元は、センサＢ〜Ｆであるから、この取得元毎に重要度の値を求める。重要度の求め方は、ランダムフォレストの予測モデルを作成する場合に重要度を求めて適切な決定木を作成するなどに用いる手法など、公知の手法を用いることができる。 Next, the importance of the explanatory variables in the prediction model 30 created above is acquired (STEP2). Since the acquisition sources of the explanatory variables are sensors B to F, the importance value is obtained for each acquisition source. As the method of determining the importance, a known method such as a method used for obtaining an appropriate decision tree when creating a prediction model of a random forest can be used.

上記のＳＴＥＰ２において求めた取得元毎に重要度の値を図９（Ａ）に示す。本実施形態では、ＳＴＥＰ２に続いて、予め設定された範囲の重要度に属する所定範囲重要度説明変数を上記予測モデルにおいて求める（ＳＴＥＰ３）。このために、取得元毎に重要度の値によって割合を求め、重要度の値が低い側から割合の値を累積し累積値を求めたものを図９（Ｂ）に示す。本実施形態では、閾値として累積値が「１．０」を採用し、この「１．０」より少ない範囲の取得元の説明変数を求める。閾値として累積値「１．０」は、一例に過ぎず、使用するシステムや使用する場面によって適宜変更されるものである。本実施形態では、図９（Ｂ）から取得元がセンサＥとセンサＦの説明変数が該当する説明変数となる。 The value of importance for each acquisition source obtained in STEP 2 above is shown in FIG. 9 (A). In the present embodiment, following STEP2, a predetermined range importance explanatory variable belonging to the importance of a preset range is obtained in the prediction model (STEP3). For this purpose, FIG. 9B shows a ratio obtained from each acquisition source based on the importance value, and the ratio value is accumulated from the side with the lower importance value to obtain the cumulative value. In the present embodiment, a cumulative value of "1.0" is adopted as the threshold value, and an explanatory variable of the acquisition source in a range smaller than this "1.0" is obtained. The cumulative value "1.0" as the threshold value is only an example, and is appropriately changed depending on the system to be used and the situation in which it is used. In the present embodiment, from FIG. 9B, the acquisition source is the explanatory variable corresponding to the explanatory variables of the sensor E and the sensor F.

上記ＳＴＥＰ３の次に、上記で求めた所定範囲重要度説明変数を用いて教師データの幾何学的距離であるマハラノビス距離を求める（ＳＴＥＰ４）。ここに、マハラノビス距離を求める対象は、教師データ中の目的変数を除く説明変数についてである。上記で求められた教師データについてのマハラノビス距離は、図６（Ａ）にＭＤとして示すようであり、これを度数分布グラフとすると図６（Ｂ）のようになる。 Next to STEP3, the Mahalanobis distance, which is the geometric distance of the teacher data, is obtained using the predetermined range importance explanatory variable obtained above (STEP4). Here, the target for obtaining the Mahalanobis distance is the explanatory variables excluding the objective variable in the teacher data. The Mahalanobis distance for the teacher data obtained above is shown as MD in FIG. 6 (A), and when this is used as a frequency distribution graph, it is as shown in FIG. 6 (B).

ＳＴＥＰ４に続いて解析対象データの幾何学的距離であるマハラノビス距離を求める（ＳＴＥＰ５）。この場合に、上記教師データの幾何学的距離の算出に用いた各説明変数の平均値と分散共分散行列の逆行列を用いて解析対象データのマハラノビス距離を求める。この結果、教師データのデータ平均から解析対象データまでの幾何学的距離が求まることになる。 Following STEP4, the Mahalanobis distance, which is the geometric distance of the data to be analyzed, is obtained (STEP5). In this case, the Mahalanobis distance of the data to be analyzed is obtained by using the mean value of each explanatory variable used for calculating the geometric distance of the teacher data and the inverse matrix of the variance-covariance matrix. As a result, the geometric distance from the data average of the teacher data to the analysis target data can be obtained.

解析対象データが前述の通り図示しないが、図５に示したような値に準じた値であるとする。これに対し、解析対象データについてのマハラノビス距離は、図７（Ａ）にＭＤとして示すようであり、これを度数分布グラフとすると図７（Ｂ）のようになる。解析対象データについてのマハラノビス距離とその分布は、取得元毎に求められる。 Although the data to be analyzed is not shown as described above, it is assumed that the values are based on the values shown in FIG. On the other hand, the Mahalanobis distance for the data to be analyzed is shown as MD in FIG. 7 (A), and when this is used as a frequency distribution graph, it is as shown in FIG. 7 (B). The Mahalanobis distance and its distribution for the data to be analyzed are obtained for each acquisition source.

ＳＴＥＰ５に続いて、上記教師データ幾何学的距離であるマハラノビス距離の分布と上記解析対象データ幾何学的距離であるマハラノビス距離の分布に基づき、上記解析対象データの状態変動を検出する（ＳＴＥＰ６）。上記教師データ幾何学的距離であるマハラノビス距離の分布は、図６（Ｂ）のようである。また、上記解析対象データ幾何学的距離であるマハラノビス距離の分布は、図７（Ｂ）のようである。 Following STEP 5, the state change of the analysis target data is detected based on the distribution of the Mahalanobis distance which is the teacher data geometric distance and the distribution of the Mahalanobis distance which is the analysis target data geometric distance (STEP 6). The distribution of the Mahalanobis distance, which is the above-mentioned teacher data geometric distance, is as shown in FIG. 6 (B). The distribution of the Mahalanobis distance, which is the geometric distance of the data to be analyzed, is as shown in FIG. 7 (B).

図６（Ｂ）に示されている教師データのマハラノビス距離の分布ＴＣと図７（Ｂ）に示されている解析対象データのマハラノビス距離の分布ＡＣとに対し横軸を一致させて縦方向に並べると、図１０のようである。教師データのマハラノビス距離の最大値Ｔ−ＭＡＸを通る縦線分Ｌを引くと明らかなように、解析対象データのマハラノビス距離は教師データのマハラノビス距離の最大値Ｔ−ＭＡＸよりの大きな値の領域にも存在している。 The horizontal axis is aligned with the distribution TC of the Mahalanobis distance of the teacher data shown in FIG. 6 (B) and the distribution AC of the Mahalanobis distance of the data to be analyzed shown in FIG. 7 (B) in the vertical direction. When arranged side by side, it is as shown in FIG. As is clear from drawing the vertical line segment L passing through the maximum Mahalanobis distance T-MAX of the teacher data, the Mahalanobis distance of the data to be analyzed is in the region of a value larger than the maximum Mahalanobis distance T-MAX of the teacher data. Also exists.

教師データのマハラノビス距離の分布ＴＣが状態変動なし（または、正常）の範囲であるとすると、解析対象データのマハラノビス距離における分布の内、縦線分Ｌを超える領域（図ではＬより右側の領域）の分布は、状態変動あり（または、異常）、と言うことになる。 Assuming that the distribution TC of the Mahalanobis distance of the teacher data is in the range of no state change (or normal), the area of the distribution of the Mahalanobis distance of the data to be analyzed that exceeds the vertical line segment L (the area to the right of L in the figure). ) Is said to have state fluctuations (or abnormalities).

そして、状態変動あり（または、異常）は、例えば、教師データのマハラノビス距離の最大値Ｔ−ＭＡＸと解析対象データのマハラノビス距離の最大値Ａ−ＭＡＸの差が所定閾値以上であるか否かにより判定することができる。また、縦に延びる線分Ｌを超える領域（図では線分Ｌより右側の領域）の全領域に対する割合（異常率とも言える）が、所定閾値を超えているか否かに基づき、状態変動あり（または、異常発生）などを判定することができる。このように、状態変動検出手段３５は、上記教師データ幾何学的距離の分布範囲を求め、この分布範囲を超えた上記解析対象データ幾何学的距離の分布の割合に基づき上記解析データの状態変動を検出する。 The state change (or abnormality) depends on whether or not the difference between the maximum Mahalanobis distance T-MAX of the teacher data and the maximum Mahalanobis distance A-MAX of the analysis target data is equal to or greater than a predetermined threshold value. Can be determined. In addition, there is a state change based on whether or not the ratio (also called the abnormality rate) of the region exceeding the vertically extending line segment L (the region on the right side of the line segment L in the figure) to the entire region exceeds a predetermined threshold value (it can be said to be an abnormality rate). Alternatively, it is possible to determine (abnormal occurrence) or the like. In this way, the state change detecting means 35 obtains the distribution range of the teacher data geometric distance, and the state change of the analysis data is based on the ratio of the distribution of the analysis target data geometric distance beyond this distribution range. Is detected.

また、教師データのマハラノビス距離の分布ＴＣの平均値と解析対象データのマハラノビス距離の分布ＡＣの平均値との差や、上記２つの平均の比が所定閾値を超えているか否かに基づき、状態変動あり（または、異常発生）などを判定することができる。上記の判定は、取得元毎に行われることから、取得元毎に得られる判断結果の多数決をとるなど統計的処理を行って解析対象データに状態変動あり（または、異常発生）などを判定することができる。 Further, the state is based on the difference between the average value of the Mahalanobis distance distribution TC of the teacher data and the average value of the Mahalanobis distance distribution AC of the analysis target data, and whether or not the ratio of the above two averages exceeds a predetermined threshold value. It can be determined that there is a fluctuation (or an abnormality has occurred). Since the above judgment is performed for each acquisition source, it is determined that the data to be analyzed has a state change (or an abnormality has occurred) by performing statistical processing such as taking a majority vote of the judgment results obtained for each acquisition source. be able to.

以上では、状態変動あり（または、異常）か、または状態変動なし（または、正常）であるかの二者択一の判断を行ったが、閾値をいくつか設けて状態変動（または、異常）の可能性を大中小の３段階や、可能性大、可能性中の大、可能性中の小、可能性小の４段階など更に段階の多い多段階で判断することもできる。 In the above, the alternative judgment of whether there is a state change (or abnormality) or no state change (or normal) is made, but some threshold values are set and the state change (or abnormality) is performed. It is also possible to judge the possibility of the above in three stages of large, medium and small, and in multiple stages with more stages such as large possibility, large possibility, small possibility, and small possibility.

なお、上記実施形態では、所定範囲重要度説明変数検索手段３２は、予め設定された閾値より低い重要度の説明変数を求めること示しているために、ランダムフォレストなどの低い重要度の説明変数を用いない状態変動検出装置とは異なる結論が得られることが期待され、本実施形態による結果をランダムフォレストなどによる結果を補正する場合などに用いることができ、状態変動検出の精度向上を図ることができる。例えば、本実施形態の判断の閾値を高くして状態変動あり（または、異常）となり難く設定を行っておき、それでも状態変動あり（または、異常）との判定結果が本実施形態で得られるのであれば、本実施形態の判断を優先するなどの手法を採用できる。なお、補正などの対象である結果を得る装置であるランダムフォレストによる状態変動検出装置は、本願出願人が出願した特願２０１８−４０５３１に記載の装置、特願２０１９−５５６０９に記載の装置、特願２０１９−５５６１５に記載の装置などを挙げることができる。 In the above embodiment, since the predetermined range importance explanatory variable search means 32 indicates that the explanatory variable of importance lower than the preset threshold value is obtained, the explanatory variable of low importance such as a random forest is used. It is expected that a conclusion different from that of the state change detection device that is not used can be obtained, and the result of this embodiment can be used to correct the result of a random forest or the like, and the accuracy of state change detection can be improved. it can. For example, the threshold value for the determination of the present embodiment is set to be high so that it is unlikely that there is a state change (or abnormality), and the determination result that there is a state change (or an abnormality) can be obtained in the present embodiment. If so, a method such as giving priority to the judgment of the present embodiment can be adopted. The state change detection device by random forest, which is a device for obtaining the result to be corrected, is the device described in Japanese Patent Application No. 2018-40531 filed by the applicant of the present application, the device described in Japanese Patent Application No. 2019-55609, and the present invention. The apparatus according to the application 2019-55615 can be mentioned.

また、上記実施形態では、所定範囲重要度説明変数検索手段３２は、予め設定された閾値より低い重要度の説明変数を求めるようにしているが、ランダムフォレストなどによる結果を追認するものとして用いることもできる。その場合には、所定範囲重要度説明変数検索手段３２は、予め設定された閾値より低い重要度の説明変数を除いた説明変数を求め、これを用いて幾何学的距離を求めるようにしても良い。或いは、所定範囲重要度説明変数検索手段３２は、予め設定された第１の閾値より低く、この第１の閾値よりも大きな第２の閾値に挟まれる重要度の説明変数を求め（つまり重要度の中間の説明変数を求め）、これを用いて幾何学的距離を求めるようにしても良い。 Further, in the above embodiment, the predetermined range importance explanatory variable search means 32 seeks an explanatory variable having an importance lower than a preset threshold value, but it is used as a confirmation of the result by a random forest or the like. You can also. In that case, the predetermined range importance explanatory variable search means 32 obtains the explanatory variables excluding the explanatory variables of importance lower than the preset threshold value, and may obtain the geometric distance using the explanatory variables. good. Alternatively, the predetermined range importance explanatory variable search means 32 obtains an explanatory variable of importance between a second threshold value lower than a preset first threshold value and larger than the first threshold value (that is, importance level). (Find the explanatory variable in the middle of), and use this to find the geometric distance.

また、上記実施形態では、幾何学的距離を求める手法として、マハラノビス距離を求める良く知られたＭＴ法と称される手法を用いるものである。しかしながら、近時においてはＭＴ法が有している問題点を解決すべく、ＭＴＡ法、Ｔ（３）法（ＲＴ法）などが提案されている。本実施形態においては、上記の提案されたＭＴＡ法、Ｔ（３）法（ＲＴ法）の手法を採用しても良い。 Further, in the above embodiment, as a method for obtaining the geometrical distance, a method called the well-known MT method for obtaining the Mahalanobis distance is used. However, recently, in order to solve the problems of the MT method, the MTA method, the T (3) method (RT method), and the like have been proposed. In this embodiment, the above-proposed MTA method and T (3) method (RT method) may be adopted.

１０ＣＰＵ
１１主メモリ
１２バス
１３外部記憶インタフェース
１４入力インタフェース
１５表示インタフェース
１６データ入力インタフェース
２２マウス
２３外部記憶装置
２４入力装置
２５表示装置
２６−１〜２６−ｍセンサ
３０予測モデル
３１予測モデル作成手段
３２所定範囲重要度説明変数検索手段
３３教師データ幾何学的距離算出手段
３４解析対象データ幾何学的距離算出手段
３５状態変動検出手段 10 CPU
11 Main memory 12 Bus 13 External storage interface 14 Input interface 15 Display interface 16 Data input interface 22 Mouse 23 External storage device 24 Input device 25 Display device 26-1 to 26-m Sensor 30 Prediction model 31 Prediction model creation means 32 Predetermined range Importance explanatory variable search means 33 Teacher data geometric distance calculation means 34 Analysis target data Geometric distance calculation means 35 State fluctuation detection means

Claims

A predictive model creation method that creates a predictive model that predicts the objective variable from the explanatory variables based on the teacher data,
A predetermined range importance explanatory variable search means for obtaining a predetermined range importance explanatory variable belonging to a preset range importance in the prediction model, and
A teacher data geometric distance calculation means for obtaining the geometric distance of the teacher data using the predetermined range importance explanatory variable, and
An analysis target data geometric distance calculation means for obtaining the geometric distance of the analysis target data using the mean value of each explanatory variable used for calculating the geometric distance of the teacher data and the inverse matrix of the variance-covariance matrix. ,
A state change detecting apparatus comprising: a state change detecting means for detecting a state change of the analysis data based on the distribution of the teacher data geometric distance and the distribution of the analysis target data geometric distance.

The state change detecting means obtains the distribution range of the teacher data geometric distance, and detects the state change of the analysis data based on the ratio of the distribution of the analysis target data geometric distance beyond this distribution range. The state change detection device according to claim 1.

The state change detecting means obtains the distribution range of the teacher data geometric distance, and detects the state change of the analysis data based on the degree to which the distribution of the analysis target data geometric distance deviates from this distribution range. The state change detection device according to claim 1 or 2, wherein the state change detection device.

The state change detection device according to any one of claims 1 to 3, wherein the prediction model created by the prediction model creation means is for performing prediction by machine learning by branching with a classifier. ..

The state change detection device according to any one of claims 1 to 4, wherein the prediction model created by the prediction model creation means predicts by a random forest.

The teacher data geometric distance calculation means obtains the Mahalanobis distance of the teacher data as the geometric distance of the teacher data.
The state change detection according to any one of claims 1 to 5, wherein the analysis target data geometric distance calculation means obtains the Mahalanobis distance of the analysis target data as the geometric distance of the analysis target data. apparatus.

The state change detection device according to any one of claims 1 to 6, wherein the predetermined range importance explanatory variable search means obtains an explanatory variable having an importance lower than a preset threshold value.

Computer,
Predictive model creation means that creates a predictive model that predicts the objective variable from the explanatory variables based on the teacher data.
A predetermined range importance explanatory variable search means for obtaining a predetermined range importance explanatory variable, which is an explanatory variable of importance lower than a preset threshold value, in the prediction model.
Teacher data geometric distance calculation means for obtaining the geometric distance of teacher data using the predetermined range importance explanatory variable,
An analysis target data geometric distance calculation means for obtaining the geometric distance of the analysis target data using the mean value of each explanatory variable used for calculating the geometric distance of the teacher data and the inverse matrix of the variance-covariance matrix.
A state change detection program characterized by functioning as a state change detection means for detecting a state change of the analysis data based on the distribution of the teacher data geometric distance and the distribution of the analysis target data geometric distance.

Using the computer as the state change detecting means, the distribution range of the teacher data geometric distance is obtained, and the state change of the analysis data is determined based on the ratio of the distribution of the analysis target data geometric distance beyond this distribution range. The state change detection program according to claim 8, wherein the program is made to function to detect.

As the state change detecting means, the computer obtains a distribution range of the teacher data geometric distance, and determines the state change of the analysis data based on the degree to which the distribution of the analysis target data geometric distance is separated from this distribution range. The program for detecting state fluctuations according to claim 8 or 9, wherein the program is for detecting.

The state according to any one of claims 8 to 10, wherein the prediction model created by the computer as the prediction model creation means branches by a classifier to perform prediction by machine learning. Fluctuation detection program.

The state change detection program according to any one of claims 8 to 11, wherein the prediction model created by the computer as the prediction model creation means predicts by a random forest.

The computer is made to function as the teacher data geometric distance calculation means to obtain the Mahalanobis distance of the teacher data as the geometric distance of the teacher data.
Any one of claims 8 to 12, wherein the computer functions as a means for calculating the geometrical distance of the data to be analyzed so as to obtain the Mahalanobis distance of the data to be analyzed as the geometric distance of the data to be analyzed. The program for detecting state fluctuations described in the section.

Any one of claims 8 to 13, wherein the computer functions as the predetermined range importance explanatory variable search means to obtain an explanatory variable of importance lower than a preset threshold value in the prediction model. The program for detecting state fluctuations described in the section.