JP7086497B2

JP7086497B2 - Abnormality / error effect explanation variable detection device and abnormality / error effect explanation variable detection program

Info

Publication number: JP7086497B2
Application number: JP2020164127A
Authority: JP
Inventors: 俊和鈴木; 真一牧野
Original assignee: 東芝デジタルエンジニアリング株式会社
Priority date: 2020-09-29
Filing date: 2020-09-29
Publication date: 2022-06-20
Anticipated expiration: 2040-09-29
Also published as: JP2022056227A

Description

この発明は、異常・誤り影響説明変数検出装置及び異常・誤り影響説明変数検出用プログラムに関するものである。 The present invention relates to an abnormality / error effect explanatory variable detection device and an abnormality / error effect explanatory variable detection program.

本願発明者らは、目的変数と説明変数から構成される教師データを用いて予測モデルを作成し、この予測モデルに対し実際に異常を捕えるべき対象機器において測定した説明変数を適用して得られた目的変数と、実際に測定された目的変数とから上記対象機器の異常を検出する装置の発明を出願した（特開２０１９－１５９３６５号、特願２０１９－０５５６０９号、特願２０１９－０５５６１５号、特願２０１９－１６０７５１号、特願２０１９－０８０５５６号）。この発明では、対象装置が異常であることの検出（或いは推定）を行うことができるが、どの説明変数が異常に大きく影響しているかについては考慮していない。 The inventors of the present application created a prediction model using teacher data composed of objective variables and explanatory variables, and applied the explanatory variables measured in the target device to actually catch anomalies to this prediction model. An invention of a device for detecting an abnormality of the above-mentioned target device from the objective variable and the objective variable actually measured has been filed (Japanese Patent Laid-Open No. 2019-159365, Japanese Patent Application No. 2019-05569, Japanese Patent Application No. 2019-055615, Japanese Patent Application No. 2019-160751, Japanese Patent Application No. 2019-080556). In the present invention, it is possible to detect (or estimate) that the target device is abnormal, but it does not consider which explanatory variable has a great influence on the abnormality.

また、図１は予測モデルの目的変数と説明変数を提供する３種類の花の平面図を示している。この３種類を分別する例として１つの花に存在している花弁とガク（額）の長さと幅を説明変数（花弁の長さ、花弁の幅、ガクの長さ、ガクの幅）として、３種類の花Ａ、Ｂ、Ｃを目的変数として予測モデルを作成することができる。 FIG. 1 also shows a plan view of three types of flowers that provide the objective and explanatory variables of the predictive model. As an example of separating these three types, the length and width of the petals and gaku (forehead) existing in one flower are used as explanatory variables (petal length, petal width, gaku length, gaku width). A prediction model can be created with three types of flowers A, B, and C as objective variables.

図２は、花弁とガクの長さと幅を測定値として、４つの測定値（説明変数）を得て、目的変数である３種類の花Ａ、Ｂ、Ｃを目的変数として得る予測モデルを示す図である。測定値（説明変数）がＫ１、Ｋ２、Ｋ２、Ｋ３であるとき、予測モデルは花の種類Ａ（目的変数）と予測する（図２（ａ））。また、花弁とガクの長さと幅の測定値（説明変数）がＫ２、Ｋ１、Ｋ３、Ｋ１であるとき、予測モデルは花の種類Ｂ（目的変数）と予測する（図２（ｂ））。更に、花弁とガクの長さと幅の測定値（説明変数）がＫ３、Ｋ２、Ｋ１、Ｋ１であるとき、予測モデルは花の種類Ｃ（目的変数）と予測する（図２（ｃ））。 FIG. 2 shows a prediction model in which the length and width of petals and gaku are measured values, four measured values (explanatory variables) are obtained, and three types of flowers A, B, and C, which are objective variables, are obtained as objective variables. It is a figure. When the measured value (explanatory variable) is K1, K2, K2, K3, the prediction model predicts the flower type A (objective variable) (FIG. 2A). Further, when the measured values (explanatory variables) of the length and width of the petals and the gaku are K2, K1, K3, and K1, the prediction model predicts the flower type B (objective variable) (FIG. 2 (b)). Further, when the measured values (explanatory variables) of the length and width of the petals and the gaku are K3, K2, K1 and K1, the prediction model predicts the flower type C (objective variable) (FIG. 2 (c)).

図３は、予測モデルの式を示す図である。上記のように予測を行う予測モデルをｆ０で表すと、測定値（説明変数）Ｋ１、Ｋ２、Ｋ１、Ｋ３によって、花の種類Ａ（目的変数）を予測した場合の式は図３（ａ）のように、Ａ＝ｆ０（Ｋ１,Ｋ２,Ｋ２,Ｋ３）と記載することができる。また、測定値（説明変数）Ｋ２、Ｋ１、Ｋ３、Ｋ１によって、花の種類Ｂ（目的変数）を予測した場合の式は図３（ｂ）のように、Ｂ＝ｆ０（Ｋ２,Ｋ１,Ｋ３,Ｋ１）と記載することができる。更に、測定値（説明変数）Ｋ３、Ｋ２、Ｋ１、Ｋ１によって、花の種類Ｃ（目的変数）を予測した場合の式は図３（ｃ）のように、Ｃ＝ｆ０（Ｋ３,Ｋ２,Ｋ１,Ｋ１）と記載することができる。 FIG. 3 is a diagram showing the formula of the prediction model. When the prediction model for prediction as described above is represented by f0, the formula when the flower type A (objective variable) is predicted by the measured values (explanatory variables) K1, K2, K1, and K3 is shown in FIG. 3A. As such, A = f0 (K1, K2, K2, K3) can be described. Further, the formula when the flower type B (objective variable) is predicted by the measured values (explanatory variables) K2, K1, K3, and K1 is B = f0 (K2, K1, K3) as shown in FIG. 3 (b). , K1). Further, the formula when the flower type C (objective variable) is predicted by the measured values (explanatory variables) K3, K2, K1, and K1 is C = f0 (K3, K2, K1) as shown in FIG. 3 (c). , K1).

図４は従来の予測モデルでは異常であることは検出されるが、その異常がいずれの説明変数の影響かについては不明であることを示す図である。図４（ａ）に示すように、測定値（説明変数）Ｋ１、Ｋ２、Ｋ１、Ｋ３と測定値（目的変数）Ａが得られている場合に、予測モデルｆ０を用いた場合の予測では図４（ｂ）に示すように、Ｂ＝ｆ０（Ｋ１,Ｋ２,Ｋ１,Ｋ３）となり、測定対象が異常であると判定されたとする。このように、異常が生じていることを検出できるものの、どの１つまたは複数の説明変数が異常であるために、異常と判定されたのかを特定することはできないものであった。 FIG. 4 is a diagram showing that an abnormality is detected by the conventional prediction model, but it is unclear which explanatory variable the abnormality is affected by. As shown in FIG. 4A, when the measured values (explanatory variables) K1, K2, K1, K3 and the measured values (objective variable) A are obtained, the prediction when the prediction model f0 is used is shown in the figure. As shown in 4 (b), B = f0 (K1, K2, K1, K3), and it is assumed that the measurement target is determined to be abnormal. As described above, although it is possible to detect that an abnormality has occurred, it is not possible to specify which one or more explanatory variables are abnormal and therefore determined to be abnormal.

上記に対し、近年、シャープレイ値（ＳｈａｐｌｅｙＶａｌｕｅ）という値を、機械学習モデルの解釈に用いる研究がなされている。このシャープレイ値は、例えば、Ａａ、Ｂｂ、Ｃｃという三人が働く場合に得る報酬の値からＡａ、Ｂｂ、Ｃｃの貢献度に対応する分配報酬を算出するものである。条件としては、Ａａ、Ｂｂ、Ｃｃが一人ずつ働く場合の各人の報酬、Ａａ、Ｂｂ、Ｃｃのいずれか二人のペアが働く場合のペアの報酬、Ａａ、Ｂｂ、Ｃｃが三人で働く場合の報酬が与えられる。これに基づき、Ａａ、Ｂｂ、Ｃｃが順に加わって働く場合の報酬を加わる人の順を考慮して算出し、最終的にＡａ、Ｂｂ、Ｃｃの貢献度に対応する分配報酬を算出する。 In contrast to the above, in recent years, research has been conducted in which a value called the Shapley Value is used for interpreting a machine learning model. This Shapley value is, for example, a distribution reward corresponding to the contribution degree of Aa, Bb, Cc is calculated from the value of the reward obtained when three people, Aa, Bb, and Cc, work. As a condition, the reward for each person when Aa, Bb, and Cc work one by one, the reward for a pair when any two pairs of Aa, Bb, and Cc work, and the reward for a pair when Aa, Bb, and Cc work with three people. If you are rewarded. Based on this, the calculation is made in consideration of the order of the people who add the reward when Aa, Bb, and Cc are added in order, and finally the distribution reward corresponding to the contribution of Aa, Bb, and Cc is calculated.

機械学習モデルへの応用では、例えば、特徴量Ｘ＝（Ｘ１，Ｘ２，Ｘ３）の予測値への貢献度をシャープレイ値で求めるものである。モデルをｆ（・）とし、平均的な予測値をＥ［ｆ（Ｘ）］とする。１つのインスタンスにおいてそれぞれ（ｘ１，ｘ２，ｘ３）＝ｘという特徴量をとっているものとし、このときの予測値をｆ（ｘ）とする。平均的な予測値のＥ［ｆ（Ｘ）］と各インスタンスの予測値ｆ（ｘ）との乖離に各特徴量がどのくらい影響しているかを求める。 In the application to the machine learning model, for example, the degree of contribution to the predicted value of the feature amount X = (X1, X2, X3) is obtained by the Shapley value. Let the model be f (.) And the average predicted value be E [f (X)]. It is assumed that each instance has a feature amount of (x1, x2, x3) = x, and the predicted value at this time is f (x). It is calculated how much each feature influences the difference between the average predicted value E [f (X)] and the predicted value f (x) of each instance.

各インスタンスの予測値ｆ（ｘ）は、
_{Ｅ［ｆ（Ｘ|Ｘ１＝ｘ１，Ｘ２＝ｘ２，Ｘ３＝ｘ３）］＝ｆ（ｘ１，ｘ２，ｘ３）＝ｆ（ｘ）}
であるから、平均的な予測値をＥ［ｆ（Ｘ）］からＸ１，Ｘ２，Ｘ３を条件付けてゆくことで、その特徴量を知ることが、各インスタンスの予測に対してどのように影響するかを求めることになる。ここで、Φｊ（ｊ＝１，２，３，・・・）を、各特徴量が予測値に与える限界的な効果とする。また、Φ０は、０と平均的な予測値Ｅ［ｆ（Ｘ）］との乖離に対応する限界的な効果とする。 The predicted value f (x) of each instance is
_{E [f (X | X1 = x1, X2 = x2, X3 = x3)] = f (x1, x2, x3) = f (x)}
Therefore, by conditioning X1, X2, and X3 from E [f (X)] to the average predicted value, knowing the feature amount affects the prediction of each instance. Will be asked. Here, Φj (j = 1, 2, 3, ...) Is defined as the limit effect that each feature has on the predicted value. Further, Φ0 is a marginal effect corresponding to the discrepancy between 0 and the average predicted value E [f (X)].

Φ０の状態からＸ１＝ｘ１という情報を得ると、予測値がΦ１だけ大きくなり、更に、Ｘ２＝ｘ２という情報を得ると、予測値がΦ２だけ大きくなる。最後に、Ｘ３＝ｘ３という情報を得ると、予測値がΦ３だけ小さくなり、これが最終的なインスタンスとなる。上記では、Ｘ１，Ｘ２，Ｘ３という順で条件付けしているが、上記の報酬に関する場合と同様にあらゆる順で条件付けし、それぞれにおいて得られる各特徴量が予測値に与える限界的な効果の平均を求める。これがシャープレイ値である。 When the information of X1 = x1 is obtained from the state of Φ0, the predicted value is increased by Φ1, and when the information of X2 = x2 is obtained, the predicted value is increased by Φ2. Finally, when the information X3 = x3 is obtained, the predicted value becomes smaller by Φ3, and this becomes the final instance. In the above, the conditions are set in the order of X1, X2, X3, but the conditions are set in any order as in the case of the above reward, and the average of the marginal effects of each feature obtained in each is given to the predicted value. Ask. This is the Shapley value.

図５は、花の種類「セトサ」を予測値「１」として４つの測定値から予測する予想モデルを示す図である。この４つの測定値として、１つの花に存在している花弁とガク（額）の長さと幅を説明変数（花弁の長さ（＝ｘ３）、花弁の幅（＝ｘ４）、ガクの長さ（＝ｘ５）、ガクの幅（＝ｘ６））として、「１」を目的変数として予測モデルを作成することができる。図６は、上記「セトサ」の場合におけるｘ３～ｘ６のシャープレイ値を棒グラフで示した図である。図６には、「セトサ」の場合に得られるシャープレイ値の予測値（Ａｃｔｕａｌｐｒｅｄｉｃｔｉｏｎ）が１であり、予測平均値（Ａｖｅｒａｇｅｐｒｅｄｉｃｔｉｏｎ）が２であることが記載されている。予測平均値は、この例では、３つの花種類の花に関する予測モデルであるため、３つの予測値の平均値を示している。 FIG. 5 is a diagram showing a prediction model for predicting from four measured values with the flower type “Setosa” as the predicted value “1”. As these four measured values, the length and width of the petals and gaku (forehead) existing in one flower are used as explanatory variables (petal length (= x3), petal width (= x4), gaku length). A prediction model can be created with "1" as the objective variable as (= x5) and the width of the corolla (= x6)). FIG. 6 is a bar graph showing the Shapley values of x3 to x6 in the case of the above-mentioned "Setosa". In FIG. 6, it is described that the predicted value (Actual Prescription) of the Shapley value obtained in the case of “Setosa” is 1, and the predicted average value (Average Prediction) is 2. Since the predicted average value is a prediction model for flowers of three flower types in this example, the average value of the three predicted values is shown.

図７は、花の種類「バーシクル」を予測値「２」として４つの測定値から予測する予想モデルを示す図である。この４つの測定値（説明変数）として、１つの花に存在している花弁とガク（額）の長さと幅を説明変数（花弁の長さ（＝ｘ３）、花弁の幅（＝ｘ４）、ガクの長さ（＝ｘ５）、ガクの幅（＝ｘ６））として、「２」を目的変数として予測モデルを作成することができる。図８は、上記「バーシクル」の場合におけるｘ３～ｘ６のシャープレイ値を棒グラフで示した図である。図８には、「バーシクル」の場合に得られるシャープレイ値の予測値（Ａｃｔｕａｌｐｒｅｄｉｃｔｉｏｎ）が２であり、予測平均値（Ａｖｅｒａｇｅｐｒｅｄｉｃｔｉｏｎ）が２であることが記載されている。 FIG. 7 is a diagram showing a prediction model for predicting from four measured values with the flower type “versicle” as the predicted value “2”. As these four measured values (explanatory variables), the length and width of the petals and gaku (forehead) existing in one flower are the explanatory variables (petal length (= x3), petal width (= x4), A prediction model can be created with "2" as the objective variable as the length of the corolla (= x5) and the width of the corolla (= x6)). FIG. 8 is a bar graph showing Shapley values of x3 to x6 in the case of the above-mentioned "vertical". In FIG. 8, it is described that the predicted value (Actual Prescription) of the Shapley value obtained in the case of the “vertical” is 2, and the predicted average value (Average Prediction) is 2.

図９は、花の種類「バージニカ」を予測値「３」として４つの測定値から予測する予想モデルを示す図である。この４つの測定値（説明変数）として、１つの花に存在している花弁とガク（額）の長さと幅を説明変数（花弁の長さ（＝ｘ３）、花弁の幅（＝ｘ４）、ガクの長さ（＝ｘ５）、ガクの幅（＝ｘ６））として、「３」を目的変数として予測モデルを作成することができる。図１０は、上記「バージニカ」の場合におけるｘ３～ｘ６のシャープレイ値を棒グラフで示した図である。図１０には、「バージニカ」の場合に得られるシャープレイ値の予測値（Ａｃｔｕａｌｐｒｅｄｉｃｔｉｏｎ）が３であり、予測平均値（Ａｖｅｒａｇｅｐｒｅｄｉｃｔｉｏｎ）が２であることが記載されている。 FIG. 9 is a diagram showing a prediction model for predicting from four measured values with the flower type “Virginica” as the predicted value “3”. As these four measured values (explanatory variables), the length and width of the petals and gaku (forehead) existing in one flower are the explanatory variables (petal length (= x3), petal width (= x4), A prediction model can be created with "3" as the objective variable as the length of the corolla (= x5) and the width of the corolla (= x6)). FIG. 10 is a bar graph showing Shapley values of x3 to x6 in the case of the above "Virginica". In FIG. 10, it is described that the predicted value (Actual Prescription) of the Shapley value obtained in the case of “Virginica” is 3, and the predicted average value (Average Prediction) is 2.

図６、図８、図１０に明らかなように、花の種類「バーシクル」の場合に、説明変数ｘ６が特異的に大きな値となっているものの、全体としていずれか１つの説明変数の影響が大きいかを特定するほどには到っておらず、現状のシャープレイ値そのものを用いて、どの説明変数の影響が大きいかを検出できないものであった。 As is clear from FIGS. 6, 8 and 10, in the case of the flower type "versicle", the explanatory variable x6 has a specifically large value, but the influence of any one of the explanatory variables as a whole is affected. It was not enough to specify whether it was large, and it was not possible to detect which explanatory variable had a large effect using the current Shapley value itself.

特許文献１には、データの説明変数に対しデータ項目のカテゴリを識別する付加文字列を付加し、データクレンジング／特徴化手段３２によって、データの異常値を特定値に置換あるいは削除するデータクレンジングを行うことが記載されている。この場合、異常判断基準については、その異常値定義と置換値を設定し、設定に従って異常値を処理するものであり、目的変数が異常となった場合に、予測を行うために使用されるいくつかの説明変数中のいずれが影響しているかを特定するものではない。 In Patent Document 1, an additional character string that identifies a category of a data item is added to an explanatory variable of data, and data cleansing that replaces or deletes an abnormal value of data with a specific value by the data cleansing / characterizing means 32 is performed. It describes what to do. In this case, the anomaly judgment criteria is to set the outlier definition and replacement value, and process the outlier according to the setting, and how many are used to make a prediction when the objective variable becomes abnormal. It does not specify which of the explanatory variables is affecting.

特許文献２には、学習後の異常検出データモデルを用いて、計算した偏差データ信号及び工程ステップのタイプを示す工程タイプ指標のデータ処理によってステップ毎の異常検出を行い、工程ステップの時間ステップｔ又はパス長ステップｌ毎に異常確率ｐを計算し、更に、この異常確率ｐに基づいて、ワークピース及び生産プロセスステップの異常・正常の分類を行うものが開示されている。 In Patent Document 2, an abnormality is detected for each step by data processing of a calculated deviation data signal and a process type index indicating the type of the process step using the abnormality detection data model after learning, and the time step t of the process step. Alternatively, there is disclosed a method in which an abnormality probability p is calculated for each path length step l, and an abnormality / normal classification of a workpiece and a production process step is performed based on this abnormality probability p.

上記引用文献２のものにおいても、目的変数が異常となった場合に、予測を行うために使用されるいくつかの説明変数中のいずれが影響しているかを求めることはできない。 Even in the above-mentioned cited document 2, when the objective variable becomes abnormal, it is not possible to determine which of the several explanatory variables used for making the prediction has an effect.

特開２００４－２９９７１号公報Japanese Unexamined Patent Publication No. 2004-29971 特開２０１９－１３５６３８号公報Japanese Unexamined Patent Publication No. 2019-135638

本発明は、上記のような機械学習による異常検出の分野における課題を解決せんとしてなされたもので、その目的は、目的変数が異常或いは誤りとなった場合に、予測を行うために使用されるいくつかの説明変数中のいずれが影響しているかを求めることが可能な影響説明変数検出装置を提供することである。 The present invention has been made to solve the above-mentioned problems in the field of abnormality detection by machine learning, and an object thereof is used to make a prediction when an objective variable becomes abnormal or erroneous. It is to provide an effect explanatory variable detector capable of determining which of several explanatory variables is influencing.

本実施形態の影響説明変数検出装置は、説明変数である複数項目の教師測定データと、前記複数項目の教師測定データを識別するための１つの教師識別データであって目的変数である教師識別データとの１セットデータが、複数セット用意された教師データを用いて、機械学習により前記説明変数から前記目的変数を求めるように作成された予測モデルと、前記予測モデルに前記複数セット用意された教師データの説明変数を与えて目的変数を求め、与えた説明変数に対応する教師データの目的変数との誤差を求める処理を前記教師データの全てについて行う誤差算出手段と、前記求められた誤差の分布を求め、前記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数を抽出する抽出手段と、抽出された説明変数の複数項目の教師測定データについて中央部値を求める中央部値算出手段と、前記中央部値に基づきシャープレイ値である基準シャープレイ値を算出する基準値算出手段と、前記教師測定データと同じ測定処理により新たに測定された誤り或いは異常に影響しているかの解析対象である複数項目の解析用測定データに基づきシャープレイ値である解析対象シャープレイ値を算出する解析対象値算出手段と、同一項目毎に、前記基準シャープレイ値と前記解析対象シャープレイ値との比較値を求め、比較値の大きさに基づき、いずれの説明変数である項目の解析用測定データが誤りに影響しているか或いは異常に影響しているかを検出する影響説明変数検出手段とを具備することを特徴とする。 The influence explanatory variable detection device of the present embodiment is one teacher identification data for discriminating the teacher measurement data of a plurality of items which are explanatory variables and the teacher measurement data of the plurality of items, and is a teacher identification data which is an objective variable. One set of data is a prediction model created so as to obtain the objective variable from the explanatory variables by machine learning using the teacher data prepared in a plurality of sets, and the teacher prepared in the plurality of sets in the prediction model. An error calculation means that performs a process of finding an objective variable by giving an explanatory variable of data and finding an error of the teacher data corresponding to the given explanatory variable with the objective variable of the teacher data for all of the teacher data, and a distribution of the obtained error. Is obtained, and the extraction means for extracting the explanatory variables of the teacher data corresponding to the error within the predetermined range of the distribution range of the error, and the central value for obtaining the central value for the teacher measurement data of a plurality of items of the extracted explanatory variables. Is it affected by the calculation means, the reference value calculation means for calculating the reference shear play value which is the shear play value based on the central value, and the error or abnormality newly measured by the same measurement process as the teacher measurement data? The analysis target value calculation means for calculating the analysis target shear play value which is the shear play value based on the analysis measurement data of a plurality of items to be analyzed, and the reference shear play value and the analysis target shear play for each same item. Impact explanatory variable detection means that obtains a comparison value with a value and detects whether the analysis measurement data of the item, which is an explanatory variable, affects an error or an abnormality based on the magnitude of the comparison value. It is characterized by having and.

予測モデルの目的変数と説明変数を提供する３種類の花の平面図。Top view of three types of flowers that provide the objective and explanatory variables of the predictive model. 花弁とガクの長さと幅の測定値（説明変数）と予測モデルを示す図。Figure showing petal and gaku length and width measurements (explanatory variables) and predictive model. 予測モデルの式を示す図。The figure which shows the formula of the prediction model. 図３の式を用いた動作を行う予測モデルによる予測結果を示す図。The figure which shows the prediction result by the prediction model which performs the operation using the formula of FIG. 花の種類「セトサ」を予測値「１」として４つの測定値から予測する予想モデルを示す図。The figure which shows the prediction model which predicts from four measured values with the flower type "Setosa" as the predicted value "1". 「セトサ」の場合におけるｘ３～ｘ６のシャープレイ値を棒グラフで示した図。The figure which showed the Shapley value of x3 to x6 in the case of "Setosa" by a bar graph. 花の種類「バーシクル」を予測値「２」として４つの測定値から予測する予想モデルを示す図。The figure which shows the prediction model which predicts from four measured values with the flower type "versicle" as the predicted value "2". 「バーシクル」の場合におけるｘ３～ｘ６のシャープレイ値を棒グラフで示した図。The figure which showed the Shapley value of x3 to x6 in the case of "vertical" by a bar graph. 花の種類「バージニカ」を予測値「３」として４つの測定値から予測する予想モデルを示す図。The figure which shows the prediction model which predicts from four measured values with the flower type "virginica" as the predicted value "3". 「バージニカ」の場合におけるｘ３～ｘ６のシャープレイ値を棒グラフで示した図。The figure which showed the Shapley value of x3 to x6 in the case of "virginica" by a bar graph. 本発明の実施形態に係る影響説明変数検出装置１００を実現するコンピュータシステムの構成図。The block diagram of the computer system which realizes the influence explanatory variable detection apparatus 100 which concerns on embodiment of this invention. 本発明の第１の実施形態に係る影響説明変数検出装置１００の機能ブロック図。The functional block diagram of the influence explanatory variable detection apparatus 100 which concerns on 1st Embodiment of this invention. 教師データＴの内容を示す図。The figure which shows the content of a teacher data T. 本発明の実施形態に係る影響説明変数検出装置１００で得られる誤差の分布図。The distribution diagram of the error obtained by the influence explanatory variable detection apparatus 100 which concerns on embodiment of this invention. ステップ処理モードを説明する図。The figure explaining the step processing mode. 本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作手順を示したフローチャート。The flowchart which showed the operation procedure of the step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment. 本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作において、教師データＴから誤差が求められるまでの処理を、データの内容の変遷を中心として示した図。It is a figure which showed the process from the teacher data T until the error is obtained in the operation of the step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment, focusing on the transition of the contents of data. 本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作において、平均値と標準偏差を求め、上記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数をステップ毎に抽出し、中央値を得るまでの処理を示す図。In the operation of the step processing mode of the abnormality / error effect explanatory variable detection device 100 according to the present embodiment, the mean value and the standard deviation are obtained, and the explanatory variables of the teacher data corresponding to the error within the predetermined range of the above error distribution range are obtained. The figure which shows the process of extracting for each step and getting the median. 本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作において、抽出処理を行った結果を示す図。The figure which shows the result of having performed the extraction processing in the operation of the step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment. 本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作において、ステップ毎の中央値に基づきシャープレイ値である基準シャープレイ値をスップ毎に算出する工程を示す図。It is a figure which shows the process of calculating the reference Shapley value which is the Shapley value for each step based on the median value for each step in the operation of the step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment. 本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作において、解析用測定データから算出された解析対象シャープレイ値と、影響説明変数検出手段３７により求められた比較値を示す図。In the operation of the step processing mode of the abnormality / error effect explanatory variable detection device 100 according to the present embodiment, the analysis target Shapley value calculated from the measurement data for analysis and the comparison value obtained by the effect explanatory variable detection means 37 are used. The figure which shows. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードに好適な測定データ形式と、異常・正常判定の手法をしめす図。FIG. 6 is a diagram showing a measurement data format suitable for the non-step processing mode of the variable detection device 100 for explaining the influence of abnormality / error according to the present embodiment, and a method for determining abnormality / normality. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードの動作手順を示したフローチャート。The flowchart which showed the operation procedure of the non-step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードの動作において、教師データＴから誤差が求められるまでの処理を、データの内容の変遷を中心として示した図。It is a figure which showed the process from the teacher data T until the error is obtained in the operation of the non-step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment, focusing on the transition of the content of data. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードの動作において、平均値と標準偏差を求め、上記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数をステップ毎に抽出し、中央値を得るまでの処理を示す図。Abnormal / error effect explanatory variable according to the present embodiment In the operation of the non-step processing mode of the detection device 100, the mean value and the standard deviation are obtained, and the explanatory variable of the teacher data corresponding to the error within the predetermined range of the above error distribution range. Is extracted step by step, and the figure which shows the process until the median value is obtained. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードの動作において、抽出処理を行った結果を示す図。The figure which shows the result of having performed the extraction processing in the operation of the non-step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードの動作において、ステップ毎の中央値に基づきシャープレイ値である基準シャープレイ値をステップ毎に算出する工程を示す図。It is a figure which shows the process of calculating the reference Shapley value which is the Shapley value for each step based on the median value for each step in the operation of the non-step processing mode of the variable detection apparatus 100 for explaining the influence of abnormality / error which concerns on this embodiment. 本実施形態に係る異常・誤り影響説明変数検出装置１００の非ステップ処理モードの動作において、解析用測定データから算出された解析対象シャープレイ値と、影響説明変数検出手段３７により求められた比較値を示す図。In the operation of the non-step processing mode of the abnormality / error effect explanatory variable detection device 100 according to the present embodiment, the analysis target Shapley value calculated from the measurement data for analysis and the comparison value obtained by the effect explanation variable detection means 37. The figure which shows.

以下添付図面を参照して、本発明の実施形態に係る影響説明変数検出装置及び影響説明変数検出用プログラムを説明する。各図において、同一の構成要素には、同一の符号を付して重複する説明を省略する。図１１は、本発明の実施形態に係る影響説明変数検出装置１００を実現するコンピュータシステムの構成図である。本発明の実施形態に係る影響説明変数検出装置１００は、例えば図１１に示されるようなパーソナルコンピュータやワークステーション、その他のコンピュータシステムにより構成することができる。このコンピュータシステムは、ＣＰＵ１０が主メモリ１１に記憶されている或いは主メモリ１１に読み込んだプログラムやデータに基づき各部を制御し、必要な処理を実行することにより影響説明変数検出装置１００として動作を行うものである。 Hereinafter, the effect explanatory variable detection device and the effect explanatory variable detection program according to the embodiment of the present invention will be described with reference to the accompanying drawings. In each figure, the same components are designated by the same reference numerals and duplicated description will be omitted. FIG. 11 is a configuration diagram of a computer system that realizes the influence explanatory variable detection device 100 according to the embodiment of the present invention. The influence explanatory variable detection device 100 according to the embodiment of the present invention can be configured by, for example, a personal computer, a workstation, or another computer system as shown in FIG. This computer system operates as an influence explanatory variable detection device 100 by controlling each part based on a program or data stored in the main memory 11 or read into the main memory 11 by the CPU 10 and executing necessary processing. It is a thing.

ＣＰＵ１０には、バス１２を介して外部記憶インタフェース１３、入力インタフェース１４、表示インタフェース１５、データ入力インタフェース１６が接続されている。外部記憶インタフェース１３には、状態変動検出用プログラム等のプログラムと必要なデータ等が記憶されている外部記憶装置２３が接続されている。入力インタフェース１４には、コマンドやデータを入力するための入力装置としてのキーボードなどの入力装置２４とポインティングデバイスとしてのマウス２２が接続されている。 An external storage interface 13, an input interface 14, a display interface 15, and a data input interface 16 are connected to the CPU 10 via a bus 12. A program such as a state change detection program and an external storage device 23 in which necessary data and the like are stored are connected to the external storage interface 13. An input device 24 such as a keyboard as an input device for inputting commands and data and a mouse 22 as a pointing device are connected to the input interface 14.

表示インタフェース１５には、ＬＥＤやＬＣＤなどの表示画面を有する表示装置２５が接続されている。データ入力インタフェース１６には、測定データを得るためのセンサ２６－１、２６－２、・・・、２６－ｍが接続されている。センサ２６－１、２６－２、・・・、２６－ｍは、測定データを得るための構成であり、データ入力を行うための記憶媒体や入力装置であっても良い。更に、このコンピュータシステムには、他の構成が備えられていても良く、また、図１１の構成は一例に過ぎない。 A display device 25 having a display screen such as an LED or an LCD is connected to the display interface 15. Sensors 26-1, 26-2, ..., 26-m for obtaining measurement data are connected to the data input interface 16. The sensors 26-1, 26-2, ..., 26-m are configured to obtain measurement data, and may be a storage medium or an input device for inputting data. Further, the computer system may be provided with other configurations, and the configuration of FIG. 11 is only an example.

図１２は、本発明の第１の実施形態に係る影響説明変数検出装置１００の機能ブロック図である。上記において、ＣＰＵ１０では、外部記憶装置２３内の影響説明変数検出用プログラムによって図１２に記載の各手段等が実現される。即ち、予測モデル作成手段３０、予測モデル３１、誤差算出手段３２、抽出手段３３、中央部値算出手段３４、基準値算出手段３５、解析対象値算出手段３６、影響説明変数検出手段３７、除算手段３８、影響度取得手段３９、教師データＴが記憶されている。 FIG. 12 is a functional block diagram of the influence explanatory variable detection device 100 according to the first embodiment of the present invention. In the above, in the CPU 10, each means and the like shown in FIG. 12 are realized by the influence explanatory variable detection program in the external storage device 23. That is, the prediction model creation means 30, the prediction model 31, the error calculation means 32, the extraction means 33, the central value calculation means 34, the reference value calculation means 35, the analysis target value calculation means 36, the influence explanatory variable detection means 37, and the division means. 38, the influence degree acquisition means 39, and the teacher data T are stored.

図１３は、教師データＴの内容を示す図である。教師データＴは、説明変数である複数項目の教師測定データを備える。ここに、説明変数である項目は、「ガクの長さ」、「ガクの幅」、「花弁の長さ」、「花弁の幅」の４項目である。更に教師データＴは、上記複数項目の教師測定データを識別するための１つの教師識別データであって目的変数である教師識別データを備える。具体的には、図１３の説明変数である項目であるガクの長さ、ガクの幅、花弁の長さ、花弁の幅に対して図の左横方向に記載されている花の種類「セトサ」を「１」、「バーシクル」を「２」、「バージニカ」を「３」として対応付けたものが教師識別情報であり、これら「１」、「２」、「３」は目的変数である。図１３の１行分が目的変数と説明変数の１セットのデータであり、図の縦方向に複数セット用意されている。 FIG. 13 is a diagram showing the contents of the teacher data T. The teacher data T includes teacher measurement data of a plurality of items which are explanatory variables. Here, the items that are explanatory variables are four items, "length of gaku", "width of gaku", "length of petals", and "width of petals". Further, the teacher data T includes one teacher identification data for identifying the teacher measurement data of the plurality of items and teacher identification data which is an objective variable. Specifically, the flower type "Setosa" described in the left lateral direction of the figure with respect to the items of the explanatory variables in FIG. 13, the length of the gaku, the width of the gaku, the length of the petals, and the width of the petals. The teacher identification information is associated with "1", "versicle" as "2", and "virginica" as "3", and these "1", "2", and "3" are objective variables. .. One line in FIG. 13 is one set of data of the objective variable and the explanatory variable, and a plurality of sets are prepared in the vertical direction of the figure.

予測モデル３１は教師データＴを用いて予測モデル作成手段３０が作成するものである。ここで、本実施形態では、予測モデル作成手段３０は、予測モデル３１を作成するためにこのコンピュータシステムに備えられているが、他の装置やプログラムによって作成された予測モデル３１をこの外部記憶装置２３に記憶させて用いるものであっても良く、この場合には、予測モデル作成手段３０を備えていなくともよい。 The prediction model 31 is created by the prediction model creating means 30 using the teacher data T. Here, in the present embodiment, the predictive model creating means 30 is provided in the computer system for creating the predictive model 31, but the predictive model 31 created by another device or program is stored in the external storage device. It may be stored in 23 and used, and in this case, the predictive model creating means 30 may not be provided.

予測モデル３１は、機械学習により説明変数から目的変数を予測するものである。ここに、機械学習のアルゴリズムとしては、パターンマイ二ングのランダムフォレストを挙げることができるが、これ以外に、分類木や回帰木などのように分類器により（例えばツリー構造で）分岐を行って予測を行う機械学習によるアルゴリズムを採用することができる。また、予測モデル３１は、重回帰分析による機械学習を行うものであっても良い。 The prediction model 31 predicts the objective variable from the explanatory variables by machine learning. Here, as a machine learning algorithm, a random forest of pattern mining can be mentioned, but in addition to this, branching is performed by a classifier (for example, in a tree structure) such as a classification tree or a regression tree. Machine learning algorithms that make predictions can be adopted. Further, the prediction model 31 may be one that performs machine learning by multiple regression analysis.

誤差算出手段３２は、上記予測モデル３１に上記複数セット用意された教師データＴの説明変数を与えて目的変数を求め、与えた説明変数に対応する教師データの目的変数との誤差を求める処理を上記教師データの全てについて行うものである。従って、図１３の教師データに対しては、行数分の誤差が求められる。 The error calculation means 32 gives an explanatory variable of the teacher data T prepared in a plurality of sets to the prediction model 31 to obtain an objective variable, and performs a process of obtaining an error from the objective variable of the teacher data corresponding to the given explanatory variable. This is done for all of the above teacher data. Therefore, for the teacher data in FIG. 13, an error corresponding to the number of lines is required.

抽出手段３３は、上記求められた誤差の分布を求め、上記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数を抽出するものである。本実施形態では、上記誤差分布の平均値と標準偏差を求め、この平均値から上記標準偏差の所定倍の範囲にある誤差に対応する教師データの説明変数を抽出するものである。上記所定倍は、本実施形態において１倍とするが、例えば、１．５倍や０．５倍などであっても良い。前述の通り誤差は図１３の教師データＴの行数分生成されるため、平均値はこれらの平均値ということになり、１つ求められる。 The extraction means 33 obtains the distribution of the obtained error, and extracts the explanatory variables of the teacher data corresponding to the error within the predetermined range of the distribution range of the error. In the present embodiment, the mean value and the standard deviation of the error distribution are obtained, and the explanatory variables of the teacher data corresponding to the error within a predetermined multiple range of the standard deviation are extracted from the mean value. The predetermined multiple is 1 in the present embodiment, but may be, for example, 1.5 times or 0.5 times. As described above, since the error is generated for the number of rows of the teacher data T in FIG. 13, the average value is the average value of these, and one is obtained.

標準偏差をσとすると、σは、次の式（１）により求められる。

上記において、ｎは標準偏差を求める対象の数値の個数であり、誤差の個数、ｘｉは各数値である。 Assuming that the standard deviation is σ, σ can be obtained by the following equation (1).

In the above, n is the number of numerical values for which the standard deviation is to be obtained, the number of errors, and xi is each numerical value.

図１４は誤差の分布図である。上記平均値をμで表すとき、図１４の誤差の分布に対して、標準偏差σの１倍の範囲（即ち、（μ－σ）から（μ＋σ）の範囲）にある誤差に対応する教師データの説明変数を抽出する。以上の結果、何行分かの説明変数が抽出される。 FIG. 14 is a distribution map of errors. When the above mean value is expressed in μ, the teacher data corresponding to the error in the range of 1 times the standard deviation σ (that is, the range from (μ−σ) to (μ + σ)) with respect to the error distribution in FIG. Extract the explanatory variables of. As a result of the above, several lines of explanatory variables are extracted.

本実施形態では、誤差の分布の中央部分である（μ－σ）から（μ＋σ）の範囲とすることによって、通常程度の誤差の範囲にある誤差を抽出し、続く中央部値の算出と基準シャープレイ値の算出へ進む。これによって、異常度がそれ程多くない説明変数の複数項目の教師データによって基準シャープレイ値が算出される。このため、上記基準シャープレイ値と解析対象シャープレイ値との比較では、解析対象シャープレイ値が上記基準シャープレイ値と大きく乖離していれば、その説明変数が、異常或いは誤りに大きく貢献していると結論付けることができる。本実施形態は、このような推論を根拠として成り立つ。 In the present embodiment, by setting the range from (μ−σ) to (μ + σ), which is the central part of the error distribution, the error within the normal error range is extracted, and the subsequent calculation and reference of the central part value are performed. Proceed to the calculation of the Shapley value. As a result, the reference Shapley value is calculated from the teacher data of a plurality of items of the explanatory variables whose degree of abnormality is not so high. Therefore, in the comparison between the reference Shapley value and the Shapley value to be analyzed, if the Shapley value to be analyzed deviates significantly from the reference Shapley value, the explanatory variable greatly contributes to an abnormality or an error. It can be concluded that it is. This embodiment is based on such reasoning.

上記の実施形態で用いた分布の中心とは逆に、誤差の分布の縁部分である（μ＋２σ）から（μ＋３σ）の範囲と（μ－２σ）から（μ－３σ）の範囲の誤差に対応する教師データの説明変数を抽出すると、誤差（異常度）が大きな説明変数の複数項目の教師データによって基準シャープレイ値が算出される。このため、上記基準シャープレイ値と解析対象シャープレイ値との比較では、解析対象シャープレイ値が上記基準シャープレイ値に近い場合に、その説明変数が、異常或いは誤りに大きく貢献していると結論付けることができる。即ち、後に述べるように本実施形態は、比較値が所定閾値より大きな場合に、その説明変数が、異常或いは誤りに大きく貢献していると判定しているが、上記のように誤差分布の縁部分を用いた場合には、比較値が所定閾値より小さい場合に、その説明変数が、異常或いは誤りに大きく貢献していると判定する手法を採用することができる。 Contrary to the center of the distribution used in the above embodiment, it corresponds to the error in the range of (μ + 2σ) to (μ + 3σ) and the range of (μ-2σ) to (μ-3σ), which is the edge of the error distribution. When the explanatory variables of the teacher data to be used are extracted, the reference Shapley value is calculated from the teacher data of a plurality of items of the explanatory variables having a large error (abnormality). Therefore, in the comparison between the reference Shapley value and the Shapley value to be analyzed, when the Shapley value to be analyzed is close to the reference Shapley value, the explanatory variable greatly contributes to the abnormality or error. I can conclude. That is, as will be described later, in the present embodiment, when the comparison value is larger than the predetermined threshold value, it is determined that the explanatory variable greatly contributes to the abnormality or the error, but the edge of the error distribution as described above. When a portion is used, when the comparison value is smaller than a predetermined threshold value, a method of determining that the explanatory variable greatly contributes to an abnormality or an error can be adopted.

中央部値算出手段３４は、抽出された説明変数の複数項目の教師測定データについて中央部値を求める。ここに、中央部値は、中央値、平均値、中央値と平均値の中間値など中央値付近の値であっても良い。ここでは、中央部値を求めることにより、抽出手段３３により抽出された説明変数の中で数値的な中心部の値を求め、抽出の意味を高めている。上記の様々な中央部値が理論的には有り得るが、計算により求めることを考えれば中央値が好適である。そこで、本実施形態では、中央部値は中央値とする。上記の通り、抽出された説明変数は、何行分かの説明変数であって、説明変数である項目は、「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目の説明変数である。従って４項目の項目ごとに何行分かをデータ収集し、このデータの中央値を求めることになる。 The central value calculation means 34 obtains the central value for the teacher measurement data of a plurality of items of the extracted explanatory variables. Here, the median value may be a value near the median value such as the median value, the average value, and the intermediate value between the median value and the average value. Here, by obtaining the central value, the numerical central value is obtained among the explanatory variables extracted by the extraction means 33, and the meaning of the extraction is enhanced. The various median values described above are theoretically possible, but the median value is preferable considering that it can be obtained by calculation. Therefore, in the present embodiment, the median value is set to the median value. As described above, the extracted explanatory variables are several lines of explanatory variables, and the items that are explanatory variables are "petal length", "petal width", "gaku length", and "gaku length". It is an explanatory variable of four items of "width of gaku". Therefore, data for several lines is collected for each of the four items, and the median value of this data is obtained.

基準値算出手段３５は、上記中央値に基づきシャープレイ値である基準シャープレイ値を算出するものである。中央値は、４項目分求められている。そこで、４項目分の中央値、つまり、「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目についてのそれぞれの中央値を、シャープレイ値を求めるためのソフトウエアライブラリに投入して、シャープレイ値を求める。このライブラリは、Christoph Molnar が作成した「iml(Interpretable Machine Learning)」という名称のものを用いることができる。 The reference value calculation means 35 calculates a reference Shapley value, which is a Shapley value, based on the median value. The median is calculated for 4 items. Therefore, the median value for each of the four items, that is, the median value for each of the four items of "petal length", "petal width", "gaku length", and "gaku width", is the Shapley value. Put it in the software library to find the Shapley value. This library can use the one named "iml (Interpretable Machine Learning)" created by Christoph Molnar.

解析対象値算出手段３６は、上記教師測定データと同じ測定処理により新たに測定された誤り或いは異常に影響しているかの解析対象である複数項目の解析用測定データに基づきシャープレイ値である解析対象シャープレイ値を算出するものである。この解析用測定データは、上記教師測定データと同じ測定処理により新たに測定されものであるから、「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目についての測定データである。この測定データについても上記と同じライブラリを用いることができる。 The analysis target value calculation means 36 is an analysis that is a shear play value based on the analysis measurement data of a plurality of items that are the analysis targets of whether or not the error or abnormality is newly measured by the same measurement process as the teacher measurement data. It calculates the target shear play value. Since this measurement data for analysis is newly measured by the same measurement process as the above teacher measurement data, "petal length", "petal width", "gaku length", and "gaku width" It is the measurement data about four items. The same library as above can be used for this measurement data.

影響説明変数検出手段３７は、上記複数項目の同一項目毎に、上記基準シャープレイ値と上記解析対象シャープレイ値との比較値を求め、比較値の大きさに基づき、いずれの説明変数である項目の解析用測定データが誤りに影響しているか或いは異常に影響しているかを検出するものである。比較値としては、基準シャープレイ値から解析対象シャープレイ値を引いた誤差や、計測した誤差の割合である。この比較値が所定閾値よりも大きい場合には、当該項目の説明変数の影響により異常または誤りとなったものとの結果を出力する。 The influence explanatory variable detecting means 37 obtains a comparison value between the reference shear play value and the analysis target shear play value for each of the same items of the plurality of items, and is any explanatory variable based on the magnitude of the comparison value. It detects whether the measurement data for analysis of an item affects an error or an abnormality. The comparison value is the error obtained by subtracting the Shapley value to be analyzed from the reference Shapley value, or the ratio of the measured error. When this comparison value is larger than a predetermined threshold value, the result of an abnormality or an error due to the influence of the explanatory variable of the item is output.

本発明の実施形態に係る異常・誤り影響説明変数検出装置１００及び異常・誤り影響説明変数検出用プログラムは、ステップ処理モードと非ステップモードのいずれかで動作する。図１５は、ステップ処理モードを説明する図である。異常・誤り影響説明変数検出装置１００及び異常・誤り影響説明変数検出用プログラムは、異常・誤りを検出する対象装置の所定位置に図１５に示すようなセンサＡ１、Ｂ１、Ｃ１、Ｄ１を設けて、第１工程の処理時刻にデータを得て、また第２工程の処理時刻にデータを得て、第３の工程の処理時刻にデータを得て、また、第１の工程の処理時刻にデータ得て、第２の工程の処理時刻にデータを得て、・・・という処理を繰り返すものに適用可能である。センサＡ１、Ｂ１、Ｃ１、Ｄ１は、例えば、温度、湿度、振動値などとすることができ、全てのセンサが同一の物理指標を得ても良いし、異なる物理指標を得るものであっても良い。 The abnormality / error effect explanatory variable detection device 100 and the abnormality / error effect explanatory variable detection program according to the embodiment of the present invention operate in either the step processing mode or the non-step mode. FIG. 15 is a diagram illustrating a step processing mode. The abnormality / error effect explanatory variable detection device 100 and the abnormality / error effect explanation variable detection program are provided with sensors A1, B1, C1, and D1 as shown in FIG. 15 at predetermined positions of the target device for detecting the abnormality / error. , Data is obtained at the processing time of the first step, data is obtained at the processing time of the second step, data is obtained at the processing time of the third step, and data is obtained at the processing time of the first step. It can be applied to a device in which data is obtained at the processing time of the second step and the process of repeating the process is repeated. The sensors A1, B1, C1, and D1 can be, for example, temperature, humidity, vibration value, etc., and all the sensors may obtain the same physical index or may obtain different physical indexes. good.

上記のような対象装置による第１工程の処理時刻（ステップ１）、第２の工程の処理時刻（ステップ２）、第３の工程の処理時刻（ステップ３）を、上記センサＡ１、Ｂ１、Ｃ１、Ｄ１の値から予測する予測モデル３１では、予測時刻を予測（ステップ１、ステップ２、ステップ３のいずれかを予測）し、予測値（目的変数）が時刻からズレが生じる場合の誤差により異常・誤りを検出する。 The processing time of the first step (step 1), the processing time of the second step (step 2), and the processing time of the third step (step 3) by the target device as described above are set to the sensors A1, B1, and C1. , The prediction model 31 that predicts from the value of D1 predicts the predicted time (predicts any of step 1, step 2, and step 3), and is abnormal due to an error when the predicted value (objective variable) deviates from the time. -Detect errors.

教師測定データと解析用測定データが、Ｎ（正整数）ステップで繰り返して得られるデータである場合に、上記抽出手段３３は、前記求められた誤差の分布をステップ毎に求め、所定範囲にある誤差に対応する教師データの説明変数をステップ毎に抽出するステップ処理モードを備え、上記中央部値算出手段３４、上記基準値算出手段３５、上記解析対象値算出手段３６、上記影響説明変数検出手段３７は、ステップ毎に処理するステップ処理モードを備える。 When the teacher measurement data and the measurement data for analysis are data obtained repeatedly in N (positive integer) steps, the extraction means 33 obtains the distribution of the obtained error for each step and is within a predetermined range. A step processing mode for extracting explanatory variables of teacher data corresponding to an error is provided for each step, and the central value calculation means 34, the reference value calculation means 35, the analysis target value calculation means 36, and the influence explanatory variable detection means 37 includes a step processing mode for processing step by step.

図１６は、本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作手順をフローチャートに示したものである。本実施形態の説明においては、センサＡ１、Ｂ１、Ｃ１、Ｄ１による測定値を図２等に示した「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目とし、予測値は花の種類であり、「セトサ」を「１」、「バーシクル」を「２」、「バージニカ」を「３」とする。教師データＴとして既に図１３に示したものが用意されているものとする。 FIG. 16 is a flowchart showing an operation procedure of the step processing mode of the variable / error effect explanatory variable detection device 100 according to the present embodiment. In the description of the present embodiment, the “petal length”, “petal width”, “gaku length”, and “gaku width”, in which the measured values by the sensors A1, B1, C1, and D1 are shown in FIG. The predicted value is the type of flower, "Setosa" is "1", "Versicle" is "2", and "Virginica" is "3". It is assumed that the teacher data T already shown in FIG. 13 has been prepared.

そこで、ＣＰＵ１０は、誤差算出手段３２として、上記予測モデル３１に上記教師データＴの説明変数を与えて目的変数を求め、与えた説明変数に対応する教師データの目的変数との誤差を求める（Ｓ１１）。この処理は、上記教師データＴの全てについて行われる。図１７は、教師データＴから誤差が求められるまでの処理を、データの内容の変遷を中心として示したものである。 Therefore, as the error calculating means 32, the CPU 10 gives the explanatory variable of the teacher data T to the prediction model 31 to obtain the objective variable, and obtains the error from the objective variable of the teacher data corresponding to the given explanatory variable (S11). ). This process is performed for all of the teacher data T. FIG. 17 shows the processing from the teacher data T until the error is obtained, focusing on the transition of the data contents.

次に、ＣＰＵ１０は、抽出手段３３として、上記誤差のステップ毎に誤差分布の平均値と標準偏差を求め、上記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数をステップ毎に抽出する（Ｓ１２）。ここで、所定の範囲は、標準偏差σの１倍の範囲（即ち、（μ－σ）から（μ＋σ）の範囲）にある誤差に対応する教師データの説明変数をステップ毎に抽出する。図１８は、誤差分布の平均値と標準偏差を求め、上記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数をステップ毎に抽出し、中央値を得るまでの処理である。図１９は抽出処理が行われた教師データを示す。上記の抽出結果、図１９に示される教師データにおいて、図の横方向に空白となった行の説明変数が排除され、数値が残っている行の説明変数が抽出される。図１８と図１９では、ステップ１の誤差が示されているが、本実施形態では、ステップ２、３の誤差についても同様にして、（μ－σ）から（μ＋σ）の範囲）にある誤差が抽出される。なお、図１８は処理を示したものであり、図１９は処理結果がどのようになるかを示したものであるため、これらの図に記載されている数値は一致していない。 Next, the CPU 10 obtains the mean value and standard deviation of the error distribution for each step of the error as the extraction means 33, and sets the explanatory variables of the teacher data corresponding to the error within the predetermined range of the distribution range of the error for each step. (S12). Here, in the predetermined range, the explanatory variables of the teacher data corresponding to the error in the range of 1 times the standard deviation σ (that is, the range of (μ−σ) to (μ + σ)) are extracted step by step. FIG. 18 is a process of obtaining the mean value and standard deviation of the error distribution, extracting the explanatory variables of the teacher data corresponding to the error in the predetermined range of the error distribution range for each step, and obtaining the median value. .. FIG. 19 shows the teacher data to which the extraction process has been performed. As a result of the above extraction, in the teacher data shown in FIG. 19, the explanatory variables of the rows that are blank in the horizontal direction of the figure are excluded, and the explanatory variables of the rows in which the numerical values remain are extracted. Although the error in step 1 is shown in FIGS. 18 and 19, in the present embodiment, the error in steps 2 and 3 is similarly in the range of (μ−σ) to (μ + σ)). Is extracted. Since FIG. 18 shows the processing and FIG. 19 shows what the processing result looks like, the numerical values shown in these figures do not match.

続いて、ＣＰＵ１０は、中央部値算出手段３４として、抽出された説明変数の複数項目の教師測定データについて中央部値をステップ毎に求める（Ｓ１３）。ここでは、中央部値は中央値とする。本実施形態では、ステップ１、ステップ２、ステップ３の３つの説明変数であり、それぞれが「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目について中央値が算出される。図１８には、ステップ１の中央値が示されているが、本実施形態では、ステップ２、３の中央値についても同様にして求められる。 Subsequently, the CPU 10 obtains the central part value step by step for the teacher measurement data of a plurality of items of the extracted explanatory variables as the central part value calculation means 34 (S13). Here, the median value is the median value. In this embodiment, there are three explanatory variables of step 1, step 2, and step 3, which are 4 of "petal length", "petal width", "gaku length", and "gaku width", respectively. The median value is calculated for the item. Although the median value of step 1 is shown in FIG. 18, in the present embodiment, the median value of steps 2 and 3 is also obtained in the same manner.

次に、ＣＰＵ１０は、基準値算出手段３５として、上記ステップ毎の中央値に基づきシャープレイ値である基準シャープレイ値をステップ毎に算出する（Ｓ１４）。図２０は上記ステップ毎の中央値に基づきシャープレイ値である基準シャープレイ値をステップ毎に算出する、基準値算出手段３５の処理を示す。ステップ１、ステップ２、ステップ３の３つの説明変数の基準シャープレイ値であり、それぞれが「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目について中央値から基準シャープレイ値が算出されている。 Next, the CPU 10 uses the reference value calculation means 35 to calculate the reference shear play value, which is the shear play value, for each step based on the median value for each step (S14). FIG. 20 shows the processing of the reference value calculation means 35 that calculates the reference Shapley value, which is the Shapley value, for each step based on the median value for each step. It is a reference Shapley value of the three explanatory variables of step 1, step 2, and step 3, and each is 4 of "petal length", "petal width", "gaku length", and "gaku width". The standard Shapley value is calculated from the median value for each item.

更に、ＣＰＵ１０は、解析対象値算出手段３６として、上記教師測定データと同様に測定を行い、解析対象である複数項目の解析用測定データを各ステップについて得て、この解析用測定データに基づきシャープレイ値である解析対象シャープレイ値をステップ毎に算出する（Ｓ１５）。図２１は、解析用測定データから算出された解析対象シャープレイ値と、影響説明変数検出手段３７により求められた比較値を示す図である。解析対象シャープレイ値がステップ１、ステップ２、ステップ３毎に「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目について算出されている。 Further, the CPU 10, as the analysis target value calculation means 36, performs measurement in the same manner as the teacher measurement data, obtains analysis measurement data of a plurality of items to be analyzed for each step, and sharpens based on the analysis measurement data. The analysis target shear play value, which is a ray value, is calculated for each step (S15). FIG. 21 is a diagram showing an analysis target Shapley value calculated from the measurement data for analysis and a comparison value obtained by the influence explanatory variable detecting means 37. The Shapley value to be analyzed is calculated for each of the four items of "petal length", "petal width", "gaku length", and "gaku width" for each step 1, step 2, and step 3.

次に、ＣＰＵ１０は、影響説明変数検出手段３７として、複数項目の同一項目毎に、上記基準シャープレイ値と上記解析対象シャープレイ値との比較値をステップ毎に求め、比較値の大きさに基づき、いずれの説明変数である項目の解析用測定データが誤りに影響しているか或いは異常に影響しているかをステップ毎に検出する（Ｓ１６）。 Next, the CPU 10, as the influence explanatory variable detecting means 37, obtains a comparison value between the reference shear play value and the analysis target shear play value step by step for each of the same items of a plurality of items, and sets the size of the comparison value. Based on this, it is detected step by step whether the measurement data for analysis of the item, which is an explanatory variable, affects the error or the abnormality (S16).

図２１は、本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作において、解析用測定データから算出された解析対象シャープレイ値と、影響説明変数検出手段３７により求められた比較値を示す図である。ここでは、図２１に示されるように、比較値としては、基準シャープレイ値から解析対象シャープレイ値を引いた誤差と、計測した誤差の割合である比率が求められている。比率は、ステップ毎の誤差の合計に対して、「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目の誤差が占める割合のことであり、項目の誤差をステップ毎の誤差合計で除法することにより求める。この例では、ステップ２における「花弁の長さ」とステップ３における「花弁の幅」とが同じステップ内の比較値に比べて際立って大きく、例えば所定閾値を超えているため、この説明変数の項目が誤りに影響しているか或いは異常に影響していると結論付けて、図２１では枠線により囲まれている。この実施形態の異常・誤り影響説明変数検出装置１００が、誤りに影響しているか或いは異常に影響している説明変数の項目を実際に報知出力する場合には、「ステップ２では花弁の長さが、ステップ３では花弁の幅が異常に影響しています。」などと、文字により表示しても良い。なお、図２１においてはステップ１における影響説明変数が求められていない理由は、予測モデル３１による予測結果Ｆが図２１に示されているように、ステップ２、３の予測値が、それぞれ２と３から大きく乖離し、異常或いは誤りを示すのに対し、ステップ１の予測値が１であり、異常或いは誤りとはなっていないためである。 FIG. 21 is obtained by the analysis target Shapley value calculated from the measurement data for analysis and the effect explanatory variable detection means 37 in the operation of the step processing mode of the abnormality / error effect explanatory variable detection device 100 according to the present embodiment. It is a figure which shows the comparative value. Here, as shown in FIG. 21, as the comparison value, the ratio of the error obtained by subtracting the analysis target Shapley value from the reference Shapley value and the measured error is obtained. The ratio is the ratio of the error of four items, "petal length", "petal width", "gaku length", and "gaku width", to the total error for each step. , The error of the item is obtained by dividing by the total error of each step. In this example, the "petal length" in step 2 and the "petal width" in step 3 are significantly larger than the comparative values in the same step, for example, exceeding a predetermined threshold value. It is enclosed by a border in FIG. 21, concluding that the item is affecting the error or abnormally. When the abnormality / error effect explanatory variable detection device 100 of this embodiment actually notifies and outputs the item of the explanatory variable that is affecting the error or is affecting the abnormality, "the length of the petals in step 2". However, in step 3, the width of the petals has an abnormal effect. " The reason why the influence explanatory variable in step 1 is not obtained in FIG. 21 is that the predicted values of steps 2 and 3 are 2 respectively, as the prediction result F by the prediction model 31 is shown in FIG. This is because the predicted value in step 1 is 1 and it is not an abnormality or an error, whereas it greatly deviates from 3 and shows an abnormality or an error.

図２２は、非ステップ処理モードに好適な測定データ形式と、異常・正常判定の手法を示す。非ステップ処理モードは、例えば、図２２に示されるように製品Ｎｏ．を有する異なる製品の生産を行っているような場合に、製品には第１の部分のサイズ１と、第２の部分のサイズ２と、第３の部分のサイズ３があり、センサ１、センサ２、センサ３により、何らかの値を測定可能とする。サイズ１の値を目的変数とし、サイズ２、３との値、センサ１、２、３の測定値を説明変数として、サイズ１の値を予測する予測モデルに適用することが可能である。教師データでは、サイズ１の実測値を有しており、教師データの説明変数により予測モデルで予測したサイズ１の予測値との誤差が所定の製品を異常とする正常範囲Ｇを決定する。 FIG. 22 shows a measurement data format suitable for the non-step processing mode and a method for determining abnormality / normality. The non-step processing mode is, for example, as shown in FIG. 22, the product No. In the case of producing different products having the above, the product has a size 1 of the first part, a size 2 of the second part, and a size 3 of the third part, and the sensor 1 and the sensor. 2. The sensor 3 makes it possible to measure some value. It is possible to apply the value of size 1 to a prediction model for predicting the value of size 1 by using the value of size 1 as the objective variable, the values of sizes 2 and 3, and the measured values of sensors 1, 2 and 3 as explanatory variables. The teacher data has an actually measured value of size 1, and an error from the predicted value of size 1 predicted by the prediction model by the explanatory variables of the teacher data determines the normal range G in which the predetermined product is abnormal.

図２２に示すように、実測値と予測値の値が正常範囲Ｇにある製品を正常、正常範囲外となる製品を異常とする。このように、時刻の変化に依存しない測定データを測定して異常或いは誤りを予測するシステムに非ステップ処理モードを適用することが可能である。 As shown in FIG. 22, a product in which the measured value and the predicted value are in the normal range G is regarded as normal, and a product in which the measured value and the predicted value are out of the normal range is regarded as abnormal. In this way, it is possible to apply the non-step processing mode to a system that measures measurement data that does not depend on changes in time and predicts anomalies or errors.

この非ステップ処理モードを有する実施形態の異常・誤り影響説明変数検出装置１００及び異常・誤り影響説明変数検出用プログラムは、図１６のフローチャートにより説明したように、教師測定データと解析用測定データが、Ｎ（正整数）ステップで繰り返して得られるデータである場合に、次の手段が次のような構成を有する。即ち、用いる教師測定データと解析用測定データは図１６において用いたものと同じである。抽出手段３３は、求められた誤差の分布を全測定データに対し１つずつ求め、所定範囲にある誤差に対応する教師データの全説明変数から抽出する非ステップモードを備え、上記中央部値算出手段３４、上記基準値算出手段３５は、ステップに関わりなく処理する一方、上記解析対象値算出手段３６、上記影響説明変数検出手段３７は、ステップ毎に処理する非ステップモードを備える。 The abnormality / error effect explanatory variable detection device 100 and the abnormality / error effect explanatory variable detection program of the embodiment having this non-step processing mode include teacher measurement data and analysis measurement data as described by the flowchart of FIG. In the case of data obtained repeatedly in the N (positive integer) step, the following means have the following configuration. That is, the teacher measurement data and the analysis measurement data used are the same as those used in FIG. The extraction means 33 has a non-step mode in which the obtained error distribution is obtained one by one for all the measurement data and is extracted from all the explanatory variables of the teacher data corresponding to the error in the predetermined range, and the central value is calculated. The means 34 and the reference value calculating means 35 process regardless of the step, while the analysis target value calculating means 36 and the influence explanatory variable detecting means 37 include a non-step mode in which processing is performed for each step.

図２３は、本実施形態に係る異常・誤り影響説明変数検出装置１００のステップ処理モードの動作手順をフローチャートに示したものである。本実施形態の説明においては、センサＡ１、Ｂ１、Ｃ１、Ｄ１による測定値を図２等に示した「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目とし、予測値を花の種類であり、「セトサ」を「１」、「バーシクル」を「２」、「バージニカ」を「３」とする。教師データＴとして既に図１３に示したものが用意されているものとする。ここでは、ステップモードによる処理と非ステップモードによる処理との差異を明らかにするため、ステップモードによる処理の説明で用いたデータを用いて非ステップモードの処理を説明する。 FIG. 23 is a flowchart showing the operation procedure of the step processing mode of the variable / error effect explanatory variable detection device 100 according to the present embodiment. In the description of the present embodiment, the “petal length”, “petal width”, “gaku length”, and “gaku width”, in which the measured values by the sensors A1, B1, C1, and D1 are shown in FIG. The predicted value is the type of flower, "Setosa" is "1", "Versicle" is "2", and "Virginica" is "3". It is assumed that the teacher data T already shown in FIG. 13 has been prepared. Here, in order to clarify the difference between the processing in the step mode and the processing in the non-step mode, the processing in the non-step mode will be described using the data used in the explanation of the processing in the step mode.

ＣＰＵ１０は、誤差算出手段３２として、上記予測モデル３１に上記教師データＴの説明変数を与えて目的変数を求め、与えた説明変数に対応する教師データの目的変数との誤差を求める（Ｓ２１）。この処理は、上記教師データＴの全てについて行われる。図２４は、教師データＴから誤差が求められるまでの処理を、データの内容の変遷を中心として示したものである。 As the error calculating means 32, the CPU 10 gives the explanatory variable of the teacher data T to the prediction model 31 to obtain the objective variable, and obtains the error from the objective variable of the teacher data corresponding to the given explanatory variable (S21). This process is performed for all of the teacher data T. FIG. 24 shows the processing from the teacher data T until the error is obtained, focusing on the transition of the contents of the data.

次に、ＣＰＵ１０は、抽出手段３３は、上記誤差の全ステップ分教師データについて平均値と標準偏差を求め、上記誤差の分布範囲の所定範囲にある誤差の教師データの説明変数を全ステップに亘って抽出する（Ｓ２２）。ここで、所定の範囲は、標準偏差σの１倍の範囲（即ち、（μ－σ）から（μ＋σ）の範囲）にある誤差に対応する教師データの説明変数をステップ毎に抽出する。図２５は、平均値と標準偏差を求め、上記誤差の分布範囲の所定範囲にある誤差に対応する教師データの説明変数をステップ毎に抽出し、中央値を得るまでの処理を示した図である。図２６は抽出処理が行われた教師データを示す。ここに、図２５は処理を示したものであり、図２６は処理結果がどのようになるかを示したものであるため、これらの図に記載されている数値は一致していない。上記の抽出結果、図２５に示される教師データにおいて、図の横方向に空白となった行の説明変数が排除され、データの数値が残っている行の説明変数が抽出される。 Next, the CPU 10 obtains an average value and a standard deviation for the teacher data for all steps of the error, and the extraction means 33 obtains an explanatory variable of the teacher data of the error within a predetermined range of the distribution range of the error over all steps. And extract (S22). Here, in the predetermined range, the explanatory variables of the teacher data corresponding to the error in the range of 1 times the standard deviation σ (that is, the range of (μ−σ) to (μ + σ)) are extracted step by step. FIG. 25 is a diagram showing the process of obtaining the mean value and the standard deviation, extracting the explanatory variables of the teacher data corresponding to the error in the predetermined range of the distribution range of the error step by step, and obtaining the median value. be. FIG. 26 shows the teacher data in which the extraction process was performed. Here, since FIG. 25 shows the processing and FIG. 26 shows what the processing result will be, the numerical values shown in these figures do not match. As a result of the above extraction, in the teacher data shown in FIG. 25, the explanatory variables of the rows that are blank in the horizontal direction of the figure are excluded, and the explanatory variables of the rows in which the numerical values of the data remain are extracted.

続いて、ＣＰＵ１０は、中央部値算出手段３４として、抽出された説明変数の複数項目の教師測定データについて全ステップの中央部値を求める（Ｓ２３）。ここでは、中央部値は中央値とする。本実施形態では、ステップ１、ステップ２、ステップ３の３つの説明変数に別れているのであるが、全ステップで一括して、「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目について中央値が算出される。ここでは、各ステップの中央値は求めない。 Subsequently, the CPU 10 obtains the central value of all steps for the teacher measurement data of a plurality of items of the extracted explanatory variables as the central value calculation means 34 (S23). Here, the median value is the median value. In this embodiment, the explanatory variables are divided into three explanatory variables, step 1, step 2, and step 3, but the "petal length", "petal width", and "gaku length" are collectively used in all steps. The median value is calculated for the four items of "sa" and "width of gaku". Here, the median value of each step is not obtained.

次に、ＣＰＵ１０は、基準値算出手段３５として、上記全ステップのデータについて一括して求めた中央値に基づきシャープレイ値である基準シャープレイ値を算出する（Ｓ２４）。図２７は全ステップのデータの中央値に基づきシャープレイ値である基準シャープレイ値を算出する、基準値算出手段３５の処理を示す。ステップ１、ステップ２、ステップ３を一括した説明変数の中央値から求める基準シャープレイ値であり、「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目についてそれぞれ１つの中央値から基準シャープレイ値が算出されている。 Next, the CPU 10 calculates the reference Shapley value, which is the Shapley value, based on the median value collectively obtained for the data of all the steps as the reference value calculating means 35 (S24). FIG. 27 shows the processing of the reference value calculation means 35 that calculates the reference Shapley value, which is the Shapley value, based on the median value of the data of all steps. It is a reference Shapley value obtained from the median of the explanatory variables that collectively describe steps 1, step 2, and step 3, and is the "petal length", "petal width", "gaku length", and "gaku width". The reference Shapley value is calculated from one median for each of the four items.

更に、ＣＰＵ１０は、解析対象値算出手段３６として、上記教師測定データと同様に測定を行い、解析対象である複数項目の解析用測定データを各ステップについて得て、この解析用測定データに基づきシャープレイ値である解析対象シャープレイ値をステップ毎に算出する（Ｓ２５）。図２８は、解析用測定データから算出された解析対象シャープレイ値と、影響説明変数検出手段３７により求められた比較値を示す図である。解析対象シャープレイ値がステップ１、ステップ２、ステップ３毎に「花弁の長さ」、「花弁の幅」、「ガクの長さ」、「ガクの幅」の４項目について算出されている。 Further, the CPU 10, as the analysis target value calculation means 36, performs measurement in the same manner as the teacher measurement data, obtains analysis measurement data of a plurality of items to be analyzed for each step, and sharpens based on the analysis measurement data. The analysis target shear play value, which is a ray value, is calculated for each step (S25). FIG. 28 is a diagram showing an analysis target Shapley value calculated from the measurement data for analysis and a comparison value obtained by the influence explanatory variable detecting means 37. The Shapley value to be analyzed is calculated for each of the four items of "petal length", "petal width", "gaku length", and "gaku width" for each step 1, step 2, and step 3.

次に、ＣＰＵ１０は、影響説明変数検出手段３７として、複数項目の同一項目毎に、上記基準シャープレイ値と上記解析対象シャープレイ値との比較値をステップ毎に求め、比較値の大きさに基づき、いずれの説明変数である項目の解析用測定データが誤りに影響しているか或いは異常に影響しているかをステップ毎に検出する（Ｓ２６）。 Next, the CPU 10, as the influence explanatory variable detecting means 37, obtains a comparison value between the reference shear play value and the analysis target shear play value step by step for each of the same items of a plurality of items, and sets the size of the comparison value. Based on this, it is detected step by step whether the measurement data for analysis of the item, which is an explanatory variable, affects the error or the abnormality (S26).

ここでは、図２８に示されるように、比較値としては、基準シャープレイ値から解析対象シャープレイ値を引いた誤差と、計測した誤差の割合が求められている。この例では、ステップ２における「花弁の長さ」とステップ３における「花弁の長さ」とが同じステップ内の比較値に比べて際立って大きく、例えば所定閾値を超えているため、この説明変数の項目が誤りに影響しているか或いは異常に影響していると結論付けて、図２８では枠線により囲まれている。この実施形態の異常・誤り影響説明変数検出装置１００が、誤りに影響しているか或いは異常に影響している説明変数の項目を実際に報知出力する場合には、「ステップ２では花弁の長さが、ステップ３では花弁の長さが異常に影響しています。」などと、文字により表示しても良い。 Here, as shown in FIG. 28, as the comparison value, the ratio of the error obtained by subtracting the analysis target Shapley value from the reference Shapley value and the measured error is obtained. In this example, the "petal length" in step 2 and the "petal length" in step 3 are significantly larger than the comparative values in the same step, for example, exceeding a predetermined threshold value. In FIG. 28, it is surrounded by a frame line, concluding that the item of is affecting the error or abnormally. When the abnormality / error effect explanatory variable detection device 100 of this embodiment actually notifies and outputs the item of the explanatory variable that is affecting the error or is affecting the abnormality, "the length of the petals in step 2". However, in step 3, the length of the petals has an abnormal effect. "

なお、図２８においてはステップ１における影響説明変数が求められているが適切な結果が得られていない。その理由は、影響説明変数を、比較値が所定閾値を超えているか否かなどに基づき検出しているため、本実施形態で用いた解析用測定データと教師測定データが共にステップ毎にデータ構成が異なっていることから、適切な検出ができていない。即ち、ステップ１では「花弁の長さ」、「花弁の幅」における誤差と比率が、同じステップ内の比較値に比べて際立って大きく、例えば所定閾値を超えているため、この説明変数の項目が誤りに影響しているか或いは異常に影響していると結論付けて、「花弁の長さ」、「花弁の幅」の行を枠線により囲む処理をしている。
しかし、予測モデル３１による予測結果Ｆが図２８に示されているように、ステップ２、３の予測値が、それぞれ２と３から大きく乖離し、異常或いは誤りを示すのに対し、ステップ１の予測値が１であり、異常或いは誤りとはなっていない。異常或いは誤りが検出されていないステップＳ１において影響説明変数が検出されていることから、エラーと識別することが可能である。 In FIG. 28, the influence explanatory variable in step 1 is obtained, but an appropriate result is not obtained. The reason is that the effect explanatory variables are detected based on whether or not the comparison value exceeds a predetermined threshold, so that the analysis measurement data and the teacher measurement data used in this embodiment are both data-structured for each step. However, proper detection has not been achieved. That is, in step 1, the error and the ratio in "petal length" and "petal width" are remarkably large compared to the comparison value in the same step, for example, exceeding a predetermined threshold value. It is concluded that is affecting the error or abnormally, and the lines of "petal length" and "petal width" are surrounded by a border.
However, as the prediction result F by the prediction model 31 is shown in FIG. 28, the predicted values in steps 2 and 3 greatly deviate from 2 and 3, respectively, and show an abnormality or an error, whereas in step 1. The predicted value is 1, and it is not abnormal or incorrect. Since the influence explanatory variable is detected in step S1 in which no abnormality or error is detected, it can be identified as an error.

本発明に係る複数の実施形態を説明したが、これらの実施形態は例として提示するものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although a plurality of embodiments according to the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other embodiments, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are also included in the scope of the invention described in the claims and the equivalent scope thereof.

１０・・・ＣＰＵ、１１・・・主メモリ、１２・・・バス、１３・・・外部記憶インタフェース、１４・・・入力インタフェース、１５・・・表示インタフェース、１６・・・データ入力インタフェース、２２・・・マウス、２３・・・外部記憶装置、２４・・・入力装置、２５・・・表示装置、２６－１～２６－ｍ・・・センサ、３０・・・予測モデル作成手段、３１・・・予測モデル、３２・・・誤差算出手段、３３・・・抽出手段、３４・・・中央部値算出手段、３５・・・基準値算出手段、３６・・・解析対象値算出手段、３７・・・影響説明変数検出手段、１００・・・影響説明変数検出装置 10 ... CPU, 11 ... Main memory, 12 ... Bus, 13 ... External storage interface, 14 ... Input interface, 15 ... Display interface, 16 ... Data input interface, 22 ... Mouse, 23 ... External storage device, 24 ... Input device, 25 ... Display device, 26-1 to 26-m ... Sensor, 30 ... Predictive model creation means, 31. ... Prediction model, 32 ... Error calculation means, 33 ... Extraction means, 34 ... Central value calculation means, 35 ... Reference value calculation means, 36 ... Analysis target value calculation means, 37 ... Impact explanatory variable detection means, 100 ... Impact explanatory variable detection device

Claims

A plurality of sets of teacher identification data of a plurality of items, which are explanatory variables, and teacher identification data, which is one teacher identification data for identifying the teacher measurement data of the plurality of items and is an objective variable, are prepared. A prediction model created to obtain the objective variable from the explanatory variables by machine learning using the teacher data.
An error in which the objective variable is obtained by giving the explanatory variables of the teacher data prepared in a plurality of sets to the prediction model, and the error of the teacher data corresponding to the given explanatory variable is obtained for all of the teacher data. Calculation means and
An extraction means for obtaining the obtained error distribution and extracting explanatory variables of teacher data corresponding to the error within a predetermined range of the error distribution range.
A central value calculation means for obtaining a central value for teacher measurement data of a plurality of items of the extracted explanatory variables, and
A reference value calculation means for calculating a reference Shapley value, which is a Shapley value, based on the central value.
Analysis to calculate the analysis target shear play value, which is the shear play value, based on the analysis measurement data of multiple items that are the analysis target of whether it is affected by the error or abnormality newly measured by the same measurement process as the teacher measurement data. Target value calculation means and
For each of the same items of the plurality of items, the comparison value between the reference shear play value and the analysis target shear play value is obtained, and based on the magnitude of the comparison value, the measurement data for analysis of any of the explanatory variables is incorrect. An anomaly / error effect explanatory variable detection device comprising: an effect explanatory variable detecting means for detecting whether or not an abnormality is affected.

When the teacher measurement data and the analysis measurement data are data obtained repeatedly in N (positive integer) steps,
The extraction means includes a step processing mode in which the distribution of the obtained error is obtained step by step, and the explanatory variables of the teacher data corresponding to the error in the predetermined range are extracted step by step.
The abnormality according to claim 1, wherein the central value calculation means, the reference value calculation means, the analysis target value calculation means, and the influence explanatory variable detection means include a step processing mode for processing step by step. -Error effect explanatory variable detector.

When the teacher measurement data and the analysis measurement data are data obtained repeatedly in N (positive integer) steps,
The extraction means includes a non-step mode in which the obtained error distribution is obtained one by one for all the measurement data and extracted from all the explanatory variables of the teacher data corresponding to the error in the predetermined range.
2. The central portion value calculating means and the reference value calculating means process regardless of the step, while the analysis target value calculating means and the influence explanatory variable detecting means include a non-step mode. Abnormality / error effect explanation variable detection device described in.

The abnormality / error effect according to any one of claims 1 to 3, wherein the extraction means extracts an explanatory variable of teacher data corresponding to an error in the range of 1 times the standard deviation from the mean value . Explanatory variable detector.

Computer,
A plurality of sets of teacher identification data of a plurality of items, which are explanatory variables, and teacher identification data, which is one teacher identification data for identifying the teacher measurement data of the plurality of items and is an objective variable, are prepared. A prediction model created to obtain the objective variable from the explanatory variables by machine learning using the teacher data.
An error in which the objective variable is obtained by giving the explanatory variables of the teacher data prepared in a plurality of sets to the prediction model, and the error of the teacher data corresponding to the given explanatory variable is obtained for all of the teacher data. Calculation method,
An extraction means for obtaining the obtained error distribution and extracting explanatory variables of teacher data corresponding to the error within a predetermined range of the error distribution range.
A central value calculation means for obtaining a central value for teacher measurement data of a plurality of items of the extracted explanatory variables.
A reference value calculation means for calculating a reference Shapley value, which is a Shapley value based on the central value.
Analysis to calculate the analysis target shear play value which is the shear play value based on the analysis measurement data of a plurality of items which are the analysis targets of whether it is affected by the error or abnormality newly measured by the same measurement process as the teacher measurement data. Target value calculation method,
For each of the same items of the plurality of items, the comparison value between the reference shear play value and the analysis target shear play value is obtained, and based on the magnitude of the comparison value, the measurement data for analysis of any of the explanatory variables is incorrect. An anomaly / error effect explanatory variable detection program characterized by functioning as an effect explanatory variable detection means for detecting whether or not an abnormality is affected.

When the teacher measurement data and the analysis measurement data are data obtained repeatedly in N (positive integer) steps,
Using the computer as the extraction means, the function is to obtain the distribution of the obtained error step by step and process it in a step mode in which the explanatory variables of the teacher data corresponding to the error in the predetermined range are extracted step by step. Let me
Further, the computer is made to function as the central value calculation means, the reference value calculation means, the analysis target value calculation means, and the influence explanatory variable detection means in a step mode for processing step by step. The program for detecting an abnormality / error effect explanatory variable according to claim 5.

When the teacher measurement data and the analysis measurement data are data obtained repeatedly in N (positive integer) steps,
Using the computer as the extraction means, the distribution of the obtained error is obtained one by one for all the measurement data, and the computer is processed in a non-step mode of extracting from all the explanatory variables of the teacher data corresponding to the error in the predetermined range. To function and
The computer is operated as the central value calculation means and the reference value calculation means so as to be processed regardless of the step, while the analysis target value calculation means and the influence explanatory variable detection means are processed step by step. The program for detecting an abnormality / error effect explanatory variable according to claim 6, wherein the program is operated as a non-step mode.