JP2017151614A

JP2017151614A - Computing machine and analytical index calculation method

Info

Publication number: JP2017151614A
Application number: JP2016032067A
Authority: JP
Inventors: ヨウショウ; Yo Shaw; 信二垂水; Shinji Tarumi
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2016-02-23
Filing date: 2016-02-23
Publication date: 2017-08-31
Anticipated expiration: 2036-02-23
Also published as: JP6568488B2

Abstract

PROBLEM TO BE SOLVED: To calculate an index which properly analyzes prediction accuracy of a predictive model.SOLUTION: A computing machine analyzes prediction accuracy of a predictive model for calculating an estimated value of a prediction item for an object to be observed. The computing machine manages a database which stores a plurality of records comprising estimated values of prediction items of an object to be observed and actual values of the prediction items of the object, sorts the records stored in the database on the basis of the estimated values, calculates predictive errors of the records, on the basis of the estimated values and actual values of the records stored in the database, selects a plurality of records on the basis of a sorting result, and calculates an analytical index on the basis of statistic calculated from the predictive errors of the selected records.SELECTED DRAWING: Figure 1

Description

本発明は、予測精度を分析する技術に関する。 The present invention relates to a technique for analyzing prediction accuracy.

近年、様々な分野において、過去データを用いた機械学習によって生成された予測モデルに基づく事象の予測が行われている。ここで、過去データは、任意の時刻に観測、集計、又は記録された項目群から構成されるレコード等のデータ群を示す。 In recent years, in various fields, prediction of an event based on a prediction model generated by machine learning using past data has been performed. Here, the past data indicates a data group such as a record composed of a group of items observed, aggregated, or recorded at an arbitrary time.

予測モデルを用いてある観測対象について任意の事象を予測する場合、使用する予測モデルの精度が高いことを保証する必要がある。したがって、予測モデルの精度を分析する技術が求められる。予測モデルの精度を分析する方法として、例えば、特許文献１に記載されている技術が知られている。 When an arbitrary event is predicted for an observation target using a prediction model, it is necessary to ensure that the accuracy of the prediction model used is high. Therefore, a technique for analyzing the accuracy of the prediction model is required. As a method for analyzing the accuracy of a prediction model, for example, a technique described in Patent Document 1 is known.

特許文献１には、「予測指標計算装置の単回帰式計算部は、過去データとして記録された商品の出荷に係る複数の実績値と各実績値に対応する予測値とについて単回帰式を計算する。予測指標計算装置の予測精度計算部は、この単回帰式で表される単回帰直線ｙ＝ａｘの傾きａと、過去データを当該複数の実績値と各実績値に対応する予測値とを両軸とするグラフ上にプロットした場合の当該グラフ上の点（Ｘ，Ｙ）を、それぞれ原点と結んだ直線の傾きとの差異に基づいて、予測精度を計算する。具体的には、単回帰直線ｙ＝ａｘと、点（Ｘ，Ｙ）をそれぞれ原点と結んだ直線とがなす角θについて、ｃｏｓ^２θを点（Ｘ，Ｙ）ごとに計算する。そして、ｃｏｓ^２θの平均値を予測精度として計算する。」ことが記載されている。 Patent Document 1 states that “a single regression equation calculation unit of a prediction index calculation device calculates a single regression equation for a plurality of actual values related to shipment of products recorded as past data and predicted values corresponding to the actual values. The prediction accuracy calculation unit of the prediction index calculation apparatus includes a slope a of a single regression line y = ax represented by the single regression equation, past data as a plurality of actual values, and predicted values corresponding to the actual values. Is calculated on the basis of the difference between the point (X, Y) on the graph and the slope of the straight line connected to the origin. Cos ² θ is calculated for each point (X, Y) for an angle θ formed by a single regression line y = ax and a straight line connecting the point (X, Y) with the origin, and the average of cos ² θ The value is calculated as the prediction accuracy. "

ここで、本明細書の用語について説明する。 Here, terms used in this specification will be described.

観測対象に対して予測したい事象に対応する項目を予測項目と記載する。なお、予測項目が取りうる値が「０」及び「１」のような二値の場合、予測事象とも記載する。例えば、病気の発症率及び機器の故障率等が予測事象に該当する。予測項目と相関関係がある項目を相関項目と記載する。過去データに基づく予測とは、任意の期間のレコード群を用いて予測項目と観測項目との間の関連性を示す規則（予測モデル）を抽出し、当該規則を用いてある時刻の観測項目の値から予測項目の値を算出することを示す。予測モデルに基づいて算出される予測項目の値を予測値と記載する。また、実際に観測された予測項目の値を実値と記載する。また、実値と予測値との間の誤差を予測誤差と記載する。以上が本明細書の用語の説明である。 An item corresponding to an event to be predicted for an observation target is described as a prediction item. In addition, when the value which a prediction item can take is binary like "0" and "1", it describes also as a prediction event. For example, the incidence of illness and the failure rate of equipment correspond to the predicted events. Items that have a correlation with the prediction item are described as correlation items. Prediction based on past data is a rule (prediction model) that shows the relationship between a prediction item and an observation item using a record group of an arbitrary period, and the observation item at a certain time using that rule is extracted. Indicates that the value of the prediction item is calculated from the value. The value of the prediction item calculated based on the prediction model is described as a predicted value. Moreover, the value of the prediction item actually observed is described as the actual value. Further, an error between the actual value and the predicted value is described as a prediction error. The above is the explanation of terms in this specification.

特開２００９−２４５０３２号公報JP 2009-245032 A

従来、予測値がどの程度実値に近い値かを示す指標が、予測精度を分析する指標として用いられる。以下の説明では、予測精度を分析する指標を分析指標とも記載する。従来の分析指標は、各レコードの予測誤差の二乗の平均値、又はその平均値の平方根等として与えられる。前述した分析指標は値が小さいほど予測モデルの予測精度が高いことを示す。 Conventionally, an index indicating how close the predicted value is to the actual value is used as an index for analyzing the prediction accuracy. In the following description, an index for analyzing prediction accuracy is also referred to as an analysis index. The conventional analysis index is given as the average value of the square of the prediction error of each record or the square root of the average value. The analysis index mentioned above shows that the prediction accuracy of a prediction model is so high that a value is small.

観測項目群と予測項目との間の相関性が低く、観測項目群に反映されていない他の要因の影響を大きく受ける予測項目の場合、予測値と実値との差が大きくなる。すなわち、実値と実値の期待値との差が大きくなる。なぜならば、予測モデルは、実値の期待値に近くなるように生成されるためである。 In the case of a prediction item that has a low correlation between the observation item group and the prediction item and is greatly affected by other factors not reflected in the observation item group, the difference between the prediction value and the actual value becomes large. That is, the difference between the actual value and the expected value of the actual value increases. This is because the prediction model is generated so as to be close to the expected value of the actual value.

前述のような場合、従来の分析指標は値が非常に大きくなるため、予測精度の分析には適さない指標である。 In the case as described above, the conventional analysis index is an index that is not suitable for the analysis of the prediction accuracy because the value becomes very large.

各レコードについて実値の期待値と予測値とを比較すれば、予測モデルの予測精度を分析できるが、実値の期待値は一つの観測対象の値だけからは算出できないため、このような分析方法は困難である。 You can analyze the prediction accuracy of the prediction model by comparing the expected value of the actual value with the predicted value for each record, but the expected value of the actual value cannot be calculated from the value of only one observation target. The method is difficult.

本発明は、予測モデルの予測精度を適切に分析できる指標を算出する装置及び方法を提供する。 The present invention provides an apparatus and a method for calculating an index that can appropriately analyze the prediction accuracy of a prediction model.

本願において開示される発明の代表的な一例を示せば以下の通りである。すなわち、観測対象の予測項目の予測値を算出するための予測モデルの予測精度を分析する計算機であって、前記計算機は、演算装置、前記演算装置に接続されるメモリ、及び前記演算装置に接続されるインタフェースを備え、前記計算機は、前記観測対象の予測項目の前記予測値及び前記観測対象の予測項目の実値から構成される複数のレコードを格納するデータベースを管理し、前記演算装置は、前記データベースに格納される複数のレコードを前記予測値の大きさに基づいてソートし、ソート結果を前記メモリに格納し、前記演算装置は、前記データベースに格納される複数のレコードの前記予測値及び前記実値に基づいて、前記複数のレコードの各々の予測誤差を算出し、前記データベースに格納される複数のレコードの各々の前記予測誤差を前記メモリに格納し、前記演算装置は、前記ソート結果に基づいて対象とする複数のレコードを選択し、前記選択された複数のレコードの各々の前記予測誤差から算出される統計量に基づいて、前記予測モデルの予測精度を分析するための分析指標を算出し、前記分析指標を前記メモリに格納することを特徴とする。 A typical example of the invention disclosed in the present application is as follows. That is, a computer that analyzes the prediction accuracy of a prediction model for calculating a prediction value of a prediction item to be observed, the computer being connected to an arithmetic device, a memory connected to the arithmetic device, and the arithmetic device And the computer manages a database that stores a plurality of records composed of the predicted value of the observation target prediction item and the actual value of the observation target prediction item, and the computing device includes: Sorting a plurality of records stored in the database based on the size of the predicted value, storing a sorting result in the memory, the arithmetic unit, the predicted values of the plurality of records stored in the database, and Based on the actual value, a prediction error of each of the plurality of records is calculated, and each of the plurality of records stored in the database is calculated. A measurement error is stored in the memory, and the arithmetic unit selects a plurality of records to be processed based on the sorting result, and calculates a statistical amount calculated from the prediction error of each of the selected plurality of records. Based on this, an analysis index for analyzing the prediction accuracy of the prediction model is calculated, and the analysis index is stored in the memory.

本発明によれば、観測項目群と予測項目との間の相関性が低い場合でも、予測モデルの予測精度を分析できる指標を算出することができる。上記した以外の課題、構成及び効果は、以下の実施例の説明により明らかにされる。 According to the present invention, it is possible to calculate an index that can analyze the prediction accuracy of a prediction model even when the correlation between the observation item group and the prediction item is low. Problems, configurations, and effects other than those described above will become apparent from the description of the following examples.

実施例１の計算機が実行する処理の一例を説明するフローチャートである。6 is a flowchart illustrating an example of processing executed by the computer according to the first embodiment. 実施例１の累積グラフを示す説明図である。6 is an explanatory diagram illustrating a cumulative graph of Example 1. FIG. 実施例１の計算機システムの構成の一例を示すブロック図である。1 is a block diagram illustrating an example of a configuration of a computer system according to a first embodiment. 実施例１の個別情報記憶部に格納される個別情報の一例を示す説明図である。FIG. 4 is an explanatory diagram illustrating an example of individual information stored in an individual information storage unit according to the first embodiment. 実施例１のグループ情報記憶部に格納されるグループ情報の一例を示す説明図である。It is explanatory drawing which shows an example of the group information stored in the group information storage part of Example 1. 実施例１のグループ情報記憶部に格納されるグループ統計情報の一例を示す説明図である。It is explanatory drawing which shows an example of the group statistical information stored in the group information storage part of Example 1. 実施例１の予測精度分析グラフの一例を示す説明図である。It is explanatory drawing which shows an example of the prediction accuracy analysis graph of Example 1. FIG. 実施例１の分析指標情報の一例を示す説明図である。6 is an explanatory diagram illustrating an example of analysis index information according to Embodiment 1. FIG. 実施例１の累積グラフの一例を示す説明図である。FIG. 6 is an explanatory diagram illustrating an example of a cumulative graph according to the first embodiment.

まず、本発明の概要について説明する。ここでは、予測項目の予測値及び実値を含むレコードを格納するデータベースを想定する。また、実値はある事象が発生するか否かを示す二値であるものとする。すなわち、実値は、事象が発生したことを示す「１」及び事象が発生しないことを示す「０」のいずれかの値となる。この場合、予測値は「０」から「１」の間の値となる。 First, an outline of the present invention will be described. Here, a database that stores records including predicted values and actual values of prediction items is assumed. The actual value is a binary value indicating whether or not a certain event occurs. That is, the actual value is either “1” indicating that an event has occurred or “0” indicating that an event has not occurred. In this case, the predicted value is a value between “0” and “1”.

本発明は、実値が予測値にどの程度近いかを分析することによって予測モデルの予測精度を分析する。すなわち、予測モデルに基づく予測値は正しいものと仮定し、予測値及び実値の誤差等を分析する。しかし、各レコードについて予測値と実値との間の誤差を算出しても、意味のある分析ができない。そこで、本発明の計算機３００（図３参照）は、以下のような処理を実行する。 The present invention analyzes the prediction accuracy of a prediction model by analyzing how close the actual value is to the predicted value. That is, the prediction value based on the prediction model is assumed to be correct, and an error between the prediction value and the actual value is analyzed. However, even if the error between the predicted value and the actual value is calculated for each record, a meaningful analysis cannot be performed. Therefore, the computer 300 (see FIG. 3) of the present invention executes the following processing.

計算機３００は、レコードを予測値の大きさに基づいてソートする。具体的には、計算機３００は、予測値の大きい順にレコードをソートする。計算機３００は、各レコードの予測値及び実値を用いて予測誤差を算出する。計算機３００は、ソート結果に基づいて対象とするレコード群を選択し、また、選択されたレコード群の予測誤差に基づいて、分析指標を算出する。 The computer 300 sorts the records based on the size of the predicted value. Specifically, the computer 300 sorts the records in descending order of the predicted value. The computer 300 calculates a prediction error using the predicted value and actual value of each record. The computer 300 selects a target record group based on the sorting result, and calculates an analysis index based on the prediction error of the selected record group.

なお、計算機３００は、レコード群を選択した後、選択されたレコード群の予測値の統計値及び実値の統計値を算出し、当該統計量に基づいて予測誤差を算出してもよい。 Note that the computer 300 may calculate a prediction value statistical value and a real value statistical value of the selected record group after selecting the record group, and calculate a prediction error based on the statistical amount.

レコードの選択方法は目的に応じて様々考えられる。例えば、予測モデルの予測精度を分析するためには、事象が発生しやすいレコード群について予測精度を分析すればよい。そこで、計算機３００は、ソート結果に基づいて、予測値の累積値及び実値の累積値を示す累積グラフ（累積情報）を生成する。 There are various record selection methods depending on the purpose. For example, in order to analyze the prediction accuracy of the prediction model, the prediction accuracy may be analyzed for a record group in which an event is likely to occur. Therefore, the computer 300 generates a cumulative graph (cumulative information) indicating the cumulative value of the predicted value and the cumulative value of the actual value based on the sorting result.

ここで、累積グラフについて説明する。図２は、実施例１の累積グラフ２００を示す説明図である。 Here, the cumulative graph will be described. FIG. 2 is an explanatory diagram illustrating a cumulative graph 200 according to the first embodiment.

横軸は、レコードに付与されたソート番号を示す。縦軸は、累積値を示す。予測値の累積値は、例えば、各レコードの予測値を合計することによって算出できる。また、実値の累積値は、各レコードの実値を合計することによって算出できる。例えば、ソート番号が「１００」の予測値の累積値は、ソート番号が「１」から「１００」までの各レコードの予測値を合計することによって算出される。 The horizontal axis indicates the sort number assigned to the record. The vertical axis represents the cumulative value. The cumulative value of predicted values can be calculated, for example, by summing the predicted values of each record. The accumulated value of the actual value can be calculated by summing up the actual values of the records. For example, the cumulative value of the predicted values with the sort number “100” is calculated by summing the predicted values of the records with the sort numbers “1” to “100”.

予測値が大きいレコードは、実値が「１」である確率が高い。そのため、累積値は急激に増加する。一方、予測値が小さいレコードは、実値が「１」である確率が低い。そのため、累積値は緩やかに増加する。したがって、累積グラフ２００は、図２に示すような曲線となる。 A record with a large predicted value has a high probability that the actual value is “1”. Therefore, the cumulative value increases rapidly. On the other hand, a record with a small predicted value has a low probability that the actual value is “1”. Therefore, the cumulative value increases gradually. Therefore, the cumulative graph 200 is a curve as shown in FIG.

なお、予測値の小さい順にレコードがソートされた場合、累積値のグラフは、図２に示すグラフとは形は異なるが、前述の累積値の増加量の特性を示すグラフとなる。 When the records are sorted in order of increasing predicted value, the graph of the cumulative value is different from the graph shown in FIG. 2, but is a graph showing the above-described characteristics of the increase amount of the cumulative value.

予測精度が高い予測モデルの場合、予測値が所定の閾値より大きいレコード群の予測値の累積値と実値の累積値との間の差は、小さくなる。すなわち、実値が予測値に近いことを示す。一方、予測値が所定の閾値以下のレコード群の予測値の累積値と実値の累積値との間の差は、単調増加する。なぜならば、予測値と実値との間の差は大きくなるためである。 In the case of a prediction model with high prediction accuracy, the difference between the cumulative value of the predicted value and the cumulative value of the actual value of the record group whose predicted value is larger than a predetermined threshold value is small. That is, the actual value is close to the predicted value. On the other hand, the difference between the cumulative value of the predicted value and the cumulative value of the actual value of the record group whose predicted value is equal to or less than a predetermined threshold increases monotonously. This is because the difference between the predicted value and the actual value becomes large.

前述したように、ソート結果に基づく予測の累積値及び実値の累積値は、予測モデルの予測精度に関連する数値であることが分かる。また、予測値が所定の閾値以下のレコード群の統計量は、予測精度の分析に適さないことが分かる。そこで、予測値が所定の閾値より大きいレコード群について予測精度の分析を行えばよい。これによって、予測精度の分析におけるランダム要因を排除することができる。 As described above, it can be seen that the cumulative value of the prediction and the cumulative value of the actual value based on the sorting result are numerical values related to the prediction accuracy of the prediction model. It can also be seen that the statistic of the record group whose predicted value is equal to or less than a predetermined threshold is not suitable for the analysis of the prediction accuracy. Therefore, the prediction accuracy may be analyzed for a record group whose predicted value is larger than a predetermined threshold. As a result, random factors in the prediction accuracy analysis can be eliminated.

そこで、計算機３００は、分析対象のレコード群を選択し、分析対象のレコード群の予測値の統計量及び実値の統計量の差等を分析指標として算出する。 Therefore, the computer 300 selects a record group to be analyzed, and calculates, as an analysis index, a difference between a predicted value statistic and a real value statistic of the record group to be analyzed.

予測モデルの予測精度をより正確に分析するためには、予測値と実値の期待値との差等を分析する必要があるためである。しかし、一般的に、実値の期待値は不明である。また、各レコードの値からは実値の期待値を算出することができない。 This is because it is necessary to analyze the difference between the predicted value and the expected value of the actual value in order to analyze the prediction accuracy of the prediction model more accurately. However, in general, the expected value of the actual value is unknown. Further, the expected value of the actual value cannot be calculated from the value of each record.

予測値が類似するレコードは、実値の期待値も類似すると推定される。そこで、計算機３００は、レコードがソートされた後、予測値が類似する複数のレコードをグループ化する。これによって、計算機３００は、グループに含まれる複数のレコードの実値の分布から実値の期待値に相当する統計量を算出できる。また、計算機３００は、各グループの実値の期待値の相当する統計量及び予測値の統計量を算出する。計算機３００は、実値の期待値の相当する統計量及び予測値の統計量に基づいて、予測誤差を算出する。計算機３００は、予測誤差に基づいて分析指標を算出する。 It is presumed that records with similar predicted values have similar actual expected values. Therefore, after the records are sorted, the computer 300 groups a plurality of records having similar predicted values. Accordingly, the computer 300 can calculate a statistic corresponding to the expected value of the actual value from the distribution of the actual values of the plurality of records included in the group. Further, the computer 300 calculates a statistic corresponding to the expected value of the actual value of each group and a statistic of the predicted value. The computer 300 calculates a prediction error based on the statistic corresponding to the expected value of the actual value and the statistic of the predicted value. The computer 300 calculates an analysis index based on the prediction error.

従来の分析指標は、全レコードの予測誤差等から算出されるものであるため、全レコードの予測値を上下することによってコントロールできる。本実施例では、予測精度の分析に適したグループについて予測精度を算出できる。 Since the conventional analysis index is calculated from the prediction error or the like of all records, it can be controlled by raising or lowering the predicted value of all records. In this embodiment, the prediction accuracy can be calculated for a group suitable for prediction accuracy analysis.

なお、前述した処理では、事象が発生しやすいレコード群について分析指標を算出していたが、本発明はこれに限定されない。例えば、事象が発生しにくいレコード群について分析指標を算出してもよい。当該レコード群は、図２のグラフを用いて選択することができる。また、ユーザが指定した特定のグループについて分析指標を算出してもよい。したがって、目的に応じた予測精度の分析が可能となる。 In the above-described processing, the analysis index is calculated for a record group in which an event is likely to occur, but the present invention is not limited to this. For example, the analysis index may be calculated for a record group that is unlikely to cause an event. The record group can be selected using the graph of FIG. Further, an analysis index may be calculated for a specific group designated by the user. Therefore, it is possible to analyze the prediction accuracy according to the purpose.

実施例１では、計算機３００は、予測項目の予測値及び実値から構成されるレコードを予測値に基づいてソートし、ソート結果に基づいて所定数のレコードから構成されるグループを生成する。計算機３００は、グループごとに、予測値の平均値及び実値の平均値を算出し、予測値の平均値と実値の平均値との差をグループの予測誤差として算出する。計算機３００は、算出結果に基づいて、予測精度分析グラフを生成し、また、分析指標を算出する。 In the first embodiment, the computer 300 sorts records composed of the predicted values and actual values of the prediction items based on the predicted values, and generates a group composed of a predetermined number of records based on the sorting result. The computer 300 calculates an average value of predicted values and an average value of actual values for each group, and calculates a difference between the average value of predicted values and the average value of actual values as a group prediction error. The computer 300 generates a prediction accuracy analysis graph based on the calculation result, and calculates an analysis index.

図３は、実施例１の計算機システムの構成の一例を示すブロック図である。 FIG. 3 is a block diagram illustrating an example of the configuration of the computer system according to the first embodiment.

計算機システムは、計算機３００及び記憶装置３０１から構成される。 The computer system includes a computer 300 and a storage device 301.

計算機３００は、任意の予測項目の予測モデルの予測精度を分析する。本実施例の計算機３００は、演算装置３１０、メモリ３１１、入力装置３１２、出力装置３１３、及び記憶媒体３１４を備える。各構成は、内部バス等を介して互いに接続される。 The computer 300 analyzes the prediction accuracy of the prediction model of an arbitrary prediction item. The computer 300 according to this embodiment includes an arithmetic device 310, a memory 311, an input device 312, an output device 313, and a storage medium 314. Each component is connected to each other via an internal bus or the like.

演算装置３１０は、メモリ３１１に格納されるプログラムを実行する演算装置であり、例えば、ＣＰＵ及びＧＰＵ等がある。以下の説明では、機能部を主語として処理及び機能を説明する場合、演算装置３１０によって当該機能部を実現するプログラムが実行されていることを示す。メモリ３１１は、演算装置３１０によって実行されるプログラム及び当該プログラムによって使用される情報を格納する。メモリ３１１は、揮発性のメモリ及び不揮発性のメモリのいずれであってもよい。 The arithmetic device 310 is an arithmetic device that executes a program stored in the memory 311 and includes, for example, a CPU and a GPU. In the following description, when processing and functions are described using a functional unit as a subject, it indicates that a program for realizing the functional unit is being executed by the arithmetic device 310. The memory 311 stores a program executed by the arithmetic device 310 and information used by the program. The memory 311 may be either a volatile memory or a nonvolatile memory.

入力装置３１２は、計算機３００に各種情報を入力するための装置であり、例えば、キーボード、マウス、及びタッチパネル等が含まれる。出力装置３１３は、計算機３００が実行した処理結果を出力する装置であり、例えば、ディスプレイ等が含まれる。 The input device 312 is a device for inputting various information to the computer 300, and includes, for example, a keyboard, a mouse, a touch panel, and the like. The output device 313 is a device that outputs a processing result executed by the computer 300, and includes, for example, a display.

記憶媒体３１４は、計算機３００が有する各種機能を実現するプログラム等を格納する。本実施例では、演算装置３１０が、記憶媒体３１４からプログラムを読み出し、読み出されたプログラムをメモリ３１１上にロードし、さらに、ロードされたプログラムを実行する。本実施例の記憶媒体３１４に格納されるプログラム等については後述する。 The storage medium 314 stores programs that realize various functions of the computer 300. In this embodiment, the arithmetic device 310 reads a program from the storage medium 314, loads the read program onto the memory 311, and executes the loaded program. The program stored in the storage medium 314 of this embodiment will be described later.

なお、記憶媒体３１４に格納されるプログラムは、ＣＤ−ＲＯＭ及びフラッシュメモリ等のリムーバブルメディア又はネットワークを介して接続される配信サーバから取得する方法が考えられる。リムーバブルメディアからプログラムを取得する場合、計算機３００は、リムーバブルメディアに接続されるインタフェースを備える。 Note that the program stored in the storage medium 314 may be obtained from a removable medium such as a CD-ROM and a flash memory or a distribution server connected via a network. When acquiring a program from a removable medium, the computer 300 includes an interface connected to the removable medium.

記憶装置３０１は、計算機３００が管理する各種データを格納する。記憶装置３０１は、一般的な計算機及びストレージシステム等が考えられる。なお、ストレージシステムは、コントローラ、外部インタフェース、及び複数の記憶媒体を備え、複数の記憶媒体を用いてＲＡＩＤを構成することができる。また、ストレージシステムは、ＲＡＩＤボリュームを用いて複数の論理的な記憶領域を提供することもできる。 The storage device 301 stores various data managed by the computer 300. The storage device 301 can be a general computer, a storage system, or the like. The storage system includes a controller, an external interface, and a plurality of storage media, and a RAID can be configured using a plurality of storage media. The storage system can also provide a plurality of logical storage areas using a RAID volume.

記憶装置３０１上にはデータベースが構築され、当該データベースには、個別情報記憶部３５１、グループ情報記憶部３５２、及び分析結果記憶部３５３が含まれる。 A database is constructed on the storage device 301, and the database includes an individual information storage unit 351, a group information storage unit 352, and an analysis result storage unit 353.

個別情報記憶部３５１は、観測対象（人、装置等）の各種情報を保持するレコード等の情報を記憶する。レコードには、観測項目の値が含まれる。個別情報記憶部３５１に記憶される情報の詳細は、図４を用いて説明する。 The individual information storage unit 351 stores information such as records that hold various types of information on observation targets (people, devices, etc.). The record includes the value of the observation item. Details of the information stored in the individual information storage unit 351 will be described with reference to FIG.

グループ情報記憶部３５２は、複数の観測対象の情報から生成されたグループに関する情報を記憶する。グループ情報記憶部３５２に記憶される情報の詳細は、図５を用いて説明する。 The group information storage unit 352 stores information on groups generated from a plurality of observation target information. Details of the information stored in the group information storage unit 352 will be described with reference to FIG.

分析結果記憶部３５３は、予測モデルの予測精度の分析結果に関する情報を記憶する。分析結果記憶部３５３に記憶される情報の詳細は、図６、図７、及び図８を用いて説明する。 The analysis result storage unit 353 stores information related to the analysis result of the prediction accuracy of the prediction model. Details of the information stored in the analysis result storage unit 353 will be described with reference to FIGS. 6, 7, and 8.

ここで、記憶媒体３１４に格納されるプログラムについて説明する。 Here, the program stored in the storage medium 314 will be described.

記憶媒体３１４は、レコード管理部３２０、統計量算出部３３０、及び予測精度分析部３４０を実現するプログラムを格納する。 The storage medium 314 stores a program for realizing the record management unit 320, the statistic calculation unit 330, and the prediction accuracy analysis unit 340.

レコード管理部３２０は、個別情報記憶部３５１に記憶されるレコードを管理する。また、レコード管理部３２０は、当該レコードのソート及びレコードのグループ化を行う。レコード管理部３２０は、複数のモジュールから構成される。本実施例のレコード管理部３２０は、レコードソート部３２１及びグループ生成部３２２を含む。 The record management unit 320 manages records stored in the individual information storage unit 351. In addition, the record management unit 320 sorts the records and groups the records. The record management unit 320 is composed of a plurality of modules. The record management unit 320 of this embodiment includes a record sort unit 321 and a group generation unit 322.

レコードソート部３２１は、予測値の大きさに基づいて複数のレコードをソートする。レコードソート部３２１は、レコードのソート時に、各レコードにソート番号を付与する。レコードソート部３２１は、処理の結果を個別情報記憶部３５１に記憶する。グループ生成部３２２は、レコードのソート結果に基づいて、所定の数のレコードを含むグループを複数生成する。グループ生成部３２２は、生成されたグループの情報をグループ情報記憶部３５２に記憶する。 The record sort unit 321 sorts a plurality of records based on the size of the predicted value. The record sort unit 321 assigns a sort number to each record when the records are sorted. The record sort unit 321 stores the processing result in the individual information storage unit 351. The group generation unit 322 generates a plurality of groups including a predetermined number of records based on the record sorting result. The group generation unit 322 stores the generated group information in the group information storage unit 352.

統計量算出部３３０は、グループ毎に、各種統計量を算出する。算出される統計量は、予測値の平均値及び実値の平均値、又は、予測値の累積値及び実値の累積値等が考えられる。なお、本発明は、算出される統計量の種別に限定されない。統計量算出部３３０は、複数のモジュールから構成される。本実施例の統計量算出部３３０は、グループ統計量算出部３３１及び予測誤差算出部３３２を含む。 The statistic calculator 330 calculates various statistics for each group. The calculated statistic may be an average value of predicted values and an average value of actual values, or an accumulated value of predicted values and an accumulated value of actual values. Note that the present invention is not limited to the type of calculated statistic. The statistic calculation unit 330 includes a plurality of modules. The statistic calculation unit 330 of this embodiment includes a group statistic calculation unit 331 and a prediction error calculation unit 332.

グループ統計量算出部３３１は、グループ情報記憶部３５２に記憶される情報に基づいて、グループ毎に、予測値の統計量及び実値の統計量を算出する。グループ統計量算出部３３１は、算出された予測値の統計量及び実値の統計量をグループ情報記憶部３５２に記憶する。予測誤差算出部３３２は、予測値の統計量及び実値の統計量に基づいて、グループ毎の予測誤差を算出する。予測誤差算出部３３２は、グループ毎の予測誤差をグループ情報記憶部３５２に記憶する。 The group statistic calculation unit 331 calculates the predicted value statistic and the actual value statistic for each group based on the information stored in the group information storage unit 352. The group statistic calculation unit 331 stores the calculated predicted value statistic and actual value statistic in the group information storage unit 352. The prediction error calculation unit 332 calculates a prediction error for each group based on the predicted value statistic and the actual value statistic. The prediction error calculation unit 332 stores the prediction error for each group in the group information storage unit 352.

予測精度分析部３４０は、統計量算出部３３０によって算出された各グループの統計量に基づいて、予測モデルの予測精度を分析する。予測精度分析部３４０は、複数のモジュールから構成される。本実施例の予測精度分析部３４０は、予測精度分析グラフ生成部３４１及び予測精度分析指標算出部３４２を含む。 The prediction accuracy analysis unit 340 analyzes the prediction accuracy of the prediction model based on the statistics of each group calculated by the statistic calculation unit 330. The prediction accuracy analysis unit 340 includes a plurality of modules. The prediction accuracy analysis unit 340 of the present embodiment includes a prediction accuracy analysis graph generation unit 341 and a prediction accuracy analysis index calculation unit 342.

予測精度分析グラフ生成部３４１は、グループ情報記憶部３５２に記憶される統計量に基づいて、予測精度分析グラフ７００（図７参照）を生成する。予測精度分析グラフ生成部３４１は、予測精度分析ブラフを分析結果記憶部３５３に記憶する。予測精度分析指標算出部３４２は、グループ情報記憶部３５２に記憶される統計量及び予測精度分析グラフ７００に基づいて、分析指標を算出する。予測精度分析指標算出部３４２は、分析指標を分析結果記憶部３５３に記憶する。 The prediction accuracy analysis graph generation unit 341 generates a prediction accuracy analysis graph 700 (see FIG. 7) based on the statistics stored in the group information storage unit 352. The prediction accuracy analysis graph generation unit 341 stores the prediction accuracy analysis bluff in the analysis result storage unit 353. The prediction accuracy analysis index calculation unit 342 calculates an analysis index based on the statistics stored in the group information storage unit 352 and the prediction accuracy analysis graph 700. The prediction accuracy analysis index calculation unit 342 stores the analysis index in the analysis result storage unit 353.

図４は、実施例１の個別情報記憶部３５１に格納される個別情報４００の一例を示す説明図である。 FIG. 4 is an explanatory diagram illustrating an example of the individual information 400 stored in the individual information storage unit 351 according to the first embodiment.

個別情報４００は、観測対象に対応するレコードを複数含む。レコードは、ＩＤ４０１、予測値４０２、及び実値４０３から構成される。以下の説明では、個別情報４００に含まれるレコードを個別レコードとも記載する。 The individual information 400 includes a plurality of records corresponding to observation targets. The record includes an ID 401, a predicted value 402, and an actual value 403. In the following description, a record included in the individual information 400 is also referred to as an individual record.

ＩＤ４０１は、観測対象、すなわち、レコードを一意に識別するための識別情報である。本実施例のＩＤ４０１には、識別番号が格納される。予測値４０２は、予測モデルに基づいて算出された予測項目の値である。実値４０３は、予測事象の観測結果を示す値である。 The ID 401 is identification information for uniquely identifying an observation target, that is, a record. An ID number is stored in the ID 401 of this embodiment. The prediction value 402 is the value of the prediction item calculated based on the prediction model. The actual value 403 is a value indicating the observation result of the predicted event.

ここで、糖尿病の発症を管理する情報を例に、予測値４０２及び実値４０３について説明する。 Here, the predicted value 402 and the actual value 403 will be described using information for managing the onset of diabetes as an example.

この場合、一つのレコードは、一人の健診者のデータに対応する。ＩＤ４０１が「００００１」のレコードの場合、予測値４０２は「０．３２」であり、実値４０３は「０」である。これは、健診者の糖尿病発症率の予測値が「０．３２」であり、発症実績は「０」であることを示す。本実施例では、糖尿病を発症した場合、実値４０３には「１」が格納され、糖尿病を発症していない場合、実値４０３には「０」が格納される。すなわち、ＩＤ４０１が「００００１」の健診者は、糖尿病を発症していないことを示す。 In this case, one record corresponds to data of one medical examiner. In the case of a record whose ID 401 is “00001”, the predicted value 402 is “0.32”, and the actual value 403 is “0”. This indicates that the predictive value of the incidence of diabetes of the medical examiner is “0.32,” and the actual onset is “0”. In this embodiment, “1” is stored in the actual value 403 when diabetes occurs, and “0” is stored in the actual value 403 when diabetes does not occur. That is, the medical examiner whose ID 401 is “00001” indicates that diabetes has not occurred.

図５は、実施例１のグループ情報記憶部３５２に格納されるグループ情報５００の一例を示す説明図である。 FIG. 5 is an explanatory diagram illustrating an example of the group information 500 stored in the group information storage unit 352 according to the first embodiment.

グループ情報５００は、観測対象に対応するレコードを複数含む。レコードは、ＩＤ５０１、グループ番号５０２、及びソート番号５０３から構成される。 Group information 500 includes a plurality of records corresponding to observation targets. The record includes an ID 501, a group number 502, and a sort number 503.

ＩＤ５０１は、ＩＤ４０１と同一のものである。グループ番号５０２は、グループ生成部３２２によって付与されたグループの識別番号である。ソート番号５０３は、レコードソート部３２１によって付与されたソート順を示す番号である。 ID 501 is the same as ID 401. The group number 502 is a group identification number assigned by the group generation unit 322. The sort number 503 is a number indicating the sort order assigned by the record sort unit 321.

本実施例では、予測値が大きさに基づいて「１」から順にソート番号が各レコードに付与される。また、レコードは、ソート番号５０３が小さい順にソートされる。 In the present embodiment, a sort number is assigned to each record in order from “1” based on the size of the predicted value. The records are sorted in ascending order of the sort number 503.

ＩＤ５０１が「００００１」のレコードの場合、ソート番号５０３は「２」である。これは、個別情報４００に含まれるレコードのうち、予測値４０２の値が２番目に大きいレコードであることを示す。また、当該レコードのグループ番号５０２は「１」である。これは、グループ番号が「１」のグループに含まれるレコードであることを示す。 In the case of a record whose ID 501 is “00001”, the sort number 503 is “2”. This indicates that among the records included in the individual information 400, the predicted value 402 is the second largest record. The group number 502 of the record is “1”. This indicates that the record is included in the group having the group number “1”.

図６は、実施例１のグループ情報記憶部３５２に格納されるグループ統計情報６００の一例を示す説明図である。 FIG. 6 is an explanatory diagram illustrating an example of the group statistical information 600 stored in the group information storage unit 352 according to the first embodiment.

グループ統計情報６００は、グループに対応するレコードを複数含む。レコードは、グループ番号６０１、予測値の平均値６０２、実値の平均値６０３、及び予測誤差６０４から構成される。 Group statistical information 600 includes a plurality of records corresponding to groups. The record includes a group number 601, a predicted value average value 602, a real value average value 603, and a prediction error 604.

グループ番号６０１は、グループ番号５０２と同一のものである。予測値の平均値６０２は、グループに含まれる個別レコードの予測値４０２の平均値である。実値の平均値６０３は、グループに含まれる個別レコードの実値４０３の平均値である。予測誤差６０４は、グループ毎の、予測値の平均値と実値の平均値との間の誤差を示す値である。 The group number 601 is the same as the group number 502. The average value 602 of predicted values is an average value of the predicted values 402 of individual records included in the group. The actual value average value 603 is an average value of the actual values 403 of the individual records included in the group. The prediction error 604 is a value indicating an error between the average value of predicted values and the average value of actual values for each group.

実値の平均値６０３は、グループに含まれる個別レコードの実値の分布から算出される値であり、実値の期待値に相当する統計量である。本実施例では、計算機３００は、実値の平均値６０３及び予測値の平均値６０２を用いて予測誤差を算出する。 The average value 603 of the actual values is a value calculated from the distribution of the actual values of the individual records included in the group, and is a statistic corresponding to the expected actual value. In the present embodiment, the computer 300 calculates a prediction error using the average value 603 of actual values and the average value 602 of predicted values.

本実施例では、統計量算出部３３０は、予測値の平均値６０２から実値の平均値６０３を減算することによって第１の値を算出する。さらに、統計量算出部３３０は、第１の値を予測値の平均値６０２で除算することによって第２の値を算出する。統計量算出部３３０は、第２の値に１００を乗算することによって第３の値を算出する。統計量算出部３３０は、第３の値を予測誤差６０４に格納する。 In the present embodiment, the statistic calculation unit 330 calculates the first value by subtracting the average value 603 of the actual value from the average value 602 of the predicted value. Further, the statistic calculation unit 330 calculates the second value by dividing the first value by the average value 602 of the predicted values. The statistic calculation unit 330 calculates the third value by multiplying the second value by 100. The statistic calculator 330 stores the third value in the prediction error 604.

なお、計算機３００は、図４、図５、及び図６を一つの情報として保持してもよい。すなわち、図４、図５、及び図６に示すようなカラムを管理できるものであればよい。 Note that the computer 300 may hold FIGS. 4, 5, and 6 as one piece of information. That is, any column can be used as long as it can manage columns as shown in FIGS. 4, 5, and 6.

図７は、実施例１の予測精度分析グラフ７００の一例を示す説明図である。図８は、実施例１の分析指標情報８００の一例を示す説明図である。 FIG. 7 is an explanatory diagram illustrating an example of the prediction accuracy analysis graph 700 according to the first embodiment. FIG. 8 is an explanatory diagram illustrating an example of the analysis index information 800 according to the first embodiment.

後述するように、予測精度分析グラフ７００及び分析指標情報８００は、予測精度分析部３４０によって生成される。 As will be described later, the prediction accuracy analysis graph 700 and the analysis index information 800 are generated by the prediction accuracy analysis unit 340.

予測精度分析グラフ７００は、予測モデルの予測精度の分析指標を算出時に用いるグラフである。図７の予測精度分析グラフ７００では、縦軸が各グループの予測誤差を示し、横軸がグループの番号を示す。なお、予測値の平均値６０２又は実値の平均値６０３を横軸としてもよい。予測精度分析グラフ７００は、分析の目的に応じて適宜変更することができる。 The prediction accuracy analysis graph 700 is a graph that uses an analysis index of prediction accuracy of a prediction model at the time of calculation. In the prediction accuracy analysis graph 700 of FIG. 7, the vertical axis indicates the prediction error of each group, and the horizontal axis indicates the group number. The average value 602 of predicted values or the average value 603 of actual values may be set on the horizontal axis. The prediction accuracy analysis graph 700 can be appropriately changed according to the purpose of analysis.

図７では、予測精度分析グラフ７００は、各グループの予測誤差をプロットしたグラフであるが、各グループの予測誤差の変化傾向を示す曲線であってもよい。 In FIG. 7, the prediction accuracy analysis graph 700 is a graph in which the prediction error of each group is plotted, but may be a curve indicating a change tendency of the prediction error of each group.

分析指標情報８００は、予測誤差の平均値８０１及び予測誤差の分散８０２を含む。予測誤差の平均値８０１は、各グループの予測誤差から算出された予測誤差の平均値である。予測誤差の分散８０２は、各グループの予測誤差から算出された予測誤差の分散である。 The analysis index information 800 includes an average value 801 of prediction errors and a variance 802 of prediction errors. The average value 801 of prediction errors is an average value of prediction errors calculated from the prediction error of each group. The prediction error variance 802 is a variance of the prediction error calculated from the prediction error of each group.

なお、分析指標情報８００には、予測誤差の平均値及び予測誤差の分散以外の統計量が含まれてもよい。 Note that the analysis index information 800 may include statistics other than the average value of prediction errors and the variance of prediction errors.

次に、計算機３００が実行する処理の詳細について説明する。 Next, details of processing executed by the computer 300 will be described.

図１は、実施例１の計算機３００が実行する処理の一例を説明するフローチャートである。 FIG. 1 is a flowchart illustrating an example of processing executed by the computer 300 according to the first embodiment.

計算機３００は、予測値の大きさに基づいて、個別情報４００の個別レコードのソート処理を実行する（ステップＳ１０１）。 The computer 300 executes the sorting process of the individual records of the individual information 400 based on the size of the predicted value (Step S101).

具体的には、レコード管理部３２０のレコードソート部３２１が、予測値４０２の値が大きい順に個別情報４００の個別レコードにソート番号を付与し、当該ソート番号に基づいて個別レコードをソートする。レコードソート部３２１は、ソート結果に基づいて、グループ情報５００のソート番号５０３にソート番号を格納する。なお、個別レコードをソートする規則は、前述したものに限定されない。例えば、予測値４０２の値が小さい順に個別情報４００のレコードをソートする方法が考えられる。 Specifically, the record sorting unit 321 of the record management unit 320 assigns a sort number to the individual record of the individual information 400 in descending order of the predicted value 402, and sorts the individual record based on the sort number. The record sort unit 321 stores the sort number in the sort number 503 of the group information 500 based on the sorting result. Note that the rules for sorting the individual records are not limited to those described above. For example, a method of sorting the records of the individual information 400 in ascending order of the predicted value 402 can be considered.

次に、計算機３００は、個別情報４００の個別レコードのソート結果に基づいて、複数のレコードから複数のグループを生成する（ステップＳ１０２）。 Next, the computer 300 generates a plurality of groups from the plurality of records based on the sorting result of the individual records of the individual information 400 (step S102).

具体的には、レコード管理部３２０のグループ生成部３２２が、個別情報４００の予測値４０２及びグループ情報５００のソート番号５０３の少なくとも何れかに基づいて、グループを生成する。このとき、グループ生成部３２２は、生成された各グループにグループ番号を付与する。また、グループ生成部３２２は、各レコードのグループ番号５０２に、各レコードが所属するグループのグループ番号を設定する。グループの生成方法は、例えば、以下のような方法が考えられる。 Specifically, the group generation unit 322 of the record management unit 320 generates a group based on at least one of the predicted value 402 of the individual information 400 and the sort number 503 of the group information 500. At this time, the group generation unit 322 gives a group number to each generated group. Further, the group generation unit 322 sets the group number of the group to which each record belongs in the group number 502 of each record. As a group generation method, for example, the following method can be considered.

（生成方法１）グループ生成部３２２は、ソート番号５０３の大きい順又は小さい順に１００個のレコードを選択し、当該１００個のレコードを一つのグループとして生成する。 (Generation Method 1) The group generation unit 322 selects 100 records in the descending order of the sort number 503, and generates the 100 records as one group.

生成方法１の場合、各グループに含まれるレコードの数は全て同一であるため、グループに含まれる全てのレコードの実値の期待値が一致している場合、実値の平均値と実値の期待値との差は、大数の法則により、任意の閾値以下にできる。ただし、前述の条件を満たす可能性は少ないため、実値の平均値と実値の期待値との間には、誤差が生じる。 In the generation method 1, since the number of records included in each group is the same, if the expected values of the actual values of all the records included in the group match, the average value of the actual values and the actual value The difference from the expected value can be set to an arbitrary threshold value or less by the law of large numbers. However, since there is little possibility of satisfying the above condition, an error occurs between the average value of the actual values and the expected value of the actual values.

（生成方法２）グループ生成部３２２は、予測値の範囲を決定し、当該予測値の範囲にグループ番号を付与する。グループ生成部３２２は、各個別レコードの予測値がどの範囲に含まれるか分類することによって、各グループに含める個別レコードを決定する。 (Generation Method 2) The group generation unit 322 determines a range of predicted values and assigns a group number to the range of predicted values. The group generation unit 322 determines the individual records to be included in each group by classifying the range in which the predicted value of each individual record is included.

例えば、各レコードの予測値を、０．５から０．４の範囲、０．４から０．３の範囲、０．３から０．２の範囲、０．２から０．１の範囲、及び０．１か０．０の範囲の四つの範囲に区切る。グループ生成部３２２は、各個別レコードを予測値の範囲毎に分類する。 For example, the predicted value for each record is a range from 0.5 to 0.4, a range from 0.4 to 0.3, a range from 0.3 to 0.2, a range from 0.2 to 0.1, and Divide into four ranges, 0.1 or 0.0. The group generation unit 322 classifies each individual record for each range of predicted values.

生成方法２の場合、実値の平均値と実値の期待値との間の誤差を抑止できる。ただし、グループに所属するレコードの数は、グループ毎に異なる。 In the generation method 2, an error between the average value of the actual values and the expected value of the actual values can be suppressed. However, the number of records belonging to a group varies from group to group.

なお、生成方法１及び生成方法２を組み合わせてもよい。以上がステップＳ１０２の処理の説明である。 Note that the generation method 1 and the generation method 2 may be combined. The above is the description of the processing in step S102.

次に、計算機３００は、各グループの予測値の統計量及び実値の統計量を算出する（ステップＳ１０３）。 Next, the computer 300 calculates the statistic of the predicted value and the statistic of the actual value for each group (step S103).

具体的には、統計量算出部３３０のグループ統計量算出部３３１は、グループを選択する。グループ統計量算出部３３１は、グループ情報５００及び個別情報４００を参照し、選択されたグループに含まれる個別レコードの予測値４０２及び実値４０３に基づいて、予測値の統計量及び実値の統計量を算出する。グループ統計量算出部３３１は、グループ統計情報６００の該当するレコードに算出された統計量を設定する。 Specifically, the group statistic calculation unit 331 of the statistic calculation unit 330 selects a group. The group statistic calculation unit 331 refers to the group information 500 and the individual information 400, and based on the predicted value 402 and the actual value 403 of the individual record included in the selected group, the predicted value statistic and the actual value statistic Calculate the amount. The group statistic calculation unit 331 sets the calculated statistic in the corresponding record of the group statistic information 600.

本実施例では、グループ統計量算出部３３１は、予測値４０２の平均値及び実値４０３の平均値をグループの統計量として算出するものとする。算出された値は、予測値の平均値６０２及び実値の平均値６０３に格納される。 In this embodiment, the group statistic calculation unit 331 calculates the average value of the predicted values 402 and the average value of the actual values 403 as the group statistic. The calculated values are stored in an average value 602 of predicted values and an average value 603 of actual values.

なお、グループ統計量算出部３３１は、予測値４０２の累積値及び実値４０３の累積値、又は、予測値４０２の分散及び実値４０３の分散をグループの統計量として算出してもよい。 The group statistic calculation unit 331 may calculate the cumulative value of the predicted value 402 and the cumulative value of the actual value 403 or the variance of the predicted value 402 and the variance of the actual value 403 as the group statistic.

次に、計算機３００は、各グループの予測値の統計量及び実値の統計量に基づいて、各グループの予測誤差を算出する（ステップＳ１０４）。 Next, the computer 300 calculates a prediction error of each group based on the statistical value of the predicted value and the statistical value of the actual value of each group (step S104).

具体的には、統計量算出部３３０の予測誤差算出部３３２は、グループを選択する。予測誤差算出部３３２は、グループ統計情報６００を参照し、選択されたグループに対応するレコードの予測値の平均値６０２及び実値の平均値６０３に基づいて、当該グループの予測誤差を算出する。 Specifically, the prediction error calculation unit 332 of the statistic calculation unit 330 selects a group. The prediction error calculation unit 332 refers to the group statistical information 600 and calculates the prediction error of the group based on the average value 602 of the predicted values and the average value 603 of the actual values corresponding to the selected group.

予測誤算の算出方法は、以下のような方法が考えられる。 The following methods can be considered as a calculation method of the prediction miscalculation.

（算出方法１）予測誤差算出部３３２は、予測値の平均値６０２と実値の平均値６０３との間の差を算出する。なお、平均値の差は、予測値の平均値６０２から実値の平均値６０３が減算された値の絶対値から求めることができる。予測誤差算出部３３２は、算出された平均値の差を予測値の平均値６０２で除算した値を予測誤差とする。 (Calculation Method 1) The prediction error calculation unit 332 calculates the difference between the average value 602 of predicted values and the average value 603 of actual values. The difference between the average values can be obtained from the absolute value of the value obtained by subtracting the average value 603 of the actual value from the average value 602 of the predicted value. The prediction error calculation unit 332 sets a value obtained by dividing the difference between the calculated average values by the average value 602 of the prediction values as a prediction error.

算出方法１で算出された予測誤差は、予測値が与えられた場合に、実値がどの程度予測値に近いかを示す値である。予測値が実値より相対的に安定している場合、算出方法１で算出された予測誤差も相対的に安定する。したがって、予測モデルの予測精度を分析に適した値として用いることができる。 The prediction error calculated by the calculation method 1 is a value indicating how close the actual value is to the predicted value when the predicted value is given. When the predicted value is relatively more stable than the actual value, the prediction error calculated by the calculation method 1 is also relatively stable. Therefore, the prediction accuracy of the prediction model can be used as a value suitable for analysis.

（算出方法２）予測誤差算出部３３２は、予測値の平均値６０２と実値の平均値６０３との間の差を算出する。予測誤差算出部３３２は、算出された平均値の差を実値の平均値６０３で除算した値を予測誤差とする。 (Calculation Method 2) The prediction error calculation unit 332 calculates the difference between the average value 602 of predicted values and the average value 603 of actual values. The prediction error calculation unit 332 sets a value obtained by dividing the difference between the calculated average values by the average value 603 of the actual values as a prediction error.

算出方法２で算出された予測誤差は、実値が与えられた場合に、予測値がどの程度実値に近いかを示す値である。実値が予測値より相対的に安定している場合、算出方法２で算出された予測誤差も相対的に安定する。したがって、予測モデルの予測精度を分析に適した値として用いることができる。 The prediction error calculated by the calculation method 2 is a value indicating how close the predicted value is to the actual value when the actual value is given. When the actual value is relatively more stable than the predicted value, the prediction error calculated by the calculation method 2 is also relatively stable. Therefore, the prediction accuracy of the prediction model can be used as a value suitable for analysis.

なお、実値はランダムな場合が多いため、実値が予測値より相対的に安定している場合は少ない。以上がステップＳ１０４の処理の説明である。 Since the actual value is often random, there are few cases where the actual value is relatively more stable than the predicted value. The above is the description of the processing in step S104.

次に、計算機３００は、グループ統計情報６００に基づいて、予測精度分析グラフ７００を生成する（ステップＳ１０５）。 Next, the computer 300 generates a prediction accuracy analysis graph 700 based on the group statistical information 600 (step S105).

具体的には、予測精度分析部３４０の予測精度分析グラフ生成部３４１は、予測誤差６０４を縦軸に設定し、グループ番号６０１を横軸に設定することによって、図６に示すような予測精度分析グラフ７００を生成する。なお、予測値の平均値６０２又は実値の平均値６０３が横軸に設定されてもよい。どのカラムを横軸に設定するかは分析目的に応じて適宜変更できる。 Specifically, the prediction accuracy analysis graph generation unit 341 of the prediction accuracy analysis unit 340 sets the prediction error 604 on the vertical axis and sets the group number 601 on the horizontal axis, so that the prediction accuracy as shown in FIG. An analysis graph 700 is generated. The average value 602 of predicted values or the average value 603 of actual values may be set on the horizontal axis. Which column is set on the horizontal axis can be appropriately changed according to the purpose of analysis.

次に、計算機３００は、グループ統計情報６００に基づいて、予測精度の分析に用いるグループを選択する（ステップＳ１０６）。 Next, the computer 300 selects a group used for prediction accuracy analysis based on the group statistical information 600 (step S106).

具体的には、予測精度分析部３４０の予測精度分析指標算出部３４２は、グループ統計情報６００及び分析目的に基づいて、グループを選択する。グループの選択方法は、例えば以下のような方法が考えられる。 Specifically, the prediction accuracy analysis index calculation unit 342 of the prediction accuracy analysis unit 340 selects a group based on the group statistical information 600 and the analysis purpose. As a group selection method, for example, the following method can be considered.

（選択方法１）予測精度分析指標算出部３４２は、グループ番号６０１に基づいて、所定の数のグループを選択する。例えば、予測精度分析指標算出部３４２は、グループ番号６０１が閾値より大きいグループを５つ選択し、又は、グループ番号６０１が閾値より小さいグループを３つ選択する。 (Selection Method 1) The prediction accuracy analysis index calculation unit 342 selects a predetermined number of groups based on the group number 601. For example, the prediction accuracy analysis index calculation unit 342 selects five groups with the group number 601 larger than the threshold, or selects three groups with the group number 601 smaller than the threshold.

（選択方法２）予測精度分析指標算出部３４２は、予測値の平均値６０２に基づいて、所定の数のグループを選択する。例えば、予測精度分析指標算出部３４２は、予測値の平均値６０２が０．７より大きいグループを５つ選択し、又は、予測値の平均値６０２が０．２より小さいグループを２つ選択する。 (Selection Method 2) The prediction accuracy analysis index calculation unit 342 selects a predetermined number of groups based on the average value 602 of predicted values. For example, the prediction accuracy analysis index calculation unit 342 selects five groups having an average predicted value 602 greater than 0.7, or selects two groups having an average predicted value 602 smaller than 0.2. .

（選択方法３）予測精度分析指標算出部３４２は、実値の平均値６０３に基づいて、所定の数のグループを選択する。例えば、予測精度分析指標算出部３４２は、実値の平均値６０３が０．７より大きいグループを５つ選択し、又は、実値の平均値６０３が０．２より小さいグループを２つ選択する。 (Selection Method 3) The prediction accuracy analysis index calculation unit 342 selects a predetermined number of groups based on the average value 603 of actual values. For example, the prediction accuracy analysis index calculation unit 342 selects five groups having an average value of actual values 603 larger than 0.7, or selects two groups having an average value of actual values 603 smaller than 0.2. .

（選択方法４）予測精度分析指標算出部３４２は、予測誤差５０４に基づいて、所定の数のグループを選択する。例えば、予測精度分析指標算出部３４２は、予測誤差５０４が２０％より大きいグループを７つ選択し、又は、予測誤差５０４が１０％より小さいグループを５つ選択する。 (Selection Method 4) The prediction accuracy analysis index calculation unit 342 selects a predetermined number of groups based on the prediction error 504. For example, the prediction accuracy analysis index calculation unit 342 selects seven groups in which the prediction error 504 is greater than 20%, or selects five groups in which the prediction error 504 is less than 10%.

（選択方法５）予測精度分析指標算出部３４２は、個別情報４００及びグループ情報５００を参照して、縦軸が予測値４０２、横軸がグループ番号５０２である累積グラフ９００を生成する。予測精度分析指標算出部３４２は、予測値の累積値の増加量又は実値の累積値の増加量が所定の閾値より大きいグループを特定する。予測精度分析指標算出部３４２は、特定されたグループを予測精度の分析に用いるグループとして選択する。 (Selection Method 5) The prediction accuracy analysis index calculation unit 342 refers to the individual information 400 and the group information 500, and generates a cumulative graph 900 in which the vertical axis is the predicted value 402 and the horizontal axis is the group number 502. The prediction accuracy analysis index calculation unit 342 specifies a group in which the increase amount of the predicted value accumulation value or the increase amount of the actual value accumulation value is larger than a predetermined threshold. The prediction accuracy analysis index calculation unit 342 selects the identified group as a group used for analysis of prediction accuracy.

図９は、実施例１の累積グラフ９００の一例を示す説明図である。 FIG. 9 is an explanatory diagram illustrating an example of the cumulative graph 900 according to the first embodiment.

グループ番号が「５」の累積値は、グループ番号が「１」から「５」までの各グループの予測値の平均値３０２を合計値である。なお、縦軸は、予測値４０２以外の統計量であってもよい。 The cumulative value with the group number “5” is the total value of the average values 302 of the predicted values of the groups with the group numbers “1” to “5”. The vertical axis may be a statistic other than the predicted value 402.

本実施例では、予測値４０２が大きい順に個別情報４００のレコードがソートされ、また、当該ソートの結果に基づいてグループが生成される。したがって、グループ番号が小さいグループほど、予測値４０２の平均値が大きい。これは、予測値４０２の平均値が大きいグループの実値の分布は非一様な分布であることを示す。すなわち、実値の真の値（期待値）が予測できることを示す。一方、実値の分布が一様な分布の場合、実値はランダムであり、実値の真の値を予測できない。 In this embodiment, the records of the individual information 400 are sorted in descending order of the predicted value 402, and a group is generated based on the sorting result. Therefore, the smaller the group number, the larger the average value of the predicted values 402. This indicates that the distribution of the actual values of the group having a large average value of the predicted values 402 is a non-uniform distribution. That is, it indicates that the true value (expected value) of the actual value can be predicted. On the other hand, when the distribution of the actual values is uniform, the actual values are random and the true value of the actual value cannot be predicted.

実値の分布の非一様性は、累積グラフ９００の累積値の変化量としてとらえることができる。分布が非一様な実値の場合、累積値は急激に増加する。すなわち、累積値の変化量が大きい。分布が一様な実値の場合、累積値は緩やかに増加する。すなわち、累積値の変化量は一定となる。 The non-uniformity of the distribution of actual values can be regarded as the amount of change in the accumulated value of the accumulated graph 900. When the distribution is a non-uniform real value, the cumulative value increases rapidly. That is, the amount of change in the accumulated value is large. When the distribution is a uniform real value, the cumulative value increases gradually. That is, the amount of change in the accumulated value is constant.

前述したような特徴は、累積グラフ９００からも把握できる。グループ番号５０２が小さいグループの累積値の変化量は曲線となる。一方、グループ番号５０２が大きいグループの累積値の変化量は直線となる。 Features as described above can also be grasped from the cumulative graph 900. The amount of change in the accumulated value of the group with the smaller group number 502 is a curve. On the other hand, the amount of change in the accumulated value of the group having the larger group number 502 is a straight line.

選択方法５では、前述したような特徴に基づいてグループが選択される。具体的には、予測精度分析指標算出部３４２は、累積グラフ９００を用いて累積値の変化量が一定値となるグループを特定する。さらに、予測精度分析指標算出部３４２は、最初のグループから特定されたグループまでを選択する。図９に示す例では、グループ番号が「１」から「１０」までのグループが選択される。 In the selection method 5, a group is selected based on the characteristics as described above. Specifically, the prediction accuracy analysis index calculation unit 342 uses the cumulative graph 900 to identify a group in which the change amount of the cumulative value is a constant value. Furthermore, the prediction accuracy analysis index calculation unit 342 selects from the first group to the specified group. In the example shown in FIG. 9, groups with group numbers “1” to “10” are selected.

選択方法５では、予測精度が高い部分を適切に抽出することができる。すなわち、事象が発生する可能性がある部分を特定できる。なお、選択方法５の変形例として、計算機３００は、予測精度が低い部分も同様に抽出することができる。 In the selection method 5, it is possible to appropriately extract a portion with high prediction accuracy. That is, it is possible to identify a portion where an event may occur. As a modification of the selection method 5, the computer 300 can similarly extract a portion with low prediction accuracy.

従来の予測精度の指標は、観測対象全体の予測値を上下することによって、コントロールできる。そのため、特定の観測対象の予測精度を分析できない。しかし、本発明では、選択方法に応じて特定の観測対象の予測精度を分析できる。 A conventional index of prediction accuracy can be controlled by raising or lowering the predicted value of the entire observation target. Therefore, the prediction accuracy of a specific observation target cannot be analyzed. However, in the present invention, the prediction accuracy of a specific observation target can be analyzed according to the selection method.

前述した選択方法は一例であって、本発明はこれに限定されない。目的に応じてグループを選択すればよい。 The selection method described above is an example, and the present invention is not limited to this. What is necessary is just to select a group according to the objective.

次に、計算機３００は、グループ統計情報６００に基づいて、分析指標を算出する（ステップＳ１０７）。 Next, the computer 300 calculates an analysis index based on the group statistical information 600 (step S107).

具体的には、予測精度分析部３４０の予測精度分析指標算出部３４２は、予測精度分析グラフ７００を参照し、選択されたグループの予測誤差の平均値及び予測誤差の分散を分析指標として算出する。予測精度分析指標算出部３４２は、分析指標情報８００の予測誤差の平均値８０１及び予測誤差の分散８０２のそれぞれに算出された値を格納する。 Specifically, the prediction accuracy analysis index calculation unit 342 of the prediction accuracy analysis unit 340 refers to the prediction accuracy analysis graph 700 and calculates an average value of prediction errors and a variance of prediction errors of the selected group as analysis indexes. . The prediction accuracy analysis index calculation unit 342 stores the calculated values in the prediction error average value 801 and the prediction error variance 802 of the analysis index information 800.

なお、実施例１では、グループ毎に予測値の統計量及び実値の統計量が算出されているが、本発明はこれに限定されない。例えば、レコード単位で同様の処理が実行されてもよい。この場合、ステップＳ１０２の処理は省略される。また、計算機３００は、レコードがソートされた後、各レコードの予測値及び実値に基づいて、予測誤差を算出する。計算機３００は、予測精度分析グラフ７００を生成する。また、計算機３００は、ソート結果に基づいて、対象のレコード群を選択し、選択されたレコード群の予測誤差の統計量を算出する。また、計算機３００は、予測誤差の統計量に基づいて、分析指標を算出する。例えば、予測誤差の統計量をそのまま分析指標として算出する方法が考えられる。 In the first embodiment, the statistic of the predicted value and the statistic of the actual value are calculated for each group, but the present invention is not limited to this. For example, the same processing may be executed for each record. In this case, the process of step S102 is omitted. In addition, after the records are sorted, the computer 300 calculates a prediction error based on the predicted value and the actual value of each record. The computer 300 generates a prediction accuracy analysis graph 700. Further, the computer 300 selects a target record group based on the sorting result, and calculates a prediction error statistic of the selected record group. Further, the computer 300 calculates an analysis index based on the statistical amount of the prediction error. For example, a method of calculating the statistical amount of prediction error as an analysis index as it is is conceivable.

以上のように実施例１によれば、計算機３００は、予測値の大きさに基づいてレコードをソートし、ソート結果に基づいてレコード又はグループを選択することによって、特定の分析目的の応じた分析指標を算出できる。また、グループの予測値の統計量及び実値の統計量に基づいて、予測値と実値の期待値との差に対応する値を算出できる。これによって、予測精度を正確に分析するための分析指標を算出できる。 As described above, according to the first embodiment, the computer 300 sorts records based on the size of the predicted value, and selects a record or a group based on the sorting result, thereby analyzing according to a specific analysis purpose. An index can be calculated. Further, a value corresponding to the difference between the predicted value and the expected value of the actual value can be calculated based on the statistical value of the predicted value and the actual value of the group. Thereby, an analysis index for accurately analyzing the prediction accuracy can be calculated.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。また、例えば、上記した実施例は本発明を分かりやすく説明するために構成を詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、実施例の構成の一部について、他の構成に追加、削除、置換することが可能である。 In addition, this invention is not limited to an above-described Example, Various modifications are included. Further, for example, the above-described embodiments are described in detail for easy understanding of the present invention, and are not necessarily limited to those provided with all the described configurations. Further, a part of the configuration of the embodiment can be added to, deleted from, or replaced with another configuration.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、本発明は、実施例の機能を実現するソフトウェアのプログラムコードによっても実現できる。この場合、プログラムコードを記録した記憶媒体をコンピュータに提供し、そのコンピュータが備えるＣＰＵが記憶媒体に格納されたプログラムコードを読み出す。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施例の機能を実現することになり、そのプログラムコード自体、及びそれを記憶した記憶媒体は本発明を構成することになる。このようなプログラムコードを供給するための記憶媒体としては、例えば、フレキシブルディスク、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、光ディスク、光磁気ディスク、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭなどが用いられる。 Each of the above-described configurations, functions, processing units, processing means, and the like may be realized by hardware by designing a part or all of them with, for example, an integrated circuit. The present invention can also be realized by software program codes that implement the functions of the embodiments. In this case, a storage medium in which the program code is recorded is provided to the computer, and a CPU included in the computer reads the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the program code itself and the storage medium storing it constitute the present invention. As a storage medium for supplying such a program code, for example, a flexible disk, CD-ROM, DVD-ROM, hard disk, SSD (Solid State Drive), optical disk, magneto-optical disk, CD-R, magnetic tape, A non-volatile memory card, ROM, or the like is used.

また、本実施例に記載の機能を実現するプログラムコードは、例えば、アセンブラ、Ｃ／Ｃ＋＋、ｐｅｒｌ、Ｓｈｅｌｌ、ＰＨＰ、Ｊａｖａ（Ｊａｖａは登録商標）等の広範囲のプログラム又はスクリプト言語で実装できる。 The program code for realizing the functions described in the present embodiment can be implemented by a wide range of programs or script languages such as assembler, C / C ++, perl, Shell, PHP, Java (Java is a registered trademark).

さらに、実施例の機能を実現するソフトウェアのプログラムコードを、ネットワークを介して配信することによって、それをコンピュータのハードディスクやメモリ等の記憶手段又はＣＤ−ＲＷ、ＣＤ−Ｒ等の記憶媒体に格納し、コンピュータが備えるＣＰＵが当該記憶手段や当該記憶媒体に格納されたプログラムコードを読み出して実行するようにしてもよい。 Furthermore, by distributing the program code of the software that realizes the functions of the embodiments via a network, the program code is stored in a storage means such as a hard disk or a memory of a computer or a storage medium such as a CD-RW or CD-R. The CPU included in the computer may read and execute the program code stored in the storage unit or the storage medium.

上述の実施例において、制御線や情報線は、説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。全ての構成が相互に接続されていてもよい。 In the above-described embodiments, the control lines and information lines indicate what is considered necessary for the explanation, and not all control lines and information lines on the product are necessarily shown. All the components may be connected to each other.

３００計算機
３０１記憶装置
３１０演算装置
３１１メモリ
３１２入力装置
３１３出力装置
３１４記憶媒体
３２０レコード管理部
３２１レコードソート部
３２２グループ生成部
３３０統計量算出部
３３１グループ統計量算出部
３３２予測誤差算出部
３４０予測精度分析部
３４１予測精度分析グラフ生成部
３４２予測精度分析指標算出部
３５１個別情報記憶部
３５２グループ情報記憶部
３５３分析結果記憶部
４００個別情報
５００グループ情報
６００グループ統計情報
７００予測精度分析グラフ
８００分析指標情報
９００累積グラフ 300 computer 301 storage device 310 arithmetic device 311 memory 312 input device 313 output device 314 storage medium 320 record management unit 321 record sort unit 322 group generation unit 330 statistic calculation unit 331 group statistic calculation unit 332 prediction error calculation unit 340 prediction accuracy Analysis unit 341 Prediction accuracy analysis graph generation unit 342 Prediction accuracy analysis index calculation unit 351 Individual information storage unit 352 Group information storage unit 353 Analysis result storage unit 400 Individual information 500 Group information 600 Group statistical information 700 Prediction accuracy analysis graph 800 Analysis index information 900 Cumulative graph

Claims

A computer that analyzes the prediction accuracy of a prediction model for calculating a prediction value of a prediction item to be observed,
The calculator includes an arithmetic device, a memory connected to the arithmetic device, and an interface connected to the arithmetic device.
The computer manages a database storing a plurality of records composed of the predicted value of the prediction item to be observed and the actual value of the prediction item to be observed,
The arithmetic device sorts a plurality of records stored in the database based on the size of the predicted value, and stores the sorting result in the memory,
The arithmetic unit calculates a prediction error of each of the plurality of records based on the predicted value and the actual value of the plurality of records stored in the database, and each of the plurality of records stored in the database Storing the prediction error in the memory,
The arithmetic device selects a plurality of records as a target based on the sorting result, and based on a statistic calculated from the prediction error of each of the selected plurality of records, the prediction accuracy of the prediction model The computer calculates an analysis index for analyzing and stores the analysis index in the memory.

The computer according to claim 1,
The actual value of the prediction item to be observed is a value indicating whether or not an arbitrary event occurs,
The arithmetic device sorts a plurality of records stored in the database in the descending order of the predicted value or in ascending order of the predicted value,
The arithmetic unit generates a plurality of groups including any number of the records based on the sorting result, stores group information for managing the group in the memory,
The arithmetic device calculates a statistic of the actual value and a statistic of the predicted value of the group based on the actual value and the predicted value of the plurality of records included in the group, and the plurality of groups Storing statistical information including the statistics of each of the actual values and the statistics of the predicted values in the memory,
The arithmetic device calculates the prediction error of each of the plurality of groups based on the statistical information,
The arithmetic unit selects a plurality of target groups based on the sorting result, and calculates the analysis index based on a statistic calculated from the prediction error of each of the selected groups. A computer characterized by

The computer according to claim 2,
The arithmetic device calculates, as the analysis index, at least one of an average value of the prediction errors and a variance of the prediction errors based on the prediction errors of each of the plurality of selected groups. calculator.

The computer according to claim 2,
The arithmetic unit generates cumulative information related to the cumulative value of the predicted value and the cumulative value of the actual value based on the sorting result, and stores the generated cumulative information in the memory,
The arithmetic device refers to the cumulative information, identifies a group in which either the increase amount of the cumulative value of the predicted value or the increase amount of the cumulative value of the actual value is greater than a predetermined threshold,
A computer that calculates the analysis index based on the prediction error of the identified group.

The computer according to claim 2,
The arithmetic unit determines a range of the predicted value,
The computing device generates the plurality of groups by classifying the records based on a range of the predicted values.

An analysis index calculation method executed by a computer that analyzes the prediction accuracy of a prediction model for calculating a prediction value of a prediction item to be observed,
The calculator includes an arithmetic device, a memory connected to the arithmetic device, and an interface connected to the arithmetic device.
The computer manages a database storing a plurality of records composed of the predicted value of the prediction item to be observed and the actual value of the prediction item to be observed,
A first step in which the arithmetic device sorts a plurality of records stored in the database based on the size of the predicted value, and stores a sorting result in the memory;
The arithmetic unit calculates a prediction error of each of the plurality of records based on the predicted value and the actual value of the plurality of records stored in the database, and each of the plurality of records stored in the database A second step of storing the prediction error in the memory;
The arithmetic unit selects a plurality of records to be processed based on the sorting result, and analyzes the prediction accuracy of the prediction model based on the prediction error statistic of each of the selected plurality of records. And a third step of storing the analysis index in the memory, and a method for calculating the analysis index.

A method of calculating an analysis index according to claim 6,
The actual value of the prediction item to be observed is a value indicating whether or not an arbitrary event occurs,
The first step includes
A fourth step in which the arithmetic device sorts a plurality of records stored in the database in the descending order of the predicted value or in ascending order of the predicted value;
A fifth step of generating a plurality of groups including an arbitrary number of the records based on the sorting result, and storing group information for managing the groups in the memory;
The second step includes
The arithmetic device calculates a statistic of the actual value and a statistic of the predicted value of the group based on the actual value and the predicted value of the plurality of records included in the group, and the plurality of groups A sixth step of storing, in the memory, statistical information including the statistics of each of the actual values and the statistics of the predicted values;
And a seventh step of calculating the prediction error of each of the plurality of groups based on the statistical information.
The third step includes
An eighth step in which the arithmetic unit selects a plurality of target groups based on the sorting result;
The arithmetic unit includes: a ninth step of calculating the analysis index based on a statistic calculated from the prediction error of each of the selected plurality of groups. Calculation method.

A method for calculating an analysis index according to claim 7,
In the ninth step, the computing device uses, as the analysis index, at least one of an average value of the prediction errors and a variance of the prediction errors based on the prediction errors of each of the selected groups. A method of calculating an analysis index, comprising a step of calculating.

A method for calculating an analysis index according to claim 7,
The eighth step includes
The arithmetic unit generates, based on the sorting result, cumulative information related to the cumulative value of the predicted value and the cumulative value of the actual value, and stores the generated cumulative information in the memory;
The arithmetic device refers to the accumulated information, and specifies a group in which either the increase amount of the accumulated value of the predicted value or the increase amount of the accumulated value of the actual value is larger than a predetermined threshold value. ,
The ninth step includes a step of calculating the analysis index based on the prediction error of the specified group by the arithmetic device.

A method for calculating an analysis index according to claim 7,
The fifth step includes
The arithmetic unit determining a range of the predicted value;
The calculation device includes a step of generating the plurality of groups by classifying the records based on the range of the predicted values.