JPWO2020188696A1

JPWO2020188696A1 - Anomaly detection device and abnormality detection method

Info

Publication number: JPWO2020188696A1
Application number: JP2019555983A
Authority: JP
Inventors: 宜史上田; 淳岡嶋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2019-03-18
Filing date: 2019-03-18
Publication date: 2021-04-01
Anticipated expiration: 2039-03-18
Also published as: KR102408756B1; JP6647473B1; KR20210114070A; CN113574358B; WO2020188696A1; CN113574358A

Abstract

本発明にかかる異常検知装置（１００）は、時系列データを、学習区間とテスト区間に分割するデータ分割部（１０２）と、時系列データのうち学習区間の部分列を学習データとして生成する部分列生成部（１０３）と、学習データを用いて、テスト区間のデータ点に対応する確率分布を求める予測分布算出部（１０４）と、確率分布を用いて異常を検知する異常検知部（１０７）と、を備える。The abnormality detection device (100) according to the present invention has a data division unit (102) that divides time-series data into a learning section and a test section, and a portion of the time-series data that generates a substring of the learning section as learning data. A column generation unit (103), a prediction distribution calculation unit (104) that obtains a probability distribution corresponding to a data point in a test section using training data, and an abnormality detection unit (107) that detects an abnormality using a probability distribution. And.

Description

本発明は、工場、化学プラント、鉄鋼プラント等の設備をはじめとした異常検知の対象物の異常を判定する異常検知装置および異常検知方法に関する。 The present invention relates to an abnormality detection device and an abnormality detection method for determining an abnormality of an object for abnormality detection such as equipment such as a factory, a chemical plant, and a steel plant.

工場、ビルといった設備では、該設備内の空調設備、電気照明等といった機器を制御するための制御システムが導入されている。火力、水力および原子力をはじめとした発電プラント、化学プラント、鉄鋼プラント等の設備でも、プロセスを制御するための制御システムが導入されている。また、工場の設備、自動車、鉄道車両等には、これらの設備の状態を記録するためのロギングシステムが搭載されている場合が多い。設備の状態は、設備が備える機器の状態、設備内または設備外の環境を示す状態などを含む。ロギングシステムおよび制御システムでは、一般に、センサによって計測された、時間の経過に応じた設備の状態を示す時系列データが蓄積されている。 In equipment such as factories and buildings, control systems for controlling equipment such as air conditioning equipment and electric lighting in the equipment have been introduced. Control systems for controlling processes have also been introduced in equipment such as power plants, chemical plants, and steel plants, including thermal power, hydraulic power, and nuclear power. In addition, factory equipment, automobiles, railroad vehicles, and the like are often equipped with a logging system for recording the state of these equipment. The state of the equipment includes the state of the equipment provided in the equipment, the state indicating the environment inside or outside the equipment, and the like. Logging systems and control systems generally store time-series data, measured by sensors, that indicate the state of the equipment over time.

従来から、上記時系列データの変化を分析して、上記設備などの異常検知の対象物の異常を検知することが行われている。例えば、特許文献１には、時系列データから特徴を抽出し、抽出した特徴と、異常を含まないトレーニングデータから抽出された特徴との距離が、閾値を超える場合に異常と判定する異常検出手法が開示されている。 Conventionally, it has been practiced to analyze changes in the time-series data to detect an abnormality in an object for abnormality detection such as the equipment. For example, Patent Document 1 describes an abnormality detection method in which features are extracted from time-series data, and when the distance between the extracted features and the features extracted from training data that does not include anomalies exceeds a threshold value, anomalies are determined. Is disclosed.

特開２０１５−１１０２７号公報JP-A-2015-11027

一方、設備内の機器によって、または状態を計測するセンサによって、時系列データの傾向が異なる場合がある。このため、上記特許文献１に記載の手法のように閾値を用いた判定を行う場合、機器およびセンサごとに、閾値の評価および検証が必要となるという課題がある。また、この閾値の評価および検証は、熟練オペレータの知見、設備設計者の知見等の外部情報が必要となるため、オペレータおよび設計者の負荷が高くかつ時間を要する。このため、閾値の設定のための作業負荷を抑制することが望まれる。 On the other hand, the tendency of time series data may differ depending on the equipment in the equipment or the sensor that measures the state. Therefore, when making a determination using a threshold value as in the method described in Patent Document 1, there is a problem that the threshold value needs to be evaluated and verified for each device and sensor. Further, the evaluation and verification of this threshold value requires external information such as the knowledge of a skilled operator and the knowledge of the equipment designer, so that the load on the operator and the designer is high and it takes time. Therefore, it is desired to suppress the workload for setting the threshold value.

本発明は、上記に鑑みてなされたものであって、閾値の設定のための作業負荷を抑制して、異常検知の対象物の異常を検知することができる異常検知装置を得ることを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to obtain an abnormality detection device capable of detecting an abnormality of an object for abnormality detection by suppressing a workload for setting a threshold value. To do.

上述した課題を解決し、目的を達成するために、本発明にかかる異常検知装置は、時系列データを、学習区間とテスト区間に分割するデータ分割部と、時系列データのうち学習区間の部分列を学習データとして生成する部分列生成部と、を備える。また、異常検知装置は、学習データを用いて、テスト区間のデータ点に対応する確率分布を求める予測分布算出部と、確率分布を用いて異常を検知する異常検知部と、備える。 In order to solve the above-mentioned problems and achieve the object, the abnormality detection device according to the present invention has a data division unit that divides time-series data into a learning section and a test section, and a portion of the time-series data in the learning section. It includes a sub-column generation unit that generates columns as training data. Further, the abnormality detection device includes a prediction distribution calculation unit that obtains a probability distribution corresponding to a data point in a test section using learning data, and an abnormality detection unit that detects an abnormality using the probability distribution.

本発明にかかる異常検知装置は、閾値の設定のための作業負荷を抑制して、異常検知の対象物の異常を検知することができるという効果を奏する。 The abnormality detection device according to the present invention has the effect of suppressing the workload for setting the threshold value and detecting the abnormality of the object for abnormality detection.

本発明の実施の形態にかかる異常検知装置の機能構成例を示す図The figure which shows the functional configuration example of the abnormality detection apparatus which concerns on embodiment of this invention. 異常検知装置を実現するコンピュータシステムの構成例を示す図The figure which shows the configuration example of the computer system which realizes an abnormality detection device. 時系列データの一例を示す図Diagram showing an example of time series data 時系列データの一例を示す図Diagram showing an example of time series data 時系列データの一例を示す図Diagram showing an example of time series data 異常検知装置における異常検知処理手順の一例を示すフローチャートFlow chart showing an example of the abnormality detection processing procedure in the abnormality detection device ガウス分布の一例を示す図Diagram showing an example of a Gaussian distribution 学習区間の更新の様子を示す図Diagram showing how the learning section is updated テスト区間の各時点の信用区間と異常度スコアの一例を示す図Diagram showing an example of the confidence interval and anomaly score at each point in the test interval

以下に、本発明の実施の形態にかかる異常検知装置および異常検知方法を図面に基づいて詳細に説明する。なお、この実施の形態によりこの発明が限定されるものではない。 Hereinafter, the abnormality detection device and the abnormality detection method according to the embodiment of the present invention will be described in detail with reference to the drawings. The present invention is not limited to this embodiment.

実施の形態．
図１は、本発明の実施の形態にかかる異常検知装置の機能構成例を示す図である。図１に示すように、本実施の形態の異常検知装置１００は、データ取得部１０１、データ分割部１０２、部分列生成部１０３、予測分布算出部１０４、信用区間算出部１０５、異常度スコア算出部１０６および異常検知部１０７を備える。Embodiment.
FIG. 1 is a diagram showing a functional configuration example of the abnormality detection device according to the embodiment of the present invention. As shown in FIG. 1, the abnormality detection device 100 of the present embodiment includes a data acquisition unit 101, a data division unit 102, a substring generation unit 103, a prediction distribution calculation unit 104, a confidence interval calculation unit 105, and an abnormality score calculation. A unit 106 and an abnormality detection unit 107 are provided.

本実施の形態の異常検知装置１００は、異常検知の対象物の状態を示す時系列データを取得し、取得した時系列データに基づいて、異常検知の対象物の異常を検知する。異常検知の対象物としては、工場、化学プラント、鉄鋼プラント、上下水道プラントをはじめとした設備、自動車、鉄道車両、経済または経営等に関するデータを例示することができる。時系列データは、複数の異なる時間にそれぞれ対応するデータを含むデータ列であり、データの時間変化が把握可能なデータ列である。時系列データは、どのようなものでも良く、例えば、複数の異なる時間にそれぞれ観測されたデータを含むデータ列であってもよいし、複数の異なる時間にそれぞれ観測されたデータがデータ処理された結果を含むデータ列であってもよい。また、時系列データは、制御に用いられたフィードバックデータなどであってもよい。すなわち、時系列データは、異なる時刻に対応する複数のデータ点を含む。なお、以下では、データ点は、時刻を示す時刻情報と該時刻に対応するセンサ値などの値とを、２次元座標系で表したときの１点に対応する。例えば、時系列データは、一定時間間隔でセンサにより計測されたセンサ値が、センサ値の取得時刻とともに、並べられたデータである。センサは、例えば、設備、機器等の温度を計測する温度センサ、工場の機械装置などが備えるモータの回転位置を検出するセンサ、工場の機械装置の加速度などを計測する力覚センサ、電流センサ、電圧センサ等である。経済または経営等に関する時系列データとしては、為替、株価、先物価格の時系列データが例示される。これらのデータの異常としては、例えば価格の急落といった異常が例示される。 The abnormality detection device 100 of the present embodiment acquires time-series data indicating the state of the object for abnormality detection, and detects the abnormality of the object for abnormality detection based on the acquired time-series data. As the object of abnormality detection, data on facilities such as factories, chemical plants, steel plants, water and sewage plants, automobiles, railroad vehicles, economy or management can be exemplified. The time series data is a data string including data corresponding to a plurality of different times, and is a data string in which the time change of the data can be grasped. The time series data may be any data, for example, a data string containing data observed at a plurality of different times, or data processed at a plurality of different times. It may be a data string containing the results. Further, the time series data may be feedback data used for control or the like. That is, the time series data includes a plurality of data points corresponding to different times. In the following, the data point corresponds to one point when the time information indicating the time and the value such as the sensor value corresponding to the time are represented by the two-dimensional coordinate system. For example, the time-series data is data in which sensor values measured by sensors at regular time intervals are arranged together with the acquisition time of the sensor values. The sensors include, for example, a temperature sensor that measures the temperature of equipment, equipment, etc., a sensor that detects the rotational position of a motor provided in a factory machine, a force sensor that measures the acceleration of a factory machine, a current sensor, and the like. A voltage sensor or the like. Time-series data on the economy, management, etc. include time-series data on exchange rates, stock prices, and futures prices. Examples of the anomaly of these data include anomalies such as a sharp drop in price.

時系列データは、例えば、工場のラインの機器である加工機、ロボットポンプ等の製造装置、自動車、鉄道車両などの機器に蓄積されていてもよいし、工場、ビル等の空調設備、電気、照明、給排水等の制御システムに蓄積されているデータであってもよい。また、時系列データは、火力、水力、原子力等の発電プラント、化学プラント、鉄鋼プラント、上下水道プラント等のプロセスを制御するための制御システムに蓄積されているデータであってもよい。さらに、時系列データは、経済または経営等に関する情報システムに蓄積されているデータであっても良い。 The time-series data may be stored in, for example, processing machines that are equipment on the factory line, manufacturing equipment such as robot pumps, equipment such as automobiles and railroad vehicles, air conditioning equipment such as factories and buildings, electricity, and so on. It may be data stored in a control system such as lighting and water supply / drainage. Further, the time-series data may be data accumulated in a control system for controlling processes of a power plant such as thermal power, hydraulic power, nuclear power, a chemical plant, a steel plant, a water and sewage plant, and the like. Further, the time series data may be data accumulated in an information system related to economy, management, or the like.

図１の説明に戻る。異常検知装置１００のデータ取得部１０１は、異常検知処理に用いる設定等のデータの入力を受け付ける。データ取得部１０１は、時系列データの入力を受け付けてもよい。データ分割部１０２は、時系列データを後述する学習区間とテスト区間に分割する。部分列生成部１０３は、時系列データのうち学習区間の部分列である学習データを生成する。 Returning to the description of FIG. The data acquisition unit 101 of the abnormality detection device 100 receives input of data such as settings used for the abnormality detection process. The data acquisition unit 101 may accept input of time series data. The data division unit 102 divides the time series data into a learning section and a test section, which will be described later. The substring generation unit 103 generates learning data which is a substring of the learning section of the time series data.

予測分布算出部１０４は、学習データに基づいて、テスト区間のデータ点に対応する確率分布を求める。信用区間算出部１０５は、確率分布に基づいて、テスト区間のデータ点に対応する信用区間を算出する。異常度スコア算出部１０６は、信用区間と、テスト区間の時系列データとの間の外れ度合いを示す異常度スコアを算出する。異常検知部１０７は、予測分布算出部１０４により算出された確率分布を用いて異常を検知する。異常検知部１０７は、例えば、異常度スコアに基づいて異常を検知する。異常検知装置１００の各部の動作の詳細は後述する。 The prediction distribution calculation unit 104 obtains the probability distribution corresponding to the data points in the test section based on the learning data. The credible interval calculation unit 105 calculates the credible interval corresponding to the data points of the test interval based on the probability distribution. The anomaly score calculation unit 106 calculates an anomaly score indicating the degree of deviation between the confidence interval and the time series data of the test interval. The anomaly detection unit 107 detects an abnormality using the probability distribution calculated by the prediction distribution calculation unit 104. The abnormality detection unit 107 detects an abnormality based on, for example, an abnormality degree score. Details of the operation of each part of the abnormality detection device 100 will be described later.

ここで、異常検知装置１００のハードウェア構成について説明する。異常検知装置１００は、コンピュータシステムにより実現される。図２は、異常検知装置１００を実現するコンピュータシステムの構成例を示す図である。このコンピュータシステムは、コンピュータ２０と、コンピュータ２０に接続される入力装置２０９およびディスプレイ２１０とを備える。 Here, the hardware configuration of the abnormality detection device 100 will be described. The abnormality detection device 100 is realized by a computer system. FIG. 2 is a diagram showing a configuration example of a computer system that realizes the abnormality detection device 100. This computer system includes a computer 20, an input device 209 connected to the computer 20, and a display 210.

コンピュータ２０は、プロセッサ２０１、補助記憶装置２０２、メモリ２０３、入力インタフェース（以下、Ｉ／Ｆと略す）２０４、ディスプレイＩ／Ｆ２０５、警報出力装置２０６およびネットワークＩ／Ｆ２０７を備える。プロセッサ２０１は、信号線２０８を介して、補助記憶装置２０２、メモリ２０３、入力Ｉ／Ｆ２０４，ディスプレイＩ／Ｆ２０５、警報出力装置２０６およびネットワークＩ／Ｆ２０７と接続される。プロセッサ２０１は、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）等である。補助記憶装置２０２およびメモリ２０３は、ＲＡＭ（Random Access Memory），ＲＯＭ（Read Only Memory），ＨＤＤ（Hard Disk Drive）等である。 The computer 20 includes a processor 201, an auxiliary storage device 202, a memory 203, an input interface (hereinafter abbreviated as I / F) 204, a display I / F 205, an alarm output device 206, and a network I / F 207. The processor 201 is connected to the auxiliary storage device 202, the memory 203, the input I / F 204, the display I / F 205, the alarm output device 206, and the network I / F 207 via the signal line 208. The processor 201 is, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like. The auxiliary storage device 202 and the memory 203 are a RAM (Random Access Memory), a ROM (Read Only Memory), an HDD (Hard Disk Drive), and the like.

入力Ｉ／Ｆ２０４は、ケーブル２１１を介して、入力装置２０９に接続される。入力Ｉ／Ｆ２０４は、入力装置２０９との間でデータのやりとりをするための回路である。入力装置２０９は、ユーザからの入力を受け付ける装置であり、キーボード、マウス等を含む。 The input I / F 204 is connected to the input device 209 via the cable 211. The input I / F 204 is a circuit for exchanging data with the input device 209. The input device 209 is a device that receives input from the user, and includes a keyboard, a mouse, and the like.

ディスプレイＩ／Ｆ２０５は、ケーブル２１２を介して、ディスプレイ２１０に接続される。ディスプレイＩ／Ｆ２０５は、ディスプレイ２１０との間でデータのやりとりをするための回路である。なお、入力装置２０９とディスプレイ２１０は一体化されて、タッチパネルにより実現されてもよい。ディスプレイ２１０は、出力装置の一例であるが、ディスプレイ２１０に加えてプリンタなどの出力装置が、当該出力装置のＩ／Ｆを介して接続されていてもよい。 The display I / F 205 is connected to the display 210 via a cable 212. The display I / F 205 is a circuit for exchanging data with the display 210. The input device 209 and the display 210 may be integrated and realized by a touch panel. The display 210 is an example of an output device, but in addition to the display 210, an output device such as a printer may be connected via the I / F of the output device.

警報出力装置２０６は、ＬＥＤ（Light Emitting Diode）パイロットランプをはじめとした表示灯、スピーカー等である。なお、図２では、警報出力装置２０６がコンピュータ２０内に設けられる例を示しているが、これに限らず、警報出力装置２０６は、ディスプレイ２１０と同様にコンピュータ２０の外部に設けられて、ケーブルを介してコンピュータ２０と接続されていてもよい。 The alarm output device 206 is an indicator light including an LED (Light Emitting Diode) pilot lamp, a speaker, and the like. Note that FIG. 2 shows an example in which the alarm output device 206 is provided inside the computer 20, but the present invention is not limited to this, and the alarm output device 206 is provided outside the computer 20 like the display 210 and has a cable. It may be connected to the computer 20 via.

ネットワークＩ／Ｆ２０７は、外部と通信を行うための通信回路であり、有線回線または無線回線を介して、図示しないネットワークに接続される。ネットワーク上には、図示しないコンピュータ、データベースを有するデータベースサーバ等の他の装置が接続される。ネットワークＩ／Ｆ２０７は、他の装置との間で電子メールの送受信を行ったり、他の装置のデータベースに格納されているデータを受信したり、他の装置のデータベースへ格納するためにデータを他の装置へ送信したりする。 The network I / F 207 is a communication circuit for communicating with the outside, and is connected to a network (not shown) via a wired line or a wireless line. Other devices such as a computer (not shown) and a database server having a database are connected to the network. The network I / F 207 sends / receives e-mail to / from another device, receives data stored in the database of the other device, and stores data in the database of the other device. Send to the device of.

図１に示した異常検知装置１００の各機能部の機能は、ソフトウエア、ファームウエア、またはソフトウエアとファームウエアとの組み合わせにより実現される。異常検知装置１００の各機能部の機能を実現するためのソフトウエア、ファームウエア、またはソフトウエアとファームウエアは、プログラムとして記述される。このプログラムは、補助記憶装置２０２に記憶されている。このプログラムは、各機能部の手順または方法をコンピュータ２０に実行させるものである。詳細には、プロセッサ２０１がプログラムを実行することにより、図１に示した異常検知装置１００の各機能部が実現される。なお、図１に示した異常検知装置１００の各機能部のうちデータ取得部１０１の機能の実現には入力装置２０９も用いられる。また、異常検知部１０７の機能の実現には、ディスプレイ２１０および警報出力装置２０６の内の少なくとも１つが用いられる。このプログラムは、記録媒体または通信媒体により提供されて、補助記憶装置２０２に記憶されてもよい。 The functions of each functional unit of the abnormality detection device 100 shown in FIG. 1 are realized by software, firmware, or a combination of software and firmware. The software, firmware, or software and firmware for realizing the functions of each functional unit of the abnormality detection device 100 are described as a program. This program is stored in the auxiliary storage device 202. This program causes the computer 20 to execute the procedure or method of each functional unit. Specifically, when the processor 201 executes the program, each functional unit of the abnormality detection device 100 shown in FIG. 1 is realized. Of the functional units of the abnormality detection device 100 shown in FIG. 1, the input device 209 is also used to realize the function of the data acquisition unit 101. Further, at least one of the display 210 and the alarm output device 206 is used to realize the function of the abnormality detection unit 107. This program may be provided by a recording medium or a communication medium and stored in the auxiliary storage device 202.

上述した時系列データは、補助記憶装置２０２に記憶されている。例えば、時系列データは、他の装置から送信され、ネットワークＩ／Ｆ２０７を介して補助記憶装置２０２に記憶される。または、時系列データは記録媒体に記録され、記録媒体から読み出されることにより、補助記憶装置２０２に記憶されてもよいし、入力装置２０９を介してユーザから入力されてもよい。 The above-mentioned time series data is stored in the auxiliary storage device 202. For example, the time series data is transmitted from another device and stored in the auxiliary storage device 202 via the network I / F 207. Alternatively, the time-series data may be stored in the auxiliary storage device 202 by being recorded on the recording medium and read from the recording medium, or may be input by the user via the input device 209.

補助記憶装置２０２に記憶されているプログラムは、補助記憶装置２０２からメモリ２０３にロードされ、プロセッサ２０１に読み込まれることにより実行される。プログラムが実行されることにより、図１に示す各機能部の機能が実現される。また、プログラムの実行時には、時系列データ等のプログラムの実行に用いられるデータも補助記憶装置２０２からメモリ２０３にロードされる。プログラムの実行結果は、メモリ２０３に書き込まれ、プログラムの記述内容に応じて、補助記憶装置２０２に記憶されたり、ディスプレイＩ／Ｆ２０５を介してディスプレイ２１０に表示されたり、ネットワークＩ／Ｆ２０７を介してネットワーク上の他の装置に送信されたりする。 The program stored in the auxiliary storage device 202 is executed by being loaded from the auxiliary storage device 202 into the memory 203 and read into the processor 201. By executing the program, the functions of each functional unit shown in FIG. 1 are realized. Further, when the program is executed, data used for executing the program, such as time series data, is also loaded from the auxiliary storage device 202 into the memory 203. The execution result of the program is written in the memory 203, stored in the auxiliary storage device 202, displayed on the display 210 via the display I / F 205, or displayed on the display 210 via the network I / F 207, depending on the description content of the program. It may be sent to other devices on the network.

入力装置２０９は、後述するデータ分割割合等の異常検知装置１００の処理において用いられる設定情報をユーザから受け付ける。また、入力装置２０９は、時系列データ処理の開始要求、終了要求といった処理に関する指示をユーザから受け付ける。入力装置２０９が受け付けた設定情報は、入力Ｉ／Ｆ２０４を介して補助記憶装置２０２に記憶される。入力装置２０９が受け付けた指示は、入力Ｉ／Ｆ２０４を介してプロセッサ２０１に入力される。 The input device 209 receives from the user setting information used in the processing of the abnormality detection device 100 such as the data division ratio described later. Further, the input device 209 receives from the user instructions regarding processing such as a start request and an end request for time-series data processing. The setting information received by the input device 209 is stored in the auxiliary storage device 202 via the input I / F 204. The instruction received by the input device 209 is input to the processor 201 via the input I / F 204.

次に、本実施の形態の異常検知方法について説明する。以下では、時系列データとして、工場のライン上で連続稼働する製造装置に備え付けられた、複数種類のセンサにより計測されたデータを例に挙げて説明する。すなわち、異常検知の対象物が製造装置である例を説明する。なお、上述したとおり、時系列データは、センサにより計測されたデータに限定されない。 Next, the abnormality detection method of the present embodiment will be described. In the following, as time-series data, data measured by a plurality of types of sensors installed in a manufacturing apparatus continuously operating on a factory line will be described as an example. That is, an example in which the object of abnormality detection is a manufacturing apparatus will be described. As described above, the time series data is not limited to the data measured by the sensor.

図３から図５は、時系列データの一例を示す図である。図３から図５に示したセンサ値３０３は、工場のライン上で連続稼働する製造装置に備え付けられた複数種類のセンサにより一定周期で計測されたデータであるセンサ値である。センサ値３０３は、各データが取得された時刻を示す時刻情報３０１と対応付けられている。図３から図５に示した例では、時刻情報３０１とセンサ値３０３の組が時系列データである。図３から図５に示した例では、複数種類のセンサは、加速度センサＡを含む。センサ値は、加速度センサＡによる計測値に限定されず、製造装置の電流、電圧、振動、加速度、圧力等の計測値を例示することができる。 3 to 5 are diagrams showing an example of time series data. The sensor values 303 shown in FIGS. 3 to 5 are sensor values that are data measured at regular intervals by a plurality of types of sensors installed in a manufacturing apparatus that continuously operates on a factory line. The sensor value 303 is associated with time information 301 indicating the time when each data is acquired. In the example shown in FIGS. 3 to 5, the set of the time information 301 and the sensor value 303 is time series data. In the examples shown in FIGS. 3 to 5, the plurality of types of sensors include the acceleration sensor A. The sensor value is not limited to the value measured by the acceleration sensor A, and the measured values of the current, voltage, vibration, acceleration, pressure, etc. of the manufacturing apparatus can be exemplified.

図３から図５では、センサ値３０３の各データが取得された時刻を示す時刻情報３０１と、製造装置の制御の条件を示す制御情報３０２と、がセンサ値３０３とともに示されている。制御情報３０２は、例えば、製造する製品の数である製品製造数、製造条件に関する指令値であるレシピ情報である。指令値は、例えば、異常検知の対象物が、回転する機構の場合にはモータの速度の指令値、溶接する装置の場合には溶接時の温度の指令値、レーザ加工機の場合にはレーザ出力電圧の指令値である。レシピ情報について説明する。製品によって、指令値を何段階かに分けて変更することがある。ここでは、いくつかの指令値変更パターン、処理条件の集合などをレシピと呼ぶ。回転する機構の例として、半導体製造における真空ポンプを挙げる。真空ポンプでは、モータを回転させることにより空気を排出して真空状態を作る。半導体を製造時に、薬品、ガスなどをウエハに塗布する。製品種類によって、薬品、ガスなどの種類が異なる。製品によって薬品、ガスなどの塗布タイミングが異なり、また製品によってモータの回転速度が異なる。例えば、ガス投入前はモータの回転速度はＡとし、ガス投入時はモータの回転速度はＢとる、ガス投入後はモータの回転速度をＣとする。これらの手順をレシピと呼ぶ。レシピ情報は、これらの手順を示す情報である。図３から図５に示した例では、制御情報３０２は指令値１を含む。ここでは、制御情報３０２が、時系列データである時刻情報３０１およびセンサ値３０３とともに、状態情報として記録されているとする。状態情報は例えば製造装置を制御する制御装置が記録しており、異常検知装置１００がこの制御装置からネットワークを介して取得する。 In FIGS. 3 to 5, time information 301 indicating the time when each data of the sensor value 303 is acquired and control information 302 indicating the control conditions of the manufacturing apparatus are shown together with the sensor value 303. The control information 302 is, for example, recipe information which is a command value related to the number of manufactured products, which is the number of manufactured products, and the manufacturing conditions. The command value is, for example, the command value of the speed of the motor when the object of abnormality detection is a rotating mechanism, the command value of the temperature at the time of welding in the case of a welding device, and the laser in the case of a laser machine. This is the command value of the output voltage. The recipe information will be explained. Depending on the product, the command value may be changed in several steps. Here, some command value change patterns, a set of processing conditions, and the like are called recipes. An example of a rotating mechanism is a vacuum pump in semiconductor manufacturing. In a vacuum pump, air is discharged by rotating a motor to create a vacuum state. When manufacturing a semiconductor, chemicals, gas, etc. are applied to the wafer. The types of chemicals, gas, etc. differ depending on the product type. The application timing of chemicals, gas, etc. differs depending on the product, and the rotation speed of the motor differs depending on the product. For example, the rotation speed of the motor is A before the gas is charged, the rotation speed of the motor is B when the gas is charged, and the rotation speed of the motor is C after the gas is charged. These steps are called recipes. Recipe information is information indicating these procedures. In the example shown in FIGS. 3 to 5, the control information 302 includes the command value 1. Here, it is assumed that the control information 302 is recorded as state information together with the time information 301 and the sensor value 303 which are time series data. The state information is recorded by, for example, a control device that controls the manufacturing device, and the abnormality detection device 100 acquires the state information from this control device via the network.

図３から図５に示した例では、時刻情報３０１が時刻により示されているが、時刻情報は、時刻自体を示すものに限定されず、機械的に振られた連続する番号であったり、行列の行番号などの数値であったりしてもよい。また、時系列データが定期的に取得され欠損がないことが明らかなデータである場合には、取得時刻順にデータが並んでいれば、時刻情報がセンサごとに付加されていなくてもよい。この場合、時系列データの開始時刻については、例えば、時系列データが含まれるデータファイルのファイル名に記載される等により別に管理され、各センサ値の取得間隔を示す情報が管理されていれば、開始時刻と、データが時系列データの何番目のデータであるかにより、各データの取得時刻がわかる。時刻情報はこのようにデータ点ごとに付加されているのではなく、間接的に与えられていてもよい。 In the examples shown in FIGS. 3 to 5, the time information 301 is indicated by the time, but the time information is not limited to the one indicating the time itself, and may be a continuous number assigned mechanically. It may be a numerical value such as a row number of a matrix. Further, when the time series data is periodically acquired and it is clear that there is no loss, the time information may not be added to each sensor as long as the data are arranged in the order of acquisition time. In this case, if the start time of the time-series data is managed separately by, for example, being described in the file name of the data file containing the time-series data, and the information indicating the acquisition interval of each sensor value is managed. , The acquisition time of each data can be known from the start time and the number of the data in the time series data. The time information is not added for each data point in this way, but may be given indirectly.

なお、図３では、状態情報を１つのテーブルとして記載しているが、状態情報の形式は図３に示した例に限定されない。例えば、時刻情報と制御情報が１つのテーブルとして作成され、時系列データである時刻情報３０１とセンサ値３０３の組が別の１つのテーブルとして作成されていてもよい。また、時系列データもセンサの種別ごとに別のテーブルとして作成されていてもよい。このように、各情報間の対応付けが可能であれば、状態情報は複数に分割されていてもよい。 Although the state information is described as one table in FIG. 3, the format of the state information is not limited to the example shown in FIG. For example, the time information and the control information may be created as one table, and the set of the time information 301 and the sensor value 303, which are time series data, may be created as another table. Further, the time series data may be created as a separate table for each sensor type. In this way, the state information may be divided into a plurality of pieces as long as the information can be associated with each other.

また、時系列データは、センサにより計測された計測値そのものではなく要約された要約値であってもよい。工場、ライン、製造装置等によっては、センサにより計測されたデータを一定のルールに沿って要約した値が記録されている場合がある。ここでいう要約とは、元のデータを用いた処理を行うことにより、元のデータよりデータ量の少ないデータを生成することをいう。要約の具体的な処理内容は特に制約はないが、例えば、統計処理、フーリエ変換処理等であってもよい。例えば、センサが毎秒より計測値を取得しており、製造装置の制御装置がこの計測値に基づいて、１時間あたり１つの代表値を生成する。代表値は、１時間分の計測値の平均値であってもよいし、１時間分の計測値の中央値であってもよいし、１時間分の計測値の最頻値であってもよい。また、異常検知装置１００が、センサにより計測された計測値を取得し、取得した計測値を要約して時系列データを生成してもよい。 Further, the time series data may be a summarized summary value instead of the measured value itself measured by the sensor. Depending on the factory, line, manufacturing equipment, etc., a value that summarizes the data measured by the sensor according to a certain rule may be recorded. The term "summary" as used herein means that data with a smaller amount of data than the original data is generated by performing processing using the original data. The specific processing content of the abstract is not particularly limited, but may be, for example, statistical processing, Fourier transform processing, or the like. For example, the sensor acquires the measured value from every second, and the control device of the manufacturing apparatus generates one representative value per hour based on the measured value. The representative value may be the average value of the measured values for one hour, the median value of the measured values for one hour, or the mode value of the measured values for one hour. Good. Further, the abnormality detection device 100 may acquire the measured values measured by the sensor, summarize the acquired measured values, and generate time series data.

図３に示した例では、時系列データは１秒おきのデータである。図３に示した例では、指令値１の値は変更されていない。図４および図５に示した例では、時系列データは１時間おきのデータである。図４に示した例では、指令値１が、2018/12/01 14:00:00に２０から４０へ変更され、2018/12/01 16:00:00に４０から８０へ変更され、2018/12/01 17:00:00に８０から２０へ変更されている。このように、生産状況等に応じて指令値が変更されることもある。後述する異常検知処理において、時系列データの傾向を予測しやすいように、指令値に応じてデータを抽出し、指令値ごとの時系列データを用いて異常検知処理を行うことができる。このような場合、同一の動作条件、すなわち指令値１の値が同一のデータを抽出すると、抽出されたデータには欠損が生じる。例えば、図４に示した例で、指令値１の値が２０のものを抽出すると、2018/12/01 14:00:00から2018/12/01 16:00:00までの３つの時点に対応するデータが欠損することになる。 In the example shown in FIG. 3, the time series data is data every second. In the example shown in FIG. 3, the value of the command value 1 is not changed. In the examples shown in FIGS. 4 and 5, the time series data is data every hour. In the example shown in FIG. 4, the command value 1 is changed from 20 to 40 at 14:00:00 on December 01, 2018, from 40 to 80 at 16:00:00 on December 01, 2018, and 2018. It was changed from 80 to 20 at 17:00:00 on 12/01. In this way, the command value may be changed depending on the production status and the like. In the abnormality detection process described later, data can be extracted according to the command value and the abnormality detection process can be performed using the time series data for each command value so that the tendency of the time series data can be easily predicted. In such a case, if data having the same operating conditions, that is, the same value of the command value 1, is extracted, the extracted data will be defective. For example, in the example shown in FIG. 4, if the command value 1 with a value of 20 is extracted, it will be at three time points from 2018/12/01 14:00:00 to 2018/12/01 16:00:00. The corresponding data will be lost.

また、設備の稼働状態、通電状態によっては、一定周期で取得されているはずの計測値が取得されなかったり、設備のメンテナンス等で計測自体が行われなかったりすることにより、データに欠落が生じることがある。図５は、時系列データに欠落が生じた例を示している。図５に示した例では、2018/12/01 14:00:00と2018/12/01 15:00:00の２つの時点に対応するデータが欠落している。 In addition, depending on the operating state and energized state of the equipment, the measured values that should have been acquired at regular intervals may not be acquired, or the measurement itself may not be performed due to equipment maintenance, etc., resulting in data loss. Sometimes. FIG. 5 shows an example in which the time series data is missing. In the example shown in FIG. 5, the data corresponding to the two time points of 2018/12/01 14:00:00 and 2018/12/01 15:00:00 are missing.

図４に示した例で指令値ごとにデータを抽出した場合、および図５に示したように元のデータに欠落が生じている場合などのように時系列データに欠落が生じている場合、異常検知装置１００は、後述するように、補間処理により、欠落したデータを補間してもよい。 When data is extracted for each command value in the example shown in FIG. 4, and when the time series data is missing, such as when the original data is missing as shown in FIG. As will be described later, the abnormality detection device 100 may interpolate the missing data by interpolation processing.

図６は、異常検知装置１００における異常検知処理手順の一例を示すフローチャートである。まず、データ取得部１０１は、処理対象の時系列データの選択を受け付ける（ステップＳ１）。上述した通り、複数種類のセンサの計測値が時系列データとして用いられる場合、時系列データはセンサごとには生成される。ステップＳ１では、ユーザから、これらの時系列データのうちどれを処理対象とするかの選択を受け付ける。このとき、データ取得部１０１は、ディスプレイ２１０に、選択可能な時系列データを識別する情報、例えば、時系列データに対応するセンサを示す名称などを表示し、表示された名称のなかからユーザによる選択を受け付けるようにしてもよい。また、処理対象の時系列データとしてセンサの種類だけでなく、処理対象の期間の選択も受け付けるようにしてもよい。ユーザは、入力装置２０９を操作することにより、表示された名称のなかから処理対象とする時系列データに対応する名称を選択する。また、データ取得部１０１は、ステップＳ１で、処理条件の入力も受け付けるようにしてもよい。処理条件としては、例えば、上述したように指令値ごとにデータを抽出した処理をするかを指定することが挙げられる。指令値ごとにデータを抽出した処理をするかが指定された場合には、どの指令値に対応するデータを処理対象とするかも処理条件となる。 FIG. 6 is a flowchart showing an example of the abnormality detection processing procedure in the abnormality detection device 100. First, the data acquisition unit 101 accepts the selection of time-series data to be processed (step S1). As described above, when the measured values of a plurality of types of sensors are used as time series data, the time series data is generated for each sensor. In step S1, the user receives a selection of which of these time series data is to be processed. At this time, the data acquisition unit 101 displays information for identifying selectable time-series data, for example, a name indicating a sensor corresponding to the time-series data, on the display 210, and the user selects the displayed names. You may accept the selection. Further, not only the type of the sensor but also the selection of the period of the processing target may be accepted as the time series data of the processing target. By operating the input device 209, the user selects a name corresponding to the time-series data to be processed from the displayed names. Further, the data acquisition unit 101 may also accept the input of the processing conditions in step S1. As the processing condition, for example, it is possible to specify whether to perform the processing of extracting the data for each command value as described above. When it is specified whether to perform the processing of extracting the data for each command value, the processing condition also determines which command value the data corresponding to which command value is to be processed.

ステップＳ１の後、データ取得部１０１は、処理条件に応じた前処理を実施する（ステップＳ２）。処理条件が定められていない場合には、データ取得部１０１は、前処理として、状態情報からステップＳ１で指定された処理対象の時系列データを抽出する処理を行う。また、処理条件としてステップＳ１で指令値ごとにデータを抽出した処理することが指定された場合には、データ取得部１０１は、前処理として、処理対象の時系列データからステップＳ２で指令された指令値に対応するデータを抽出する。また、データ取得部１０１は、時系列データに欠落がある場合、前処理として、補間処理により欠落したデータを補ってもよい。 After step S1, the data acquisition unit 101 performs preprocessing according to the processing conditions (step S2). When the processing conditions are not defined, the data acquisition unit 101 performs a process of extracting the time-series data of the processing target specified in step S1 from the state information as a preprocessing. When it is specified in step S1 that the data is extracted and processed for each command value as the processing condition, the data acquisition unit 101 is instructed in step S2 from the time-series data to be processed as preprocessing. Extract the data corresponding to the command value. Further, when the time series data is missing, the data acquisition unit 101 may make up for the missing data by the interpolation process as preprocessing.

また、データ取得部１０１は、学習区間とテスト区間の割合を受け付ける（ステップＳ３）。本実施の形態では、後述するように時系列データを学習区間とテスト区間とに分割し、学習区間の時系列データを用いてテスト区間のデータを予測する。ステップＳ３では、データ取得部１０１は、この分割の際に用いる学習区間とテスト区間の割合の入力を、ユーザから受け付ける。学習区間とテスト区間は、データに対応する時間長の比であってもよいし、データ点数の比であってもよいが、ここでは、上述したように時系列データに欠落が有る場合を考慮してデータ点数の比を用いるとする。 Further, the data acquisition unit 101 receives the ratio between the learning section and the test section (step S3). In the present embodiment, as will be described later, the time series data is divided into a learning section and a test section, and the data of the test section is predicted using the time series data of the learning section. In step S3, the data acquisition unit 101 receives the input of the ratio of the learning section and the test section used at the time of this division from the user. The learning interval and the test interval may be the ratio of the time length corresponding to the data or the ratio of the data points, but here, as described above, the case where the time series data is missing is considered. Then, the ratio of the data points is used.

次に、データ分割部１０２は、学習区間とテスト区間の割合に基づいて、時系列データを学習区間とテスト区間とに分割する（ステップＳ４）。詳細には、データ分割部１０２は、学習区間とテスト区間の割合に基づいて、時系列データを学習区間とテスト区間とに分割する分割位置を算出する。例えば、処理対象の時系列データのデータ点数がＮ_ａｌｌであり、学習区間とテスト区間の割合が、学習区間：テスト区間がＲ_ｔ：Ｒ_ｄであったとする。このとき、データ分割部１０２は、Ｎ_ａｌｌ個のデータのうち、はじめのＮ_ａｌｌ×（Ｒ_ｔ／（Ｒ_ｔ＋Ｒ_ｄ））個のデータを学習区間とし、学習区間より後の時系列データをテスト区間とする。Ｎ_ａｌｌ×（Ｒ_ｔ／（Ｒ_ｔ＋Ｒ_ｄ））が整数でない場合には、Ｎ_ａｌｌ×（Ｒ_ｔ／（Ｒ_ｔ＋Ｒ_ｄ））に四捨五入、切り捨て、切り上げなどの処理をすることにより、学習区間のデータ点数を決定する。このようにして求めた学習区間のデータ長すなわちデータ点数をｎとし、テスト区間のデータ長をｍとする。ｎ＋ｍ＝Ｎ_ａｌｌである。時系列データのｎ番目とｎ＋１番目の間が学習区間とテスト区間の分割位置となる。このように、学習区間は、テスト区間より、時系列データに対応する時刻が前となる区間である。学習区間のデータ点数、テスト区間のデータ点数を、以下、それぞれ学習データ長、テストデータ長ともいう。データ分割部１０２は、学習データ長、テストデータ長を部分列生成部１０３へ通知する。Next, the data division unit 102 divides the time series data into a learning section and a test section based on the ratio of the learning section and the test section (step S4). Specifically, the data division unit 102 calculates the division position for dividing the time series data into the learning section and the test section based on the ratio of the learning section and the test section. For example, suppose that the number of data points of the time-series data to be processed is N _all , and the ratio of the learning section to the test section is R _t : R _d for the learning section: test section. At this time, the data dividing unit _102, among the _{N all} pieces of data, the beginning of _{N all} × a _{_{_{(R t / (R t +}}} R d)) pieces of data as training interval, the time series data after the learning period Let it be a test section. When N _all x (R _t / (R _t + R _d )) is not an integer, learning is performed by rounding, rounding, rounding up, etc. to _{N all} x (R _t / (R _t + R _d)). Determine the number of data points for the section. Let n be the data length of the learning section obtained in this way, that is, the number of data points, and m be the data length of the test section. n + m = N _all . The division position between the learning section and the test section is between the nth and n + 1th of the time series data. In this way, the learning section is a section in which the time corresponding to the time series data precedes the test section. The number of data points in the learning section and the number of data points in the test section are also hereinafter referred to as the learning data length and the test data length, respectively. The data division unit 102 notifies the substring generation unit 103 of the training data length and the test data length.

次に、部分列生成部１０３は、ステップＳ４の分割結果、すなわちステップＳ４で算出された分割位置に基づいて、学習区間の部分列である学習データを生成する（ステップＳ５）。すなわち、部分列生成部１０３は、時系列データからはじめのｎ点を抽出することにより学習区間の部分列を生成し、時系列データの残りのｍ点を抽出することによりテスト区間の部分列を生成する。部分列生成部１０３は、生成した学習区間の部分列を予測分布算出部１０４へ出力する。なお、後述するように学習区間は後のステップＳ９の処理で更新される。以下では、ステップＳ５で分割された学習区間を初期学習区間ともいう。 Next, the substring generation unit 103 generates learning data which is a substring of the learning section based on the division result of step S4, that is, the division position calculated in step S4 (step S5). That is, the sub-sequence generation unit 103 generates a sub-sequence of the learning section by extracting the first n points from the time-series data, and extracts the remaining m points of the time-series data to generate the sub-sequence of the test section. Generate. The substring generation unit 103 outputs the substring of the generated learning section to the prediction distribution calculation unit 104. As will be described later, the learning section is updated in the process of step S9 later. Hereinafter, the learning section divided in step S5 is also referred to as an initial learning section.

次に、予測分布算出部１０４は、学習区間の部分列である学習データを基に、テスト区間のｊ時点の確率分布と予測値を求める（ステップＳ６）。ｊは、初期テスト区間における部分列内のデータの番号を示す自然数であり、初期値は１である。ｊ時点とは、テスト区間における部分列内のｊ番目のデータ点に対応する時点すなわちｊ番目の時刻のことを示す。具体的には、初回のステップＳ６では、予測分布算出部１０４は、初期学習区間の部分列である学習データを基に、学習データの次のデータ点、つまり先頭からｎ＋１点目に相当するテスト区間のｊ番目の時点のデータの確率分布を算出する。したがって、１回目のステップＳ６では、ｊは１である。予測分布算出部１０４は、例えば、ガウス過程回帰（ＧＰＲ：Gaussian Process Regression）によるモデルを用いて、学習データに基づき、学習データの次の点における条件付き分布を算出する。 Next, the prediction distribution calculation unit 104 obtains the probability distribution and the predicted value at point j of the test section based on the learning data which is a substring of the learning section (step S6). j is a natural number indicating the number of data in the substring in the initial test interval, and the initial value is 1. The j-time point indicates a time point corresponding to the j-th data point in the substring in the test interval, that is, the j-th time point. Specifically, in the first step S6, the prediction distribution calculation unit 104 is a test corresponding to the next data point of the learning data, that is, the n + 1 point from the beginning, based on the learning data which is a substring of the initial learning section. Calculate the probability distribution of the data at the jth point in the interval. Therefore, in the first step S6, j is 1. The prediction distribution calculation unit 104 calculates a conditional distribution at the next point of the training data based on the training data, for example, using a model by Gaussian process regression (GPR).

ガウス過程は、ｎ個のデータの集合（ｘ_１，ｘ_２，…，ｘ_ｎ）について、これらのデータに対応するＹ＝（ｙ_１，ｙ_２，…，ｙ_ｎ）における同時分布ｐ（Ｙ）がガウス分布に従うものである。回帰問題にガウス過程を適用すること、つまり上記のデータ集合に、ガウス過程を当てはめることがガウス過程回帰である。したがって、ガウス過程回帰では、上記の通り、ｎ個のデータ点(Ｘ，Ｙ)＝(ｘ_１，ｙ_１)，(ｘ_２，ｙ_２)，…，(ｘ_ｎ，ｙ_ｎ)が与えられたとき、ｘ_ｎ＋１の点におけるＹの予測分布として、条件付き分布ｐ（ｘ_ｎ＋１｜Ｙ）を求めることになる。なお、ｉ＝１，２，…，ｎとするとき、（ｘ_ｉ，ｙ_ｉ）は、学習区間のｉ番目のデータ点を示し、ｘ_ｉは時刻情報、ｙ_ｉはｘ_ｉに対応するセンサ値等の値を示す。Gaussian process is a set of n data _{_{(x 1, x 2, ...}} , x n) _{_{for, Y = (y 1, y}} 2, ..., y n) corresponding to these data simultaneously distribution in p (Y ) Follows the Gaussian distribution. Applying a Gaussian process to a regression problem, that is, applying a Gaussian process to the above data set, is Gaussian process regression. Therefore, in Gaussian process regression, as described above, n data points (X, Y) = (x ₁ , y ₁ ), (x ₂ , y ₂ ), ..., (X _n , y _n ) are given. when _in, as predictive distribution of Y at the point of _{x n + 1,} the conditional distribution _p | will be determined _{(x n +} 1 Y). When i = 1, 2, ..., N, (xi _, y _i ) indicates the i-th data point of the learning section, x _i is the time information, and y _i is the sensor corresponding to _{x i.} Indicates a value such as a value.

上記条件付き分布ｐ（ｘ_ｎ＋１｜Ｙ）の算出には、以下の式（１）で示す同時分布ｐ（Ｙ_ｎ＋１）が必要となる。式（１）におけるＣ_ｎ＋１は、（ｎ＋１）×（ｎ＋１）の共分散行列であり、式（２）に示す形で表せる。In order to calculate the conditional distribution p (x _{n + 1} | Y), the joint distribution p (Y _{n + 1} ) represented by the following equation (1) is required. _{C n + 1} in the equation (1) is a covariance matrix of (n + 1) × (n + 1) and can be expressed in the form shown in the equation (2).

ここで、Ｃ_ｎはｎ×ｎの共分散行列であり、カーネル関数ｋ（ｘ_ｉ，ｘ_ｊ）を用いて表現できる。なお、ｊは、ｎ＋１である。カーネル関数とは、ｘ_ｉとｘ_ｊの２つの変数の類似度合いすなわち相関関係を表す関数である。また、Ｋは、ｋ（ｘ_ｎ，ｘ_ｎ＋１）という要素を持つベクトルである。また、ｃは、式（３）に示す通りスカラーである。βは定数である。δ_ｉｊは、ｉ＝ｊのとき０となる変数である。なお、Ｙには測定誤差等の誤差があり、かつ誤差がガウス分布に従うと仮定する。この誤差は、式（３）における定数β^−１と変数δ_ｉｊとの乗算結果に対応する。Here, _{C n} is the covariance matrix of n × n, can be expressed using a kernel function _{_{k (x i, x j)}} . In addition, j is n + 1. The kernel function is a function that expresses the degree of similarity, that is, the correlation between two variables _{, x i} and x _j. Further, K is a vector having an element of _{k (x n} , x _{n + 1).} Further, c is a scalar as shown in the formula (3). β is a constant. δ _ij is a variable that becomes 0 when i = j. It is assumed that Y has an error such as a measurement error and the error follows a Gaussian distribution. This error corresponds to the result of multiplication ^{of the constant β -1} and the variable δ _ij in Eq. (3).

ここでは、カーネル関数として、式（４）に示すガウスカーネルを用いるとする。なお、指数カーネルまたは線形カーネルを用いてもよく、カーネル関数はガウスカーネルに限定されない。 Here, it is assumed that the Gauss kernel shown in Eq. (4) is used as the kernel function. An exponential kernel or a linear kernel may be used, and the kernel function is not limited to the Gauss kernel.

予測分布算出部１０４は、上記の式（１）〜（４）を用いて、学習区間の時系列データを(ｘ_１，ｙ_１)，(ｘ_２，ｙ_２)，…，(ｘ_ｎ，ｙ_ｎ)として用いることで、ｙ_ｎ＋１のガウス分布である条件付き分布ｐ（ｘ_ｎ＋１｜Ｙ）の平均値μと、分散σ^２を、式（５）、式（６）により求めることができる。条件付き分布ｐ（ｘ_ｎ＋１｜Ｙ）は、式（７）により表すことができる。The prediction distribution calculation unit 104 uses the above equations (1) to (4) to obtain (x ₁ , y ₁ ), (x ₂ , y ₂ ), ..., (x _n ,) the time series data of the learning interval. By using it as _{y n} ), the mean value μ of the conditional distribution p (x _{n + 1} _{| Y), which is a Gaussian distribution of y n + 1} ^{, and the variance σ 2} can be obtained by Eqs. (5) and (6). .. The conditional distribution p (x _{n + 1} | Y) can be expressed by the equation (7).

ｘ_ｎ＋１に対応する時点、すなわちｊ時点の確率分布は、平均値μ、分散σ^２のガウス分布である。図７は、ガウス分布の一例を示す図である。ここで、ｊ時点の時系列データの予測値は、上記ガウス分布の平均値とすることができる。また、そして信用区間は、例えば９５％信用区間とした場合、ガウス分布における左右の２．５％を除く範囲が、ｊ時点における信用区間となる。９５％信用区間は、真の値が信用区間に存在する確率が９５％となる区間である。The probability distribution at the time point corresponding to x _{n + 1} , that is, the time point j is a Gaussian distribution with a ^{mean value μ and a variance σ 2.} FIG. 7 is a diagram showing an example of a Gaussian distribution. Here, the predicted value of the time series data at time j can be the average value of the Gaussian distribution. Further, when the credit interval is, for example, a 95% confidence interval, the range excluding the left and right 2.5% in the Gaussian distribution is the credit interval at time j. The 95% confidence interval is an interval in which the probability that the true value exists in the confidence interval is 95%.

図６の説明に戻り、予測分布算出部１０４は、確率分布を算出した後、確率分布に基づいて予測値、すなわちガウス分布の平均値を算出する。また、予測分布算出部１０４は、算出した確率分布を信用区間算出部１０５へ渡す。信用区間算出部１０５は、確率分布に基づいて、ｊ時点の信用区間を算出する（ステップＳ７）。信用区間算出部１０５は、算出した信用区間を補助記憶装置２０２に記憶する。 Returning to the explanation of FIG. 6, the prediction distribution calculation unit 104 calculates the probability distribution and then calculates the prediction value, that is, the average value of the Gaussian distribution based on the probability distribution. Further, the prediction distribution calculation unit 104 passes the calculated probability distribution to the confidence interval calculation unit 105. The credible interval calculation unit 105 calculates the credible interval at time j based on the probability distribution (step S7). The credit interval calculation unit 105 stores the calculated credit interval in the auxiliary storage device 202.

信用区間算出部１０５は、テスト区間の全点の信用区間を算出したか否かを判断する（ステップＳ８）。テスト区間のうち信用区間を算出していない時点がある場合（ステップＳ８Ｎｏ）、信用区間算出部１０５は、部分列生成部１０３に学習区間を指示し、部分列生成部１０３は、学習区間を更新する（ステップＳ９）。具体的には、ステップＳ９では、部分列生成部１０３は、学習区間を、後ろへ、すなわちテスト区間側に１データ点分スライドさせることにより学習区間を更新し、更新した学習区間の部分列を生成して予測分布算出部１０４へ出力する。ステップＳ９の後、更新された学習区間に対応する部分列が学習データとして用いられて、ステップ６からの処理が繰り返される。学習区間は、ステップＳ９で更新されているので、２回目以降のステップＳ６では、更新された学習区間の次のデータ点に対応する処理が行われる。このため、ステップＳ６のｊ時点のｊの値は学習区間の更新のたびに、１つずつインクリメントしていく。 The credible interval calculation unit 105 determines whether or not the credible intervals of all the points in the test section have been calculated (step S8). When there is a time point in the test section where the confidence interval is not calculated (step S8 No), the confidence interval calculation unit 105 instructs the substring generation unit 103 to learn the learning interval, and the substring generation unit 103 sets the learning interval. Update (step S9). Specifically, in step S9, the substring generation unit 103 updates the learning section by sliding the learning section backward, that is, by sliding one data point toward the test section side, and updates the learning section with the updated substring of the learning section. It is generated and output to the prediction distribution calculation unit 104. After step S9, the substring corresponding to the updated learning section is used as learning data, and the process from step 6 is repeated. Since the learning section is updated in step S9, in the second and subsequent steps S6, processing corresponding to the next data point of the updated learning section is performed. Therefore, the value of j at the time j in step S6 is incremented by one each time the learning section is updated.

図８は、学習区間の更新の様子を示す図である。図８では、時系列データのデータ点数がＮ_ａｌｌを２０とし、学習区間とテスト区間の割合を、Ｒ_ｔ：Ｒ_ｄ＝７：３とした例を示している。つまり、図８では、時系列データを、学習区間を７０％でテスト区間を３０％となる割合で分割する例を示している。この例では、学習区間のデータ点数は１４であり、テスト区間のデータ点数は６である。上述したステップＳ５では図中の最上段に示すように、入力である時系列データのうち左から１４点が学習区間の部分列となり、右から６点がテスト区間の部分列となる。図８では、最右の点が最も直近のデータを示す。なお、図８では、時系列データとしてセンサ値を例に記載している。FIG. 8 is a diagram showing a state of updating the learning section. In Figure 8, the data points of the time series data and 20 _{N all,} the ratio of the learning period and the test _{_{period, R t: R d = 7}} : shows an example in which a 3. That is, FIG. 8 shows an example in which the time series data is divided at a ratio of 70% for the learning section and 30% for the test section. In this example, the number of data points in the learning section is 14, and the number of data points in the test section is 6. In step S5 described above, as shown in the uppermost part of the figure, 14 points from the left of the input time series data are substrings of the learning section, and 6 points from the right are substrings of the test section. In FIG. 8, the rightmost point shows the most recent data. In FIG. 8, the sensor value is shown as an example as time series data.

図８の２段目の予測１では、初回すなわちループ１回目のステップＳ６で予測値が算出される様子を示している。図８では、濃いハッチングの丸は学習区間内の実測値を示し、薄いハッチングの丸はテスト区間の実測値を示している。実測値は、時系列データとして入力されるデータである。なお、時系列データは、上述したように実測された値ではなく要約値等である場合もあるがここではセンサ値を例示しているので実測値と記載する。時系列データが要約値である場合には、図８の実測値は要約値となる。予測１では、時系列データのうち左から１４点の時点である学習区間すなわち初期学習区間に基づいて、四角の印で示した、初期学習区間の次の時点すなわちテスト区間の最初の時点に対応する予測値が算出される。 Prediction 1 in the second stage of FIG. 8 shows how the predicted value is calculated in step S6 of the first time, that is, the first time of the loop. In FIG. 8, dark hatched circles indicate measured values in the learning section, and light hatched circles indicate measured values in the test section. The measured value is data input as time series data. The time-series data may be a summary value or the like instead of the measured value as described above, but since the sensor value is illustrated here, it is described as the measured value. When the time series data is a summary value, the actually measured value in FIG. 8 is a summary value. In Prediction 1, based on the learning interval, that is, the initial learning interval, which is the time point of 14 points from the left in the time series data, it corresponds to the next time point of the initial learning section, that is, the first time point of the test section indicated by the square mark. The predicted value to be calculated is calculated.

図８の３段目の予測２は、ループ１回目のステップＳ９で学習区間が更新された後のループ２回目のステップＳ６で、予測値が算出される様子を示している。ループ１回目のステップＳ９では、学習区間が、左側に１点ずれるようにスライドされるように更新される。すなわち、部分列生成部１０３は、学習区間を、対応する時刻が後の時刻へずれるように更新し、更新後の学習区間に対応する部分列を更新後の学習データとして生成する。また、更新された学習区間では、左側に１点ずれてテスト区間に入り込む時点については実測値ではなく予測値が用いられる。つまり、図８の３段目では、更新された学習区間は、時系列データのうち左から２番目から１４番目までの１３点の実測値と、テスト区間の予測値１点とを含む。このように更新された学習区間は、確率分布に応じて算出されるテスト区間の予測値を含む。予測２では、この更新された学習区間の部分列を用いて、更新された学習区間の次の時点すなわち更新後の学習区間の次のデータ点である更新点の確率分布が算出され、この確率分布に基づく予測値が算出される。時系列データのうち左から２番目から１４番目までの１３点の実測値と、学習区間すなわち初期学習区間に基づいて、四角の印で示した、初期学習区間の次のデータ点すなわちテスト区間の最初のデータ点に対応する予測値が算出される。 Prediction 2 in the third stage of FIG. 8 shows how the prediction value is calculated in step S6 of the second loop after the learning section is updated in step S9 of the first loop. In step S9 of the first loop, the learning section is updated so as to be slid to the left by one point. That is, the substring generation unit 103 updates the learning section so that the corresponding time shifts to a later time, and generates the substring corresponding to the updated learning section as the updated learning data. Further, in the updated learning section, the predicted value is used instead of the measured value at the time point when the test section is deviated by one point to the left. That is, in the third stage of FIG. 8, the updated learning section includes the actually measured values of 13 points from the second to the 14th from the left of the time series data and one predicted value of the test section. The learning interval updated in this way includes the predicted value of the test interval calculated according to the probability distribution. In Prediction 2, the probability distribution of the update point, which is the next data point of the updated learning interval, that is, the next time point of the updated learning interval, is calculated by using the substring of the updated learning interval. The predicted value based on the distribution is calculated. Based on the measured values of 13 points from the second to the 14th from the left of the time series data and the learning section, that is, the initial learning section, the data points next to the initial learning section, that is, the test section, indicated by the square marks. The predicted value corresponding to the first data point is calculated.

予測２の後、ステップＳ９の学習区間の更新とステップＳ６〜Ｓ８とが、テスト区間の全点の信用区間が算出されるまで、つまりテスト区間のデータ点数であるｍ回目の予測ｍが実施されるまで、予測３〜予測ｍの処理が同様に実施される。ステップＳ９の学習区間の更新では、順次、左側に学習区間がずれ、これにともなって学習区間に予測値が１点ずつ追加されていく。 After the prediction 2, the update of the learning interval in step S9 and steps S6 to S8 are carried out until the confidence intervals of all the points in the test section are calculated, that is, the m-th prediction m, which is the number of data points in the test section, is executed. Until then, the processes of prediction 3 to prediction m are carried out in the same manner. In the update of the learning section in step S9, the learning section is sequentially shifted to the left side, and along with this, the predicted value is added one point at a time to the learning section.

図６の説明に戻る。ステップＳ８でＹｅｓと判定した場合、信用区間算出部１０５は、異常度スコア算出部１０６へ、テスト区間の各点の信用区間のデータを渡す。これにより、異常度スコア算出部１０６は、テスト区間の異常度スコアを算出する（ステップＳ１０）。異常度スコアは、学習データとテスト区間の時系列データとの間の外れ度合いを示す値である。つまり、異常度スコアは、学習区間における時系列データの挙動と、テスト区間における時系列データの挙動との、相対的な乖離度合いを示す値である。異常度スコアは、例えば、０．０から１．０までの数値で表現され、乖離度合いが大きいほど１．０に近づくとする。したがって、学習区間における時系列データの挙動と、テスト区間における時系列データの挙動とが類似していると異常度スコアは低くなる。なお、異常度スコアの定義はこれに限定されず、学習区間における時系列データの挙動と、テスト区間における時系列データの挙動との乖離度合いを表現でくるものであればよい。 Returning to the description of FIG. If it is determined Yes in step S8, the credible interval calculation unit 105 passes the credible interval data of each point of the test section to the abnormality score calculation unit 106. As a result, the abnormality degree score calculation unit 106 calculates the abnormality degree score of the test section (step S10). The anomaly degree score is a value indicating the degree of deviation between the training data and the time series data of the test interval. That is, the abnormality degree score is a value indicating the relative degree of dissociation between the behavior of the time series data in the learning interval and the behavior of the time series data in the test interval. The anomaly degree score is expressed by a numerical value from 0.0 to 1.0, for example, and it is assumed that the larger the degree of deviation, the closer to 1.0. Therefore, if the behavior of the time-series data in the learning interval and the behavior of the time-series data in the test interval are similar, the anomaly degree score becomes low. The definition of the anomaly score is not limited to this, and any degree of deviation between the behavior of the time series data in the learning section and the behavior of the time series data in the test section may be expressed.

ここでは、異常度スコアの具体的な算出方法として、異常度スコア算出部１０６が、テスト区間の実測値が信用区間内であるかどうかを各点で判定し、実測値が信用区間内となるデータ点数を、テスト区間の総データ点数で割った値を異常度スコアとして算出する方法を用いるとする。すなわち、異常度スコア算出部１０６は、テスト区間の複数のデータ点に対応する信用区間と、テスト区間の時系列データとに基づいて異常度スコアを算出する。 Here, as a specific calculation method of the abnormality degree score, the abnormality degree score calculation unit 106 determines at each point whether or not the actually measured value of the test section is within the credible interval, and the actually measured value is within the credible interval. A method of calculating the anomaly score by dividing the data score by the total data score of the test interval is used. That is, the abnormality degree score calculation unit 106 calculates the abnormality degree score based on the confidence interval corresponding to the plurality of data points of the test section and the time series data of the test section.

図９は、テスト区間の各時点の信用区間と異常度スコアの一例を示す図である。図９では、テスト区間内で、破線で示した信用区間に存在しない実測値が５点あり、テスト区間のデータ点数は６である。このため、異常度スコアは、５／６＝０．８３３…となる。図９では、異常度スコアの小数点第３位を四捨五入して異常度スコアを０．８３と記載している。なお、学習データとして用いた部分列に、欠落がある場合は、分散σ^２の値が大きくなり確率分布の裾が広がるため、信用区間が広がり、予測の確度が低下する。このような予測の確度が低い点に関しては、異常度スコアを算出する際に、重みを付けたりすることにより、予測の確度の低いデータの異常判定への影響を抑制することができる。例えば、分散σ^２の値が規定値異常の場合には、異常度スコアの算出において該当する点を１点とせずに、０．５点とするといった重み付け方法が考えられる。このように、異常度スコア算出部１０６は、予測分布算出部１０４で算出された確率分布の分散に基づいて、異常度スコアを算出してもよい。FIG. 9 is a diagram showing an example of a confidence interval and an abnormality score at each time point of the test interval. In FIG. 9, there are 5 actually measured values that do not exist in the confidence interval shown by the broken line in the test section, and the number of data points in the test section is 6. Therefore, the abnormality degree score is 5/6 = 0.833 .... In FIG. 9, the anomaly score is described as 0.83 by rounding off the third decimal place of the anomaly score. If the substring used as the training data is missing ^{, the value of the variance σ 2} becomes large and the tail of the probability distribution is widened, so that the confidence interval is widened and the prediction accuracy is lowered. With respect to such points with low prediction accuracy, it is possible to suppress the influence of data with low prediction accuracy on the abnormality determination by weighting the points when calculating the abnormality score. For example, when ^{the value of the variance σ 2} is abnormal to the specified value, a weighting method can be considered in which the corresponding point is set to 0.5 points instead of 1 point in the calculation of the anomaly degree score. As described above, the abnormality degree score calculation unit 106 may calculate the abnormality degree score based on the variance of the probability distribution calculated by the prediction distribution calculation unit 104.

図６の説明に戻り、ステップＳ１０の後、異常検知部１０７は、異常度スコアに応じて、異常の判定結果を出力する（ステップＳ１１）。例えば、異常検知部１０７は、異常度スコアが０．０以上かつ０．５未満である場合に正常と判定し、異常度スコアが０．５以上かつ０．７未満である場合に、要注意の異常と判定し、異常度スコアが０．７以上である場合に警告が必要な異常と判定する。なお、ここでは、要注意についても異常の一部としたが、警告が必要な異常のみを異常と定義してもよい。異常検知部１０７は、異常の判定結果を、電子メールによりネットワークＩ／Ｆ２０７を介して他の装置に送信したり、ディスプレイＩ／Ｆ２０５を介してディスプレイ２１０に表示したりする。また、判定結果が、警告が必要な異常であった場合には、異常検知部１０７は、警報出力装置２０６により警報を発してもよい。また、異常検知部１０７は、異常度スコアの推移を時系列データとして扱い、この時系列データをディスプレイＩ／Ｆ２０５を介してディスプレイ２１０にトレンドグラフを表示させてもよい。 Returning to the description of FIG. 6, after step S10, the abnormality detection unit 107 outputs the abnormality determination result according to the abnormality degree score (step S11). For example, the abnormality detection unit 107 determines that the abnormality is normal when the abnormality score is 0.0 or more and less than 0.5, and needs attention when the abnormality score is 0.5 or more and less than 0.7. It is judged that the abnormality requires a warning when the abnormality score is 0.7 or more. Although caution is also included here as a part of the abnormality, only the abnormality that requires a warning may be defined as the abnormality. The abnormality detection unit 107 transmits the abnormality determination result to another device via the network I / F 207 by e-mail, or displays the abnormality determination result on the display 210 via the display I / F 205. If the determination result is an abnormality that requires a warning, the abnormality detection unit 107 may issue an alarm by the alarm output device 206. Further, the abnormality detection unit 107 may treat the transition of the abnormality degree score as time-series data, and display the trend graph on the display 210 via the display I / F 205.

なお、上述した例では、異常検知部１０７は、異常度スコアを用いて異常を判定したが、異常の判定方法は、算出された信用区間または予測値を用いる方法、換言すれば確率分布に用いて異常を判定する方法であればよく、上述した例に限定されない。例えば、異常検知部１０７は、テスト区間で信用区間をはずれる実測値が１つでもあれば異常と判定してもよい。すなわち、異常検知部１０７は、予測分布算出部１０４によって算出された確率分布に基づいて異常を判定するものであればよい。 In the above example, the abnormality detection unit 107 determines the abnormality using the abnormality degree score, but the abnormality determination method is a method using the calculated confidence interval or the predicted value, in other words, the probability distribution. Any method may be used to determine an abnormality, and the method is not limited to the above-mentioned example. For example, the abnormality detection unit 107 may determine that there is an abnormality if there is even one actually measured value that deviates from the confidence interval in the test section. That is, the abnormality detection unit 107 may determine an abnormality based on the probability distribution calculated by the prediction distribution calculation unit 104.

なお、以上の例では、テスト区間の点数が複数であるため、テスト区間の回数分の予測値を求めたが、テスト区間が１点である場合には、初回のステップＳ８でＹｅｓとなるため、学習区間の更新はされない。つまり、学習区間の更新は必須ではなく、部分列生成部１０３は、時系列データのうち学習区間の部分列を学習データとして生成すればよい。そして、テスト区間が複数点である場合には、部分列生成部１０３は、上述したように学習区間の更新を行う。 In the above example, since there are a plurality of points in the test section, the predicted values for the number of times in the test section are obtained, but if the test section has one point, Yes in the first step S8. , The learning section is not updated. That is, the update of the learning section is not indispensable, and the substring generation unit 103 may generate the substring of the learning section of the time series data as the learning data. Then, when the test section has a plurality of points, the substring generation unit 103 updates the learning section as described above.

また、異常検知部１０７は、図９に示した信用区間と異常度スコアを含む情報をディスプレイＩ／Ｆ２０５を介してディスプレイ２１０に表示させてもよい。また、ネットワークＩ／Ｆ２０７を介して、信用区間と異常度スコア等のデータを外部の表示器へ送信し、外部の表示器に表示させてもよい。ディスプレイ２１０または外部の表示器に、これらの情報を常時表示しておくことで、工場のライン等において、作業者が、異常、および異常兆候の有無をリアルタイムに確認することができる。 Further, the abnormality detection unit 107 may display information including the confidence interval and the abnormality degree score shown in FIG. 9 on the display 210 via the display I / F 205. Further, data such as a credit interval and an abnormality degree score may be transmitted to an external display via the network I / F 207 and displayed on the external display. By constantly displaying this information on the display 210 or an external display, the operator can confirm the presence or absence of an abnormality and an abnormality sign in real time on a factory line or the like.

上述した通り、時系列データが複数ある場合には、時系列データごとに、図６に示した処理を実施してもよいし、特定の時系列データに関して図６に示した処理を実施してもよい。また、指令値の値など制御条件ごとに時系列データを抽出して図６に示した処理を実施する場合、指令値の値ごとに図６に示した処理を実施してもよいし、特定の指令値に関して図６に示した処理を実施してもよい。 As described above, when there are a plurality of time-series data, the processing shown in FIG. 6 may be performed for each time-series data, or the processing shown in FIG. 6 may be performed for the specific time-series data. May be good. Further, when the time series data is extracted for each control condition such as the value of the command value and the processing shown in FIG. 6 is performed, the processing shown in FIG. 6 may be performed for each value of the command value or specified. The processing shown in FIG. 6 may be carried out with respect to the command value of.

また、リアルタイム性が不要な場合には、上述した情報を記録しておき、定期的にグラフとして表示しても良い。本実施の形態では、時系列データの抜けがあった場合に異常度スコアの算出の際に重み付けなどにより対応できるため、同じ機器で生産計画に応じて複数種類の指示値を切り替える場合であっても、指令値の値ごとに上記の図６に示した処理を行うことができる。 Further, when real-time performance is not required, the above-mentioned information may be recorded and displayed as a graph on a regular basis. In the present embodiment, if there is a lack of time-series data, it can be dealt with by weighting or the like when calculating the abnormality score, so that it is a case where a plurality of types of indicated values are switched according to the production plan with the same device. Also, the processing shown in FIG. 6 can be performed for each command value.

本実施の形態の異常検出方法は、異常検知の対象物である工場などの設備、センサの種類、時系列データの傾向などを問わない。このため、異常検知の対象物ごとに、異常と判定するための閾値の設定のための評価等を要しないので、閾値の設定のための作業負荷を抑制することができる。また、本実施の形態では、時系列データの徐々な変化、傾向が急変する異常といった変化の傾向を基に異常を検知することができるため、時系列データと閾値との単純な比較によって異常を検出する方法に比べて多様な異常の検知に対応することができる。また、例えば、異常の種類、原因といった詳細情報と、異常発生前後における異常度スコアとを紐付けすることで、異常の原因診断にも活用することができる。これにより、異常検知精度の向上および、異常原因の調査負荷が削減できる。 The abnormality detection method of the present embodiment does not care about the equipment such as a factory, the type of the sensor, the tendency of the time series data, etc., which are the objects of the abnormality detection. Therefore, it is not necessary to evaluate for setting the threshold value for determining the abnormality for each object of abnormality detection, so that the workload for setting the threshold value can be suppressed. Further, in the present embodiment, since the abnormality can be detected based on the tendency of the change such as the gradual change of the time series data and the abnormality in which the tendency suddenly changes, the abnormality can be detected by a simple comparison between the time series data and the threshold value. Compared to the detection method, it is possible to respond to the detection of various abnormalities. Further, for example, by associating detailed information such as the type and cause of the abnormality with the abnormality degree score before and after the occurrence of the abnormality, it can be utilized for diagnosing the cause of the abnormality. As a result, the accuracy of abnormality detection can be improved and the load of investigating the cause of the abnormality can be reduced.

以上の実施の形態に示した構成は、本発明の内容の一例を示すものであり、別の公知の技術と組み合わせることも可能であるし、本発明の要旨を逸脱しない範囲で、構成の一部を省略、変更することも可能である。 The configuration shown in the above-described embodiment shows an example of the content of the present invention, can be combined with another known technique, and is one of the configurations without departing from the gist of the present invention. It is also possible to omit or change the part.

１００異常検知装置、１０１データ取得部、１０２データ分割部、１０３部分列生成部、１０４予測分布算出部、１０５信用区間算出部、１０６異常度スコア算出部、１０７異常検知部、２０１プロセッサ、２０２補助記憶装置、２０３メモリ、２０４入力Ｉ／Ｆ、２０５ディスプレイＩ／Ｆ、２０６警報出力装置、２０７ネットワークＩ／Ｆ、２０９入力装置、２１０ディスプレイ。 100 anomaly detection device, 101 data acquisition unit, 102 data division unit, 103 substring generation unit, 104 prediction distribution calculation unit, 105 credit interval calculation unit, 106 abnormality score calculation unit, 107 abnormality detection unit, 201 processor, 202 auxiliary Storage device, 203 memory, 204 input I / F, 205 display I / F, 206 alarm output device, 207 network I / F, 209 input device, 210 display.

上述した課題を解決し、目的を達成するために、本発明にかかる異常検知装置は、時系列データを、学習区間とテスト区間に分割するデータ分割部と、時系列データのうち学習区間の部分列を学習データとして生成する部分列生成部と、を備える。また、異常検知装置は、学習データを用いて、テスト区間のデータ点に対応する確率分布を求める予測分布算出部と、確率分布を用いて異常を検知する異常検知部と、備える。学習区間は、テスト区間より、時系列データに対応する時刻が前となる区間であり、予測分布算出部は、確率分布として、学習区間の次のデータ点に対応する確率分布を求め、部分列生成部は、学習区間を、対応する時刻が後の時刻へずれるように更新し、更新後の学習区間に対応する部分列を更新後の学習データとして生成し、予測分布算出部は、更新後の学習データを用いて、更新後の学習区間の次のデータ点である更新点の確率分布を求め、更新後の学習区間は、確率分布に応じて算出されるテスト区間の予測値を含む。 In order to solve the above-mentioned problems and achieve the object, the abnormality detection device according to the present invention has a data division unit that divides time-series data into a learning section and a test section, and a portion of the time-series data in the learning section. It includes a sub-column generation unit that generates columns as training data. Further, the abnormality detection device includes a prediction distribution calculation unit that obtains a probability distribution corresponding to a data point in a test section using learning data, and an abnormality detection unit that detects an abnormality using the probability distribution. The learning section is a section in which the time corresponding to the time series data precedes the test section, and the prediction distribution calculation unit obtains the probability distribution corresponding to the next data point of the learning section as the probability distribution, and substrings. The generation unit updates the learning section so that the corresponding time shifts to a later time, generates the substring corresponding to the updated learning section as the updated learning data, and the prediction distribution calculation unit after the update. The probability distribution of the update point, which is the next data point of the updated learning interval, is obtained using the learning data of, and the updated learning interval includes the predicted value of the test interval calculated according to the probability distribution.

Claims

A data division unit that divides time series data into a learning section and a test section,
A sub-sequence generator that generates a sub-sequence of the learning section as learning data in the time-series data,
Using the learning data, a prediction distribution calculation unit that obtains a probability distribution corresponding to the data points of the test section, and
Anomaly detection unit that detects anomalies using the probability distribution,
Anomaly detection device characterized by being equipped with.

A credible interval calculation unit that calculates a credible interval corresponding to a data point in the test section based on the probability distribution.
An anomaly score calculation unit that calculates an anomaly score indicating the degree of deviation between the learning data and the time-series data of the test interval using the confidence interval.
With
The abnormality detection device according to claim 1, wherein the abnormality detection unit detects an abnormality based on the abnormality degree score.

The learning section is a section in which the time corresponding to the time series data precedes the test section.
The prediction distribution calculation unit obtains a probability distribution corresponding to the next data point in the learning section as the probability distribution.
The sub-string generation unit updates the learning section so that the corresponding time shifts to a later time, and generates the sub-string corresponding to the updated learning section as the updated learning data.
The prediction distribution calculation unit uses the updated learning data to obtain the probability distribution of the update point, which is the next data point of the updated learning section.
The credible interval calculation unit calculates the credible interval of the update point based on the probability distribution of the update point.
The claim is characterized in that the abnormality degree score calculation unit calculates the abnormality degree score based on the confidence interval corresponding to a plurality of data points of the test section and the time series data of the test section. The abnormality detection device according to 2.

The abnormality detection device according to claim 3, wherein the updated learning section includes a predicted value of the test section calculated according to the probability distribution.

Claim 3 or claim 3, wherein the abnormality degree score calculation unit calculates the abnormality degree score based on the score of the data that does not exist in the corresponding credit interval among the time series data of the test section. The abnormality detection device according to 4.

The invention according to any one of claims 2 to 5, wherein the anomaly degree score calculation unit calculates the anomaly degree score based on the variance of the probability distribution calculated by the prediction distribution calculation unit. Anomaly detection device.

This is an abnormality detection method in an abnormality detection device.
The first step of dividing the time series data into a learning interval and a test interval,
A second step of generating a substring of the learning section of the time series data as learning data, and
Using the training data, the third step of obtaining the probability distribution corresponding to the data points of the test interval, and
The fourth step of detecting anomalies using the probability distribution and
Anomaly detection method characterized by including.