JP2022169290A

JP2022169290A - Method for generating learning model, program, information processor, and method for generating learning data

Info

Publication number: JP2022169290A
Application number: JP2021075232A
Authority: JP
Inventors: 大輝岡田; Daiki Okada
Original assignee: Tokyo Electron Device Ltd
Current assignee: Tokyo Electron Device Ltd
Priority date: 2021-04-27
Filing date: 2021-04-27
Publication date: 2022-11-09
Anticipated expiration: 2041-04-27
Also published as: JP7015405B1

Abstract

To provide a method for generating a learning model that can reduce the load of preparing abnormal training data.SOLUTION: A method for generating a learning model according to one aspect includes the steps of: acquiring time-series data which was determined to be normal: generating abnormal data on the basis of the acquired time-series data; and executing processing of generating a learning model that outputs information about abnormalities when time-series data is input on the basis of the time-series data determined to be normal, a normality label put on the time-series data, the generated abnormal data, and an abnormality label put on the abnormal data.SELECTED DRAWING: Figure 7

Description

本発明は、学習モデルの生成方法、プログラム、情報処理装置及び学習用データの生成方法に関する。 The present invention relates to a learning model generation method, a program, an information processing apparatus, and a learning data generation method.

近年では、異常検出を行うために機械学習のアルゴリズムが利用されている。例えば特許文献１には、複数のデータから異常を精度良く検知可能な異常検知装置が開示されている。 In recent years, machine learning algorithms have been used to perform anomaly detection. For example, Patent Literature 1 discloses an anomaly detection device capable of accurately detecting an anomaly from a plurality of data.

特開２０２１－０３８９４６号公報JP 2021-038946 A

しかしながら、学習モデルの精度を向上させるためには、正常及び異常双方に関する大量の訓練データが必要となるという問題がある。 However, in order to improve the accuracy of the learning model, there is a problem that a large amount of training data regarding both normal and abnormal conditions is required.

一つの側面では、異常訓練データの準備負担を軽減することが可能な学習モデルの生成方法等を提供することを目的とする。 An object of one aspect is to provide a learning model generation method and the like that can reduce the burden of preparing abnormal training data.

一つの側面に係る学習モデルの生成方法は、正常と判断された時系列データを取得し、取得した時系列データに基づいて異常データを生成し、正常と判断された時系列データ及び前記時系列データに対しラベル付けされた正常ラベルと、生成した異常データ及び前記異常データに対しラベル付けされた異常ラベルとに基づき、時系列データを入力した場合に、異常に関する情報を出力する学習モデルを生成する処理を実行させることを特徴とする。 A method for generating a learning model according to one aspect acquires time-series data determined to be normal, generates abnormal data based on the acquired time-series data, generates the time-series data determined to be normal and the time-series data Generates a learning model that outputs information about anomalies when time-series data is input, based on the normal labels labeled for the data, the generated anomalous data, and the anomaly labels labeled for the anomalous data. It is characterized by executing a process to

一つの側面では、異常訓練データの準備負担を軽減することが可能となる。 In one aspect, it is possible to reduce the burden of preparing abnormal training data.

サーバの構成例を示すブロック図である。It is a block diagram which shows the structural example of a server. 訓練データ管理ＤＢ及び学習モデル管理ＤＢのレコードレイアウトの一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of record layouts of a training data management DB and a learning model management DB; 複数の時系列データの平均を求める処理を説明する説明図である。FIG. 10 is an explanatory diagram illustrating processing for obtaining an average of a plurality of time-series data; 異常データの生成方法を説明する説明図である。FIG. 4 is an explanatory diagram for explaining a method of generating abnormal data; 複数パターンの異常データを生成する処理を説明する説明図である。FIG. 10 is an explanatory diagram illustrating a process of generating multiple patterns of abnormal data; 異常データの生成方法の受付画面の一例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of a reception screen for a method of generating abnormal data; 異常検知モデルを生成する際の処理手順を示すフローチャートである。4 is a flowchart showing a processing procedure when generating an anomaly detection model; 異常データを生成する処理のサブルーチンの処理手順を示すフローチャートである。FIG. 10 is a flow chart showing the procedure of a subroutine for generating abnormal data; FIG. 補正処理を行う処理を説明する説明図である。It is explanatory drawing explaining the process which performs a correction process. 最大実効値と平均実効値との比を算出する説明図である。FIG. 4 is an explanatory diagram for calculating a ratio between a maximum effective value and an average effective value; 実施形態２の異常データを生成する処理のサブルーチンの処理手順を示すフローチャートである。FIG. 10 is a flowchart showing a subroutine processing procedure of processing for generating abnormal data according to the second embodiment; FIG. 実施形態３のサーバの構成例を示すブロック図である。FIG. 12 is a block diagram showing a configuration example of a server according to Embodiment 3; 生成モデルの学習処理に関する説明図である。FIG. 10 is an explanatory diagram of learning processing of a generative model; 生成モデルを生成する際の処理手順を示すフローチャートである。4 is a flow chart showing a processing procedure for generating a generative model; 実施形態３の異常検知モデルを生成する際の処理手順を示すフローチャートである。FIG. 14 is a flow chart showing a processing procedure for generating an anomaly detection model according to the third embodiment; FIG.

以下、本発明をその実施形態を示す図面に基づいて詳述する。 Hereinafter, the present invention will be described in detail based on the drawings showing its embodiments.

（実施形態１）
実施形態１は、正常と判断された時系列データと、該時系列データに基づいて生成された異常データとに基づき、時系列データを入力した場合に異常に関する情報を出力する学習モデルを生成する形態に関する。時系列データは、連続的な複数の時刻のそれぞれにおける計測値が時系列に配列されたデータ、またはその集合である。 (Embodiment 1)
Embodiment 1 generates a learning model that outputs information about anomalies when time-series data is input, based on time-series data determined to be normal and abnormal data generated based on the time-series data. Regarding morphology. Time-series data is data in which measured values at each of a plurality of consecutive times are arranged in time series, or a set thereof.

本実施形態のシステムは、情報処理装置１を含む。情報処理装置１は、種々の情報に対する処理、記憶及び送受信を行う情報処理装置である。情報処理装置１は、例えばサーバ装置、パーソナルコンピュータまたは汎用のタブレットＰＣ（パソコン）等である。本実施形態において、情報処理装置１は、時系列データに基づいて異常を検知するパーソナルコンピュータであるものとし、以下では簡潔のためコンピュータ１と読み替える。 The system of this embodiment includes an information processing device 1 . The information processing device 1 is an information processing device that processes, stores, and transmits/receives various information. The information processing device 1 is, for example, a server device, a personal computer, a general-purpose tablet PC (personal computer), or the like. In the present embodiment, the information processing apparatus 1 is assumed to be a personal computer that detects anomalies based on time-series data.

本実施形態に係るコンピュータ１は、正常と判断された時系列データを取得し、取得した時系列データに基づいて異常データを生成する。コンピュータ１は、正常と判断された時系列データに対して正常ラベルを付与し、異常データに対して異常ラベルを付与する。コンピュータ１は、正常と判断された時系列データ及び該時系列データに対しラベル付けされた正常ラベルと、異常データ及び該異常データに対しラベル付けされた異常ラベルとに基づき、時系列データを入力した場合に、異常に関する情報を出力する学習モデルを生成する。生成された学習モデルはコンピュータ１、または工作機械等に付属する情報系端末装置もしくはコントローラ（図示なし）にデプロイされる。 The computer 1 according to this embodiment acquires time-series data determined to be normal, and generates abnormal data based on the acquired time-series data. The computer 1 assigns a normal label to time-series data determined to be normal, and an anomalous label to anomalous data. The computer 1 inputs time-series data based on the time-series data determined to be normal, the normal label labeled for the time-series data, and the abnormal data and the abnormal label labeled for the abnormal data. Generates a learning model that outputs information about anomalies when The generated learning model is deployed to the computer 1 or an information system terminal device or controller (not shown) attached to a machine tool or the like.

図１は、コンピュータ１の構成例を示すブロック図である。コンピュータ１は、制御部１１、記憶部１２、入力部１３、表示部１４、読取部１５及び大容量記憶部１６を含む。各構成はバスＢで接続されている。 FIG. 1 is a block diagram showing a configuration example of a computer 1. As shown in FIG. The computer 1 includes a control section 11 , a storage section 12 , an input section 13 , a display section 14 , a reading section 15 and a mass storage section 16 . Each configuration is connected by a bus B.

制御部１１はＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro-Processing Unit）、ＧＰＵ（Graphics Processing Unit）等の演算処理装置を含み、記憶部１２に記憶された制御プログラム１Ｐを読み出して実行することにより、コンピュータ１に係る種々の情報処理、制御処理等を行う。なお、図１では制御部１１を単一のプロセッサであるものとして説明するが、マルチプロセッサであっても良い。 The control unit 11 includes arithmetic processing units such as a CPU (Central Processing Unit), an MPU (Micro-Processing Unit), and a GPU (Graphics Processing Unit). , various information processing, control processing, etc. related to the computer 1 are performed. Note that although FIG. 1 illustrates the controller 11 as a single processor, it may be a multiprocessor.

記憶部１２はＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）等のメモリ素子を含み、制御部１１が処理を実行するために必要な制御プログラム１Ｐ又はデータ等を記憶している。また、記憶部１２は、制御部１１が演算処理を実行するために必要なデータ等を一時的に記憶する。 The storage unit 12 includes memory elements such as RAM (Random Access Memory) and ROM (Read Only Memory), and stores the control program 1P or data necessary for the control unit 11 to execute processing. The storage unit 12 also temporarily stores data and the like necessary for the control unit 11 to perform arithmetic processing.

入力部１３は、マウス、キーボード、タッチパネル、ボタン等の入力デバイスであり、受け付けた操作情報を制御部１１へ出力する。表示部１４は、液晶ディスプレイ又は有機ＥＬ（electroluminescence）ディスプレイ等であり、制御部１１の指示に従い各種情報を表示する。 The input unit 13 is an input device such as a mouse, keyboard, touch panel, buttons, etc., and outputs received operation information to the control unit 11 . The display unit 14 is a liquid crystal display, an organic EL (electroluminescence) display, or the like, and displays various information according to instructions from the control unit 11 .

読取部１５は、ＣＤ（Compact Disc）－ＲＯＭ又はＤＶＤ（Digital Versatile Disc）－ＲＯＭを含む可搬型記憶媒体１ａを読み取る。制御部１１が読取部１５を介して、制御プログラム１Ｐを可搬型記憶媒体１ａより読み取り、大容量記憶部１６に記憶しても良い。また、ネットワーク等を介して他のコンピュータから制御部１１が制御プログラム１Ｐをダウンロードし、大容量記憶部１６に記憶しても良い。さらにまた、半導体メモリ１ｂから、制御部１１が制御プログラム１Ｐを読み込んでも良い。 The reading unit 15 reads a portable storage medium 1a including CD (Compact Disc)-ROM or DVD (Digital Versatile Disc)-ROM. The control unit 11 may read the control program 1P from the portable storage medium 1a via the reading unit 15 and store it in the large-capacity storage unit 16. FIG. Alternatively, the control unit 11 may download the control program 1P from another computer via a network or the like and store it in the large-capacity storage unit 16 . Furthermore, the control unit 11 may read the control program 1P from the semiconductor memory 1b.

大容量記憶部１６は、例えばＨＤＤ（Hard disk drive:ハードディスク）、ＳＳＤ(Solid State Drive:ソリッドステートドライブ)等の記録媒体を備える。大容量記憶部１６は、異常検知モデル１６１、訓練データ管理ＤＢ（database）１６２、学習モデル管理ＤＢ１６３及び訓練データファイル１６４を含む。 The large-capacity storage unit 16 includes a recording medium such as an HDD (Hard disk drive) or an SSD (Solid State Drive). The large-capacity storage unit 16 includes an anomaly detection model 161 , a training data management DB (database) 162 , a learning model management DB 163 and a training data file 164 .

異常検知モデル１６１は、時系列データに基づいて異常に関する情報を出力する異常検知器であり、機械学習により生成された学習済みモデルである。訓練データ管理ＤＢ１６２は、異常検知モデル１６１を構築（作成）するための訓練データ（学習用データ）の管理情報を記憶している。学習モデル管理ＤＢ１６３は、学習済みの異常検知モデル１６１に関する情報を記憶している。訓練データファイル１６４は、訓練データを記憶している。 The anomaly detection model 161 is an anomaly detector that outputs information about anomalies based on time-series data, and is a learned model generated by machine learning. The training data management DB 162 stores management information of training data (learning data) for constructing (creating) the anomaly detection model 161 . The learning model management DB 163 stores information on the learned anomaly detection model 161 . The training data file 164 stores training data.

なお、本実施形態において記憶部１２及び大容量記憶部１６は一体の記憶装置として構成されていても良い。また、大容量記憶部１６は複数の記憶装置により構成されていても良い。更にまた、大容量記憶部１６はコンピュータ１に接続された外部記憶装置であっても良い。 In addition, in this embodiment, the storage unit 12 and the large-capacity storage unit 16 may be configured as an integrated storage device. Also, the large-capacity storage unit 16 may be composed of a plurality of storage devices. Furthermore, the large-capacity storage unit 16 may be an external storage device connected to the computer 1 .

コンピュータ１は、種々の情報処理及び制御処理等をコンピュータ単体で実行しても良いし、複数のコンピュータで分散して実行しても良いし、仮想マシンで分散して実行しても良い。なお、コンピュータ１に係る種々の情報処理及び制御処理等が、通信環境を有するサーバ装置等で実行されても良い。 The computer 1 may execute various information processing, control processing, and the like as a single computer, distributed among a plurality of computers, or distributed among virtual machines. Various information processing and control processing related to the computer 1 may be executed by a server device or the like having a communication environment.

図２は、訓練データ管理ＤＢ１６２及び学習モデル管理ＤＢ１６３のレコードレイアウトの一例を示す説明図である。
訓練データ管理ＤＢ１６２は、訓練ＩＤ列、ファイル名称列及び登録日時列を含む。訓練ＩＤ列は、各訓練データを識別するために、一意に特定される訓練データのＩＤを記憶している。ファイル名称列は、時系列データを含むファイルの名称を記憶している。なお、ファイル名称列には、ファイルの名称がファイルのパスと合わせて記憶されても良い。登録日時列は、訓練データを登録した日時情報を記憶している。 FIG. 2 is an explanatory diagram showing an example of record layouts of the training data management DB 162 and the learning model management DB 163. As shown in FIG.
The training data management DB 162 includes a training ID column, a file name column, and a registration date/time column. The training ID column stores a unique training data ID for identifying each training data. The file name column stores the names of files containing time-series data. The file name column may store the file name together with the file path. The registration date and time column stores information on the date and time when the training data was registered.

学習モデル管理ＤＢ１６３は、モデルＩＤ列、学習モデル列及び生成日時列を含む。モデルＩＤ列は、各学習済みの異常検知モデル１６１を識別するために、一意に特定される異常検知モデル１６１のＩＤを記憶している。学習モデル列は、学習済みの異常検知モデル１６１のファイルを記憶している。生成日時列は、異常検知モデル１６１を生成した日時情報を記憶している。 The learning model management DB 163 includes a model ID column, a learning model column, and a generation date/time column. The model ID column stores IDs of uniquely identified anomaly detection models 161 in order to identify each learned anomaly detection model 161 . The learning model column stores files of learned anomaly detection models 161 . The generation date/time column stores date/time information when the anomaly detection model 161 was generated.

なお、上述した各ＤＢの記憶形態は一例であり、データ間の関係が維持されていれば、他の記憶形態であっても良い。 Note that the storage form of each DB described above is an example, and other storage forms may be used as long as the relationship between data is maintained.

訓練データファイル１６４は、異常検知モデル１６１を構築するための訓練データを記憶している。具体的には、訓練データファイル１６４には、時系列データ、及び該時系列データに対し付けられたラベルの種類等が記憶される。ラベルの種類は、例えば「正常」及び「異常」を含む。なお、ラベルの種類は、「正常」及び「異常」に限定せず、実際に生じる異常種類により詳細化されても良い。訓練データファイル１６４は、例えば、拡張子であるＸＬＳ若しくはＸＬＳＸ等のＥＸＣＥＬ（登録商標）ファイルであっても良く、またはユーザにより定義されたユーザ定義ファイルであっても良い。 The training data file 164 stores training data for constructing the anomaly detection model 161 . Specifically, the training data file 164 stores time-series data, types of labels attached to the time-series data, and the like. Label types include, for example, "normal" and "abnormal". Note that the types of labels are not limited to "normal" and "abnormal", and may be detailed according to the types of abnormalities that actually occur. The training data file 164 may be, for example, an EXCEL® file, such as an extension XLS or XLSX, or a user-defined file defined by a user.

続いて、時系列データに基づいて異常に関する情報を出力する処理を説明する。本実施形態では、異常検知モデル１６１を用いて異常に関する情報を出力する。異常検知モデル１６１は、例えばＯｎｅＣｌａｓｓＳＶＭ（One Class Support Vector Machine）である機械学習のアルゴリズムを用いて生成される。ＯｎｅＣｌａｓｓＳＶＭは、教師なしで学習した良品の学習値からの離れ値（outliers）を検出するアルゴリズムであり、正常データを非負値に、異常データを負値に写像（射影）するモデルである。 Next, processing for outputting information about anomalies based on time-series data will be described. In this embodiment, the anomaly detection model 161 is used to output information about an anomaly. The anomaly detection model 161 is generated using a machine learning algorithm such as OneClassSVM (One Class Support Vector Machine). OneClassSVM is an algorithm for detecting outliers from the learning value of good products learned without supervision, and is a model that maps (projects) normal data to non-negative values and abnormal data to negative values.

ＯｎｅＣｌａｓｓＳＶＭではすべての訓練データをクラスタ１とし、原点のみをクラスタ－１に属するようにカーネルトリックと呼ばれる手法を用いて、高次元空間の特徴空間へデータを写像する。このとき、訓練データは原点から遠くに配置されるように写像されるため、もとの訓練データと類似していない時系列データは原点の近くに集まるようになる。この性質を用いて正常および異常データの区別をすることができる。 OneClassSVM assigns all training data to cluster 1, and maps data to a high-dimensional feature space using a technique called kernel trick so that only the origin belongs to cluster-1. At this time, since the training data are mapped so as to be located far from the origin, time-series data that are not similar to the original training data gather near the origin. This property can be used to distinguish between normal and abnormal data.

コンピュータ１は、訓練データを用いて機械学習を行う異常検知モデル１６１を生成する。コンピュータ１は、機械学習のアルゴリズムとしてＯｎｅＣｌａｓｓＳＶＭを用いて、正常データ（正常時の時系列データ）と異常データ（異常時の時系列データ）とを訓練データとして機械学習させることで異常値との識別境界を決定する。コンピュータ１は、当該識別境界を基準として異常の検出が可能な異常検知モデル１６１を生成する。 The computer 1 generates an anomaly detection model 161 that performs machine learning using training data. The computer 1 uses OneClassSVM as a machine learning algorithm to perform machine learning using normal data (normal time series data) and abnormal data (abnormal time series data) as training data, thereby identifying abnormal values. Determine boundaries. The computer 1 generates an anomaly detection model 161 that can detect an anomaly based on the identification boundary.

具体的には、コンピュータ１は、ＯｎｅＣｌａｓｓＳＶＭを用いて、正常・異常の２クラスの分離超平面を教師なし学習する。機械学習のアルゴリズムは、ハイパーパラメータと称されるパラメータを有する。ハイパーパラメータは、ニュー（ν）及びガンマ（γ）をパラメータとして含む。パラメータνは、訓練データに含まれる異常データの割合に関連するパラメータである。パラメータγは、境界面の複雑さを決定するパラメータであり、ガンマが大きくなると境界面の複雑さが増す。 Specifically, the computer 1 uses OneClassSVM to perform unsupervised learning of two classes of normal/abnormal separating hyperplanes. Machine learning algorithms have parameters called hyperparameters. The hyperparameters include nu (ν) and gamma (γ) as parameters. The parameter ν is a parameter related to the proportion of anomalous data included in the training data. The parameter γ is a parameter that determines the complexity of the interface, and the larger the gamma, the more complex the interface.

ＯｎｅＣｌａｓｓＳＶＭでは、パラメータνによって学習データにおける外れ値が占める割合を指定し、特徴量空間において正常データと原点の間のマージンを最大化する分離超平面を学習する。また、パラメータγを伴うＲＢＦカーネル(Radial basis function kernel)により特徴量空間を写像することで、非線形の分離超平面を得ることができる。 In OneClassSVM, the ratio of outliers in learning data is specified by parameter ν, and a separating hyperplane that maximizes the margin between normal data and the origin in feature space is learned. Also, a nonlinear separating hyperplane can be obtained by mapping the feature amount space with an RBF kernel (radial basis function kernel) with a parameter γ.

なお、ＯｎｅＣｌａｓｓＳＶＭ等の外れ値検知アルゴリズムを利用する場合は、学習時に生成された異常データをニューなどのパラメータで指定する割合で含めても良いし、自然界で取得されるデータには外れ値データをその割合で含んでいることを仮定して、正常データだけを用いて、分離超平面を学習しても良い。また同様に自然界のデータには一定割合外れ値を含むとして、ブートストラップサンプリングを繰り返すことでサンプルデータに外れ値の割合を調整しても良い。 When using an outlier detection algorithm such as OneClassSVM, abnormal data generated during learning may be included at a rate specified by a parameter such as new, or outlier data may be included in data acquired in the natural world. The separating hyperplane may be learned using only normal data, assuming it contains that proportion. Similarly, assuming that natural data includes outliers at a certain rate, bootstrap sampling may be repeated to adjust the rate of outliers in sample data.

コンピュータ１は、取得した正常データと異常データとを含む訓練データに基づいて、異常検知モデル１６１のハイパーパラメータを探索によって最適化する。ハイパーパラメータを最適化することで、コンピュータ１は、異常を検知する際の精度をさらに向上させることができる。 The computer 1 optimizes the hyperparameters of the anomaly detection model 161 by searching based on the obtained training data including normal data and anomalous data. By optimizing the hyperparameters, the computer 1 can further improve accuracy in detecting anomalies.

コンピュータ１は、タイムスロットごとの特徴量ベクトルを受け取り、それぞれに対する正常・異常の検知結果を出力する。具体的には、コンピュータ１は、正常である検知結果に対して１を出力し、異常である検知結果に対して－１を出力する。なお、コンピュータ１は、検知結果の計算の過程で求めた実数値が０以上である場合に１を出力し、実数値が０未満である場合に－１を出力しても良い。 The computer 1 receives the feature quantity vector for each time slot and outputs normality/abnormality detection results for each. Specifically, the computer 1 outputs 1 for a normal detection result and -1 for an abnormal detection result. The computer 1 may output 1 when the real value obtained in the process of calculating the detection result is 0 or more, and output -1 when the real value is less than 0.

なお、本実施の形態では異常検知モデル１６１がＯｎｅＣｌａｓｓＳＶＭであるものとして説明するが、異常検知モデル１６１はＯｎｅＣｌａｓｓＳＶＭに限定されず、アイソレーションフォレスト（Isolation Forest）、ＬＯＦ（Local Outlier Factor；局所外れ値因子）、ＣＮＮ（Convolutional Neural Network）、ＲＮＮ（Recurrent Neural Network）、ベイジアンネットワークまたは回帰木等の任意の学習アルゴリズムで構築された学習済みモデルであって良い。その他、ＬＴＳＭ（Long-Short Term Memory）に係るニューラルネットワーク、トランスフォーマー（Transformer）等を用いても良い。 In this embodiment, the anomaly detection model 161 will be described as OneClassSVM, but the anomaly detection model 161 is not limited to OneClassSVM, and may include isolation forest, LOF (Local Outlier Factor), ), CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), Bayesian network, or a trained model constructed by any learning algorithm such as a regression tree. In addition, a neural network related to LTSM (Long-Short Term Memory), a transformer, or the like may be used.

続いて、訓練データの作成処理を説明する。本実施形態では、コンピュータ１は、正常と判断された時系列データを取得し、取得した時系列データに基づいて異常データを生成する。なお、異常データの生成処理に関しては後述する。コンピュータ１は、正常と判断された時系列データに対して正常ラベルを付与し、異常データに対して異常ラベルを付与する。 Next, the process of creating training data will be described. In this embodiment, the computer 1 acquires time-series data determined to be normal, and generates abnormal data based on the acquired time-series data. The processing for generating abnormal data will be described later. The computer 1 assigns a normal label to time-series data determined to be normal, and an anomalous label to anomalous data.

正常ラベルは、例えば、異常なしを示す「正常」であっても良い。異常ラベルは、例えば、異常ありを示す「異常」であっても良く、または、異常の種類応じて分類される異常名称であっても良い。コンピュータ１は、複数の時系列データにおいて、それぞれの時系列データに対してラベリングすることで、異常検知モデル１６１を学習するための訓練データを作成（生成）する。 A normal label may be, for example, "normal" indicating no abnormalities. The anomaly label may be, for example, "abnormal" indicating that there is an anomaly, or may be an anomaly name classified according to the type of anomaly. The computer 1 creates (generates) training data for learning the anomaly detection model 161 by labeling each piece of time series data in a plurality of pieces of time series data.

続いて、時系列データに基づく異常データの生成処理を詳しく説明する。先ず、コンピュータ１は、複数の時系列データの代表データを求める。代表データは、例えば、複数の時系列データの平均値、複数の時系列データの中央値、または複数の時系列データの最頻値（時系列データの中で頻度が最も高い値）等である。なお、複数の時系列データの中でユーザが指定した最も正常を示すデータが代表データとして利用されても良い。なお、本実施形態では、代表データが複数の時系列データの平均である例を説明するが、ほかの種類の代表データ、その代表データが複数ある場合にも同様に適用することができる。 Next, the process of generating abnormal data based on time-series data will be described in detail. First, the computer 1 obtains representative data of a plurality of time-series data. Representative data is, for example, the average value of multiple time-series data, the median value of multiple time-series data, or the mode of multiple time-series data (the value with the highest frequency in time-series data). . Data indicating the most normal state specified by the user may be used as the representative data among the plurality of time-series data. In this embodiment, an example in which the representative data is the average of a plurality of time-series data will be described, but other types of representative data and a case where there are a plurality of such representative data can be similarly applied.

図３は、複数の時系列データの平均を求める処理を説明する説明図である。図３Ａは、元データとなった複数の時系列データを説明する説明図である。コンピュータ１は、元データとなった複数の時系列データを取得する。時系列データは、例えば、工作機械に取り付けられた３軸加速度センサのデータである。なお、時系列データは、その他、温度、圧力、速度、血圧、売上等のデータであっても良い。３軸加速度センサは、Ｘ軸とＹ軸とＺ軸の３方向の加速度を検出し、加速度信号を出力する。図示のように、取得されたＸ軸の加速度センサのデータ、Ｙ軸の加速度センサのデータ及びＺ軸の加速度センサのデータが、グラフ１１ａに示されている。グラフ１１ａの横軸は、時間を示し、矢印の先端に向かって時間が経過している。グラフ１１ａの縦軸は加速度の大きさを示す。また、３軸の加速度センサのデータそれぞれが、異なる時間帯（例えば、ｔ０～ｔ１、ｔ２～ｔ３及びｔ４～ｔ５）で複数取得される。 FIG. 3 is an explanatory diagram illustrating a process of obtaining an average of multiple pieces of time-series data. FIG. 3A is an explanatory diagram illustrating a plurality of pieces of time-series data used as original data. The computer 1 acquires a plurality of time-series data as original data. The time-series data is, for example, data from a three-axis acceleration sensor attached to a machine tool. The time-series data may also be data such as temperature, pressure, speed, blood pressure, and sales. The triaxial acceleration sensor detects acceleration in three directions of the X-axis, Y-axis, and Z-axis, and outputs acceleration signals. As shown, the acquired X-axis acceleration sensor data, Y-axis acceleration sensor data, and Z-axis acceleration sensor data are shown in a graph 11a. The horizontal axis of graph 11a indicates time, and time elapses toward the tip of the arrow. The vertical axis of graph 11a indicates the magnitude of acceleration. Also, a plurality of pieces of data from the three-axis acceleration sensors are obtained in different time periods (eg, t0 to t1, t2 to t3, and t4 to t5).

コンピュータ１は、取得した各軸の複数の加速度センサのデータの平均を算出する。具体的には、コンピュータ１は、Ｘ軸の各時間帯の加速度センサのデータの平均を算出し、Ｙ軸の各時間帯の加速度センサのデータの平均を算出し、Ｚ軸の各時間帯の加速度センサのデータの平均を算出する。図示のように、算出された３軸の各時間帯の加速度センサのデータの平均が、グラフ１１ｂに示されている。なお、グラフ１１ｂの横軸及び縦軸は、グラフ１１ａと同様であるため、説明を省略する。
以上の処理に従って、複数の時系列データの平均を求めることができる。 The computer 1 calculates the average of the acquired data of the plurality of acceleration sensors for each axis. Specifically, the computer 1 calculates the average of the acceleration sensor data in each time slot on the X axis, calculates the average of the acceleration sensor data in each time slot on the Y axis, and calculates the average of the acceleration sensor data in each time slot on the Z axis. Calculate the average of the accelerometer data. As shown in the figure, the average of the calculated three-axis acceleration sensor data for each time period is shown in graph 11b. Note that the horizontal axis and vertical axis of the graph 11b are the same as those of the graph 11a, so description thereof will be omitted.
According to the above process, the average of multiple pieces of time-series data can be obtained.

続いて、コンピュータ１は、複数の時系列データの平均に基づいて異常データを生成する処理を説明する。なお、本実施形態では、複数の時系列データの平均（以下、平均データという）を用いて説明するが、時系列データそのものに基づいて異常データを生成する処理にも同様に適用することができる。 Next, the computer 1 will explain the process of generating abnormal data based on the average of a plurality of time-series data. In this embodiment, the average of a plurality of time-series data (hereinafter referred to as average data) is used for explanation, but it can be similarly applied to the process of generating abnormal data based on the time-series data itself. .

異常データの生成方法は、第１生成方法、第２生成方法、第３生成方法及び第４生成方法を含む。第１生成方法は、平均データに対し、周期的な変動を加えた異常データを生成する生成方法である。第２生成方法は、平均データを時間の変化に伴い継続して増加または減少させることにより、異常データを生成する生成方法である。第３生成方法は、平均データに単位時間内の値が突発的に閾値を超えるスパイク（spike）を加えた異常データを生成する生成方法である。第４生成方法は、平均データの波形の位相をずらすことにより、異常データを生成する生成方法である。 The abnormal data generation method includes a first generation method, a second generation method, a third generation method, and a fourth generation method. A first generation method is a generation method for generating abnormal data by adding periodic fluctuations to average data. The second generation method is a generation method that generates abnormal data by continuously increasing or decreasing average data over time. A third generation method is a generation method of generating abnormal data by adding a spike in which a value within a unit time suddenly exceeds a threshold to average data. A fourth generation method is a generation method of generating abnormal data by shifting the phase of the waveform of the average data.

平均データに対し、第１生成方法、第２生成方法、第３生成方法、第４生成方法のうちの少なくとも一つまたはこれらの組み合わせに基づいて、異常データを生成することができる。 Abnormal data can be generated for the average data based on at least one of a first generation method, a second generation method, a third generation method, a fourth generation method, or a combination thereof.

図４は、異常データの生成方法を説明する説明図である。図４Ａは、第１生成方法に基づく異常データの生成処理を説明する説明図である。平均データに所定のタイミング（時点）で周期的な変動を加えることにより、異常データを生成することができる。周期的な変動は、時間的に連続して同じ間隔で生じる変動であっても良いし、周期的な変動の周期を長くまたは短くすることにより生じる変動であっても良い。また、周期的な変動は、周期的な変動の振幅を大きくまたは小さくすることにより生じる変動であっても良い。 FIG. 4 is an explanatory diagram for explaining a method of generating anomaly data. FIG. 4A is an explanatory diagram for explaining abnormal data generation processing based on the first generation method. Abnormal data can be generated by adding periodic fluctuations to the average data at predetermined timings (points of time). Periodic fluctuations may be fluctuations that occur continuously at the same intervals in time, or fluctuations that occur by lengthening or shortening the period of periodic fluctuations. Also, the periodic fluctuation may be fluctuation caused by increasing or decreasing the amplitude of the periodic fluctuation.

例えば、コンピュータ１は、異常の発生開始タイミング（例えば、５５０ｍｓ）から異常の発生終了タイミング（例えば、１０００ｍｓ）までに、所定の周期（例えば、２０ｍｓ）で繰り返しても良い。または、コンピュータ１は、異常の発生開始タイミングから異常の発生終了タイミングまでに、正常な変動の振幅を０．５倍に小さくしても良い。 For example, the computer 1 may repeat at a predetermined cycle (eg, 20 ms) from the timing when the occurrence of abnormality starts (eg, 550 ms) to the timing when the occurrence of abnormality ends (eg, 1000 ms). Alternatively, the computer 1 may reduce the amplitude of the normal fluctuation by 0.5 times from the start timing of the occurrence of the abnormality to the end timing of the occurrence of the abnormality.

図４Ｂは、第２生成方法に基づく異常データの生成処理を説明する説明図である。平均データを時間の変化に伴い継続して増加(上昇)または減少（下降）させることにより、異常データを生成することができる。図示のように、コンピュータ１は、異常の発生開始タイミング（例えば、６００ｍｓ）から異常の発生終了タイミング（例えば、１０００ｍｓ）までに、所定の傾き係数で平均データに減少傾向を加えて異常データを生成する。なお、増加傾向と減少傾向との組み合わせに基づいて異常データが生成されても良い。 FIG. 4B is an explanatory diagram for explaining abnormal data generation processing based on the second generation method. Abnormal data can be generated by continuously increasing (rising) or decreasing (falling) the average data over time. As shown in the figure, the computer 1 generates abnormal data by adding a decreasing trend to the average data with a predetermined slope coefficient from the start timing of the occurrence of abnormality (eg, 600 ms) to the end timing of the occurrence of abnormality (eg, 1000 ms). do. Abnormal data may be generated based on a combination of an increasing tendency and a decreasing tendency.

図４Ｃは、第３生成方法に基づく異常データの生成処理を説明する説明図である。平均データに単位時間内の値が突発的に閾値を超えるスパイクを付加することにより、異常データを生成することができる。図示のように、コンピュータ１は平均データに対し、異常発生タイミングｔ１に所定の第１スパイク幅でスパイク（突起形状）を正の位置に加えて、異常発生タイミングｔ２に所定の第２スパイク幅でスパイクを正の位置に加える。異なるタイミングでのスパイクの付加によりスパイク状の歪みが生じ、異常データが生成される。 FIG. 4C is an explanatory diagram for explaining abnormal data generation processing based on the third generation method. Abnormal data can be generated by adding to the average data a spike whose value within a unit time suddenly exceeds a threshold. As shown in the figure, the computer 1 adds a spike (protrusion shape) to the average data at the positive position with a predetermined first spike width at the abnormality occurrence timing t1, and adds a predetermined second spike width at the abnormality occurrence timing t2. Add spikes at positive positions. Addition of spikes at different timings causes spike-like distortion and generates anomalous data.

なお、スパイクの付加位置は正の位置に限るものではない。例えばコンピュータ１は、複数のスパイクを負の位置に加えることにより、異常データを生成しても良いし、または、複数のスパイクを正の位置と負の位置との両方に加えることにより、異常データを生成しても良い。 Note that the spike addition position is not limited to the positive position. For example, computer 1 may generate anomalous data by adding multiple spikes to negative locations, or add multiple spikes to both positive and negative locations to generate anomalous data may be generated.

このように、異常発生タイミング、スパイク幅及びスパイクの数のうちの少なくとも一つまたはこれらの組み合わせに基づいて、第３生成方法に基づく複数のパターンの異常データを生成することができる。 In this way, it is possible to generate a plurality of patterns of abnormal data based on the third generation method, based on at least one of the abnormal occurrence timing, spike width, and number of spikes, or a combination thereof.

図４Ｄは、第４生成方法に基づく異常データの生成処理を説明する説明図である。時系列データの波形の位相をずらすことにより、異常データを生成することができる。位相は、周期的に変動する波の位置情報である。図示のように、コンピュータ１は平均データに対し、時間を示す横軸に沿って、所定のずらし量（例えば、５００ｍｓ）で波形の位相をずらすことで異常データを生成する。 FIG. 4D is an explanatory diagram for explaining abnormal data generation processing based on the fourth generation method. Abnormal data can be generated by shifting the phase of the waveform of the time-series data. The phase is positional information of a periodically varying wave. As shown, the computer 1 generates abnormal data by shifting the phase of the waveform by a predetermined shift amount (eg, 500 ms) along the horizontal axis indicating time with respect to the average data.

なお、コンピュータ１は、ユーザによる各異常データの生成方法を実行するためのパラメータの設定を受け付けても良い。例えばコンピュータ１は、ユーザによる第１生成方法のパラメータ（時点、周期及び振幅等）の設定を受け付ける。コンピュータ１は、受け付けた第１生成方法のパラメータに基づき、異常データを生成する。 Note that the computer 1 may accept user settings of parameters for executing each method of generating abnormal data. For example, the computer 1 accepts the user's setting of the parameters (time point, period, amplitude, etc.) of the first generation method. The computer 1 generates abnormal data based on the received parameters of the first generation method.

また、第１生成方法、第２生成方法、第３生成方法、第４生成方法の任意の組み合わせに基づいて、異常データを生成することができる。例えばコンピュータ１は、第１生成方法と第２生成方法との組み合わせに基づいて、異常データを生成しても良い。具体的には、コンピュータ１は、第１生成方法に基づいて、異常の発生開始タイミングｔ１から異常の発生終了タイミングｔ２までに、平均データに所定の周期（例えば、２０ｍｓ）で繰り返す変動を加えて中間の異常データを生成する。コンピュータ１は、第２生成方法に基づいて、異常の発生開始タイミングｔ３から異常の発生終了タイミングｔ４までに、生成した中間の異常データに所定の傾き係数で増加傾向または減少傾向を加える処理を行い、最終的な異常データを生成する。 Also, abnormal data can be generated based on any combination of the first generation method, the second generation method, the third generation method, and the fourth generation method. For example, the computer 1 may generate abnormal data based on a combination of the first generation method and the second generation method. Specifically, based on the first generation method, the computer 1 adds variation that repeats at a predetermined cycle (for example, 20 ms) to the average data from the abnormality occurrence start timing t1 to the abnormality occurrence end timing t2. Generate intermediate anomaly data. Based on the second generation method, the computer 1 adds an increasing tendency or a decreasing tendency to the generated intermediate abnormality data with a predetermined slope coefficient from the abnormality occurrence start timing t3 to the abnormality occurrence end timing t4. , to produce the final anomaly data.

また、生成された異常データにカラードノイズ（Colors of Noise）を加えることにより、該異常データとは異なるパターンの異常データ（第２異常データ）を生成することができる。カラードノイズは、パワースペクトル密度（Power Spectral Density）が平坦でないノイズである。例えば、パワースペクトルの大きさが周波数に反比例するピンクノイズ、または、パワースペクトルの大きさが周波数の二乗に反比例するブラウンノイズ等のカラードノイズを加えて第２異常データを生成する。なお、カラードノイズに限らず、異常データに平坦なパワースペクトルを示すホワイトノイズ(White noise)を加えて、第２異常データを生成しても良い。 Further, by adding colored noise (Colors of Noise) to the generated abnormal data, it is possible to generate abnormal data (second abnormal data) having a pattern different from that of the abnormal data. Colored noise is noise whose power spectral density is not flat. For example, colored noise such as pink noise whose power spectrum is inversely proportional to frequency or brown noise whose power spectrum is inversely proportional to the square of frequency is added to generate the second abnormal data. The second abnormal data may be generated by adding white noise indicating a flat power spectrum to the abnormal data, instead of the colored noise.

更にまた、平均データに対する複数種類の異常データの生成方法を、異なるタイミングで平均データに適用することにより異常データを生成することができる。 Furthermore, abnormal data can be generated by applying a plurality of types of abnormal data generation methods for average data to average data at different timings.

図５は、複数パターンの異常データを生成する処理を説明する説明図である。なお、図５では、第３生成方法に基づいて異常データを生成する例を説明するが、他の生成方法にも同様に適用することができる。 FIG. 5 is an explanatory diagram for explaining the process of generating a plurality of patterns of abnormal data. Note that although FIG. 5 illustrates an example of generating abnormal data based on the third generation method, other generation methods can be similarly applied.

第３生成方法は、平均データに単位時間内の値が突発的に閾値を超えるスパイクを付加することにより、異常データを生成する生成方法である。例えばコンピュータ１は、平均データに対し、一部の時間範囲にスパイク１２ａ及びスパイク１２ｂを付加して第１パターンの異常データを生成する。そして、コンピュータ１は、生成した第１パターンの異常データに、スパイク１２ａ及びスパイク１２ｂを付加した時間範囲をお互いにずらすことにより、第２パターンの異常データ及び第３パターンの異常データを生成する。 A third generation method is a generation method of generating abnormal data by adding a spike, in which a value within a unit time suddenly exceeds a threshold value, to average data. For example, the computer 1 adds spikes 12a and 12b to a partial time range to the average data to generate abnormal data of the first pattern. Then, the computer 1 generates abnormal data of the second pattern and abnormal data of the third pattern by mutually shifting the time ranges obtained by adding the spikes 12a and 12b to the generated abnormal data of the first pattern.

図６は、異常データの生成方法の受付画面の一例を示す説明図である。該画面は、生成方法選択欄１３ａ及び設定ボタン１３ｂを含む。生成方法選択欄１３ａは、単一または複数の異常データの生成方法の選択を受け付ける欄である。設定ボタン１３ｂは、異常データの生成方法を設定するボタンである。 FIG. 6 is an explanatory diagram showing an example of a reception screen for a method of generating abnormal data. The screen includes a generation method selection field 13a and a setting button 13b. The generation method selection field 13a is a field for receiving selection of a single or a plurality of abnormal data generation methods. The setting button 13b is a button for setting a method for generating abnormal data.

コンピュータ１は、第１生成方法、第２生成方法、第３生成方法、第４生成方法、及び、各生成方法を組み合わせた組み合わせ生成方法を選択可能な受付画面を生成する。コンピュータ１は、生成した受付画面を表示する。図示のように、第１生成方法（周波数変化）、第２生成方法（トレンド）、第３生成方法（スパイク）、第４生成方法（位相変化）、及び組み合わせ生成方法（複合変化）が、生成方法選択欄１３ａに表示される。 The computer 1 generates a reception screen that allows selection of a first generation method, a second generation method, a third generation method, a fourth generation method, and a combination generation method combining each generation method. The computer 1 displays the generated acceptance screen. As shown, a first generation method (frequency change), a second generation method (trend), a third generation method (spike), a fourth generation method (phase change), and a combination generation method (complex change) generate It is displayed in the method selection column 13a.

コンピュータ１は、生成方法選択欄１３ａの選択操作を受け付けた場合、異常データの生成方法の選択を受け付ける。コンピュータ１は、設定ボタン１３ｂのタッチ（クリック）操作を受け付けた場合、生成方法選択欄１３ａにより選択された生成方法を記憶部１２または大容量記憶部１６に記憶する。そして、異常データの生成処理の際には、記憶された生成方法を利用する。 When the computer 1 receives a selection operation in the generation method selection field 13a, the computer 1 receives selection of the abnormal data generation method. The computer 1 stores the generation method selected in the generation method selection field 13a in the storage unit 12 or the large-capacity storage unit 16 when receiving the touch (click) operation of the setting button 13b. Then, the stored generation method is used when generating the abnormal data.

図７は、異常検知モデル１６１を生成する際の処理手順を示すフローチャートである。コンピュータ１の制御部１１は、正常と判断された複数の時系列データを入力部１３により取得する（ステップＳ１０１）。制御部１１は、異常データの生成方法の選択を入力部１３により受け付ける（ステップＳ１０２）。制御部１１は、取得した複数の時系列データと、受け付けた異常データの生成方法とに基づいて、異常データを生成する処理のサブルーチンを実行する（ステップＳ１０３）。なお、異常データの生成処理のサブルーチンに関しては後述する。 FIG. 7 is a flowchart showing a processing procedure for generating the anomaly detection model 161. As shown in FIG. The control unit 11 of the computer 1 acquires a plurality of pieces of time-series data determined to be normal through the input unit 13 (step S101). The control unit 11 receives the selection of the abnormal data generation method through the input unit 13 (step S102). The control unit 11 executes a subroutine of processing for generating abnormal data based on the acquired plurality of time-series data and the received abnormal data generating method (step S103). The subroutine for generating abnormal data will be described later.

制御部１１は、複数の時系列データと、生成した異常データとに基づき、訓練データを作成する（ステップＳ１０４）。具体的には、制御部１１は、時系列データに対して正常ラベル（例えば、「正常」）を付与し、異常データに対して異常ラベル（例えば、「異常」）を付与する。 The control unit 11 creates training data based on the plurality of time-series data and the generated abnormal data (step S104). Specifically, the control unit 11 assigns a normal label (for example, “normal”) to the time-series data, and assigns an abnormal label (for example, “abnormal”) to the abnormal data.

制御部１１は、複数の時系列データと、生成した異常データとを訓練データ（学習用データ）として訓練データファイル１６４に記憶する（ステップＳ１０５）。具体的には、制御部１１は、訓練データファイル１６４を生成する。制御部１１は、各時系列データに対してラベル名称（例えば、「正常」）を付与し、時系列データとラベル名称とを対応付けて訓練データファイル１６４に書き込む。制御部１１は、各異常データに対してラベル名称（例えば、「異常」）を付与し、時系列データとラベル名称とを対応付けて訓練データファイル１６４に書き込む。制御部１１は、時系列データとラベル名称とを対応付けて書き込んだファイルを大容量記憶部１６に記憶する。 The control unit 11 stores the plurality of time-series data and the generated abnormal data as training data (learning data) in the training data file 164 (step S105). Specifically, the control unit 11 creates the training data file 164 . The control unit 11 assigns a label name (for example, “normal”) to each piece of time-series data, associates the time-series data with the label name, and writes them in the training data file 164 . The control unit 11 assigns a label name (for example, “abnormality”) to each abnormal data, associates the time-series data with the label name, and writes them in the training data file 164 . The control unit 11 stores in the large-capacity storage unit 16 the file in which the time-series data and label names are associated and written.

制御部１１は、訓練データファイル１６４を管理するための管理情報を大容量記憶部１６の訓練データ管理ＤＢ１６２に記憶する（ステップＳ１０６）。具体的には、制御部１１は、訓練ＩＤを割り振り、訓練データが含まれているファイルの名称及び登録日時を一つのレコードとして訓練データ管理ＤＢ１６２に記憶する。 The control unit 11 stores management information for managing the training data file 164 in the training data management DB 162 of the large-capacity storage unit 16 (step S106). Specifically, the control unit 11 assigns a training ID, and stores the name and registration date and time of the file containing the training data as one record in the training data management DB 162 .

制御部１１は、作成した訓練データを用いて異常検知モデル１６１を生成する（ステップＳ１０７）。具体的には、制御部１１は、ＯｎｅＣｌａｓｓＳＶＭを用いて、訓練データである複数の時系列データを機械学習させることにより、異常値との識別境界を決定し、当該識別境界を基準として異常の検出が可能な異常検知モデル１６１を生成する。 The control unit 11 uses the created training data to generate the anomaly detection model 161 (step S107). Specifically, the control unit 11 uses OneClassSVM to machine-learn a plurality of time-series data that are training data, thereby determining an identification boundary with an abnormal value, and detecting an abnormality based on the identification boundary. generates an anomaly detection model 161 capable of

制御部１１は、生成した異常検知モデル１６１を大容量記憶部１６の学習モデル管理ＤＢ１６３に記憶し（ステップＳ１０８）、一連の処理を終了する。具体的には、制御部１１は、生成した異常検知モデル１６１に対してモデルＩＤを割り振り、割り振ったモデルＩＤに対応付けて、異常検知モデル１６１のファイル及び生成日時を一つのレコードとして学習モデル管理ＤＢ１６３に記憶する。 The control unit 11 stores the generated anomaly detection model 161 in the learning model management DB 163 of the large-capacity storage unit 16 (step S108), and ends the series of processes. Specifically, the control unit 11 assigns a model ID to the generated anomaly detection model 161, associates the assigned model ID with the file of the anomaly detection model 161 and the date and time of generation as one record for learning model management. Store in DB 163 .

図８は、異常データを生成する処理のサブルーチンの処理手順を示すフローチャートである。コンピュータ１の制御部１１は、複数の時系列データの平均を算出する（ステップＳ１１）。例えば、時系列データが３軸加速度センサのデータである場合、制御部１１は、Ｘ軸の加速度センサのデータの平均、Ｙ軸の加速度センサのデータの平均及びＺ軸の加速度センサのデータの平均をそれぞれ算出する。 FIG. 8 is a flow chart showing the procedure of a subroutine for generating abnormal data. The control unit 11 of the computer 1 calculates an average of multiple pieces of time-series data (step S11). For example, when the time-series data is data of a three-axis acceleration sensor, the control unit 11 averages the data of the X-axis acceleration sensor, the average of the data of the Y-axis acceleration sensor, and the average of the data of the Z-axis acceleration sensor. are calculated respectively.

制御部１１は、受信した異常データの生成方法を取得する（ステップＳ１２）。制御部１１は、取得した生成方法を用いて、算出した複数の時系列データの平均に基づいて異常データを生成する（ステップＳ１３）。例えば、異常データの生成方法が第１生成方法である場合、制御部１１は第１生成方法を用いて、複数の時系列データの平均に所定のタイミングで周期的な変動を加えて異常データを生成する。制御部１１は、異常データの生成処理のサブルーチンを終了してリターンする。 The control unit 11 acquires the method for generating the received abnormal data (step S12). Using the acquired generation method, the control unit 11 generates abnormal data based on the calculated average of the plurality of time-series data (step S13). For example, when the abnormal data generation method is the first generation method, the control unit 11 generates abnormal data by adding periodic fluctuations at a predetermined timing to the average of a plurality of time-series data using the first generation method. Generate. The control unit 11 terminates the abnormal data generation processing subroutine and returns.

なお、本実施形態では、正常データと、正常データに基づいて生成された異常データとを用いる異常検知モデル１６１の生成処理の例を説明したが、これに限るものではない。例えば、コンピュータ１はユーザから複数の異常データを直接取得した場合、取得した複数の異常データと正常データとに基づいて訓練データを作成しても良い。または、コンピュータ１は、ユーザから取得された複数の異常データと、正常データに基づいて生成された異常データとを併用し、正常データと合わせて訓練データを作成しても良い。異常検知モデル１６１の生成または学習の際には、作成された訓練データを用いる。 In this embodiment, an example of processing for generating the abnormality detection model 161 using normal data and abnormal data generated based on the normal data has been described, but the present invention is not limited to this. For example, when the computer 1 directly acquires a plurality of abnormal data from the user, the computer 1 may create training data based on the acquired multiple abnormal data and normal data. Alternatively, the computer 1 may combine a plurality of abnormal data acquired from the user and abnormal data generated based on normal data to create training data in combination with the normal data. The created training data is used when generating or learning the anomaly detection model 161 .

本実施形態によると、時系列データを入力した場合に異常に関する情報を出力する異常検知モデル１６１を生成することが可能となる。 According to this embodiment, it is possible to generate an anomaly detection model 161 that outputs information about anomalies when time-series data is input.

本実施形態によると、正常と判定された時系列データに基づいて異常データを生成することが可能となる。 According to this embodiment, it is possible to generate abnormal data based on time-series data determined to be normal.

本実施形態によると、異常データの生成方法を示す複数の生成方法、または各生成方法を組み合わせた組み合わせ方法を利用することにより、多様な異常データを生成することが可能となる。 According to the present embodiment, it is possible to generate various types of abnormal data by using a plurality of generation methods indicating methods for generating abnormal data or a combination method combining each generation method.

本実施形態によると、正常データしか取得できない場合または異常データの取得が困難である場合に、異常データを自動的に生成することにより、異常検知モデル１６１の生成に役立つことが可能となる。 According to this embodiment, when only normal data can be obtained or when it is difficult to obtain abnormal data, by automatically generating abnormal data, it is possible to help generate the abnormality detection model 161 .

（実施形態２）
実施形態２は、複数の時系列データの代表データに対して補正処理を行う形態に関する。なお、実施形態１と重複する内容については説明を省略する。なお、本実施形態では、代表データが複数の時系列データの平均である例を説明するが、ほかの種類の代表データ、その代表データが複数ある場合にも同様に適用することができる。 (Embodiment 2)
Embodiment 2 relates to a form in which correction processing is performed on representative data of a plurality of time-series data. In addition, description is abbreviate|omitted about the content which overlaps with Embodiment 1. FIG. In this embodiment, an example in which the representative data is the average of a plurality of time-series data will be described, but other types of representative data and a case where there are a plurality of such representative data can be similarly applied.

図９は、補正処理を行う処理を説明する説明図である。図９Ａは、複数の時系列データの平均を説明する説明図である。時系列データは、例えば３軸加速度センサのデータである。コンピュータ１は、複数の３軸加速度センサのデータに基づき、各軸の加速度センサのデータの平均を取得する。なお、加速度センサのデータの平均を求める処理に関しては、図３と同様であるため、説明を省略する。図示のように、Ｘ軸の加速度センサのデータの平均、Ｙ軸の加速度センサのデータの平均及びＺ軸の加速度センサのデータの平均がグラフ１４ａに示されている。グラフ１４ａの横軸は、時間を示す。グラフ１４ａの縦軸は加速度の大きさを示す。 FIG. 9 is an explanatory diagram for explaining the process of performing the correction process. FIG. 9A is an explanatory diagram illustrating averaging of multiple pieces of time-series data. The time-series data is, for example, data from a triaxial acceleration sensor. The computer 1 acquires an average of the data of the acceleration sensors of each axis based on the data of the multiple 3-axis acceleration sensors. Note that the processing for obtaining the average of the data of the acceleration sensor is the same as in FIG. 3, so the description is omitted. As shown, the average X-axis accelerometer data, the average Y-axis accelerometer data, and the average Z-axis accelerometer data are shown in graph 14a. The horizontal axis of the graph 14a indicates time. The vertical axis of graph 14a indicates the magnitude of acceleration.

図９Ｂは、第１補正後の加速度センサのデータを説明する説明図である。コンピュータ１は、各軸の複数の加速度センサのデータの平均に対して補正処理を行う。具体的には、先ず、コンピュータ１は、各軸の複数の加速度センサのデータの平均に対し、平均を０に揃える処理を行い、平均を０にした各軸の第１補正後の加速度センサのデータを生成する。図示のように、生成されたＸ軸の第１補正後の加速度センサのデータ、Ｙ軸の第１補正後の加速度センサのデータ及びＺ軸の第１補正後の加速度センサのデータが、グラフ１４ｂに示されている。なお、グラフ１４ｂの横軸及び縦軸は、グラフ１４ａと同様であるため、説明を省略する。 FIG. 9B is an explanatory diagram illustrating acceleration sensor data after the first correction. The computer 1 performs correction processing on the average of data of a plurality of acceleration sensors on each axis. Specifically, first, the computer 1 performs processing for averaging the data of a plurality of acceleration sensors on each axis to zero, and the acceleration sensors after the first correction on each axis with the average set to zero. Generate data. As shown, the generated acceleration sensor data after the first correction on the X axis, the acceleration sensor data after the first correction on the Y axis, and the acceleration sensor data after the first correction on the Z axis are shown in a graph 14b. shown in Note that the horizontal axis and vertical axis of the graph 14b are the same as those of the graph 14a, so description thereof will be omitted.

図９Ｃは、第２補正後の加速度センサのデータを説明する説明図である。コンピュータ１は、各軸の第１補正後の加速度センサのデータに対し、実効値に基づいて各軸の第２補正後の加速度センサのデータを生成する。 FIG. 9C is an explanatory diagram illustrating acceleration sensor data after the second correction. The computer 1 generates the accelerometer data after the second correction for each axis based on the effective value for the accelerometer data after the first correction for each axis.

先ず、コンピュータ１は、各軸の最大実効値と平均実効値との比を算出する。図１０は、最大実効値と平均実効値との比を算出する説明図である。実効値は、二乗平均平方根（ＲＭＳ：root mean square）であり、時間と共に変化する信号の実効的な大きさを示す値である。図示のように、変量ｘのデータであるｘｉ（ｉ＝１，２，…，ｎ）に対して、ｘの二乗平均平方根ＲＭＳ（ｘ）は式１５ａで定義される。変量ｘは、例えばＸ軸の加速度センサのデータである。 First, the computer 1 calculates the ratio between the maximum effective value and the average effective value of each axis. FIG. 10 is an explanatory diagram for calculating the ratio between the maximum effective value and the average effective value. RMS is the root mean square (RMS) and is a value that indicates the effective magnitude of a signal as it changes over time. As shown, for x i (i=1, 2, . . . , n), the data of variable x, the root mean square RMS(x) of x is defined by Equation 15a. The variable x is, for example, X-axis acceleration sensor data.

コンピュータ１は、異なる時間帯（例えば、ｔ０～ｔ１、ｔ２～ｔ３及びｔ４～ｔ５）でのＸ軸の加速度センサのデータを複数取得する。図示のように、取得された各時間帯の加速度センサのデータがグラフ１５ｂに表示される。グラフ１５ｂの横軸は、時間を示す。グラフ１５ｂの縦軸は、Ｘ軸の加速度の大きさを示す。コンピュータ１は、取得した各時間帯の加速度センサのデータの実効値をそれぞれ算出する。具体的には、コンピュータ１は、「ｔ０～ｔ１」時間帯の加速度センサのデータの実効値を算出し、「ｔ２～ｔ３」時間帯の加速度センサのデータの実効値を算出し、「ｔ４～ｔ５」時間帯の加速度センサのデータの実効値を算出する。コンピュータ１は、算出したそれぞれの実効値から最大の実効値を取得する。 The computer 1 acquires a plurality of X-axis acceleration sensor data in different time periods (eg, t0-t1, t2-t3 and t4-t5). As shown in the figure, the acquired data of the acceleration sensor in each time period is displayed in the graph 15b. The horizontal axis of graph 15b indicates time. The vertical axis of the graph 15b indicates the magnitude of the X-axis acceleration. The computer 1 calculates the effective value of the accelerometer data obtained in each time period. Specifically, the computer 1 calculates the effective value of the acceleration sensor data in the time period “t0 to t1”, calculates the effective value of the acceleration sensor data in the time period “t2 to t3”, and calculates the effective value of the acceleration sensor data in the time period “t4 to Calculate the effective value of the acceleration sensor data in the time period t5. The computer 1 acquires the maximum effective value from each calculated effective value.

コンピュータ１は、取得した各時間帯の加速度センサのデータの平均を算出する。図示のように、算出された各時間帯の加速度センサのデータの平均が、グラフ１５ｃに示されている。なお、グラフ１５ｃの横軸及び縦軸は、グラフ１５ｂと同様であるため、説明を省略する。コンピュータ１は、各時間帯の加速度センサのデータの平均の実効値を算出する。 The computer 1 calculates the average of the accelerometer data obtained in each time period. As shown in the figure, the average of the calculated acceleration sensor data for each time period is shown in the graph 15c. Note that the horizontal axis and vertical axis of the graph 15c are the same as those of the graph 15b, so description thereof will be omitted. The computer 1 calculates the average effective value of the acceleration sensor data for each time period.

コンピュータ１は、取得したＸ軸の加速度センサのデータの最大実効値と、算出したＸ軸の加速度センサのデータの平均実効値との比を算出する。例えば、最大実効値と平均実効値との比が「１：０．９７８」である。なお、図１０では、Ｘ軸の加速度センサのデータの例を説明するが、Ｙ軸の加速度センサのデータ及びＺ軸の加速度センサのデータにも同様に適用することができる。 The computer 1 calculates the ratio between the maximum effective value of the acquired data of the X-axis acceleration sensor and the average effective value of the calculated data of the X-axis acceleration sensor. For example, the ratio of the maximum effective value to the average effective value is "1:0.978". Note that FIG. 10 illustrates an example of the data of the X-axis acceleration sensor, but the same can be applied to the data of the Y-axis acceleration sensor and the data of the Z-axis acceleration sensor.

続いて、図９Ｃに戻り、コンピュータ１は、各軸の第１補正後の加速度センサのデータ（図９Ｂ）に対し、算出した各軸の最大実効値と平均実効値との比を乗算する。コンピュータ１は乗算処理を通じて、各軸の第１補正後の加速度センサのデータ（図９Ｂ）の位置（平均を０にした）を、元の各軸の複数の加速度センサのデータの平均（図９Ａ）の位置に戻し、第２補正後の加速度センサのデータを生成する。図示のように、生成されたＸ軸の第２補正後の加速度センサのデータ、Ｙ軸の第２補正後の加速度センサのデータ及びＺ軸の第２補正後の加速度センサのデータが、グラフ１４ｃに示されている。なお、グラフ１４ｃの横軸及び縦軸は、グラフ１４ａと同様であるため、説明を省略する。 Next, returning to FIG. 9C, the computer 1 multiplies the acceleration sensor data after the first correction for each axis (FIG. 9B) by the calculated ratio of the maximum effective value to the average effective value for each axis. Through multiplication processing, the computer 1 converts the position of the acceleration sensor data (FIG. 9B) after the first correction for each axis (the average is set to 0) to the original average of the multiple acceleration sensor data for each axis (FIG. 9A ) to generate acceleration sensor data after the second correction. Graph 14c shows the generated acceleration sensor data after the second correction on the X axis, the acceleration sensor data after the second correction on the Y axis, and the acceleration sensor data after the second correction on the Z axis. shown in Note that the horizontal axis and vertical axis of the graph 14c are the same as those of the graph 14a, so description thereof will be omitted.

図１１は、実施形態２の異常データを生成する処理のサブルーチンの処理手順を示すフローチャートである。コンピュータ１の制御部１１は、複数の時系列データの平均を算出する（ステップＳ２１）。なお、複数の時系列データの平均の算出処理に関しては、図８のステップＳ１１の処理と同様であるため、説明を省略する。 FIG. 11 is a flowchart showing the procedure of a subroutine for generating abnormal data according to the second embodiment. The control unit 11 of the computer 1 calculates an average of multiple pieces of time-series data (step S21). Note that the processing for calculating the average of a plurality of pieces of time-series data is the same as the processing in step S11 of FIG. 8, so description thereof will be omitted.

制御部１１は、算出した複数の時系列データの平均に対し、平均を０に揃える処理を行い、第１補正後の時系列データを生成する（ステップＳ２２）。制御部１１は、各時系列データの実効値を算出し（ステップＳ２３）、算出した各実効値から最大の実効値を取得する（ステップＳ２４）。制御部１１は、複数の時系列データの平均の実効値を算出する（ステップＳ２５）。 The control unit 11 performs a process of aligning the average of the calculated multiple time-series data to 0, and generates time-series data after the first correction (step S22). The control unit 11 calculates the effective value of each time-series data (step S23), and obtains the maximum effective value from the calculated effective values (step S24). The control unit 11 calculates an average effective value of a plurality of time-series data (step S25).

制御部１１は、取得した最大実効値と、算出した平均実効値との比を算出する（ステップＳ２６）。制御部１１は、第１補正後の時系列データに対し、算出した最大実効値と平均実効値との比を乗算することにより、第２補正後の時系列データを生成する（ステップＳ２７）。 The control unit 11 calculates the ratio between the obtained maximum effective value and the calculated average effective value (step S26). The control unit 11 multiplies the time-series data after the first correction by the calculated ratio of the maximum effective value to the average effective value to generate the time-series data after the second correction (step S27).

制御部１１は、受信した異常データの生成方法を取得する（ステップＳ２８）。制御部１１は、取得した生成方法を用いて、生成した第２補正後の時系列データに基づいて異常データを生成する（ステップＳ２９）。制御部１１は、異常データの生成処理のサブルーチンを終了してリターンする。 The control unit 11 acquires the method for generating the received abnormal data (step S28). Using the acquired generation method, the control unit 11 generates abnormal data based on the generated time-series data after the second correction (step S29). The control unit 11 terminates the abnormal data generation processing subroutine and returns.

本実施形態によると、実効値を用いて、複数の時系列データの代表データ、または代表データの集合に対して補正処理を行うことが可能となる。 According to this embodiment, it is possible to perform correction processing on representative data of a plurality of time-series data or a set of representative data using the effective value.

（実施形態３）
実施形態３は、人工知能を用いて、正常と判断された時系列データ（以下、正常データという）から、異常用の時系列データ（以下、異常データという）を生成する形態に関する。なお、実施形態１～２と重複する内容については説明を省略する。 (Embodiment 3)
Embodiment 3 relates to a mode of generating abnormal time-series data (hereinafter referred to as abnormal data) from time-series data determined to be normal (hereinafter referred to as normal data) using artificial intelligence. Note that the description of the contents overlapping those of the first and second embodiments will be omitted.

図１２は、実施形態３のコンピュータ１の構成例を示すブロック図である。なお、図１と重複する内容については同一の符号を付して説明を省略する。大容量記憶部１６には、生成モデル１６５が記憶されている。生成モデル１６５は、正常データに基づいて異常データを生成する生成器であり、機械学習により生成された学習済みモデルである。 FIG. 12 is a block diagram showing a configuration example of the computer 1 of the third embodiment. In addition, the same code|symbol is attached|subjected about the content which overlaps with FIG. 1, and description is abbreviate|omitted. A generative model 165 is stored in the large-capacity storage unit 16 . The generative model 165 is a generator that generates abnormal data based on normal data, and is a learned model generated by machine learning.

コンピュータ１は、複数の正常データ及び複数の異常データを取得した場合、取得した複数の正常データ及び複数の異常データを用いて生成モデル１６５を生成する。 When acquiring multiple normal data and multiple abnormal data, the computer 1 generates a generative model 165 using the acquired multiple normal data and multiple abnormal data.

図１３は、生成モデル１６５の学習処理に関する説明図である。本実施の形態でコンピュータ１は、ＧＡＮ（Generative Adversarial Network）の手法を用いて正常データ及び異常データを学習し、生成モデル１６５を生成する。図１３では、ＧＡＮの構成を概念的に図示している。 FIG. 13 is an explanatory diagram of the learning process of the generative model 165. As shown in FIG. In this embodiment, the computer 1 learns normal data and abnormal data using a GAN (Generative Adversarial Network) technique to generate the generative model 165 . FIG. 13 conceptually illustrates the configuration of the GAN.

ＧＡＮは、入力データから出力データを生成する生成器（Generator）１６ａと、生成器１６ａが生成したデータの真偽を識別する識別器（Discriminator）１６ｂとから構成される。生成器１６ａは、ランダムなノイズ（潜在変数）の入力を受け付け、出力データを生成する。識別器１６ｂは、学習用に与えられる真のデータと、生成器１６ａから与えられるデータとを用いて、入力データの真偽を学習する。ＧＡＮでは生成器１６ａ及び識別器１６ｂが競合して学習を行い、最終的に生成器１６ａの損失関数が最小化し、かつ、識別器１６ｂの損失関数が最大化するようにネットワークを構築する。 The GAN includes a generator 16a that generates output data from input data and a discriminator 16b that discriminates whether the data generated by the generator 16a is true or false. The generator 16a receives input of random noise (latent variable) and generates output data. The discriminator 16b learns whether the input data is true or false by using true data given for learning and data given from the generator 16a. In the GAN, the generator 16a and the discriminator 16b perform competitive learning, and finally construct a network such that the loss function of the generator 16a is minimized and the loss function of the discriminator 16b is maximized.

なお、本実施の形態では、生成モデル１６５の生成（学習）手法としてＧＡＮを用いるが、生成モデル１６５はＧＡＮに係る学習済みモデルに限定されず、その他Ｕ－ＮＥＴ（Ｕ字型のニューラルネットワーク）等の深層学習、決定木等の学習手法による学習済みモデルであっても良い。 In the present embodiment, GAN is used as a method for generating (learning) the generative model 165, but the generative model 165 is not limited to a trained model related to GAN, and U-NET (U-shaped neural network) A model that has been trained by a learning method such as deep learning, decision tree, or the like may be used.

コンピュータ１は、正常データと異常データとを含む訓練データに基づいて、生成モデル１６５を生成（構築）する。具体的には、コンピュータ１は、正常と判断された時系列データまたはランダムなノイズデータを生成器１６ａに入力する。生成器１６ａは、偽の異常データを生成する。さらに、コンピュータ１は、予め入手した複数の真の異常データまたは偽の異常データを識別器１６ｂに入力する。これにより、生成器１６ａ及び識別器１６ｂに深層学習を行わせる。 The computer 1 generates (builds) a generative model 165 based on training data including normal data and abnormal data. Specifically, the computer 1 inputs time-series data or random noise data determined to be normal to the generator 16a. Generator 16a generates false anomaly data. Further, the computer 1 inputs a plurality of pre-obtained true abnormal data or false abnormal data to the discriminator 16b. This causes the generator 16a and classifier 16b to perform deep learning.

コンピュータ１は、生成器１６ａと識別器１６ｂを交互に学習させる。まず、これら２つのネットワークのパラメータを乱数で初期化する。そして、コンピュータ１は、生成器１６ａに入力した時系列データまたはランダムなノイズデータを用いて、生成器１６ａに偽の異常データを生成させる。コンピュータ１は、真の異常データか、生成器１６ａで生成された偽の異常データのいずれかを識別器１６ｂに入力して識別を行う。 The computer 1 alternately trains the generator 16a and the discriminator 16b. First, the parameters of these two networks are initialized with random numbers. Then, the computer 1 uses the time-series data or random noise data input to the generator 16a to cause the generator 16a to generate false abnormal data. The computer 1 inputs either true abnormal data or false abnormal data generated by the generator 16a to the discriminator 16b for identification.

学習の初期段階では、逆誤差伝播法を用いて、識別器１６ｂの学習が行われ、識別器１６ｂのパラメータが調整される。誤差逆伝搬法は、入力層、中間層、出力層からなるネットワークに対し、出力層から入力層にかけて誤差の勾配を逆伝搬させることで各層の重みフィルタとバイアスといったパラメータを更新する教師付き学習アルゴリズムである。 In the initial stage of learning, the discriminator 16b is trained using the backpropagation method, and the parameters of the discriminator 16b are adjusted. Backpropagation is a supervised learning algorithm that updates parameters such as weight filters and biases for each layer by backpropagating the error gradient from the output layer to the input layer in a network consisting of an input layer, an intermediate layer, and an output layer. is.

識別器１６ｂの識別誤差が小さくなると、コンピュータ１は、同じように逆誤差伝播法を用いて、生成器１６ａの学習を行う。コンピュータ１は、生成器１６ａの学習と識別器１６ｂの学習とを繰り返し行うことで、識別器１６ｂの識別力を向上させつつ、生成器１６ａの偽の異常データの生成能力も向上させる。このように、正常と判断された時系列データを入力した場合に異常データを出力することが可能な生成モデル１６５が生成される。 When the discrimination error of the discriminator 16b becomes small, the computer 1 similarly uses the backpropagation method to train the generator 16a. The computer 1 repeats the learning of the generator 16a and the learning of the discriminator 16b, thereby improving the discrimination power of the discriminator 16b and also improving the ability of the generator 16a to generate false abnormal data. In this way, a generative model 165 is generated that can output abnormal data when time-series data determined to be normal is input.

なお、学習の初期段階では、予め入手した複数の真の異常データ、正常と判断された時系列データに基づいて生成された異常データ（実施形態１）、または両者の組み合わせが利用されても良い。生成器１６ａの偽の異常データの生成能力の向上に伴い、予め入手した複数の真の異常データのみが利用されても良い。 In the initial stage of learning, a plurality of true abnormal data obtained in advance, abnormal data generated based on time-series data determined to be normal (embodiment 1), or a combination of both may be used. . As the ability of the generator 16a to generate false anomaly data improves, only a plurality of pre-obtained true anomaly data may be used.

図１４は、生成モデル１６５を生成する際の処理手順を示すフローチャートである。コンピュータ１の制御部１１は、学習用の訓練データ群であって、複数の正常データ及び複数の異常データを入力部１３により取得する（ステップＳ１１１）。制御部１１は、取得した複数の正常データ及び複数の異常データを用いて、正常データを入力した場合に異常データを出力する生成モデル１６５を生成する（ステップＳ１１２）。具体的には上述の如く、制御部１１は、ＧＡＮの手法を用いて生成モデル１６５を生成する。制御部１１は、生成した生成モデル１６５を大容量記憶部１６の学習モデル管理ＤＢ１６３に記憶し（ステップＳ１１３）、一連の処理を終了する。 FIG. 14 is a flow chart showing a processing procedure for generating the generative model 165. As shown in FIG. The control unit 11 of the computer 1 acquires a group of training data for learning, ie, a plurality of normal data and a plurality of abnormal data through the input unit 13 (step S111). The control unit 11 uses the obtained plurality of normal data and the plurality of abnormal data to generate a generative model 165 that outputs abnormal data when normal data is input (step S112). Specifically, as described above, the control unit 11 generates the generative model 165 using the GAN method. The control unit 11 stores the generated generative model 165 in the learning model management DB 163 of the large-capacity storage unit 16 (step S113), and ends the series of processes.

図１５は、実施形態３の異常検知モデル１６１を生成する際の処理手順を示すフローチャートである。コンピュータ１の制御部１１は、複数の正常データを入力部１３により取得する（ステップＳ１２１）。制御部１１は、正常データを入力した場合に異常データを出力する生成モデル１６５を用いて、取得した各正常データに対して異常データを生成する（ステップＳ１２２）。制御部１１は、ステップＳ１２３～Ｓ１２７の処理を実行する。なお、ステップＳ１２３～Ｓ１２７の処理に関しては、図７のステップＳ１０４～１０８の処理と同様であるため、詳細な説明を省略する。 FIG. 15 is a flow chart showing a processing procedure for generating the anomaly detection model 161 of the third embodiment. The control unit 11 of the computer 1 acquires a plurality of normal data through the input unit 13 (step S121). The control unit 11 generates abnormal data for each acquired normal data using the generation model 165 that outputs abnormal data when normal data is input (step S122). The control unit 11 executes the processes of steps S123 to S127. Note that the processing of steps S123 to S127 is the same as the processing of steps S104 to S108 in FIG. 7, so detailed description thereof will be omitted.

本実施形態によると、生成モデル１６５を用いて、正常と判断された時系列データから異常データを生成することが可能となる。 According to this embodiment, using the generative model 165, it is possible to generate abnormal data from time-series data determined to be normal.

本実施形態によると、生成モデル１６５を用いて生成された異常データを用いて、異常検知モデル１６１を生成することが可能となる。なお、本実施形態における異常データの生成方法も実施形態１で述べた複数の異常データの生成方法の選択肢の一つとして、利用することができる。
以上の実施の形態１乃至３を含む実施形態に関し、さらに以下の付記を開示する。
（付記１）
データを入力した場合に、異常データを生成するよう学習された生成モデルに、データを入力して異常データを生成する。
（付記２）
前記生成モデルは、前記データ及び予め入手した真の異常データを用いて学習されている。 According to this embodiment, the abnormality detection model 161 can be generated using the abnormality data generated using the generation model 165 . The method of generating abnormal data according to the present embodiment can also be used as one of the options for generating a plurality of abnormal data described in the first embodiment.
Regarding the embodiments including the first to third embodiments described above, the following additional remarks will be disclosed.
(Appendix 1)
Abnormal data is generated by inputting data into a generative model trained to generate abnormal data when data is input.
(Appendix 2)
The generative model is trained using the data and previously obtained true anomaly data.

今回開示された実施形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 The embodiments disclosed this time are illustrative in all respects and should be considered not restrictive. The scope of the present invention is indicated by the scope of the claims rather than the above-described meaning, and is intended to include all modifications within the scope and meaning equivalent to the scope of the claims.

１情報処理装置（コンピュータ）
１１制御部
１２記憶部
１３入力部
１４表示部
１５読取部
１６大容量記憶部
１６１異常検知モデル
１６２訓練データ管理ＤＢ
１６３学習モデル管理ＤＢ
１６４訓練データファイル
１６５生成モデル
１ａ可搬型記憶媒体
１ｂ半導体メモリ
１Ｐ制御プログラム 1 Information processing device (computer)
REFERENCE SIGNS LIST 11 control unit 12 storage unit 13 input unit 14 display unit 15 reading unit 16 large capacity storage unit 161 anomaly detection model 162 training data management DB
163 Learning Model Management DB
164 training data file 165 generative model 1a portable storage medium 1b semiconductor memory 1P control program

一つの側面に係る学習モデルの生成方法は、正常と判断された複数の時系列データを取得し、取得した複数の時系列データの各時間のデータに対する平均化処理、中央値の選択処理、または、最頻値の選択処理により代表時系列データを生成し、生成した代表時系列データに基づいて異常データを生成し、正常と判断された時系列データ及び前記時系列データに対しラベル付けされた正常ラベルと、生成した異常データ及び前記異常データに対しラベル付けされた異常ラベルとに基づき、時系列データを入力した場合に、異常に関する情報を出力する学習モデルを生成する処理を実行させることを特徴とする。 A method of generating a learning model according to one aspect acquires a plurality of time-series data that are determined to be normal, and averages the acquired time-series data for each hour, selects a median value, or , generating representative time-series data by mode selection processing, generating abnormal data based on the generated representative time-series data , and labeling the time-series data determined to be normal and the time-series data Based on the normal label, the generated abnormal data, and the abnormal label attached to the abnormal data, when time-series data is input, a process of generating a learning model that outputs information about anomalies is executed. Characterized by

一つの側面に係る学習モデルの生成方法は、コンピュータが、正常と判断された複数の時系列データを取得し、取得した複数の時系列データの実効値をそれぞれ算出し、算出したそれぞれの実効値から最大の実効値を取得し、前記複数の時系列データにおける代表時系列データの実効値を算出し、取得した最大実効値と、算出した代表時系列データの実効値との比を算出し、算出した実効値の比に基づいて、前記複数の時系列データにおける代表時系列データを補正し、補正した代表時系列データに基づいて異常データを生成し、正常と判断された前記時系列データ及び前記時系列データに対しラベル付けされた正常ラベルと、生成した異常データ及び前記異常データに対しラベル付けされた異常ラベルとに基づき、時系列データを入力した場合に、異常に関する情報を出力する学習モデルを生成する処理を実行することを特徴とする。 A method of generating a learning model according to one aspect includes: a computer acquires a plurality of time-series data determined to be normal , calculates the effective values of the acquired plurality of time-series data, and calculates the calculated effective values Obtain the maximum effective value from, calculate the effective value of the representative time-series data in the plurality of time-series data, calculate the ratio between the obtained maximum effective value and the calculated effective value of the representative time-series data, Based on the calculated effective value ratio, correcting representative time-series data in the plurality of time-series data , generating abnormal data based on the corrected representative time-series data, and determining the normal time-series data and Learning to output information about anomalies when time-series data is input based on normal labels labeled for the time-series data, generated anomaly data, and anomaly labels labeled for the anomaly data. It is characterized by executing a process of generating a model.

一つの側面に係る学習モデルの生成方法は、コンピュータが、正常と判断された複数の時系列データを取得し、取得した複数の時系列データの実効値をそれぞれ算出し、算出したそれぞれの実効値から最大の実効値を取得し、前記複数の時系列データにおける代表時系列データの実効値を算出し、取得した最大実効値と、算出した代表時系列データの実効値との比を算出し、算出した実効値の比に基づいて、前記複数の時系列データにおける代表時系列データを補正し、補正した代表時系列データに基づいて異常データを生成し、正常と判断された前記時系列データ及び前記時系列データに対しラベル付けされた正常ラベルと、生成した異常データ及び前記異常データに対しラベル付けされた異常ラベルとに基づき、時系列データを入力した場合に、異常に関する情報を出力する学習モデルを生成する処理を実行させることを特徴とする。 A method of generating a learning model according to one aspect includes: a computer acquires a plurality of time-series data determined to be normal , calculates the effective values of the acquired plurality of time-series data, and calculates the calculated effective values Obtain the maximum effective value from, calculate the effective value of the representative time-series data in the plurality of time-series data, calculate the ratio between the obtained maximum effective value and the calculated effective value of the representative time-series data, Based on the calculated effective value ratio, correcting representative time-series data in the plurality of time-series data , generating abnormal data based on the corrected representative time-series data, and determining the normal time-series data and Learning to output information about anomalies when time-series data is input based on normal labels labeled for the time-series data, generated anomaly data, and anomaly labels labeled for the anomaly data. It is characterized by executing processing for generating a model.

Claims

Acquire time-series data judged to be normal,
Generate anomaly data based on the acquired time series data,
When time-series data is input based on the time-series data determined to be normal, the normal label attached to the time-series data, and the generated abnormal data and the abnormal label attached to the abnormal data , generating a learning model that outputs information about anomalies. How to generate a learning model.

The learning model generation method according to claim 1, wherein the abnormal data is generated based on representative data of a plurality of time-series data or a plurality of representative data.

3. The method of generating a learning model according to claim 1, wherein the abnormal data is generated by adding periodic fluctuations to the time-series data.

4. The method of generating a learning model according to any one of claims 1 to 3, wherein the abnormal data is generated by continuously increasing or decreasing the time-series data as time changes.

The learning model generation method according to any one of claims 1 to 4, wherein the abnormal data is generated by adding a spike whose value within a unit time suddenly exceeds a threshold to the time series data.

The learning model generation method according to any one of claims 1 to 5, wherein the abnormal data is generated by shifting the phase of the waveform of the time-series data.

Calculate the effective values of multiple time-series data,
Obtain the maximum rms value from each calculated rms value,
calculating the effective value of the representative data in the plurality of time-series data;
Calculate the ratio between the maximum effective value obtained and the effective value of the calculated representative data,
The learning model generation method according to any one of claims 1 to 6, wherein representative data in the plurality of time-series data is corrected based on the calculated effective value ratio.

Selectably output multiple generation methods indicating how to generate abnormal data,
Accepts the output generation method selection,
The learning model generation method according to any one of claims 1 to 7, wherein the abnormal data is generated based on the received generation method.

Selectably outputting a plurality of generation methods indicating a method of generating abnormal data and a combination method combining each generation method,
Accept one of the output generation methods or a combination generation method,
The learning model generation method according to any one of claims 1 to 7, wherein the abnormal data is generated based on the received generation method or combination generation method.

10. The method of generating a learning model according to any one of claims 1 to 9, wherein second abnormal data is generated by adding colored noise to the abnormal data.

The learning model generation method according to any one of claims 1 to 10, wherein abnormal data is generated by applying a plurality of types of abnormal data generation methods for the time series data to the time series data at different timings. .

Acquire time-series data judged to be normal,
Generate anomaly data based on the acquired time series data,
When time-series data is input based on the time-series data determined to be normal, the normal label attached to the time-series data, and the generated abnormal data and the abnormal label attached to the abnormal data A program that causes a computer to generate a learning model that outputs information about anomalies.

an acquisition unit that acquires time-series data determined to be normal;
a first generating unit that generates abnormal data based on the acquired time-series data;
When time-series data is input based on the time-series data determined to be normal, the normal label attached to the time-series data, and the generated abnormal data and the abnormal label attached to the abnormal data and a second generator that generates a learning model that outputs information about an abnormality.

Acquire time-series data judged to be normal,
Generate anomaly data based on the acquired time series data,
A method of generating learning data, comprising: associating an anomaly label with the generated anomaly data, associating a normal label with the time-series data, and storing the anomaly data and the time-series data as learning data.