JP7272575B2

JP7272575B2 - Data processing device, data processing system and program

Info

Publication number: JP7272575B2
Application number: JP2018227556A
Authority: JP
Inventors: 宏紀松谷; 峰登塚田; 正章近藤
Original assignee: Keio University; University of Tokyo NUC
Current assignee: Keio University; University of Tokyo NUC
Priority date: 2018-06-29
Filing date: 2018-12-04
Publication date: 2023-05-12
Anticipated expiration: 2038-12-04
Also published as: JP2020009400A

Description

本発明は、データ処理装置、データ処理システム及びプログラムに関する。 The present invention relates to a data processing device, a data processing system and a program.

近年では、製品製造の現場等の種々の場所に適用可能な異常検知のシステムとして、機械学習を用いたシステムが開発されている。 In recent years, a system using machine learning has been developed as an anomaly detection system that can be applied to various places such as product manufacturing sites.

例えば特許文献１には、予め現場で収集可能なログデータを記録しておき、当該ログデータを参照しつつ、ログデータの各記録時点での異常発生の有無を教師データとして学習処理を行い、異常判定を行う装置が開示されている。 For example, in Patent Document 1, log data that can be collected at the site is recorded in advance, and while referring to the log data, the presence or absence of an abnormality at each recording time of the log data is used as teacher data for learning processing. An apparatus for determining abnormality is disclosed.

特開２０１８－７３２５８号公報JP 2018-73258 A

しかしながら、例えば製品製造の現場では、振動等の情報を収集するセンサが出力するデータは、周辺のノイズの状況により、現場ごとに異なっているのが実情である。具体的には隣接して別の振動を生じる製造機械が動作している現場と、そうでない現場とでは出力は大きく異なっている。また、別の振動を生じる製造機械が停止している間と動作している間でも出力が異なるため、同じ現場であっても時間帯など種々の要因によって異常判断を行うための基礎となるデータに相当の相違があるのが一般的である。 However, the actual situation is that, for example, at a product manufacturing site, the data output by a sensor that collects information such as vibration differs from site to site depending on the surrounding noise conditions. Specifically, there is a large difference in output between a site where manufacturing machines that generate different vibrations are operating adjacent to each other and a site where they are not. In addition, since the output is different when the manufacturing machine that produces different vibrations is stopped and when it is operating, the data is the basis for determining abnormalities depending on various factors such as the time of day even at the same site. It is common for there to be substantial differences in

このような状況のため、事前にログデータを収集して学習する装置では、現場ごとにログデータを収集する必要があり、装置構成が複雑となり、また、動作させるまでに多くの準備工程を要していた。またそうして学習処理を行っても、上述のように時間帯によって環境が異なる場合があるため、必ずしも適切な判定が行えない場合があった。 Because of this situation, equipment that collects log data in advance and learns needs to collect log data for each site, which complicates the equipment configuration and requires many preparatory processes before it can be operated. Was. Moreover, even if the learning process is performed in this way, the environment may differ depending on the time zone as described above, so there are cases where appropriate determination cannot always be made.

本発明は上記実情に鑑みて為されたもので、比較的簡易な構成で、準備工程を簡略化でき、環境に適合した判定を行うことのできるデータ処理装置、データ処理システム及びプログラムを提供することを、その目的の一つとする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and provides a data processing apparatus, a data processing system, and a program capable of simplifying the preparation process with a relatively simple configuration and making judgments suitable for the environment. to be one of its purposes.

上記従来例の問題点を解決する本発明の一態様は、繰り返し入力されるデータに基づく所定の判定処理を行うデータ処理装置であって、入力データと教師データとを受け入れて、逆数演算により機械学習可能な推測手段と、前記入力されたデータを入力データ及び教師データとして、前記推測手段を機械学習する学習処理手段と、前記推測手段の出力と、前記入力されたデータとの比較に基づいて、前記所定の判定処理を行い、当該判定処理の結果を出力する出力手段と、を含むこととしたものである。 One aspect of the present invention that solves the problems of the above conventional example is a data processing device that performs a predetermined determination process based on data that is repeatedly input, and receives input data and teacher data, and performs a reciprocal operation on the machine. A learnable guessing means, a learning processing means for machine-learning the guessing means using the input data as input data and teacher data, and a comparison between the output of the guessing means and the input data. and output means for performing the predetermined determination process and outputting the result of the determination process.

本発明によると、入力データを用いて機械学習を行いつつ判定を行い、また、逆数演算により機械学習を可能な推測手段を用いることで、比較的簡易な構成で、準備工程を簡略化でき、環境に適合した判定を行うことが可能となる。 According to the present invention, the judgment is performed while performing machine learning using input data, and the preparatory process can be simplified with a relatively simple configuration by using an inference means capable of machine learning by reciprocal arithmetic, It is possible to make a judgment suitable for the environment.

本発明の実施の形態に係るデータ処理装置の例を表す構成ブロック図である。1 is a configuration block diagram showing an example of a data processing device according to an embodiment of the present invention; FIG. 本発明の実施の形態に係るデータ処理装置の一例に係る機能ブロック図である。1 is a functional block diagram of an example of a data processing device according to an embodiment of the present invention; FIG. 本発明の実施の形態に係るデータ処理装置が用いる推定器の例を表す構成ブロック図である。FIG. 4 is a configuration block diagram showing an example of an estimator used by the data processing device according to the embodiment of the present invention; 本発明の実施の形態に係るデータ処理装置によるパラメータの記憶例を表す説明図である。4 is an explanatory diagram showing an example of parameter storage by the data processing device according to the embodiment of the present invention; FIG. 本発明の実施の形態に係るデータ処理装置のもう一つの例に係る機能ブロック図である。It is a functional block diagram concerning another example of the data processor concerning an embodiment of the invention. 本発明の実施の形態に係るデータ処理装置のさらにもう一つの例に係る機能ブロック図である。It is a functional block diagram concerning another example of the data processor concerning an embodiment of the invention.

本発明の実施の形態について図面を参照しながら説明する。本発明の実施の形態に係るデータ処理装置１の例は、図１に示すように、制御部１１と、記憶部１２と、入力部１３と、出力部１４とを含んで構成されている。 An embodiment of the present invention will be described with reference to the drawings. An example of a data processing apparatus 1 according to an embodiment of the present invention includes a control section 11, a storage section 12, an input section 13, and an output section 14, as shown in FIG.

ここで制御部１１は、ＣＰＵ等のプログラム制御デバイス、あるいはＦＰＧＡ（Field Programmable Gate Array）等のロジックデバイス、あるいはＡＳＩＣ（Application Specific Integrated Circuit）であり、本発明の推測手段、学習処理手段、及び出力手段を実現する。制御部１１として、ＣＰＵ等のプログラム制御デバイスを用いる場合は、この制御部１１は、記憶部１２に格納されたプログラムを実行することで、上記各部の動作を実現する。 Here, the control unit 11 is a program control device such as a CPU, or a logic device such as an FPGA (Field Programmable Gate Array), or an ASIC (Application Specific Integrated Circuit). implement the means. When a program control device such as a CPU is used as the control section 11, the control section 11 executes a program stored in the storage section 12 to realize the operations of the above sections.

また制御部１１としてＦＰＧＡ等のロジックデバイスを用いる場合は、プログラムされた論理に従って動作し、上記各部の動作を実現する。 When a logic device such as an FPGA is used as the control section 11, it operates according to programmed logic to realize the operation of each section described above.

すなわち本実施の形態ではこの制御部１１が、繰り返し入力されるデータを受け入れ、入力されたデータを入力データ及び教師データとして、逆数演算により機械学習可能な機械学習モデルの機械学習処理を実行する。また、この制御部１１は、当該入力されたデータを入力データとして、当該機械学習モデルが表す推定器に入力したときの、当該推定器の出力を得る。そして制御部１１は、当該得られた推定器の出力と、入力されたデータとの比較に基づいて、予め定めた判定処理を行い、当該判定処理の結果を出力する。この制御部１１の詳しい動作の例については後に述べる。 That is, in the present embodiment, the control unit 11 receives data that is repeatedly input, uses the input data as input data and teacher data, and executes machine learning processing of a machine learning model that can be machine-learned by reciprocal arithmetic. Also, the control unit 11 obtains the output of the estimator when the input data is input to the estimator represented by the machine learning model as input data. Then, the control unit 11 performs predetermined determination processing based on the comparison between the obtained output of the estimator and the input data, and outputs the result of the determination processing. A detailed example of the operation of the control unit 11 will be described later.

記憶部１２は、メモリデバイスやディスクデバイスを含んで構成される。この記憶部１２は、制御部１１がＣＰＵ等のプログラム制御デバイスである場合は、制御部１１によって実行されるプログラムを保持する。このプログラムは、コンピュータ可読、かつ非一時的な記録媒体に格納されて提供され、この記憶部１２に格納されたものであってもよい。 The storage unit 12 includes a memory device and a disk device. The storage unit 12 holds programs executed by the control unit 11 when the control unit 11 is a program control device such as a CPU. This program may be provided by being stored in a computer-readable, non-temporary recording medium and stored in the storage unit 12 .

またこの記憶部１２は、推定器を機械学習する際の機械学習モデルのモデルパラメータ等、制御部１１の処理において必要となる情報を保持する、ワークメモリとしても動作する。 The storage unit 12 also operates as a work memory that holds information necessary for the processing of the control unit 11, such as model parameters of a machine learning model when performing machine learning on the estimator.

入力部１３は、外部のセンサ等から入力されるデータをディジタルデータに変換し、また予め定められた次元のベクトル情報に変換して制御部１１に出力する。出力部１４は、制御部１１から入力される指示に従って情報を出力する。この出力部１４は例えばディスプレイ等であり、制御部１１から入力される情報を表示出力する。 The input unit 13 converts data input from an external sensor or the like into digital data, converts it into vector information of a predetermined dimension, and outputs the vector information to the control unit 11 . The output unit 14 outputs information according to instructions input from the control unit 11 . The output unit 14 is, for example, a display, and displays and outputs information input from the control unit 11 .

本実施の形態では制御部１１は、機能的には、図２に例示するように、データ受入部２１と、推定器２２と、学習処理部２３と、判定処理部２４と、出力処理部２５とを含んで構成される。 In the present embodiment, the control unit 11 functionally includes a data receiving unit 21, an estimator 22, a learning processing unit 23, a determination processing unit 24, and an output processing unit 25, as illustrated in FIG. and

データ受入部２１は、入力部１３からデータの入力を受け入れる。本実施の形態の例では、データ処理装置１は、例えば製品製造の現場等に配され、当該製品製造に用いる装置に取り付けられ、当該装置の振動や、温度等の種々の情報を検出して出力するセンサに接続される。そして入力部１３は、これらのセンサの出力を所定のタイミングごと（例えば定期的なタイミングごと）に、繰り返しディジタル値に変換して制御部１１に対して出力する。また、入力部１３は複数のセンサの出力を変換して得たディジタル値を所定の順に配列したベクトル値をデータとして制御部１１に出力する。データ受入部２１は、この入力部１３が所定のタイミングごとに出力するデータを受け入れて、学習処理部２３に出力する。 The data receiving unit 21 receives input of data from the input unit 13 . In the example of the present embodiment, the data processing device 1 is arranged, for example, at a product manufacturing site, is attached to a device used for manufacturing the product, and detects various information such as vibration and temperature of the device. It is connected to the sensor that outputs. The input unit 13 repeatedly converts the outputs of these sensors into digital values at predetermined timings (for example, at regular timings) and outputs the digital values to the control unit 11 . The input unit 13 also outputs to the control unit 11 vector values obtained by arranging digital values obtained by converting the outputs of a plurality of sensors in a predetermined order as data. The data receiving unit 21 receives data output by the input unit 13 at predetermined timings and outputs the data to the learning processing unit 23 .

推定器２２は、推測手段を実現するもので、機能的には図３に例示するように、入力層３１と、中間層（隠れ層）３２と、出力層３３とを含む、３層の全結合型のニューラルネットワークである。また、本実施の形態においてこの推定器２２は入力層３１のノードの数（入力データのベクトルの次元）と、出力層３３のノードの数（出力するデータのベクトルの次元）とが一致しているものとする。 The estimator 22 implements estimating means, and functionally, as illustrated in FIG. It is a combinatorial neural network. In this embodiment, the estimator 22 has the number of nodes in the input layer 31 (the dimension of the input data vector) and the number of nodes in the output layer 33 (the dimension of the vector of output data). It is assumed that there is

すなわちこの推定器２２では、入力データとなったベクトル値の各成分に対応する入力層３１のノード３１ａ，３１ｂ，…，３１ｎと中間層３２のノード３２ａ，３２ｂ，…３２Ｌとの間の結合重みをＷ（ｗ₁，ｗ₂，…，ｗ_L）（ここでｗ_iは入力層３１のノード数ｎに等しい次元のベクトル）、バイアスをｂ（ｂ₁,ｂ₂，…ｂ_L）とし、中間層３２のノード３２ａ，３２ｂ，…３２Ｌと出力層３３のノード３３ａ，３３ｂ，…，３３ｎとの間の結合重みをＶ（ｖ₁，ｖ₂，…，ｖ_L）（ここでｖ_iは出力層３３のノード数ｎに等しい次元のベクトル）とするとき、これらの結合重みの値Ｗ，Ｖ及びバイアスｂが推定器２２の機械学習モデルのパラメータとなる。 , 31n of the input layer 31 and the nodes 32a, 32b, . _is _W (w ₁ _, _w ₂ , _. , 32L _of the intermediate layer 32 and the nodes _33a , _33b , _. ), these connection weight values W and V and the bias b become parameters of the machine learning model of the estimator 22 .

推定器２２は、入力データが入力されると、当該入力データを入力層３１に入力し、この入力層３１に入力された入力データの各成分に、それぞれ対応する結合重みＷを乗じて総和するなど、入力層３１に入力された入力データの各成分と結合重みＷとに基づく所定の演算を行うことで中間層３２の各ノードの値を求める。また、中間層３２の各ノードの値ｈ₁，ｈ₂…，ｈ_Lに、所定の非線形関数ｆを適用して求められる値ｆ（ｈ_i）（ｉ＝１，２，…Ｌ）に対し、対応する結合重みＶを乗じて出力層３３の各ノードの値を求める。この演算は、一般的なニューラルネットワークにおける演算と同様であるので、ここでの詳しい説明を省略する。 When input data is input, the estimator 22 inputs the input data to the input layer 31, multiplies each component of the input data input to the input layer 31 by the corresponding connection weight W, and sums them up. By performing a predetermined operation based on each component of the input data input to the input layer 31 and the connection weight W, the value of each node of the intermediate layer 32 is obtained. Also, for values f(h _i ) (i= ₁ , ₂ , . . . _L ) obtained by applying a predetermined nonlinear function f to values h 1 , h 2 . , is multiplied by the corresponding connection weight V to obtain the value of each node in the output layer 33 . Since this calculation is similar to calculation in a general neural network, detailed explanation is omitted here.

ここでの非線形関数は、シグモイド関数、ＲｅＬＵ等広く知られたものを採用してもよい。推定器２２は、出力層３３の各ノードの値を成分とするベクトルを、出力データとして出力する。 As the nonlinear function here, widely known functions such as a sigmoid function and ReLU may be adopted. The estimator 22 outputs a vector whose components are the values of the nodes of the output layer 33 as output data.

本実施の形態では、この推定器２２は、少なくとも学習処理の条件によっては、逆数演算により上記機械学習モデルのパラメータの機械学習が可能なものとなっている。このような推定器２２の例については後に詳しく説明する。 In the present embodiment, the estimator 22 can perform machine learning of the parameters of the machine learning model by reciprocal arithmetic, at least depending on the conditions of the learning process. An example of such an estimator 22 will be described in detail later.

学習処理部２３は、データ受入部２１が受け入れたデータを入力データ及び教師データとして、推定器２２の機械学習処理を行う。具体的にこの学習処理部２３は、データ受入部２１が受け入れた直近所定回数分（バッチサイズ分）のデータを、推定器２２に対して入力データとして入力する。ここでバッチサイズを１とする場合は、学習処理部２３は、データ受入部２１が受け入れたデータを、推定器２２に対してそのまま入力データとして入力する。 The learning processing unit 23 performs machine learning processing of the estimator 22 using the data received by the data receiving unit 21 as input data and teacher data. Specifically, the learning processing unit 23 inputs data for the latest predetermined number of times (batch size) received by the data receiving unit 21 as input data to the estimator 22 . If the batch size is set to 1, the learning processing unit 23 inputs the data received by the data receiving unit 21 to the estimator 22 as it is as input data.

学習処理部２３は、当該入力データを入力したときに推定器２２が出力した出力データと、データ受入部２１が受け入れたデータ（教師データ）とを比較し、その差（絶対誤差や二乗平均誤差等）を損失（ロス）として演算する。そして学習処理部２３は、当該損失が小さくなるよう、推定器２２のパラメータ（結合重みの値）を更新する。 The learning processing unit 23 compares the output data output by the estimator 22 when the input data is input with the data (teacher data) received by the data receiving unit 21, and calculates the difference (absolute error or mean square error etc.) is calculated as a loss. Then, the learning processing unit 23 updates the parameters (connection weight values) of the estimator 22 so that the loss becomes smaller.

つまり、本実施の形態の例では、学習処理部２３は、推定器２２をオートエンコーダとして機械学習することとなる。 That is, in the example of the present embodiment, the learning processing unit 23 performs machine learning using the estimator 22 as an autoencoder.

判定処理部２４は、学習処理部２３が推定器２２に対し、入力データを入力したときに、推定器２２が出力する出力データに基づいて所定の判定処理を実行する。 The determination processing unit 24 executes predetermined determination processing based on the output data output from the estimator 22 when the learning processing unit 23 inputs input data to the estimator 22 .

具体的な例として、この判定処理部２４は、学習処理部２３が演算した損失を参照し、当該損失の大きさが予め定めたしきい値を超える場合に入力データが異常である旨（つまり、製造装置等に異常が生じている旨）を表す判定の結果を出力処理部２５に出力させる。また判定処理部２４は、損失の大きさが予め定めたしきい値を超えていないときには、入力データが正常である旨（つまり、製造装置等に異常がない旨）を表す判定の結果を出力処理部２５に出力させてもよい。 As a specific example, the determination processing unit 24 refers to the loss calculated by the learning processing unit 23, and if the magnitude of the loss exceeds a predetermined threshold, it indicates that the input data is abnormal (i.e. , that an abnormality has occurred in the manufacturing apparatus, etc.) is output to the output processing unit 25 . When the magnitude of the loss does not exceed the predetermined threshold value, the judgment processing unit 24 outputs a judgment result indicating that the input data is normal (that is, there is no abnormality in the manufacturing equipment, etc.). You may make it output to the process part 25. FIG.

なお、この判定処理部２４は、学習処理部２３により推定器２２のパラメータが十分に学習されるまでは、判定の処理を行わないよう制御されてもよい。具体的には、判定処理部２４は、推定器２２が機械学習をしていない状態（リセットされた状態）から、推定器２２に入力データを入力した回数（学習処理部２３によりパラメータの更新が行われた回数）が予め定めた初期化しきい値を超えたか否かを比較し、当該初期化しきい値を超えている場合に、推定器２２のパラメータが十分に学習されたと判断してもよい。ここで初期化しきい値は例えば、中間層３２のノードの数Ｌ以上の数として予め定めておく。 Note that the determination processing unit 24 may be controlled so as not to perform determination processing until the parameters of the estimator 22 are sufficiently learned by the learning processing unit 23 . Specifically, the determination processing unit 24 determines the number of times input data is input to the estimator 22 (when the parameter is updated by the learning processing unit 23) from the state where the estimator 22 is not performing machine learning (reset state). performed) exceeds a predetermined initialization threshold, and if the initialization threshold is exceeded, it may be determined that the parameters of the estimator 22 have been sufficiently learned. . Here, the initialization threshold value is determined in advance as a number equal to or greater than the number L of nodes in the intermediate layer 32, for example.

この場合、判定処理部２４は、当該初期化しきい値として定められた回数だけ学習処理を実行するまでは（推定器２２が初期化しきい値の回数だけ入力データを受け入れてパラメータの更新を受けるまでは）判定の処理を行わない。 In this case, the determination processing unit 24 continues until the learning process is executed the number of times determined as the initialization threshold (until the estimator 22 receives the input data the number of times of the initialization threshold and updates the parameters). ) does not process judgment.

出力処理部２５は、判定処理部２４から入力される指示に従い、判定の結果を出力部１４に出力する。 The output processing unit 25 outputs the determination result to the output unit 14 according to the instruction input from the determination processing unit 24 .

ここで本実施の形態の例に係る推定器２２の具体的な機械学習モデルについて説明する。本実施の形態では、この推定器２２は、学習処理部２３の動作により、入力層３１と中間層３２との結合重みＷ、及びバイアスｂをランダムに決定し、中間層３２と出力層３３との間の結合重みＶを機械学習するニューラルネットワークとする。具体的にここでは推定器２２と学習処理部２３とにより、OS-ELM（Online Sequential - Extreme Learning Machine）を実現して用いる。このOS-ELMは、N.Y. Liang, G.B. Huang, P.Saratchandran, and N.Sundararajan,”A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks,” IEEE Transactions on Neural Networks, Vol. 17, No.6, pp. 1411-1423, Nov. 2006等の文献に開示され、広く知られているので、ここでの詳細な説明は省略する。 Here, a specific machine learning model of the estimator 22 according to the example of this embodiment will be described. In this embodiment, the estimator 22 randomly determines the connection weight W between the input layer 31 and the intermediate layer 32 and the bias b by the operation of the learning processing unit 23, and the intermediate layer 32 and the output layer 33 It is assumed that the connection weight V between is a neural network for machine learning. Specifically, an OS-ELM (Online Sequential-Extreme Learning Machine) is realized and used here by the estimator 22 and the learning processing unit 23 . This OS-ELM is based on N.Y. Liang, G.B. Huang, P.Saratchandran, and N.Sundararajan,”A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks,” IEEE Transactions on Neural Networks, Vol. 17, No.6, pp. 1411-1423, Nov. 2006, etc., and is widely known, so detailed description thereof will be omitted here.

このOS-ELMとする場合、学習処理部２３は、当初は推定器２２をリセットするため、入力層３１と中間層３２との結合重みＷとバイアスｂとをランダムに決定する（なお、このとき、結合重みＶもランダムに定めておいてもよい）。そして学習処理部２３は、データ受入部２１が受け入れたデータを推定器２２に対し、入力データとして入力する。学習処理部２３は、当該入力データを入力したときに推定器２２が出力した出力データと、データ受入部２１が受け入れたデータ（教師データ）とを比較し、その差（例えば二乗平均誤差とする）を損失として演算する。そして学習処理部２３は、当該損失が小さくなるよう、推定器２２の、中間層３２と出力層３３との間の結合重みＶを更新する。 When this OS-ELM is used, the learning processing unit 23 initially resets the estimator 22, so the connection weight W and the bias b between the input layer 31 and the intermediate layer 32 are randomly determined (at this time , the connection weight V may also be determined randomly). Then, the learning processing unit 23 inputs the data received by the data receiving unit 21 to the estimator 22 as input data. The learning processing unit 23 compares the output data output by the estimator 22 when the input data is input and the data (teacher data) received by the data receiving unit 21, and the difference (for example, the mean square error ) as a loss. Then, the learning processing unit 23 updates the connection weight V between the intermediate layer 32 and the output layer 33 of the estimator 22 so that the loss becomes smaller.

OS-ELMでは学習処理は次のようにして行われる。学習処理部２３は、入力データを、ｎ_i（ｉ＝１，２…）個ずつに分けてバッチとし、各バッチを順次、訓練データとする。ここでｉ番目のバッチに含まれる入力データをｘ_i（ｉ＝１，２，…，）とし、入力層３１と中間層３２との間の結合重みをＷ_k（Ｗ_ｋ＝ｗ₁，ｗ₂，…，ｗ_L）、バイアスをｂ_k（ｂ_k＝ｂ₁,ｂ₂，…ｂ_L）とするとき、このｉ番目のバッチに対する出力層３３の出力は、次の行列で表される。

なお、Ｇは、出力層３３の各ノードの出力を表し、ｘ_i（ｊ）は、ｉ番目のバッチに含まれるｊ番目の入力データであることを表す。なお、ｎ_iは、ｉ番目のバッチにおけるバッチサイズである。 In OS-ELM, the learning process is performed as follows. The learning processing unit 23 divides the input data into n _i (i=1, 2, . . . ) batches, and sequentially uses each batch as training data. Here, let the input data contained _in the i- _th batch be x _i (i=1, ₂ , . ₂ _, _. _{_} _{_} _{_} _{_} .

Note that G represents the output of each node of the output layer 33, and x _i (j) represents the j-th input data included in the i-th batch. Note that n _i is the batch size in the i-th batch.

また、ここで教師データは入力データに同じであるので、この教師データＴは、

となる。なお、右肩のＴは転置を意味する（以下同じ）。 Also, since the teacher data here is the same as the input data, this teacher data T is

becomes. The right shoulder T means transposition (same below).

学習処理部２３は、ｉ番目のバッチが入力された時点では、損失の大きさ

を最小とする結合重みβを求めて、これを推定器２２の中間層３２と出力層３３との間の結合重みＶとすることで、推定器２２を最適化することとなる。このとき、

を用い、ｉ＝２の場合を考慮すると、（１）式は、

と変形できる。なお、Ａ^-1は、Ａの疑似逆行列を意味する。 When the i-th batch is input, the learning processing unit 23 determines that the magnitude of the loss is

is obtained to minimize , and is used as the connection weight V between the intermediate layer 32 and the output layer 33 of the estimator 22 , thereby optimizing the estimator 22 . At this time,

and considering the case of i=2, the formula (1) is

and can be transformed. Note that A ⁻¹ means a pseudo-inverse matrix of A.

これを一般化して、ｉ番目のバッチまでの学習が終了し、その時点での推定器２２の中間層３２と出力層３３との間の結合重みＶがＶ＝β_iとなっているとすると、ｉ＋１番目のバッチに基づく機械学習の結果である結合重みβ_i+1を、

とすることができる（逐次更新式）。ただし、

である。 Generalizing this, if learning up to the i-th batch is completed and the connection weight V between the intermediate layer 32 and the output layer 33 of the estimator 22 at that time is V=β _i , the connection weight _βi +1, which is the result of machine learning based on the i+1-th batch,

(sequential update formula). however,

is.

ここで、

とすると、これらの逐次更新式は、

として（ただしＩは単位行列）、

と表現できる。 here,

Then these iterative update formulas are

as (where I is the identity matrix),

can be expressed as

従って学習処理部２３は、ｉ番目のバッチが入力されたときに、中間層３２の出力Ｈ_iを得て、その時点で最適化された中間層３２と出力層３３との結合重みＶ＝β_iと、中間結果としてのＰ_iとを得ておく。そしてｉ＋１番目のバッチが入力されると、（２）式により次の中間結果Ｐ_i+1を得て、中間層３２と出力層３３との結合重みＶを（３）式で演算されるβ_i+1に更新する。なお、逆行列を求める演算については、特異値分解を用いるのが一般的であるが、各バッチにおけるバッチサイズＮ_iをＮ_i＝１とする（バッチサイズを１とする）と、疑似逆行列を求めるべき行列である（２）式の

は、スカラ値となり（（４）式は、Ｎ_i×Ｎ_iの行列であるため）、従ってこの疑似逆行列は、単なる逆数演算により求められることとなる。 Therefore, the learning processing unit 23 obtains the output H _i of the intermediate layer 32 when the i-th batch is input, and the connection weight V=β between the intermediate layer 32 and the output layer 33 optimized at that time is _i and P _i as an intermediate result are obtained. Then, when the i+1-th batch is input, the next intermediate result P _i+1 is obtained by equation (2), and the connection weight V between the intermediate layer 32 and the output layer 33 is calculated by equation (3). Update to _i+1 . In general, _singular value decomposition is used for _the calculation of the inverse matrix. of formula (2), which is the matrix for which

is a scalar value (because equation (4) is a matrix of N _i ×N _i ), so this pseudo-inverse matrix can be obtained by a simple reciprocal operation.

すなわち、本実施の形態において逆数演算により機械学習可能な推測手段は、推定器２２及び学習処理部２３として、中間層と出力層との間の結合重みを疑似逆行列演算によって機械学習するニューラルネットワークを用い、その機械学習のバッチサイズを１とすることで（データを受け入れるごとに機械学習をすることで）実現される。このようなニューラルネットワークは、具体的には入力層と中間層との結合重みをランダムに決定するＥＬＭ（Extreme Learning Machine）及び、それから派生するニューラルネットワーク（FP（Forgetting Parameters）-ELM、OS（On-Line Sequential）-ELM、EOS（Ensemble of OS）-ELM、FOS-ELM（OS-ELM with forgetting mechanism））等が相当する。もっともこれら推定器２２と学習処理部２３とによって、逆数演算により機械学習可能な推測手段を実現するニューラルネットワークは、これらの例に限られるものではない。 That is, in the present embodiment, the estimator 22 and the learning processing unit 23, which can be machine-learned by inverse calculation, are neural networks that perform machine learning on the connection weights between the intermediate layer and the output layer by pseudo-inverse matrix calculation. and set the machine learning batch size to 1 (by performing machine learning each time data is received). Specifically, such a neural network is an ELM (Extreme Learning Machine) that randomly determines the connection weight between the input layer and the intermediate layer, and a neural network derived from it (FP (Forgetting Parameters)-ELM, OS (On -Line Sequential)-ELM, EOS (Ensemble of OS)-ELM, FOS-ELM (OS-ELM with forgetting mechanism)), etc. However, the neural network that implements the estimator 22 and the learning processing unit 23 as an estimating means capable of machine learning by reciprocal arithmetic is not limited to these examples.

［動作］
本実施の形態のデータ処理装置１は以上の構成を備えており、次のように動作する。以下の例では、制御部１１を、ＦＰＧＡを用いて実装するものとし、推定器２２として、バッチサイズを１とした、OS-ELMを用いるものとする。 [motion]
The data processing apparatus 1 of this embodiment has the above configuration and operates as follows. In the following example, it is assumed that the controller 11 is implemented using FPGA, and OS-ELM with a batch size of 1 is used as the estimator 22 .

またここではデータ処理装置１は、製品を製造する装置の近傍に配した複数の振動センサからの信号を受け入れるものとする。振動センサは、取り付けられた部位の振動の大きさを表すアナログの電気信号を出力する。 Also, here, the data processing device 1 receives signals from a plurality of vibration sensors arranged in the vicinity of a product manufacturing device. The vibration sensor outputs an analog electrical signal representing the magnitude of vibration of the site to which it is attached.

データ処理装置１は、当初は、推定器２２であるOS-ELMの入力層３１と中間層３２との結合重みＷ及びバイアスｂをランダムに決定する。そしてデータ処理装置１は、各センサが出力した電気信号をディジタル値に変換し、制御部１１に入力する。制御部１１はデータ受入部２１として機能して、複数のセンサの出力を変換して得たディジタル値を所定の順に配列したベクトル値を、学習処理部２３に出力する。 The data processing device 1 initially randomly determines the connection weight W and the bias b between the input layer 31 and the intermediate layer 32 of the OS-ELM, which is the estimator 22 . The data processing device 1 then converts the electrical signals output from the sensors into digital values and inputs the digital values to the control unit 11 . The control unit 11 functions as a data receiving unit 21 and outputs to the learning processing unit 23 a vector value obtained by arranging digital values obtained by converting the outputs of a plurality of sensors in a predetermined order.

学習処理部２３は、データの入力を受け入れるごとに、つまりバッチサイズが１の入力データを受け入れるごとに、推定器２２に当該入力データを入力する。学習処理部２３は、推定器２２の出力データＨと、教師データとしての入力データＴと、その段階での推定器２２の中間層３２と出力層３３との結合重みＶ＝β_iと、前回演算した中間結果としてのＰ_i（β，Ｐとも、初回の値は予め設定しておく）を得る。 The learning processing unit 23 inputs the input data to the estimator 22 each time it receives data input, that is, each time it receives input data with a batch size of 1. The learning processing unit 23 uses the output data H of the estimator 22, the input data T as teacher data, the connection weight V= _β between the intermediate layer 32 and the output layer 33 of the estimator 22 at that stage, and the previous P _i (the initial values of both β and P are set in advance) is obtained as an intermediate result of the calculation.

そして学習処理部２３は、（２），（３）式により推定器２２の中間層３２と出力層３３との結合重みＶを更新する。また学習処理部２３は、推定器２２の出力と、入力データ（教師データ）との二乗平均誤差を損失として演算し、判定処理部２４に出力する。 Then, the learning processing unit 23 updates the connection weight V between the intermediate layer 32 and the output layer 33 of the estimator 22 according to equations (2) and (3). The learning processing unit 23 also calculates the mean square error between the output of the estimator 22 and the input data (teacher data) as a loss, and outputs it to the determination processing unit 24 .

判定処理部２４は、推定器２２のパラメータが十分に機械学習されたか否かを判断する。この判断は、データ処理装置１が推定器２２を初期化してから入力データを入力した回数が予め定めた初期化しきい値を超えたか否かにより判断する。判定処理部２４は、推定器２２のパラメータが十分に機械学習された状態にないと判断すると、判定処理部２４は、判定処理を行わない。 The determination processing unit 24 determines whether or not the parameters of the estimator 22 have been sufficiently machine-learned. This determination is made based on whether or not the number of times input data is input after the data processor 1 initializes the estimator 22 exceeds a predetermined initialization threshold value. When the determination processing unit 24 determines that the parameters of the estimator 22 are not sufficiently machine-learned, the determination processing unit 24 does not perform determination processing.

一方、推定器２２のパラメータが十分に機械学習された状態となっていると判断すると、判定処理部２４は、学習処理部２３から入力された損失の値の大きさが、予め定めたしきい値を超えたか否かを調べ、損失の値の大きさが当該予め定めたしきい値を超えている場合に、入力データが異常であり、製造装置等に異常が生じていると推定される旨を表す判定の結果を、出力部１４であるディスプレイに出力するよう制御する等の処理を行う。 On the other hand, when determining that the parameters of the estimator 22 have been sufficiently machine-learned, the determination processing unit 24 determines that the magnitude of the loss value input from the learning processing unit 23 exceeds the predetermined threshold. Examine whether or not the value exceeds the value, and if the magnitude of the loss value exceeds the predetermined threshold value, it is estimated that the input data is abnormal and the manufacturing equipment, etc. is abnormal. It performs processing such as controlling to output the result of determination indicating that to the display, which is the output unit 14 .

このように本実施の形態のデータ処理装置１によると、実際の異常検知を行う現場に設置してから推定器の逐次的な機械学習をオートエンコーダとして（つまり別途、教師データを用意することなく）行い、当該機械学習の結果に基づいて異常検知を行うので、準備工程を簡略化でき、環境に適合した判定を行うことが可能となる。 As described above, according to the data processing device 1 of the present embodiment, the sequential machine learning of the estimator is performed as an autoencoder after being installed at the site where actual anomaly detection is performed (that is, without separately preparing teacher data ), and anomaly detection is performed based on the result of the machine learning.

また、機械学習の過程の演算を比較的簡素な逆数演算により行うことが可能な推定器を用いることで、比較的簡易な構成とすることができる。 Also, by using an estimator capable of performing computations in the process of machine learning by relatively simple reciprocal computations, a relatively simple configuration can be achieved.

なお、ここでは製品を製造する装置の異常検知を行う例について述べたが、本実施の形態はこの例に限られるものではなく、発熱する装置の温度データを入力データとして発熱に係る異常検知を行うこととしてもよいし、配線上の電流を入力データとした電流量に関する異常検知や、ある製品の熱分布（サーモグラフィー）のデータを入力データとした熱分布の異常検知、人の行動や装置の操作履歴を入力データとした異常検知、無人航空機（ドローン等）の動作に係る異常検知等、種々の例に適用可能である。 Here, an example of detecting an abnormality in a device for manufacturing a product has been described, but the present embodiment is not limited to this example. It is also possible to detect anomalies related to the amount of current using the current on the wiring as input data, detect anomalies in the heat distribution using the heat distribution (thermography) data of a certain product as input data, and detect human behavior and equipment. It can be applied to various examples such as anomaly detection using operation history as input data, anomaly detection related to the operation of an unmanned aerial vehicle (such as a drone).

［忘却］
また、本実施の形態の例において、忘却処理を含めるべき場合は、忘却率をαとして、（３）式のＰ_i+1に１／α²を乗じることとすればよい。これによって簡易な方法で忘却効果を得ることが可能となる。 [Forget]
In addition, in the example of the present embodiment, if the forgetting process is to be included, the forgetting rate is set to α and P _i+1 in equation (3) is multiplied by 1/α ² . This makes it possible to obtain the forgetting effect by a simple method.

［学習結果の部分破棄］
また本実施の形態において、バッチサイズが１であるなど、比較的少数のバッチサイズの入力データ群により学習処理を行う場合は、異常なデータが連続することで生じる、異常なデータへの適合を防止するため、次のような学習結果の部分破棄の処理を行ってもよい。 [Partial discard of learning results]
Further, in the present embodiment, when learning processing is performed using a group of input data with a relatively small batch size, such as a batch size of 1, adaptation to abnormal data caused by continuous abnormal data is avoided. In order to prevent this, the following process of partially discarding the learning results may be performed.

本実施の形態の一例では、学習処理部２３はデータ受入部２１が受け入れたデータを、推定器２２に対してそのまま入力データとして入力する（バッチサイズを１として逐次的な学習処理を実行する）。具体的にこの推定器２２としては、OS-ELM のニューラルネットワークを用いることとすればよい。この場合、推定器２２と学習処理部２３とにより、バッチサイズを「１」とした、逐次的な学習処理を行うOS-ELMが実現される。 In one example of the present embodiment, the learning processing unit 23 directly inputs the data received by the data receiving unit 21 to the estimator 22 as input data (the batch size is set to 1 and sequential learning processing is performed). . Specifically, an OS-ELM neural network may be used as the estimator 22 . In this case, the estimator 22 and the learning processing unit 23 realize an OS-ELM that performs sequential learning processing with a batch size of "1".

この例において学習処理部２３は、入力データＸを推定器２２に入力し、当該推定器２２が出力する出力データと入力データＸとを用いて機械学習処理を行うごとに、つまり推定器２２の中間層３２と出力層３３との結合重みＶを更新するごとに、推定器２２に入力した入力データＸと、更新のために求めた結合重みβ、及び機械学習処理で必要となるデータ（例えば上述のＰ）を互いに関連付けて記憶部１２に格納する。 In this example, the learning processing unit 23 inputs the input data X to the estimator 22, and each time the machine learning process is performed using the output data output by the estimator 22 and the input data X, that is, the estimator 22 Each time the connection weight V between the intermediate layer 32 and the output layer 33 is updated, the input data X input to the estimator 22, the connection weight β obtained for updating, and data required for machine learning processing (for example, The above P) are associated with each other and stored in the storage unit 12 .

またこのとき学習処理部２３は、過去Ｍ回より前に格納した、入力データＸ，結合重みβ及び中間結果Ｐを互いに関連付けた情報が記憶部１２に格納されていれば、当該情報を削除してもよい。 Also, at this time, the learning processing unit 23 deletes the information stored in the storage unit 12 that associates the input data X, the connection weight β, and the intermediate result P stored before M times in the past. may

これにより記憶部１２には、最大で直近のＭ回分の推定器２２の機械学習結果である中間層３２と出力層３３との結合重みβ_jと、中間結果としてのＰ_jと、それぞれの結合重みを求めたときの入力データＸ_jとを互いに関連付けた情報Ｒ_j（ｊ＝１，２，…）が記憶されている状態となる（図４）。 As a result, in the storage unit 12, the connection weight β _j between the intermediate layer 32 and the output layer 33, which is the machine learning result of the estimator 22 for the most recent M times, P _j as the intermediate result, and each connection Information _{R j} ₍ j=1, 2, .

学習処理部２３は、予め定めた方法で決定した複数回（ここではＭ回とする）の機械学習処理を行うごとに、記憶部１２に記憶したＭ回前の中間層３２と出力層３３との結合重みβ、中間結果Ｐ、及び入力データＸを読み出す。学習処理部２３は、ここで読み出した入力データＸを、推定器２２に入力する。 Each time the learning processing unit 23 performs machine learning processing a plurality of times (here, M times) determined by a predetermined method, the intermediate layer 32 and the output layer 33 stored in the storage unit 12 M times before. , intermediate results P, and input data X are read. The learning processing unit 23 inputs the read input data X to the estimator 22 .

学習処理部２３は、推定器２２の出力を参照し、当該出力が予め定めた条件を満足するか否かを判断する。ここでは学習処理部２３は、当該推定器２２の出力と、入力データＸ（教師データに相当する）との差に基づく値（例えば二乗平均誤差）を損失として演算し、この損失の大きさが予め定めたしきい値を超えるか否かを調べる。ここで演算される損失の大きさが予め定めたしきい値を超えるとの条件が、例えば上記の予め定めた条件の一例に相当する。 The learning processing unit 23 refers to the output of the estimator 22 and determines whether or not the output satisfies a predetermined condition. Here, the learning processing unit 23 calculates a value (for example, the mean square error) based on the difference between the output of the estimator 22 and the input data X (corresponding to teacher data) as a loss, and the magnitude of this loss is Check whether or not a predetermined threshold is exceeded. A condition that the magnitude of the loss calculated here exceeds a predetermined threshold value corresponds to, for example, an example of the above-described predetermined condition.

学習処理部２３は、ここで損失の大きさが予め定めたしきい値を超えていたとき、つまり、上記予め定めた条件が満足されるときには、ここで入力した入力データに関連付けて記憶されている機械学習結果であるＭ回前の中間層３２と出力層３３との結合重みβを用いて、推定器２２の機械学習状態を補正する。具体的には、推定器２２の中間層３２と出力層３３との結合重みＶを、ここで読み出した結合重みβに設定する（この動作は、直近の機械学習の結果を部分的に破棄することに相当する）。 When the magnitude of the loss exceeds a predetermined threshold value, that is, when the predetermined condition is satisfied, the learning processing unit 23 stores the data in association with the input data input here. The machine learning state of the estimator 22 is corrected using the connection weight β between the intermediate layer 32 and the output layer 33 M times before, which is the result of machine learning. Specifically, the connection weight V between the intermediate layer 32 and the output layer 33 of the estimator 22 is set to the connection weight β read here (this operation partially discards the most recent machine learning results). (equivalent to

またこのときには、学習処理部２３は、この時点で記憶している中間結果を、ここで読み出した中間結果Ｐに設定し直す。 Also, at this time, the learning processing unit 23 resets the intermediate result stored at this time to the intermediate result P read here.

なお、学習処理部２３は、損失の大きさが予め定めたしきい値を超えていなかった場合は、推定器２２の機械学習状態を補正することなく、機械学習処理を続ける。 Note that the learning processing unit 23 continues the machine learning processing without correcting the machine learning state of the estimator 22 when the magnitude of the loss does not exceed the predetermined threshold value.

本実施の形態のこの例によると、予め定めた方法で決定した回数（例えば実験的に定めてもよい）だけの機械学習を行うごとに、過去の所定の時点での入力データを現在の推定器２２に入力して損失の大きさが大きくなっていないかを確認する。そして損失が大きくなっていれば、直近の機械学習の内容を破棄して、推定器２２のパラメータを、上記過去の所定の時点での推定器２２のパラメータに戻すこととなる。 According to this example of the present embodiment, each time machine learning is performed the number of times determined by a predetermined method (for example, it may be determined experimentally), the input data at a predetermined point in the past is used as the current estimation. Input to the device 22 and check whether the magnitude of the loss has increased. If the loss is large, the contents of the most recent machine learning are discarded, and the parameters of the estimator 22 are returned to the parameters of the estimator 22 at the predetermined time in the past.

これにより、バッチサイズが比較的大きい値となっている場合と同様に、時間的に平均化した機械学習が行われることとなる。 As a result, time-averaged machine learning is performed as in the case where the batch size is a relatively large value.

［遅延学習］
また本実施の形態の一例では、学習データを部分破棄する代わりに、学習を遅延して行ってもよい。この例では、学習処理部２３はデータ受入部２１が受け入れたデータを、推定器２２に対してそのまま入力データとして入力する（バッチサイズを１として逐次的な学習処理を実行する）。この推定器２２は、既に述べた例と同様にOS-ELM のニューラルネットワークを用いることとすればよい。すなわち、ここでも推定器２２と学習処理部２３とにより、バッチサイズを「１」とした、逐次的な学習処理を行うOS-ELMが実現されるものとする。 [Delayed learning]
Further, in one example of the present embodiment, learning may be delayed instead of partially discarding the learning data. In this example, the learning processing unit 23 directly inputs the data received by the data receiving unit 21 to the estimator 22 as input data (performs sequential learning processing with a batch size of 1). This estimator 22 may use an OS-ELM neural network as in the example already described. That is, here too, the estimator 22 and the learning processing unit 23 implement an OS-ELM that performs sequential learning processing with a batch size of "1".

学習処理部２３は、入力データＸを受け入れて推定器２２に入力し、当該推定器２２が出力する出力データを得て、当該出力データと入力データＸとの差に基づく値（例えば二乗平均誤差）を損失として演算し、入力データＸと演算した損失とを関連付けて記憶部１２に格納する。この段階では学習処理部２３は、推定器２２の機械学習処理を行わない。 The learning processing unit 23 receives the input data X and inputs it to the estimator 22, obtains the output data output by the estimator 22, and calculates a value based on the difference between the output data and the input data X (for example, the mean square error ) is calculated as a loss, and the input data X and the calculated loss are stored in the storage unit 12 in association with each other. At this stage, the learning processing unit 23 does not perform machine learning processing for the estimator 22 .

学習処理部２３は、予め定められた回数Ｍ（Ｍは２以上の自然数とする）だけ、上記の処理を繰り返して、過去Ｍ回分の入力データＸとそれに基づく推定結果の損失の値とを、記憶部１２に保持している状態となると、Ｍ回前の入力データＸを用いた機械学習処理を行うか否かを判断する。 The learning processing unit 23 repeats the above process a predetermined number of times M (where M is a natural number of 2 or more), and obtains the input data X for the past M times and the loss value of the estimation result based thereon, When the data is held in the storage unit 12, it is determined whether or not to perform the machine learning process using the input data X M times before.

具体的にこの判断は、次のようにして行うことができる。すなわち学習処理部２３は、保持している情報を参照して、直近Ｍ回分の損失の値に基づく統計値（例えばここでは平均Ｅavとする）を求める。そして学習処理部２３は、Ｍ回前の（機械学習処理を行うか否かを判断する対象となったＭ回前の入力データＸに対する）損失の値Ｅを参照し、この値Ｅが、上記統計値である平均Ｅavを用いた条件
Ｅ＜ａ・Ｅav
を満足するか否かを判断する。このａは予め定めた定数であり、例えばａ＝３．０などとしておく。なお、ここでは統計値として平均を用いたが、平均だけでなく、中間値としてもよい。またａは定数ではなく直近Ｍ回分の損失の値の分散や標準偏差に基づいて定められてもよい。 Specifically, this determination can be made as follows. That is, the learning processing unit 23 refers to the held information and obtains a statistic value (for example, an average Eav here) based on the loss values for the most recent M times. Then, the learning processing unit 23 refers to the value E of the loss M times before (with respect to the input data X M times before for which it is determined whether or not to perform the machine learning process), and this value E Condition E<a·Eav using average Eav, which is a statistical value
is satisfied or not. This a is a predetermined constant, for example, a=3.0. Although the average is used as the statistical value here, an intermediate value may be used instead of the average. Alternatively, a may be determined based on the variance or standard deviation of the loss values for the last M times instead of a constant.

学習処理部２３は、上記の値Ｅが、Ｅ＜ａ・Ｅavを満足するときには、Ｍ回前の入力データＸを用いて機械学習処理を実行する。つまり、当該Ｍ回前の入力データＸと、対応する損失Ｅとを用いて、推定器２２の中間層３２と出力層３３との結合重みＶを更新する。 When the above value E satisfies E<a·Eav, the learning processing unit 23 executes machine learning processing using the input data X M times before. That is, using the input data X of the M times before and the corresponding loss E, the connection weight V between the intermediate layer 32 and the output layer 33 of the estimator 22 is updated.

学習処理部２３は、そして当該Ｍ回前に格納した、入力データＸと損失の情報とを削除してもよい。 The learning processing unit 23 may then delete the input data X and the loss information stored M times before.

なお、学習処理部２３は、上記の判断において、値Ｅが、Ｅ＜ａ・Ｅavを満足しない場合は、Ｍ回前の入力データＸを用いた機械学習処理を実行することなく、当該Ｍ回前に格納した入力データＸと損失の情報とを削除する。 In the above determination, if the value E does not satisfy E<a·Eav, the learning processing unit 23 does not execute the machine learning process using the input data X M times before. Delete the previously stored input data X and loss information.

本実施の形態のこの例によると、予め定めた方法で決定した回数（例えば実験的に定めてもよい）だけ遅延して機械学習を行うか否かを判断し、大きく外れた入力データに基づく機械学習を行わないよう制御するので、条件に応じて直近の学習内容を破棄する上述の例と同様の効果を得ることができ、時間的に平均化した機械学習が行われることとなる。 According to this example of the present embodiment, it is determined whether or not to perform machine learning by delaying the number of times determined by a predetermined method (for example, it may be determined experimentally), and based on input data that greatly deviates Since control is performed so that machine learning is not performed, it is possible to obtain the same effect as the above-described example in which the most recent learning content is discarded according to conditions, and time-averaged machine learning is performed.

［並列化］
さらに本実施の形態によると、推定器２２は複数あっても構わない。この例に係るデータ処理装置１の制御部１１は、機能的には、図５に例示するように、データ受入部２１と、複数の推定器４２-1，４２-2…と、各推定器４２に対応して設けられる複数の学習処理部４３-1，４３-2…と、判定処理部４４と、出力処理部２５とを含んで構成される。なお、図２に例示したものと同様の構成となるものについては、同じ符号を付して繰り返しての説明を省略する。 [Parallelization]
Furthermore, according to this embodiment, a plurality of estimators 22 may be provided. The control unit 11 of the data processing device 1 according to this example functionally includes a data receiving unit 21, a plurality of estimators 42-1, 42-2 . . . , a determination processing unit 44, and an output processing unit 25. Note that components having the same configuration as those illustrated in FIG. 2 are denoted by the same reference numerals, and repeated descriptions thereof are omitted.

本実施の形態のこの例に係る推定器４２（ここで各推定器を区別する必要がない場合は、それぞれの推定器をまとめて推定器４２と表記する。また学習処理部についても同様とする）のそれぞれは、図２に例示した推定器２２と同じもので構わない。つまり、各推定器４２は、それぞれOS-ELMに対応するニューラルネットワークでよい。 The estimator 42 according to this example of the present embodiment (if there is no need to distinguish between the estimators, the estimators are collectively referred to as the estimator 42. The same applies to the learning processing unit). ) may be the same as the estimator 22 illustrated in FIG. That is, each estimator 42 may be a neural network corresponding to OS-ELM.

また学習処理部４３は、対応する推定器４２のパラメータを、既に説明した学習処理部２３と同様にして機械学習処理により逐次的に更新する。学習処理部４３は、また推定器４２の出力データと、入力データ（教師データに相当する）との差に係る値（二乗平均誤差等）を損失として演算して出力する。さらに本実施の形態のこの例では、学習処理部４３は、損失の値を出力するとともに、対応する推定器４２をリセットしてからのパラメータの更新回数（入力データを入力した回数）を、学習状況情報として出力する。なお、ここでの例でも、それぞれの学習処理部４３は、学習結果の部分破棄の処理を実行してもよい。 Also, the learning processing unit 43 sequentially updates the parameters of the corresponding estimator 42 by machine learning processing in the same manner as the learning processing unit 23 already described. The learning processing unit 43 also calculates and outputs a value (root mean square error or the like) related to the difference between the output data of the estimator 42 and the input data (corresponding to teacher data) as a loss. Furthermore, in this example of the present embodiment, the learning processing unit 43 outputs the value of the loss, and learns the number of parameter updates (the number of input data inputs) after resetting the corresponding estimator 42. Output as status information. Also in this example, each learning processing unit 43 may execute processing for partially discarding the learning result.

本実施の形態のこの例において学習処理部４３-i（ｉ＝１，２，…）は、対応する推定器４２-i（ｉ＝１，２，…）を、学習処理部４３-iごとに定められる所定のタイミングＴ_iごとにリセットする。すなわち学習処理部４３-iは、前回推定器４２-iをリセット（入力層３１と中間層３２との結合重みＷ及びバイアスｂをランダムに決定）してから、タイミングＴ_iだけの時間が経過するごとに、推定器４２-iを再度リセットする。 In this example of the present embodiment, the learning processing unit 43-i (i=1, 2, . is reset at each predetermined timing T _i defined in . That is, the learning processing unit 43-i resets the estimator 42-i last time (randomly determines the connection weight W and the bias b between the input layer 31 and the intermediate layer 32), and the time T _i has passed. Each time, the estimator 42-i is reset again.

ここでタイミングＴ_iは、入力データの入力回数により定めてもよい（例えばタイミングＴ_iは入力データがｑ_i回入力されるごととしてもよい）し、図示しない時計部（現在時刻を計時ないし取得する回路部）から時刻の情報を取得し、当該時刻の情報に基づいて判断される、実際の時間経過により定めてもよい。 Here, the timing T _i may be determined by the number of times input data is input (for example, the timing T _i may be set every time input data is input q _i times), or a clock unit (not shown) (which measures or acquires the current time). It may be determined based on the actual passage of time, which is determined based on the time information acquired from the circuit unit that performs the time.

またこのタイミングＴ_iは、すべての推定器４２が（少なくとも異常検知の処理に必要な時間の間は）一斉にリセットされないタイミングとしておくこととしてもよい。例えばタイミングＴ_iを入力データの入力回数により定める場合は、各タイミングＴ_iに係る入力回数を素数とする。一例として推定器４２を２つ用いる場合に、それぞれのリセットのタイミングを、Ｔ₁＝２７４３７，Ｔ₂＝２７４４９（いずれも素数）としておくと、７．５億回までは同じタイミングでリセットすることがなくなる。 Also, this timing T _i may be a timing at which all the estimators 42 are not reset all at once (at least during the time required for abnormality detection processing). For example, when the timing T _i is determined by the number of inputs of input data, the number of inputs for each timing T _i is assumed to be a prime number. As an example, when two estimators 42 are used, if the respective reset timings are set to T ₁ =27437 and T ₂ =27449 (both are prime numbers), resetting can be performed at the same timing up to 750 million times. disappears.

またタイミングＴ_iを現実の時刻により定める場合は、Ｔ₁を毎日午前０時０分０秒に、Ｔ₂を毎週月曜日の午前１時０分０秒に…というように定めれば、同じタイミングでリセットされることがなくなる。 When the timing T _i is determined by the actual time, T ₁ is set at 00:00:00 every day, T ₂ is set at 1:00:00 every Monday, and so on. will no longer be reset.

このように本実施の形態の一例では、学習処理部４３は、各推定器４２を、例えばそれぞれ互いに異なるタイミングでリセットすることで、すべての推定器４２が（少なくとも異常検知の処理に必要な時間の間は）一斉にリセットされないようにしておく。 As described above, in one example of the present embodiment, the learning processing unit 43 resets each estimator 42 at different timings, for example, so that all the estimators 42 (at least the time required for abnormality detection processing ), so that they are not reset all at once.

また、この例の判定処理部４４は、複数の学習処理部４３から、それぞれ対応する推定器４２が出力した出力データに係る損失の演算結果と、学習状況情報（ここでは対応する推定器４２をリセットしてからのパラメータの更新回数）とを受け入れる。 In addition, the determination processing unit 44 in this example receives the calculation result of the loss related to the output data output by the corresponding estimator 42 from the plurality of learning processing units 43, and the learning status information (here, the corresponding estimator 42 number of parameter updates since reset).

そして判定処理部４４は、学習状況情報を参照して、パラメータが十分に学習されている推定器４２に対応する学習処理部４３が出力した損失を参照し、当該損失の大きさが予め定めたしきい値を超えるか否かを調べる。ここでパラメータが十分に学習されているか否かは、例えば学習状況情報が表すパラメータの更新回数が予め定めた初期化しきい値を超えているか否かにより判断すればよい。つまり判定処理部４４は、学習状況情報が表すパラメータの更新回数が予め定めた初期化しきい値を超えていればパラメータが十分に学習されていると判断する。 Then, the determination processing unit 44 refers to the learning status information, refers to the loss output by the learning processing unit 43 corresponding to the estimator 42 whose parameters have been sufficiently learned, and determines that the magnitude of the loss is predetermined. Check if the threshold is exceeded. Here, whether or not the parameters have been sufficiently learned may be determined by, for example, whether or not the number of times the parameters represented by the learning status information have been updated exceeds a predetermined initialization threshold value. In other words, the determination processing unit 44 determines that the parameter has been sufficiently learned if the update count of the parameter represented by the learning status information exceeds a predetermined initialization threshold value.

判定処理部４４は、パラメータが十分に学習されていると判断された推定器４２の数Ｑにより、パラメータが十分に学習されていると判断された推定器４２に対応する学習処理部４３のうち、出力した損失の大きさが予め定めたしきい値を超えるものの数ｑを除して、この値ｑ／Ｑが所定の値、例えば１／２を超えるか否かを調べる。そしてこの値ｑ／Ｑが例えば１／２を超える場合（この所定の値が１／２であるときには、過半数の推定器４２が異常を検知したと判断される場合）に、入力データが異常である旨（つまり、製造装置等に異常が生じている旨）を表す判定の結果を出力処理部２５に出力させる。 Of the learning processing units 43 corresponding to the estimators 42 determined to have sufficiently learned parameters, the determination processing unit 44 determines the number Q of estimators 42 determined to have sufficiently learned parameters. , the number q of output losses exceeding a predetermined threshold value is divided, and it is checked whether this value q/Q exceeds a predetermined value, for example, 1/2. When this value q/Q exceeds, for example, 1/2 (when this predetermined value is 1/2, it is determined that a majority of the estimators 42 have detected an abnormality), the input data is abnormal. The output processing unit 25 is caused to output the determination result indicating that there is an abnormality (that is, that there is an abnormality in the manufacturing apparatus or the like).

［クラスタリング］
また、このように推定器２２を複数設けて並列化するときには、次のように機械学習処理を行ってもよい。本実施の形態のここでの例に係るデータ処理装置１の制御部１１は、機能的には、図６に例示するように、図５に示した例と同様、データ受入部２１と、複数の推定器４２-1，４２-2…と、学習処理部４３′と、判定処理部４４′と、出力処理部２５と、第２学習処理部４５とを含んで構成される。なお、図５に例示したものと同様の構成となるものについては、同じ符号を付して繰り返しての説明を省略する。 [Clustering]
Further, when a plurality of estimators 22 are provided and parallelized in this manner, machine learning processing may be performed as follows. Functionally, the control unit 11 of the data processing device 1 according to this example of the present embodiment includes a data receiving unit 21 and a plurality of , a learning processing unit 43', a determination processing unit 44', an output processing unit 25, and a second learning processing unit 45. In addition, the same reference numerals are assigned to the components having the same configuration as those illustrated in FIG. 5, and the repeated description thereof will be omitted.

本実施の形態のこの例に係る推定器４２（ここでも各推定器を区別する必要がない場合は、それぞれの推定器をまとめて推定器４２と表記する）のそれぞれは、図２に例示した推定器２２と同じもので構わない。つまり各推定器４２は、それぞれOS-ELMに対応するニューラルネットワークでよい。 Each of the estimators 42 according to this example of the present embodiment (also collectively referred to as the estimators 42 when there is no need to distinguish between the estimators) is illustrated in FIG. The same one as the estimator 22 may be used. That is, each estimator 42 may be a neural network corresponding to OS-ELM.

学習処理部４３′は、データ受入部２１が受け入れた入力データＸを各推定器４２に入力する。そして学習処理部４３′は、各推定器４２の出力データと、入力データＸとの二乗平均誤差を損失として演算し、判定処理部４４′に出力する。学習処理部４３′は、入力データＸを記憶部１２に蓄積して保持する。 The learning processing unit 43 ′ inputs the input data X received by the data receiving unit 21 to each estimator 42 . Then, the learning processing unit 43' calculates the mean square error between the output data of each estimator 42 and the input data X as a loss, and outputs it to the determination processing unit 44'. The learning processing unit 43 ′ accumulates and holds the input data X in the storage unit 12 .

また学習処理部４３′は、推定器４２のうち、その出力データに係る損失に基づく判断の結果が、入力データＸが「正常」であることを表すものとなっている推定器４２を特定する情報を判定処理部４４′から受け入れ、当該情報で特定される推定器４２のパラメータを、既に説明した学習処理部２３と同様にして機械学習処理により更新する。 Further, the learning processing unit 43' specifies the estimator 42 whose judgment result based on the loss related to the output data indicates that the input data X is "normal" among the estimators 42. Information is received from the determination processing unit 44', and parameters of the estimator 42 specified by the information are updated by machine learning processing in the same manner as the learning processing unit 23 already described.

判定処理部４４′は、学習処理部４３′から各推定器４２の損失に係る情報の入力を受けて、当該損失に基づいて、各推定器４２による入力データＸの正常／異常判定の結果を出力する。具体的に、判定処理部４４′は、対応する損失の大きさが予め定めたしきい値を超える推定器４２については、当該推定器４２が入力データＸを異常と判定したとする。また、対応する損失の大きさが予め定めたしきい値を超えていない推定器４２については、当該推定器４２が入力データＸを正常と判定したものとする。判定処理部４４′は、すべての推定器４２が入力データＸを異常と判定した場合に、入力データＸが異常であったと判定して、その旨を出力するよう、出力処理部２５に指示する。 The determination processing unit 44' receives information related to the loss of each estimator 42 from the learning processing unit 43', and determines the normal/abnormality determination result of the input data X by each estimator 42 based on the loss. Output. Specifically, the determination processing unit 44' determines that the input data X is abnormal for the estimator 42 whose corresponding loss magnitude exceeds a predetermined threshold value. Also, it is assumed that the estimator 42 for which the magnitude of the corresponding loss does not exceed the predetermined threshold determines that the input data X is normal. If all the estimators 42 determine that the input data X is abnormal, the determination processing unit 44′ determines that the input data X is abnormal, and instructs the output processing unit 25 to output that effect. .

また判定処理部４４′は、少なくとも一つの推定器４２が入力データを正常と判定しているときには、入力データＸは正常であるとして、その旨を出力するよう、出力処理部２５に指示する。 Further, when at least one estimator 42 determines that the input data is normal, the determination processing unit 44' determines that the input data X is normal and instructs the output processing unit 25 to output that fact.

さらに判定処理部４４′は、入力データＸを正常であると判断した推定器４２を特定する情報を、学習処理部４３′に出力する。 Further, the determination processing unit 44' outputs information specifying the estimator 42 that has determined that the input data X is normal to the learning processing unit 43'.

第２学習処理部４５は、予め定められたタイミングで起動し、推定器４２の機械学習処理を実行する。ここで予め定められたタイミングは、データ処理装置１の起動時点、あるいは所定の時間に１度、入力データＸが所定の回数だけ入力されるごと、異常と判断された入力データが所定の回数を超えて入力された時点、入力データの傾向に基づいて定めたクラスタ数ｍの値が妥当でないと判断した時点、利用者の指示による時点など、として定めておけばよい。クラスタ数の値が妥当であるか否かの判断は、クラスタ数を異ならせて試験的なクラスタリングを行い、その結果を参照して、クラスタ数を変更するか否かを判断する処理を予め定めた判断のタイミングごとに繰り返して実行することによって行う。ここでクラスタ数を変更するか否かの具体的な判断は、上記判断の処理の時点におけるクラスタ数でのクラスタリングの結果と、上記の試験的なクラスタリングの結果とにおける、同じクラスタに属する入力データ同士の距離（凝集性）や、互いに異なるクラスタに属する入力データ同士の距離（クラスタ間離散性）とに基づいて行うことができる。 The second learning processing unit 45 is activated at a predetermined timing and executes machine learning processing of the estimator 42 . Here, the predetermined timing is the time when the data processing device 1 is started, or once at a predetermined time, every time the input data X is input a predetermined number of times, the input data determined to be abnormal is output a predetermined number of times. It may be determined as the time point when the number of clusters m is exceeded, the time point when the value of the cluster number m determined based on the tendency of the input data is judged to be inappropriate, or the time point when the user instructs. Determining whether or not the value of the number of clusters is appropriate is performed by performing trial clustering with different numbers of clusters, referring to the results, and determining whether or not to change the number of clusters. This is done by repeatedly executing each judgment timing. Here, the specific determination of whether to change the number of clusters is based on the input data belonging to the same cluster in the result of clustering with the number of clusters at the time of the determination process and the result of the above trial clustering. This can be done based on the distance between the input data (cohesion) and the distance between the input data belonging to different clusters (inter-cluster discreteness).

この第２学習処理部４５による機械学習処理は、次のようにして行われる。第２学習処理部４５は、学習処理部４３′が記憶部１２に蓄積して記録した入力データをクラスタリングする。ここでのクラスタリングの方法は、教師なしのクラスタリングであれば、どのような方法であってもよく、例えばDBScan、SUBCLU、k-meansなど、種々の処理のいずれかを採用できる。 Machine learning processing by the second learning processing unit 45 is performed as follows. The second learning processing unit 45 clusters the input data accumulated and recorded in the storage unit 12 by the learning processing unit 43'. The clustering method here may be any method as long as it is unsupervised clustering, and any one of various processes such as DBScan, SUBCLU, and k-means can be employed.

第２学習処理部４５は、推定器４２の数と同じ、またはより少ない数のクラスタに分類を行う。つまり推定器４２の数がｎであるとすると、クラスタの数をｍ≦ｎなるｍとする。あるいは、ここで得られるクラスタの数より多い推定器４２を予め用意しておくこととしてもよい（その場合は、入力データとなるべきデータを予め蓄積し、クラスタリング処理を行ってクラスタ数を決定する）。 The second learning processing unit 45 classifies into the same or smaller number of clusters than the number of estimators 42 . In other words, if the number of estimators 42 is n, the number of clusters is m where m≤n. Alternatively, more estimators 42 than the number of clusters obtained here may be prepared in advance (in that case, the data to be input data are accumulated in advance, clustering is performed, and the number of clusters is determined). ).

第２学習処理部４５は、各推定器４２を、いずれかのクラスタに係る入力データを機械学習する推定器であるとして、各推定器４２をいずれかのクラスタに割り当てる。なお、以前に割り当てを行っている場合には、第２学習処理部４５は改めて割り当てを変更することなく、以前の割り当ての結果をそのまま利用する。 The second learning processing unit 45 assigns each estimator 42 to one of the clusters as an estimator that performs machine learning on the input data related to one of the clusters. In addition, when the allocation has been performed before, the second learning processing unit 45 uses the result of the previous allocation as it is without changing the allocation again.

そして第２学習処理部４５は、記憶部１２に蓄積されている入力データを順次取り出し、取り出した入力データのクラスタリングの結果（属するクラスタを表す情報）を参照する。第２学習処理部４５は、当該情報で表されるクラスタに割り当てられている推定器４２に対して、当該入力データを入力し、その出力データを得て、既に述べた方法と同様の方法で、当該推定器４２を機械学習処理する。このとき、参照した情報で表されるクラスタに割り当てられていない推定器４２については、機械学習処理を行わない。 Then, the second learning processing unit 45 sequentially retrieves the input data accumulated in the storage unit 12, and refers to the clustering result of the retrieved input data (information representing the cluster to which it belongs). The second learning processing unit 45 inputs the input data to the estimator 42 assigned to the cluster represented by the information, obtains the output data, and uses the same method as described above. , the estimator 42 is machine-learned. At this time, machine learning processing is not performed for the estimators 42 that are not assigned to the cluster represented by the referenced information.

第２学習処理部４５は、記憶部１２に蓄積されている入力データのすべてについて上記の処理を終了すると、記憶部１２に蓄積している入力データを削除する（次回の学習には利用しないよう制御する）こととしてもよい。 After completing the above processing for all the input data stored in the storage unit 12, the second learning processing unit 45 deletes the input data stored in the storage unit 12 (does not use it for the next learning). control).

そして第２学習処理部４５による機械学習処理後、学習処理部４３′と、判定処理部４４′と出力処理部２５とによる動作を継続する。 After the machine learning processing by the second learning processing unit 45, the operations of the learning processing unit 43', the determination processing unit 44', and the output processing unit 25 are continued.

つまりこの例では、入力データをクラスタに分類し、分類されたクラスタごとに、対応する推定器４２を用意して、当該推定器４２を、対応するクラスタに属する入力データで機械学習する。 That is, in this example, input data is classified into clusters, an estimator 42 corresponding to each classified cluster is prepared, and machine learning is performed on the estimator 42 using input data belonging to the corresponding cluster.

本実施の形態のこの例によると、各推定器４２が第２学習処理部４５の動作により、入力データの種類に特化した機械学習を行うこととなる。これにより、例えば製造装置の異常判定を行う例においては、製造装置が一時停止しているときの振動、動作中の振動、…といったように、複数の異なるタイプの振動をそれぞれ機械学習するようになる。 According to this example of the present embodiment, each estimator 42 performs machine learning specialized for the type of input data by the operation of the second learning processing section 45 . As a result, for example, in an example of determining an abnormality in a manufacturing apparatus, a plurality of different types of vibration, such as vibration when the manufacturing apparatus is temporarily stopped, vibration during operation, and so on, can be machine-learned. Become.

なお、本実施の形態のこの例においては、各クラスタに対して利用者が任意にラベルを付してもよい。この例では制御部１１は当該ラベルの情報を記憶する。 In this example of the present embodiment, the user may arbitrarily label each cluster. In this example, the control unit 11 stores the label information.

そして判定処理部４４′は、少なくとも一つの推定器４２が入力データを正常と判定しているときには、当該推定器４２が割り当てられているクラスタに係るラベルの情報を、出力処理部２５に出力し、当該ラベルの情報を出力させる。これによると、利用者は、入力データの異常・正常の判断に加え、当該入力データがどのような状態に対応するものであるか（例えば上述の例であれば、「装置停止中」などといった状態）を識別可能となる。 Then, when at least one estimator 42 determines that the input data is normal, the determination processing unit 44 ′ outputs label information related to the cluster to which the estimator 42 is assigned to the output processing unit 25 . , to output the information of the label. According to this, in addition to judging whether the input data is abnormal or normal, the user can determine what kind of state the input data corresponds to (for example, in the above example, such as "device is stopped"). state) can be identified.

また、判定処理部４４′は、すべての推定器４２の判定結果を総合的に判断することで、当該入力データに対応するラベル（当該入力データがどのような状態に対応するものであるか）の確信度を出力してもよい。例えば、一つの推定器４２が正常と判定し、それ以外の推定器４２がすべて異常と判定したときは正常と判断した推定器４２が割り当てられているクラスタに係るラベルの情報に対する確信度（当該クラスタ分類に対する信頼度）は高くなる。一方、一つの入力データに対して、複数の推定器４２が一斉に正常と判定したときには、当該複数の推定器４２のそれぞれが割り当てられているクラスタに係るラベルの情報に対する確信度は低くなる。 In addition, the judgment processing unit 44' comprehensively judges the judgment results of all the estimators 42 to determine the label corresponding to the input data (what state the input data corresponds to). may output the confidence of For example, when one estimator 42 is determined to be normal and all the other estimators 42 are determined to be abnormal, the certainty (the Confidence in cluster classification) is higher. On the other hand, when a plurality of estimators 42 simultaneously determine that one input data is normal, the reliability of the label information related to the cluster to which each of the plurality of estimators 42 is assigned becomes low.

そこで判定処理部４４′は、複数の入力データのそれぞれに対して推定器４２ごとに、単独で正常と判断した（他の推定器４２が異常と判断しているときに、当該推定器４２のみが正常と判断した）割合、つまり単独で正常と判断した回数を、正常と判断した回数（他の推定器４２も正常と判断したときを含む回数）で除した値を、当該推定器４２に割り当てられているクラスタに係るラベルの確信度として出力してもよい。 Therefore, the determination processing unit 44' independently determined that each estimator 42 was normal for each of the plurality of input data (when the other estimators 42 were determined to be abnormal, only the relevant estimator 42 determined to be normal), that is, the number of times it was determined to be normal by itself divided by the number of times it was determined to be normal (including the number of times when other estimators 42 were also determined to be normal). It may be output as the confidence of the label associated with the assigned cluster.

［行動異常の検出］
また既に述べたように、本実施の形態のデータ処理装置１は、被験者の行動の正常・異常を判断することにも用いることができる。 [Detection of behavioral anomalies]
Moreover, as already described, the data processing apparatus 1 of the present embodiment can also be used to determine whether the behavior of the subject is normal or abnormal.

この行動異常の検出を行う場合、被験者の行動（例えば車両を運転中の被験者であればハンドルを左に切る、右に切る、アクセルを踏む、などの行動であり、コンピュータ操作を行う被験者を対象とする場合は、入力したコマンドの種類等）を表す符号（例えばアルファベット一文字）を予め規定しておき、被験者の一連の行動を符号列（Ａ，Ｃ，Ｅ，Ｂ…といった列）として表現する。 When detecting this behavioral abnormality, the behavior of the subject (for example, if the subject is driving a vehicle, it is the behavior of turning the steering wheel to the left, turning to the right, stepping on the accelerator, etc.). In the case of , a code (for example, one letter of the alphabet) representing the type of input command, etc.) is defined in advance, and a series of actions of the subject is expressed as a code string (a string such as A, C, E, B, etc.) .

データ処理装置１のデータ受入部２１は、一定の期間ごとの被験者の一連の行動を表す符号列の入力を入力部１３から複数回受け入れる。そしてデータ受入部２１は、各期間に対応する符号列に基づいて各期間ごとの状態遷移表を生成する。具体的にデータ受入部２１は、Ｎ個の符号からなる符号列からｉ番目（ｉは、０＜ｉ＜Ｎの各整数）の符号Ｃ_iと、ｉ＋１番目の符号Ｃ_i+1とを取り出て順列（Ｃ_i，Ｃ_i+1）を作成し、この順列ごとの出現確率を演算する。 The data receiving unit 21 of the data processing device 1 receives, from the input unit 13, multiple times input of code strings representing a series of behaviors of the subject at regular intervals. The data receiving unit 21 then generates a state transition table for each period based on the code string corresponding to each period. Specifically, the data receiving unit 21 obtains the i-th (i is an integer of 0<i<N) code C _i and the i+1-th code C _i+1 from a code string consisting of N codes. Then, a permutation (C _i , C _i+1 ) is created, and the appearance probability for each permutation is calculated.

データ受入部２１は、あり得るすべての順列について、その出現確率を関連付けたベクトル情報（符号列から取り出されなかった順列についての出現確率は０とする）を生成して状態遷移表とする。そしてデータ受入部２１は、この状態遷移表のベクトル情報を入力データとして学習処理部２３（あるいは学習処理部４３や学習処理部４３′）に出力し、正常・異常を判別する推定器を得る。 The data receiving unit 21 generates vector information associated with the appearance probabilities for all possible permutations (assuming that the appearance probabilities for permutations not extracted from the code string are 0) to form a state transition table. The data receiving unit 21 outputs the vector information of the state transition table as input data to the learning processing unit 23 (or the learning processing unit 43 or learning processing unit 43') to obtain an estimator that determines normality/abnormality.

もっとも、このようにして生成した状態遷移表は、スパース（ほとんどの要素が「０」）なベクトル情報となっていることが想定される。 However, it is assumed that the state transition table generated in this way is sparse vector information (most elements are "0").

そこで本実施の形態のここでの例では、データ受入部２１は、複数の期間のそれぞれについて求められた状態遷移表について、広く知られた圧縮（複数の要素を統合して要素数を減少させる）処理を行った後、圧縮処理後のベクトル情報を入力データとすることとしてもよい。この処理としては、Candes-Taoの理論に基づく方法など、広く知られた方法を採用できるので、ここでの詳しい説明は省略する。 Therefore, in this example of the present embodiment, the data receiving unit 21 compresses the state transition table obtained for each of the plurality of periods using well-known compression (combining a plurality of elements to reduce the number of elements). ) After the compression processing, the vector information after the compression processing may be used as the input data. A widely known method such as a method based on Candes-Tao's theory can be used for this process, so a detailed description thereof will be omitted here.

あるいは、データ受入部２１は、第ｊ番目の期間に対応する状態遷移表Ｖ_jを求める際、第ｊ－１番目の期間に対応する状態遷移表Ｖ_j-1があれば、第ｊ番目の期間に対応して上記の方法で求めた状態遷移表Ｖ′_jを用い、求める状態遷移表Ｖ_jを、
Ｖ_j＝Ｖ′_j＋α・Ｖ_j-1
として求めてもよい。ここでαは任意の定数であり、例えばα＝０．８などとする。 Alternatively, when the data receiving unit 21 obtains the state transition table V _j corresponding to the j-th period, if there is a state transition table V _j-1 corresponding to the j-1-th period, the j-th period Using the state transition table V' _j obtained by the above method corresponding to the period, the state transition table V _j to be obtained is:
V _j =V′ _j +α·V _j−1
can be obtained as Here, α is an arbitrary constant, such as α=0.8.

この例のデータ処理装置１は、図６に例示した構成を有するものであってもよい。この図６の構成を有するものとした場合は、車両を運転する被験者の行動の異常・正常を検出するときには、各推定器４２は、走行中に対応するもの、停車中に対応するもの…といったように分化して機械学習されることが期待される。そしてこの場合のデータ処理装置１は、どの推定器４２においても異常であると判断されたときに、行動の異常が検出されたことを表す情報を出力することとなる。 The data processing device 1 of this example may have the configuration illustrated in FIG. In the case of having the configuration of FIG. 6, each estimator 42 detects whether the behavior of the subject driving the vehicle is abnormal or normal. It is expected that it will be differentiated and machine-learned. In this case, the data processing device 1 outputs information indicating that the behavioral abnormality is detected when any of the estimators 42 determines that the behavior is abnormal.

［前処理］
さらに本実施の形態のデータ処理装置１のデータ受入部２１は、入力部１３が出力するデータに対して前処理を行ってもよい。この前処理は、処理の対象とするベクトルデータｘ（要素が（ｘ₀,ｘ₁，ｘ₂…，ｘ_n）とする）に対して所定の変換を行うもので、変換後のベクトルｙ（要素が（ｙ₀,ｙ₁，ｙ₂…，ｙ_n）とする）を、
ｙ_i＝Σα_j・ｘ_j
（ただしΣは、ｊについての総和を求めることを意味する）などとして求めることを意味する。ここでαは、フィルタ関数（カーネル）であり、例えば、
α_j＝０（ｊ＜ｉ－１，またはｊ＞ｉ＋１のとき）
α_j＝１／３（ｉ－１≦ｊ≦ｉ＋１のとき）
としてもよい。 [Preprocessing]
Furthermore, the data receiving unit 21 of the data processing device 1 of the present embodiment may preprocess the data output by the input unit 13 . In this preprocessing, vector data x (elements are (x ₀ , x ₁ , x ₂ . . . , x _n )) to be processed is subjected to a predetermined transformation, and vector y ( Let the elements be (y ₀ , y ₁ , y _{2 .} . . , y _n ),
y _i =Σα _j x x _j
(However, Σ means to find the sum of j). where α is the filter function (kernel), for example
α _j =0 (when j<i−1 or j>i+1)
α _j = 1/3 (when i-1≤j≤i+1)
may be

また、
α_j＝０（ｊ＜ｉ－１，またはｊ＞ｉ＋１のとき）
α_j＝１／４（ｊ＝ｉ－１，またはｊ＝ｉ＋１のとき）
α_j＝１／２（ｊ＝ｉのとき）
としてもよい。 again,
α _j =0 (when j<i−1 or j>i+1)
α _j =1/4 (when j=i−1 or j=i+1)
α _j =1/2 (when j=i)
may be

データ受入部２１は、入力部１３が出力するデータを受け入れて、当該データに対して上述のフィルタ関数を用いて変換処理を行ってから、変換処理後のデータを入力データとして学習処理部２３（あるいは学習処理部４３や学習処理部４３′）に出力することとしてもよい。 The data receiving unit 21 receives the data output by the input unit 13, performs conversion processing on the data using the filter function described above, and uses the data after the conversion processing as input data for the learning processing unit 23 ( Alternatively, it may be output to the learning processing unit 43 or the learning processing unit 43').

このようにすると、例えば時系列に値を配列したベクトルデータを入力データとする場合に、時間変化に対する変動に対してロバストな判定を行うことが可能となる。 By doing so, for example, when vector data in which values are arranged in time series is used as input data, it is possible to perform robust determination against fluctuations over time.

また、上述のように、フィルタ関数の定め方は複数あるため、図５に例示した構成を用いることとしてもよい。この場合、各フィルタ関数に対応する推定器４２を定めておく。そしてこの場合のデータ受入部２１は複数のフィルタ関数をそれぞれ適用して変換したデータを複数得て、各フィルタ関数に対応する推定器４２を、対応するフィルタ関数で変換したデータを用いて機械学習させるよう、各学習処理部４３にそれぞれ対応する変換したデータを出力することとしてもよい。 Further, as described above, since there are multiple ways of determining the filter function, the configuration illustrated in FIG. 5 may be used. In this case, the estimator 42 corresponding to each filter function is defined. The data receiving unit 21 in this case obtains a plurality of data converted by applying a plurality of filter functions, and the estimator 42 corresponding to each filter function performs machine learning using the data converted by the corresponding filter function. Transformed data corresponding to each learning processing unit 43 may be output to each learning processing unit 43 so as to make it possible.

また変換処理の方法として、ＨＯＧ特徴量を用いる方法を採用してもよい。具体的には、ベクトルデータｘの各要素を所定サイズ（例えばｗ×ｈ）のマトリクス状に配列した上で、当該配列後のマトリクス内で予め定めたウインドウサイズＷｗ×Ｈｗ（Ｗｗ＜ｗ、Ｈｗ＜ｈ）の領域を設定し、当該領域（局所データとなる）の勾配方向と勾配強度とを演算して、それらのヒストグラムを変換後のベクトルｙとして、当該変換処理後のデータを入力データとして学習処理部２３（あるいは学習処理部４３や学習処理部４３′）に出力することとしてもよい。 Further, as a conversion processing method, a method using the HOG feature amount may be adopted. Specifically, after arranging each element of the vector data x in a matrix of a predetermined size (for example, w×h), a predetermined window size Ww×Hw (Ww<w, Hw <h) area is set, the gradient direction and gradient strength of the area (which becomes local data) are calculated, the histogram is used as the vector y after conversion, and the data after the conversion process is used as input data It may be output to the learning processing unit 23 (or the learning processing unit 43 or the learning processing unit 43').

この処理は、ベクトルデータｘがもともと上記所定サイズ（ｗ×ｈ）の画像データである場合に有効である。この場合、ベクトルデータｘの各成分は当該画像データの各画素の輝度値となる。またそのＨＯＧ特徴量をベクトルｙとして表現する方法は、広く知られているため、ここでの詳しい説明は省略する。 This processing is effective when the vector data x is originally image data of the predetermined size (w×h). In this case, each component of the vector data x is the brightness value of each pixel of the image data. Further, since the method of expressing the HOG feature quantity as a vector y is widely known, detailed description thereof will be omitted here.

［多層化］
また、本実施の形態のデータ処理装置１を複数用い、各データ処理装置１を互いに、ＦＡＴツリー等の木構造ネットワーク状に接続して用いてもよい。 [Multilayer]
Further, a plurality of data processing apparatuses 1 of this embodiment may be used, and the data processing apparatuses 1 may be connected to each other in a tree structure network such as a FAT tree.

この場合、データ処理装置１を、木構造ネットワークの各ノードに配する。そして親となるノードのないノード（ルートノード）に対応するデータ処理装置１を最上位とする。子のあるノードに対応するデータ処理装置１ｐは、子となっているノードに対応するデータ処理装置１ｆが出力する判定の結果（当該データ処理装置１ｆに入力された入力データが正常であるか否かを表す情報）を受け入れ、この判定の結果の情報を入力部１３から受け入れて、機械学習の対象（入力データ及び教師データ）として、OS-ELM等で構成した推定器をオートエンコーダとして学習処理し、入力データが正常であるか異常であるかの判定を行う。 In this case, the data processing device 1 is arranged at each node of the tree structure network. Then, the data processing device 1 corresponding to a node (root node) having no parent node is set to the highest level. The data processing device 1p corresponding to the child node receives the determination result output by the data processing device 1f corresponding to the child node (whether or not the input data input to the data processing device 1f is normal). information representing whether or not) is received from the input unit 13, the information of the result of this determination is received from the input unit 13, and learning processing is performed using an estimator configured by OS-ELM etc. as an autoencoder as a target of machine learning (input data and teacher data) and determines whether the input data is normal or abnormal.

この例によると、より下位側（製品の製造機械の振動などの情報を入力とする）データ処理装置１において、異常検知の対象となったシステムの細部における異常を検知するとともに、例えば一つの作業室において、個々の製造機械の振動についての異常検知を行う複数のデータ処理装置１からの入力を受け入れる（親となっているノードに相当する）データ処理装置１は、この作業室全体（システムのより広域な部分）における集約的な異常検知を行うこととなる。 According to this example, in the data processing device 1 on the lower side (inputting information such as vibration of a product manufacturing machine), an abnormality in details of the system targeted for abnormality detection is detected, and, for example, one task is detected. In the room, a data processing device 1 (corresponding to a parent node) that receives inputs from a plurality of data processing devices 1 that detect anomalies in the vibration of individual manufacturing machines controls the entire work room (system In other words, intensive anomaly detection in a wider area) will be performed.

このように本実施の形態のデータ処理装置１を多層的に接続することで、システムの種々のスケールで異常検知を行うことが可能となる。 By connecting the data processing apparatuses 1 of the present embodiment in multiple layers in this way, it becomes possible to perform abnormality detection at various scales of the system.

［要因推定］
また、本実施の形態のデータ処理装置１は、異常と判断された入力データと、その前後にデータ処理装置１に入力されていた複数の入力データとを入力とし、異常の原因を表す情報を正解として機械学習処理した、ディープラーニングのニューラルネットワークを用いた要因推定装置に接続されてもよい。 [Factor estimation]
Further, the data processing apparatus 1 of the present embodiment receives input data determined to be abnormal and a plurality of pieces of input data that have been input to the data processing apparatus 1 before and after that input data, and obtains information representing the cause of the abnormality. It may be connected to a factor estimating device using a deep learning neural network that performs machine learning processing as a correct answer.

この場合、データ処理装置１は、最近入力した入力データを少なくともＮ個蓄積して保持する。そして異常と判断される入力データが入力されると、その後、ｍ個（ｍ＜Ｎ，ｍ＝Ｎ－ｎとする）の入力データが入力されるまで待機し、異常と判断された入力データが入力された後、ｍ個の入力データが入力されたときに、保持しているＮ個の入力データ（異常発生前にｎ－１個、異常と判断された入力データが１個、異常と判断された後の入力データｍ個の合計Ｎ個）を、要因推定装置に送出する。 In this case, the data processing apparatus 1 accumulates and holds at least N pieces of recently input data. When the input data determined to be abnormal is input, the system waits until m (m<N, m=N−n) input data are input, and the input data determined to be abnormal is input. After inputting, when m pieces of input data are input, N pieces of input data held (n-1 pieces before abnormality occurred, 1 piece of input data judged to be abnormal, A total of N pieces of m pieces of input data after being processed are sent to the factor estimating device.

なお、上述のような要因推定装置の構成は、広く知られたものを採用できるので、ここでの詳しい説明は省略する。 A well-known configuration can be adopted for the configuration of the factor estimating device as described above, so a detailed description thereof will be omitted here.

１データ処理装置、１１制御部、１２記憶部、１３入力部、１４出力部、２１データ受入部、２２推定器、２３学習処理部、２４判定処理部、２５出力処理部、３１入力層、３２中間層、３３出力層、４２推定器、４３，４３′ 学習処理部、４４，４４′ 判定処理部、４５第２学習処理部。
1 data processing device 11 control unit 12 storage unit 13 input unit 14 output unit 21 data reception unit 22 estimator 23 learning processing unit 24 determination processing unit 25 output processing unit 31 input layer 32 Intermediate layer, 33 output layer, 42 estimator, 43, 43' learning processing unit, 44, 44' determination processing unit, 45 second learning processing unit.

Claims

A data processing device that performs a predetermined determination process based on data that is repeatedly input,
a guessing means that accepts input data and teacher data and can be machine-learned by reciprocal arithmetic;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
including
A plurality of the estimating means are provided, and each of the estimating means includes a neural network that randomly determines the connection weight between the input layer and the intermediate layer and machine-learns the connection weight between the intermediate layer and the output layer. ,
The learning processing means resets the connection weights and biases between the input layer and the intermediate layer of the plurality of estimating means at a timing that does not reset all the estimating means at the same time,
The data processing device, wherein the output means performs the predetermined determination process based on a comparison between the output of each of the plurality of estimation means and the input data, and outputs a result of the determination process.

A data processing device that performs a predetermined determination process based on data that is repeatedly input,
a guessing means that accepts input data and teacher data and can be machine-learned by reciprocal arithmetic;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
including
The learning processing means performs machine learning processing of the inference means each time the input of the data is accepted,
Each time a machine learning process is performed, the machine learning result of the inference means and the input data are recorded in association with each other,
Each time the machine learning process is performed a plurality of times determined by a predetermined method, the output of the guessing means when the recorded input data is input to the guessing means at that time is referred to, and based on the output A data processing device for setting and correcting the machine learning state of the estimating means to the machine learning result recorded in association with the input data when the condition that the loss exceeds a predetermined threshold value is satisfied.

The data processing device according to claim 2,
The estimation means includes a neural network that randomly determines connection weights between the input layer and the intermediate layer and machine-learns connection weights between the intermediate layer and the output layer,
The learning processing means records the machine learning result including the connection weight between the intermediate layer and the output layer of the inference means and the input data in association each time the machine learning process is performed,
Each time the machine learning process is performed a plurality of times determined by a predetermined method, the output of the guessing means when the recorded input data is input to the guessing means at that time is referred to, and based on the output When the condition that the loss exceeds a predetermined threshold is satisfied, the machine learning state including the connection weight between the intermediate layer and the output layer of the estimation means is recorded in association with the input data. A data processing device for setting and correcting a machine learning result including a connection weight between an intermediate layer and an output layer of the estimation means .

A data processing device that performs a predetermined determination process based on data that is repeatedly input,
a guessing means that accepts input data and teacher data and can be machine-learned by reciprocal arithmetic;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
including
The learning processing means performs machine learning processing of the inference means each time the input of the data is accepted,
Each time a machine learning process is performed, the machine learning result of the inference means and the input data are recorded in association with each other,
The estimation means includes a neural network that performs machine learning using loss information based on the difference between the output when the input data is input and the teacher data,
The learning processing means records the loss for the most recent input data for a predetermined M times (M is a natural number), and compares the statistical value calculated based on the recorded loss and the input data M times before. Based on the comparison with the loss when inputting to the inference means, it is determined whether or not to perform machine learning processing based on the input data M times before, and when it is determined to perform machine learning processing, the M A data processing device for machine-learning the estimating means using previous input data as both input data and teacher data.

The data processing device according to any one of claims 1 to 4,
A plurality of the estimating means are provided, and further comprising means for classifying the input data into clusters,
The learning processing means associates each of the plurality of estimating means with each of the classified clusters, determines a cluster belonging to each piece of input data, and assigns an estimating means corresponding to the determined cluster to the input data. A data processing device that performs machine learning using

including a plurality of data processing devices connected to each other in a tree structure network,
each of the data processing devices
A data processing device that performs a predetermined determination process based on data that is repeatedly input,
a guessing means that accepts input data and teacher data and can be machine-learned by reciprocal arithmetic;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
including
A plurality of the estimating means are provided, and each of the estimating means includes a neural network that randomly determines the connection weight between the input layer and the intermediate layer and machine-learns the connection weight between the intermediate layer and the output layer. ,
The learning processing means resets the connection weights and biases between the input layer and the intermediate layer of the plurality of estimating means at a timing that does not reset all the estimating means at the same time,
The output means is a data processing device that performs the predetermined determination process based on a comparison between the output of each of the plurality of estimation means and the input data, and outputs the result of the determination process. data processing system.

including a plurality of data processing devices connected to each other in a tree structure network,
each of the data processing devices
A data processing device that performs a predetermined determination process based on data that is repeatedly input,
a guessing means that accepts input data and teacher data and can be machine-learned by reciprocal arithmetic;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
including
The learning processing means performs machine learning processing of the inference means each time the input of the data is accepted,
Each time a machine learning process is performed, the machine learning result of the inference means and the input data are recorded in association with each other,
Each time the machine learning process is performed a plurality of times determined by a predetermined method, the output of the guessing means when the recorded input data is input to the guessing means at that time is referred to, and based on the output a data processing device that sets and corrects the machine learning state of the inferring means to the machine learning result recorded in association with the input data when the condition that the loss exceeds a predetermined threshold is satisfied; A data processing system that is

A program for causing a computer to function as a data processing device that performs predetermined determination processing based on repeatedly input data,
each of which accepts input data and teacher data, is a guessing means capable of machine learning by reciprocal arithmetic, randomly determines connection weights between the input layer and the intermediate layer, and connects between the intermediate layer and the output layer a plurality of guessing means including a neural network that machine-learns the weights;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
make the computer function as
When functioning as the learning processing means, each of the connection weights and biases between the input layer and the intermediate layer of the plurality of estimation means is reset at a timing that does not reset all the estimation means at once,
A program that, when functioning as the output means, performs the predetermined determination process based on a comparison between the output of each of the plurality of estimation means and the input data, and outputs the result of the determination process. .

A program for causing a computer to function as a data processing device that performs predetermined determination processing based on repeatedly input data,
a guessing means that accepts input data and teacher data and can be machine-learned by reciprocal arithmetic;
learning processing means for performing machine learning on the inferring means using the input data as input data and teacher data;
output means for performing the predetermined determination process based on the comparison between the output of the estimation means and the input data, and outputting the result of the determination process;
function as
When functioning as the learning processing means, each time the input of the data is accepted, the machine learning process of the inference means is performed, and each time the machine learning process is performed, the machine learning result of the inference means and the input data are recorded in association with each other, and each time the machine learning process is performed a plurality of times determined by a predetermined method, the output of the inferring means when the recorded input data is input to the inferring means at that time is recorded. When the condition that the loss based on the output exceeds a predetermined threshold is satisfied, the machine learning state of the inference means is set to the machine learning result recorded in association with the input data. A program that corrects