JP2023112782A

JP2023112782A - State detection device, state detection method, and state detection program

Info

Publication number: JP2023112782A
Application number: JP2022014701A
Authority: JP
Inventors: 健生山本; Tatsuo Yamamoto
Original assignee: Omron Corp; Omron Tateisi Electronics Co
Current assignee: Omron Corp
Priority date: 2022-02-02
Filing date: 2022-02-02
Publication date: 2023-08-15

Abstract

To simply suppress the frequency at which a state of an observation object is erroneously recognized due to difference between a learning environment and an operation environment.SOLUTION: A storage part stores observation data inputted to an input part and a recognition result of a state of the observation object recognized, on the basis of a feature amount of the observation data, by a recognition model generated by machine learning in association with each other. When the recognition result is designated to be erroneous recognition, an extraction part extracts, on the basis of a feature amount of observation data having acquired the designated recognition result, other observation data having a possibility that the recognition result is erroneous recognition. When receiving an update permission of the recognition model, an update part updates a parameter to be used to determine a state of the observation object by the recognition model.SELECTED DRAWING: Figure 4

Description

この発明は、観測対象の観測データを基に観測対象の状態を認識する技術に関する。 The present invention relates to technology for recognizing the state of an observation target based on observation data of the observation target.

従来、被検出体と既知の検出体との同一性を認識する装置があった（特許文献１参照）。 Conventionally, there has been a device for recognizing the identity of an object to be detected and a known object to be detected (see Patent Document 1).

特許文献１の装置は、機械学習によって、既知の検出体の種類毎に、その既知の検出体の特徴量から生成した基準空間（マハラノビス空間）を構成する情報（空間情報）を記憶部に記憶している。この装置は、入力された被検出体の特徴量を取得する。この装置は、既知の検出体の種類毎に、その種類の基準空間と、取得した被検出体の特徴量との距離（マハラノビス距離）を算出する。この装置は、既知の検出体の種類毎に算出した距離によって、被検出体が既知の検出体のいずれであるかを認識する。 The device of Patent Document 1 stores information (spatial information) that constitutes a reference space (Mahalanobis space) generated from the feature amount of a known detected object for each type of known detected object by machine learning in a storage unit. are doing. This device obtains the input feature amount of the object to be detected. This device calculates the distance (Mahalanobis distance) between the reference space of the type and the obtained feature amount of the detected object for each known type of detected object. This device recognizes which of the known detection objects the object to be detected is based on the distance calculated for each type of known detection object.

また、特許文献１の装置は、既知の検出体のいずれであるか認識でなかった被検出体から取得した特徴量を蓄積的に記憶する。また、この装置は、適当なタイミングで記憶している既知の検出体のいずれであるか認識でなかった被検出体の特徴量を用いて再学習を行い、被検出体の認識精度の向上を図っている。 In addition, the apparatus of Patent Document 1 accumulatively stores a feature amount acquired from a detected object that has not been recognized as a known detected object. In addition, this device performs re-learning at an appropriate timing using the feature amount of a detected object that has not been recognized as to which of the known detected objects is stored, thereby improving the recognition accuracy of the detected object. I am planning.

特開２００５－２１４６８２号公報JP 2005-214682 A

しかしながら、特許文献１の装置は、機械学習に用いた学習用データを取得した環境（学習環境）と、認識する被検出体の観測データを取得する環境（運用環境）との相違によって、被検出体の誤認識が起こることを想定していない。すなわち、特許文献１の装置は、学習環境と運用環境との相違によって起こる被検出体の誤認識を抑制することができない。 However, the device of Patent Document 1, due to the difference between the environment (learning environment) in which learning data used for machine learning is acquired and the environment (operating environment) in which observation data of the object to be recognized is acquired, It is not assumed that misrecognition of the body will occur. That is, the apparatus of Patent Document 1 cannot suppress erroneous recognition of the detected object caused by the difference between the learning environment and the operating environment.

また、運用環境は、時間の経過によって変化する場合もあり、運用開始時に十分な認識精度が得られていても、時間の経過によって認識精度が低下する（誤認識の発生頻度が増大する。）。 In addition, the operating environment may change with the passage of time, and even if sufficient recognition accuracy is obtained at the start of operation, the recognition accuracy will decline with the passage of time (increase in the frequency of misrecognition). .

この発明の目的は、学習環境と運用環境との相違によって観測対象の状態が誤認識される頻度を簡単に抑制できる技術を提供することにある。 SUMMARY OF THE INVENTION An object of the present invention is to provide a technique that can easily suppress the frequency of erroneous recognition of the state of an observation target due to the difference between the learning environment and the operating environment.

この発明の状態検知装置は、上記目的を達成するため以下に示すように構成している。 In order to achieve the above objects, the state detection device of the present invention is constructed as follows.

入力部には、観測対象の観測データが入力される。観測対象は、状態を認識する対象であり、例えば、空間、人、物体、物質である。観測データは、観測対象をセンサでセンシングしたセンシングデータである。観測データは、例えばカメラで撮像した観測対象のフレーム画像であってもよいし、マイクで集音した観測対象周辺の音声データであってもよいし、温度センサで計測した観測対象の温度であってもよい。また、観測データは、１つではなく、複数であってもよい。 Observation data of an observation target is input to the input unit. An observation target is a target whose state is to be recognized, and is, for example, space, a person, an object, or a substance. Observation data is sensing data obtained by sensing an observation target with a sensor. The observation data may be, for example, a frame image of the observation target captured by a camera, sound data around the observation target collected by a microphone, or the temperature of the observation target measured by a temperature sensor. may Also, the number of observation data may be plural instead of one.

記憶部は、入力部に入力された観測データと、機械学習で生成された認識モデルが当該観測データの特徴量を基に認識した観測対象の状態の認識結果と、を対応付けて記憶する。認識モデルは、例えば、観測対象の正常時の観測データを含む学習用データを用いて学習させた変分自己符号化器(ＶＡＥ：Variational Autoencoder)を有するものであってもよいし、他の種類の認識モデルであってもよい。認識モデルは、例えば観測対象の状態が、正常状態であるか、異常状態であるかを認識するものであってもよいし、予め定めた３種類以上の状態のいずれであるかを認識するものであってもよい。 The storage unit stores the observation data input to the input unit in association with the recognition result of the state of the observation target recognized by the recognition model generated by machine learning based on the feature amount of the observation data. The recognition model may have, for example, a Variational Autoencoder (VAE) trained using learning data including normal observation data of the observation target, or other types may be a recognition model of The recognition model may recognize, for example, whether the state of the observation target is a normal state or an abnormal state, or recognize which of three or more predetermined states it is. may be

抽出部は、認識結果が誤認識であると指定された場合、指定された認識結果を得た観測データの特徴量に基づき、認識結果が誤認識であった可能性がある他の観測データを記憶部から抽出する。抽出部は、例えば、特徴量が、誤認識であると指摘された認識結果を得た観測データの特徴量と類似する観測データを抽出する。 If the recognition result is specified as an erroneous recognition, the extracting unit extracts other observation data that may have been an erroneous recognition based on the feature amount of the observation data that obtained the specified recognition result. Extract from memory. The extraction unit extracts, for example, observation data whose feature amount is similar to the feature amount of the observation data from which the recognition result pointed out as erroneous recognition was obtained.

更新部は、抽出部が抽出した観測データが出力された後に、認識モデルの更新許可を受け付けた場合、認識モデルが観測対象の状態の判定に用いるパラメータを更新する。例えば、抽出部が抽出した観測データは、利用者が視認できるように、表示器に表示させればよい。 The update unit updates the parameters used by the recognition model to determine the state of the observation target when permission to update the recognition model is received after the observation data extracted by the extraction unit is output. For example, the observation data extracted by the extraction unit may be displayed on a display so that the user can visually recognize it.

この構成では、認識結果が誤認識であった可能性がある他の観測データを利用者に確認させることができる。また、利用者は、認識結果が誤認識であった可能性があるとして抽出された観測データを確認し、出力された全ての観測データに対する認識が誤認識であれば、認識モデルの更新許可にかかる入力操作を行う。これにより、認識モデルが観測対象の状態の判定に用いるパラメータが、更新部によって更新される。 With this configuration, it is possible to allow the user to check other observation data that may have been misrecognition as a result of recognition. In addition, the user checks the extracted observation data for the possibility that the recognition result was an erroneous recognition, and if the recognition for all the output observation data is erroneous recognition, the user is allowed to update the recognition model. Such an input operation is performed. As a result, the update unit updates the parameters that the recognition model uses to determine the state of the observation target.

したがって、利用者は、学習環境と運用環境との相違等によって起こる観測対象の誤認識の発生頻度を、誤認識された認識結果を選択指定するという簡単な操作で抑制できる。 Therefore, the user can suppress the occurrence frequency of erroneous recognition of the observation target caused by the difference between the learning environment and the operating environment, etc., by a simple operation of selecting and designating the erroneously recognized recognition result.

また、例えば、抽出部が抽出した観測データを出力した後に、認識結果を訂正しない観測データの指定を受け付けた場合、抽出部に対して、指定された観測データの特徴量に基づき、前回抽出した認識結果が誤認識であった可能性がある他の観測データの絞り込みを指示する指示部を備える構成にしてもよい。 Further, for example, when the observation data extracted by the extraction unit is output, and the specification of the observation data whose recognition result is not to be corrected is received, the extraction unit receives the observation data previously extracted based on the feature amount of the specified observation data. A configuration may be provided in which an instruction unit is provided to instruct narrowing down of other observation data for which the recognition result may have been an erroneous recognition.

このように構成すれば、抽出部が抽出した観測データの中に、認識結果が誤っていない観測データが含まれていた場合、その認識結果を除外して、認識モデルが観測対象の状態の判定に用いるパラメータを更新できる。したがって、認識モデルが観測対象の状態の判定に用いるパラメータが不適正な値に更新されるのを防止できる。 With this configuration, if the observation data extracted by the extracting unit includes observation data in which the recognition result is not erroneous, the recognition result is excluded and the recognition model determines the state of the observation target. You can update the parameters used for Therefore, it is possible to prevent the parameter used by the recognition model to determine the state of the observation target from being updated to an inappropriate value.

また、認識モデルが、例えば、ＶＡＥである場合には、ＶＡＥが、特徴量として、入力部に入力された観測データの潜在変数の分布を判定用分布として取得し、判定用分布と基準分布との距離に基づき観測対象の状態を認識する構成にしてもよい。 Further, when the recognition model is, for example, a VAE, the VAE acquires the distribution of the latent variables of the observation data input to the input unit as the distribution for judgment as the feature amount, and the distribution for judgment and the reference distribution are obtained. may be configured to recognize the state of the observation target based on the distance of .

また、判定用分布と基準分布との距離は、例えば、マハラノビス距離を用いればよい。 For the distance between the judgment distribution and the reference distribution, for example, the Mahalanobis distance may be used.

また、観測データは、時間的に連続して撮像されたｎフレームのフレーム画像の第１平均画像と、時間的に連続して撮像されたｍフレームのフレーム画像の第２平均画像と、の差分画像にしてもよい。但し、この場合、ｎは、ｍよりも大きく、ｍは、１以上である。また、第１平均画像の生成に用いられる時間的に最も遅い撮像時刻のフレーム画像と、第２平均画像の生成に用いられる時間的に最も遅い撮像時刻のフレーム画像と、は同じフレーム画像である。 The observation data is the difference between the first average image of the n frame images captured temporally continuously and the second average image of the m frames captured temporally continuously. It can be an image. However, in this case, n is larger than m, and m is 1 or more. Also, the frame image captured at the latest time used to generate the first average image and the frame image captured at the latest time used to generate the second average image are the same frame image. .

このように構成すれば、第１平均画像、および第２平均画像の生成に用いられる時間的に最も遅いフレーム画像の撮像時刻の直前に、撮像エリア内に出現したオブジェクトの画像（差分画像）を、観測データとして取得できる。したがって、観測対象の状態を、観測対象エリア内に位置するオブジェクトの種類、大きさ、位置等によって認識する認識モデルである場合、観測対象の状態を認識するのに適した観測データを用いることができる。 With this configuration, an image (difference image) of an object appearing in the imaging area is captured immediately before the timing of capturing the latest frame image used to generate the first average image and the second average image. , can be obtained as observation data. Therefore, in the case of a recognition model that recognizes the state of an observation target based on the types, sizes, positions, etc. of objects located within the observation target area, it is possible to use observation data suitable for recognizing the state of the observation target. can.

この発明によれば、学習環境と運用環境との相違によって観測対象の状態が誤認識される頻度を簡単に抑制できる。 According to this invention, it is possible to easily suppress the frequency of erroneous recognition of the state of the observation target due to the difference between the learning environment and the operating environment.

この例の状態検知装置を適用した異常認識システムを示す図である。It is a figure which shows the abnormality recognition system to which the state detection apparatus of this example is applied. ビデオカメラが観測対象である駅構内を撮像したフレーム画像の例である。It is an example of the frame image which the video camera imaged the station premises which are observation targets. ビデオカメラが観測対象である道路を撮像したフレーム画像の例である。It is an example of the frame image which the video camera imaged the road which is an observation object. この例の状態検知装置の主要部の構成を示す図である。It is a figure which shows the structure of the principal part of the state detection apparatus of this example. この例の状態検知装置の認識部の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of the recognition part of the state detection apparatus of this example. この例の状態検知装置の状態認識処理を示すフローチャートである。It is a flowchart which shows the state recognition processing of the state detection apparatus of this example. この例の状態検知装置の更新処理を示すフローチャートである。It is a flowchart which shows the update process of the state detection apparatus of this example.

以下、この発明の実施形態について説明する。 Embodiments of the present invention will be described below.

＜１．適用例＞
図１は、この例の状態検知装置を適用した異常認識システムを示す図である。この例の異常認識システム１００は、状態検知装置１と、ビデオカメラ２と、表示器３と、入力デバイス４とを備えている。 <1. Application example>
FIG. 1 is a diagram showing an abnormality recognition system to which the state detection device of this example is applied. An abnormality recognition system 100 of this example includes a state detection device 1 , a video camera 2 , a display 3 and an input device 4 .

この例の状態検知装置１は、観測対象の状態が正常であるか、異常であるかを認識し、その認識結果を出力する。観測対象は、鉄道の駅構内であってもよいし、車両が走行する道路であってもよい。この例の状態検知装置１は、観測対象が鉄道の駅構内である場合、放置された不審物１０５があると異常状態であると認識する。また、この例の状態検知装置１は、観測対象が車両が走行する道路である場合、車両１１０の走行を阻害する障害物が道路上にあると、異常状態であると認識する。障害物は、例えば車両１１０の荷台等から道路上に落下した落下物（荷物）や、エンジントラブル等の故障で道路上で停車した車両１１０である。 The state detection device 1 of this example recognizes whether the state of the observation target is normal or abnormal, and outputs the recognition result. The observation target may be a railway station premises or a road on which a vehicle travels. The state detection device 1 of this example recognizes an abnormal state if there is a suspicious object 105 left unattended when the observation target is a railway station premises. Further, when the observation target is a road on which a vehicle travels, the state detection device 1 of this example recognizes an abnormal state when there is an obstacle on the road that hinders the travel of the vehicle 110 . The obstacle is, for example, a fallen object (baggage) that has fallen onto the road from the carrier of the vehicle 110 or the like, or the vehicle 110 that has stopped on the road due to a failure such as engine trouble.

ビデオカメラ２は、観測対象を撮像し、撮像したフレーム画像を状態検知装置１に出力する。ビデオカメラ２は、観測対象が撮像エリアに収まるアングルで取り付けられている。言い換えれば、ビデオカメラ２の撮像エリアの全部、または一部が観測対象である。図２および図３は、ビデオカメラが撮像したフレーム画像を示す概略図である。図２は、ビデオカメラ２が観測対象である駅構内を撮像したフレーム画像の例であり、図３は、ビデオカメラ２が観測対象である道路を撮像したフレーム画像の例である。ビデオカメラ２のフレームレートは、例えば数十フレーム／ｓｅｃ（例えば、１０～３０フレーム／ｓｅｃ）である。この発明で言う観測データは、ビデオカメラ２によって撮像された観測対象のフレーム画像から取得される。 The video camera 2 captures an image of an observation target and outputs the captured frame image to the state detection device 1 . The video camera 2 is attached at an angle that allows the observation target to fit within the imaging area. In other words, all or part of the imaging area of the video camera 2 is the observation target. 2 and 3 are schematic diagrams showing frame images captured by a video camera. FIG. 2 is an example of a frame image captured by the video camera 2 inside a station, and FIG. 3 is an example of a frame image captured by the video camera 2 of a road. The frame rate of the video camera 2 is, for example, several tens of frames/sec (eg, 10 to 30 frames/sec). Observation data referred to in the present invention is obtained from frame images of an observation target captured by the video camera 2 .

なお、ビデオカメラ２は、外部装置から入力されたレリーズ信号に応じて、静止画像を撮像するディジタルスチルカメラであってもよい。 The video camera 2 may be a digital still camera that captures a still image in response to a release signal input from an external device.

表示器３は、状態検知装置１における観測対象の状態の認識結果等を表示する。表示器３は、状態検知装置１から入力された画面表示データに基づく画面を表示する。すなわち、状態検知装置１が、表示器３において表示される画面を制御する。表示器３では、ＧＵＩ（Graphical User Interface）にかかる画面の表示等も行われる。 The display 3 displays the recognition result of the state of the observation target in the state detection device 1 and the like. The display 3 displays a screen based on screen display data input from the state detection device 1 . That is, the state detection device 1 controls the screen displayed on the display 3 . The display 3 also displays screens related to a GUI (Graphical User Interface).

入力デバイス４は、利用者が状態検知装置１に対して入力操作を行うキーボードや、マウス等である。 The input device 4 is a keyboard, a mouse, or the like with which the user performs an input operation on the state detection device 1 .

状態検知装置１は、観測対象の状態が正常であるかどうかを、機械学習で生成された認識モデルで認識する。この認識モデルは、ビデオカメラ２によって撮像された観測対象のフレーム画像から得られた観測データの特徴量を基に、観測対象の状態が正常であるかどうかを認識する。状態検知装置１は、観測データと、認識結果とを対応づけて記憶する。 The state detection device 1 recognizes whether or not the state of the observation target is normal using a recognition model generated by machine learning. This recognition model recognizes whether or not the state of the observation target is normal based on the feature amount of the observation data obtained from the frame images of the observation target captured by the video camera 2 . The state detection device 1 stores observation data and recognition results in association with each other.

また、状態検知装置１は、認識モデルの認識結果に対して誤認識であるとする入力指定を受け付けると、指定された認識結果を得た観測データの特徴量に基づき、認識結果が誤認識であった可能性がある他の観測データを抽出し、出力する。状態検知装置１は、例えば、認識結果が誤認識であった可能性があるとして抽出した観測データの一部、または全部を表示器３の画面に表示させる。 Further, when receiving an input specification indicating that the recognition result of the recognition model is an erroneous recognition, the state detection device 1 determines that the recognition result is an erroneous recognition based on the feature amount of the observation data for obtaining the specified recognition result. Extract and output other observation data that may have been. For example, the state detection device 1 causes the screen of the display 3 to display part or all of the observation data extracted as a possibility that the recognition result was an erroneous recognition.

利用者は、認識結果が誤認識であった可能性があるとして抽出された全ての観測データが誤認識であることを確認すると、入力デバイス４を操作して認識モデルの更新許可を状態検知装置１に入力する。状態検知装置１は、更新許可の入力を受け付けると、観測対象の状態の判定に用いるパラメータの値を、認識結果が誤認識であった可能性があるとして抽出した全ての観測データについて、訂正された認識結果が得られる値に更新する。 When the user confirms that all of the observation data extracted as the result of recognition may have been misrecognition is misrecognition, the user operates the input device 4 to permit the update of the recognition model. Enter 1. When receiving an update permission input, the state detection device 1 corrects the values of the parameters used to determine the state of the observation target for all the observation data extracted for the possibility that the recognition result was an erroneous recognition. update to a value that gives the correct recognition result.

また、利用者は、認識結果が誤認識であった可能性があるとして抽出された観測データの中に、認識結果が適正である（誤認識でない）観測データを確認すると、入力デバイス４を操作して認識モデルの更新の不許可を状態検知装置１に入力する。このとき、利用者は、認識結果が誤認識であった可能性があるとして抽出された観測データに対して、認識結果が適正である観測データを１つ以上指定する入力操作を行う。状態検知装置１は、更新不許可の入力を受け付けると、今回指定された認識結果が適正である観測データの特徴量に基づき、前回認識結果が誤認識であった可能性があるとして抽出した観測データの中から、認識結果が適正であると推定される観測データを除外する絞り込みを行う。 In addition, when the user confirms observation data with a correct recognition result (not misrecognition) among the observation data extracted with the possibility that the recognition result was misrecognition, the user operates the input device 4. Then, the non-permission of the update of the recognition model is input to the state detection device 1 . At this time, the user performs an input operation to designate one or more pieces of observation data for which the recognition result is correct, among the observation data extracted as being possibly misrecognised. When the state detection device 1 receives an input indicating that the update is not permitted, based on the feature amount of the observation data for which the recognition result specified this time is correct, the observation extracted based on the possibility that the previous recognition result was an erroneous recognition. Narrowing is performed by excluding observation data whose recognition result is estimated to be correct from the data.

状態検知装置１は、絞り込みを行った観測データの一部、または全部を表示器３の画面に表示させ、利用者による更新許可、または更新不許可にかかる入力を待つ。 The state detection device 1 displays part or all of the narrowed down observation data on the screen of the display device 3, and waits for the user's input to permit or deny update.

したがって、利用者は、誤認識であることを確認した認識結果を指定する入力操作を行うことによって、観測対象の状態の判定に用いられるパラメータの値を更新できる。したがって、この例の状態検知装置１は、学習環境と運用環境との相違によって起こる観測対象の誤認識の発生頻度を簡単に抑制できる。 Therefore, the user can update the value of the parameter used to determine the state of the observation target by performing an input operation to specify the recognition result that has been confirmed to be misrecognition. Therefore, the state detection device 1 of this example can easily suppress the occurrence frequency of erroneous recognition of the observation target caused by the difference between the learning environment and the operating environment.

＜２．構成例＞
図４は、この例の状態検知装置の主要部の構成を示す図である。この例の状態検知装置１は、制御ユニット１１と、入力部１２と、出力部１３と、記憶部１４と、表示出力部１５と、入力受付部１６とを備えている。 <2. Configuration example>
FIG. 4 is a diagram showing the configuration of the main part of the state detection device of this example. The state detection device 1 of this example includes a control unit 11 , an input section 12 , an output section 13 , a storage section 14 , a display output section 15 and an input reception section 16 .

制御ユニット１１は、状態検知装置１本体各部の動作を制御する。また、制御ユニット１１は、認識部１１ａ、抽出部１１ｂ、更新部１１ｃ、指示部１１ｄ、および表示制御部１１ｅを有している。制御ユニット１１が有する認識部１１ａ、抽出部１１ｂ、更新部１１ｃ、指示部１１ｄ、および表示制御部１１ｅについては、後述する。 The control unit 11 controls the operation of each part of the state detection device 1 main body. The control unit 11 also has a recognition section 11a, an extraction section 11b, an update section 11c, an instruction section 11d, and a display control section 11e. The recognition unit 11a, the extraction unit 11b, the update unit 11c, the instruction unit 11d, and the display control unit 11e included in the control unit 11 will be described later.

入力部１２には、観測対象を撮像するアングルで取り付けられたビデオカメラ２が接続されている。入力部１２には、ビデオカメラ２が観測対象を撮像したフレーム画像が入力される。 The input unit 12 is connected to a video camera 2 attached at an angle for capturing an image of an observation target. The input unit 12 receives a frame image of an observation target captured by the video camera 2 .

出力部１３は、観測対象の状態を認識した認識結果を上位装置に出力する。 The output unit 13 outputs the recognition result of recognizing the state of the observation target to the host device.

記憶部１４は、観測対象の観測データと、その観測データに対する観測対象の状態の認識結果とを対応付けて記憶する。記憶部１４は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、ＳＤメモリカード等の記憶媒体である。 The storage unit 14 associates and stores the observation data of the observation target and the recognition result of the state of the observation target with respect to the observation data. The storage unit 14 is a storage medium such as a HDD (Hard Disk Drive), an SSD (Solid State Drive), an SD memory card, or the like.

表示出力部１５には、表示器３が接続されている。表示出力部１５は、表示器３に対して画面表示データを出力する。表示器３は、この画面表示データに基づく画面を表示する。表示出力部１５は、表示器３とのインタフェース回路である。 The display device 3 is connected to the display output unit 15 . The display output unit 15 outputs screen display data to the display device 3 . The display device 3 displays a screen based on this screen display data. The display output unit 15 is an interface circuit with the display device 3 .

入力受付部１６には、入力デバイス４が接続されている。入力受付部１６には、利用者による入力デバイス４の入力操作に応じた入力操作データが入力される。入力受付部１６は、入力デバイス４とのインタフェース回路である。 An input device 4 is connected to the input reception unit 16 . Input operation data corresponding to the input operation of the input device 4 by the user is input to the input reception unit 16 . The input reception unit 16 is an interface circuit with the input device 4 .

次に、制御ユニット１１が有する認識部１１ａ、抽出部１１ｂ、更新部１１ｃ、指示部１１ｄ、および表示制御部１１ｅについて説明する。 Next, the recognition section 11a, the extraction section 11b, the update section 11c, the instruction section 11d, and the display control section 11e of the control unit 11 will be described.

認識部１１ａは、観測対象の状態（この例では、正常または異常）を認識する。図５は、この例の状態検知装置の認識部の機能構成を示すブロック図である。この例では、認識部１１ａは、変分自己符号化器２０(以下、ＶＡＥ２０（Variational Autoencoder)と言う。）と、特徴量算出部２３と、距離算出部２４と、分布記憶部２５と、判定部２６とを有している。 The recognition unit 11a recognizes the state of the observation target (normal or abnormal in this example). FIG. 5 is a block diagram showing the functional configuration of the recognition section of the state detection device of this example. In this example, the recognition unit 11a includes a variational autoencoder 20 (hereinafter referred to as a VAE 20 (Variational Autoencoder)), a feature value calculation unit 23, a distance calculation unit 24, a distribution storage unit 25, and a determination a portion 26;

ＶＡＥ２０は、観測対象の正常時の観測データを含む学習用データを用いた深層学習（ディープラーニング）で生成させた認識モデルである。ＶＡＥ２０は、エンコーダ２１、およびデコーダ２２を有する。エンコーダ２１、およびデコーダ２２は、ニューラルネットである。エンコーダ２１は、入力された観測データから潜在変数ｚを抽出する。デコーダ２２は、潜在変数ｚを入力とし、観測データを復元する。ここでは、デコーダ２２が潜在変数ｚから復元した観測データを復元データと言う。 The VAE 20 is a recognition model generated by deep learning using learning data including normal observation data of an observation target. VAE 20 has encoder 21 and decoder 22 . Encoder 21 and decoder 22 are neural networks. The encoder 21 extracts the latent variable z from the input observation data. The decoder 22 receives the latent variable z and reconstructs observation data. Here, the observation data restored from the latent variable z by the decoder 22 is called restored data.

特徴量算出部２３は、潜在変数ｚの特徴量を算出する。分布記憶部２５は、観測対象の状態が正常であるときの観測データについて得られた潜在変数ｚの特徴量の分布（以下、正常分布と言う。）を記憶している。分布記憶部２５は、基準空間における潜在変数ｚの特徴量の分布を記憶している。基準空間は、この例ではマハラノビス空間として説明するが、他の統計的な手法で生成した空間であってもよい。 The feature quantity calculator 23 calculates the feature quantity of the latent variable z. The distribution storage unit 25 stores the distribution of the feature amount of the latent variable z obtained for observation data when the state of the observation target is normal (hereinafter referred to as normal distribution). The distribution storage unit 25 stores the distribution of the feature amount of the latent variable z in the reference space. The reference space is explained as Mahalanobis space in this example, but it may be a space generated by another statistical method.

なお、分布記憶部２５は、観測対象の状態が異常であるときの観測データについて得られた潜在変数ｚの特徴量の分布（以下、異常分布と言う。）を記憶していてもよいし、正常分布、および異常分布の両方を記憶していてもよい。 Note that the distribution storage unit 25 may store the distribution of the feature amount of the latent variable z obtained for the observation data when the state of the observation target is abnormal (hereinafter referred to as an abnormal distribution), Both normal and abnormal distributions may be stored.

距離算出部２４は、特徴量算出部２３が観測データについて算出した潜在変数ｚの特徴量を基準空間であるマハラノビス空間にあてはめて、正常分布に対する距離（マハラノビス距離）を算出する。 The distance calculation unit 24 applies the feature amount of the latent variable z calculated for the observation data by the feature amount calculation unit 23 to the Mahalanobis space, which is the reference space, to calculate the distance (Mahalanobis distance) from the normal distribution.

判定部２６は、距離算出部２４によって算出されたマハラノビス距離と、予め定められた判定距離とを比較し、観測対象の状態が正常であるか、異常であるかを判定する。例えば、判定部２６は、距離算出部２４によって算出された正常分布に対するマハラノビス距離が、判定距離以下であれば、観測対象の状態が正常であると判定し、距離算出部２４によって算出された正常分布に対するマハラノビス距離が、判定距離以下でなければ、観測対象の状態が異常であると判定する。 The determination unit 26 compares the Mahalanobis distance calculated by the distance calculation unit 24 with a predetermined determination distance, and determines whether the state of the observation target is normal or abnormal. For example, if the Mahalanobis distance to the normal distribution calculated by the distance calculation unit 24 is equal to or less than the judgment distance, the judgment unit 26 judges that the state of the observation target is normal. If the Mahalanobis distance to the distribution is not equal to or less than the judgment distance, it is judged that the state of the observation target is abnormal.

なお、分布記憶部２５が、潜在変数ｚの特徴量の異常分布を記憶している場合、距離算出部２４は、特徴量算出部２３が観測データについて算出した潜在変数ｚの特徴量を基準空間であるマハラノビス空間にあてがった場合における、異常正常分布に対するマハラノビス距離を算出する。また、判定部２６は、距離算出部２４によって算出された異常分布に対するマハラノビス距離が、判定距離以上であれば、観測対象の状態が正常であると判定し、距離算出部２４によって算出された異常分布に対するマハラノビス距離が、判定距離以上でなければ、観測対象の状態が異常であると判定してもよい。 When the distribution storage unit 25 stores the abnormal distribution of the feature amount of the latent variable z, the distance calculation unit 24 stores the feature amount of the latent variable z calculated for the observation data by the feature amount calculation unit 23 in the reference space. Calculate the Mahalanobis distance to the abnormal-normal distribution when applied to the Mahalanobis space. Further, if the Mahalanobis distance to the abnormal distribution calculated by the distance calculation unit 24 is equal to or greater than the judgment distance, the judgment unit 26 judges that the state of the observation target is normal, and determines that the state of the observation target is normal. If the Mahalanobis distance to the distribution is not equal to or greater than the judgment distance, it may be judged that the state of the observation target is abnormal.

さらに、分布記憶部２５が、潜在変数ｚの特徴量の正常分布、および異常分布を記憶している場合、距離算出部２４は、特徴量算出部２３が観測データについて算出した潜在変数ｚの特徴量を基準空間であるマハラノビス空間にあてがった場合における、正常分布に対するマハラノビス距離、および異常正常分布に対するマハラノビス距離を算出してもよい。この場合、判定部２６は、距離算出部２４によって算出された正常分布に対するマハラノビス距離（第１距離）と、距離算出部２４によって算出された異常分布に対するマハラノビス距離（第２距離）と、を比較して観測対象の状態を判定してもよい。具体的には、判定部２６は、距離算出部２４によって算出された第１距離が第２距離よりも短ければ正常と判定し、第１距離が第２距離よりも長ければ異常と判定する。 Furthermore, when the distribution storage unit 25 stores the normal distribution and the abnormal distribution of the feature amount of the latent variable z, the distance calculation unit 24 stores the feature amount of the latent variable z calculated for the observation data by the feature amount calculation unit 23. The Mahalanobis distance to the normal distribution and the Mahalanobis distance to the abnormal normal distribution when the quantity is applied to the Mahalanobis space, which is the reference space, may be calculated. In this case, the determination unit 26 compares the Mahalanobis distance (first distance) for the normal distribution calculated by the distance calculation unit 24 and the Mahalanobis distance (second distance) for the abnormal distribution calculated by the distance calculation unit 24. may be used to determine the state of the observation target. Specifically, the determination unit 26 determines that the first distance calculated by the distance calculation unit 24 is normal if it is shorter than the second distance, and determines that it is abnormal if the first distance is longer than the second distance.

抽出部１１ｂは、認識部１１ａにおける観測対象の状態の認識結果が誤認識であると指定された場合、指定された認識結果を得た観測データの特徴量に基づき、認識部１１ａにおける観測対象の状態の認識結果が誤認識であった可能性がある他の観測データを記憶部１４から抽出する。抽出部１１ｂが抽出する他の観測データは、１つ以上であればいくつであってもよい。 When it is specified that the recognition result of the state of the observation target in the recognition unit 11a is an erroneous recognition, the extraction unit 11b extracts the observation target in the recognition unit 11a based on the feature amount of the observation data for obtaining the specified recognition result. Other observation data for which there is a possibility that the recognition result of the state was an erroneous recognition is extracted from the storage unit 14 . The number of other observation data extracted by the extraction unit 11b may be one or more.

更新部１１ｃは、抽出部１１ｂによって抽出された他の観測データ（認識部１１ａにおける観測対象の状態の認識結果が誤認識であった可能性がある他の観測データ）についても誤認識であり、状態の認識に用いるパラメータの更新が指示されると、状態の認識に用いるパラメータを更新する。この例では、判定部２６が、観測対象の状態の判定に用いる判定距離を更新する。 The update unit 11c recognizes that other observation data extracted by the extraction unit 11b (other observation data for which the recognition result of the state of the observation target in the recognition unit 11a may have been an erroneous recognition) is also erroneously recognized, When the update of the parameters used for state recognition is instructed, the parameters used for state recognition are updated. In this example, the determination unit 26 updates the determination distance used to determine the state of the observation target.

なお、更新部１１ｃは、分布記憶部２５に記憶している、基準空間における潜在変数ｚの特徴量の分布を更新してもよい。 Note that the update unit 11c may update the distribution of the feature amount of the latent variable z in the reference space, which is stored in the distribution storage unit 25. FIG.

指示部１１ｄは、抽出部１１ｂによって抽出された他の観測データ（認識部１１ａにおける観測対象の状態の認識結果が誤認識であった可能性がある他の観測データ）のいずれかについて、誤認識でない（認識結果が適正である）旨の指定を受け付けると、誤認識でないと指定された観測データを除外し、観測対象の状態の認識結果が誤認識であった可能性がある他の観測データの絞り込みを抽出部１１ｂに指示する。抽出部１１ｂは、指示部１１ｄの指示にしたがって、観測対象の状態の認識結果が誤認識であった可能性がある他の観測データの絞り込みを行い、その結果（絞り込みした観測対象の状態の認識結果が誤認識であった可能性がある他の観測データ）を出力する。 The instructing unit 11d causes erroneous recognition of any of the other observation data extracted by the extracting unit 11b (other observation data for which the recognition result of the state of the observation target in the recognizing unit 11a may have been an erroneous recognition). If we accept the designation that it is not (the recognition result is correct), we exclude the observation data designated as not being misrecognition, and other observation data that may have been the recognition result of the state of the observation target being misrecognition. The extraction unit 11b is instructed to narrow down the The extraction unit 11b, according to the instruction of the instruction unit 11d, narrows down other observation data for which there is a possibility that the recognition result of the state of the observation target was an erroneous recognition. Other observation data that may have resulted in misrecognition).

表示制御部１１ｅは、表示器３の画面に表示させる画面表示データを生成する。 The display control unit 11 e generates screen display data to be displayed on the screen of the display device 3 .

状態検知装置１の制御ユニット１１を構成するハードウェアＣＰＵが、この発明にかかる状態検出プログラムを実行したときに、認識部１１ａ、抽出部１１ｂ、更新部１１ｃ、指示部１１ｄ、および表示制御部１１ｅとして動作する。また、メモリは、この発明にかかる状態検出プログラムを展開する領域や、この状態検出プログラムの実行時に生じたデータ等を一時記憶する領域を有している。制御ユニット１１は、ハードウェアＣＰＵ、メモリ等を一体化したＬＳＩであってもよい。また、ハードウェアＣＰＵが、この発明にかかる状態検出方法を実行するコンピュータである。 When the hardware CPU constituting the control unit 11 of the state detection device 1 executes the state detection program according to the present invention, the recognition unit 11a, the extraction unit 11b, the update unit 11c, the instruction unit 11d, and the display control unit 11e works as The memory also has an area for developing the state detection program according to the present invention and an area for temporarily storing data generated when the state detection program is executed. The control unit 11 may be an LSI that integrates a hardware CPU, memory, and the like. Also, the hardware CPU is a computer that executes the state detection method according to the present invention.

＜３．動作例＞
この例の状態検知装置１が、観測対象の状態を認識する状態認識処理について説明する。図６は、この例の状態検知装置の状態認識処理を示すフローチャートである。 <3. Operation example>
State recognition processing for recognizing the state of an observation target by the state detection device 1 of this example will be described. FIG. 6 is a flow chart showing the state recognition processing of the state detection device of this example.

状態検知装置１は、入力部１２に入力された観測対象の観測データを一時的に記憶する観測データ記憶部（不図示）を有している。この例では、観測データは、ビデオカメラ２が観測対象を撮像したフレーム画像である。状態検知装置１は、観測データ記憶部に一時的に記憶した観測データの中から、処理対象の観測データを選択する（ｓ１）。例えば、ビデオカメラ２が観測対象を撮像した動画像のフレーム画像を順番に処理対象の観測データとして選択してもよいし、前回処理対象の観測データとして選択したフレーム画像からｍフレーム目のフレーム画像を処理対象の観測データとして選択してもよい。 The state detection device 1 has an observation data storage unit (not shown) that temporarily stores observation data of an observation target input to the input unit 12 . In this example, the observation data are frame images captured by the video camera 2 of the observation target. The state detection device 1 selects observation data to be processed from the observation data temporarily stored in the observation data storage unit (s1). For example, frame images of a moving image captured by the video camera 2 may be sequentially selected as the observation data to be processed, or the m-th frame image from the frame image selected as the observation data to be processed last time. may be selected as the observation data to be processed.

なお、上記した処理対象の観測データを選択する手法は、一例であって、他の手法で選択してもよい。 Note that the method of selecting the observation data to be processed is an example, and other methods may be used for selection.

状態検知装置１は、ｓ１で選択した処理対象の観測データをＶＡＥ２０のエンコーダ２１でエンコードし、この観測データの潜在変数ｚを抽出する（ｓ２）。このとき、ＶＡＥ２０は、デコーダ２２で抽出した潜在変数ｚから観測データを復元してもよいし、復元しなくてもよい。 The state detection device 1 encodes the observation data to be processed selected in s1 with the encoder 21 of the VAE 20, and extracts the latent variable z of this observation data (s2). At this time, the VAE 20 may or may not restore observation data from the latent variable z extracted by the decoder 22 .

特徴量算出部２３が、観測データから抽出された潜在変数ｚの特徴量を算出する（ｓ３）。ｓ３で算出される潜在変数ｚの特徴量は、処理対象の観測データ（フレーム画像）の画素毎に算出されてもよいし、フレーム画像を予め定めた複数の領域に分割した分割領域毎に算出されてもよい。フレーム画像は、観測対象の周辺環境等に応じて分割すればよく、例えば縦方向にｎ分割、横方向にｍ分割した、ｎ×ｍ個の矩形状の領域に分割したものであってもよいし、曲線で囲んだ領域で複数に分割したものであってもよいし、その他の手法で分割したものであってもよい。 The feature quantity calculator 23 calculates the feature quantity of the latent variable z extracted from the observation data (s3). The feature value of the latent variable z calculated in s3 may be calculated for each pixel of the observation data (frame image) to be processed, or may be calculated for each divided area obtained by dividing the frame image into a plurality of predetermined areas. may be The frame image may be divided according to the surrounding environment of the object to be observed. For example, the frame image may be divided into n×m rectangular regions divided vertically by n and horizontally by m. However, it may be divided into a plurality of regions surrounded by curved lines, or may be divided by other methods.

なお、ｓ３では、算出される潜在変数ｚの特徴量は、分布記憶部２５に正常分布が記憶されている種類の特徴量である。 In s3, the feature amount of the latent variable z calculated is a feature amount of a type whose normal distribution is stored in the distribution storage unit 25. FIG.

距離算出部２４が、ｓ３で算出した特徴量を、基準空間であるマハラノビス空間にあてはめて、正常分布に対する距離（マハラノビス距離）を算出する（ｓ４）。この例では、正常分布に対するマハラノビス距離を算出するが、分布記憶部２５が、潜在変数ｚの特徴量の異常分布を記憶している場合、異常分布に対するマハラノビス距離を算出する。分布記憶部２５が、潜在変数ｚの特徴量の正常分布、および異常分布を記憶している場合、正常分布に対するマハラノビス距離、および異常分布に対するマハラノビス距離を算出する。 The distance calculator 24 applies the feature amount calculated in s3 to the Mahalanobis space, which is the reference space, and calculates the distance (Mahalanobis distance) to the normal distribution (s4). In this example, the Mahalanobis distance to the normal distribution is calculated, but if the distribution storage unit 25 stores the abnormal distribution of the feature quantity of the latent variable z, the Mahalanobis distance to the abnormal distribution is calculated. When the distribution storage unit 25 stores the normal distribution and the abnormal distribution of the feature amount of the latent variable z, the Mahalanobis distance to the normal distribution and the Mahalanobis distance to the abnormal distribution are calculated.

判定部２６は、ｓ４で算出されたマハラノビス距離を基に、観測対象の状態が正常であるか、異常であるかを判定する判定処理を行う（ｓ５）。判定部２６における判定結果が、認識部１１ａの認識結果である。 Based on the Mahalanobis distance calculated in s4, the determination unit 26 performs determination processing for determining whether the state of the observation target is normal or abnormal (s5). The determination result of the determination unit 26 is the recognition result of the recognition unit 11a.

状態検知装置１は、判定部２６による判定結果を出力するとともに、今回の観測データ、ｓ３で算出した特徴量、および認識結果とを対応付けて記憶部１４に記憶し（ｓ６、ｓ７）、ｓ１に戻る。ｓ６では、例えば認識結果が観測対象の状態を異常であると判定した場合に限って、上位装置に出力してもよい。また、ｓ６では、表示制御部１１ｅが、今回の観測データと、認識結果とを示す画面を表示器３に表示させる画面表示データを生成し、表示器３に出力してもよい。利用者は、表示器３の画面を見ることによって、観測対象を撮像したフレーム画像とともに、認識結果を確認することができる。 The state detection device 1 outputs the determination result by the determination unit 26, and stores the current observation data, the feature amount calculated in s3, and the recognition result in association with each other in the storage unit 14 (s6, s7), and s1 back to In s6, for example, only when the recognition result determines that the state of the object to be observed is abnormal, it may be output to the host device. Further, in s6, the display control unit 11e may generate screen display data for causing the display 3 to display a screen showing the current observation data and the recognition result, and output the screen display data to the display 3. FIG. By looking at the screen of the display device 3, the user can confirm the recognition result together with the frame image of the object to be observed.

次に、この例の状態検知装置１が、観測対象の状態を認識するときに用いるパラメータを更新する更新処理について説明する。図７は、この例の状態検知装置の更新処理を示すフローチャートである。状態検知装置１は、入力デバイス４において、観測対象の状態の認識結果が誤認識である旨の操作が利用者によって行われた場合に、この更新処理を実行する。 Next, update processing for updating the parameters used when the state detection device 1 of this example recognizes the state of the observation target will be described. FIG. 7 is a flowchart showing update processing of the state detection device of this example. The state detection device 1 executes this update process when the user performs an operation on the input device 4 indicating that the recognition result of the state of the observation target is an erroneous recognition.

状態検知装置１は、入力受付部１６で観測対象の状態の認識結果が誤認識である旨の入力を受け付けるのを待つ（ｓ１１）。 The state detection device 1 waits until the input receiving unit 16 receives an input indicating that the recognition result of the state of the observation target is erroneous recognition (s11).

状態検知装置１は、入力受付部１６で観測対象の状態の認識結果が誤認識である旨の入力を受け付けると、抽出部１１ｂが今回誤認識であると指定された観測データを基に、記憶部１４に記憶している認識結果の中で、認識結果が誤認識であった可能性がある他の観測データを抽出する（ｓ１２）。ｓ１２では、例えば、今回誤認識であると指定された観測データから抽出された潜在変数ｚの特徴量を基に、記憶部１４に記憶している認識結果の中で、認識結果が誤認識であった可能性がある他の観測データを抽出する。具体的には、潜在変数ｚの特徴量の類似度が、今回誤認識であると指定された観測データと、類似している（類似度が予め定めた閾値を超えている）観測データを、認識結果が誤認識であった可能性がある他の観測データとして抽出する。また、例えば、今回誤認識であると指定された観測データについて算出された正常分布に対するマハラノビス距離との差分が所定の範囲内である観測データを、認識結果が誤認識であった可能性がある他の観測データとして抽出する。 When the input receiving unit 16 receives an input indicating that the recognition result of the state of the observation target is an erroneous recognition, the state detecting device 1 stores the observation data specified as the current erroneous recognition by the extracting unit 11b. Among the recognition results stored in the unit 14, other observation data that may have been erroneous recognition are extracted (s12). In s12, for example, based on the feature amount of the latent variable z extracted from the observation data designated to be erroneous recognition this time, the recognition result is determined to be erroneous recognition among the recognition results stored in the storage unit 14. Extract other observations that may have been. Specifically, the similarity of the feature amount of the latent variable z is similar to the observation data designated as misrecognition this time (the similarity exceeds a predetermined threshold). The recognition result is extracted as other observation data that may have been an erroneous recognition. Also, for example, there is a possibility that the observation data whose difference from the Mahalanobis distance with respect to the normal distribution calculated for the observation data designated as being erroneously recognized this time is within a predetermined range, was erroneously recognized as the recognition result. Extract as other observation data.

表示制御部１１ｅは、ｓ１２で抽出された、認識結果が誤認識であった可能性がある他の観測データを表示器３の画面に表示させる画面表示データを生成し、表示器３に出力する（ｓ１３）。これにより、表示器３は、今回抽出された認識結果が誤認識であった可能性がある他の観測データを画面に表示する。 The display control unit 11e generates screen display data for displaying on the screen of the display device 3 other observation data extracted in s12 that may have been misrecognition as the recognition result, and outputs the data to the display device 3. (s13). As a result, the display device 3 displays on the screen other observation data for which the recognition result extracted this time may have been an erroneous recognition.

利用者は、表示器３の画面を確認し、今回抽出された認識結果が誤認識であった可能性がある他の観測データの全てが誤認識であるかどうかを確認する。言い換えれば、利用者は、表示器３の画面を確認し、今回抽出された認識結果が誤認識であった可能性がある他の観測データの中に、認識結果が誤認識でない観測データが含まれているかどうかを確認する。 The user confirms the screen of the display device 3 and confirms whether or not all of the other observation data that may have been erroneous recognition of the recognition result extracted this time are erroneous recognitions. In other words, the user confirms the screen of the display device 3, and the observation data in which the recognition result is not an erroneous recognition is included in the other observation data extracted this time that may have been an erroneous recognition. Check if it is

利用者は、今回抽出された認識結果が誤認識であった可能性がある他の観測データの全てが誤認識であれば、入力デバイス４において、更新許可にかかる入力操作を行う。また、利用者は、今回抽出された認識結果が誤認識であった可能性がある他の観測データの中に、認識結果が誤認識でない観測データが含まれていれば、その観測データ（認識結果が誤認識でなかった観測データ）を指定するとともに、更新の不許可を指示する入力操作を行う。 If all of the other observation data for which the recognition result extracted this time may have been erroneous recognition is erroneous recognition, the user performs an input operation for permitting update on the input device 4 . In addition, if the observation data extracted this time that may have been misrecognition includes observation data that does not result in misrecognition, the user may Observation data for which the result was not an erroneous recognition) is specified, and an input operation is performed to instruct disapproval of update.

状態検知装置１は、更新許可、または更新不許可にかかる入力を待つ（ｓ１４、ｓ１５）。状態検知装置１は、利用者が入力デバイス４において、更新許可にかかる入力操作を行うと、更新部１１ｃが認識部１１ａに対して観測対象の状態の判定に用いるパラメータの更新を指示する。認識部１１ａは、この指示にしたがって、観測対象の状態の判定に用いるパラメータを更新し（ｓ１６）、ｓ１１に戻る。ｓ１６では、例えば判定距離を更新してもよいし、分布記憶部２５が記憶している潜在変数ｚの特徴量の正常分布、または異常分布、または正常分布および異常分布の両方を更新してもよい。ｓ１６では、更新部１１ｃが、利用者が誤認識であることを確認した観測データの認識結果が変化するように、パラメータを更新する。 The state detection device 1 waits for an input regarding update permission or update non-permission (s14, s15). In the state detection device 1, when the user performs an input operation for permitting updating on the input device 4, the updating unit 11c instructs the recognizing unit 11a to update the parameters used to determine the state of the observation target. According to this instruction, the recognition unit 11a updates the parameters used for determining the state of the observation target (s16), and returns to s11. In s16, for example, the judgment distance may be updated, or the normal distribution, the abnormal distribution, or both the normal distribution and the abnormal distribution of the feature amount of the latent variable z stored in the distribution storage unit 25 may be updated. good. In s16, the updating unit 11c updates the parameters so that the recognition result of the observation data for which the user has confirmed that the recognition is erroneous changes.

また、状態検知装置１は、利用者が入力デバイス４において、更新不許可にかかる入力操作を行うと、ｓ１２で抽出した観測データ（認識結果が誤認識であった可能性がある他の観測データ）の絞り込みを行う（ｓ１７）。ｓ１７では、指示部１１ｄが、更新不許可の入力とともに指定された、認識結果が誤認識でない観測データを基に、ｓ１２で抽出した観測データを認識結果が誤認識であると判断した誤認識グループと、認識結果が適正であると判断した適正認識グループとの２つのグループに分けることを、抽出部１１ｂに指示する。例えば、今回誤認識であると指定された観測データの潜在変数ｚの特徴量と、今回認識結果が誤認識でないと指定された観測データの潜在変数ｚの特徴量とを基に、ｓ１２で抽出した観測データを２つのグループに分ける。例えば、抽出部１１ｂは、ｓ１２で抽出した観測データ（今回認識結果が誤認識でないと指定された観測データを除く）毎に、その観測データの潜在変数ｚの特徴量について、ｓ１１で今回認識結果が誤認識であると指定された観測データの潜在変数ｚの特徴量との距離（第１特徴量距離）を算出する。また、抽出部１１ｂは、ｓ１２で抽出した観測データ（今回認識結果が誤認識でないと指定された観測データを除く）毎に、その観測データの潜在変数ｚの特徴量について、ｓ１５で今回認識結果が誤認識でないと指定された観測データの潜在変数ｚの特徴量との距離（第２特徴量距離）を算出する。抽出部１１ｂは、第１特徴量距離と第２特徴量距離との大小によって、当該観測データを誤認識グループ、または適正認識グループのどちらであるかを判断する。具体的には、抽出部１１ｂは、第１特徴量距離が、第２特徴量距離よりも短い観測データについては、誤認識グループであると判断する。反対に、抽出部１１ｂは、第１特徴量距離が、第２特徴量距離よりも長い観測データについては、適正認識グループであると判断する。 In addition, when the user performs an input operation to disallow update on the input device 4, the state detection device 1 detects the observation data extracted in s12 (other observation data whose recognition result may have been an erroneous recognition). ) is narrowed down (s17). In s17, the instructing unit 11d selects an erroneous recognition group in which the observation data extracted in s12 is determined to be an erroneous recognition based on the observation data specified together with the input of the update disapproval, and the recognition result is not an erroneous recognition. and an appropriate recognition group in which the recognition result is determined to be appropriate. For example, based on the feature amount of the latent variable z of the observation data designated as misrecognition this time and the feature amount of the latent variable z of the observation data designated as not being misrecognition this time, extraction is performed in s12. We divide the observed data into two groups. For example, for each observation data extracted in s12 (excluding observation data specified that the current recognition result is not an erroneous recognition), the extraction unit 11b extracts the current recognition result A distance (first feature value distance) from the feature value of the latent variable z of observation data designated as erroneous recognition is calculated. In addition, for each observation data extracted in s12 (excluding the observation data specified that the current recognition result is not an erroneous recognition), the extraction unit 11b extracts the current recognition result is not an erroneous recognition, and the distance (second feature distance) between the observation data and the feature of the latent variable z is calculated. The extraction unit 11b determines whether the observation data belongs to the incorrect recognition group or the proper recognition group, depending on the magnitude of the first feature amount distance and the second feature amount distance. Specifically, the extraction unit 11b determines that the observation data whose first feature distance is shorter than the second feature distance is an erroneously recognized group. Conversely, the extraction unit 11b determines that the observation data whose first feature distance is longer than the second feature distance belongs to the proper recognition group.

表示制御部１１ｅは、絞り込み結果として、ｓ１７で分けたグループ別に、観測データを表示器３の画面に表示させる画面表示データを生成し、表示器３に出力する（ｓ１８）。これにより、表示器３は、ｓ１２で抽出した観測データについて、認識結果が誤認識であると判断した観測データと、認識結果が適正認識であると判断した観測データと、を区別した画面を表示する。したがって、状態検知装置１は、利用者に対して、今回指定された観測データの認識結果を適正にした場合、どの観測データの認識結果を誤認識と判断し、どの観測データの認識結果を適正認識と判断したかを確認させることができる。 The display control unit 11e generates screen display data for displaying the observation data on the screen of the display device 3 for each group divided in s17 as a narrowing result, and outputs the data to the display device 3 (s18). As a result, the display unit 3 displays a screen that distinguishes between the observation data extracted in step s12 for which the recognition result is determined to be erroneous recognition and the observation data for which the recognition result is determined to be proper recognition. do. Therefore, when the recognition result of the observation data designated this time is set to be correct for the user, the state detection device 1 determines which observation data recognition result is erroneous recognition, and which observation data recognition result is correct. It can be made to confirm whether it is recognized or not.

状態検知装置１は、ｓ１８にかかる処理を完了すると、ｓ１４に戻り、上記した処理を繰り返す。 After completing the processing of s18, the state detection device 1 returns to s14 and repeats the above-described processing.

このように、利用者は、認識結果が誤認識であると判断した観測データを指定するという簡単な操作で、認識部１１ａ（認識モデル）が観測対象の状態を認識するのに用いるパラメータを更新できる。したがって、この例の状態検知装置１は、学習環境と運用環境との相違によって起こる観測対象の状態の誤認識の発生頻度を簡単に抑制できる。 In this way, the user updates the parameters used by the recognition unit 11a (recognition model) to recognize the state of the observation target by a simple operation of specifying the observation data for which the recognition result is determined to be an erroneous recognition. can. Therefore, the state detection device 1 of this example can easily suppress the occurrence frequency of erroneous recognition of the state of the observation target caused by the difference between the learning environment and the operating environment.

なお、ここで言うパラメータは、上記したように、正常、異常の判定に用いる判定距離に限らず、観測対象の状態が正常であるときの観測データの特徴量の正常分布、観測対象の状態が異常であるときの観測データの特徴量の異常分布等であってもよい。 It should be noted that the parameters referred to here are not limited to the judgment distance used for judgment of normality or abnormality, as described above, but the normal distribution of the feature amount of the observation data when the state of the observation target is normal, It may be an abnormal distribution or the like of the feature amount of observation data when it is abnormal.

＜４．変形例＞
上記の例では、認識部１１ａは、ビデオカメラ２によって撮像されたフレーム画像を、観測対象の観測データとして用いるとしたが、例えば、ビデオカメラ２によって撮像された時間的に連続するｎフレームのフレーム画像の第１平均画像と、ビデオカメラ２によって撮像された時間的に連続するｍフレーム（ｎ＞ｍ）のフレーム画像の第２平均画像と、の差分画像を生成し、この差分画像を観測データとして用いる構成であってもよい。 <4. Variation>
In the above example, the recognition unit 11a uses the frame images captured by the video camera 2 as the observation data of the observation target. A difference image is generated between the first average image of the images and the second average image of the temporally continuous m frames (n>m) captured by the video camera 2, and the difference image is used as observation data. It may be a configuration used as.

この場合、ｎは、２以上であり、ｍは、１以上である。また、第１平均画像の生成に用いるｎフレームのフレーム画像に、第２平均画像の生成に用いるｍフレームのフレーム画像が含まれていていてもよいし、含まれていなくてもよい。 In this case, n is 2 or more and m is 1 or more. Also, the frame images of n frames used to generate the first average image may or may not include the frame images of m frames used to generate the second average image.

例えば、ビデオカメラ２によって撮像された時間的に連続するｎフレームのフレーム画像を用いて第１平均画像を生成し、このｎフレームのフレーム画像の中で、撮像時刻が遅いｍフレームのフレーム画像を用いて第２平均画像を生成してもよい。この場合、第２平均画像の生成に用いたｍフレームのフレーム画像は、第１平均画像の生成に用いたｎフレームのフレーム画像に含まれている。 For example, a first average image is generated using n frames of frame images captured by the video camera 2 that are temporally continuous, and among the frame images of the n frames, the frame images of m frames captured later are selected. may be used to generate a second average image. In this case, the m frame images used to generate the second average image are included in the n frame images used to generate the first average image.

また、例えば、ビデオカメラ２によって撮像された時間的に連続するｐフレーム（ｐ＜ｎ＋ｍ）のフレーム画像の中で、撮像時刻が早いｎフレームのフレーム画像を用いて第１平均画像を生成し、このｐフレームのフレーム画像の中で、撮像時刻が遅いｍフレームのフレーム画像を用いて第２平均画像を生成してもよい。この場合、第２平均画像の生成に用いた一部のフレーム画像が、第１平均画像の生成に用いたｎフレームのフレーム画像に含まれている。 Further, for example, among the frame images of p frames (p<n+m) that are temporally continuous captured by the video camera 2, the frame images of the n frames that are captured earlier are used to generate the first average image, Among the frame images of the p frames, the frame images of the m frames captured later may be used to generate the second average image. In this case, some of the frame images used to generate the second average image are included in the n frame images used to generate the first average image.

また、例えば、ビデオカメラ２によって撮像された時間的に連続するｎ＋ｍフレームのフレーム画像の中で、撮像時刻が早いｎフレームのフレーム画像を用いて第１平均画像を生成し、このｎ＋ｍフレームのフレーム画像の中で、撮像時刻が遅いｍフレームのフレーム画像を用いて第２平均画像を生成してもよい。この場合、撮像時刻が、第１平均画像の生成に用いたｎフレームのフレーム画像に、時間的に連続したｍフレームの画像によって第２平均画像が生成される。 Further, for example, among the temporally continuous n+m frame images captured by the video camera 2, the frame images of the n frames captured earlier are used to generate the first average image, and the frames of the n+m frames are generated. The second average image may be generated by using the m frame images captured later in the images. In this case, the second average image is generated by the image of m frames temporally consecutive to the frame images of n frames whose imaging times are used to generate the first average image.

さらに、例えば、ビデオカメラ２によって撮像された時間的に連続するｑフレーム（ｑ＞ｎ＋ｍ）のフレーム画像の中で、撮像時刻が早いｎフレームのフレーム画像を用いて第１平均画像を生成し、このｑフレームのフレーム画像の中で、撮像時刻が遅いｍフレームのフレーム画像を用いて第２平均画像を生成してもよい。この場合、ｑフレームの中には、第１平均画像、および第２平均画像の生成に用いられていないフレーム画像が存在する。この場合、ｑフレームの中には、第１平均画像、および第２平均画像の生成に用いられていないフレーム画像のフレーム数は、数フレーム～数十フレーム程度にすればよい。 Further, for example, generating a first average image using the frame image of the n frame whose imaging time is earlier among the frame images of q frames (q>n+m) that are temporally continuous captured by the video camera 2, Among the q frame images, the second average image may be generated using the m frame images captured later. In this case, the q frames include frame images that are not used to generate the first average image and the second average image. In this case, among the q frames, the number of frame images that are not used for generating the first average image and the second average image should be about several frames to several tens of frames.

このように構成すると、認識部１１ａは、第２時刻の直前にビデオカメラ２の撮像エリア内（観測対象エリア内）に出現したオブジェクト（例えば、不審物１０５や車両１１０から落下した落下物）の画像を観測データ（第１平均画像と、第２平均画像との差分画像）として用いるので、観測対象の状態を認識するのに適した観測データを得ることができる。 With this configuration, the recognition unit 11a recognizes an object (for example, a suspicious object 105 or a falling object that has fallen from the vehicle 110) that appeared in the imaging area (inside the observation target area) of the video camera 2 immediately before the second time. Since the image is used as observation data (the difference image between the first average image and the second average image), observation data suitable for recognizing the state of the observation target can be obtained.

また、上記の例では、認識部１１ａは、機械学習によって生成された認識モデルを有する構成であれば、上記したＶＡＥ２０に限らず、他の種類の認識モデルであってもよい。 In the above example, the recognition unit 11a is not limited to the VAE 20 described above, and may be another type of recognition model as long as it has a recognition model generated by machine learning.

また、上記の例では、観測対象を撮像したフレーム画像、または差分画像であるとしたが、マイクで集音した観測対象周辺の音声データであってもよいし、温度センサで計測した観測対象、または観測対象周辺の温度であってもよいし、温度センサ以外のセンサで観測対象、または観測対象周辺を計測したセンシングデータであってもよい。また、観測データは、１つではなく、複数種類であってもよい。 In the above example, the frame image or differential image of the observation target is used. Alternatively, it may be the temperature around the observation target, or sensing data obtained by measuring the observation target or the periphery of the observation target with a sensor other than a temperature sensor. Also, the number of observation data may be not one but may be plural.

また、上記の例では、認識部１１ａは、観測対象の状態を正常、または異常のどちらであるかを認識する構成であるとしたが、観測対象の状態を３つ以上の状態（例えば、第１状態、第２状態、および中間状態（第１状態と第２状態との間の状態））のいずれであるかを認識するものであってもよい。 In the above example, the recognizing unit 11a is configured to recognize whether the state of the observation target is normal or abnormal. It may recognize which of the 1 state, the 2nd state, and the intermediate state (the state between the 1st state and the 2nd state).

なお、この発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態に亘る構成要素を適宜組み合せてもよい。また、図６、および図７に示した処理の順番は、図示した順番に限らず、適宜入れ換えてもよい。 It should be noted that the present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying constituent elements without departing from the scope of the present invention at the implementation stage. Also, various inventions can be formed by appropriate combinations of the plurality of constituent elements disclosed in the above embodiments. For example, some components may be omitted from all components shown in the embodiments. Furthermore, constituent elements of different embodiments may be combined as appropriate. Moreover, the order of the processes shown in FIGS. 6 and 7 is not limited to the illustrated order, and may be changed as appropriate.

さらに、この発明に係る構成と上述した実施形態に係る構成との対応関係は、以下の付記のように記載できる。
＜付記＞
観測対象の観測データが入力される入力部（１２）と、
前記入力部（１２）に入力された観測データと、機械学習で生成された認識モデル（２０）が当該観測データの特徴量を基に認識した前記観測対象の状態の認識結果と、を対応付けて記憶する記憶部（１４）と、
認識結果が誤認識であると指定された場合、指定された認識結果を得た観測データの前記特徴量に基づき、認識結果が誤認識であった可能性がある他の観測データを前記記憶部（１４）から抽出する抽出部（１１ｂ）と、
前記抽出部（１１ｂ）が抽出した観測データが出力された後に、前記認識モデルの更新許可を受け付けた場合、前記認識モデルが前記観測対象の状態の判定に用いるパラメータを更新する更新部（１１ｃ）と、を備えた状態検知装置（１）。 Furthermore, the correspondence relationship between the configuration according to the present invention and the configuration according to the above-described embodiment can be described as the following additional remarks.
<Appendix>
an input unit (12) into which observation data of an observation target is input;
The observed data input to the input unit (12) are associated with the recognition result of the state of the observation target recognized by the recognition model (20) generated by machine learning based on the feature amount of the observed data. a storage unit (14) for storing
If the recognition result is specified as an erroneous recognition, the storage unit stores other observation data that may have been an erroneous recognition based on the feature amount of the observation data that obtained the specified recognition result. an extraction unit (11b) for extracting from (14);
an update unit (11c) for updating a parameter used by the recognition model to determine the state of the observation target when permission to update the recognition model is received after the observation data extracted by the extraction unit (11b) is output; and a state detection device (1).

１…状態検知装置
２…ビデオカメラ
３…表示器
４…入力デバイス
１１…制御ユニット
１１ａ…認識部
１１ｂ…抽出部
１１ｃ…更新部
１１ｄ…指示部
１１ｅ…表示制御部
１２…入力部
１３…出力部
１４…記憶部
１５…表示出力部
１６…入力受付部
２０…変分自己符号化器（ＶＡＥ）
２１…エンコーダ
２２…デコーダ
２３…特徴量算出部
２４…距離算出部
２５…分布記憶部
２６…判定部 Reference Signs List 1 State detection device 2 Video camera 3 Display device 4 Input device 11 Control unit 11a Recognition unit 11b Extraction unit 11c Update unit 11d Instruction unit 11e Display control unit 12 Input unit 13 Output unit 14 storage unit 15 display output unit 16 input reception unit 20 variational autoencoder (VAE)
21 Encoder 22 Decoder
23... Feature amount calculation unit 24... Distance calculation unit 25... Distribution storage unit 26... Judgment unit

Claims

an input unit into which observation data of an observation target is input;
a storage unit that associates and stores observation data input to the input unit and recognition results of the state of the observation target recognized by a recognition model generated by machine learning based on the feature amount of the observation data; ,
If the recognition result is specified as an erroneous recognition, the storage unit stores other observation data that may have been an erroneous recognition based on the feature amount of the observation data that obtained the specified recognition result. an extractor that extracts from
an update unit that updates parameters used by the recognition model to determine the state of the observation target when permission to update the recognition model is received after the observation data extracted by the extraction unit is output. detection device.

After outputting the observation data extracted by the extraction unit, when the specification of the observation data whose recognition result is not to be corrected is received, the extraction unit receives the previously extracted recognition based on the feature amount of the specified observation data. 2. The state detection device according to claim 1, further comprising an instruction unit for instructing narrowing down of other observation data whose result may have been erroneous recognition.

The recognition model is
Machine learning using learning data including observation data of the observation target under normal conditions, including a variational autoencoder that acquires a distribution of latent variables extracted from observation data under normal conditions as a reference distribution,
the variational autoencoder obtains, as the feature quantity, a distribution of latent variables of the observation data input to the input unit as a distribution for judgment;
Recognizing the state of the observation target based on the distance between the determination distribution and the reference distribution;
The state detection device according to claim 1 or 2.

4. The state detection device according to claim 3, wherein the distance between said distribution for judgment and said reference distribution is a Mahalanobis distance.

The state detection device according to any one of claims 1 to 4, wherein said observation data is a frame image of an object to be observed.

The observation data is a difference image between a first average image of n frame images captured temporally continuously and a second average image of m frame images captured temporally continuously. and
n is greater than m,
The state detection device according to any one of claims 1 to 4, wherein m is 1 or more.

The state detection device according to any one of claims 1 to 6, further comprising a display control section for displaying the observation data extracted by the extraction section on a display.

The state detection device according to any one of claims 1 to 7, wherein the recognition model recognizes whether the state of the observation target is normal or abnormal.

The observation data of the observation target input to the input unit and the recognition result of the state of the observation target recognized by the recognition model generated by machine learning based on the feature amount of the observation data are associated and stored in the storage unit. a recognition result storing step for storing;
If the recognition result is specified as an erroneous recognition, the storage unit stores other observation data that may have been an erroneous recognition based on the feature amount of the observation data that obtained the specified recognition result. an extraction step of extracting from
an update step of updating a parameter used by the recognition model to determine the state of the observation target when permission to update the recognition model is received after outputting the observation data extracted in the extraction step; State detection method.

The observation data of the observation target input to the input unit and the recognition result of the state of the observation target recognized by the recognition model generated by machine learning based on the feature amount of the observation data are associated and stored in the storage unit. a recognition result storing step for storing;
If the recognition result is specified as an erroneous recognition, the storage unit stores other observation data that may have been an erroneous recognition based on the feature amount of the observation data that obtained the specified recognition result. an extraction step of extracting from
causing a computer to execute an update step of updating a parameter used by the recognition model to determine the state of the observation target when permission to update the recognition model is received after outputting the observation data extracted in the extraction step; State detection program.