JP2022158518A

JP2022158518A - Learning data selection program, learning data selection method, and information processing device

Info

Publication number: JP2022158518A
Application number: JP2021063486A
Authority: JP
Inventors: 英司長谷川; Eiji Hasegawa
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2021-04-02
Filing date: 2021-04-02
Publication date: 2022-10-17

Abstract

To select appropriate learning data to be used in machine learning.SOLUTION: An information processing device 101 calculates a feature quantity representing a feature of a frame image included in a moving image of an abnormality detection object. The information processing device 101 calculates a period of the moving image based on a temporal change of the calculated feature quantity of the frame image. The information processing device 101 calculates, for each of a plurality of sections into which the moving image is divided according to the calculated period, a difference regarding the feature quantity of the frame image between each section and another section. The information processing device 101 identifies, based on the difference calculated for each section, a section including a frame image in an abnormal state among the plurality of sections. The information processing device 101 determines frame images corresponding to other sections, which are different from the identified section among the plurality of sections, as learning data to be used for learning of a model M that detects an abnormality of the abnormality detection object.SELECTED DRAWING: Figure 1

Description

本発明は、学習データ選択プログラム、学習データ選択方法および情報処理装置に関する。 The present invention relates to a learning data selection program, a learning data selection method, and an information processing apparatus.

近年、工場内の自動生産ラインの異常発生を監視する目的で、画像を用いた機械学習による異常検知技術の導入が進んでいる。画像を用いた異常検知では、正常な状態の画像（正常画像）のみを学習データとする、いわゆる、半教師あり学習（例えば、ＡｕｔｏＥｎｃｏｄｅｒなど）が用いられることが多い。 In recent years, the introduction of anomaly detection technology based on machine learning using images is progressing for the purpose of monitoring the occurrence of anomalies in automatic production lines in factories. Anomaly detection using images often uses so-called semi-supervised learning (for example, AutoEncoder, etc.) in which only images in a normal state (normal images) are used as learning data.

先行技術としては、例えば、検査前コンベアによって運ばれる包装物を撮影する検査前カメラの出力である静止画または動画を解析することによって、包装物の異常判定を行うものがある。また、折り丁した印刷物を撮像し、撮像位置をずらせながら撮像エリア内で一定周期のタイミングで取り込んだ各画像の自己相関関数を求め、得られた自己相関関数に基づく最大相関時の画像を乱丁検査用の基準画像とする技術がある。 As a prior art, for example, there is a method for determining an abnormality of a package by analyzing a still image or moving image output from a pre-inspection camera that photographs the package carried by the pre-inspection conveyor. In addition, the folded printed matter is imaged, and the autocorrelation function of each image captured at a fixed cycle timing within the imaging area is calculated while shifting the imaging position. There is a technique for using a reference image for inspection.

特開２０２０－４６２７０号公報Japanese Patent Application Laid-Open No. 2020-46270 特開平１１－５３５５３号公報JP-A-11-53553

しかしながら、従来技術では、正常な状態の画像と異常な状態の画像とが混在したデータから、機械学習に用いる適切な学習データを選定することが難しい。 However, with the conventional technology, it is difficult to select appropriate learning data to be used for machine learning from data in which normal state images and abnormal state images are mixed.

一つの側面では、本発明は、機械学習に用いる適切な学習データを選定することを目的とする。 In one aspect, an object of the present invention is to select appropriate learning data to be used for machine learning.

１つの実施態様では、異常検知対象を撮影した動画に含まれるフレーム画像の特徴を表す特徴量を算出し、算出した前記フレーム画像の特徴量の時間変化に基づいて、前記動画の周期を算出し、算出した前記周期に応じて前記動画を区切って分割した複数の区間の各区間について、前記各区間と他の区間とのフレーム画像の特徴量に関する差分を算出し、前記各区間について算出した前記差分に基づいて、前記複数の区間のうち異常な状態のフレーム画像を含む区間を特定し、前記複数の区間のうち特定した前記区間とは異なる他の区間に対応するフレーム画像を、前記異常検知対象の異常を検知するモデルの学習に用いる学習データに決定する、学習データ選択プログラムが提供される。 In one embodiment, a feature amount representing a feature of a frame image included in a moving image of an anomaly detection target is calculated, and a period of the moving image is calculated based on a temporal change in the calculated feature amount of the frame image. , for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating the difference regarding the feature amount of the frame image between each section and another section; Based on the difference, a section including a frame image in an abnormal state is specified among the plurality of sections, and a frame image corresponding to a section other than the specified section among the plurality of sections is detected by the abnormality detection. A learning data selection program is provided for determining learning data to be used for learning a model for detecting anomalies of a target.

本発明の一側面によれば、機械学習に用いる適切な学習データを選定することができるという効果を奏する。 According to one aspect of the present invention, it is possible to select appropriate learning data to be used for machine learning.

図１は、実施の形態にかかる学習データ選択方法の一実施例を示す説明図である。FIG. 1 is an explanatory diagram of an example of a learning data selection method according to an embodiment. 図２は、情報処理システム２００のシステム構成例を示す説明図である。FIG. 2 is an explanatory diagram showing a system configuration example of the information processing system 200. As shown in FIG. 図３は、学習データ選択装置２０１のハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram showing a hardware configuration example of the learning data selection device 201. As shown in FIG. 図４は、異常を検知する適用シーンの具体例を示す説明図である。FIG. 4 is an explanatory diagram showing a specific example of an application scene for detecting an abnormality. 図５は、フレーム画像ＤＢ２２０の記憶内容の一例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of the contents stored in the frame image DB 220. As shown in FIG. 図６は、学習データ選択装置２０１の機能的構成例を示すブロック図である。FIG. 6 is a block diagram showing a functional configuration example of the learning data selection device 201. As shown in FIG. 図７は、エンコーダの学習例を示す説明図である。FIG. 7 is an explanatory diagram showing a learning example of the encoder. 図８は、特徴量信号テーブル２３０の記憶内容の一例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of the contents of the feature amount signal table 230. As shown in FIG. 図９は、特徴量信号テーブル２３０の更新例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of updating the feature amount signal table 230. As shown in FIG. 図１０は、動画Ｖｄの周期Ｔの算出処理例を示す説明図である。FIG. 10 is an explanatory diagram showing an example of processing for calculating the period T of the moving image Vd. 図１１は、区間の分割処理例を示す説明図である。FIG. 11 is an explanatory diagram illustrating an example of segment division processing. 図１２は、学習データの決定例を示す説明図である。FIG. 12 is an explanatory diagram showing an example of determination of learning data. 図１３は、学習データ選択装置２０１のデータ選択処理手順の一例を示すフローチャート（その１）である。FIG. 13 is a flowchart (Part 1) showing an example of the data selection processing procedure of the learning data selection device 201 . 図１４は、学習データ選択装置２０１のデータ選択処理手順の一例を示すフローチャート（その２）である。FIG. 14 is a flowchart (part 2) showing an example of the data selection processing procedure of the learning data selection device 201 . 図１５は、学習データの選別例を示す説明図である。FIG. 15 is an explanatory diagram showing an example of selection of learning data.

以下に図面を参照して、本発明にかかる学習データ選択プログラム、学習データ選択方法および情報処理装置の実施の形態を詳細に説明する。 Exemplary embodiments of a learning data selection program, a learning data selection method, and an information processing apparatus according to the present invention will be described below in detail with reference to the drawings.

（実施の形態）
図１は、実施の形態にかかる学習データ選択方法の一実施例を示す説明図である。図１において、情報処理装置１０１は、モデルＭの学習に用いる学習データを決定するコンピュータである。モデルＭは、異常検知対象の異常を検知するための学習モデルであり、例えば、ニューラルネットワークを用いた機械学習等により生成される。 (Embodiment)
FIG. 1 is an explanatory diagram of an example of a learning data selection method according to an embodiment. In FIG. 1, an information processing apparatus 101 is a computer that determines learning data used for model M learning. The model M is a learning model for detecting an anomaly in an anomaly detection target, and is generated, for example, by machine learning using a neural network.

異常検知対象は、例えば、工場内での自動生産ラインや、周期的に動作するロボットなどである。自動生産ラインは、例えば、同一・同種の製品を大量に製造するために作られた、流れ作業による組み立て工程である。自動生産ラインでは、周期的に製品が流れてくる。製品としては、例えば、飲料水、電子機器、自動車部品などが挙げられる。 Anomaly detection target is, for example, an automatic production line in a factory, a robot that operates periodically, and the like. An automated production line is, for example, an assembly line assembly process designed to mass-produce the same and similar products. In an automated production line, products flow periodically. Examples of products include drinking water, electronic equipment, automobile parts, and the like.

ここで、自動生産ラインの同じ位置をカメラで撮影し続け、撮影された画像を用いて、自動生産ラインの製品の異常を検知する場合がある。異常な状態は、例えば、製品のラベルが剥がれたり、製品が倒れたりしている状態である。画像を用いた異常検知では、例えば、正常な状態の画像（正常画像）のみを抽出した画像群を半教師あり学習で学習し、撮影画像と、その撮影画像を学習モデルで再構成した再構成画像との差分を評価する方法が用いられる。この方法では、異常な状態は再構成されないため、異常部分が差分としてあらわれる。 Here, there is a case where the same position of the automatic production line is continuously photographed by a camera, and an abnormality in the product of the automatic production line is detected using the photographed image. The abnormal state is, for example, a state in which the label of the product is peeled off or the product is overturned. In anomaly detection using images, for example, a group of images in which only images in a normal state (normal images) are extracted is learned by semi-supervised learning. A method of evaluating the difference with the image is used. In this method, since the abnormal state is not reconstructed, the abnormal portion appears as a difference.

半教師あり学習での異常検知を精度よく行うためには、撮影画像から正常な状態の画像のみを学習データとして抽出することが求められる。しかし、正常な状態の画像と異常な状態の画像とが混在したデータから、正常な状態の画像のみを抽出する作業を人手で行うには、手間や時間がかかるという問題がある。 In order to accurately detect anomalies in semi-supervised learning, it is required to extract only normal images from captured images as learning data. However, there is a problem that manually extracting only normal images from data in which normal images and abnormal images are mixed requires time and effort.

一方で、正常な状態の画像と異常な状態の画像とが混在したデータを単純に半教師あり学習の学習データとすると、例えば、異常も再構成されて、異常の検知精度が低下する。このため、正常な状態の画像のみを抽出する作業を人手で行うことなく、異常な状態の画像を検知可能な学習モデルを生成できるようにすることが望まれる。 On the other hand, if data in which normal state images and abnormal state images are mixed is simply used as learning data for semi-supervised learning, for example, anomalies are also reconstructed, resulting in lower anomaly detection accuracy. Therefore, it is desired to generate a learning model capable of detecting an image in an abnormal state without manually extracting only an image in a normal state.

ここで、正常な状態の画像と異常な状態の画像とが混在したデータから教師なしで正常データを絞り込む従来技術として、ＩｓｏｌａｔｉｏｎＦｏｒｅｓｔと呼ばれるものがある。ＩｓｏｌａｔｉｏｎＦｏｒｅｓｔ（従来技術１）は、特徴空間における分布の中で、他から外れている部分を異常とみなすものである。 Here, there is a technique called Isolation Forest as a conventional technique for narrowing down normal data from data in which images in a normal state and images in an abnormal state are mixed without a teacher. Isolation Forest (Prior Art 1) regards a part of the distribution in the feature space that is out of the range as abnormal.

しかし、従来技術１は、特徴空間で正常データと異常データの距離が近い場合は、異常データの分離が難しいという問題がある。従来技術１では、存在分布として外れた部分を異常値候補としており、正常データと異常データとの潜在空間での差が大きくなければ、異常を除去できない。 However, prior art 1 has a problem that it is difficult to separate abnormal data when the distance between normal data and abnormal data is short in the feature space. In prior art 1, a portion that is out of existence distribution is regarded as an abnormal value candidate, and an abnormality cannot be removed unless the difference in the latent space between normal data and abnormal data is large.

例えば、自動生産ラインを撮影した動画の場合、違うタイミングで撮影した二つの正常データとの差と、同じタイミングで撮影した正常データと異常データの差では、前者の方が大きい場合がある。このような場合、従来技術１では、異常データを正常と判定し、正常データを異常と判定してしまうおそれがある。 For example, in the case of a moving image of an automated production line, the difference between two normal data shot at different timings and the difference between normal data and abnormal data shot at the same timing may be larger in the former. In such a case, the prior art 1 may judge abnormal data as normal and normal data as abnormal.

また、正常な状態の画像と異常な状態の画像とが混在したデータから教師なしで正常データを絞り込む従来技術として、ＲｏｂｕｓｔＡｕｔｏＥｎｃｏｄｅｒと呼ばれるものがある。ＲｏｂｕｓｔＡｕｔｏＥｎｃｏｄｅｒ（従来技術２）は、動きの少ない背景部分と変化部分に分解していき、変化部分を取り除くことで背景部分の再構成誤差が少なくなるように学習するものである。 Further, there is a conventional technique called Robust Auto Encoder for narrowing down normal data from data in which images in a normal state and images in an abnormal state are mixed without a teacher. The Robust Auto Encoder (prior art 2) separates an image into a background part with little movement and a changed part, and learns to reduce the reconstruction error of the background part by removing the changed part.

しかし、従来技術２は、共通部分に対して特徴の差が大きい画像は異常とみなして学習データから除去することができるものの、特徴の差が小さい画像については、ノイズ成分が大きいデータ（異常なデータ）とみなすことが難しいという問題がある。すなわち、従来技術２でも、正常データ間の差に比べて、正常データとの間の差が小さい異常データは除去することができない。 However, according to the prior art 2, images with a large feature difference from the common part can be regarded as abnormal and removed from the learning data. data) is difficult. In other words, even with prior art 2, abnormal data whose difference from normal data is smaller than the difference between normal data cannot be removed.

このように、従来技術では、正常な状態の画像と異常な状態の画像とが混在し、正常な状態の画像と異常な状態の画像との特徴の差異が大きくないデータから、異常な状態の画像を除去して、モデルを学習することは難しい。異常な状態の画像を除去して学習することができなければ、異常な状態の画像を精度よく検知可能な学習モデルを生成することができない。 As described above, in the conventional technology, an image of an abnormal state is obtained from data in which an image of a normal state and an image of an abnormal state are mixed, and the difference in features between the image of the normal state and the image of the abnormal state is not large. It is difficult to remove the images and train the model. If it is not possible to learn by removing images in an abnormal state, it is impossible to generate a learning model capable of accurately detecting images in an abnormal state.

そこで、本実施の形態では、周期性のある動画において、正常な状態の画像と異常な状態の画像が選別されずに混在し、両者の特徴の差異が小さい場合であっても、機械学習のための適切な学習データを選定する学習データ選択方法について説明する。以下、情報処理装置１０１の処理例について説明する。 Therefore, in the present embodiment, in a moving image with periodicity, images in a normal state and images in an abnormal state are mixed without being sorted out, and even if the difference in features between the two is small, machine learning can be performed. A learning data selection method for selecting appropriate learning data for A processing example of the information processing apparatus 101 will be described below.

（１）情報処理装置１０１は、異常検知対象を撮影した動画に含まれるフレーム画像の特徴を表す特徴量を算出する。ここで、異常検知対象を撮影した動画は、同じような画像が周期的に繰り返される、周期性のある動画である。フレーム画像の特徴量は、フレーム画像の特徴を表す情報であり、例えば、一つの値であってもよく、また、複数の成分（要素）を含む特徴ベクトルであってもよい。どのような値を特徴量として抽出するかは、任意に設定可能である。 (1) The information processing apparatus 101 calculates a feature quantity representing a feature of a frame image included in a moving image of an anomaly detection target. Here, the moving image of the anomaly detection target is a periodic moving image in which similar images are periodically repeated. The feature amount of the frame image is information representing the feature of the frame image, and may be, for example, a single value or a feature vector containing a plurality of components (elements). It is possible to arbitrarily set what kind of value is to be extracted as a feature amount.

図１の例では、異常検知対象を「工場内での自動生産ラインの製品（例えば、ペットボトル入りの飲料水）」とし、異常検知対象を撮影した動画を「動画１１０」とする。自動生産ラインでは、周期的に製品が流れてくるため、自動生産ラインを撮影した動画１１０では、同じような画像が周期的に繰り返される。この場合、情報処理装置１０１は、例えば、動画１１０に含まれるフレーム画像（例えば、フレーム画像１１１～１１３）の特徴を表す特徴量を算出する。 In the example of FIG. 1, the object of abnormality detection is "a product on an automatic production line in a factory (for example, drinking water in a PET bottle)", and the moving image of the object of abnormality detection is "moving image 110". In the automatic production line, products flow periodically, so similar images are cyclically repeated in the moving image 110 of the automatic production line. In this case, the information processing apparatus 101 calculates, for example, feature amounts representing features of frame images (for example, frame images 111 to 113) included in the moving image 110. FIG.

（２）情報処理装置１０１は、算出したフレーム画像の特徴量の時間変化に基づいて、動画の周期を算出する。ここで、異常検知対象を撮影した動画において、同じような画像が周期的に繰り返される場合、動画に含まれるフレーム画像の特徴量の時間変化に周期性があるといえる。 (2) The information processing apparatus 101 calculates the period of the moving image based on the calculated temporal change of the feature amount of the frame image. Here, when similar images are periodically repeated in a moving image of an anomaly detection target, it can be said that temporal changes in the feature amount of frame images included in the moving image have periodicity.

このため、情報処理装置１０１は、動画に含まれるフレーム画像の特徴量の時間変化から周期性を判断する。具体的には、例えば、情報処理装置１０１は、算出したフレーム画像の特徴量の時間変化を示す特徴量信号（信号波形）１２０を周波数解析して得られる結果に基づいて、動画の周期を算出する。 For this reason, the information processing apparatus 101 determines the periodicity from the temporal change of the feature amount of the frame images included in the moving image. Specifically, for example, the information processing apparatus 101 calculates the period of the moving image based on the result obtained by frequency analysis of the feature amount signal (signal waveform) 120 indicating the temporal change of the calculated feature amount of the frame image. do.

図１の例では、情報処理装置１０１は、例えば、特徴量信号１２０を周波数解析して得られる周波数から、動画１１０の周期Ｔ１を算出する。特徴量信号１２０は、動画１１０に含まれるフレーム画像（例えば、フレーム画像１１１～１１３）の特徴を表す特徴量の時間変化を示す（縦軸：特徴量、横軸：フレーム番号）。フレーム番号は、時系列順に付与されるフレーム画像の識別子である。 In the example of FIG. 1, the information processing apparatus 101 calculates the period T1 of the moving image 110 from the frequency obtained by frequency analysis of the feature amount signal 120, for example. The feature amount signal 120 indicates the temporal change of the feature amount representing the features of the frame images (for example, the frame images 111 to 113) included in the moving image 110 (vertical axis: feature amount, horizontal axis: frame number). A frame number is an identifier of a frame image given in chronological order.

（３）情報処理装置１０１は、算出した周期に応じて動画を分割した複数の区間の各区間について、各区間と他の区間とのフレーム画像の特徴量に関する差分を算出する。他の区間は、例えば、各区間に隣接する区間である。区間は、例えば、フレーム番号によって指定される。 (3) The information processing apparatus 101 calculates, for each of a plurality of sections into which the moving image is divided according to the calculated period, the difference regarding the feature amount of the frame image between each section and another section. Other sections are, for example, sections adjacent to each section. A section is designated by, for example, a frame number.

フレーム画像の特徴量に関する差分は、区間同士のフレーム画像の特徴量を比較して得られる差異である。異常検知対象の異常な状態は、区間内で一定時間映り続けることが多い。このため、異常な状態のフレーム画像を含む区間全体で、他の区間との間に特徴量の差異が生じる傾向がある。 The difference regarding the feature amount of the frame image is the difference obtained by comparing the feature amounts of the frame images of the sections. The abnormal state of the abnormality detection target often continues for a certain period of time within the section. For this reason, there tends to be a difference in the feature amount between the whole section including the frame image in the abnormal state and the other sections.

図１の例では、情報処理装置１０１は、周期Ｔ１に応じて動画１１０を分割した複数の区間（例えば、区間１１０－１～１１０－３）の各区間について、各区間と他の区間とのフレーム画像の特徴量に関する差分を算出する。例えば、情報処理装置１０１は、区間１１０－２について、区間１１０－２と他の区間１１０－１，１１０－３とのフレーム画像の特徴量に関する差分を算出する。 In the example of FIG. 1, the information processing apparatus 101 divides the moving image 110 according to the cycle T1 into a plurality of sections (for example, sections 110-1 to 110-3). A difference relating to the feature amount of the frame image is calculated. For example, for the section 110-2, the information processing apparatus 101 calculates the difference regarding the feature amounts of the frame images between the section 110-2 and the other sections 110-1 and 110-3.

区間１１０－２と他の区間１１０－１，１１０－３とのフレーム画像の特徴量に関する差分は、例えば、区間１１０－１，１１０－２の間の差分と区間１１０－２，１１０－３の間の差分のうちの最大値であってもよく、また、二つの差分の平均値であってもよい。なお、特徴量信号１３０は、動画１１０に含まれるフレーム画像の特徴を表す特徴量の時間変化を示す（例えば、特徴量信号１２０の一部分を拡大表示したもの。）。 The difference regarding the frame image feature amount between the section 110-2 and the other sections 110-1 and 110-3 is, for example, the difference between the sections 110-1 and 110-2 and the section 110-2 and 110-3. It may be the maximum value of the differences between them, or it may be the average value of the two differences. Note that the feature amount signal 130 indicates the temporal change of the feature amount representing the features of the frame images included in the moving image 110 (for example, an enlarged display of a portion of the feature amount signal 120).

（４）情報処理装置１０１は、各区間について算出した差分に基づいて、複数の区間のうち、異常な状態のフレーム画像を含む区間を特定する。具体的には、例えば、情報処理装置１０１は、算出した差分が閾値以上の区間を、異常な状態のフレーム画像を含む区間として特定してもよい。閾値は、任意に設定可能である。 (4) The information processing apparatus 101 identifies a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section. Specifically, for example, the information processing apparatus 101 may identify a section in which the calculated difference is equal to or greater than a threshold value as a section including a frame image in an abnormal state. The threshold can be set arbitrarily.

図１の例では、区間１１０－２が、異常な状態のフレーム画像を含む区間として特定された場合を想定する。 In the example of FIG. 1, it is assumed that section 110-2 is identified as a section containing frame images in an abnormal state.

（５）情報処理装置１０１は、複数の区間のうち特定した区間とは異なる他の区間に対応するフレーム画像を、異常検知対象の異常を検知するモデルＭの学習に用いる学習データに決定する。区間に対応するフレーム画像は、例えば、区間内の一定時間ごとのフレーム画像であってもよく、また、区間内のすべてのフレーム画像であってもよい。 (5) The information processing apparatus 101 determines frame images corresponding to other sections, which are different from the specified section, among the plurality of sections as learning data to be used for learning of the model M for detecting anomaly detection target. The frame images corresponding to the section may be, for example, frame images at regular intervals within the section, or may be all frame images within the section.

図１の例では、動画１１０に含まれるフレーム画像のうち、区間１１０－２とは異なる他の区間（例えば、区間１１０－１，１１０－３）に対応するフレーム画像が、モデルＭの学習に用いる学習データに決定される。 In the example of FIG. 1, of the frame images included in the moving image 110, the frame images corresponding to other sections (eg, sections 110-1 and 110-3) different from the section 110-2 are used for model M learning. It is determined by the learning data to be used.

このように、情報処理装置１０１によれば、動画の周期性を利用して、フレーム画像の特徴量を周期（区間）ごとに比較し、周期性に乱れがある区間に含まれるフレーム画像を異常の可能性があるものとして、学習データから除去することができる。これにより、異常検知対象を撮影した動画から、機械学習に用いる適切な学習データを選定することができる。 As described above, the information processing apparatus 101 uses the periodicity of a moving image to compare the feature amounts of frame images for each period (section), and detects an abnormal frame image included in a section in which the periodicity is disturbed. can be removed from the training data as a possibility of Accordingly, it is possible to select appropriate learning data to be used for machine learning from a moving image of an anomaly detection target.

例えば、周期性のある動画において、正常な状態の画像と異常な状態の画像が選別されずに混在し、両者の特徴の差異が小さい場合であっても、異常を精度よく検知可能なモデルＭを生成するための適切な学習データを選定することができる。また、学習データを用意するにあたり、人手により正常な状態の画像と異常な状態の画像とを選別する作業が不要となり、手間と工数を削減することができる。 For example, in a moving image with periodicity, images in a normal state and images in an abnormal state are mixed without being sorted out, and even if the difference in features between the two is small, the model M can detect anomalies with high accuracy. Appropriate training data can be selected for generating Moreover, when preparing the learning data, it is not necessary to manually sort out the images in the normal state and the images in the abnormal state.

図１の例では、情報処理装置１０１は、周期性に乱れがある区間１１０－２に含まれるフレーム画像を異常の可能性があるものとして、学習データから除去することができる。また、動画１１０のうち、周期性に乱れがある区間（例えば、区間１１０－２）に含まれるフレーム画像を除去した学習データ１４０を用いてモデルＭを学習することで、自動生産ラインの異常を精度よく検知可能なモデルＭを生成することができる。 In the example of FIG. 1, the information processing apparatus 101 can remove the frame images included in the period 110-2 in which the periodicity is disturbed from the learning data as possible abnormalities. In addition, by learning the model M using the learning data 140 obtained by removing the frame images included in the period (for example, section 110-2) in which the periodicity is disturbed in the moving image 110, the abnormality of the automatic production line can be detected. A model M that can be detected with high accuracy can be generated.

（情報処理システム２００のシステム構成例）
つぎに、図１に示した情報処理装置１０１を含む情報処理システム２００のシステム構成例について説明する。以下の説明では、図１に示した情報処理装置１０１を、情報処理システム２００内の学習データ選択装置２０１に適用した場合を例に挙げて説明する。情報処理システム２００は、例えば、工場の自動生産ラインにおいて、動画を用いて異常検知を行うコンピュータシステムに適用される。 (System configuration example of information processing system 200)
Next, a system configuration example of an information processing system 200 including the information processing apparatus 101 shown in FIG. 1 will be described. In the following description, a case where the information processing device 101 shown in FIG. 1 is applied to the learning data selection device 201 in the information processing system 200 will be described as an example. The information processing system 200 is applied, for example, to a computer system that detects anomalies using moving images in an automatic production line in a factory.

図２は、情報処理システム２００のシステム構成例を示す説明図である。図２において、情報処理システム２００は、学習データ選択装置２０１と、クライアント装置２０２と、を含む。情報処理システム２００において、学習データ選択装置２０１およびクライアント装置２０２は、有線または無線のネットワーク２１０を介して接続される。ネットワーク２１０は、例えば、インターネット、ＬＡＮ、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）などである。 FIG. 2 is an explanatory diagram showing a system configuration example of the information processing system 200. As shown in FIG. In FIG. 2 , an information processing system 200 includes a learning data selection device 201 and a client device 202 . In information processing system 200 , learning data selection device 201 and client device 202 are connected via wired or wireless network 210 . The network 210 is, for example, the Internet, LAN, WAN (Wide Area Network), or the like.

ここで、学習データ選択装置２０１は、フレーム画像ＤＢ（Ｄａｔａｂａｓｅ）２２０、特徴量信号テーブル２３０および学習データＤＢ２４０を有し、モデルＭの学習に用いる学習データを決定する。モデルＭは、異常検知対象の異常を検知するための学習モデルである。学習データ選択装置２０１は、例えば、サーバである。 Here, the learning data selection device 201 has a frame image DB (Database) 220, a feature amount signal table 230, and a learning data DB 240, and determines learning data used for model M learning. The model M is a learning model for detecting anomalies in the anomaly detection target. The learning data selection device 201 is, for example, a server.

フレーム画像ＤＢ２２０は、異常検知対象を撮影した動画に含まれるフレーム画像を記憶する。特徴量信号テーブル２３０は、異常検知対象を撮影した動画に含まれるフレーム画像の特徴量（特徴ベクトル）を記憶する。学習データＤＢ２４０は、モデルＭの学習に用いる学習データを記憶する。 The frame image DB 220 stores frame images included in a moving image of an anomaly detection target. The feature amount signal table 230 stores feature amounts (feature vectors) of frame images included in a moving image of an anomaly detection target. The learning data DB 240 stores learning data used for model M learning.

なお、フレーム画像ＤＢ２２０および特徴量信号テーブル２３０の記憶内容については、図５および図８を用いて後述する。 Note that the storage contents of the frame image DB 220 and the feature amount signal table 230 will be described later with reference to FIGS. 5 and 8. FIG.

クライアント装置２０２は、情報処理システム２００のユーザが使用するコンピュータである。ユーザは、例えば、工場の自動生産ラインを管理する管理者などである。クライアント装置２０２は、例えば、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、タブレットＰＣなどである。 A client device 202 is a computer used by a user of the information processing system 200 . A user is, for example, an administrator who manages an automatic production line in a factory. The client device 202 is, for example, a PC (Personal Computer), a tablet PC, or the like.

なお、ここでは、学習データ選択装置２０１とクライアント装置２０２とを別体に設けることにしたが、これに限らない。例えば、学習データ選択装置２０１は、クライアント装置２０２により実現されることにしてもよい。また、情報処理システム２００には、複数のクライアント装置２０２が含まれることにしてもよい。 Although the learning data selection device 201 and the client device 202 are provided separately here, the present invention is not limited to this. For example, the learning data selection device 201 may be implemented by the client device 202 . The information processing system 200 may also include a plurality of client devices 202 .

（学習データ選択装置２０１のハードウェア構成例）
図３は、学習データ選択装置２０１のハードウェア構成例を示すブロック図である。図３において、学習データ選択装置２０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１と、メモリ３０２と、ディスクドライブ３０３と、ディスク３０４と、通信Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）３０５と、可搬型記録媒体Ｉ／Ｆ３０６と、可搬型記録媒体３０７と、を有する。また、各構成部は、バス３００によってそれぞれ接続される。 (Hardware configuration example of learning data selection device 201)
FIG. 3 is a block diagram showing a hardware configuration example of the learning data selection device 201. As shown in FIG. 3, the learning data selection device 201 includes a CPU (Central Processing Unit) 301, a memory 302, a disk drive 303, a disk 304, a communication I/F (Interface) 305, and a portable recording medium I/F 306. and a portable recording medium 307 . Also, each component is connected by a bus 300 .

ここで、ＣＰＵ３０１は、学習データ選択装置２０１の全体の制御を司る。ＣＰＵ３０１は、複数のコアを有していてもよい。メモリ３０２は、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）およびフラッシュＲＯＭなどを有する。具体的には、例えば、フラッシュＲＯＭがＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）のプログラムを記憶し、ＲＯＭがアプリケーションプログラムを記憶し、ＲＡＭがＣＰＵ３０１のワークエリアとして使用される。メモリ３０２に記憶されるプログラムは、ＣＰＵ３０１にロードされることで、コーディングされている処理をＣＰＵ３０１に実行させる。 Here, the CPU 301 controls the learning data selection device 201 as a whole. The CPU 301 may have multiple cores. The memory 302 has, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), a flash ROM, and the like. Specifically, for example, a flash ROM stores an OS (Operating System) program, a ROM stores application programs, and a RAM is used as a work area for the CPU 301 . A program stored in the memory 302 is loaded into the CPU 301 to cause the CPU 301 to execute coded processing.

ディスクドライブ３０３は、ＣＰＵ３０１の制御に従ってディスク３０４に対するデータのリード／ライトを制御する。ディスク３０４は、ディスクドライブ３０３の制御で書き込まれたデータを記憶する。ディスク３０４としては、例えば、磁気ディスク、光ディスクなどが挙げられる。 The disk drive 303 controls data read/write with respect to the disk 304 under the control of the CPU 301 . The disk 304 stores data written under the control of the disk drive 303 . Examples of the disk 304 include a magnetic disk and an optical disk.

通信Ｉ／Ｆ３０５は、通信回線を通じてネットワーク２１０に接続され、ネットワーク２１０を介して外部のコンピュータ（例えば、図２に示したクライアント装置２０２）に接続される。そして、通信Ｉ／Ｆ３０５は、ネットワーク２１０と装置内部とのインターフェースを司り、外部のコンピュータからのデータの入出力を制御する。通信Ｉ／Ｆ３０５には、例えば、モデムやＬＡＮアダプタなどを採用することができる。 The communication I/F 305 is connected to the network 210 through a communication line, and is connected to an external computer (for example, the client device 202 shown in FIG. 2) via the network 210 . A communication I/F 305 serves as an interface between the network 210 and the inside of the apparatus, and controls input/output of data from an external computer. For the communication I/F 305, for example, a modem or a LAN adapter can be adopted.

可搬型記録媒体Ｉ／Ｆ３０６は、ＣＰＵ３０１の制御に従って可搬型記録媒体３０７に対するデータのリード／ライトを制御する。可搬型記録媒体３０７は、可搬型記録媒体Ｉ／Ｆ３０６の制御で書き込まれたデータを記憶する。可搬型記録媒体３０７としては、例えば、ＣＤ（ＣｏｍｐａｃｔＤｉｓｃ）－ＲＯＭ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリなどが挙げられる。 A portable recording medium I/F 306 controls reading/writing of data from/to a portable recording medium 307 under the control of the CPU 301 . The portable recording medium 307 stores data written under control of the portable recording medium I/F 306 . Examples of the portable recording medium 307 include CD (Compact Disc)-ROM, DVD (Digital Versatile Disk), USB (Universal Serial Bus) memory, and the like.

なお、学習データ選択装置２０１は、上述した構成部のほかに、例えば、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、入力装置、ディスプレイなどを有することにしてもよい。また、図２に示したクライアント装置２０２についても、学習データ選択装置２０１と同様のハードウェア構成により実現することができる。ただし、クライアント装置２０２は、上述した構成部のほかに、例えば、入力装置、ディスプレイなどを有する。 Note that the learning data selection device 201 may have, for example, a GPU (Graphics Processing Unit), an input device, a display, etc., in addition to the components described above. Also, the client device 202 shown in FIG. 2 can be realized by a hardware configuration similar to that of the learning data selection device 201 . However, the client device 202 has, for example, an input device and a display in addition to the components described above.

（適用シーンの具体例）
つぎに、異常検知対象を撮影した動画を用いて、異常を検知する適用シーンの具体例について説明する。 (Specific examples of applicable scenes)
Next, a specific example of an application scene for detecting an anomaly will be described using a moving image of an anomaly detection target.

図４は、異常を検知する適用シーンの具体例を示す説明図である。図４において、カメラ４００は、動画を撮影する機能を有する撮像装置である。カメラ４００は、例えば、工場内に設置され、異常検知対象である自動生産ラインの同じ位置を撮影する。これにより、周期的にカメラ４００の前を通る製品が撮影される。 FIG. 4 is an explanatory diagram showing a specific example of an application scene for detecting an abnormality. In FIG. 4, a camera 400 is an imaging device having a function of shooting moving images. The camera 400 is installed in a factory, for example, and photographs the same position of the automatic production line that is the object of abnormality detection. As a result, products passing in front of the camera 400 are periodically photographed.

動画４１０は、自動生産ラインの製品を撮影した動画像であり、例えば、フレーム画像４１１～４１４を含む。なお、カメラ４００は、通信機能を有していてもよい。この場合、カメラ４００は、図２に示したネットワーク２１０を介して、学習データ選択装置２０１およびクライアント装置２０２と接続可能であってもよい。 A moving image 410 is a moving image of a product on an automatic production line, and includes frame images 411 to 414, for example. Camera 400 may have a communication function. In this case, camera 400 may be connectable to learning data selection device 201 and client device 202 via network 210 shown in FIG.

（フレーム画像ＤＢ２２０の記憶内容）
つぎに、図５を用いて、学習データ選択装置２０１が有するフレーム画像ＤＢ２２０の記憶内容について説明する。なお、フレーム画像ＤＢ２２０、特徴量信号テーブル２３０および学習データＤＢ２４０は、例えば、図３に示したメモリ３０２、ディスク３０４などの記憶装置により実現される。 (Stored contents of frame image DB 220)
Next, the storage contents of the frame image DB 220 of the learning data selection device 201 will be described with reference to FIG. Note that the frame image DB 220, the feature amount signal table 230, and the learning data DB 240 are realized by, for example, storage devices such as the memory 302 and disk 304 shown in FIG.

図５は、フレーム画像ＤＢ２２０の記憶内容の一例を示す説明図である。図５において、フレーム画像ＤＢ２２０は、フレーム番号およびフレーム画像のフィールドを有し、各フィールドに情報を設定することで、時系列フレーム画像データ（例えば、時系列フレーム画像データ５００－１～５００－３）をレコードとして記憶する。 FIG. 5 is an explanatory diagram showing an example of the contents stored in the frame image DB 220. As shown in FIG. In FIG. 5, the frame image DB 220 has fields for frame numbers and frame images. ) is stored as a record.

ここで、フレーム番号は、動画に含まれるフレーム画像を識別する識別子である。フレーム番号は、時系列順にフレーム画像に付与される。フレーム画像は、動画に含まれるフレーム画像（画像データ）である。例えば、時系列フレーム画像データ５００－１は、フレーム番号「１」のフレーム画像を示す。 Here, the frame number is an identifier that identifies a frame image included in the moving image. Frame numbers are assigned to frame images in chronological order. A frame image is a frame image (image data) included in a moving image. For example, the time series frame image data 500-1 indicates the frame image of frame number "1".

（学習データ選択装置２０１の機能的構成例）
図６は、学習データ選択装置２０１の機能的構成例を示すブロック図である。図６において、学習データ選択装置２０１は、取得部６０１と、第１の算出部６０２と、第２の算出部６０３と、特定部６０４と、決定部６０５と、出力部６０６と、を含む。取得部６０１～出力部６０６は制御部となる機能であり、具体的には、例えば、図３に示したメモリ３０２、ディスク３０４、可搬型記録媒体３０７などの記憶装置に記憶されたプログラムをＣＰＵ３０１に実行させることにより、または、通信Ｉ／Ｆ３０５により、その機能を実現する。各機能部の処理結果は、例えば、メモリ３０２、ディスク３０４などの記憶装置に記憶される。 (Example of functional configuration of learning data selection device 201)
FIG. 6 is a block diagram showing a functional configuration example of the learning data selection device 201. As shown in FIG. In FIG. 6 , learning data selection device 201 includes acquisition section 601 , first calculation section 602 , second calculation section 603 , identification section 604 , determination section 605 , and output section 606 . Acquisition unit 601 to output unit 606 are functions of a control unit. or by the communication I/F 305, the function is realized. The processing results of each functional unit are stored in a storage device such as the memory 302 or disk 304, for example.

取得部６０１は、異常検知対象を撮影した動画Ｖｄを取得する。動画Ｖｄは、例えば、所定期間、異常検知対象を撮影した、周期性のある動画である。周期性のある動画とは、同じような画像が周期的に繰り返される動画である。図４に示した動画４１０は、動画Ｖｄの一例である。 Acquisition unit 601 acquires video Vd in which an abnormality detection target is captured. The moving image Vd is, for example, a periodic moving image in which an abnormality detection target is photographed for a predetermined period of time. A moving image with periodicity is a moving image in which similar images are repeated periodically. A moving image 410 shown in FIG. 4 is an example of the moving image Vd.

所定期間は、例えば、異常検知対象に応じて判断される。例えば、異常検知対象を製品が数秒間隔で流れてくるような自動生産ラインとすると、所定期間は、数分程度の時間となる。具体的には、例えば、取得部６０１は、図２に示したクライアント装置２０２から動画Ｖｄを受信することにより、受信した動画Ｖｄを取得する。 The predetermined period is determined, for example, according to the abnormality detection target. For example, if the abnormality detection target is an automatic production line in which products are delivered at intervals of several seconds, the predetermined period is about several minutes. Specifically, for example, the acquisition unit 601 acquires the received video Vd by receiving the video Vd from the client device 202 shown in FIG.

また、取得部６０１は、クライアント装置２０２から動画Ｖｄの指定を受け付けることにより、不図示の動画ＤＢから、指定された動画Ｖｄを取得してもよい。また、取得部６０１は、不図示の入力装置を用いたユーザの操作により、入力された動画Ｖｄを取得してもよい。また、取得部６０１は、図４に示したカメラ４００から、カメラ４００によって撮影された動画Ｖｄを取得してもよい。 Further, the acquiring unit 601 may acquire the designated moving image Vd from a moving image DB (not shown) by receiving the designation of the moving image Vd from the client device 202 . Further, the acquisition unit 601 may acquire the moving image Vd input by the user's operation using an input device (not shown). Further, the acquisition unit 601 may acquire the moving image Vd shot by the camera 400 from the camera 400 shown in FIG.

第１の算出部６０２は、取得された動画Ｖｄに含まれるフレーム画像の特徴量を算出する。フレーム画像の特徴量は、例えば、フレーム画像の特徴を表す特徴量ベクトルである。特徴量ベクトルは、複数の成分（要素）を含む。複数の成分は、フレーム画像の特徴を表す値の組合せである。複数の成分としては、例えば、既存の特徴抽出技術により抽出可能ないかなる値（輝度、コントラスト、製品らしさなど）を用いることにしてもよい。 The first calculator 602 calculates feature amounts of frame images included in the acquired moving image Vd. The feature amount of the frame image is, for example, a feature amount vector representing the feature of the frame image. A feature vector includes a plurality of components (elements). A plurality of components is a combination of values representing features of the frame image. As the plurality of components, for example, any values (brightness, contrast, product-likeness, etc.) that can be extracted by existing feature extraction techniques may be used.

ここで、第１の算出部６０２は、例えば、エンコーダ学習部６１１と、ベクトル算出部６１２と、を含む。 Here, the first calculation unit 602 includes an encoder learning unit 611 and a vector calculation unit 612, for example.

エンコーダ学習部６１１は、フレーム画像から特徴量ベクトルへの変換を行うエンコーダｅｃ（変換器）を学習する。具体的には、例えば、エンコーダ学習部６１１は、動画Ｖｄから一定時間間隔でフレーム画像を抽出する。一定時間は、任意に設定可能であり、例えば、動画Ｖｄから数フレーム～数十フレーム間隔でフレーム画像が抽出されるような時間に設定される。 The encoder learning unit 611 learns an encoder ec (converter) that converts a frame image into a feature amount vector. Specifically, for example, the encoder learning unit 611 extracts frame images from the moving image Vd at regular time intervals. The fixed time can be arbitrarily set, and is set, for example, to a time such that frame images are extracted from the moving image Vd at intervals of several frames to several tens of frames.

つぎに、エンコーダ学習部６１１は、抽出したフレーム画像に対して、時系列順にフレーム番号を付与する。フレーム番号が付与されたフレーム画像は、例えば、図５に示したフレーム画像ＤＢ２２０に記憶される。そして、エンコーダ学習部６１１は、抽出したフレーム画像を学習データとして、エンコーダｅｃを学習する。 Next, the encoder learning unit 611 assigns frame numbers to the extracted frame images in chronological order. Frame images assigned frame numbers are stored in the frame image DB 220 shown in FIG. 5, for example. Then, the encoder learning unit 611 learns the encoder ec using the extracted frame images as learning data.

ここで、図７を用いて、エンコーダｅｃの学習例について説明する。ここでは、オートエンコーダ（ＡｕｔｏＥｎｃｏｄｅｒ）により、エンコーダｅｃ（ニューラルネットワーク）を学習する場合について説明する。 Here, a learning example of the encoder ec will be described with reference to FIG. Here, a case where an encoder ec (neural network) is learned by an autoencoder will be described.

図７は、エンコーダの学習例を示す説明図である。エンコーダ学習部６１１は、オートエンコーダにより、フレーム画像ＤＢ２２０に記憶されているすべてのフレーム画像を入力として、入力に近い再構成画像を出力するエンコーダｅｃ（ニューラルネットワーク）を学習する。 FIG. 7 is an explanatory diagram showing a learning example of the encoder. The encoder learning unit 611 uses an autoencoder to learn an encoder ec (neural network) that receives all frame images stored in the frame image DB 220 as input and outputs a reconstructed image that is close to the input.

具体的には、例えば、エンコーダ学習部６１１は、フレーム画像（例えば、フレーム画像７０１～７０３）と再構成画像（例えば、再構成画像７１１～７１３）との誤差が小さくなるように、エンコーダｅｃ（ニューラルネットワーク）を学習する。 Specifically, for example, the encoder learning unit 611 may set the encoder ec ( neural networks).

オートエンコーダの潜在空間は、画像の全体的な特徴を維持しつつ次元を圧縮したものである。このため、オートエンコーダのエンコーダｅｃで圧縮された特徴空間（潜在空間）における特徴量ベクトルを用いることで、画像の全体的な特徴を反映した特徴量を得ることができる。 The autoencoder's latent space is dimensionally compressed while preserving the global features of the image. Therefore, by using the feature amount vector in the feature space (latent space) compressed by the encoder ec of the autoencoder, it is possible to obtain the feature amount that reflects the overall features of the image.

なお、ここでは、動画Ｖｄに含まれるフレーム画像を用いてエンコーダｅｃを学習することにしたが、これに限らない。例えば、エンコーダ学習部６１１は、学習済みのエンコーダｅｃを取得することにしてもよい。具体的には、例えば、エンコーダ学習部６１１は、異常検知対象を同じ位置で別のタイミングで撮影した別の動画に含まれるフレーム画像を用いて学習されたエンコーダｅｃを、学習済みのエンコーダｅｃとして取得することにしてもよい。学習済みのエンコーダｅｃは、例えば、不図示の入力装置を用いたユーザの操作により、または、他のコンピュータ（例えば、クライアント装置２０２）から取得されてもよい。 Although the encoder ec is learned here using the frame images included in the moving image Vd, the present invention is not limited to this. For example, the encoder learning unit 611 may acquire the learned encoder ec. Specifically, for example, the encoder learning unit 611 regards the encoder ec that has been learned using frame images included in another moving image of the anomaly detection target shot at the same position at a different timing as the learned encoder ec. You may choose to acquire it. The learned encoder ec may be acquired, for example, by a user's operation using an input device (not shown) or from another computer (eg, client device 202).

ベクトル算出部６１２は、学習したエンコーダｅｃを用いて、動画Ｖｄに含まれるフレーム画像の特徴量ベクトルを算出する。具体的には、例えば、ベクトル算出部６１２は、学習したエンコーダｅｃに対して、フレーム画像ＤＢ２２０に記憶された各フレーム画像を入力し、中間層での特徴量を特徴量ベクトルとして出力する。 The vector calculation unit 612 uses the learned encoder ec to calculate feature amount vectors of frame images included in the moving image Vd. Specifically, for example, the vector calculation unit 612 inputs each frame image stored in the frame image DB 220 to the learned encoder ec, and outputs the feature quantity in the intermediate layer as a feature quantity vector.

出力された各フレーム画像の特徴量ベクトルは、例えば、特徴量信号テーブル２３０に記憶される。ここで、図８を用いて、特徴量信号テーブル２３０の記憶内容について説明する。 The feature amount vector of each output frame image is stored in the feature amount signal table 230, for example. Here, the storage contents of the feature quantity signal table 230 will be described with reference to FIG.

図８は、特徴量信号テーブル２３０の記憶内容の一例を示す説明図である。図８において、特徴量信号テーブル２３０は、フレーム番号、特徴量ベクトルおよびウインドウ番号のフィールドを有し、各フィールドに情報を設定することで、特徴量信号情報（例えば、特徴量信号情報８００－１～８００－３）をレコードとして記憶する。 FIG. 8 is an explanatory diagram showing an example of the contents of the feature amount signal table 230. As shown in FIG. 8, the feature amount signal table 230 has fields of frame number, feature amount vector, and window number. By setting information in each field, feature amount signal information (for example, feature amount signal information 800-1 800-3) are stored as records.

ここで、フレーム番号は、動画Ｖｄから抽出されたフレーム画像を識別する識別子である。特徴量ベクトルは、学習したエンコーダｅｃを用いて算出された、フレーム画像の特徴を表す特徴量ベクトルである。ウインドウ番号は、フレーム画像が属する区間を識別する識別子である。初期状態では、ウインドウ番号は「－（Ｎｕｌｌ）」である。 Here, the frame number is an identifier that identifies a frame image extracted from the moving image Vd. The feature amount vector is a feature amount vector representing the feature of the frame image calculated using the learned encoder ec. A window number is an identifier that identifies a section to which a frame image belongs. In the initial state, the window number is "-(Null)".

図６の説明に戻り、第２の算出部６０３は、算出されたフレーム画像の特徴量の時間変化に基づいて、動画Ｖｄの周期Ｔを算出する。具体的には、例えば、第２の算出部６０３は、算出されたフレーム画像の特徴量ベクトルの主成分の時間変化を示す信号波形（主成分信号波形）の周波数解析を行って得られる結果に基づいて、周期Ｔを算出する。 Returning to the description of FIG. 6, the second calculator 603 calculates the period T of the moving image Vd based on the calculated temporal change in the feature amount of the frame image. Specifically, for example, the second calculation unit 603 performs frequency analysis of a signal waveform (principal component signal waveform) that indicates the temporal change of the principal component of the calculated feature amount vector of the frame image. Based on this, the period T is calculated.

ここで、第２の算出部６０３は、例えば、主成分分析部６１３と、周波数解析部６１４と、を含む。 Here, the second calculator 603 includes, for example, a principal component analyzer 613 and a frequency analyzer 614 .

主成分分析部６１３は、算出されたフレーム画像の特徴量ベクトルの主成分を特定する。具体的には、例えば、主成分分析部６１３は、特徴量信号テーブル２３０を参照して、算出されたフレーム画像の特徴量ベクトルを時系列順に並べて主成分分析し、フレーム画像の特徴量ベクトル間で最もばらつきが大きい成分を主成分として特定する。 A principal component analysis unit 613 identifies the principal component of the calculated feature amount vector of the frame image. Specifically, for example, the principal component analysis unit 613 refers to the feature quantity signal table 230, arranges the calculated feature quantity vectors of the frame images in chronological order, performs principal component analysis, and performs principal component analysis. The component with the largest variation in is specified as the principal component.

周波数解析部６１４は、特定された特徴量ベクトルの主成分の時間変化を示す特徴量信号（主成分信号波形）の周波数解析を行う。具体的には、例えば、周波数解析部６１４は、特徴量信号テーブル２３０を参照して、特徴量ベクトルの主成分信号波形を特定する。そして、周波数解析部６１４は、特定した主成分信号波形を周波数解析して、最もパワーが大きい周波数を特定する。この場合、第２の算出部６０３は、特定された周波数を、動画Ｖｄの周期Ｔとする。 The frequency analysis unit 614 performs frequency analysis of a feature amount signal (principal component signal waveform) that indicates the temporal change of the principal component of the specified feature amount vector. Specifically, for example, the frequency analysis unit 614 refers to the feature amount signal table 230 to identify the principal component signal waveform of the feature amount vector. Then, frequency analysis section 614 frequency-analyzes the identified main component signal waveform to identify the frequency with the highest power. In this case, the second calculator 603 sets the specified frequency as the period T of the moving image Vd.

動画Ｖｄの周期Ｔの算出処理例については、図１０を用いて後述する。 An example of processing for calculating the period T of the moving image Vd will be described later with reference to FIG.

なお、ここでは、動画Ｖｄに含まれるフレーム画像の特徴量の時間変化に基づいて、動画Ｖｄの周期Ｔを算出することにしたが、これに限らない。例えば、第２の算出部６０３は、動画Ｖｄの周期Ｔとして、予め指定された周期を取得することにしてもよい。具体的には、例えば、第２の算出部６０３は、異常検知対象を同じ位置で別のタイミングで撮影した別の動画に含まれるフレーム画像の特徴量の時間変化に基づいて算出された周期を、動画Ｖｄの周期Ｔとして取得してもよい。また、第２の算出部６０３は、不図示の入力装置を用いたユーザの操作により、または、他のコンピュータ（例えば、クライアント装置２０２）から指定された周期を、動画Ｖｄの周期Ｔとして取得してもよい。 Note that here, the period T of the moving image Vd is calculated based on the temporal change in the feature amount of the frame images included in the moving image Vd, but the present invention is not limited to this. For example, the second calculation unit 603 may acquire a period specified in advance as the period T of the moving image Vd. Specifically, for example, the second calculation unit 603 calculates the period calculated based on the temporal change in the feature amount of the frame images included in the different moving images captured at the same position at different timings of the abnormality detection target. , may be obtained as the period T of the moving image Vd. Further, the second calculation unit 603 acquires a period designated by a user's operation using an input device (not shown) or from another computer (for example, the client device 202) as the period T of the video Vd. may

特定部６０４は、算出された周期Ｔに応じて動画Ｖｄを区切って分割した複数の区間の各区間について、各区間と他の区間とのフレーム画像の特徴量に関する差分ｄｆを算出する。ここで、差分ｄｆは、例えば、主成分信号波形の差分である。ただし、差分ｄｆは、例えば、特徴量ベクトルの各成分信号波形の差分の合計や平均によって表すことにしてもよい。 The specifying unit 604 calculates the difference df regarding the feature amount of the frame image between each section and another section, for each section of a plurality of sections obtained by dividing the moving image Vd according to the calculated cycle T. Here, the difference df is, for example, the difference between the main component signal waveforms. However, the difference df may be represented by, for example, the sum or average of the differences of the component signal waveforms of the feature vector.

具体的には、例えば、特定部６０４は、特徴量信号テーブル２３０を参照して、算出された周期Ｔをウインドウ幅として、動画Ｖｄから抽出されたフレーム画像群（特徴量ベクトル群）を区切って、複数の区間（ウインドウ）に分割する。各フレーム画像（特徴量ベクトル群）が属する区間を識別するウインドウ番号は、例えば、特徴量信号テーブル２３０に設定される。 Specifically, for example, the specifying unit 604 refers to the feature amount signal table 230, uses the calculated period T as the window width, and divides the frame image group (feature amount vector group) extracted from the moving image Vd. , into a plurality of intervals (windows). A window number identifying a section to which each frame image (feature vector group) belongs is set in the feature signal table 230, for example.

図９は、特徴量信号テーブル２３０の更新例を示す説明図である。図９において、特徴量信号テーブル２３０内の各特徴量信号情報のウインドウ番号フィールドに、各フレーム番号のフレーム画像（特徴量ベクトル）が属する区間を識別するウインドウ番号が設定されている。 FIG. 9 is an explanatory diagram showing an example of updating the feature amount signal table 230. As shown in FIG. In FIG. 9, a window number identifying a section to which a frame image (feature vector) of each frame number belongs is set in the window number field of each feature signal information in the feature signal table 230 .

例えば、特徴量信号情報８００－１～８００－３には、ウインドウ番号「１」がそれぞれ設定されている。 For example, window number "1" is set in each of the feature amount signal information 800-1 to 800-3.

区間の分割処理例については、図１１を用いて後述する。 An example of segment division processing will be described later with reference to FIG. 11 .

特定部６０４は、特徴量信号テーブル２３０を参照して、各区間（ウインドウ）の特徴量ベクトル（例えば、主成分信号波形）を近傍ｋ区間と比較することにより、その差分ｄｆを算出する。例えば、各区間（ウインドウ）に含まれる各フレーム画像の特徴量ベクトルの主成分信号値を、「ｖ（ｔ）＝［ｘ（ｔ），ｘ（ｔ＋１），ｘ（ｔ＋２）…］」とする。 The specifying unit 604 refers to the feature amount signal table 230 and compares the feature amount vector (for example, main component signal waveform) of each section (window) with neighboring k sections to calculate the difference df. For example, let the principal component signal value of the feature amount vector of each frame image included in each section (window) be "v(t)=[x(t), x(t+1), x(t+2)...]". .

この場合、特定部６０４は、例えば、ｔ≠ｔ’以外のｔ’について、「｜ｖ（ｔ）－ｖ（ｔ’）｜＾２」を計算する。そして、特定部６０４は、「Ｐ（ｔ）＝ｍｉｎ（｜ｖ（ｔ）－ｖ（ｔ’）｜＾２）」を、差分ｄｆとする（ただし、ｋ＝１）。差分ｄｆは、その区間に異常な状態のフレーム画像を含む度合いを示す値（異常度）に相当する。 In this case, the identifying unit 604 calculates “|v(t)−v(t′)|̂2” for t′ other than t≠t′, for example. Then, the specifying unit 604 sets “P(t)=min(|v(t)−v(t′)|̂2)” as the difference df (where k=1). The difference df corresponds to a value (abnormality degree) indicating the degree to which frame images in an abnormal state are included in that section.

ｋ＞２の場合には、特定部６０４は、例えば、区間同士の計算結果のうち「｜ｖ（ｔ）－ｖ（ｔ’）｜＾２」が小さいほうからｋ個取り出し、「Ｐ（ｔ）＝ａｖｇ（｜ｖ（ｔ）－ｖ（ｔ１’）｜＾２，｜ｖ（ｔ）－ｖ（ｔ２’）｜＾２…）」を、差分ｄｆとする。なお、ｋ近傍法を用いた差分ｄｆ（異常度）の計算例については、後述する。 When k>2, the identification unit 604, for example, extracts k pieces from the calculation results of the intervals with smaller “|v(t)−v(t′)|^2”, and extracts “P(t )=avg(|v(t)−v(t1′)|̂2,|v(t)−v(t2′)|̂2 . . . )” is the difference df. A calculation example of the difference df (abnormality degree) using the k nearest neighbor method will be described later.

そして、特定部６０４は、各区間について算出した差分ｄｆに基づいて、複数の区間のうち異常な状態のフレーム画像を含む区間を特定する。具体的には、例えば、特定部６０４は、算出した差分ｄｆが閾値Ｔｈ以上の区間を、異常な状態のフレーム画像を含む区間として特定する。閾値Ｔｈは、任意に設定可能であり、例えば、異常検知対象に応じて設定される。 Then, the specifying unit 604 specifies a section including a frame image in an abnormal state among the plurality of sections based on the difference df calculated for each section. Specifically, for example, the identifying unit 604 identifies a section in which the calculated difference df is equal to or greater than the threshold Th as a section containing a frame image in an abnormal state. The threshold Th can be arbitrarily set, and is set according to, for example, an abnormality detection target.

以下の説明では、異常な状態のフレーム画像を含む区間を「異常データ区間」と表記する場合がある。 In the following description, a section containing frame images in an abnormal state may be referred to as an "abnormal data section".

また、特定部６０４は、複数の区間のうち、算出した差分ｄｆが相対的に小さい所定数の区間を除く区間を、異常データ区間として特定することにしてもよい。換言すると、特定部６０４は、例えば、差分ｄｆが大きいほうから所定数の区間を、異常データ区間として特定してもよい。 Further, the specifying unit 604 may specify, among the plurality of sections, sections excluding a predetermined number of sections in which the calculated difference df is relatively small as abnormal data sections. In other words, the specifying unit 604 may specify, for example, a predetermined number of sections in descending order of the difference df as abnormal data sections.

所定数は、任意に設定可能であり、例えば、モデルＭの学習に必要な学習データ数（必要サンプル数）に応じて設定される。例えば、１区間を「６０フレーム」とし、必要サンプル数を「６００フレーム」とする。この場合、特定部６０４は、複数の区間のうち、差分ｄｆが少ない上位１０区間を除く残余の区間を、異常データ区間として特定する。 The predetermined number can be arbitrarily set, and is set, for example, according to the number of learning data necessary for learning the model M (required number of samples). For example, assume that one section is "60 frames" and the required number of samples is "600 frames". In this case, the identifying unit 604 identifies, among the plurality of sections, remaining sections excluding the top 10 sections with the smallest differences df as abnormal data sections.

決定部６０５は、複数の区間のうち、特定された区間（異常データ区間）とは異なる他の区間に対応するフレーム画像を、モデルＭの学習に用いる学習データに決定する。すなわち、決定部６０５は、動画Ｖｄに含まれるフレーム画像のうち、異常データ区間に対応するフレーム画像を、モデルＭの学習に用いる学習データから除外する。モデルＭは、異常検知対象の異常を検知する学習モデルである。 The determination unit 605 determines frame images corresponding to other sections that are different from the identified section (abnormal data section) among the plurality of sections as learning data to be used for model M learning. That is, the determination unit 605 excludes frame images corresponding to the abnormal data section from the learning data used for learning the model M among the frame images included in the moving image Vd. The model M is a learning model that detects anomalies in the anomaly detection target.

具体的には、例えば、決定部６０５は、フレーム画像ＤＢ２２０を参照して、異常データ区間とは異なる他の区間に対応するフレーム画像を、学習データに決定する。学習データに決定されたフレーム画像は、例えば、学習データＤＢ２４０（図２参照）に記憶される。 Specifically, for example, the determination unit 605 refers to the frame image DB 220 and determines, as learning data, a frame image corresponding to a section other than the abnormal data section. Frame images determined as learning data are stored, for example, in the learning data DB 240 (see FIG. 2).

学習データの決定例については、図１２を用いて後述する。 An example of determining learning data will be described later with reference to FIG. 12 .

なお、決定部６０５は、例えば、動画Ｖｄに含まれるフレーム画像のうち、異常データ区間に属するフレーム画像以外のすべてのフレーム画像を学習データに決定することにしてもよい。すなわち、決定部６０５は、フレーム画像ＤＢ２２０に記憶されていないフレーム画像についても、学習データとして決定してもよい。 For example, the determination unit 605 may determine all frame images other than the frame images belonging to the abnormal data section among the frame images included in the moving image Vd as learning data. That is, the determining unit 605 may determine frame images that are not stored in the frame image DB 220 as learning data.

出力部６０６は、決定された学習データを出力する。具体的には、例えば、出力部６０６は、記憶されているフレーム画像を、異常検知対象の異常を検知するモデルＭの学習に用いる学習データとして出力する。出力部６０６の出力形式としては、例えば、メモリ３０２、ディスク３０４などの記憶装置への記憶、通信Ｉ／Ｆ３０５による他のコンピュータ（例えば、クライアント装置２０２）への送信などがある。 The output unit 606 outputs the determined learning data. Specifically, for example, the output unit 606 outputs the stored frame images as learning data used for learning of the model M for detecting anomalies of the anomaly detection targets. The output format of the output unit 606 includes, for example, storage in a storage device such as the memory 302 and disk 304, transmission to another computer (for example, the client device 202) via the communication I/F 305, and the like.

なお、学習データ選択装置２０１は、決定された学習データを用いて、モデルＭを学習することにしてもよい。モデルＭは、例えば、ニューラルネットワークを用いた機械学習等により生成される。これにより、異常検知対象（例えば、自動生産ライン）の異常を精度よく検知可能なモデルＭを生成することができる。 Note that the learning data selection device 201 may learn the model M using the determined learning data. The model M is generated, for example, by machine learning using a neural network. As a result, it is possible to generate a model M capable of accurately detecting an abnormality in an abnormality detection target (for example, an automatic production line).

上述した学習データ選択装置２０１の機能部（取得部６０１～出力部６０６）は、例えば、複数のコンピュータ（例えば、学習データ選択装置２０１、クライアント装置２０２）が連携して動作することにより実現されることにしてもよい。 The functional units (acquisition unit 601 to output unit 606) of the learning data selection device 201 described above are realized by, for example, a plurality of computers (for example, the learning data selection device 201 and the client device 202) operating in cooperation. You can decide.

（動画Ｖｄの周期Ｔの算出処理例）
つぎに、図１０を用いて、動画Ｖｄの周期Ｔの算出処理例について説明する。 (Example of processing for calculating period T of moving image Vd)
Next, an example of processing for calculating the period T of the moving image Vd will be described with reference to FIG.

図１０は、動画Ｖｄの周期Ｔの算出処理例を示す説明図である。図１０において、グラフ１０００は、特徴量信号テーブル２３０に記憶されている各フレーム画像の特徴量ベクトルの主成分を、フレーム番号順（時系列）にプロットして得られる主成分信号波形である。 FIG. 10 is an explanatory diagram showing an example of processing for calculating the period T of the moving image Vd. In FIG. 10, a graph 1000 is a principal component signal waveform obtained by plotting the principal component of the feature amount vector of each frame image stored in the feature amount signal table 230 in frame number order (time series).

周波数解析部６１４は、主成分信号波形（グラフ１０００）の周波数解析を行う。グラフ１０１０は、主成分信号波形（グラフ１０００）を周波数解析して得られた結果を示す（縦軸：パワー、横軸：周波数成分）。パワーは、例えば、波形の振幅に対応する。 A frequency analysis unit 614 performs frequency analysis of the principal component signal waveform (graph 1000). A graph 1010 shows the result obtained by frequency analysis of the main component signal waveform (graph 1000) (vertical axis: power, horizontal axis: frequency component). Power, for example, corresponds to the amplitude of the waveform.

ここで、特徴量ベクトルの主成分信号をＦ（ｔ）とすると、Ｆ（ｔ）は、離散フーリエ変換により、下記式（１）によって表すことができる。ただし、ｆ（ｘ）は、周波数ｘの強度を示す。Ｎは、自然数である。ｉは、虚数単位である。πは、円周率である。 Here, assuming that the principal component signal of the feature amount vector is F(t), F(t) can be expressed by the following formula (1) by discrete Fourier transform. However, f(x) indicates the intensity of frequency x. N is a natural number. i is the imaginary unit. π is the circular constant.

周波数解析部６１４は、例えば、Ｆ（ｔ）をＦＦＴ（ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ）等により上記式（１）の形に離散フーリエ変換し、「０≦ｘ≦Ｎ－１」のｘについて、下記式（２）を用いて、各周波数ｘのパワーを求める。 For example, the frequency analysis unit 614 performs a discrete Fourier transform on F(t) in the form of the above formula (1) by FFT (Fast Fourier Transform) or the like, and calculates the following formula ( 2) is used to determine the power of each frequency x.

周波数ｘのパワー＝｜ｆ（ｘ）｜＾２・・・（２） Power of frequency x=|f(x)|^2 (2)

周波数解析部６１４は、周波数解析結果（グラフ１０１０）を参照して、最大パワーの周波数ｘ_maxを特定する。第２の算出部６０３は、特定された周波数ｘ_maxを、動画Ｖｄの周期Ｔ（ウインドウ幅）とする。これにより、主成分軸の特徴量信号の中で最も強い周波数成分を、動画Ｖｄの周期Ｔ（ウインドウ幅）とすることができる。 The frequency analysis unit 614 refers to the frequency analysis result (graph 1010) to identify the _maximum power frequency xmax. The second calculator 603 sets the identified frequency x _max as the period T (window width) of the moving image Vd. As a result, the strongest frequency component in the feature quantity signal on the principal component axis can be set as the period T (window width) of the moving image Vd.

（区間の分割処理例）
つぎに、図１１を用いて、動画Ｖｄを区切って複数の区間に分割する場合の区間の分割処理例について説明する。 (Example of section division processing)
Next, with reference to FIG. 11, an example of segment dividing processing when dividing the moving image Vd into a plurality of segments will be described.

図１１は、区間の分割処理例を示す説明図である。図１１において、グラフ１０００は、特徴量信号テーブル２３０に記憶されている各フレーム画像の特徴量ベクトルの主成分を、フレーム番号順（時系列）にプロットして得られる主成分信号波形である。特定部６０４は、例えば、算出された周期Ｔ（ウインドウ幅）で主成分信号波形（グラフ１０００）を区切って、複数の区間に分割する。 FIG. 11 is an explanatory diagram illustrating an example of segment division processing. In FIG. 11, a graph 1000 is a principal component signal waveform obtained by plotting the principal component of the feature amount vector of each frame image stored in the feature amount signal table 230 in frame number order (time series). The specifying unit 604 divides the main component signal waveform (graph 1000) into a plurality of sections by, for example, the calculated period T (window width).

図１１の例では、区間Ｓ１～Ｓ３に分割されている。各区間Ｓ１～Ｓ３は、フレーム番号によって指定される。区間Ｓ２を注目区間とすると、特定部６０４は、例えば、区間Ｓ１，Ｓ３を比較区間として、区間Ｓ２の主成分信号波形を、区間Ｓ１，Ｓ３の主成分信号波形と比較することにより、差分ｄｆ（異常度）を算出する。 In the example of FIG. 11, it is divided into sections S1 to S3. Each section S1 to S3 is designated by a frame number. Assuming that the section S2 is the target section, the specifying unit 604 compares the main component signal waveform of the section S2 with the main component signal waveforms of the sections S1 and S3, for example, using the sections S1 and S3 as comparison sections, thereby obtaining the difference df. (degree of anomaly) is calculated.

（差分ｄｆ（異常度）の計算例）
ここで、ｋ近傍法を用いた差分ｄｆ（異常度）の計算例について説明する。ここでは、スライディングウインドウを使ってウインドウ幅のデータを取り出して、特徴量ベクトルの主成分信号値を計算する場合について説明する。 (Calculation example of difference df (abnormality))
Here, an example of calculating the difference df (degree of anomaly) using the k nearest neighbor method will be described. Here, a case will be described in which the data of the window width is extracted using a sliding window and the principal component signal value of the feature amount vector is calculated.

例えば、５００フレームの動画があり、ウインドウ幅を「１０」とすると、特定部６０４は、以下のような、ベクトル値ｖ（１）～ｖ（４９０）を算出する。ただし、Ｘｎは、時刻ｎのフレーム画像の主信号成分の値を示す。 For example, if there is a moving image of 500 frames and the window width is "10", the specifying unit 604 calculates vector values v(1) to v(490) as follows. However, Xn indicates the value of the main signal component of the frame image at time n.

ｖ（１）＝（ｘ１，ｘ２，ｘ３…，ｘ１０）
ｖ（２）＝（ｘ２，ｘ３，ｘ４…，ｘ１１）
・・・
ｖ（４９０）＝（ｘ４９１，ｘ４９２・・・，ｘ５００） v(1)=(x1,x2,x3...,x10)
v(2)=(x2,x3,x4...,x11)
・・・
v(490) = (x491, x492..., x500)

つぎに、特定部６０４は、各ｖ（ｔ）について、他のｖ（ｔ）との差分を算出する。例えば、ｖ（１）とｖ（２）との差分は、「（ｘ１－ｘ２）＾２＋（ｘ２－ｘ３）＾２＋…」となる。また、ｖ（１）とｖ（３）との差分は、「（ｘ１－ｘ３）＾２＋（ｘ２－ｘ４）＾２＋…」となる。そして、特定部６０４は、算出した差分のうち、最も差分が小さいｋ個の平均値を、時刻ｔにおける区間についての差分ｄｆ（異常度）として算出する。 Next, the identifying unit 604 calculates the difference between each v(t) and another v(t). For example, the difference between v(1) and v(2) is "(x1-x2)^2+(x2-x3)^2+...". Also, the difference between v(1) and v(3) is "(x1-x3)^2+(x2-x4)^2+...". Then, the specifying unit 604 calculates an average value of k values having the smallest difference among the calculated differences as the difference df (abnormality degree) for the section at time t.

（学習データの決定例）
つぎに、図１２を用いて、学習データの決定例について説明する。 (Example of determination of learning data)
Next, an example of learning data determination will be described with reference to FIG.

図１２は、学習データの決定例を示す説明図である。図１２において、区間Ｓ１～Ｓ３のうち、区間Ｓ２が異常データ区間として特定された場合を想定する。この場合、決定部６０５は、フレーム画像ＤＢ２２０を参照して、異常データ区間Ｓ２とは異なる他の区間Ｓ１，Ｓ３に対応するフレーム画像を、学習データに決定する。 FIG. 12 is an explanatory diagram showing an example of determination of learning data. In FIG. 12, it is assumed that section S2 is identified as an abnormal data section among sections S1 to S3. In this case, the determination unit 605 refers to the frame image DB 220 and determines, as learning data, frame images corresponding to sections S1 and S3 different from the abnormal data section S2.

例えば、区間Ｓ１をウインドウ番号「１」の区間とし、区間Ｓ２をウインドウ番号「２」の区間とし、区間Ｓ３をウインドウ番号「３」の区間とする。決定部６０５は、ウインドウ番号「１」、「３」に対応するフレーム画像を、学習データに決定する。そして、決定部６０５は、フレーム画像ＤＢ２２０から、ウインドウ番号「１」、「３」に対応するフレーム画像を抽出し、抽出したフレーム画像を学習データＤＢ２４０に記憶する。 For example, let the section S1 be the section with the window number "1", the section S2 be the section with the window number "2", and the section S3 be the section with the window number "3". The determination unit 605 determines frame images corresponding to window numbers “1” and “3” as learning data. Then, the determining unit 605 extracts frame images corresponding to window numbers “1” and “3” from the frame image DB 220 and stores the extracted frame images in the learning data DB 240 .

これにより、異常検知対象（自動生産ライン）の異常を検知するモデルＭの学習に用いる学習データを学習データＤＢ２４０に蓄積することができる。 As a result, the learning data used for learning the model M for detecting anomalies in the anomaly detection target (automatic production line) can be accumulated in the learning data DB 240 .

（学習データ選択装置２０１のデータ選択処理手順）
つぎに、学習データ選択装置２０１のデータ選択処理手順について説明する。 (Data selection processing procedure of learning data selection device 201)
Next, a data selection processing procedure of the learning data selection device 201 will be described.

図１３および図１４は、学習データ選択装置２０１のデータ選択処理手順の一例を示すフローチャートである。図１３のフローチャートにおいて、まず、学習データ選択装置２０１は、異常検知対象を撮影した動画Ｖｄを取得したか否かを判断する（ステップＳ１３０１）。 13 and 14 are flowcharts showing an example of the data selection processing procedure of the learning data selection device 201. FIG. In the flowchart of FIG. 13 , first, the learning data selection device 201 determines whether or not a moving image Vd in which an abnormality detection target is photographed has been obtained (step S1301).

ここで、学習データ選択装置２０１は、動画Ｖｄを取得するのを待つ（ステップＳ１３０１：Ｎｏ）。そして、学習データ選択装置２０１は、動画Ｖｄを取得した場合（ステップＳ１３０１：Ｙｅｓ）、取得した動画Ｖｄから一定時間間隔でフレーム画像を抽出する（ステップＳ１３０２）。 Here, the learning data selection device 201 waits to acquire the moving image Vd (step S1301: No). Then, when the learning data selection device 201 acquires the moving image Vd (step S1301: Yes), it extracts frame images from the acquired moving image Vd at regular time intervals (step S1302).

つぎに、学習データ選択装置２０１は、抽出したフレーム画像に対して、時系列順にフレーム番号を付与し、当該フレーム画像をフレーム画像ＤＢ２２０に記憶する（ステップＳ１３０３）。そして、学習データ選択装置２０１は、フレーム画像ＤＢ２２０を参照して、抽出したフレーム画像を学習データとして、エンコーダｅｃを学習する（ステップＳ１３０４）。 Next, the learning data selection device 201 assigns frame numbers to the extracted frame images in chronological order, and stores the frame images in the frame image DB 220 (step S1303). Then, the learning data selection device 201 refers to the frame image DB 220 and learns the encoder ec using the extracted frame images as learning data (step S1304).

つぎに、学習データ選択装置２０１は、フレーム画像ＤＢ２２０を参照して、抽出した各フレーム画像を、学習したエンコーダｅｃに入力して、各フレーム画像の特徴量ベクトルを算出する（ステップＳ１３０５）。そして、学習データ選択装置２０１は、算出したフレーム画像の特徴量ベクトルを時系列順に並べて主成分分析して、特徴量ベクトルの主成分を特定する（ステップＳ１３０６）。 Next, the learning data selection device 201 refers to the frame image DB 220, inputs each extracted frame image to the learned encoder ec, and calculates the feature amount vector of each frame image (step S1305). Then, the learning data selection device 201 arranges the calculated feature vectors of the frame images in chronological order, performs principal component analysis, and identifies the principal components of the feature vectors (step S1306).

つぎに、学習データ選択装置２０１は、特定した主成分の特徴量信号（主成分信号波形）に対して周波数解析を行って、最大パワーの周波数を特定する（ステップＳ１３０７）。そして、学習データ選択装置２０１は、特定した最大パワーの周波数を、動画Ｖｄの周期Ｔとして（ステップＳ１３０８）、図１４に示すステップＳ１４０１に移行する。 Next, the learning data selection device 201 performs frequency analysis on the identified feature signal of the principal component (principal component signal waveform) to identify the maximum power frequency (step S1307). Then, the learning data selection device 201 sets the specified maximum power frequency as the period T of the moving image Vd (step S1308), and proceeds to step S1401 shown in FIG.

図１４のフローチャートにおいて、まず、学習データ選択装置２０１は、動画Ｖｄの周期Ｔをウインドウ幅として、主成分の特徴量信号をウインドウ幅で区切って、複数の区間に分割する（ステップＳ１４０１）。この処理は、動画Ｖｄの周期Ｔに応じて、動画Ｖｄを区切って複数の区間に分割する処理に相当する。 In the flowchart of FIG. 14, first, the learning data selection device 201 uses the period T of the moving image Vd as the window width, and divides the feature amount signal of the main component into a plurality of sections by the window width (step S1401). This process corresponds to the process of dividing the moving image Vd into a plurality of sections according to the period T of the moving image Vd.

つぎに、学習データ選択装置２０１は、分割した複数の区間から選択されていない未選択の区間を選択する（ステップＳ１４０２）。そして、学習データ選択装置２０１は、選択した区間（ここでは、「注目区間」という。）の特徴量信号を近傍ｋ区間と比較して、注目区間についての差分ｄｆを算出する（ステップＳ１４０３）。 Next, the learning data selection device 201 selects an unselected section from the plurality of divided sections (step S1402). Then, the learning data selection device 201 compares the feature amount signal of the selected section (here, referred to as "interested section") with neighboring k sections to calculate the difference df for the noticed section (step S1403).

つぎに、学習データ選択装置２０１は、算出した差分ｄｆが閾値Ｔｈ以上であるか否かを判断する（ステップＳ１４０４）。ここで、差分ｄｆが閾値Ｔｈ以上の場合（ステップＳ１４０４：Ｙｅｓ）、学習データ選択装置２０１は、ステップＳ１４０６に移行する。一方、差分ｄｆが閾値Ｔｈ未満の場合（ステップＳ１４０４：Ｎｏ）、学習データ選択装置２０１は、注目区間に対応するフレーム画像を学習データに決定する（ステップＳ１４０５）。 Next, the learning data selection device 201 determines whether or not the calculated difference df is greater than or equal to the threshold Th (step S1404). If the difference df is greater than or equal to the threshold Th (step S1404: Yes), the learning data selection device 201 proceeds to step S1406. On the other hand, if the difference df is less than the threshold Th (step S1404: No), the learning data selection device 201 determines the frame image corresponding to the section of interest as learning data (step S1405).

そして、学習データ選択装置２０１は、分割した複数の区間から選択されていない未選択の区間があるか否かを判断する（ステップＳ１４０６）。ここで、未選択の区間がある場合（ステップＳ１４０６：Ｙｅｓ）、学習データ選択装置２０１は、ステップＳ１４０２に戻る。 Then, the learning data selection device 201 determines whether or not there is an unselected section that has not been selected from the plurality of divided sections (step S1406). Here, if there is an unselected section (step S1406: Yes), the learning data selection device 201 returns to step S1402.

一方、未選択の区間がない場合（ステップＳ１４０６：Ｎｏ）、学習データ選択装置２０１は、学習データに決定したフレーム画像を出力して（ステップＳ１４０７）、本フローチャートによる一連の処理を終了する。 On the other hand, if there is no unselected section (step S1406: No), the learning data selection device 201 outputs the frame image determined as the learning data (step S1407), and ends the series of processing according to this flowchart.

これにより、異常検知対象を撮影した動画Ｖｄの周期性を利用して、動画Ｖｄのうち、周期性に乱れがある区間に含まれるフレーム画像を異常の可能性があるものとして、モデルＭの学習に用いる学習データから除外することができる。 As a result, by using the periodicity of the video Vd in which the abnormality detection target is shot, the frame images included in the period in which the periodicity is disturbed in the video Vd are regarded as having the possibility of being abnormal, and the model M is learned. can be excluded from the training data used for

なお、ステップＳ１３０４において、エンコーダｅｃを学習することにしたが、これに限らない。例えば、学習済みのエンコーダｅｃがある場合には、学習データ選択装置２０１は、ステップＳ１３０４の処理を省略することにしてもよい。 Although the encoder ec is learned in step S1304, the present invention is not limited to this. For example, if there is a learned encoder ec, the learning data selection device 201 may omit the process of step S1304.

また、ステップＳ１３０８において、動画Ｖｄの周期Ｔを算出することにしたが、これに限らない。例えば、予め指定された周期Ｔがある場合には、学習データ選択装置２０１は、例えば、ステップＳ１３０７，Ｓ１３０８の処理を省略することにしてもよい。 Also, in step S1308, the period T of the moving image Vd is calculated, but the present invention is not limited to this. For example, if there is a cycle T specified in advance, the learning data selection device 201 may omit the processing of steps S1307 and S1308.

また、ステップＳ１４０４において、差分ｄｆが閾値Ｔｈ未満の場合に学習データに決定することにしたが、これに限らない。例えば、学習データ選択装置２０１は、差分ｄｆが小さいほうから上位所定数の区間を選択して、学習データにしてもよい。この場合、ステップＳ１４０４，Ｓ１４０５の処理は省略し、ステップＳ１４０７の前に差分ｄｆが小さいほうから上位所定数の区間を選択して学習データに決定する処理を追加する。 Also, in step S1404, learning data is determined when the difference df is less than the threshold value Th, but the present invention is not limited to this. For example, the learning data selection device 201 may select a predetermined number of sections from the smallest difference df to use as learning data. In this case, the processes of steps S1404 and S1405 are omitted, and a process of selecting the upper predetermined number of sections from the smaller difference df and determining them as learning data is added before step S1407.

以上説明したように、実施の形態にかかる学習データ選択装置２０１によれば、異常検知対象を撮影した動画Ｖｄに含まれるフレーム画像の特徴量を算出し、算出したフレーム画像の特徴量の時間変化に基づいて、動画Ｖｄの周期Ｔを算出することができる。動画Ｖｄは、例えば、周期性のある動画である。 As described above, according to the learning data selection device 201 according to the embodiment, the feature amount of the frame image included in the moving image Vd in which the abnormality detection target is shot is calculated, and the feature amount of the calculated frame image changes over time. , the period T of the moving image Vd can be calculated. The moving image Vd is, for example, a periodic moving image.

これにより、自動生産ラインのような異常検知対象を撮影した動画Ｖｄでは同じような画像が周期的に繰り返されることを利用して、動画Ｖｄに含まれるフレーム画像の特徴量の時間変化の周期性から、動画Ｖｄの周期Ｔを求めることができる。 As a result, by utilizing the fact that similar images are periodically repeated in a moving image Vd of an anomaly detection target such as an automatic production line, the periodicity of the temporal change in the feature amount of the frame images included in the moving image Vd can be obtained. , the period T of the moving image Vd can be obtained.

また、学習データ選択装置２０１によれば、算出した周期Ｔに応じて動画Ｖｄを区切って分割した複数の区間の各区間について、各区間と他の区間とのフレーム画像の特徴量に関する差分ｄｆを算出し、各区間について算出した差分ｄｆに基づいて、複数の区間のうち異常な状態のフレーム画像を含む区間（異常データ区間）を特定することができる。そして、学習データ選択装置２０１によれば、複数の区間のうち特定した区間とは異なる他の区間に対応するフレーム画像を、モデルＭの学習に用いる学習データに決定することができる。 Further, according to the learning data selection device 201, for each of a plurality of sections obtained by dividing the moving image Vd according to the calculated period T, the difference df regarding the feature amount of the frame image between each section and another section is calculated. Based on the difference df calculated for each section, a section (abnormal data section) containing a frame image in an abnormal state can be specified among the plurality of sections. Then, according to the learning data selection device 201, frame images corresponding to other sections different from the specified section among the plurality of sections can be determined as learning data to be used for model M learning.

また、学習データ選択装置２０１によれば、算出した差分ｄｆが閾値Ｔｈ以上の区間を、異常な状態のフレーム画像を含む区間として特定することができる。 Further, according to the learning data selection device 201, a section in which the calculated difference df is equal to or greater than the threshold Th can be identified as a section containing a frame image in an abnormal state.

これにより、動画Ｖｄの中からフレーム画像の特徴量の時間変化の周期性に乱れがある区間を精度よく特定することができる。 As a result, it is possible to accurately identify a section in the moving image Vd in which the periodicity of the temporal change of the feature amount of the frame image is disturbed.

また、学習データ選択装置２０１によれば、複数の区間のうち、算出した差分ｄｆが相対的に小さい所定数の区間を除く区間を、異常な状態のフレーム画像を含む区間として特定することができる。 Further, according to the learning data selection device 201, among the plurality of sections, sections excluding a predetermined number of sections with relatively small calculated differences df can be specified as sections containing frame images in an abnormal state. .

これにより、学習データから異常な状態の画像を除去しつつ、モデルＭの学習に必要な所定数（必要サンプル数）の学習データを確保することができる。 As a result, it is possible to secure a predetermined number of learning data (required number of samples) necessary for learning the model M while removing images in an abnormal state from the learning data.

また、学習データ選択装置２０１によれば、動画Ｖｄに含まれるフレーム画像の特徴量ベクトルを算出し、算出したフレーム画像の特徴量ベクトルの主成分の時間変化を示す特徴量信号の周波数解析を行って得られる結果に基づいて、動画Ｖｄの周期Ｔを算出することができる。 Further, according to the learning data selection device 201, the feature amount vector of the frame image included in the moving image Vd is calculated, and the frequency analysis is performed on the feature amount signal indicating the temporal change of the principal component of the calculated feature amount vector of the frame image. The period T of the moving image Vd can be calculated based on the result obtained by the above.

これにより、動画の周期に沿って変化する量が多い特徴量信号の中で最も強い周波数成分を、動画Ｖｄの周期Ｔとすることができる。 As a result, it is possible to set the period T of the moving image Vd to be the strongest frequency component among the feature amount signals that vary greatly along the period of the moving image.

また、学習データ選択装置２０１によれば、フレーム画像の特徴量ベクトル間で最もばらつきが大きい成分を主成分として特定することができる。 Further, according to the learning data selection device 201, it is possible to specify, as the principal component, the component with the largest variation among the feature amount vectors of the frame images.

これにより、動画Ｖｄの周期Ｔに沿って変化する量が多い特徴量成分を特定することができる。 Thereby, it is possible to identify the feature amount component that varies greatly along the period T of the moving image Vd.

また、学習データ選択装置２０１によれば、動画Ｖｄから一定時間間隔でフレーム画像を抽出し、抽出したフレーム画像を学習データとして、エンコーダｅｃを学習し、学習したエンコーダｅｃを用いて、動画Ｖｄに含まれるフレーム画像の特徴量ベクトルを算出することができる。 Further, according to the learning data selection device 201, frame images are extracted from the video Vd at regular time intervals, the extracted frame images are used as learning data, the encoder ec is learned, and the learned encoder ec is used to convert the video Vd. A feature amount vector of the included frame image can be calculated.

これにより、動画Ｖｄに含まれるフレーム画像の特徴を表すのに妥当な特徴空間における特徴量ベクトルを得ることができる。例えば、オートエンコーダにより、どのような要素（成分）を用いれば、画像を精度よく再現できるのかを学習することで、フレーム画像の特徴を表すのに妥当な特徴空間における特徴量ベクトルを得ることができる。 As a result, it is possible to obtain a feature amount vector in a feature space appropriate for representing the features of the frame images included in the moving image Vd. For example, by learning what kind of elements (components) should be used to accurately reproduce an image using an autoencoder, it is possible to obtain a feature vector in a feature space that is appropriate for expressing the features of a frame image. can.

また、学習データ選択装置２０１によれば、決定した学習データを出力することができる。 Further, according to the learning data selection device 201, it is possible to output the determined learning data.

これにより、異常検知対象の異常を精度よく検知することができるモデルＭを生成可能な学習データを出力することができる。 As a result, it is possible to output the learning data capable of generating the model M capable of accurately detecting the anomaly of the anomaly detection target.

これらのことから、学習データ選択装置２０１によれば、正常画像と異常画像とが選別されずに混在し、両者の特徴の差異が小さいデータ（動画Ｖｄ）であっても、画像の特徴量の変化の周期性の乱れを検出することによって、正常画像のみを精度よく抽出して学習することができる。これにより、異常検知対象の異常を精度よく検知可能なモデルＭを生成することが可能となる。また、学習データを用意するにあたり、人手により正常な状態の画像と異常な状態の画像とを選別する作業が不要となり、手間と工数を削減することができる。 For this reason, according to the learning data selection device 201, even if the normal image and the abnormal image are mixed without being sorted out, and even if the data (moving image Vd) has a small difference between the features of the two, the feature amount of the image can be determined. By detecting disturbances in the periodicity of changes, it is possible to accurately extract and learn only normal images. As a result, it is possible to generate a model M capable of accurately detecting anomalies in the anomaly detection target. In addition, when preparing the learning data, it is not necessary to manually sort out the images in the normal state and the images in the abnormal state, thereby reducing labor and man-hours.

ここで、本学習データ選択方法による学習データの選別例について説明する。 Here, an example of selection of learning data by this learning data selection method will be described.

図１５は、学習データの選別例を示す説明図である。図１５において、フレーム画像１５０１～１５０６は、異常検知対象（自動生産ライン）を撮影した動画Ｖｄに含まれるフレーム画像を時系列に並べたものである。ここでは、撮影環境の変化（照明変化）により、異常検知対象の映り方に違いが生じている。 FIG. 15 is an explanatory diagram showing an example of selection of learning data. In FIG. 15, frame images 1501 to 1506 are chronologically arranged frame images included in a moving image Vd of an abnormality detection target (automatic production line). Here, there is a difference in how the abnormality detection target is captured due to changes in the shooting environment (illumination changes).

この場合、本学習データ選択方法によれば、照明変化により映り方に違いが生じた区間のフレーム画像は、学習データから除去される。一方で、照明変化後に、同じ周期が繰り返されれば、その区間のフレーム画像は学習データとして選択される。これにより、照明変化前後のフレーム画像を学習データとしてモデルＭを学習することができる。このモデルＭによれば、照明変化時に、異常検知対象が正常な状態であるにもかかわらず異常と誤判定してしまうことを防ぐことができる。 In this case, according to this learning data selection method, the frame images in the sections in which the appearance is different due to changes in illumination are removed from the learning data. On the other hand, if the same period is repeated after the illumination change, the frame images in that section are selected as learning data. As a result, the model M can be learned using the frame images before and after the illumination change as learning data. According to this model M, it is possible to prevent an erroneous determination of an abnormality even though the abnormality detection target is in a normal state when the illumination changes.

なお、本学習データ選択方法のように、周期性が乱れた部分（区間）に対応するフレーム画像を学習データから除去するのではなく、周期性が乱れた部分を異常検知対象の異常と直接判定する手法も考えられる。しかし、この手法では、例えば、図１５に示したように、照明変化等の撮影環境の変化により周期性が乱れたときに、異常と誤判定してしまうという問題がある。 Note that instead of removing the frame images corresponding to the parts (sections) where the periodicity is disturbed from the training data, as in this learning data selection method, the parts where the periodicity is disturbed are directly determined as anomalies to be detected. It is also possible to consider a method to However, with this method, for example, as shown in FIG. 15, when the periodicity is disturbed due to a change in the imaging environment such as a change in lighting, there is a problem that an abnormality is erroneously determined.

なお、本実施の形態で説明した学習データ選択方法は、予め用意されたプログラムをパーソナル・コンピュータやワークステーション等のコンピュータで実行することにより実現することができる。本学習データ選択プログラムは、ハードディスク、フレキシブルディスク、ＣＤ－ＲＯＭ、ＤＶＤ、ＵＳＢメモリ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また、本学習データ選択プログラムは、インターネット等のネットワークを介して配布してもよい。 The learning data selection method described in the present embodiment can be realized by executing a prepared program on a computer such as a personal computer or a workstation. This learning data selection program is recorded on a computer-readable recording medium such as a hard disk, flexible disk, CD-ROM, DVD, USB memory, etc., and is executed by being read from the recording medium by a computer. Also, the learning data selection program may be distributed via a network such as the Internet.

また、本実施の形態で説明した学習データ選択装置２０１（情報処理装置１０１）は、スタンダードセルやストラクチャードＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）などの特定用途向けＩＣやＦＰＧＡなどのＰＬＤ（ＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ）によっても実現することができる。 In addition, the learning data selection device 201 (information processing device 101) described in the present embodiment can be implemented by application-specific ICs such as standard cells and structured ASICs (Application Specific Integrated Circuits) and PLDs (Programmable Logic Devices) such as FPGAs. can also be realized.

上述した実施の形態に関し、さらに以下の付記を開示する。 Further, the following additional remarks are disclosed with respect to the above-described embodiment.

（付記１）異常検知対象を撮影した動画に含まれるフレーム画像の特徴を表す特徴量を算出し、
算出した前記フレーム画像の特徴量の時間変化に基づいて、前記動画の周期を算出し、
算出した前記周期に応じて前記動画を区切って分割した複数の区間の各区間について、前記各区間と他の区間とのフレーム画像の特徴量に関する差分を算出し、
前記各区間について算出した前記差分に基づいて、前記複数の区間のうち異常な状態のフレーム画像を含む区間を特定し、
前記複数の区間のうち特定した前記区間とは異なる他の区間に対応するフレーム画像を、前記異常検知対象の異常を検知するモデルの学習に用いる学習データに決定する、
処理をコンピュータに実行させることを特徴とする学習データ選択プログラム。 (Appendix 1) Calculating a feature amount representing a feature of a frame image included in a moving image of an anomaly detection target,
calculating the period of the moving image based on the time change of the calculated feature amount of the frame image;
calculating, for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating a difference regarding a feature amount of a frame image between each section and another section;
identifying a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section;
determining a frame image corresponding to another section different from the specified section among the plurality of sections as learning data to be used for learning a model for detecting anomaly of the anomaly detection target;
A learning data selection program characterized by causing a computer to execute processing.

（付記２）前記特定する処理は、
算出した前記差分が閾値以上の区間を、異常な状態のフレーム画像を含む区間として特定する、ことを特徴とする付記１に記載の学習データ選択プログラム。 (Appendix 2) The specifying process is
1. The learning data selection program according to appendix 1, wherein a section in which the calculated difference is equal to or greater than a threshold value is specified as a section containing a frame image in an abnormal state.

（付記３）前記特定する処理は、
前記複数の区間のうち、算出した前記差分が相対的に小さい所定数の区間を除く区間を、異常な状態のフレーム画像を含む区間として特定する、ことを特徴とする付記１または２に記載の学習データ選択プログラム。 (Appendix 3) The identifying process is
3. The method according to appendix 1 or 2, wherein, among the plurality of sections, sections excluding a predetermined number of sections in which the calculated difference is relatively small are specified as sections containing frame images in an abnormal state. Learning data selection program.

（付記４）前記特徴量を算出する処理は、
前記動画に含まれるフレーム画像の特徴を表す特徴量ベクトルを算出し、
前記周期を算出する処理は、
算出した前記フレーム画像の特徴量ベクトルの主成分の時間変化を示す特徴量信号の周波数解析を行って得られる結果に基づいて、前記周期を算出する、
ことを特徴とする付記１～３のいずれか一つに記載の学習データ選択プログラム。 (Appendix 4) The process of calculating the feature amount includes:
calculating a feature amount vector representing a feature of a frame image included in the moving image;
The process of calculating the cycle includes:
calculating the period based on a result obtained by performing a frequency analysis of a feature amount signal indicating a temporal change in the principal component of the calculated feature amount vector of the frame image;
The learning data selection program according to any one of Appendices 1 to 3, characterized by:

（付記５）前記フレーム画像の特徴量ベクトル間で最もばらつきが大きい成分を主成分として特定する、処理を前記コンピュータに実行させることを特徴とする付記４に記載の学習データ選択プログラム。 (Supplementary Note 5) The learning data selection program according to Supplementary Note 4, characterized in that it causes the computer to execute a process of specifying a component having the largest variation among the feature quantity vectors of the frame images as a principal component.

（付記６）前記動画から一定時間間隔でフレーム画像を抽出し、
抽出した前記フレーム画像を学習データとして、フレーム画像から特徴量ベクトルへの変換を行うエンコーダを学習する、処理を前記コンピュータに実行させ、
前記特徴量を算出する処理は、
学習した前記エンコーダを用いて、前記動画に含まれるフレーム画像の特徴量ベクトルを算出する、ことを特徴とする付記１～５のいずれか一つに記載の学習データ選択プログラム。 (Appendix 6) Extracting frame images from the moving image at regular time intervals,
causing the computer to execute a process of learning an encoder that converts a frame image into a feature vector using the extracted frame image as learning data;
The process of calculating the feature amount includes:
6. The learning data selection program according to any one of appendices 1 to 5, wherein the learned encoder is used to calculate a feature amount vector of a frame image included in the moving image.

（付記７）決定した前記学習データを出力する、処理を前記コンピュータに実行させることを特徴とする付記１～６のいずれか一つに記載の学習データ選択プログラム。 (Appendix 7) The learning data selection program according to any one of Appendices 1 to 6, characterized by causing the computer to execute a process of outputting the determined learning data.

（付記８）前記動画は、異常検知対象を撮影した、周期性のある動画である、ことを特徴とする付記１～７のいずれか一つに記載の学習データ選択プログラム。 (Appendix 8) The learning data selection program according to any one of Appendices 1 to 7, wherein the moving image is a periodic moving image of an anomaly detection target.

（付記９）前記特定する処理は、
算出した前記差分が大きいほうから所定数の区間を、異常な状態のフレーム画像を含む区間として特定する、ことを特徴とする付記１に記載の学習データ選択プログラム。 (Appendix 9) The identifying process is
1. The program for selecting learning data according to appendix 1, wherein a predetermined number of sections in descending order of the calculated difference are specified as sections containing frame images in an abnormal state.

（付記１０）異常検知対象を撮影した動画に含まれるフレーム画像の特徴を表す特徴量を算出し、
算出した前記フレーム画像の特徴量の時間変化に基づいて、前記動画の周期を算出し、
算出した前記周期に応じて前記動画を区切って分割した複数の区間の各区間について、前記各区間と他の区間とのフレーム画像の特徴量に関する差分を算出し、
前記各区間について算出した前記差分に基づいて、前記複数の区間のうち異常な状態のフレーム画像を含む区間を特定し、
前記複数の区間のうち特定した前記区間とは異なる他の区間に対応するフレーム画像を、前記異常検知対象の異常を検知するモデルの学習に用いる学習データに決定する、
処理をコンピュータが実行することを特徴とする学習データ選択方法。 (Appendix 10) calculating a feature amount representing a feature of a frame image included in a moving image of an anomaly detection target;
calculating the period of the moving image based on the time change of the calculated feature amount of the frame image;
calculating, for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating a difference regarding a feature amount of a frame image between each section and another section;
identifying a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section;
determining a frame image corresponding to another section different from the specified section among the plurality of sections as learning data to be used for learning a model for detecting anomaly of the anomaly detection target;
A learning data selection method characterized in that processing is executed by a computer.

（付記１１）異常検知対象を撮影した動画に含まれるフレーム画像の特徴を表す特徴量を算出し、
算出した前記フレーム画像の特徴量の時間変化に基づいて、前記動画の周期を算出し、
算出した前記周期に応じて前記動画を区切って分割した複数の区間の各区間について、前記各区間と他の区間とのフレーム画像の特徴量に関する差分を算出し、
前記各区間について算出した前記差分に基づいて、前記複数の区間のうち異常な状態のフレーム画像を含む区間を特定し、
前記複数の区間のうち特定した前記区間とは異なる他の区間に対応するフレーム画像を、前記異常検知対象の異常を検知するモデルの学習に用いる学習データに決定する、
制御部を有することを特徴とする情報処理装置。 (Appendix 11) calculating a feature quantity representing a feature of a frame image included in a moving image of an anomaly detection target;
calculating the period of the moving image based on the time change of the calculated feature amount of the frame image;
calculating, for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating a difference regarding a feature amount of a frame image between each section and another section;
identifying a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section;
determining a frame image corresponding to another section different from the specified section among the plurality of sections as learning data to be used for learning a model for detecting anomaly of the anomaly detection target;
An information processing apparatus comprising a control unit.

１０１情報処理装置
１１０，４１０，Ｖｄ動画
１２０，１３０特徴量信号
１４０学習データ
２００情報処理システム
２０１学習データ選択装置
２０２クライアント装置
２１０ネットワーク
２２０フレーム画像ＤＢ
２３０特徴量信号テーブル
２４０学習データＤＢ
３００バス
３０１ＣＰＵ
３０２メモリ
３０３ディスクドライブ
３０４ディスク
３０５通信Ｉ／Ｆ
３０６可搬型記録媒体Ｉ／Ｆ
３０７可搬型記録媒体
４００カメラ
６０１取得部
６０２第１の算出部
６０３第２の算出部
６０４特定部
６０５決定部
６０６出力部
６１１エンコーダ学習部
６１２ベクトル算出部
６１３主成分分析部
６１４周波数解析部
ｅｃエンコーダ 101 information processing device 110, 410, Vd video 120, 130 feature signal 140 learning data 200 information processing system 201 learning data selection device 202 client device 210 network 220 frame image DB
230 Feature amount signal table 240 Learning data DB
300 Bus 301 CPU
302 memory 303 disk drive 304 disk 305 communication I/F
306 portable recording medium I/F
307 portable recording medium 400 camera 601 acquisition unit 602 first calculation unit 603 second calculation unit 604 identification unit 605 determination unit 606 output unit 611 encoder learning unit 612 vector calculation unit 613 principal component analysis unit 614 frequency analysis unit ec encoder

Claims

Calculating the feature value representing the feature of the frame image included in the video of the anomaly detection target,
calculating the period of the moving image based on the time change of the calculated feature amount of the frame image;
calculating, for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating a difference regarding a feature amount of a frame image between each section and another section;
identifying a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section;
determining a frame image corresponding to another section different from the specified section among the plurality of sections as learning data to be used for learning a model for detecting anomaly of the anomaly detection target;
A learning data selection program characterized by causing a computer to execute processing.

The process of specifying
2. The learning data selection program according to claim 1, wherein a section in which the calculated difference is equal to or greater than a threshold value is specified as a section containing a frame image in an abnormal state.

The process of specifying
2. The learning data selection program according to claim 1, wherein a predetermined number of sections in descending order of the calculated differences are specified as sections containing frame images in an abnormal state.

The process of calculating the feature amount includes:
calculating a feature amount vector representing a feature of a frame image included in the moving image;
The process of calculating the cycle includes:
calculating the period based on a result obtained by performing a frequency analysis of a feature amount signal indicating a temporal change in the principal component of the calculated feature amount vector of the frame image;
4. The learning data selection program according to any one of claims 1 to 3, characterized by:

extracting frame images from the moving image at regular time intervals;
causing the computer to execute a process of learning an encoder that converts a frame image into a feature vector using the extracted frame image as learning data;
The process of calculating the feature amount includes:
5. The learning data selection program according to any one of claims 1 to 4, wherein the learned encoder is used to calculate a feature amount vector of a frame image included in the moving image.

6. The learning data selection program according to claim 1, causing the computer to execute a process of outputting the determined learning data.

7. The learning data selection program according to any one of claims 1 to 6, wherein the moving image is a periodic moving image of an anomaly detection target.

Calculating the feature value representing the feature of the frame image included in the video of the anomaly detection target,
calculating the period of the moving image based on the time change of the calculated feature amount of the frame image;
calculating, for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating a difference regarding a feature amount of a frame image between each section and another section;
identifying a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section;
determining a frame image corresponding to another section different from the specified section among the plurality of sections as learning data to be used for learning a model for detecting anomaly of the anomaly detection target;
A learning data selection method characterized in that processing is executed by a computer.

Calculating the feature value representing the feature of the frame image included in the video of the anomaly detection target,
calculating the period of the moving image based on the time change of the calculated feature amount of the frame image;
calculating, for each section of a plurality of sections obtained by dividing the moving image according to the calculated period, calculating a difference regarding a feature amount of a frame image between each section and another section;
identifying a section including a frame image in an abnormal state among the plurality of sections based on the difference calculated for each section;
determining a frame image corresponding to another section different from the specified section among the plurality of sections as learning data to be used for learning a model for detecting anomaly of the anomaly detection target;
An information processing apparatus comprising a control unit.