JP2010128947A

JP2010128947A - Abnormality estimation apparatus, abnormality estimation method and abnormality estimation program

Info

Publication number: JP2010128947A
Application number: JP2008304946A
Authority: JP
Inventors: Kyoko Sudo; 恭子数藤; Tatsuya Osawa; 達哉大澤; Yoshiori Wakabayashi; 佳織若林; Hideki Koike; 秀樹小池
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2008-11-28
Filing date: 2008-11-28
Publication date: 2010-06-10
Anticipated expiration: 2028-11-28
Also published as: JP5002575B2

Abstract

<P>PROBLEM TO BE SOLVED: To stabilize normality/abnormality classification according to motion features of moving objects. <P>SOLUTION: Video captured by a camera is input into a video input part 2 of an abnormality estimation apparatus 1. A motion feature extraction part 3 generates a plurality of time series motion features from the input video, selects from the features according to a selection criterion, and outputs a time series combination of selected features to an identification part 4. The identification means 4 classifies the input combination of features as normal/abnormal by an SVM technique, and computes N logical values indicating whether or not the feature elements should be used in final identification. A display part 5 displays time series normal/abnormal flags as identification results. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、映像中に含まれる動物体の動きに基づく特徴量からシーンの非定常度を識別する技術に関する。 The present invention relates to a technique for identifying a non-stationary degree of a scene from a feature amount based on a movement of an animal body included in an image.

近年、多数のカメラを備えた映像監視システムが普及し、監視業務の効率化が必要とされている。そのため、映像監視システムに映像蓄積機能だけでなく、動物体検出機能や人物追跡機能など、映像から情報を抽出する機能が組み込まれるようになった。そうした機能の一つとして、普段と異なるシーンを自動検出する非定常検出の技術が求められており、これを既存の映像監視システムに映像モジュールに組み込みたいとのニーズが高まっている。 In recent years, video surveillance systems equipped with a large number of cameras have become widespread, and the efficiency of surveillance work is required. For this reason, functions for extracting information from video such as a moving object detection function and a person tracking function have been incorporated into the video monitoring system in addition to the video storage function. As one of such functions, there is a demand for non-stationary detection technology that automatically detects scenes that are different from usual, and there is an increasing need to incorporate this into an existing video surveillance system in a video module.

映像から非定常なシーンを検出する手がかりとして、映像中の動物体の位置の変化と動き方向の変化を検出するには、各時刻における位置の座標と、その座標における動きベクトルの組として特徴量とすることで、その特徴量の違いから非定常なシーンを検出することができる。しかしながら、非定常なシーンの検出には次の問題が提起されている。 In order to detect changes in the position of the moving object in the video and changes in the direction of movement as a clue to detect an unsteady scene from the video, a feature value is set as a set of the coordinate of the position at each time and the motion vector at that coordinate By doing so, an unsteady scene can be detected from the difference in the feature amount. However, the following problems have been raised in the detection of non-stationary scenes.

第一の問題点は、動きベクトル抽出の処理はノイズの影響を受けやすく、特徴量のばらつきが多いことが原因で、定常、非定常を安定して分離することが困難なことである。 The first problem is that the motion vector extraction process is easily affected by noise, and it is difficult to stably separate the steady state and the unsteady state due to the large variation in the feature amount.

第二の問題点は、動きベクトルの時間変化をどのようにして識別器へ入力するかということである。画像の画素ごとに動きベクトルを記憶する場合、数千から数十万のオーダーの画素のフレーム数倍の情報となる。このようにすると、計算上のメモリ使用量や統計処理に要する時間が膨大となり、リアルタイムの検出処理への入力には不向きである。動物体の存在する領域のみの情報を保持することでメモリ量を減らすことが可能であるが、その領域の数が時間とともに変化するような場合、特徴量の次元数が変化してしまうため、統計処理のうえで扱いにくい。そこで、映像中の動物体の位置の変化及び動き方向の変化について、これを安定した低次元の特徴量として抽出する方法が必要である。 The second problem is how to input the time change of the motion vector to the discriminator. When a motion vector is stored for each pixel of the image, the information is several times the number of frames of pixels in the order of thousands to hundreds of thousands. In this way, the amount of memory used for calculation and the time required for statistical processing become enormous, and it is not suitable for input to real-time detection processing. It is possible to reduce the amount of memory by holding information only about the area where the moving object exists, but if the number of areas changes with time, the dimension number of the feature quantity will change, It is difficult to handle after statistical processing. Therefore, there is a need for a method for extracting a change in the position of the moving object in the video and a change in the direction of movement as a stable low-dimensional feature value.

第三の問題点は、定常・非定常を分ける情報として何を特徴量に用いるべきなのかは、カメラの設置場所とそこで観察される映像によって異なると考えられるが、あらかじめどんな特徴量が最適であるかを見極めるのは難しく、自動的に特徴量を選択する手法が示されていないことである。必ずしもすべてを用いればよいわけではなく、定常／非定常の分離に寄与しない特徴は、識別演算の上でノイズとなる可能性があり、計算コストが無駄になるおそれがある。 The third problem, the what we should use the feature amount as information to separate the stationary and non-stationary, it is believed that differ by the image observed there and the camera installation site, in advance what feature quantity best It is difficult to determine whether or not there is a method for automatically selecting a feature amount. It is not always necessary to use all of them, and features that do not contribute to steady / unsteady separation may become noise in the identification calculation, and calculation cost may be wasted.

第一の問題点に対しては、非特許文献１に動物体領域内での空間的、時間的動き平均を求める方法が提案されている。この方法は、動領域が人間かそうでないかを識別するための動きに基づく特徴量として提案されており、領域ごとの比較を行う場合に有効である。しかし、領域ごとの比較ではなく、画像全体の特徴量とする方法は示されていない。画像全体についての特徴量を抽出するためには、その中に含まれる主要な動物体の動き特徴を反映できるような工夫が必要である。
羽下哲司，鷲見和彦，八木康史「変化領域内の動きの時空間特徴に着目した屋外情景における歩行者の検出」電子情報通信学会論文誌Ｄ−ＩＩ、ｖｏｌ．Ｊ８７−Ｄ−ＩＩ，Ｎｏ．５，ｐｐ．１１０４−１１１１，２００４．松山隆司，久野義徳，井宮淳編「コンピュータビジョン」新技術コミュニケーションズ３．３．４節ｐｐ．１７２申，渡辺，菅原「テンポラルテンプレートを用いた動画像解析手法」電子情報通信学会技術研究報告、ＰＲＭＵ２００２馬場「角度データの統計」統計数理研究所彙報第２８巻第一号、ｐｐ．４１--４５，１９８１ＺｅｎｇＷｅｎｈｕａ，ＭａＪｉａｎ，”ＡＮｏｖｅｌＩｎｃｒｅｍｅｎｔａｌＳＶＭＬｅａｒｎｉｎｇＡｌｇｏｒｉｔｈｍ，”Ｔｈｅ８ｔｈＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＳｕｐｐｏｒｔｅｄＣｏｏｐｅｒａｔｉｖｅＷｏｒｋｉｎＤｅｓｉｇｎＰｒｏｃｅｅｄｉｎｇｓ，ｐｐ．６５８--６６２，２００３． For the first problem, Non-Patent Document 1 proposes a method for obtaining a spatial and temporal motion average in a moving body region. This method has been proposed as a feature quantity based on movement for identifying whether a moving area is a human or not, and is effective when comparing each area. However, it is not a comparison for each region, but a method for setting the feature amount of the entire image is not shown. In order to extract the feature amount of the entire image, it is necessary to devise a technique that can reflect the motion characteristics of the main moving objects included in the image.
Tetsuji Hashita, Kazuhiko Kusumi, Yasushi Yagi “Detection of Pedestrians in Outdoor Scenes Focusing on Spatio-Temporal Features of Movement in Change Regions” IEICE Transactions D-II, vol. J87-D-II, no. 5, pp. 1104-1111, 2004. “Computer Vision” New Technology Communications edited by Takashi Matsuyama, Yoshinori Kuno, Satoshi Imiya 3.3.4 pp. 172 Shin, Watanabe, Sakakibara "Video Analysis Method Using Temporal Template" IEICE Technical Report, PRMU2002 Baba "Statistics of angle data" Statistical Institute of Statistical Mathematics Vol. 28, No. 1, pp. 41--45, 1981 Zeng Wenhua, Ma Jian, “A Novel Incremental SVM Learning Algorithm,” The 8th International Conference on Computer Supported Cooperatives. 658--662, 2003.

従来の技術では、動きベクトル抽出の処理はノイズの影響を受けやすく、動きベクトルを用いた特徴量はばらつきが多いため、安定した定常／非定常の分類が困難である。動物体領域内での空間的、時間的動き平均を求める既存の方法を用いるとしても、領域ごとの比較ではなく画像全体の特徴量とする方法は示されていない。 In the conventional technique, the motion vector extraction process is easily affected by noise, and the feature amount using the motion vector has many variations, so that stable steady / unsteady classification is difficult. Even if an existing method for obtaining a spatial and temporal motion average in a moving body region is used, a method for making a feature amount of an entire image, not a comparison for each region, is not shown.

また、画像の画素ごとに動きベクトルを記憶する場合、数千から数十万のオーダーの特徴量となり、計算上のメモリ使用量や統計処理に要する時間が膨大となり、リアルタイム処理に不向きである。動物体の存在する領域のみの情報を保持することでメモリ量を減らすと、その領域の数が時間と共に変化するような場合、特徴量の次元数が変化してしまうため、統計処理に向き難い。 In addition, when a motion vector is stored for each pixel of an image, the amount of features is in the order of thousands to several hundreds of thousands, and the amount of memory used for calculation and the time required for statistical processing become enormous, which is not suitable for real-time processing. If the amount of memory is reduced by holding only information on the area where the animal exists, the number of dimensions of the feature quantity will change if the number of areas changes over time, making it difficult to use for statistical processing. .

さらに、定常・非定常を分ける情報として、抽出可能な動き特徴は、動物体の位置、動きの方向、大きさ、時間などがあるが、これらの情報の中で、定常／非定常の違いに寄与しないものを用いることは、識別演算の上でノイズとなる可能性があり、計算コストも無駄になるおそれがある。そのため、動物体の動き特徴のうち、位置、動き方向、大きさ、動きの連続時間などのうち、いずれの要素を特徴量に反映するのか、その組み合わせを選択する手段が必要である。 Furthermore, as information that separates stationary and non-stationary, the extractable movement features include the position of the moving object, the direction of movement, the size, and the time. Among these pieces of information, there is a difference between stationary and non-stationary. Use of a non-contributing one may cause noise in the identification calculation and may lead to a waste of calculation cost. Therefore, there is a need for a means for selecting a combination of which elements to be reflected in the feature quantity among the movement features of the moving object, such as the position, the movement direction, the size, and the continuous time of the movement.

本発明は、このような事情に鑑みてなされたものであり、動物体の動き特徴から安定的に定常／非定常を分類することを解決課題としている。 This invention is made | formed in view of such a situation, and makes it a solution subject to classify | categorizing steady / unsteady stably from the motion characteristic of a moving body.

そこで、請求項１記載の発明は、前記課題を解決すべく、撮影装置で撮影された映像中に含まれる動物体の動きに基づく特徴量を抽出し、統計的な処理によってシーンの非定常性を識別する装置であって、前記撮影装置からの入力映像中において動物体の動きに基づく特徴量を抽出する動き特徴抽出手段と、前記動き特徴抽出手段の抽出した特徴量から前記入力映像の非定常性を識別する識別手段と、を備えることを特徴としている。 Accordingly, in order to solve the above-mentioned problem, the invention according to claim 1 extracts a feature amount based on the movement of the moving object included in the video captured by the imaging device, and performs statistical processing to unsteady scenes. A motion feature extracting unit that extracts a feature amount based on a motion of a moving object in an input video from the photographing device, and a non-input image from the feature amount extracted by the motion feature extracting unit. And an identification means for identifying continuity.

請求項２記載の発明は、前記動き特徴抽出部は、前記入力映像から時系列の特徴量を生成する複数の特徴量抽出モジュールと、前記各特徴量抽出モジュールの生成した特徴量を、特徴量選択基準に基づいて選択し、該選択された各特徴量の組合せの時系列を前記識別部に出力する特徴量選択手段と、を備えることを特徴としている。 According to a second aspect of the present invention, the motion feature extraction unit includes a plurality of feature amount extraction modules that generate time-series feature amounts from the input video, and feature amounts generated by the feature amount extraction modules. And feature amount selection means for selecting based on a selection criterion and outputting a time series of combinations of the selected feature amounts to the identification unit.

請求項３記載の発明は、前記各特徴量抽出モジュールは、前記入力映像中の動物体領域を抽出し、該動物体領域と背景領域の二値画像を求める変化領域抽出手段と、前記入力映像中の時系列の各画像において、各画素の位置（ｘ．ｙ）と該画素の動き方向（ｕ．ｖ）を示す二次元ベクトルをペアにした動きベクトル配列（ｘ．ｙ．ｕ．ｖ）を求める動きベクトル算出手段と、前記動物体領域の情報と前記動きベクトル配列の情報とを統合し、該統合されたベクトルのペアを特徴量として求める統合ベクトル算出手段と、を備えることを特徴としている。 According to a third aspect of the present invention, each of the feature quantity extraction modules extracts a moving body region in the input video and obtains a binary image of the moving body region and a background region, and the input video. In each time-series image in the middle, a motion vector array (xyuv) in which a two-dimensional vector indicating the position (xy) of each pixel and the motion direction (uv) of the pixel is paired A motion vector calculation means for obtaining a moving vector, and an integrated vector calculation means for integrating the information on the moving body region and the information on the motion vector arrangement and obtaining the pair of the integrated vectors as a feature quantity. Yes.

請求項４記載の発明は、前記統合ベクトル算出手段は、時系列の前記二値画像を時間的に重畳したモーション履歴画像を、前記動物体領域に数値をあてはめ前記背景領域をゼロとして生成するモーション履歴生成手段と、前記動きベクトル配列を、前記モーション履歴画像を用いてマスク処理を行うマスク処理手段と、を備えることを特徴としている。 According to a fourth aspect of the present invention, the integrated vector calculating means generates a motion history image obtained by temporally superimposing the time-series binary images by assigning numerical values to the moving body region and setting the background region to zero. It is characterized by comprising history generation means and mask processing means for performing mask processing on the motion vector array using the motion history image.

請求項５記載の発明は、前記各特徴量抽出モジュールは、前記統合されたベクトルのペアが連続して出現する時間をカウントする連続時間カウント手段をさらに備え、前記カウント時間を前記統合されたベクトルのペアに付加した特徴量を前記特徴量選択部に出力することを特徴としている。 According to a fifth aspect of the present invention, each of the feature quantity extraction modules further includes a continuous time counting unit that counts a time at which the integrated vector pair continuously appears, and the integrated vector includes the count time. The feature quantity added to the pair is output to the feature quantity selection unit.

請求項６記載の発明は、前記識別手段は、各特徴量抽出モジュールの特徴量を組み合わせたＮ次元の識別特徴量を、ＳＶＭ（サポートベクタマシン）手法を用いて定常／非定常に分類するＳＶＭ識別手段と、前記ＳＶＭ識別手段で生成されたサポートベクトル群および定常／非定常の識別結果から、Ｎ次元の特徴量の要素に応じて該要素を最終的に識別に用いるか否かを示すＮ個の論理値を前記識別機器にフィードバックする特徴量評価手段とを備え、定常・非定常のプラグの時系列を識別結果として表示させることを特徴としている。 According to a sixth aspect of the present invention, the discriminating unit classifies the N-dimensional discriminating feature amount, which is a combination of the feature amounts of the feature amount extraction modules, into a steady / non-stationary state using an SVM (support vector machine) method. N indicating whether or not the element is finally used for identification according to the element of the N-dimensional feature amount from the identification unit, the support vector group generated by the SVM identification unit, and the stationary / non-stationary identification result And a feature value evaluating means for feeding back the logical values to the identification device, and displaying a time series of steady and non-steady plugs as identification results.

請求項７記載の発明は、撮影装置で撮影された映像中に含まれる動物体の動きに基づく特徴量を抽出し、統計的な処理によってシーンの非定常性を識別する方法であって、動き特徴量抽出手段が、前記撮影装置からの入力映像中において動物体の動きに基づく特徴量を抽出する動き特徴抽出ステップと、識別手段が、前記動き特量抽出ステップで抽出した特徴量から前記入力映像の非定常性を識別する識別ステップと、を有することを特徴としている。 The invention according to claim 7 is a method of extracting feature quantities based on the movement of a moving object included in an image photographed by a photographing apparatus, and identifying a non-stationary state of a scene by statistical processing. A feature extraction unit extracts a feature quantity based on the movement of a moving object in an input video from the imaging device, and an identification unit extracts the input from the feature quantity extracted in the motion feature extraction step. And an identification step for identifying non-stationarity of the image.

請求項８記載の発明は、前記動き特徴抽出ステップは、複数の特徴量抽出モジュールのそれぞれが、前記入力映像から時系列の特徴量を生成する特徴量生成ステップと、特徴量選択手段が、前記特徴量生成ステップで生成した各特徴量を、特徴量選択基準に基づいて選択し、該選択された各特徴量の組合せの時系列を前記識別部に出力する特徴量選択ステップと、を有することを特徴としている。 According to an eighth aspect of the present invention, in the motion feature extraction step, each of a plurality of feature amount extraction modules generates a time-series feature amount from the input video, and a feature amount selection unit includes the feature amount selection unit, A feature amount selection step of selecting each feature amount generated in the feature amount generation step based on a feature amount selection criterion and outputting a time series of a combination of the selected feature amounts to the identification unit. It is characterized by.

請求項９記載の発明は、前記特徴量生成ステップは、前記入力映像中の動物体領域を抽出し、該動物体領域と背景領域の二値画像を求める変化領域抽出ステップと、前記入力映像中の時系列の各画像において、各画素の位置（ｘ．ｙ）と該画素の動き方向（ｕ．ｖ）を示す二次元ベクトルをペアにした動きベクトル配列（ｘ．ｙ．ｕ．ｖ）を求める動きベクトル算出ステップと、前記動物体領域の情報と前記動きベクトル配列の情報とを統合し、該統合されたベクトルのペアを特徴量として求める統合ベクトル算出ステップと、を有することを特徴としている。 In the invention according to claim 9, in the feature quantity generation step, a moving body region in the input video is extracted, a change region extraction step for obtaining a binary image of the moving body region and a background region, and in the input video In each time-series image, a motion vector array (xyuv) in which two-dimensional vectors indicating the position (xy) of each pixel and the motion direction (uv) of the pixel are paired is obtained. A motion vector calculation step to be obtained, and an integrated vector calculation step for integrating the information on the moving body region and the information on the motion vector arrangement, and obtaining the integrated vector pair as a feature amount. .

請求項１０記載の発明は、請求項１〜６のいずれか１項に記載の非定常度推定装置として、コンピュータを機能させることを特徴とする非定常度推定プログラムに関する。 A tenth aspect of the present invention relates to a nonstationary degree estimation program that causes a computer to function as the nonstationary degree estimation device according to any one of the first to sixth aspects.

請求項１〜１０記載の発明によれば、動物体の動き特徴から安定的に定常／非定常が分類され、ノイズに影響されることなく、高速に高精度な非定常検出を行うことができる。 According to the first to tenth aspects of the present invention, steady / unsteady is stably classified from the motion characteristics of the moving object, and high-precision and unsteady detection can be performed at high speed without being affected by noise. .

図１は、本発明の実施形態に係る非定常度推定装置の構成例を示し、該推定装置１には映像監視システムの撮影装置（例えばデジタルカメラ，デジタルビデオカメラなど）からネットワーク経由で映像データが入力されている。 FIG. 1 shows an example of the configuration of an unsteady degree estimation apparatus according to an embodiment of the present invention. The estimation apparatus 1 includes video data from a photographing apparatus (for example, a digital camera or a digital video camera) of a video surveillance system via a network. Is entered.

前非定常度記推定装置１は、コンピュータにより構成され、前記映像データを入力する映像入力部２と、この映像入力部２の入力映像から動物体の動き特徴を時系列に抽出する動き特徴抽出部３と、この動き特徴抽出部３の抽出した時系列の動き特徴に基づいて動き特徴の非定常度を示す数値を求める識別部４と、この識別部４の求めた数値を表示する表示部５とを有している。 The pre-stationary degree estimation apparatus 1 is constituted by a computer, and includes a video input unit 2 for inputting the video data, and a motion feature extraction for extracting motion features of moving objects from the input video of the video input unit 2 in time series. Unit 3, an identification unit 4 for obtaining a numerical value indicating the non-stationary degree of the motion feature based on the time-series motion feature extracted by the motion feature extraction unit 3, and a display unit for displaying the numerical value obtained by the identification unit 4 5.

この各部２〜５の処理機能は、コンピュータのハードウェアとソフトウェアの協働によって実現されている。このとき前記識別部４は、非定常度の数値を閾値判定して非定常か否かを示す論理値を前記表示部５に出力することもできる。ここで出力される前記表示部５は例えばモニタなどでよい。 The processing functions of the units 2 to 5 are realized by the cooperation of computer hardware and software. At this time, the identification unit 4 can output a logical value indicating whether or not the value is unsteady to the display unit 5 by determining the threshold value of the unsteady degree. The display unit 5 output here may be a monitor, for example.

なお、前記非定常度推定装置１は、通常のコンピュータの構成要素、例えば前記各部２〜５の処理を制御する演算手段（例えばＣＰＵ：ＣｅｎｔｒａｌＰｒｏｃｅｓｓｏｒＵｎｉｔ）、前記各部２〜５の処理データを記憶可能なメモリ（ＲＡＭ）、前記各部２〜５の処理データを保存可能なハードディスクドライブ装置、ネットワーク接続用の通信デバイス、ポインティングデバイスやキーボードなどの入力手段を有している。以下、前記各部２〜５の具体的に処理内容を説明する。 The nonstationary degree estimation device 1 stores normal computer components, for example, arithmetic means (for example, CPU: Central Processor Unit) that controls the processing of each of the units 2 to 5 and processing data of the units 2 to 5. It has an input means such as a possible memory (RAM), a hard disk drive device capable of storing the processing data of the units 2 to 5, a communication device for network connection, a pointing device and a keyboard. Hereinafter, the processing content of each said parts 2-5 is demonstrated concretely.

（１）映像入力部２
前記映像入力部２に入力される映像データは、例えばＡＶＩファイルやＪＰＥＧ画像列などでよく、前記映像入力部２には映像データが１フレームずつ入力される。ここでは前記映像入力部２は通信デバイスをもって実現されている。 (1) Video input unit 2
The video data input to the video input unit 2 may be, for example, an AVI file or a JPEG image sequence, and the video data is input to the video input unit 2 frame by frame. Here, the video input unit 2 is realized by a communication device.

（２）動き特徴抽出部３
前記動き特徴抽出部３は、入力された映像データを複数フレームずつ用いて、物体の動き特徴を抽出する。ここで抽出される動き特徴は、映像に含まれる動物体の領域およびその動き方向およびその変化に関する情報をもつ数値とする。 (2) Motion feature extraction unit 3
The motion feature extraction unit 3 extracts the motion feature of the object by using the input video data for each of a plurality of frames. The motion feature extracted here is a numerical value having information on the region of the moving object included in the video, its motion direction, and its change.

具体的には、前記動き特徴抽出部３は、図２に示すように、特徴量抽出モジュール群６と特徴量選択部７とに大別されている。この特徴量抽出モジュール群６は、ｋ種類の特徴量抽出モジュールにより構成されている。図２中は、各特徴量抽出モジュールを、特徴量１抽出モジュールから特徴量ｋ抽出モジュールと示している。この各特徴量抽出モジュールは、映像入力部２から入力された時系列の画像フレームを用いて、それぞれ特徴量１から特徴量ｋを抽出し、その時系列をそれぞれ特徴量選択部７に出力している。 Specifically, the motion feature extraction unit 3 is roughly divided into a feature quantity extraction module group 6 and a feature quantity selection unit 7 as shown in FIG. The feature quantity extraction module group 6 includes k types of feature quantity extraction modules. In FIG. 2, each feature quantity extraction module is shown as a feature quantity 1 extraction module to a feature quantity k extraction module. Each of the feature quantity extraction modules extracts a feature quantity k from the feature quantity 1 using the time series image frames input from the video input section 2 and outputs the time series to the feature quantity selection section 7 respectively. Yes.

この特徴量１から特徴量ｋは、画像フレームの時系列から抽出可能な特徴、即ち画像フレーム中の動物体が動くことによって生じる画像の変化の特徴を意味する。ここでは動きベクトルとして説明する。ただし、時空間特徴（動領域を「１」、背景を「０」とする二値画像を１列に展開して複数フレーム分並べた、画素数×フレーム数次元の特徴量）や、フレーム間差分特徴（隣り合うフレーム同士の差分によって生成される二値画像の画素の並びである、画素数次元の特徴量）などを用いてもよい。 The feature amounts 1 to k mean features that can be extracted from the time series of image frames, that is, features of image changes caused by movement of moving objects in the image frames. Here, it will be described as a motion vector. However, spatio-temporal features (features of the number of pixels x number of frames in which binary images with a moving region of "1" and a background of "0" are expanded in a row and arranged for multiple frames) and between frames A difference feature (a feature quantity in the number-of-pixels dimension, which is an array of pixels of a binary image generated by a difference between adjacent frames) may be used.

前記特徴量選択部７では、特徴量選択基準に基づき特徴量１から特徴量ｋのいずれかを組み合わせたＮ次元の識別特徴量を時系列に出力する。この特徴量選択基準は、各特徴量のいずれを用いるかを示す基準、即ち特徴量１から特徴量ｋに対応する数値の組からなり、外部からＧＵＩ（ＧｒａｐｈｉｃＵｓｅｒＩｎｔｅｒｆａｃｅ）、またはコマンドプロンプトで入力することができる。この場合の入力は前記入力手段を用いればよい。なお、前記特徴量選択基準はプログラムなどに事前に定義された閾値としてもよい。 The feature quantity selection unit 7 outputs N-dimensional discriminating feature quantities in combination with any one of the feature quantities 1 to k based on the feature quantity selection criteria in time series. This feature quantity selection criterion is a standard indicating which of each feature quantity is used, that is, a set of numerical values corresponding to the feature quantity 1 to the feature quantity k, and is input from the outside with a GUI (Graphic User Interface) or a command prompt. can do. In this case, the input means may be used for input. The feature quantity selection criterion may be a threshold value defined in advance in a program or the like.

図３は、前記各特徴量抽出モジュールの構成例を示している。ここでは各特徴量抽出モジュールは、変化領域抽出部８，動きベクトル算出部９，統合ベクトル部１０，連続時間カウント部１３，配列格納部１４を有している。 FIG. 3 shows a configuration example of each of the feature quantity extraction modules. Here, each feature amount extraction module includes a change area extraction unit 8, a motion vector calculation unit 9, an integrated vector unit 10, a continuous time counting unit 13, and an array storage unit 14.

具体的な処理を説明すれば、まず前記動きベクトル算出部９は、動きベクトル配列を求める。この動きベクトルは、画像中の各画素の位置（ｘ．ｙ）と、該位置（ｘ．ｙ）の検出時刻におけるその画素の動き方向（ｕ．ｖ）を示す二次元ベクトルとを対応させた組（ｘ．ｙ．ｕ．ｖ）の配列とする。ここで求めた動きベクトル（ｘ．ｙ．ｕ．ｖ）を、前記統合ベクトル算出部１０に出力する。 To describe a specific process, first, the motion vector calculation unit 9 obtains a motion vector array. This motion vector is made to correspond to the position (xy) of each pixel in the image and the two-dimensional vector indicating the motion direction (uv) of the pixel at the detection time of the position (xy). It is set as an array of a set (x.y.u.v). The obtained motion vector (x.y.uv) is output to the integrated vector calculation unit 10.

以下、本実施形態では、一般的にオプティカルフローと呼ばれるものと、これを空間的、時間的に平均するなど加工したベクトルとを合わせて動きベクトルとする。このオプティカルフローを求めるには輝度勾配に基づく方法や、非特許文献２に記載された領域のマッチングに基づく方法を用いてもよい。つぎに前記変化領域抽出部８では、動物体領域と背景領域を分離した二値画像を前記統合ベクトル算出部１０に出力する。 Hereinafter, in this embodiment, what is generally called an optical flow and a processed vector such as spatially and temporally averaged are combined as a motion vector. In order to obtain this optical flow, a method based on luminance gradient or a method based on region matching described in Non-Patent Document 2 may be used. Next, the change region extraction unit 8 outputs a binary image obtained by separating the moving object region and the background region to the integrated vector calculation unit 10.

前記統合ベクトル算出部１０は、前記変化領域算出部８の出力した二値画像と、前記動きベクトル算出部９の出力した動きベクトルとから、画像全体の特徴量として統合されたベクトルを求め、前記連続時間カウント部１３に出力する。ここで動きベクトル算出部９の出力は、各画素に対応した動きベクトル（始点座標と終点座標で定まる４次元の情報）からなる数値の集合の時系列である。 The integrated vector calculation unit 10 obtains a vector integrated as a feature amount of the entire image from the binary image output from the change region calculation unit 8 and the motion vector output from the motion vector calculation unit 9, Output to the continuous time counting unit 13. Here, the output of the motion vector calculation unit 9 is a time series of a set of numerical values composed of motion vectors (four-dimensional information determined by start point coordinates and end point coordinates) corresponding to each pixel.

例えば、（代表ベクトルの始点ｘ座標、代表ベクトルの始点ｙ座標、代表ベクトルの終点ｘ座標、代表ベクトル終点ｙ座標）の組の時系列として出力される。統合ベクトル算出部１０は、変化領域の情報（二値画像）と動きベクトルの情報を統合して、画像全体としての動き特徴を示す統合された動きベクトルの時系列が出力される。 For example, it is output as a time series of a set of (representative vector start point x coordinate, representative vector start point y coordinate, representative vector end point x coordinate, representative vector end point y coordinate). The integrated vector calculation unit 10 integrates change area information (binary image) and motion vector information, and outputs a time series of integrated motion vectors indicating the motion characteristics of the entire image.

図４は、前記統合ベクトル算出部１０の構成例を示している。ここでは前記統合ベクトル算出部１０は、モーション履歴画像生成部１１，マスク処理部１２を有している。このモーション履歴画像生成部１１は、変化領域の時間的変化を重畳した画像（モーション履歴画像）を生成する。このモーション履歴画像は、ある一定時間における動物体の動きの大きさや方向を知るために用いられ、非特許文献３にその方法が示されている。 FIG. 4 shows a configuration example of the integrated vector calculation unit 10. Here, the integrated vector calculation unit 10 includes a motion history image generation unit 11 and a mask processing unit 12. The motion history image generation unit 11 generates an image (motion history image) on which the temporal change of the change region is superimposed. This motion history image is used to know the magnitude and direction of the movement of the moving object during a certain period of time, and Non-Patent Document 3 shows the method.

具体的には、前記モーション履歴画像生成部１１には、前記変化領域抽出部８から時系列の二値画像が入力され、動き領域の画素にゼロ以外の数値をあてはめ、それ以外の領域をゼロとしたマスク画像を生成する。このマスク画像はマスク処理部１２に入力される。このマスク処理部１２には、併せて前記動きベクトル算出部９から動きベクトル配列（ｘ，ｙ，ｕ，ｖ）が入力される。そして、前記マスク処理部１２は、マスク画像のマスク内部についてベクトルを平均化することで統合されたベクトルを求めている。ここでは空間平均と時間平均から動きベクトルを統合する。 Specifically, a time-series binary image is input to the motion history image generation unit 11 from the change region extraction unit 8, and numerical values other than zero are applied to pixels in the motion region, and other regions are set to zero. A mask image is generated. This mask image is input to the mask processing unit 12. In addition, a motion vector array (x, y, u, v) is input from the motion vector calculation unit 9 to the mask processing unit 12. The mask processing unit 12 obtains an integrated vector by averaging the vectors within the mask of the mask image. Here, the motion vectors are integrated from the spatial average and the temporal average.

前記マスク処理部１２の統合処理の一例を図５に基づき説明する。図５（ａ）は、時刻ｔにおける変化領域と、その領域の動きの方向と大きさを示すベクトルを矢印で表示している。図５（ｂ）は、同様に時刻ｔ＋ｋにおける状態を表している。図５（ｃ）は、時刻ｔと時刻ｔ＋１の変化領域を重畳した状態を表している。この状態はモーション履歴画像に該当する。 An example of the integration process of the mask processing unit 12 will be described with reference to FIG. In FIG. 5A, a change area at time t and a vector indicating the direction and magnitude of the movement of the area are indicated by arrows. FIG. 5B similarly shows the state at time t + k. FIG. 5C shows a state in which the change areas at time t and time t + 1 are superimposed. This state corresponds to a motion history image.

すなわち、前記モーション履歴画像生成部１１の処理は、図５（ａ）〜図５（ｃ）間の処理に相当している。ここでは時刻ｔと時刻ｔ＋ｋの２つの時刻のみを重畳しているが、間隔ｋや何時刻分を重畳するかは可変パラメータなので、任意に変更できる。図５（ｃ）の重畳した変化領域について、領域ごとに内接する矩形の縦横の長さを求め、矩形の外周または矩形の面積が一定の閾値以上であるもの、または大きい方から一定の数だけの領域を選択する。この閾値および前記可変パラメータはプログラムなどに定義しておけばよい。 That is, the process of the motion history image generation unit 11 corresponds to the process between FIGS. 5 (a) to 5 (c). Here, only two times of time t and time t + k are superposed, but the interval k and how many times are superposed are variable parameters and can be arbitrarily changed. For the overlapped change region in FIG. 5C, the vertical and horizontal lengths of the inscribed rectangles are obtained for each region, and the rectangle outer circumference or the rectangular area is equal to or larger than a certain threshold value, or a certain number from the larger one Select the area. The threshold value and the variable parameter may be defined in a program or the like.

そして、前記モーション履歴画像生成部１１は、前記選択領域を「１」、それ以外の領域を「０」とするような、マスク画像を出力する。図５（ｄ）において、前記選択領域のみについて、ベクトルの統合を行う。このときマスクの内部のベクトルの平均を求め、これを統合されたベクトルとする。図５（ｄ）は、統合ベクトル算出部における処理を示している。図５（ｄ）中の矢印Ｐは代表ベクトルに相当している。また、図５（ｅ）（ｆ）は、図５（ｃ）（ｄ）の具体的な処理画像例を示している。以下、統合の処理方法の例を２つ示す。 Then, the motion history image generation unit 11 outputs a mask image in which the selected area is “1” and the other areas are “0”. In FIG. 5D, vector integration is performed only for the selected region. At this time, the average of the vectors inside the mask is obtained, and this is used as an integrated vector. FIG. 5D shows processing in the integrated vector calculation unit. An arrow P in FIG. 5D corresponds to a representative vector. FIGS. 5E and 5F show specific processed image examples of FIGS. 5C and 5D. Hereinafter, two examples of the integration processing method will be described.

第一の方法は、まず、図５（ａ）における時刻ｔの変化領域Ａ^(t) ₁と変化領域Ａ^(t) ₂内の画素すべてに、それぞれ領域の動きの方向と大きさを示すベクトルｖ^(t) ₁とベクトルｖ^(t) ₂に対応する数値を付与する。同様に、図５（ｂ）における時刻ｔ＋ｋの変化領域Ａ^(t+k) ₁と変化領域Ａ^(t+k) ₂内の画素すべてに、それぞれ領域の動きの方向と大きさを示すベクトルｖ^(t+k) ₁とベクトルｖ^(t+k) ₂に対応する数値をもたせる。 In the first method, first, a vector indicating the direction and magnitude of the movement of each of the pixels in the change area A ^(t) ₁ and the change area A ^(t) _{2 at} time t in FIG. Numerical values corresponding to v ^(t) ₁ and vector v ^(t) ₂ are assigned. Similarly, a vector v indicating the direction and magnitude of the movement of each of the pixels in the change area A ^{(t + k)} ₁ and the change area A ^{(t + k)} _{2 at} time t + k in FIG. ^The numerical values corresponding to ^{(t + k)} ₁ and the vector v ^{(t + k)} ₂ are given.

つぎに、図５（ｃ）において、矩形内の画素がそれぞれその数値をもっているとき、矩形が重なり合う部分には両方の矩形の数値を統合した数値を与える。ベクトルに対応する数値をベクトルの（始点ｘ，始点ｙ）と（終点ｘ，終点ｙ）の組とするとき、平均はこれら４つの数値それぞれの平均として求める。 Next, in FIG. 5C, when each pixel in the rectangle has the numerical value, a numerical value obtained by integrating the numerical values of both the rectangles is given to a portion where the rectangles overlap. When the numerical value corresponding to the vector is a set of (start point x, start point y) and (end point x, end point y) of the vector, the average is obtained as the average of each of these four numerical values.

また、ベクトルに対応する数値をベクトルの方向（３６０度）をＮ個の領域に分割して、１からＮまでの数値をあてはめてもよい。この場合は、矩形が重なり合う部分には、その数値の平均した数値を与える。方向の分割の仕方は、例えば０度〜４５度および３１５度〜３６０度、４５度〜１３５度、１３５度〜２２５度、２２５度〜３１５度の４分割（Ｎ＝４）などのようにして、各領域に数値１〜４を割り当てればよい。 In addition, the numerical value corresponding to the vector may be divided into N regions in the vector direction (360 degrees), and numerical values from 1 to N may be applied. In this case, the average value of the numerical values is given to the portion where the rectangles overlap. The way of dividing the direction is, for example, four divisions (N = 4) of 0 degrees to 45 degrees and 315 degrees to 360 degrees, 45 degrees to 135 degrees, 135 degrees to 225 degrees, and 225 degrees to 315 degrees. The numerical values 1 to 4 may be assigned to each area.

この場合に３１５度の境界で数値に不連続性が生じるため、この方向のベクトルが頻繁に発生する場合には誤差が大きくなるため注意が必要である。したがって、４５度の方向の発生頻度が低ければ、前述のとおりの４分割とし、０度の方向の発生頻度が低ければ、０度〜９０度、９０度〜１８０度、１８０度〜２７０度、２７０度〜３６０度の４分割というように、発生頻度の低い方向を不連続の境界にもってくるようにする。より正確に対応する必要がある場合、もしくは発生頻度の低い方向が不明な場合には、非特許文献４に記載された角度統計の相関係数を適用することもできる。 In this case, discontinuity occurs in the numerical value at the boundary of 315 degrees. Therefore, when a vector in this direction is frequently generated, the error becomes large, so care must be taken. Therefore, if the occurrence frequency in the direction of 45 degrees is low, it is divided into four as described above, and if the occurrence frequency in the direction of 0 degrees is low, 0 degrees to 90 degrees, 90 degrees to 180 degrees, 180 degrees to 270 degrees, A direction with low occurrence frequency is brought to a discontinuous boundary, such as four divisions of 270 to 360 degrees. When it is necessary to correspond more accurately, or when the direction of low occurrence frequency is unknown, the correlation coefficient of angle statistics described in Non-Patent Document 4 can be applied.

第二の方法は、変化領域のうち面積最大の領域のベクトルに対応するベクトルの（始点ｘ，始点ｙ）と（終点ｘ，終点ｙ）の組を代表ベクトルとする方法である。 The second method is a method in which a set of (start point x, start point y) and (end point x, end point y) corresponding to the vector of the region having the largest area among the change regions is used as a representative vector.

ここで前記連続時間カウント部１３の処理例を述べる。ここでは前記統合ベクトル算出部１０にて統合されたベクトル（画像の時系列に対応する数値の時系列）が入力されている。このとき前記連続時間カウント部１３は、統合されたベクトルがゼロでない連続時間をカウントする（動きベクトルの時間平均がゼロでないフレームをカウントする。）。カウントされた連続時間は、前記配列格納部（例えばメモリなど）１４にて統合されたベクトルとともにデータ配列に特徴量として格納され、前記特徴量選択部７に出力される。 Here, a processing example of the continuous time counting unit 13 will be described. Here, a vector (a time series of numerical values corresponding to a time series of images) integrated by the integrated vector calculation unit 10 is input. At this time, the continuous time counting unit 13 counts continuous times when the integrated vector is not zero (counts frames where the temporal average of motion vectors is not zero). The counted continuous time is stored as a feature amount in a data array together with the vector integrated in the array storage unit (eg, memory) 14 and is output to the feature amount selection unit 7.

ここでカウントされた連続時間は、動物体がどの程度の時間連続して出現するかを示していることから、これを特徴量に加えることで前記識別部４において、長時間動き続けている動物体がある場合に、これを非定常として検出することが可能となる。 The continuous time counted here indicates how long the moving object appears continuously. By adding this to the feature amount, the identification unit 4 keeps moving for a long time. When there is a body, this can be detected as non-stationary.

以上の各特徴量抽出モジュールの処理フローを図６に示す。ここでは入力される映像フレームをＩ（ｘ，ｙ，ｔ−ｓ），Ｉ（ｘ，ｙ，ｔ−ｓ＋１），，，Ｉ（ｘ，ｙ，ｔ）とする。また、「ｘ，ｙ」は画像の座標値、ｔは現在の時刻、ｓは現在の特徴量を求めるために必要な過去のフレームの枚数を示し、処理の開始はｔ＝０とする。 FIG. 6 shows a processing flow of each of the above feature quantity extraction modules. Here, it is assumed that the input video frames are I (x, y, ts), I (x, y, ts + 1),, I (x, y, t). Further, “x, y” is the coordinate value of the image, t is the current time, s is the number of past frames necessary to obtain the current feature amount, and the start of processing is t = 0.

Ｓ０１：まず、オプティカルフローを算出する。ここではオプティカルフローを求めた画像をＩ_o（ｘ，ｙ，ｔ）＝（ｕ（ｘ，ｙ，ｔ），ｖ（ｘ，ｙ，ｔ））とする。 S01: First, an optical flow is calculated. Here, it is assumed that an image for which the optical flow is obtained is I _o (x, y, t) = (u (x, y, t), v (x, y, t)).

Ｓ０２．Ｓ０３：Ｓ０１と平行して背景差分などによって動物体領域を求め（Ｓ０２）、該領域にラベリングと各領域の画素数の算出処理を施す（Ｓ０３）。 S02. S03: In parallel with S01, a moving object region is obtained by background difference or the like (S02), and the region is labeled and the number of pixels in each region is calculated (S03).

ここでは動物体領域を、ラベリングして属する画素数の多いほうからｎ個の領域についての面積（画素数）をＡ₀（ｘ，ｙ，ｔ），．．．，Ａ_n（ｘ，ｙ，ｔ）とする。 Here, the area (number of pixels) for the n regions from the larger number of pixels to which the moving object region is labeled is defined as A ₀ (x, y, t),. . . , A _n (x, y, t).

Ｓ０４：Ｓ０３の算出処理した画像について空間平均（中心座標、方向ベクトル）を算出する。ここでは時刻ｔにおいて座標（ｘ，ｙ）の画素がｋ番目の動領域に属していて、その画素数がＭのとき、Ａ_k（ｘ，ｙ，ｔ）＝Ｍとする。空間平均（中心座標、方向ベクトル）算出の出力を（Ｘ_s（ｔ），Ｙ_s（ｔ），ｕ_s（ｔ），ｖ_s（ｔ））として、以下のように計算する。
Ｘ_s（ｔ）＝（Ａ_k（ｘ，ｙ，ｔ）！＝０であるｘの平均）
Ｙ_s（ｔ）＝（Ａ_k（ｘ，ｙ，ｔ）！＝０であるｙの平均）
ｕ_s（ｔ）＝１／（Σ_k=1〜nＡ_k（ｘ，ｙ，ｔ））×Σ_k=1〜nｕ（ｘ，ｙ，ｔ）×Ａ_k（ｘ，ｙ，ｔ）
ｖ_s（ｔ）＝１／（Σ_k=1〜nＡ_k（ｘ，ｙ，ｔ））×Σ_k=1〜nｖ（ｘ，ｙ，ｔ）×Ａ_k（ｘ，ｙ，ｔ）
その後に「ｔ＝ｔ＋１」として、Ｓ１０１．Ｓ１０２に戻って次フレームの処理を続ける。 S04: A spatial average (center coordinate, direction vector) is calculated for the image subjected to the calculation processing in S03. Here, when the pixel at coordinates (x, y) belongs to the k-th moving region at time t and the number of pixels is M, A _k (x, y, t) = M. The output of the spatial average (center coordinate, direction vector) calculation is calculated as follows (X _s (t), Y _s (t), u _s (t), v _s (t)).
X _s (t) = (A _k (x, y, t)! = 0 x average)
Y _s (t) = (average of y where A _k (x, y, t)! = 0)
_{u s (t) = 1 /} (Σ k = 1~n A k (x, y, t)) × Σ k = 1~n u (x, y, t) × A k (x, y, t)
_{v s (t) = 1 /} (Σ k = 1~n A k (x, y, t)) × Σ k = 1~n v (x, y, t) × A k (x, y, t)
Then, “t = t + 1” is set, and S101. Returning to S102, processing of the next frame is continued.

Ｓ０５：時間平均（中心座標、方向ベクトル）の算出を行う（Ｓ０５）。ここでは（Ｘ_s（ｔ−ｓ），Ｙ_s（ｔ−ｓ），ｕ_s（ｔ−ｓ），ｖ_s（ｔ−ｓ）），．．．（Ｘ_s（ｔ），Ｙ_s（ｔ），ｕ_s（ｔ），ｖ_s（ｔ））の時間平均（Ｘ_st（ｔ），Ｙ_st（ｔ），ｕ_st（ｔ），ｖ_st（ｔ））を成分ごとの平均として求める。 S05: A time average (center coordinate, direction vector) is calculated (S05). Here _{_{(X s (t-s)}} , Y s (t-s), u s (t-s), v s (t-s)) ,. . . (X _s (t), Y _s (t), u _s (t), vs _s (t)) time average (X _st (t), Y _st (t), u _st (t), v _st ( t)) is determined as an average for each component.

Ｓ０６〜Ｓ０８：Ｓ０５で時間平均を求めた結果の動きベクトル（ｕ_st（ｔ），ｖ_st（ｔ））が、連続したゼロベクトル（０，０）か否かを確認する（Ｓ０６）。 S06 to S08: It is confirmed whether or not the motion vector (u _st (t), v _st (t)) obtained as a result of obtaining the time average in S05 is a continuous zero vector (0, 0) (S06).

そして、連続したゼロベクトル（０，０）のフレーム数［Ｃ（ｔ）＝０］を除外し（Ｓ０７）、連続したゼロベクトル（０，０）でないフレーム数［Ｃ（ｔ）＝Ｃ（ｔ−１）＋１］をカウントする（Ｓ０８）。 Then, the number of frames [C (t) = 0] of consecutive zero vectors (0, 0) is excluded (S07), and the number of frames that are not consecutive zero vectors (0, 0) [C (t) = C (t -1) +1] is counted (S08).

Ｓ０９：Ｓ０５で求めた時間平均にＳ０８にてカウントしたフレーム数Ｃ（ｔ）を加えた（Ｘ_st（ｔ），Ｙ_st（ｔ），ｕ_st（ｔ），ｖ_st（ｔ），Ｃ（ｔ））を特徴量としてデータ配列に格納する。 S09: The number of frames C (t) counted in S08 is added to the time average obtained in S05 (X _st (t), Y _st (t), u _st (t), v _st (t), C ( t)) is stored in the data array as a feature quantity.

（３）識別部４
図７は、前記識別部４の構成例を示している。ここでは前記識別部４には、特徴量１から特徴量ｋの組み合わせからなるＮ次元の識別特徴量の時系列データが入力される。図７中のＳＶＭ識別器１５は、入力された時系列データをＳＶＭ（サポートベクタマシン）手法を用いて統計的に識別特徴量を定常と非定常に識別した結果を特徴量評価部６に出力する。ここで時系列の入力に対するＳＶＭのアルゴリズムとしては、非特許文献５に記載のように、インクリメンタルな１クラスのＳＶＭを用いて、教師無しのオンライン識別を用いることができる。 (3) Identification unit 4
FIG. 7 shows a configuration example of the identification unit 4. Here, the identification unit 4 is input with time-series data of N-dimensional identification feature quantities composed of combinations of feature quantities 1 to k. The SVM discriminator 15 in FIG. 7 outputs to the feature quantity evaluation unit 6 the result of statistically discriminating the discriminating feature quantity from the input time series data using the SVM (support vector machine) method. To do. Here, as an SVM algorithm for time-series input, as described in Non-Patent Document 5, an unsupervised online identification can be used using an incremental one-class SVM.

特徴量評価部１６は、ＳＶＭ識別器１５を用いた識別の過程で生成されるサポートベクトル群を用いて、Ｎ次元の特徴量の要素に対応して、その要素を最終的に識別に用いるかどうかの判定を行う。ここではＯＮ／ＯＦＦを示すＮ個の論理値をＳＶＭ識別器１５にフィードバックしている。サポートベクトルは、入力された各特徴量のサブセットであって、定常と非定常の識別境界に位置する特徴量に当たるため、サポートベクトル群および定常、非定常の分布を調べることで、入力された各特徴量のどの次元が定常と非定常の分離に寄与しているかを推定することができる。 The feature quantity evaluation unit 16 uses the support vector group generated in the identification process using the SVM classifier 15, and uses the element for the identification in correspondence with the element of the N-dimensional feature quantity. Judge whether or not. Here, N logical values indicating ON / OFF are fed back to the SVM discriminator 15. A support vector is a subset of each input feature value and corresponds to a feature value located at the stationary / non-stationary discrimination boundary. Therefore, by checking the support vector group and the steady-state / non-stationary distribution, It can be estimated which dimension of the feature value contributes to the separation of stationary and unsteady.

図８は実際の画像から得られた特徴量を用いて識別を行い、特徴量のうちの２次元分の特徴軸上で、定常と非定常とサポートベクトルの分布を示したものである。図８（ａ）は、サポートベクトルは定常の島（定常のクラスと考えられる）の周辺に偏って分布し、定常の分布の外側に非定常が分布しており、比較的よく定常と非定常が分離できているが、図８（ｂ）ではサポートベクトルと定常が重なっており非定常も定常と重なりが大きい。 FIG. 8 shows identification using feature amounts obtained from actual images, and shows distributions of stationary, non-stationary, and support vectors on two-dimensional feature axes of the feature amounts. In FIG. 8 (a), the support vectors are distributed unevenly around a stationary island (considered as a stationary class), and non-stationary is distributed outside the stationary distribution. However, in FIG. 8B, the support vector and the stationary state overlap, and the unsteady state also has a large overlap with the stationary state.

そこで、前記特徴量評価部１６は、サポートベクトルの偏り具合を評価することで、その特徴次元が定常、非定常の識別に寄与しているかどうかの指標に用いることができる。ここでは特徴量の各次元について、定常の分散に対するサポートベクトルの分散の比、または定常の分散に対する非定常の分散の比を求め、比の値が大きいほうが、その特徴次元での分離度がよいと判定し、分離度の高い次元はＯＮ、低い次元はＯＦＦとするＮ個の論理値が前記ＳＶＭ識別器１５にフィードバックされ、定常・非定常のフラグ（ｆｌａｇ）の時系列が表示部５に出力され、表示部５で画面表示される。 Therefore, the feature quantity evaluation unit 16 can be used as an indicator of whether or not the feature dimension contributes to the steady / unsteady discrimination by evaluating the degree of bias of the support vector. Here, for each dimension of the feature quantity, the ratio of the dispersion of the support vector to the stationary variance or the ratio of the non-stationary variance to the stationary variance is obtained, and the larger the ratio value, the better the separation in the feature dimension. N logical values with the high degree of separation being ON and the low dimension being OFF are fed back to the SVM discriminator 15, and the time series of the steady / non-stationary flags are displayed on the display unit 5. The data is output and displayed on the screen by the display unit 5.

このように前記非定常度推定装置１によれば、動き特徴抽出部３において映像中の動物体の動きベクトルに基づく情報を低次元の特徴量として安定的に抽出し、統計的な識別器４に入力できるため、高速に高精度な非定常検出を行うことができる。特に、動き特徴抽出部３が、動きの連続性を特徴量に加え、動きベクトルを動物体領域の動きの大きさや該領域自体の大きさを反映して、空間的・時間的に平均化することから、特徴量の安定化が図られ、特徴量の次元数による計算量が減少し、この点で高速で低コストの非定常検出が実現する。また、識別器４が、特徴量評価部１６を用いて、サポートベクトルおよび定常／非定常の部分分布状態から特徴量を評価し、識別に有効に寄与する特徴量の要素を選択することから、定常／非定常に寄与しないノイズの影響が抑制され、識別演算の計算コストも低減される。 As described above, according to the non-stationary degree estimation apparatus 1, the motion feature extraction unit 3 stably extracts information based on the motion vector of the moving object in the video as a low-dimensional feature amount, and the statistical discriminator 4. Therefore, unsteady detection with high accuracy can be performed at high speed. In particular, the motion feature extraction unit 3 adds the continuity of motion to the feature amount, and averages the motion vector spatially and temporally, reflecting the magnitude of the motion of the moving object region and the size of the region itself. As a result, the feature amount is stabilized, and the amount of calculation due to the number of dimensions of the feature amount is reduced. In this respect, high-speed and low-cost unsteady detection is realized. Further, the classifier 4 uses the feature amount evaluation unit 16 to evaluate the feature amount from the support vector and the stationary / non-stationary partial distribution state, and selects an element of the feature amount that effectively contributes to the identification. The influence of noise that does not contribute to stationary / non-stationary is suppressed, and the calculation cost of the identification calculation is also reduced.

なお、本発明は、上記実施形態に限定されるものではなく、装置構成などは各請求項に記載した範囲内で適宜に変更することができる。例えば、図９に示すように、前記各特徴量抽出モジュールの連続時間カウント部１３を廃止することもできる。この場合は、図６中のＳ０６〜Ｓ０９が省略され、前記各統合ベクトル算出部１０で統合されたベクトル（画像の時系列に対応する数値の時系列）、即ち図９の時間平均（Ｘ_st（ｔ），Ｙ_st（ｔ），ｕ_st（ｔ），ｖ_st（ｔ））が特徴量に用いられるものの、通常の動物体であれば好適に非定常度として検出することができる。 In addition, this invention is not limited to the said embodiment, A device structure etc. can be suitably changed within the range described in each claim. For example, as shown in FIG. 9, the continuous time counting unit 13 of each feature amount extraction module can be eliminated. In this case, S06 to S09 in FIG. 6 are omitted, and the vector (time series of numerical values corresponding to the time series of the image) integrated by each integrated vector calculation unit 10, that is, the time average (X _st Although (t), Y _st (t), u _st (t), and v _st (t)) are used as feature quantities, any normal moving object can be suitably detected as an unsteady degree.

また、本発明は、前記非定常度推定装置１としてコンピュータを機能させるプログラムとしても構築することができる。このプログラムは、コンピュータに前記各部１〜５の処理のすべてを実行させるものでもよく、また前記動き特徴抽出部３・前記識別部４などの一部の処理を実行させるものでもよい。 The present invention can also be constructed as a program that causes a computer to function as the non-stationary degree estimation device 1. This program may cause a computer to execute all the processes of the respective units 1 to 5 or may execute a part of the processes such as the motion feature extraction unit 3 and the identification unit 4.

このプログラムは、Ｗｅｂサイトなどからのダウンロードによってコンピュータに提供することができる。また、前記プログラムは、ＣＤ−ＲＯＭ，ＤＶＤ−ＲＯＭ，ＣＤ−Ｒ，ＣＤ−ＲＷ，ＤＶＤ−Ｒ，ＤＶＤ−ＲＷ，ＭＯ，ＨＤＤ，Ｂｌｕ−ｒａｙＤｉｓｋ（登録商標）などの記録媒体に格納してコンピュータに提供してもよい。この記録媒体から読み出されたプログラムコード自体が前記実施形態の処理を実現するので、該記録媒体も本発明を構成する。 This program can be provided to a computer by downloading from a website or the like. The program is stored in a recording medium such as a CD-ROM, DVD-ROM, CD-R, CD-RW, DVD-R, DVD-RW, MO, HDD, Blu-ray Disk (registered trademark). It may be provided to a computer. Since the program code itself read from the recording medium realizes the processing of the above embodiment, the recording medium also constitutes the present invention.

本発明の実施形態に係る非定常度推定装置の構成例図。The structural example figure of the non-stationary degree estimation apparatus which concerns on embodiment of this invention. 同動き特徴抽出部の構成例図。The structural example figure of the movement feature extraction part. 同各特徴量抽出モジュールの構成例図。The structural example figure of the same each feature-value extraction module. 同統合ベクトル算出部の構成例図。The structural example figure of the integrated vector calculation part. （ａ）は時刻ｔにおける変化領域と、その変化領域の動きの方向と大きさを示す模式図、（ｂ）は時刻ｔ＋１に移行した状態を示す模式図、（ｃ）は（ａ）〜（ｃ）の変化領域を重畳した模式図、（ｄ）は選択した領域のベクトル統合を示す模式図、（ｅ）（ｆ）は具体的処理例を示す図。(A) is a schematic diagram showing the change area at time t, and the direction and magnitude of the movement of the change area, (b) is a schematic diagram showing the state shifted to time t + 1, and (c) is (a)-( (c) Schematic diagram in which change areas are superimposed, (d) is a schematic diagram showing vector integration of selected areas, and (e) and (f) are diagrams showing specific processing examples. 同各特徴量抽出モジュールの処理フロー図。The processing flowchart of each feature amount extraction module. 同識別部の構成例図。The structural example figure of the identification part. （ａ）（ｂ）は、サポートベクトルの偏りの比較例図。(A) and (b) are comparative examples of bias of support vectors. 各特徴量抽出モジュールの他例を示す構成図。The block diagram which shows the other examples of each feature-value extraction module.

Explanation of symbols

１…非定常度推定装置
２…映像入力部
３…動き特徴抽出部（動き特徴週出手段）
４…識別部（識別手段）
５…表示部
６…特徴量抽出モジュール群
７…特徴量選択部（特徴量選択手段）
８…変化領域選択部（変化領域抽出手段）
９…動きベクトル算出部（動きベクトル算出手段）
１０…統合ベクトル算出部（統合ベクトル算出手段）
１１…モーション履歴画像生成部（モーション画像生成手段）
１２…マスク処理部（マスク処理手段）
１３…連続時間カウント部（連続カウント手段）
１４…配列格納部
１５…ＳＶＭ識別器（ＳＶＭ識別手段）
１６…特徴量評価部（特徴量評価手段） DESCRIPTION OF SYMBOLS 1 ... Unsteady degree estimation apparatus 2 ... Video | video input part 3 ... Motion feature extraction part (motion feature week output means)
4 ... Identification part (identification means)
DESCRIPTION OF SYMBOLS 5 ... Display part 6 ... Feature-value extraction module group 7 ... Feature-value selection part (feature-value selection means)
8 ... Change area selection section (change area extraction means)
9: Motion vector calculation unit (motion vector calculation means)
10: Integrated vector calculation unit (integrated vector calculation means)
11 ... Motion history image generation unit (motion image generation means)
12 ... Mask processing section (mask processing means)
13 ... Continuous time counting section (continuous counting means)
14 ... Array storage unit 15 ... SVM discriminator (SVM discriminating means)
16... Feature amount evaluation unit (feature amount evaluation means)

Claims

A device that extracts feature quantities based on movements of moving objects included in a video imaged by an imaging device and identifies unsteady scenes by statistical processing,
Movement feature extraction means for extracting feature amounts based on the movement of the moving object in the input video from the imaging device;
Identification means for identifying non-stationaryness of the input video from the feature amount extracted by the motion feature extraction means;
An unsteady degree estimation device comprising:

The motion feature extraction unit includes a plurality of feature quantity extraction modules that generate time-series feature quantities from the input video;
Feature quantity selection means for selecting a feature quantity generated by each feature quantity extraction module based on a feature quantity selection criterion, and outputting a time series of combinations of the selected feature quantities to the identification unit;
The nonstationary degree estimation apparatus according to claim 1, further comprising:

Each feature amount extraction module extracts a moving body region in the input video, and a change region extracting means for obtaining a binary image of the moving body region and a background region;
In each time-series image in the input video, a motion vector array (x.y.u) in which two-dimensional vectors indicating the position (x.y) of each pixel and the movement direction (u.v) of the pixel are paired. V) motion vector calculation means for obtaining
Integrated vector calculation means for integrating the information of the moving body region and the information of the motion vector arrangement, and obtaining the integrated vector pair as a feature amount;
The nonstationary degree estimation apparatus according to claim 2, comprising:

The integrated vector calculation means includes a motion history generation means for generating a motion history image obtained by temporally superimposing the binary images in time series, applying a numerical value to the moving body region, and setting the background region to zero.
Mask processing means for performing a mask process on the motion vector array using the motion history image;
The nonstationary degree estimation apparatus according to claim 3, further comprising:

Each of the feature quantity extraction modules further includes a continuous time counting unit that counts a time during which the integrated vector pair appears continuously,
The unsteady degree estimation apparatus according to any one of claims 2 to 4, wherein a feature amount obtained by adding the count time to the integrated vector pair is output to the feature amount selection unit.

The identification means includes an SVM identification means for classifying an N-dimensional identification feature quantity combining the feature quantities of each feature quantity extraction module into a steady / non-stationary state using an SVM (support vector machine) method;
N logical values indicating whether or not the element is finally used for identification according to the element of the N-dimensional feature quantity from the support vector group generated by the SVM identification means and the stationary / non-stationary identification result And feature quantity evaluation means for feeding back to the identification device,
The unsteady degree estimation apparatus according to any one of claims 2 to 5, wherein a time series of steady and unsteady plugs is displayed as an identification result.

A method of extracting feature quantities based on the movement of an animal body included in an image captured by an imaging device and identifying unsteadiness of a scene by statistical processing,
A motion feature extraction step, wherein the motion feature amount extraction means extracts a feature amount based on the movement of the moving object in the input video from the imaging device;
An identifying step for identifying non-stationarity of the input video from the feature amount extracted in the motion feature amount extraction step;
A non-stationary degree estimation method characterized by comprising:

The motion feature extraction step includes:
Each of a plurality of feature amount extraction modules generates a feature amount in time series from the input video, and
A feature quantity selecting unit selects each feature quantity generated in the feature quantity generation step based on a feature quantity selection criterion, and outputs a time series of combinations of the selected feature quantities to the identification unit A selection step;
The nonstationary degree estimation method according to claim 7, wherein:

The feature generation step includes
Extracting a moving body region in the input video, and a change region extracting step for obtaining a binary image of the moving body region and a background region;
In each time-series image in the input video, a motion vector array (x.y.u) in which two-dimensional vectors indicating the position (x.y) of each pixel and the movement direction (u.v) of the pixel are paired. V) a motion vector calculation step for obtaining
An integrated vector calculating step of integrating the information of the moving body region and the information of the motion vector arrangement, and obtaining the integrated vector pair as a feature amount;
The nonstationary degree estimation method according to claim 8, wherein:

A non-stationary degree estimation program that causes a computer to function as the non-stationary degree estimation device according to claim 1.