JP2012205097A

JP2012205097A - Video processing device

Info

Publication number: JP2012205097A
Application number: JP2011068103A
Authority: JP
Inventors: Hidenori Ujiie; 秀紀氏家
Original assignee: Secom Co Ltd
Current assignee: Secom Co Ltd
Priority date: 2011-03-25
Filing date: 2011-03-25
Publication date: 2012-10-22
Anticipated expiration: 2031-03-25
Also published as: JP5738028B2

Abstract

PROBLEM TO BE SOLVED: To accurately give each frame of monitoring video a degree at which it should be watched.SOLUTION: An object tracking unit 33 tracks a moving object appearing in each frame of monitoring video 22, and detects an object image of the moving object in a tracking frame. An object feature extraction unit 34 extracts, from the object image, its feature amount. An importance calculation unit 35 statistically analyzes the feature amounts obtained from respective tracking frames for each moving object with respect to monitoring of singularity of a moving object to calculate an individual representative amount representing the feature amounts, and calculates, for each tracking frame, the importance of the tracking frame corresponding to distances between the feature amounts of respective moving objects and the individual representative amounts of the moving objects.

Description

本発明は、監視映像の各フレームに対して注視すべき度合いを付与する映像処理装置に関する。 The present invention relates to a video processing apparatus that gives a degree to be watched to each frame of a monitoring video.

従来、監視映像中から人物の不審な行動を検出する処理は、各人物に共通の比較基準を用いて行われていた。 Conventionally, the process of detecting a suspicious behavior of a person from a monitoring video has been performed using a common reference for each person.

特許文献１に記載の侵入物体監視装置では、監視映像中の侵入物体を検知してその移動量を算出し、移動量が所定量を超えた場合の入力画像をダイジェスト映像として記録する。 In the intruding object monitoring device described in Patent Document 1, an intruding object in a monitoring video is detected and the amount of movement thereof is calculated, and an input image when the amount of movement exceeds a predetermined amount is recorded as a digest video.

また、特許文献２に記載の異常行動検知装置では、複数の移動物体の正常行動パターンを学習することで比較基準を作成する。 Moreover, in the abnormal action detection apparatus described in Patent Document 2, a comparison reference is created by learning normal action patterns of a plurality of moving objects.

特開２００４−２５４１４１号公報JP 2004-254141 A 特開２０１０−７２７８２号公報JP 2010-72782 A

しかしながら、従来技術では、特定人物が一時的にとった不審な動きが大多数の人物がとる動きと似ているとそれを重要シーンと判定できないことがあった。例えば、大多数の人が歩いている中で小走りしていた特定人物が一時的に速度を落としたシーンが重要シーンと判定できなかった。 However, in the related art, if the suspicious movement temporarily taken by a specific person is similar to the movement taken by the majority of persons, it may not be determined as an important scene. For example, a scene in which a specific person who has been running shortly while a large number of people are walking temporarily decreases the speed cannot be determined as an important scene.

またこのとき、特定人物が一時的にとった減速を不審な動きとして特に注視すべき重要シーンとする一方で、当該特定人物が走っているという、大多数の人とは異なる動きも次に注視すべき重要シーンと判定することが望まれる。 At this time, the deceleration that the specific person temporarily took is an important scene that should be particularly watched as suspicious movements, while the movement that is different from the majority of people that the specific person is running is also watched next. It is desired to determine that it is an important scene to be performed.

本発明は上記課題を鑑みてなされたものであり、監視にあたり注視すべき重要シーンを好適に判別できる映像処理装置を提供することを目的とする。また、重要シーンを判別できることに基づいて、元の監視映像よりも短い要約映像を作成可能とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a video processing apparatus that can appropriately determine an important scene to be watched for monitoring. Further, based on the fact that the important scene can be discriminated, a summary video shorter than the original monitoring video can be created.

本発明に係る映像処理装置は、複数フレームからなる監視映像において注視すべきフレームの情報を生成するものであって、前記監視映像を記憶する記憶部と、画像解析により前記監視映像中の移動物体の物体像を追跡し、前記移動物体が追跡された追跡フレームと当該移動物体の前記物体像の対応情報を生成する物体追跡部と、前記物体像から前記移動物体の動き又は／及び外見の特徴量を抽出する物体特徴抽出部と、前記特徴量を前記移動物体ごとに統計分析して当該特徴量の分布を代表する個別代表量を求め、前記各移動物体の前記個別代表量と当該移動物体の前記追跡フレームごとの前記特徴量との距離である第１距離を算出して当該第１距離が大きな前記追跡フレームほど高い監視価値を算出する監視価値算出部と、を備える。 A video processing apparatus according to the present invention generates information on a frame to be watched in a monitoring video composed of a plurality of frames, and stores a storage unit for storing the monitoring video, and a moving object in the monitoring video by image analysis. A tracking frame in which the moving object is tracked and an object tracking unit for generating correspondence information between the moving object and the object image of the moving object, and features of movement or / and appearance of the moving object from the object image An object feature extraction unit for extracting a quantity; and a statistical analysis of the feature quantity for each moving object to obtain an individual representative quantity representing the distribution of the feature quantity, and the individual representative quantity of each moving object and the moving object A monitoring value calculation unit that calculates a first distance, which is a distance from the feature amount for each tracking frame, and calculates a higher monitoring value for the tracking frame having a larger first distance.

他の本発明に係る映像処理装置においては、前記監視価値算出部は、前記物体追跡部により追跡された複数の前記移動物体の前記特徴量を統計分析して当該特徴量を代表する集団代表量を求め、前記集団代表量と前記各移動物体の前記個別代表量との距離である第２距離を算出して、前記各追跡フレームにおける前記移動物体の前記第１距離及び当該追跡フレームにて追跡された移動物体の前記第２距離に応じて前記監視価値を算出する。 In another video processing apparatus according to the present invention, the monitoring value calculation unit statistically analyzes the feature amounts of the plurality of moving objects tracked by the object tracking unit to represent the feature amounts. And a second distance that is a distance between the collective representative amount and the individual representative amount of each moving object is calculated, and tracking is performed using the first distance of the moving object and the tracking frame in each tracking frame. The monitoring value is calculated according to the second distance of the moved moving object.

また他の本発明に係る映像処理装置は、さらに、前記監視映像から前記監視価値が高い前記追跡フレームほど優先して選出することにより当該監視映像を間引いて、当該監視映像の総フレーム数より少なく設定されたフレーム数の要約映像を作成する要約映像作成部を有する。 In another video processing apparatus according to the present invention, the monitoring video is further thinned out by preferentially selecting the tracking frame having a higher monitoring value from the monitoring video, so that the number of the monitoring video is less than the total number of frames of the monitoring video. A summary video creation unit that creates a summary video of the set number of frames is provided.

別の本発明に係る映像処理装置においては、前記要約映像作成部は、前記各移動物体の一連の前記追跡フレームのうち始端フレームと終端フレームとの予め定められた少なくとも一方をそれらの前記監視価値に関係なく選出する。 In the video processing apparatus according to another aspect of the invention, the summary video creation unit may obtain at least one of the start frame and the end frame in a series of the tracking frames of the moving objects as the monitoring value thereof. Select regardless of.

さらに別の本発明に係る映像処理装置においては、前記要約映像作成部は、動的計画法を用い、前記要約映像を構成する前記追跡フレームの前記監視価値の総和を最大化する。 In still another video processing apparatus according to the present invention, the summary video creation unit uses dynamic programming to maximize the sum of the monitoring values of the tracking frames constituting the summary video.

本発明によれば、監視にあたり注視すべき重要シーンを好適に判別することが可能となる。また本発明は、監視映像から重要なシーンを優先的に選択して、元の監視映像よりも短い要約映像を作成することを可能とする。 According to the present invention, it is possible to suitably determine an important scene to be watched for monitoring. Further, the present invention makes it possible to preferentially select an important scene from a monitoring video and create a summary video shorter than the original monitoring video.

本発明の実施形態に係る映像処理装置のブロック構成図である。It is a block block diagram of the video processing apparatus which concerns on embodiment of this invention. 監視画像にて追跡された移動物体に関する物体情報の構成を示す模式図である。It is a schematic diagram which shows the structure of the object information regarding the moving object tracked with the monitoring image. 物体特徴量の構成を示すテーブルである。It is a table which shows the structure of an object feature-value. 物体内逸脱度の算出処理の一例を示す模式図である。It is a schematic diagram which shows an example of the calculation process of the deviation degree in an object. 物体内逸脱度、物体間逸脱度及び重要度の一例を示す説明図である。It is explanatory drawing which shows an example of the deviation degree in an object, the deviation degree between objects, and importance. 最大スキップ数が２フレームに設定されるときの傾斜制限を示す模式図である。It is a schematic diagram which shows inclination restriction | limiting when the maximum skip number is set to 2 frames. 動的計画法が選択し得る格子点と局所パスの例を示すグラフである。It is a graph which shows the example of the lattice point which can be selected by dynamic programming, and a local path. 映像要約装置の動作を説明する概略のフロー図である。It is a general | schematic flowchart explaining operation | movement of an image | video summary apparatus. 重要度算出部による処理の概略のフロー図である。It is a general | schematic flowchart of the process by the importance calculation part.

以下、本発明の実施形態について、図面に基づいて説明する。本実施形態は、監視空間を長時間に亘り撮像した監視映像の重要シーンを効率的に確認できる映像処理装置である。例えば、監視映像は店舗等の監視空間を撮像したものであり、当該監視空間内には監視対象として移動する人物が存在する。本映像処理装置１は、その長時間の監視映像を効率的に早見するために重要シーンを撮像速度で又は低速で再生し、重要シーン以外を高速で再生する。特に、映像処理装置１は、通常行動から逸脱した不審な人物行動が撮影されているシーンを重要シーンとして検出して早見可能にする。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The present embodiment is a video processing apparatus that can efficiently check an important scene of a monitoring video obtained by imaging a monitoring space for a long time. For example, the monitoring video is an image of a monitoring space such as a store, and a person who moves as a monitoring target exists in the monitoring space. The video processing apparatus 1 reproduces an important scene at an imaging speed or at a low speed and efficiently reproduces other than the important scene at a high speed in order to efficiently view the long-time monitoring video. In particular, the video processing apparatus 1 detects a scene in which a suspicious human action deviating from a normal action is captured as an important scene, and enables quick viewing.

図１は実施形態に係る映像処理装置１のブロック構成図である。映像処理装置１は、映像記録装置２、映像要約装置３及び表示装置４を含んで構成される。映像記録装置２及び表示装置４は映像要約装置３に接続される。 FIG. 1 is a block diagram of a video processing apparatus 1 according to the embodiment. The video processing device 1 includes a video recording device 2, a video summarizing device 3, and a display device 4. The video recording device 2 and the display device 4 are connected to the video summarizing device 3.

映像記録装置２は、撮像部２０及び録画部２１を含んで構成される。 The video recording device 2 includes an imaging unit 20 and a recording unit 21.

撮像部２０はいわゆる監視カメラであり、録画部２１に接続される。撮像部２０は監視空間を所定の時間間隔で撮像し、撮影した画像を順次、録画部２１へ出力する。撮像の時間間隔は例えば１／５秒である。以下、この時間間隔で撮像される各画像をフレーム、また、この時間間隔をフレーム間隔と呼ぶ。 The imaging unit 20 is a so-called monitoring camera and is connected to the recording unit 21. The imaging unit 20 images the monitoring space at predetermined time intervals, and sequentially outputs the captured images to the recording unit 21. The imaging time interval is, for example, 1/5 second. Hereinafter, each image captured at this time interval is referred to as a frame, and this time interval is referred to as a frame interval.

録画部２１は、例えば、ハードディスクドライブ（ＨＤＤ）等の大容量記録媒体を備えたディジタルビデオレコーダー（ＤＶＲ）等で構成される。録画部２１は、数日、数週間あるいは数ヶ月という長時間に亘り撮像される時系列の画像を撮像部２０から入力され、監視映像２２として大容量記録媒体に記録することができる。監視映像２２は複数のフレームが撮像順に並んだデータである。録画部２１は映像要約装置３にも接続され、映像要約装置３へ監視映像２２を出力する。 The recording unit 21 is configured by, for example, a digital video recorder (DVR) provided with a large-capacity recording medium such as a hard disk drive (HDD). The recording unit 21 can receive a time-series image captured over a long period of several days, weeks, or months from the imaging unit 20 and record it as a monitoring video 22 on a large-capacity recording medium. The monitoring video 22 is data in which a plurality of frames are arranged in the order of imaging. The recording unit 21 is also connected to the video summarizing device 3 and outputs a monitoring video 22 to the video summarizing device 3.

映像要約装置３は、設定入力部３０、記憶部３１及び信号処理部３２を含んで構成され、設定入力部３０及び記憶部３１は信号処理部３２に接続される。映像要約装置３は録画部２１及び表示装置４と接続され、監視映像２２から重要シーンを検出して重要シーン以外のフレームを間引いた要約映像を作成し、作成した要約映像を表示装置４へ出力する。映像要約装置３は例えば、パーソナルコンピュータ（ＰＣ）等を用いて構成することができる。 The video summarizing apparatus 3 includes a setting input unit 30, a storage unit 31, and a signal processing unit 32, and the setting input unit 30 and the storage unit 31 are connected to the signal processing unit 32. The video summarization device 3 is connected to the recording unit 21 and the display device 4, detects an important scene from the monitoring video 22, creates a summary video in which frames other than the important scene are thinned out, and outputs the created summary video to the display device 4. To do. The video summarizing device 3 can be configured using, for example, a personal computer (PC).

設定入力部３０はキーボード、マウス等のユーザーインターフェース装置であり、要約映像を見て監視空間における不審行動等を確認する店舗管理者や監視員等のユーザーにより操作され、各種設定を入力するために用いられる。ここで、上記設定には、要約映像の元映像としてユーザーに指定される監視映像２２中の任意区間や、ユーザーに指定される要約映像のフレーム数（指定フレーム数）などがある。以下、要約処理の対象に指定された元映像を単に監視映像２２と称する。指定フレーム数は少なくとも監視映像２２の総フレーム数よりも小さい。なお、指定フレーム数をユーザーに直接入力させる代わりに要約映像の再生時間を設定入力部３０から入力させ、信号処理部３２にて再生時間をフレーム間隔で除してフレーム数に換算してもよい。 The setting input unit 30 is a user interface device such as a keyboard and a mouse. The setting input unit 30 is operated by a user such as a store manager or a monitor who checks a suspicious behavior in a monitoring space by watching a summary video and inputs various settings. Used. Here, the setting includes an arbitrary section in the monitoring video 22 designated by the user as the original video of the summary video, the number of frames of the summary video designated by the user (specified number of frames), and the like. Hereinafter, the original video designated as the target of the summary process is simply referred to as a monitoring video 22. The designated number of frames is at least smaller than the total number of frames of the monitoring video 22. Instead of directly inputting the designated number of frames by the user, the playback time of the summary video may be input from the setting input unit 30 and the signal processing unit 32 may divide the playback time by the frame interval and convert it to the number of frames. .

記憶部３１は、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）等の記憶装置である。記憶部３１は各種プログラムや各種データを記憶し、信号処理部３２との間でこれらの情報を入出力する。各種データには物体情報３１０及び重要度データ３１１が含まれる。 The storage unit 31 is a storage device such as a ROM (Read Only Memory) or a RAM (Random Access Memory). The storage unit 31 stores various programs and various data, and inputs / outputs such information to / from the signal processing unit 32. Various data includes object information 310 and importance data 311.

物体情報３１０は、監視映像２２中の各物体像の領域情報と各物体像から抽出された動き及び外見の少なくとも一方に関する特徴量を移動物体の識別子及びフレーム番号と対応付けた情報である。フレーム番号は監視映像２２を構成する各フレームに対して撮像順に付与された番号である。図２は物体情報３１０の構成を示す模式図である。「物体ＩＤ」は移動物体の識別子であり、各移動物体を一意に識別する番号である。「追跡フレーム数」は各移動物体の物体像が検出され、追跡されたフレーム（追跡フレーム）の総数である。「追跡フレーム番号」は追跡フレームのフレーム番号である。また、各移動物体と対応付けられた追跡フレーム番号の始端から終端までが監視映像２２における当該移動物体の出現区間を表す。「物体特徴量」は各移動物体の追跡フレームにおいて当該移動物体の物体像から抽出された動きの特徴量及び外見の特徴量のいずれか又は両方であり、移動物体ｕの追跡フレームｔにおける特徴量をｆ_ｔ，ｕと表記する。「物体領域」は各移動物体の追跡フレームにおいて当該移動物体の物体像が検出された領域であり、当該領域から物体像が特定できる。移動物体ｕの追跡フレームｔにおける物体領域をＩ_ｔ，ｕと表記する。 The object information 310 is information in which the region information of each object image in the monitoring video 22 and the feature amount related to at least one of the motion and appearance extracted from each object image are associated with the identifier and the frame number of the moving object. The frame number is a number assigned to each frame constituting the monitoring video 22 in the order of imaging. FIG. 2 is a schematic diagram showing the configuration of the object information 310. “Object ID” is an identifier of a moving object, and is a number that uniquely identifies each moving object. “Number of tracking frames” is the total number of frames (tracking frames) in which an object image of each moving object is detected and tracked. “Tracking frame number” is the frame number of the tracking frame. The tracking frame number associated with each moving object from the beginning to the end represents the appearance section of the moving object in the monitoring video 22. The “object feature amount” is one or both of the motion feature amount and the appearance feature amount extracted from the object image of the moving object in the tracking frame of each moving object, and the feature amount in the tracking frame t of the moving object u. _Is expressed as _{ft, u} . The “object area” is an area where an object image of the moving object is detected in the tracking frame of each moving object, and the object image can be specified from the area. The object region of the moving object u in the tracking frame t is denoted as It _{, u} .

図３は物体特徴量の構成を示すテーブルである。本実施形態の物体特徴量は外見の特徴量及び動きの特徴量のベクトルを並べた２７次元ベクトルである。「全身エッジ特徴」、「上半分エッジ特徴」及び「下半分エッジ特徴」は物体像の全体、物体像をＹ方向に二分した上半分、及び下半分をそれぞれ抽出範囲とする外見特徴量であり、それぞれ抽出範囲の全画素から抽出されたエッジを４方向に量子化し、当該エッジ方向ごとにエッジ強度をヒストグラム分析して得られる４次元ベクトルである。 FIG. 3 is a table showing the structure of the object feature amount. The object feature amount according to the present embodiment is a 27-dimensional vector in which appearance feature amounts and motion feature amount vectors are arranged. “Whole edge feature”, “Upper half edge feature”, and “Lower half edge feature” are appearance feature amounts that have the entire object image, the upper half of the object image divided in the Y direction, and the lower half as extraction ranges, respectively. These are four-dimensional vectors obtained by quantizing the edges extracted from all the pixels in the extraction range in four directions and performing histogram analysis of the edge strength for each edge direction.

「全身色特徴」、「上半分色特徴」及び「下半分色特徴」も物体像の全体、物体像の上半分、及び下半分をそれぞれ抽出範囲とする外見特徴量である。本実施形態ではＬｕｖ表色系を採用し、それぞれ全身、上半分及び下半分を抽出範囲とする色特徴は抽出範囲の全画素から算出されたＬ成分平均値、Ｕ成分平均値、Ｖ成分平均値を並べた３次元ベクトルで表される。 The “whole body color feature”, “upper half color feature”, and “lower half color feature” are appearance feature amounts whose extraction ranges are the entire object image, the upper half, and the lower half of the object image, respectively. In this embodiment, the Luv color system is adopted, and the color features having the whole body, the upper half, and the lower half as the extraction ranges are the L component average value, U component average value, and V component average calculated from all the pixels in the extraction range, respectively. It is represented by a three-dimensional vector in which values are arranged.

「物体サイズ」も外見特徴量の１つであり、物体像の外接矩形の幅と高さを並べた２次元ベクトルである。 “Object size” is also an appearance feature quantity, and is a two-dimensional vector in which the width and height of a circumscribed rectangle of an object image are arranged.

「物体位置」は動き特徴量の１つであり、物体像の重心位置のＸ座標とＹ座標とを並べた２次元ベクトルである。「物体速度」も動き特徴量の１つであり、前後する追跡フレーム間での物体位置のＸ座標変化量及びＹ座標変化量からなる２次元ベクトルである。 “Object position” is one of the motion feature quantities, and is a two-dimensional vector in which the X and Y coordinates of the center of gravity of the object image are arranged. “Object velocity” is also one of the motion feature amounts, and is a two-dimensional vector composed of the X coordinate change amount and the Y coordinate change amount of the object position between the preceding and following tracking frames.

各移動物体ｕの追跡フレームｔにおける物体特徴量ｆ_ｔ，ｕは、単純に１フレームごとの特徴量とすることもできるが、移動平均とすることもできる。すなわち物体特徴量ｆ_ｔ，ｕは各追跡フレームｔを中心とする前後ｐフレーム、合計（２ｐ＋１）フレームの特徴量を成分ごとに平均した２７次元ベクトルとすることができる。ｐには予め見積もった出現区間の最短長の１／２より小さい自然数を設定しておく。移動平均を用いると平滑化により領域抽出の誤差の影響が軽減され、不審行動の誤検出が少なくなる。 The object feature quantity f _{t, u} in the tracking frame t of each moving object u can be simply a feature quantity for each frame, but can also be a moving average. That is, the object feature quantity f _{t, u} can be a 27-dimensional vector obtained by averaging the feature quantities of the total (2p + 1) frames for each component, including p frames before and after each tracking frame t. A natural number smaller than ½ of the minimum length of the appearance interval estimated in advance is set for p. When the moving average is used, the influence of region extraction errors is reduced by smoothing, and false detection of suspicious behavior is reduced.

さらに物体特徴量ｆ_ｔ，ｕは、追跡フレームｔを含む複数フレームからなる小区間の分布とすることもできる。具体的には、小区間として注目フレームｔを中心とする（２ｐ＋１）フレームを設定し、当該小区間の特徴量の分布を正規分布でモデル化する。物体特徴量ｆ_ｔ，ｕは当該小区間の特徴量の成分ごとに平均した２７次元ベクトル及び、当該平均ベクトルを用いて生成される２７×２７次元の共分散行列となる。ここでのｐも予め見積もった出現区間の最短長の１／２より小さい自然数に設定される。小区間の分布を用いると領域抽出誤差の軽減による不審行動の誤検出低減効果に加え、外見特徴量の瞬時的な変化をも物体特徴量ｆ_ｔ，ｕで表すことができるのでより多くの不審行動が検出されやすくなる。本実施形態では、この小区間の分布を物体特徴量ｆ_ｔ，ｕとして抽出し、追跡フレームごとの後述する重要度の算出に用いる。 Furthermore, the object feature quantity f _{t, u} can be a distribution of small sections including a plurality of frames including the tracking frame t. Specifically, a (2p + 1) frame centered on the frame of interest t is set as a small section, and the feature value distribution of the small section is modeled as a normal distribution. The object feature amount _{ft, u} is a 27-dimensional vector averaged for each component of the feature amount in the small section, and a 27 × 27-dimensional covariance matrix generated using the average vector. Here, p is also set to a natural number smaller than ½ of the shortest length of the appearance section estimated in advance. If the distribution of small sections is used, in addition to the effect of reducing false detection of suspicious behavior by reducing region extraction errors, instantaneous changes in appearance feature quantities can also be expressed by object feature quantities _{ft, u} , so that more suspiciousness is achieved. Actions are easier to detect. In the present embodiment, the distribution of this small section is extracted as the object feature amount f _{t, u} and used for calculating the importance described later for each tracking frame.

重要度データ３１１は追跡フレームと当該追跡フレームの監視上の価値（監視価値）である。重要度とを対応付けた情報である。なお、追跡フレーム以外の監視映像２２のフレームについて明示的にこれらのフレームと重要度の最低値（例えば０）とを対応付けた情報を重要度データ３１１に含ませてもよい。 The importance data 311 is a tracking frame and a monitoring value (monitoring value) of the tracking frame. This is information in which importance is associated. Note that the importance level data 311 may include information in which the frames of the monitoring video 22 other than the tracking frame are explicitly associated with the lowest importance level (for example, 0).

信号処理部３２は、ＣＰＵ(Central Processing Unit)、ＤＳＰ（Digital Signal Processor）、ＭＣＵ（Micro Control Unit）等の演算装置を用いて構成され、物体追跡部３３、物体特徴抽出部３４、重要度算出部３５及び要約映像作成部３６等の動作が記述されたプログラムを記憶部３１から読み出して実行することにより各手段として動作する。信号処理部３２は録画部２１より入力される監視映像２２から要約映像を作成し、作成した要約映像を表示装置４へ出力する。 The signal processing unit 32 is configured using an arithmetic device such as a CPU (Central Processing Unit), a DSP (Digital Signal Processor), or an MCU (Micro Control Unit), and includes an object tracking unit 33, an object feature extraction unit 34, and an importance calculation. The program in which operations of the unit 35, the summary video creation unit 36, and the like are described is read from the storage unit 31 and executed to operate as each unit. The signal processing unit 32 creates a summary video from the monitoring video 22 input from the recording unit 21 and outputs the created summary video to the display device 4.

物体追跡部３３は相前後するフレームを比較して監視映像２２に撮像された移動物体の物体像を移動物体ごとに追跡し、追跡結果を記憶部３１の物体情報３１０に記憶させる。物体情報３１０の追跡結果は物体特徴抽出部３４により参照される。追跡処理は背景差分処理により各フレームから物体領域を抽出し、前後するフレームにて抽出された物体領域のうち同一物体による物体領域をテンプレートマッチングまたはパーティクルフィルタにより対応付けるなどの公知の方法で行われる。追跡結果として、物体ＩＤ、追跡フレーム数、追跡フレームの番号、物体領域、及び物体特徴量のうちの物体位置が得られる。 The object tracking unit 33 compares successive frames and tracks the object image of the moving object captured in the monitoring video 22 for each moving object, and stores the tracking result in the object information 310 of the storage unit 31. The tracking result of the object information 310 is referred to by the object feature extraction unit 34. The tracking process is performed by a known method such as extracting an object region from each frame by background difference processing and associating an object region of the same object among the object regions extracted in the preceding and following frames by template matching or a particle filter. As the tracking result, the object ID, the number of tracking frames, the number of tracking frames, the object region, and the object position among the object features are obtained.

物体特徴抽出部３４は追跡結果及び監視映像２２を参照して物体特徴量を抽出する。具体的には、物体特徴量のうち、全身エッジ特徴、上半分エッジ特徴、下半分エッジ特徴、全身色特徴、上半分色特徴、下半分色特徴、物体速度、及び物体サイズを抽出する。 The object feature extraction unit 34 extracts an object feature amount with reference to the tracking result and the monitoring video 22. Specifically, the whole body edge feature, the upper half edge feature, the lower half edge feature, the whole body color feature, the upper half color feature, the lower half color feature, the object speed, and the object size are extracted from the object feature amount.

重要度算出部３５は、監視映像２２から重要度データ３１１を生成して記憶部３１に記憶させる。重要度算出部３５は、ユーザーが監視空間における不審行動等の移動物体の動きや外見の特異さを監視する本装置の用途に対応して、各追跡フレームの監視価値を算出する監視価値算出部である。 The importance calculation unit 35 generates importance data 311 from the monitoring video 22 and stores it in the storage unit 31. The importance calculation unit 35 is a monitoring value calculation unit that calculates the monitoring value of each tracking frame in accordance with the use of this apparatus that monitors the movement of a moving object such as suspicious behavior in the monitoring space and the peculiarity of appearance. It is.

具体的には、重要度算出部３５は、物体特徴量を移動物体ごとに統計分析して、当該移動物体の物体特徴量を代表する個別代表量を求め、当該移動物体の各追跡フレームにおける物体特徴量それぞれと個別代表量との距離（物体内逸脱度）を算出して当該距離が大きな追跡フレームほど高い重要度を算出する。重要度算出部３５はさらに、複数の移動物体の物体特徴量を統計分析して、複数の移動物体の物体特徴量を代表する集団代表量を求め、移動物体の個別代表量それぞれと集団代表量との距離（物体間逸脱度）を算出して当該距離が大きな移動物体の追跡フレームほど重要度を高く算出する。そして、算出された重要度を追跡フレーム番号と対応付けて記憶部３１に記憶させる。ちなみに、個別代表量や集団代表量は、統計対象とする一群の特徴量に関する代表量として例えば、平均ベクトルを含む。この他、中央値、最頻値などに相当する特徴量のベクトルも代表量として用いることもできる。また、個別代表量や集団代表量は分散に関する情報などを含むことができ、逸脱度や重要度の算出の基礎となる多次元の特徴量空間内での距離の定義として、当該分散等を考慮したものを採用することができる。 Specifically, the importance calculation unit 35 statistically analyzes the object feature amount for each moving object, obtains an individual representative amount that represents the object feature amount of the moving object, and determines the object in each tracking frame of the moving object. The distance between each feature quantity and the individual representative quantity (intra-object deviation) is calculated, and the higher importance is calculated for a tracking frame having a larger distance. The importance calculation unit 35 further statistically analyzes the object feature amounts of the plurality of moving objects to obtain a group representative amount representing the object feature amounts of the plurality of moving objects, and each individual representative amount of the moving object and the group representative amount The distance (the degree of deviation between objects) is calculated, and the higher the importance is calculated for a tracking frame of a moving object having a larger distance. Then, the calculated importance is stored in the storage unit 31 in association with the tracking frame number. Incidentally, the individual representative quantity and the collective representative quantity include, for example, an average vector as a representative quantity related to a group of feature quantities to be statistically targeted. In addition, a vector of feature amounts corresponding to a median value, a mode value, and the like can also be used as a representative amount. In addition, individual representative quantities and group representative quantities can include information on variances, and such variances are taken into account as definitions of distances in a multidimensional feature space that is the basis for calculating deviations and importance levels. Can be used.

このように算出される各フレームの重要度は、監視対象である移動物体の動きまたは外見が特異な変化を示したフレームまたは特異であるフレームであるほど高くなり、当該フレームが監視において注視すべきフレームである度合いを表す尺度として用いることができる。 The importance of each frame calculated in this way becomes higher as the movement or appearance of the moving object to be monitored shows a singular change or a singular frame, and the frame should be watched in monitoring. It can be used as a scale representing the degree of being a frame.

重要度算出部３５は、物体内逸脱度を算出する物体内逸脱度算出部３５０と、物体間逸脱度を算出する物体間逸脱度算出部３５１と、物体内逸脱度と物体間逸脱度とを合成して重要度を算出する逸脱度合成部３５２を含んでなる。以下、これらの各部について説明する。 The importance calculation unit 35 includes an in-object departure degree calculation unit 350 that calculates an in-object departure degree, an inter-object departure degree calculation unit 351 that calculates an inter-object departure degree, and an in-object departure degree and an between-object departure degree. A departure degree combining unit 352 that combines and calculates importance is included. Hereinafter, each of these parts will be described.

物体内逸脱度算出部３５０は、追跡された物体ｕごとにその出現区間における物体特徴量ｆ_ｔ，ｕを統計分析して個別代表量Ａ_ｕを算出し、各フレームｔにおける物体ｕごとの物体特徴量ｆ_ｔ，ｕの当該物体ｕの個別代表量Ａ_ｕからの距離ａ_ｔ，ｕを物体内逸脱度として算出する。物体内逸脱度の算出に２７次元の物体特徴量をそのまま用いることもできるが、本実施形態ではそのうち図３に示す第１０〜２７次成分を用いる。第１２〜２７次成分である物体サイズ、上半分エッジ特徴、下半分エッジ特徴、上半分色特徴及び下半分色特徴は、手を挙げる、立っている状態から屈む、或いは屈んでいる状態から立つといった姿勢変化を伴う人の行動、あるいは荷物や台車を放置する・持ち去るといった物品への関与を伴う人の行動によって変化するため、人の不審行動をフレーム単位で検出するための特徴量として適している。また、第１０，１１次成分である物体速度は一時停止、加速、減速といった移動物体の行動に伴って変化するため、移動物体の不審行動をフレーム単位で検出するための特徴量として適している。 The in-object deviation degree calculation unit 350 statistically analyzes the object feature amount f _{t, u} in the appearance section for each tracked object _u to calculate the individual representative amount A _u, and the object for each object u in each frame t. A distance a _{t, u} from the individual representative amount A _u of the object u of the feature quantity f _{t, u} is calculated as an in-object deviation degree. Although the 27-dimensional object feature quantity can be used as it is for the calculation of the in-object deviation, the 10th to 27th order components shown in FIG. 3 are used in the present embodiment. The object size, the upper half edge feature, the lower half edge feature, the upper half color feature, and the lower half color feature, which are the 12th to 27th order components, raise their hands, bend from standing or bend from standing It is suitable as a feature for detecting suspicious behavior in units of frames because it changes depending on the behavior of people with posture changes such as, or the behavior of people with involvement in goods such as leaving or removing luggage or carts. Yes. In addition, since the object speed, which is the 10th and 11th order components, changes with the action of the moving object such as temporary stop, acceleration, and deceleration, it is suitable as a feature amount for detecting the suspicious action of the moving object in units of frames. .

物体ｕの個別代表量Ａ_ｕは当該物体ｕの出現区間における物体特徴量ｆ_ｔ，ｕの分布を近似した分布関数とすることができる。具体的には、分布関数として正規分布関数を用いる。当該物体ｕの出現区間における物体特徴量ｆ_ｔ，ｕの平均ベクトルμ_ｕと共分散行列Σ_ｕを下記（１）式、（２）式で算出し、これらを正規分布関数に適用して個別代表量Ａ_ｕ＝Ｎ（μ_ｕ，Σ_ｕ）を算出する。なお、（１）式、（２）式において、Ｔ_ｕは当該物体ｕの追跡フレーム数であり、ｔは当該物体ｕの出現区間の始端フレームを１、終端フレームをＴ_ｕとした相対的なフレーム番号である。 The individual representative amount A _u of the object u can be a distribution function approximating the distribution of the object feature amount f _{t, u} in the appearance section of the object _u . Specifically, a normal distribution function is used as the distribution function. The average vector μ _u and the covariance matrix Σ _u of the object feature value f _{t, u} in the appearance section of the object _u are calculated by the following equations (1) and (2), and these are applied to the normal distribution function to individually A representative amount A _u = N (μ _u , Σ _u ) is calculated. Incidentally, (1) and (2), T _u is the number tracking frames of the object u, t is 1 to start frame of appearance section of the object u, relative to the end frame and the T _u Frame number.

物体特徴量ｆ_ｔ，ｕをフレームｔとその前後各ｐフレームからなる小区間の分布とする本実施形態において、物体内逸脱度ａ_ｔ，ｕは次式により個別代表量Ａ_ｕと物体特徴量ｆ_ｔ，ｕとのバタチャリヤ距離として算出できる。 Object feature amount f _t, in the present embodiment that the _u and frame t and before and after each consisting of p frame sub-interval distribution, objects within the deviance a _{t, u} individual representative amount A _u and the object feature amount by the following formula It can be calculated as a batcha rear distance with ft _{, u} .

（３）式において、μ_ｔ，ｕはフレームｔに対応する小区間（以下、小区間ｔと表記する）における物体ｕの特徴量の平均であり、Σ_ｔ，ｕは小区間ｔにおける物体ｕの特徴量の共分散である。 In Expression (3), μ _{t, u} is an average of the feature quantities of the object u in a small section (hereinafter referred to as the small section t) corresponding to the frame t, and Σ _{t, u} is the object u in the small section t. Is the covariance of the feature quantity.

なお、バタチャリヤ距離に代えてカルバックライブラー距離、Hellinger距離など公知の各種分布間距離尺度を用いることができる。また、物体特徴量ｆ_ｔ，ｕを移動平均または１フレームの特徴量とする別の実施形態において、物体内逸脱度ａ_ｔ，ｕは次式により個別代表量Ａ_ｕと物体特徴量ｆ_ｔ，ｕとのマハラノビス距離として算出できる。（４）式による距離は重要度の算出処理速度を優先したい場合に有効である。 It should be noted that various well-known inter-distribution distance scales such as a Cullbacher distance and a Hellinger distance can be used instead of the Batachariya distance. Further, in another embodiment in which the object feature amount f _{t, u} is a moving average or one frame feature amount, the in-object deviation degree at _{, u} is expressed by the following expression using the individual representative amount A _u and the object feature amount f _{t, It} can be calculated as Mahalanobis distance from _u . The distance according to the equation (4) is effective when priority is given to the importance calculation processing speed.

図４は物体内逸脱度の算出処理の一例を示す模式図であり、第５８フレームにおける物体＃１の物体内逸脱度ａ_５８，１を算出する処理をイメージ的に示している。ここでは特徴量空間を２次元で簡略表示している。図において、白丸は１フレームの特徴量、黒三角は物体＃１の個別代表量Ａ_１を構成する平均ベクトルμ_１であり、点線は当該個別代表量Ａ_１を構成する共分散行列Σ_１により得られる、平均ベクトルμ_１から３シグマの範囲を示している。また、白抜き矢印は物体内逸脱度ａ_５８，１に対応する。白三角はｐ＝１としたときの第５８小区間における物体＃１の特徴量の平均ベクトルμ_５８，１であり、一点鎖線は当該第５８小区間における物体＃１の特徴量の共分散行列Σ_５８，１により得られる、平均ベクトルμ_５８，１から３シグマの範囲を示している。 FIG. 4 is a schematic diagram showing an example of the calculation process of the in-object departure degree, and _conceptually shows the process of calculating the in-object departure degree a _58,1 of the object # 1 in the 58th frame. Here, the feature space is simply displayed in two dimensions. In the figure, a white circle is a feature quantity of one frame, a black triangle is an average vector μ ₁ constituting the individual representative quantity A ₁ of the object # ₁ , and a dotted line is represented by a covariance matrix Σ ₁ constituting the individual representative quantity A _1. The resulting average vector μ ₁ to 3 sigma is shown. The white arrow corresponds to the in-object deviation degree a _58,1 . The white triangle is the average vector μ _58,1 of the feature quantity of the object # 1 in the 58th small section when p = 1, and the one-dot chain line is the covariance matrix of the feature quantity of the object # 1 in the 58th small section. The range of the mean vector μ _58,1 to 3 sigma obtained by Σ _58,1 is shown.

図４（ａ）は物体特徴量ｆ_ｔ，ｕを１フレームの特徴量とした場合の例であり、Ｎ（μ_１，Σ_１）と白丸の間のマハラノビス距離（実際には共分散行列Σ_１で正規化されている）が物体内逸脱度ａ_５８，１となる。 FIG. 4A shows an example in which the object feature value f _{t, u} is a feature value of one frame, and the Mahalanobis distance (actually a covariance matrix Σ between N (μ ₁ , Σ ₁ ) and a white circle. Is normalized by ₁ ) to be the in-object deviation a _58,1 .

図４（ｂ）は物体特徴量ｆ_ｔ，ｕを移動平均とした場合の例であり、ｆ_５８，１＝μ_５８，１である。Ｎ（μ_１，Σ_１）とμ_５８，１の間のマハラノビス距離（実際には共分散行列Σ_１で正規化されている）が物体内逸脱度ａ_５８，１となる。 FIG. 4B shows an example in which the object feature value f _{t, u} is a moving average, and f _58,1 = μ _58,1 . N (μ _{_1, Σ 1)} Mahalanobis distance between the _{μ 58,1 (actually} been normalized by the covariance matrix sigma ₁₎ becomes an object within deviance _{a 58,1.}

図４（ｃ）は物体特徴量ｆ_ｔ，ｕを小区間の分布とした場合の例であり、ｆ_５８，１＝Ｎ（μ_５８，１，Σ_５８，１）である。Ｎ（μ_１，Σ_１）とＮ（μ_５８，１，Σ_５８，１）の間のバタチャリヤ距離（実際には共分散行列Σ_１及びΣ_５８，１で正規化されている）が物体内逸脱度ａ_５８，１となる。 FIG. 4C shows an example in which the object feature quantity f _{t, u} is a small section distribution, and f _58,1 = N (μ _58,1 , Σ _58,1 ). The virtual distance between N (μ ₁ , Σ ₁ ) and N (μ _58,1 , Σ _58,1 ) (actually normalized by the covariance matrices Σ ₁ and Σ _58,1 ) Deviation a _58,1 .

さらに処理速度を優先する別の実施形態においては、分散による正規化を省くこともできる。この場合、物体ｕの個別代表量Ａ_ｕを当該物体ｕの出現区間における物体特徴量ｆ_ｔ，ｕの平均値とし、当該物体ｕの各物体特徴量ｆ_ｔ，ｕとその平均値との間のユークリッド距離を物体内逸脱度ａ_ｔ，ｕとして算出する。なお上述したように、平均値の代わりに最頻値または中央値などを用いることもできる。 Further, in another embodiment in which processing speed is prioritized, normalization by dispersion can be omitted. In this case, the individual representative amount A _u of the object u is set as an average value of the object feature amounts f _{t, u} in the appearance section of the object _u, and between each object feature amount f _{t, u of the} object _u and the average value thereof. The Euclidean distance is calculated as the in-object deviation degree at _{, u} . As described above, the mode value or the median value can be used instead of the average value.

また別の実施形態においては物体特徴量ｆ_ｔ，ｕを小区間における特徴量の分散、個別代表量Ａ_ｕを出現区間における特徴量の分散とし、これらの分散の間の距離を物体内逸脱度ａ_ｔ，ｕとして算出することもできる。 In another embodiment, the object feature quantity f _{t, u} is the variance of the feature quantity in the small section, the individual representative quantity A _u is the variance of the feature quantity in the appearance section, and the distance between these variances is the in-object deviation degree. It can also be calculated as at _{, u} .

このように算出される物体内逸脱度は物体ごとに算出される代表特徴量を基準としているため、各物体の動きや外見が一時的に変化したフレームにおいて高い物体内逸脱度が算出され、当該物体が専らとっていた動きや外見が現れているフレームには低い物体内逸脱度が算出される。例えば、大多数の人物が歩いている中で小走りしていた特定人物が一時的に速度を落とすという監視映像２２においては、特定人物が一時的に速度を落としたフレームに高い物体内逸脱度が算出され、小走りしていたフレームには低い物体内逸脱度が算出される。 Since the in-object deviation calculated in this way is based on the representative feature value calculated for each object, a high in-object deviation is calculated in a frame in which the movement or appearance of each object is temporarily changed. A low degree of deviation in the object is calculated for a frame in which the movement or appearance that the object is exclusively appearing. For example, in a monitoring image 22 in which a specific person who has been trotting while a large number of people are walking temporarily decreases the speed, a high in-object deviation is present in a frame in which the specific person temporarily decreases the speed. A low in-object deviation is calculated for the frame that has been calculated and has been running.

物体間逸脱度算出部３５１は、監視映像２２内で追跡された複数の移動物体の物体特徴量ｆ_ｔ，ｕを統計分析して集団代表量Ｂを算出し、物体ｕごとの個別代表量Ａ_ｕの集団代表量Ｂからの距離ｂ_ｕを物体間逸脱度として算出する。物体間逸脱度の算出に２７次元の物体特徴量をそのまま用いることもできるが、本実施形態ではそのうち図３に示す第１〜１１次成分を物体間逸脱度の算出に用いる。エッジ特徴や色特徴を大局的に分析した全身エッジ特徴、全身色特徴はフレーム単位の細かな変化に左右されにくい一方で、特異な服装など人物間の違いを比較しやすいため物体間逸脱度の元データとして適している。また、物体位置は特異な位置からの入場・退場、特異なエリアへの移動など人物間の違いを比較しやすいため物体間逸脱度の元データとして適している。 The inter-object deviation degree calculation unit 351 statistically analyzes the object feature amounts f _{t, u} of a plurality of moving objects tracked in the monitoring video 22 to calculate a collective representative amount B, and the individual representative amount A for each object u. the distance b _u from a population representative amount B of _u is calculated as an object between the deviance. Although the 27-dimensional object feature amount can be used as it is for the calculation of the deviation degree between objects, the first to eleventh order components shown in FIG. 3 are used for the calculation of the deviation degree between objects in this embodiment. The whole body edge feature and whole body color feature that analyzed the edge feature and color feature globally are not easily influenced by minute changes in frame units, but it is easy to compare differences between persons such as unique clothes, so the degree of deviation between objects Suitable as original data. In addition, the object position is suitable as original data of the degree of deviation between objects because it is easy to compare differences between persons such as entry / exit from a specific position and movement to a specific area.

集団代表量Ｂは追跡された全物体の物体特徴量ｆ_ｔ，ｕの分布を近似した分布関数とすることができる。具体的には、分布関数として正規分布関数を用いる。全物体の物体特徴量ｆ_ｔ，ｕの平均ベクトルμと共分散行列Σを算出し、これらを正規分布関数に適用して集団代表量ＢをＢ＝Ｎ（μ，Σ）と定める。物体間逸脱度の算出に用いる個別代表量Ａ_ｕは物体内逸脱度算出部３５０における処理と同様にして算出する。但し、算出対象が第１〜１１次成分である点が物体内逸脱度の算出の場合と異なる。 The collective representative quantity B can be a distribution function approximating the distribution of the object feature quantities f _{t, u} of all the objects tracked. Specifically, a normal distribution function is used as the distribution function. An average vector μ and a covariance matrix Σ of object feature values f _{t, u} of all objects are calculated, and these are applied to a normal distribution function to determine a collective representative value B as B = N (μ, Σ). The individual representative amount A _u used for calculating the deviation degree between objects is calculated in the same manner as the processing in the deviation deviation calculation unit 350 within the object. However, the point that the calculation target is the first to eleventh order components is different from the case of calculating the in-object deviation.

物体ｕの物体間逸脱度ｂ_ｕは集団代表量Ｂと当該物体の個別代表量Ａ_ｕとのバタチャリヤ距離として算出できる。なお、バタチャリヤ距離に代えてカルバックライブラー距離、Hellinger距離など公知の各種分布間距離尺度を用いることも可能である。 The inter-object deviation b _u of the object _u can be calculated as a batch rear distance between the collective representative amount B and the individual representative amount A _u of the object. In addition, it is also possible to use various known inter-distribution distance scales such as a Cullback libler distance and a Hellinger distance instead of the Batachariya distance.

重要度の算出処理速度を優先する別の実施形態においては、個別代表量Ａ_ｕを上記平均ベクトルμ_ｕで代表させ、物体ｕの物体間逸脱度ｂ_ｕを集団代表量Ｂと当該物体の個別代表量Ａ_ｕとのマハラノビス距離として算出する。 In another embodiment in which priority calculation processing speed is prioritized, the individual representative amount A _u is represented by the average vector μ _u , and the inter-object deviation b _u of the object _u is represented by the collective representative amount B and the individual of the object. It is calculated as the Mahalanobis distance between the representative amount a _u.

さらに処理速度を優先する別の実施形態においては、分散による正規化を省くこともできる。この場合、集団代表量Ｂを全物体特徴量ｆ_ｔ，ｕの平均ベクトルμ、物体ｕの個別代表量Ａ_ｕを当該物体ｕの出現区間における物体特徴量ｆ_ｔ，ｕの平均ベクトルμ_ｕとし、全物体特徴量ｆ_ｔ，ｕの平均ベクトルμと各物体ｕの各物体特徴量ｆ_ｔ，ｕの平均ベクトルμ_ｕとの間のユークリッド距離を当該物体ｕの物体間逸脱度ｂ_ｕとして算出する。なお平均値の代わりに最頻値または中央値などを用いることもできる。 Further, in another embodiment in which processing speed is prioritized, normalization by dispersion can be omitted. In this case, the population representative amount B all object feature amount f _t, the average vector of _u mu, the individual representative amount A _u object u object feature amount f _t in appearance section of the object _u, the mean vector mu _u of _u , calculated total object feature amount f _t, the mean vector mu and each object feature amount f _t of each object u of _{_u,} the Euclidean distance between the mean vector mu _u of _u as the object between deviance b _u of the object u To do. The mode value or the median value can be used instead of the average value.

また別の実施形態においては、集団代表量Ｂを上記分散Σ、個別代表量Ａ_ｕを上記分散Σ_ｕとし、これら分散の間の距離を物体間逸脱度ｂ_ｕとして算出することもできる。 In another embodiment, the collective representative quantity B can be calculated as the variance Σ, the individual representative quantity A _u can be the variance Σ _u, and the distance between these variances can be calculated as the inter-object deviation b _u .

このように算出される物体間逸脱度は監視映像２２全体から算出される集団代表量を基準としているため、大多数の物体とは動きや外見が異なる物体に対して高い物体間逸脱度が算出され、大多数の物体と動きや外見が似ている物体に対して低い物体間逸脱度が算出される。例えば、大多数の人物が歩いている中で小走りしていた特定人物が一時的に速度を落とす上述の例では、小走りしていたこの特定人物に対して高い物体間逸脱度が算出され、歩いている大多数の人物には低い物体間逸脱度が算出される。 Since the inter-object deviation calculated in this way is based on the collective representative amount calculated from the entire monitoring video 22, a high inter-object deviation is calculated for an object whose movement and appearance are different from the majority of objects. Thus, a low inter-object deviation is calculated for an object that is similar in movement and appearance to the majority of objects. For example, in the above example where a specific person who has trotting while a large number of persons are walking temporarily decreases the speed, a high degree of deviation between objects is calculated for this specific person who has trotting and walking. A low degree of inter-object deviation is calculated for the majority of persons.

逸脱度合成部３５２はフレームごとに物体内逸脱度と物体間逸脱度とに応じて当該フレームの重要度を算出する。具体的には次式に基づいて、フレームｔごとに物体内逸脱度ａ_ｔ，ｕと物体間逸脱度ｂ_ｕを重み付け加算して当該フレームの重要度ｃ_ｔを算出する。なお（５）式の総和演算ではフレームｔ内の移動物体の集合Ｕ（ｔ）に属する全ての移動物体ｕについての逸脱度が合算される。 The deviation degree synthesis unit 352 calculates the importance of the frame for each frame according to the deviation degree within the object and the deviation degree between the objects. Specifically, based on the following equation, the object in deviance for each frame t a _t, by weighted addition of _u and the object between the deviance b _u calculates the importance c _t of the frame. In the summation calculation of equation (5), the deviation degrees for all moving objects u belonging to the set U (t) of moving objects in the frame t are added together.

ここでλは物体内逸脱度と物体間逸脱度との重み調整用のパラメータであり、０≦λ≦１なる範囲で設定される。例えば０．５に設定すればよい。 Here, λ is a parameter for adjusting the weight of the in-object deviation and the inter-object deviation, and is set in a range of 0 ≦ λ ≦ 1. For example, it may be set to 0.5.

別の実施形態において逸脱度合成部３５２はフレームｔごとに当該フレーム内の全ての移動物体の物体内逸脱度ａ_ｔ，ｕ及び物体間逸脱度ｂ_ｕのうちの最大値を当該フレームの重要度ｃ_ｔとして算出してもよい。この場合の重要度ｃ_ｔは次式で表される。（６）式の重要度ｃ_ｔは（５）式と異なりフレーム内の移動物体の数に影響されないので、人数が少ないシーンでも追跡物体内のフレームごとの逸脱度合が高い箇所が優先的に選択されることになる。 In another embodiment, the deviation degree compositing unit 352 determines the maximum value of the in-object deviations at and _u and the inter-object deviation b _u of all moving objects in the frame for each frame t. it may be calculated as c _t. The importance c _t in this case is expressed by the following equation. (6) Since the importance c _t it is not affected by the number of moving objects in the frame different from (5), departing degree higher portion of each frame in the object to be tracked even number less scene preferentially select the type Will be.

図５は物体内逸脱度、物体間逸脱度及び重要度の一例を示す説明図であり、図２に示した例に対応しており、監視映像２２にて３つの物体＃１〜＃３について追跡フレームが存在する。データ６０１は物体＃１〜＃３について算出された物体内逸脱度の例であり、物体＃１の物体内逸脱度はその出現区間である第５６フレーム〜第６１フレームに対して算出され、その値は１．０，１．５，５．６，２．０，１．８，１．０である。物体＃２の物体内逸脱度はその出現区間である第５８フレーム〜第６５フレームにおいて算出され、その値は１．２，３．４，３．０，４．５，５．２，２．６，２．０，１．７である。また、物体＃３の物体内逸脱度はその出現区間である第１１２フレーム〜第１１６フレームに対して算出され、その値は１．１，１．５，１．７，１．５，１．２である。なお、物体が検出されなかった第６６フレーム〜第１１１フレームに対しては物体内逸脱度が算出されない。 FIG. 5 is an explanatory diagram showing an example of the in-object deviation, the inter-object deviation, and the importance, corresponding to the example shown in FIG. There is a tracking frame. Data 601 is an example of the in-object deviation calculated for the objects # 1 to # 3. The in-object deviation for the object # 1 is calculated for the 56th to 61st frames, which are the appearance sections, and The values are 1.0, 1.5, 5.6, 2.0, 1.8 and 1.0. The in-object deviation degree of the object # 2 is calculated in the 58th to 65th frames which are the appearance sections, and the values are 1.2, 3.4, 3.0, 4.5, 5.2, and 2. 6, 2.0, and 1.7. Further, the in-object deviation degree of the object # 3 is calculated for the 112th to 116th frames which are the appearance sections, and the values are 1.1, 1.5, 1.7, 1.5, 1.. 2. Note that the in-object deviation is not calculated for the 66th to 111st frames in which no object is detected.

データ６０２は物体＃１〜＃３について算出された物体間逸脱度の例であり、物体＃１の物体間逸脱度は２．０、物体＃２の物体間逸脱度は２．２、物体＃３の物体間逸脱度は４．９である。 Data 602 is an example of the inter-object deviation calculated for the objects # 1 to # 3. The inter-object deviation of the object # 1 is 2.0, the inter-object deviation of the object # 2 is 2.2, and the object # The degree of deviation between 3 objects is 4.9.

データ６０３は（５）式によりデータ６０１の物体内逸脱度とデータ６０２の物体間逸脱度を合成した重要度の例である。第５６フレーム〜第６５フレームに対する重要度は１．５，１．７５，５．５，４．８，４．５，４．８５，３．７，２．４，２．１，１．９５と算出され、第１１２フレーム〜第１１６フレームに対する重要度は３．０，３．２，３．３，３．２，３．０５と算出された。 Data 603 is an example of importance obtained by synthesizing the in-object deviation in data 601 and the inter-object deviation in data 602 according to equation (5). The importance for the 56th to 65th frames is 1.5, 1.75, 5.5, 4.8, 4.5, 4.85, 3.7, 2.4, 2.1, 1.95. The importance for the 112th to 116th frames was calculated as 3.0, 3.2, 3.3, 3.2, and 3.05.

例えば第５８フレームに関しては、ａ_５８，１＝５．６，ｂ_１＝２．０，ａ_５８，２＝１．２，ｂ_２＝２．２であり、これらをλ＝０．５とした（５）式に適用して、ｃ_５８＝｛（１−０．５）×５．６＋０．５×２．０＋（１−０．５）×１．２＋０．５×２．２｝＝５．５と算出される。また、（５）式に代えて（６）式を用いた場合はｃ_５８＝ｍａｘ｛５．６，２．０，１．２，２．２｝＝５．６となる。 For example, for the 58th frame, a _58,1 = 5.6, b ₁ = 2.0, a _58,2 = 1.2, b ₂ = 2.2, and these are set to λ = 0.5. Applying to the equation (5), c ₅₈ = {(1-0.5) × 5.6 + 0.5 × 2.0 + (1-0.5) × 1.2 + 0.5 × 2.2} = 5 .5. Further, when Expression (6) is used instead of Expression (5), c ₅₈ = max {5.6, 2.0, 1.2, 2.2} = 5.6.

図５に示す例において、物体＃１，＃２の物体間逸脱度は低めであるため、これらの物体の出現区間全体が特異なシーンであるとは判定できない。しかし、物体＃１の物体内逸脱度は第５８フレームで高くなっており、また物体＃２の物体内逸脱度は第５９フレーム〜第６２フレームで高くなっている。このことから、それぞれの物体の出現区間を構成する幾つかのフレームに対して特異なシーンを判定できることが分かる。つまり物体＃１，＃２に対しては物体内逸脱度が重要シーンを判定するために有効であることが分かる。 In the example shown in FIG. 5, since the degree of deviation between the objects # 1 and # 2 is low, it cannot be determined that the entire appearance section of these objects is a unique scene. However, the in-object deviation of object # 1 is high in the 58th frame, and the in-object deviation of object # 2 is high in the 59th to 62nd frames. From this, it can be seen that a unique scene can be determined for several frames constituting the appearance section of each object. That is, it can be seen that the deviation degree in the object is effective for determining the important scene for the objects # 1 and # 2.

一方、物体＃３の物体内逸脱度は第１１２フレーム〜第１１６フレームのいずれにおいても低めであるが、物体間逸脱度は高くなっている。このことから、物体＃３の出現区間を他の物体の出現区間に照らせば物体＃３の出現区間全体を特異なシーンと判定できることが分かる。つまり物体＃３に対しては物体間逸脱度が重要シーンを判定するために有効であることが分かる。 On the other hand, the in-object departure degree of the object # 3 is low in any of the 112th to 116th frames, but the inter-object departure degree is high. From this, it can be seen that if the appearance section of the object # 3 is illuminated with the appearance sections of other objects, the entire appearance section of the object # 3 can be determined as a unique scene. That is, it can be understood that the deviation degree between objects is effective for determining an important scene for the object # 3.

そして物体内逸脱度と物体間逸脱度を合成した重要度では、物体＃１が特異である第５８フレーム、物体＃２が特異である第５９フレーム〜第６２フレーム、及び物体＃３が特異である第１１２フレーム〜第１１６フレームのそれぞれに高めの値が算出されており、物体間逸脱度による大局的な観点と物体内逸脱度による局所的な観点との両方により漏れの少ない重要シーン判定が可能になる。 In the importance obtained by synthesizing the in-object deviation and the inter-object deviation, the 58th frame in which the object # 1 is unique, the 59th to 62nd frames in which the object # 2 is unique, and the object # 3 is unique. Higher values are calculated for each of the 112th to 116th frames, and important scene determination with less leakage is performed both from a global viewpoint based on the degree of deviation between objects and a local viewpoint based on the degree of deviation within the object. It becomes possible.

逸脱度合成部３５２が生成する、追跡フレーム番号と重要度とを対応付けたデータ６０３は記憶部３１に重要度データ３１１として格納される。 Data 603 generated by the departure degree synthesis unit 352 and associated with the tracking frame number and the importance is stored in the storage unit 31 as importance data 311.

要約映像作成部３６は重要度データ３１１を参照し、監視映像２２の中から重要度が高いフレームを優先して指定フレーム数のフレームを選出することにより当該監視映像を間引いて、当該監視映像の総フレーム数より少なく設定されたフレーム数の要約映像を作成する。そして、作成した要約映像を表示装置４に出力する。ここで、要約映像を撮像時と同じフレーム間隔で順次出力することで、重要度の高い重要シーンは撮像時と同じフレーム間隔で再生され、それ以外のシーンは間引かれているため高速再生される。 The summary video creation unit 36 refers to the importance level data 311 and thins out the monitoring video by selecting frames of the designated number of frames from the monitoring video 22 with priority on the frames with high importance. Create a summary video with a set number of frames less than the total number of frames. Then, the created summary video is output to the display device 4. Here, by sequentially outputting the summary video at the same frame interval as at the time of imaging, important scenes with high importance are played back at the same frame interval as at the time of imaging, and other scenes are thinned out so that they are played back at high speed. The

具体的には要約映像作成部３６は動的計画法（ＤＰ；Dynamic Programming）を用い、監視映像２２の中から所定範囲内のスキップ数でその重要度の総和が最大となるフレームを選出する。要約映像作成部３６は監視映像２２のうち物体が検出されていないフレームを除き、追跡フレームのみからなる部分に動的計画法を適用する。例えば、図５に例示した第５６フレーム〜第１１６フレームを要約対象の監視映像とすると、これら６１フレームのうち物体が検出されていない第６６フレーム〜第１１１フレームを除いた残り１５フレームに動的計画法が適用される。 Specifically, the summary video creation unit 36 uses dynamic programming (DP) to select a frame from the monitoring video 22 that maximizes the sum of its importance levels with the number of skips within a predetermined range. The summary video creation unit 36 applies the dynamic programming method to a portion including only the tracking frame except for the frame in which no object is detected in the monitoring video 22. For example, if the 56th to 116th frames illustrated in FIG. 5 are the monitoring images to be summarized, the remaining 15 frames excluding the 66th to 111th frames in which no object is detected among these 61 frames are dynamically changed. Planning methods apply.

その適用に際しては、最小スキップ数は０、一方、最大スキップ数は（総追跡フレーム数÷指定フレーム数×α−１）に設定することができる。αは全ての選択が最大スキップ数で行なわれないようにするための調整値でα＞１である。例えばα＝２とすることができる。例えば、上述した総追跡フレーム数が１５フレームの例で指定フレーム数を１０フレームとした場合、最大スキップ数は１５÷１０×２−１＝２フレームとなる。 In the application, the minimum number of skips can be set to 0, while the maximum number of skips can be set to (total number of tracked frames / designated number of frames × α−1). α is an adjustment value for preventing all selections from being made with the maximum number of skips, and α> 1. For example, α = 2 can be set. For example, when the total number of tracking frames is 15 and the designated number of frames is 10, the maximum number of skips is 15/10 × 2-1 = 2 frames.

スキップ数の選択範囲は、動的計画法における傾斜制限として規定できる。図６は上述した最大スキップ数が２フレームに設定されるときの傾斜制限を示す模式図であり、横軸が要約映像のフレーム番号に対応し、縦軸が監視映像のフレーム番号に対応している。図６において、白丸は要約映像のフレームと監視映像のフレームの格子点であり、２つの格子点を結ぶ実線は局所パスを表している。スキップ数の選択範囲が０，１，２となることに対応し、要約映像のフレーム番号が１増加するときに監視映像２２のフレーム番号の増加数が１，２，３である局所パスが設定される。 The selection range of the number of skips can be defined as a slope limit in dynamic programming. FIG. 6 is a schematic diagram showing the tilt restriction when the above-mentioned maximum skip number is set to 2 frames. The horizontal axis corresponds to the frame number of the summary video, and the vertical axis corresponds to the frame number of the monitoring video. Yes. In FIG. 6, white circles are grid points of the summary video frame and the surveillance video frame, and a solid line connecting the two grid points represents a local path. Corresponding to the selection range of skip number being 0, 1, 2, and when the summary video frame number is incremented by 1, a local path is set in which the increment number of the frame number of the surveillance video 22 is 1, 2, 3. Is done.

局所パスにより接続される格子点に対応する監視映像２２のフレームの重要度が当該局所パスを選択するときのコストに相当する。要約映像作成部３６は動的計画法により選択された局所パスと対応してそのコストを累積する。指定フレーム数の各格子点を結ぶパスに沿って累積されたコストが重要度の総和であり、要約映像作成部３６はそれぞれが局所パスを組み合わせてなる複数のパスのうち、当該コストの累積値を最大にするパスを求める。 The importance of the frame of the monitoring video 22 corresponding to the grid points connected by the local path corresponds to the cost when selecting the local path. The summary video creation unit 36 accumulates the cost corresponding to the local path selected by the dynamic programming. The cost accumulated along the path connecting the lattice points of the designated number of frames is the sum of the importance levels, and the summary video creation unit 36 is the accumulated value of the cost among a plurality of paths each combining the local paths. Find the path that maximizes.

ここで、要約映像において移動物体が画像内の様々な位置から登場したり消失したりするとこれを目で追う監視員に大きなストレスがかかる。そこで、各物体の出現区間の始端フレーム及び終端フレームをその重要度とは関係なく強制選択して要約映像を作成する。動的計画法を用いる場合、各物体の出現区間の始端フレーム及び終端フレームをまたぐ局所パスの選択を禁止することでこの制御を実現できる。こうすることで要約映像中において各移動物体が実際の出現位置から登場し実際の退場位置から消失するので、監視員のストレスを軽減することができる。 Here, if a moving object appears or disappears from various positions in the image in the summary video, a great stress is applied to the observer who follows this. Therefore, a summary video is created by forcibly selecting the start frame and end frame of the appearance section of each object regardless of their importance. When dynamic programming is used, this control can be realized by prohibiting selection of a local path across the start frame and end frame of the appearance section of each object. By doing so, each moving object appears from the actual appearance position and disappears from the actual exit position in the summary video, so that the stress of the observer can be reduced.

図７は上述した監視映像２２の例及び動的計画法の設定例のもとで動的計画法が選択し得る格子点と局所パスを示すグラフである。縦軸は監視映像２２における追跡フレーム番号、横軸は作成する要約映像のフレーム番号に当たる。白丸及び黒丸は選択し得る格子点であり、そのうち黒丸は強制選択される格子点である。また、丸と丸を結ぶ線は図６の傾斜制限と強制選択の拘束条件のもとで選択し得る局所パスである。さらにコストが最大となるという条件を課すことにより、この例では、図７にて太線で示すパスが選択され、要約映像の第１〜１０フレームとして監視映像２２の第５６，５８，５９，６０，６１，６３，６５，１１２，１１４，１１６フレームがそれぞれ選出される。 FIG. 7 is a graph showing lattice points and local paths that can be selected by the dynamic programming based on the example of the monitoring video 22 and the setting example of the dynamic programming described above. The vertical axis corresponds to the tracking frame number in the monitoring video 22, and the horizontal axis corresponds to the frame number of the summary video to be created. White circles and black circles are grid points that can be selected, and black circles are grid points that are forcibly selected. Further, a line connecting the circles is a local path that can be selected under the inclination restriction and the forced selection constraint condition of FIG. Further, by imposing the condition that the cost is maximized, in this example, the path indicated by the thick line in FIG. 7 is selected, and the 56th, 58th, 59th, 60th of the monitoring video 22 is selected as the first to 10th frames of the summary video. , 61, 63, 65, 112, 114, and 116 frames are selected.

動的計画法を用いると最大フレーム間隔以内で連続性のある要約映像が作成され、重要度が高いフレームのみならずその間の「つなぎ」のフレームが適度に選択されるので、要約映像中に次々と登場する移動物体を目で追う監視員へのストレスを軽減可能な要約映像を作成することができる。 With dynamic programming, a continuous summary video is created within the maximum frame interval, and not only the frames with high importance but also the “connection” frames between them are selected appropriately. It is possible to create a summary video that can reduce the stress on the observer who follows the moving object that appears.

表示装置４は信号処理部３２から入力される要約映像を表示する液晶ディスプレイ又はＣＲＴ（Cathode Ray Tube）ディスプレイ等である。ユーザーは表示装置４に再生表示される要約映像を見ることにより監視映像２２を早見することができる。 The display device 4 is a liquid crystal display or a CRT (Cathode Ray Tube) display that displays a summary video input from the signal processing unit 32. The user can quickly view the monitoring video 22 by viewing the summary video reproduced and displayed on the display device 4.

次に映像処理装置１の動作について映像要約装置３の動作を中心に説明する。撮像部２０は予め設定されたフレーム間隔で監視空間を撮像して、監視映像２２を録画部２１へ出力する。録画部２１は監視映像２２を記録する。図８は映像要約装置３の動作を説明する概略のフロー図である。ユーザーは設定入力部３０を用いて要約対象とする監視映像２２を指定し、また指定フレーム数を入力する（Ｓ１）。 Next, the operation of the video processing device 1 will be described focusing on the operation of the video summarizing device 3. The imaging unit 20 images the monitoring space at a preset frame interval and outputs a monitoring video 22 to the recording unit 21. The recording unit 21 records the monitoring video 22. FIG. 8 is a schematic flowchart for explaining the operation of the video summarizing apparatus 3. The user designates the monitoring video 22 to be summarized using the setting input unit 30, and inputs the designated number of frames (S1).

信号処理部３２はステップＳ１にて指定された監視映像２２を録画部２１から読み出す（Ｓ２）。信号処理部３２の物体追跡部３３は、読み出された監視映像２２を処理して当該監視映像２２に撮像されている物体を個々に追跡し、追跡結果を記憶部３１の物体情報３１０に記憶させる（Ｓ３）。また、信号処理部３２の物体特徴抽出部３４は、各物体の追跡フレームそれぞれから当該物体の特徴量を抽出し、抽出された各特徴量を記憶部３１の物体情報３１０に記憶させる（Ｓ４）。そして、信号処理部３２の重要度算出部３５は、物体情報３１０を参照して各追跡フレームに対し、当該フレームに撮像されている物体の行動の特異性を表す重要度を算出する（Ｓ５）。 The signal processing unit 32 reads the monitoring video 22 designated in step S1 from the recording unit 21 (S2). The object tracking unit 33 of the signal processing unit 32 processes the read monitoring video 22 to individually track the object imaged in the monitoring video 22 and stores the tracking result in the object information 310 of the storage unit 31. (S3). In addition, the object feature extraction unit 34 of the signal processing unit 32 extracts the feature amount of the object from each tracking frame of each object, and stores the extracted feature amount in the object information 310 of the storage unit 31 (S4). . Then, the importance level calculation unit 35 of the signal processing unit 32 refers to the object information 310 and calculates an importance level indicating the specificity of the action of the object imaged in the frame for each tracking frame (S5). .

図９は重要度算出部３５による処理の概略のフロー図である。物体内逸脱度算出部３５０及び物体間逸脱度算出部３５１は、物体ごとにその全追跡フレームの物体特徴量を統計分析することにより各物体の個別代表量を算出し記憶部３１に記憶させる（Ｓ５０）。そして、物体内逸脱度算出部３５０は各物体の追跡フレームごとに、当該追跡フレームにおける当該物体の物体特徴量と当該物体の個別代表量との間の距離を物体内逸脱度として算出し記憶部３１に記憶させる（Ｓ５１）。一方、物体間逸脱度算出部３５１は図８のステップＳ４にて抽出された全ての物体特徴量を統計分析することにより集団代表量を算出する（Ｓ５２）。そして、物体間逸脱度算出部３５１はステップＳ５０にて算出された各物体の個別代表量と集団代表量との間の距離を当該物体の物体間逸脱度として算出し記憶部３１に記憶させる（Ｓ５３）。 FIG. 9 is a schematic flowchart of processing by the importance degree calculation unit 35. The intra-object deviation degree calculating unit 350 and the inter-object deviation degree calculating unit 351 calculate the individual representative amount of each object by statistically analyzing the object feature amount of all tracking frames for each object, and store the calculated individual representative amount in the storage unit 31 ( S50). Then, for each tracking frame of each object, the in-object departure degree calculation unit 350 calculates the distance between the object feature amount of the object in the tracking frame and the individual representative amount of the object as an in-object departure degree, and stores the storage unit 31 (S51). On the other hand, the inter-object deviation degree calculation unit 351 calculates a collective representative quantity by statistically analyzing all the object feature quantities extracted in step S4 of FIG. 8 (S52). Then, the inter-object deviation calculating unit 351 calculates the distance between the individual representative amount and the collective representative amount of each object calculated in step S50 as the inter-object deviation of the object, and stores it in the storage unit 31 ( S53).

逸脱度合成部３５２はステップＳ５１にて算出された物体内逸脱度とステップＳ５３にて算出された物体間逸脱度とを追跡フレームごとに合成して重要度を算出する。重要度算出部３５は算出された重要度を追跡フレーム番号と対応付けて、記憶部３１に重要度データ３１１として記憶させる。なお、このとき重要度算出部３５は追跡フレーム以外のフレームのフレーム番号と重要度の最低値０を対応付けた情報を重要度データ３１１に加えてもよい。 The departure degree combining unit 352 combines the in-object departure degree calculated in step S51 and the inter-object departure degree calculated in step S53 for each tracking frame to calculate the importance. The importance calculation unit 35 stores the calculated importance in association with the tracking frame number and stores it as importance data 311 in the storage unit 31. At this time, the importance calculation unit 35 may add information in which the frame number of a frame other than the tracking frame is associated with the lowest importance value 0 to the importance data 311.

図９に示す重要度算出部３５による処理が終わると処理は再び図８に戻り、信号処理部３２の要約映像作成部３６が、動的計画法を用いて重要度を最大化する指定フレーム数の要約映像を作成する（Ｓ６）。具体的には、要約映像作成部３６は、まず総追跡フレーム数を算出し、総追跡フレーム数と指定フレーム数とから最大スキップ数を求めて動的計画法の傾斜制限を設定する。次に要約映像作成部３６は重要度データ３１１に基づいて追跡フレームのフレーム番号系列を作成し、動的計画法を用いて当該系列中のフレームから傾斜制限の範囲内で指定フレーム数と同数の部分系列を作成して、系列を構成する追跡フレームの重要度の総和が最大となるような部分系列を選び出す。ただし部分系列の作成にあたり、各物体の出現区間の始端フレーム及び終端フレームは強制的に部分系列に組み込む。重要度の総和が最大となるような選び出された部分系列は要約映像を構成するフレームが監視映像２２のフレーム番号で表現されたものとなっている。要約映像作成部３６は選び出された部分系列が示すフレームを監視映像２２から抜き出して時系列順に並べることで要約映像を作成し、記憶部３１に記憶させる。 When the process by the importance calculation unit 35 shown in FIG. 9 is completed, the process returns to FIG. 8 again, and the summary video creation unit 36 of the signal processing unit 32 uses the dynamic programming to maximize the importance. (S6). Specifically, the summary video creation unit 36 first calculates the total number of tracking frames, obtains the maximum number of skips from the total number of tracking frames and the number of designated frames, and sets the slope limitation of dynamic programming. Next, the summary video creation unit 36 creates a frame number sequence of the tracking frame based on the importance data 311 and uses the same number as the designated number of frames within the range of tilt restriction from the frame in the sequence using dynamic programming. A partial sequence is created, and a partial sequence that maximizes the sum of the importance levels of the tracking frames constituting the sequence is selected. However, when creating a partial sequence, the start and end frames of the appearance section of each object are forcibly incorporated into the partial sequence. In the selected partial series that maximizes the sum of the importance levels, the frames constituting the summary video are expressed by the frame number of the monitoring video 22. The summary video creation unit 36 creates a summary video by extracting the frames indicated by the selected partial series from the monitoring video 22 and arranging them in time series, and stores the summary video in the storage unit 31.

要約映像作成部３６は、記憶部３１に記憶させた要約映像を撮像時のフレーム間隔で順次表示装置４に出力することで要約映像を再生する（Ｓ７）。 The summary video creation unit 36 reproduces the summary video by sequentially outputting the summary video stored in the storage unit 31 to the display device 4 at a frame interval at the time of imaging (S7).

上述の実施形態では重要度算出部３５は物体内逸脱度と物体間逸脱度から重要度を定めたが、物体内逸脱度のみから重要度を算出してもよい。また、物体内逸脱度と物体間逸脱度とを合成した重要度と物体内逸脱度のみから算出した重要度とのいずれかをユーザーに選択させて、選択された方法で重要度を算出してもよい。 In the above-described embodiment, the importance calculation unit 35 determines the importance from the in-object deviation and the between-object deviation. However, the importance may be calculated only from the in-object deviation. Also, let the user select either the importance calculated by combining the deviation within the object and the deviation between the objects or the importance calculated only from the deviation within the object, and calculate the importance using the selected method. Also good.

要約映像作成部３６は監視映像２２の先頭フレームから順次重要度を累積して累積値が閾値Ｔ２に達したときのフレームを要約映像のフレームとして選択し、１つのフレームが選択されると累積値を閾値Ｔ２だけ減じる処理または累積値を０リセットする処理をした後、同様の重要度累積を開始して次のフレームを選択するという処理を、監視映像２２の終端フレームまで繰り返すことにより要約映像を作成してもよい。このときの閾値Ｔ２は（監視映像２２の全フレームに対する重要度の和÷指定フレーム数）と設定することができる。この処理方法によれば、動的計画法を用いる場合と比べて計算量が削減される。 The summary video creation unit 36 sequentially accumulates the importance from the first frame of the monitoring video 22 and selects the frame when the cumulative value reaches the threshold value T2 as the frame of the summary video. When one frame is selected, the cumulative value is selected. Is reduced by the threshold value T2 or the accumulated value is reset to 0, and then the process of starting the same importance accumulation and selecting the next frame is repeated until the end frame of the monitoring video 22 to obtain the summary video. You may create it. The threshold value T2 at this time can be set as (sum of importance for all frames of monitoring video 22 / designated number of frames). According to this processing method, the amount of calculation is reduced compared to the case of using dynamic programming.

また、各物体の出現区間の始端フレーム及び終端フレームのいずれか一方のみを強制的に要約映像に組み込む構成としてもよい。 Alternatively, only one of the start frame and the end frame of the appearance section of each object may be forcibly incorporated into the summary video.

また、要約映像作成部３６は重要度が高い順に指定フレーム数と同数のフレームを選出して選出したフレームをフレーム番号順に並べることにより要約映像を作成してもよい。または、各フレームの重要度に対して一律の閾値Ｔ３を適用して重要度が閾値Ｔ３を超えたフレームを並べることにより要約映像を作成することもできる。この場合、閾値Ｔ３を順次増加または減少させて指定フレーム数の要約映像となるよう調整することができる。なお、これらの構成では、いずれも要約映像に「つなぎ」が入らない分、要約映像における重要シーンの密度が高くなる。この点で、熟練ユーザーに好適であると言える。 The summary video creation unit 36 may create a summary video by selecting the same number of frames as the designated number of frames in descending order of importance and arranging the selected frames in the order of frame numbers. Alternatively, a summary video can be created by applying a uniform threshold T3 to the importance of each frame and arranging frames whose importance exceeds the threshold T3. In this case, the threshold value T3 can be adjusted to be a summary video of the designated number of frames by sequentially increasing or decreasing. In any of these configurations, the density of important scenes in the summary video increases because no “connecting” is included in the summary video. In this respect, it can be said that it is suitable for skilled users.

上記映像処理装置１は監視映像２２から作成された要約映像で録画部２１に記憶されている当該監視映像を置き換えることもできる。こうすることで監視価値の高いフレームを残し監視価値の低いフレームが削除されるので監視映像２２の監視価値を一定に保ちながら、録画部２１のデータ量を削減できる。 The video processing apparatus 1 can replace the monitoring video stored in the recording unit 21 with a summary video created from the monitoring video 22. By doing so, frames with high monitoring value are left and frames with low monitoring value are deleted, so that the data amount of the recording unit 21 can be reduced while keeping the monitoring value of the monitoring video 22 constant.

また、上記実施形態において重要度算出部３５は移動物体それぞれの出現区間全体を統計分析して個別特徴量を算出したが、出現区間の一部区間を統計分析してもよい。例えば、監視映像２２内に予め設定したエリア内にて追跡された区間だけを統計分析の対象とすることができる。 In the above-described embodiment, the importance calculation unit 35 statistically analyzes the entire appearance section of each moving object and calculates the individual feature amount. However, a part of the appearance section may be statistically analyzed. For example, only the section tracked in the area set in advance in the monitoring video 22 can be the target of statistical analysis.

また、上記実施形態において重要度算出部３５は全移動物体を統計分析して集団特徴量を算出したが、一部の移動物体を統計分析してもよい。例えば、監視映像２２の撮像時間帯を分けて各時間帯に追跡された移動物体だけを統計分析の対象とすることができる。 In the above embodiment, the importance calculation unit 35 statistically analyzes all moving objects and calculates the collective feature amount. However, some moving objects may be statistically analyzed. For example, only the moving object tracked in each time zone by dividing the imaging time zone of the monitoring video 22 can be the target of statistical analysis.

１映像処理装置、２映像記録装置、３映像要約装置、４表示装置、２０撮像部、２１録画部、３０設定入力部、３１記憶部、３２信号処理部、３３物体追跡部、３４物体特徴抽出部、３５重要度算出部、３６要約映像作成部、３１０物体情報、３１１重要度データ、３５０物体内逸脱度算出部、３５１物体間逸脱度算出部、３５２逸脱度合成部。 DESCRIPTION OF SYMBOLS 1 Video processing device, 2 Video recording device, 3 Video summarization device, 4 Display device, 20 Imaging part, 21 Recording part, 30 Setting input part, 31 Storage part, 32 Signal processing part, 33 Object tracking part, 34 Object feature extraction 35, importance calculation unit, 36 summary video creation unit, 310 object information, 311 importance data, 350 in-object deviation calculation unit, 351 inter-object deviation calculation unit, 352 departure degree synthesis unit.

Claims

A video processing device that generates information of a frame to be watched in a monitoring video composed of a plurality of frames,
A storage unit for storing the monitoring video;
An object tracking unit that tracks an object image of a moving object in the monitoring video by image analysis, and generates correspondence information between a tracking frame in which the moving object is tracked and the object image of the moving object;
An object feature extraction unit that extracts a movement or / and appearance feature amount of the moving object from the object image;
The feature amount is statistically analyzed for each moving object to obtain an individual representative amount representing the distribution of the feature amount, and the individual representative amount of each moving object and the feature amount for each tracking frame of the moving object A monitoring value calculation unit that calculates a first monitoring value that is a distance of the tracking frame and calculates a higher monitoring value for the tracking frame having a larger first distance;
A video processing apparatus comprising:

The video processing apparatus according to claim 1,
The monitoring value calculation unit statistically analyzes the feature amount of the plurality of moving objects tracked by the object tracking unit to obtain a group representative amount representing the feature amount, and the group representative amount and each moving object And calculating a second distance that is a distance from the individual representative amount of the moving object according to the first distance of the moving object in each tracking frame and the second distance of the moving object tracked in the tracking frame. A video processing apparatus characterized by calculating the monitoring value.

The video processing apparatus according to claim 1, further comprising:
Summary video creation for creating a summary video having a set number of frames smaller than the total number of frames of the monitoring video by thinning out the monitoring video by preferentially selecting the tracking frame having a higher monitoring value from the monitoring video A video processing apparatus.

The video processing apparatus according to claim 3.
The summary video creation unit selects at least one of a start frame and a termination frame in a series of the tracking frames of each moving object regardless of the monitoring value thereof. Processing equipment.

In the video processing device according to claim 3 or 4,
The summary video creation unit uses dynamic programming to maximize the total sum of the monitoring values of the tracking frames constituting the summary video.