JP2002063577A

JP2002063577A - System and method for analyzing image, and image analysis program recording medium

Info

Publication number: JP2002063577A
Application number: JP2000246385A
Authority: JP
Inventors: Kazuhiro Otsuka; 和弘大塚; Ryuji Yamamoto; 隆二山本; Masashi Morimoto; 正志森本; Haruhiko Kojima; 治彦児島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2000-08-15
Filing date: 2000-08-15
Publication date: 2002-02-28

Abstract

PROBLEM TO BE SOLVED: To make outputable information on the states of stablly tracing objects and the union and separation of a plurality of objects by introducing a diversity such as one-to-multiple and multiple-to-one to the correspondence relation among objects between image frames. SOLUTION: The position and size of an object are acquired from an image area occupied by the object in respective image frames constituting an image, then, the transition cost in the case where the object in the former frame moves to the object in the latter is calculated about the set of the objects on the two frames in regard to the objects detected in the two adjacent frames, the correspondence relation between the objects on the two frames is classified from the transition cost, and the state of the objects is decided. The moving track of an individual object is traced over a plurality of image frames in the above correspondence relation of the objects by utilizing the information of results obtained by performing calculation about all adjacent inter-frames in the section of the image composed of a plurality of image frames.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は，映像に表示されて
いる多くの物体を追跡する画像解析システムであって，
ビデオカメラにより得られる映像を利用して交通流や人
物の行動の遠隔監視，映像アーカイブの構築における映
像中の人物動作内容のインデクシング，スポーツ映像か
らの選手の運動状態やチーム戦略の分析などを行うため
の画像解析処理技術に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image analysis system for tracking many objects displayed on a video,
Remote monitoring of traffic flow and human behavior using video obtained by video cameras, indexing of human motions in video in construction of video archives, analysis of athlete's movement status and team strategy from sports videos, etc. The present invention relates to an image analysis processing technique.

【０００２】[0002]

【従来の技術】これまで，人物，車両などの動物体を撮
影した映像を入力として，映像中の物体の位置を検出
し，それぞれ複数のフレームに渡って追跡することで，
動物体の行動を記録する画像処理技術は存在している。2. Description of the Related Art Up to now, an image of a moving object such as a person or a vehicle has been input and the position of an object in the image has been detected and tracked over a plurality of frames.
Image processing techniques for recording the behavior of a moving object exist.

【０００３】代表的な技術としては，追跡を開始する段
階において対象とする物体の領域や形状，色情報などを
テンプレートとして登録し，その後の画像フレーム中に
おいてテンプレートマッチングを行い，最も合致する位
置を対象物の位置として計算し，この動作を複数のフレ
ームに渡り繰り返し実行することで，動物体の追跡を行
うものが挙げられる。[0003] As a typical technique, the area, shape, color information, and the like of a target object are registered as a template at the stage of starting tracking, and template matching is performed in a subsequent image frame to determine the position of the best match. A moving object is tracked by calculating the position of an object and repeatedly executing this operation over a plurality of frames.

【０００４】また，特許第２６９７０７２号の「動画像
の物体追跡装置」に記載の技術では，画像フレーム上の
物体の次の画像フレーム上での位置を推定して対応付け
を決定している。特許第３０２６８５０号の「自動追尾
装置」では，対象物とその枠の色ヒストグラムを計算
し，それらの比較により代表的な特徴量を決定し，それ
を手がかりとして次フレームにおける物体の位置を探索
している。In the technique described in Japanese Patent No. 2697072, "Moving image object tracking device", the position of an object on an image frame on the next image frame is estimated to determine the correspondence. In the “automatic tracking device” of Japanese Patent No. 3026850, a color histogram of an object and its frame is calculated, a representative characteristic amount is determined by comparing them, and the position of the object in the next frame is searched by using the characteristic amount as a clue. ing.

【０００５】これらの手法は，物体が単体で存在し，そ
の形状や色の変化が少なく，また，運動も単純で予測が
容易な場合に良好は追跡性能を発揮することができる。[0005] These methods can exhibit good tracking performance when an object exists alone, its shape and color change is small, and its motion is simple and prediction is easy.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら，サッカ
ーの試合のように複数の人物が密集して存在したり，接
触したりする場合，その映像中には，複数の選手が重な
る領域が生じる。その重なりの原因としては，３次元空
間上では離れて存在する人物が２次元の画像平面に投影
されることにより，画像上では重なって見えることに加
え，相手チームの選手のマーク行為など，実際に３次元
空間上で複数の人物が接触していることによる重なりが
考えられる。However, when a plurality of persons are densely present or contact each other as in a soccer game, an area where a plurality of players overlap is generated in the video. The reason for the overlap is that people who are far apart in the three-dimensional space are projected on the two-dimensional image plane, so that they appear to overlap on the image, and in addition, the actual opponent's player's mark act It is conceivable that there is an overlap due to a plurality of persons touching in a three-dimensional space.

【０００７】そのような映像に対して，従来の手法で
は，複数の人物の重なりが生じる時点で，テンプレート
のマッチングが安定かつ正確に実行できず，出力する追
跡情報も極めて精度の低いものとなってしまうという問
題があった。従来法では，そのような問題に対処するた
めに，人物の重なりというような条件を検出する時点
で，追跡処理を終了してしまうか，人手による補正を求
める入力待ちの状態に移行するなどの対策をとってい
た。このように，従来法では，複数の人物の密集が生じ
る映像に対しては，安定に追跡処理が実行できず，ま
た，複数の人物の交わりや分離などの状態の認識もでき
なかった。[0007] In such a video, in the conventional method, when a plurality of persons overlap, template matching cannot be performed stably and accurately, and the tracking information to be output becomes extremely inaccurate. There was a problem that would. In the conventional method, in order to deal with such a problem, when detecting a condition such as overlapping of persons, the tracking process is terminated, or the state shifts to a state of waiting for input for manual correction. Measures were taken. As described above, according to the conventional method, tracking processing cannot be stably performed on an image in which a plurality of people are crowded, and a state such as intersection or separation of a plurality of people cannot be recognized.

【０００８】そこで本発明は，画像フレーム間における
物体間の対応関係に一対多，多対一などの多様性を導入
し，画像上での物体の移動，合体，分裂，消滅の状態の
判定を行うことで，安定した物体の追跡と，複数の物体
の合体，分離の状態に関する情報の出力を可能にする技
術の提供を課題とする。Accordingly, the present invention introduces diversity such as one-to-many, many-to-one, etc. into the correspondence between objects between image frames, and determines the state of movement, uniting, division, and disappearance of objects on an image. Accordingly, an object of the present invention is to provide a technology that enables stable tracking of an object and output of information on the state of merging and separation of a plurality of objects.

【０００９】[0009]

【課題を解決するための手段】本発明は，一つ以上の対
象物を撮影した映像を入力とし，映像を構成する複数の
画像フレームに渡って対象物を追跡し，各画像フレーム
における対象物の画像の位置を含む情報を出力する多物
体追跡画像解析システムであって，映像を構成する各々
の画像フレームにおいて対象物が占める画像領域を抽出
する手段と，前記対象物が占める画像領域から対象物の
中心座標値と大きさを取得する手段と，近接する二つの
フレームにおいて検出された対象物について，前のフレ
ームの対象物が，後のフレームの対象物へ移動した場合
の遷移コストを，二つのフレーム上の対象物の組につい
て計算する手段と，該遷移コストから二つのフレーム上
の対象物の間の対応関係を分類し，対象物の出現，消
滅，および，対象物同士の合体，分裂などの状態を判定
する手段と，前記二つのフレーム間の対象物の対応関係
を，複数の画像フレームからなる映像の区間内の全ての
隣接するフレーム間に対して計算した結果の情報を利用
して，個々の対象物の移動軌跡を複数の画像フレームに
渡って追跡し，各画像フレームにおける対象物の位置お
よびサイズに加えて，その出現，消滅，および，対象物
同士の合体，分裂の状態の情報を含めて出力する手段と
を有することを特徴とする。SUMMARY OF THE INVENTION According to the present invention, an image obtained by photographing one or more objects is input, the object is tracked over a plurality of image frames constituting the image, and the object in each image frame is tracked. A multi-object tracking image analysis system for outputting information including the position of an image of a subject, comprising: means for extracting an image area occupied by an object in each image frame forming a video; The means for acquiring the center coordinate value and the size of the object, and the transition cost when the object in the previous frame moves to the object in the subsequent frame for the objects detected in two adjacent frames, Means for calculating a set of objects on two frames, and classifying the correspondence between the objects on the two frames based on the transition cost, and the appearance, disappearance, and Means for judging the state such as merging or splitting of a person, and the result of calculating the correspondence of an object between the two frames for all adjacent frames in a section of a video composed of a plurality of image frames Using the information of the object, the trajectory of each object is tracked over a plurality of image frames. In addition to the position and size of the object in each image frame, its appearance, disappearance, and Means for outputting the information including the information on the state of merging and division.

【００１０】また，本発明の実施様態によれば，画像フ
レームにおいて対象物が占める画像領域を抽出する手段
として，画像を構成する画素値の各々の色成分について
ヒストグラムを計算する手段と，該ヒストグラムの最大
値を示す色成分値と，該最大値をピークとする山に隣接
する谷に対応する色成分値とを計算することで，対象物
の背景領域の色成分の範囲を計算する手段と，画像中の
任意の画素を中心とする近傍領域内の画素値が該背景領
域の色成分の範囲に含まれる割合を計算する手段と，該
近傍領域内の色成分の分散の大きさと背景領域全体の色
成分の分散の大きさの比を計算する手段と，該計算され
た分散の比が一定値以下であり，かつ，該近傍領域内の
画素値が該背景領域の色成分の範囲に含まれる割合が一
定値以上の場合に，対象の画素が対象物の領域に含まれ
ると判断し，それらの画素の集合である対象物の領域を
二値画像として抽出する手段とを有することを特徴とす
る。According to the embodiment of the present invention, as means for extracting an image area occupied by an object in an image frame, means for calculating a histogram for each color component of pixel values constituting an image, Means for calculating a range of color components in the background region of the object by calculating a color component value indicating the maximum value of the target object and a color component value corresponding to a valley adjacent to the peak having the maximum value as a peak. Means for calculating a ratio of a pixel value in a neighboring area centered on an arbitrary pixel in an image to a range of a color component of the background area; Means for calculating the ratio of the magnitude of the variance of the entire color component, and wherein the calculated variance ratio is equal to or less than a certain value, and wherein the pixel values in the neighboring region fall within the range of the color component of the background region. If the included ratio is above a certain value Determines that the pixel of interest is included in the area of the object, characterized by having a means for extracting a region of the object is a set of these pixels as binary image.

【００１１】また，本発明の実施様態によれば，対象物
が占める画像領域から対象物の中心座標値と大きさを取
得する手段として，対象物が占める領域を二値画像とし
て保持する画像に対して距離変化を施し，距離画像を計
算する手段と，該距離画像において，局所的な距離値の
ピークを示す座標を検出する手段と，該距離値がピーク
を示す座標を中心とする対象物領域に対して，適合する
外接四角形を計算する手段と，該外接四角形の中心位置
として対象物の中心座標を出力し，該外接四角形の幅，
高さとして対象物のサイズを計算し出力する手段とを有
することを特徴とする。Further, according to the embodiment of the present invention, as means for acquiring the center coordinate value and the size of the object from the image area occupied by the object, an image holding the area occupied by the object as a binary image is used. Means for performing a distance change on the distance image to calculate a distance image, means for detecting a coordinate indicating a local distance value peak in the distance image, and an object centered on the coordinate indicating the distance value peak. Means for calculating an appropriate circumscribed rectangle for the area; outputting the center coordinates of the object as the center position of the circumscribed rectangle;
Means for calculating and outputting the size of the object as the height.

【００１２】また，本発明の実施様態によれば，前記近
接する二つのフレームにおいて検出された対象物につい
て，前のフレームの対象物が，後のフレームの対象物へ
移動した場合の遷移コストを計算する手段として，対象
物の移動の大きさに関する遷移コストとして，二つのフ
レーム上の対象物の距離を，水平方向および垂直方向そ
れぞれについて，カメラの視点移動に伴うフレーム間に
おける画像全体のずれを補正して計算し，該水平方向の
距離を対象物の幅で正規化し，該垂直方向の距離を対象
物の高さで正規化して計算する手段と，対象物の形状や
色の変化に関する遷移コストとして，一方のフレーム上
の対象物の画素値の空間分布を，他方のフレーム上の対
象物に重ね合わせ，それらの画素値の分布の誤差が最も
小さくなる値を計算する手段とを有することを特徴とす
る。[0012] According to the embodiment of the present invention, for the objects detected in the two adjacent frames, the transition cost when the object in the previous frame moves to the object in the subsequent frame is reduced. As a calculation method, the distance between the objects on the two frames as the transition cost related to the size of the movement of the object, and the displacement of the entire image between the frames due to the camera's viewpoint movement in each of the horizontal and vertical directions Means for calculating by compensating and correcting the horizontal distance by the width of the object, and normalizing the vertical distance by the height of the object, and a transition relating to changes in the shape and color of the object As the cost, the spatial distribution of the pixel values of the object on one frame is superimposed on the object on the other frame, and the value that minimizes the error in the distribution of the pixel values is calculated. And having a means for.

【００１３】また，本発明の実施様態によれば，前記遷
移コストを用い，二つのフレーム上の対象物の間の対応
関係を分類し，対象物の出現，消滅，および，対象物同
士の合体，分裂などの状態を判定する手段として，二つ
のフレーム上の個々の対象物について，遷移コストが一
定値以下の対象物の組の間に対応関係があると判断する
手段と，二つのフレーム上の対象物の間の対応関係を，
前フレームの単一の対象物が後フレームの単一の対象物
に対応する状態，前フレームの複数の対象物が後フレー
ムの単一の対象物へ対応する状態，前フレームの単一の
対象物が後フレームの複数の対象物へ対応する状態，前
フレームには存在するが後フレーム上に対応する対象物
が存在しない状態，後フレーム上に存在するが前フレー
ム上に対応する対象物が存在しない状態に分類し，前フ
レームの単一の対象物が後フレームの単一の対象物に移
動する状態を単純な移動，前フレームの複数の対象物が
後フレームの単一の対象物へ移動する状態を対象物の合
体，前フレームの単一の対象物が後フレームの複数の対
象物へ移動する状態を対象物の分裂，前フレームには存
在するが後フレーム上に対応する対象物が存在しない状
態を対象物の消滅と，後フレーム上に存在するが前フレ
ーム上に対応する対象物が存在しない状態を対象物の出
現として判定する手段とを有することを特徴とする。According to an embodiment of the present invention, the correspondence between objects on two frames is classified by using the transition cost, and the appearance and disappearance of the objects and the union of the objects are combined. A means for determining the state such as split, split, etc., for each object on two frames, a means for determining that there is a correspondence between a set of objects whose transition cost is less than a certain value, The correspondence between the objects of
Single object in the previous frame corresponds to a single object in the rear frame, multiple objects in the previous frame correspond to a single object in the rear frame, single object in the previous frame An object corresponds to multiple objects in the rear frame, a state exists in the previous frame but no corresponding object exists in the rear frame, and an object exists in the rear frame but corresponds to the previous frame. Classify as non-existent state and simply move the single object in the previous frame to the single object in the subsequent frame. Move multiple objects in the previous frame to the single object in the subsequent frame. The moving state is the merging of objects, the single object in the previous frame is moving to multiple objects in the subsequent frame, the splitting of the object, the object existing in the previous frame but corresponding to the rear frame. Object disappears when there is no , Characterized by having a means for determining the state of an object is present on the rear frame corresponding to the previous frame does not exist as the appearance of the object.

【００１４】また，本発明の実施様態によれば，二つの
フレーム間の対象物の対応関係を，複数の画像フレーム
からなる映像の区間内の全ての隣接するフレーム間に対
して計算した結果から，個々の対象物の移動軌跡を複数
の画像フレームに渡って追跡し，その出現，消滅，およ
び，対象物同士の合体，分裂の状態の情報を含めて出力
する手段として，複数のフレームに渡る対象物の存在お
よびフレーム間の対象物の対応関係のデータを入力し，
それを個々の画像フレームにおいて検出された対象物を
頂点，近接フレーム間における対象物の対応関係をエッ
ジとするグラフ構造状のデータ構造として格納する手段
と，出現の直後の状態をもつ対象物を探索し，それを起
点として消滅直前の状態または合体状態が検出されるま
で，該グラフ中を探索し，対象物の移動軌跡を取得する
手段と，分裂の直後の状態をもつ対象物を探索し，それ
を起点として消滅直前の状態または合体状態が検出され
るまで，該グラフ中を探索し，対象物の移動軌跡を取得
する手段とを有することを特徴とする。Further, according to the embodiment of the present invention, the correspondence of the object between the two frames is calculated from the result of calculating between all the adjacent frames in the video section composed of a plurality of image frames. As a means of tracking the trajectory of each object over multiple image frames and outputting it including information on its appearance, disappearance, and merging and division of the objects over multiple frames Input the data of the existence of the object and the correspondence of the object between frames,
Means for storing the detected object in each image frame as a graph-structured data structure in which the vertices are the objects detected in each image frame and edges indicate the correspondence between the objects in adjacent frames; Means for searching the graph to obtain the movement trajectory of the object, and searching for the object having the state immediately after the division, until the state immediately before the disappearance or the united state is detected from the starting point. And a means for searching the graph until the state immediately before disappearance or the united state is detected from the start point, and acquiring the movement trajectory of the object.

【００１５】前記入力映像として，複数の人間で行う球
技の試合を撮影した映像を入力したような場合，例えば
背景は競技場のフィールド領域であり，前記対象物は競
技を行う選手，審判などの人物である。In the case where a video image of a ball game played by a plurality of humans is input as the input video, for example, the background is a field area of a stadium, and the object is a player, a referee, etc. A person.

【００１６】サッカーの試合のように激しく運動する人
物を撮影する場合，映像中の人物像は，フレーム毎に著
しく形状が変化する。従来のテンプレートマッチングを
用いた追跡方法では，フレーム間において物体の形状変
化が大きい対象物の場合に，正確な追跡が困難であると
いう問題があるが，本発明では，隣接する２つのフレー
ム上の物体の全ての組み合わせについて，その移動の可
能性を遷移コストとして計算するために，物体の形状変
化が大きい対象物に対しても正確な追跡が可能であると
いう利点を有する。When photographing a person who exercises violently, such as in a soccer game, the shape of the person image in the video changes significantly from frame to frame. The tracking method using the conventional template matching has a problem that it is difficult to accurately track an object whose shape changes greatly between frames. However, in the present invention, the tracking method on two adjacent frames is difficult. Since the possibility of movement is calculated as the transition cost for all combinations of objects, there is an advantage that accurate tracking is possible even for an object having a large shape change of the object.

【００１７】また，同じくサッカーの試合映像等の場
合，複数の人物が激しくぶつかり合ったり，密着したり
するため，映像中の人物像もまた，複数の人物像が密集
して現れる。このような密集している領域から個々の人
物の領域を分離して切り出すことは極めて困難であり，
また，ある人物の陰に別の選手が隠れる場合には，隠れ
た選手を検出することは単一の視点から撮影した画像か
らは困難である。Also, in the case of a soccer game image or the like, since a plurality of persons collide or come into close contact with each other, a plurality of person images also appear densely in the image. It is extremely difficult to separate and separate individual person's areas from such dense areas.
Further, when another player is hidden behind a certain person, it is difficult to detect the hidden player from an image taken from a single viewpoint.

【００１８】以上の理由により，本発明では，複数の人
物が密集したり重なったりする場合，それらの全体の領
域を一つの人物領域として抽出する方式を用い，前後の
フレームとの対応関係を用いることにより複数の選手の
合体や分離を判定することで，複数の人物の重なりなど
へ対応することができる。一方，従来のテンプレートマ
ッチングの手法では，複数の人物の密集や重なりが生じ
た時点で正常な動作が望めない。For the above reasons, in the present invention, when a plurality of persons are crowded or overlapped, a method of extracting the entire area of them as one person area is used, and the correspondence between the preceding and succeeding frames is used. Thus, by judging the combination or separation of a plurality of players, it is possible to cope with an overlap of a plurality of persons. On the other hand, in the conventional template matching method, a normal operation cannot be expected when a plurality of persons are crowded or overlapped.

【００１９】また，このような複雑な対象に従来の手法
を用いる場合，人物の重なる状況を人間が判定して，追
跡の処理を制御する必要があったが，本発明では，自動
的に複数の人物の合体や分裂を判定することができ，そ
の点においても従来手法に対して優位性を持つ。In addition, when the conventional method is used for such a complicated object, it is necessary for a person to judge an overlapping situation of a person and to control a tracking process. It is possible to judge the union and division of the persons, and in that respect, it is superior to the conventional method.

【００２０】以上の処理手段の各部を，ハードウェアに
よって実現することも，またコンピュータが実行するソ
フトウェアプログラムによって実現することもできる。
そのソフトウェアプログラムはコンパクトディスク，フ
ロッピー（登録商標）ディスクその他の記録媒体に記録
して提供することができる。Each section of the processing means described above can be realized by hardware, or can be realized by a software program executed by a computer.
The software program can be provided by being recorded on a compact disk, a floppy (registered trademark) disk, or another recording medium.

【００２１】[0021]

【発明の実施の形態】以下，図面を参照して本発明の実
施の形態を詳細に説明する。図１は，本発明の一実施形
態例を説明する図であって，本システムは，入力部１０
０，処理部２００，ファイルシステム３００，メモリ４
００，および表示部１０００から構成される。Embodiments of the present invention will be described below in detail with reference to the drawings. FIG. 1 is a diagram for explaining an embodiment of the present invention.
0, processing unit 200, file system 300, memory 4
00, and a display unit 1000.

【００２２】入力部１００は，撮影装置１０１，ビデオ
映像蓄積再生装置１０２，映像取り込み装置１０３から
なる。撮影装置１０１は，試合映像を撮影するカメラで
ある。ビデオ映像蓄積再生装置１０２は，撮影装置１０
１により撮影された映像を記録し，再生することができ
る装置である。ここで，撮影装置１０１とビデオ映像蓄
積再生装置１０２は同一の地点に存在する必要はなく，
通信回線等を通じて撮影装置１０１で撮影された映像を
送信し，ビデオ映像蓄積再生装置１０２で受信および記
録を行うこともできる。さらに，ビデオ映像蓄積再生装
置１０２は，一台である必要はなく，複数の装置を用い
て記録と再生とを別々の装置で実行することも可能であ
る。The input unit 100 includes a photographing device 101, a video image storage / playback device 102, and a video capturing device 103. The photographing device 101 is a camera that photographs a game video. The video image storage / playback device 102 is
1 is a device capable of recording and reproducing the video taken by the camera 1. Here, the photographing device 101 and the video image storage / playback device 102 need not be at the same point,
It is also possible to transmit a video image captured by the image capturing apparatus 101 through a communication line or the like, and receive and record the video image by the video image storage / reproduction apparatus 102. Further, the video image storage / reproduction device 102 does not need to be one, and it is also possible to use a plurality of devices to execute recording and reproduction by separate devices.

【００２３】映像取り込み装置１０３は，ビデオ映像蓄
積再生装置１０２で再生される映像を計算機上に取り込
み，映像蓄積ファイルシステム３０１中にディジタルデ
ータとして蓄積を行う装置である。映像取り込み装置１
０３は，ビデオ映像蓄積再生装置１０２の出力がアナロ
グ信号の場合には，ディジタル／アナログ（Ｄ／Ａ）変
換を行う。また，撮影装置１０１やビデオ映像蓄積再生
装置１０２の出力がディジタル信号の場合には，必要に
応じて映像データのフォーマット変換を行う機能も実現
するものである。The video capturing device 103 is a device for capturing the video reproduced by the video video storage / reproducing device 102 on a computer and storing the video as digital data in the video storage file system 301. Video capture device 1
In step 03, when the output of the video image storage / reproduction device 102 is an analog signal, digital / analog (D / A) conversion is performed. Further, when the output of the photographing device 101 or the video image storage / reproduction device 102 is a digital signal, a function of converting the format of video data as necessary is also realized.

【００２４】処理部２００は，本発明の中心的技術を実
現する部分であり，ヒストグラム計算部２０１，フィー
ルド領域抽出部２０２，人物領域抽出部２０３，人物検
出部２０４，遷移コスト計算部２０５，フレーム間対応
決定部２０６，人物遷移探索部２０７からなる。処理部
２００は，映像を入力し，映像の各画像フレームから人
物を検出し，その人物の追跡を行い，その結果として，
各フレームにおける人物の位置情報などの出力を行う。
詳細は後述する。The processing unit 200 is a part for realizing the core technology of the present invention, and includes a histogram calculation unit 201, a field region extraction unit 202, a person region extraction unit 203, a person detection unit 204, a transition cost calculation unit 205, a frame It consists of an interval correspondence determination unit 206 and a person transition search unit 207. The processing unit 200 receives a video, detects a person from each image frame of the video, tracks the person, and as a result,
The position information of the person in each frame is output.
Details will be described later.

【００２５】ファイルシステム３００は，映像蓄積ファ
イルシステム３０１，人物色モデルファイルシステム３
０２，人物検出結果ファイルシステム３０３，トラッキ
ング出力ファイルシステム３０４からなる。The file system 300 includes a video storage file system 301 and a person color model file system 3
02, a person detection result file system 303, and a tracking output file system 304.

【００２６】映像蓄積ファイルシステム３０１は，映像
取り込み装置１０３より出力される映像データを蓄積
し，処理部２００からの要求に応じて任意の映像中の画
像フレームを出力することができる。人物色モデルファ
イルシステム３０２は，人物検出部２０４において検出
した人物の属性を判定する際に用いるモデルとなる人物
の色情報が格納されているファイルシステムである。The video storage file system 301 can store video data output from the video capture device 103 and output an image frame in an arbitrary video in response to a request from the processing unit 200. The person color model file system 302 is a file system in which color information of a person serving as a model used when determining the attribute of the person detected by the person detection unit 204 is stored.

【００２７】人物検出結果ファイルシステム３０３は，
処理部２００より出力される入力映像の各画像フレーム
における人物の位置，サイズ，および，前後フレーム上
の人物との対応関係等からなるデータを格納するファイ
ルシステムである。トラッキング出力ファイルシステム
３０４は，人物遷移探索部２０７より出力される映像中
の各人物の移動軌跡情報が格納されるファイルであり，
本システムの最終出力となるものである。なお，これら
のファイルシステムは，同一または異なる固定ディスク
装置を用いて実現することができる。The person detection result file system 303
This is a file system for storing data including the position and size of a person in each image frame of the input video output from the processing unit 200 and the correspondence between the person and the person on the preceding and succeeding frames. The tracking output file system 304 is a file in which the moving trajectory information of each person in the video output from the person transition search unit 207 is stored.
This is the final output of this system. Note that these file systems can be realized using the same or different fixed disk devices.

【００２８】メモリ４００は，画像フレームメモリ４０
１，ヒストグラムメモリ４０２，フィールド領域画像フ
レームメモリ４０３，人物領域画像フレームメモリ４０
４，人物情報メモリ４０５，遷移コストマトリクスメモ
リ４０６，人物状態テーブルメモリ４０７，人物遷移構
造メモリ４０８からなる。これらのメモリは，処理部２
００の各処理過程において，入出力される情報を一時的
に格納するために用いられ，半導体メモリを用いて実現
することができる。The memory 400 is an image frame memory 40
1, histogram memory 402, field area image frame memory 403, person area image frame memory 40
4, a person information memory 405, a transition cost matrix memory 406, a person state table memory 407, and a person transition structure memory 408. These memories are stored in the processing unit 2
00 is used to temporarily store input / output information and can be realized using a semiconductor memory.

【００２９】表示部１０００は，トラッキング結果の画
像表示を行う。The display section 1000 displays an image of the tracking result.

【００３０】図２は，図１に示すシステムの処理例を示
す流れ図である。まず，映像中より画像フレームを読み
込み（ステップ５０１），各色成分についてヒストグラ
ムを計算し（ステップ５０２），画像フレーム中よりフ
ィールド領域を抽出し（ステップ５０３），フィールド
上に存在する人物の領域を抽出し（ステップ５０４），
人物領域から人物を検出し，その位置，サイズ情報を取
得し（ステップ５０５），さらに，現在の画像フレーム
が開始フレームでない場合には，前フレームで検出され
た人物との間の遷移コストを計算し（ステップ５０
６），現在フレームと前フレームの人物間の対応づけを
行い（ステップ５０７），その結果をファイル出力し，
さらに，最終フレームまで処理が終了した後，各フレー
ムについて得られた人物情報と対応付けの情報を入力し
（ステップ５０８），その情報を用いて人物の探索を行
い，その結果を出力する（ステップ５０９）。FIG. 2 is a flowchart showing a processing example of the system shown in FIG. First, an image frame is read from a video (Step 501), a histogram is calculated for each color component (Step 502), a field region is extracted from the image frame (Step 503), and a person region existing on the field is extracted. (Step 504),
A person is detected from the person area, its position and size information is obtained (step 505), and if the current image frame is not the start frame, the transition cost between the person and the person detected in the previous frame is calculated. (Step 50
6), the correspondence between the person in the current frame and the person in the previous frame is performed (step 507), and the result is output to a file,
Further, after the processing is completed up to the last frame, the person information obtained for each frame and the correspondence information are input (step 508), a person is searched using the information, and the result is output (step 508). 509).

【００３１】以下では，図１に示す処理部２００の動作
を具体的に説明する。処理部２００においては，まず，
映像蓄積ファイルシステム３０１に記憶されている映像
データにアクセスし，利用者の指定するフレーム区間に
ついて，利用者の指定するフレーム間隔おきに画像フレ
ームを読み込み，画像フレームメモリ４０１に記憶す
る。Hereinafter, the operation of the processing section 200 shown in FIG. 1 will be specifically described. In the processing unit 200, first,
The video data stored in the video storage file system 301 is accessed, and image frames are read at every frame interval specified by the user in a frame section specified by the user, and stored in the image frame memory 401.

【００３２】ヒストグラム計算部２０１では，画像フレ
ームメモリ４０１に蓄積された画像フレームを構成する
全ての画素について，色成分毎のヒストグラム（頻度分
布）を計算し，ヒストグラムメモリ４０２へ格納する。The histogram calculator 201 calculates a histogram (frequency distribution) for each color component for all pixels constituting the image frame stored in the image frame memory 401 and stores the histogram in the histogram memory 402.

【００３３】本システムで扱う映像は，カラー映像であ
り，各画素は色を表現するために，複数の値を保持して
いる。例えば，ＲＧＢ表色系（赤，緑，青）という３つ
の数値の組が計算機用の映像表示装置には用いられてい
る。ヒストグラム計算部２０１では，こうした色表現手
段の一つとしてＨＳＶ表色系（色相，彩度，輝度）を用
いることができる。An image handled by the present system is a color image, and each pixel holds a plurality of values in order to express a color. For example, a set of three numerical values of an RGB color system (red, green, and blue) is used in a video display device for a computer. The histogram calculation unit 201 can use the HSV color system (hue, saturation, luminance) as one of such color expression means.

【００３４】ヒストグラム分布の計算に先駆け，予め各
色成分の値の分割幅を設定し，各色成分の最小値と最大
値の間を複数の区間に分割し，区間毎に度数を格納でき
るようなメモリ領域をヒストグラムメモリ４０２中に確
保し，各区間の値を０にリセットする。Prior to the calculation of the histogram distribution, a division width of each color component value is set in advance, a portion between the minimum value and the maximum value of each color component is divided into a plurality of sections, and a memory capable of storing a frequency for each section. An area is secured in the histogram memory 402, and the value of each section is reset to 0.

【００３５】次に，各画素を巡回し，その色成分の値に
対応したヒストグラムの区間の値を１だけ増加させる。
ここで，各色成分毎に独立したヒストグラムを作成す
る。全画素巡回後のヒストグラムメモリ中の値からヒス
トグラム分布が得られる。例えば，図３（ａ）のような
分布が各色成分について得られ，競技場のフィールドの
色に対応する部分にはヒストグラム中において最も高い
山が形成される。Next, each pixel is circulated, and the value of the section of the histogram corresponding to the value of the color component is increased by one.
Here, an independent histogram is created for each color component. A histogram distribution is obtained from the values in the histogram memory after all the pixels have been visited. For example, a distribution as shown in FIG. 3A is obtained for each color component, and the highest mountain in the histogram is formed in a portion corresponding to the color of the field of the stadium.

【００３６】なお，撮影時の競技場や照明条件により画
像の性質は，著しく変化するが，本発明では，ヒストグ
ラムの分割幅を調整することにより，様々な条件の映像
に対応が可能である。Although the nature of the image changes significantly depending on the stadium and the lighting conditions at the time of shooting, the present invention can cope with images under various conditions by adjusting the division width of the histogram.

【００３７】なお，ヒストグラム計算部２０１および以
後の処理において，ＨＳＶ表色系を実現例として用いて
いるが，その他，任意の表色系を用いても本発明を実施
することができる。Although the HSV color system is used as an example of implementation in the histogram calculation unit 201 and subsequent processing, the present invention can be implemented using any other color system.

【００３８】フィールド領域抽出部２０２では，画像フ
レームメモリ４０１に記憶されている画像フレーム，お
よびヒストグラムメモリ４０２に格納されているヒスト
グラム分布を入力として，画像フレーム中のフィールド
領域の抽出を行い，抽出したフィールド領域を画像とし
てフィールド領域画像フレームメモリ４０３に出力す
る。ここで「フィールド」とは，映像のインタレース方
式におけるフィールドではなく，撮影された競技場にお
いて試合を行う領域のことを指すものである。The field region extraction unit 202 extracts a field region in an image frame by using the image frame stored in the image frame memory 401 and the histogram distribution stored in the histogram memory 402 as inputs. The field area is output to the field area image frame memory 403 as an image. Here, the "field" does not refer to a field in the video interlaced system but refers to an area where a game is played in a stadium where the video was taken.

【００３９】本システムで対象とする画像は，一般に図
４のような構成をもつ。６０１はフィールド領域以外の
画像領域であり，観客席などの競技場施設に対応する領
域であり，６０２は競技が行われるフィールド領域であ
り，６０３はフィールド上に存在する人物（選手）が占
める画像領域である。An image targeted by the present system generally has a configuration as shown in FIG. Reference numeral 601 denotes an image area other than the field area, which is an area corresponding to a stadium facility such as spectator seats; 602, a field area where a game is played; and 603, an image occupied by a person (player) existing on the field. Area.

【００４０】フィールド領域抽出部２０２では，まず，
ヒストグラム分布を利用して，フィールドに対応する画
素の色の範囲を計算する。そのため，始めに，各色成分
のヒストグラム分布について，それぞれ最大の度数を示
す値を中心色成分（Ｈ_C，Ｓ _C，Ｖ_C）として取得す
る。In the field area extraction unit 202, first,
Using the histogram distribution, the image corresponding to the field
Calculate the range of elementary colors. Therefore, first, each color component
The maximum frequency for each histogram distribution
Value to the center color component (H_C, S _C, V_C)
You.

【００４１】次に，フィールドの色範囲の下限と上限を
各成分について求める。本発明では，ヒストグラムのピ
ークを挟んだ山をフィールドの色領域と定義し，ピーク
を挟む谷の色成分値を探索により求める（図３
（ｂ））。例えば，下限の探索は，ピーク値より色成分
値の低い方向に進みながら，分布の傾きを計算し，傾き
がゼロになった色成分値を下限値として求めることがで
きる。また，上限値も，探索方向を逆にすることにより
求められる。ここで得られた色範囲を成分毎に
（Ｈ_min，Ｈ_max），（Ｓ_min，Ｓ_max），（Ｖ_min，
Ｖ_max）と表記する。ある画素の色成分がフィールド色
領域に含まれるとは，色相Ｈ，彩度Ｓ，輝度Ｖのそれぞ
れの値が全て対応する色範囲に含まれていることを指
す。また，この条件を緩和し，例えば，色相Ｈ，彩度Ｓ
成分についてのみ条件を課すというような様々な条件の
組み合わせを用いることも可能である。Next, the lower and upper limits of the color range of the field are determined for each component. In the present invention, the peaks sandwiching the peak of the histogram are defined as the color region of the field, and the color component values of the valleys sandwiching the peak are obtained by searching (FIG. 3).
(B)). For example, in the search for the lower limit, the gradient of the distribution is calculated while proceeding in the direction in which the color component value is lower than the peak value, and the color component value at which the gradient becomes zero can be obtained as the lower limit value. The upper limit is also obtained by reversing the search direction. The color range obtained here is determined for each component by (H _min , H _max ), (S _min , S _max ), (V _min ,
V _max ). The fact that the color component of a certain pixel is included in the field color area means that all the values of the hue H, the saturation S, and the luminance V are all included in the corresponding color range. In addition, this condition is relaxed, for example, the hue H and the saturation S
It is also possible to use various combinations of conditions, such as imposing conditions only on components.

【００４２】次に，画像フレームの各画素を巡回し，各
々の画素がフィールド領域に属するか否かを判定する。
フィールド領域は，他の領域と比較してなめらかで色の
ばらつきが小さいので，局所的な色分布のばらつきと，
フィールド領域全体の色のばらつきを比較して，局所的
に一様な色分布を持ち，かつ，検出したフィールドの色
範囲へ適合する画素をフィールド領域に含まれるとして
判定する。Next, each pixel of the image frame is circulated to determine whether or not each pixel belongs to the field area.
The field area is smoother and has less color variation than other areas, so local color distribution variation and
By comparing the color variation of the entire field area, it is determined that a pixel having a locally uniform color distribution and conforming to the color range of the detected field is included in the field area.

【００４３】そのための具体的な方法として，画像フレ
ーム中の各画素（ｉ，ｊ）を巡回し，着目する画素を含
むＮ×Ｎ画素の近傍画素の集合について，フィールド色
範囲に含まれる画素の割合ｒと，近傍の色分布とフィー
ルド上の色分布の分散の大きさの比Ｒ（分散比）を計算
する。そして，割合ｒが一定値以上であり，かつ，分散
比Ｒが一定値以下である場合に，その画素はフィールド
上に属すると判断する。その場合，フィールド領域画像
フレームメモリ４０３上の対応する座標値（ｉ，ｊ）が
１となり，フィールド外と判定された場合０と設定され
る。As a specific method for this purpose, the pixel (i, j) in the image frame is circulated, and for a set of N × N pixels including the pixel of interest, a set of pixels included in the field color range is obtained. A ratio R (dispersion ratio) between the ratio r and the magnitude of the variance between the neighboring color distribution and the color distribution on the field is calculated. If the ratio r is equal to or greater than a certain value and the variance ratio R is equal to or less than a certain value, it is determined that the pixel belongs to the field. In that case, the corresponding coordinate value (i, j) on the field area image frame memory 403 becomes 1, and is set to 0 when it is determined to be out of the field.

【００４４】以上で，判定されたフィールド領域には，
フィールド上に存在する人物（選手）の領域が含まれ
ず，フィールド領域中の穴として残ってしまう場合があ
る。そのため，フィールド領域画像フレームメモリ４０
３に記憶されているフィールド領域画像について，整形
処理を行う。整形処理は，まず，モルフォロジフィルタ
による縮小処理により，孤立した微少領域をゴミとして
除去する。続いて，人物領域の穴を埋めるために，フィ
ールド領域画像上の各画素を巡回し，ある画素が０であ
る場合（フィールド外と設定されている）に，図５のよ
うに，上下左右のある距離離れた画素の値を取得し，そ
れらが全て１である（つまり，フィールド上である）場
合に，現在の巡回中の画素の値を１に変化させるという
処理を繰り返し行う。図５の（ａ），（ｂ），（ｃ）は
その際にチェックする近傍の画素のパターンであり，一
つまたは複数のパターンについて順次実行する。本処理
により，従来のモルフォロジフィルタの膨張処理と比較
して，領域中の大規模な穴を，高速に埋めることが可能
となる。In the field area determined as described above,
A region of a person (player) existing on the field may not be included and may remain as a hole in the field region. Therefore, the field area image frame memory 40
The shaping process is performed on the field region image stored in No.3. In the shaping process, first, isolated minute regions are removed as dust by a reduction process using a morphological filter. Next, in order to fill the hole in the person area, each pixel on the field area image is circulated, and when a certain pixel is 0 (set outside the field), as shown in FIG. The values of pixels at a certain distance are acquired, and when all of them are 1 (that is, on the field), the process of changing the value of the currently traveling pixel to 1 is repeatedly performed. FIGS. 5A, 5B, and 5C show patterns of neighboring pixels to be checked at that time, which are sequentially executed for one or a plurality of patterns. With this processing, it is possible to fill a large-scale hole in the area at a higher speed as compared with the expansion processing of the conventional morphological filter.

【００４５】以上の処理により，図４のような画像か
ら，図６のようなフィールド領域がフィールド領域画像
フレームメモリ４０３上の二値画像として得られる。By the above processing, a field region as shown in FIG. 6 is obtained as a binary image on the field region image frame memory 403 from the image as shown in FIG.

【００４６】人物領域抽出部２０３は，画像フレームメ
モリ４０１中の画像と，フィールド領域画像フレームメ
モリ４０３中のフィールド領域画像を入力とし，画像中
から人物領域を抽出し，抽出結果を画像として人物領域
フレームメモリ４０４に記録する。The person area extraction unit 203 receives an image in the image frame memory 401 and a field area image in the field area image frame memory 403, extracts a person area from the image, and uses the extraction result as an image to extract a person area. The data is recorded in the frame memory 404.

【００４７】そのための具体的な方法として，以下の方
法が利用できる。フィールド領域画像上において，フィ
ールド領域に含まれる画素を巡回し，その各画素の色成
分がフィールド色範囲に含まれない場合，または，色相
成分Ｈの値のフィールド中心色からのずれ（フィールド
色分布の標準偏差で正規化）が一定値以上の場合，また
は，着目画素の近傍の画素の輝度成分Ｖのフィールド中
心色まわりの分散とフィールド色分布の分散の比が，一
定値以上の場合のいずれかの場合に，着目画素は選手領
域に属すると判断し，人物領域画像フレームメモリ４０
４中の該当座標の画素の値を１に設定する。また，そう
でない場合には，値を０に設定する。As a specific method for this, the following method can be used. On the field area image, pixels included in the field area are circulated, and the color component of each pixel is not included in the field color range, or the value of the hue component H deviates from the field center color (field color distribution Is normalized to a standard deviation) or more, or when the ratio of the variance of the luminance component V of the pixel in the vicinity of the pixel of interest around the field center color to the variance of the field color distribution is a certain value or more. In this case, the pixel of interest is determined to belong to the player area, and the person area image frame memory 40 is determined.
The value of the pixel at the corresponding coordinates in 4 is set to 1. Otherwise, the value is set to 0.

【００４８】このような処理を全画素について実行する
と，図７のような人物領域を示す二値画像が，人物領域
画像フレームメモリ４０４に得られる。When such processing is executed for all pixels, a binary image showing a person region as shown in FIG. 7 is obtained in the person region image frame memory 404.

【００４９】人物検出部２０４は，画像フレームメモリ
４０１中の画像と，人物領域画像フレームメモリ４０４
中の選手領域画像，人物色モデルファイルシステム３０
２中の人物の色モデル情報を入力とし，画像中より人物
を検出し，その位置，サイズ，属性を取得し，人物情報
メモリ４０５に格納する。The person detecting section 204 includes an image in the image frame memory 401 and a person area image frame memory 404.
Player area image inside, person color model file system 30
2, the color model information of the person in FIG. 2 is input, a person is detected from the image, the position, size, and attribute of the person are acquired and stored in the person information memory 405.

【００５０】そのための具体的な方法として，以下の方
法が利用できる。まず，人物領域画像フレームメモリ４
０４中より読み込んだ選手領域画像について，距離変換
を施し，その結果を人物領域画像フレームメモリ４０４
に書き込む。距離変換後の画像を距離画像と呼ぶ。距離
画像中の各画素値は，その画素から背景（つまり，非人
物領域）への最短距離の値を持つ。その方法には，例え
ば，「画像解析ハンドブック」（監修高木幹雄，下田陽
久，東京大学出版会）の５７６ページから５７７ページ
に記載の方法が利用できる。The following method can be used as a specific method for that. First, the person area image frame memory 4
04, the distance conversion is performed on the player region image read from the center, and the result is stored in the person region image frame memory 404.
Write to. The image after the distance conversion is called a distance image. Each pixel value in the range image has a value of the shortest distance from the pixel to the background (that is, a non-human area). For the method, for example, the method described on pages 576 to 577 of "Image Analysis Handbook" (supervised by Mikio Takagi, Hirohisa Shimoda, University of Tokyo Press) can be used.

【００５１】次に，距離画像について，距離値が局所的
にピークを持つような画素の集合を検出する。その方法
としては，各画素を巡回し，今着目する画素と，近傍の
画素の距離値を比較して，着目画素の距離値がいずれよ
りも大きい場合に，その画素をピークとして検出する方
法が利用できる。続いて，検出される各ピークについ
て，そのピーク画素を取り囲む人物領域に適合する外接
四角形を計算する。例えば，図８（ａ）の人物領域７０
２について，点７０３のようにピーク画素が得られ，人
物領域の外接四角形が７０１のように求められる。Next, a set of pixels whose distance value locally has a peak is detected in the distance image. As a method of this, a method of circulating through each pixel, comparing the distance value of the pixel of interest and a neighboring pixel, and detecting the pixel as a peak when the distance value of the pixel of interest is larger than any of them. Available. Subsequently, for each detected peak, a circumscribed rectangle that fits the human area surrounding the peak pixel is calculated. For example, the person area 70 shown in FIG.
With respect to 2, a peak pixel is obtained as shown by a point 703, and a circumscribed rectangle of the person area is obtained as shown by 701.

【００５２】その方法の一つとして，図８（ｂ）のよう
に，ピーク画素を中心とし，その距離値の２倍の幅，高
さをもつ四角形を考え，この四角形の上，下の辺をそれ
ぞれ上方，下方に移動させ，返上の画素が人物領域に含
まれる割合を計算し，ある割合を下回る時点で，辺の移
動を停止する。また，左右の辺についても，同様の操作
を行う。その結果として得られる辺の位置を，含む四角
形を外接四角形として求め，その中心位置の座標，幅，
高さを人物情報として人物情報メモリ４０５中に格納す
る。As one of the methods, as shown in FIG. 8B, a rectangle having a width and a height twice as long as the distance value with the peak pixel at the center is considered, and the upper and lower sides of the rectangle are considered. Are moved upward and downward, respectively, the ratio of the pixels that are returned is included in the human area is calculated, and when the ratio falls below a certain ratio, the movement of the side is stopped. The same operation is performed on the left and right sides. The resulting rectangle is determined as a circumscribed rectangle, and the coordinates of the center position, width,
The height is stored in the personal information memory 405 as personal information.

【００５３】さらに，人物情報メモリ４０５中に格納さ
れた人物中から，相異なる人物の外接四角形の重なりの
面積を調べ，重なりが顕著である場合，より面積の小さ
い方の人物情報を人物情報メモリ４０５から削除する。
その方法としては，重なりの面積と，大きい方の外接四
角形の面積との比が一定値以下のときに，面積の小さい
方を消去する方法が利用できる。また，重なりがある場
合に，面積の小さい方の外接四角形を消去することで，
失われる面積が他方の外接四角形の面積に占める割合を
計算し，それが一定値以下の場合に，面積の小さい方を
消去するという方法が利用できる。なお，上記の方法以
外の方法も利用可能である。Further, from the persons stored in the person information memory 405, the area of the overlap of the circumscribed rectangles of the different persons is checked, and if the overlap is remarkable, the person information of the smaller area is stored in the person information memory. 405.
As the method, when the ratio between the area of the overlap and the area of the larger circumscribed rectangle is equal to or smaller than a predetermined value, a method of erasing the smaller area can be used. Also, if there is an overlap, by erasing the circumscribed rectangle with the smaller area,
A method of calculating the ratio of the lost area to the area of the other circumscribed rectangle and erasing the smaller area when it is less than a certain value can be used. Note that a method other than the above method can be used.

【００５４】次に，以上の処理の結果，人物情報メモリ
４０５中に存在する人物について，その属性を判定し，
属性情報も人物情報メモリ４０５に付加する。そのため
に，人物色モデルファイルシステム３０２より，予め登
録してあるモデルとなる人物の色モデルを読み込み，検
出した人物の色との比較を行い，最も適合するモデルの
属性を人物の属性として判断する。Next, as a result of the above processing, the attribute of the person existing in the person information memory 405 is determined.
The attribute information is also added to the person information memory 405. For this purpose, a color model of a person who is a model registered in advance is read from the person color model file system 302, the color model is compared with the detected color of the person, and the attribute of the best matching model is determined as the attribute of the person. .

【００５５】その一つの実現方法として，以下のような
色情報を用いる。本例では，図９のように人物領域を垂
直方向に分割し，分割された各領域中の人物領域につい
て，各色成分の平均値を計算する。こうした色の組と，
同様にして予め利用者によって属性が指定されている各
モデルの人物の色の組との平均二乗誤差を計算し，最も
この誤差の小さいモデルの属性を検出した人物の属性と
する。モデルとしては，両チームの選手，審判に加え，
フィールド上の白線や芝のハゲの領域を用意して，人物
色モデルファイルシステム３０２に登録しておくこと
で，白線やハゲのモデルに最も近いと判断された場合に
は，その人物の情報を人物情報メモリ４０５から削除す
ることができ，そうすることにより，白線やハゲなどの
領域を人物として誤検出してしまうことが減少する。As one of the realizing methods, the following color information is used. In this example, as shown in FIG. 9, the person region is divided in the vertical direction, and the average value of each color component is calculated for the person region in each divided region. With these color sets,
Similarly, a mean square error between each model whose attribute is designated by the user and the color set of the person is calculated, and the attribute of the model having the smallest error is set as the attribute of the detected person. As a model, in addition to the players and referees of both teams,
By preparing a white line on the field or a bald area of turf, and registering it in the person color model file system 302, if it is determined that the model is closest to the white line or the bald model, the information of the person is deleted. It can be deleted from the person information memory 405, thereby reducing the possibility of erroneously detecting an area such as a white line or baldness as a person.

【００５６】遷移コスト計算部２０５は，現在のフレー
ム，および，前フレームについて検出され，人物情報メ
モリ４０５に記憶されている人物情報を入力として，フ
レーム間における人物の遷移コストを計算し，遷移コス
トマトリクスメモリ４０６に格納する。The transition cost calculation unit 205 calculates the transition cost of a person between frames using the person information detected in the current frame and the previous frame and stored in the person information memory 405 as input. The data is stored in the matrix memory 406.

【００５７】遷移コスト計算部２０５は，処理対象の映
像のフレーム範囲において，第２フレーム以上のフレー
ムについて実施するものである。遷移コストとは，前フ
レームのある人物が，現フレーム上のある人物に移動し
たと考える場合のコストであり，前フレーム上の人物数
と現フレームの人物数をそれぞれ行の数，列の数とする
マトリクス形式として遷移コストマトリクスメモリ４０
６に格納される。The transition cost calculation unit 205 performs the processing on the second frame or more in the frame range of the video to be processed. The transition cost is a cost when it is considered that a certain person in the previous frame has moved to a certain person in the current frame, and the number of persons in the previous frame and the number of persons in the current frame are respectively the number of rows and the number of columns. Transition matrix memory 40 as a matrix format
6 is stored.

【００５８】このコストとして，本例では，移動コスト
と変化コストという２種類のコストを計算し，その結果
をそれぞれ遷移コストマトリクスメモリ４０６に格納す
る。移動コストとは，フレーム間の人物の距離に関する
コストであり，変化コストとは，フレーム間における人
物の形状や色分布の変化に関するコストである。これら
のコストが小さいもの同士が実際に移動したと考える。In this example, two types of costs, a moving cost and a change cost, are calculated as the costs, and the results are stored in the transition cost matrix memory 406, respectively. The moving cost is a cost related to the distance of a person between frames, and the change cost is a cost related to a change in the shape and color distribution of a person between frames. It is considered that those having low costs have actually moved.

【００５９】移動コストの一計算方法として，以下の方
法が利用できる。前フレーム上のある人物ｉと，現フレ
ームｊとの間の移動コストＤ（ｉ，ｊ）は，人物ｉの座
標を（ｘ_i，ｙ_i），人物ｊの座標を（ｘ_j，ｙ_j），
人物ｉの幅，高さを（ｗ_i，ｈ_i），人物ｊの幅，高さ
を（ｗ_j，ｈ_j）とすると，The following method can be used as one method of calculating the moving cost. The person i previous of the upper frame, the movement cost D (i, j) of between a current frame j is the coordinate of the person i (x _i, y _i), the coordinates of the person j (x _j, y _j ),
The width of the person i, the height (w _{_i,} h _i), the width of the person j, the height (w _{_j,} h _j) When,

【００６０】[0060]

【数１】 (Equation 1)

【００６１】と計算できる。ただし，Ｖ_x，Ｖ_yは，カ
メラの視点移動によりフレーム間で生じた画像のずれ幅
である。固定視点カメラにより撮影された映像の場合，
これらを０とし，移動視点カメラにより撮影された映像
を処理対象とする場合には，平均的な移動量（単位は画
素）としてＶ_x，Ｖ_yをフレーム間のマッチング等の方
法により計算する。It can be calculated as follows. Here, V _x and V _y are the shift widths of the images generated between the frames due to the movement of the viewpoint of the camera. In the case of a video taken by a fixed viewpoint camera,
These and 0, in the case of an image taken by means of the view camera processed is the average amount of movement (in pixels) is calculated by a method such as matching between frames of V _x, V _y as.

【００６２】映像の撮影は，通常，フィールドの真上か
らではなく，側面から行われるため，画像上では，奥行
き方向の距離が左右方向の距離と比較して縮まって表現
される。そのため，水平方向の移動成分を人物の高さで
正規化し，垂直方向の移動成分を人物の幅を正規化する
ことによる補正によって，移動コストの精度を向上させ
ることができる。Since the image is normally taken not from directly above the field but from the side, the distance in the depth direction is expressed on the image as being smaller than the distance in the left and right direction. Therefore, the accuracy of the moving cost can be improved by correcting the moving component in the horizontal direction by the height of the person and correcting the moving component in the vertical direction by normalizing the width of the person.

【００６３】移動コストの計算後，以後の計算の簡略化
のために，一定値以上の移動コストの場合には，移動コ
スト，変化コストともに無限大に設定する。また，移動
コストが一定値以下の場合においても，現フレームの人
物と前フレームの人物の外接四角形の面積比を計算し，
著しく面積比が大きいか，小さい場合には，変化コスト
を無限大に設定する。これは，多くの場合，人物以外の
ゴミ領域は小さな面積として得られるため，これを識別
するための効果を有する。After the calculation of the moving cost, in order to simplify the subsequent calculation, if the moving cost is a fixed value or more, both the moving cost and the change cost are set to infinity. Also, even when the moving cost is below a certain value, the area ratio of the circumscribed rectangle between the person in the current frame and the person in the previous frame is calculated.
If the area ratio is extremely large or small, the change cost is set to infinity. This has an effect of identifying a dust area other than a person in many cases because the dust area is obtained as a small area.

【００６４】これら条件を満たした現フレームと前フレ
ームの各人物について，変化コストをテンプレートマッ
チングを用いて計算する。これは現フレームをテンプレ
ートとして，このテンプレートの中心を前フレームの人
物の領域中に当てはめたときの画像間の色成分の二乗誤
差を計算し，テンプレートの中心位置を前フレームの人
物の領域中でずらしながら，二乗誤差が最小になる位置
を探索し，最も小さい二乗誤差を変化コストとして設定
する。なお，二乗誤差以外の尺度も利用可能である。For each person in the current frame and the previous frame that satisfies these conditions, the change cost is calculated using template matching. This calculates the square error of the color component between images when the center of this template is applied to the person's area of the previous frame using the current frame as a template, and the center position of the template is set in the person's area of the previous frame. While shifting, a position where the square error is minimized is searched, and the smallest square error is set as the change cost. Note that measures other than the squared error can be used.

【００６５】フレーム間対応決定部２０６では，遷移コ
ストマトリクスメモリ４０６中に記憶されている遷移コ
ストを入力とし，フレーム間の人物の対応関係を決定
し，決定された人物の対応関係を人物状態テーブルメモ
リ４０７へ格納する。また，人物状態テーブルメモリ４
０７の内容を人物検出結果ファイルシステム３０３へと
出力する。The inter-frame correspondence determination unit 206 receives the transition cost stored in the transition cost matrix memory 406 as input, determines the correspondence between persons between frames, and uses the determined correspondence between persons as a person state table. The data is stored in the memory 407. Also, the person state table memory 4
07 is output to the person detection result file system 303.

【００６６】本発明では，現在フレームと前フレームと
の間の人物の対応関係，および，現在フレームと次フレ
ームとの間の人物の対応関係を各々，図１０のような４
つの状態に区分して認識する。現在フレームと前後のフ
レームを含めると１６通りの状態が存在することにな
る。それぞれの対応関係を識別する記号として，前フレ
ームとの関係，および，後フレームとの関係について，
それぞれ接続状態番号を用いる。According to the present invention, the correspondence between a person between the current frame and the previous frame and the correspondence between the person between the current frame and the next frame are respectively shown in FIG.
Recognize by dividing into two states. Including the current frame and the preceding and succeeding frames, there are 16 states. As the symbols for identifying the corresponding relationships, the relationship with the previous frame and the relationship with the subsequent frame
Each connection state number is used.

【００６７】前フレームとの対応関係において，前フレ
ームには存在しない人物の出現の場合には，接続状態番
号を１とし，また，単純な人物の移動の場合には，接続
状態番号を２と表す。前フレームで一つの人物領域が，
現フレームでは分裂するような場合，現フレームにおけ
る分裂した片割れを接続状態番号３で表す。また，前フ
レームで複数の人物領域が，現フレームにおいて一つに
合体する場合には，接続状態番号４を付与する。In the correspondence relationship with the previous frame, the connection state number is set to 1 when a person not existing in the previous frame appears, and the connection state number is set to 2 when a simple person moves. Represent. One person area in the previous frame is
In the case where the current frame is split, the split half of the current frame is represented by a connection state number 3. When a plurality of person areas are combined into one in the previous frame, a connection state number 4 is assigned.

【００６８】次フレームとの関係も，同様の番号を用
い，現フレームの人物が次フレームにおいては，存在し
ないような，人物が消滅する場合（フィールド領域の範
囲外への移動）には接続状態番号１を与え，現フレーム
の人物が単純に移動する場合には，２を与え，現フレー
ムの人物が分裂する場合には３を与え，現フレームの人
物が次フレームにおいて他の人物と合体して一つの領域
をなす場合には，接続状態番号４を与える。The same number is used for the relationship with the next frame. When the person disappears (moves out of the field area) such that the person in the current frame does not exist in the next frame, the connection state is changed. Number 1 is given, 2 is given if the person in the current frame moves simply, 3 is given if the person in the current frame splits, and the person in the current frame merges with another person in the next frame. When one area is formed, connection state number 4 is given.

【００６９】フレーム間対応決定部２０６では，以上の
ようなフレーム間の対応関係が，逐次決定されていく。
決定される人物の状態は，人物状態テーブルメモリ４０
７にテーブル形式で格納される。これを「人物状態テー
ブル」と呼ぶ。その情報は，図１１に示すように一つの
画像フレーム上で検出された個々の人物について，ＩＤ
番号，前フレームとの接続状態番号（前方接続状態番
号），次フレームとの接続状態番号（後方接続状態番
号），前フレームで対応する人物番号（前方対応人物番
号），次フレームで対応する人物番号（後方対応人物番
号）を含むものとする。In the inter-frame correspondence determination section 206, the above-described correspondence between frames is sequentially determined.
The determined person state is stored in the person state table memory 40.
7 is stored in a table format. This is called a “person state table”. The information includes, for each person detected on one image frame as shown in FIG.
Number, connection status number with the previous frame (forward connection status number), connection status number with the next frame (rear connection status number), person number corresponding to the previous frame (forward corresponding person number), person corresponding to the next frame Number (backward corresponding person number).

【００７０】ＩＤ番号は，人物に固有の番号であり，同
一の人物が複数のフレームに渡って存在する場合，それ
らのフレーム中の該当人物には同一の番号が付与される
とする。また，人物番号とは，一つのフレーム上で検出
された人物に対して付与される通し番号であり，人物状
態テーブルメモリ４０７中のテーブルにおいては，行番
号を指定するものである。前フレームで対応する人物番
号および次フレームで対応する人物番号とは，現フレー
ムのある人物が前フレーム，および，次フレームでどの
人物と対応しているかを示す番号である。The ID number is a number unique to a person. If the same person exists over a plurality of frames, the same number is assigned to the corresponding person in those frames. The person number is a serial number given to a person detected on one frame, and specifies a line number in a table in the person state table memory 407. The person number corresponding to the previous frame and the person number corresponding to the next frame are numbers indicating which person in the current frame corresponds to the person in the previous frame and the next frame.

【００７１】なお，本発明において，図１２に示すよう
な対応関係は禁止するとする。この禁止対応関係には
（ａ）Ｚ型，（ｂ）Ｘ型が存在し，これは，二つのフレ
ームにおいて，二人の人物が並走するような場合，人物
の移動の他に，一つのフレームで隣合う二人の人物間
で，それぞれ隠し持っている別の人物（後ろに隠れてい
る）の交換が行われることはないと規定するものであ
る。図１１に示す人物状態テーブルのようなテーブル構
造を用いると，このような禁止対応関係は表現できな
い。In the present invention, it is assumed that the correspondence shown in FIG. 12 is prohibited. There are (a) Z-type and (b) X-type in this forbidden correspondence, which means that when two people run side by side in two frames, one It stipulates that the exchange of another person (hidden behind) is not performed between two persons who are adjacent in the frame. If a table structure such as the person state table shown in FIG. 11 is used, such a prohibited correspondence cannot be expressed.

【００７２】映像蓄積ファイルシステム３０１より与え
られる映像データの各フレームについて，前後のフレー
ム間での人物の対応関係を決定し，人物状態テーブルメ
モリ４０７にその結果を格納するアルゴリズムとして，
本発明では以下の手法を用いることができる。For each frame of the video data provided by the video storage file system 301, the correspondence between persons before and after the frame is determined, and the result is stored in the person state table memory 407 as an algorithm.
In the present invention, the following method can be used.

【００７３】まず，映像データについて，処理を開始す
るフレーム番号，終了するフレーム番号，および，処理
を行う間隔を示すステップ数が利用者より与えられる。
本手法では，人物状態テーブルメモリ４０７に，２フレ
ーム分の人物状態テーブルを保持する。それぞれ，第１
の人物状態テーブル，第２の人物状態テーブルと呼ぶ。First, for the video data, a user gives a frame number at which processing is started, a frame number at which processing is ended, and a step number indicating an interval at which processing is performed.
In this method, the person state table memory 407 holds a person state table for two frames. Respectively
And the second person state table.

【００７４】図１３および図１４に，トラッキング処理
が進行して，人物の状態が決定していく様子を便宜的に
示す。図１３（ａ）には，あるフレーム上の一人の人物
について，人物状態テーブルの内容がどのように説明さ
れるかを示したものであり，円形の上部にＩＤ番号，円
形の左下部には前フレームとの接続状態番号（前方接続
状態番号），円形の右下部には次フレームとの接続状態
番号（後方接続状態番号）が表現される。また，未決定
の場合には，それぞれ空白となる。斜体の番号は，該当
のフレームおいて付与された番号を示し，ローマン体の
番号は，既に決定されている番号を示す。また，円形が
点線で構成される場合には，既に状態等が決定し，その
内容が出力され，人物状態テーブルからは消去されてい
ることを意味する。また，前フレームでの対応人物，お
よび，次フレームでの対応人物は，円形の左側に接続さ
れる矢印の元，および，円形の右側から伸る矢印の先と
して示される。ここで人物間の対応（接続）には，優勢
対応（強接続）と，弱接続があり，強接続は，同一人物
の移動を表し，太線で記される。これにより示される人
物が人物状態テーブルの該当欄に記憶される人物であ
る。また，弱接続にある人物は，別の人物が合体し，背
後に隠れる場合や，背後に隠れていた人物が次のフレー
ムでは出てくるようなケースを表す。FIGS. 13 and 14 show how the tracking process proceeds and the state of the person is determined for convenience. FIG. 13 (a) shows how the contents of the person status table are described for one person on a certain frame, with the ID number at the top of the circle and the lower left of the circle. The connection state number with the previous frame (forward connection state number) and the lower right part of the circle represent the connection state number with the next frame (rear connection state number). If undecided, each will be blank. The numbers in italics indicate the numbers assigned in the corresponding frame, and the numbers in the Roman type indicate numbers that have already been determined. Further, when the circle is formed by a dotted line, it means that the state and the like have already been determined, and the contents have been output and have been deleted from the person state table. In addition, the corresponding person in the previous frame and the corresponding person in the next frame are shown as the origin of the arrow connected to the left side of the circle and the tip of the arrow extending from the right side of the circle. Here, the correspondence (connection) between persons includes a dominant correspondence (strong connection) and a weak connection, and the strong connection indicates movement of the same person and is indicated by a thick line. The person indicated by this is the person stored in the corresponding column of the person status table. In addition, a person in a weak connection represents a case where another person is united and hidden behind, or a case where a person hidden behind comes out in the next frame.

【００７５】まず，開始フレーム（フレーム１）の段階
においては，全ての人物が新たに出現したとして，第１
の人物状態テーブルの前方接続状態番号欄に１を書き込
む。また，ＩＤ番号欄には，新しいＩＤ番号を付与す
る。また，前方対応人物番号欄には，対応先がないこと
を示す−１を書き込む。この様子を図１３（ｂ）に示
す。First, at the stage of the start frame (frame 1), it is assumed that all persons have newly appeared and the first
Is written in the front connection state number column of the person state table of FIG. Also, a new ID number is assigned to the ID number field. In addition, -1 indicating that there is no corresponding destination is written in the front corresponding person number column. This state is shown in FIG.

【００７６】次に，フレーム２の処理に移行する。その
際には，人物状態テーブルメモリ４０７に第２の人物状
態テーブルを確保する。まず，現フレーム（フレーム
２）の各人物について，前フレーム（フレーム１）の各
人物の中で最もトータルの遷移コストの小さい人物を選
択する。ここで，トータル遷移コストは，遷移コストマ
トリクスメモリ４０６中に格納されている移動コストと
変化コストを用いて，トータル遷移コスト＝移動コスト
＋ｗ×変化コストのように計算される。ここで，ｗは重
み付けの係数である。なおその際には，許容する移動コ
スト，および，変化コストに上限Ｅ１，Ｅ２を設け，こ
の条件を満たさない場合には，対応関係は存在しないも
のとする。Next, the processing shifts to the processing of frame 2. At that time, a second person state table is secured in the person state table memory 407. First, for each person in the current frame (frame 2), the person with the smallest total transition cost is selected among the persons in the previous frame (frame 1). Here, the total transition cost is calculated by using the movement cost and the change cost stored in the transition cost matrix memory 406 as follows: total transition cost = movement cost + w × change cost. Here, w is a weighting coefficient. In this case, upper limits E1 and E2 are set for the allowable moving cost and the change cost, and if these conditions are not satisfied, it is assumed that there is no correspondence.

【００７７】この処理が終了した段階において，図１４
（ａ）のように，前フレームの一人の人物から，現フレ
ームの複数の人物へ対応関係が存在する場合がある。こ
の場合，トータルの遷移コストが最も小さい対応を優勢
対応として，前フレームの人物のＩＤ番号を現フレーム
の人物に継承し，第２の人物状態テーブルの該当人物の
欄にこのＩＤ番号を記録する。また，同時に，前フレー
ムの後方対応人物番号欄に，対応する現フレームの人物
番号を記入し，現フレームの前方対応人物番号欄に，対
応する前フレームの人物番号を記入する。また，優勢対
応ではない弱接続している現在フレームの人物について
も，前方対応人物番号欄に，対応している前フレームの
人物番号を記入する。At the stage where this process is completed, FIG.
As shown in (a), there is a case where there is a correspondence from one person in the previous frame to a plurality of persons in the current frame. In this case, the ID of the person in the previous frame is inherited by the person in the current frame, and the ID number is recorded in the column of the person in the second person state table, with the correspondence having the smallest total transition cost as the dominant correspondence. . At the same time, the corresponding person number of the current frame is entered in the backward corresponding person number column of the previous frame, and the corresponding person number of the previous frame is entered in the forward corresponding person number column of the current frame. In addition, for the person of the current frame that is not dominantly connected and is weakly connected, the corresponding person number of the previous frame is entered in the corresponding person number column in the forward direction.

【００７８】次に，前フレームの人物において，現フレ
ームの人物と対応関係が確立されていない人物が存在す
る場合，その人物それぞれについて，現フレーム上の人
物の中から，遷移コストの中でも，移動コストの上限Ｅ
１の条件を満たし，かつ，最も小さい人物が存在する場
合，前フレームの後方対応人物番号欄に，現フレームの
人物番号を記入する。Next, if there is a person in the previous frame who has not been established in correspondence with the person in the current frame, for each of the persons, the person in the current frame may be moved in the transition cost. Cost limit E
If condition 1 is satisfied and the smallest person exists, the person number of the current frame is entered in the backward corresponding person number column of the previous frame.

【００７９】次に，これまでに現在フレームの人物でＩ
Ｄ番号が付与されていない人物について，新規のＩＤ番
号を付与し，対応する第２の人物状態テーブルの欄に記
録する。以上で確立されたフレーム間の人物の対応関係
を元に，前フレームの人物の後方接続状態番号を判定
し，第１の人物状態テーブルの該当欄に記入し，現フレ
ームの人物の前方接続状態番号を判定し，第２の人物状
態テーブルの該当欄に記入する。Next, the person in the current frame has
A new ID number is assigned to a person who has not been assigned a D number, and recorded in the corresponding column of the second person state table. Based on the correspondence of persons between the frames established above, the rear connection state number of the person in the previous frame is determined, entered in the corresponding column of the first person state table, and the front connection state of the person in the current frame is determined. The number is determined and entered in the corresponding column of the second person status table.

【００８０】最後に，第１の人物状態テーブルに記録さ
れている前フレーム（フレーム１）の人物の状態の情報
を人物検出結果ファイルシステム３０３に出力し，第１
の人物状態テーブルの内容を消去する。Finally, the information on the state of the person in the previous frame (frame 1) recorded in the first person state table is output to the person detection result file system 303, and the first
Erase the contents of the person status table.

【００８１】次に，フレーム３に移り，フレーム２の場
合と同様の処理を行う。その様子を図１４（ｂ）に示
す。ここでは，フレーム３の人物の状態は，第１の人物
状態テーブルに記憶される。フレーム３の段階の処理の
結果，第２の人物状態テーブル中のフレーム２の人物の
状態情報を，人物結果ファイルシステム３０３に出力す
る。その際に，フレーム２の段階で出力した情報に追加
する形で出力が行われる。Next, the process proceeds to frame 3 and the same processing as in the case of frame 2 is performed. This is shown in FIG. Here, the state of the person in frame 3 is stored in the first person state table. As a result of the process at the stage of frame 3, the status information of the person in frame 2 in the second person status table is output to the person result file system 303. At this time, the output is performed in a form that is added to the information output in the stage of the frame 2.

【００８２】以上の処理を終了フレームまで，指定され
たフレームのステップ毎に実行する。終了フレームの段
階の処理においては，後方対応人物番号には−１を付与
し，後方接続状態番号には，全ての人物が消滅すると
し，１を付与し，その内容を人物結果ファイルシステム
３０３に出力する。なお，以上の手法は一例であり，こ
れ以外の手法も利用可能である。The above processing is executed for each step of the designated frame until the end frame. In the processing at the end frame stage, -1 is assigned to the backward corresponding person number, and 1 is assigned to the backward connection state number, assuming that all persons disappear, and the content is added to the person result file system 303. Output. The above method is an example, and other methods can be used.

【００８３】人物遷移探索部２０７では，人物検出結果
ファイルシステム３０３中の人物検出結果を読み込み，
人物遷移構造メモリ４０８に記憶させ，さらに，その内
容を入力として，複数フレームに渡る連続した人物の探
索を実行し，その結果をトラッキング出力ファイルシス
テム３０４に記録する。The person transition search unit 207 reads the person detection result in the person detection result file system 303 and
The contents are stored in the person transition structure memory 408, and further, a search for a continuous person over a plurality of frames is performed using the contents as an input, and the result is recorded in the tracking output file system 304.

【００８４】人物遷移探索部２０７では，本システムの
最終的な出力である，画像上において個々の人物をトラ
ッキングすることで得られる各フレームにおける位置と
サイズの情報を出力することを目標とする。The person transition search unit 207 aims to output information on the position and size in each frame obtained by tracking each person on an image, which is the final output of the present system.

【００８５】その実現方法の一例として以下の方法が利
用できる。まず，人物検出結果ファイルシステム３０３
から，人物検出結果を読み込み，人物遷移構造メモリ４
０８にその内容を記憶させる。人物検出結果ファイルシ
ステム３０３には，フレーム間対応決定部２０６より出
力されるデータが蓄積されている。これは，映像蓄積フ
ァイルシステム３０１中の映像データの開始フレームか
ら終了フレームまで，一定のステップ毎に人物を検出
し，フレーム間の人物の対応関係を記述したデータであ
る。The following method can be used as an example of the realizing method. First, the person detection result file system 303
From the person detection structure memory 4
08 stores the contents. The data output from the inter-frame correspondence determination unit 206 is stored in the person detection result file system 303. This is data in which a person is detected at certain steps from the start frame to the end frame of the video data in the video storage file system 301 and the correspondence between the frames is described.

【００８６】ここでは，人物遷移構造メモリ４０８中の
データ構造は，図１５のようなグラフ構造をもつ。図１
５は，横軸にフレーム番号，縦軸にＩＤ番号をとったも
のであり，各人物が頂点として表され，フレーム間の人
物の対応関係はエッジとして表される。人物遷移探索部
２０７では，この人物遷移構造メモリ４０８中のグラフ
データを用いで，個々の人物の複数フレームに渡り探索
し，人物毎の移動軌跡データを出力する。Here, the data structure in the person transition structure memory 408 has a graph structure as shown in FIG. FIG.
Reference numeral 5 denotes a frame number on the horizontal axis and an ID number on the vertical axis. Each person is represented as a vertex, and the correspondence between persons between frames is represented as an edge. Using the graph data in the person transition structure memory 408, the person transition search unit 207 performs a search over a plurality of frames of each person and outputs moving trajectory data for each person.

【００８７】具体的な実現方法の一つとして，以下に示
す方法を用いることができる。まず，人物遷移構造メモ
リ４０８中のグラフデータ中において各頂点を巡回し，
前方接続状態番号が１であるものを探索する。これは新
規に出現する人物に対応し，それら各々の人物を起点と
して，次の条件に適合するかの試験を行う。As one of the concrete realizing methods, the following method can be used. First, each vertex is visited in the graph data in the person transition structure memory 408,
A search is made for one whose forward connection state number is 1. This corresponds to newly appearing persons, and a test is performed to determine whether the following conditions are met, starting from each of those persons.

【００８８】・条件１：（後方接続状態番号が１であ
る；次フレームで消滅する）・条件２：（後方接続状態番号が４である；次フレーム
で合体する）・条件３：（現在のＩＤ番号が次フレームの対応人物の
ＩＤ番号と異なる；優勢対応ではない）・条件４：（条件１）または（（条件２）かつ（条件
３））・条件５：（条件４）に当てはまらない条件５に適合する場合には，後方対応人物へ進み，同じ
く上記の条件に適合するかの試験を行う。なお，探索の
過程で通過した人物はリストとして記憶を行う。条件４
に適合する場合には，該当人物を起点とした探索を中止
する。その場合，起点となる人物について，その後の人
物の経過の軌跡を記録した人物のリストが生成されてお
り，各フレーム毎に画像上での座標値とサイズの情報が
トラッキング出力ファイルシステム３０４へ出力され
る。Condition 1: (the backward connection state number is 1; disappears in the next frame) Condition 2: (the backward connection state number is 4; unite in the next frame) Condition 3: (current The ID number is different from the ID number of the corresponding person in the next frame; it is not the dominant correspondence. ・ Condition 4: (Condition 1) or ((Condition 2) and (Condition 3)) ・ Condition 5: Not applicable to (Condition 4) When the condition 5 is satisfied, the process proceeds to the person corresponding to the rear, and a test is performed to determine whether the condition is satisfied. The persons who have passed in the search process are stored as a list. Condition 4
If satisfies, the search starting from the corresponding person is stopped. In this case, a list of persons recording the trajectory of the subsequent persons is generated for the person serving as the starting point, and information on the coordinate values and the size on the image is output to the tracking output file system 304 for each frame. Is done.

【００８９】次に，他の人物より分岐して別れた人物で
あり，かつ，前フレームの対応人物とは，優勢対応では
ない場合，つまり，次の条件６・条件６：（前方接続状態番号が３である；前フレーム
で分岐している）かつ（現在のＩＤ番号が前フレームの
対応人物のＩＤ番号と異なる；優勢対応ではない）に適
合する人物を人物遷移構造メモリ４０８中のグラフデー
タ中から探索し，適合した各々の人物について，それら
各々の人物を起点として，前述の条件４，５に適合する
かの試験を行う。この処理も前述した処理と同様であ
る。上記において，各々の人物を起点とした選手の軌跡
の探索の結果，得られた人物の移動軌跡のリストの長さ
をチェックし，軌跡の長さが極端に短いものの出力を取
り止めることにより，フィールド上のゴミなどの人物以
外のものが誤検出される回数を減少させることができ
る。Next, if the person is a person who has branched off from another person and is not a dominant person with the corresponding person in the previous frame, the following condition 6 and condition 6: (forward connection state number Is 3; branching in the previous frame) and (the current ID number is different from the ID number of the corresponding person in the previous frame; not the dominant correspondence) are graph data in the person transition structure memory 408. A test is performed on each of the persons who have searched and matched from the inside, and whether or not the above-mentioned conditions 4 and 5 are satisfied, with the respective persons as starting points. This processing is the same as the processing described above. In the above, as a result of searching for the trajectory of the player starting from each person, the length of the list of the trajectories obtained for the persons is checked, and the output of the trajectory whose length is extremely short is stopped, so that the field is output. It is possible to reduce the number of times that an object other than a person such as the upper dust is erroneously detected.

【００９０】また，各々の人物の軌跡を出力する際に
は，その人物の始点における前方接続状態番号，およ
び，終点における後方接続状態番号もあわせて出力され
る。When the trajectory of each person is output, the front connection state number at the start point of the person and the rear connection state number at the end point are also output.

【００９１】以上に記述した探索処理を行うことによ
り，例えば，図１５に示したような人物の状態遷移構造
より，図１６のような各々の人物の移動軌跡が得られ，
人物間の対応関係が明確に記述できるようになる。By performing the search processing described above, for example, the movement trajectory of each person as shown in FIG. 16 is obtained from the state transition structure of the person as shown in FIG.
The correspondence between persons can be clearly described.

【００９２】なお，以上の手法では，人物検出結果ファ
イルシステム３０３を介して，人物の検出，フレーム間
の対応関係結果のデータを読み込んでいるが，人物検出
結果ファイルシステム３０３の代わりに直接，人物状態
テーブルメモリ４０７より人物遷移構造メモリ４０８へ
データを書き込むこともできる。In the above-described method, the data of the result of the detection of the person and the correspondence between the frames is read via the person detection result file system 303. Data can also be written from the state table memory 407 to the person transition structure memory 408.

【００９３】表示部１０００は，トラッキング出力ファ
イルシステム３０４に記録された人物の移動軌跡に関す
るデータを読み込み，ディスプレイ装置に出力を行う。
表示の形態としては，例えば，図１６に示すような各人
物が画像上に存在する時間区間，および，出現，消滅の
時点，また，複数の人物間における合体，分裂などの関
連をグラフ構造で表示を行う形態を用いることができ
る。[0093] The display unit 1000 reads data relating to the movement trajectory of the person recorded in the tracking output file system 304 and outputs the data to the display device.
As a form of display, for example, as shown in FIG. 16, a time section in which each person exists on the image, a point of time of appearance and disappearance, and a relation such as union and division among a plurality of persons are represented in a graph structure. A mode in which display is performed can be used.

【００９４】さらに，図１７のように，映像中の画像の
上に，検出した人物の外接四角形を表示したり（図１７
の８０１），外接四角形内の画像輝度を増加したり（図
１７の８０２），また，過去複数のフレームについて，
人物が移動した軌跡を外接四角形や矢印を重ねて表示し
たりできる（図１７の８０３）。Further, as shown in FIG. 17, a circumscribed rectangle of the detected person is displayed on the image in the video (FIG. 17).
801), the image brightness in the circumscribed rectangle is increased (802 in FIG. 17), and for a plurality of past frames,
The trajectory of the movement of the person can be displayed by superimposing a circumscribed rectangle or an arrow (803 in FIG. 17).

【００９５】また，各人物についてその名称や背番号を
人手により付与することで，人物にそれら番号，名称を
付随させて表示させることもできる（８０４）。Also, by giving the name and uniform number of each person manually, the person can be displayed with the number and name attached (804).

【００９６】さらに，出現，消滅，合体，分離が生じる
時点において，外接四角形の色を変化させて表示させる
ことで，利用者にわかりやすく人物の状態を認識させる
ことができる。Furthermore, when the appearance, disappearance, uniting, and separation occur, the color of the circumscribed rectangle is changed and displayed, so that the user can easily recognize the state of the person.

【００９７】図１に示す処理部２００の処理は，例えば
図示省略したＣＰＵとソフトウェアプログラムによって
実現することができ，そのソフトウェアプログラムは，
各種の記録媒体や通信回線を利用してインストールする
ことができる。ＣＰＵは，インストールされたソフトウ
ェアプログラムをメモリ４００にローディングし，それ
を実行することにより，前述した一連の方法を実現す
る。The processing of the processing section 200 shown in FIG. 1 can be realized by, for example, a CPU and a software program (not shown).
It can be installed using various recording media and communication lines. The CPU loads the installed software program into the memory 400 and executes the loaded software program, thereby realizing the above-described series of methods.

【００９８】[0098]

【発明の効果】以上で説明したように，本発明では，映
像中の各画像フレームにおいて対象物を検出し，画像フ
レーム間における物体間の遷移コストを計算し，一対
多，多対一などの多様性を導入した対応付け処理を行
い，その結果を用いて，物体の移動，合体，分裂，消滅
の状態の判定を行うことで，サッカーの試合のように複
数の人物が密集するような映像に対しても，安定した物
体の追跡と，複数の物体の合体，分離の状態に関する情
報の出力を可能にする。As described above, according to the present invention, an object is detected in each image frame in a video, the transition cost between objects between image frames is calculated, and a variety of one-to-many, many-to-one, etc. By performing the matching process that introduces the character, and using the results to determine the state of movement, coalescence, division, and disappearance of the object, it is possible to create an image in which multiple people are crowded like a soccer game On the other hand, it enables stable tracking of an object and output of information on the state of merging and separation of a plurality of objects.

【００９９】また，従来のテンプレートマッチングを用
いた追跡方法の欠点であるフレーム間において物体の形
状変化が大きい場合にも，本発明では，フレーム間にお
ける物体間の全ての組み合わせについて移動の可能性を
検討し，遷移コストを計算するため，物体の形状変化が
大きい対象物に対しても正確な追跡が可能になる。Also, even when the shape of an object greatly changes between frames, which is a drawback of the conventional tracking method using template matching, the present invention reduces the possibility of movement for all combinations between objects between frames. Since the examination and the transition cost are calculated, accurate tracking is possible even for an object having a large change in the shape of the object.

[Brief description of the drawings]

【図１】本発明のシステム構成例を示す図である。FIG. 1 is a diagram showing a system configuration example of the present invention.

【図２】図１に示すシステムの処理例を示す流れ図であ
る。FIG. 2 is a flowchart showing a processing example of the system shown in FIG. 1;

【図３】ヒストグラムとフィールド色範囲決定方法を説
明する図である。FIG. 3 is a diagram illustrating a method of determining a histogram and a field color range.

【図４】画像フレーム上の対象を示す図である。FIG. 4 is a diagram showing an object on an image frame.

【図５】フィールド領域整形処理を説明する図である。FIG. 5 is a diagram illustrating a field area shaping process.

【図６】抽出されたフィールド領域を示す画像の例を示
す図である。FIG. 6 is a diagram illustrating an example of an image indicating an extracted field region.

【図７】抽出された人物領域を示す画像の例を示す図で
ある。FIG. 7 is a diagram illustrating an example of an image indicating an extracted person region.

【図８】人物領域とその外接四角形を示す図である。FIG. 8 is a diagram illustrating a person region and a circumscribed rectangle thereof.

【図９】人物の属性を判定するための色情報の取得を説
明する図である。FIG. 9 is a diagram illustrating acquisition of color information for determining a person attribute.

【図１０】人物の前後フレームとの接続関係を説明する
図である。FIG. 10 is a diagram illustrating a connection relationship between a person and front and rear frames.

【図１１】人物状態テーブルに格納されるデータの例を
示す図である。FIG. 11 is a diagram showing an example of data stored in a person state table.

【図１２】禁止対応関係を示す図である。FIG. 12 is a diagram showing a forbidden correspondence.

【図１３】トラッキング処理の進行の様子を示す図であ
る。FIG. 13 is a diagram illustrating a state of progress of a tracking process.

【図１４】トラッキング処理の進行の様子を示す図であ
る。FIG. 14 is a diagram showing the progress of a tracking process.

【図１５】人物状態遷移構造の様子を示す図である。FIG. 15 is a diagram showing a state of a person state transition structure.

【図１６】人物遷移の探索により得られる人物の遷移構
造の例を示す図である。FIG. 16 is a diagram illustrating an example of a person transition structure obtained by searching for a person transition.

【図１７】人物の追跡結果の表示の一例を示す図であ
る。FIG. 17 is a diagram illustrating an example of display of a tracking result of a person.

[Explanation of symbols]

１００入力部１０１撮影装置１０２ビデオ映像蓄積再生装置１０３映像取り込み装置２００処理部２０１ヒストグラム計算部２０２フィールド領域抽出部２０３人物領域抽出部２０４人物検出部２０５遷移コスト計算部２０６フレーム間対応決定部２０７人物遷移探索部３００ファイルシステム３０１映像蓄積ファイルシステム３０２人物色モデルファイルシステム３０３人物検出結果ファイルシステム３０４トラッキング出力ファイルシステム４００メモリ４０１画像フレームメモリ４０２ヒストグラムメモリ４０３フィールド領域画像フレームメモリ４０４人物領域画像フレームメモリ４０５人物情報メモリ４０６遷移コストマトリクスメモリ４０７人物状態テーブルメモリ４０８人物遷移構造メモリ１０００表示部 REFERENCE SIGNS LIST 100 input unit 101 imaging device 102 video image storage / reproduction device 103 image capture device 200 processing unit 201 histogram calculation unit 202 field region extraction unit 203 person region extraction unit 204 person detection unit 205 transition cost calculation unit 206 inter-frame correspondence determination unit 207 person Transition search unit 300 file system 301 video storage file system 302 person color model file system 303 person detection result file system 304 tracking output file system 400 memory 401 image frame memory 402 histogram memory 403 field area image frame memory 404 human area image frame memory 405 Person information memory 406 Transition cost matrix memory 407 Person state table memory 408 Person transition structure memo 1000 display unit

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｔ 7/60 １５０Ｇ０６Ｔ 7/60 １５０ＪＨ０４Ｎ 7/18 Ｈ０４Ｎ 7/18 Ｋ (72)発明者森本正志東京都千代田区大手町二丁目３番１号日本電信電話株式会社内 (72)発明者児島治彦東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B057 BA02 CA01 CA08 CA12 CA16 CB01 CB06 CB08 CB12 CB16 CC03 CD01 CE09 CE11 CE12 CH11 CH12 DA07 DA08 DB02 DB06 DB08 DB09 DC03 DC04 DC23 5C054 FC04 FC08 FC12 FC13 FC16 GB12 GB14 GB17 HA05 5L096 AA02 AA06 CA02 EA14 EA17 EA35 EA37 EA43 FA14 FA15 FA18 FA33 FA34 FA37 FA42 FA59 FA66 FA69 GA41 HA03 HA05 LA05 Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat II (Reference) G06T 7/60 150 G06T 7/60 150J H04N 7/18 H04N 7/18 K (72) Inventor Masashi Morimoto Chiyoda-ku, Tokyo 2-3-1 Otemachi Nippon Telegraph and Telephone Corporation (72) Inventor Haruhiko Kojima 2-3-1 Otemachi, Chiyoda-ku, Tokyo F-term within Nippon Telegraph and Telephone Corporation 5B057 BA02 CA01 CA08 CA12 CA16 CB01 CB06 CB08 CB12 CB16 CC03 CD01 CE09 CE11 CE12 CH11 CH12 DA07 DA08 DB02 DB06 DB08 DB09 DC03 DC04 DC23 5C054 FC04 FC08 FC12 FC13 FC16 GB12 GB14 GB17 HA05 5L096 AA02 AA06 CA02 EA14 EA17 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA37 FA15 FA37 FA37 FA37 FA15 FA37 FA37 FA37 FA66 FA69 GA41 HA03 HA05 LA05

Claims

[Claims]

An image analysis system that tracks an object over a plurality of image frames constituting a video and outputs information including the position of the image of the object in each image frame. Means for extracting the image area occupied by the object in the frame; means for obtaining the position and size of the object from the image area occupied by the object; Means for calculating a transition cost when an object of a frame moves to an object of a later frame for a set of objects on two frames, and a correspondence between the objects on the two frames based on the transition cost Means for classifying the relationship and judging the state of the object; and determining the correspondence of the object between the two frames to all the neighbors in the video section comprising a plurality of image frames. Means for tracking the movement trajectory of each target object over a plurality of image frames by using information of a result calculated between adjacent frames.

2. A method for extracting an image area occupied by an object in the image frame, comprising: means for calculating a histogram for each color component of pixel values constituting an image; and a color component value indicating a maximum value of the histogram. Means for calculating a color component value corresponding to a valley adjacent to a peak having the maximum value as a peak, thereby calculating a range of a color component in a background region of the object; Means for calculating the ratio of the pixel values in the neighboring region to the range of the color components in the background region, and the magnitude of the variance of the color components in the neighboring region and the magnitude of the variance of the color components in the entire background region Means for calculating the ratio of the variances, and the ratio at which the calculated variance ratio is equal to or less than a predetermined value and the pixel values in the neighboring area are included in the color component range of the background area is equal to or greater than a predetermined value. If
2. An image analysis system according to claim 1, further comprising means for judging that the target pixel is included in the target area, and extracting the target area which is a set of the pixels as a binary image. .

3. A means for acquiring the position and size of an object from an image area occupied by the object, performing distance change on an image holding the area occupied by the object as a binary image, Means for calculating; means for detecting coordinates indicating a local peak of the distance value in the distance image;
Means for calculating a suitable circumscribed rectangle for the object region centered on the coordinates at which the distance value indicates the peak, and outputting the center coordinates of the object as the center position of the circumscribed rectangle; 3. An image analysis system according to claim 1, further comprising means for calculating and outputting a size of the object as a height.

4. As means for calculating the transition cost,
As the transition cost related to the size of the movement of the object, the distance between the objects on the two frames is calculated by correcting the shift of the entire image between the frames due to the movement of the camera's viewpoint in both the horizontal and vertical directions, A means for normalizing the horizontal distance by the width of the object and normalizing the distance in the vertical direction by the height of the object, and a transition cost related to changes in the shape and color of the object 2. The method according to claim 1, further comprising: superimposing a spatial distribution of pixel values of the object on the object on the other frame, and calculating a value that minimizes an error in the distribution of the pixel values. The image analysis system according to claim 3.

5. A means for classifying a correspondence relationship between objects on the two frames and determining a state of the objects,
For each object on the two frames, a means for determining that there is a correspondence between a set of objects whose transition cost is equal to or less than a predetermined value, and a correspondence between the objects on the two frames, Single object in the previous frame corresponds to a single object in the rear frame, multiple objects in the previous frame correspond to a single object in the rear frame, single object in the previous frame An object corresponds to multiple objects in the rear frame, a state exists in the previous frame but no corresponding object exists in the rear frame, and an object exists in the rear frame but corresponds to the previous frame. Classify as non-existent state and simply move the single object in the previous frame to the single object in the subsequent frame. Move multiple objects in the previous frame to the single object in the subsequent frame. The moving state is determined by combining The state in which a single object in the system moves to multiple objects in the subsequent frame is divided into objects. The state in which the corresponding object does not exist in the previous frame but exists in the previous frame is eliminated. Means for judging a state in which there is no corresponding object on the previous frame but exists on the previous frame as appearance of the object. Image analysis system as described.

6. As means for tracing the movement trajectory of each individual object over a plurality of image frames, data on the existence of the object over a plurality of frames and the correspondence relation of the object between the frames are inputted. Means for storing the detected object in each image frame as a graph-structured data structure in which the vertices are the objects detected in each image frame and edges indicate the correspondence between the objects in adjacent frames; Means for searching the graph to obtain the movement trajectory of the object, and searching for the object having the state immediately after the division, until the state immediately before the disappearance or the united state is detected from the starting point. , Searching from the graph until a state immediately before disappearance or a coalesced state is detected from the starting point,
The image analysis system according to any one of claims 1 to 5, further comprising: means for acquiring a movement trajectory of the object.

7. An image analysis method for tracking an object over a plurality of image frames constituting a video and outputting information including the position of the image of the object in each image frame, wherein each image constituting the video is Extracting the image area occupied by the object in the frame; obtaining the position and size of the object from the image area occupied by the object; Calculating the transition cost when the object in the frame moves to the object in the subsequent frame for the set of objects on the two frames, and the correspondence between the objects on the two frames from the transition cost Classifying the relationship and determining the state of the object; and determining the correspondence of the object between the two frames to all adjacent regions in a video section comprising a plurality of image frames. Tracking the movement trajectory of each target object over a plurality of image frames using information of a result calculated between frames.

8. A recording medium recording an image analysis program for tracking an object over a plurality of image frames constituting a video and outputting information including the position of the image of the object in each image frame. A process of extracting an image area occupied by an object in each image frame forming a video; a process of acquiring the position and size of the object from the image area occupied by the object; For the detected object, a process of calculating a transition cost when an object of a previous frame moves to an object of a subsequent frame for a pair of objects on two frames, and two processes from the transition cost. The process of classifying the correspondence between the objects on the frame and determining the state of the object, and the correspondence of the object between the two frames from the plurality of image frames. Using a result of calculation between all adjacent frames in a given video section to track the movement trajectory of each target object over a plurality of image frames. Recording medium for recording an image analysis program, wherein a program for recording the program is recorded.