JP3520050B2

JP3520050B2 - Moving object tracking device

Info

Publication number: JP3520050B2
Application number: JP2001008682A
Authority: JP
Inventors: 章内海
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2001-01-17
Filing date: 2001-01-17
Publication date: 2004-04-19
Anticipated expiration: 2021-01-17
Also published as: JP2002218449A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は移動物体追跡装置
に関し、特に、非同期で得られる多視点画像により移動
物体を追跡する移動物体追跡装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a moving object tracking device, and more particularly to a moving object tracking device for tracking a moving object by multi-view images obtained asynchronously.

【０００２】[0002]

【従来の技術】異なる視点で得られる複数の画像からシ
ーン内の奥行き情報を得るステレオ計測は、画像からの
３次元情報の復元を主要な課題の１つとするコンピュー
タビジョン研究において重要な位置を占めている。Stereo measurement, which obtains depth information in a scene from a plurality of images obtained from different viewpoints, occupies an important position in computer vision research whose restoration of three-dimensional information from an image is one of the main subjects. ing.

【０００３】ステレオ計測において特に重要となるのが
画像間の対応付けであり、誤った対応付けを少なくする
ために特徴量を適切に選択したり（Marr D. And Poggio
T.A theory of human stereo vision. In Roy. Soc. L
ondon, pp. ３０１−３２８．１９７９）、より多くの
視点を有効に利用する手法（Ramakant Nevatia. Depth
measurement by motion stereo. Computer Graphics an
d Image Processing vol. 5, pp.２０３−２１４，１９
７６）が開発されてきた。これらは主に、静的なシーン
を対象としたものであり、この場合のステレオ法の持つ
誤差については既に多くの報告がある（Jeffrey J. Rod
riguez and J. K. Aggarwal. Stochastic Analysis of
stereo quantization error. IEEE Pattern Anal. Mach
ine Intell., Vol.１２, No.５，pp.４６７−４７０，
５１９９２）。Correspondence between images is particularly important in stereo measurement, and feature amounts are appropriately selected to reduce erroneous correspondence (Marr D. And Poggio
TA theory of human stereo vision. In Roy. Soc. L
ondon, pp. 301-328.1979), a method for effectively utilizing more viewpoints (Ramakant Nevatia. Depth
measurement by motion stereo. Computer Graphics an
d Image Processing vol. 5, pp.203-214, 19
76) has been developed. These are mainly for static scenes, and there are already many reports about the error of the stereo method in this case (Jeffrey J. Rod
riguez and JK Aggarwal. Stochastic Analysis of
stereo quantization error. IEEE Pattern Anal. Mach
ine Intell., Vol.12, No.5, pp.467-470,
5 1992).

【０００４】動きを持つシーンを対象とした研究では、
ステレオ観測に基づくロボットのナビゲーション（Larr
y Matthies and Steven A. Shafer. Error Modeling in
stereo navigation. IEEE Rebotics and Automation,
vol. RA-３, No.３, pp.２３９−２４８, １９８７）、
１台のカメラによる既知形状のトラッキング（Akio Kos
aka and Goichi Nakazawa. Vision-based motion track
ing of rigid objectsusing prediction of uncertaini
es. In Proc. of International Conference on Roboti
cs and Automation, pp.２６３７−２６４４，１９５
５）、ステレオ計測による既知形状のトラッキング（Ge
m-Sun J. Young and Rama Chellappa. 3-dmotion estim
ation using a sequence of noisy stereo images: Mod
els, estimation, and Uniqueness results. IEEE Patt
ern Anal. Machine Intell., Vol.１２，No.８，pp.７
３５−７５９，１９９０）、ステレオ観測に基づく複数
物体の追跡（Jae-Woong Yi and Jun-Ho Oh. Recursive
resolving algorithm for multiple stereo and motion
matches. Image and Vision Computing, Vol.１５,pp.
１８１−１９６，１９９７）などが提案されている。画
像特徴を対象とする物体モデルと関連付けながら複数物
体を追跡するこれらの問題設定は、「動きの対応付け」
（Motion Correspondence）として知られている（Ingem
ar J. Cox.A review of statical data association te
chniques for motion correspondence. International
Journal of Computer Vision, Vol.１０：１, pp.５３
−６６，１９９３）。[0004] In the research targeting scenes with motion,
Robot navigation based on stereo observation (Larr
y Matthies and Steven A. Shafer. Error Modeling in
stereo navigation. IEEE Rebotics and Automation,
vol. RA-3, No.3, pp.239-248, 1987),
Tracking known shapes with one camera (Akio Kos
aka and Goichi Nakazawa. Vision-based motion track
ing of rigid objectsusing prediction of uncertaini
es. In Proc. of International Conference on Roboti
cs and Automation, pp.2637-2644,195
5), tracking of known shape by stereo measurement (Ge
m-Sun J. Young and Rama Chellappa. 3-dmotion estim
ation using a sequence of noisy stereo images: Mod
els, estimation, and Uniqueness results.IEEE Patt
ern Anal. Machine Intell., Vol.12, No.8, pp.7
35-759, 1990), Tracking of multiple objects based on stereo observation (Jae-Woong Yi and Jun-Ho Oh. Recursive
resolving algorithm for multiple stereo and motion
matches. Image and Vision Computing, Vol.15, pp.
181-196, 1997) and the like have been proposed. These problem settings, which track multiple objects while associating image features with the object model of interest, are "motion associations."
Known as (Motion Correspondence) (Ingem
ar J. Cox. A review of statical data association te
chniques for motion correspondence. International
Journal of Computer Vision, Vol.10: 1, pp.53
-66, 1993).

【０００５】[0005]

【発明が解決しようとする課題】ところで、上述した従
来の手法はいずれも、一定間隔で複数視点の観測が同時
に行なわれ、または静止シーン（同時観測と等価）であ
ることを前提としている。観測を同時に行なった場合、
同じ物理的特徴を観測することになるので動きの対応付
けが容易になる。By the way, all of the above-mentioned conventional methods are based on the premise that observations from a plurality of viewpoints are made simultaneously at fixed intervals, or that they are still scenes (equivalent to simultaneous observations). If the observations are made simultaneously,
Since the same physical characteristics will be observed, it becomes easy to associate the movements.

【０００６】しかしながら、同時観測を行なう従来の手
法においても、時系列で物体を追跡するためには異なる
時刻間で「動きの対応付け」が依然として必要となる。However, even in the conventional method of performing simultaneous observation, "movement correspondence" between different times is still necessary in order to track an object in time series.

【０００７】さらに、従来の多視点画像を利用したシス
テムの多くは、追跡時の位置情報の獲得のために視点間
で同時に観測が行なわれていることを前提としているた
め、各視点で得られた画像を処理する際には、全体の処
理速度は最も処理の遅いプロセスにより制限されるとい
う問題があった。また、このようなシステムでは画像情
報のやり取りの他に視点間の同期を行なうための手段を
別途必要とする。これら同期を前提とした従来のシステ
ムの持つ問題は、利用する視点数が増加するに伴いより
顕著になる。Further, since many conventional systems using multi-view images are based on the premise that observations are simultaneously made between the viewpoints in order to acquire position information at the time of tracking, they are obtained at each viewpoint. When processing an image, there is a problem that the overall processing speed is limited by the slowest processing. Further, in such a system, means for synchronizing the viewpoints in addition to the exchange of image information is required separately. The problems of the conventional system based on these synchronizations become more remarkable as the number of viewpoints used increases.

【０００８】そこで、本願発明者は特願平１１−３０８
１８７号において、複数のカメラによって画像を取り込
み得られた画像に対して特徴抽出処理を行なう「移動物
体追跡装置」を提案した。しかし、この移動物体追跡装
置で画像から追跡対象に関する複数種類の情報を獲得に
有利な視点（カメラ位置）が他の情報の獲得にも有利で
あるとは限らない。Therefore, the inventor of the present application filed Japanese Patent Application No. 11-308.
In No. 187, we proposed a "moving object tracking device" that performs feature extraction processing on images obtained by capturing images with a plurality of cameras. However, a viewpoint (camera position) that is advantageous for acquiring a plurality of types of information regarding a tracking target from an image with this moving object tracking device is not always advantageous for acquiring other information.

【０００９】たとえば、図２５に示すように天井に設置
したカメラ２により人物ＭＡＮの追跡を行なう場合、位
置情報と服の色や背丈などの物体固有の情報の両方を検
出していたため、位置情報に関しては一方しか得られな
いようなカメラ配置に対応できず、追跡の安定性が低く
なる場合があった。For example, when the person MAN is tracked by the camera 2 installed on the ceiling as shown in FIG. 25, both the position information and the information peculiar to the object such as the color and the length of the clothes are detected. With regard to (1), it was not possible to deal with the camera arrangement that only one of them could be obtained, and there were cases where the tracking stability became low.

【００１０】すなわち、図２５において矢印ａはカメラ
２の真下に人物ＭＡＮの存在しているときに位置検出で
きる精度が高く、かつ人物ＭＡＮが２人存在していても
両者を分離して検出できる精度の高いことを示してい
る。矢印ｂはカメラ２から人物ＭＡＮが離れるほど人物
ＭＡＮ固有の情報を検出でき、人物を同定できる精度の
高いことを示している。That is, in FIG. 25, the arrow a has high accuracy in detecting the position when the person MAN exists directly under the camera 2, and even if there are two person MAN, they can be detected separately. It shows that the accuracy is high. The arrow b indicates that the farther the person MAN is from the camera 2, the more the information unique to the person MAN can be detected, and the more accurate the person can be identified.

【００１１】それゆえに、この発明の主たる目的は、物
体がいずれの位置にあっても位置情報の獲得，物体の同
定を安定に行なえる移動物体追跡装置を提供することで
ある。Therefore, a main object of the present invention is to provide a moving object tracking device which can stably acquire position information and identify an object regardless of the position of the object.

【００１２】[0012]

【課題を解決するための手段】この発明は、シーン内を
移動する移動物体を多視点で追跡する移動物体追跡装置
であって、シーン内で一定の高さでかつ一定の間隔を隔
てた複数の視点から下方向を撮影するための複数の撮影
手段と、複数の撮影手段のそれぞれに対応して設けら
れ、互いに独立して動作する複数の観測手段とを備え
る。複数の観測手段の各々は、対応する撮影手段が撮影
した移動物体の画像信号から、当該移動物体の位置情報
を推定する位置情報推定手段と、位置情報に基づいて対
応する撮影手段と当該移動物体との距離を推定し、前記
推定された距離に応じて、当該移動物体が対応する撮影
手段の下方または側方のいずれに近いかを表わす重み情
報を推定する重み情報推定手段と、画像信号から特徴点
抽出処理を行い、当該移動物体を同定する物体同定情報
を算出する物体同定情報算出手段と、位置情報、物体同
定情報、および重み情報を観測情報として送信する観測
情報送信手段とを含む。移動物体追跡装置はさらに、観
測情報送信手段から送信される位置情報、物体同定情
報、および重み情報を統合して、移動物体の状態を予測
する追跡手段を備える。SUMMARY OF THE INVENTION The present invention is a moving object tracking device for tracking a moving object moving in a scene from multiple points of view, and a plurality of moving object tracking devices having a constant height and a fixed interval in the scene. A plurality of photographing means for photographing a downward direction from the viewpoint and a plurality of observation means provided corresponding to each of the plurality of photographing means and operating independently of each other.
It Each of the plurality of observation means is photographed by the corresponding photographing means
Position information of the moving object from the image signal of the moving object
Position information estimating means for estimating
The distance between the moving image capturing means and the moving object,
Shooting corresponding to the moving object according to the estimated distance
Weight information indicating whether it is closer to the lower side or the side of the means
Information estimating means for estimating information and feature points from image signals
Object identification information that identifies the moving object by performing extraction processing
The object identification information calculation means for calculating
Observations that send constant information and weight information as observation information
And information transmission means. The moving object tracking device is
Position information and object identification information transmitted from the measurement information transmission means
Distribution, and integrates the weighted information, obtain Preparations tracking means for predicting the state of the moving object.

【００１３】また、複数の観測手段の各々は、追跡手段
から送信される予測観測位置に基づいて、対応する撮影
手段が撮影した当該移動物体と各追跡目標との対応付け
を行う対応付け手段を含む。移動物体追跡装置はさら
に、新規の移動物体を検出する発見手段を備える。観測
情報送信手段は、対応付けされた移動物体の観測情報を
追跡手段に送信し、対応付けされなかった移動物体の観
測情報を前記発見手段に送信する。追跡手段は、対応付
けされなかった観測情報に基づいて発見手段によって発
見された新規の移動物体に基づいて、各移動物体に関す
る情報を更新する。 Further, each of the plurality of observation means, based on the predicted observation position that will be transmitted from the tracking unit, corresponding shooting
Correspondence between the moving object and the tracking target means have taken
It includes associating means for performing. Moving object tracking device
And a discovery means for detecting a new moving object. Observation
The information transmitting means transmits the observation information of the associated moving object.
A view of a moving object that was sent to the tracking means and was not associated
The measurement information is transmitted to the finding means. Tracking means is compatible
Based on the new moving object discovered by the discovery means based on the morning it was not observed information, related to each moving object
To update the that information.

【００１４】また、複数の観測手段の各々は、対応する
撮影手段が撮影した画像を、背景画像と移動物体領域と
に分割する領域分割手段をさらに含む。物体同定情報算
出手段は、領域分割手段により取出された移動物体領域
の重心点を特徴点として求める手段を含む。 Each of the plurality of observing means further includes area dividing means for dividing the image photographed by the corresponding photographing means into a background image and a moving object area . Object identification information calculation
Detecting means includes a determined Mel hand stage the center of gravity point of the moving object region extracted by the region dividing means as a characteristic point.

【００１５】さらに、追跡手段は、計算機で構成され、
複数の観測手段の各々は、計算機で構成され、複数の観
測手段の各々と追跡手段とは、相互通信を行う。Further, the tracking means is composed of a computer,
Each of the plurality of observation means is constituted by a computer, and each the tracking means of a plurality of observing means, communicate with each other.

【００１６】[0016]

【発明の実施の形態】図１は、この発明の一実施形態の
カメラの配置図である。図１において、複数のカメラ２
＃１，２＃２，２＃３が一定の高さの室内の天井に、撮
影方向が下方向に向くように一定に間隔で埋め込まれて
いる。これらの天井に埋め込まれたカメラ２＃１，２＃
２，２＃３により人物の追跡を行なう場合、追跡に必要
な情報として位置情報と、服の色や背丈などの物体固有
の情報が用いられる。1 is a layout view of a camera according to an embodiment of the present invention. In FIG. 1, a plurality of cameras 2
# 1, # 2 and 2 # 3 are embedded in the ceiling of the room having a constant height at regular intervals so that the photographing direction is directed downward. Cameras 2 # 1, 2 # embedded in these ceilings
When the person is tracked by 2, 2 # 3, position information and information peculiar to the object such as the color and the length of the clothes are used as the information necessary for the tracking.

【００１７】位置情報の獲得には他の物体との隠れを避
けるために人物の真上のカメラを利用する方が、複数の
人物を分離して観測できるので有利となる。たとえば、
カメラ２＃２の下方の位置では、前述の図２５で説明し
たようにカメラ２＃２による位置検出（分離）は容易と
なるが、他のカメラ２＃１，２＃３では側方の観測とな
るため、位置検出が困難となる。矢印ａは各カメラ２＃
１，２＃２，２＃３のそれぞれの真下にある物体の位置
検出精度が高いことを示している。In order to obtain position information, it is advantageous to use a camera directly above a person in order to avoid obscuration with other objects because a plurality of persons can be observed separately. For example,
At the position below the camera 2 # 2, the position detection (separation) by the camera 2 # 2 becomes easy as described above with reference to FIG. 25, but the other cameras 2 # 1 and 2 # 3 perform side observation. Therefore, it becomes difficult to detect the position. Arrow a indicates each camera 2 #
It shows that the position detection accuracy of the objects directly below each of 1, 2 # 2 and 2 # 3 is high.

【００１８】一方、服の色や背丈や顔画像といった固有
情報は、カメラ２＃２よりもカメラ２＃１，２＃３が側
方を観測するため有利となる。矢印ｂは各カメラ２＃
１，２＃２，２＃３がそれぞれ側方を観測したときに固
有情報の取得が良好になることを示している。On the other hand, the unique information such as the color of the clothes, the height and the face image is advantageous because the cameras 2 # 1 and 2 # 3 observe the sides more than the camera 2 # 2. Arrow b indicates each camera 2 #
It is shown that 1,2 # 2 and 2 # 3 each have good acquisition of unique information when observing their sides.

【００１９】この発明は、図１に示すようにカメラ２＃
１，２＃２，２＃３を天井に配置し、人物との距離に応
じて各カメラをいずれの情報の獲得に利用するかを切換
えることで広範囲にわたり、隠れの影響を受けにくくし
対象物の同定，位置検出の両方を安定に行なうことを可
能にする。According to the present invention, as shown in FIG.
1, 2 # 2, 2 # 3 are placed on the ceiling, and by switching which camera is used to acquire which information according to the distance to the person, it is possible to prevent the influence of hiding over a wide range. It is possible to stably perform both identification and position detection.

【００２０】図２はこの発明の一実施形態における移動
物体追跡装置１の全体構成を示すブロック図である。図
２を参照して、移動物体追跡装置１は、図１に示すよう
に天井に配置されるカメラ２♯１，２♯２，…，２♯ｎ
と、観測部４♯１，４♯２，…，４♯ｎと、発見部６
と、追跡部８を含む。FIG. 2 is a block diagram showing the overall configuration of the moving object tracking device 1 according to the embodiment of the present invention. Referring to FIG. 2, moving object tracking device 1 includes cameras 2 # 1, 2 # 2, ..., 2 # n arranged on the ceiling as shown in FIG.
, Observation section 4 # 1, 4 # 2, ..., 4 # n, and discovery section 6
And a tracking unit 8.

【００２１】観測部４♯１，４♯２，…，４♯ｎのそれ
ぞれは、カメラ２♯１，２♯２，…，２♯ｎにそれぞれ
対応して設けられる（以下総称的に、カメラ２、観測部
４とそれぞれ記す）。そして、各観測部４＃１，４＃
２，…，４＃ｎには各カメラ２＃１，２＃２，…，２＃
ｎの撮像出力が与えられている。観測部４♯１、４♯２
…４♯ｎ、発見部６および追跡部８のそれぞれは、互い
に独立して動作することが可能である。たとえば、これ
らは異なる計算機で構成され、それぞれの計算機をロー
カルエリアネットワークＬＡＮで接続する。Observation units 4 # 1, 4 # 2, ..., 4 # n are provided corresponding to cameras 2 # 1, 2 # 2, ..., 2 # n, respectively (hereinafter collectively referred to as cameras). 2 and observation section 4, respectively). Then, the observation units 4 # 1 and 4 #
2, ..., 4 # n have respective cameras 2 # 1, 2 # 2, ..., 2 #
n imaging outputs are given. Observation units 4 # 1 and 4 # 2
4 # n, discovery unit 6 and tracking unit 8 can operate independently of each other. For example, these are composed of different computers, and each computer is connected to the local area network LAN.

【００２２】図中、記号Ａ０は観測部４から追跡部８に
送信される対応点（追跡目標）の観測情報を表し、記号
Ａ１は観測部４から発見部６に送信される未対応点（追
跡目標と対応のとれない点）の観測情報を表し、記号Ａ
２は追跡部８から観測部４に送信される予測位置情報を
表し、記号Ａ３は発見部６から追跡部８に送信される新
規人物の位置情報（初期値）を表し、記号Ａ４は追跡部
８から発見部４に送信される位置情報（更新後）を表わ
している。In the figure, the symbol A0 represents the observation information of the corresponding point (tracking target) transmitted from the observing unit 4 to the tracking unit 8, and the symbol A1 represents the uncorresponding point (transmitting from the observing unit 4 to the finding unit 6 ( It represents the observation information of points that cannot be traced to the tracking target, and the symbol A
Reference numeral 2 represents predicted position information transmitted from the tracking unit 8 to the observation unit 4, symbol A3 represents position information (initial value) of a new person transmitted from the discovery unit 6 to the tracking unit 8, and symbol A4 is the tracking unit. 8 represents the position information (after updating) transmitted from 8 to the discovery unit 4.

【００２３】観測部４は、対応するカメラが真下の人物
を撮像したことに応じて、その人物の位置情報を獲得
し、他のカメラがその人物の側方を撮像したことに応じ
て少なくとも顔，服装を同定するための情報を獲得し、
獲得結果から特徴抽出処理を行なう。The observing section 4 acquires position information of the person directly under the image of the person directly under the corresponding camera, and at least the face according to the fact that the other camera images the side of the person. ， Acquire information to identify clothes,
Feature extraction processing is performed from the obtained result.

【００２４】なお、カメラは必ずしも真下の人物や人物
を側方から撮像するものに限定されず、たとえばロボッ
トが暴走する様子や、犬や猫などの動物などが移動する
様子を上方および側方から撮像できればよい。各観測部
４は独立に動作する。観測部４で得られた特徴量（重心
点および距離変換値）は、後述する追跡部８から送信さ
れてくる予測位置情報Ａ２をもとに追跡目標と対応付け
られた後に、観察時間の情報とともに追跡部８に送信さ
れる。対応がとれなかった特徴量は未対応点の観測情報
Ａ１として、発見部６に送信される。It should be noted that the camera is not necessarily limited to the one that directly images the person or person directly below, but for example, a robot may run away or an animal such as a dog or cat may move from above or from the side. It is only necessary to be able to take an image. Each observation unit 4 operates independently. The feature amount (the center of gravity point and the distance conversion value) obtained by the observation unit 4 is associated with the tracking target based on the predicted position information A2 transmitted from the tracking unit 8 described later, and then the observation time information is obtained. It is sent together with the tracking unit 8. The feature amount that is not matched is transmitted to the discovery unit 6 as the observation information A1 of the uncorresponding point.

【００２５】発見部６では、送信されてきた未対応点の
観測情報Ａ１を用いて、シーンの中に新らたに現れた人
物（新規人物）の検出を行なう。新規人物についての検
出結果、すなわち新規人物の位置情報Ａ３は、追跡部８
に送信される。これにより、新規人物は、新たな追跡目
標として追加される。そして、追跡部８において追跡が
開始される。The finding section 6 detects a person (new person) who appears newly in the scene by using the transmitted observation information A1 of the uncorresponding points. The detection result of the new person, that is, the position information A3 of the new person is acquired by the tracking unit 8
Sent to. As a result, the new person is added as a new tracking target. Then, the tracking unit 8 starts tracking.

【００２６】追跡部８では、新規人物の位置情報Ａ３を
初期値とし、観測情報Ａ０を入力値として、カルマンフ
ィルタを用いて人物の位置情報を更新し、さらに観測モ
デルに基づき位置を予測する。予測位置情報Ａ２は、観
測部４に送信される。発見部６には、後述するように位
置情報（更新後）Ａ４が送信される。The tracking unit 8 updates the position information of the person using the Kalman filter with the position information A3 of the new person as the initial value and the observation information A0 as the input value, and further predicts the position based on the observation model. The predicted position information A2 is transmitted to the observation unit 4. The location information (after updating) A4 is transmitted to the finding unit 6 as described later.

【００２７】図３はこの発明の一実施形態の動作を説明
するためのフローチャートであり、図４は図３の各人物
領域と追跡モデルの対応付けおよび位置・姿勢推定動作
を示すフローチャートであり、図５は天井に設置したカ
メラからの距離に基づく人物像の変化例を示す図であ
る。FIG. 3 is a flow chart for explaining the operation of one embodiment of the present invention, and FIG. 4 is a flow chart showing the correspondence between each human area and the tracking model and the position / orientation estimation operation of FIG. FIG. 5 is a diagram showing an example of a change in a person image based on the distance from a camera installed on the ceiling.

【００２８】観測部４は図３に示すステップ（図示では
ＳＰと略称する）ＳＰ１において、カメラ２から対応す
る画像信号が入力されると、ステップＳＰ２において人
物領域を検出する。ステップＳＰ３において人物を検出
したか否かを判別する。人物を検出するまでステップＳ
Ｐ１〜ＳＰ３を繰り返す。人物の存在を検出すると、ス
テップＳＰ４において、人物を観測した時刻を追跡部８
に送信する。その結果、観測部４は追跡部８から追跡モ
デルと観測時刻により算出した位置・姿勢および顔画
像，服装などの人物特徴の予測値を受信する。When a corresponding image signal is input from the camera 2 in step SP1 (abbreviated as SP in the figure) SP1 shown in FIG. 3, the observation section 4 detects a person area in step SP2. In step SP3, it is determined whether or not a person is detected. Step S until detecting a person
Repeat P1 to SP3. When the presence of a person is detected, the time when the person is observed is tracked by the tracking unit 8 in step SP4.
Send to. As a result, the observation unit 4 receives from the tracking unit 8 the position / orientation calculated from the tracking model and the observation time, and the predicted value of the human feature such as the face image and clothes.

【００２９】ステップＳＰ６において、各人物領域と追
跡モデルとを対応付ける。この処理は図４（ａ）に示す
処理を実行することにより行なわれる。すなわち、ステ
ップＳＰ２１において、各人物領域を特定し、ステップ
ＳＰ２２において位置・姿勢を推定する。この処理は図
４（ｂ）に示す処理で行なわれる。ステップＳＰ３１に
おいて位置推定が行なわれ、ステップＳＰ３２において
各人物領域とカメラ２との距離ｄを算出する。距離ｄの
制約下で姿勢を推定し、信頼度（重み）を算出する。姿
勢の推定は、図５に示すようにカメラから距離ｄ離れた
人物がとり得る回転（姿勢角）のシルエット情報に基づ
いて行なわれる。In step SP6, each person area is associated with the tracking model. This process is performed by executing the process shown in FIG. That is, in step SP21, each person area is specified, and in step SP22, the position / posture is estimated. This process is performed by the process shown in FIG. Position estimation is performed in step SP31, and the distance d between each person area and the camera 2 is calculated in step SP32. The posture is estimated under the constraint of the distance d, and the reliability (weight) is calculated. The estimation of the posture is performed based on the rotation (posture angle) silhouette information that can be taken by a person who is a distance d from the camera as shown in FIG.

【００３０】その後、図４（ａ）に示すステップＳＰ２
３において頭部領域画像，服の色などの人物特徴を抽出
するとともに信頼度（重み）を算出する。そして、ステ
ップＳＰ２４において位置・姿勢情報，人物特徴情報に
よる対応付け処理を行う。ステップＳＰ２５においてす
べての人物領域について処理を終えたか否かを判別し、
終えていなければ次の人物領域について処理するために
ステップＳＰ２１に戻る。すべての人物領域について処
理を終えていれば、図３に示すステップＳＰ７に進む。Then, step SP2 shown in FIG.
In 3, the human feature such as the head region image and the clothes color is extracted and the reliability (weight) is calculated. Then, in step SP24, the associating process based on the position / orientation information and the person characteristic information is performed. In step SP25, it is determined whether or not the processing has been completed for all the person areas,
If not completed, the process returns to step SP21 to process the next person area. If the processing has been completed for all the person regions, the process proceeds to step SP7 shown in FIG.

【００３１】ステップＳＰ７において、各人物領域を特
定し、ステップＳＰ８において追跡モデルと対応関係が
あるか否かを判別する。追跡モデルと対応関係があれば
ステップＳＰ９において位置・姿勢および人物特徴など
の観測情報を追跡ノードに送信し、その際カメラ２との
距離に応じた信頼度を付加する。もし、追跡モデルと対
応関係がなければステップＳＰ１０において位置・姿勢
および人物特徴などの観測情報を発見ノードに送信し、
その際カメラ２との距離に応じた信頼度を付加する。ス
テップＳＰ１１においてすべての人物領域について処理
を終えたか否かを判別し、終えていなければステップＳ
Ｐ７の戻り、終えていればステップＳＰ１に戻る。In step SP7, each person area is specified, and in step SP8 it is determined whether or not there is a correspondence relationship with the tracking model. If there is a correspondence relationship with the tracking model, the observation information such as the position / orientation and the human feature is transmitted to the tracking node in step SP9, and the reliability according to the distance from the camera 2 is added at that time. If there is no correspondence with the tracking model, the observation information such as the position / orientation and the human feature is transmitted to the discovery node in step SP10,
At that time, the reliability according to the distance from the camera 2 is added. In step SP11, it is determined whether or not the processing has been completed for all the person regions, and if not completed, step S
Return to P7, and if completed, return to step SP1.

【００３２】ここで、この発明の一実施形態において用
いる人体モデルについて図６を用いて説明する。図６
は、本発明の実施の形態において用いる人体モデルにつ
いて説明するための図である。図中、Ｘ、Ｙ、Ｚは世界
座標系の３軸を示している。人物ＭＡＮは、楕円柱ｈで
モデル化している。人物モデルｈの中心軸Ｘｈを、人物
の回転軸とし、法線軸（回転軸Ｘｈと垂直な短軸方向の
軸）とＸ軸とがなす角ｒを人物の姿勢角とする。なお、
人物ＭＡＮの回転軸Ｘｈは、床面（Ｘ軸およびＹ軸がな
す平面）に対して垂直であると仮定する。カメラ２は、
人物の回転軸Ｘｈ、すなわちＺ軸に垂直に配置される。A human body model used in the embodiment of the present invention will be described with reference to FIG. Figure 6
FIG. 4 is a diagram for explaining a human body model used in the embodiment of the present invention. In the figure, X, Y, and Z indicate the three axes of the world coordinate system. The person MAN is modeled by an elliptic cylinder h. The center axis Xh of the person model h is the rotation axis of the person, and the angle r formed by the normal axis (the axis in the short axis perpendicular to the rotation axis Xh) and the X axis is the posture angle of the person. In addition,
It is assumed that the rotation axis Xh of the person MAN is perpendicular to the floor surface (the plane formed by the X axis and the Y axis). Camera 2
It is arranged perpendicular to the rotation axis Xh of the person, that is, the Z axis.

【００３３】次に、図２に示す観測部４について説明す
る。図７は、観測部４の構成の概要について説明するた
めの図である。図７を参照して、観測部４は、領域分割
回路１０、画素値算出回路１２、重心点選択回路１４、
および特徴点対応づけ回路１６を含む。Next, the observation section 4 shown in FIG. 2 will be described. FIG. 7 is a diagram for explaining the outline of the configuration of the observation unit 4. Referring to FIG. 7, the observation unit 4 includes an area dividing circuit 10, a pixel value calculating circuit 12, a center of gravity point selecting circuit 14,
And a feature point correspondence circuit 16.

【００３４】領域分割回路１０、画素値算出回路１２お
よび重心点選択回路１４は、入力した人物画像に基づき
特徴抽出処理を行なう。特徴点対応づけ回路１６は、当
該抽出された特徴点と追跡目標（モデル）間の対応付け
を行なう。The area division circuit 10, the pixel value calculation circuit 12, and the center-of-gravity point selection circuit 14 perform the feature extraction processing based on the inputted person image. The feature point association circuit 16 associates the extracted feature points with a tracking target (model).

【００３５】領域分割回路１０は、入力画像を、人物領
域と背景領域とに分割する。たとえば、「連続画像に基
づく階層的適応による人物追跡のための画像分割」コン
ピュータビジョンおよびパターン認識（ＣＶＰＲ′９
８），ＩＥＥＥコンピュータ学会論文誌，ｐ．９１１〜
９１６に、領域分割の手法が記載されている。The area dividing circuit 10 divides the input image into a person area and a background area. For example, "Image segmentation for person tracking by hierarchical adaptation based on continuous images" Computer Vision and Pattern Recognition (CVPR'9
8), Journal of IEEE Computer Society, p. 911 to
In 916, a method of area division is described.

【００３６】具体的には、カメラで撮影した現フレーム
の画像と、前後フレームの画像との差分を２値化し、差
分画像のいくつかのフレームを重ね合せることによりマ
スク画像を生成する（すなわち、粗い分割により大まか
な動物体領域を抽出する）。マスク画像のサイズが一定
範囲にあるときは、さらに輝度情報、色情報、またはピ
クスタ情報、テクスチャ情報ならびにこれらのいずれか
の組合せにより当該パラメータの分布を推定し、さらに
推定されたパラメータの分布によって、新たに得られた
画像上の各画素について当該動物体領域の存在する確率
を算出する（すなわち、精密な分割を行なう）。これに
より、カメラにとらえられる入力画像の中から人物領域
を切り出す。Specifically, the mask image is generated by binarizing the difference between the image of the current frame captured by the camera and the images of the preceding and following frames, and superimposing some frames of the difference image (that is, Extract a rough body region by coarse division). When the size of the mask image is within a certain range, further estimate the distribution of the parameter by luminance information, color information, or pixel information, texture information and any combination thereof, and further by the estimated parameter distribution, The probability that the moving object region exists for each pixel on the newly obtained image is calculated (that is, precise division is performed). As a result, the person area is cut out from the input image captured by the camera.

【００３７】続いて、画素値算出回路１２において、得
られた人物画像に対して距離変換を行なう。具体的に
は、人物領域を構成する画素の各々から人物領域の境界
までの最短距離を示す画素値を算出する。重心点選択回
路１４は、人物領域内で画素値が最大となる点を当該人
物領域の重心点（すなわち特徴点）として選択する。重
心点における画素値を距離変換値とする。重心点と重心
点の距離変換値（重心点距離変換値）とを特徴量として
用いる。Subsequently, the pixel value calculation circuit 12 performs distance conversion on the obtained person image. Specifically, a pixel value indicating the shortest distance from each of the pixels forming the person area to the boundary of the person area is calculated. The center-of-gravity point selection circuit 14 selects a point having the maximum pixel value in the person area as a center of gravity (that is, a feature point) of the person area. The pixel value at the center of gravity is used as the distance conversion value. The center-of-gravity point and the distance conversion value of the center-of-gravity point (the center-of-gravity point distance conversion value) are used as the feature amount.

【００３８】具体的には、領域分割回路１０からは、図
８（ａ）に示すような２値化画像が出力される。画素値
算出回路１２では、図８（ａ）に示す２値化画像に対し
て、図８（ｂ）に示す距離変換画像を生成する。図８
（ｂ）では、距離変換値が大きい画素ほど黒く表され、
距離変換値が小さい画素ほど白く表されている。人体の
輪郭から遠くなるにつれ画素の色が濃くなっている。重
心点選択回路１４は、図８（ｂ）における記号“Ｘ”に
相当する画素を、重心点として選択する。当該距離変換
による重心点検出は、人物のポーズの変化の影響を受け
にくいという特徴がある。Specifically, the area dividing circuit 10 outputs a binarized image as shown in FIG. The pixel value calculation circuit 12 generates a distance conversion image shown in FIG. 8B for the binarized image shown in FIG. Figure 8
In (b), a pixel with a larger distance conversion value is represented in black,
Pixels with smaller distance conversion values are shown in white. The color of the pixel becomes darker as the distance from the contour of the human body increases. The center-of-gravity point selection circuit 14 selects the pixel corresponding to the symbol “X” in FIG. 8B as the center of gravity point. The detection of the center of gravity by the distance conversion is characterized by being less susceptible to changes in the pose of a person.

【００３９】続いて、抽出された重心点と既に発見され
ている追跡目標との間の対応付けを行なう特徴点対応づ
け回路１６について説明する。図９は、特徴点の対応付
け処理について説明するための図である。図中、記号２
０♯１、２０♯２、…、２０♯ｋ、…、２０♯ｍ、…、
２０♯ｎのそれぞれは、カメラ２♯１、２♯２、…、２
♯ｋ、…、２♯ｍ、…、２♯ｎにより得られる画像面を
それぞれ表わしている（以下、総称的に、画像面２０と
記す）。Next, the feature point associating circuit 16 for associating the extracted barycentric point with the already-discovered tracking target will be described. FIG. 9 is a diagram for explaining the feature point association processing. Symbol 2 in the figure
0 # 1, 20 # 2, ..., 20 # k, ..., 20 # m ,.
Each of 20 # n has a camera 2 # 1, 2 # 2, ..., 2
The image planes obtained by #k, ..., 2 # m, ..., 2 # n are respectively represented (hereinafter, generically referred to as image plane 20).

【００４０】時刻ｔａにカメラ２♯ｋにより、時刻ｔｂ
にカメラ２♯ｍにより、観測がそれぞれ行なわれたもの
とする（ただし、ｔａ＜ｔｂとする）。時刻ｔａでは、
２人の人物ｈ０，ｈ１が存在し、時刻ｔｂにおいて人物
ｈ２がシーン内に初めて現われたとする。At time ta, camera 2 # k detects time tb.
It is assumed that the observation is performed by the camera 2 # m (note that ta <tb). At time ta,
It is assumed that there are two persons h0 and h1 and the person h2 first appears in the scene at time tb.

【００４１】画像面２０に付した記号×は、各画像にお
いて検出された重心点を示している。この場合、時刻ｔ
ａにおける画像面２０♯ｋから、２つの重心点（特徴
点）が検出され、時刻ｔｂにおける画像面２０♯ｍか
ら、３つの重心点（特徴点）が検出される。The symbol x attached to the image plane 20 indicates the center of gravity detected in each image. In this case, time t
Two barycentric points (feature points) are detected from the image plane 20 # k at a, and three barycentric points (feature points) are detected from the image plane 20 # m at time tb.

【００４２】人物ｈ０，ｈ１のそれぞれについては、追
跡部８への問合せにより、時刻ｔａまでの観測結果に基
づき時刻ｔｂの予測位置を得る。後述するように、追跡
部８では、人物の動きを等速運動で仮定しており、ある
時刻ｔにおける人物ｈｊの予測位置は、２次元のガウス
分布で表わされる。世界座標系における人物ｈｊの時刻
ｔにおける位置を位置Ｘ_hj,tとし、当該２次元のガウス
分布の平均を／Ｘ_hj,t、共分散行列を／Ｓ_hj,tと記す。
平均および共分散行列は、予測位置情報Ａ０として、追
跡部８から送信される。For each of the persons h0 and h1, the predicted position at time tb is obtained by inquiry to the tracking unit 8 based on the observation result up to time ta. As will be described later, in the tracking unit 8, the motion of the person is assumed to be a uniform velocity motion, and the predicted position of the person hj at a certain time t is represented by a two-dimensional Gaussian distribution. The position of the person hj in the world coordinate system at time t is _defined as the position X _{hj, t} , the average of the two-dimensional Gaussian distribution is represented as / X _{hj, t} , and the covariance matrix is represented as / S _{hj, t} .
The average and covariance matrix are transmitted from the tracking unit 8 as the predicted position information A0.

【００４３】なお、式では、／Ｘ_hj,tを、記号Ｘ_hj,tの
上に“−”を付したものとして表現し、／Ｓ_hj,tを、記
号Ｓ_hj,tの上に“−”を付したものとして表現する。In the expression, / X _{hj, t} is _expressed as a symbol X _{hj, t with} a "-" added, and / S _{hj, t} is denoted by a symbol S _{hj, t} on ". It is expressed as "-".

【００４４】予測位置の分布Ｎ（／Ｘ_hj,t、／Ｓ_hj,t）
を画像２０♯ｉ（ｉ＝１、２、…、ｎ）に弱透視投影す
ると、式（１）で示される確率からなる１次元ガウス分
布ｎ（／ｘ_hj,t,i、／ｓ_hj,t,iが得られる。これは、画
像２０♯ｉにおける人物の存在確率を示している。式
（１）において、記号ｘは、世界座標系での人物位置Ｘ
を画像面上に投影したもの、記号／ｘは、世界座標系に
おける平均／Ｘを画像面上に投影したもの、記号／ｓ
は、世界座標系における共分散行列／Ｓを画像面上に投
影したものをそれぞれ表わしている。なお、式では、／
ｘを、記号ｘの上に“−”を付したものとして表現し、
／ｓを、記号ｓの上に“−”を付したものとして表現す
る。Predicted position distribution N (/ X _{hj, t} , / S _{hj, t} )
Is weakly perspectively projected onto the image 20 # i (i = 1, 2, ..., N), the one-dimensional Gaussian distribution n (/ x _{hj, t, i} , / s _{hj, t, i} is obtained, which indicates the existence probability of a person in the image 20 # _i, where the symbol x is the person position X in the world coordinate system.
Is projected on the image plane, the symbol / x is the average / X in the world coordinate system projected on the image plane, the symbol / s
Represents the projection of the covariance matrix / S in the world coordinate system on the image plane. In the formula, /
x is expressed as a symbol x with "-" added,
/ S is expressed as a symbol s with "-" added.

【００４５】[0045]

【数１】 [Equation 1]

【００４６】式（１）に表わされる確率を最大にする特
徴点を観測時間での人物ｈｊに対応する観測値とし、当
該特徴点にｈｊのラベルを付ける。ラベル付けられた特
徴点の観測情報Ａ０は、追跡部８に送信される。ただ
し、複数の人物と対応付けができた特徴点については、
観測時点でオクルージョンが発生していると判断し、送
信を停止する。The feature point that maximizes the probability expressed by the equation (1) is taken as the observation value corresponding to the person hj at the observation time, and the feature point is labeled hj. The observation information A0 of the labeled feature points is transmitted to the tracking unit 8. However, for the feature points that can be associated with multiple people,
It is determined that occlusion has occurred at the time of observation, and transmission is stopped.

【００４７】これらの処理の後、対応付けがされていな
い特徴点については、未知の人物、すなわち新規人物に
属するものとし、未対応の観測情報Ａ１として、位置お
よび時間が発見部６に送信される。なお、図９において
は、人物ｈ２についての観測情報Ａ１が、発見部６に送
信される。After these processes, the feature points not associated with each other are assumed to belong to an unknown person, that is, a new person, and the position and time are transmitted to the finding unit 6 as uncorresponding observation information A1. It In FIG. 9, the observation information A1 about the person h2 is transmitted to the discovery unit 6.

【００４８】続いて、図２に示す追跡部８について説明
する。追跡部８では、観測部４のそれぞれから送られて
きた観測情報Ａ０に基づき、追跡目標人物の位置・方向
を更新する。Next, the tracking unit 8 shown in FIG. 2 will be described. The tracking unit 8 updates the position / direction of the tracking target person based on the observation information A0 sent from each of the observation units 4.

【００４９】図１０は、図２に示す追跡部８の構成の概
要について説明するための図である。図１０を参照し
て、追跡部８は、位置推定回路２２および方向角推定回
路２４を含む。位置推定回路２２は、観測部４から得ら
れる重心点位置に基づき、予測位置を推定し、推定結果
を方向推定回路２４に出力する。方向角推定回路２４
は、位置推定回路２２から出力される推定結果に従い、
観測部４から得られる重心点距離変換値に基づき方向角
を推定する。FIG. 10 is a diagram for explaining the outline of the configuration of the tracking unit 8 shown in FIG. Referring to FIG. 10, tracking unit 8 includes a position estimation circuit 22 and a direction angle estimation circuit 24. The position estimation circuit 22 estimates the predicted position based on the position of the center of gravity obtained from the observation unit 4, and outputs the estimation result to the direction estimation circuit 24. Direction angle estimation circuit 24
According to the estimation result output from the position estimation circuit 22,
The direction angle is estimated based on the barycentric point distance conversion value obtained from the observation unit 4.

【００５０】まず、位置推定回路２２における位置推定
処理について詳しく述べる。追跡中の人物の位置は、各
観測部４において対応付けられた特徴情報（重心点位
置）を用いて更新される。図１１は、位置推定処理につ
いて説明するための図である。図１１を参照して、カメ
ラ２♯ｉと画像面２０♯ｉとの距離を距離ｌ_i、カメラ
２♯ｉと人物ｈｊとの距離を距離Ｌ_hj,iと記す。エピポ
ーラ線とＹ軸とがなす角度をｗ_hj,iと記す。First, the position estimation processing in the position estimation circuit 22 will be described in detail. The position of the person being tracked is updated by using the feature information (centroid position) associated with each observation unit 4. FIG. 11 is a diagram for explaining the position estimation process. Referring to FIG. 11, the distance between camera 2 # _i and image plane 20 # _{i is referred} to as distance l _i , and the distance between camera 2 # _i and person hj is referred to as distance L _{hj, i} . The angle formed by the epipolar line and the Y axis is denoted by w _{hj, i} .

【００５１】位置推定処理においては、人物は等速運動
をしているものと仮定する。時刻ｔにおける人物ｈｊの
状態を、世界座標（Ｘ、Ｙ）上で、式（２）および
（３）で表わす。ただし、初期状態は、発見部６から送
信される新規人物（モデル）の情報によって決定される
ものとする。行列に付される“′”は、転置を表わして
いる。In the position estimation processing, it is assumed that the person is moving at a constant velocity. The state of the person hj at time t is represented by equations (2) and (3) on world coordinates (X, Y). However, the initial state is determined by the information of the new person (model) transmitted from the discovery unit 6. The “′” attached to the matrix represents transposition.

【００５２】∧Ｘ_hj,t-1を時刻（ｔ−１）における人物
位置Ｘ_hjの推定値とし、∧Ｓ_hj,t-1を、推定値∧Ｘ
_hj,t-1の分散行列とすると、時刻ｔでの状態は式（４）
および式（５）でそれぞれ表わされる。式では、∧Ｘ
_hj,t-1を、記号Ｘ_hj,t-1の上に“∧”を付したものとし
て表現し、∧Ｓ_hj,t-1を、記号Ｓ_hj,t-1の上に“∧”を
付したものとして表現する。Let ∧X _{hj, t-1 be} the estimated value of the person position X _hj at time (t-1), and let ∧S _{hj, t-1} be the estimated value ∧X.
_{Assuming that hj, t-1 is} the covariance matrix, the state at time t is given by equation (4).
And equation (5). In the formula, ∧X
_{Express hj, t-1} as a symbol X _{hj, t-1} with "∧" added, and ∧S _{hj, t-1} on the symbol S _{hj, t-1} with "∧" It is expressed as the one with.

【００５３】なお、遷移行列Ｆは、式（６）で表わされ
る（Δｔ：ｔ−１→ｔ）。また、記号Ｑは、遷移におけ
る共分散行列を表わしている。The transition matrix F is expressed by the equation (6) (Δt: t-1 → t). The symbol Q represents the covariance matrix at the transition.

【００５４】[0054]

【数２】 [Equation 2]

【００５５】ここで、観測部４♯ｉにより１回目の観測
が行なわれたものとする。観測部４♯ｉから送られてき
た位置情報により、この観測は、式（７）〜（９）で表
わすことができる。Here, it is assumed that the observation section 4 # i makes the first observation. This observation can be expressed by equations (7) to (9) based on the position information sent from the observation unit 4 # i.

【００５６】Ｈ＝[１０００] …（７）H = [1 0 0 0] (7)

【００５７】[0057]

【数３】 [Equation 3]

【００５８】記号Ｃ_iは、カメラの位置を、記号Ｒ
_hj,t,iは、エピポーラ線とＹ軸とがなす角度ｗ_hj,t,iの
時計回りの回転を表わしている。なお、記号ｅは、観測
誤差を表わしており、平均０、標準偏差σ_hj,t,iとす
る。標準誤差σ_hj,t,iは、カメラの距離が大きくなるほ
ど増加すると考え、式（９）のように表わす。The symbol C _i is the position of the camera and the symbol R is
_{hj, t, i} represents the clockwise rotation of the angle w _{hj, t, i} formed by the epipolar line and the Y axis. The symbol e represents an observation error, and has an average of 0 and a standard deviation σ _{hj, t, i} . The standard error σ _{hj, t, i} is considered to increase as the distance of the camera increases, and is expressed as in equation (9).

【００５９】ここでは、カメラの位置Ｃ_iと人物（Ｘ
_hj,t）との間の距離Ｌ_hj,t,iは未知数のため、Ｘ_hj,tの
予測位置／Ｘ_hj,tにより算出した距離／Ｌ_hj,t,iを近似
値として使用する（式では、／Ｌ_hj,t,iを、記号Ｌ
_hj,t,iの上に“−”を付したものとして表現する）。Here, the position C _{i of} the camera and the person (X
_Since the distance L _{hj, t, i} from ( _{hj, t} ) is an unknown number _, the predicted position of X _{hj, t} / the distance calculated by X _{hj, t} / L _{hj, t, i} is used as an approximate value ( In the formula, / L _{hj, t, i} is the symbol L
It is expressed as _{hj, t, i} with "-" added.

【００６０】式（８）の観測式では、左辺は観測情報
を、右辺は人物位置を画像に投影した結果をそれぞれ表
わしている。In the observation formula (8), the left side represents the observation information and the right side represents the result of projecting the person position on the image.

【００６１】位置推定回路２２では、以上の観測モデル
によりカルマンフィルタを構成し、人物ｈｊの状態を更
新する。The position estimation circuit 22 constitutes a Kalman filter by the above observation model and updates the state of the person hj.

【００６２】[0062]

【数４】 [Equation 4]

【００６３】各カメラごとに、独立に式（１０）および
（１１）による更新処理を行ない、状態予測を行なう。
時刻（ｔ＋１）における人物ｈｊの状態予測は、平均を
／Ｘ _hj,t+1、共分散行列を／Ｓ_hj,t+1とするガウス分布
で与えられる。状態予測の結果は、観測部４の要求に応
じて計算・送信され、上述したとおり特徴点の対応づけ
に利用される。カメラが検出可能な範囲外へ移動した人
物モデルは削除し、その人物の追跡を中止する。For each camera, equations (10) and
The update process according to (11) is performed to predict the state.
The state prediction of the person hj at time (t + 1) is the average.
/ X _{hj, t + 1}, The covariance matrix / S_{hj, t + 1}Gaussian distribution
Given in. The result of the state prediction meets the request of the observation unit 4.
Correspondence of feature points as described above
Used for. A person who has moved outside the detectable range of the camera
Delete the physical model and stop tracking the person.

【００６４】続いて、方向角推定回路２４の処理につい
て説明する。方向角推定回路２４では、人物の姿勢角ｒ
（図６参照）を、オクルージョンを生じない重心点距離
変換値を用いて推定する。この推定は、図１２に示す人
体の観測モデル（楕円体モデル）に基づいて行なう。図
１２を参照して、カメラ２で撮影された画像上における
楕円体の幅をｓとして重心点距離変換値を用いる。弱透
視変換を仮定すると光軸と楕円体モデル（人物）の回転
軸の法線とがなす角をθとして、観測される重心点距離
変換値ｓは、式（１２）に従うことになる。ここで、式
（１２）における記号Ｌは、楕円体モデルの回転軸から
カメラまでの距離を示しており、記号ＡおよびＢは定数
である。観測がガウス誤差を伴うと仮定すると、重心点
距離変換値ｓが観測される確率Ｐ（ｓ｜θ）は、式（１
３）のように表わされる。なお、定数Ａ，Ｂと誤差の分
散σ_sとは、学習用データにより予め決定しておく。Next, the processing of the direction angle estimating circuit 24 will be described. In the direction angle estimation circuit 24, the posture angle r of the person
(See FIG. 6) is estimated using the center-of-gravity point distance conversion value that does not cause occlusion. This estimation is performed based on the human body observation model (ellipsoidal model) shown in FIG. Referring to FIG. 12, the barycentric point distance conversion value is used with s being the width of the ellipsoid on the image captured by camera 2. Assuming a weak perspective transformation, the observed center-of-gravity point distance conversion value s follows Equation (12), where θ is the angle formed by the optical axis and the normal to the rotation axis of the ellipsoidal model (person). Here, the symbol L in the equation (12) indicates the distance from the rotation axis of the ellipsoidal model to the camera, and the symbols A and B are constants. Assuming that the observation is accompanied by Gaussian error, the probability P (s | θ) that the centroid distance conversion value s is observed is given by the equation (1)
3). The constants A and B and the variance σ _{s of the} error are determined in advance based on the learning data.

【００６５】[0065]

【数５】 [Equation 5]

【００６６】人体の姿勢は、オイラー角（ａ、ｅ、ｒ）
により表現できる。ここで、ｒは、方位角（姿勢角）
を、ａは方位角を、ｅは仰角をそれぞれ表わしている。
方位角ａおよび仰角ｅは、人体の回転軸方向により決定
されるため、回転軸回りの回転角ｒのみが未知数となっ
ている。The posture of the human body is the Euler angles (a, e, r)
Can be expressed by Where r is the azimuth angle (posture angle)
, A represents an azimuth angle, and e represents an elevation angle.
Since the azimuth angle a and the elevation angle e are determined by the rotation axis direction of the human body, only the rotation angle r around the rotation axis is an unknown number.

【００６７】人体の法線ベクトルＮは、式（１４）で表
わされる。したがって、カメラ２♯ｉの光軸ベクトルＣ
と人体の法線ベクトルＮとがなす角θ_i（ｒ）は、式
（１５）で表わされることになる。The normal vector N of the human body is expressed by equation (14). Therefore, the optical axis vector C of the camera 2 # i
The angle θ _i (r) formed between the normal vector N of the human body and the normal vector N of the human body is expressed by Expression (15).

【００６８】Ｎ＝Ｒ_Z（ａ）Ｒ_Y（ｅ）Ｒ_X（ｒ）ｅ …（１４） θ_ci（ｒ）＝ｃｏｓ^-1・Ｎ^T・Ｃ …（１５）なお、式（１４）において、Ｒ_Z、Ｒ_Y、Ｒ_Xはそれぞ
れ、Ｚ、Ｙ、Ｘ軸に関する回転行列を示し、ｅ_Zは、Ｚ
軸方向の単位ベクトルを示している。また、式（１５）
において、記号Ｎ^Tは、法線ベクトルＮの転置ベクトル
を示している。N = R _Z (a) R _Y (e) R _X (r) e (14) θ _ci (r) = cos ⁻¹ · N ^T · C (15) In the formula (14), , R _Z , R _Y , and R _X represent rotation matrices about the Z, Y, and X axes, respectively, and e _Z is Z.
A unit vector in the axial direction is shown. Also, equation (15)
In, the symbol N ^T indicates a transposed vector of the normal vector N.

【００６９】ｎ台のカメラ２♯１〜２♯ｎによって、重
心点距離変換値の組Ｗ（ｓ₁、ｓ₂、…、ｓ_n）が観測さ
れる確率Ｐは、式（１３）により、式（１６）で表わさ
れる。式（１６）における確率Ｐ（Ｗ｜ｒ）を最大とす
る値ｒを姿勢角の推定値とする。The probability P that the set W (s ₁ , s ₂ , ..., Sn) of the barycentric point distance conversion values is observed by the _n cameras 2 # ₁ to ₂ # _n is given by the equation (13). It is expressed by equation (16). A value r that maximizes the probability P (W | r) in Expression (16) is used as the estimated value of the posture angle.

【００７０】[0070]

【数６】 [Equation 6]

【００７１】方位角推定は、上述した位置推定と同様
に、観測が行なわれるごとに更新される。なお、図１２
に示す観測モデルは、入力を観測部４から送信される特
徴情報（重心点距離変換値ｓ）とし、式（１７）で表わ
される状態ｒ_hj,tを有することになる。なお、オクルー
ジョンの有無は、特徴点の対応付けの際に判定されてお
り、オクルージョンが生じた場合は観測情報としては用
いられない。The azimuth angle estimation is updated every time observation is performed, like the position estimation described above. Note that FIG.
The observation model shown in (3) has the input as the characteristic information (the center-of-gravity point distance conversion value s) transmitted from the observation unit 4 _, and has the state r _{hj, t} represented by the equation (17). The presence / absence of occlusion is determined at the time of associating feature points, and is not used as observation information when occlusion occurs.

【００７２】続いて、図２に示す発見部６の概要につい
て説明する。発見部６では、シーンに新たに登場した人
物（新規人物）を検出し、対応するモデルを追跡部８に
追加する。Next, the outline of the finding section 6 shown in FIG. 2 will be described. The finding unit 6 detects a person (new person) who newly appears in the scene, and adds a corresponding model to the tracking unit 8.

【００７３】観測情報は非同期に獲得されるため、通常
のステレオ対応をそのまま適用することができない。そ
こで、次のような時系列情報による対応（発見）手法を
用いる。Since the observation information is acquired asynchronously, ordinary stereo correspondence cannot be applied as it is. Therefore, the following correspondence (discovery) method using time series information is used.

【００７４】まず、観測部４のそれぞれから送られてき
た未対応点の観測情報のうち、異なる４時刻の観測情報
を各１点ずつ選出する（γと記す）。観測時刻ｔ１，ｔ
２，ｔ３，ｔ４のそれぞれに対して、観測視点をそれぞ
れＣ₁，Ｃ₂，Ｃ₃，Ｃ₄とする。観測情報を入力として、
上述したカルマンフィルタの更新処理（式（８））を行
なう。ただし、初期分散／Ｓ_hj,t＝０とする。First, of the observation information of uncorresponding points sent from each of the observing units 4, the observation information at different four times is selected one by one (denoted by γ). Observation time t1, t
The observation viewpoints are C ₁ , C ₂ , C ₃ , and C ₄ for 2, 2, and t ₄ , respectively. With observation information as input,
The above-described Kalman filter update processing (equation (8)) is performed. However, the initial variance / S _{hj, t} = 0.

【００７５】この操作により、４回の観測における予測
軌道が得られる。この予測軌道を用いて、式（４）を用
いて位置推定を行なう（ただし、Δｔ＝ｔ４−ｔｉ：ｉ
＝１，２，３のいずれか）。位置推定結果の集合を[∧
Ｘ_t1，∧Ｘ_t2，∧Ｘ_t3，∧Ｘ_t ₄]とする。By this operation, the predicted orbits in the four observations can be obtained. Using this predicted trajectory, position estimation is performed using equation (4) (where Δt = t4-ti: i
= 1, 2, or 3). The set of position estimation results is [∧
_{_{_{X t1, ∧X t2, ∧X t3}}} , and ∧X _t _4].

【００７６】上述したように、式（７）の観測式では、
左辺は観測情報を、右辺は人物位置を画像に投影した結
果をそれぞれ表わしている。観測情報と人物位置を画像
に投影した結果との差として、式（１７）に示すマハラ
ノビス距離を用いた誤差評価関数ｆ（γ）を定義する。As described above, in the observation formula of the formula (7),
The left side shows the observation information, and the right side shows the result of projecting the person position on the image. The error evaluation function f (γ) using the Mahalanobis distance shown in Expression (17) is defined as the difference between the observation information and the result of projecting the person position on the image.

【００７７】[0077]

【数７】 [Equation 7]

【００７８】式（１７）に示す評価関数の値が一定のし
きい値以内である組合せは、新規人物に属する特徴点集
合とし、最新観測時刻（ここではｔ４）における推定位
置を初期発見位置として追跡部８に送信する。A combination in which the value of the evaluation function shown in equation (17) is within a certain threshold is a set of feature points belonging to a new person, and the estimated position at the latest observation time (here, t4) is the initial found position. It transmits to the tracking unit 8.

【００７９】次に、本発明の有効性を明らかにするた
め、以下のような２つのシミュレーション実験を行なっ
た。図１３は、位置追従の精度を確認するためのシミュ
レーション実験について説明するための図である。図１
３に示すシミュレーション実験では、２台のカメラ（２
♯１，２♯２）を用い、点線で示す円軌道を周期１０×
ｔ_fで等速円運動する追跡対象物体ｈ０を観測した。ｔ_f
は、カメラの撮影間隔である。カルマンフィルタには、
等速直線運動のモデルを与えた。観測は平行射影を仮定
し、実験では２台のカメラ２♯１および２♯２のそれぞ
れの光軸がなす角度θを４通り（０°、３０°、４５
°、９０°）に設定した。それぞれの条件において、カ
メラ２♯１とカメラ２♯２との撮影時間のずれΔｔを０
からｔ_f／２まで変化させ、カメラ２♯１の観測からｔ_f
後の（カメラ２♯１の撮影面に平行な方向についての）
位置予測誤差を記録した。Next, in order to clarify the effectiveness of the present invention, the following two simulation experiments were conducted. FIG. 13 is a diagram for explaining a simulation experiment for confirming the accuracy of position tracking. Figure 1
In the simulation experiment shown in 3, two cameras (2
# 1, # 2) and the circular orbit indicated by the dotted line with a cycle of 10 ×
An object h0 to be tracked that moves in a uniform circular motion at t _f was observed. t _f
Is the shooting interval of the camera. The Kalman filter has
A model of constant velocity linear motion is given. In the observation, parallel projection is assumed, and in the experiment, there are four angles θ formed by the optical axes of the two cameras 2 # 1 and 2 # 2 (0 °, 30 °, 45 °).
, 90 °). Under each condition, the difference Δt in the shooting time between the camera 2 # 1 and the camera 2 # 2 is 0.
To t _f / 2 from the observation of camera 2 # 1 to t _f
Later (for the direction parallel to the shooting surface of camera 2 # 1)
The position prediction error was recorded.

【００８０】図１４は、図１３に対する実験結果を説明
するための図である。図１４において、横軸は撮影時間
のずれΔｔを、縦軸は予測誤差を示している。図１４に
示すように、光軸のなす角度θが０°、３０°、４５°
である場合は特に、撮影間隔の増加に伴い予測誤差が減
少し、Δｔ＝ｔ_f／２において予測誤差は最小値をとっ
た。これにより、追跡対象物体の運動がカルマンフィル
タの前提となる運動モデルから外れる場合に、特に、非
同期観測を行なうことにより同期観測に比べて高い追従
性が得られることがわかる。FIG. 14 is a diagram for explaining the experimental results for FIG. In FIG. 14, the horizontal axis represents the deviation Δt of the shooting time, and the vertical axis represents the prediction error. As shown in FIG. 14, the angle θ formed by the optical axes is 0 °, 30 °, 45 °
In particular, the prediction error decreased as the shooting interval increased, and the prediction error took the minimum value at Δt = t _f / 2. From this, it can be seen that when the motion of the tracking target object deviates from the motion model that is the premise of the Kalman filter, a higher followability can be obtained by performing the asynchronous observation, in particular, as compared with the synchronous observation.

【００８１】図１５は、複数人物の動きを推定する能力
を明らかにするためのシミュレーション実験について説
明するための図である。図１５に示すシミュレーション
実験では、５台のカメラ２♯１〜２♯５を配置する。カ
メラはそれぞれ、対応する観測部を構成する１台の計算
機に接続されている。画像処理は、これらの計算機上で
行なわれる。処理速度は約１〜２frame／secである。FIG. 15 is a diagram for explaining a simulation experiment for clarifying the ability to estimate the movements of a plurality of persons. In the simulation experiment shown in FIG. 15, five cameras 2 # 1 to 2 # 5 are arranged. Each camera is connected to one computer that constitutes the corresponding observation unit. Image processing is performed on these computers. The processing speed is about 1-2 frame / sec.

【００８２】各計算機は、ローカルエリアネットワーク
ＬＡＮに接続されており、内部時計を互いに同期させて
いる。発見部６および追跡部８をそれぞれ構成する図示
しない２台の計算機がローカルエリアネットワークＬＡ
Ｎに接続されている。Each computer is connected to the local area network LAN and has internal clocks synchronized with each other. Two computers (not shown) that respectively configure the finding unit 6 and the tracking unit 8 are local area network LA.
It is connected to N.

【００８３】各カメラは予めすべてキャリブレーション
されており、各観測部４のキャリブレーション情報は、
観測情報（観測時間、特徴点）とともに発見部６および
追跡部８にそれぞれ送信される。実験では、２人の人物
ｈ０、ｈ１についての位置追跡を行なった。図１５に示
すように、２人の人物ｈ０、ｈ１が順にシーン５５内に
現われる。Each camera is calibrated in advance, and the calibration information of each observing unit 4 is:
It is transmitted to the discovering unit 6 and the tracking unit 8 together with the observation information (observation time, characteristic points). In the experiment, position tracking of two persons h0 and h1 was performed. As shown in FIG. 15, two persons h0 and h1 appear in the scene 55 in order.

【００８４】図１６〜図２０は、図１５に対する実験結
果を説明するための図である。図１５に示すシミュレー
ション実験において、各カメラで得られた特徴点の時間
ごとの推移を表わしている。図１６はカメラ２♯１に、
図１７はカメラ２♯２に、図１８はカメラ２♯３に、図
１９はカメラ２♯４に、図２０はカメラ２♯５にそれぞ
れ対応している。図１６〜図２０において、横軸は時
間、縦軸は追跡位置を表わしている。16 to 20 are views for explaining the experimental results for FIG. In the simulation experiment shown in FIG. 15, the transition of the characteristic points obtained by each camera over time is shown. 16 shows the camera 2 # 1
17 corresponds to the camera 2 # 2, FIG. 18 corresponds to the camera 2 # 3, FIG. 19 corresponds to the camera 2 # 4, and FIG. 20 corresponds to the camera 2 # 5. 16 to 20, the horizontal axis represents time and the vertical axis represents tracking position.

【００８５】図１６〜図２０を参照して、追跡当初は、
カメラ２♯１以外ではほとんど特徴点が観測されないた
め、人物の発見が行なわれない。１人目の人物が実験環
境（シーン５５）の中央に近づくにつれ、他の視点（カ
メラ２♯３、２♯４、２♯５）でも特徴点が得られ、追
跡が開始される（開始から約２０秒が過ぎ）。同様にし
て、２人目の人物の移動に伴い約３７秒過ぎに２人目の
人物の追跡が開始される。図１６〜図２０に示す○は１
人目の人物の追跡結果を示し、×は２人目の人物の追跡
結果を示している。以上の実験結果から、本発明により
複数人物の追跡が可能であることが示されたと言える。Referring to FIGS. 16 to 20, at the beginning of tracking,
Since no characteristic points are observed except for the camera 2 # 1, the person is not found. As the first person approaches the center of the experimental environment (scene 55), feature points are obtained from other viewpoints (cameras 2 # 3, 2 # 4, 2 # 5) and tracking is started (about from the start). 20 seconds have passed). Similarly, tracking of the second person is started about 37 seconds after the movement of the second person. 16 in FIG. 16 to FIG. 20 is 1
The tracking result of the second person is shown, and the cross shows the tracking result of the second person. From the above experimental results, it can be said that the present invention enables tracking of a plurality of persons.

【００８６】上述の実施形態では、複数の視点で非同期
に撮影した画像内の特徴点に基づいて、人物の動きを追
跡するようにしたが、次に人物固有の属性値を抽出し、
その抽出結果を観測情報として出力し、抽出した属性値
に基づいて対応する人物の状態を検出し、検出された状
態に応じて人物の追跡モデルを切換えるような実施形態
について説明する。たとえば、人物が着席している間に
移動する可能性は極めて少ないと考えられるので、位置
推定値の信頼度を大きくする。In the above-described embodiment, the movement of the person is tracked based on the feature points in the images asynchronously photographed from a plurality of viewpoints. Next, however, the attribute value peculiar to the person is extracted,
An embodiment will be described in which the extraction result is output as observation information, the state of the corresponding person is detected based on the extracted attribute value, and the tracking model of the person is switched according to the detected state. For example, since it is considered that there is a very low possibility that a person moves while seated, the reliability of the position estimation value is increased.

【００８７】図２１はこの発明の第２の実施形態の観測
部と追跡部の概要を示す図であり、第１の実施形態の図
７および図１０に対応している。図２１において、領域
分割回路１０と画素値算出回路１２と中心点選択回路１
４は図７と同じであるが、この実施形態の観測部４に
は、新たにシルエットの高さ検出回路１７が設けられ
る。シルエットの高さ検出回路１７は領域分割回路１０
で２値化された２値化画像から人物の高さを検出する。FIG. 21 is a diagram showing an outline of the observation section and the tracking section of the second embodiment of the present invention, and corresponds to FIGS. 7 and 10 of the first embodiment. In FIG. 21, a region dividing circuit 10, a pixel value calculating circuit 12, and a center point selecting circuit 1
4 is the same as that of FIG. 7, but the observation unit 4 of this embodiment is newly provided with a silhouette height detection circuit 17. The silhouette height detecting circuit 17 is the area dividing circuit 10.
The height of the person is detected from the binarized image binarized by.

【００８８】図２２は人物が立っているときと着席して
いるときの２値化画像を示す図である。シルエットの高
さ検出回路１７は図２２（ａ）に示すように、２値化画
像の高さＴ１と図２２（ｂ）に示す２値化画像の高さＴ
２を検出する。たとえば、人物が立っている場合にはＴ
１が１８０ｃｍであり、人物が着席していればＴ２が１
４０ｃｍというように検出する。シルエットの高さ検出
回路１７の検出出力は特徴点対応づけ回路１６に与えら
れる。FIG. 22 is a diagram showing a binarized image when a person is standing and seated. As shown in FIG. 22 (a), the silhouette height detection circuit 17 detects the height T1 of the binarized image and the height T of the binarized image shown in FIG. 22 (b).
2 is detected. For example, if a person is standing, T
1 is 180 cm and T2 is 1 if a person is seated
It is detected as 40 cm. The detection output of the silhouette height detection circuit 17 is given to the feature point correspondence circuit 16.

【００８９】特徴点対応づけ回路１６は第１の実施形態
で説明したように、重心点選択回路１４で抽出された重
心点と既に発見されている追跡目標との間の対応づけを
行なうが、シルエットの高さ検出回路１７の検出出力に
基づいて、対応づけられた人物の身長を測定した情報を
観測情報に含めて追跡部８に出力する。As described in the first embodiment, the feature point associating circuit 16 associates the barycentric point extracted by the barycentric point selecting circuit 14 with the already-discovered tracking target. Based on the detection output of the silhouette height detection circuit 17, the information obtained by measuring the height of the corresponding person is included in the observation information and output to the tracking unit 8.

【００９０】一方、追跡部８は、図１０に示した構成に
加えて、図２１に示すように高さ情報抽出回路３０と背
たけ変換回路３２と状態推定回路３４とを含む。高さ情
報抽出回路３０は観測部４から追跡部８に送られてくる
観測情報から高さ情報を抽出し、背たけ変換回路３２に
与える。背たけ変換回路３２は観測情報から抽出された
高さ情報を背たけに変換する。On the other hand, tracking unit 8 includes a height information extraction circuit 30, a back conversion circuit 32, and a state estimation circuit 34, as shown in FIG. 21, in addition to the configuration shown in FIG. The height information extraction circuit 30 extracts height information from the observation information sent from the observation unit 4 to the tracking unit 8 and supplies it to the backrest conversion circuit 32. The backrest conversion circuit 32 converts the height information extracted from the observation information into a backrest.

【００９１】状態推定回路３４は、この背たけがたとえ
ば１８０ｃｍから１４０ｃｍに変化すれば、人物が立っ
ている状態から着席した状態になったものと推定する。
逆に、背たけが１４０ｃｍから１８０ｃｍに変化すれ
ば、人物が着席している状態から立上がったものと推定
する。If the backrest changes from 180 cm to 140 cm, for example, the state estimating circuit 34 estimates that the person is in a standing state from a standing state.
On the contrary, if the backrest changes from 140 cm to 180 cm, it is estimated that the person is standing up.

【００９２】観測部４と追跡部８は前述の図１０と同様
にして動作し、追跡部８は観測部４から得られる観測情
報に含まれる重心点位置に基づいて予測位置を推定し、
その推定結果に基づいて方向角を推定し、複数の人物の
うちの特定の人物を追跡する。The observing section 4 and the tracking section 8 operate in the same manner as in FIG. 10, and the tracking section 8 estimates the predicted position based on the position of the center of gravity included in the observation information obtained from the observing section 4,
The direction angle is estimated based on the estimation result, and a specific person among a plurality of persons is tracked.

【００９３】観測部４から追跡部８に送られる観測情報
のうち、高さ情報は高さ情報抽出回路３０によって抽出
され、背たけ変換回路３２で背たけに変換され、状態推
定回路３４によって人物が立っている状態であるかある
いは着席している状態であるかが推定される。追跡部８
では推定された状態に応じて追跡する人物を切換える。
すなわち、追跡していた人物が着席した状態であれば、
もはやその人物は移動することがないので、他の人物を
追跡する。Among the observation information sent from the observing unit 4 to the tracking unit 8, the height information is extracted by the height information extracting circuit 30, converted into the ignorance by the displacing circuit 32, and the state estimating circuit 34 exemplifies the person. It is estimated whether the person is standing or seated. Tracking unit 8
Then, the person to be tracked is switched according to the estimated state.
That is, if the person being tracked is seated,
The person no longer moves, so he tracks other people.

【００９４】なお、上述の説明は人物が立っている状態
と着席している状態を推定するようにしたが、これに限
ることなく人物が停止している状態から移動を開始する
状態または移動している状態から停止する状態のような
行動情報を推定するようにしてもよい。その場合には、
人物の動きが一定以上の速度になれば動いている状態と
推定し、速度が０になれば停止していると推定すればよ
い。この場合、前述の式（４）〜（６）は次の式（１
８）〜（２０）に置き換えられる。これらの式は時刻ｔ
_aに行なわれた前回の観測までの情報から、時刻ｔ_bにお
ける人物ｈ₁の状態の予測を行なう場合を示している。In the above description, the state in which the person is standing and the state in which the person is seated are estimated, but the present invention is not limited to this. It is also possible to estimate action information such as a state of stopping from a state of being. In that case,
It may be estimated that the person is moving when the speed of the person becomes a certain speed or more, and it is estimated that the person is stopped when the speed becomes zero. In this case, the above equations (4) to (6) are expressed by the following equation (1
8) to (20). These expressions are time t
_It shows a case where the state of the person h ₁ at time t _b is predicted from the information up to the previous observation made in a.

【００９５】[0095]

【数８】 [Equation 8]

【００９６】上記式でＱ_x，ΔｔはΔｔ間の運動の揺ら
ぎを示しており、この大きさは現在の運動の状態により
変化すると考えられる。上述の実施形態では、この値を
歩行幅は大，停止中は中，着席中は小というように変化
させることが考えられる。このように、運動モデルを変
化させることで、推定結果の信頼性を物体の運動状態に
応じて評価することが可能となる。In the above equation, Q _x and Δt represent the fluctuation of the motion between Δt, and it is considered that the magnitude thereof changes depending on the current motion state. In the above-described embodiment, it is possible to change this value such that the walking width is large, the walking width is medium, and the sitting width is small. In this way, by changing the motion model, the reliability of the estimation result can be evaluated according to the motion state of the object.

【００９７】図２１に示した第２の実施形態では、人物
が立っているかあるいは着席中であるかの状態推定およ
び人物が停止しているかあるいは移動を開始したかの状
態を推定するようにした。しかし、観測される各画像の
特徴と追跡モデルの対応づけにおいて、位置情報に代え
て画像から観測可能な対象物体固有の属性値を用いるこ
ともできる。ここで、固有の属性値としては、前述の対
象物体の背たけや行動情報に限ることなく服の色なども
考えられる。In the second embodiment shown in FIG. 21, the state estimation of whether the person is standing or seated and the state of whether the person is stationary or has started moving are estimated. . However, in associating the feature of each observed image with the tracking model, the attribute value peculiar to the target object that can be observed from the image can be used instead of the position information. Here, the unique attribute value is not limited to the backrest of the target object or the action information described above, and the color of clothes may be considered.

【００９８】図２３はそのような実施形態を示すブロッ
ク図である。図２３に示した観測部４には図２１に示し
たシルエットの高さ検出回路１７に代えて、シルエット
の高さ，色検出回路１８が設けられ、追跡部８には図２
１の高さ抽出回路３０と背たけ変換回路３２に代えて、
高さ，色情報抽出回路３６と背たけ，色変換回路３８が
設けられる。観測部４のシルエットの高さ，色検出回路
１８は図２１で説明したように人物の高さを検出すると
ともに、図２０に示すように領域分割された画像のうち
の胸の部分（図２４の斜線で示す部分）の平均色も検出
して特徴点対応づけ回路１６に与える。FIG. 23 is a block diagram showing such an embodiment. The observation unit 4 shown in FIG. 23 is provided with a silhouette height / color detection circuit 18 in place of the silhouette height detection circuit 17 shown in FIG.
1 instead of the height extraction circuit 30 and the backrest conversion circuit 32,
A height / color information extraction circuit 36 and a color conversion circuit 38 are provided. The silhouette height of the observation unit 4 and the color detection circuit 18 detect the height of the person as described with reference to FIG. 21, and at the same time, as shown in FIG. The average color of the shaded area) is also detected and given to the feature point correspondence circuit 16.

【００９９】一方、追跡部８の高さ，色情報抽出回路３
６は観測情報から高さ情報と色情報とを抽出し、背た
け，色変換回路３８は高さ情報と色情報とから背たけや
色を変換し、状態推定回路３４が人物の状態を推定す
る。この場合、前回の観測で得られた色情報Ｓ_tnと今回
の観測で得られた色情報Ｓ_tn+1は次式で示される。On the other hand, the height / color information extraction circuit 3 of the tracking unit 8
The reference numeral 6 extracts height information and color information from the observation information, the backrest / color conversion circuit 38 converts backrest and color from the height information and color information, and the state estimation circuit 34 estimates the state of the person. . In this case, the color information S _tn obtained in the previous observation and the color information S _{tn + 1} obtained in the current observation are expressed by the following equation.

【０１００】[0100]

【数９】 [Equation 9]

【０１０１】このように、観測情報として背たけや色情
報というような対象物体固有の属性値を用いることによ
り、対象物体が接近している場合であっても混同してし
まうことがないという利点がある。As described above, by using the attribute values specific to the target object such as the backrest and the color information as the observation information, there is an advantage that the target object is not confused even when the target object is approaching. is there.

【０１０２】今回開示された実施の形態はすべての点で
例示であって制限的なものではないと考えられるべきで
ある。本発明の範囲は上記した説明ではなくて特許請求
の範囲によって示され、特許請求の範囲と均等の意味お
よび範囲内でのすべての変更が含まれることが意図され
る。The embodiments disclosed this time are to be considered as illustrative in all points and not restrictive. The scope of the present invention is shown not by the above description but by the claims, and is intended to include meanings equivalent to the claims and all modifications within the scope.

【０１０３】[0103]

【発明の効果】以上のように、この発明によれば、シー
ン内で一定の高さでかつ一定の間隔を隔てた複数の視点
から下方向を撮影手段によって撮影し、撮像手段が下方
向の移動物体を撮像したことに応じて、当該移動物体の
位置情報を獲得し、他の撮像手段が当該移動物体の側方
を撮像したことに応じて、その移動物体を同定するため
の情報を獲得し、獲得結果から特徴点を抽出して観測情
報として出力し、特徴点を抽出した観測手段からの要求
に従い、複数の観測手段の観測情報を統合することによ
り移動物体の状態を予測するすることができ、物体がい
ずれの位置にあっても位置情報の獲得，物体の同定を安
定に行なうことができ、観移動物体を高精度に追従する
ことが可能となるAs described above, according to the present invention, the photographing means photographs the downward direction from a plurality of viewpoints at a constant height and at constant intervals in the scene, and the photographing means moves downward. The position information of the moving object is acquired according to the image of the moving object, and the information for identifying the moving object is acquired according to the image of the side of the moving object by another imaging unit. Then, the feature points are extracted from the acquired results and output as observation information, and the state of the moving object is predicted by integrating the observation information of the plurality of observation means according to the request from the observation means that extracted the feature points. The position information can be acquired and the object can be stably identified regardless of the position of the object, and the moving object can be tracked with high accuracy.

[Brief description of drawings]

【図１】この発明の一実施形態のカメラの配置図であ
る。FIG. 1 is a layout view of a camera of an embodiment of the present invention.

【図２】この発明の実施の形態による移動物体追跡装
置１の全体構成を示すブロック図である。FIG. 2 is a block diagram showing an overall configuration of a moving object tracking device 1 according to an embodiment of the present invention.

【図３】この発明の一実施形態の動作を説明するため
のフローチャートである。FIG. 3 is a flowchart for explaining the operation of the embodiment of the present invention.

【図４】図３の各人物領域と追跡モデルの対応付けお
よび位置・姿勢推定動作を示すフローチャートであり、FIG. 4 is a flowchart showing the correspondence between each human region and the tracking model in FIG. 3 and the position / orientation estimation operation,

【図５】天井に設置したカメラによって撮像した人物
像の変化例を示す図である。FIG. 5 is a diagram showing a change example of a person image captured by a camera installed on a ceiling.

【図６】この発明の一実施の形態において用いた人体
モデルについて説明するための図である。FIG. 6 is a diagram for explaining a human body model used in the embodiment of the present invention.

【図７】観測部の構成の概要について説明するための
図である。FIG. 7 is a diagram for explaining an outline of a configuration of an observation unit.

【図８】観測部における特徴抽出処理について説明す
るための図である。FIG. 8 is a diagram for explaining a feature extraction process in the observation unit.

【図９】特徴点の対応付け処理について説明するため
の図である。FIG. 9 is a diagram for explaining a feature point association process.

【図１０】図２に示す追跡部の構成の概要について説
明するための図である。FIG. 10 is a diagram for explaining an outline of the configuration of the tracking unit shown in FIG.

【図１１】位置推定処理について説明するための図で
ある。FIG. 11 is a diagram for explaining position estimation processing.

【図１２】方向角推定回路における処理を説明するた
めの図である。FIG. 12 is a diagram for explaining processing in the direction angle estimation circuit.

【図１３】位置追従の精度を確認するためのシミュレ
ーション実験について説明するための図である。FIG. 13 is a diagram for explaining a simulation experiment for confirming the accuracy of position tracking.

【図１４】図１３に対する実験結果を説明するための
図である。FIG. 14 is a diagram for explaining an experimental result for FIG.

【図１５】複数人物の動きを推定する能力を明らかに
するためのシミュレーション実験について説明するため
の図である。FIG. 15 is a diagram for explaining a simulation experiment for clarifying the ability to estimate the movements of a plurality of persons.

【図１６】図１５に対する実験結果を説明するための
図である。FIG. 16 is a diagram for explaining an experimental result for FIG. 15.

【図１７】図１５に対する実験結果を説明するための
図である。FIG. 17 is a diagram for explaining an experimental result for FIG. 15.

【図１８】図１５に対する実験結果を説明するための
図である。FIG. 18 is a diagram for explaining an experimental result for FIG. 15.

【図１９】図１５に対する実験結果を説明するための
図である。FIG. 19 is a diagram for explaining an experimental result for FIG. 15.

【図２０】図１５に対する実験結果を説明するための
図である。FIG. 20 is a diagram for explaining an experimental result for FIG. 15.

【図２１】この発明の第２の実施形態の観測部と追跡
部の態様を示す図である。FIG. 21 is a diagram showing aspects of an observation unit and a tracking unit according to the second embodiment of the present invention.

【図２２】人物が立っているときと着席しているとき
の２値化画像を示す図である。FIG. 22 is a diagram showing a binarized image when a person is standing and when a person is seated.

【図２３】この発明の第３の実施形態の観測部と追跡
部の態様を示す図である。FIG. 23 is a diagram showing aspects of an observation unit and a tracking unit according to the third embodiment of the present invention.

【図２４】領域分割された画像のうちの胸の部分の平
均色を検出する状態を示す図である。FIG. 24 is a diagram showing a state of detecting an average color of a chest portion of an image obtained by region division.

【図２５】天井に設置したカメラにより人物の追跡を
行なう例を示す図である。FIG. 25 is a diagram showing an example of tracking a person with a camera installed on the ceiling.

[Explanation of symbols]

１移動物体追跡装置、２♯１〜２♯ｎカメラ、４♯
１〜４♯ｎ観測部、６発見部、８追跡部、１０
領域分割回路、１２画素値算出回路、１４重心点選択
回路、１６特徴点対応づけ回路、１７シルエットの
高さ検出回路、１８シルエットの高さ，色検出回路、
２２位置推定回路、２４方向角推定回路、３０高
さ情報抽出回路、３２背たけ変換回路、３４状態推
定回路、３６高さ，色情報抽出回路、３８背たけ，
色変換回路。1 Moving Object Tracking Device, 2 # 1-2 # n Cameras, 4 #
1-4 # n Observation part, 6 Discovery part, 8 Tracking part, 10
Area dividing circuit, 12 pixel value calculating circuit, 14 barycentric point selecting circuit, 16 feature point associating circuit, 17 silhouette height detecting circuit, 18 silhouette height, color detecting circuit,
22 position estimation circuit, 24 direction angle estimation circuit, 30 height information extraction circuit, 32 back conversion circuit, 34 state estimation circuit, 36 height and color information extraction circuit, 38 back support,
Color conversion circuit.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04N 7/18 G06T 1/00 G06T 7/00 ─────────────────────────────────────────────────── ─── Continuation of the front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) H04N 7/18 G06T 1/00 G06T 7/00

Claims

(57) [Claims]

1. A moving object tracking device for tracking a moving object moving in a scene from multiple viewpoints, wherein a downward image is taken from a plurality of viewpoints at a constant height and at regular intervals in the scene. a plurality of imaging means for, provided corresponding to each of the plurality of imaging means, each other
And a plurality of observing means for operating independently to have, each of the plurality of observing means from the image signal of the moving object corresponding imaging means photographed,
Position information estimating means for estimating the position information of the moving object
When photographing means and the transfer of the corresponding, based on the position information
Estimate the distance to the moving body and adjust the distance according to the estimated distance.
Then, if the moving object is below the corresponding photographing means,
Is a weight for estimating weight information that indicates which side is closer to
Information estimating means and a feature point extraction process from the image signal to obtain the moving object.
Object identification information calculator for calculating object identification information
Step, the position information, the object identification information, and the weight information
Observation information transmitting means for transmitting as the observation information, the moving object tracking device further includes the position information transmitted from the observation information transmitting means,
A moving object tracking device , comprising tracking means for integrating the object identification information and the weight information to predict the state of the moving object.

Each wherein said plurality of observation means, based on the predicted observation position that will be transmitted from said tracking means, said corresponding
Includes correlation means for capturing means performs <br/> correspondence between the moving object and the tracking target taken for the moving object tracking apparatus further comprises a discovery means for detecting a new moving object, wherein The observation information transmitting means is used to view the associated moving object.
The measurement information was sent to the tracking means and was not associated
The observation information of the moving object is transmitted to the discovery means, and the tracking means adds the observation information that is not associated with the observation information.
Based on the basis of the new moving object found by the finding means, to update the information on each moving object, the moving object tracking apparatus according to claim 1.

3. Each of the plurality of observing means further includes area dividing means for dividing an image photographed by the corresponding photographing means into a background image and a moving object area, and the object identification information calculating means, calculated Mel including <br/> hand stage, moving object tracking apparatus according to claim 1 as a point wherein the center of gravity of the moving object region extracted by the region dividing means.

Wherein said tracking means comprises a computer, each of the plurality of observing means is constituted by a computer, with each said track means of said plurality of observing means can communicate with each other, claim 1. The moving object tracking device according to 1.