JP2008134939A

JP2008134939A - Moving object tracking apparatus, moving object tracking method, moving object tracking program with the method described therein, and recording medium with the program stored therein

Info

Publication number: JP2008134939A
Application number: JP2006322048A
Authority: JP
Inventors: Tatsuya Osawa; 達哉大澤; Xiaojun Wu; 小軍ウ; Kyoko Sudo; 恭子数藤; Hiroyuki Arai; 啓之新井; Takayuki Yasuno; 貴之安野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-11-29
Filing date: 2006-11-29
Publication date: 2008-06-12

Abstract

<P>PROBLEM TO BE SOLVED: To stably track respective moving objects by integrating images sent from one or more imaging apparatuses for imaging a plurality of moving objects. <P>SOLUTION: A plurality of moving objects are tracked by using: a means 11 for predicting the target states of the plurality of moving objects at each time on the basis of a plurality of image data obtained by photographing the moving objects by using one or more imaging apparatuses; a means 12 for acquiring the image data; a means 13 for generating silhouette images obtained by extracting areas including respective moving objects; a means 14 for estimating target state distribution on the basis of the silhouette images, the three-dimensional structure of a real world, and an internal parameter and an external parameter of the imaging apparatus, a change type, and probability distribution updating frequency; a means 15 for calculating change vectors based on a target state having maximum probability at present time and a target state having maximum probability at previous time; a means 21 for storing the three-dimensional structure of the real world, the internal parameter and the external parameter; a means 22 for storing the probability distribution of the target states; and a means 22 for storing the change vectors. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、１又は複数の画像入力装置（例えば、カメラ等）を用いて、動体（物体、動物、人間を含む動体）の三次元位置や大きさなどの状態を追跡する技術に関するものである。 The present invention relates to a technique for tracking a state such as a three-dimensional position and a size of a moving object (moving object including an object, an animal, and a human) using one or a plurality of image input devices (for example, a camera or the like). .

コンピュータビジョン分野では、複数のカメラの情報を利用して人物などの動く対象の追跡に関する研究が多く行われている。例えば、次のような研究成果が、実現されている。 In the field of computer vision, there are many studies on tracking moving objects such as people using information from multiple cameras. For example, the following research results have been realized.

複数の視点から対象を観測し、画像上における対象の領域（即ち、シルエット）を抽出したシルエット画像を用意する。さらに、予め追跡を行う空間の三次元構造（以下、三次元環境情報と呼ぶ）と、追跡に利用する全てのカメラの内部パラメータ及び外部パラメータ（カメラパラメータともいう）と、を予め計測して求めておく。 A silhouette image is prepared by observing a target from a plurality of viewpoints and extracting a target region (ie, silhouette) on the image. Further, the three-dimensional structure of the space to be tracked in advance (hereinafter referred to as three-dimensional environment information) and the internal parameters and external parameters (also referred to as camera parameters) of all the cameras used for tracking are obtained by measuring in advance. Keep it.

なお、シルエット画像とは、撮影した画像に対して背景差分法やフレーム間差分法を施し、撮影した画像の対象が写っている領域の輝度値を「１」、他の領域の輝度値を「０」で表現した２値画像である（即ち、実差分画像である）。また、カメラの内部パラメータの校正，三次元位置と姿勢を求めるためのカメラキャリブレーション方法も広く知られている（例えば、非特許文献１参照）。 The silhouette image is obtained by performing a background difference method or an inter-frame difference method on a captured image, and setting the luminance value of an area where the target of the captured image is “1” and the luminance values of other areas to “ This is a binary image expressed by “0” (that is, an actual difference image). A camera calibration method for calibrating internal parameters of a camera and obtaining a three-dimensional position and orientation is also widely known (for example, see Non-Patent Document 1).

そして、その三次元環境中に人物モデルを配置し、先に算出したカメラの内部パラメータと外部パラメータを有する仮想カメラ（例えば、ソフトウェアで構築された仮想撮像装置）で、その人物モデルが配置されたシーンを撮影することによって、シルエット画像のシミュレーションを行うことができる。 Then, the person model is arranged in the three-dimensional environment, and the person model is arranged by a virtual camera (for example, a virtual imaging device constructed by software) having the internal parameters and external parameters of the camera calculated previously. By capturing a scene, a silhouette image can be simulated.

また、人物モデルに楕円体を用いて、生成したシミュレーション画像とシルエット画像を比較することで一般的な環境の下でも人物の追跡を行う方法（以下、楕円体モデル追跡方法と呼ぶ）が提案されている（例えば、非特許文献２参照）。
Ｚ．Ｚｈａｎｇ，”Ａｆｌｅｘｉｂｌｅｎｅｗｔｅｃｈｎｉｑｕｅｆｏｒｃａｍｅｒａｃａｌｉｂｒａｔｉｏｎ”，ＩＥＥＥ，ＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，２０００（平成１２年），ｖｏｌ．２２，Ｎｏ．１１，ｐ．１３３０−１３３４．加藤博一、中澤篤志、井口征史，「楕円体モデルを用いたリアルタイム人物追跡」，情報処理学会論文誌，１９９９年（平成１１年）１１月，ｖｏｌ．４０，Ｎｏ１１，ｐ．４０８７−４０９６。 In addition, a method of tracking a person under a general environment by using an ellipsoid for a human model and comparing the generated simulation image with a silhouette image (hereinafter referred to as an ellipsoid model tracking method) has been proposed. (For example, refer nonpatent literature 2).
Z. Zhang, “Aflexible new technology for camera calibration”, IEEE, Transactions on Pattern Analysis and Machine Intelligence, 2000 (2000), vol. 22, no. 11, p. 1330-1334. Hirokazu Kato, Atsushi Nakazawa, Seiji Iguchi, “Real-time human tracking using ellipsoidal model”, Journal of Information Processing Society of Japan, 1999 (Heisei 11), vol. 40, No11, p. 4087-4096.

前述のように、カメラによって取得した情報に基づいて対象を追跡する方法において、複数の人物を追跡する場合、人物同士の接近などによって、カメラで撮影された画像上で人物同士が重なってしまう現象（即ち、オクルージョン）が問題にされている。 As described above, in the method of tracking a target based on information acquired by a camera, when a plurality of persons are tracked, the persons overlap each other on an image captured by the camera due to the approach of the persons. (Ie occlusion) is a problem.

そのオクルージョン問題は、複数の視点から得られた情報を統合することによって、軽減できる。しかし、前記の楕円体モデル追跡方法では、複数のカメラが独立に追跡を行っているため、複数カメラの情報統合が行われておらず、人物同士が接近すると追跡が不可能となる問題を有している。 The occlusion problem can be reduced by integrating information obtained from multiple viewpoints. However, in the above ellipsoid model tracking method, since multiple cameras track independently, information integration of multiple cameras is not performed, and there is a problem that tracking is impossible when people approach each other. is doing.

また、前記の楕円体モデル追跡方法では、追跡する人数を初めから固定しているため、追跡空間からの人物の退出、出現に対応することができないという問題を有している。 The ellipsoid model tracking method has a problem that the number of people to be tracked is fixed from the beginning, so that it cannot cope with the exit and appearance of a person from the tracking space.

さらに、追跡人数に比例した処理量が必要であるため、一度に追跡する人数が増えると処理コストが増大し、追跡処理の実時間性が失われるといった問題を有している。 Furthermore, since a processing amount proportional to the number of people to be tracked is required, there is a problem that if the number of people to be tracked increases, the processing cost increases and the real-time performance of the tracking processing is lost.

そして、シミュレーション画像生成に用いる楕円体モデルに固定の高さ，固定の半径を与えているため、人物の大きさの個人差に対応することが不可能である。即ち、この楕円体モデルの大きさと著しく異なる大きさの体を持つ人物を追跡する際には、測定値に大きな誤差を発生し、結果として追跡にも失敗してしまうという問題をも有している。 Since the ellipsoidal model used for generating the simulation image is given a fixed height and a fixed radius, it is impossible to deal with individual differences in the size of the person. In other words, when tracking a person having a body whose size is significantly different from the size of the ellipsoid model, there is a problem that a large error occurs in the measured value, and as a result, tracking also fails. Yes.

本発明は、前記課題に基づいてなされたものであって、複数の動体を撮像する１又は複数の撮像装置から送られる画像から、三次元空間中に存在する複数の動体の各時刻における位置、大きさを表す対象状態を逐次的に推定し、各動体の動きを安定して追跡する動体追跡装置，動体追跡方法，その方法を記述した動体追跡プログラム及びそのプログラムを格納した記録媒体を提供することにある。 The present invention has been made based on the above-mentioned problem, and from the images sent from one or a plurality of imaging devices that image a plurality of moving objects, the positions of the plurality of moving objects existing in the three-dimensional space at each time point, Provided are a moving body tracking device, a moving body tracking method, a moving body tracking method, a moving body tracking program describing the method, and a recording medium storing the program, which sequentially estimate a target state representing a size and stably track the movement of each moving body. There is.

前記課題の解決を図るために、請求項１記載の発明は、動体の三次元位置及び大きさを該動体の対象状態と見做し、１又は複数の撮像装置を用いて該動体を特定の時間間隔をあけた時刻に撮像し、その撮像によって得られた複数画像データに基づいて、該動体を追跡する動体追跡装置であって、各時刻に複数の前記動体の対象状態を予測する対象状態予測手段と、前記画像データを取得する画像取得手段と、前記画像データから前記動体を写した領域を抽出したシルエット画像を作成するシルエット画像作成手段と、対象状態予測手段によって得られた対象状態を初期状態と見做し、シルエット画像作成手段によって作成されたシルエット画像と、三次元環境情報管理手段に保存された実世界の三次元構造と、三次元環境情報管理手段に保存された前記撮像装置の内部パラメータ及び外部パラメータと、を用いて、対象状態分布を推定し、対象状態分布記憶手段に保存されている前時刻の対象状態分布を現時刻の対象状態分布に更新する対象状態分布推定手段と、対象状態分布記憶手段に保存された現時刻の対象状態分布に基づいて、最大確率となる対象状態を計算し、その現時刻において最大確率となる対象状態と前時刻において最大確率となる対象状態に基づいて変化ベクトルを計算し、その計算された変化ベクトルを対象状態記憶手段に記憶させる対象状態計算手段と、予め計測しておいた実世界の三次元構造と前記撮像装置の内部パラメータ及び外部パラメータを保存する三次元環境情報管理手段と、各時刻に推定された対象状態の確率的分布を記憶する対象状態分布記憶手段と、各時刻の前記変化ベクトルを記憶する対象状態記憶手段と、を備え、前記対象状態分布推定手段が、前記対象状態分布記憶手段に対する最初の確率分布更新回数Ｂと対象状態に対する最後の確率分布更新回数Ｎを比較し、その比較結果に基づいて、予め用意された変化タイプから任意の確率に従って変化タイプを選択する手段と、前記変化タイプを適応して変化させる動体を選択する手段と、前記選択された変化タイプ及び前記確率分布更新回数Ｂと前記確率分布更新回数Ｎの比較結果に基づいて、選択された動体の対象状態を変化させる手段と、前記撮像装置と同じ外部パラメータ及び内部パラメータを有する仮想撮像手段で、前記変化させた対象状態を投影して作成された画像を、前記シルエット画像をシミュレートしたシミュレーション画像と見做す手段と、前記三次元構造に関する事前知識の条件に基づいた対象状態の確率的な第１条件尤度と、動体同士の重なりに関する事前知識の条件に基づいた対象状態の確率的な第２条件尤度と、前記シルエット画像とシミュレーション画像の比較に基づく第３条件尤度と、を計算し、その第１乃至第３条件尤度の積を、前記変化させた暫定対象状態の尤度と見做す手段と、前記暫定対象状態が採用されたか否かの結果に応じて、ｎ回目に予測された対象状態を決定する手段と、更新回数が前記確率分布計算回数Ｂと等しい場合、該Ｂ回の更新で得られた対象状態に基づいて状態分布を計算する手段と、前記Ｂ回の更新で得られた対象状態に基づいて状態分布に基づいて、現在追跡している動体数が０と等しいか否かを判定する手段と、前記動体数が０と等しくなかった場合、かつ、更新回数が記確率分布更新回数Ｂに前記確率分布更新回数Ｎを加算した回数を超えていた場合、最後のＮ回の更新で保存されたＮ個の対象状態全てが等確率に起こりうると考え、ｎ回目に予測された対象状態を対象状態分布と見做す手段と、を有する、ことを特徴とする。 In order to solve the above-mentioned problem, the invention according to claim 1 is characterized in that the three-dimensional position and size of a moving object are regarded as the target state of the moving object, and the moving object is specified using one or a plurality of imaging devices. A moving object tracking device that images at a time interval and tracks the moving object based on a plurality of image data obtained by the image capturing, and predicts a plurality of moving object target states at each time The target state obtained by the prediction means, the image acquisition means for acquiring the image data, the silhouette image creation means for creating a silhouette image obtained by extracting the region where the moving object is copied from the image data, and the target state obtained by the target state prediction means Considering the initial state, the silhouette image created by the silhouette image creation means, the real world 3D structure saved in the 3D environment information management means, and saved in the 3D environment information management means The target state distribution is estimated using the internal parameter and the external parameter of the image pickup apparatus, and the target state distribution at the previous time stored in the target state distribution storage unit is updated to the target state distribution at the current time. Based on the target state distribution estimation means and the target state distribution at the current time stored in the target state distribution storage means, the target state having the maximum probability is calculated, and the target state having the maximum probability at the current time and the previous time A target state calculation unit that calculates a change vector based on a target state having a maximum probability and stores the calculated change vector in a target state storage unit; a three-dimensional structure of the real world measured in advance and the imaging 3D environment information management means for storing internal parameters and external parameters of the apparatus, and target state distribution storage for storing the probabilistic distribution of the target state estimated at each time And a target state storage means for storing the change vector at each time, wherein the target state distribution estimation means includes a first probability distribution update count B for the target state distribution storage means and a final probability for the target state. A means for comparing the number of distribution updates N and selecting a change type according to an arbitrary probability from a change type prepared in advance based on the comparison result; a means for selecting a moving object that changes the change type adaptively; Based on the selected change type and the comparison result between the probability distribution update count B and the probability distribution update count N, means for changing the target state of the selected moving object, and the same external parameters and internal parameters as the imaging device An image created by projecting the changed target state is simulated by simulating the silhouette image. A first condition likelihood of the target state based on the prior knowledge condition regarding the three-dimensional structure, and the probability of the target state based on the prior knowledge condition regarding the overlapping of moving objects The second condition likelihood and the third condition likelihood based on the comparison between the silhouette image and the simulation image, and the product of the first to third condition likelihoods is changed to the provisional target state. Means for determining the target state predicted for the nth time according to the result of whether or not the provisional target state is adopted, and the number of updates is the probability distribution calculation number B. If equal, means for calculating the state distribution based on the target state obtained by the B-th update and currently tracking based on the state distribution based on the target state obtained by the B-th update. Determine if the number of moving objects is equal to 0 If the number of moving objects is not equal to 0 and the number of updates exceeds the number of times of adding the probability distribution update number N to the probability distribution update number B, the last N updates are stored. It is considered that all the N target states that are generated can occur with equal probability, and the target state predicted for the nth time is regarded as a target state distribution.

請求項２記載の発明は、請求項１に記載の発明において、前記動体の対象状態を楕円体モデルの集合と見做し、各動体の位置を任意の平面上の位置（ｘ，ｙ）、動体の大きさを該楕円体モデルの半径ｒと高さｈ、で表し、それらパラメータｘ，ｙ，ｒ，ｈの値の組み合わせを複数記憶する、ことを特徴とする。 The invention according to claim 2 is the invention according to claim 1, wherein the object state of the moving object is regarded as a set of ellipsoidal models, and the position of each moving object is a position (x, y) on an arbitrary plane. The size of the moving object is represented by a radius r and a height h of the ellipsoid model, and a plurality of combinations of values of the parameters x, y, r, and h are stored.

請求項３記載の発明は、予め計測しておいた実世界の三次元構造と前記撮像装置の内部パラメータ及び外部パラメータを保存する三次元環境情報管理手段と、各時刻に推定された対象状態の確率的分布を記憶する対象状態分布記憶手段と、各時刻の前記変化ベクトルを記憶する対象状態記憶手段と、を備え、動体の三次元位置及び大きさを該動体の対象状態と見做し、１又は複数の撮像装置を用いて該動体を特定の時間間隔をあけた時刻に撮像し、その撮像によって得られた複数画像データに基づいて、該動体を追跡する装置に使用される動体追跡方法であって、各時刻に複数の前記動体の対象状態を予測する対象状態予測ステップと、前記画像データを取得する画像取得ステップと、前記画像データから前記動体を写した領域を抽出したシルエット画像を作成するシルエット画像作成ステップと、対象状態予測手段によって得られた対象状態を初期状態と見做し、シルエット画像作成手段によって作成されたシルエット画像と、三次元環境情報管理手段に保存された実世界の三次元構造と、三次元環境情報管理手段に保存された前記撮像装置の内部パラメータ及び外部パラメータと、を用いて、対象状態分布を推定し、対象状態分布記憶手段に保存されている前時刻の対象状態分布を現時刻の対象状態分布に更新する対象状態分布推定ステップと、対象状態分布記憶手段に保存された現時刻の対象状態分布に基づいて最大確率となる対象状態を計算し、その現時刻において最大確率となる対象状態と前時刻において最大確率となる対象状態に基づいて変化ベクトルを計算し、その計算された変化ベクトルを対象状態記憶手段に記憶させる対象状態計算ステップと、を有し、前記対象状態分布推定ステップが、前記対象状態分布記憶手段に対する最初の確率分布更新回数Ｂと対象状態に対する最後の確率分布更新回数Ｎを比較し、その比較結果に基づいて、予め用意された変化タイプから任意の確率に従って変化タイプを選択するステップと、前記変化タイプを適応して変化させる動体を選択するステップと、前記選択された変化タイプ及び前記確率分布更新回数Ｂと前記確率分布更新回数Ｎの比較結果に基づいて、選択された動体の対象状態を変化させるステップと、前記撮像装置と同じ外部パラメータ及び内部パラメータを有する仮想撮像手段で、前記変化させた対象状態を投影して作成された画像を、前記シルエット画像をシミュレートしたシミュレーション画像と見做すステップと、前記三次元構造に関する事前知識の条件に基づいた対象状態の確率的な第１条件尤度と、動体同士の重なりに関する事前知識の条件に基づいた対象状態の確率的な第２条件尤度と、前記シルエット画像とシミュレーション画像の比較に基づく第３条件尤度と、を計算し、その第１乃至第３条件尤度の積を、前記変化させた暫定対象状態の尤度と見做すステップと、前記暫定対象状態が採用されたか否かの結果に応じて、ｎ回目に予測された対象状態を決定するステップと、更新回数が前記確率分布計算回数Ｂと等しい場合、該Ｂ回の更新で得られた対象状態に基づいて状態分布を計算するステップと、前記Ｂ回の更新で得られた対象状態に基づいて状態分布に基づいて、現在追跡している動体数が０と等しいか否かを判定するステップと、前記動体数が０と等しくなかった場合、かつ、更新回数が記確率分布更新回数Ｂに前記確率分布更新回数Ｎを加算した回数を超えていた場合、最後のＮ回の更新で保存されたＮ個の対象状態全てが等確率に起こりうると考え、ｎ回目に予測された対象状態を対象状態分布と見做すステップと、を有する、ことを特徴とする。 The invention described in claim 3 is a real-world three-dimensional structure measured in advance, a three-dimensional environment information management means for storing internal parameters and external parameters of the imaging device, and a target state estimated at each time. A target state distribution storage means for storing a stochastic distribution; and a target state storage means for storing the change vector at each time; and considering the three-dimensional position and size of the moving object as the target state of the moving object, A moving object tracking method used in an apparatus for capturing an image of a moving object at a specific time interval using one or a plurality of imaging devices and tracking the moving object based on a plurality of image data obtained by the imaging A target state predicting step for predicting a target state of a plurality of moving objects at each time, an image acquiring step for acquiring the image data, and a region in which the moving object is extracted from the image data. The silhouette image creation step for creating the et image and the target state obtained by the target state prediction means are regarded as the initial state, and the silhouette image created by the silhouette image creation means and the three-dimensional environment information management means are stored. The target state distribution is estimated using the real-world three-dimensional structure and the internal and external parameters of the imaging device stored in the three-dimensional environment information management unit, and stored in the target state distribution storage unit. The target state distribution estimation step for updating the target state distribution at the previous time to the target state distribution at the current time, and the target state having the maximum probability is calculated based on the target state distribution at the current time stored in the target state distribution storage means The change vector is calculated based on the target state having the maximum probability at the current time and the target state having the maximum probability at the previous time. A target state calculation step for storing the calculated change vector in the target state storage means, and the target state distribution estimation step includes the first probability distribution update count B for the target state distribution storage means and the last for the target state. Comparing the probability distribution update count N of the two, selecting a change type according to an arbitrary probability from change types prepared in advance based on the comparison result, and selecting a moving body that changes the change type adaptively And changing the target state of the selected moving object based on the selected change type and the comparison result between the probability distribution update count B and the probability distribution update count N, and the same external parameters as the imaging device, An image created by projecting the changed target state with a virtual imaging means having internal parameters, Based on a step of considering an image as a simulated image, a probabilistic first condition likelihood of a target state based on a prior knowledge condition regarding the three-dimensional structure, and a prior knowledge condition regarding overlapping of moving objects Calculating a probabilistic second condition likelihood of the target state and a third condition likelihood based on a comparison between the silhouette image and the simulation image, and calculating a product of the first to third condition likelihoods as the change A step of determining the likelihood of the provisional target state, a step of determining the target state predicted n times according to a result of whether or not the provisional target state is adopted, and the number of updates is the probability If the number of distribution calculations is equal to B, the step of calculating the state distribution based on the target state obtained by updating the B times, and the state distribution based on the target state obtained by updating the B times, Present A step of determining whether or not the number of moving objects being tracked is equal to 0; and if the number of moving objects is not equal to 0, and the update count adds the probability distribution update count N to the probability distribution update count B If all the N target states stored in the last N updates can occur with equal probability, the target state predicted for the nth time is regarded as a target state distribution. And having.

請求項４記載の発明は、請求項３に記載の発明において、前記動体の対象状態を楕円体モデルの集合と見做し、各動体の位置を任意の平面上の位置（ｘ，ｙ）、動体の大きさを該楕円体モデルの半径ｒと高さｈ、で表し、それらパラメータｘ，ｙ，ｒ，ｈの値の組み合わせを複数記憶する、ことを特徴とする。 The invention according to claim 4 is the invention according to claim 3, wherein the object state of the moving object is regarded as a set of ellipsoidal models, and the position of each moving object is a position (x, y) on an arbitrary plane. The size of the moving object is represented by a radius r and a height h of the ellipsoid model, and a plurality of combinations of values of the parameters x, y, r, and h are stored.

請求項５記載の発明は、動体追跡プログラムであって、請求項３または４に記載の動体追跡方法を、コンピュータで実行可能なコンピュータプログラムとして記述したことを特徴とする。
特徴とする。 The invention according to claim 5 is a moving object tracking program, wherein the moving object tracking method according to claim 3 or 4 is described as a computer program executable by a computer.
Features.

請求項６記載の発明は、記録媒体であって、コンピュータで実行可能なコンピュータプログラムとして記述し、そのコンピュータプログラムを記録したことを特徴とする。 The invention described in claim 6 is a recording medium, which is described as a computer program executable by a computer and records the computer program.

前記の請求項１，３の発明によれば、複数の動体の対象状態分布を取得し、さらに、それら対象状態分布に基づいて、動体の変化ベクトルを取得できる。三次元構造に関する事前知識に基づいた対象状態の確率的尤度，動体同士の重なりに関する事前知識に基づいた対象状態の確率的尤度，前記シルエット画像とシミュレーション画像の比較による尤度を取得できる。 According to the first and third aspects of the present invention, it is possible to acquire a target state distribution of a plurality of moving objects, and to acquire a change vector of the moving object based on the target state distributions. The probabilistic likelihood of the target state based on prior knowledge regarding the three-dimensional structure, the probabilistic likelihood of the target state based on prior knowledge regarding overlapping of moving objects, and the likelihood by comparing the silhouette image and the simulation image can be acquired.

前記の請求項２，４の発明によれば、各動体の位置を任意の平面上の位置（ｘ，ｙ）、動体の大きさを該楕円体モデルの半径ｒと高さｈを記憶できる。 According to the second and fourth aspects of the present invention, it is possible to store the position (x, y) on an arbitrary plane as the position of each moving object and the radius r and height h of the ellipsoid model as the size of the moving object.

前記の請求項５の発明によれば、請求項３または４に記載の動体追跡方法をコンピュータプログラムとして記載できる。 According to the invention of claim 5, the moving body tracking method according to claim 3 or 4 can be described as a computer program.

前記の請求項６の発明によれば、請求項３または４に記載の動体追跡方法を実装したコンピュータプログラムを記録媒体に記録できる。 According to the sixth aspect of the present invention, a computer program that implements the moving body tracking method according to the third or fourth aspect can be recorded on a recording medium.

以上示したように請求項１，３の発明によれば、１または複数の撮像装置から取得した画像に基づいて動体の変化ベクトルを計算し、その動体の変化ベクトルに基づいて複数の動物体の動きを安定に追跡できる。確率的尤度とシルエット画像とシミュレーション画像の比較による尤度（即ち、複数の撮像装置からの画像情報を統合した結果）を用いて、動体同士が非常に接近するような状況下においても、動体を安定して追跡できる。 As described above, according to the first and third aspects of the present invention, a change vector of a moving object is calculated based on images acquired from one or a plurality of imaging devices, and a plurality of moving objects are calculated based on the change vector of the moving object. The movement can be tracked stably. Even in situations where moving objects are very close to each other using probabilistic likelihood, likelihood based on comparison of silhouette images and simulation images (that is, the result of integrating image information from a plurality of imaging devices) Can be tracked stably.

請求項２，４の発明によれば、ｘ，ｙ，ｒ，ｈを記憶できるため、大きさの異なる複数の動体や動的に追跡する動体数の変化に応じた追跡を行うことができる。また、大きさの異なる複数の動体や動的に追跡する動体数の変化に応じた追跡を行っても、処理時間を一定に保ち実時間性を確保できる。 According to the second and fourth aspects of the present invention, since x, y, r, and h can be stored, it is possible to perform tracking according to changes in the number of moving objects having different sizes and the number of moving objects that are dynamically tracked. Further, even if tracking is performed according to a change in the number of moving objects having different sizes or the number of moving objects to be tracked dynamically, the processing time can be kept constant and real-time performance can be ensured.

請求項５の発明によれば、動体追跡方法を実装したコンピュータプログラムを提供できる。 According to invention of Claim 5, the computer program which mounted the moving body tracking method can be provided.

請求項６の発明によれば、動体追跡方法を実装したコンピュータプログラムを記録した記録媒体を提供できる。 According to invention of Claim 6, the recording medium which recorded the computer program which mounted the moving body tracking method can be provided.

これらを以ってコンピュータビジョン分野に貢献できる。 These can contribute to the computer vision field.

本実施形態における動体追跡装置を図１に基づいて説明する。 The moving body tracking apparatus in this embodiment is demonstrated based on FIG.

動体追跡装置は、対象状態予測手段１１，画像取得手段１２，シルエット画像作成手段１３，対象状態分布推定手段１４，対象状態計算手段１５，三次元環境情報管理手段（例えば、三次元環境情報を管理するデータベース）２１，対象状態分布記憶手段２２，対象状態記憶手段２３から構成される。 The moving object tracking device includes a target state prediction unit 11, an image acquisition unit 12, a silhouette image creation unit 13, a target state distribution estimation unit 14, a target state calculation unit 15, and a three-dimensional environment information management unit (for example, managing three-dimensional environment information). Database) 21, target state distribution storage means 22, and target state storage means 23.

対象状態予測手段１１は、前時刻に得られた追跡対象の状態（以後、単に対象状態という；追跡対象の三次元位置、大きさ）および対象状態の変化ベクトルを用いて現時刻の対象状態を予測する。 The target state prediction means 11 determines the target state at the current time using the state of the tracking target obtained at the previous time (hereinafter simply referred to as the target state; the three-dimensional position and size of the tracking target) and the change vector of the target state. Predict.

画像取得手段１２は、各時刻に撮影された画像データを取得する手段であり、例えば、デジタルカメラやビデオカメラ等の撮像装置などを備えている。本実施形態における動体追跡装置は、１又は複数の撮像装置を備えていても良い。また、画像データは、各時刻に撮影された画像データを格納した情報管理手段（例えば、データベース）から取得しても良い。 The image acquisition means 12 is means for acquiring image data taken at each time, and includes, for example, an imaging device such as a digital camera or a video camera. The moving object tracking device in the present embodiment may include one or a plurality of imaging devices. Further, the image data may be acquired from information management means (for example, a database) that stores image data taken at each time.

シルエット画像作成手段１３は、前記取得された画像データから対象の写っている領域のみを抽出したシルエット画像を作成する。 The silhouette image creating means 13 creates a silhouette image obtained by extracting only the region where the object is captured from the acquired image data.

対象状態分布推定手段１４は、対象状態予測手段１１によって得られた対象状態を初期状態と見做し、シルエット画像作成手段１３によって作成されたシルエット画像と三次元環境情報管理手段２１を用いて対象状態の確率的分布を推定し、対象状態分布記憶手段２２に保存されている前時刻の対象状態分布を現時刻の対象状態分布に更新する。 The target state distribution estimation unit 14 regards the target state obtained by the target state prediction unit 11 as an initial state, and uses the silhouette image created by the silhouette image creation unit 13 and the three-dimensional environment information management unit 21 to The probabilistic distribution of the state is estimated, and the target state distribution at the previous time stored in the target state distribution storage unit 22 is updated to the target state distribution at the current time.

対象状態計算手段１５は、現時刻の対象状態分布が保存された対象状態分布記憶手段２２を用いて、最大確率となる対象状態を計算するとともに、前時刻の対象状態からの変化ベクトルを計算し、対象状態記憶手段２３に保存されている対象状態からの変化ベクトルを更新する。 The target state calculation means 15 uses the target state distribution storage means 22 in which the target state distribution at the current time is stored, calculates the target state that has the maximum probability, and calculates a change vector from the target state at the previous time. The change vector from the target state stored in the target state storage unit 23 is updated.

三次元環境情報管理手段２１は、予め計測しておいた実世界の三次元構造情報と、設置されている画像取得手段１２（例えば、デジタルカメラ、以下、単にカメラという）の内部パラメータ（例えば、焦点距離，画像中心などのパラメータ）と、その画像取得手段１２の外部パラメータ（例えば、カメラ自体の三次元位置及び姿勢情報）と、を保存する。例えば、三次元環境情報管理手段２１は、三次元構造情報，内部パラメータ，外部パラメータを保存するデータベースを含んでいても良い。また、三次元環境情報管理手段２１は、記憶部（例えば、ハードディスク装置や不揮発性メモリ）に三次元構造情報，内部パラメータ，外部パラメータを記憶してよい。 The three-dimensional environment information management means 21 includes real-world three-dimensional structure information measured in advance and internal parameters (for example, a digital camera, hereinafter simply referred to as a camera) of the installed image acquisition means 12 (for example, Parameters such as focal length and image center) and external parameters of the image acquisition means 12 (for example, three-dimensional position and orientation information of the camera itself) are stored. For example, the three-dimensional environment information management means 21 may include a database that stores three-dimensional structure information, internal parameters, and external parameters. The three-dimensional environment information management unit 21 may store three-dimensional structure information, internal parameters, and external parameters in a storage unit (for example, a hard disk device or a nonvolatile memory).

対象状態分布記憶手段２２は、対象状態分布推定手段１４によって推定された対象状態の確率的分布を記憶する。例えば、対象状態分布記憶手段２２に含まれるメモリ（または、記憶領域）やハードディスク装置に、推定された対象状態の確率的分布を記憶しても良い。 The target state distribution storage unit 22 stores the probabilistic distribution of the target state estimated by the target state distribution estimation unit 14. For example, the probabilistic distribution of the estimated target state may be stored in a memory (or storage area) included in the target state distribution storage unit 22 or a hard disk device.

対象状態記憶手段２３は、対象状態計算手段１５によって計算された最大確率を取る対象状態および前時刻の対象状態との変化ベクトルを記憶する。例えば、対象状態記憶手段２３に含まれるメモリ（または、記憶領域）やハードディスク装置に前記変化ベクトルは記憶しても良い。 The target state storage unit 23 stores a change vector between the target state having the maximum probability calculated by the target state calculation unit 15 and the target state at the previous time. For example, the change vector may be stored in a memory (or storage area) included in the target state storage unit 23 or a hard disk device.

本実施形態における動体追跡装置に関する条件を図２に基づいて説明する。なお、以下の説明で図１中の符号と同じものの説明は省略する。 The conditions regarding the moving body tracking apparatus in this embodiment are demonstrated based on FIG. In the following description, the same reference numerals as those in FIG. 1 are omitted.

図２は、平面上を歩行する複数の人物（図２中では、人物Ｈ１，Ｈ２…Ｈｋ）を追跡対象（即ち、動体）とし、ｐ台の位置と姿勢を固定されたビデオカメラ（即ち、カメラＣ１，Ｃ２…Ｃｐ）を用いて追跡を行う例について示すものである。また、カメラＣ１〜Ｃｐは、同じ撮影間隔で同期して画像の取得を行うものとする。なお、同じ撮影間隔で同期して画像の取得を行う際に、例えば、カメラＣ１〜Ｃｐに同期装置を接続または搭載しても良い。 FIG. 2 shows a plurality of persons walking on a plane (persons H1, H2,... Hk in FIG. 2) as tracking targets (that is, moving objects), and video cameras with fixed positions and postures of p units (that is, An example of tracking using cameras C1, C2,... Cp) will be described. In addition, the cameras C1 to Cp acquire images synchronously at the same shooting interval. Note that when acquiring images synchronously at the same shooting interval, for example, a synchronization device may be connected to or mounted on the cameras C1 to Cp.

一般に、空間中には追跡対象でない非対象物体（図２中では、非対象物体Ｂ）が存在しており、この非対象物体の三次元構造データは、手作業によるＣＡＤ（Ｃｏｍｐｕｔｅｒ−ＡｉｄｅｄＤｅｓｉｇｎ）データの作成やレンジファインダの利用し、または、カメラ映像からの三次元復元などを利用して、予め計測して取得できる。この三次元構造データは、ＸＹ平面を床面と一致させ、高さ方向をＺ軸と一致させる座標系に変換するものとする。 In general, there is a non-target object (non-target object B in FIG. 2) that is not a tracking target in space, and the three-dimensional structure data of the non-target object is CAD (Computer-Aided Design) by hand. It can be measured and acquired in advance using data creation, a range finder, or three-dimensional reconstruction from camera images. This three-dimensional structure data shall be converted into a coordinate system in which the XY plane coincides with the floor surface and the height direction coincides with the Z axis.

また、配置された全て（ｐ台）のカメラの内部パラメータの校正と三次元位置(ｘ，ｙ，ｚ)と姿勢（φ，θ，γ）とを前記取得した三次元構造データと同じ座標系で求める（例えば、非特許文献１参照）。これら三次元構造データと全てのカメラの三次元位置および姿勢を、三次元環境情報管理手段２１に予め記憶しておくことにする。 Further, calibration of internal parameters of all (p) cameras arranged, and the three-dimensional position (x, y, z) and posture (φ, θ, γ) are the same coordinate system as the acquired three-dimensional structure data. (For example, refer nonpatent literature 1). These three-dimensional structure data and the three-dimensional positions and orientations of all the cameras are stored in advance in the three-dimensional environment information management means 21.

本実施形態における人物（動体）の形状モデルを図３に基づいて説明する。 A shape model of a person (moving body) in the present embodiment will be described with reference to FIG.

なお、以下の説明における時刻は、特定のサンプリング間隔Δｔでサンプリングした回数（または順序数）を示す。そして、初期時刻ｔ₀は、動体追跡を開始した時刻（即ち、時刻ｔが「０」の時点）を示す。前時刻は、初期時刻ｔ₀から現在時刻（即ち現時刻）間の時刻であって、現時刻（即ち、処理中の時刻；例えば、ｔ）の直前に動体追跡を行った時刻（即ち、ｔ−１）となる。 The time in the following description indicates the number of times (or the number of orders) sampled at a specific sampling interval Δt. The initial time t ₀ indicates the time when moving object tracking is started (that is, the time t is “0”). The previous time is the time between the initial time t ₀ and the current time (ie, the current time), and the time when the moving object was tracked immediately before the current time (ie, the time being processed; eg, t) (ie, t -1).

図３では、楕円体ｅを人物の形状を近似したモデルと考え、人物の位置を床平面上の二次元座標値(ｘ，ｙ)、人物の大きさを楕円体ｅの半径ｒと高さｈで表す。さらに、対象状態分布推定手段１４には、確率統計的状態分布推定法の一種であるＭＣＭＣ（ＭａｒｋｏｖＣｈａｉｎＭｏｎｔｅＣａｒｌｏ）法を用いる。 In FIG. 3, the ellipsoid e is considered as a model approximating the shape of a person, the position of the person is a two-dimensional coordinate value (x, y) on the floor plane, and the size of the person is the radius r and height of the ellipsoid e. Represented by h. Further, the MCMC (Markov Chain Monte Carlo) method, which is a kind of the stochastic statistical state distribution estimation method, is used for the target state distribution estimation means 14.

複数の人物の状態を表す対象状態Ｓは、各人物の位置，大きさを表すベクトルＭ（即ち、人物モデル）を並べたものによって表される。対象状態Ｓの次元数は、最大追跡可能人数Ｋによって定まる。すなわち、ベクトルＭが四次元ベクトルであるため、４Ｋ次元ベクトルとなる。例えば、時刻ｔにおける状態Ｓ_tは以下の式で表される。 A target state S representing the states of a plurality of persons is represented by a vector M (ie, a person model) representing the position and size of each person. The number of dimensions of the target state S is determined by the maximum number K of people that can be tracked. That is, since the vector M is a four-dimensional vector, it becomes a 4K-dimensional vector. For instance, the state S _t at time t is expressed by the following equation.

また、状態分布Ｓを構成する人物ｉの位置、大きさ、形を表す人物モデルＭ_iは、以下の式で表される四次元ベクトルである。 A person model M _i representing the position, size, and shape of the person i constituting the state distribution S is a four-dimensional vector represented by the following expression.

ただし、ここでｘ_iおよびｙ_iは床平面状での人物ｉの位置を示す二次元座標値(ｘ，ｙ)であり、ｒ_iおよびｈ_iは人物を楕円体で近似した際の楕円体のパラメータ半径ｒおよび高さｈを表す。なお、ｘ_i，ｙ_i，ｒ_i，ｈ_iは、例えば、動体追跡装置に備えられた記憶部（図示省略、例えば、メモリや外部記憶装置）に保存され、アクセスされても良い。 Here, x _i and y _i are two-dimensional coordinate values (x, y) indicating the position of the person i on the floor plane, and r _i and h _i are ellipsoids when the person is approximated by an ellipsoid. Represents the parameter radius r and height h. Note that x _i , y _i , r _i , and h _i may be stored and accessed in a storage unit (not shown, for example, a memory or an external storage device) provided in the moving body tracking device, for example.

以上のように、本実施形態は、三次元空間中に存在する各時刻ｔにおける対象状態Ｓ_t（複数の物体や人物の各時刻ｔにおける位置、大きさを表す対象状態）を推定することによって、複数の物体や人物の状態を逐次推定し、追跡を行うものである。 As described above, the present embodiment, by estimating the target state at each time t that is present in three-dimensional space S _t (the position at each time t of the plurality of objects and people, subject state representing the magnitude) In addition, the state of a plurality of objects and persons is sequentially estimated and tracked.

本実施形態の動体追跡方法の処理手順を図４中のフローチャートに基づいて説明する。なお、以下の説明で、図１中の符号と同じものの説明は省略する。 The processing procedure of the moving body tracking method of the present embodiment will be described based on the flowchart in FIG. In the following description, the same reference numerals as those in FIG. 1 are omitted.

まず、処理は、初期時刻ｔ₀（時刻ｔ＝０）に開始され、開始時に対象状態Ｓおよび対象状態の変化ベクトルＶを０ベクトルに初期化する（Ｓ３００）。 First, the process is started at an initial time t ₀ (time t = 0), and at the start, the target state S and the change vector V of the target state are initialized to a zero vector (S300).

次に、対象状態予測手段１１により、現時刻での対象状態の予測が行われる（Ｓ３０１）。これは前時刻で計算された対象状態と対象状態の変化ベクトルを用いて行われる線形予測処理である。前時刻で求められた対象状態をＳ_t-1、変化ベクトルを４Ｋ次元のベクトルＶ_t-1とすると、現時刻の対象状態の予測値Ｓ’_t,0は、次の式によって表すことができる。 Next, the target state prediction unit 11 predicts the target state at the current time (S301). This is a linear prediction process performed using the target state calculated at the previous time and the change vector of the target state. Assuming that the target state obtained at the previous time is S _t−1 and the change vector is a 4K-dimensional vector V _t−1 , the predicted value S ′ _{t, 0} of the target state at the current time can be expressed by the following equation. it can.

ただし、Ｓ’のインデックスのｔは時刻ｔを表す。「０」の部分は、予測値の更新回数が「０」であることを示す。後述する対象状態分布の推定（Ｓ３０４）においては、予測値をＳ’_t,0を初期値と見做して、その後、複数回の更新を行って対象状態の分布を求める。すなわち、インデックスの「０」は更新回数が「０」である初期値であることを示す。 However, t in the index of S ′ represents time t. The portion “0” indicates that the predicted value update count is “0”. In the estimation of the target state distribution (S304), which will be described later, the predicted value is regarded as S ′ _{t, 0} as the initial value, and then the target state distribution is obtained by performing a plurality of updates. That is, “0” in the index indicates an initial value whose update count is “0”.

なお、動体追跡の開始（初期時刻ｔ₀）直後には前時刻の対象状態、対象状態の変化ベクトルは未知であるため、全てのＭ_iの要素を「０」にしてＳ₀の初期化を行う。 It should be noted that immediately after the start of moving body tracking (initial time t ₀ ), since the target state and the change vector of the target state at the previous time are unknown, all elements of M _i are set to “0” and S ₀ is initialized. Do.

次に、画像取得手段１２を用いて、同期撮影された画像データＩｍ₁，Ｉｍ₂，…Ｉｍ_pをカメラＣ１，Ｃ２…Ｃｐから取得する（Ｓ３０２）。 Next, using the image acquisition unit 12, the synchronous photography image data Im _1, Im _2, acquires ... Im _p from the camera C1, C2 ... Cp (S302) .

次に、シルエット画像作成手段１３を用いて、前記カメラＣ１，Ｃ２…Ｃｐから取得した画像データＩｍ₁，Ｉｍ₂，…Ｉｍ_pに対し、それぞれのシルエット画像Ｓｉｌ₁，Ｓｉｌ₂，…Ｓｉｌ_pを作成する（Ｓ３０３）。なお、シルエット画像とは、図５のように撮影した画像の対象を写した領域の輝度値が「１」、他の領域が「０」である２値画像のことであって、画像５０１は入力画像、画像５０２が作成されたシルエット画像の一例である。このような画像は、背景差分法やフレーム間差分法などのよく知られた方法を利用することによって簡単に生成できる。 Next, the silhouette images Sil ₁ , Sil ₂ ,... Sil _p are obtained from the image data Im ₁ , Im ₂ ,... Im _p acquired from the cameras C 1, C ₂ ,. Create (S303). Note that the silhouette image is a binary image in which the luminance value of the area where the object of the image taken as shown in FIG. 5 is “1” and the other area is “0”. It is an example of a silhouette image in which an input image and an image 502 are created. Such an image can be easily generated by using a well-known method such as a background difference method or an inter-frame difference method.

次に、ステップＳ３０１で予測された対象状態Ｓ’_t,0を対象状態の初期値とし、Ｓ’を逐次更新していくことによって、対象状態の確率的分布を推定する（Ｓ３０４）。 Next, the target state S ′ _{t, 0} predicted in step S301 is set as the initial value of the target state, and S ′ is sequentially updated to estimate the probabilistic distribution of the target state (S304).

ここで、図６に基づいて対象状態の確率的分布を推定する処理手順を説明する。 Here, a processing procedure for estimating the probabilistic distribution of the target state will be described with reference to FIG.

対象状態の確率分布の推定は複数回（即ち、（Ｂ＋Ｎ）回）の状態更新処理（前記対象状態分布記憶手段に対する状態更新処理）によって行われる。最初のＢ回の更新では人物の監視エリアへの増減、各人物の位置の大まかな対象状態の確率分布を求め、最後のＮ回の更新では、最初のＢ回の更新で得られた大まかな対象状態から人物の増減を行わずに、各人物の位置、大きさに関し、より詳細な対象状態の確率分布を求める。この二段階の推定処理により、頑健に複数人物の状態を推定できる。 The estimation of the probability distribution of the target state is performed by a plurality of times (that is, (B + N) times) of state update processing (state update processing for the target state distribution storage means). In the first B update, the person's increase / decrease in the monitoring area and the probability distribution of the rough target state of each person's position are obtained. In the last N updates, the rough obtained by the first B update is obtained. A more detailed probability distribution of the target state is obtained with respect to the position and size of each person without increasing or decreasing the number of persons from the target state. By this two-stage estimation process, the state of a plurality of persons can be estimated robustly.

最後のＮ回で得られた状態を状態分布として保存し、このＮ個の状態は全て等確率で起こりうると考える。これは最初のＢ回の更新で得られた状態が初期値に依存すると考えられるために、最後のＮ回の更新で得られた状態だけを採用する処理（一般にＢｕｒｎ−ｉｎと呼ばれる処理）であり、ＢとＮの比率は経験的に自由に設定できる。 The state obtained in the last N times is stored as a state distribution, and it is considered that all the N states can occur with equal probability. Since it is considered that the state obtained by the first B updates depends on the initial value, only the state obtained by the last N updates (a process generally called Burn-in) is adopted. Yes, the ratio of B and N can be freely set empirically.

まず、更新回数ｎを「０」に初期設定して処理が開始される。 First, the update count n is initially set to “0” and the process is started.

ステップＳ４０１では変化タイプの選択を行う。なお、変化タイプとは、状態を変化させる処理のタイプを表すものであって、４種類（第１タイプは人物の追加，第２タイプは人物の消去，第３タイプは人物の位置変更，第４タイプは人物の大きさ変更）から成る。なお、それぞれの変化タイプの選択は更新回数によって以下のように異なってくる。 In step S401, a change type is selected. Note that the change type represents a type of processing to change the state, and four types (first type is person addition, second type is person deletion, third type is person position change, 4 types consist of person size change). The selection of each change type differs as follows depending on the number of updates.

更新回数ｎ≦Ｂの場合は、次の通りである。
まず、現在の状態として誰も人物を追跡していない場合（全てのＭが「０」ベクトル）には必ず（第１タイプ：人物の追加）が選択される。それ以外の（少なくとも一人の人物を追跡している）場合には各タイプが選ばれる確率を予め任意に設定しておき、確率的に４つのタイプから選択を行う。 When the number of updates n ≦ B, it is as follows.
First, if no person is being tracked as the current state (all M is a “0” vector), (first type: addition of person) is always selected. In other cases (at least one person is tracked), the probability that each type is selected is set arbitrarily in advance, and the probability is selected from four types.

更新回数ｎ＞Ｂの場合は、次の通りである。
後述するように最初のＢ回の更新において、誰も追跡していないと判定された場合には、更新処理は終了となる。追跡人物数が「０」でない場合には、変化タイプとして（第３タイプ：人物の位置変更）もしくは、（第４タイプ：人物の大きさ変更）が予め決められた確率で選択される。 When the number of updates n> B, it is as follows.
As will be described later, in the first update of B times, when it is determined that no one is tracking, the update process ends. When the number of persons to be tracked is not “0”, the change type (third type: person position change) or (fourth type: person size change) is selected with a predetermined probability.

次に、ステップＳ４０２では、変化させる対象を選択する。いま、更新回数をｎ回目とすると、ｎ−１回目に更新された対象状態Ｓ’_t,n-1を用いて、暫定的な変化状態Ｓ^*を計算する。 Next, in step S402, an object to be changed is selected. Now, assuming that the number of updates is n, the temporary change state S ^* is calculated using the target state S ′ _{t, n−1} updated _n−1 .

このときに、一度の更新処理で全ての対象の状態を変化させずに、毎回の更新処理で一つの対象の状態のみを変化させる。まず、変化タイプが人物の追加である場合には、現在「０」ベクトルとなっているＭ_i（即ち、追跡されていないＭ_i）の中からランダムに選択を行う。それ以外の場合には「０」ベクトルとなっていないＭ_i（即ち、追跡されているＭ_i）の中からランダムに選択を行う。 At this time, the state of all the objects is not changed by one update process, but only the state of one object is changed by each update process. First, when the change type is addition of a person, a random selection is made from M _i that is currently a “0” vector (that is, M _i that is not tracked). M _i (i.e., M _i being tracked) not "0" vector otherwise performing randomly selected from a.

次に、ステップＳ４０３では、選択された人物モデルＭ_iから選択された第１〜第４タイプの各処理に従って暫定的な人物モデルＭ’_iを計算する。以下に、タイプ別の処理を説明する。 Next, in step S403, a temporary person model M ′ _i is calculated according to each of the first to fourth types of processing selected from the selected person model M _i . Hereinafter, processing by type will be described.

第１タイプ（人物の追加）では、次のような処理を行う。 In the first type (person addition), the following processing is performed.

人物モデルＭ_iが「０」ベクトルのため、人物モデルＭ_iに新たに値の設定を行う必要がある。例えば、以下のようにして値を設定できる。 Since the person model M _i is a “0” vector, it is necessary to newly set a value for the person model M _i . For example, the value can be set as follows.

三次元環境情報管理手段２１によって、人物の空間上で取り得る位置の範囲が既知であるものと仮定し、この範囲を次のように定義する。 It is assumed that the range of positions that can be taken in the person's space is known by the three-dimensional environment information management means 21, and this range is defined as follows.

また、人物を近似する楕円体の大きさを表すパラメータｒおよびｈに関しては、人物の大きさに関する経験的な事前知識（例えば、３ｍ（メートル）を超える身長の人はいない、など）に基づいて、取り得る値の範囲を同様に制限することができる。ここでは、この範囲を次のように定義する。 The parameters r and h representing the size of an ellipsoid that approximates a person are based on empirical prior knowledge about the size of the person (for example, there is no person with a height exceeding 3 m (meters)). The range of possible values can be similarly limited. Here, this range is defined as follows.

以上のような条件下で、楕円体に関するｘ，ｙ，ｒ，ｈのそれぞれのパラメータに関して一様乱数を発生させることによって、Ｍ’_iを生成することができる。 Under the conditions as described above, M ′ _i can be generated by generating uniform random numbers for the respective parameters of x, y, r, and h regarding the ellipsoid.

なお、第１タイプ（人物の追加）の処理は、人物が新たに追跡範囲に入ってきたことを表している。また、追跡範囲内において、予め入退室する場所（例えば、ドアの場所など）の情報が得られていれば、その近辺だけを入退室エリアと見做すことも同様に可能である。 Note that the first type (person addition) processing indicates that a person has newly entered the tracking range. In addition, if information on a place to enter / exit (for example, the location of a door) is obtained in advance within the tracking range, it is also possible to consider only the vicinity as an entrance / exit area.

第２タイプ（人物の消去）では、次のような処理を行う。 In the second type (person deletion), the following processing is performed.

全ての要素を「０」と考え、Ｍ’_iを「０」ベクトルとする。即ち、第２タイプ（人物の消去）は、追跡していた人物が追跡範囲から消えたことを表している。 All elements are considered “0” and M ′ _i is a “0” vector. That is, the second type (person deletion) represents that the person being tracked has disappeared from the tracking range.

第３タイプ（人物の位置変更）では、次のような処理を行う。 In the third type (person position change), the following processing is performed.

選択されたＭ_iの要素の内、人物の二次元平面上での位置(ｘ_i，ｙ_i)に基づいて、更新回数がｎ≦Ｂの場合には式４、更新回数がｎ＞Ｂの場合には式５に従って変化させた暫定的な位置(ｘ’_i，ｙ’_i)を計算する。 Based on the position (x _i , y _i ) of the person on the two-dimensional plane among the selected elements of M _i , when the number of updates is n ≦ B, Expression 4 and the number of updates is n> B. In this case, the temporary position (x ′ _i , y ′ _i ) changed according to Equation 5 is calculated.

ただし、Ｗ_x，Ｗ_yはそれぞれ一次元の白色ノイズであり、そのパラメータは実験的に任意の値に決めることができる。 However, W _x and W _y are one-dimensional white noises, respectively, and parameters thereof can be determined to arbitrary values experimentally.

ただし、δ_x，δ_yはそれぞれ一次元のガウスノイズであり、そのδ_x，δ_yは実験的に任意の値に決めることができる。この暫定的な位置(ｘ’_i，ｙ’_i)を用いて、以下の式のようにＭ’_iを定義する。 However, δ _x and δ _y are one-dimensional Gaussian noises, and δ _x and δ _y can be experimentally determined to arbitrary values. Using this provisional position (x ′ _i , y ′ _i ), M ′ _i is defined as in the following equation.

第４タイプ（人物の大きさ変更）では、次のような処理を行う。 In the fourth type (person size change), the following processing is performed.

選択されたＭ_iの要素の内、人物を近似する楕円体の半径ｒおよび高さｈを表す(ｒ_i，ｈ_i)から更新回数がｎ≦Ｂの場合には式７、更新回数がｎ＞Ｂの場合には式８に従って変化させた暫定的な形、大きさパラメータ(ｒ’_i，ｈ’_i)を計算する。 Of the selected elements of M _i , if the number of updates is n ≦ B from (r _i , h _i ) representing the radius r and height h of an ellipsoid approximating a person, Equation 7 and the number of updates is n If> B, the temporary shape and size parameters (r ′ _i , h ′ _i ) changed according to Equation 8 are calculated.

ただし、Ｗ_r，Ｗ_hはそれぞれ一次元の白色ノイズであり、そのパラメータは実験的に任意の値に決めることができる。 However, W _r and W _h are one-dimensional white noises, respectively, and their parameters can be experimentally determined to arbitrary values.

ただし、δ_r，δ_hはそれぞれ一次元のガウスノイズであり、そのδ_r，δ_hは実験的に任意の値に決めることができる。この楕円体の暫定的な形、大きさパラメータ(ｒ’_i，ｈ’_i)を用いて、次の式によってＭ’_iを定義する。 However, δ _r and δ _h are one-dimensional Gaussian noises, and δ _r and δ _h can be experimentally determined to arbitrary values. Using the provisional shape and size parameters (r ′ _i , h ′ _i ) of this ellipsoid, M ′ _i is defined by the following equation.

以上の各タイプで生成されたＭ’_iを用いて、暫定的に変化させた対象状態Ｓ^*を生成する。例えば、Ｓ^*は、現在時刻ｔで更新回数をｎ回目とすれば、ｎ−１回目の更新で得られた状態Ｓ’_t,n-1のうち選択された人物モデルＭ_iを変化させて生成したＭ’_iに置き換えたものになる。 Using the M ′ _i generated in each of the above types, the target state S ^* that is temporarily changed is generated. For example, S ^* changes the selected person model M _i out of the states S ′ _{t, n−1} obtained by the (n−1) th update if the number of updates is nth at the current time t. The generated M ′ _i is replaced.

次に、暫定的に変化させた対象状態Ｓ^*を用いて、シルエット画像の作成（Ｓ３０３）で作成されたシルエット画像Ｓｉｌ₁，Ｓｉｌ₂，…Ｓｉｌ_pをシミュレートした画像（即ち、シミュレーション画像Ｓｉｍ₁，Ｓｉｍ₂，…Ｓｉｍ_p）を作成する（Ｓ４０４）。そのシミュレーション画像の作成（Ｓ４０４）方法は、次の通りである。 Next, using the temporarily changed target state S ^* , an image simulating the silhouette images Sil ₁ , Sil ₂ ,... Sil _p created in the creation of the silhouette image (S 303) (ie, the simulation image Sim ₁ , Sim ₂ ,... Sim _p ) are created (S 404). The method of creating the simulation image (S404) is as follows.

まず、三次元環境情報管理手段２１と対象状態を組み合わせることによって、実世界をシミュレートした仮想シーンを構築する。なお、三次元環境情報管理手段２１には、実世界の三次元構造の情報が、床面がＸＹ平面、高さ方向がＺ軸となるような座標系を有する三次元の点群（もしくは、ポリゴンデータ）として保存されているものとする。この三次元構造内にステップＳ４０３で計算された暫定的な対象状態Ｓ^*に基づいた楕円体の三次元モデルを複数配置できる。 First, a virtual scene simulating the real world is constructed by combining the three-dimensional environment information management means 21 and the target state. In the three-dimensional environment information management means 21, three-dimensional point information (or a three-dimensional point group having a coordinate system in which the floor surface is the XY plane and the height direction is the Z axis) (or Polygon data). A plurality of three-dimensional ellipsoidal models based on the provisional target state S ^* calculated in step S403 can be arranged in the three-dimensional structure.

次に、実世界上に配置されたカメラの内部パラメータ（例えば、焦点距離，画像中心など）と外部パラメータ（カメラの三次元位置と姿勢）を用いて、仮想シーンを仮想カメラで投影処理することによって、シミュレーション画像を作成する。なお、前記内部パラメータと外部パラメータは、三次元環境情報管理手段２１に保存されているものとする。 Next, the virtual scene is projected with the virtual camera using the internal parameters (eg, focal length, image center, etc.) and external parameters (camera three-dimensional position and orientation) of the camera placed in the real world. To create a simulation image. The internal parameters and external parameters are stored in the three-dimensional environment information management unit 21.

前記投影処理は、一例として非特許文献２に記載の方法を用いて一意に行うことができる。なお、投影処理の際、画像上の楕円体領域の値を「１」、それ以外の領域の値を「０」とする２値画像とすることによって、シルエット画像Ｓｉｌ₁，Ｓｉｌ₂，…Ｓｉｌ_pをシミュレートしたシミュレーション画像Ｓｉｍ₁，Ｓｉｍ₂，…Ｓｉｍ_pを作成できる。 The projection process can be uniquely performed by using a method described in Non-Patent Document 2, for example. In the projection process, silhouette images Sil ₁ , Sil ₂ ,... Sil are obtained by setting a binary image in which the value of the ellipsoidal region on the image is “1” and the value of the other region is “0”. the simulation image Sim _1, Sim ₂ simulating the _p, can create a ... Sim _p.

以上が、シミュレーション画像の作成方法である。 The above is the method for creating a simulation image.

次に、この暫定的な対象状態Ｓ^*が、どのくらい尤もらしいかを判断するために、尤度を計算する（Ｓ４０５）。尤度の計算は、事前知識による対象状態自体の確からしさの判定及びステップＳ３０３で作成されたシルエット画像Ｓｉｌ₁，Ｓｉｌ₂，…Ｓｉｌ_pとステップＳ４０４で生成されたシミュレーション画像Ｓｉｍ₁，Ｓｉｍ₂，…Ｓｉｍ_pの比較に基づいて計算する。 Next, in order to determine how likely this provisional target state S ^* is, the likelihood is calculated (S405). Calculation of likelihood, prior knowledge silhouette image created in the determination and step S303 of certainty of the subject state itself by Sil _1, Sil _2, ... simulated generated by Sil _p and step S404 image Sim _1, Sim _2, ... calculated based on Sim _p comparison.

事前知識の条件による対象状態自体の尤もらしさの判定（即ち、条件に基づく確率的な尤度（即ち、条件尤度）の計算）には、次の２種類の判定が考えられる。 The following two types of determination can be considered for the determination of the likelihood of the target state itself based on the prior knowledge conditions (that is, the calculation of the probabilistic likelihood (that is, the conditional likelihood) based on the condition).

第１判定は、三次元環境に基づく制限を用いた判定である。三次元環境情報管理手段２１を用いて、人物状態の位置(ｘ，ｙ)に応じて、存在確率に重みをつけることができる。なお、本実施形態では、人物は床平面上に存在していることを仮定しており、空間中で対象以外に存在している静止物体の上によじ昇ったりすることはない、という条件を仮定している。その仮定の下に、例えば、人物の存在している位置(ｘ，ｙ)における三次元構造の高さに反比例する以下の式で表されるペナルティ関数Ｐ（Ｓ^*）を用意することで、存在確率に重みを付けることが可能である。 The first determination is a determination using a restriction based on a three-dimensional environment. By using the three-dimensional environment information management means 21, the existence probability can be weighted according to the position (x, y) of the person state. In the present embodiment, it is assumed that the person exists on the floor plane, and the condition is that the person does not climb up on a stationary object that exists in the space other than the target. Assumes. Under the assumption, for example, by preparing a penalty function P (S ^* ) represented by the following equation that is inversely proportional to the height of the three-dimensional structure at the position (x, y) where the person exists, It is possible to weight the existence probability.

ただし、ｈ（Ｍ_i）はＭ_iの要素中の位置(ｘ_i，ｙ_i)における三次元構造のＺ値、αは定数であって実験的に任意に設定可能な値である。 However, h (M _i ) is the Z value of the three-dimensional structure at the position (x _i , y _i ) in the element of M _i , and α is a constant that can be arbitrarily set experimentally.

第２判定は、追跡している人物同士が三次元空間上で重なることがない、という条件に基づいた判定である。即ち、人物を楕円体で近似しているため、個々の楕円体が重ならない条件（すなわち、異なる二つの楕円体の中心間の距離）が、二つの楕円体の半径の和より大きいことを満たす必要がある。 The second determination is a determination based on a condition that the persons being tracked do not overlap in the three-dimensional space. In other words, since the person is approximated by an ellipsoid, the condition that the individual ellipsoids do not overlap (that is, the distance between the centers of two different ellipsoids) satisfies that the sum of the radii of the two ellipsoids is greater. There is a need.

二人の人物同士が三次元空間上で重ならないようにするために、次の式で表された距離Ｒ、二人の人物を表す楕円体の中心同士の距離に応じたペナルティ関数Ｅ（Ｓ^*）を設定する。 In order to prevent two persons from overlapping each other in a three-dimensional space, a penalty function E (S) according to the distance R expressed by the following equation and the distance between the centers of ellipsoids representing the two persons is represented. ^* ) Is set.

ただし、ｘ_n，ｙ_nは人物ｎの位置ｘ座標、ｙ座標を示す。 Here, x _n and y _n indicate the position x coordinate and y coordinate of the person n.

楕円体の距離が近くなるとペナルティを与える関数Ｅは、例えば、以下の式のように設定できる。 The function E that gives a penalty when the distance between the ellipsoids becomes short can be set as the following equation, for example.

ただし、βは定数で、実験的に任意に設定可能である。 However, β is a constant and can be arbitrarily set experimentally.

次に、シルエット画像の作成（Ｓ３０３）によって作成されたシルエット画像Ｓｉｌ₁，Ｓｉｌ₂，…Ｓｉｌ_pとシミュレーション画像Ｓｉｍ₁，Ｓｉｍ₂，…Ｓｉｍ_pの比較について説明する。例えば、以下の評価式によって計算できる（即ち、条件尤度を計算する）。 Then, silhouette images Sil ₁ created by the creation of the silhouette image _{(S303), Sil 2, ...} Sil p and simulation image Sim _1, Sim _2, ... comparison of Sim _p will be described. For example, it can be calculated by the following evaluation formula (that is, the conditional likelihood is calculated).

ただし、（ｋ，ｌ）は、画像のｋ行ｌ列成分（画素）を示す。 Here, (k, l) indicates k rows and 1 column components (pixels) of the image.

上記の式では、ｐ台のカメラ全ての情報統合を行っており、複数の視点の情報を使って評価を行っている。 In the above formula, information of all p cameras is integrated, and evaluation is performed using information from a plurality of viewpoints.

最終的に、暫定状態Ｓ^*の尤度Ｌは、以下の式のような積演算によって計算される。 Finally, the likelihood L of the provisional state S ^* is calculated by a product operation like the following equation.

以上の処理が、尤度の計算（Ｓ４０５）である。 The above processing is the likelihood calculation (S405).

次に、ステップＳ４０６では、ステップＳ４０５で計算された尤度を用いて、暫定状態Ｓ^*を受け入れて更新するか、もしくは拒否するかを判定する。この判定は、受け入れ拒否判断確率Ａの計算に基づいて判定する。受け入れ拒否判断確率Ａは、例えば、以下の式により計算できる。 Next, in step S406, it is determined using the likelihood calculated in step S405 whether the provisional state S ^* is accepted and updated or rejected. This determination is made based on the calculation of the acceptance rejection determination probability A. The acceptance refusal judgment probability A can be calculated by the following equation, for example.

ここで、現在ｎ回目の更新とすると、尤度Ｌ’はｎ−１回目に更新されたＳ’_t,n-1の尤度を表す。つまり、前回の状態よりも今回の状態の方が尤度が高ければ必ず受け入れ、そうでなければ、Ｌ／Ｌ’の確率（即ち、受け入れ拒否判断確率Ａ）で暫定状態Ｓ^*を採用する。 Here, assuming that the update is the nth time, the likelihood L ′ represents the likelihood of S ′ _{t, n−1} updated _{the n−} 1th time. That is, the current state is always accepted if the likelihood is higher than the previous state, and otherwise, the provisional state S ^* is adopted with the probability of L / L ′ (that is, the acceptance rejection determination probability A).

さらに、暫定状態Ｓ^*が採用されれば、Ｓ’_t,nがＳ^*と等しい（Ｓ’_t,n＝Ｓ^*）ものと見做し、採用が拒否されれば、Ｓ’_t,nがＳ’_t,n-1と等しい（Ｓ’_t,n＝Ｓ’_t,n-1）ものと見做して、ステップＳ４０７の条件判断ステップへ進む。 Further, if the provisional state S ^* is adopted, it is assumed that S ′ _{t, n} is equal to S ^* (S ′ _{t, n} = S ^* ), and if adoption is rejected, S ′ _{t, n} Is assumed to be equal to S ′ _{t, n−1} (S ′ _{t, n} = S ′ _{t, n−1} ), and the process proceeds to the condition determination step of step S407.

ステップＳ４０７では、更新回数ｎがＢと等しいか否かの判断を行う。更新回数ｎがＢ等しくない場合には、ステップＳ４１０の条件判断ステップへと進む。更新回数ｎがＢと等しい場合には、ステップＳ４０８の状態分布計算ステップへ進む。 In step S407, it is determined whether the number of updates n is equal to B. If the number of updates n is not equal to B, the process proceeds to the condition determination step in step S410. When the number of updates n is equal to B, the process proceeds to the state distribution calculation step in step S408.

ステップＳ４０８では、これまでのＢ回の更新で得られた状態に基づいて、大まかな推定として、状態分布の計算を行う。得られているＢ個の状態は、その全てが追跡している人物数や前時刻まで追跡していた人物を引き続き追跡しているか否か、といった情報が異なる可能性を有している。 In step S408, the state distribution is calculated as a rough estimation based on the state obtained by the B updates so far. The obtained B states may have different information such as the number of persons tracked by all of them and whether or not the person tracked until the previous time is being tracked.

以下のＫ次元ベクトルｋを用いて追跡人物数、追跡の引き継ぎの情報を表すことができる。 The following K-dimensional vector k can be used to represent the number of persons to be tracked and the information of tracking succession.

各Ｉ_iはそれぞれ人物モデルの状態Ｍ_iに対応しており、「０」，「１」，「２」のいずれかの値を取る。そのＩ_iのそれぞれの値は、次のように定義される。 Each I _i corresponds to the state M _{i of the} person model and takes one of the values “0”, “1”, and “2”. Each value of I _i is defined as follows:

Ｉ_iが「０」である場合には、対応するＭがゼロベクトルであり、追跡を行っていないこと。 If I _i is “0”, the corresponding M is a zero vector and no tracking is performed.

Ｉ_iが「１」である場合には、前時刻で追跡していた人物を引き継ぎ追跡していること。 If I _i is “1”, the person who was tracked at the previous time has been taken over and tracked.

Ｉ_iが「２」である場合には、新たに現時刻の更新処理で対応するＭに人物が追加されたこと。 If I _i is “2”, a person has been newly added to the corresponding M in the current time update process.

前記ｋベクトルが同じとなっている（ｋベクトルと等しい）状態をカウントし、最も多い得票数を得たｋベクトルが現在の追跡人物数、追跡の引き継ぎを表していると考え、最も得票数の多かったｋベクトルを保持している状態を用いて大まかな状態分布を計算する。 Count the state where the k vectors are the same (equal to the k vector), and consider that the k vector that has obtained the largest number of votes represents the current number of people tracked and the succession of tracking, A rough state distribution is calculated using a state that holds many k vectors.

例えば、最大得票数がＣであり、最大得票数を得た状態をＳ’_t,iとすると、状態分布Ｓ_t'を以下の式で計算する。ただし（ｉ＝１，２，・・・，Ｃ）とし、最も得票数の多かったｋベクトルを保持している状態で、更新回数が若かった順にインデックスｉをつけることとする。 For example, if the maximum number of votes is C and the state where the maximum number of votes is obtained is S ′ _{t, i} , the state distribution _{St ′} is calculated by the following equation. However, it is assumed that (i = 1, 2,..., C), and the index i is assigned in ascending order of the number of updates while the k vector having the largest number of votes is held.

次に、条件判断ステップＳ４０９では、ステップＳ４０８得られた状態分布において、現在追跡している人数が「０」か否かを判定する。追跡人物数が「０」である場合には状態更新処理を終了する。一人でも追跡している人物がいる（追跡人物数が「０」でない）場合にはｎ回目に更新された状態を状態分布Ｓ_t'として、条件判断ステップＳ４１０へと進む。 Next, in condition determination step S409, it is determined whether or not the number of people currently tracked is “0” in the state distribution obtained in step S408. If the number of persons tracked is “0”, the state update process is terminated. If there is a person who is tracking even one person (the number of persons tracked is not “0”), the state updated at the nth time is set as the state distribution _{St ′} , and the process proceeds to the condition determination step S410.

ステップＳ４１０では、更新処理が規定の回数行われたか否かの判断が行われる。更新回数ｎが任意に決定できる定数ＢとＮの和で表される回数（Ｂ＋Ｎ回）を超えていれば処理を終了する。さらに、状態Ｓ’_t,nを対象状態分布記憶手段２２に保存しておく。最後のＮ回の更新で保存されたＮ個の状態全てが等確率に起こりうると考え、これを対象状態分布とする。また、Ｂ＋Ｎ回を超えていなければ、更新回数ｎを１増分させ、ステップＳ４０１へ戻る。 In step S410, it is determined whether the update process has been performed a prescribed number of times. If the number n of updates exceeds the number (B + N times) represented by the sum of constants B and N that can be arbitrarily determined, the process is terminated. Further, the state S ′ _{t, n} is stored in the target state distribution storage unit 22. It is considered that all N states stored in the last N updates can occur with equal probability, and this is set as a target state distribution. If it does not exceed B + N times, the update number n is incremented by 1, and the process returns to step S401.

以上、述べた処理がＳ３０４である。 The process described above is S304.

次に、ステップＳ３０５では、ステップＳ３０４で得られた対象状態分布から最大確率を持つ対象状態の計算を行う。ただし、ステップＳ３０４において、追跡人物数を「０」と判断された場合には、対象状態が「０」ベクトルとなるため、この計算を行わない。対象状態分布を構成するＮ個の対象状態は全て等確率で起こると仮定しているため、最大確率を持つ対象状態Ｓ_tは以下の式によって計算できる。 Next, in step S305, the target state having the maximum probability is calculated from the target state distribution obtained in step S304. However, if it is determined in step S304 that the number of persons to be tracked is “0”, the target state is a “0” vector, so this calculation is not performed. Since it is assumed that N pieces of object state constituting the target state distribution occurs at all equal probability, the target state S _t with the maximum probability it can be calculated by the following equation.

また、対象状態予測手段１１により、次の時刻の対象状態を予測するために、４Ｋ次元の変化ベクトルＶを以下の式によって求める。 Further, in order to predict the target state at the next time by the target state prediction means 11, a 4K-dimensional change vector V is obtained by the following equation.

ただし、初期時刻ｔ₀においては、上式を計算できないので、Ｖの全ての要素を「０」とする。このようにして計算したＳ_tおよびＶを対象状態記憶手段２３に保存する。 However, since the above equation cannot be calculated at the initial time t ₀ , all elements of V are set to “0”. _St and V calculated in this way are stored in the target state storage means 23.

そして、ステップＳ３０６では追跡を終了するか否かをチェックする。追跡が終了なら処理を終了とし、まだ追跡を続けるのならば、次の時刻の状態の予測（Ｓ３０１）の処理を行う。例えば、時刻ｔが予め設定された所定値を超えたか否かをチェックし、所定値を超えていれば、追跡を終了する。所定値を超えていなければ、追跡を続行するために、ステップＳ３０１に戻る。また、追跡を終了するか否かをチェックするために、追跡停止指示手段（例えば、動体追跡装置に備えられたキーボード装置）を備え、その追跡停止指示手段からの追跡停止指示があった場合に、追跡を停止しても良い。 In step S306, it is checked whether or not to end tracking. If the tracking is completed, the process is terminated. If the tracking is still continued, the process of predicting the state at the next time (S301) is performed. For example, it is checked whether or not the time t exceeds a predetermined value set in advance. If the time t exceeds the predetermined value, the tracking is ended. If it does not exceed the predetermined value, the process returns to step S301 to continue tracking. In addition, in order to check whether or not the tracking is to be ended, a tracking stop instruction means (for example, a keyboard device provided in the moving object tracking device) is provided, and when there is a tracking stop instruction from the tracking stop instruction means You may stop tracking.

以上のように、本実施形態における処理を繰り返すことによって、動的に人数の変化する複数の人物の状態を逐次推定できる。 As described above, by repeating the processing in the present embodiment, it is possible to sequentially estimate the states of a plurality of persons whose number of persons dynamically changes.

即ち、本実施形態は、上述のような従来技術の問題点を解決するためになされたものであり、複数のカメラ情報を統合し、対象同士が接近しても安定して追跡を行うことができる。 In other words, the present embodiment is made to solve the above-described problems of the prior art, and can integrate a plurality of camera information and perform tracking stably even if the objects approach each other. it can.

また、追跡人数の動的な増減に対応し、二段階の確率分布推定を行うことで、新たに追跡空間に出現した人物や、追跡空間から退出した人物を安定して追跡することができる。 Further, by performing two-stage probability distribution estimation corresponding to the dynamic increase / decrease in the number of people tracked, it is possible to stably track a person who has newly appeared in the tracking space or a person who has left the tracking space.

また、一度に追跡する人数が増えても全体の処理量を一定に保つことができ、追跡処理の実時間性を保つことが出来る。 Further, even if the number of people to be tracked increases at a time, the entire processing amount can be kept constant, and the real-time performance of the tracking process can be maintained.

また、対象の位置と同時に対象の大きさも推定することで、個人差に対応した追跡を行うことができる。 Moreover, tracking corresponding to individual differences can be performed by estimating the size of the object simultaneously with the position of the object.

なお、本実施形態では、ステップＳ４０５の処理において、一例としてシルエット画像とシミュレーション画像の比較に関して式１１を用いたが、画像間の類似度を取る評価関数（例えば、単純な積演算）を利用しても同様に実現できる。 In the present embodiment, in the process of step S405, Expression 11 is used as an example for comparing the silhouette image and the simulation image, but an evaluation function (for example, a simple product operation) that takes similarity between images is used. However, it can be realized similarly.

また、本実施形態では、ステップＳ４０５の処理において、一例として環境のペナルティ関数として高さ方向で重みを付けた式１０を用いたが、人物の存在位置の重みを表現できる関数を別途に利用できることは当然である。 Further, in the present embodiment, in the process of step S405, as an example, the expression 10 weighted in the height direction is used as the environmental penalty function, but a function that can express the weight of the position of the person can be used separately. Is natural.

そして、本実施形態では、ステップＳ４０５の処理において、一例として人物の接近に関するペナルティ関数として式１１を用いたが、人物が重なった際にペナルティを与えられる任意の関数を利用できることは当然である。 In the present embodiment, in the process of step S405, Expression 11 is used as a penalty function related to the approach of a person as an example, but it is natural that any function that gives a penalty when the persons overlap can be used.

なお、図１で示した装置における各手段の一部もしくは全部の機能をコンピュータのプログラムで構成し、そのプログラムをコンピュータを用いて実行して本発明を実現することができること、図４、図６で示した処理の手順をコンピュータのプログラムで構成し、そのプログラムをコンピュータに実行させることができることは言うまでもなく、コンピュータでその機能を実現するためのプログラムを、そのコンピュータが読み取り可能な記録媒体、例えばＦＤ（Ｆｌｏｐｐｙ（登録商標）Ｄｉｓｋ）や、ＭＯ（Ｍａｇｎｅｔｏ−Ｏｐｔｉｃａｌｄｉｓｋ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、メモリカード、ＣＤ（ＣｏｍｐａｃｔＤｉｓｋ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、リムーバブルディスクなどに記録して、保存したり、配布したりすることが可能である。また、上記のプログラムをインターネットや電子メールなど、ネットワークを通して提供することも可能である。 It is to be noted that a part or all of the functions of each means in the apparatus shown in FIG. 1 can be configured by a computer program and the program can be executed using the computer to realize the present invention, FIG. 4 and FIG. It is needless to say that the processing procedure shown in FIG. 6 is configured by a computer program and the program can be executed by the computer, and the program for realizing the function by the computer can be read by a computer-readable recording medium, for example, FD (Floppy (registered trademark) Disk), MO (Magneto-Optical disk), ROM (Read Only Memory), memory card, CD (Compact Disk), DVD (Digital Versatile Disk), removable device And recorded in a disk, or stored, it is possible or distribute. It is also possible to provide the above program through a network such as the Internet or electronic mail.

さらに、上述の動体追跡装置に関する方法を記述したコンピュータプログラムを、動体追跡装置に関する方法に必要とされる入出力データを格納したメモリや外部記憶装置等にアクセスするように実装してもよい。 Further, a computer program describing a method related to the moving object tracking device may be mounted so as to access a memory or an external storage device that stores input / output data required for the method related to the moving object tracking device.

［実施例］
本実施例では、上述の動体追跡装置を用いて、二台のビデオカメラ（カメラＣ１およびカメラＣ２）から得られた映像を処理して、追跡範囲に一人ずつ侵入してくる２人の人物を追跡した結果を図４に示した処理フローに基づいて説明する。 [Example]
In this embodiment, the moving object tracking device described above is used to process images obtained from two video cameras (camera C1 and camera C2), and two persons entering the tracking range one by one are detected. The traced result will be described based on the processing flow shown in FIG.

次に、対象状態を線形予測する（Ｓ３０１）。 Next, the target state is linearly predicted (S301).

次に、カメラから画像を取得する（Ｓ３０２）。なお、図７は、二台のカメラから取得した画像列の一部であり、上段がカメラＣ１から取得した画像７０１〜７０６、下段がカメラＣ２から取得した画像７０７〜７１２である。 Next, an image is acquired from the camera (S302). FIG. 7 is a part of an image sequence acquired from two cameras. The upper row is images 701 to 706 acquired from the camera C1, and the lower row is images 707 to 712 acquired from the camera C2.

次に、得られた画像からシルエット画像を作成する（Ｓ３０３）。背景差分法を用いて作成したシルエット画像を図８に示す。図７と同じように、上段がカメラＣ１から取得した画像から作成したシルエット画像８０１〜８０６、下段がカメラＣ２から取得した画像から作成したシルエット８０７〜８１２である。なお、図８の各シルエット画像は、図７において同じ位置に示されている各画像と対応している。 Next, a silhouette image is created from the obtained image (S303). A silhouette image created using the background subtraction method is shown in FIG. As in FIG. 7, the upper row is a silhouette image 801 to 806 created from an image acquired from the camera C1, and the lower row is a silhouette 807 to 812 created from an image acquired from the camera C2. Each silhouette image in FIG. 8 corresponds to each image shown in the same position in FIG.

次に、ステップＳ３０４にて対象状態分布の推定を行う。即ち、ステップＳ３０４では、状態を逐次更新させながらその状態をシミュレートしたシミュレーション画像の作成を行って、そのシミュレーション画像の評価を行う。 Next, the target state distribution is estimated in step S304. That is, in step S304, a simulation image simulating the state is created while sequentially updating the state, and the simulation image is evaluated.

図９において、上段はシルエット画像９０１、下段は上段のシルエット画像をシミュレートしたシミュレーション画像の一部（符号９０２から９０４で示されるシミュレーション画像）である。このように様々な状態に変化させながらシミュレーション画像を作成し、尤度の計算、受け入れ拒否演算により対象状態分布を作成する。 In FIG. 9, the upper part is a silhouette image 901, and the lower part is a part of a simulation image simulating the upper silhouette image (simulation images indicated by reference numerals 902 to 904). In this way, a simulation image is created while changing to various states, and a target state distribution is created by likelihood calculation and acceptance rejection calculation.

ステップＳ３０５では、ステップＳ３０４で計算された対象状態分布より、現在、最大確率を取る対象状態を求め、これを現時刻での対象状態として出力する。また、同時に、前時刻と現時刻での対象状態の変化ベクトルを求める。 In step S305, a target state that currently takes the maximum probability is obtained from the target state distribution calculated in step S304, and this is output as the target state at the current time. At the same time, a change vector of the target state at the previous time and the current time is obtained.

図１０に計算された最大確率を取る対象状態を表現した画像を示す。図１０は、それぞれ上段がカメラからの画像１００１〜１００６、中段が最大確率を取る対象状態をシミュレートした画像１００７〜１０１２、下段が俯瞰図として天井にカメラが設置されているとして最大確率を取る対象状態をシミュレートした画像１０１３〜１０１８である。 FIG. 10 shows an image representing the target state taking the maximum probability calculated. FIG. 10 shows images 1001 to 1006 from the camera in the upper stage, images 1007 to 1012 in which the middle stage simulates a target state that takes the maximum probability, and takes the maximum probability assuming that the camera is installed on the ceiling as the overhead view. It is the images 1013 to 1018 simulating the target state.

人物Ｈ１と人物Ｈ２が少し時間を置いて追跡範囲に侵入した後に、すれ違い、最後にまた追跡範囲から退出する様子を明確に追跡できていることを示している。また、追跡する人数が０人→１人→２人→０人のように動的に変化しているが人物の出現と消失を行いながら安定して追跡していることが確認できる。 This shows that after the person H1 and the person H2 have entered the tracking range after a short time, they can clearly trace the situation of passing each other and finally leaving the tracking range. In addition, it can be confirmed that the number of people to be tracked is dynamically changed as 0 → 1 → 2 → 0, but the person is stably tracked while appearing and disappearing.

ステップＳ３０６では、現時刻までで、追跡を終了とする場合には、処理を終了とし、そうでない場合には、ステップＳ３０１の対象状態の予測へ戻る。 In step S306, if the tracking is to be ended up to the current time, the process is ended, otherwise, the process returns to the prediction of the target state in step S301.

以上の実施例における処理を繰り返すことによって、２人の人物が接近してすれ違うシーンにおいても安定した追跡を行うことができる。即ち、これは複数のカメラの情報が統合されていることを示している。 By repeating the processing in the above embodiment, stable tracking can be performed even in a scene where two persons approach each other and pass each other. That is, this indicates that information of a plurality of cameras is integrated.

また、追跡する人数が動的に変化しているが、人物の出現と消失に対応し、追跡が行えていることを示している。 In addition, although the number of people to be tracked is dynamically changing, it is shown that tracking can be performed in response to the appearance and disappearance of a person.

さらに、このように動的に追跡する人物の人数が変化しても全体の処理量が増えるといった問題がなく、実時間性を確保している。 Furthermore, even if the number of persons to be dynamically tracked changes as described above, there is no problem that the total processing amount increases, and real-time performance is ensured.

そして、２人の体型の違う人物に対しても、予めパラメータを与えることなく、安定した追跡を可能にしている。 Also, stable tracking is possible without giving parameters in advance to two persons with different body shapes.

これらの特徴は、複数のカメラの情報を統合し、体の大きさまで推定できるためである。 These features are because the information of a plurality of cameras can be integrated to estimate the size of the body.

なお、実施例では、複数のカメラから取得した情報を対象にしていたが、単数のカメラから取得した情報を対象にしても同様の結果を得ることができる。 In the embodiment, information acquired from a plurality of cameras is targeted. However, the same result can be obtained even if information acquired from a single camera is targeted.

以上、本発明の実施形態について説明したが、本発明は説明した実施形態に限定されるものでなく、各請求項に記載した範囲において各種の変形を行うことが可能である。 Although the embodiments of the present invention have been described above, the present invention is not limited to the described embodiments, and various modifications can be made within the scope described in each claim.

例えば、本実施形態の変形として、実験的に任意に設定可能な値（例えば、上述のＢ，Ｎ，Ｗ_x，Ｗ_y，Ｗ_r，Ｗ_h，δ_x，δ_y，δ_r，δ_h，α，βの値）を入力できる入力手段を備え、その入力手段からの入力値に基づいて、上述の動体追跡に関する処理を行っても良い。即ち、キーボード装置などを含む入力手段によって、値を動体追跡装置に入力し、その入力された値に基づいて、上述の動体追跡に関する処理を行うものである。 For example, as a modification of the present embodiment, values that can be arbitrarily set experimentally (for example, the above-described B, N, W _x , W _y , W _r , W _h , δ _x , δ _y , δ _r , δ _h , Α, β values) may be provided, and the above-described processing related to moving object tracking may be performed based on the input value from the input unit. That is, a value is input to the moving object tracking device by input means including a keyboard device and the above-described processing related to moving object tracking is performed based on the input value.

本実施形態における動体追跡装置の構成図。The block diagram of the moving body tracking device in this embodiment. 本実施形態における動体追跡装置に関する条件を示す図。The figure which shows the conditions regarding the moving body tracking device in this embodiment. 本実施形態における動体追跡装置で用いる三次元モデル図。The three-dimensional model figure used with the moving body tracking device in this embodiment. 本実施形態における動体追跡方法の処理手順を示すフローチャート。The flowchart which shows the process sequence of the moving body tracking method in this embodiment. 本実施形態における入力画像に対応するシルエット画像を示す図。The figure which shows the silhouette image corresponding to the input image in this embodiment. 本実施形態における対象状態分布の推定に関する処理手順を示すフローチャート。The flowchart which shows the process sequence regarding estimation of object state distribution in this embodiment. 本実施例におけるカメラから取得した画像列の一部を示した図。The figure which showed a part of image sequence acquired from the camera in a present Example. 本実施例におけるカメラから取得した画像列の一部から作成したシルエット画像を示した図。The figure which showed the silhouette image created from a part of image sequence acquired from the camera in a present Example. 本実施例においてシルエット画像および予測結果に基づいてシルエット画像をシミュレートした画像の一部を示した図。The figure which showed a part of image which simulated the silhouette image based on the silhouette image and the prediction result in a present Example. 本実施例において入力画像の一部および入力画像から計算した対象状態をシミュレートした画像とそれを俯瞰した画像を示した図。The figure which showed the image which simulated the object state calculated from a part of input image and the input image in this Example, and the bird's-eye view of it.

Explanation of symbols

１１…対象状態予測手段
１２…画像取得手段
１３…シルエット画像作成手段
１４…対象状態分布推定手段
１５…対象状態計算手段
２１…三次元環境情報管理手段
２２…対象状態分布記憶手段
２３…対象状態記憶手段
５０１…入力画像
５０２，８０１〜８１２，９０１…シルエット画像
７０１〜７１２，１００１〜１００６…カメラからの画像
９０２〜９０４…シミュレーション画像
１００７〜１０１８…対象状態をシミュレートした画像
Ｂ…非対象物体
Ｃ１，Ｃ２，Ｃｐ…カメラ
Ｈ１，Ｈ２，Ｈｋ…人物
ｅ…楕円体 DESCRIPTION OF SYMBOLS 11 ... Target state prediction means 12 ... Image acquisition means 13 ... Silhouette image creation means 14 ... Target state distribution estimation means 15 ... Target state calculation means 21 ... Three-dimensional environment information management means 22 ... Target state distribution storage means 23 ... Target state storage Means 501 ... Input image 502, 801 to 812, 901 ... Silhouette image 701 to 712, 1001 to 1006 ... Image from camera 902 to 904 ... Simulation image 1007 to 1018 ... Image simulating the target state B ... Non-target object C1 , C2, Cp ... camera H1, H2, Hk ... person e ... ellipsoid

Claims

Assuming the three-dimensional position and size of the moving object as the target state of the moving object, the moving object is imaged at a specific time interval using one or a plurality of imaging devices, and a plurality of images obtained by the imaging A moving object tracking device that tracks the moving object based on image data,
A target state prediction means for predicting a plurality of target states of the moving object at each time;
Image acquisition means for acquiring the image data;
Silhouette image creating means for creating a silhouette image obtained by extracting an area in which the moving object is copied from the image data;
Considering the target state obtained by the target state prediction means as the initial state, the silhouette image created by the silhouette image creation means, the three-dimensional structure of the real world stored in the three-dimensional environment information management means, and the three-dimensional The target state distribution is estimated using the internal parameters and external parameters of the imaging device stored in the environment information management unit, and the target state distribution at the previous time stored in the target state distribution storage unit is determined as the current time. Target state distribution estimating means for updating to the target state distribution;
Based on the target state distribution at the current time stored in the target state distribution storage means, the target state having the maximum probability is calculated, and the target state having the maximum probability at the current time and the target state having the maximum probability at the previous time are calculated. A target state calculation unit that calculates a change vector based on the calculated change vector and stores the calculated change vector in a target state storage unit;
Three-dimensional environment information management means for storing the real-world three-dimensional structure measured in advance and the internal and external parameters of the imaging device;
Object state distribution storage means for storing a probabilistic distribution of the object state estimated at each time;
Target state storage means for storing the change vector at each time;
With
The target state distribution estimating means is
The first probability distribution update number B for the target state distribution storage means is compared with the last probability distribution update number N for the target state, and based on the comparison result, a change type is selected according to an arbitrary probability from a change type prepared in advance. Means to choose;
Means for adaptively changing the moving type to change the moving type;
Means for changing the target state of the selected moving object based on the selected change type and the comparison result of the probability distribution update count B and the probability distribution update count N;
A virtual imaging unit having the same external parameters and internal parameters as the imaging device, a unit that regards an image created by projecting the changed target state as a simulation image simulating the silhouette image;
A probabilistic first condition likelihood of a target state based on a prior knowledge condition relating to the three-dimensional structure, and a probabilistic second condition likelihood of the target state based on a prior knowledge condition relating to overlapping of moving objects;
Means for calculating a third condition likelihood based on a comparison between the silhouette image and the simulation image, and considering a product of the first to third condition likelihoods as the likelihood of the changed provisional target state; ,
Means for determining a target state predicted n times according to a result of whether or not the provisional target state is adopted;
When the number of updates is equal to the probability distribution calculation number B, means for calculating the state distribution based on the target state obtained by the B times of update,
Means for determining whether or not the number of moving objects currently being tracked is equal to 0 based on the state distribution based on the target state obtained in the B-th update;
If the number of moving objects is not equal to 0 and if the update count exceeds the probability distribution update count B plus the probability distribution update count N, the N stored in the last N updates Means that all target states can occur with equal probability, and the target state predicted for the nth time is regarded as a target state distribution;
Having
A moving body tracking device characterized by that.

The moving body tracking device according to claim 1,
The target state of the moving object is regarded as a set of ellipsoid models, the position of each moving object is a position (x, y) on an arbitrary plane, the size of the moving object is the radius r and height h of the ellipsoid model, Represented by
Storing a plurality of combinations of values of these parameters x, y, r, h,
A moving body tracking device characterized by that.

Three-dimensional environment information management means for storing the real-world three-dimensional structure measured in advance and the internal and external parameters of the imaging device;
Object state distribution storage means for storing a probabilistic distribution of the object state estimated at each time;
Target state storage means for storing the change vector at each time;
With
Assuming the three-dimensional position and size of the moving object as the target state of the moving object, the moving object is imaged at a specific time interval using one or a plurality of imaging devices, and a plurality of images obtained by the imaging A moving object tracking method used in an apparatus for tracking a moving object based on image data,
A target state prediction step of predicting a plurality of target states of the moving object at each time; and
An image acquisition step of acquiring the image data;
A silhouette image creating step for creating a silhouette image obtained by extracting an area in which the moving object is copied from the image data;
Considering the target state obtained by the target state prediction means as the initial state, the silhouette image created by the silhouette image creation means, the three-dimensional structure of the real world stored in the three-dimensional environment information management means, and the three-dimensional Using the internal parameters and external parameters of the imaging device stored in the environmental information management means, the target state distribution is estimated,
A target state distribution estimating step of updating the target state distribution at the previous time stored in the target state distribution storage means to the target state distribution at the current time;
Based on the target state distribution having the maximum probability based on the target state distribution at the current time stored in the target state distribution storage means, and based on the target state having the maximum probability at the current time and the target state having the maximum probability at the previous time A target state calculating step of calculating a change vector and storing the calculated change vector in a target state storage means;
Have
The target state distribution estimation step includes
The first probability distribution update number B for the target state distribution storage means is compared with the last probability distribution update number N for the target state, and based on the comparison result, a change type is selected according to an arbitrary probability from a change type prepared in advance. A step to choose;
Selecting a moving body to adapt and change the change type;
Changing the target state of the selected moving object based on the selected change type and the comparison result of the probability distribution update count B and the probability distribution update count N;
A virtual imaging means having the same external parameters and internal parameters as the imaging device, and considering an image created by projecting the changed target state as a simulation image simulating the silhouette image;
A probabilistic first condition likelihood of a target state based on a prior knowledge condition relating to the three-dimensional structure, and a probabilistic second condition likelihood of the target state based on a prior knowledge condition relating to overlapping of moving objects;
Calculating a third condition likelihood based on a comparison between the silhouette image and the simulation image, and considering a product of the first to third condition likelihoods as the likelihood of the changed provisional target state; ,
Determining a target state predicted n times according to a result of whether or not the provisional target state is adopted;
When the number of updates is equal to the probability distribution calculation number B, calculating a state distribution based on the target state obtained by the B times of updating,
Determining whether the number of moving objects currently being tracked is equal to 0 based on the state distribution based on the target state obtained by the B-th update;
If the number of moving objects is not equal to 0 and if the update count exceeds the probability distribution update count B plus the probability distribution update count N, the N stored in the last N updates Considering that all target states can occur with equal probability, and assuming the target state predicted n times as a target state distribution;
Having
The moving body tracking method characterized by this.

The moving body tracking method according to claim 3,
The target state of the moving object is regarded as a set of ellipsoid models, the position of each moving object is a position (x, y) on an arbitrary plane, the size of the moving object is the radius r and height h of the ellipsoid model, Represented by
Storing a plurality of combinations of values of these parameters x, y, r, h,
The moving body tracking method characterized by this.

5. A moving body tracking program according to claim 3, wherein the moving body tracking method is described as a computer program executable by a computer.

5. A recording medium in which the moving body tracking method according to claim 3 is described as a computer program executable by a computer and the computer program is recorded.