JP7187655B1

JP7187655B1 - OBJECT TRACKING METHOD, PROGRAM, SYSTEM AND RECORDING MEDIUM

Info

Publication number: JP7187655B1
Application number: JP2021212552A
Authority: JP
Inventors: 敏博法原; ▲邦▼明秋間
Original assignee: A Link
Current assignee: A Link
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2022-12-12
Anticipated expiration: 2041-12-27
Also published as: JP2023096653A

Abstract

【課題】位置情報による同定が困難であれば、追跡対象の特徴情報を以て同定処理を補完し、以て同定機能の強化を図る。【解決手段】追跡対象（Ｐ）を撮像して画像情報（Ｉｆ）および深度情報（Ｄｉ）を取得し、前記画像情報および前記深度情報により前記追跡対象を表す三次元の座標値を含む位置情報を算出し、該位置情報を時系列で取得する工程と、前記位置情報を用いて前記追跡対象を同定する工程とを含む。前記追跡対象の追跡中、現時点の位置情報と直近の位置情報を時系列で比較して前記追跡対象の移動距離を算出する工程と、前記移動距離が閾値以内であれば、現時点の前記追跡対象と直近の前記追跡対象を同定する。【選択図】図５An object of the present invention is to enhance the identification function by complementing the identification processing with the feature information of the tracked object when the identification based on the position information is difficult. A tracked object (P) is imaged to obtain image information (If) and depth information (Di), and position information including three-dimensional coordinate values representing the tracked object is obtained by the image information and the depth information. and acquiring the location information in time series; and identifying the tracked object using the location information. During tracking of the tracked target, a step of comparing the current position information and the latest position information in time series to calculate the moving distance of the tracked target, and if the moving distance is within a threshold, the current tracked target and the most recent tracked object. [Selection drawing] Fig. 5

Description

本開示はたとえば、ポータブル多機能デバイスなどを使用し、ポータブル多機能デバイスで取得したカメラ画像上の追跡対象であるたとえば、人などの物体を追跡する追跡技術に関する。
The present disclosure relates, for example, to a tracking technique using a portable multifunction device or the like to track an object, such as a person, being tracked on a camera image acquired by the portable multifunction device.

追跡すべき物体をカメラ画像により検出し、この物体を画像上で追跡することは既に知られている。この物体追跡には連続する画像間で物体の同定が不可欠である。 It is already known to detect an object to be tracked from a camera image and track this object on the image. Identification of the object between consecutive images is essential for this object tracking.

この物体追跡に関し、深層学習による深層学習識別器と、深層学習識別器による特徴量を含んだマルチチャンネル特徴量を用いて、マルチチャンネル特徴量の位置情報とパーティクルの位置情報の距離に応じてパーティクルフィルタの尤度評価に適用することで物体を追跡するパーティクルフィルタ機能部とを備えることが知られている（たとえば、特許文献１）。 Regarding this object tracking, using a deep learning discriminator based on deep learning and a multi-channel feature amount including the feature amount by the deep learning discriminator, particle tracking is performed according to the distance between the multi-channel feature position information and the particle position information It is known to include a particle filter function unit that tracks an object by applying filter likelihood evaluation (for example, Patent Document 1).

人の追跡に関し、複数のカメラにより取得した画像情報から認識対象を抽出するための特徴情報を選択し、重要度情報および信頼度情報に基づいて複数の特徴情報から画像認識処理に有効な特徴情報を選択することが知られている（たとえば、特許文献２）。 Regarding human tracking, feature information for extracting recognition targets is selected from image information acquired by multiple cameras, and feature information effective for image recognition processing is selected from multiple feature information based on importance information and reliability information. is known (for example, Patent Document 2).

人の同定に関し、異なる方位から撮像して入力領域画像間の相対的な方位関係を取得し、少なくとも３つの方位から撮像して得た登録領域画像群に含まれる登録領域画像の特徴と入力領域画像の特徴を比較して同一の人物であるかを判定することが知られている（たとえば、特許文献３）。 Regarding the identification of a person, images are taken from different directions to acquire relative orientation relationships between input area images, and characteristics of registration area images included in a group of registration area images obtained by taking images from at least three directions and the input area. It is known to compare features of images to determine whether they are the same person (for example, Patent Document 3).

また、人の同定に関し、顔画像データの顔角度範囲を表す顔角度範囲データを顔画像データに関係付ける顔角度範囲判定処理を行い、顔認識度の高い顔画像データを顔角度範囲ごとのベストショットと判定し、それ以外の顔画像データを削除するベストショット判定処理を行い、さらに、ベストショットと判定された顔画像データを集計して人物管理データを作成し、人物の特定を行うことが知られている（たとえば、特許文献４）。
For identification of a person, face angle range determination processing is performed to associate the face angle range data representing the face angle range of the face image data with the face image data. It is possible to determine a shot and perform best shot determination processing for deleting face image data other than that, and furthermore, create person management data by aggregating face image data determined to be the best shot, and identify a person. known (for example, Patent Document 4).

特開２０１９－１５３１１２号公報JP 2019-153112 A 特開２０１１－６００２４号公報JP 2011-60024 A 特開２０１６－１４４７号公報JP 2016-1447 A 特開２０１６－１５７１６５号公報JP 2016-157165 A

カメラ画像上で人を追跡するには、その追跡対象を各フレームで認識しなければならない。現在のフレームと直前のフレームを比較し、追跡対象を同定する必要がある。つまり、フレーム間で追跡対象を同定することが不可欠である。 To track a person on the camera image, the tracked object must be recognized in each frame. The current frame and the previous frame should be compared to identify the tracked object. Thus, it is essential to identify the tracked object between frames.

フレーム上に複数の人が存在した場合には画像中で人の重なりや接触の他、追跡対象の人がカメラの画角から外れ、再び画角内に入った場合など、人の骨格などの情報だけでの対比では追跡対象を同定することが困難になる。つまり、フレーム間の画像対比において、同定に必要な情報量が不足すれば、追跡対象を同定できず見失うことになる。 If there are multiple people in the frame, people overlap or touch each other in the image, and if the person to be tracked moves out of the angle of view of the camera and then re-enters the angle of view, the skeleton of the person will change. It becomes difficult to identify the tracked object by contrasting information only. In other words, in image comparison between frames, if the amount of information required for identification is insufficient, the tracked object cannot be identified and is lost.

追跡対象が人であれば、顔認識を用いることができる。この顔認識には、対象者の登録顔情報と取得した顔情報とを対比し、特徴量の比較が必要である。これには追跡対象の登録情報として複数枚の正面画像の登録など、対比に必要な情報を登録しておくことが必要である。 If the object to be tracked is a person, facial recognition can be used. For this face recognition, it is necessary to compare the registered face information of the target person and the obtained face information, and to compare feature amounts. For this purpose, it is necessary to register information necessary for comparison, such as registration of a plurality of front images, as registration information of the tracking target.

しかしながら、追跡対象が自然に動作している人から顔画像を取得して登録し、その顔情報を対比して連続して追跡対象を同定することは厄介である。尤も、不特定な追跡対象について登録情報を予め取得しておくことは不可能であるし、追跡対象が歩行者、競技者、被介護者など、その動きが異なれば、追跡に必要な情報の取得が困難になる。 However, it is cumbersome to acquire and register facial images from persons whose tracked subjects are naturally behaving, and to continuously identify the tracked subjects by contrasting the facial information. Of course, it is impossible to obtain registration information in advance for unspecified tracking targets, and if the tracking target is a pedestrian, an athlete, a care recipient, etc., and its movement is different, the information necessary for tracking is not available. difficult to obtain.

本開示の発明者は、画像情報および深度情報を時系列で取得してワールド空間（三次元空間）の座標値を含む位置情報を取得し、この位置情報を以て同定し、追跡対象の状態如何によって特徴情報を以て同定することが、多様な動きを伴う人などの物体を高精度に追跡可能であるとの知見を得た。 The inventor of the present disclosure obtains image information and depth information in time series to obtain position information including coordinate values in the world space (three-dimensional space), identifies with this position information, We have found that identification using feature information enables highly accurate tracking of objects such as people with various movements.

そこで、本開示の目的は、上記課題または上記知見に鑑み、少なくとも二次元の画像情報および深度情報を用いて座標値を含む追跡対象の位置情報を時系列で取得して追跡中の移動対象を同定し、物体追跡の高速化および高精度化を実現することにある。 Therefore, in view of the above problems or findings, the object of the present disclosure is to acquire position information of a tracked target including coordinate values in time series using at least two-dimensional image information and depth information, and to determine a moving target being tracked. The object is to identify the object and realize high-speed and high-precision object tracking.

また、本開示の他の目的は、位置情報による同定が困難であれば、追跡対象の特徴情報を以て同定処理の補完により同定機能の強化を図ることにある。
Another object of the present disclosure is to enhance the identification function by supplementing the identification process with the feature information of the tracked object if identification based on position information is difficult.

上記目的を達成するため、本開示の物体追跡方法の一側面によれば、追跡対象から撮像により画像情報と深度情報を時系列で取得し、前記画像情報および前記深度情報により三次元の座標値を含む位置情報を時系列で取得する工程と、前記画像情報から前記追跡対象の向き情報と前記追跡対象の顔領域の特徴情報を時系列で取得し、取得した前記特徴情報を前記向き情報によって分類してデータベースに格納する工程と、前記位置情報を用いて前記追跡対象を同定する工程と、取得した前記画像情報と前記深度情報による前記位置情報で前記追跡対象を同定できない場合、該画像情報から前記追跡対象の向き情報と顔領域の前記特徴情報を取得し、この取得した前記特徴情報と前記データベースの前記向き情報によって分類された前記特徴情報とを比較して、前記追跡対象を同定する工程とを含む。 In order to achieve the above object, according to one aspect of the object tracking method of the present disclosure, image information and depth information are acquired in time series from a tracked object by imaging, and three-dimensional coordinate values are obtained from the image information and the depth information. obtaining position information in time series, including orientation information of the tracked object and feature information of the face area of the tracked object from the image information in time series, and converting the obtained feature information using the orientation information classifying and storing in a database; identifying the tracked target using the position information ; and, if the tracked target cannot be identified by the acquired image information and the position information based on the depth information, the image information. obtains the orientation information of the tracked object and the feature information of the face area from the database, and compares the obtained feature information with the feature information classified by the orientation information in the database to identify the tracked object. and a step .

この物体追跡方法において、さらに、追跡中、前記追跡対象の現時点の位置情報と直近の位置情報を時系列で比較して前記追跡対象の移動距離を算出する工程と、前記移動距離が閾値以内であれば、前記追跡対象を同定する工程とを含んでもよい。 In this object tracking method, further, during tracking, comparing the current position information of the tracked object with the most recent position information in time series to calculate the movement distance of the tracked object; and identifying the tracked object, if any.

この物体追跡方法において、さらに、追跡中の追跡対象と他の追跡対象の画像間に重なりを生じ、または追跡対象のバウンディングボックスと他の追跡対象のバウンディングボックスとの間に重なりを生じた場合、該重なりの直前の前記特徴情報と、前記重なりの解消時点の前記特徴情報とを対比して追跡対象を同定する工程を含んでもよい。 In this object tracking method, if there is an overlap between the tracked object being tracked and the image of another tracked object, or an overlap between the bounding box of the tracked object and the bounding box of the other tracked object, A step of identifying the tracked object by comparing the feature information immediately before the overlap with the feature information at the time of cancellation of the overlap may be included.

この物体追跡方法において、さらに、前記画像情報から顔領域情報を時系列で取得し、該顔領域情報から少なくとも目または耳の部位情報を取得する工程と、前記部位情報を用いて前記追跡対象の前記向き情報を取得し、該向き情報により前記特徴情報を分類して前記データベースに格納する工程と、前記位置情報で前記追跡対象を同定できない場合、前記画像情報から取得した前記部位情報を用いて前記追跡対象の前記向き情報を取得するとともに、前記画像情報から顔領域の前記特徴情報を取得し、この取得した前記特徴情報と前記データベースの前記向き情報によって分類された前記特徴情報とを比較して、前記追跡対象を同定する工程とを含んでもよい。 In this object tracking method, the steps of acquiring face region information in time series from the image information, acquiring at least eye or ear region information from the face region information, obtaining the orientation information, classifying the feature information according to the orientation information and storing the characteristic information in the database ; Acquiring the orientation information of the tracking target, acquiring the feature information of the face region from the image information, and comparing the acquired feature information with the feature information classified by the orientation information in the database . and identifying the tracked object.

この物体追跡方法において、さらに、前記バウンディングボックスを分割して複数行、複数列のグリッドを形成し、該グリッドの座標値で特定される前記位置情報を取得する工程を含んでもよい。
この物体追跡方法において、さらに、バウンディングボックスを分割して複数行、複数列のグリッドを形成し、該グリッドの座標値で特定される前記部位情報を取得する工程を含んでもよい。 This object tracking method may further include the step of dividing the bounding box to form a grid of multiple rows and multiple columns, and acquiring the position information specified by the coordinate values of the grid.
This object tracking method may further include the step of dividing the bounding box to form a grid of multiple rows and multiple columns, and obtaining the part information specified by the coordinate values of the grid.

上記目的を達成するため、本開示のプログラムの一側面によれば、コンピュータによって実行するプログラムであって、追跡対象から撮像により画像情報と深度情報を時系列で取得し、前記画像情報および前記深度情報により三次元の座標値を含む位置情報を時系列で取得する機能と、前記画像情報から前記追跡対象の向き情報と前記追跡対象の顔領域の特徴情報を時系列で取得し、取得した前記特徴情報を前記向き情報によって分類してデータベースに格納する機能と、前記位置情報を用いて前記追跡対象を同定する機能と、取得した前記画像情報と前記深度情報による前記位置情報で前記追跡対象を同定できない場合、該画像情報から前記追跡対象の向き情報と顔領域の前記特徴情報を取得し、この取得した前記特徴情報と前記データベースの前記向き情報によって分類された前記特徴情報とを比較して、前記追跡対象を同定する機能とを前記コンピュータによって実行させる。 In order to achieve the above object, according to one aspect of the program of the present disclosure, there is provided a program executed by a computer, which obtains image information and depth information in time series by imaging from a tracked object, and obtains the image information and the depth a function of acquiring position information including three-dimensional coordinate values from the information in time series; A function of classifying feature information according to the direction information and storing it in a database, a function of identifying the tracked target using the position information, and identifying the tracked target with the position information based on the obtained image information and the depth information. If identification is not possible, the orientation information of the tracked object and the feature information of the face area are obtained from the image information, and the obtained feature information and the feature information classified by the orientation information of the database are compared. , and the function of identifying the tracked object are performed by the computer.

このプログラムにおいて、さらに、追跡中、前記追跡対象の現時点の位置情報と直近の位置情報を時系列で比較して前記追跡対象の移動距離を算出する機能と、前記移動距離が閾値以内であれば、前記追跡対象を同定する機能とを前記コンピュータによって実行させてもよい。 In this program, further, during tracking, a function of comparing the current position information and the latest position information of the tracked object in time series to calculate the movement distance of the tracked object, and if the movement distance is within a threshold , and the function of identifying the tracked object may be performed by the computer.

このプログラムにおいて、さらに、追跡中の追跡対象と他の追跡対象の画像間に重なりを生じ、または追跡対象のバウンディングボックスと他の追跡対象のバウンディングボックスとの間に重なりを生じた場合、該重なりの直前の前記特徴情報と、前記重なりの解消時点の特徴情報とを対比して追跡対象を同定する機能を前記コンピュータによって実行させてもよい。 In this program, if an overlap occurs between the tracked object being tracked and the image of another tracked object, or between the bounding box of the tracked object and the bounding box of the other tracked object, the overlapping The computer may execute a function of identifying the tracked object by comparing the feature information immediately before the overlap with the feature information at the time of cancellation of the overlap.

このプログラムにおいて、さらに、前記画像情報から顔領域情報を時系列で取得し、該顔領域情報から少なくとも目または耳の部位情報を取得する機能と、前記部位情報を用いて前記追跡対象の前記向き情報を取得し、該向き情報により前記特徴情報を分類して前記データベースに格納する機能と、前記位置情報で前記追跡対象を同定できない場合、前記画像情報から取得した前記部位情報を用いて前記追跡対象の前記向き情報を取得するとともに、前記画像情報から顔領域の前記特徴情報を取得し、この取得した前記特徴情報と前記データベースの前記向き情報によって分類された前記特徴情報とを比較して、前記追跡対象を同定する機能とを前記コンピュータによって実行させてもよい。 This program further comprises a function of acquiring face area information in time series from the image information, acquiring at least eye or ear part information from the face area information, and the orientation of the tracked target using the part information. a function of acquiring information, classifying the feature information according to the orientation information and storing the feature information in the database ; Acquiring the orientation information of the target, acquiring the feature information of the face region from the image information, and comparing the acquired feature information with the feature information classified by the orientation information in the database , and identifying the tracked object may be performed by the computer.

このプログラムにおいて、さらに、前記バウンディングボックスを分割して複数行、複数列のグリッドを形成し、該グリッドの座標値で特定される前記位置情報を取得する機能を前記コンピュータによって実行させてもよい。
このプログラムにおいて、さらに、バウンディングボックスを分割して複数行、複数列のグリッドを形成し、該グリッドの座標値で特定される前記部位情報を取得する機能を前記コンピュータによって実行させてもよい。 In this program, the computer may further execute a function of dividing the bounding box to form a grid of multiple rows and multiple columns, and acquiring the position information specified by the coordinate values of the grid. .
In this program, the computer may further execute a function of dividing the bounding box to form a grid of multiple rows and multiple columns, and acquiring the region information specified by the coordinate values of the grid.

上記目的を達成するため、本開示の物体追跡システムの一側面によれば、追跡対象を表す少なくとも二次元の画像情報を時系列で取得する画像情報取得部と、前記追跡対象の深度情報を時系列で取得する深度情報取得部と、前記画像情報から前記追跡対象の向き情報と特徴情報を取得する特徴情報取得部と、取得した前記特徴情報を前記向き情報によって分類して格納するデータベースと、前記画像情報と前記深度情報を用いて前記追跡対象の位置情報を取得し、該位置情報を用いて前記追跡対象を同定し、該位置情報で同定できない場合、前記画像情報から前記追跡対象の向き情報と顔領域の前記特徴情報を取得し、この取得した前記特徴情報と前記データベースの前記向き情報によって分類された前記特徴情報とを比較して前記追跡対象を同定する同定処理部と、前記追跡対象を表す画像情報とともに追跡情報を提示する情報提示部とを含む。 In order to achieve the above object, according to one aspect of the object tracking system of the present disclosure, an image information acquisition unit that acquires at least two-dimensional image information representing a tracked object in time series, and depth information of the tracked object that a depth information acquisition unit that acquires serially, a feature information acquisition unit that acquires direction information and feature information of the tracked object from the image information, a database that classifies and stores the acquired feature information according to the direction information; Obtaining position information of the tracked target using the image information and the depth information, identifying the tracked target using the position information, and determining the orientation of the tracked target from the image information when the position information cannot be used to identify the tracked target. an identification processing unit that acquires information and the feature information of a face area, compares the acquired feature information with the feature information classified by the orientation information in the database, and identifies the tracking target; and an information presenter for presenting tracking information along with image information representing the object.

上記目的を達成するため、本開示の物体追跡システムの一側面によれば、少なくとも前記画像情報取得部、前記深度情報取得部、前記特徴情報取得部および前記同定処理部を備え、追跡対象の追跡情報を出力する二以上のデバイスと、各デバイスから前記追跡情報を取得し、前記追跡対象の追跡角度または追跡範囲を拡張させて前記追跡対象を追跡するサーバーとを備える。 In order to achieve the above object, according to one aspect of the object tracking system of the present disclosure, the object tracking system includes at least the image information acquisition unit, the depth information acquisition unit, the feature information acquisition unit, and the identification processing unit, and tracks a tracked object. It comprises two or more devices that output information, and a server that acquires the tracking information from each device and extends the tracking angle or tracking range of the tracked target to track the tracked target.

上記目的を達成するため、本開示の記録媒体の一側面によれば、前記プログラムを格納し、または、該データベースを格納した記録媒体である。
In order to achieve the above object, according to one aspect of a recording medium of the present disclosure , there is provided a recording medium storing the program or storing the database.

本開示によれば、次の何れかの効果が得られる。
(1) 追跡対象から取得した二次元の画像情報と、深度情報を用いて三次元の位置情報を取得するので、この位置情報により人などの追跡対象の認識率および認識精度を向上させることができ、追跡対象の高精度かつ高速での同定ができ、追跡の信頼性を高め、追跡機能を強化することができる。 According to the present disclosure, any of the following effects can be obtained.
(1) Since 2D image information obtained from the tracked object and depth information are used to obtain 3D positional information, this positional information can be used to improve the recognition rate and accuracy of tracked objects such as people. This enables high-precision and high-speed identification of the tracked object, increases the reliability of tracking, and enhances the tracking function.

(2) 移動する人などの追跡対象を、カメラや光検出・測距部を備えるポータブルデバイスにより追跡でき、たとえば、ポータブルデバイス上の情報提示部に追跡中の追跡対象を画像表示して提示できる。 (2) A tracking target such as a moving person can be tracked by a portable device equipped with a camera and a light detection/ranging unit. .

(3) 追跡対象である物体たとえば、運動中のプレーヤー、被介護者、施設における人の出入り、通行者など、各種物体の追跡に活用できる。
(3) Objects to be tracked For example, it can be used to track various objects such as players in motion, care recipients, people entering and exiting facilities, and passers-by.

図１は、第一の実施の形態に係る物体追跡システムの一例を示す図である。FIG. 1 is a diagram showing an example of an object tracking system according to the first embodiment. 図２のＡは、ポータブル多機能デバイスの背面部を示す図であり、図２のＢは、ポータブル多機能デバイスの前面部を示す図である。FIG. 2A is a view of the back portion of the portable multifunction device, and FIG. 2B is a view of the front portion of the portable multifunction device. 図３は、ポータブル多機能デバイスのハードウェアを示す図である。FIG. 3 is a diagram showing the hardware of the portable multifunction device. 図４は、追跡情報データベースの一例を示す図である。FIG. 4 is a diagram showing an example of a tracking information database. 図５は、物体追跡の処理手順を示すフローチャートである。FIG. 5 is a flow chart showing a processing procedure for object tracking. 図６は、追跡対象の位置情報の取得処理を示すフローチャートである。FIG. 6 is a flow chart showing a process of acquiring position information of a tracked object. 図７は、バウンディングボックスの処理を示すフローチャートである。FIG. 7 is a flow chart showing bounding box processing. 図８のＡは、バウンディングボックスからグリッド点を示す図であり、図８のＢは、バウンディングボックスおよび深度情報を示す図である。FIG. 8A is a diagram showing the grid points from the bounding box, and FIG. 8B is a diagram showing the bounding box and depth information. 図９のＡは画像情報を示す図であり、図９のＢは深度情報を示す図であり、図９のＣは合成画像を示す図である。9A is a diagram showing image information, FIG. 9B is a diagram showing depth information, and FIG. 9C is a diagram showing a synthesized image. 図１０は、向き判断の処理手順を示すフローチャートである。FIG. 10 is a flow chart showing a processing procedure for orientation determination. 図１１は、取得部位と向きの関係を示す図である。FIG. 11 is a diagram showing the relationship between acquisition sites and orientations. 図１２は、同定処理Ｉを示すフローチャートである。FIG. 12 is a flow chart showing identification processing I. FIG. 図１３のＡは追跡中の画像情報を示す図であり、図１３のＢはバウンディングボックスの重なり状態を含む画像情報を示す図であり、図１３のＣはバウンディングボックスの分離状態を含む画像情報を示す図である。13A is a diagram showing image information during tracking, FIG. 13B is a diagram showing image information including bounding box overlapping states, and FIG. 13C is an image information including bounding box separation states. It is a figure which shows. 図１４のＡは、バウンディングボックスの移動距離が閾値以下の場合を示す図であり、図１４のＢは、バウンディングボックスの移動距離が閾値を超える場合を示す図である。FIG. 14A is a diagram showing a case where the moving distance of the bounding box is equal to or less than the threshold, and FIG. 14B is a diagram showing a case where the moving distance of the bounding box exceeds the threshold. 図１５のＡは、追跡対象のフレームインを示す図であり、図１５のＢは、追跡対象のフレームアウトを示す図であり、図１５のＣは、追跡対象の再フレームインを示す図である。FIG. 15A is a diagram showing a tracked object frame-in, FIG. 15B is a diagram showing a tracked object frame-out, and FIG. 15C is a diagram showing a tracked object re-frame-in. be. 図１６は、同定処理IIを示すフローチャートである。FIG. 16 is a flowchart showing identification processing II. 図１７は、第二の実施の形態に係る物体追跡システムを示す図である。FIG. 17 is a diagram showing an object tracking system according to the second embodiment. 図１８は、２台のデバイスによる追跡範囲の拡大化を示す図である。FIG. 18 is a diagram illustrating increased tracking range with two devices. 図１９は、４台のデバイスによる追跡範囲の広域化および複合追跡を示す図である。FIG. 19 is a diagram illustrating extended tracking range and compound tracking with four devices.

〔第一の実施の形態〕
図１は、第一の実施の形態に係る物体追跡システムを示している。図１に示す構成は一例であり、本開示が斯かる構成に限定されるものではない。 [First embodiment]
FIG. 1 shows an object tracking system according to the first embodiment. The configuration shown in FIG. 1 is an example, and the present disclosure is not limited to such a configuration.

この物体追跡システム２は、追跡対象の画像情報および深度情報により取得した位置情報による同定処理Ｉで同定し、同定処理Ｉで同定できないとき、画像情報より取得した特徴情報を用いて追跡対象を同定処理IIで同定する。この物体追跡システム２はたとえば、ＡｐｐｌｅＩｎｃ．（アップルインコーポレイテッド）の製品などのポータブル多機能デバイスに搭載されて実施可能であるが、例示したポータブル多機能デバイスに限定されるものではない。 This object tracking system 2 identifies a tracked target by identification processing I based on position information acquired from image information and depth information of the tracked target, and identifies the tracked target using feature information acquired from image information when identification processing I fails to identify the target. Identify in Treatment II. This object tracking system 2 is, for example, Apple Inc. (Apple Inc.) products, but are not limited to the illustrated portable multifunction devices.

この物体追跡システム２には処理部４、カメラ６、光検出・測距部８、情報提示部１０などが含まれる。処理部４は、追跡対象を情報処理によって追跡する情報処理やカメラ６や光検出・測距部８などの各種の機能部の制御を司る。 This object tracking system 2 includes a processing section 4, a camera 6, a light detection/ranging section 8, an information presentation section 10, and the like. The processing unit 4 manages information processing for tracking a tracked object by information processing, and control of various functional units such as the camera 6 and the light detection/ranging unit 8 .

カメラ６は本開示の撮像部の一例である。このカメラ６は、追跡対象を含むエリアを処理部４の制御により撮像し、連続したたとえば、二次元の画像情報を時系列で出力する。 Camera 6 is an example of an imaging unit of the present disclosure. The camera 6 captures an image of an area including the tracked object under the control of the processing unit 4, and outputs continuous, for example, two-dimensional image information in time series.

光検出・測距部８はポータブル多機能デバイスに搭載されるたとえば、光検出・測距機能ユニットの一例である。光検出・測距部８は、処理部４の制御によりカメラ６の撮像と同期して取得した深度情報を出力する。この光検出・測距部８はたとえば、ＬＩＤＡＲ（Light Detection and Ranging ）スキャナーなど、光検出・測距ユニットで構成される。このライダースキャナーユニットによれば、追跡対象に照射したレーザー光を走査し、追跡対象からの反射光を受光して追跡対象と光照射点との距離、つまり深度を表す深度情報を時系列で取得することができる。 The light detection/ranging unit 8 is an example of a light detection/ranging function unit installed in a portable multifunction device. The light detection/distance measurement unit 8 outputs depth information acquired in synchronization with the imaging by the camera 6 under the control of the processing unit 4 . The light detection/ranging unit 8 is composed of a light detection/ranging unit such as a LIDAR (Light Detection and Ranging) scanner. This lidar scanner unit scans a laser beam that irradiates a tracked target, receives the reflected light from the tracked target, and obtains depth information representing the distance between the tracked target and the light irradiation point, that is, the depth in chronological order. can do.

情報提示部１０はたとえば、ＬＣＤ（Liquid Crystal Display）など、画像提示ユニットで構成し、追跡対象や追跡情報などを含む画像を提示する。 The information presenting unit 10 is composed of an image presenting unit such as an LCD (Liquid Crystal Display), and presents an image including a tracked object and tracking information.

＜処理部４＞
処理部４には追跡制御部１２、画像情報取得部１４、深度情報取得部１６、バウンディングボックス処理部１８、追跡情報データベース生成部２０、位置情報処理部２２、特徴情報処理部２４、状態情報処理部２６、同定処理部２８、同定情報提示部３０、連係処理部３２などの機能部が含まれている。 <Processing unit 4>
The processing unit 4 includes a tracking control unit 12, an image information acquisition unit 14, a depth information acquisition unit 16, a bounding box processing unit 18, a tracking information database generation unit 20, a position information processing unit 22, a feature information processing unit 24, and a state information processing unit. Functional units such as a unit 26, an identification processing unit 28, an identification information presenting unit 30, and a link processing unit 32 are included.

追跡制御部１２はたとえば、移動する追跡対象を追跡するため、追跡対象から位置情報および特徴情報の取得を制御し、追跡対象の位置情報による同定制御と、位置情報による同定ができないとき、特徴情報による同定制御などを司る。 For example, in order to track a moving tracked target, the tracking control unit 12 controls acquisition of position information and feature information from the tracked target, controls identification of the tracked target based on the position information, and obtains feature information when identification based on the position information is not possible. It governs identification control by

画像情報取得部１４は、カメラ６から追跡対象を含む画像情報を取得する。画像情報取得部１４は、追跡制御部１２の制御に基づき、カメラ６から二次元の画像情報を時系列で取得し、この画像情報を追跡情報データベース生成部２０に提供する。 The image information acquisition unit 14 acquires image information including a tracked object from the camera 6 . The image information acquisition unit 14 acquires two-dimensional image information in time series from the camera 6 under the control of the tracking control unit 12 and provides the image information to the tracking information database generation unit 20 .

深度情報取得部１６は、光検出・測距部８がカメラ６の撮像と同期して取得した深度情報を時系列で取得する。 The depth information acquisition unit 16 acquires the depth information acquired by the light detection/ranging unit 8 in synchronization with the imaging by the camera 6 in chronological order.

バウンディングボックス処理部１８は、追跡対象の検出、バウンディングボックスの取得機能、分割機能、グリッド点の生成機能、座標値の取得機能などの処理を実行する。画像情報から追跡対象の骨格などを検出し、追跡対象を特定する。デバイスに搭載された既存のＯＳ（Operating System）に含まれるＡＰＩ（Application Programming Interface ）機能などでは、追跡対象の画像情報にバウンディングボックスが生成されるので、このバウンディングボックスを取得すればよい。バウンディングボックスは追跡対象の上半身を含む領域を表すたとえば、長方形の区画枠である。このバウンディングボックスの分割機能は、バウンディングボックスを複数の行線および列線で複数区画に分割し、行線および列線の交点でグリッド点ｇを生成させる。このグリッド点ｇの座標値を取得するとともに、深度情報を参照して追跡対象の特定に有効な位置情報を選択する。 The bounding box processing unit 18 executes processing such as tracking target detection, bounding box acquisition function, division function, grid point generation function, and coordinate value acquisition function. The skeleton of the tracked target is detected from the image information, and the tracked target is specified. An API (Application Programming Interface) function included in an existing OS (Operating System) installed in the device generates a bounding box for the image information of the tracked object, so this bounding box can be obtained. The bounding box is, for example, a rectangular bounding box that represents the region containing the upper body of the tracked subject. This bounding box division function divides the bounding box into multiple compartments with multiple row lines and column lines, and generates grid points g at the intersections of the row lines and column lines. Along with acquiring the coordinate values of this grid point g, the depth information is referenced to select position information that is effective for specifying the tracked object.

追跡情報データベース生成部２０は、追跡情報データベース（ＤＢ）６６（図４）を生成し、処理部４が取得しまたは生成する画像情報、深度情報、バウンディングボックス情報、グリッド点情報、位置情報、特徴情報など、追跡情報に必要な情報をたとえば、フレーム単位で追跡情報ＤＢ６６に格納する。この追跡情報ＤＢ６６が追跡対象の同定に用いられる。 The tracking information database generation unit 20 generates a tracking information database (DB) 66 (FIG. 4), and stores image information, depth information, bounding box information, grid point information, position information, and features acquired or generated by the processing unit 4. Information necessary for tracking information such as information is stored in the tracking information DB 66 in frame units, for example. This tracking information DB 66 is used to identify the tracked object.

位置情報処理部２２は、画像情報取得部１４から提供された二次元の画像情報と、深度情報取得部１６から提供された深度情報とを合成し、合成画像と既述のグリッド点ｇを用いて追跡対象を特定するための座標値を含む位置情報を生成する。 The position information processing unit 22 synthesizes the two-dimensional image information provided from the image information acquisition unit 14 and the depth information provided from the depth information acquisition unit 16, and uses the synthesized image and the grid points g described above. to generate position information including coordinate values for identifying the tracked object.

特徴情報処理部２４は、画像情報から追跡対象の特徴情報と、特徴情報の分類情報の一例である向き情報を取得し、追跡対象の顔の向き判断と、特徴情報を顔の向きにカテゴライズする。特徴情報は、追跡制御部１２の制御に基づき、画像情報から追跡対象の顔領域情報を取得し、この顔領域から特徴情報を取得する。この特徴情報はたとえば、目、耳などの部位を除く顔の画像情報から取得可能な特徴情報である。向き情報は、目、耳などの取得部位によって判定された「正面」、「左向き」および「右向き」などの分類情報である。顔認識による同定処理IIは、追跡対象から取得した特徴情報に含まれる特徴量の多少に応じて追跡対象を同定する。 The feature information processing unit 24 acquires the feature information of the tracked object and the orientation information, which is an example of the classification information of the feature information, from the image information, determines the orientation of the face of the tracked object, and categorizes the feature information into the orientation of the face. . Under the control of the tracking control unit 12, the feature information is obtained by obtaining information on the face area of the tracking target from the image information and obtaining the feature information from this face area. This feature information is, for example, feature information that can be obtained from face image information excluding parts such as eyes and ears. Direction information is classification information such as "front", "leftward", and "rightward" determined by acquisition parts such as eyes and ears. Identification processing II based on face recognition identifies a tracked target according to the amount of feature amounts included in feature information acquired from the tracked target.

状態情報処理部２６は、追跡対象のバウンディングボックスを監視し、その状態情報としてバウンディングボックス間の重なり、バウンディングボックスの移動距離、バウンディングボックスのフレームアウト後の再フレームインなどの状態情報を取得する。この状態情報を取得することにより、追跡対象を位置情報で同定ができない場合において、その直前フレームにおける追跡対象の特徴情報、不定状態の解消後の直後フレームにおける追跡対象の特徴情報を取得する。 The state information processing unit 26 monitors the bounding boxes to be tracked, and obtains state information such as overlap between bounding boxes, moving distance of bounding boxes, re-frame-in after frame-out of bounding boxes. By acquiring this state information, when the tracked object cannot be identified by the position information, the feature information of the tracked object in the frame immediately before that and the feature information of the tracked object in the frame immediately after the indefinite state is resolved are acquired.

同定処理部２８は、位置情報を用いて追跡対象の同定が可能である場合には、位置情報によって同定処理Ｉを実行し、位置情報を用いて追跡対象の同定ができない場合、追跡対象の特徴情報を用いた同定処理IIを実行する。具体的には同定処理部２８では追跡制御部１２の制御に基づき、時系列で取得した座標値を画像情報のフレームごとに時系列で取得し、フレーム単位で追跡対象を同定する。つまり、追跡対象の移動距離が所定の閾値を表すたとえば、０．５ｍ以下であれば、追跡対象を同定し、同一対象と認定する。これに対し、追跡対象の移動距離が閾値以上であれば、座標値を含む位置情報による同定処理Ｉを回避し、顔認証による同定処理IIに委ねる。移動距離の閾値は、深度情報取得部１６で取得可能な深度情報の精度に応じて設定すればよい。 The identification processing unit 28 executes identification processing I using the position information when the tracking target can be identified using the position information, and when the tracking target cannot be identified using the position information, the tracking target features Perform identification processing II using information. Specifically, under the control of the tracking control unit 12, the identification processing unit 28 acquires the coordinate values obtained in time series for each frame of the image information in time series, and identifies the tracking target in frame units. That is, if the movement distance of the tracked object is equal to or less than a predetermined threshold value, for example, 0.5 m, the tracked object is identified and identified as the same object. On the other hand, if the movement distance of the tracked object is equal to or greater than the threshold, identification processing I based on position information including coordinate values is avoided, and identification processing II based on face authentication is performed. The threshold for the movement distance may be set according to the accuracy of depth information that can be acquired by the depth information acquisition unit 16 .

同定情報提示部３０は、追跡中の追跡対象について追跡表示を生成し、情報提示部１０に提示する。この追跡表示は、追跡対象の追跡、同定などの追跡状態を表す追跡情報の一例である。この追跡表示には同定されている追跡対象を表す画像上にたとえば、同一色で提示され、他の追跡対象と峻別可能に設定される。バウンディングボックスが追跡対象を特定するための情報処理上の概念であるのに対し、この追跡表示はバウンディングボックスと異なり、追跡中の追跡対象のたとえば、頭部に表示される。 The identification information presenting section 30 generates a tracking display for the tracked object being tracked and presents it to the information presenting section 10 . This tracking display is an example of tracking information representing tracking states such as tracking and identification of a tracked object. In this tracking display, an image representing the identified tracked object is displayed, for example, in the same color, and is set to be distinguishable from other tracked objects. While the bounding box is an information processing concept for identifying the tracked object, this tracking display is displayed on the tracked object being tracked, for example, the head, unlike the bounding box.

連係処理部３２は、ネットワーク３４を介して外部のサーバー３６と連係して追跡対象の追跡処理を補完する。 The link processing unit 32 links with an external server 36 via the network 34 to complement tracking processing of the tracked object.

この第一の実施の形態では、連係処理部３２、ネットワーク３４およびサーバー３６が物体追跡システム２に含まれているが、この物体追跡システム２から連係処理部３２、ネットワーク３４およびサーバー３６を除いたポータブル多機能デバイス（以下単に「デバイス」と称する）３８のみで物体追跡システム２を構成してもよい。 In this first embodiment, the object tracking system 2 includes the link processing unit 32, the network 34 and the server 36, but the link processing unit 32, the network 34 and the server 36 are removed from the object tracking system 2. Object tracking system 2 may consist of portable multifunction device (hereinafter simply referred to as “device”) 38 alone.

＜デバイス３８＞
図２のＡは、デバイス３８の背面部を示している。このデバイス３８は、本開示の物体追跡装置の一例である。 <Device 38>
FIG. 2A shows the back side of the device 38. FIG. This device 38 is an example of an object tracker of the present disclosure.

このデバイス３８の装置本体４０の背面には情報取得部４１が設定され、この情報取得部４１にはカメラ６および光検出・測距部８が設置されている。 An information acquisition section 41 is set on the back of the apparatus main body 40 of the device 38, and the information acquisition section 41 is provided with the camera 6 and the light detection/distance measurement section 8. FIG.

図２のＢは、デバイス３８の前面部を示している。このデバイス３８の装置本体４０の前面にはディスプレイ４５の表示画面部４６が設置され、この表示画面部４６にはタッチパネル４８が設置されている。このタッチパネル４８は操作入力部５６の一例である。この操作入力部５６には図示しないキーボード装置やマウスなどのインターフェイス機器が含まれる。 FIG. 2B shows the front of the device 38. FIG. A display screen portion 46 of a display 45 is installed on the front surface of the device body 40 of the device 38 , and a touch panel 48 is installed on the display screen portion 46 . This touch panel 48 is an example of the operation input unit 56 . The operation input unit 56 includes interface devices such as a keyboard device and a mouse (not shown).

この表示画面部４６には一例として画像情報Ｉｆが表示されている。この画像情報Ｉｆには、追跡中の動画情報であり、複数の追跡対象Ｐ－１、Ｐ－２および追跡表示Ｔ－１、Ｔ－２が含まれている。追跡表示Ｔ－１は、追跡対象Ｐ－１の頭部に重ねられて同定状態を表し、追跡表示Ｔ－２は追跡対象Ｐ－２の頭部に重ねられて同定状態を表す。各追跡表示Ｔ－１、Ｔ－２は、異なる表示色で提示され、追跡対象Ｐ－１、Ｐ－２が追跡表示Ｔ－１、Ｔ－２によっても峻別可能である。情報処理上、追跡対象Ｐ－１にはバウンディングボックスＢ－１、追跡対象Ｐ－２にはバウンディングボックスＢ－２が形成されるが、これらは画像上に提示されない。以下、追跡対象Ｐ－１、Ｐ－２について、追跡対象を特定しない場合、単に追跡対象Ｐと表記し、追跡表示についても追跡対象を特定しない場合、単に追跡表示Ｔと表示し、同様に、バウンディングボックスＢ－１、Ｂ－２についても追跡対象を特定しない場合には単にバウンディングボックスＢと表記する。 Image information If is displayed on the display screen section 46 as an example. This image information If is moving image information being tracked, and includes a plurality of tracked objects P-1 and P-2 and tracking displays T-1 and T-2. Tracking display T-1 is overlaid on the head of tracked subject P-1 to represent the identification state, and tracking display T-2 is overlaid on the head of tracked subject P-2 to represent the identification state. Each tracking display T-1, T-2 is presented in a different display color, and the tracking targets P-1, P-2 can also be distinguished by the tracking displays T-1, T-2. In terms of information processing, a bounding box B-1 is formed for the tracked object P-1 and a bounding box B-2 is formed for the tracked object P-2, but these are not presented on the image. Hereinafter, when the tracked targets P-1 and P-2 are not specified, they are simply referred to as tracked target P, and when the tracked display is not specified, they are simply referred to as tracked display T, and similarly, Bounding boxes B-1 and B-2 are also simply referred to as bounding box B when the tracking target is not specified.

＜デバイス３８のハードウェア＞
図３は、デバイス３８のハードウェアの一例を示している。デバイス３８には処理部４、カメラ６、光検出・測距部８、ディスプレイ４５、タッチパネル４８などが含まれる。 <Hardware of Device 38>
FIG. 3 shows an example of the hardware of device 38 . The device 38 includes the processing unit 4, the camera 6, the light detection/ranging unit 8, the display 45, the touch panel 48, and the like.

処理部４にはプロセッサ５８、記憶部６０、入出力部（Ｉ／Ｏ）６２、通信部６４が含まれる。プロセッサ５８は、記憶部６０にあるＯＳ、物体追跡プログラムなどの各種プログラムを実行し、既述の機能部の制御を実行する。 The processing unit 4 includes a processor 58 , a storage unit 60 , an input/output unit (I/O) 62 and a communication unit 64 . The processor 58 executes various programs such as an OS and an object tracking program stored in the storage unit 60, and controls the above-described functional units.

記憶部６０は、本開示の記録媒体の一例であり、ＯＳ、物体追跡プログラム、追跡情報ＤＢ６６など、各種の情報を格納する。この記憶部６０にはＲＯＭ（Read-Only Memory）、ＲＡＭ（Random-Access Memory）などの記憶素子が用いられる。ＲＡＭは、情報処理のワークエリアやフレームワークなどを構成する。 The storage unit 60 is an example of the recording medium of the present disclosure, and stores various information such as an OS, object tracking program, tracking information DB 66, and the like. Storage elements such as ROM (Read-Only Memory) and RAM (Random-Access Memory) are used for the storage unit 60 . The RAM constitutes a work area, a framework, and the like for information processing.

Ｉ／Ｏ６２は、プロセッサ５８の制御により情報の入出力を行う。このＩ／Ｏ６２には情報入力手段としてカメラ６、光検出・測距部８、ディスプレイ４５、タッチパネル４８などが接続される。 The I/O 62 inputs and outputs information under the control of the processor 58 . The I/O 62 is connected with the camera 6, the light detection/distance measurement unit 8, the display 45, the touch panel 48, etc. as information input means.

情報提示部１０には、ディスプレイ４５以外の他の表示機器を用いてもよい。操作入力部５６には、タッチパネル４８以外の操作入力機器を用いてもよい。 A display device other than the display 45 may be used for the information presentation unit 10 . An operation input device other than the touch panel 48 may be used as the operation input unit 56 .

通信部６４は、プロセッサ５８の制御によりネットワーク３４を介してサーバー３６と通信し、連係によって物体追跡に必要な情報交換を行う。 The communication unit 64 communicates with the server 36 via the network 34 under the control of the processor 58, and exchanges information necessary for object tracking through linkage.

＜追跡情報ＤＢ６６＞
図４は、物体追跡システム２に用いる追跡情報を格納する追跡情報ＤＢ６６を示している。この追跡情報ＤＢ６６には検出した追跡対象ごとに生成する追跡対象ファイル６７－１、６７－２、・・・、６７－ｎが含まれる。 <Tracking information DB 66>
FIG. 4 shows the tracking information DB 66 that stores tracking information used in the object tracking system 2. As shown in FIG. This trace information DB 66 includes trace object files 67-1, 67-2, . . . , 67-n generated for each detected trace object.

各追跡対象ファイル６７－１、６７－２、・・・、６７－ｎには画像情報部６８、深度情報部７０、バウンディングボックス部７２、グリッド点部７４、位置情報部７６、特徴情報部７８、分類情報部８０、向き情報部８２、状態情報部８４、同定情報部８６、履歴情報部８８が含まれる。 . . , 67-n includes an image information section 68, a depth information section 70, a bounding box section 72, a grid point section 74, a position information section 76, and a feature information section 78. , a classification information section 80, an orientation information section 82, a state information section 84, an identification information section 86, and a history information section 88 are included.

画像情報部６８には、追跡対象Ｐからカメラ６で取得した二次元の画像情報が時系列で格納される。深度情報部７０には、追跡対象Ｐから光検出・測距部８で取得した深度情報が時系列で格納される。バウンディングボックス部７２には、画像情報から取得したバウンディングボックス情報が格納される。グリッド点部７４には、バウンディングボックスに生成されたグリッド点の位置情報が格納される。位置情報部７６には、追跡対象から取得した画像情報と深度情報から算出された位置情報が格納される。特徴情報部７８には、画像情報から取得した追跡対象の特徴情報が格納される。分類情報部８０には、顔認識に用いられる画像情報の分類情報が格納される。この分類情報部８０には、左目、右目、左耳、右耳の各位置情報が含まれる。 The image information section 68 stores two-dimensional image information acquired from the tracking target P by the camera 6 in time series. The depth information section 70 stores the depth information acquired from the tracking target P by the light detection/ranging section 8 in chronological order. Bounding box information obtained from image information is stored in the bounding box portion 72 . The grid point portion 74 stores the position information of the grid points generated in the bounding box. The position information section 76 stores position information calculated from the image information acquired from the tracked object and the depth information. The feature information section 78 stores the feature information of the tracked object obtained from the image information. The classification information section 80 stores classification information of image information used for face recognition. The classification information section 80 includes position information for left eye, right eye, left ear, and right ear.

向き情報部８２には顔の向きを表す向き情報によって分類されて特徴情報が格納される。この向き情報部８２には正面部８２－１、左向き部８２－２、右向き部８２－３、不定部８２－４が設定されている。正面部８２－１には正面に分類された特徴情報が格納される。左向き部８２－２には左向きに分類された特徴情報が格納される。右向き部８２－３には右向きに分類された特徴情報が格納される。不定部８２－４には何れにも分類されない特徴情報が格納される。 The orientation information section 82 stores feature information classified by orientation information representing the orientation of the face. The orientation information section 82 has a front section 82-1, a left section 82-2, a right section 82-3, and an undefined section 82-4. The front portion 82-1 stores the feature information classified as the front. The leftward portion 82-2 stores the characteristic information classified into the leftward direction. The rightward portion 82-3 stores feature information classified in the rightward direction. The undefined portion 82-4 stores feature information that is not classified into any category.

状態情報部８４には、バウンディングボックスの状態情報が格納される。この状態情報には移動距離情報部８４－１、重なり情報部８４－２が設定されている。移動距離情報部８４－１には、バウンディングボックス間の移動距離情報、フレームアウト情報などが格納される。 The state information section 84 stores the state information of the bounding box. A movement distance information section 84-1 and an overlap information section 84-2 are set in this state information. The movement distance information section 84-1 stores movement distance information between bounding boxes, frame-out information, and the like.

重なり情報部８４－２には、バウンディングボックスの重なり情報などが格納される。同定情報部８６には、追跡対象を同定した処理における同定処理Ｉ、同定処理II、同定結果を表す追跡表示の着色情報、同定カテゴリ情報などが格納される。履歴情報部８８には、追跡対象の同定履歴、実績などの履歴情報が格納される。 The overlap information section 84-2 stores overlap information of bounding boxes and the like. The identification information section 86 stores identification processing I, identification processing II, tracking display color information indicating the identification result, identification category information, and the like in the processing for identifying the tracking target. The history information section 88 stores history information such as tracked object identification history and track record.

＜物体追跡の処理手順＞
図５は、デバイス３８を用いた物体追跡の処理手順を示している。この処理手順は、本開示の物体追跡システムを用いた物体追跡方法または物体追跡プログラムの一例である。図５において、Ｓは機能または処理の工程を示し、Ｓに付した番号は工程順を示している（図６、図７、図１０、図１２または図１６に示すフローチャートにおいても同様である）。 <Procedure for object tracking>
FIG. 5 shows the procedure of object tracking using device 38 . This processing procedure is an example of an object tracking method or an object tracking program using the object tracking system of the present disclosure. In FIG. 5, S indicates a function or process step, and the number attached to S indicates the order of steps (the same applies to the flow charts shown in FIGS. 6, 7, 10, 12 and 16). .

この処理手順には図５に示すように、画像情報および深度情報の取得（Ｓ１０１）、追跡モードの開始判断（Ｓ１０２）、追跡対象の位置情報の取得および保存（Ｓ１０３）、追跡対象の向き情報・特徴情報の取得および保存（Ｓ１０４）、同定判断（Ｓ１０５、Ｓ１０６、Ｓ１０７、Ｓ１０８、Ｓ１０９）、追跡情報の提示（Ｓ１１０）などが含まれている。 As shown in FIG. 5, this processing procedure includes acquisition of image information and depth information (S101), determination of start of tracking mode (S102), acquisition and storage of position information of the tracked object (S103), orientation information of the tracked object. - Acquisition and storage of characteristic information (S104), determination of identification (S105, S106, S107, S108, S109), presentation of tracking information (S110), and the like are included.

画像情報および深度情報の取得（Ｓ１０１）：デバイス３８を起動すると、画像情報を取得するための画像取得モードが立ち上がる。この画像取得モードにおいて、カメラ６および光検出・測距部８が起動する。カメラ６が追跡制御部１２の制御により画像情報を時系列で取得するとともに、光検出・測距部８が、追跡対象の深度情報を時系列で取得する。 Acquisition of Image Information and Depth Information (S101): When the device 38 is activated, an image acquisition mode for acquiring image information is activated. In this image acquisition mode, the camera 6 and the light detection/ranging unit 8 are activated. The camera 6 acquires image information in time series under the control of the tracking control unit 12, and the light detection/ranging unit 8 acquires depth information of the tracked object in time series.

画像情報はたとえば、ユーザ操作に基づき、追跡対象に向けられたカメラ６の撮像によって得られるたとえば、二次元の画像情報である。したがって、この画像情報は単一または複数の追跡対象の他、背景などを含んでいる。 The image information is, for example, two-dimensional image information obtained by imaging with the camera 6 aimed at the tracked object based on the user's operation. Therefore, this image information includes the single or multiple tracked objects as well as the background and the like.

深度情報は、ユーザ操作に基づき、カメラ６とともに追跡対象に向けられた光検出・測距部８に取得される深度を表す情報である。深度とは、画像情報を二次元とすれば、画像情報に対して奥行き方向を表す距離情報である。 The depth information is information representing the depth acquired by the light detection/ranging unit 8 directed toward the tracking target together with the camera 6 based on the user's operation. If the image information is two-dimensional, the depth is distance information representing the depth direction with respect to the image information.

追跡モードの開始判断（Ｓ１０２）：この実施の形態では、画像情報の取得と追跡対象を追跡するための追跡モードは別個に設定されている。この追跡モードの開始には、開始条件を充足することが必要である。開始条件としては、画像取得モードにおいて、開始情報の取得を条件とする。たとえば、追跡対象に表示されているバウンディングボックスＢへのタッチを感知し、これを開始情報に用いて追跡モードを開始させる。 Determining whether to start tracking mode (S102): In this embodiment, the tracking mode for acquiring image information and tracking the tracked object are set separately. Initiation of this tracking mode requires that initiation conditions be met. The start condition is acquisition of start information in the image acquisition mode. For example, a touch on bounding box B displayed on the tracked object is sensed and used as starting information to initiate tracking mode.

追跡対象の位置情報の取得および保存（Ｓ１０３）：本開示の物体追跡には、追跡対象を特定し、バウンディングボックスＢ内の追跡対象の座標値を含む位置情報を取得する。この位置情報の取得には、カメラ６によって取得した画像情報から追跡対象を検出し、この追跡対象を表す二次元情報と、追跡対象から取得した深度情報とを合成し、追跡対象を表す三次元情報である座標値を含む位置情報を時系列で取得する。この位置情報は、画像情報および深度情報とともに追跡情報ＤＢ６６に記録されて保存される。 Acquisition and storage of position information of tracked object (S103): In object tracking of the present disclosure, a tracked object is identified and position information including coordinate values of the tracked object within bounding box B is obtained. To acquire this position information, a tracked target is detected from image information acquired by the camera 6, two-dimensional information representing this tracked target is combined with depth information acquired from the tracked target, and a three-dimensional image representing the tracked target is synthesized. Positional information including coordinate values, which is information, is acquired in chronological order. This position information is recorded and saved in the tracking information DB 66 together with image information and depth information.

追跡対象の向き情報・特徴情報の取得および保存（Ｓ１０４）：本開示の物体追跡には、追跡対象の位置情報の他、追跡対象の向き情報および特徴情報が用いられる。この特徴情報は、追跡対象が人であれば、顔画像から人の向き情報および特徴情報を取得する。向き情報は、人の顔の向きを表す情報であり、特徴情報をカテゴライズするために用いられる。顔の向きはたとえば、正面、左向きおよび右向きの３パターンが設定されている。特徴情報のカテゴライズは、対比するための特徴情報を特定する単位概念であり、顔画像における特徴情報は、顔の向きを単位として分類され、追跡情報ＤＢ６６に記録されて保存される。 Acquisition and storage of orientation information and feature information of tracked object (S104): For object tracking according to the present disclosure, orientation information and feature information of a tracked object are used in addition to position information of the tracked object. For this feature information, if the tracking target is a person, orientation information and feature information of the person are obtained from the face image. Orientation information is information representing the orientation of a person's face, and is used to categorize feature information. For example, three patterns of front, left, and right are set for the face direction. Categorization of feature information is a unit concept for specifying feature information for comparison, and feature information in a face image is classified in units of face orientation, and recorded and saved in the tracking information DB 66 .

同定判断（Ｓ１０５、Ｓ１０６、Ｓ１０７、Ｓ１０８、Ｓ１０９）：同定判断は、移動する追跡対象が同一か否かの判断である。この同定判断には、第１の処理として同定処理Ｉ、第２の処理として同定処理IIが含まれる。同定処理Ｉは、追跡対象の位置情報を用いた同定であり、同定処理IIは、追跡対象の特徴情報を用いた同定である。つまり、追跡対象の位置情報による判断である三次元の座標値による判断を行ったとき、不定フレームなどの存在で座標値による同定ができないとき、同定処理Ｉを補完する処理である同定処理IIによる顔認証による同定が含まれている。 Identification Judgment (S105, S106, S107, S108, S109): The identification judgment is a judgment as to whether or not the moving tracking targets are the same. This identification determination includes identification processing I as the first processing and identification processing II as the second processing. Identification processing I is identification using the position information of the tracked object, and identification processing II is identification using the feature information of the tracked object. In other words, when a determination is made based on the three-dimensional coordinate values, which is the determination based on the position information of the tracked object, when identification cannot be performed based on the coordinate values due to the presence of an indeterminate frame, identification processing II, which is processing that complements identification processing I, is performed. Includes face recognition identification.

座標値による同定ができない場合には画像情報および深度情報が取得できた場合であってもたとえば、(1) バウンディングボックスＢ－１が他のバウンディングボックスＢ－２との間で重なりが生じた場合、(2) 現フレームのバウンディングボックスと直前フレームのバウンディングボックスの移動距離が閾値以上の場合、(3) 追跡対象が撮像の画角範囲から外れ、再び画角内に復帰した場合などのパターンが含まれる。 Even if image information and depth information can be acquired when identification by coordinate values is not possible, for example: (1) When bounding box B-1 overlaps another bounding box B-2 , (2) when the movement distance between the bounding box of the current frame and the bounding box of the previous frame is greater than or equal to the threshold, and (3) when the tracked object moves out of the imaging angle range and returns to the angle of view. included.

バウンディングボックスが重なった場合には、バウンディングボックスを基準に追跡対象を特定することから、バウンディングボックスが重なると、追跡対象を見失うことになる。 If the bounding boxes overlap, the tracked object is identified based on the bounding box, so if the bounding boxes overlap, the tracked object will be lost.

バウンディングボックスが重ならない場合であっても、現フレームのバウンディングボックスの直前フレームのバウンディングボックスからの移動距離が閾値以上である場合には座標値の精度が失われるので、同定しない。 Even if the bounding boxes do not overlap, if the moving distance of the bounding box of the current frame from the bounding box of the previous frame is equal to or greater than the threshold, the precision of the coordinate values is lost, so identification is not performed.

また、カメラ６の撮像範囲から追跡対象が外れると、画像情報および深度情報を取得することができない。このため、撮像範囲内の追跡対象から位置情報を取得していても、再び撮像範囲以内に入った追跡対象の画像情報および深度情報から取得した位置情報との対比では追跡対象の同定ができない。 Further, when the tracking target is out of the imaging range of the camera 6, image information and depth information cannot be obtained. Therefore, even if the position information is obtained from the tracked target within the imaging range, the tracked target cannot be identified by comparing the position information obtained from the image information and depth information of the tracked target that has entered the imaging range again.

このように位置情報で追跡対象を同定できない場合には、特徴情報を用いて追跡対象を同定する。つまり、座標値で追跡対象を同定できた場合には、特徴情報を用いた同定は省略される。 When the tracked object cannot be identified by the position information in this way, the tracked object is identified by using the feature information. In other words, if the tracking target can be identified by the coordinate values, the identification using the feature information is omitted.

したがって、この同定判断（Ｓ１０５）には、座標値を含む位置情報による同定処理Ｉでの同定判断（Ｓ１０６）、特徴情報による同定処理IIでの同定判断（Ｓ１０７）、追跡対象の同定（Ｓ１０８）、不定の判断（Ｓ１０９）が含まれる。同定処理Ｉでの同定判断（Ｓ１０６）では、同定処理Ｉで同定が可能かを判断し、同定処理Ｉで同定が可能であれば、位置情報のみで追跡対象を同定する（Ｓ１０８）。 Therefore, this identification judgment (S105) includes identification judgment in identification processing I based on position information including coordinate values (S106), identification judgment in identification processing II based on feature information (S107), and tracking target identification (S108). , indefinite determination (S109). In identification determination (S106) in identification processing I, it is determined whether identification is possible in identification processing I. If identification is possible in identification processing I, the tracking target is identified only by position information (S108).

同定処理Ｉで同定できない場合には（Ｓ１０６のＮＯ）、同定処理IIによる同定かを判断する（Ｓ１０７）。同定処理IIで同定できれば（Ｓ１０７のＹＥＳ）、追跡対象を同定する（Ｓ１０８）。これに対し、同定処理IIで同定できなければ（Ｓ１０７のＮＯ）、追跡対象を不定と判断し（Ｓ１０９）、Ｓ１０３に戻る。つまり、特徴情報によっても同定できなければ、Ｓ１０３に戻る。 If identification cannot be performed by identification processing I (NO in S106), it is determined whether identification is by identification processing II (S107). If it can be identified in the identification process II (YES in S107), the tracking target is identified (S108). On the other hand, if it cannot be identified in the identification process II (NO in S107), the tracking target is determined to be undefined (S109), and the process returns to S103. That is, if it cannot be identified even by the characteristic information, the process returns to S103.

追跡情報の提示（Ｓ１１０）：追跡対象を同定すれば（Ｓ１０８）、追跡対象の追跡情報、つまり追跡中の追跡対象の画像情報とともに追跡表示を提示し（Ｓ１１０）、Ｓ１０３に戻る。同定された追跡対象の追跡表示は追跡対象ごとに異なる着色を以て表示する。つまり、同定中の追跡対象の追跡表示は同一色で提示され、他の追跡対象の追跡表示と異なる着色を以て提示される。つまり、同定が失われた追跡対象の追跡表示は同定中の着色と異ならせ、その状態を提示すればよい。 Presentation of Tracking Information (S110): If the tracked object is identified (S108), the tracking display is presented together with the tracking information of the tracked object, ie, the image information of the tracked object being tracked (S110), and the process returns to S103. The tracking display of the identified tracked objects is displayed with different colors for each tracked object. That is, the tracking representation of the tracked object being identified is presented in the same color, and is presented with a different coloring than the tracking representations of other tracked objects. In other words, the tracking display of the tracked object for which identification has been lost should be made different from the coloring during identification, and the state may be presented.

＜位置情報の取得＞
図６は、位置情報の取得の処理手順を示している。この処理手順は追跡対象の位置情報の取得および保存（図５のＳ１０３）のサブルーチンである。 <Acquisition of location information>
FIG. 6 shows a processing procedure for acquiring position information. This processing procedure is a subroutine for obtaining and storing the position information of the tracked object (S103 in FIG. 5).

この処理手順には、画像情報および深度情報の取得（Ｓ２０１）、追跡対象の検知およびバウンディングボックスの取得（Ｓ２０２）、追跡対象の座標値の取得（Ｓ２０３）、座標値の保存（Ｓ２０４）、座標値の取得の終了判断（Ｓ２０５）、次のフレームの処理（Ｓ２０６）が含まれる。 This processing procedure includes acquisition of image information and depth information (S201), detection of a tracked object and acquisition of a bounding box (S202), acquisition of coordinate values of the tracked object (S203), storage of coordinate values (S204), It includes determination of end of value acquisition (S205) and processing of the next frame (S206).

画像情報および深度情報の取得（Ｓ２０１）：この画像情報および深度情報の取得は、追跡モード（Ｓ１０２）の開始後の取得である。 Acquisition of image information and depth information (S201): This acquisition of image information and depth information is acquisition after the start of the tracking mode (S102).

追跡対象Ｐの検知およびバウンディングボックスの取得（Ｓ２０２）：この追跡対象Ｐの検知およびバウンディングボックスの取得は、追跡制御部１２の制御により、カメラ６から得た画像情報から追跡対象を検知し、この追跡対象の座標値を取得する。追跡対象が人であれば、ＯＳに搭載されている人検知機能（たとえば、Vision Framework - Request Human Detection）を用いて画像情報から人の検知を行い、二次元情報であるバウンディングボックスを取得する。 Detection of tracked object P and acquisition of bounding box (S202): The detection of the tracked object P and acquisition of the bounding box are performed by detecting the tracked object from image information obtained from the camera 6 under the control of the tracking control unit 12. Get the coordinates of the tracked target. If the object to be tracked is a person, the person is detected from the image information using the human detection function (for example, Vision Framework - Request Human Detection) installed in the OS, and the bounding box, which is two-dimensional information, is obtained.

追跡対象Ｐの座標値の取得（Ｓ２０３）：この追跡対象Ｐの座標値の取得は、バウンディングボックスＢを単位として追跡対象Ｐを表す座標値を含む三次元情報として位置情報を取得する。この位置情報の取得にはたとえば、図９に示すように、二次元情報である画像情報と深度情報とを合成し、この合成によって得られる合成画像（三次元の座標情報）から座標値を時系列で取得する。つまり、この処理により、バウンディングボックス内の追跡対象Ｐを高精度に表す位置情報を取得できる。 Acquisition of coordinate values of tracked target P (S203): In this acquisition of coordinate values of tracked target P, position information is acquired as three-dimensional information including coordinate values representing tracked target P with bounding box B as a unit. To obtain this position information, for example, as shown in FIG. 9, image information, which is two-dimensional information, and depth information are synthesized, and coordinate values are obtained from the synthesized image (three-dimensional coordinate information) obtained by this synthesis. Get in series. In other words, by this processing, it is possible to acquire position information representing the tracked object P within the bounding box with high accuracy.

座標値の保存（Ｓ２０４）：取得した座標値は、追跡制御部１２の制御により追跡情報ＤＢ６６に格納されて保存される。この座標値の保存はたとえば、画像情報のフレーム単位で行えばよい。 Saving coordinate values (S204): The acquired coordinate values are stored and saved in the tracking information DB 66 under the control of the tracking control unit 12. FIG. This coordinate value may be stored, for example, in units of frames of image information.

座標値の取得の終了判断（Ｓ２０５）：この座標値の取得終了は、追跡制御部１２の制御により、フレーム中の座標値の取得終了かを判断し（Ｓ２０５）、この座標値の取得終了まで、Ｓ２０３およびＳ２０４の処理を繰り返す（Ｓ２０５のＮＯ）。 Judgment of end of acquisition of coordinate values (S205): The end of acquisition of coordinate values is determined by the control of the tracking control unit 12 as to whether or not the acquisition of coordinate values in the frame is completed (S205). , S203 and S204 are repeated (NO in S205).

次のフレームの処理（Ｓ２０６）：追跡制御部１２がフレーム中の座標値の取得を終了したと判断したとき（Ｓ２０５のＹＥＳ）、次フレームの処理に移行し（Ｓ２０６）、Ｓ２０１～Ｓ２０６の処理を時系列で実行する。 Processing of the next frame (S206): When the tracking control unit 12 determines that the acquisition of the coordinate values in the frame has ended (YES in S205), the processing of the next frame is started (S206), and the processing of S201 to S206 is performed. are executed in chronological order.

＜バウンディングボックスＢの処理＞
図７は、バウンディングボックスＢの処理を示している。この処理手順には、追跡対象に表示されるバウンディングボックスＢの分割および追跡対象の位置情報の取得の処理が含まれており、位置情報の取得処理（図６）のＳ２０２のサブルーチンである。 <Processing of Bounding Box B>
FIG. 7 shows the bounding box B processing. This processing procedure includes processing for dividing the bounding box B displayed on the tracked object and acquisition of position information of the tracked object, and is a subroutine of S202 of the position information acquisition processing (FIG. 6).

この処理手順にはバウンディングボックスＢの幅ｗおよび高さｈの取得（Ｓ３０１）、バウンディングボックスの分割（Ｓ３０２）、グリッド点ｇの生成（Ｓ３０３）、グリッド点ｇの位置判定（Ｓ３０４、Ｓ３０５）、グリッド点ｇの座標値の取得（Ｓ３０６）、座標値の信頼度のチェック（Ｓ３０７）、信頼度判定（Ｓ３０８）、グリッド点ｇの座標値の保存（Ｓ３０９）、処理の終了判定（Ｓ３１０）、座標値数の判定（Ｓ３１１）、中心Ｂｎの座標値の取得（Ｓ３１２）、位置情報の設定（Ｓ３１３）、位置情報の不明判定（Ｓ３１４）などが含まれる。 This processing procedure includes acquisition of the width w and height h of the bounding box B (S301), division of the bounding box (S302), generation of the grid point g (S303), determination of the position of the grid point g (S304, S305), Acquisition of coordinate values of grid point g (S306), check of reliability of coordinate values (S307), determination of reliability (S308), storage of coordinate values of grid point g (S309), determination of end of processing (S310), Determination of the number of coordinate values (S311), acquisition of coordinate values of the center Bn (S312), setting of position information (S313), determination of unknown position information (S314), and the like are included.

バウンディングボックスＢの幅ｗおよび高さｈの取得（Ｓ３０１）：このバウンディングボックスＢの幅ｗおよび高さｈは追跡制御部１２の制御により、バウンディングボックス処理部１８が取得したバウンディングボックスＢから幅ｗおよび高さｈを取得する。 Acquisition of width w and height h of bounding box B (S301): Width w and height h of bounding box B are obtained from bounding box B acquired by bounding box processing unit 18 under the control of tracking control unit 12. and height h.

バウンディングボックスＢの分割（Ｓ３０２）：バウンディングボックス処理部１８が追跡制御部１２の制御により、図８のＡに示すように、複数の行線および列線によってたとえば、幅方向を８分割、高さ方向を８分割し、バウンディングボックスＢを６４区画に分割する。この分割数は一例であり、この分割数に本開示が限定されるものではない。バウンディングボックスＢのＸ軸方向の幅ｗをｗ＝ｘ、Ｙ軸方向の高さｈをｈ＝ｙとすれば、分割枠の大きさをステップ（Ｓｔｅｐ）で表すと、式１、式２で表すことができる。 Division of bounding box B (S302): Under the control of the tracking control unit 12, the bounding box processing unit 18 divides the width direction into eight and the height into eight by a plurality of row lines and column lines as shown in FIG. 8A. Divide the direction into 8 and divide the bounding box B into 64 partitions. This division number is an example, and the present disclosure is not limited to this division number. Assuming that the width w in the X-axis direction of the bounding box B is w=x, and the height h in the Y-axis direction is h=y, the size of the division frame can be expressed in Steps by Equations 1 and 2. can be represented.

ｘＳｔｅｐ＝ｗ／８（式１）
ｙＳｔｅｐ＝ｈ／８（式２） xStep=w/8 (Formula 1)
yStep=h/8 (Formula 2)

グリッド点ｇの生成（Ｓ３０３）：このようにバウンディングボックスＢを分割すると、図８のＡに示すように、バウンディングボックスＢ内に８×８の分割によってグリッド点ｇを作成することができる。つまり、グリッド点ｇは、行線および列線の交点である。 Generation of Grid Points g (S303): By dividing the bounding box B in this way, grid points g can be generated by dividing the bounding box B into 8×8, as shown in FIG. 8A. That is, the grid point g is the intersection of row lines and column lines.

グリッド点ｇの位置判定（Ｓ３０４）：作成された各グリッド点ｇについて、各位置を判定する。グリッド点ｇの位置が閾値として、バウンディングボックスＢの中心Ｂｎより＋ｙＳｔｅｐより下であるか否かの判定である。 Position determination of grid point g (S304): Each position is determined for each created grid point g. It is determined whether or not the position of the grid point g is below the center Bn of the bounding box B by +yStep as a threshold.

グリッド点ｇの位置判定（Ｓ３０５）：グリッド点ｇの位置が閾値以下、つまりバウンディングボックスＢの中心Ｂｎより＋ｙＳｔｅｐより下であれば（Ｓ３０５のＹＥＳ）、Ｓ３０６に遷移し、グリッド点ｇの位置がバウンディングボックスＢの中心Ｂｎより＋ｙＳｔｅｐより上であれば（Ｓ３０５のＮＯ）、Ｓ３０４に遷移する。 Position determination of grid point g (S305): If the position of grid point g is equal to or less than the threshold, that is, if it is below +yStep from the center Bn of bounding box B (YES in S305), the process proceeds to S306, and the position of grid point g is If it is above +yStep from the center Bn of the bounding box B (NO in S305), the process proceeds to S304.

グリッド点ｇの座標値の取得（Ｓ３０６）：バウンディングボックスＢの中心Ｂｎより＋ｙＳｔｅｐより下であるグリッド点ｇについて、三次元の座標値を取得する。この座標値の取得について、バウンディングボックスＢは追跡対象の顔領域を含む上半身を包囲する領域である。このため、バウンディングボックスＢの中心Ｂｎより上のグリッド点ｇでは肩より上となり、身体以外の部分を含むこととなり、追跡対象以外の座標値が取得される可能性がある。これに対し、バウンディングボックスＢの中心Ｂｎより下のグリッド点ｇは追跡対象の中央部分、つまり、身体の中央部分となるので、有効なグリッド点ｇの三次元の座標値を取得できる可能性が高い。よって、バウンディングボックスＢの中心Ｂｎより下のグリッド点ｇを取得し、これを追跡対象の位置情報とする。 Acquisition of coordinate values of grid point g (S306): For grid point g located below +yStep from center Bn of bounding box B, three-dimensional coordinate values are acquired. For this coordinate acquisition, the bounding box B is the area surrounding the upper body including the face area to be tracked. Therefore, the grid point g above the center Bn of the bounding box B is above the shoulder and includes parts other than the body, and there is a possibility that coordinate values other than the tracking target will be acquired. On the other hand, since the grid point g below the center Bn of the bounding box B is the central portion of the object to be tracked, that is, the central portion of the body, there is a possibility that the effective three-dimensional coordinate values of the grid point g can be obtained. high. Therefore, a grid point g below the center Bn of the bounding box B is acquired and used as position information of the tracked object.

座標値の信頼度のチェック（Ｓ３０７）：グリッド点ｇの座標値について、信頼度のチェックを行う。このチェックには図８のＢに示すように、グリッド点ｇに重なる追跡対象Ｐの深度情報Ｄｉを参照する。深度情報Ｄｉには信頼度を表す情報（信頼度情報）として低信頼度ｌｏｗ（＝Depth-value accuracy in which the framework is less confident.）、中信頼度ｍｅｄｉｕｍ（＝Depth-value accuracy in which the framework is moderately confident.）、高信頼度ｈｉｇｈ（＝Depth-value accuracy in which the framework is fairly confident.）が含まれている。したがって、各グリッド点ｇは、深度情報の信頼度を以てチェックすることができる。 Reliability Check of Coordinate Values (S307): The reliability of the coordinate values of the grid point g is checked. For this check, as shown in FIG. 8B, the depth information Di of the tracked object P that overlaps the grid point g is referred to. Depth information Di has low reliability low (=Depth-value accuracy in which the framework is less confident.) and medium reliability medium (=Depth-value accuracy in which the framework is moderately confident.) and high reliability (=Depth-value accuracy in which the framework is fairly confident.). Therefore, each grid point g can be reliably checked for depth information.

グリッド点ｇの信頼度の判定（Ｓ３０８）：グリッド点ｇの信頼度について、高信頼度ｈｉｇｈであるか否かの閾値を設定し、その判定を行う。つまり、グリッド点ｇの信頼度が高信頼度ｈｉｇｈであれば（Ｓ３０８のＹＥＳ）、Ｓ３０９に遷移し、グリッド点ｇの信頼度が低信頼度ｌｏｗまたは中信頼度ｍｅｄｉｕｍであれば（Ｓ３０８のＮＯ）、Ｓ３０５に戻る。 Determination of Reliability of Grid Point g (S308): Regarding the reliability of grid point g, a threshold is set to determine whether or not the reliability is high, and the determination is performed. That is, if the reliability of the grid point g is high (YES in S308), the process proceeds to S309, and if the reliability of the grid point g is low or medium (NO in S308) ), and returns to S305.

座標値の保存（Ｓ３０９）：グリッド点ｇの信頼度が高信頼度ｈｉｇｈであれば（Ｓ３０８のＹＥＳ）、取得したグリッド点ｇの三次元の座標値を位置情報として追跡情報ＤＢ６６に登録して保存する。 Saving coordinate values (S309): If the reliability of the grid point g is high (YES in S308), the acquired three-dimensional coordinate values of the grid point g are registered in the tracking information DB 66 as position information. save.

処理の終了判定（Ｓ３１０）：バウンディングボックス処理部１８は、取得した全グリッド点ｇの処理が終了したかを判定する（Ｓ３１０）。全グリッド点ｇの処理が終了していなければ（Ｓ３１０のＮＯ）、Ｓ３０５～Ｓ３１０の処理を継続し、全グリッド点ｇの処理を行う。そして、グリッド点ｇの全部の処理が終了すれば（Ｓ３１０のＹＥＳ）、Ｓ３１１に遷移する。 Processing End Determination (S310): The bounding box processing unit 18 determines whether or not the processing of all acquired grid points g is completed (S310). If the processing of all grid points g has not been completed (NO in S310), the processing of S305 to S310 is continued, and all grid points g are processed. Then, when the processing of all the grid points g is completed (YES in S310), the process proceeds to S311.

座標値数の判定（Ｓ３１１）：バウンディングボックス処理部１８は、保存したグリッド点ｇの座標値数が閾値以上かを判定する。座標値数の閾値は一定値としてたとえば、２×２を設定する。この場合、保存したグリッド点ｇの座標値数が閾値＝２×２以上であれば（Ｓ３１１のＹＥＳ）、Ｓ３１２に遷移し、保存したグリッド点ｇの座標値数が閾値＝２×２未満であれば（Ｓ３１１のＮＯ）、Ｓ３１４に遷移する。 Determining the Number of Coordinate Values (S311): The bounding box processing unit 18 determines whether the number of coordinate values of the stored grid point g is equal to or greater than a threshold. For example, 2×2 is set as a constant value for the threshold of the number of coordinate values. In this case, if the number of coordinate values of the stored grid point g is equal to or greater than the threshold value of 2×2 (YES in S311), the process proceeds to S312. If there is (NO in S311), the process proceeds to S314.

中心の座標値の取得（Ｓ３１２）：保存したグリッド点ｇの座標値数が閾値＝２×２以上であれば（Ｓ３１１のＹＥＳ）、バウンディングボックス処理部１８は、位置情報が確定している追跡対象として認識する。この認識の結果、バウンディングボックス処理部１８は、バウンディングボックスＢの中心Ｂｎの座標値を取得する。 Acquisition of coordinate values of the center (S312): If the number of coordinate values of the stored grid point g is equal to or greater than the threshold=2×2 (YES in S311), the bounding box processing unit 18 obtains the tracking Recognize as an object. As a result of this recognition, the bounding box processing unit 18 acquires the coordinate values of the center Bn of the bounding box B. FIG.

この場合、中心Ｂｎの座標値の取得処理はたとえば、
Mid X and Mid Y is the midpoint of the X and Y edges of screen bounds
Mid X ＝ screenBoundsd. Mid X
Mid Y ＝ screenBoundsd. Mid Y
Center ＝（Mid X, Mid Y＋ yStep）
からバウンディングボックスＢの中心Ｂｎの値が求められる。 In this case, the processing for acquiring the coordinate values of the center Bn is, for example,
Mid X and Mid Y is the midpoint of the X and Y edges of screen bounds
MidX = screenBoundsd.MidX
MidY = screenBoundsd.MidY
Center = (Mid X, Mid Y + yStep)
, the value of the center Bn of the bounding box B is obtained.

位置情報の設定（Ｓ３１３）：バウンディングボックス処理部１８は取得した中心Ｂｎの座標値を追跡対象の位置情報として設定し、この処理を終了する。 Setting of position information (S313): The bounding box processing unit 18 sets the acquired coordinate values of the center Bn as the position information of the tracked object, and ends this process.

位置情報の不明判定（Ｓ３１４）：保存したグリッド点ｇの座標値数が閾値＝２×２未満であれば（Ｓ３１１のＮＯ）、グリッド点ｇの座標値が保存されないので、追跡対象の位置情報＝不明として処理し、この場合、位置情報が不明な追跡対象となる。 Determination of unknown position information (S314): If the number of stored coordinate values of the grid point g is less than the threshold=2×2 (NO in S311), the coordinate values of the grid point g are not saved, so the position information of the tracked object is determined. = is processed as unknown, and in this case, the tracked object has unknown position information.

＜画像情報および深度情報の合成、位置情報の取得＞
図９のＡは、二次元の画像情報を示している。追跡対象Ｐをカメラ６で撮像すると、図９のＡに示すように、追跡対象Ｐを表す画像情報Ｉｆが得られる。この画像情報Ｉｆは追跡制御部１２の制御により、追跡情報ＤＢ６６に記録される。 <Synthesis of image information and depth information, acquisition of position information>
FIG. 9A shows two-dimensional image information. When the tracking target P is imaged by the camera 6, image information If representing the tracking target P is obtained as shown in A of FIG. This image information If is recorded in the tracking information DB 66 under the control of the tracking control unit 12 .

図９のＢは、追跡対象Ｐの深度情報Ｄｉを示している。追跡対象Ｐを同一条件で光検出・測距部８で計測すると、撮影対象を表す深度情報Ｄｉが得られる。この深度情報Ｄｉは、光検出・測距部８からの追跡対象Ｐまでの距離（浮動小数点）と信頼度がペアでスクリーン（フレーム）の左上から右下に向かって格納されている。その距離を視覚化すると濃淡画像（二値化情報）で得られ、距離が近い方が暗く、遠い方が明るく表示される。つまり、この深度情報Ｄｉは、深度をコントラストで表す二値化情報であり、画像情報Ｉｆと同様に追跡対象Ｐが濃淡情報で得られる。 B of FIG. 9 shows the depth information Di of the tracked object P. FIG. When the tracking target P is measured under the same conditions by the light detection/distance measuring unit 8, depth information Di representing the imaging target is obtained. The depth information Di is stored from the upper left to the lower right of the screen (frame) in pairs of the distance (floating point) from the light detection/ranging unit 8 to the tracked object P and the reliability. When the distance is visualized, it is obtained as a grayscale image (binarized information), and the closer the distance, the darker it is, and the farther the brighter it is displayed. In other words, the depth information Di is binary information representing depth in terms of contrast, and the tracking target P can be obtained as grayscale information in the same way as the image information If.

図９のＣは、合成画像Ｃｍを示している。たとえば、画像情報Ｉｆに深度情報Ｄｉを重ねて合成画像Ｃｍを生成すると、画像情報Ｉｆに深度情報Ｄｉが加わり、目視的には深みのある合成画像Ｃｍに変換される。この合成画像Ｃｍには、画像情報ＩｆにあるバウンディングボックスＢが提示されるので、追跡対象Ｐを表す位置情報として三次元の座標値をバウンディングボックス単位で取得する。 C of FIG. 9 shows the composite image Cm. For example, if the depth information Di is superimposed on the image information If to generate the composite image Cm, the depth information Di is added to the image information If, and the composite image Cm visually has depth. Since the bounding box B in the image information If is presented in this composite image Cm, three-dimensional coordinate values are obtained as position information representing the tracked object P for each bounding box.

＜二次元の画像情報から三次元の座標値への変換＞
この二次元の画像情報から三次元の座標値への変換を座標値変換と称する。この座標値変換には、
Ａ）グリッド点ｇのスクリーン座標（二次元）の最小値および最大値の取得
Ｂ）ワールド座標（三次元）の取得
Ｃ）有効なグリッド点ｇを頂点とするバウンディングボックスＢの幅ｗと高さｈの取得
Ｄ）スクリーン座標（ｓｘ，ｓｙ）からワールド座標（ｗｘ，ｗｙ，ｗｚ）の取得
などの処理が含まれる。以下、これらの処理について説明する。 <Conversion from two-dimensional image information to three-dimensional coordinate values>
This conversion from two-dimensional image information to three-dimensional coordinate values is called coordinate value conversion. For this coordinate transformation,
A) Get minimum and maximum screen coordinates (2D) of grid point g B) Get world coordinates (3D) C) Width w and height of bounding box B with valid grid point g as apex Acquisition of h D) Includes processing such as acquisition of world coordinates (wx, wy, wz) from screen coordinates (sx, sy). These processes will be described below.

Ａ）の取得したグリッド点ｇについて、スクリーン座標（二次元）の最小値と最大値を取得する。
ｓｃｒｅｅｎ．ｍｉｎ＝グリッド点ｇの最小値（Ｘ，Ｙ）
ｓｃｒｅｅｎ．ｍａｘ＝グリッド点ｇの最大値（Ｘ，Ｙ） Obtain the minimum and maximum screen coordinates (two-dimensional) for the obtained grid point g in A).
screen. min = minimum value of grid point g (X, Y)
screen. max = maximum value of grid point g (X, Y)

Ｂ）のグリッド点ｇのワールド座標（三次元）の取得では、バウンディングボックスＢが相当するワールド座標（三次元）を取得する。
ｗｏｒｌｄ．ｍｉｎ＝グリッド点の最小値（Ｘ，Ｙ，Ｚ）
ｗｏｒｌｄ．ｍａｘ＝グリッド点の最大値（Ｘ，Ｙ，Ｚ） In B) obtaining the world coordinates (three-dimensional) of the grid point g, the world coordinates (three-dimensional) corresponding to the bounding box B are obtained.
world. min = minimum value of grid points (X, Y, Z)
world. max = maximum value of grid points (X, Y, Z)

Ｃ）の有効なグリッド点ｇを頂点とするバウンディングボックスＢの幅ｗと高さｈの取得には、ワールド座標値を用いて、有効なグリッド点ｇを頂点とする四角形の幅と高さを取得する。
ｓｃｒｅｅｎ．ｗ＝ｓｃｒｅｅｎ．ｍａｘ．ｘ－ｓｃｒｅｅｎ．ｍｉｎ．ｘ
ｓｃｒｅｅｎ．ｈ＝ｓｃｒｅｅｎ．ｍａｘ．ｙ－ｓｃｒｅｅｎ．ｍｉｎ．ｙ
ｗｏｒｌｄ．ｗ＝ｗｏｒｌｄ．ｍａｘ．ｘ－ｗｏｒｌｄ．ｍｉｎ．ｘ
ｗｏｒｌｄ．ｈ＝ｗｏｒｌｄ．ｍａｘ．ｙ－ｗｏｒｌｄ．ｍｉｎ．ｙ To obtain the width w and height h of the bounding box B whose vertex is the effective grid point g in C), the width and height of the rectangle whose vertex is the effective grid point g are obtained using the world coordinate values. get.
screen. w=screen. max. x-screen. min. x
screen. h=screen. max. y-screen. min. y
world. w=world. max. x-world. min. x
world. h=world. max. y-world. min. y

Ｄ）のスクリーン座標（ｓｘ，ｓｙ）からワールド座標（ｗｘ，ｗｙ，ｗｚ）の取得には、任意の点をスクリーン座標（ｓｘ，ｓｙ）からワールド座標（ｗｘ，ｗｙ，ｗｚ）を取得する場合、下記の式を使用して変換する。
ｗｘ＝ｓｘ＊（ｗｏｒｌｄ．ｗ／ｓｃｒｅｅｎ．ｗ）
ｗｙ＝ｓｙ＊（ｗｏｒｌｄ．ｈ／ｓｃｒｅｅｎ．ｈ）
ｗｚ＝ＤｅｐｔｈＢｕｆｆｅｒ［ｓｘ＋ｓｙ＊ｓｃｒｅｅｎｗｉｄｔｈ］ D) Acquisition of world coordinates (wx, wy, wz) from screen coordinates (sx, sy) is obtained by obtaining world coordinates (wx, wy, wz) from screen coordinates (sx, sy) of an arbitrary point. , converted using the following formula:
wx=sx*(world.w/screen.w)
wy=sy*(world.h/screen.h)
wz = DepthBuffer [sx + sy * screen width]

＜追跡対象Ｐの特徴情報＞
追跡対象Ｐの画像情報から特徴情報を取得することができる。この特徴情報は、追跡対象が人であれば、顔情報から特徴情報を取得し、この特徴情報に含まれる特徴量を以て追跡対象を同定することができる。 <Characteristic information of tracked object P>
Feature information can be acquired from the image information of the tracked object P. FIG. If the tracking target is a person, this feature information can be obtained from face information, and the tracking target can be identified using the feature amount included in this feature information.

顔情報には顔の向きを表す向き情報が含まれる。顔情報から取得した特徴情報は、向き情報によって分類すれば、同定処理の迅速化や高精度化を図ることができる。顔の向きは頭部の部位情報たとえば、左目、右目、左耳、右耳の各位置情報を以て判断することができる。 The face information includes orientation information representing the orientation of the face. If the feature information acquired from the face information is classified according to the direction information, the identification process can be speeded up and the accuracy can be improved. The orientation of the face can be determined based on information on the parts of the head, such as information on the positions of the left eye, right eye, left ear, and right ear.

＜部位情報の取得と特徴情報の分類＞
図１０は、顔の向き判断ロジックおよび特徴情報のカテゴライズ処理を示している。この処理手順は、Ｓ１０４（図５）のサブルーチンを示している。この処理手順には、顔情報の取得に基づき、特徴情報および部位情報を取得し、向きの判断とともに特徴情報の分類の処理が含まれる。 <Acquisition of part information and classification of feature information>
FIG. 10 shows face orientation determination logic and feature information categorization processing. This processing procedure indicates a subroutine of S104 (FIG. 5). This processing procedure includes acquisition of feature information and part information based on acquisition of face information, determination of orientation, and classification of feature information.

この処理手順において、左目とは表示画面に向かって左側に見える目、左耳とは表示画面に向かって左側に見える耳、右目とは表示画面に向かって右側に見える目、右耳とは表示画面に向かって右側に見える耳である。追跡対象Ｐ自身の左右とは異なる。図１１に示す向き情報テーブル１００も同様である。 In this processing procedure, the left eye is the eye seen on the left side of the display screen, the left ear is the ear seen on the left side of the display screen, the right eye is the eye seen on the right side of the display screen, and the right ear is the eye seen on the right side of the display screen. This is the ear that can be seen on the right side of the screen. It is different from the left and right of the tracked object P itself. The same applies to the orientation information table 100 shown in FIG.

この処理手順では、追跡対象Ｐの顔画像から部位情報として左目、右目、左耳、右耳の位置情報の取得を行うための処理を実行する（Ｓ４０１）。この処理において、左耳および右耳の双方の位置情報が取得した場合には（Ｓ４０２）、追跡対象Ｐの顔の向き＝「正面」と判断し（Ｓ４０３）、特徴情報を正面にカテゴライズする（Ｓ４０４）。 In this processing procedure, processing for acquiring position information of the left eye, right eye, left ear, and right ear as part information from the face image of the tracking target P is executed (S401). In this process, when the position information of both the left ear and the right ear is acquired (S402), it is determined that the face orientation of the tracking target P is "front" (S403), and the feature information is categorized into the front ( S404).

Ｓ４０２の後、左目および右目の双方の位置情報が取得した場合であって（Ｓ４０５）、左耳の位置情報を取得でき、右耳の位置情報を取得できない場合には（Ｓ４０６）、追跡対象Ｐの顔の向き＝「左向き」と判断し（Ｓ４０７）、特徴情報を左向きにカテゴライズする（Ｓ４０８）。 After S402, when both the left eye and right eye position information are acquired (S405), and the left ear position information can be acquired and the right ear position information cannot be acquired (S406), the tracking target P face direction=“left facing” (S407), and the feature information is categorized as left facing (S408).

Ｓ４０５の後、右耳の位置情報が取得でき、左耳の位置情報を取得できない場合（Ｓ４０９）、追跡対象Ｐの顔の向き＝「右向き」と判断し（Ｓ４１０）、特徴情報を右向きにカテゴライズする（Ｓ４１１）。 After S405, when the position information of the right ear can be acquired but the position information of the left ear cannot be acquired (S409), it is determined that the face orientation of the tracking target P is "right facing" (S410), and the feature information is categorized as facing right. (S411).

Ｓ４０５の後、右耳および左耳の双方の位置情報を取得できない場合（Ｓ４１２）、追跡対象Ｐの顔の向き＝「正面」と判断し（Ｓ４０３）、特徴情報を正面にカテゴライズする（Ｓ４０４）。 After S405, if the position information of both the right ear and the left ear cannot be acquired (S412), it is determined that the face orientation of the tracking target P is "front" (S403), and the feature information is categorized as "front" (S404). .

Ｓ４０５の後、左耳の位置情報を取得でき、右耳および右目の位置情報を取得できない場合（Ｓ４１３）、追跡対象Ｐの顔の向き＝「左向き」と判断し（Ｓ４０７）、特徴情報を左向きにカテゴライズする（Ｓ４０８）。 After S405, when the position information of the left ear can be acquired, but the position information of the right ear and the right eye cannot be acquired (S413), it is determined that the face orientation of the tracking target P is "leftward" (S407), and the feature information is turned leftward. (S408).

Ｓ４０５の後、右耳の位置情報を取得でき、左耳および左目の位置情報を取得できない場合（Ｓ４１４）、追跡対象Ｐの顔の向き＝「右向き」と判断し（Ｓ４１０）、特徴情報を右向きにカテゴライズする（Ｓ４１１）。 After S405, when the position information of the right ear can be acquired, but the position information of the left ear and the left eye cannot be acquired (S414), it is determined that the face orientation of the tracking target P is "right facing" (S410), and the feature information is set to the right facing direction. (S411).

Ｓ４０５の後、左目、右目、左耳、右耳の各位置情報を取得できない場合（Ｓ４１５）、追跡対象Ｐの顔の向き＝「不明」と判断し（Ｓ４１６）、特徴情報をカテゴライズしない。 After S405, if the position information of the left eye, right eye, left ear, and right ear cannot be acquired (S415), it is determined that the face orientation of the tracking target P is "unknown" (S416), and the feature information is not categorized.

＜追跡対象Ｐの向き判断＞
図１１は、取得部位および向き情報を示す向き情報テーブル１００を示している。 <Determination of orientation of tracked object P>
FIG. 11 shows an orientation information table 100 indicating acquired parts and orientation information.

この向き情報テーブル１００には、取得部位情報部１０２および向き情報部１０４が設定されている。取得部位情報部１０２には移動する追跡対象Ｐについて、各取得部位の中心位置からの位置情報が格納される。この取得部位情報部１０２には左目部１０２－１、右目部１０２－２、左耳部１０２－３、右耳部１０２－４が設定されている。 In this orientation information table 100, an acquisition part information section 102 and an orientation information section 104 are set. The acquired part information section 102 stores the position information from the center position of each acquired part for the moving tracking target P. FIG. A left eye portion 102-1, a right eye portion 102-2, a left ear portion 102-3, and a right ear portion 102-4 are set in this acquired region information section 102. FIG.

左目部１０２－１には追跡対象Ｐの左目の位置情報が格納されている。右目部１０２－２には追跡対象Ｐの右目の位置情報が格納されている。左耳部１０２－３には追跡対象Ｐの左耳の位置情報が格納されている。右耳部１０２－４には追跡対象Ｐの右耳の位置情報が格納されている。 Position information of the left eye of the tracked object P is stored in the left eye section 102-1. Position information of the right eye of the tracked object P is stored in the right eye section 102-2. Position information of the left ear of the tracking target P is stored in the left ear section 102-3. Position information of the right ear of the tracking target P is stored in the right ear section 102-4.

向き情報部１０４には、取得部位の組み合わせによる判断により追跡対象Ｐの向きを表す向き情報が格納される。 The orientation information section 104 stores orientation information representing the orientation of the tracked object P determined by a combination of acquired parts.

ア) 顔の向き＝正面の場合
左耳および右耳の各位置情報が取得できれば、追跡対象Ｐの向きは「正面」と判断する。同様に、左目および右目の各位置情報が取得できれば、左耳または右耳の位置情報が取得できない場合であっても、追跡対象Ｐの向きは「正面」と判断する。 a) When Face Direction is Front If the position information of the left ear and right ear can be acquired, the orientation of the tracked object P is determined to be "front". Similarly, if position information for the left eye and right eye can be obtained, the orientation of the tracked object P is determined to be "front" even if position information for the left ear or right ear cannot be obtained.

イ) 顔の向き＝左向きの場合
左目および右目の各位置情報が取得でき、左耳の位置情報が取得できれば、追跡対象Ｐの向きは「左向き」と判断する。同様に、左目および右目の各位置情報が取得できない場合であっても、左耳の位置情報が取得できれば、追跡対象Ｐの向きは「左向き」と判断する。 b) When the face is facing left If the position information of the left eye and the right eye can be obtained, and the position information of the left ear can be obtained, it is determined that the tracking target P is facing left. Similarly, even if the position information for the left eye and the right eye cannot be acquired, if the position information for the left ear can be acquired, the direction of the tracked object P is determined to be "leftward."

ウ) 顔の向き＝右向きの場合
左目および右目の各位置情報が取得でき、右耳の位置情報が取得できれば、追跡対象Ｐの向きは「右向き」と判断する。同様に、左目および右目の各位置情報が取得できない場合であっても、右耳の位置情報が取得できれば、追跡対象Ｐの向きは「右向き」と判断する。 c) When the face is directed to the right If the position information of the left eye and the right eye can be acquired, and the position information of the right ear can be acquired, the orientation of the tracked object P is determined to be "right." Similarly, even if the position information for the left eye and the right eye cannot be acquired, if the position information for the right ear can be acquired, the direction of the tracked object P is determined to be "facing right."

エ) 向き＝不明の場合
左目、右目、左耳および右耳の何れの位置情報も取得できなければ、追跡対象Ｐの向きは「不明」と判断する。 d) Orientation = Unknown The orientation of the tracked object P is determined to be "unknown" if position information for the left eye, right eye, left ear, and right ear cannot be acquired.

＜追跡対象Ｐの同定処理＞
追跡対象Ｐの追跡には移動する追跡対象Ｐの同定処理が必要である。この同定処理には、追跡対象Ｐから取得した位置情報を用いる同定処理Ｉと、追跡対象Ｐから取得した特徴情報を用いる同定処理IIが含まれる。 <Identification processing of tracking target P>
Tracking of the tracked target P requires identification processing of the tracked target P that moves. This identification processing includes identification processing I using position information acquired from the tracked target P and identification processing II using feature information acquired from the tracked target P. FIG.

この実施の形態では、位置情報を用いる同定処理Ｉを先行して実施し、同定処理Ｉで同定できなかった場合には特徴情報による同定処理IIを実施し、同定処理の迅速化と高精度化を実現する。 In this embodiment, identification processing I using position information is performed first, and if identification is not possible in identification processing I, identification processing II using feature information is performed to speed up and improve the accuracy of identification processing. Realize

移動中の追跡対象Ｐを追跡する場合には追跡対象Ｐの状態が刻々と変化する。この変化状態のひとつにバウンディングボックスＢ－１、Ｂ－２間の重なりが生じる場合がある。バウンディングボックスＢ－１が他のバウンディングボックスＢ－２と重なると、追跡対象Ｐを見失うことになる。 When tracking a moving tracked object P, the state of the tracked object P changes moment by moment. One of these changing states is the overlap between the bounding boxes B-1 and B-2. If the bounding box B-1 overlaps another bounding box B-2, the tracked object P will be lost.

追跡対象Ｐの追跡中、その移動速度によってバウンディングボックスＢの移動距離Ｍに変化を生じる。移動距離Ｍが延びると、位置情報に変化を来たし、この場合も追跡対象Ｐを見失うことになる。 During tracking of the tracked object P, the moving distance M of the bounding box B changes according to its moving speed. If the movement distance M increases, the position information changes, and the tracked object P is lost in this case as well.

追跡対象Ｐの追跡範囲には限界がある。この追跡範囲から追跡対象Ｐが外れ、再び追跡範囲に戻るといった場合にも、追跡対象Ｐの位置情報だけでは同定することができない。つまり、フレームアウトの場合も、追跡対象Ｐを見失うことになる。 The tracking range of the tracked object P is limited. Even when the tracking target P leaves this tracking range and returns to the tracking range again, it cannot be identified only by the position information of the tracking target P. In other words, the tracked object P is lost even when the frame is out.

このように、位置情報だけで追跡対象Ｐを同定できない場合には、予め取得した追跡対象Ｐの特徴情報を用いて同定処理IIを実行する。つまり、この実施の形態では、同定処理Ｉを同定処理IIで補完し、同定処理の迅速化とともに、高精度化を実現している。 In this way, when the tracking target P cannot be identified only by the position information, the identification process II is executed using the feature information of the tracking target P obtained in advance. That is, in this embodiment, the identification process I is complemented by the identification process II, thereby speeding up the identification process and achieving high accuracy.

＜位置情報による同定処理Ｉ＞
図１２は、位置情報による同定処理Ｉを示している。この処理手順は図５に示す処理手順のＳ１０５のサブルーチンである。 <Identification processing I by position information>
FIG. 12 shows identification processing I based on position information. This processing procedure is a subroutine of S105 of the processing procedure shown in FIG.

この処理手順には追跡対象Ｐの位置情報の取得（Ｓ５０１）、バウンディングボックスＢの重なりチェック（Ｓ５０２）、重なり判定（Ｓ５０３）、不定判断（Ｓ５０４）、移動距離Ｍの算出（Ｓ５０５）、移動距離Ｍの判定（Ｓ５０６）、追跡対象Ｐの同定（Ｓ５０７）、チェック終了判断（Ｓ５０８）などが含まれる。 This processing procedure includes acquisition of position information of tracked object P (S501), overlap check of bounding box B (S502), overlap determination (S503), indefinite determination (S504), calculation of movement distance M (S505), movement distance M determination (S506), tracking target P identification (S507), check end determination (S508), and the like are included.

追跡対象Ｐの位置情報の取得（Ｓ５０１）：同定処理Ｉでは、追跡制御部１２が追跡対象Ｐから取得した位置情報を用いて同定する。 Acquisition of position information of tracked object P (S501): In identification process I, the tracking control unit 12 uses the position information obtained from the tracked object P to identify it.

バウンディングボックスＢの重なりチェック（Ｓ５０２）：この同定中において、各追跡対象Ｐのそれぞれに付されているバウンディングボックスＢに重なりがあるかをチェックする。バウンディングボックスＢの重なりとは、画像上において、２以上のバウンディングボックスＢの接触ないし重合した状態である。 Overlap check of bounding box B (S502): During this identification, it is checked whether or not the bounding box B attached to each tracked object P overlaps. Overlapping of bounding boxes B is a state in which two or more bounding boxes B are in contact or superimposed on the image.

重なり判定（Ｓ５０３）：バウンディングボックスＢ間に重なりが生じた場合（Ｓ５０３のＹＥＳ）、Ｓ５０４に遷移する。また、バウンディングボックスＢ間に重なりが生じていなければ（Ｓ５０３のＮＯ）、Ｓ５０５に遷移する。 Overlap Determination (S503): If there is overlap between the bounding boxes B (YES in S503), the process proceeds to S504. If the bounding boxes B do not overlap (NO in S503), the process proceeds to S505.

不定判断（Ｓ５０４）：バウンディングボックスＢに重なりが生じた場合（Ｓ５０３のＹＥＳ）には、追跡制御部１２は各バウンディングボックスＢを不定と判断し、該当する追跡対象Ｐに対する同定処理Ｉによる同定を解除し、この処理を終了する。つまり、この場合、特徴情報による同定処理IIに移行する。 Indeterminate determination (S504): If the bounding boxes B overlap (YES in S503), the tracking control unit 12 determines that each bounding box B is indeterminate, and identifies the corresponding tracked object P by the identification process I. Release and terminate this process. That is, in this case, the process proceeds to identification processing II based on feature information.

移動距離Ｍの算出（Ｓ５０５）：バウンディングボックスＢ間に重なりが生じていなければ（Ｓ５０３のＮＯ）、現フレームのバウンディングボックスＢと直前フレームのバウンディングボックスＢの移動距離Ｍの算出を行う（Ｓ５０５）。 Calculation of movement distance M (S505): If there is no overlap between the bounding boxes B (NO in S503), the movement distance M of the bounding box B of the current frame and the bounding box B of the previous frame is calculated (S505). .

移動距離Ｍの判定（Ｓ５０６）：算出した移動距離Ｍが閾値Ｍｔｈとしてたとえば、０．５ｍを設定し、０．５ｍ以下であれば（Ｓ５０６のＹＥＳ）、Ｓ５０７に遷移する。移動距離Ｍが閾値Ｍｔｈを超えていれば（Ｓ５０６のＮＯ）、Ｓ５０４に遷移し、追跡制御部１２は移動距離Ｍが閾値Ｍｔｈを超えたバウンディングボックスＢは不定とする。つまり、該当する追跡対象Ｐに対する同定処理Ｉによる同定を解除し、この処理を終了する。この場合、特徴情報による同定処理IIによって同定判断を行う。 Determination of moving distance M (S506): For example, 0.5 m is set as the threshold Mth for the calculated moving distance M, and if it is 0.5 m or less (YES in S506), the process proceeds to S507. If the movement distance M exceeds the threshold Mth (NO in S506), the process proceeds to S504, and the tracking control unit 12 determines that the bounding box B in which the movement distance M exceeds the threshold Mth is indefinite. In other words, the identification by the identification process I for the corresponding tracked object P is cancelled, and this process ends. In this case, identification is determined by identification processing II based on feature information.

追跡対象Ｐの同定（Ｓ５０７）：算出した移動距離Ｍが閾値Ｍｔｈ以下であれば（Ｓ５０６のＹＥＳ）、追跡対象Ｐの位置情報による同定を行う。移動距離Ｍが閾値Ｍｔｈ以下であれば、追跡対象Ｐが同一であると判断し、同定する。 Identification of tracked target P (S507): If the calculated moving distance M is equal to or less than the threshold value Mth (YES in S506), the tracked target P is identified based on the position information. If the movement distance M is equal to or less than the threshold value Mth, it is determined that the tracking targets P are the same and identified.

チェック終了判断（Ｓ５０８）：追跡制御部１２は、Ｓ５０１ないしＳ５０７の処理について、全てのバウンディングボックスＢの処理が終了したかを判断する（Ｓ５０８）。全てのバウンディングボックスＢの処理が終了していなければ（Ｓ５０８のＮＯ）、Ｓ５０１ないしＳ５０７の処理を継続し、バウンディングボックスＢの処理をバウンディングボックス数だけ繰り返す。そして、全てのバウンディングボックスＢについての処理が終了すれば（Ｓ５０８のＹＥＳ）、同定判断（図５のＳ１０５）に遷移し、この処理を終了する。 Determination of check completion (S508): The tracking control unit 12 determines whether or not the processing of all bounding boxes B has been completed in the processing of S501 to S507 (S508). If the processing of all bounding boxes B has not been completed (NO in S508), the processing of S501 to S507 is continued, and the processing of bounding boxes B is repeated by the number of bounding boxes. When the process for all bounding boxes B is completed (YES in S508), the process proceeds to identification determination (S105 in FIG. 5) and ends this process.

＜バウンディングボックスＢの重なり判定動作および特徴情報の取得＞
図１３のＡは追跡中の画像情報を示している。この画像情報内には３人の追跡対象Ｐ－１、Ｐ－２、Ｐ－３が存在し、追跡対象Ｐ－１にはバウンディングボックスＢ－１、追跡表示Ｔ－１、追跡対象Ｐ－２にはバウンディングボックスＢ－２、追跡表示Ｔ－２が存在している。 <Overlap Judgment Operation of Bounding Box B and Acquisition of Feature Information>
FIG. 13A shows image information during tracking. In this image information, there are three tracked objects P-1, P-2, and P-3. has a bounding box B-2 and a tracking representation T-2.

図１３のＢは、図１３のＡに続く追跡中の画像情報を示している。この画像情報は、バウンディングボックスＢ－１、Ｂ－２の重なり状態を表している。つまり、追跡対象Ｐ－１が追跡対象Ｐ－２と重なり、バウンディングボックスＢ－１、Ｂ－２が重なってひとつのバウンディングボックスＢｘに合体している。同様に追跡表示Ｔ－１、Ｔ－２が重なり、ひとつの追跡表示Ｔｘに合体している。この場合、バウンディングボックスＢ－１、Ｂ－２の双方が既述の不定の状態である。そこで、重なりの直前フレームから各追跡対象Ｐ－１、Ｐ－２の画像情報から特徴情報を取得しておく。 FIG. 13B shows the image information being tracked following FIG. 13A. This image information represents the overlapping state of the bounding boxes B-1 and B-2. In other words, the tracked object P-1 overlaps the tracked object P-2, and the bounding boxes B-1 and B-2 overlap to form one bounding box Bx. Similarly, the tracking displays T-1 and T-2 are overlapped and merged into one tracking display Tx. In this case, both the bounding boxes B-1 and B-2 are in the previously described indeterminate state. Therefore, feature information is obtained from the image information of each of the tracking targets P-1 and P-2 from the frame immediately before the overlap.

図１３のＣは、図１３のＢに続く追跡中の画像情報を示している。この画像情報は、バウンディングボックスＢ－１、Ｂ－２、追跡表示Ｔ―１、Ｔ－２の重なりからの分離状態を示している。つまり、追跡対象Ｐ－１が追跡対象Ｐ－２から離れ、各バウンディングボックスＢ－１、Ｂ－２が分離し、同様に追跡表示Ｔ－１、Ｔ－２が分離している。つまり、重なり状態が解消したことにより、追跡対象Ｐ－１、Ｐ－２の双方の位置情報を取得することができる。しかし、特徴情報による同定のために、不定状態の解消直後では、重なりが解除された直後フレームから各追跡対象Ｐ－１、Ｐ－２の画像情報から特徴情報を取得しておく。 FIG. 13C shows the image information being tracked following FIG. 13B. This image information shows the separation of the bounding boxes B-1, B-2 and the tracking representations T-1, T-2 from the overlap. That is, the tracked object P-1 moves away from the tracked object P-2, the bounding boxes B-1 and B-2 are separated, and the tracking representations T-1 and T-2 are separated as well. In other words, the position information of both of the tracking targets P-1 and P-2 can be obtained by canceling the overlapping state. However, immediately after the indefinite state is resolved, the feature information is acquired from the image information of each of the tracking targets P-1 and P-2 from the frame immediately after the overlapping is canceled for identification by the feature information.

＜バウンディングボックスＢの移動距離Ｍの判定動作および特徴情報の取得＞
図１４のＡは、バウンディングボックスＢの移動距離Ｍが閾値Ｍｔｈより短い場合（Ｍ≦Ｍｔｈ）を示している。 <Determination Operation of Moving Distance M of Bounding Box B and Acquisition of Characteristic Information>
FIG. 14A shows a case where the moving distance M of the bounding box B is shorter than the threshold value Mth (M≦Mth).

追跡対象Ｐの追跡中、追跡制御部１２は追跡対象Ｐ－１の移動距離Ｍを監視する。この移動距離ＭがＭ≦Ｍｔｈであれば、追跡対象Ｐ－１を同一と判断し、同定する。 During tracking of the tracked object P, the tracking control unit 12 monitors the moving distance M of the tracked object P-1. If the moving distance M satisfies M≤Mth, the tracked object P-1 is determined to be the same and identified.

図１４のＢは、バウンディングボックスＢ－１の移動距離Ｍが閾値Ｍｔｈより長い場合（Ｍ＞Ｍｔｈ）を示している。 B of FIG. 14 shows a case where the moving distance M of the bounding box B-1 is longer than the threshold Mth (M>Mth).

追跡対象Ｐ－１の追跡中、追跡制御部１２は追跡対象Ｐ－１の移動距離Ｍが閾値Ｍｔｈを超えると（Ｍ＞Ｍｔｈ）、位置情報による追跡対象Ｐの同定精度が低下するので、不定とし、位置情報による同定を停止する。この場合、特徴情報による同定のために、不定状態に移行直前のフレームの画像情報から追跡対象Ｐ－１の特徴情報を取得しておく。 When the movement distance M of the tracked target P-1 exceeds the threshold Mth (M>Mth) during tracking of the tracked target P-1, the tracking control unit 12 determines that the accuracy of identifying the tracked target P based on the position information decreases. to stop identification based on location information. In this case, for identification by the feature information, the feature information of the tracked object P-1 is obtained from the image information of the frame immediately before the transition to the undefined state.

＜フレームアウトの判定動作および特徴情報の取得＞
図１５のＡは、追跡中の画像情報を示している。この画像情報には、フレームＦ１において、１人の追跡対象Ｐ－１の移動を示している。 <Determination operation of frame-out and acquisition of feature information>
FIG. 15A shows image information during tracking. This image information shows the movement of one tracked object P-1 in frame F1.

図１５のＢは、追跡中、追跡対象Ｐ－１がフレームアウトした場合を示している。このフレームアウトとは矢印で示すように、追跡対象Ｐ－１が移動して追跡範囲を表すフレームＦ２から脱することである。この場合も不定状態とし、位置情報による同定を停止する。そこで、特徴情報による同定のために、不定状態に移行直前のフレームの画像情報から追跡対象Ｐ－１の特徴情報を取得しておく。 FIG. 15B shows the case where the tracked object P-1 is out of frame during tracking. This frame-out means that the tracked object P-1 moves out of the frame F2 representing the tracking range as indicated by the arrow. In this case as well, the state is set to an indefinite state, and the identification based on the position information is stopped. Therefore, for identification based on feature information, the feature information of the tracking target P-1 is obtained from the image information of the frame immediately before the transition to the undefined state.

図１５のＣは、追跡対象Ｐ－１が再フレームインした場合を示している。この再フレームインとは矢印で示すように、フレームＦ３の外から追跡対象Ｐ－１が移動してフレームＦ３に再進入した場合である。この場合も、特徴情報による同定のために、再フレームインの直後フレームの画像情報から追跡対象Ｐ－１の特徴情報を取得しておく。 FIG. 15C shows the case where the tracked object P-1 re-frames-in. This re-in-frame means that the object to be tracked P-1 moves from outside the frame F3 and re-enters the frame F3, as indicated by the arrow. Also in this case, for identification based on feature information, feature information of the tracking target P-1 is obtained from the image information of the frame immediately after re-frame-in.

＜特徴情報による同定処理II＞
図１６は、特徴情報による同定処理IIを示している。この処理手順は図５に示す処理手順のＳ１０５のサブルーチンである。 <Identification processing II by feature information>
FIG. 16 shows identification processing II based on feature information. This processing procedure is a subroutine of S105 of the processing procedure shown in FIG.

この処理手順には、不定バウンディングボックスＢの判定（Ｓ６０１）、前フレームの判断（Ｓ６０２）、新規のバウンディングボックスＢの特定（Ｓ６０３）、顔領域の画像情報の取得（Ｓ６０４）、向き情報および特徴情報の取得（Ｓ６０５）、特徴情報の比較（Ｓ６０６）、特徴情報の一致判断（Ｓ６０７）、バウンディングボックスＢの同定処理（Ｓ６０８）などが含まれる。 This processing procedure includes determination of an indeterminate bounding box B (S601), determination of a previous frame (S602), identification of a new bounding box B (S603), acquisition of image information of the face area (S604), orientation information and feature Acquisition of information (S605), comparison of feature information (S606), determination of matching of feature information (S607), identification of bounding box B (S608), and the like are included.

不定バウンディングボックスＢの判定（Ｓ６０１）：特徴情報による同定処理IIは、不定バウンディングボックスの存在が前提である。したがって、同定処理部２８は、現フレームで不定とされたバウンディングボックスＢが存在するかを判定する。不定とされたバウンディングボックスＢがなければ（Ｓ６０１のＮＯ）、この処理を終了して同定判断１（Ｓ１０５：図５）にリターンする。不定とされたバウンディングボックスＢがあれば（Ｓ６０１のＹＥＳ）、Ｓ６０２に遷移する。 Determination of Indefinite Bounding Box B (S601): Identification processing II based on feature information is premised on the existence of an indefinite bounding box. Therefore, the identification processing unit 28 determines whether or not there is a bounding box B that is indefinite in the current frame. If there is no indefinite bounding box B (NO in S601), this process is terminated and the process returns to identification judgment 1 (S105: FIG. 5). If there is a bounding box B that is indefinite (YES in S601), the process proceeds to S602.

前フレームの判断（Ｓ６０２）：前フレームで不定とされたバウンディングボックスＢが存在するかを判定する。前フレームで不定とされたバウンディングボックスＢが存在しなければ（Ｓ６０２のＮＯ）、Ｓ６０３に遷移する。 Judgment of previous frame (S602): It is judged whether or not there is a bounding box B that is indeterminate in the previous frame. If there is no bounding box B which was made indefinite in the previous frame (NO in S602), the process proceeds to S603.

新規バウンディングボックスの特定（Ｓ６０３）：前フレームで不定とされたバウンディングボックスＢが存在しなければ（Ｓ６０２のＮＯ）、現時点のフレームにおけるバウンディングボックスＢを新規のバウンディングボックスＢとし（Ｓ６０３）、この処理を終了して同定判断（Ｓ１０５：図５）にリターンする。また、前フレームで不定とされたバウンディングボックスＢが存在すれば（Ｓ６０２のＹＥＳ）、Ｓ６０４に遷移する。 Identification of a new bounding box (S603): If there is no bounding box B which was made indeterminate in the previous frame (NO in S602), the bounding box B in the current frame is set as a new bounding box B (S603), and this process is performed. , and returns to the identification judgment (S105: FIG. 5). Also, if there is a bounding box B that is indeterminate in the previous frame (YES in S602), the process proceeds to S604.

顔領域の画像情報の取得（Ｓ６０４）：不定とされたバウンディングボックスＢが存在すれば（Ｓ６０２のＹＥＳ）、バウンディングボックスＢ内の画像情報から顔領域の画像情報を取得する。 Acquisition of image information of face area (S604): If bounding box B which is determined to be indefinite exists (YES in S602), image information of face area is acquired from the image information in bounding box B. FIG.

向き情報および特徴情報の取得（Ｓ６０５）：取得した画像情報（顔画像）から向き情報および特徴情報を取得する。 Acquisition of Orientation Information and Feature Information (S605): Orientation information and feature information are acquired from the acquired image information (face image).

特徴情報の比較（Ｓ６０６）：取得した特徴情報と、前フレームで不定とされたバウンディングボックスＢの特徴情報とを、向き情報ごとにカテゴライズされている同士を比較する。 Comparison of feature information (S606): The acquired feature information and the feature information of the bounding box B which was made indefinite in the previous frame are compared with each other, which are categorized for each direction information.

特徴情報の一致判断（Ｓ６０７）：特徴情報が一致したか否かを判断し、一致すれば（Ｓ６０７のＹＥＳ）、Ｓ６０８に遷移し、一致しなければ（Ｓ６０７のＮＯ）、この処理を終了して同定判断（Ｓ１０５：図５）にリターンする。この特徴情報の一致判断はたとえば、特徴量の一致数を以て判断すればよい。 Match determination of feature information (S607): It is determined whether or not the feature information matches. to return to the identification judgment (S105: FIG. 5). The match determination of the feature information may be performed based on, for example, the number of matching feature amounts.

バウンディングボックスＢの同定処理（Ｓ６０８）：同定処理部２８は、現フレームのバウンディングボックスＢを前フレームのバウンディングボックスＢの同定により、追跡対象Ｐを同定し、この処理を終了して同定判断（Ｓ１０５：図５）にリターンする。 Identification processing of bounding box B (S608): The identification processing unit 28 identifies the tracking target P by identifying the bounding box B of the current frame with the bounding box B of the previous frame, ends this processing, and makes an identification judgment (S105). : Return to Fig. 5).

＜第一の実施の形態の効果＞
この第一の実施の形態によれば、次の何れかの効果が得られる。
(1) 追跡対象Ｐから取得した二次元の画像情報と深度情報とを以て座標値を含む位置情報を用いて追跡対象Ｐの同定を行うので、高精度に同定できる。 <Effects of the first embodiment>
According to this first embodiment, one of the following effects can be obtained.
(1) Since the tracking target P is identified using the two-dimensional image information and the depth information obtained from the tracking target P and the position information including the coordinate values, the tracking target P can be identified with high accuracy.

(2) 位置情報を用いた同定処理Ｉで追跡対象Ｐの同定ができない場合には、追跡対象Ｐの画像情報から取得した特徴情報を以て同定処理IIを行うので、同定処理IIを以て同定機能を補完することができ、追跡対象Ｐの同定精度を高めることができる。 (2) If the tracking target P cannot be identified by the identification processing I using the position information, the identification processing II is performed using the feature information acquired from the image information of the tracking target P, so the identification processing II complements the identification function. and the identification accuracy of the tracked object P can be improved.

(3) バウンディングボックスを分割して位置情報を間引き、処理情報の軽量化を図ることができ、同定処理の高速化とともに同定のための処理負荷を軽減できる。 (3) By dividing the bounding box and thinning out the position information, the processing information can be lightened, and the identification processing can be speeded up and the processing load for identification can be reduced.

(4) 追跡対象の追跡状態を追跡表示によって提示できる。また、追跡表示Ｔは追跡対象の追跡中の同定状態を着色によって表すことができる。追跡表示Ｔの着色の変化を確認すれば、同定状態か同定不良か、追跡中か追跡失敗かを容易に認識できる。 (4) The tracking status of the tracked object can be presented by the tracking display. Further, the tracking display T can express the identification state of the tracked object during tracking by coloring. By confirming the change in coloration of the tracking display T, it is possible to easily recognize whether the identification is in the identification state or the identification failure, and whether the tracking is in progress or the tracking has failed.

〔第二の実施の形態〕
図１７は、第二の実施の形態に係る物体追跡システム２を示している。図１７において図３と同一部分には同一符号を付してある。 [Second embodiment]
FIG. 17 shows an object tracking system 2 according to a second embodiment. In FIG. 17, the same parts as in FIG. 3 are denoted by the same reference numerals.

この物体追跡システム２は、図１７に示すように、複数のデバイス３８－１、３８－２、・・・、３８－ｎを備え、各デバイス３８－１、３８－２、・・・、３８－ｎがネットワーク３４を通して有線または無線によりサーバー３６に接続されている。 This object tracking system 2 includes a plurality of devices 38-1, 38-2, . . . , 38-n, as shown in FIG. -n are connected to a server 36 by wire or wirelessly through a network 34;

デバイス３８－１、３８－２、・・・、３８－ｎで得られた追跡情報は、サーバー３６に集合し、必要に応じて合成し、追跡対象Ｐ、または複数の追跡対象Ｐ－１、Ｐ－２、・・・、Ｐ－ｎの追跡に用いられる。 Tracking information obtained by devices 38-1, 38-2, . Used for tracking P-2, . . . , Pn.

＜第二の実施の形態の効果＞
この第二の実施の形態によれば、次の何れかの効果が得られる。
(1) 追跡範囲の拡大を図ることができる。 <Effects of Second Embodiment>
According to this second embodiment, one of the following effects can be obtained.
(1) The tracking range can be expanded.

図１８は、２台のデバイス３８－１、３８－２による追跡範囲の拡大化を示している。デバイス３８－１の画角Ａ－１（＝追跡範囲）、デバイス３８－２の画角Ａ－２（＝追跡範囲）とすれば、複数の追跡対象Ｐ－１、Ｐ－２、・・・、Ｐ－７のうち、追跡対象Ｐ－１、Ｐ－２が画角Ａ－１で追跡でき、画角Ａ－１から外れている追跡対象Ｐ－３、Ｐ－４は画角Ａ－２で追跡が可能である。また、追跡対象Ｐ－５、Ｐ－６、Ｐ－７が画角Ａ－２で追跡できる。したがって、画角Ａ－１、Ａ－２で画角が拡大され、追跡対象Ｐ－１、Ｐ－２、・・・、Ｐ－７が追跡可能である。そして、これらの追跡情報は、サーバー３６で集合されるので、追跡範囲の拡大化とともに追跡の補完を行うことができる。 FIG. 18 shows the extension of the tracking range by two devices 38-1, 38-2. If the angle of view of the device 38-1 is A-1 (=tracking range) and the angle of view of the device 38-2 is A-2 (=tracking range), then a plurality of tracking targets P-1, P-2, . , P-7, the tracking targets P-1 and P-2 can be tracked at the angle of view A-1, and the tracking targets P-3 and P-4 outside the angle of view A-1 are at the angle of view A-2. can be tracked with Also, the tracking targets P-5, P-6, and P-7 can be tracked at the angle of view A-2. Therefore, the angles of view A-1 and A-2 are enlarged, and the tracking targets P-1, P-2, . . . , P-7 can be tracked. Since these pieces of tracking information are collected by the server 36, it is possible to expand the tracking range and complement the tracking.

(2) 追跡範囲の広域化および追跡範囲の囲い込みができる。
図１９は、４台のデバイス３８－１、３８－２、３８－３、３８－４による追跡範囲の広域化および複合追跡を示している。この場合、病院の待合室などの追跡エリア１０６では多数の追跡対象としてたとえば、追跡対象Ｐ－１１、Ｐ－１２、・・・、Ｐ－１４、Ｐ－２１、Ｐ－２２、・・・、Ｐ－２４、・・・、Ｐ－５１、Ｐ－５２、・・・、Ｐ－５４が滞在している。 (2) The tracking range can be widened and the tracking range can be enclosed.
FIG. 19 illustrates extended tracking range and compound tracking with four devices 38-1, 38-2, 38-3, 38-4. In this case, in a tracking area 106 such as a waiting room of a hospital, a number of tracked objects, such as tracked objects P-11, P-12, . -24, ..., P-51, P-52, ..., P-54 are staying.

この追跡エリア１０６に対し、この追跡エリア１０６を包囲して４台のデバイス３８－１、３８－２、３８－３、３８－４が配置されている。デバイス３８－１の画角をＡ－１１、デバイス３８－２の画角をＡ－１２、デバイス３８－３の画角をＡ－１３、デバイス３８－４の画角をＡ－１４とすれば、複数の画角Ａ－１１、Ａ－１２、Ａ－１３、Ａ－１４を以て囲い込み、追跡エリア１０６が追跡範囲に設定されている。 For this tracking area 106, four devices 38-1, 38-2, 38-3, 38-4 are arranged surrounding this tracking area 106. FIG. If the angle of view of device 38-1 is A-11, the angle of view of device 38-2 is A-12, the angle of view of device 38-3 is A-13, and the angle of view of device 38-4 is A-14, , a plurality of angles of view A-11, A-12, A-13, and A-14, and a tracking area 106 is set as a tracking range.

このようにすれば、追跡エリア１０６に滞在または出入りする追跡対象Ｐ－１１、Ｐ－１２、・・・、Ｐ－１４、Ｐ－２１、Ｐ－２２、・・・、Ｐ－２４、・・・、Ｐ－５１、Ｐ－５２、・・・、Ｐ－５４の全てを追跡することができる。 , P-14, P-21, P-22, . . . , P-24, . , P-51, P-52, . . . , P-54 can all be tracked.

そして、追跡エリア１０６内での移動による追跡対象Ｐ－１１、Ｐ－１２、・・・、Ｐ－１４、Ｐ－２１、Ｐ－２２、・・・、Ｐ－２４、・・・、Ｐ－５１、Ｐ－５２、・・・、Ｐ－５４間の重なりによる追跡不定をデバイス３８－１、３８－２、３８－３、３８－４間で解消できるとともに、サーバー３６の追跡不定の回避処理を迅速に行うことができる。 , P-14, P-21, P-22, . . . , P-24, . 51, P-52, . can be done quickly.

〔他の実施の形態〕
(1) 上記実施の形態では、同定処理Ｉおよび同定処理IIを含み、同定処理Ｉで追跡対象を同定できない場合に同定処理IIを実行し、同定処理IIで同定処理Ｉを補完している。これに対し、同定処理Ｉおよび同定処理IIを同時に実行させ、何れか一方で同定できれば、同定処理IIを先行させてもよい。また、同定処理Ｉおよび同定処理IIを同時に実行させ、双方の同定を以て追跡対象を同定してもよい。 [Other embodiments]
(1) In the above embodiment, identification processing I and identification processing II are included, and identification processing II is executed when identification processing I fails to identify a tracking target, and identification processing II complements identification processing I. On the other hand, the identification process I and the identification process II may be executed simultaneously, and if identification can be made by one of them, the identification process II may be preceded. Alternatively, the identification process I and the identification process II may be executed simultaneously to identify the tracking object by both identifications.

(2) 上記実施の形態では、追跡対象のバウンディングボックスと他の追跡対象のバウンディングボックスとの間に重なりを生じた場合、該重なりの直前の前記特徴情報と、前記重なりの解消時点の前記特徴情報とを対比して追跡対象を同定する処理を行っているが、追跡中の追跡対象と他の追跡対象の画像間に重なりを生じた場合にも同様の処理を行ってもよい。 (2) In the above embodiment, when an overlap occurs between the bounding box of the tracked object and the bounding box of another tracked object, the feature information immediately before the overlap and the feature information at the time of cancellation of the overlap Although the process of identifying the tracked target by comparing it with the information is performed, the same process may be performed when an overlap occurs between the images of the tracked target being tracked and another tracked target.

(3) 上記実施の形態では、追跡対象Ｐとして人を例示したが、人以外の物体の追跡にも活用できる。追跡対象としては、ロボット、移動体、樹木の伐採などにも活用可能である。 (3) In the above embodiment, a person was exemplified as the tracking target P, but it can also be used to track objects other than humans. It can also be used for tracking objects such as robots, mobile objects, and tree felling.

(4) デバイス３８には他の情報処理端末を用いてもよく、本開示の追跡プログラムは既存の特定のプログラムに限定されない。 (4) Other information processing terminals may be used for the device 38, and the tracking program of the present disclosure is not limited to a specific existing program.

(5) 追跡中、追跡対象Ｐや追跡表示Ｔを背景色と異なる着色を以て追跡情報を提示してもよい。 (5) During tracking, the tracking information may be presented with the tracking target P and the tracking display T colored differently from the background color.

(6) 図１に示す物体追跡システム２には、ポータブル多機能デバイス３８を含んだシステムを例示しているが、ポータブル多機能デバイス３８を一体に備える必要はなく、カメラ６、光検出・測距部８などの機能部と処理部４とを別個に備えるシステムであってもよく、これらの設置場所を異ならせたシステムであってもよい。 (6) The object tracking system 2 shown in FIG. A system that separately includes the functional unit such as the distance unit 8 and the processing unit 4 may be used, or a system having different installation locations may be used.

以上説明したように、本開示は最も好ましい実施の形態について説明した。本開示は、上記記載に限定されるものではない。特許請求の範囲に記載され、または発明を実施するための形態に開示された要旨に基づき、当業者において様々な変形や変更が可能である。斯かる変形や変更が、本開示の範囲に含まれることは言うまでもない。
As described above, the present disclosure has been described with respect to the most preferred embodiments. The disclosure is not limited to the above description. Various modifications and changes can be made by those skilled in the art based on the gist described in the claims or disclosed in the detailed description. It goes without saying that such variations and modifications are included within the scope of the present disclosure.

本開示によれば、追跡対象から取得した二次元の画像情報と深度情報とを以て座標値を含む位置情報で同定するとともに、画像情報から取得した追跡対象の特徴情報を以て同定するので、同定精度を高めることができるとともに、バウンディングボックスを分割して追跡情報の軽量化による同定の迅速化や処理負荷を軽減できる。
According to the present disclosure, identification is performed using position information including coordinate values using two-dimensional image information and depth information obtained from a tracked target, and identification is performed using feature information of the tracked target obtained from the image information. In addition, by dividing the bounding box, it is possible to speed up identification and reduce the processing load by reducing the weight of the tracking information.

Ｐ追跡対象
Ｂバウンディングボックス
Ｔ追跡表示
２物体追跡システム
４処理部
６カメラ
８光検出・測距部
１０情報提示部
１２追跡制御部
１４画像情報取得部
１６深度情報取得部
１８バウンディングボックス処理部
２０追跡情報データベース生成部
２２位置情報処理部
２４特徴情報処理部
２６状態情報処理部
２８同定処理部
３０同定情報提示部
３２連係処理部
３４ネットワーク
３６サーバー
３８デバイス
４０装置本体
４１情報取得部
４５ディスプレイ
４６表示画面部
４８タッチパネル
５６操作入力部
５８プロセッサ
６０記憶部
６２入出力部（Ｉ／Ｏ）
６４通信部
６６追跡情報データベース
６７－１、６７－２、・・・、６７－ｎ追跡対象フアイル
６８画像情報部
７０深度情報部
７２バウンディングボックス部
７４グリッド点部
７６位置情報部
７８特徴情報部
８０分類情報部
８２向き情報部
８２－１正面部
８２－２左向き部
８２－３右向き部
８２－４不定部
８４状態情報部
８４－１移動距離情報部
８４－２重なり情報部
８６同定情報部
８８履歴情報部
１００向き情報テーブル
１０２取得部位情報部
１０２－１左目部
１０２－２右目部
１０２－３左耳部
１０２－４右耳部
１０４向き情報部
１０６追跡エリア
P tracking target B bounding box T tracking display 2 object tracking system 4 processing unit 6 camera 8 light detection/ranging unit 10 information presentation unit 12 tracking control unit 14 image information acquisition unit 16 depth information acquisition unit 18 bounding box processing unit 20 tracking Information database generation unit 22 Position information processing unit 24 Feature information processing unit 26 State information processing unit 28 Identification processing unit 30 Identification information presentation unit 32 Coordination processing unit 34 Network 36 Server 38 Device 40 Apparatus body 41 Information acquisition unit 45 Display 46 Display screen Section 48 Touch panel 56 Operation input section 58 Processor 60 Storage section 62 Input/output section (I/O)
64 communication unit 66 tracking information database 67-1, 67-2, . Classification information part 82 Orientation information part 82-1 Front part 82-2 Left part 82-3 Right part 82-4 Indefinite part 84 State information part 84-1 Movement distance information part 84-2 Overlap information part 86 Identification information part 88 History Information section 100 Orientation information table 102 Acquisition part information section 102-1 Left eye section 102-2 Right eye section 102-3 Left ear section 102-4 Right ear section 104 Orientation information section 106 Tracking area

Claims

a step of acquiring image information and depth information in time series by imaging from a tracked object, and acquiring position information including three-dimensional coordinate values in time series from the image information and the depth information;
a step of acquiring direction information of the tracked object and feature information of the face area of the tracked object from the image information in time series, classifying the acquired feature information according to the direction information, and storing the acquired feature information in a database;
identifying the tracked object using the location information;
If the tracked target cannot be identified by the acquired image information and the position information based on the depth information, the direction information of the tracked target and the feature information of the face region are acquired from the image information, and the acquired feature information and comparing the feature information classified by the orientation information in the database to identify the tracked object;
An object tracking method, comprising:

Further, during tracking, the step of comparing the current location information and the latest location information of the tracked object in time series to calculate the moving distance of the tracked object;
identifying the tracked object if the distance traveled is within a threshold;
2. The object tracking method of claim 1, comprising:

Furthermore, if an overlap occurs between the tracked object being tracked and the image of another tracked object, or between the bounding box of the tracked object and the bounding box of the other tracked object, the feature just before the overlap 3. The object tracking method according to claim 1, further comprising a step of identifying a tracked object by comparing the information with the feature information at the time when the overlap is eliminated.

a step of acquiring facial region information from the image information in chronological order, and acquiring at least eye or ear region information from the facial region information;
a step of acquiring the orientation information of the tracked object using the part information, classifying the feature information according to the orientation information, and storing the classified information in the database;
obtaining the orientation information of the tracked target using the part information obtained from the image information when the tracked target cannot be identified from the position information, and obtaining the feature information of the face region from the image information; comparing the obtained feature information with the feature information classified by the orientation information in the database to identify the tracked object;
The object tracking method according to any one of claims 1 to 3, comprising:

4. The object tracking method according to claim 3, further comprising the step of dividing said bounding box to form a grid of multiple rows and multiple columns, and acquiring said position information specified by coordinate values of said grid.

5. The object tracking method according to claim 4, further comprising the step of dividing the bounding box to form a grid of multiple rows and multiple columns, and acquiring the part information specified by the coordinate values of the grid.

A program executed by a computer,
A function of acquiring image information and depth information in time series by imaging from a tracked object, and acquiring position information including three-dimensional coordinate values in time series from the image information and the depth information;
a function of acquiring orientation information of the tracked object and feature information of the face area of the tracked object from the image information in time series, classifying the acquired feature information according to the orientation information, and storing the acquired feature information in a database;
the ability to identify the tracked object using the location information;
If the tracked target cannot be identified by the acquired image information and the position information based on the depth information, the direction information of the tracked target and the feature information of the face region are acquired from the image information, and the acquired feature information and a function of comparing the feature information classified by the orientation information in the database to identify the tracked object;
by the computer.

Furthermore, during tracking, a function of comparing the current location information of the tracked object and the latest location information in chronological order to calculate the moving distance of the tracked object;
A function of identifying the tracked object if the movement distance is within a threshold;
8. The program according to claim 7, for executing by said computer.

Furthermore, if an overlap occurs between the tracked object being tracked and the image of another tracked object, or between the bounding box of the tracked object and the bounding box of the other tracked object, the feature just before the overlap 9. The program according to claim 7 or 8, for causing the computer to execute a function of identifying a tracked object by comparing the information with the feature information at the time when the overlap is eliminated.

Furthermore, a function of acquiring face region information in time series from the image information and acquiring at least eye or ear region information from the face region information;
a function of acquiring the orientation information of the tracked target using the part information, classifying the feature information according to the orientation information, and storing the feature information in the database;
obtaining the orientation information of the tracked target using the part information obtained from the image information when the tracked target cannot be identified from the position information, and obtaining the feature information of the face region from the image information; a function of comparing the acquired feature information with the feature information classified by the orientation information of the database to identify the tracked object;
10. The program according to any one of claims 7 to 9, for executing by said computer.

10. The computer according to claim 9, wherein the bounding box is further divided to form a grid of multiple rows and multiple columns, and the function of acquiring the position information specified by the coordinate values of the grid is executed by the computer. program.

11. The program according to claim 10, further causing the computer to execute a function of dividing the bounding box to form a grid of multiple rows and multiple columns and acquiring the part information specified by the coordinate values of the grid. .

an image information acquisition unit that acquires at least two-dimensional image information representing a tracked object in time series;
a depth information acquisition unit that acquires depth information of the tracked object in time series;
a feature information acquisition unit that acquires orientation information and feature information of the tracked object from the image information;
a database that classifies and stores the acquired feature information according to the direction information;
Obtaining position information of the tracked target using the image information and the depth information, identifying the tracked target using the position information, and determining the orientation of the tracked target from the image information when the position information cannot be used to identify the tracked target. an identification processing unit that acquires information and the feature information of the face region, compares the acquired feature information with the feature information classified by the direction information of the database, and identifies the tracked target;
an information presentation unit that presents tracking information together with image information representing the tracking target;
object tracking system.

two or more devices including at least the image information acquisition unit, the depth information acquisition unit, the feature information acquisition unit, and the identification processing unit, and outputting tracking information of a tracked target;
a server that obtains the tracking information from each device and extends the tracking angle or tracking range of the tracked target to track the tracked target;
14. The object tracking system of claim 13, comprising:

A recording medium storing the program according to any one of claims 7 to 12.