JP2020107349A

JP2020107349A - Object tracking device, object tracking system, and program

Info

Publication number: JP2020107349A
Application number: JP2020036328A
Authority: JP
Inventors: 亮磨大網; Ryoma Oami; 博義宮野; Hiroyoshi Miyano
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2014-09-26
Filing date: 2020-03-04
Publication date: 2020-07-09
Anticipated expiration: 2034-11-04
Also published as: JP6488647B2; JP2016071830A; JP7004017B2; JP7272417B2; JP2022036143A; US20230419671A1; JP2023083565A; JP6673508B2; JP2019114280A

Abstract

To provide a technique capable of tracking an object with higher precision.SOLUTION: An object tracking system comprises a plurality of detection means and integrated tracking means. The integrated tracking means outputs position information on an object of a calculated common coordinate system. The detection means is configured to: convert the position information on the object on the common coordinate system into position information represented with an individual coordinate system characteristic of a camera outputting an image of the object serving as a target of detection; track the object on the individual coordinate system; detect the object based upon the position information represented with the individual coordinate system; and convert the position information on the object, detected based on the position information represented with the individual coordinate system, into position information represented with the common coordinate system.SELECTED DRAWING: Figure 1

Description

本発明は、物体追跡装置、物体追跡システム、物体追跡方法、表示制御装置、物体検出装置、プログラムおよび記録媒体に関する。 The present invention relates to an object tracking device, an object tracking system, an object tracking method, a display control device, an object detection device, a program and a recording medium.

近年、複数のカメラ等を用いて、人物を追跡するシステムが開発されている。例えば、特許文献１に記載されている移動体追跡システムは、カメラ毎に分散的に人物を追跡する複数のカメラ内追跡手段を使用する。そして、そのシステムは、複数のカメラ内追跡手段間で連携して移動体を追跡する。また、特許文献２には、複数の撮像部間で撮像された同一のオブジェクトを個々のオブジェクトの追跡結果に基づき追跡する方法が記載されている。 In recent years, a system for tracking a person using a plurality of cameras has been developed. For example, the moving body tracking system described in Patent Document 1 uses a plurality of in-camera tracking means for tracking a person in a distributed manner for each camera. Then, the system tracks a moving body in cooperation with a plurality of in-camera tracking means. Further, Patent Document 2 describes a method of tracking the same object imaged between a plurality of imaging units based on the tracking result of each object.

また、関連する技術として、追跡する必要のない物体を、早めに追跡対象から外す方法が特許文献３に記載されている。 In addition, as a related technique, Patent Document 3 describes a method of removing an object that does not need to be tracked from a tracking target earlier.

また、１つの撮影手段で撮影した画像における移動体を検出する装置が、特許文献４に記載されている。 Further, Patent Document 4 describes an apparatus that detects a moving object in an image captured by one image capturing unit.

特開２００４−７２６２８号公報JP, 2004-72628, A 特表２００９−５１０５４１号公報Japanese Patent Publication No. 2009-510541 国際公開第２０１３／０１２０９１号International Publication No. 2013/012091 特開２００６−２０２０４７号公報JP, 2006-202047, A

しかしながら、上述した特許文献１または２に記載の技術では、例えば、物体から遠く離れた位置にカメラが存在する場合、このカメラの撮影画像における物体（移動体）の追跡精度が低くなる可能性がある。この場合、特許文献１または２の技術では、そのカメラの追跡精度の影響を受け、オブジェクトの追跡結果を統合できなかったり、統合できた場合であっても統合時に求められる該オブジェクトの位置の検出精度が低くなったりする場合があった。 However, in the technique described in Patent Document 1 or 2 described above, for example, when a camera exists at a position far away from the object, the tracking accuracy of the object (moving object) in the image captured by the camera may be low. is there. In this case, in the technique of Patent Document 1 or 2, the tracking accuracy of the camera is affected, and the tracking results of the objects cannot be integrated, or even if they are integrated, the detection of the position of the object required at the time of integration is detected. There were cases where the accuracy became low.

本発明は上記課題に鑑みてなされたものであり、その目的は、より高精度に物体を追跡可能な技術を提供することにある。 The present invention has been made in view of the above problems, and an object thereof is to provide a technique capable of tracking an object with higher accuracy.

本発明の一態様に係る物体追跡装置は、センサの出力情報から物体を検出し、検出結果を出力する複数の検出手段と、前記複数の検出手段の夫々によって出力された、複数の前記検出結果に基づいて前記物体を追跡し、共通座標系で表現された前記物体の追跡情報を生成する統合追跡手段と、を備え、前記統合追跡手段は、前記生成した追跡情報を、前記複数の検出手段の夫々に出力し、前記検出手段は、前記追跡情報に基づいて、前記物体を検出する。 An object tracking device according to an aspect of the present invention detects an object from output information of a sensor and outputs a detection result, and a plurality of detection results output by each of the detection means. Integrated tracking means for tracking the object based on the above, and generating tracking information of the object expressed in a common coordinate system, wherein the integrated tracking means detects the generated tracking information from the plurality of detecting means. And the detection means detects the object based on the tracking information.

本発明の一態様に係る物体追跡システムは、センサと、前記センサによって取得された情報からなる出力情報を受信する物体追跡装置とを備え、前記物体追跡装置は、前記出力情報から前記物体を検出し、検出結果を出力する複数の検出手段と、前記複数の検出手段の夫々によって出力された、複数の前記検出結果に基づいて前記物体を追跡し、共通座標系で表現された前記物体の追跡情報を生成する統合追跡手段と、を備え、前記統合追跡手段は、前記生成した追跡情報を、前記複数の検出手段の夫々に出力し、前記検出手段は、前記追跡情報に基づいて、前記物体を検出する。 An object tracking system according to an aspect of the present invention includes a sensor and an object tracking device that receives output information including information acquired by the sensor, and the object tracking device detects the object from the output information. Then, a plurality of detection means for outputting the detection result, and tracking the object based on the plurality of detection results output by each of the plurality of detection means, the tracking of the object represented in a common coordinate system Integrated tracking means for generating information, wherein the integrated tracking means outputs the generated tracking information to each of the plurality of detecting means, and the detecting means, based on the tracking information, the object. To detect.

本発明の一態様に係る物体追跡方法は、センサの出力情報から物体を検出し、検出結果を出力し、前記出力された複数の前記検出結果に基づいて前記物体を追跡し、共通座標系で表現された前記物体の追跡情報を生成し、生成した前記追跡情報を出力し、前記物体の検出は、前記追跡情報に基づいて、前記物体を検出する。 An object tracking method according to an aspect of the present invention detects an object from output information of a sensor, outputs a detection result, and tracks the object based on the output results of the plurality of detections in a common coordinate system. The expressed tracking information of the object is generated, the generated tracking information is output, and the detection of the object detects the object based on the tracking information.

本発明の一態様に係る表示制御装置は、表示装置に表示データを表示させる表示制御装置であって、前記表示データは、センサの出力情報のうち、物体の追跡情報に基づいて該物体を探索する探索範囲であって、該出力情報を出力するセンサ固有の個別座標系で表現された探索範囲を示すものであり、前記物体の追跡情報は、複数のセンサの夫々における出力情報における前記探索範囲内から検出された該物体の検出結果に基づいて、該物体を追跡した結果を示す情報である。 A display control device according to an aspect of the present invention is a display control device for displaying display data on a display device, wherein the display data searches the object based on tracking information of the object among output information of a sensor. A search range represented by a sensor-specific individual coordinate system that outputs the output information, wherein the object tracking information is the search range in the output information of each of the plurality of sensors. It is information indicating the result of tracking the object based on the detection result of the object detected from inside.

本発明の一態様に係る物体検出装置は、複数の物体検出装置の夫々から出力された、複数の検出結果に基づいて追跡された、物体の追跡結果を示す追跡情報であって、共通座標系で表現された追跡情報に基づいて、センサの出力情報から前記物体を検出する。 An object detection device according to an aspect of the present invention is tracking information indicating a tracking result of an object, which is output from each of a plurality of object detection devices and is tracked based on a plurality of detection results, and has a common coordinate system. The object is detected from the output information of the sensor based on the tracking information expressed by.

なお、上記各装置、物体追跡システムまたは物体追跡方法を、コンピュータによって実現するコンピュータプログラム、およびそのコンピュータプログラムが格納されている、コンピュータ読み取り可能な記憶媒体も、本発明の範疇に含まれる。 It should be noted that a computer program that implements the above-described devices, object tracking system, or object tracking method by a computer, and a computer-readable storage medium that stores the computer program are also included in the scope of the present invention.

本発明によれば、より高精度に物体を追跡することができる。 According to the present invention, an object can be tracked with higher accuracy.

本発明の第１の実施の形態に係る物体追跡装置の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the object tracking device which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る物体追跡システムの全体構成の概略の一例を示す図である。It is a figure which shows an example of the outline of the whole structure of the object tracking system which concerns on the 1st Embodiment of this invention. ターゲットとトラッカーとを対応付ける処理を説明するための図である。It is a figure for demonstrating the process which matches a target and a tracker. 本発明の第１の実施の形態に係る物体追跡装置の検出部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the detection part of the object tracking device which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る物体追跡装置の検出部におけるオブジェクト検出部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the object detection part in the detection part of the object tracking device which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る物体追跡装置の統合追跡部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the integrated tracking part of the object tracking device which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る統合追跡部が行うオブジェクトの逐次追跡処理を説明するための図である。It is a figure for demonstrating the sequential tracking process of the object which the integrated tracking part which concerns on the 1st Embodiment of this invention performs. 本発明の第１の実施の形態に係る物体追跡装置の物体追跡処理の流れの一例を示すフローチャートである。It is a flow chart which shows an example of the flow of the object tracking processing of the object tracking device concerning a 1st embodiment of the present invention. 本発明の第２の実施の形態に係る物体追跡装置の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the object tracking device which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態係る物体追跡装置の検出部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the detection part of the object tracking apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る物体追跡装置の適用例を説明するための図である。It is a figure for demonstrating the application example of the object tracking apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る物体追跡装置の適用例を説明するための図である。It is a figure for demonstrating the application example of the object tracking apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る物体追跡装置の適用例を説明するための図である。It is a figure for demonstrating the application example of the object tracking apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る物体追跡装置の適用例を説明するための図である。It is a figure for demonstrating the application example of the object tracking apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施の形態に係る物体追跡装置の検出部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the detection part of the object tracking device which concerns on the 3rd Embodiment of this invention. 本発明の第３の実施の形態に係る物体追跡装置の検出部におけるオブジェクト検出部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the object detection part in the detection part of the object tracking apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第４の実施の形態に係る物体追跡装置の統合追跡部の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a functional structure of the integrated tracking part of the object tracking apparatus which concerns on the 4th Embodiment of this invention. 本発明の第４の実施の形態に係る統合追跡部が行うオブジェクトの一括追跡処理を説明するための図である。It is a figure for demonstrating the batch tracking process of the object which the integrated tracking part which concerns on the 4th Embodiment of this invention performs. 本発明の各実施の形態を実現可能なコンピュータ（情報処理装置）のハードウェア構成を例示的に説明する図である。It is a figure which illustrates the hardware constitutions of the computer (information processing apparatus) which can implement each embodiment of the present invention.

＜第１の実施の形態＞
本発明の第１の実施の形態について図面を参照して詳細に説明する。まず、図２を参照して、本発明の物体追跡システム（単にシステムとも呼ぶ）の全体構成について説明する。図２は、本実施の形態に係る物体追跡システム１の全体構成の概略の一例を示す図である。図２に示す通り、本実施の形態に係る物体追跡システム１は、物体追跡装置１０と、複数のカメラ（２０−１〜２０−Ｎ（Ｎは自然数））と、１以上の表示装置３０とを備えている。なお、本実施の形態では、複数のカメラ（２０−１〜２０−Ｎ）の夫々を区別しない場合、または、総称する場合には、これらをカメラ２０と呼ぶ。 <First Embodiment>
A first embodiment of the present invention will be described in detail with reference to the drawings. First, with reference to FIG. 2, an overall configuration of an object tracking system (also simply called a system) of the present invention will be described. FIG. 2 is a diagram showing an example of a schematic overall configuration of the object tracking system 1 according to the present embodiment. As shown in FIG. 2, the object tracking system 1 according to the present embodiment includes an object tracking device 10, a plurality of cameras (20-1 to 20-N (N is a natural number)), and one or more display devices 30. Equipped with. In addition, in the present embodiment, when the plurality of cameras (20-1 to 20-N) are not distinguished from each other or are collectively referred to, they are referred to as a camera 20.

物体追跡装置１０、カメラ２０および表示装置３０は、ネットワーク４０を介して互いに通信可能に接続されている。なお、表示装置３０は、物体追跡システム１に含まれていなくてもよい。また、表示装置３０は、ネットワーク４０を介さず、物体追跡装置１０に直接接続される構成であってもよい。 The object tracking device 10, the camera 20, and the display device 30 are communicatively connected to each other via a network 40. The display device 30 may not be included in the object tracking system 1. Further, the display device 30 may be directly connected to the object tracking device 10 without passing through the network 40.

カメラ２０は、物体を検知するセンサとして機能する。なお、本実施の形態では、物体を検知するセンサとして、カメラ２０を用いた場合を例に説明を行うが、本発明はこれに限定されるものではない。センサは、カメラに限らず、電波センサなど、位置測位が可能なものであればよい。また、電波センサとカメラとが一体となったセンサのように複数のセンサが混在するものを用いてもよい。センサとして、カメラ２０を用いることにより、物体追跡装置１０は、色などの視覚的な情報をより好適に取得することができる。 The camera 20 functions as a sensor that detects an object. In the present embodiment, the case where the camera 20 is used as a sensor for detecting an object will be described as an example, but the present invention is not limited to this. The sensor is not limited to the camera, and may be a radio wave sensor or the like as long as the position can be measured. Alternatively, a sensor in which a plurality of sensors are mixed, such as a sensor in which a radio wave sensor and a camera are integrated, may be used. By using the camera 20 as a sensor, the object tracking device 10 can more appropriately acquire visual information such as color.

また、本実施の形態では、センサが取得する情報とは、カメラが撮影した映像であるとして、説明を行うが、センサが電波センサの場合、センサが取得する情報は、該電波センサによって取得される電波である。 In this embodiment, the information acquired by the sensor will be described as an image captured by the camera.However, when the sensor is a radio wave sensor, the information acquired by the sensor is acquired by the radio wave sensor. It is a radio wave.

物体追跡装置１０は、複数のカメラ２０の夫々で撮影された映像に含まれる物体を追跡する装置である。なお、物体追跡装置１０の機能構成については、図面を変えて説明する。 The object tracking device 10 is a device that tracks an object included in a video image captured by each of the plurality of cameras 20. The functional configuration of the object tracking device 10 will be described with reference to the drawings.

表示装置３０は、物体追跡装置１０によるオブジェクトの追跡結果を表示する。なお、表示装置３０は、カメラ２０が撮影した映像を表示するものであってもよい。また、表示装置３０は、動線情報などのその他の情報を表示するものであってもよい。 The display device 30 displays the tracking result of the object by the object tracking device 10. The display device 30 may display an image captured by the camera 20. Further, the display device 30 may display other information such as flow line information.

（物体追跡装置１０）
次に、物体追跡装置１０の機能について説明する。図１は、本実施の形態に係る物体追跡装置１０の機能構成の一例を示す機能ブロック図である。図１に示す通り、物体追跡装置１０は、複数の検出部（１００−１〜１００−Ｎ）と、統合追跡部２００とを備えている。なお、本実施の形態では、複数の検出部（１００−１〜１００−Ｎ）の夫々を区別しない場合、または、総称する場合には、これらを検出部１００と呼ぶ。 (Object tracking device 10)
Next, the function of the object tracking device 10 will be described. FIG. 1 is a functional block diagram showing an example of the functional configuration of the object tracking device 10 according to the present embodiment. As shown in FIG. 1, the object tracking device 10 includes a plurality of detection units (100-1 to 100-N) and an integrated tracking unit 200. In addition, in this Embodiment, when not distinguishing each of a some detection part (100-1-100-N), or when collectively calling, these are called the detection part 100.

（検出部１００）
検出部１００は、後述する統合追跡部２００から出力される追跡情報であって、物体（オブジェクト）を検出する対象のフレームの前フレームにおけるオブジェクトの追跡情報に基づいて、カメラ２０の出力情報からオブジェクトを検出する。ここで、本実施の形態において、カメラ２０の出力情報とは、カメラ２０が撮影した映像を示す映像データを示す。 (Detection unit 100)
The detection unit 100 outputs tracking information output from the camera 20 based on the tracking information output from the integrated tracking unit 200 described below, which is the tracking information of the object in the frame preceding the target frame for detecting an object (object). To detect. Here, in the present embodiment, the output information of the camera 20 indicates video data indicating a video image captured by the camera 20.

本実施の形態では、複数の検出部１００と複数のカメラ２０とは一対一で対応付けられているものとする。例えば、検出部１００−１は、カメラ２０−１が撮影した映像からオブジェクトを検出し、検出部１００−２は、カメラ２０−２が撮影した映像からオブジェクトを検出する。なお、本実施の形態はこれに限定されるものではなく、例えば、検出部１００−１は、カメラ２０−Ｎが撮影した映像からオブジェクトを検出してもよい。 In the present embodiment, it is assumed that the plurality of detection units 100 and the plurality of cameras 20 are associated with each other one-on-one. For example, the detection unit 100-1 detects an object from the video image captured by the camera 20-1, and the detection unit 100-2 detects the object from the video image captured by the camera 20-2. Note that the present embodiment is not limited to this, and for example, the detection unit 100-1 may detect an object from the video imaged by the camera 20-N.

また、カメラ２０と検出部１００とが一対一で対応付けられていなくてもよい。例えば、検出部１００−１は、複数のカメラ２０の夫々が撮影した映像からオブジェクトを検出してもよい。 Further, the camera 20 and the detection unit 100 may not be associated with each other on a one-to-one basis. For example, the detection unit 100-1 may detect an object from the video captured by each of the plurality of cameras 20.

以下、検出部１００の動作について説明する。検出部１００は、カメラ２０から、カメラ２０が撮影した映像を示す映像データ（以降、カメラ映像と呼ぶ）を受信する。図１では、カメラ２０−ｎ（ｎは、１〜Ｎ）が撮影した映像を示す映像データを、カメラ映像（ｎ）と記載している。ここで、カメラ映像は、監視カメラ等のカメラ２０で撮影した映像をリアルタイムで取得したものでもよいし、カメラ２０で撮影した映像を一旦、図示しない記憶部等に蓄積しておき、それを後で復号（または再生）したものであってもよい。この映像データには、撮影された時間を示す時間情報が含まれている。 The operation of the detection unit 100 will be described below. The detection unit 100 receives, from the camera 20, video data indicating a video captured by the camera 20 (hereinafter, referred to as camera video). In FIG. 1, video data indicating a video image captured by the camera 20-n (n is 1 to N) is referred to as a camera video (n). Here, the camera image may be an image captured by the camera 20 such as a surveillance camera in real time, or the image captured by the camera 20 may be temporarily stored in a storage unit (not shown) and then stored. It may be decrypted (or reproduced) in. This video data includes time information indicating the time of shooting.

また、検出部１００は、統合追跡部２００から前フレームにおけるオブジェクトの追跡情報を受信する。なお、前フレームとは、オブジェクトを検出する対象となるフレーム（現時点のフレーム）の、直前のフレームであってもよいし、現時点のフレームから所定数前のフレームであってもよい。また、前フレームとは１つであってもよいし、複数であってもよい。なお、検出部１００が最初のフレームに対するオブジェクトの検出を行う場合、前フレームにおけるオブジェクトの追跡情報は存在しないため、検出部１００は、前フレームにおけるオブジェクトの追跡情報を受信しない（使用しない）。 Further, the detection unit 100 receives the tracking information of the object in the previous frame from the integrated tracking unit 200. The previous frame may be the frame immediately before the frame (current frame) in which the object is to be detected, or may be the frame a predetermined number before the current frame. The number of the previous frame may be one or more. When the detecting unit 100 detects an object in the first frame, the detecting unit 100 does not receive (use) the tracking information of the object in the previous frame because the tracking information of the object in the previous frame does not exist.

検出部１００は、受信したカメラ映像と前フレームにおけるオブジェクトの追跡情報とを用いて、該カメラ映像からオブジェクトの検出（オブジェクト検出と呼ぶ）を行う。なお、上述したとおり、最初のフレームに対してオブジェクト検出を行う場合、検出部１００は、前フレームにおけるオブジェクトの追跡情報を使わずに検出を行う。なお、以下では、検出されたオブジェクトのことを、ターゲットと呼ぶ。即ち、オブジェクトの検出結果（オブジェクト検出結果、または、単に検出結果とも呼ぶ）は、ターゲットの集合となる。 The detection unit 100 uses the received camera image and the tracking information of the object in the previous frame to detect an object from the camera image (referred to as object detection). As described above, when the object detection is performed on the first frame, the detection unit 100 performs the detection without using the tracking information of the object in the previous frame. In addition, below, the detected object is called a target. That is, the object detection result (also referred to as the object detection result or simply the detection result) is a set of targets.

オブジェクト検出結果は、ターゲット毎の、例えば、ターゲットの位置を表す情報、該ターゲットの大きさを表す情報等を含んでいる。具体的には、オブジェクト検出結果は、例えば、オブジェクトを検出した映像中のフレームにおける、ターゲットが占める領域（ターゲット領域）の外接矩形の情報、ターゲット領域の重心の座標値、ターゲットの幅を示す情報、ターゲットの高さを示す情報等を含んでいる。なお、オブジェクト検出結果は、これに限定されるものではない。例えば、オブジェクト検出結果は、ターゲット領域の重心の座標値に代えて、または、加えて、ターゲット領域の最上端の座標値や最下端の座標値等を含んでもよい。オブジェクト検出結果は、ターゲット毎に、ターゲットの位置および大きさ等を表す情報を含んでいればよい。 The object detection result includes, for example, information indicating the position of the target and information indicating the size of the target for each target. Specifically, the object detection result is, for example, information on the circumscribed rectangle of the area occupied by the target (target area), the coordinate value of the center of gravity of the target area, and the information indicating the width of the target in the frame in the video in which the object is detected. , And includes information indicating the height of the target. The object detection result is not limited to this. For example, the object detection result may include the coordinate value of the uppermost end or the coordinate value of the lowermost end of the target area instead of or in addition to the coordinate value of the center of gravity of the target area. The object detection result may include information indicating the position and size of the target for each target.

なお、本実施の形態では、オブジェクト検出結果は、ターゲット毎に、ターゲットの最下端の座標値と、ターゲットの外接矩形を示す情報とを含むことを例に説明を行う。なお、ターゲットの最下端の座標値とは、オブジェクトが床面（地面）と接する点の座標値および／またはオブジェクトの外接矩形の下辺の中点の座標値を示す。また、ターゲットの最下端の座標値とは、オブジェクトが人の場合には足元の座標値であってもよい。 Note that in the present embodiment, the object detection result will be described by taking as an example that, for each target, the coordinate value of the lowermost end of the target and information indicating the circumscribed rectangle of the target are included. The coordinate value at the lowermost end of the target indicates the coordinate value of the point where the object contacts the floor surface (ground) and/or the coordinate value of the middle point of the lower side of the circumscribed rectangle of the object. Further, the coordinate value at the lowermost end of the target may be the coordinate value at the feet when the object is a person.

そして、検出部１００は、オブジェクト検出結果に含まれる座標値を、複数のカメラ２０で撮影される空間（撮影空間）内で定義される共通の共通座標系における座標値に変換し、変換した座標値をオブジェクト検出結果とする。 Then, the detection unit 100 converts the coordinate value included in the object detection result into the coordinate value in the common coordinate system defined in the space (shooting space) captured by the plurality of cameras 20, and the converted coordinate. The value is the object detection result.

また、オブジェクト検出結果は、上述した情報の他に、ターゲットの形状を表す情報を含んでもよい。つまり、オブジェクト検出結果は、ターゲット領域を表すシルエット情報などを含んでもよい。ここで、シルエット情報とは、ターゲット領域の内部の画素と外部の画素を区別する情報であり、例えば、内部の画素値を２５５、外部の画素値を０に設定した画像情報であったり、ＭＰＥＧ−７で標準化されているようなシェイプディスクリプタ（形状特徴量）をシルエット形状から抽出した値である。また、オブジェクト検出結果は、オブジェクトの外見の特徴量を含んでもよい。例えば、オブジェクト検出結果は、該オブジェクトの色、模様、形状などの特徴量も含んでいてもよい。 Further, the object detection result may include information indicating the shape of the target in addition to the above-mentioned information. That is, the object detection result may include silhouette information representing the target area. Here, the silhouette information is information for distinguishing pixels inside and outside pixels of the target area, for example, image information in which the inside pixel value is set to 255 and the outside pixel value is set to 0, or the MPEG information. It is a value obtained by extracting a shape descriptor (shape feature amount) as standardized in −7 from the silhouette shape. Further, the object detection result may include a feature amount of the appearance of the object. For example, the object detection result may include a feature amount such as a color, a pattern, or a shape of the object.

さらに、オブジェクト検出結果は、オブジェクト検出の確からしさ（確度）を表す尤度を記述する情報（ターゲットの尤度情報）をターゲット毎に含んでもよい。ターゲットの尤度情報とは、ターゲットの尤度の算出に必要な情報であり、オブジェクト検出時のスコアの値、検出されたオブジェクトのカメラからの距離、大きさなど、オブジェクト検出の確度に関連する情報である。また、検出部１００は、ターゲットの尤度自体を算出し、算出した尤度をターゲットの尤度情報としてもよい。 Furthermore, the object detection result may include information (likelihood information of the target) describing the likelihood indicating the likelihood (accuracy) of the object detection for each target. The likelihood information of the target is information necessary for calculating the likelihood of the target, and is related to the accuracy of the object detection such as the score value at the time of detecting the object, the distance of the detected object from the camera, and the size. Information. Further, the detection unit 100 may calculate the likelihood of the target itself and use the calculated likelihood as the likelihood information of the target.

そして、検出部１００は、オブジェクト検出結果を統合追跡部２００へ出力する。 Then, the detection unit 100 outputs the object detection result to the integrated tracking unit 200.

（統合追跡部２００）
統合追跡部２００は、検出部１００の夫々から出力された検出結果を受け取る。そして、統合追跡部２００は、この各検出結果に基づいてオブジェクトを追跡する。具体的には、統合追跡部２００は、検出部１００の夫々が検出部１００に紐付けられたカメラ２０によって撮影された映像から検出した、１または複数のオブジェクトに対するオブジェクト検出結果を用いて、該オブジェクトを追跡する（オブジェクト追跡を行う）。そして、統合追跡部２００は、共通座標系で表現されたオブジェクトの追跡結果（オブジェクト追跡結果）を生成する。このように、統合追跡部２００は、各検出部１００が、該検出部１００に紐付けられたカメラ２０によって撮影された映像から検出したオブジェクト検出結果を統合し、オブジェクト追跡を行う。そのため、統合追跡部２００が行うオブジェクト追跡を、オブジェクト統合追跡とも呼ぶ。 (Integrated tracking unit 200)
The integrated tracking unit 200 receives the detection result output from each of the detection units 100. Then, the integrated tracking unit 200 tracks the object based on each detection result. Specifically, the integrated tracking unit 200 uses the object detection results for one or a plurality of objects detected by the detection units 100 from the images captured by the cameras 20 linked to the detection units 100, respectively. Track objects (do object tracking). Then, the integrated tracking unit 200 generates a tracking result (object tracking result) of the object expressed in the common coordinate system. In this way, the integrated tracking unit 200 integrates the object detection results detected by the respective detection units 100 from the video imaged by the camera 20 associated with the detection unit 100, and performs object tracking. Therefore, the object tracking performed by the integrated tracking unit 200 is also referred to as object integrated tracking.

以後、オブジェクト追跡結果として生成される、オブジェクト毎の情報をトラッカーと呼ぶ。つまり、トラッカーには、追跡されたオブジェクトの情報（オブジェクト追跡結果）として、追跡されたオブジェクトの位置を示す情報、該オブジェクトの運動モデル等が含まれるとするが本発明はこれに限定されるものではない。なお、追跡されたオブジェクトの位置は、オブジェクトの、現時点より前（過去）の位置であるため、オブジェクトの過去の位置とも呼ぶ。 Hereinafter, information for each object generated as an object tracking result will be referred to as a tracker. That is, it is assumed that the tracker includes information indicating the position of the tracked object, a motion model of the object, etc. as the information of the tracked object (object tracking result), but the present invention is not limited to this. is not. Since the tracked position of the object is the position of the object before (past) the current time, it is also referred to as the past position of the object.

つまり、オブジェクト追跡とは、オブジェクト検出で検出されたターゲットと、このターゲットの検出前に生成されたトラッカーとを対応付けることによって、フレーム間におけるオブジェクト同士を対応付けていく処理とみなせる。このオブジェクト追跡について、図３を参照して説明する。図３は、統合追跡部２００によるターゲットとトラッカーとを対応付ける処理を説明するための図である。図３に示す通り、ターゲットの数をＭ個、トラッカーの数をＫ個とする（ＭおよびＫは、０以上の整数）。統合追跡部２００は、このＭ個のターゲットとＫ個のトラッカーとの間で対応付けを行う。統合追跡部２００は、ターゲットとトラッカーとを対応付ける際、まず、トラッカーに含まれる情報によって示される、オブジェクトの過去の位置から該オブジェクトの現在の位置を予測し、ターゲットとトラッカーとの関連性を表す指標を用いて対応付けを行う。 That is, object tracking can be regarded as a process of associating objects between frames by associating a target detected by object detection with a tracker generated before the detection of this target. This object tracking will be described with reference to FIG. FIG. 3 is a diagram for explaining a process of associating a target with a tracker by the integrated tracking unit 200. As shown in FIG. 3, the number of targets is M and the number of trackers is K (M and K are integers of 0 or more). The integrated tracking unit 200 associates the M targets with the K trackers. When associating the target with the tracker, the integrated tracking unit 200 first predicts the current position of the object from the past position of the object indicated by the information included in the tracker, and represents the relationship between the target and the tracker. Correspondence is made using the index.

つまり、統合追跡部２００は、前フレームにおいて検出されたオブジェクトの位置と、トラッカー毎に算出され保持されるオブジェクトの運動モデルと、に基づいて、オブジェクトの現フレーム上の位置を予測する。この手法としては、カルマンフィルタを用いる方法、または、パーティクルフィルタを用いる方法など、既存の様々な方法を用いることができる。 That is, the integrated tracking unit 200 predicts the position of the object on the current frame based on the position of the object detected in the previous frame and the motion model of the object calculated and held for each tracker. As this method, various existing methods such as a method using a Kalman filter or a method using a particle filter can be used.

そして、統合追跡部２００は、例えば、以下の（１）〜（３）に挙げる情報に基づいて、前フレームにおけるオブジェクト追跡結果（トラッカー）と、検出結果に含まれるオブジェクト（ターゲット）とを対応付ける。
（１）トラッカーを用いて予測したオブジェクトの現フレーム上の位置とターゲットの位置との距離の近さ
（２）ターゲットと、トラッカーによって追跡結果が示されるオブジェクトと、の間の外見特徴量の類似性
（３）ターゲットおよびトラッカーそれぞれの尤度
この対応付けの処理は、図３に示すような２部グラフのコスト最小化問題に帰着させることができる。よって、統合追跡部２００は、ハンガリアン法などのアルゴリズムによってこの問題を解くことができる。 Then, the integrated tracking unit 200 associates the object tracking result (tracker) in the previous frame with the object (target) included in the detection result, for example, based on the information listed in (1) to (3) below.
(1) The closeness of the distance between the position of the object predicted by the tracker on the current frame and the position of the target (2) Similarity of the appearance feature amount between the target and the object whose tracking result is shown by the tracker Gender (3) Likelihood of each target and tracker The process of this association can be reduced to a cost minimization problem of a bipartite graph as shown in FIG. Therefore, the integrated tracking unit 200 can solve this problem using an algorithm such as the Hungarian method.

図３では、ターゲットとトラッカーとが対応付けられた場合を、矢印を用いて示している。つまり、一番上のターゲットは、一番上のトラッカーと対応付いていることを示している。 In FIG. 3, a case where the target and the tracker are associated with each other is shown by using an arrow. That is, the top target is associated with the top tracker.

そして、トラッカーと対応付かないターゲットが存在する場合には、統合追跡部２００は、該ターゲットが新規に現れたオブジェクトとみなせるかどうかを判定する。そして、統合追跡部２００は、該ターゲットが新規に現れた可能性が高いと判定した場合、該ターゲットに関連するトラッカーを新規に追加する。図３においては、符号ｍで示したターゲット（ターゲットｍと呼ぶ）が、トラッカーと対応付かないターゲットであるとする。このとき、統合追跡部２００は、ターゲットｍが新たに表れたオブジェクトとみなせるか否かの判定を行い、みなせる場合、ターゲットｍに関連するトラッカーを新規に作成する。 Then, when there is a target that does not correspond to the tracker, the integrated tracking unit 200 determines whether the target can be regarded as a newly appearing object. Then, when the integrated tracking unit 200 determines that there is a high possibility that the target has newly appeared, the integrated tracking unit 200 newly adds a tracker related to the target. In FIG. 3, it is assumed that the target indicated by reference sign m (referred to as target m) is a target that does not correspond to the tracker. At this time, the integrated tracking unit 200 determines whether the target m can be regarded as a newly appearing object, and if it can be regarded, newly creates a tracker related to the target m.

一方、ターゲットと対応付かないトラッカーが存在する場合には、統合追跡部２００は、該トラッカーが撮影空間から消えたオブジェクトに関する情報か否かを判定する。そして、統合追跡部２００は、該トラッカーが撮影空間から消えたオブジェクトに関する情報である可能性が高い場合、該トラッカーを削除する。図３においては、符号ｋで示したトラッカー（トラッカーｋと呼ぶ）が、ターゲットと対応付かないトラッカーであるとする。このとき、統合追跡部２００は、トラッカーｋが、消えたオブジェクトに関する情報か否かの判定を行い、該トラッカーｋが消えたオブジェクトに関する情報の場合、トラッカーｋを削除する。 On the other hand, if there is a tracker that does not correspond to the target, the integrated tracking unit 200 determines whether or not the tracker is information regarding an object that has disappeared from the shooting space. Then, the integrated tracking unit 200 deletes the tracker when there is a high possibility that the tracker is information regarding an object that has disappeared from the shooting space. In FIG. 3, it is assumed that the tracker indicated by reference numeral k (called a tracker k) is a tracker that does not correspond to the target. At this time, the integrated tracking unit 200 determines whether or not the tracker k is information regarding the disappeared object, and if the tracker k is information regarding the disappeared object, deletes the tracker k.

統合追跡部２００は、これらの処理をフレーム単位で繰り返すことにより、オブジェクト追跡を行っていく。なお、統合追跡部２００は、トラッカーに全カメラ２０で共通の一意のＩＤ（ｉｄｅｎｔｉｆｉｅｒ）を与え、このＩＤによって追跡結果（トラッカー）を管理する。また、統合追跡部２００は、追跡結果の確からしさを評価した値（以後、トラッカーの尤度（重み）と呼ぶ）をトラッカーのパラメータとして、該トラッカーに含める。なお、トラッカーに含まれる、追跡されたオブジェクトの位置を示す情報によって示される位置であって、最も新しいオブジェクトの位置を、トラッカーの位置と呼ぶ。また、このときのオブジェクトの大きさをトラッカーの大きさとも呼ぶ。 The integrated tracking unit 200 performs object tracking by repeating these processes on a frame-by-frame basis. The integrated tracking unit 200 gives the tracker a unique ID (identifier) common to all the cameras 20, and manages the tracking result (tracker) by this ID. Further, the integrated tracking unit 200 includes a value (hereinafter referred to as a tracker's likelihood (weight)) that evaluates the certainty of the tracking result in the tracker as a parameter of the tracker. The position of the newest object, which is the position indicated by the information indicating the position of the tracked object included in the tracker, is called the position of the tracker. The size of the object at this time is also called the size of the tracker.

また、統合追跡部２００は、対応付けの結果に基づいて、各トラッカーの位置を示す情報およびトラッカーの尤度の情報等を更新する。トラッカーの位置の情報は、複数のカメラ２０で撮影される撮影空間内で定義される共通座標系で表現された情報である。この共通座標系で表現された情報とは、共通座標系における座標値である。この共通座標系における座標値とは、例えば、複数のカメラ２０が、ある店舗内に設置されたカメラである場合、実世界におけるフロアの位置を示す座標系である。これに対し、各カメラ２０に固有の座標系を、該カメラ２０の個別座標系と呼ぶ。この個別座標系は、カメラ２０の撮影画像上の座標系である。以降、共通座標系で表現された位置の情報を、共通座標系の座標値として説明を行う。また、カメラ２０の個別座標系で表現された位置の情報を、カメラ２０の個別座標系の座標値として説明を行う。 Further, the integrated tracking unit 200 updates the information indicating the position of each tracker, the information on the likelihood of the tracker, and the like based on the result of the association. The information on the position of the tracker is information expressed in a common coordinate system defined in the shooting space where the plurality of cameras 20 shoot. The information expressed in this common coordinate system is the coordinate value in the common coordinate system. The coordinate value in the common coordinate system is, for example, a coordinate system indicating the position of the floor in the real world when the plurality of cameras 20 are cameras installed in a certain store. On the other hand, the coordinate system unique to each camera 20 is called the individual coordinate system of the camera 20. This individual coordinate system is a coordinate system on a captured image of the camera 20. Hereinafter, the position information expressed in the common coordinate system will be described as coordinate values in the common coordinate system. Further, the position information represented by the individual coordinate system of the camera 20 will be described as the coordinate value of the individual coordinate system of the camera 20.

そして、統合追跡部２００は、対応付けの結果に基づいて情報が更新されたトラッカーを、新たなオブジェクト追跡結果として生成する。そして、統合追跡部２００は、生成したオブジェクト追跡結果（トラッカー）のうち、トラッカーの位置および／または大きさを示す情報、および、トラッカーの尤度の情報等を、オブジェクト追跡の追跡結果を示す情報（追跡情報）として出力する。 Then, the integrated tracking unit 200 generates a tracker whose information is updated based on the result of the association, as a new object tracking result. Then, the integrated tracking unit 200 includes the information indicating the position and/or the size of the tracker, the information about the likelihood of the tracker, and the like in the generated object tracking result (tracker), and the information indicating the tracking result of the object tracking. Output as (tracking information).

この追跡情報は、各トラッカーの位置を示す情報として、各トラッカーの共通座標系の座標値を含んでいる。この追跡情報は、検出部１００にフィードバックされる。つまり、検出部１００は、この追跡情報を受信し、該追跡情報を、以降のフレームに対するオブジェクト検出に用いる。 The tracking information includes coordinate values of the common coordinate system of each tracker as information indicating the position of each tracker. This tracking information is fed back to the detection unit 100. That is, the detection unit 100 receives the tracking information and uses the tracking information for object detection for the subsequent frames.

なお、統合追跡部２００は、上記追跡情報と、トラッカーのその他の情報を含むオブジェクト追跡結果を検出部１００に出力する構成であってもよい。 The integrated tracking unit 200 may be configured to output the object tracking result including the tracking information and other information of the tracker to the detection unit 100.

このように、本実施の形態に係る物体追跡装置１０は、複数のカメラの夫々で撮影された映像を用いて、この複数のカメラ２０の夫々で撮影された映像に対するオブジェクト検出結果を統合して、オブジェクト追跡を行う。そして、物体追跡装置１０は、得られた追跡情報を次のフレームでのオブジェクト検出にフィードバックする。 In this way, the object tracking device 10 according to the present embodiment uses the images captured by each of the plurality of cameras and integrates the object detection results for the images captured by each of the plurality of cameras 20. , Do object tracking. Then, the object tracking device 10 feeds back the obtained tracking information to the object detection in the next frame.

このように、物体追跡装置１０は、前のフレームに対する追跡結果を用いて、映像からオブジェクトを検出する。例えば、あるカメラ２０から見えないが、他のカメラ２０から見えているオブジェクトがある場合、物体追跡装置１０は、あるカメラ２０からは見えないオブジェクトに対する追跡結果も、このあるカメラ２０の映像におけるオブジェクト検出に用いる。これにより、検出部１００は、このあるカメラ２０から見える範囲に同じオブジェクトが現れた場合に、このオブジェクトを好適に検出することができる。そのため、物体追跡装置１０は、このオブジェクトに対するオブジェクト追跡を精度よく行うことができる。 In this way, the object tracking device 10 detects the object from the video using the tracking result for the previous frame. For example, if there is an object that cannot be seen by a certain camera 20 but is seen by another camera 20, the object tracking device 10 also obtains a tracking result for an object that cannot be seen by a certain camera 20 from the object in the image of this certain camera 20. Used for detection. Accordingly, the detection unit 100 can preferably detect the same object when the same object appears in the range that can be seen from the certain camera 20. Therefore, the object tracking device 10 can accurately track an object for this object.

したがって、物体追跡装置１０は、前のフレームに対する追跡結果を用いない場合に比べ、オブジェクトの検出精度を高めることができる。また、物体追跡装置１０は、検出精度が高い検出結果の全てを用いてオブジェクト追跡を行うため、全体として得られる追跡結果の精度も向上する。 Therefore, the object tracking device 10 can improve the detection accuracy of the object as compared with the case where the tracking result for the previous frame is not used. Further, since the object tracking device 10 performs object tracking using all the detection results with high detection accuracy, the accuracy of the tracking result obtained as a whole is also improved.

（検出部１００の詳細）
次に、図４から図８を参照して、物体追跡装置１０の各部の機能についてより詳細に説明する。図４は、本実施の形態に係る物体追跡装置１０の検出部１００のより詳細な機能構成の一例を示す機能ブロック図である。図４に示す通り、検出部１００は、オブジェクト検出部１１０と、共通座標変換部（第２の変換部）１２０と、個別座標変換部（第１の変換部）１３０とを備えている。なお、図４では、検出部１００が受信するカメラ映像（ｎ）（ｎは、１〜Ｎ）を、単にカメラ映像と記載している。 (Details of the detection unit 100)
Next, the functions of the respective units of the object tracking device 10 will be described in more detail with reference to FIGS. 4 to 8. FIG. 4 is a functional block diagram showing an example of a more detailed functional configuration of the detection unit 100 of the object tracking device 10 according to this embodiment. As shown in FIG. 4, the detection unit 100 includes an object detection unit 110, a common coordinate conversion unit (second conversion unit) 120, and an individual coordinate conversion unit (first conversion unit) 130. In FIG. 4, the camera image (n) (n is 1 to N) received by the detection unit 100 is simply referred to as a camera image.

個別座標変換部１３０は、統合追跡部２００から出力される追跡情報を、統合追跡部２００から受信する。そして、個別座標変換部１３０は、この追跡情報に含まれる、各トラッカーの共通座標系の座標値を、各カメラ２０が撮影するフレーム上の座標値（つまり、各カメラ２０固有の個別座標系で表現された座標値）に変換する。共通座標系の座標値を（Ｘ，Ｙ，Ｚ）とし、カメラ２０の個別座標系の座標値を（ｘ，ｙ）と表したとき、個別座標変換部１３０は、トラッカーの共通座標系の座標値（Ｘ，Ｙ，Ｚ）から、該個別座標変換部１３０を備える検出部１００に紐付けられたカメラ２０の個別座標系の座標値（ｘ，ｙ）を求める。このとき、個別座標変換部１３０は、少なくとも、検出部１００に紐付けられたカメラ２０の、カメラ位置、姿勢等を表すカメラパラメータをキャリブレーションにより求めておくことが好ましい。これにより、個別座標変換部１３０は、得られたカメラパラメータを用いて、共通座標系の座標値を、カメラ２０の個別座標系の座標値に変換する。 The individual coordinate conversion unit 130 receives the tracking information output from the integrated tracking unit 200 from the integrated tracking unit 200. Then, the individual coordinate conversion unit 130 converts the coordinate value of the common coordinate system of each tracker included in this tracking information into the coordinate value on the frame captured by each camera 20 (that is, in the individual coordinate system unique to each camera 20). Converted to expressed coordinate values). When the coordinate value of the common coordinate system is (X, Y, Z) and the coordinate value of the individual coordinate system of the camera 20 is (x, y), the individual coordinate conversion unit 130 indicates that the coordinate of the common coordinate system of the tracker. From the value (X, Y, Z), the coordinate value (x, y) of the individual coordinate system of the camera 20 associated with the detection unit 100 including the individual coordinate conversion unit 130 is obtained. At this time, it is preferable that the individual coordinate conversion unit 130 obtains at least camera parameters representing the camera position, orientation, etc. of the camera 20 linked to the detection unit 100 by calibration. As a result, the individual coordinate conversion unit 130 uses the obtained camera parameters to convert the coordinate values of the common coordinate system into the coordinate values of the individual coordinate system of the camera 20.

なお、このカメラパラメータは、検出部１００内の図示しない記憶部等に格納されるものであってもよいし、個別座標変換部１３０内の記憶領域に格納されるものであってもよい。後者の場合、個別座標変換部１３０は、共通座標変換部１２０にカメラパラメータを供給する構成であってもよい。 The camera parameters may be stored in a storage unit (not shown) or the like in the detection unit 100, or may be stored in a storage area in the individual coordinate conversion unit 130. In the latter case, the individual coordinate conversion unit 130 may be configured to supply the camera parameters to the common coordinate conversion unit 120.

例えば、オブジェクトが人物であり、トラッカーの位置を示す情報が、該人物の足元位置を示す座標値と頭頂位置を示す座標値であるとする。そして、この足元位置の座標値と、頭頂位置の座標値とを、それぞれ（Ｘ０，Ｙ０，０）、（Ｘ０，Ｙ０，Ｈ）（Ｈは人物の高さを表す）とする。また、個別座標変換部１３０を備える検出部１００に紐付けられたカメラ２０が、カメラ２０−１であるとする。 For example, it is assumed that the object is a person and the information indicating the position of the tracker is the coordinate value indicating the foot position and the coordinate value indicating the crown position of the person. Then, the coordinate value of the foot position and the coordinate value of the crown position are respectively (X0, Y0, 0) and (X0, Y0, H) (H represents the height of the person). The camera 20 associated with the detection unit 100 including the individual coordinate conversion unit 130 is assumed to be the camera 20-1.

このとき、個別座標変換部１３０は、カメラ２０−１に関するカメラパラメータを用いて、該カメラ２０−１が撮影するフレーム上における足元位置（ｘ０，ｙ０）と頭頂位置（ｘ１，ｙ１）とをそれぞれ求める。もし、トラッカーの位置を示す情報が外接矩形を示す情報を含む場合、該外接矩形の幅を示す値は、以前に、カメラパラメータを用いて共通座標系の座標値に変換することにより、求められている。そのため、個別座標変換部１３０は、この外接矩形の幅として再び上記カメラパラメータを用いて変換した値を用いてもよい。 At this time, the individual coordinate conversion unit 130 uses the camera parameters for the camera 20-1 to determine the foot position (x0, y0) and the crown position (x1, y1) on the frame captured by the camera 20-1. Ask. If the information indicating the position of the tracker includes information indicating the circumscribing rectangle, the value indicating the width of the circumscribing rectangle is previously obtained by converting the coordinate value into the common coordinate system using the camera parameter. ing. Therefore, the individual coordinate conversion unit 130 may use the value converted using the camera parameter again as the width of the circumscribed rectangle.

また、トラッカーによって追跡結果が示されるオブジェクトの全てが１つのカメラ２０から見えるわけではなく、このカメラ２０の視界の外側に存在する場合もある。したがって、追跡情報に含まれるトラッカーの位置を示す情報が、検出部１００に紐付けられたカメラ２０から見えないオブジェクトに関する情報の場合、個別座標変換部１３０は、オブジェクトの上記カメラ２０の個別座標系の座標値を求めることができない。よって、個別座標変換部１３０は、このようなカメラ２０の画角外で見えないオブジェクトに関するトラッカーの共通座標系の座標値を、個別座標系の座標値に変換しないようにしてもよい。この際、個別座標変換部１３０は、各カメラ２０で見える共通座標系の座標値の範囲を、カメラ２０毎に、図示しない記憶部等に予め登録しておき、各オブジェクトがこの中に入っているかどうかを判定するようにしてもよい。また、個別座標変換部１３０は、実際に個別座標系の座標値に変換して、紐付けられたカメラ２０で監視している領域の外側を示す異常な値になったり、値が求まらなかったりしたときに、座標値を変換したオブジェクトがカメラ２０から見えないオブジェクトであると判定してもよい。 In addition, not all the objects whose tracking results are shown by the tracker can be seen from one camera 20, and may exist outside the field of view of this camera 20. Therefore, when the information indicating the position of the tracker included in the tracking information is the information related to the object that cannot be seen from the camera 20 associated with the detection unit 100, the individual coordinate conversion unit 130 causes the individual coordinate conversion unit 130 to determine the individual coordinate system of the camera 20 of the object. The coordinate value of cannot be obtained. Therefore, the individual coordinate conversion unit 130 may not convert the coordinate value of the tracker's common coordinate system regarding the object that is not visible outside the angle of view of the camera 20 into the coordinate value of the individual coordinate system. At this time, the individual coordinate conversion unit 130 registers the range of coordinate values of the common coordinate system visible to each camera 20 in advance in a storage unit or the like (not shown) for each camera 20 so that each object can be stored therein. It may be determined whether or not there is. In addition, the individual coordinate conversion unit 130 actually converts the coordinate values into the individual coordinate system to obtain an abnormal value indicating the outside of the area monitored by the associated camera 20, or the value is obtained. If there is none, it may be determined that the object whose coordinate value has been converted is an object that cannot be seen by the camera 20.

そして、個別座標変換部１３０は、統合追跡部２００から出力された追跡情報に含まれるトラッカーの共通座標系の座標値が、検出部１００に紐付けられたカメラ２０の個別座標系の座標値に変換された結果（追跡情報）を、オブジェクト検出部１１０に出力する。つまり、個別座標変換部１３０は、共通座標系で表現された追跡情報を、個別座標系で表現された追跡情報に変換し、該変換後の追跡情報をオブジェクト検出部１１０に出力する。以降、単に「個別座標系の座標値」と記載した場合、検出部１００に紐付けられたカメラ２０の個別座標系の座標値を示す。 Then, in the individual coordinate conversion unit 130, the coordinate value of the common coordinate system of the tracker included in the tracking information output from the integrated tracking unit 200 becomes the coordinate value of the individual coordinate system of the camera 20 linked to the detection unit 100. The converted result (tracking information) is output to the object detection unit 110. That is, the individual coordinate conversion unit 130 converts the tracking information expressed in the common coordinate system into the tracking information expressed in the individual coordinate system, and outputs the converted tracking information to the object detection unit 110. Hereinafter, when simply described as “coordinate value of individual coordinate system”, the coordinate value of the individual coordinate system of the camera 20 linked to the detection unit 100 is shown.

オブジェクト検出部１１０は、オブジェクト検出部１１０を備える検出部１００に紐付けられたカメラ２０からのカメラ映像を受信する。また、オブジェクト検出部１１０は、個別座標変換部１３０から個別座標系の座標値に変換された追跡情報を受信する。そして、オブジェクト検出部１１０は、上記追跡情報に基づいて、受信したカメラ映像からオブジェクトを検出する。 The object detection unit 110 receives a camera image from the camera 20 associated with the detection unit 100 including the object detection unit 110. Further, the object detection unit 110 receives the tracking information converted into the coordinate value of the individual coordinate system from the individual coordinate conversion unit 130. Then, the object detection unit 110 detects an object from the received camera image based on the tracking information.

そして、オブジェクト検出部１１０は、検出結果を生成する。オブジェクト検出部１１０は、生成した検出結果を共通座標変換部１２０に出力する。なお、この検出結果は、個別座標系で表現された検出結果である。 Then, the object detection unit 110 generates a detection result. The object detection unit 110 outputs the generated detection result to the common coordinate conversion unit 120. The detection result is a detection result expressed in the individual coordinate system.

図５を参照して、オブジェクト検出部１１０の構成についてより詳細に説明する。図５は、本実施の形態に係るオブジェクト検出部１１０の機能構成の一例を示す機能ブロック図である。図５に示す通り、オブジェクト検出部１１０は、認識型オブジェクト検出部（第１の物体検出部）１１１と、探索範囲設定部１１２とを備えている。 The configuration of the object detection unit 110 will be described in more detail with reference to FIG. FIG. 5 is a functional block diagram showing an example of the functional configuration of the object detection unit 110 according to this embodiment. As shown in FIG. 5, the object detection unit 110 includes a recognition-type object detection unit (first object detection unit) 111 and a search range setting unit 112.

探索範囲設定部１１２は、個別座標変換部１３０から、個別座標系の座標値に変換された追跡情報を受信する。そして、探索範囲設定部１１２は、この個別座標系の座標値に変換された追跡情報を用いて、現フレームに対するオブジェクトの検出を行う対象となるエリア（探索範囲）を求める。つまり、探索範囲設定部１１２は、個別座標系の座標値に変換された、前フレームの追跡結果からなる追跡情報に基づいて、現フレームのオブジェクトの位置を予測する。そして、探索範囲設定部１１２は、予測した位置から、オブジェクトを検索する検出範囲を求める。なお、この探索範囲をオブジェクトの検出範囲とも呼ぶ。 The search range setting unit 112 receives the tracking information converted into the coordinate values of the individual coordinate system from the individual coordinate conversion unit 130. Then, the search range setting unit 112 uses the tracking information converted into the coordinate values of the individual coordinate system to obtain the area (search range) to be the target of detecting the object in the current frame. That is, the search range setting unit 112 predicts the position of the object in the current frame based on the tracking information that is converted into the coordinate values of the individual coordinate system and that is the tracking result of the previous frame. Then, the search range setting unit 112 obtains the detection range for searching the object from the predicted position. The search range is also called an object detection range.

ここで、探索範囲設定部１１２が受信する追跡情報は、現在処理を行おうとするフレームの時間から見ると、過去のフレームにおけるオブジェクト追跡の結果（過去の追跡結果とも呼ぶ）になる。そこで、探索範囲設定部１１２は、各オブジェクトの動きを予測し、現フレームにおける各オブジェクトの位置を予測する。以降、この予測したオブジェクトの位置を予測位置と呼ぶ。そして、探索範囲設定部１１２は、この予測位置の近傍を該オブジェクトの探索範囲として設定する。 Here, the tracking information received by the search range setting unit 112 is a result of object tracking in a past frame (also referred to as a past tracking result) when viewed from the time of a frame to be currently processed. Therefore, the search range setting unit 112 predicts the movement of each object and predicts the position of each object in the current frame. Hereinafter, the predicted position of the object will be referred to as a predicted position. Then, the search range setting unit 112 sets the vicinity of this predicted position as the search range of the object.

探索範囲設定部１１２は、過去の追跡結果から算出されるオブジェクト毎の運動モデルを用いて、オブジェクト毎の動きを予測することが好ましい。例えば、探索範囲設定部１１２は、過去数フレーム（２フレームでもよい）の追跡結果でオブジェクトの位置が変化していないときには、該オブジェクトが静止していると判定し、追跡結果で得られたオブジェクトの位置をそのまま予測位置とする。また、探索範囲設定部１１２は、過去数フレームの追跡結果でオブジェクトが移動している場合には、該オブジェクトが等速で動いていると仮定し、過去フレームからの時間差を考慮して予測位置を求めてもよい。 The search range setting unit 112 preferably predicts the movement of each object by using the movement model of each object calculated from the past tracking result. For example, the search range setting unit 112 determines that the object is stationary when the position of the object has not changed in the tracking result of the past several frames (may be two frames), and the object obtained by the tracking result is determined. The position of is directly used as the predicted position. Further, the search range setting unit 112 assumes that the object is moving at a constant speed when the object moves according to the tracking results of the past several frames, and considers the time difference from the past frame to predict the predicted position. May be asked.

探索範囲設定部１１２がオブジェクト毎の動きを予測する際に使用する過去数フレームの追跡結果は、追跡情報に含まれるものであってもよい。また、上記過去数フレームの追跡結果から得られるオブジェクトの運動モデルも、上記追跡情報に含まれるものであってもよい。 The tracking result of the past several frames used when the search range setting unit 112 predicts the movement of each object may be included in the tracking information. Further, the motion model of the object obtained from the tracking result of the past several frames may be included in the tracking information.

また、この予測位置は、追跡情報に含まれていてもよい。つまり、統合追跡部２００が、オブジェクト追跡時にカルマンフィルタまたはパーティクルフィルタで求まった値を予測位置として、追跡情報に含めてもよい。 In addition, this predicted position may be included in the tracking information. That is, the integrated tracking unit 200 may include the value obtained by the Kalman filter or the particle filter at the time of object tracking as the predicted position in the tracking information.

また、例えば、カメラ２０が撮影可能な範囲の外縁部分には、該カメラ２０の画角に新たなオブジェクトが出現する可能性がある。また、カメラ２０が撮影している場所が出入口のような場所を含む場合も、該カメラ２０の画角に新たなオブジェクトが出現する可能性がある。したがって、探索範囲設定部１１２は、このようなカメラ２０が撮影した映像に含まれるフレーム上における、これらの領域（フレームの外縁部および／または出入口部分）も併せてオブジェクト探索範囲に含めることが好ましい。 Further, for example, a new object may appear at the angle of view of the camera 20 in the outer edge portion of the range in which the camera 20 can shoot. Also, when the place where the camera 20 is shooting includes a place such as a doorway, a new object may appear in the angle of view of the camera 20. Therefore, it is preferable that the search range setting unit 112 also includes these areas (outer edge portion and/or entrance/exit portion of the frame) in the frame included in the image captured by the camera 20 in the object search range. ..

探索範囲設定部１１２は、設定したオブジェクトの探索範囲を示す情報（探索範囲情報）を認識型オブジェクト検出部１１１に出力する。 The search range setting unit 112 outputs information indicating the set search range of the object (search range information) to the recognition-type object detection unit 111.

認識型オブジェクト検出部１１１は、探索範囲設定部１１２から、探索範囲情報を受信する。認識型オブジェクト検出部１１１は、受信した探索範囲情報に基づいて、認識型オブジェクト検出部１１１に入力されるカメラ映像からオブジェクトを検出する。認識型オブジェクト検出部１１１は、入力されたカメラ映像のフレームを、一旦、認識型オブジェクト検出部１１１内のバッファ等の記憶手段に蓄えておき、探索範囲情報を受信すると、この情報を適用してオブジェクト検出処理を実行する。具体的には、認識型オブジェクト検出部１１１は、探索範囲情報によって示される領域（探索範囲）に対して、オブジェクトの画像特徴を学習させた識別器を用いて、オブジェクト検出を行う。 The recognition-type object detection unit 111 receives the search range information from the search range setting unit 112. The recognition-type object detection unit 111 detects an object from the camera image input to the recognition-type object detection unit 111 based on the received search range information. The recognition-type object detection unit 111 temporarily stores the input camera video frame in a storage unit such as a buffer in the recognition-type object detection unit 111, and when receiving the search range information, applies this information. Perform object detection processing. Specifically, the recognition-type object detection unit 111 performs object detection on an area (search range) indicated by the search range information using a classifier that has learned the image characteristics of the object.

例えば、オブジェクトが人物の場合には、認識型オブジェクト検出部１１１は、人物の特徴的な部位（例えば、頭部または上半身）を学習させた識別器を適用し、人物の検出を行う。また、認識型オブジェクト検出部１１１は、上記識別器として、人物全体を学習させた識別器を用いてもよい。認識型オブジェクト検出部１１１は、この識別器として、様々なものを用いることができる。例えば、認識型オブジェクト検出部１１１は、頭部、上半身、および、人物全身等の画像をＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）で学習させて得られる識別器を用いることができる。また、認識型オブジェクト検出部１１１は、ＨＯＧ（ＨｉｓｔｏｇｒａｍＯｆＧａｕｓｓｉａｎ）等の特徴抽出を行い、ＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ；サポートベクタマシン）、または、ＧＬＶＱ（ＧｅｎｅｒａｌｉｓｅｄＬｅａｒｎｉｎｇＶｅｃｔｏｒＱｕａｎｔｉｚａｔｉｏｎ；一般化学習ベクトル量子化）等の識別器を用いるようにしてもよい。なお、上記以外でも、認識型オブジェクト検出部１１１は、既存の様々な認識ベースの検出手法を用いることができる。 For example, when the object is a person, the recognition-type object detection unit 111 detects the person by applying a discriminator that has learned a characteristic part of the person (for example, the head or the upper half of the body). Further, the recognition-type object detection unit 111 may use a classifier that has learned the entire person as the classifier. The recognition-type object detection unit 111 can use various types of discriminators. For example, the recognition-type object detection unit 111 can use a discriminator obtained by learning images of the head, the upper half of the body, the whole body of a person, and the like with a CNN (Convolutional Neural Network). In addition, the recognition-type object detection unit 111 performs feature extraction such as HOG (Histogram Of Gaussian), SVM (Support Vector Machine; support vector machine), or GLVQ (Generalized Learning Vector Quantization) generalization learning; Alternatively, a discriminator such as the above may be used. In addition to the above, the recognition-type object detection unit 111 can use various existing recognition-based detection methods.

このように、本実施の形態に係る認識型オブジェクト検出部１１１は、探索範囲設定部１１２によって設定された探索範囲内において、物体を検出する。つまり、認識型オブジェクト検出部１１１は、探索範囲設定部１１２が前フレームにおける追跡結果を用いて絞り込んだ探索範囲内で、オブジェクト検出を行う。そのため、認識型オブジェクト検出部１１１は、フレームにおける、オブジェクトが存在する可能性が低い範囲でのオブジェクト検出を行わないため、余分な誤検知を低減できる。また、認識型オブジェクト検出部１１１は、オブジェクト検出の処理の高速化を図ることができる。 In this way, the recognition-type object detection unit 111 according to the present embodiment detects an object within the search range set by the search range setting unit 112. That is, the recognition type object detection unit 111 performs object detection within the search range narrowed down by the search range setting unit 112 using the tracking result in the previous frame. Therefore, the recognition-type object detection unit 111 does not detect an object in a range in which an object is unlikely to exist in a frame, and thus it is possible to reduce extra erroneous detection. Further, the recognition-type object detection unit 111 can speed up the process of object detection.

また、認識型オブジェクト検出部１１１がオブジェクト検出を行う探索範囲は、探索範囲情報で示される領域だけでなく、背景差分等によって求まるシルエット情報で定まる領域も含まれてもよい。また、認識型オブジェクト検出部１１１は、シルエット情報で定まる領域とオブジェクト探索範囲情報で指定される領域の共通部分を、オブジェクト検出を実行する領域（探索範囲）としてもよい。 In addition, the search range in which the recognition-type object detection unit 111 performs object detection may include not only the area indicated by the search range information but also the area determined by the silhouette information obtained by the background difference or the like. Further, the recognition-type object detection unit 111 may set a common part of the area defined by the silhouette information and the area specified by the object search range information as an area (search range) in which the object detection is executed.

そして、認識型オブジェクト検出部１１１は、オブジェクト検出を行った結果（検出結果）を生成し、該検出結果を共通座標変換部１２０に出力する。このとき、検出結果に含まれるオブジェクトの座標値は、個別座標系の座標値である。 Then, the recognition-type object detection unit 111 generates a result (detection result) of object detection, and outputs the detection result to the common coordinate conversion unit 120. At this time, the coordinate value of the object included in the detection result is the coordinate value of the individual coordinate system.

図４に戻り、検出部１００の共通座標変換部１２０の機能について説明する。共通座標変換部１２０は、オブジェクト検出部１１０から、個別座標系で表現された、オブジェクト検出結果を受信する。そして、個別座標変換部１３０は、受信した検出結果に含まれる個別座標系の座標値を、共通座標系の座標値に変換する。これにより、共通座標変換部１２０は、個々のカメラ２０に対するオブジェクトの検出位置を統合するための情報を生成することができる。 Returning to FIG. 4, the function of the common coordinate conversion unit 120 of the detection unit 100 will be described. The common coordinate conversion unit 120 receives the object detection result expressed in the individual coordinate system from the object detection unit 110. Then, the individual coordinate conversion unit 130 converts the coordinate value of the individual coordinate system included in the received detection result into the coordinate value of the common coordinate system. Thereby, the common coordinate conversion unit 120 can generate information for integrating the detected positions of the objects with respect to the individual cameras 20.

具体的には、共通座標変換部１２０は、共通座標変換部１２０を備える検出部１００に紐付けられたカメラ２０のカメラパラメータを用いて、オブジェクト検出結果に含まれる個別座標系の座標値を共通座標系の座標値に変換する。例えば、共通座標変換部１２０は、カメラ２０が撮影したフレーム上でのオブジェクトの下端の座標が（ｘ０，ｙ０）のとき、これを共通座標系の座標である（Ｘ０，Ｙ０，０）に変換する。ここで、地面をＺ＝０の平面としているため、Ｚ軸方向の成分が０となっている。また、オブジェクトの上端の座標が（ｘ１，ｙ１）のとき（ここでは、オブジェクトの上端は、オブジェクトの下端の真上（鉛直方向上方）にあると仮定）、オブジェクトの高さをＨとする。このとき、共通座標変換部１２０は、このオブジェクトの上端の座標（ｘ１，ｙ１）を、共通座標系の座標である（Ｘ０，Ｙ０，Ｈ）に変換する。なお、共通座標変換部１２０は、これを満たすＨを探索することにより、オブジェクトの高さを求める。このようにして、共通座標変換部１２０は、検出されたオブジェクト毎に、共通座標系での座標値（Ｘ，Ｙ，Ｚ）を求める。 Specifically, the common coordinate transformation unit 120 uses the camera parameters of the camera 20 associated with the detection unit 100 including the common coordinate transformation unit 120 to share the coordinate values of the individual coordinate system included in the object detection result. Convert to coordinate values in the coordinate system. For example, when the coordinates of the lower end of the object on the frame captured by the camera 20 are (x0, y0), the common coordinate conversion unit 120 converts the coordinates into (X0, Y0, 0) which is the coordinates of the common coordinate system. To do. Here, since the ground is a plane of Z=0, the component in the Z-axis direction is 0. Further, when the coordinates of the upper end of the object are (x1, y1) (here, it is assumed that the upper end of the object is directly above the lower end of the object (vertically upward)), the height of the object is H. At this time, the common coordinate conversion unit 120 converts the coordinates (x1, y1) of the upper end of this object into the coordinates (X0, Y0, H) of the common coordinate system. The common coordinate transformation unit 120 finds the height of the object by searching for H that satisfies this. In this way, the common coordinate transformation unit 120 obtains coordinate values (X, Y, Z) in the common coordinate system for each detected object.

なお、Ｈが既知の場合、共通座標変換部１２０は、その既知の値をそのまま用いてもよい。 When H is known, the common coordinate conversion unit 120 may use the known value as it is.

共通座標変換部１２０は、座標変換後の座標値（共通座標系の座標値）を含む検出結果を、統合追跡部２００に出力する。つまり、共通座標変換部１２０は、共通座標系で表現された検出結果を統合追跡部２００に出力する。なお、共通座標変換部１２０は、検出結果に含まれる１または複数のターゲット（オブジェクト）の夫々に対し、シルエット情報および該オブジェクトの外見特徴（色、模様、形状など）の特徴量等の情報を該オブジェクトに関する情報として含めてもよい。そして、共通座標変換部１２０は、これらの情報を含んだ検出結果を統合追跡部２００に出力してもよい。 The common coordinate conversion unit 120 outputs the detection result including the coordinate values after coordinate conversion (coordinate values of the common coordinate system) to the integrated tracking unit 200. That is, the common coordinate conversion unit 120 outputs the detection result expressed in the common coordinate system to the integrated tracking unit 200. The common coordinate conversion unit 120 provides silhouette information and information such as the feature amount of the appearance feature (color, pattern, shape, etc.) of the object for each of the one or more targets (objects) included in the detection result. It may be included as information about the object. Then, the common coordinate conversion unit 120 may output the detection result including these pieces of information to the integrated tracking unit 200.

（統合追跡部２００の詳細）
次に、図６を参照して、統合追跡部２００の機能構成についてより詳細に説明する。図６は、本実施の形態に係る物体追跡装置１０の統合追跡部２００のより詳細な機能構成の一例を示す機能ブロック図である。図６に示す通り、統合追跡部２００は、予測部２１０と、記憶部２２０と、対応付け部２３０と、更新部２４０と、を備えている。なお、本実施の形態における統合追跡部２００は、各カメラ２０からの映像をカメラ単位で逐次追跡するため、逐次追跡部とも呼ぶ。 (Details of integrated tracking unit 200)
Next, the functional configuration of the integrated tracking unit 200 will be described in more detail with reference to FIG. FIG. 6 is a functional block diagram showing an example of a more detailed functional configuration of the integrated tracking unit 200 of the object tracking device 10 according to this embodiment. As illustrated in FIG. 6, the integrated tracking unit 200 includes a prediction unit 210, a storage unit 220, an associating unit 230, and an updating unit 240. The integrated tracking unit 200 in the present embodiment also sequentially calls the video from each camera 20 on a camera-by-camera basis, and is therefore also called a sequential tracking unit.

この統合追跡部２００が行う、逐次のオブジェクト追跡（オブジェクトの逐次追跡、逐次統合追跡とも呼ぶ。）について、図７を用いて説明する。図７は、本実施の形態に係る統合追跡部２００が行うオブジェクトの逐次追跡処理を説明するための図である。図７には、カメラ数が３つの場合に、カメラＡ、カメラＢ、カメラＣの夫々で画像を取得するタイミングの一例を示している。図７において、横軸は、時間軸を示しており、右側にいくほど、時間的に後であることを示している。図７に示す通り、カメラＡは時間ｔ１、ｔ５およびｔ８で画像を取得している。同様に、カメラＢは、時間ｔ２、ｔ４、ｔ６およびｔ９で画像を取得し、カメラＣは時間ｔ３およびｔ７で画像を取得している。 Sequential object tracking (also called object sequential tracking and sequential integrated tracking) performed by the integrated tracking unit 200 will be described with reference to FIG. 7. FIG. 7 is a diagram for explaining an object sequential tracking process performed by the integrated tracking unit 200 according to the present embodiment. FIG. 7 shows an example of timings at which images are acquired by each of the cameras A, B, and C when the number of cameras is three. In FIG. 7, the horizontal axis indicates the time axis, and the right side indicates that the time is later. As shown in FIG. 7, camera A acquires images at times t1, t5, and t8. Similarly, camera B acquires images at times t2, t4, t6, and t9, and camera C acquires images at times t3 and t7.

図７に示す通り、各カメラ２０で取得されるフレーム（画像）の時間（タイムスタンプ）は、すべてのカメラ２０で一致しているとは限らず、通常ばらばらであることが多い。また、フレーム間隔も、カメラ２０ごとに異なる場合もあり、また、同じカメラ２０でも不均一であることもある。 As shown in FIG. 7, the time (time stamp) of the frame (image) acquired by each camera 20 is not necessarily the same for all the cameras 20, and is usually different. In addition, the frame interval may be different for each camera 20, and the same camera 20 may be uneven.

そして、検出部１００がこのカメラ２０間で非同期に出力されたカメラ映像から、時間順に検出を行い、検出結果を統合追跡部２００に出力する。 Then, the detection unit 100 performs detection in time order from the camera images output asynchronously between the cameras 20, and outputs the detection result to the integrated tracking unit 200.

本実施の形態に係る統合追跡部２００は、時間的に早い時間の画像から順に、逐次追跡処理を実行する。即ち、図７の場合、統合追跡部２００は、まず、カメラＡの時間ｔ１の画像に対するオブジェクト検出結果を用いて、複数カメラ間の統合を行い、オブジェクト追跡を行う。それが終わると、統合追跡部２００は、続いて、カメラＢの時間ｔ２、カメラＣの時間ｔ３、カメラＢの時間ｔ４、・・・の順に、オブジェクト検出結果を用いて、複数カメラ間の統合を行い、オブジェクト追跡を行う。この際、オブジェクトの全てがどのカメラ２０からも見えているわけではないため、統合追跡部２００は、カメラ２０毎に見えている可能性が高いオブジェクトに対してオブジェクト追跡を行う。 The integrated tracking unit 200 according to the present embodiment sequentially executes the tracking process in order from the image of the earliest time. That is, in the case of FIG. 7, the integrated tracking unit 200 first uses the object detection result of the image of the camera A for the time t1 to perform integration between a plurality of cameras to perform object tracking. After that, the integrated tracking unit 200 uses the object detection result in the order of the time t2 of the camera B, the time t3 of the camera C, the time t4 of the camera B,... And do object tracking. At this time, not all the cameras 20 are visible to all cameras 20, so the integrated tracking unit 200 performs object tracking on the objects that are likely to be viewed by each camera 20.

図６に戻り、統合追跡部２００の各部について説明する。 Returning to FIG. 6, each unit of the integrated tracking unit 200 will be described.

記憶部２２０には、統合追跡部２００が受信した検出結果に含まれるオブジェクト（ターゲット）と対応付ける、トラッカーの情報が格納されている。この記憶部２２０に格納されている、トラッカーの情報は、トラッカーのＩＤを用いて、更新部２４０によって管理されている。このトラッカーの情報とは、例えば、トラッカーに追跡結果が含まれるオブジェクトに関する情報、トラッカーの尤度等を含むパラメータ等であるが、本発明はこれに限定されるものではない。オブジェクトに関する情報には、オブジェクトの過去の位置を示す情報、該オブジェクトの運動モデル等が含まれるが本発明はこれに限定されるものではない、オブジェクトに関する情報には、上述したオブジェクト検出結果に含まれる情報が含まれてもよい。 The storage unit 220 stores tracker information associated with an object (target) included in the detection result received by the integrated tracking unit 200. The tracker information stored in the storage unit 220 is managed by the update unit 240 using the tracker ID. The tracker information is, for example, information about an object whose tracking result is included in the tracker, parameters including the likelihood of the tracker, and the like, but the present invention is not limited to this. The information about the object includes information indicating a past position of the object, a motion model of the object, and the like, but the present invention is not limited thereto. The information about the object is included in the above-described object detection result. Information may be included.

なお、図６では、記憶部２２０が統合追跡部２００内に内蔵されることを例に説明を行うが、本発明はこれに限定されるものではない。記憶部２２０は、統合追跡部２００とは、別に、物体追跡装置１０内に設けられるものであってもよい。また、記憶部２２０は、物体追跡装置１０とは別個の記憶装置等で実現されるものであってもよい。 In FIG. 6, the storage unit 220 is described as an example incorporated in the integrated tracking unit 200, but the present invention is not limited to this. The storage unit 220 may be provided in the object tracking device 10 separately from the integrated tracking unit 200. The storage unit 220 may be realized by a storage device or the like that is separate from the object tracking device 10.

記憶部２２０が統合追跡部２００内に内蔵されない場合、記憶部２２０は、物体追跡装置１０内で使用するデータ等を格納する構成であってもよい。例えば、記憶部２２０には、例えば、カメラ２０で撮影したカメラ映像、各カメラ２０のカメラパラメータ、各カメラ２０で見える共通座標系の座標値の範囲等が格納されていてもよい。 When the storage unit 220 is not built in the integrated tracking unit 200, the storage unit 220 may be configured to store data or the like used in the object tracking device 10. For example, the storage unit 220 may store, for example, camera images captured by the cameras 20, camera parameters of the cameras 20, ranges of coordinate values of the common coordinate system visible by the cameras 20, and the like.

予測部２１０は、記憶部２２０を参照し、現フレーム上のオブジェクトの位置を予測する。具体的には、予測部２１０は、前フレームにおけるオブジェクトの追跡結果（トラッカー）を用いて、該オブジェクトの運動モデルに基づいて、該オブジェクトの現在の位置を予測する。ここで、オブジェクトの位置を示す情報は共通座標系で表現されている。 The prediction unit 210 refers to the storage unit 220 and predicts the position of the object on the current frame. Specifically, the prediction unit 210 predicts the current position of the object based on the motion model of the object using the tracking result (tracker) of the object in the previous frame. Here, the information indicating the position of the object is expressed in the common coordinate system.

また、予測部２１０が位置の予測に使用するオブジェクトの運動モデルは、記憶部２２０に格納されているものであってもよいし、予測部２１０が追跡結果を用いて、オブジェクトの位置の予測を行う前に算出したものであってもよい。 The motion model of the object used by the prediction unit 210 to predict the position may be stored in the storage unit 220, or the prediction unit 210 may use the tracking result to predict the position of the object. It may be calculated before the execution.

予測部２１０によるオブジェクトの位置の予測には、例えば、カルマンフィルタまたはパーティクルフィルタ等の予測処理を適用することができる。また、予測部２１０は、単純に、過去数回分の追跡結果からオブジェクトの速度を算出し、等速直線運動を仮定して、前フレームにおける位置からの移動量を速度から予測して該前フレームにおける位置に加算することにより、現在の位置を予測してもよい。 For the prediction of the position of the object by the prediction unit 210, for example, a prediction process such as Kalman filter or particle filter can be applied. In addition, the prediction unit 210 simply calculates the speed of the object from the results of tracking several times in the past, and, assuming a uniform linear motion, predicts the amount of movement from the position in the previous frame from the speed to predict the previous frame. The current position may be predicted by adding to the position at.

そして、予測部２１０は、予測結果を対応付け部２３０に出力する。 Then, the prediction unit 210 outputs the prediction result to the association unit 230.

対応付け部２３０は、検出部１００の夫々から出力される検出結果を受信する。なお、図６において、検出結果（ｎ）（ｎは、１〜Ｎ）は、検出部１００−ｎから出力された検出結果を示している。また、対応付け部２３０は、予測部２１０から、予測結果を受信する。そして、対応付け部２３０は、記憶部２２０を参照し、上記予測結果を用いて、検出結果に含まれるターゲットと、トラッカーとの対応付けを行う。 The associating unit 230 receives the detection result output from each of the detecting units 100. In addition, in FIG. 6, the detection result (n) (n is 1 to N) indicates the detection result output from the detection unit 100-n. The associating unit 230 also receives the prediction result from the prediction unit 210. Then, the associating unit 230 refers to the storage unit 220 and associates the target included in the detection result with the tracker using the prediction result.

対応付け部２３０は、対応付け全体として最も確度が高くなる組み合わせを求める。あるターゲットｍと、あるトラッカーｋが対応付く尤度は、ターゲットｍおよびトラッカーｋのそれぞれの尤度Ｐｍおよびηｋと、両者が同一のオブジェクトである可能性を表す尤度ｑｋｍとを掛け合わせたものになる。よって、対応付け部２３０は、ターゲットおよびトラッカーの各ペアに対してこの値を算出し、全体として最大となる組み合わせを求める。 The associating unit 230 obtains the combination with the highest accuracy as a whole associating. The likelihood that a certain target m is associated with a certain tracker k is obtained by multiplying the likelihoods Pm and ηk of the target m and the tracker k, respectively, and the likelihood qkm indicating the possibility that they are the same object. become. Therefore, the associating unit 230 calculates this value for each pair of the target and the tracker, and finds the maximum combination as a whole.

ここで、ターゲットの尤度（第１の尤度）は、オブジェクト検出の確からしさ（確度）を表す値である。オブジェクト検出の確度は、検出対象のオブジェクト（検出オブジェクトと呼ぶ）の画面上（フレーム上）における大きさ、カメラ２０から、オブジェクトの検出位置までの距離、カメラ２０からのオブジェクトの見え方等に依存する。 Here, the likelihood of the target (first likelihood) is a value that represents the probability (probability) of object detection. The accuracy of object detection depends on the size of the object to be detected (called a detected object) on the screen (on the frame), the distance from the camera 20 to the detection position of the object, the appearance of the object from the camera 20 and the like. To do.

例えば、検出オブジェクトが小さく、該検出オブジェクトのサイズが検出できるサイズの限界に近い場合には、オブジェクト検出の確度は低くなる。また、検出オブジェクトの大きさが、カメラパラメータによって想定されるオブジェクトの見かけの大きさからずれている場合、オブジェクト検出の確度は低くなる。また、オブジェクトの検出位置がカメラ２０から離れていたり、オブジェクトの存在する領域に対する照明条件が悪く、オブジェクトが検出されにくい場所であったりする場合には、オブジェクト検出の確度は低くなる。また、識別器の学習に用いたデータと、実際の見え方が大きく異なる場合（例えば、角度が異なるなど）にも、オブジェクト検出の確度は低くなる。 For example, when the detected object is small and the size of the detected object is close to the limit of the size that can be detected, the accuracy of object detection is low. Further, when the size of the detected object deviates from the apparent size of the object assumed by the camera parameter, the accuracy of the object detection becomes low. In addition, if the detection position of the object is far from the camera 20 or if the illumination condition for the area where the object is present is poor and the object is difficult to detect, the accuracy of the object detection is low. In addition, the accuracy of object detection is low even when the data used for learning of the discriminator is significantly different from the actual appearance (for example, the angle is different).

対応付け部２３０は、このような特性を反映させて、ターゲットの尤度を算出する。具体的には、対応付け部２３０は、検出部１００から受信した検出結果に含まれる、ターゲットの尤度情報を用いて、ターゲットの尤度を算出する。なお、検出部１００がターゲットの尤度を算出し、算出した尤度をターゲットの尤度情報として検出結果に含めている場合、対応付け部２３０は、このターゲットの尤度情報に含まれる尤度をそのまま用いてもよい。なお、ターゲットの尤度はここに記載したすべての項目を反映させる必要はなく、主要な要因のみを反映させるようにしてもよい。 The associating unit 230 calculates the likelihood of the target by reflecting such characteristics. Specifically, the associating unit 230 calculates the likelihood of the target using the likelihood information of the target included in the detection result received from the detecting unit 100. When the detecting unit 100 calculates the likelihood of the target and includes the calculated likelihood as the likelihood information of the target in the detection result, the associating unit 230 calculates the likelihood included in the likelihood information of the target. May be used as is. Note that the likelihood of the target does not have to reflect all the items described here, and may reflect only major factors.

トラッカーの尤度（第２の尤度）は、オブジェクト追跡の確からしさ（確度）を表す値である。オブジェクト追跡の確度は、前フレームにおけるオブジェクト追跡の追跡結果に依存して変化する。例えば、現フレームの前の（過去の）フレームまでにおける追跡結果で、ターゲットと確実に対応付いているトラッカーは、オブジェクト追跡の確度が高いと言え、あまり対応づいていないトラッカーは、オブジェクト追跡の確度が低いと言える。よって、対応付け部２３０は、各フレームにおいて、ターゲットとトラッカーとが対応付いたかどうかの結果に基づいて、尤度を変化させていけばよく、対応付いた場合にトラッカーの尤度を上げ、対応付かなかった場合にトラッカーの尤度を下げるようにすればよい。 The tracker's likelihood (second likelihood) is a value representing the likelihood (probability) of object tracking. The accuracy of object tracking changes depending on the tracking result of object tracking in the previous frame. For example, in the tracking results up to the previous (past) frame of the current frame, it can be said that the tracker that is surely associated with the target has high object tracking accuracy, and the tracker that is not well associated with the target tracking accuracy is Can be said to be low. Therefore, the associating unit 230 may change the likelihood based on the result of whether or not the target and the tracker are associated with each other in each frame. When the association is established, the likelihood of the tracker is increased and the association is performed. If not, the tracker's likelihood should be lowered.

また、この際、トラッカーの位置が、カメラ２０から遠い位置にある場合には、このトラッカーの位置の誤差が大きくなると考えられる。その結果、このようなトラッカーは、検出結果に含まれるオブジェクト（ターゲット）と対応付きにくくなる。このため、対応付け部２３０は、トラッカーの位置とカメラ２０との距離に応じて、該トラッカーの尤度を変化させる比率を変更させてもよい。更に対応付け部２３０は、カメラ２０が、トラッカーによって追跡結果が示されるオブジェクトを見たときの、該カメラ２０を含む水平面と視線方向とがなす角（俯角または仰角）に応じて、該トラッカーの尤度を変化させる比率を変えてもよい。 Further, at this time, when the position of the tracker is far from the camera 20, it is considered that the error of the position of the tracker becomes large. As a result, such a tracker becomes hard to correspond to the object (target) included in the detection result. Therefore, the associating unit 230 may change the ratio for changing the likelihood of the tracker according to the distance between the tracker and the camera 20. Further, the associating unit 230, when the camera 20 looks at the object whose tracking result is shown by the tracker, according to the angle (depression angle or elevation angle) formed by the horizontal plane including the camera 20 and the line-of-sight direction, You may change the ratio which changes likelihood.

例えば、トラッカーによって追跡結果が示されるオブジェクト（以降、トラッカーのオブジェクトと呼ぶ）がカメラ２０から近く、該オブジェクトに対するカメラ２０の俯角が所定の角度より大きい場合には、オブジェクトの位置の精度は高い。よって、該トラッカーと、ターゲットとが対応付きやすい。そのため、このような場合、対応付け部２３０は、トラッカーの尤度を変化させる比率をより大きくする。 For example, when an object whose tracking result is shown by a tracker (hereinafter referred to as a tracker object) is close to the camera 20 and the depression angle of the camera 20 with respect to the object is larger than a predetermined angle, the accuracy of the position of the object is high. Therefore, the tracker and the target are easily associated with each other. Therefore, in such a case, the associating unit 230 increases the ratio of changing the likelihood of the tracker.

また、例えば、トラッカーのオブジェクトがカメラ２０から遠く該オブジェクトに対するカメラ２０の俯角が所定の角度より浅い場合、カメラ２０が撮影したフレーム上における該オブジェクトのサイズは小さくなる。また、画像上での少しの位置のずれが、実空間上では大きなずれになる。よって、該オブジェクトの検出位置の精度は低くなる可能性が高い。そのため、このような場合、対応付け部２３０は、トラッカーの尤度を変化させる比率をより小さくする。このようにして、対応付け部２３０は、トラッカーの尤度を算出する。 Further, for example, when the object of the tracker is far from the camera 20 and the depression angle of the camera 20 with respect to the object is shallower than a predetermined angle, the size of the object on the frame captured by the camera 20 becomes small. In addition, a slight positional shift on the image causes a large shift in the real space. Therefore, the accuracy of the detection position of the object is likely to be low. Therefore, in such a case, the association unit 230 reduces the ratio of changing the likelihood of the tracker. In this way, the association unit 230 calculates the likelihood of the tracker.

以上のように、対応付け部２３０は、トラッカーのオブジェクトに近いカメラ２０で検出された検出結果を優先的にトラッカーの尤度に反映できるため、全体として追跡の精度を上げることができる。なお、トラッカーの尤度にはここに記載したすべての項目を反映させる必要はなく、主要な要因のみを反映させるようにしてもよい。 As described above, since the associating unit 230 can preferentially reflect the detection result detected by the camera 20 close to the object of the tracker to the likelihood of the tracker, it is possible to improve the tracking accuracy as a whole. Note that it is not necessary to reflect all the items described here in the likelihood of the tracker, and only the main factors may be reflected.

また、ターゲットｍとトラッカーｋと間の同一性を表す尤度ｑｋｍは、両者が同一である確度を表している。ターゲットが示すオブジェクトと、トラッカーのオブジェクトとが同一のオブジェクトの場合には、ターゲットとトラッカーのオブジェクトとの位置は近くなる可能性が高い。そのため、対応付け部２３０は、ターゲットとトラッカーのオブジェクトとの間の距離に応じて尤度を変化させる。つまり、対応付け部２３０は、ターゲットｍとトラッカーｋのオブジェクトとの距離が近い場合に、尤度ｑｋｍの値をより大きく、距離が離れている場合に尤度ｑｋｍの値をより小さくすればよい。 Further, the likelihood qkm indicating the identity between the target m and the tracker k indicates the probability that they are the same. When the object indicated by the target and the object of the tracker are the same object, the positions of the target and the object of the tracker are likely to be close to each other. Therefore, the association unit 230 changes the likelihood according to the distance between the target and the object of the tracker. That is, the associating unit 230 may increase the value of the likelihood qkm when the distance between the target m and the object of the tracker k is short, and decrease the value of the likelihood qkm when the distance is far. ..

この際、ターゲットｍがカメラ２０から離れていたり、カメラ２０の該ターゲットｍに対する俯角が所定の角度より浅かったりする場合には、ターゲットｍの位置の精度が低くなる可能性が高い。よって、対応付け部２３０は、単純なユークリッド距離を用いた距離の計算ではなく、ターゲットの検出位置に含まれる誤差（曖昧さ）を考慮した、マハラノビス距離を用いて、ターゲットとトラッカーのオブジェクトとの間の距離を求めてもよい。また、対応付け部２３０は、上記方法の他に、曖昧さを考慮して、距離に応じた尤度ｑｋｍの変化の度合いを制御するようにしてもよい。即ち、対応付け部２３０は、上記曖昧さがより大きい場合には、ターゲットｍとトラッカーｋのオブジェクトとの間の距離に応じて、尤度ｑｋｍの変化をより小さくする。これにより、対応付け部２３０は、ターゲットの位置ずれが対応付けに与える影響を軽減させる。 At this time, when the target m is far from the camera 20 or the depression angle of the camera 20 with respect to the target m is shallower than a predetermined angle, the accuracy of the position of the target m is likely to be low. Therefore, the associating unit 230 does not calculate the distance using the simple Euclidean distance, but uses the Mahalanobis distance in consideration of the error (ambiguity) included in the detected position of the target, and the target and the object of the tracker. The distance between them may be obtained. In addition to the above method, the associating unit 230 may control the degree of change in the likelihood qkm depending on the distance, in consideration of ambiguity. That is, when the ambiguity is larger, the associating unit 230 reduces the change in the likelihood qkm according to the distance between the target m and the object of the tracker k. As a result, the associating unit 230 reduces the influence of the displacement of the target on the association.

さらに、対応付け部２３０は、ターゲットおよびトラッカーの外見の類似性も考慮してもよい。即ち、対応付け部２３０は、ターゲットおよびトラッカーのオブジェクトの色、模様、形状といった特徴を抽出しておき、これらの類似性を評価して、尤度ｑｋｍを求めるようにしてもよい。 Further, the associating unit 230 may also consider the similarity in appearance of the target and the tracker. That is, the associating unit 230 may extract features such as the color, pattern, and shape of the target and tracker objects, evaluate the similarity between them, and obtain the likelihood qkm.

例えば、対応付け部２３０は、オブジェクトの色ヒストグラムを、ターゲットおよびトラッカーのオブジェクトの両方に対して算出し、これらの類似度を色ヒストグラムの重なり等によって評価し、尤度ｑｋｍに反映させてもよい。なお、ターゲットおよびトラッカーの同一性を表す尤度ｑｋｍも、トラッカーの尤度ηｋおよびターゲットの尤度Ｐｍと同様に、上述した全ての項目を反映させる必要はなく、主要な要因のみを反映させるようにしてもよい。 For example, the associating unit 230 may calculate the color histogram of the object for both the target and the tracker objects, evaluate the similarity between them by overlapping the color histograms, and reflect the similarity in the likelihood qkm. .. Like the tracker's likelihood ηk and the target's likelihood Pm, the likelihood qkm indicating the identity of the target and the tracker does not need to reflect all the above-mentioned items, and only major factors should be reflected. You may

また、対応付け部２３０は、オブジェクトがカメラ２０の画角外に出たり、他のオブジェクトに遮蔽されたりして、検出されない場合も考慮して、上記の各尤度を算出してもよい。これにより、統合追跡部２００は、オブジェクトが未検出であったり、画角外に出てしまったりする場合であっても、高精度にオブジェクトを追跡することができる。 Further, the associating unit 230 may calculate each of the above likelihoods in consideration of the case where the object is not detected due to being out of the angle of view of the camera 20 or occluded by another object. Thereby, the integrated tracking unit 200 can track the object with high accuracy even when the object is not detected or goes out of the angle of view.

以上のように、各尤度を算出し、全体として各尤度が最大となるターゲットとトラッカーとの対応付けを求める問題は、各尤度を単調非増加関数によってコストに変換して用いることにより、コストが最小となる割当問題（どのターゲットをどのトラッカーに対応付けるか）に帰着できる。この割当問題は、例えば、ハンガリアン法等の手法により、効率的に算出することが可能である。 As described above, the problem of calculating each likelihood and finding the correspondence between the target and tracker that maximizes each likelihood as a whole is that each likelihood is converted into a cost by a monotonic non-increasing function and used. , Which results in the lowest cost allocation problem (which target is associated with which tracker). This allocation problem can be efficiently calculated by a method such as the Hungarian method.

そして、対応付け部２３０は対応付けの結果を更新部２４０に出力する。この対応付けの結果には、どのターゲットとトラッカーとが対応付くかを示す情報と、少なくともトラッカーの尤度を含む上記各尤度とが含まれる。 Then, the associating unit 230 outputs the result of the association to the updating unit 240. The result of this association includes information indicating which target is associated with the tracker and each of the above-mentioned likelihoods including at least the likelihood of the tracker.

なお、本実施の形態において、対応付け部２３０は、ターゲットの尤度と、トラッカーの尤度との両方の尤度を用いて、対応付けを行ったが、どちらか一方の尤度を用いて対応付けを行ってもよい。 In addition, in the present embodiment, the association unit 230 performs association using both the likelihood of the target and the likelihood of the tracker, but uses either one of the likelihoods. Correlation may be performed.

更新部２４０は、トラッカーの情報の更新を行う。そして、更新部２４０は、このトラッカーを新たなオブジェクト追跡結果として生成する。具体的には、更新部２４０は、対応付け部２３０から、対応付けの結果を受信する。そして、更新部２４０は、この結果に基づいて、トラッカーのオブジェクトの現在位置を算出する。そして、更新部２４０は、記憶部２２０に格納された該トラッカー情報を更新する。更新を行う情報は、例えば、トラッカーに追跡結果が含まれるオブジェクトの位置および／またはサイズ、該オブジェクトの運動モデル、および、トラッカーの尤度等のパラメータであるが、本発明はこれに限定されるものではない。更新部２４０は、記憶部２２０に格納された情報のうち、更新があった情報を更新すればよい。 The update unit 240 updates the tracker information. Then, the updating unit 240 generates this tracker as a new object tracking result. Specifically, the updating unit 240 receives the association result from the associating unit 230. Then, the updating unit 240 calculates the current position of the tracker object based on this result. Then, the updating unit 240 updates the tracker information stored in the storage unit 220. The information to be updated is, for example, parameters such as the position and/or size of the object whose tracking result is included in the tracker, the motion model of the object, and the likelihood of the tracker, but the present invention is not limited to this. Not a thing. The update unit 240 may update the updated information among the information stored in the storage unit 220.

まず、更新部２４０による、トラッカーのオブジェクトの現在位置の算出について説明する。更新部２４０は、トラッカーのオブジェクトの現在位置を、ターゲットの位置の精度を考慮して算出する。例えば、更新部２４０がトラッカーを用いてオブジェクトの位置を予測した予測位置と、該トラッカーに対応付いたターゲットの検出位置とに対し、重みづけを行い、該トラッカーのオブジェクトの現在位置を算出するとする。この場合、更新部２４０は、ターゲットの位置の確度によって、該重みを制御してもよい。 First, the calculation of the current position of the tracker object by the update unit 240 will be described. The update unit 240 calculates the current position of the tracker object in consideration of the accuracy of the target position. For example, it is assumed that the update unit 240 weights the predicted position where the position of the object is predicted using the tracker and the detected position of the target associated with the tracker, and calculates the current position of the object of the tracker. .. In this case, the updating unit 240 may control the weight according to the accuracy of the position of the target.

例えば、ターゲットがカメラ２０から離れた位置にあり、該カメラ２０のターゲットに対する俯角が浅い場合には、このターゲットの位置の精度は低い可能性が高い。このような場合、更新部２４０は、このターゲットの位置に対する重みをより小さくする。 For example, when the target is located away from the camera 20 and the depression angle of the camera 20 with respect to the target is shallow, the accuracy of the position of this target is likely to be low. In such a case, the updating unit 240 reduces the weight for the position of this target.

一方、ターゲットがカメラ２０に近い位置にあり、該カメラ２０ターゲットに対する俯角が所定の値より大きい場合には、このターゲットの位置の精度は高いと想定される。このような場合、更新部２４０は、このターゲットの位置に対する重みをより大きくする。 On the other hand, when the target is close to the camera 20 and the depression angle with respect to the camera 20 target is larger than a predetermined value, it is assumed that the accuracy of the position of this target is high. In such a case, the updating unit 240 increases the weight for the position of this target.

更新部２４０は、予測位置と、重みを設定した位置とを用いて、トラッカーのオブジェクトの現在位置を算出する。 The update unit 240 calculates the current position of the tracker object using the predicted position and the position for which the weight is set.

このように、更新部２４０がターゲットの位置に対する重みを決定することで、ターゲットに近いカメラ２０による、オブジェクトの検出位置の予測結果がより強く反映されるようになる。したがって、物体追跡装置１０は、オブジェクトの位置の予測精度を向上させることができる。 In this way, the updating unit 240 determines the weight for the position of the target, so that the prediction result of the detected position of the object by the camera 20 close to the target is more strongly reflected. Therefore, the object tracking device 10 can improve the prediction accuracy of the position of the object.

そして、更新部２４０は、記憶部２２０に格納されたオブジェクトに関する情報に含まれる、オブジェクトの最も新しい位置を、算出したオブジェクトの現在位置に更新する。 Then, the updating unit 240 updates the latest position of the object included in the information about the object stored in the storage unit 220 to the calculated current position of the object.

次に、更新部２４０が行うトラッカーの尤度の更新について説明する。 Next, the updating of the likelihood of the tracker performed by the updating unit 240 will be described.

トラッカーのオブジェクトがカメラ２０から離れた位置にある場合、該オブジェクトの大きさは小さくなる。そのため、検出部１００は、このようなオブジェクトを検出し辛くなる。 When a tracker object is located far from the camera 20, the size of the object is reduced. Therefore, the detection unit 100 has difficulty detecting such an object.

また、フレームに含まれるオブジェクトが、学習に用いたオブジェクトの見え方と異なる見え方である場合に、検出部１００の認識型オブジェクト検出部１１１が認識型のオブジェクト検出を行う場合について説明する。フレームに含まれるオブジェクトが、学習に用いたオブジェクトの見え方と異なる見え方である場合とは、例えば、トラッカーのオブジェクトの位置から想定される、該オブジェクトに対するカメラ２０の俯角と、学習に用いたオブジェクトに対するカメラ２０の俯角と、が大きく異なる場合である。このような場合、検出部１００の認識型オブジェクト検出部１１１は、フレームに含まれるオブジェクトを検出し辛くなる。 Further, a case will be described in which the recognition-type object detection unit 111 of the detection unit 100 performs recognition-type object detection when the object included in the frame has a different appearance from the object used for learning. When the object included in the frame has a different appearance from the object used for learning, for example, the depression angle of the camera 20 with respect to the object of the tracker, which is assumed from the position of the object, and the object used for learning are used. This is a case where the depression angle of the camera 20 with respect to the object is significantly different. In such a case, the recognition-type object detection unit 111 of the detection unit 100 becomes difficult to detect the object included in the frame.

このようなオブジェクトを検出し辛い状況の場合、フレームに含まれるオブジェクトは、未検出となってしまう可能性がある。この場合、このオブジェクトに関連するトラッカーに対応付くターゲットが存在しない可能性がある。 In a situation where it is difficult to detect such an object, the object included in the frame may be undetected. In this case, there may not be a target associated with the tracker associated with this object.

したがって、更新部２４０は、ターゲットに関連付いていないトラッカーのうち、オブジェクトが検出されづらい状況にあるオブジェクトのトラッカーの尤度の変化を小さく抑える。 Therefore, the update unit 240 suppresses the change in the likelihood of the tracker of the object in which the object is difficult to be detected among the trackers not associated with the target.

このようにして、更新部２４０は、オブジェクトが検出されにくい場合の追跡への影響を抑え、検出されやすいカメラでの検出の結果をトラッカーの尤度に大きく反映させることができる。そして、次のフレームに対するオブジェクト追跡の際に、対応付け部２３０は、このトラッカーの尤度に基づいて、対応付けを行うため、統合追跡部２００は、よりオブジェクト追跡の精度を向上させることができる。 In this way, the update unit 240 can suppress the influence on tracking when an object is difficult to be detected, and can greatly reflect the detection result of the easily detected camera on the likelihood of the tracker. Then, when the object is tracked for the next frame, the associating unit 230 performs the matching based on the likelihood of this tracker, so that the integrated tracking unit 200 can further improve the accuracy of the object tracking. ..

そして、記憶部２２０に格納されたトラッカーの尤度のうち、対応付け部２３０が算出したトラッカーの尤度と、上記変化を小さく抑えたトラッカーの尤度とを、更新する。 Then, of the tracker likelihoods stored in the storage unit 220, the tracker likelihood calculated by the associating unit 230 and the tracker likelihood in which the above change is suppressed are updated.

次に、更新部２４０による、トラッカーのオブジェクトの運動モデルの更新について説明する。例えば、統合追跡部２００が、カルマンフィルタを用いてオブジェクトの位置を予測することによって、オブジェクト追跡を行う場合について説明する。この場合、更新部２４０は、ターゲットと対応付いたトラッカーのオブジェクトの位置座標を、検出された位置座標として、記憶部２２０に格納された、カルマンフィルタの状態変数の更新式に代入し、カルマンフィルタの状態を更新する。 Next, updating of the motion model of the tracker object by the updating unit 240 will be described. For example, a case will be described in which the integrated tracking unit 200 performs object tracking by predicting the position of an object using a Kalman filter. In this case, the updating unit 240 substitutes the position coordinates of the object of the tracker associated with the target as the detected position coordinates into the update formula of the state variable of the Kalman filter stored in the storage unit 220 to determine the state of the Kalman filter. To update.

また、上記以外にも更新部２４０は、記憶部２２０に格納された、トラッカーのその他のパラメータ等を更新する。 In addition to the above, the update unit 240 updates the other parameters of the tracker stored in the storage unit 220.

例えば、オブジェクト自体が姿勢を変更する場合がある。例えば、オブジェクトが人物の場合には、該人物がしゃがんだり、屈んだりすることにより、該オブジェクトの見かけの高さが変わる。このように、運動モデル以外に、オブジェクトのサイズ等も変更がある場合、更新部２４０は、記憶部２２０に格納された情報のうち、この変更があった情報の更新を行う。 For example, the object itself may change its posture. For example, when the object is a person, the apparent height of the object changes as the person crouches or bends. As described above, when the size of the object or the like is changed in addition to the motion model, the updating unit 240 updates the changed information in the information stored in the storage unit 220.

さらに、トラッカーのパラメータとして、トラッカーのオブジェクトの存在する確率、追跡結果の信頼度を表す重み等が含まれている場合、この重みは対応付け部２３０による対応付けの結果に応じて変化する。したがって、更新部２４０は、この重み等のパラメータを更新する。 Further, when the tracker parameters include the probability of existence of the tracker object, the weight indicating the reliability of the tracking result, and the like, the weight changes according to the result of the association by the associating unit 230. Therefore, the updating unit 240 updates the parameters such as the weight.

また、更新部２４０は、トラッカーの更新として、トラッカーの生成、削除を行う。まず、更新部２４０は、対応付け部２３０による対応付けの処理後、トラッカーと対応付かないターゲットが存在するか否かを判定する。このトラッカーと対応付かないターゲットは、カメラ２０で撮影される範囲内に新たに現れたオブジェクトである可能性がある。そのため、更新部２４０は、トラッカーと対応付かないターゲットが存在する場合、このターゲットが上記範囲内に新たに表れたオブジェクトとみなせるか否かの判定を行う。 The updating unit 240 also generates and deletes a tracker as a tracker update. First, the updating unit 240 determines whether or not there is a target that is not associated with the tracker after the association process by the association unit 230. The target that does not correspond to this tracker may be an object newly appearing within the range captured by the camera 20. Therefore, when there is a target that does not correspond to the tracker, the update unit 240 determines whether this target can be regarded as an object newly appearing within the range.

つまり、更新部２４０は、トラッカーと対応付かないターゲットが存在する場合、このターゲットの存在する確率を評価する。そして、更新部２４０は、この確率が所定の値以上か否かを判定する。そして、この確率が所定の値以上の場合、更新部２４０は、このターゲットが示すオブジェクトが、上記範囲内に新たに表れたオブジェクトであると判定する。そして、更新部２４０は、上記範囲内に新たに表れたオブジェクトと判定したオブジェクト（ターゲット）に関連するトラッカーを新規に作成する。 That is, when there is a target that does not correspond to the tracker, the update unit 240 evaluates the probability that this target exists. Then, the updating unit 240 determines whether or not this probability is equal to or higher than a predetermined value. Then, when this probability is equal to or higher than a predetermined value, the updating unit 240 determines that the object indicated by this target is an object newly appearing within the range. Then, the updating unit 240 newly creates a tracker related to the object (target) that is determined to be an object newly appearing within the range.

また、更新部２４０は、ターゲットと対応付かないトラッカーが存在するか否かを判定する。このターゲットと対応付かないトラッカーは、カメラ２０で撮影される範囲内から消えた（範囲内から範囲外に移動した）オブジェクトに関するトラッカーである可能性がある。そのため、更新部２４０は、ターゲットと対応付かないトラッカーが存在する場合、このターゲットが上記範囲内から消えたオブジェクトに関するトラッカーとみなせるか否かを判定する。 Further, the update unit 240 determines whether or not there is a tracker that does not correspond to the target. A tracker that does not correspond to this target may be a tracker for an object that has disappeared (moved from within the range) to within the range captured by the camera 20. Therefore, when there is a tracker that does not correspond to the target, the update unit 240 determines whether or not the target can be regarded as the tracker related to the object disappeared from the range.

つまり、更新部２４０は、ターゲットと対応付かないトラッカーが存在する場合、このトラッカーに関するオブジェクトの存在する確率を評価する。そして、更新部２４０は、この確率が所定の値を下回るかを判定する。そして、この確率が所定の値を下回った場合、更新部２４０は、このトラッカーに関するオブジェクトが、上記範囲内から消えたオブジェクトであると判定する。そして、更新部２４０は、上記範囲内から消えたオブジェクトと判定したオブジェクトに関するトラッカーを削除する。 That is, when there is a tracker that does not correspond to the target, the update unit 240 evaluates the probability that an object related to this tracker exists. Then, the updating unit 240 determines whether this probability is lower than a predetermined value. Then, when this probability falls below a predetermined value, the update unit 240 determines that the object related to this tracker is an object that has disappeared from the range. Then, the updating unit 240 deletes the tracker related to the object determined to be the object that has disappeared from the range.

オブジェクトの存在する確率とは、トラッカーの尤度によって求められる。つまり、更新部２４０は、ターゲットと対応付かないトラッカーが存在する場合、このトラッカーの尤度を減じていく。そして、トラッカーの尤度の値が所定の閾値を下回った場合、更新部２４０は、該トラッカーを削除するようにする。 The probability that an object exists is determined by the likelihood of the tracker. That is, when there is a tracker that does not correspond to the target, the update unit 240 reduces the likelihood of this tracker. Then, when the likelihood value of the tracker falls below a predetermined threshold value, the updating unit 240 deletes the tracker.

そして、更新部２４０は、最終的に残ったトラッカーを、このフレームにおけるオブジェクトの追跡結果として生成する。そして、更新部２４０は、このうち、トラッカーの位置を示す情報およびトラッカーのオブジェクトの大きさに関する情報等を、オブジェクト追跡の追跡結果を示す情報（追跡情報）として出力する。 Then, the updating unit 240 generates the finally remaining tracker as the tracking result of the object in this frame. Then, the updating unit 240 outputs information indicating the position of the tracker, information regarding the size of the object on the tracker, and the like as information (tracking information) indicating the tracking result of object tracking.

このように、本実施の形態に係る物体追跡装置１０によれば、オブジェクト追跡は、各カメラ２０から出力されたカメラ映像に含まれる時間情報が古いものから順に行われる。複数のカメラ２０の夫々の映像データから検出されるオブジェクトは、カメラ２０毎の確度の高い検出結果を統合し、この検出結果および過去の追跡結果が反映された追跡結果を用いて検出される。そのため、物体追跡装置１０の統合追跡部２００が行うオブジェクト追跡の追跡精度が向上する。 As described above, according to the object tracking device 10 according to the present embodiment, object tracking is performed in order from the oldest time information included in the camera image output from each camera 20. The object detected from the video data of each of the plurality of cameras 20 is detected by integrating the detection results with high accuracy for each camera 20 and using the detection result and the tracking result in which the past tracking result is reflected. Therefore, the tracking accuracy of the object tracking performed by the integrated tracking unit 200 of the object tracking device 10 is improved.

次に、図８を用いて、本実施の形態に係る物体追跡装置１０の物体追跡処理の流れについて説明する。図８は、本実施の形態に係る物体追跡装置１０の物体追跡処理の流れの一例を示すフローチャートである。 Next, the flow of the object tracking process of the object tracking device 10 according to the present embodiment will be described with reference to FIG. FIG. 8 is a flowchart showing an example of the flow of object tracking processing of the object tracking device 10 according to the present embodiment.

図８に示す通り、まず、検出部１００の認識型オブジェクト検出部１１１が該オブジェクト検出部１１０を備える検出部１００に紐付けられたカメラ２０からのカメラ映像を受信する（ステップＳ８１）。 As shown in FIG. 8, first, the recognition-type object detection unit 111 of the detection unit 100 receives a camera image from the camera 20 associated with the detection unit 100 including the object detection unit 110 (step S81).

検出部１００は、受信したカメラ映像のフレームが、該カメラ２０から出力される最初のフレームか否かを確認し（ステップＳ８２）、最初のフレームの場合（ステップＳ８２にてＹＥＳ）、処理をステップＳ８５に進める。 The detection unit 100 confirms whether the frame of the received camera image is the first frame output from the camera 20 (step S82), and if the frame is the first frame (YES in step S82), the process is performed. Proceed to S85.

受信したカメラ映像のフレームが最初のフレームではない場合（ステップＳ８２にてＮＯ）、個別座標変換部１３０が、統合追跡部２００から出力される、前フレームに対する追跡情報を、個別座標系で表現された追跡情報に変換する（ステップＳ８３）。 When the frame of the received camera image is not the first frame (NO in step S82), the individual coordinate conversion unit 130 expresses the tracking information for the previous frame output from the integrated tracking unit 200 in the individual coordinate system. It is converted into the tracking information (step S83).

そして、オブジェクト検出部１１０の探索範囲設定部１１２がステップＳ８３にて変換された追跡情報を用いて、現フレームに対するオブジェクトの探索範囲を設定する（ステップＳ８４）。 Then, the search range setting unit 112 of the object detection unit 110 uses the tracking information converted in step S83 to set the search range of the object for the current frame (step S84).

そして、認識型オブジェクト検出部１１１が、受信したカメラ映像からオブジェクトを検出する（ステップＳ８５）。 Then, the recognition-type object detection unit 111 detects an object from the received camera image (step S85).

次に、個別座標変換部１３０が、認識型オブジェクト検出部１１１による検出結果を、共通座標系で表現された検出結果に変換する（ステップＳ８６）。 Next, the individual coordinate conversion unit 130 converts the detection result obtained by the recognition-type object detection unit 111 into the detection result expressed in the common coordinate system (step S86).

次に、統合追跡部２００の予測部２１０が、トラッカーの情報を用いて、現フレーム上のオブジェクトの位置を予測する（ステップＳ８７）。 Next, the prediction unit 210 of the integrated tracking unit 200 predicts the position of the object on the current frame using the tracker information (step S87).

そして、対応付け部２３０が、検出結果に含まれるオブジェクト（ターゲット）と、トラッカーとを対応付ける（ステップＳ８８）。 Then, the associating unit 230 associates the object (target) included in the detection result with the tracker (step S88).

次に、更新部２４０がトラッカーのオブジェクトの位置およびオブジェクトの運動モデル等の、トラッカーの情報の更新を行う（ステップＳ８９）。 Next, the update unit 240 updates the tracker information such as the position of the tracker object and the motion model of the object (step S89).

更に、更新部２４０がトラッカーの生成および／または削除を行う（ステップＳ９０）。そして、物体追跡装置１０は、検出部１００にフレームが入力されなくなるまで、この処理を繰り返す。 Further, the updating unit 240 generates and/or deletes the tracker (step S90). Then, the object tracking device 10 repeats this process until no frame is input to the detection unit 100.

（効果）
以上のように、本実施の形態に係る物体追跡装置１０によれば、より高精度に物体を追跡することができる。なぜならば、検出部１００が、カメラ２０の出力情報から、該出力情報（映像のフレーム）の前の出力情報に対する追跡情報に基づいて、物体を検出するからである。そして、統合追跡部２００が、各検出部１００が出力した、複数の検出結果に基づいて、物体を追跡し、共通座標系で表現された物体の追跡情報を生成するからである。 (effect)
As described above, the object tracking device 10 according to the present embodiment can track an object with higher accuracy. This is because the detection unit 100 detects an object from the output information of the camera 20 based on the tracking information for the output information before the output information (video frame). Then, the integrated tracking unit 200 tracks the object based on the plurality of detection results output by each detection unit 100, and generates tracking information of the object expressed in the common coordinate system.

例えば、あるカメラ２０から見えないが、他のカメラ２０から見えているオブジェクトがある場合、物体追跡装置１０は、あるカメラ２０からは見えないオブジェクトに対する追跡結果も、このあるカメラ２０の映像におけるオブジェクト検出に用いる。これにより、検出部１００は、このあるカメラ２０から見える範囲に同じオブジェクトが現れた場合に、このオブジェクトを好適に検出することができる。そのため、物体追跡装置１０は、このオブジェクトに対するオブジェクト追跡を精度よく行うことができる。 For example, if there is an object that cannot be seen by a certain camera 20 but is seen by another camera 20, the object tracking device 10 also obtains a tracking result for an object that cannot be seen by a certain camera 20 from the object in the image of this certain camera 20. Used for detection. Accordingly, the detection unit 100 can preferably detect the same object when the same object appears in the range that can be seen from the certain camera 20. Therefore, the object tracking device 10 can accurately track an object for this object.

このように、本実施の形態に係る物体追跡装置１０は、複数のカメラ２０の夫々が撮影している領域をまたがって移動する人物等の動線を抽出することが可能になる。これにより、物体追跡装置１０による追跡結果は、例えば、店舗内を回遊する顧客の行動を分析し、マーケティングや店舗のレイアウト変更の基礎情報とすることができる。また、この追跡結果は、セキュリティ目的で、エリア間をうろつく人物を検出に利用することができる。 In this way, the object tracking device 10 according to the present embodiment can extract the flow line of a person or the like that moves across the region photographed by each of the plurality of cameras 20. As a result, the tracking result by the object tracking device 10 can be used as basic information for marketing or changing the layout of the store by analyzing the behavior of the customer who travels around the store, for example. Further, this tracking result can be used for detection of a person who wanders between areas for security purposes.

また、探索範囲設定部１１２が追跡結果を用いて、カメラ映像における、オブジェクト検出を行う検索範囲を設定するため、認識型オブジェクト検出部１１１は、余分な誤検知を低減できる。また、認識型オブジェクト検出部１１１は、オブジェクト検出の処理の高速化を図ることができる。 Further, since the search range setting unit 112 uses the tracking result to set the search range for detecting an object in the camera image, the recognition-type object detection unit 111 can reduce extra false detection. Further, the recognition-type object detection unit 111 can speed up the process of object detection.

また、統合追跡部２００が、ターゲットの尤度および／またはトラッカーの尤度を用いて、オブジェクト追跡を行うことにより、物体追跡装置１０は、より信頼性の高いオブジェクト追跡結果を得ることができる。また、物体追跡装置１０は、このようにして得られたオブジェクト追跡結果を用いて、オブジェクト検索を行うため、よりオブジェクト追跡の精度を高めることができる。これにより、全体として、物体追跡装置１０は、オブジェクト追跡の精度を向上させることができる。 Further, the integrated tracking unit 200 performs object tracking using the likelihood of the target and/or the likelihood of the tracker, so that the object tracking device 10 can obtain a more reliable object tracking result. Further, since the object tracking device 10 searches for an object using the object tracking result obtained in this way, it is possible to further improve the accuracy of object tracking. Thereby, the object tracking device 10 can improve the accuracy of object tracking as a whole.

＜第２の実施の形態＞
次に、本発明の第２の実施の形態について、図面を参照して説明する。なお、説明の便宜上、前述した第１の実施の形態で説明した図面に含まれる部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。 <Second Embodiment>
Next, a second embodiment of the present invention will be described with reference to the drawings. For convenience of description, members having the same functions as the members included in the drawings described in the above-described first embodiment are designated by the same reference numerals, and the description thereof will be omitted.

本実施の形態に係る物体追跡システム２は、図２を用いて説明した第１の実施の形態に係る物体追跡システム１の物体追跡装置１０の代わりに、物体追跡装置５０を備える構成である。物体追跡システム２のその他のシステム構成については、図２に示した物体追跡システム１と同様であるため、説明を省略する。 The object tracking system 2 according to the present embodiment is configured to include an object tracking device 50 instead of the object tracking device 10 of the object tracking system 1 according to the first embodiment described with reference to FIG. The other system configuration of the object tracking system 2 is the same as that of the object tracking system 1 shown in FIG.

（物体追跡装置５０）
物体追跡装置５０の機能について、図９を参照して説明を行う。図９は、本実施の形態に係る物体追跡装置５０の機能構成の一例を示す機能ブロック図である。図９に示す通り、物体追跡装置５０は、複数の検出部（１００−１〜１００−Ｎ）と、統合追跡部２００と、表示制御部３００とを備えている。なお、上述した第１の実施の形態と同様に本実施の形態では、複数の物体検出部（１００−１〜１００−Ｎ）の夫々を区別しない場合、または、総称する場合には、これらを検出部１００と呼ぶ。 (Object tracking device 50)
The function of the object tracking device 50 will be described with reference to FIG. FIG. 9 is a functional block diagram showing an example of the functional configuration of the object tracking device 50 according to the present embodiment. As shown in FIG. 9, the object tracking device 50 includes a plurality of detection units (100-1 to 100-N), an integrated tracking unit 200, and a display control unit 300. In addition, like this 1st Embodiment mentioned above, in this Embodiment, when not distinguishing each of several object detection part (100-1-100-N), or when collectively calling, these are these. The detection unit 100 is called.

表示制御部３００は、表示装置３０に表示させる画像（映像）を制御するものである。具体的には、表示制御部３００は、統合追跡部２００が出力する追跡情報を、表示装置３０に表示可能なデータに変換した表示データを生成し、表示装置３０に送信する。統合追跡部２００が出力する追跡情報は、共通座標系で表現されている。そのため、表示制御部３００は、共通座標系で表示装置３０に表示可能な表示データを生成する。 The display controller 300 controls an image (video) displayed on the display device 30. Specifically, the display control unit 300 converts the tracking information output by the integrated tracking unit 200 into data that can be displayed on the display device 30, generates display data, and transmits the display data to the display device 30. The tracking information output by the integrated tracking unit 200 is expressed in a common coordinate system. Therefore, the display control unit 300 generates display data that can be displayed on the display device 30 in the common coordinate system.

なお、このとき、統合追跡部２００は、表示装置３０にオブジェクトの動線を表示するために必要な情報（例えば、トラッカーのオブジェクトの過去の位置を示す情報）を追跡情報として、表示制御部３００に出力することが好ましい。この追跡情報は、検出部１００にフィードバックする情報と同じであってもよいし、異なるものであってもよい。 At this time, the integrated tracking unit 200 uses the information necessary for displaying the flow line of the object on the display device 30 (for example, the information indicating the past position of the object of the tracker) as the tracking information, and the display control unit 300. It is preferable to output to. This tracking information may be the same as or different from the information fed back to the detecting unit 100.

そして、表示装置３０は、受信した表示データを画面に表示する。これにより、物体追跡装置５０は、追跡結果をユーザに提示することができる。 Then, the display device 30 displays the received display data on the screen. As a result, the object tracking device 50 can present the tracking result to the user.

また、本実施の形態に係る物体追跡装置５０の検出部１００は、オブジェクト検出部１１０の探索範囲設定部１１２が設定する探索範囲を、表示装置３０に表示可能なデータに変換した表示データを生成し、表示装置３０に送信する。探索範囲設定部１１２が出力する探索範囲情報は、個別座標系で表現されている。したがって、表示制御部３００は、探索範囲情報を出力した検出部１００に紐付いたカメラ２０のカメラパラメータを用いて、個別座標系で表示装置３０に表示可能な表示データを生成する。 Further, the detection unit 100 of the object tracking device 50 according to the present embodiment generates display data by converting the search range set by the search range setting unit 112 of the object detection unit 110 into data that can be displayed on the display device 30. Then, it is transmitted to the display device 30. The search range information output by the search range setting unit 112 is expressed in an individual coordinate system. Therefore, the display control unit 300 uses the camera parameters of the camera 20 associated with the detection unit 100 that has output the search range information to generate display data that can be displayed on the display device 30 in the individual coordinate system.

そして、表示装置３０は、受信した表示データを画面に表示する。これにより、物体追跡装置５０は、各検出部１００から出力された、オブジェクトの探索範囲情報を用いて、該探索範囲をユーザに提示することができる。 Then, the display device 30 displays the received display data on the screen. Thereby, the object tracking device 50 can present the search range to the user by using the search range information of the object output from each detection unit 100.

なお、表示装置３０は、複数であってもよい。例えば、表示装置３０は、共通座標系で表示される表示データと、個別座標系で表示される表示データとを異なる表示装置３０で受信して、夫々において、受信した表示データを画面に表示する構成であってもよい。また、表示装置３０は、１つの画面の表示領域を分割して、複数の表示データを画面に表示する構成であってもよい。このように、本実施の形態に係る表示装置３０における表示データ表示方法は特に限定されない。 The display device 30 may be plural. For example, the display device 30 receives the display data displayed in the common coordinate system and the display data displayed in the individual coordinate system by different display devices 30, and displays the received display data on the screen. It may be configured. Further, the display device 30 may be configured to divide a display area of one screen and display a plurality of display data on the screen. As described above, the display data display method in display device 30 according to the present embodiment is not particularly limited.

また、表示制御部３００は、検出部１００内に夫々備えられる構成であってもよい。図１０は、本実施の形態に係る物体追跡装置５０の検出部１００の機能構成の一例を示す機能ブロック図である。図１０に示す通り、検出部１００は、オブジェクト検出部１１０と、共通座標変換部１２０と、個別座標変換部１３０と、表示制御部１５０とを備えている。また、オブジェクト検出部１１０は、図５に示すオブジェクト検出部１１０と同様に、認識型オブジェクト検出部１１１と、探索範囲設定部１１２とを備えている。 Further, the display control unit 300 may be provided in each of the detection units 100. FIG. 10 is a functional block diagram showing an example of the functional configuration of the detection unit 100 of the object tracking device 50 according to this embodiment. As shown in FIG. 10, the detection unit 100 includes an object detection unit 110, a common coordinate conversion unit 120, an individual coordinate conversion unit 130, and a display control unit 150. The object detection unit 110 also includes a recognition-type object detection unit 111 and a search range setting unit 112, similar to the object detection unit 110 shown in FIG.

図１０に示す探索範囲設定部１１２は、設定した探索範囲を示す探索範囲情報を表示制御部１５０に出力する。表示制御部１５０は、探索範囲設定部１１２から出力された探索範囲情報を受信し、表示制御部３００と同様に、表示装置３０に表示可能なデータに変換した表示データを生成する。探索範囲設定部１１２が出力する探索範囲情報は、個別座標系で表現されている。したがって、表示制御部１５０は、該表示制御部１５０を備える検出部１００に紐付いたカメラ２０のカメラパラメータを用いて、個別座標系で表示装置３０に表示可能な表示データを生成する。 The search range setting unit 112 illustrated in FIG. 10 outputs search range information indicating the set search range to the display control unit 150. The display control unit 150 receives the search range information output from the search range setting unit 112 and, like the display control unit 300, generates display data converted into data that can be displayed on the display device 30. The search range information output by the search range setting unit 112 is expressed in an individual coordinate system. Therefore, the display control unit 150 uses the camera parameters of the camera 20 associated with the detection unit 100 including the display control unit 150 to generate display data that can be displayed on the display device 30 in the individual coordinate system.

そして、表示制御部１５０は、生成した表示データを表示装置３０に送信する。そして、表示装置３０は、受信した表示データを画面に表示する。 Then, the display control unit 150 transmits the generated display data to the display device 30. Then, the display device 30 displays the received display data on the screen.

（適用例）
本実施の形態に係る物体追跡装置５０の適用例を図１１から１４を参照して説明する。図１１から図１４は、本実施の形態に係る物体追跡装置５０の適用例を説明するための図である。 (Application example)
An application example of the object tracking device 50 according to the present embodiment will be described with reference to FIGS. 11 to 14. 11 to 14 are diagrams for explaining application examples of the object tracking device 50 according to the present embodiment.

まず、図１１は、棚Ｒ１と棚Ｒ２と、複数のカメラ（Ａ〜Ｆ）が設置された室内を、重力方向とは逆の方向から見た場合の室内の例を示す図である。図１１に示す通り、図１１の横方向を共通座標系におけるＸ軸とし、縦方向をＹ軸としている。棚Ｒ１と棚Ｒ２とは、長手方向が、Ｙ軸方向と平行になるように、Ｘ軸上に並べて設置されている。 First, FIG. 11 is a diagram showing an example of the room in which the shelf R1 and the shelf R2, and the room in which the plurality of cameras (A to F) are installed are viewed from the direction opposite to the gravity direction. As shown in FIG. 11, the horizontal direction of FIG. 11 is the X axis in the common coordinate system, and the vertical direction is the Y axis. The rack R1 and the rack R2 are arranged side by side on the X axis so that the longitudinal direction is parallel to the Y axis direction.

カメラＡは、この部屋の出入口に近接した位置に設置されている。本実施の形態では、カメラＡ〜Ｆによって、この室内が、全て撮影されているとみなす。つまり、図１１に示す通り、複数のカメラ（Ａ〜Ｆ）が設置された室内空間は、撮影空間となる。また、カメラＡ〜Ｆは、互いに共通する場所を撮影している。 The camera A is installed at a position close to the doorway of this room. In the present embodiment, it is assumed that the interior of this room is photographed by cameras A to F. That is, as shown in FIG. 11, the indoor space in which the plurality of cameras (A to F) are installed becomes a shooting space. Further, the cameras A to F are photographing the common place.

図１２は、カメラＡとカメラＢとの夫々が撮影した映像の一例を示す図である。図１２の上側の図は、カメラＡで撮影した映像のあるフレームを示す図であり、下側の図は、カメラＢで撮影した映像のあるフレームを示す図である。これらのフレームにおける座標値は、カメラ毎の個別座標系の座標値で表現される。 FIG. 12 is a diagram showing an example of an image captured by each of the camera A and the camera B. The upper diagram of FIG. 12 is a diagram showing a certain frame of a video image captured by the camera A, and the lower diagram is a diagram showing a certain frame of a video image captured by the camera B. The coordinate values in these frames are represented by the coordinate values in the individual coordinate system for each camera.

なお、本実施の形態における物体追跡装置５０は、カメラ２０が撮影した映像を、表示装置３０に表示する構成であってもよい。 Note that the object tracking device 50 in the present embodiment may be configured to display the image captured by the camera 20 on the display device 30.

図１２に示す通り、カメラＡで撮影した映像には、人物Ｃ１が含まれる。また、カメラＡは出入口の近辺に設置されているため、この映像に出入口が含まれている。また、カメラＢで撮影した映像には、人物Ｃ１と人物Ｃ２とが含まれる。 As shown in FIG. 12, the image captured by the camera A includes the person C1. Further, since the camera A is installed near the doorway, this image includes the doorway. In addition, the image captured by the camera B includes the person C1 and the person C2.

人物Ｃ２は、カメラＡから見ると、棚Ｒ１の陰に隠れている。したがって、図１２の映像の時点では、人物Ｃ２は、カメラＡからは見えないオブジェクトとなっている。仮にこれらの映像のフレームがカメラＡおよびカメラＢで撮影された映像の最初のフレームの場合、前フレームにおける追跡情報が無いため、物体追跡装置５０は、これらのフレームから、オブジェクトを検出し、追跡情報を生成する。 When viewed from the camera A, the person C2 is hidden behind the shelf R1. Therefore, at the time of the image in FIG. 12, the person C2 is an object invisible to the camera A. If the frames of these images are the first frames of the images captured by the cameras A and B, since there is no tracking information in the previous frame, the object tracking device 50 detects the object from these frames and performs tracking. Generate information.

そして、探索範囲設定部１１２は、個別座標系で表現された追跡情報を用いて、カメラＡが撮影した映像の、次のフレームに対するオブジェクトの探索範囲を設定する。同様に、探索範囲設定部１１２は、個別座標系で表現された追跡情報を用いて、カメラＢが撮影した映像の、次のフレームに対するオブジェクトの探索範囲を設定する。 Then, the search range setting unit 112 uses the tracking information expressed in the individual coordinate system to set the search range of the object for the next frame in the video image captured by the camera A. Similarly, the search range setting unit 112 uses the tracking information expressed in the individual coordinate system to set the search range of the object in the next frame of the image captured by the camera B.

図１３は、表示装置３０に表示された探索範囲の一例を示す図である。図１３の上側の図は、カメラＡから出力されるフレームに対する、オブジェクトの探索範囲の例を示す図であり、下側の図は、カメラＢから出力されるフレームに対する、オブジェクトの探索範囲の例を示す図である。 FIG. 13 is a diagram showing an example of the search range displayed on the display device 30. The upper diagram of FIG. 13 is a diagram showing an example of an object search range with respect to a frame output from camera A, and the lower diagram is an example of an object search range with respect to a frame output from camera B. FIG.

図１３の上側の図に示す通り、探索範囲設定部１１２は、図１２の上側の図における人物Ｃ１の位置から、探索範囲Ａ１を求めている。また、探索範囲設定部１１２は、室内への出入口部分の領域を探索範囲Ｎ１として求めている。また、探索範囲設定部１１２は、フレームの外縁部を、探索範囲Ｎ２およびＮ３として求めている。そして、探索範囲設定部１１２は、求めた探索範囲Ａ１、Ｎ１〜Ｎ３をまとめた情報を探索範囲情報として、認識型オブジェクト検出部１１１および表示制御部１５０または表示制御部３００に出力する。そして、表示制御部１５０または表示制御部３００は、表示装置３０にこの探索範囲情報によって示される探索範囲を、画面上に表示可能な表示データに変換し、表示装置３０に送信する。 As shown in the upper diagram of FIG. 13, the search range setting unit 112 obtains the search range A1 from the position of the person C1 in the upper diagram of FIG. Further, the search range setting unit 112 obtains the area of the entrance/exit of the room as the search range N1. Further, the search range setting unit 112 finds the outer edge of the frame as the search ranges N2 and N3. Then, the search range setting unit 112 outputs the information obtained by summarizing the obtained search ranges A1, N1 to N3 as search range information to the recognition-type object detection unit 111 and the display control unit 150 or the display control unit 300. Then, the display control unit 150 or the display control unit 300 converts the search range indicated by the search range information on the display device 30 into display data that can be displayed on the screen, and transmits the display data to the display device 30.

表示制御部１５０または表示制御部３００から表示データを受け取った表示装置３０は、図１３の上側の図に示すように、画面上に探索範囲を表示する。 The display device 30, which has received the display data from the display control unit 150 or the display control unit 300, displays the search range on the screen, as shown in the upper diagram of FIG. 13.

次に、図１３の下側の図について説明する。図１３の下側の図に示す通り、探索範囲設定部１１２は、図１２の下側の図における人物Ｃ１および人物Ｃ２の位置から、夫々、探索範囲Ｂ１および探索範囲Ｂ２を求めている。また、探索範囲設定部１１２は、フレームの外縁部を、探索範囲Ｎ４〜Ｎ７して求めている。そして、探索範囲設定部１１２は、求めた探索範囲Ｂ１、Ｂ２、Ｎ４〜Ｎ７をまとめた情報を探索範囲情報として、認識型オブジェクト検出部１１１および表示制御部１５０または表示制御部３００に出力する。そして、表示制御部１５０または表示制御部３００は、表示装置３０にこの探索範囲情報によって示される探索範囲を、画面上に表示可能な表示データに変換し、表示装置３０に送信する。 Next, the lower diagram of FIG. 13 will be described. As shown in the lower diagram of FIG. 13, the search range setting unit 112 obtains the search range B1 and the search range B2 from the positions of the person C1 and the person C2 in the lower diagram of FIG. 12, respectively. Further, the search range setting unit 112 obtains the outer edge portion of the frame as the search ranges N4 to N7. Then, the search range setting unit 112 outputs the information obtained by summarizing the obtained search ranges B1, B2, and N4 to N7 as search range information to the recognition-type object detection unit 111 and the display control unit 150 or the display control unit 300. Then, the display control unit 150 or the display control unit 300 converts the search range indicated by the search range information on the display device 30 into display data that can be displayed on the screen, and transmits the display data to the display device 30.

表示制御部１５０または表示制御部３００から表示データを受け取った表示装置３０は、図１３の下側の図に示すように、画面上に探索範囲を表示する。 The display device 30, which has received the display data from the display control unit 150 or the display control unit 300, displays the search range on the screen, as shown in the lower diagram of FIG. 13.

なお、表示装置３０は、探索範囲を、領域毎に異なる様態となるように表示してもよい。例えば、表示装置３０は、既に検出されたオブジェクトに対する探索範囲と、フレームの外縁部に対する探索範囲とを、互いに異なる色で表示してもよい。 Note that the display device 30 may display the search range so that the search range is different for each area. For example, the display device 30 may display the search range for the already detected object and the search range for the outer edge of the frame in mutually different colors.

そして、統合追跡部２００が、その後のフレームにおいて検出された、人物Ｃ１と人物Ｃ２とに関するトラッカーを生成する。そして、統合追跡部２００は、人物Ｃ１および人物Ｃ２の夫々動線を表示するために必要な情報を追跡情報として、表示制御部３００に出力する。 Then, the integrated tracking unit 200 generates a tracker regarding the person C1 and the person C2 detected in the subsequent frame. Then, the integrated tracking unit 200 outputs the information necessary for displaying the flow lines of the person C1 and the person C2 to the display control unit 300 as the tracking information.

統合追跡部２００から追跡情報を受け取った表示制御部３００は、該追跡情報を表示装置３０に表示可能な表示データに変換し、該表示データを表示装置３０に送信する。 Upon receiving the tracking information from the integrated tracking unit 200, the display control unit 300 converts the tracking information into display data that can be displayed on the display device 30, and transmits the display data to the display device 30.

そして、表示装置３０は、表示制御部３００から受信した表示データを画面に表示する。図１４は、表示装置３０が、人物Ｃ１および人物Ｃ２の夫々の追跡結果を示した動線を画面（表示画面）に表示した際の例を示す図である。図１４に示す通り、本適用例では、表示画面には、共通座標系におけるＸＹ平面でオブジェクトの追跡結果が表示されるものとする。図１４では、人物Ｃ１の動線が実線で、人物Ｃ２の動線が一点鎖線で表示される。このように、表示装置３０は、オブジェクトの追跡結果を画面に表示することができる。 Then, the display device 30 displays the display data received from the display control unit 300 on the screen. FIG. 14 is a diagram showing an example when the display device 30 displays, on a screen (display screen), flow lines indicating the tracking results of the person C1 and the person C2. As shown in FIG. 14, in this application example, it is assumed that the display screen displays the object tracking result on the XY plane in the common coordinate system. In FIG. 14, the flow line of the person C1 is displayed as a solid line, and the flow line of the person C2 is displayed as a one-dot chain line. In this way, the display device 30 can display the tracking result of the object on the screen.

＜第３の実施の形態＞
次に、本発明の第３の実施の形態について、図面を参照して説明する。なお、説明の便宜上、前述した第１および第２の実施の形態で説明した図面に含まれる部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。 <Third Embodiment>
Next, a third embodiment of the present invention will be described with reference to the drawings. For convenience of description, members having the same functions as the members included in the drawings described in the above-described first and second embodiments are designated by the same reference numerals, and the description thereof will be omitted.

本実施の形態に係る物体追跡装置１０は、図１に示した物体追跡装置１０の検出部１００の代わりに、検出部４００を備える構成である。この検出部４００の構成について、図１５を参照して説明を行う。図１５は、本実施の形態に係る物体追跡装置１０の検出部４００の機能構成の一例を示す機能ブロック図である。 The object tracking device 10 according to the present embodiment is configured to include a detection unit 400 instead of the detection unit 100 of the object tracking device 10 shown in FIG. The configuration of the detection unit 400 will be described with reference to FIG. FIG. 15 is a functional block diagram showing an example of the functional configuration of the detection unit 400 of the object tracking device 10 according to this embodiment.

検出部４００は、図４および図５に示す検出部１００のオブジェクト検出部１１０に代えて、オブジェクト検出部１４０を備える。また、検出部４００は、記憶部１６０を更に備える構成である。つまり、本実施の形態に係る検出部４００は、図１５に示す通り、オブジェクト検出部１４０と、共通座標変換部１２０と、個別座標変換部１３０と、記憶部１６０とを備える。 The detection unit 400 includes an object detection unit 140 instead of the object detection unit 110 of the detection unit 100 shown in FIGS. 4 and 5. Further, the detection unit 400 is configured to further include the storage unit 160. That is, the detection unit 400 according to the present embodiment includes an object detection unit 140, a common coordinate conversion unit 120, an individual coordinate conversion unit 130, and a storage unit 160, as shown in FIG.

本実施の形態では、第１の実施の形態に係る物体追跡装置１０の検出部１００のオブジェクト検出部１１０の代わりに、オブジェクト検出部１４０を備える構成を例に説明を行う。なお、本発明はこれに限定さえるものではなく、第２の実施の形態に係る物体追跡装置５０の検出部１００のオブジェクト検出部１１０の代わりにオブジェクト検出部１４０を備える構成であってもよい。つまり、本実施の形態に係る検出部１００は、表示制御部１５０または表示制御部３００に表示対象となるデータを出力する構成であってもよい。 In the present embodiment, a configuration including an object detection unit 140 instead of the object detection unit 110 of the detection unit 100 of the object tracking device 10 according to the first embodiment will be described as an example. The present invention is not limited to this, and the object detection unit 140 may be provided instead of the object detection unit 110 of the detection unit 100 of the object tracking device 50 according to the second embodiment. That is, the detection unit 100 according to the present embodiment may be configured to output the data to be displayed to the display control unit 150 or the display control unit 300.

記憶部１６０には、座標系の変換の際に使用される、カメラ２０毎のカメラパラメータが格納されている。更に、記憶部１６０には、個別座標変換部１３０が、追跡情報に含まれる共通座標系の座標値が紐付けられたカメラ２０で撮影される範囲に含まれるか否かを確認する際に使用する共通座標系の座標値の範囲を示す情報が格納されている。また、記憶部１６０には、カメラ２０で撮影した映像が格納されてもよい。なお、この映像は一時的に格納されるものであってもよい。 The storage unit 160 stores camera parameters for each camera 20, which are used when the coordinate system is converted. Further, the storage unit 160 is used when the individual coordinate conversion unit 130 confirms whether or not the coordinate value of the common coordinate system included in the tracking information is included in the range photographed by the camera 20 associated with it. Information indicating the range of coordinate values of the common coordinate system is stored. Further, the storage unit 160 may store a video image captured by the camera 20. The video may be temporarily stored.

なお、図１５では、記憶部１６０が検出部４００内に内蔵されることを例に説明を行うが、本発明はこれに限定されるものではない。記憶部１６０は、検出部４００とは、別に、物体追跡装置１０内に設けられるものであってもよい。また、記憶部１６０は、物体追跡装置１０とは別個の記憶装置等で実現されるものであってもよい。 In addition, in FIG. 15, the storage unit 160 is described as an example that is incorporated in the detection unit 400, but the present invention is not limited to this. The storage unit 160 may be provided in the object tracking device 10 separately from the detection unit 400. Further, the storage unit 160 may be realized by a storage device or the like separate from the object tracking device 10.

次に、検出部４００のオブジェクト検出部１４０の詳細な機能構成について、図１６を参照して説明する。図１６は、本実施の形態に係る検出部４００のオブジェクト検出部１４０の機能構成の一例を示す機能ブロック図である。図１６に示す通り、オブジェクト検出部１４０は、認識型オブジェクト検出部（第１の物体検出手段）１４１と、非認識型オブジェクト検出部（第２の物体検出手段）１４２と、検出パラメータ更新部１４３と、検出結果統合部１４４とを備えている。 Next, a detailed functional configuration of the object detection unit 140 of the detection unit 400 will be described with reference to FIG. FIG. 16 is a functional block diagram showing an example of the functional configuration of the object detection unit 140 of the detection unit 400 according to this embodiment. As shown in FIG. 16, the object detection unit 140 includes a recognition-type object detection unit (first object detection unit) 141, a non-recognition-type object detection unit (second object detection unit) 142, and a detection parameter update unit 143. And a detection result integration unit 144.

本実施の形態では、辞書（識別器）等を用いたオブジェクト検出を「認識型オブジェクト検出」と呼ぶ。一方、識別器等を用いないオブジェクト検出を、「非認識型オブジェクト検出」と呼ぶ。 In the present embodiment, object detection using a dictionary (identifier) or the like is called “recognition-type object detection”. On the other hand, object detection that does not use a discriminator or the like is called “non-recognition type object detection”.

認識型オブジェクト検出部１４１は、認識型オブジェクト検出部１４１に入力されるカメラ映像からオブジェクトを検出する。認識型オブジェクト検出部１４１は、フレーム全体に対してオブジェクトの検出を行う。なお、認識型オブジェクト検出部１４１は、図１６に破線で示す通り、後述する検出パラメータ更新部１４３から出力される探索範囲情報に基づいて、オブジェクト検出を行ってもよい。このとき、認識型オブジェクト検出部１４１は、第１の実施の形態において説明した認識型オブジェクト検出部１１１と同様の方法で、オブジェクト検出を行う。 The recognition-type object detection unit 141 detects an object from the camera image input to the recognition-type object detection unit 141. The recognition-type object detection unit 141 detects an object in the entire frame. Note that the recognition-type object detection unit 141 may perform object detection based on the search range information output from the detection parameter update unit 143 described below, as indicated by the broken line in FIG. At this time, the recognition type object detection unit 141 performs object detection by the same method as the recognition type object detection unit 111 described in the first embodiment.

また、認識型オブジェクト検出部１４１は、探索範囲情報が検出パラメータ更新部１４３から出力されないとき、フレーム全体に対してオブジェクト検出を行うのではなく、別の基準を用いて、オブジェクト検出を行ってもよい。例えば、認識型オブジェクト検出部１４１は、シルエット情報を利用して、シルエットがある領域とその周囲の領域に対してのみオブジェクトの検出を行ってもよい。 Further, when the search range information is not output from the detection parameter update unit 143, the recognition-type object detection unit 141 does not perform object detection for the entire frame, but may perform object detection using another criterion. Good. For example, the recognition-type object detection unit 141 may use the silhouette information to detect an object only in a region having a silhouette and a region surrounding the region.

認識型オブジェクト検出部１４１は、オブジェクト検出の検出結果を第１の検出結果として、検出結果統合部１４４に出力する。 The recognition-type object detection unit 141 outputs the detection result of object detection as the first detection result to the detection result integration unit 144.

また、認識型オブジェクト検出部１４１は、後述する非認識型オブジェクト検出部１４２によるオブジェクト検出に備えて、この時点で、オブジェクトの外見特徴を抽出してもよい。オブジェクトの外見特徴としては、オブジェクトの色、模様、形状などの情報が挙げられるが本発明はこれに限定されるものではない。認識型オブジェクト検出部１４１は、オブジェクトの外見特徴としてこれらの特徴量を抽出する。この際、認識型オブジェクト検出部１４１によるオブジェクト検出で用いる領域と、非認識型オブジェクト検出部１４２によるオブジェクト検出で用いる領域とは同一でなくてもよい。例えば、オブジェクトが人物の場合、認識型オブジェクト検出部１４１によるオブジェクト検出では、頭部を検出し、非認識型オブジェクト検出部１４２によるオブジェクト検出では、服の領域までを検出するとする。このとき認識型オブジェクト検出部１４１は、該服の領域を含むように、オブジェクトの外見特徴の特徴量を抽出する。そして、認識型オブジェクト検出部１４１は、抽出された特徴量をテンプレート情報として、抽出に用いた領域を示す情報（抽出領域情報）とともに出力してもよい。また、認識型オブジェクト検出部１４１は、特徴量自体を認識型オブジェクト検出部１４１内部で保持しておき、その特徴量を識別するための情報のみを出力してもよい。 In addition, the recognition-type object detection unit 141 may extract the appearance feature of the object at this point in preparation for the object detection by the non-recognition-type object detection unit 142 described later. The appearance feature of the object includes information such as the color, pattern, and shape of the object, but the present invention is not limited to this. The recognition-type object detection unit 141 extracts these feature quantities as appearance features of the object. At this time, the area used for object detection by the recognition-type object detection unit 141 and the area used for object detection by the non-recognition-type object detection unit 142 do not have to be the same. For example, when the object is a person, the recognition type object detection unit 141 detects the head, and the non-recognition type object detection unit 142 detects the object up to the clothing area. At this time, the recognition type object detection unit 141 extracts the feature amount of the appearance feature of the object so as to include the area of the clothes. Then, the recognition-type object detection unit 141 may output the extracted feature amount as template information together with information indicating an area used for extraction (extracted area information). Further, the recognition-type object detection unit 141 may hold the feature amount itself inside the recognition-type object detection unit 141 and output only the information for identifying the feature amount.

検出パラメータ更新部１４３は、個別座標変換部１３０から、個別座標系で表現された追跡情報を受信する。そして、検出パラメータ更新部１４３は、この追跡情報を用いて、オブジェクト検出に必要なパラメータ（検出パラメータと呼ぶ）を求める。この検出パラメータは、オブジェクト検出処理で必要となるパラメータ類である。検出パラメータには、例えば、オブジェクトの現フレームにおける予測位置（予測領域）、オブジェクト検出を適用する探索範囲、テンプレートマッチングに用いるテンプレートのサイズ、以前にトラッカーに対応付いたターゲットのテンプレートの特徴量（テンプレート情報）等が含まれる。なお、検出パラメータには、これら全ての情報が含まれていなくてもよく、非認識型オブジェクト検出部１４２によるオブジェクト検出に必要なパラメータが含まれていればよい。また、検出パラメータには、前フレームにおけるオブジェクトの追跡結果でトラッカーと対応づいたターゲットの情報が含まれてもよい。 The detection parameter update unit 143 receives the tracking information represented by the individual coordinate system from the individual coordinate conversion unit 130. Then, the detection parameter update unit 143 uses this tracking information to obtain a parameter (referred to as a detection parameter) necessary for object detection. The detection parameters are parameters necessary for the object detection processing. The detection parameters include, for example, the predicted position (predicted area) of the object in the current frame, the search range to which the object detection is applied, the size of the template used for template matching, the feature amount of the target template previously associated with the tracker (template Information) etc. are included. It should be noted that the detection parameter does not have to include all of this information as long as it includes the parameters necessary for the non-recognition-type object detection unit 142 to detect an object. Further, the detection parameter may include target information associated with the tracker in the tracking result of the object in the previous frame.

例えば、検出パラメータ更新部１４３は、前フレームで検出され、トラッカーに対応付けられたターゲット（オブジェクト）に対して、該オブジェクトの追跡情報に基づいて、現フレームにおけるオブジェクトが存在する位置を予測位置として求める。この予測処理は、第１の実施の形態に係る探索範囲設定部１１２における予測位置の予測処理と同様である。なお、検出パラメータ更新部１４３は、この予測位置を含む領域を予測領域として求めてもよい。 For example, the detection parameter updating unit 143 detects the position of the object in the current frame as the predicted position for the target (object) detected in the previous frame and associated with the tracker, based on the tracking information of the object. Ask. This prediction process is the same as the prediction position prediction process in the search range setting unit 112 according to the first embodiment. The detection parameter updating unit 143 may obtain the area including the predicted position as the predicted area.

また、例えば、検出パラメータ更新部１４３は、上記予測領域を中心として、テンプレートマッチングによるオブジェクト検出を適用する範囲を求め、この範囲を検出パラメータに含まれるオブジェクトの探索範囲として含めてもよい。 Further, for example, the detection parameter updating unit 143 may obtain a range to which the object detection by the template matching is applied, centered on the prediction region, and include this range as the search range of the objects included in the detection parameter.

検出パラメータ更新部１４３は、上記検出パラメータを、追跡情報に含まれる各オブジェクトに対して求める。そして、検出パラメータ更新部１４３は、求めた検出パラメータを、オブジェクト検出処理に用いる検出パラメータとして更新する。そして、検出パラメータ更新部１４３は、この検出パラメータを非認識型オブジェクト検出部１４２に出力する。 The detection parameter updating unit 143 obtains the detection parameter for each object included in the tracking information. Then, the detection parameter updating unit 143 updates the obtained detection parameter as a detection parameter used in the object detection process. Then, the detection parameter update unit 143 outputs this detection parameter to the non-recognition type object detection unit 142.

なお、検出パラメータ更新部１４３は、上述した第１の実施の形態に係る検出部１００の探索範囲設定部１１２と同様に、個別座標系の座標値に変換された追跡情報を用いて、オブジェクトの探索範囲を求めてもよい。そして、検出パラメータ更新部１４３は、求めたオブジェクトの探索範囲を示す探索範囲情報を、認識型オブジェクト検出部１４１に出力してもよい。 Note that the detection parameter updating unit 143 uses the tracking information converted into the coordinate values of the individual coordinate system, similarly to the search range setting unit 112 of the detection unit 100 according to the above-described first embodiment, to detect the object. The search range may be obtained. Then, the detection parameter updating unit 143 may output search range information indicating the obtained search range of the object to the recognition-type object detection unit 141.

非認識型オブジェクト検出部１４２は、検出パラメータ更新部１４３から、検出パラメータを受信する。そして、非認識型オブジェクト検出部１４２は、受信した検出パラメータに基づいて、非認識型オブジェクト検出部１４２に入力されるカメラ映像からオブジェクトを検出する。この非認識型オブジェクト検出部１４２は、認識型オブジェクト検出部１４１とは異なり、前のフレームにおいて検出されたオブジェクトの外見の類似性に基づいてオブジェクトの検出を行う。 The non-recognition type object detection unit 142 receives the detection parameter from the detection parameter update unit 143. Then, the non-recognition type object detection unit 142 detects an object from the camera image input to the non-recognition type object detection unit 142 based on the received detection parameter. Unlike the recognition-type object detection unit 141, the non-recognition-type object detection unit 142 detects an object based on the appearance similarity of the object detected in the previous frame.

即ち、非認識型オブジェクト検出部１４２は、前のフレームにおいて、オブジェクトが検出された際、その領域の画像特徴（または検出領域の部分画像そのものでもよい）をテンプレートとして記憶しておく。そして、非認識型オブジェクト検出部１４２は、この記憶したテンプレートと類似する領域が現フレームに存在するかどうかをテンプレートマッチングにより調べることによって、オブジェクト検出を行う。この際に用いる画像の特徴としては、例えば、色のパターンおよび分布に関する情報、エッジおよび輝度勾配の分布情報、あるいは、これらを組み合わせてできる特徴等を用いることができる。 That is, when the object is detected in the previous frame, the non-recognition type object detection unit 142 stores the image feature of the area (or the partial image of the detection area itself) as a template. Then, the non-recognition type object detection unit 142 detects an object by checking by template matching whether or not an area similar to the stored template exists in the current frame. As the characteristics of the image used at this time, for example, information on color patterns and distributions, distribution information on edges and luminance gradients, or characteristics formed by combining these can be used.

非認識型オブジェクト検出部１４２におけるオブジェクト検出を行う際に使用する検出パラメータは、検出パラメータ更新部１４３から出力される検出パラメータによって制御される。具体的には、非認識型オブジェクト検出部１４２は、検出パラメータ更新部１４３によって予測された、オブジェクトの予測位置（予測領域）およびその近辺に対してテンプレートマッチングによるオブジェクト検出を行う。即ち、非認識型オブジェクト検出部１４２は、予測されるオブジェクト存在範囲（予測領域）を中心として、テンプレートマッチングの探索範囲を設定し、その周辺に対してテンプレートマッチングを行う。また、この際、非認識型オブジェクト検出部１４２は、オブジェクトの位置の移動によってオブジェクトの見かけの大きさが変化することも考慮してもよい。この変化は、カメラパラメータを用いることによって算出可能である。そのため、非認識型オブジェクト検出部１４２は、オブジェクトの大きさの変化を計算し、テンプレートに反映させてからテンプレートマッチングを行うようにしてもよい。 The detection parameter used when the non-recognition type object detection unit 142 detects an object is controlled by the detection parameter output from the detection parameter update unit 143. Specifically, the non-recognition type object detection unit 142 performs object detection by template matching on the predicted position (predicted region) of the object predicted by the detection parameter updating unit 143 and its vicinity. That is, the non-recognition type object detection unit 142 sets the search range of template matching centering on the predicted object existence range (prediction area), and performs template matching on the periphery thereof. At this time, the non-recognition type object detection unit 142 may also consider that the apparent size of the object changes due to the movement of the position of the object. This change can be calculated by using camera parameters. Therefore, the non-recognition type object detection unit 142 may perform template matching after calculating the change in the size of the object and reflecting it in the template.

また、非認識型オブジェクト検出部１４２がテンプレートマッチングを行うテンプレートの情報は、前のフレームにおけるオブジェクト検出処理において、認識型オブジェクト検出部１４１が抽出した特徴量であってもよい。 Further, the information of the template for which the unrecognized object detection unit 142 performs template matching may be the feature amount extracted by the recognized object detection unit 141 in the object detection process in the previous frame.

このように、非認識型オブジェクト検出部１４２は、統合追跡部２００によって追跡された物体に対する追跡結果に基づいて、オブジェクト検出を行うため、上記追跡結果を用いない場合に比べ、検出の精度を向上させることができる。 As described above, the non-recognition-type object detection unit 142 performs object detection based on the tracking result for the object tracked by the integrated tracking unit 200, and thus the detection accuracy is improved as compared with the case where the tracking result is not used. Can be made.

そして、非認識型オブジェクト検出部１４２は、オブジェクト検出の検出結果を第２の検出結果として、検出結果統合部１４４に出力する。 Then, the non-recognition type object detection unit 142 outputs the detection result of the object detection as the second detection result to the detection result integration unit 144.

検出結果統合部１４４は、認識型オブジェクト検出部１４１から第１の検出結果を受信する。また、検出結果統合部１４４は、非認識型オブジェクト検出部１４２から第２の検出結果を受信する。そして、検出結果統合部１４４は、第１の検出結果と、第２の検出結果とを統合する。そして、検出結果統合部１４４は、統合した結果をオブジェクト検出部１４０におけるオブジェクト検出の検出結果として、共通座標変換部１２０に出力する。 The detection result integration unit 144 receives the first detection result from the recognition-type object detection unit 141. Further, the detection result integration unit 144 receives the second detection result from the non-recognition type object detection unit 142. Then, the detection result integration unit 144 integrates the first detection result and the second detection result. Then, the detection result integration unit 144 outputs the integrated result to the common coordinate conversion unit 120 as a detection result of object detection by the object detection unit 140.

第１の検出結果および第２の検出結果の両方に含まれているオブジェクトと、どちらか一方のみに含まれているオブジェクトとが存在する場合がある。そのため、検出結果統合部１４４は、第１の検出結果と第２の検出結果とのそれぞれに含まれるオブジェクト同士の対応付けを行い、統合する。この対応付けには、例えば、オブジェクト領域の重なりの度合いを用いることができる。 There may be an object included in both the first detection result and the second detection result, and an object included in only one of them. Therefore, the detection result integration unit 144 associates and integrates the objects included in each of the first detection result and the second detection result. For this association, for example, the degree of overlapping of object areas can be used.

即ち、検出結果統合部１４４は、オブジェクト領域同士の重なり比率（例えば、オブジェクト外接矩形の重なり比率）を算出し、これが所定の値より大きくなる場合に第１の検出結果に含まれるオブジェクトと、第２の検出結果に含まれるオブジェクトとを対応付ける。 That is, the detection result integration unit 144 calculates the overlapping ratio of the object regions (for example, the overlapping ratio of the object circumscribing rectangles), and when the overlapping ratio is larger than a predetermined value, the object included in the first detection result and the first The object included in the detection result of No. 2 is associated.

また、検出結果統合部１４４は、オブジェクト間の領域の重なり比率を重みとするグラフ問題として定式化し、オブジェクト間の対応付けを行ってもよい。例えば、検出結果統合部１４４は、重なり比率を単調非増加関数によってコストに変換したのち、ハンガリアン法等を用いて、最適な対応付けを計算することにより、オブジェクト間の対応付けを行う。 In addition, the detection result integration unit 144 may formulate a graph problem with the overlapping ratio of regions between objects as a weight, and associate the objects with each other. For example, the detection result integration unit 144 associates the objects by converting the overlap ratio into a cost by a monotonous non-increasing function and then calculating the optimal association using the Hungarian method or the like.

検出結果統合部１４４は、対応付けを行った結果、対応付けの際に用いた値（例えば、重なり比率またはコスト）が、所定の値より大きいものはこの時点でマージしてもよい。また、検出結果統合部１４４は、この時点ではマージせず、対応づくという情報を生成してもよい。そして、検出結果統合部１４４は、第１の検出結果と第２の検出結果とを合わせた検出結果に、該対応付くという情報を付随させた結果をオブジェクト検出部１４０の検出結果として出力し、統合追跡時に対応付けの情報を用いて追跡を行うようにしてもよい。 As a result of the association, the detection result integration unit 144 may merge at this time a value (for example, an overlapping ratio or a cost) used in the association that is larger than a predetermined value. Further, the detection result integration unit 144 may generate the information indicating that the correspondences are not merged at this point. Then, the detection result integration unit 144 outputs, as the detection result of the object detection unit 140, the detection result obtained by combining the first detection result and the second detection result with the information that the correspondence is associated, You may make it track using the information of an association at the time of integrated tracking.

また、非認識型オブジェクト検出部１４２は、前フレームに対するオブジェクトの追跡結果に基づいて、第２の検出結果を出力する。このため、この第２の検出結果の方が、第１の検出結果よりも遅れて生成される場合がある。このような場合には、検出結果統合部１４４は、第１の検出結果を、一旦、検出結果統合部１４４内のバッファ等の記憶手段または記憶部１６０に蓄えておく。そして、検出結果統合部１４４は、該第１の検出結果を生成する対象となるフレームに対応するフレームに対する第２の検出結果を受信した時点で、両結果を統合してもよい。 In addition, the non-recognition type object detection unit 142 outputs the second detection result based on the tracking result of the object with respect to the previous frame. Therefore, the second detection result may be generated later than the first detection result. In such a case, the detection result integration unit 144 temporarily stores the first detection result in the storage unit such as a buffer in the detection result integration unit 144 or the storage unit 160. Then, the detection result integration unit 144 may integrate both results at the time when the second detection result for the frame corresponding to the frame for which the first detection result is generated is received.

以上のように、本実施の形態に係る物体追跡装置１０の検出部４００は、認識型オブジェクト検出部１４１によるオブジェクト検出の結果（第１の検出結果）と、非認識型オブジェクト検出部１４２によるオブジェクト検出の結果（第２の検出結果）とを統合した結果を、検出結果として出力する。このとき、非認識型オブジェクト検出部１４２は、統合追跡部２００によって追跡された物体に対する追跡結果に基づいて、テンプレートマッチングを行うことにより、オブジェクトを検出する。これにより、検出部４００は、オブジェクトを識別することによるオブジェクト検出（認識型オブジェクト検出）のみを行う場合に比べ、よりオブジェクト検出の精度をより向上させることができる。 As described above, the detection unit 400 of the object tracking device 10 according to the present embodiment uses the result (first detection result) of object detection by the recognition-type object detection unit 141 and the object by the non-recognition-type object detection unit 142. The result obtained by integrating the detection result (second detection result) is output as the detection result. At this time, the non-recognition type object detection unit 142 detects the object by performing template matching based on the tracking result for the object tracked by the integrated tracking unit 200. As a result, the detection unit 400 can further improve the accuracy of object detection as compared with the case where only object detection by identifying an object (recognition-type object detection) is performed.

したがって、物体追跡装置１０は、より高精度にオブジェクトの追跡を行うことができる。 Therefore, the object tracking device 10 can track the object with higher accuracy.

＜第４の実施の形態＞
次に、本発明の第４の実施の形態について、図面を参照して説明する。なお、説明の便宜上、前述した各実施の形態で説明した図面に含まれる部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。 <Fourth Embodiment>
Next, a fourth embodiment of the present invention will be described with reference to the drawings. For convenience of description, members having the same functions as the members included in the drawings described in the above-described embodiments are designated by the same reference numerals, and the description thereof will be omitted.

本実施の形態に係る物体追跡装置１０は、図１に示した物体追跡装置１０の統合追跡部２００代わりに、統合追跡部５００を備える構成である。この統合追跡部５００の構成について、図１７を参照して説明を行う。図１７は、本実施の形態に係る物体追跡装置１０の統合追跡部５００の機能構成の一例を示す機能ブロック図である。図１７に示す通り、統合追跡部５００は、バッファ部５１０と、予測部２１０と、記憶部２２０と、対応付け部５３０と、更新部２４０と、を備えている。 The object tracking device 10 according to the present embodiment is configured to include an integrated tracking part 500 instead of the integrated tracking part 200 of the object tracking device 10 shown in FIG. The configuration of the integrated tracking unit 500 will be described with reference to FIG. FIG. 17 is a functional block diagram showing an example of the functional configuration of the integrated tracking unit 500 of the object tracking device 10 according to this embodiment. As shown in FIG. 17, the integrated tracking unit 500 includes a buffer unit 510, a prediction unit 210, a storage unit 220, an associating unit 530, and an updating unit 240.

本実施の形態では、第１の実施の形態に係る物体追跡装置１０の統合追跡部２００の代わりに、統合追跡部５００を備える構成を例に説明を行う。なお、本発明はこれに限定さえるものではなく、第２の実施の形態に係る物体追跡装置５０の統合追跡部２００の代わりに統合追跡部５００を備える構成であってもよい。つまり、本実施の形態に係る統合追跡部５００は、表示制御部３００に表示対象となるデータを出力する構成であってもよい。 In the present embodiment, a configuration including an integrated tracking unit 500 instead of the integrated tracking unit 200 of the object tracking device 10 according to the first embodiment will be described as an example. It should be noted that the present invention is not limited to this, and an integrated tracking unit 500 may be provided instead of the integrated tracking unit 200 of the object tracking device 50 according to the second embodiment. That is, the integrated tracking unit 500 according to the present embodiment may be configured to output the data to be displayed to the display control unit 300.

また、本実施の形態に係る統合追跡部５００に検出結果を出力する検出部は、第３の実施の形態において説明した検出部４００であってもよい。 Further, the detection unit that outputs the detection result to the integrated tracking unit 500 according to the present embodiment may be the detection unit 400 described in the third embodiment.

バッファ部５１０は、検出部１００から出力される共通座標系で表現された検出結果を一時的に格納する手段である。そして、バッファ部５１０にバッファリングされたデータ（検出結果）のうち、検出が行われたカメラ映像に含まれる時間情報が所定期間内であるデータは、対応付け部５３０によって取得される。この所定期間は周期的な期間である。そして、対応付け部５３０は、ある周期で取得した１または複数の検出結果を用いて、オブジェクト追跡を行う。このように、本実施の形態における統合追跡部５００は、各カメラ２０からの映像のうち、複数のカメラの映像を用いてオブジェクト追跡を行うため、一括追跡部とも呼ぶ。 The buffer unit 510 is a unit that temporarily stores the detection result expressed by the common coordinate system output from the detection unit 100. Then, among the data (detection result) buffered in the buffer unit 510, the data in which the time information included in the detected camera image is within the predetermined period is acquired by the associating unit 530. This predetermined period is a periodic period. Then, the associating unit 530 performs object tracking using one or a plurality of detection results acquired in a certain cycle. As described above, the integrated tracking unit 500 according to the present embodiment performs object tracking using images from a plurality of cameras among images from each camera 20, and is therefore also referred to as a collective tracking unit.

この統合追跡部５００が行う、オブジェクト追跡（一括統合追跡とも呼ぶ。）について、図１８を用いて説明する。図１８は、本実施の形態に係る統合追跡部５００が行うオブジェクトの一括追跡処理を説明するための図である。図１８には、図７と同様に、カメラ数が３つの場合に、カメラＡ、カメラＢ、カメラＣの夫々で画像を取得するタイミングの一例を示している。図１８において、横軸は、時間軸を示しており、右側にいくほど、時間的に後であることを示している。図１８に示す通り、カメラＡは時間ｔ１、ｔ５およびｔ８で画像を取得している。同様に、カメラＢは、時間ｔ２、ｔ４、ｔ６およびｔ９で画像を取得し、カメラＣは時間ｔ３およびｔ７で画像を取得している。 Object tracking (also called collective integrated tracking) performed by the integrated tracking unit 500 will be described with reference to FIG. FIG. 18 is a diagram for explaining the batch tracking process of objects performed by the integrated tracking unit 500 according to the present embodiment. Similar to FIG. 7, FIG. 18 illustrates an example of the timings at which images are acquired by each of the cameras A, B, and C when the number of cameras is three. In FIG. 18, the horizontal axis indicates the time axis, and the closer to the right, the later in time. As shown in FIG. 18, camera A acquires images at times t1, t5, and t8. Similarly, camera B acquires images at times t2, t4, t6, and t9, and camera C acquires images at times t3 and t7.

そして、これらの各タイミングで取得された画像は、順にオブジェクト検出が行われる。以下では、説明の便宜上、図１８に示す時間は、オブジェクトの検出結果が出力された時間とほぼ同じであるとみなして説明を行う。つまり、時間ｔ１は、カメラＡによって撮影された映像のフレームに対する検出結果が検出部１００から出力され、バッファ部５１０にバッファリングされた時間であるとする。 Then, objects are sequentially detected from the images acquired at each of these timings. In the following, for convenience of description, it is assumed that the time shown in FIG. 18 is substantially the same as the time when the detection result of the object is output. That is, it is assumed that the time t1 is the time when the detection result for the frame of the video image captured by the camera A is output from the detection unit 100 and buffered in the buffer unit 510.

また、図１８の最下部の時間軸は、周期的な期間の一例を示している。 The lowermost time axis in FIG. 18 shows an example of a periodic period.

対応付け部５３０は、バッファ部５１０にバッファリングされた１または複数の検出結果のうち、物体の検出の対象となったカメラ映像に含まれる時間情報が所定期間内である検出結果を取得する。なお、上述したとおり、この所定期間は周期的な期間である。また、本実施の形態では、バッファリングされた時間と、カメラ映像の時間とは同じであるとみなしている。そのため、対応付け部５３０は、バッファ部５１０にバッファリングされた１または複数の検出結果を、所定の周期で取得するともいえる。 The associating unit 530 acquires, from the one or more detection results buffered in the buffer unit 510, a detection result in which the time information included in the camera image that is the object detection target is within a predetermined period. As described above, this predetermined period is a periodic period. Further, in the present embodiment, it is assumed that the buffered time and the camera image time are the same. Therefore, it can be said that the associating unit 530 acquires one or a plurality of detection results buffered in the buffer unit 510 at a predetermined cycle.

具体的には、対応付け部５３０は、まず、最初の期間Ｔ１でバッファリングされた検出結果を取得する。つまり、対応付け部５３０は、時間ｔ１、ｔ２、ｔ３でバッファリングされた検出結果を取得する。時間ｔ１でバッファリングされた検出結果は、カメラＡによって撮影された映像のフレームに対する検出結果である。また、時間ｔ２でバッファリングされた検出結果は、カメラＢによって撮影された映像のフレームに対する検出結果であり、時間ｔ３でバッファリングされた検出結果は、カメラＣによって撮影された映像のフレームに対する検出結果である。したがって、対応付け部５３０は、バッファ部５１０に所定期間（この場合Ｔ１）内でバッファリングされた検出結果であって、複数のカメラ２０の夫々で撮影された映像のフレームに対する、複数の検出結果を、バッファ部５１０から取得する。そして、対応付け部５３０は、取得した検出結果を用いて、オブジェクト追跡を行う。 Specifically, the associating unit 530 first acquires the detection result buffered in the first period T1. That is, the associating unit 530 acquires the detection results buffered at the times t1, t2, and t3. The detection result buffered at time t1 is the detection result for the frame of the video image captured by the camera A. The detection result buffered at time t2 is the detection result for the frame of the image captured by camera B, and the detection result buffered at time t3 is the detection result for the frame of the image captured by camera C. The result. Therefore, the associating unit 530 is a detection result buffered in the buffer unit 510 within a predetermined period (T1 in this case), and detects a plurality of detection results for frames of video images captured by each of the plurality of cameras 20. From the buffer unit 510. Then, the associating unit 530 uses the acquired detection result to perform object tracking.

また、対応付け部５３０は、同様に、期間Ｔ２、Ｔ３、Ｔ４においても、この周期的な期間内でバッファリングされた検出結果を取得し、オブジェクト追跡を行う。 Similarly, the associating unit 530 also obtains the buffered detection results within this periodic period in the periods T2, T3, and T4, and performs object tracking.

なお、本実施の形態では、対応付け部５３０がバッファ部５１０にバッファリングされたデータ（複数の検出結果）を、所定の周期でバッファ部５１０から取得する構成について説明するが、対応付け部５３０は、所定の周期でバッファ部５１０からこのデータを受信する構成であってもよい。つまり、バッファ部５１０は、このデータを、所定の周期で対応付け部５３０に出力する機能を有してもよい。 In the present embodiment, a configuration will be described in which associating unit 530 acquires data (a plurality of detection results) buffered in buffer unit 510 from buffer unit 510 at a predetermined cycle, but associating unit 530. May be configured to receive this data from the buffer unit 510 at a predetermined cycle. That is, the buffer unit 510 may have a function of outputting this data to the associating unit 530 at a predetermined cycle.

図１７に戻り、統合追跡部５００の対応付け部５３０について更に説明する。 Returning to FIG. 17, the associating unit 530 of the integrated tracking unit 500 will be further described.

対応付け部５３０は、取得した各検出結果に含まれるターゲットの位置から、ターゲット間の距離を求め、該距離が近いターゲット同士を互いに対応付ける。このとき、対応付け部５３０は、ターゲット間の距離を用いて、ハンガリアン法等の手法によって、対応付けを行う。また、対応付け部５３０は、ターゲット間の距離に加え、ターゲットの外見特徴の類似性も同時に用いてもよい。例えば、位置が近く、近似した色を有するターゲット同士は同一のオブジェクトである可能性が高い。よって、対応付け部５３０は、このような特徴を用いて対応付けを行ってもよい。なお、外見特徴の類似性を判定するための特徴は、色に限定されず、例えば、ターゲットの模様等であってもよい。 The associating unit 530 obtains the distance between the targets from the positions of the targets included in the acquired detection results, and associates the targets having the short distances with each other. At this time, the associating unit 530 uses the distance between the targets and makes the association by a method such as the Hungarian method. In addition to the distance between the targets, the associating unit 530 may also use the similarity of the appearance features of the targets at the same time. For example, targets that are close in position and have similar colors are likely to be the same object. Therefore, the associating unit 530 may make the association using such a feature. The feature for determining the similarity of the appearance features is not limited to the color, and may be, for example, the pattern of the target.

そして、対応付け部５３０は、互いに対応付けされたターゲットに対する検出結果同士を統合する。つまり、対応付け部５３０は、ターゲット間の対応付けを行った後、対応付いたターゲットの夫々の検出結果を用いて、オブジェクトの位置を求める。この際、対応付け部５３０は、各ターゲットの尤度および／または予測位置の確度を評価し、この確度が最大となる位置を、オブジェクトの位置としてもよい。 Then, the associating unit 530 integrates the detection results of the targets associated with each other. That is, the association unit 530 obtains the position of the object by using the detection result of each of the associated targets after performing the association between the targets. At this time, the associating unit 530 may evaluate the likelihood of each target and/or the accuracy of the predicted position, and use the position with the highest accuracy as the position of the object.

また、対応付け部５３０は、各ターゲットに対するカメラ２０の角度（俯角または仰角）および該カメラ２０からターゲットまでの距離等によって定まる予測位置の確度に基づいて、各ターゲットの位置に対し重みづけをしてもよい。そして、対応付け部５３０は、重みづけした位置から、例えば、平均値などの統計量を算出し、該算出した統計量によって示される位置を、オブジェクトの位置としてもよい。 The associating unit 530 also weights the position of each target based on the accuracy of the predicted position determined by the angle (depression angle or elevation angle) of the camera 20 with respect to each target and the distance from the camera 20 to the target. May be. Then, the associating unit 530 may calculate a statistic such as an average value from the weighted positions, and use the position indicated by the calculated statistic as the position of the object.

そして、対応付け部５３０は、求めたオブジェクトの位置を、検出結果を取得した周期に対するターゲットの位置とする。対応付け部５３０は、このターゲットの位置を用いて、第１の実施の形態に係る対応付け部２３０と同様に、対応付けを行う。また、統合追跡部５００による、対応付けの処理およびその後の処理については、第１の実施の形態において説明した統合追跡部２００における処理と同様であるため、説明を省略する。 Then, the associating unit 530 sets the obtained position of the object as the position of the target with respect to the cycle in which the detection result is acquired. The associating unit 530 uses the position of this target to perform the association in the same manner as the associating unit 230 according to the first embodiment. Further, the association processing and the subsequent processing by the integrated tracking unit 500 are the same as the processing by the integrated tracking unit 200 described in the first embodiment, and thus the description thereof will be omitted.

また、対応付け部５３０は、ターゲット間の対応付けを行う前に、各ターゲットとトラッカーとの対応付けを行い、統合してもよい。つまり、対応付け部５３０は、同じトラッカーに対応付けされたターゲットが複数ある場合、これらのターゲットの間でマージ処理を行う。この場合、対応付け部５３０は、各ターゲットの尤度および／または予測位置の確度がより高いものを優先して、マージを行ってもよい。このように、対応付け部５３０は、これらの情報に基づいて、検出結果を評価し、同じトラッカーに対応付いたターゲットを統合してもよい。 Further, the associating unit 530 may associate and integrate each target with the tracker before performing the association between the targets. That is, when there are a plurality of targets associated with the same tracker, the associating unit 530 performs merge processing between these targets. In this case, the associating unit 530 may preferentially merge the targets with higher likelihood and/or accuracy of the predicted position of each target to perform the merging. In this way, the associating unit 530 may evaluate the detection result based on these pieces of information and integrate the targets associated with the same tracker.

以上のように、本実施の形態に係る物体追跡装置１０は、所定の期間内に、各カメラ２０で撮影されたカメラ映像に対する検出結果の全てを用いてオブジェクト追跡を行う。これにより、物体追跡装置１０は、複数のカメラ２０間で、オブジェクトの検索結果の優先付けを行い、オブジェクト追跡を行う処理を適用しやすくなる。 As described above, the object tracking device 10 according to the present embodiment performs object tracking using all the detection results for the camera images captured by each camera 20 within a predetermined period. As a result, the object tracking device 10 can easily apply a process of performing object tracking by prioritizing search results of objects among a plurality of cameras 20.

また、例えば、全てのカメラ２０のフレームレートが安定して同じである場合、フレーム間隔に従って、所定期間を設定することにより、全カメラ２０のフレームは、この所定期間に含まれる。したがって、本実施の形態に係る物体追跡装置１０は、全カメラ２０に対するフレームに対し、オブジェクト追跡を行うことができる。これにより、物体追跡装置１０は、複数のカメラ２０から同時に見えているオブジェクトに対し、同時に検出結果を評価できるようになるため、検出結果の信頼性を、追跡によりダイレクトに反映できるようになる。 Further, for example, when the frame rates of all the cameras 20 are stable and the same, the frames of all the cameras 20 are included in the predetermined period by setting the predetermined period according to the frame interval. Therefore, the object tracking device 10 according to the present embodiment can perform object tracking on the frames for all the cameras 20. As a result, the object tracking device 10 can simultaneously evaluate the detection results for the objects that are simultaneously seen by the plurality of cameras 20, and thus the reliability of the detection results can be directly reflected in the tracking.

＜第５の実施の形態＞
本発明の第５の実施の形態について説明する。本実施の形態では、本発明の課題を解決する最小の構成について説明を行う。 <Fifth Embodiment>
A fifth embodiment of the present invention will be described. In this embodiment, the minimum configuration that solves the problems of the present invention will be described.

本実施の形態に係る物体追跡装置１０は、第１の実施の形態において説明した図１に示す物体追跡装置１０と同様の構成であるため、図１を参照して説明を行う。 The object tracking device 10 according to the present embodiment has the same configuration as the object tracking device 10 shown in FIG. 1 described in the first embodiment, and therefore will be described with reference to FIG.

図１に示す通り、本実施の形態に係る物体追跡装置１０は、複数の検出部（１００−１〜１００−Ｎ）（Ｎは自然数）と、統合追跡部２００とを備えている。なお、本実施の形態では、複数の検出部（１００−１〜１００−Ｎ）の夫々を区別しない場合、または、総称する場合には、これらを検出部１００と呼ぶ。 As shown in FIG. 1, the object tracking device 10 according to the present embodiment includes a plurality of detection units (100-1 to 100-N) (N is a natural number) and an integrated tracking unit 200. In addition, in this Embodiment, when not distinguishing each of a some detection part (100-1-100-N), or when collectively calling, these are called the detection part 100.

複数の検出部１００の夫々は、センサから出力される出力情報から物体を検出する。なお、図１においては、センサをカメラとし、センサの出力情報をカメラ映像（映像データ）として記載しているが、センサはカメラに限定されるものではない。具体的には、検出部１００は、統合追跡部２００から出力される追跡情報に基づいて、物体を検出する。検出部１００は、検出結果を統合追跡部２００に出力する。 Each of the plurality of detection units 100 detects an object from the output information output from the sensor. Although the sensor is a camera and the output information of the sensor is a camera image (image data) in FIG. 1, the sensor is not limited to the camera. Specifically, the detection unit 100 detects an object based on the tracking information output from the integrated tracking unit 200. The detection unit 100 outputs the detection result to the integrated tracking unit 200.

統合追跡部２００は、複数の検出部（１００−１〜１００−Ｎ）の夫々によって出力された、複数の検出結果に基づいて、該検出結果によって示される１または複数の物体の夫々を追跡する。そして、統合追跡部２００は、追跡結果として、共通座標系で表現された物体の追跡情報を生成する。そして、統合追跡部２００は、複数の検出部（１００−１〜１００−Ｎ）の夫々に出力する。 Based on a plurality of detection results output by each of the plurality of detection units (100-1 to 100-N), the integrated tracking unit 200 tracks each of one or a plurality of objects indicated by the detection results. .. Then, the integrated tracking unit 200 generates tracking information of the object expressed in the common coordinate system as the tracking result. Then, the integrated tracking unit 200 outputs to each of the plurality of detection units (100-1 to 100-N).

このように、本実施の形態に係る物体追跡装置１０の検出部１００は、統合追跡部２００によって追跡された物体に対する追跡結果に基づいて、センサから出力された出力情報から物体を検出する。 As described above, the detection unit 100 of the object tracking device 10 according to the present embodiment detects an object from the output information output from the sensor based on the tracking result for the object tracked by the integrated tracking unit 200.

このように、物体追跡装置１０は、前のフレームに対する追跡結果を用いて、映像からオブジェクトを検出するため、該追跡結果を用いない場合に比べ、オブジェクトの検出精度を高めることができる。また、カメラ２０の夫々で撮影された映像に対するオブジェクトの検出結果全てを用いてオブジェクト追跡を行うため、物体追跡装置１０は、カメラ毎に独立にオブジェクト追跡を行う場合に比べ、追跡精度を向上できる。また、物体追跡装置１０は、検出精度が高い検出結果の全てを用いてオブジェクト追跡を行うため、より高精度に物体を追跡することができる。 In this way, the object tracking device 10 detects the object from the video by using the tracking result for the previous frame, so that it is possible to improve the detection accuracy of the object as compared with the case where the tracking result is not used. Further, since the object tracking is performed by using all the detection results of the objects in the images captured by the cameras 20, the object tracking device 10 can improve the tracking accuracy as compared with the case where the object tracking is performed for each camera independently. .. Further, since the object tracking device 10 performs object tracking using all the detection results with high detection accuracy, it is possible to track the object with higher accuracy.

なお、上述した各実施の形態では、物体追跡装置１０は、検出部（１００、４００）と統合追跡部（２００、５００）とを含むことを例に説明したが、この検出部と統合追跡部とは夫々別個の装置で実現されるものであってもよい。つまり、検出部（１００、４００）は、物体検出装置として、統合追跡部（２００、５００）は、統合追跡装置として、夫々、別個の装置で実現されるものであってもよい。また、表示制御部３００も、物体追跡装置５０とは別個の表示制御装置で実現されるものであってもよい。この表示制御装置は、表示装置３０内に内蔵されるものであってもよい。
＜第６の実施の形態＞
本発明の第６の実施の形態について説明する。本実施の形態に係る物体追跡装置１０は、第１の実施の形態において説明した図１に示す物体追跡装置１０と同様の構成であるため、図１を参照して説明を行う。なお、本実施の形態に係る物体追跡装置１０は、第１の実施の形態に係る物体追跡装置１０に、更に以下に説明する機能を追加した構成であるとするが、本発明はこれに限定されるものではない。本実施の形態に係る物体追跡装置１０は、上述した第２から第５の実施の形態に係る物体追跡装置にも適用可能である。 In each of the above-described embodiments, the object tracking device 10 has been described as an example including the detection unit (100, 400) and the integrated tracking unit (200, 500), but the detection unit and the integrated tracking unit are described. And may be realized by separate devices. That is, the detection unit (100, 400) may be realized by a separate device as the object detection device, and the integrated tracking unit (200, 500) may be realized by a separate device. Also, the display control unit 300 may be realized by a display control device separate from the object tracking device 50. The display control device may be built in the display device 30.
<Sixth Embodiment>
A sixth embodiment of the present invention will be described. The object tracking device 10 according to the present embodiment has the same configuration as the object tracking device 10 shown in FIG. 1 described in the first embodiment, and therefore will be described with reference to FIG. Note that the object tracking device 10 according to the present embodiment has a configuration in which the functions described below are further added to the object tracking device 10 according to the first embodiment, but the present invention is not limited to this. It is not something that will be done. The object tracking device 10 according to the present embodiment is also applicable to the object tracking devices according to the second to fifth embodiments described above.

本実施の形態では、統合追跡部２００が、更に、オブジェクトの見え方に関する情報を取得し、取得した情報を追跡情報に含める。そして、検出部１００が、統合追跡部２００から出力された追跡情報に含まれる各オブジェクトの見え方に関する情報を用いて、オブジェクトの検出を制御する。 In the present embodiment, the integrated tracking unit 200 further acquires information regarding the appearance of the object and includes the acquired information in the tracking information. Then, the detection unit 100 controls the detection of the object by using the information regarding the appearance of each object included in the tracking information output from the integrated tracking unit 200.

このオブジェクトの見え方に関する情報（以降、見え方情報）とは、各カメラ２０の位置からオブジェクトを見たときに、各オブジェクトがどのように見えるかに関する情報であり、各トラッカーのオブジェクトの位置によって定まるものである。 The information regarding the appearance of the object (hereinafter, appearance information) is information regarding how each object looks when viewed from the position of each camera 20, and depends on the position of the object of each tracker. It is fixed.

例えば、あるカメラ２０からあるオブジェクトと他のオブジェクトとを見たときに、あるオブジェクトが他のオブジェクトの前（カメラ２０側）にある場合を考える。この場合、後ろ側のオブジェクト（他のオブジェクト）は、前側のオブジェクト（あるオブジェクト）に隠れてしまい、カメラ２０から見えなくなる可能性が高くなる。このとき、統合追跡部２００は、このようなオブジェクト同士の重なりを表す情報を、見え方情報として、他のオブジェクトに関するトラッカー（追跡結果）に含め、該追跡結果を出力する。 For example, consider a case where an object is in front of another object (on the side of the camera 20) when an object and another object are viewed from a camera 20. In this case, the object on the back side (other object) is hidden by the object on the front side (a certain object), and there is a high possibility that it cannot be seen from the camera 20. At this time, the integrated tracking unit 200 includes information indicating such overlapping of objects as appearance information in a tracker (tracking result) related to another object, and outputs the tracking result.

次に、本実施の形態に係る物体追跡装置１０の各部の具体的な動作について説明する。 Next, a specific operation of each unit of the object tracking device 10 according to the present embodiment will be described.

統合追跡部２００は、例えば、図６に示した記憶部２２０などに、各カメラ２０の配置に関する情報を格納している。カメラ２０の配置に関する情報とは、例えば、各カメラ２０がどの位置に配置されているのか、どの方向を撮影しているのか等を示す情報である。また、統合追跡部２００は、カメラ２０の配置に関する情報として、撮影空間の照明の位置や向き、照明の特性に関する情報、撮影空間のどの位置が明るいまたは暗いのかといった照明条件に関する情報を含んでもよい。また、統合追跡部２００は、カメラ２０の配置に関する情報として、撮影空間の方角情報も保持していてもよい。 The integrated tracking unit 200 stores information about the arrangement of the cameras 20 in, for example, the storage unit 220 illustrated in FIG. The information regarding the arrangement of the cameras 20 is, for example, information indicating at which position each camera 20 is arranged, in which direction the camera 20 is photographing, and the like. Further, the integrated tracking unit 200 may include, as the information on the arrangement of the camera 20, information on the position and direction of the illumination in the shooting space, information on the characteristics of the illumination, and information on the lighting conditions such as which position in the shooting space is bright or dark. .. Further, the integrated tracking unit 200 may also hold direction information of the shooting space as information on the arrangement of the cameras 20.

統合追跡部２００は、前述した各実施の形態に係る統合追跡部２００と同様に、各トラッカーによって示されるオブジェクトの現フレーム上の動きを予測し、ターゲットとトラッカーとを対応付けることにより、トラッカーの位置を求める。 Similar to the integrated tracking unit 200 according to each of the above-described embodiments, the integrated tracking unit 200 predicts the movement of the object indicated by each tracker on the current frame, and associates the target with the tracker to determine the position of the tracker. Ask for.

そして、統合追跡部２００は、求めたトラッカーの位置と、各トラッカーの動きの情報とから、各カメラ２０によって撮影された撮影画像上での各トラッカーによって示されるオブジェクトの位置を予測する。 Then, the integrated tracking unit 200 predicts the position of the object indicated by each tracker on the captured image captured by each camera 20, based on the obtained position of the tracker and the information on the movement of each tracker.

そして、統合追跡部２００は、各カメラ２０の配置に関する情報を参照し、予測した位置の各オブジェクトが、各カメラ２０から見たときに、どのように見えるか（見え方）を予測する。つまり、統合追跡部２００は、複数のカメラ２０の夫々に対し、次に撮影したタイミングにおいて、上記各トラッカーによって示されるオブジェクト同士が、重なり合うかどうかを予測する。 Then, the integrated tracking unit 200 refers to the information on the arrangement of the cameras 20 and predicts how each object at the predicted position will look (appearance) when viewed from each camera 20. That is, the integrated tracking unit 200 predicts whether or not the objects indicated by the trackers overlap each other at each of the plurality of cameras 20 at the next shooting timing.

そして、統合追跡部２００は、予測した見え方に基づいて、あるカメラ２０の撮影画像上において、オブジェクト同士が重なっていると判定した場合には、オブジェクトが重なって見えない可能性があることを示す情報（見え方情報）を生成する。 Then, when the integrated tracking unit 200 determines that the objects overlap each other on the captured image of a certain camera 20 based on the predicted appearance, it is possible that the objects may not overlap and are not visible. The information (visual appearance information) is generated.

例えば、統合追跡部２００は、あるカメラ２０と予測したあるオブジェクトとを結ぶ線分の間に、予測した他のオブジェクトがあるかどうかを判定する。予測した他のオブジェクトが上記線分の間にある場合には、このあるオブジェクトが、他のオブジェクトと重なる可能性が高い。そのため、統合追跡部２００は、隠される（重なる）オブジェクトの情報、および重なり合う度合（尤度）を、上記あるオブジェクトに対する見え方情報として求める。 For example, the integrated tracking unit 200 determines whether or not there is another predicted object between the line segment connecting the certain camera 20 and the predicted certain object. When another predicted object is between the line segments, there is a high possibility that this certain object will overlap with another object. Therefore, the integrated tracking unit 200 obtains information about hidden (overlapping) objects and the degree of overlap (likelihood) as appearance information for the certain object.

そして、統合追跡部２００は、このあるオブジェクトの追跡結果に、この判定結果を見え方情報として含めてもよい。 Then, the integrated tracking unit 200 may include the determination result as the appearance information in the tracking result of the certain object.

そして、統合追跡部２００は、あるカメラ２０によって撮影された撮影画像上から見えなくなる可能性が高いオブジェクトの追跡情報に、生成した見え方情報を含める。このとき、見え方情報には、オブジェクトが見えなくなる可能性が高いカメラ２０を示す情報を含むことが好ましい。 Then, the integrated tracking unit 200 includes the generated appearance information in the tracking information of the object that is likely to be invisible in the captured image captured by the certain camera 20. At this time, it is preferable that the appearance information includes information indicating the camera 20 in which the object is likely to be invisible.

そして、統合追跡部２００は、見え方情報を含んだ追跡情報を、各検出部１００に出力する。なお、統合追跡部２００は、見え方情報を含んだ追跡情報を、オブジェクトが重なって見えない可能性が高いカメラ２０（あるカメラ２０）に関連付けられた検出部１００に出力してもよい。そして、統合追跡部２００は、見え方情報を含まない追跡情報を他のカメラ２０に関連付けられた物体追跡装置１０に出力してもよい。 Then, the integrated tracking unit 200 outputs the tracking information including the appearance information to each detection unit 100. The integrated tracking unit 200 may output the tracking information including the appearance information to the detection unit 100 associated with the camera 20 (a certain camera 20) in which the objects are highly likely to be invisible. Then, the integrated tracking unit 200 may output tracking information not including the appearance information to the object tracking device 10 associated with another camera 20.

また、オブジェクトの位置に応じて照明の当たり方が変わり、該オブジェクトの色合いや明るさが変化することがわかっている場合には、統合追跡部２００は、オブジェクトの位置に応じた見え方の変化を記述した情報を追跡情報に含めてもよい。 When it is known that the way the light hits changes depending on the position of the object, and the hue and brightness of the object change, the integrated tracking unit 200 changes the appearance depending on the position of the object. The information describing the above may be included in the tracking information.

例えば、照明の当たり方がオブジェクトの位置によって定まる場合には、統合追跡部２００は、その位置から照明の当たり方を予測し、明るくなる、暗くなる、色味が変化するといった情報を、トラッカーごとに追跡情報に含めてもよい。 For example, when the way the light hits is determined by the position of the object, the integrated tracking unit 200 predicts the way the light hits from that position, and provides information such as brighter, darker, and changed tint for each tracker. May be included in the tracking information.

また、オブジェクトが配置された空間における照明の位置がわかっている場合には、統合追跡部２００は、照明およびオブジェクトの関係から、オブジェクトまたはこの環境に配置されている他の物体の影が、他のオブジェクトに重なるか否か判定する。そして、影が重なる可能性がある場合には、統合追跡部２００は、影が重なるオブジェクトに対して、影が重なる可能性（尤度）を算出し、追跡情報に含めるようにする。 When the position of the illumination in the space where the object is arranged is known, the integrated tracking unit 200 determines that the shadow of the object or another object arranged in this environment is different from the relationship between the illumination and the object. It is determined whether or not it overlaps the object. Then, when there is a possibility that the shadows overlap, the integrated tracking unit 200 calculates the possibility (likelihood) that the shadows will overlap for the objects where the shadows overlap, and includes this in the tracking information.

また、太陽のように、移動する場合であっても、統合追跡部２００は、時刻と現場の方角の情報とから太陽の位置を求め、影のできる方向を予測し、オブジェクトの見え方に与える影響を考慮するようにしてもよい。例えば、統合追跡部２００は、時刻情報から太陽の現在位置を求め、方角情報と合わせて、どちらの方向に影ができるかを予測する。そして、統合追跡部２００は、他のオブジェクトの影がかかる可能性がある場合に影が重なる可能性（尤度）を算出し、追跡情報に含めるようにすればよい。 Further, even when moving like the sun, the integrated tracking unit 200 obtains the position of the sun from the time and the information on the direction of the site, predicts the direction in which the shadow is formed, and gives it to the appearance of the object. You may consider the influence. For example, the integrated tracking unit 200 obtains the current position of the sun from the time information, and, together with the direction information, predicts in which direction the shadow will form. Then, the integrated tracking unit 200 may calculate the possibility (likelihood) of overlapping shadows when the shadows of other objects are likely to be included, and include them in the tracking information.

次に、検出部１００の動作について説明する。検出部１００は、上述した各実施の形態と同様に、追跡情報に基づいて、オブジェクトを検出する。このとき、本実施の形態に係る物体追跡装置１０の検出部１００は、追跡情報に含まれる各オブジェクトの見え方に関する情報を用いて、オブジェクトの検出を制御する。具体的には、検出部１００は、他のオブジェクトと重なって見えない可能性が高いオブジェクトに関しては、検出を行わないようにする。例えば、検出部１００は、この見えない可能性が高いオブジェクトに対して、探索範囲を設定しないようにする。 Next, the operation of the detection unit 100 will be described. The detection unit 100 detects an object based on the tracking information, as in each of the above-described embodiments. At this time, the detection unit 100 of the object tracking device 10 according to the present embodiment controls the detection of the object using the information regarding the appearance of each object included in the tracking information. Specifically, the detection unit 100 does not detect an object that is likely to be invisible because it overlaps with another object. For example, the detection unit 100 does not set the search range for the object that is likely to be invisible.

なお、明るさや色合いが変化する情報が追跡情報に含まれている場合には、検出部１００は、探索する際にその照明の効果を補正して検出をかけるようにしてもよい。たとえば、暗い領域では、検出部１００は、その領域の画素値を明るめに補正してから、検出をかけるようにしてもよい。 If the tracking information includes information that changes in brightness or color tone, the detection unit 100 may correct the effect of the illumination when performing the search and perform detection. For example, in a dark area, the detection unit 100 may detect the pixel value of the area after correcting the pixel value to be lighter.

また、色合いが変化する場合には、検出部１００は、テンプレートマッチングで用いるマッチングのパラメータ（つまり、上述した検出パラメータ）を更新する際に、その色合いの変化を考慮して、該パラメータの色の情報を補正してもよい。また、色合いの変化が大きい場合には、検出部１００は、色の情報を用いないようにしてもよい。また、検出部１００は、テンプレートの特徴の中で、色の情報の重みを下げ、エッジ等の他の特徴の重みを高めてマッチングを行うようにしてもよい。 Further, when the hue changes, the detection unit 100 considers the change in the hue when updating the matching parameter used in the template matching (that is, the above-described detection parameter), and detects the color of the parameter. The information may be corrected. If the change in hue is large, the detection unit 100 may not use the color information. Further, the detection unit 100 may perform matching by reducing the weight of color information and increasing the weight of other features such as edges among the features of the template.

また、検出部１００で、オブジェクト検出処理に用いる検出パラメータを更新する際に、該オブジェクトが重なっている可能性が高いと判断される場合には、検出パラメータの更新を行わないようにしてもよい。 Further, when the detection unit 100 updates the detection parameters used in the object detection process, if it is determined that the objects are likely to overlap, the detection parameters may not be updated. ..

以上のように、検出部１００は、追跡情報に含まれる見え方情報に基づいて、物体の検出を制御する。これにより、検出部１００は、見えないオブジェクトを検出する処理や、テンプレートマッチング等のパラメータを更新する処理を削減することができる。これにより、検出部１００は、誤検出や誤ったパラメータの更新の可能性を低減することができる。 As described above, the detection unit 100 controls the detection of the object based on the appearance information included in the tracking information. As a result, the detection unit 100 can reduce the processing of detecting an invisible object and the processing of updating parameters such as template matching. Thereby, the detection unit 100 can reduce the possibility of false detection and incorrect parameter update.

同様に、検出部１００は、照明条件が変わった可能性が高い場合には、その効果を補正してパラメータを更新するように制御してもよいし、パラメータの更新を行わないように制御してもよい。 Similarly, when there is a high possibility that the lighting condition has changed, the detection unit 100 may perform control so as to correct the effect and update the parameter, or control not to update the parameter. May be.

また、複数の検出アルゴリズムが切り替えられるようになっている場合には、物体追跡装置１０は、検出部１００として、より重なりに強い検出器を用いるようにしてもよい。具体的には、物体追跡装置１０は、通常は頭部全体を検知する検出器を用いるが、重なっている場合には、頭部全体ではなく、頭部の一部分のみを検知する検出器を用いるようにしてもよい。これにより、物体追跡装置１０は、通常時は、シンプルな検出器を用い、重なっている可能性がある場合には、より詳細な検出器を用いることができる。これにより、物体追跡装置１０は、効率性を維持したうえで、高精度な検出が可能になる。同様に、物体追跡装置１０は、照明条件が変化した場合には、その照明条件に対して頑健性が高い検出器（特徴量）を用いて、検出を制御してもよい。 In addition, when a plurality of detection algorithms can be switched, the object tracking device 10 may use, as the detection unit 100, a detector that is stronger in overlapping. Specifically, the object tracking device 10 normally uses a detector that detects the entire head, but when overlapping, uses a detector that detects only a part of the head instead of the entire head. You may do it. Thereby, the object tracking device 10 can use a simple detector at normal times, and can use a more detailed detector when there is a possibility of overlapping. As a result, the object tracking device 10 can perform highly accurate detection while maintaining efficiency. Similarly, when the lighting condition changes, the object tracking device 10 may control the detection by using a detector (feature amount) having high robustness to the lighting condition.

以上のように本実施の形態に係る物体追跡装置１０は、統合追跡部２００が、他のカメラの情報も使ってオブジェクトの見え方の判定をする。そのため、統合追跡部２００は、あるカメラ２０からは、オブジェクト同士が重なって映る場合など、そのカメラ２０だけでは判定が難しい場合でも高精度に見え方を判定することができる。そして、統合追跡部２００はこの結果を、検出部１００にフィードバックすることができる。これにより、検出部１００は、オブジェクトの誤検出を低減できる。そして、統合追跡部２００は、この検出結果を用いてオブジェクト追跡を行うため、オブジェクトの追跡精度を向上させることができる。 As described above, in the object tracking device 10 according to the present embodiment, the integrated tracking unit 200 uses the information of other cameras to determine the appearance of the object. Therefore, the integrated tracking unit 200 can determine the appearance with a high degree of accuracy from a certain camera 20 even when the determination is difficult with the camera 20 alone, such as when the objects overlap each other. Then, the integrated tracking unit 200 can feed back the result to the detection unit 100. Thereby, the detection unit 100 can reduce erroneous detection of the object. Then, since the integrated tracking unit 200 uses the detection result to track the object, the tracking accuracy of the object can be improved.

＜ハードウェアの構成例＞
ここで、上述した各実施の形態に係る物体追跡装置（１０、５０）を実現可能なハードウェアの構成例について説明する。上述した物体追跡装置（１０、５０）は、専用の装置として実現してもよいが、コンピュータ（情報処理装置）を用いて実現してもよい。 <Example of hardware configuration>
Here, a configuration example of hardware capable of realizing the object tracking device (10, 50) according to each of the above-described embodiments will be described. The object tracking device (10, 50) described above may be realized as a dedicated device, or may be realized using a computer (information processing device).

図１９は、本発明の各実施の形態を実現可能なコンピュータ（情報処理装置）のハードウェア構成を例示する図である。 FIG. 19 is a diagram illustrating a hardware configuration of a computer (information processing device) that can implement each embodiment of the present invention.

図１９に示した情報処理装置（コンピュータ）７００のハードウェアは、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１、通信インタフェース（Ｉ／Ｆ）１２、入出力ユーザインタフェース１３、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１４、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１５、記憶装置１７、及びコンピュータ読み取り可能な記憶媒体１９のドライブ装置１８を備え、これらがバス１６を介して接続された構成を有する。入出力ユーザインタフェース１３は、入力デバイスの一例であるキーボードや、出力デバイスとしてのディスプレイ等のマンマシンインタフェースである。通信インタフェース１２は、上述した各実施の形態に係る装置（図１、図９）が、外部装置と、通信ネットワーク６００を介して通信するための一般的な通信手段である。係るハードウェア構成において、ＣＰＵ１１は、各実施の形態に係る物体追跡装置（１０、５０）を実現する情報処理装置７００について、全体の動作を司る。 The hardware of the information processing apparatus (computer) 700 shown in FIG. 19 includes a CPU (Central Processing Unit) 11, a communication interface (I/F) 12, an input/output user interface 13, a ROM (Read Only Memory) 14, and a RAM ( A random access memory 15, a storage device 17, and a drive device 18 for a computer-readable storage medium 19 are provided, and these are connected via a bus 16. The input/output user interface 13 is a man-machine interface such as a keyboard as an example of an input device and a display as an output device. The communication interface 12 is a general communication means for allowing the devices (FIGS. 1 and 9) according to the above-described embodiments to communicate with external devices via the communication network 600. In such a hardware configuration, the CPU 11 controls the entire operation of the information processing device 700 that realizes the object tracking device (10, 50) according to each embodiment.

上述した各実施の形態を例に説明した本発明は、例えば、上記各実施の形態において説明した処理を実現可能なプログラム（コンピュータ・プログラム）を、図１９に示す情報処理装置７００に対して供給した後、そのプログラムを、ＣＰＵ１１に読み出して実行することによって達成される。なお、係るプログラムは、例えば、上記各実施の形態の説明において参照したフローチャート（図８）に記載した各種処理や、或いは、図１、図４〜図６、図９、図１５〜図１７に示したブロック図において当該装置内に示した各部（各ブロック）を実現可能なプログラムであってもよい。 The present invention described in each of the above-described embodiments as an example supplies the information processing apparatus 700 shown in FIG. 19 with a program (computer program) capable of implementing the processing described in each of the above-described embodiments. After that, the program is read out to the CPU 11 and executed. Note that the program is, for example, various processing described in the flowchart (FIG. 8) referred to in the description of each of the above-described embodiments, or in FIG. 1, FIG. 4 to FIG. 6, FIG. 9, and FIG. It may be a program that can realize each unit (each block) shown in the apparatus in the block diagram shown.

また、情報処理装置７００内に供給されたプログラムは、読み書き可能な一時記憶メモリ（１５）またはハードディスクドライブ等の不揮発性の記憶装置（１７）に格納されてもよい。即ち、記憶装置１７において、プログラム群１７Ａは、例えば、上述した各実施の形態における物体追跡装置（１０、５０）内に示した各部の機能を実現可能なプログラムである。また、各種の記憶情報１７Ｂは、例えば、上述した各実施の形態におけるオブジェクト追跡結果、カメラ映像、カメラパラメータ、各カメラ２０で見える共通座標系の座標値の範囲等である。ただし、情報処理装置７００へのプログラムの実装に際して、個々のプログラム・モジュールの構成単位は、ブロック図（図１、図４〜図６、図９、図１５〜図１７）に示した各ブロックの区分けには限定されず、当業者が実装に際して適宜選択してよい。 Further, the program supplied into the information processing device 700 may be stored in a readable/writable temporary storage memory (15) or a non-volatile storage device (17) such as a hard disk drive. That is, in the storage device 17, the program group 17A is, for example, a program that can realize the function of each unit shown in the object tracking device (10, 50) in each of the above-described embodiments. Further, the various stored information 17B is, for example, the object tracking result, the camera image, the camera parameter, the range of the coordinate value of the common coordinate system seen by each camera 20 in each of the above-described embodiments. However, when the program is installed in the information processing device 700, the constituent units of the individual program modules are the blocks shown in the block diagrams (FIG. 1, FIG. 4 to FIG. 6, FIG. 9, and FIG. 15 to FIG. 17). The division is not limited, and those skilled in the art may appropriately select when mounting.

また、前記の場合において、当該装置内へのプログラムの供給方法は、ＣＤ（ＣｏｍｐａｃｔＤｉｓｋ）−ＲＯＭ、フラッシュメモリ等のコンピュータ読み取り可能な各種の記録媒体（１９）を介して当該装置内にインストールする方法や、インターネット等の通信回線（６００）を介して外部よりダウンロードする方法等のように、現在では一般的な手順を採用することができる。そして、このような場合において、本発明は、係るコンピュータプログラムを構成するコード（プログラム群１７Ａ）或いは係るコードが格納された記憶媒体（１９）によって構成されると捉えることができる。 In the above case, the method of supplying the program into the device is installed in the device via various computer-readable recording media (19) such as a CD (Compact Disk)-ROM and a flash memory. A general procedure can be adopted at present, such as a method or a method of downloading from the outside through a communication line (600) such as the Internet. In such a case, the present invention can be considered to be configured by the code (program group 17A) configuring the computer program or the storage medium (19) storing the code.

以上、本発明を、上述した模範的な実施の形態に適用した例として説明した。しかしながら、本発明の技術的範囲は、上述した各実施の形態に記載した範囲には限定されない。当業者には、係る実施の形態に対して多様な変更または改良を加えることが可能であることは明らかである。そのような場合、係る変更または改良を加えた新たな実施の形態も、本発明の技術的範囲に含まれ得る。そしてこのことは、特許請求の範囲に記載した事項から明らかである。 The present invention has been described above as an example applied to the exemplary embodiment described above. However, the technical scope of the present invention is not limited to the scope described in each of the above-described embodiments. It is obvious to those skilled in the art that various modifications and improvements can be added to the embodiment. In such a case, a new embodiment with such changes or improvements may be included in the technical scope of the present invention. And this is clear from the matters described in the claims.

１物体追跡システム
２物体追跡システム
１０物体追跡装置
１００検出部
１１０オブジェクト検出部
１１１認識型オブジェクト検出部
１１２探索範囲設定部
１２０共通座標変換部
１３０個別座標変換部
１４０オブジェクト検出部
１４１認識型オブジェクト検出部
１４２非認識型オブジェクト検出部
１４３検出パラメータ更新部
１４４検出結果統合部
１５０表示制御部
１６０記憶部
２００統合追跡部
２１０予測部
２２０記憶部
２３０対応付け部
２４０更新部
３００表示制御部
４００検出部
５００統合追跡部
５１０バッファ部
５３０対応付け部
２０カメラ
３０表示装置
４０ネットワーク
５０物体追跡装置 DESCRIPTION OF SYMBOLS 1 Object tracking system 2 Object tracking system 10 Object tracking device 100 Detection part 110 Object detection part 111 Recognition type object detection part 112 Search range setting part 120 Common coordinate conversion part 130 Individual coordinate conversion part 140 Object detection part 141 Recognition type object detection part 142 Non-recognition type object detection unit 143 Detection parameter update unit 144 Detection result integration unit 150 Display control unit 160 Storage unit 200 Integrated tracking unit 210 Prediction unit 220 Storage unit 230 Correlation unit 240 Update unit 300 Display control unit 400 Detection unit 500 Integration Tracking unit 510 Buffer unit 530 Matching unit 20 Camera 30 Display device 40 Network 50 Object tracking device

Claims

A plurality of detecting means for detecting an object from the captured image and outputting the detection result;
Integrated tracking means for calculating the position information of the object expressed in a common coordinate system, based on the plurality of detection results output by each of the plurality of detecting means,
The integrated tracking means outputs position information of the object in the calculated common coordinate system,
The detection means converts the position information of the object in the common coordinate system into position information expressed in an individual coordinate system specific to a camera that outputs an image that is a detection target of the object, and in the individual coordinate system. Tracking the object,
Detecting the object based on the position information expressed in the individual coordinate system,
Converting the position information of the object detected based on the position information expressed in the individual coordinate system into position information expressed in the common coordinate system,
Object tracking system.

In the detection of the object based on the position information represented by the individual coordinate system, the detection means,
A camera-specific individual coordinate system that outputs the image, which is a search range for searching the object in the image that is the detection target of the object, and is specified based on the position information expressed in the individual coordinate system. The object is detected within the search range expressed by
The object tracking system according to claim 1.

The object tracking system according to claim 1, wherein the detection of the object is performed by using a discriminator that has learned image characteristics of the object.

The detection result of the detection of the object is a result of integrating the detection result by the discriminator and the detection result by template matching with the object region detected in the previous frame,
The object tracking system according to claim 3.

A plurality of detection means for detecting an object from the output information of the sensor and outputting the detection result,
An integrated tracking unit that outputs the tracking information of the object represented by a common coordinate system, tracking the object based on the plurality of detection results output by each of the plurality of detection units,
The integrated tracking means outputs the generated tracking information to each of the plurality of detection means,
The detection means generates a first detection result, which is a result of object identification, and a second detection result, which detects an object by template matching with a previously detected object region, based on the tracking information. , An object tracking system for detecting the object by integrating.

A plurality of detection means for detecting an object from the output information of the sensor and outputting the detection result,
An integrated tracking unit that outputs the tracking information of the object represented by a common coordinate system, tracking the object based on the plurality of detection results output by each of the plurality of detection units,
The integrated tracking means outputs the generated tracking information to each of the plurality of detection means,
The detection means detects the object based on the tracking information,
An object tracking system in which the integrated tracking unit calculates the position by collectively collecting the detection results output from the plurality of detection units within a predetermined period, and generates tracking information.

Detects an object from the captured image, outputs the detection result,
Based on the output results of the plurality of detected, to calculate the position information of the object represented in a common coordinate system,
Outputting position information of the object in the calculated common coordinate system,
Position information of the object in the common coordinate system is converted into position information expressed in an individual coordinate system specific to a camera that outputs an image that is a detection target of the object, and the object is tracked in the individual coordinate system. ,
Detecting the object based on the position information expressed in the individual coordinate system,
Converting the position information of the object detected based on the position information expressed in the individual coordinate system into position information expressed in the common coordinate system,
Object tracking method.

In the detection of the object based on the position information expressed in the individual coordinate system,
A camera-specific individual coordinate system for outputting the image, which is a search range for searching the object in the image to be detected by the object and is specified based on position information expressed in the individual coordinate system. The object is detected within the search range expressed by
The object tracking method according to claim 7.

9. The object tracking method according to claim 7, wherein the detection of the object is performed using a classifier that has learned the image feature of the object.

The detection result of the detection of the object is a result of integrating the detection result by the discriminator and the detection result by template matching with the object region detected in the previous frame,
The object tracking method according to claim 9.

The object is detected from the output information of the sensor, the detection result is output,
Tracking the object based on the output of the plurality of detection results, to generate tracking information of the object represented in a common coordinate system,
Output the generated tracking information,
By generating and integrating a first detection result which is a result of object identification and a second detection result which detects an object by template matching with a previously detected object region based on the tracking information. An object tracking method for detecting the object.

The object is detected from the output information of the sensor, the detection result is output,
Tracking the object based on the output of the plurality of detection results, to generate tracking information of the object represented in a common coordinate system,
Output the generated tracking information,
Detecting the object based on the tracking information,
An object tracking method in which a plurality of detection results output within a predetermined period are combined to calculate a position and generate tracking information.

A process of detecting an object from a captured image,
The process of outputting the detection result,
Based on the output results of the plurality of detection, a process of calculating the position information of the object represented in a common coordinate system,
A process of outputting the position information of the object in the calculated common coordinate system;
Position information of the object in the common coordinate system is converted into position information expressed in an individual coordinate system unique to a camera that outputs an image that is a detection target of the object, and the object is tracked in the individual coordinate system. Processing and
A process of detecting the object based on the position information expressed in the individual coordinate system,
A process of converting the position information of the object detected based on the position information expressed in the individual coordinate system into the position information expressed in the common coordinate system,
Programs that include.

In the process of detecting the object based on the position information expressed in the individual coordinate system,
A camera-specific individual coordinate system that outputs the image, which is a search range for searching the object in the image that is the detection target of the object, and is specified based on the position information expressed in the individual coordinate system. In the search range represented by, a process of detecting the object,
14. The program according to claim 13, including.

The program according to claim 13 or 14, wherein the processing of detecting the object is performed using a classifier that has learned the image characteristics of the object.

The detection result of the process of detecting the object is a result of integrating the detection result by the discriminator and the detection result by template matching with the object region detected in the previous frame,
The program according to claim 15.

The process of detecting an object from the output information of the sensor,
The process of outputting the detection result,
A process of tracking the object based on the output results of the plurality of detections, and generating tracking information of the object expressed in a common coordinate system;
A process of outputting the generated tracking information,
By generating and integrating a first detection result which is a result of object identification and a second detection result which detects an object by template matching with a previously detected object region based on the tracking information. A process of detecting the object,
Programs that include.

The process of detecting an object from the output information of the sensor,
The process of outputting the detection result,
A process of tracking the object based on the output results of the plurality of detections, and generating tracking information of the object expressed in a common coordinate system;
A process of outputting the generated tracking information,
A process of detecting the object based on the tracking information;
A process of collectively calculating a plurality of detection results output within a predetermined period and generating tracking information,
Programs that include.