JP2005099953A

JP2005099953A - Image processor, object tracing system therewith, image processing method and image processing program

Info

Publication number: JP2005099953A
Application number: JP2003330635A
Authority: JP
Inventors: Kazumasa Murai; 和昌村井
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2003-09-22
Filing date: 2003-09-22
Publication date: 2005-04-14

Abstract

PROBLEM TO BE SOLVED: To more accurately perform a process for tracing an object such as a face from inside moving image data. SOLUTION: The process for tracing the face from inside the moving image data comprising a static image data string is performed. When detecting the face by search of the whole range inside static image data in a search part, the tracing of the face is performed by a tracing part. The tracing part traces the face from inside the static image data of a tracing target on the basis of characteristics of the face already detected from inside one or more pieces of static image data in time after the static image data of the tracing object to trace the face in a tracing-back direction of a temporal axis of the static image data string. COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、各々が異なる時刻のデータである静止画データ列によって構成される動画データ中から、オブジェクトを追跡する処理を行う画像処理装置、それを備えるオブジェクト追跡システム、画像処理方法、及び画像処理プログラムに関する。 The present invention relates to an image processing apparatus that performs processing for tracking an object from moving image data composed of still image data sequences that are data at different times, an object tracking system including the same, an image processing method, and image processing Regarding the program.

画像処理によって画像データから例えば顔等のオブジェクトを検出するためには、複雑な計算を必要とするため、検出に長時間を要するという問題点がある。ただし、動画像においては、フレーム間の相関が非常に高いという性質があるため、オブジェクトの検出処理を行うときに、その検出対象となるフレームより前の時間における検出済みのオブジェクトの「大きさ」、「位置」、「角度」、「形状」、「明るさ」、「色」、及びそれらの「軌跡」などの特性を利用することにより、検出時間の短縮を図ることができる。動画データ中のオブジェクト検出処理において、前の時間におけるフレームにて検出されたオブジェクトの特性を利用して時間軸順方向にオブジェクトの探索を行う手法は広く利用されている。 In order to detect an object such as a face from image data by image processing, a complicated calculation is required, and there is a problem that it takes a long time for detection. However, moving images have the property that the correlation between frames is very high, so when performing object detection processing, the “size” of the detected object at a time prior to the detection target frame. By utilizing characteristics such as “position”, “angle”, “shape”, “brightness”, “color”, and their “trajectory”, the detection time can be shortened. In object detection processing in moving image data, a method of searching for an object in the forward direction of the time axis using the characteristics of the object detected in the frame at the previous time is widely used.

また、その他の背景技術として、画像データから人間の顔を検出する手法の一例が非特許文献１に開示され、ビデオ画像から顔を検出して口の動きを検知することにより、発話区間を推定して音声認識結果の向上を図った技術が非特許文献２に開示されている。 In addition, as another background art, an example of a technique for detecting a human face from image data is disclosed in Non-Patent Document 1, and a speech section is estimated by detecting a face from a video image and detecting a mouth movement. A technique for improving the speech recognition result is disclosed in Non-Patent Document 2.

R. Stiefelhagen他,"Simultaneous tracking of head pose in a panoramic view",proc. International Conference on Pattern Recognition,Vol.3,pp726-729,2000R. Stiefelhagen et al., “Simultaneous tracking of head pose in a panoramic view”, proc. International Conference on Pattern Recognition, Vol. 3, pp726-729, 2000 Kazumasa Murai, Kenichi Kumatani, Satoshi Nakamura, "A Robust End Point Detection by Speaker's Facial Motion", HSC2001 (International Workshop on Hands-Free Speech),2001/4/9Kazumasa Murai, Kenichi Kumatani, Satoshi Nakamura, "A Robust End Point Detection by Speaker's Facial Motion", HSC2001 (International Workshop on Hands-Free Speech), 2001/4/9

しかしながら、前述した時間軸順方向にオブジェクトの探索を行う手法においては、最初にオブジェクトを検出できた時点より後の時間におけるオブジェクトの特性しか得ることができない。さらに、最初の検出時点より後の時間におけるオブジェクトの特性だけでは、将来のオブジェクトの予測精度も低下する。このように、時間軸順方向にオブジェクトの探索を行う手法では、より正確なオブジェクトの追跡及びより正確な将来のオブジェクトの予測について、まだ改善の余地がある。 However, in the above-described method for searching for an object in the forward direction of the time axis, only the characteristics of the object at a time after the time when the object can be detected first can be obtained. Furthermore, the prediction accuracy of the future object is lowered only by the characteristics of the object at the time after the first detection time. As described above, in the method of searching for an object in the forward direction of the time axis, there is still room for improvement with respect to more accurate object tracking and more accurate prediction of future objects.

本発明は、動画データ中からオブジェクトを追跡する処理をより正確に行うことができる画像処理装置、それを備えるオブジェクト追跡システム、画像処理方法及び画像処理プログラムを提供することを目的とする。 An object of the present invention is to provide an image processing apparatus that can more accurately perform processing for tracking an object from moving image data, an object tracking system including the same, an image processing method, and an image processing program.

本発明に係る画像処理装置は、各々が異なる時刻のデータである静止画データ列によって構成される動画データ中から、オブジェクトを追跡する処理を行う画像処理装置であって、静止画データ中からオブジェクトを探索する探索手段と、該探索手段により１つ以上のオブジェクトを検出できた場合に、他の静止画データ中における該１つ以上のオブジェクトを追跡する追跡手段と、を備え、前記追跡手段は、追跡対象の静止画データより後の時刻における１以上の静止画データ中から既に検出されたオブジェクトの特性に基づいて、該追跡対象の静止画データ中からオブジェクトを追跡することにより、静止画データ列の時間軸を遡る方向にオブジェクトを追跡することを要旨とする。 An image processing apparatus according to the present invention is an image processing apparatus that performs processing for tracking an object from moving image data composed of still image data sequences each of which is data at different times. Search means for searching for, and when one or more objects can be detected by the search means, tracking means for tracking the one or more objects in other still image data, and the tracking means By tracking an object from the tracked still image data based on the characteristics of the object already detected from one or more still image data at a time later than the tracked still image data, the still image data The gist is to trace the object in the direction going back the time axis of the column.

本発明に係る画像処理装置において、前記追跡対象の静止画データより後の時刻における１以上の静止画データは、該追跡対象の静止画データの直後の時刻における静止画データを含むものとすることもできる。本発明に係る画像処理装置において、前記オブジェクトの特性は、該オブジェクトの位置、大きさ、角度、形状、明るさ、色、及びそれらの軌跡の少なくとも１つを含むものとすることもできる。 In the image processing apparatus according to the present invention, the one or more still image data at a time later than the tracking target still image data may include still image data at a time immediately after the tracking target still image data. . In the image processing apparatus according to the present invention, the characteristics of the object may include at least one of the position, size, angle, shape, brightness, color, and locus of the object.

本発明に係る画像処理装置において、前記追跡手段は、追跡対象の静止画データより前の時刻における１以上の静止画データ中から既に検出されたオブジェクトの位置及び大きさを含む特性に基づいて、該追跡対象の静止画データ中からオブジェクトを追跡することにより、静止画データ列の時間軸に沿った方向にオブジェクトを追跡するものとすることもできる。このように、静止画データ列の時間軸を遡る方向にオブジェクトを追跡する手段と時間軸に沿って順方向にオブジェクトを追跡する手段とを併用することができる。また、検出されたオブジェクトの画像と、検出しようとする画像を用い、オプティカルフロー法に代表される動き検出アルゴリズムを用いてオブジェクトを追跡することも可能である。本発明に係る画像処理装置において、前記追跡対象の静止画データより前の時刻における１以上の静止画データは、該追跡対象の静止画データの直前の時刻における静止画データを含むものとすることもできる。 In the image processing apparatus according to the present invention, the tracking unit is based on characteristics including the position and size of an object already detected from one or more still image data at a time before the still image data to be tracked. By tracking an object from the still image data to be tracked, the object can be tracked in the direction along the time axis of the still image data string. In this way, it is possible to use both means for tracking an object in the direction going back the time axis of the still image data sequence and means for tracking the object in the forward direction along the time axis. It is also possible to track an object using a motion detection algorithm typified by an optical flow method using an image of the detected object and an image to be detected. In the image processing apparatus according to the present invention, the one or more still image data at a time prior to the tracking target still image data may include still image data at a time immediately before the tracking target still image data. .

本発明に係る画像処理装置において、前記オブジェクトは顔であることが好適である。本発明に係る画像処理装置において、該画像処理装置は、リアルタイム動画データ中から、オブジェクトをリアルタイムで追跡する処理を行う装置であることが好適である。顔画像の検出をリアルタイムで行うには複雑な計算を必要とするため、本発明を適用することが特に有効である。 In the image processing apparatus according to the present invention, it is preferable that the object is a face. In the image processing apparatus according to the present invention, the image processing apparatus is preferably an apparatus that performs processing for tracking an object in real time from real-time moving image data. It is particularly effective to apply the present invention because complex calculation is required to detect a face image in real time.

本発明に係る画像処理装置を備えるオブジェクト追跡システムは、オブジェクトを含むリアルタイム動画データを得るために、該オブジェクトを撮像する画像処理用撮像装置と、前記追跡手段にて既に検出された静止画データ中のオブジェクトの特性に基づいて、該オブジェクトの将来特性を推定する特性推定手段と、オブジェクトを追尾可能な追尾装置と、前記特性推定手段により推定されたオブジェクトの将来特性に基づいて制御指令値を演算して前記追尾装置へ出力することで、該追尾装置が該オブジェクトを追尾するように制御を行う制御指令演算手段と、を備えるものとすることもできる。この本発明に係るオブジェクト追跡システムにおいて、前記追尾装置は、オブジェクトを追尾して撮像可能な追尾用撮像装置であり、前記制御指令演算手段は、該追尾用撮像装置の撮像方向及びズームの少なくとも一方を制御するための制御指令値を演算することが好適である。 An object tracking system including an image processing device according to the present invention includes an image processing imaging device that images an object to obtain real-time moving image data including the object, and still image data already detected by the tracking unit. Based on the characteristics of the object, a characteristic estimation means for estimating the future characteristic of the object, a tracking device capable of tracking the object, and a control command value based on the future characteristic of the object estimated by the characteristic estimation means Then, by outputting to the tracking device, it is possible to provide control command calculation means for performing control so that the tracking device tracks the object. In the object tracking system according to the present invention, the tracking device is a tracking imaging device that can track and image an object, and the control command calculation means includes at least one of an imaging direction and a zoom of the tracking imaging device. It is preferable to calculate a control command value for controlling the.

本発明に係る画像処理方法は、各々が異なる時刻のデータである静止画データ列によって構成される動画データ中から、オブジェクトを追跡する処理を行う画像処理方法であって、静止画データ中からオブジェクトを探索する探索ステップと、該探索ステップにより１つ以上のオブジェクトを検出できた場合に、他の静止画データ中における該１つ以上のオブジェクトを追跡する追跡ステップと、を含み、前記追跡ステップは、追跡対象の静止画データより後の時刻における１以上の静止画データ中から既に検出されたオブジェクトの特性に基づいて、該追跡対象の静止画データ中からオブジェクトを追跡することを繰り返すことにより、静止画データ列の時間軸を遡る方向にオブジェクトを追跡することを要旨とする。 An image processing method according to the present invention is an image processing method for performing processing for tracking an object from moving image data composed of still image data sequences, each of which is data at different times. And a tracking step for tracking the one or more objects in other still image data when the search step detects one or more objects, the tracking step comprising: , By repeatedly tracking the object from the tracking target still image data based on the characteristics of the object already detected from the one or more still image data at a time later than the tracking target still image data, The gist is to trace an object in a direction going back the time axis of the still image data string.

本発明に係る画像処理プログラムは、各々が異なる時刻のデータである静止画データ列によって構成される動画データ中から、オブジェクトを追跡する処理をコンピュータに実行させる画像処理プログラムであって、静止画データ中からオブジェクトを探索する探索処理と、該探索処理により１つ以上のオブジェクトを検出できた場合に、他の静止画データ中における該１つ以上のオブジェクトを追跡する追跡処理と、をコンピュータに実行させ、前記追跡処理は、追跡対象の静止画データより後の時刻における１以上の静止画データ中から既に検出されたオブジェクトの特性に基づいて、該追跡対象の静止画データ中からオブジェクトを追跡する処理を行うことにより、静止画データ列の時間軸を遡る方向にオブジェクトを追跡する処理を行うことを要旨とする。 An image processing program according to the present invention is an image processing program for causing a computer to execute processing for tracking an object from moving image data composed of still image data sequences each of which is data at different times. The computer executes search processing for searching for an object from within, and tracking processing for tracking the one or more objects in other still image data when one or more objects can be detected by the search processing The tracking process tracks an object from the still image data to be tracked based on the characteristics of the object already detected from one or more still image data at a time later than the still image data to be tracked. Processing to track objects in the direction going back the time axis of the still image data string by processing The gist of the Ukoto.

以上説明したように、本発明によれば、静止画データ列の時間軸を遡る方向にオブジェクトを追跡することにより、動画データ中からオブジェクトを追跡する処理をより正確に行うことができる。 As described above, according to the present invention, it is possible to more accurately perform processing for tracking an object from moving image data by tracking the object in a direction going back the time axis of the still image data sequence.

以下、本発明の実施の形態（以下実施形態という）を、図面に従って説明する。 Hereinafter, embodiments of the present invention (hereinafter referred to as embodiments) will be described with reference to the drawings.

図１は、本発明の実施形態に係る画像処理装置を備えるオブジェクト追跡システムの構成の概略を示すブロック図である。本実施形態に係るオブジェクト追跡システムは、画像処理を用いて人間の顔を追跡するシステムであり、その構成は大きく分けてカメラ１０、画像処理装置１２及び制御装置１４に分けられる。 FIG. 1 is a block diagram showing an outline of the configuration of an object tracking system including an image processing apparatus according to an embodiment of the present invention. The object tracking system according to the present embodiment is a system that tracks a human face using image processing, and its configuration is roughly divided into a camera 10, an image processing device 12, and a control device 14.

画像処理用撮像装置としてのカメラ１０は、本実施形態のオブジェクトである人間の顔を撮像する。カメラ１０により撮像された動画像信号は、画像処理装置１２へ入力される。また、カメラ１０はその撮像方向及びズームを変化させることができ、後述するように、カメラ１０の撮像方向及びズームを制御装置１４により制御することができる。 A camera 10 as an image processing imaging apparatus captures an image of a human face that is an object of the present embodiment. A moving image signal captured by the camera 10 is input to the image processing device 12. Further, the camera 10 can change its imaging direction and zoom, and the imaging direction and zoom of the camera 10 can be controlled by the control device 14 as will be described later.

画像処理装置１２は、カメラ１０により撮像された動画像から人間の顔をリアルタイムで追跡する処理を行う。そして、画像処理装置１２は、画像前処理部２８、画像保管部２９、探索部１６及び追跡部１８を有している。 The image processing device 12 performs processing for tracking a human face in real time from a moving image captured by the camera 10. The image processing apparatus 12 includes an image preprocessing unit 28, an image storage unit 29, a search unit 16, and a tracking unit 18.

画像前処理部２８には、カメラ１０からの動画像信号が入力される。カメラ１０からの動画像信号は、画像前処理部２８にてＡ／Ｄ変換及び量子化等の処理が行われ、各々が異なる時刻のデータである静止画データ列によって構成されるリアルタイム動画データに変換される。ここで、静止画データ列の各々は一時的に画像保管部２９に保管される。また、画像保管部２９に保管された静止画データ列の各々は探索部１６または追跡部１８へ入力され、人間の顔をリアルタイムで追跡する処理が行われる。画像保管部２９に保管される静止画データ列の一例を図２に示す。 A moving image signal from the camera 10 is input to the image preprocessing unit 28. The moving image signal from the camera 10 is subjected to processing such as A / D conversion and quantization in the image pre-processing unit 28, and is converted into real-time moving image data composed of still image data sequences that are data at different times. Converted. Here, each of the still image data strings is temporarily stored in the image storage unit 29. Each of the still image data sequences stored in the image storage unit 29 is input to the search unit 16 or the tracking unit 18, and processing for tracking a human face in real time is performed. An example of a still image data string stored in the image storage unit 29 is shown in FIG.

探索部１６は、画像前処理部２８により得られる最新の静止画データにおいて、所定フレーム間隔おきの静止画データ中から人間の顔を探索する。ここでの探索部１６は、例えば静止画データ中の全範囲を探索範囲として人間の顔を探索する。あるいは、背景差分やフレーム間差分による各座標の差分値に基づいて顔の探索範囲を設定することにより、顔の探索範囲を限定し探索時間を短縮することもできる。 The search unit 16 searches for the human face in the latest still image data obtained by the image preprocessing unit 28 from still image data at predetermined frame intervals. The search unit 16 here searches for a human face using, for example, the entire range in still image data as a search range. Alternatively, by setting the face search range based on the difference value of each coordinate based on the background difference or the inter-frame difference, the face search range can be limited and the search time can be shortened.

追跡部１８は、探索部１６により１以上の人間の顔を検出できた場合に、画像保管部２９に保管された他の静止画データ中から、その人間の顔の探索を続けることにより、その人間の顔を追跡する。そして、追跡部１８は、順探索範囲設定部２０、順探索部２２、逆探索範囲設定部２４、及び逆探索部２６を有している。 When the search unit 16 can detect one or more human faces, the tracking unit 18 continues to search for the human face from other still image data stored in the image storage unit 29. Track human face. The tracking unit 18 includes a forward search range setting unit 20, a forward search unit 22, a reverse search range setting unit 24, and a reverse search unit 26.

順探索範囲設定部２０は、追跡対象の静止画データより前の時刻における１以上の静止画データ中から既に検出された人間の顔の特性に基づいて、追跡対象の静止画データにおける探索範囲を設定する。これによって、すでに人間の顔を検出できている検出済静止画データより時間軸に沿った方向すなわち時間軸順方向の静止画データにおける探索範囲が設定される。ここで、追跡対象の静止画データより前の時刻における１以上の静止画データは、追跡対象の静止画データの直前の時刻における静止画データを含む。そして、順探索部２２は、順探索範囲設定部２０により設定された探索範囲内で人間の顔の探索を行うことで、追跡対象の静止画データ中から人間の顔を追跡する。なお、ここでの顔の特性は、顔の位置、大きさ、角度、形状、明るさ、色、及びそれらの軌跡の少なくとも１つを含む。 The forward search range setting unit 20 determines the search range in the still image data to be tracked based on the characteristics of the human face already detected from one or more still image data at a time prior to the still image data to be tracked. Set. As a result, a search range in the still image data in the direction along the time axis, that is, in the time axis forward direction is set from the detected still image data in which the human face has already been detected. Here, the one or more still image data at a time before the tracking target still image data includes still image data at a time immediately before the tracking target still image data. The forward search unit 22 tracks the human face from the tracking target still image data by searching for the human face within the search range set by the forward search range setting unit 20. The face characteristics here include at least one of the position, size, angle, shape, brightness, color, and locus of the face.

一方、逆探索範囲設定部２４は、追跡対象の静止画データより後の時刻における１以上の静止画データ中から既に検出された人間の顔の特性に基づいて、追跡対象の静止画データにおける探索範囲を設定する。これによって、すでに人間の顔を検出できている検出済静止画データより時間軸を遡った方向すなわち時間軸逆方向の静止画データにおける探索範囲が設定される。ここで、追跡対象の静止画データより後の時刻における１以上の静止画データは、追跡対象の静止画データの直後の時刻における静止画データを含む。そして、逆探索部２６は、逆探索範囲設定部２４により設定された探索範囲内で人間の顔の探索を行うことで、追跡対象の静止画データ中から人間の顔を追跡する。 On the other hand, the reverse search range setting unit 24 searches the still image data to be tracked based on the characteristics of the human face already detected from one or more still image data at a time later than the still image data to be tracked. Set the range. As a result, a search range is set in the still image data in the direction going back in time with respect to the detected still image data in which the human face has already been detected, that is, in the direction opposite to the time axis. Here, the one or more still image data at a time after the tracking target still image data includes still image data at a time immediately after the tracking target still image data. Then, the reverse search unit 26 searches for a human face within the search range set by the reverse search range setting unit 24, thereby tracking the human face from the still image data to be tracked.

このように、本実施形態においては、追跡部１８により人間の顔を追跡するときに、図２に示すように、検出済静止画データより時間軸に沿った方向（時間軸順方向）の静止画データ中から顔３４−１，３４−２を追跡する処理を行うだけでなく、検出済静止画データより時間軸を遡った方向（時間軸逆方向）の静止画データ中から顔３４−１，３４−２を追跡する処理も行う。 As described above, in this embodiment, when the tracking unit 18 tracks a human face, as shown in FIG. 2, the still image in the direction along the time axis (time axis forward direction) is detected from the detected still image data. In addition to performing processing for tracking the faces 34-1 and 34-2 from the image data, the face 34-1 is detected from the still image data in the direction (time axis reverse direction) that goes back the time axis from the detected still image data. , 34-2 are also tracked.

制御装置１４は、画像処理装置１２により得られた顔の特性の追跡結果に基づいてカメラ１０の撮像方向及びズームの少なくとも一方の制御を行う。そして、制御装置１４は、特性推定部３０及び制御指令演算部３２を有している。 The control device 14 controls at least one of the imaging direction of the camera 10 and the zoom based on the tracking result of the facial characteristics obtained by the image processing device 12. The control device 14 includes a characteristic estimation unit 30 and a control command calculation unit 32.

特性推定部３０は、探索部１６及び追跡部１８により既に検出できている静止画データ中の人間の顔の特性に基づいて、その人間の顔の将来特性を推定する。ここでの顔の将来特性は、顔の将来の位置及び大きさの少なくとも一方を含む。そして、制御指令演算部３２は、特性推定部３０により推定された人間の顔の将来特性に基づいて、カメラ１０の撮像方向及びズームの少なくとも一方を制御するための制御指令値を演算してカメラ１０へ出力する。これによって、カメラ１０の撮像方向及びズームの少なくとも一方が人間の顔を追尾するように制御される。ここでのカメラ１０は、人間の顔を追尾して撮像するための追尾用撮像装置としての役割も兼ねている。 The characteristic estimation unit 30 estimates the future characteristics of the human face based on the characteristics of the human face in the still image data already detected by the search unit 16 and the tracking unit 18. The future characteristics of the face here include at least one of the future position and size of the face. Then, the control command calculation unit 32 calculates a control command value for controlling at least one of the imaging direction of the camera 10 and the zoom based on the future characteristics of the human face estimated by the characteristic estimation unit 30. 10 is output. Accordingly, at least one of the imaging direction and zoom of the camera 10 is controlled to track the human face. The camera 10 here also serves as a tracking imaging device for tracking and imaging a human face.

なお、画像処理装置１２及び制御装置１４については、例えばＣＰＵ、ＲＯＭ、ＲＡＭ、ハードディスク及び通信Ｉ／Ｆ等がバス接続されたコンピュータによって実現することができる。そして、後述する探索部１６及び追跡部１８による処理及び特性推定部３０及び制御指令演算部３２による処理については、コンピュータにより実行されるプログラムによってソフトウェア的に実現することができる。 Note that the image processing device 12 and the control device 14 can be realized by, for example, a computer in which a CPU, a ROM, a RAM, a hard disk, a communication I / F, and the like are connected by a bus. And the process by the search part 16 and the tracking part 18 mentioned later, and the process by the characteristic estimation part 30 and the control command calculating part 32 are realizable like software with the program run by a computer.

次に、本実施形態のシステムにおける処理を図３〜６に示すフローチャートを用いて説明する。ここで、図３は探索部１６による処理を説明するフローチャートを示す。そして、図４は追跡部１８による時間軸を遡った方向に顔を追跡する処理を説明するフローチャートを示し、図５は追跡部１８による時間軸に沿った方向に顔を追跡する処理を説明するフローチャートを示す。また、図６は制御装置１４による処理を説明するフローチャートを示す。なお、以下の説明では、動画データ中から複数種類の顔を追跡することを想定して説明を行う。 Next, processing in the system of the present embodiment will be described using the flowcharts shown in FIGS. Here, FIG. 3 shows a flowchart for explaining processing by the search unit 16. 4 shows a flowchart for explaining the process of tracking the face in the direction going back the time axis by the tracking unit 18, and FIG. 5 explains the process of tracking the face in the direction along the time axis by the tracking unit 18. A flowchart is shown. FIG. 6 is a flowchart for explaining processing by the control device 14. In the following description, it is assumed that a plurality of types of faces are tracked from moving image data.

図３のフローチャートのステップ（以下Ｓとする）１においては、画像前処理部２８により得られた最新の静止画データ中から、探索部１６による顔の探索を行うか否かが判定される。ここで、画像前処理部２８にて最新の静止画データは所定時間おきに得られている。本実施形態の画像処理装置１２では、所定フレーム間隔おきに探索部１６による顔の探索を行うように、探索部１６による探索条件が設定されている。Ｓ１の判定結果がＮＯの場合は、前回探索を行った静止画データの取得から所定フレーム数の静止画データの取得が終了するまでＳ１の判定が繰り返される。一方、Ｓ１の判定結果がＹＥＳの場合はＳ２に進む。 In step (hereinafter referred to as S) 1 in the flowchart of FIG. 3, it is determined whether or not to search for a face by the search unit 16 from the latest still image data obtained by the image preprocessing unit 28. Here, the latest still image data is obtained every predetermined time by the image preprocessing unit 28. In the image processing apparatus 12 of the present embodiment, the search condition by the search unit 16 is set so that the search unit 16 searches for a face at predetermined frame intervals. If the determination result in S1 is NO, the determination in S1 is repeated until acquisition of a predetermined number of frames of still image data is completed after acquisition of the still image data that was searched last time. On the other hand, if the determination result in S1 is YES, the process proceeds to S2.

Ｓ２においては、探索部１６にて静止画データ中から人間の顔の探索が行われる。ここで、画像データ中からの人間の顔の検出については、例えば非特許文献１に開示されている皮膚の色を検出する手法を用いることにより検出することができる。また、顔の探索範囲については、例えば静止画データ中の全範囲を探索する。ただし、全範囲の探索には長時間を要し、典型的にはフレーム間隔周期よりも長時間となる。そこで、前述したように、背景差分やフレーム間差分を用いることで探索範囲を絞ることも可能である。探索部１６による顔の探索が終了したら、Ｓ３に進む。 In S2, the search unit 16 searches for a human face from still image data. Here, the detection of the human face from the image data can be detected by using, for example, a technique for detecting the skin color disclosed in Non-Patent Document 1. For the search range of the face, for example, the entire range in the still image data is searched. However, it takes a long time to search the entire range, which is typically longer than the frame interval period. Therefore, as described above, the search range can be narrowed down by using the background difference or the inter-frame difference. If the search of the face by the search part 16 is complete | finished, it will progress to S3.

Ｓ３においては、探索部１６により新たに顔が検出されたか否かが判定される。Ｓ３の判定結果がＹＥＳの場合、すなわち新たな顔が検出された場合は、Ｓ４に進む。一方、Ｓ３の判定結果がＮＯの場合、すなわち顔が検出されなかった場合、あるいはすでに探索部１６により検出済みで追跡部１８による追跡が行われている顔が再度検出された場合は、Ｓ１に戻り、Ｓ１以下の処理が繰り返される。 In S3, it is determined whether or not a new face is detected by the search unit 16. If the determination result in S3 is YES, that is, if a new face is detected, the process proceeds to S4. On the other hand, if the determination result in S3 is NO, that is, if a face has not been detected, or if a face that has already been detected by the search unit 16 and is being tracked by the tracking unit 18 is detected again, the process goes to S1. Returning, the processing after S1 is repeated.

Ｓ４においては、Ｓ３で新たに検出した顔を追跡する処理が追跡部１８にて開始される。具体的には、図４，５のフローチャートに示す処理が開始される。そして、Ｓ１に戻り、Ｓ１以下の処理が繰り返される。ここで、探索部１６により顔が検出され追跡部１８により顔の追跡が行われる場合でも探索部１６による顔の探索を繰り返す理由は、図２に示すように、１つの顔３４−１の追跡処理を行っているときに他の種類の顔３４−２が新たに現れる場合も考えられるためである。このように、所定フレーム間隔おきに探索部１６による顔の探索を繰り返すことにより、複数種類の顔を検出することができる。さらに、追跡部１８による顔の追跡に一旦失敗しても、探索部１６によりその顔を再検出することができるので、追跡部１８によるその顔の再追跡が可能となる。 In S4, the tracking unit 18 starts processing for tracking the face newly detected in S3. Specifically, the processing shown in the flowcharts of FIGS. 4 and 5 is started. And it returns to S1 and the process below S1 is repeated. Here, even when a face is detected by the search unit 16 and a face is tracked by the tracking unit 18, the reason for repeating the face search by the search unit 16 is to track one face 34-1 as shown in FIG. This is because another type of face 34-2 may newly appear during processing. As described above, a plurality of types of faces can be detected by repeating the face search by the search unit 16 at predetermined frame intervals. Further, even if the tracking of the face by the tracking unit 18 fails once, the face can be re-detected by the searching unit 16, so that the tracking unit 18 can re-track the face.

次に、図４，５に示す処理について説明する。本実施形態では、図４の時間軸を遡った方向に顔を追跡する処理と、図５の時間軸に沿った方向に顔を追跡する処理と、が並行して行われる場合について説明する。ただし、これらの処理が必ずしも並行して行われる必要はなく、例えば図４の時間軸を遡った方向に顔を追跡する処理の方を先に行ってから図５の時間軸に沿った方向に顔を追跡する処理を行うことも可能である。そして、探索部１６により複数種類の顔が検出されている場合は、各顔ごとに図４，５の処理が行われる。 Next, the process shown in FIGS. 4 and 5 will be described. In the present embodiment, a case will be described in which the process of tracking a face in a direction retroactive to the time axis in FIG. 4 and the process of tracking a face in a direction along the time axis in FIG. 5 are performed in parallel. However, these processes do not necessarily have to be performed in parallel. For example, the process of tracking the face in the direction retroactive to the time axis in FIG. 4 is performed first and then in the direction along the time axis in FIG. It is also possible to perform processing for tracking a face. When a plurality of types of faces are detected by the search unit 16, the processes of FIGS. 4 and 5 are performed for each face.

図４のフローチャートのＳ１０１においては、検出済静止画データ中の顔の特性に基づいて、検出済静止画データより時間軸を遡った方向の静止画データ中における探索範囲が逆探索範囲設定部２４にて設定される。そして、Ｓ１０２に進む。以下、探索範囲の設定の具体例について説明する。 In S101 of the flowchart of FIG. 4, based on the characteristics of the face in the detected still image data, the search range in the still image data in the direction going back the time axis from the detected still image data is the reverse search range setting unit 24. Set by. Then, the process proceeds to S102. Hereinafter, a specific example of setting the search range will be described.

図７に示すように、例えば検出済静止画データが探索部１６により検出済みのフレームｆだけである場合、フレームｆより１フレーム分時間軸を遡ったフレームｆ−１の探索範囲については、フレームｆでの両目中央位置、及びフレーム間隔周期における両目中央位置の変動予測量（縦方向、横方向及び奥行き）により設定することができる。 As shown in FIG. 7, for example, when the detected still image data is only the frame f that has been detected by the search unit 16, the search range of the frame f-1 that goes back the time axis by one frame from the frame f It can be set by the center position of both eyes at f and the predicted amount of fluctuation (vertical direction, horizontal direction and depth) of the center position of both eyes in the frame interval period.

次に、図８に示すように、例えば検出済静止画データが探索部１６により検出済みのフレームｆ、及び逆探索部２６により検出済みのフレームｆ−１である場合を考える。この場合、フレームｆ−１より１フレーム分時間軸を遡ったフレームｆ−２の探索範囲については、フレームｆ，ｆ−１での両目中央位置から回帰計算されるフレームｆ−２での両目中央位置、及びフレーム間隔周期における両目中央位置の変動予測量により設定することができる。 Next, as shown in FIG. 8, for example, consider a case where the detected still image data is a frame f detected by the search unit 16 and a frame f−1 detected by the reverse search unit 26. In this case, for the search range of the frame f-2 that goes back the time axis by one frame from the frame f-1, the center of both eyes in the frame f-2 calculated by regression from the center positions of both eyes in the frames f and f-1 It can be set by the position and the predicted amount of fluctuation of the center position of both eyes in the frame interval period.

次に、図９に示すように、例えば検出済静止画データが探索部１６により検出済みのフレームｆ、及び逆探索部２６により検出済みのフレームｆ−１，ｆ−２である場合を考える。この場合、フレームｆ−２より１フレーム分時間軸を遡ったフレームｆ−３の探索範囲については、フレームｆ，ｆ−１，ｆ−２での両目中央位置から回帰計算されるフレームｆ−３での両目中央位置、及びフレーム間隔周期における両目中央位置の変動予測量により設定することができる。 Next, as shown in FIG. 9, for example, consider a case where the detected still image data is a frame f detected by the search unit 16 and frames f−1 and f-2 detected by the reverse search unit 26. In this case, for the search range of the frame f-3 that goes back the time axis by one frame from the frame f-2, the frame f-3 that is calculated by regression from the center position of both eyes in the frames f, f-1, and f-2. The center position of both eyes and the predicted amount of fluctuation of the center position of both eyes in the frame interval period can be set.

ここで、フレーム間隔周期における位置変動予測量については、例えば実験的に設定することができる。一例として、フレーム間隔周期における両目中央位置の横方向変位の確率分布を実験的に求めた結果を図１０に示す。図１０に示すように、２フレームから求められる確率分布の方が１フレームから求められる確率分布よりばらつきは小さい。同様の実験により、検出済みのフレーム数の増大に応じてフレーム間隔周期における位置変動予測量を小さく設定することができるので、検出済みのフレーム数の増大に応じて探索範囲を狭くすることができ、探索時間を短縮することができる。なお、上記の説明では、探索範囲の設定に両目中央位置を用いているが、顔の他の位置を用いても探索範囲の設定は可能である。 Here, the position fluctuation prediction amount in the frame interval cycle can be set experimentally, for example. As an example, FIG. 10 shows the result of experimentally determining the probability distribution of the lateral displacement at the center position of both eyes in the frame interval period. As shown in FIG. 10, the probability distribution obtained from two frames has a smaller variation than the probability distribution obtained from one frame. The same experiment can be used to set the position fluctuation prediction amount in the frame interval period to be small according to the increase in the number of detected frames, so that the search range can be narrowed according to the increase in the number of detected frames. The search time can be shortened. In the above description, the center position of both eyes is used for setting the search range, but the search range can also be set using other positions of the face.

Ｓ１０２においては、逆探索範囲設定部２４により設定された探索範囲内で顔の探索が逆探索部２６にて行われる。顔の検出については、前述したように、例えば非特許文献１に開示の手法を用いることで検出可能である。ここでの逆探索部２６による顔の探索については、探索範囲が限定されているため、顔の探索時間を大幅に短縮することができる。典型的にはフレーム間隔周期より極めて短い時間で探索処理を完了させることができる。逆探索部２６による顔の探索が終了したら、Ｓ１０３に進む。 In S102, the reverse search unit 26 searches for a face within the search range set by the reverse search range setting unit 24. As described above, the face can be detected by using the method disclosed in Non-Patent Document 1, for example. As for the face search by the inverse search unit 26 here, the search range is limited, and therefore the face search time can be greatly shortened. Typically, the search process can be completed in a time extremely shorter than the frame interval period. When the face search by the reverse search unit 26 ends, the process proceeds to S103.

Ｓ１０３においては、逆探索部２６により追跡対象の顔を検出できたか否かが判定される。Ｓ１０３の判定結果がＹＥＳの場合はＳ１０１に戻り、時間軸を１フレーム分遡った静止画データについて、Ｓ１０１，Ｓ１０２の処理を繰り返す。一方、Ｓ１０３の判定結果がＮＯの場合は、カメラ１０の撮像範囲から追跡対象の顔が外れたものと判定し、その顔の時間軸を遡る方向の追跡を終了する。なお、Ｓ１０３の判定では、複数フレーム連続して顔を検出できなかった場合に、追跡対象の顔がカメラ１０の撮像範囲から外れたものと判定するようにしてもよい。 In S103, it is determined whether or not the face to be tracked can be detected by the reverse search unit 26. If the determination result in S103 is YES, the process returns to S101, and the processing in S101 and S102 is repeated for still image data that is backed by one frame on the time axis. On the other hand, when the determination result in S103 is NO, it is determined that the face to be tracked is out of the imaging range of the camera 10, and the tracking in the direction going back the time axis of the face is ended. In the determination of S103, when a face cannot be detected continuously for a plurality of frames, it may be determined that the face to be tracked is out of the imaging range of the camera 10.

一方、図５のフローチャートのＳ２０１においては、検出済静止画データ中の顔の特性に基づいて、検出済静止画データより時間軸に沿った方向の静止画データ中における探索範囲が順探索範囲設定部２０にて設定される。そして、Ｓ２０２に進む。ここでの探索範囲については、時間軸を遡る場合と同様の手法で設定することができる。また、順探索範囲設定部２０による探索範囲の設定の際には、逆探索部２６による顔の探索も並行して行われることで時間軸を遡って顔が検出されているため、探索範囲の設定に用いられる検出済みの顔の特性は、逆探索部２６により検出された顔の特性を含んでいる。例えば図９の例においては、フレームｆ＋１における探索範囲の設定に、逆探索部２６により検出されたフレームｆ−２，ｆ−１における顔の特性が用いられる。このように、逆探索部２６により検出された顔の特性も利用して探索範囲を設定することにより、探索範囲をより狭くすることができるので、順探索部２２による探索時間をより短縮させることができる。 On the other hand, in S201 of the flowchart of FIG. 5, the search range in the still image data in the direction along the time axis is set to the forward search range setting based on the characteristics of the face in the detected still image data. Set in section 20. Then, the process proceeds to S202. The search range here can be set by the same method as when going back in time. Further, when the search range is set by the forward search range setting unit 20, since the face search is performed back in time by the face search by the reverse search unit 26 being performed in parallel, The detected face characteristics used for the setting include the face characteristics detected by the inverse search unit 26. For example, in the example of FIG. 9, the face characteristics in the frames f-2 and f-1 detected by the inverse search unit 26 are used for setting the search range in the frame f + 1. Thus, since the search range can be narrowed by setting the search range using the characteristics of the face detected by the reverse search unit 26, the search time by the forward search unit 22 can be further shortened. Can do.

Ｓ２０２においては、順探索範囲設定部２０により設定された探索範囲内で顔の探索が順探索部２２にて行われる。顔の検出については、前述したように、例えば非特許文献１に開示の手法を用いることで検出可能である。ここでの順探索部２２による顔の探索については、探索範囲が限定されているため、顔の探索時間を大幅に短縮することができる。典型的にはフレーム間隔周期より極めて短い時間で探索処理を完了させることができる。順探索部２２による顔の探索が終了したら、Ｓ２０３に進む。 In S 202, the face search is performed by the forward search unit 22 within the search range set by the forward search range setting unit 20. As described above, the face can be detected by using the method disclosed in Non-Patent Document 1, for example. As for the face search by the forward search unit 22 here, since the search range is limited, the face search time can be greatly shortened. Typically, the search process can be completed in a time extremely shorter than the frame interval period. When the face search by the forward search unit 22 ends, the process proceeds to S203.

Ｓ２０３においては、順探索部２２により追跡対象の顔を検出できたか否かが判定される。Ｓ２０３の判定結果がＹＥＳの場合はＳ２０１に戻り、時間軸に１フレーム分沿った静止画データについて、Ｓ２０１，Ｓ２０２の処理を繰り返す。一方、Ｓ２０３の判定結果がＮＯの場合は、カメラ１０の撮像範囲から追跡対象の顔が外れたものと判定し、その顔の時間軸に沿った方向の追跡を終了する。なお、Ｓ２０３の判定では、複数フレーム連続して顔を検出できなかった場合に、追跡対象の顔がカメラ１０の撮像範囲から外れたものと判定するようにしてもよい。 In S203, the forward search unit 22 determines whether or not the face to be tracked has been detected. If the determination result in S203 is YES, the process returns to S201, and the processes in S201 and S202 are repeated for still image data along one frame on the time axis. On the other hand, if the determination result in S203 is NO, it is determined that the face to be tracked is out of the imaging range of the camera 10, and the tracking of the face along the time axis is terminated. In the determination of S203, if a face cannot be detected continuously for a plurality of frames, it may be determined that the face to be tracked is out of the imaging range of the camera 10.

次に、カメラ１０の制御について説明する。図６のフローチャートのＳ３０１においては、カメラ１０の制御を行うか否かが判定される。探索部１６及び追跡部１８により顔の将来特性を推定できるだけの顔の追跡結果がまだ得られていない場合は、Ｓ３０１の判定結果はＮＯとなり、顔の将来特性を推定できるだけの追跡結果が得られるまでＳ３０１の判定が繰り返される。一方、探索部１６及び追跡部１８により顔の将来特性を推定できるだけの追跡結果が得られている場合は、Ｓ３０１の判定結果はＹＥＳとなり、Ｓ３０２に進む。 Next, control of the camera 10 will be described. In S301 of the flowchart of FIG. 6, it is determined whether or not the camera 10 is to be controlled. If the face tracking result sufficient to estimate the future face characteristics is not yet obtained by the search unit 16 and the tracking part 18, the determination result in S301 is NO, and the tracking result sufficient to estimate the future face characteristics is obtained. Until S301, the determination in S301 is repeated. On the other hand, if the search unit 16 and the tracking unit 18 have obtained tracking results that can estimate the future characteristics of the face, the determination result in S301 is YES, and the process proceeds to S302.

Ｓ３０２においては、所定時間経過後の顔の将来特性が特性推定部３０にて推定される。ここでは、例えば複数の検出済静止画データにおける両目中央位置の各々から求められる回帰式を利用することで、所定時間経過後の顔の将来特性を推定することができる。このとき、検出済静止画データにおける両目中央位置には逆探索部２６により検出された両目中央位置が含まれており、逆探索部２６により検出された両目中央位置も利用して顔の将来特性を推定できるので、顔の将来特性を精度よく推定することができる。そして、Ｓ３０３に進む。 In S302, the characteristic estimation unit 30 estimates the future characteristics of the face after a predetermined time has elapsed. Here, for example, by using a regression equation obtained from each of the center positions of both eyes in a plurality of detected still image data, the future characteristics of the face after a predetermined time has elapsed can be estimated. At this time, the center position of both eyes in the detected still image data includes the center position of both eyes detected by the reverse search unit 26, and the future characteristics of the face are also utilized using the center position of both eyes detected by the reverse search unit 26. Therefore, it is possible to accurately estimate the future characteristics of the face. Then, the process proceeds to S303.

Ｓ３０３においては、カメラ１０の撮像方向を制御するための制御指令値が制御指令演算部３２にて演算される。ここでは、特性推定部３０により推定された将来の顔をカメラ１０の撮像範囲が含むように、制御指令値が演算される。また、複数種類の顔の追跡結果が得られてその複数種類の顔の将来特性が推定されているときは、その複数種類の将来の顔をカメラ１０の撮像範囲が含むように、制御指令値が演算される。制御指令値の演算については、例えば制御指令値とカメラ１０の撮像方向に関する特性マップを記憶しておき、その特性マップを用いて演算すればよい。そして、Ｓ３０４に進む。 In S 303, a control command value for controlling the imaging direction of the camera 10 is calculated by the control command calculation unit 32. Here, the control command value is calculated so that the imaging range of the camera 10 includes the future face estimated by the characteristic estimation unit 30. Further, when the tracking results of a plurality of types of faces are obtained and the future characteristics of the plurality of types of faces are estimated, the control command value is set so that the imaging range of the camera 10 includes the plurality of types of future faces. Is calculated. For the calculation of the control command value, for example, a characteristic map relating to the control command value and the imaging direction of the camera 10 may be stored and calculated using the characteristic map. Then, the process proceeds to S304.

Ｓ３０４においては、制御指令演算部３２にて演算された制御指令値がカメラ１０へ出力され、カメラ１０の撮像方向の制御が行われる。そして、Ｓ３０１に戻り、Ｓ３０１以下の処理が繰り返される。なお、カメラ１０の制御については、撮像方向制御の代わりに、ズームを制御するための制御指令値を演算して出力してもかまわない。さらに、撮像方向及びズームの両方を制御するための制御指令値を演算して出力してもかまわない。 In S304, the control command value calculated by the control command calculation unit 32 is output to the camera 10, and the imaging direction of the camera 10 is controlled. And it returns to S301 and the process after S301 is repeated. As for the control of the camera 10, a control command value for controlling the zoom may be calculated and output instead of the imaging direction control. Furthermore, a control command value for controlling both the imaging direction and the zoom may be calculated and output.

以上説明したように、本実施形態によれば、動画データ中から顔を追跡する処理を行うときに、時間軸を遡る方向に顔を追跡することにより、より過去における顔の特性を得ることができる。さらに、その過去における顔の特性を顔の将来特性の予測に反映させることができる。例えば、非特許文献２のように動画データ中の顔の追跡結果を利用して話し始めの瞬間を予測しようとする場合を考えると、唇の動き始めと実際の話し始めとは時間差があるため、より過去における唇の動きを動画データ中から把握できることが望ましい。そこで、時間軸を遡る方向に顔を追跡してより過去における唇の動きを把握することで、話し始めをより早期に予測することができる。このように、本実施形態においては、時間軸を遡る方向に顔を追跡してより過去における顔の追跡結果を得ることにより、動画データ中から顔をより正確に追跡することができる。さらに、将来の顔の特性をより正確に予測することができる。 As described above, according to the present embodiment, when performing the process of tracking a face from moving image data, it is possible to obtain a more characteristic of the face in the past by tracking the face in a direction going back in the time axis. it can. Furthermore, the face characteristics in the past can be reflected in the prediction of the future face characteristics. For example, in the case of trying to predict the start of speaking using the face tracking result in the moving image data as in Non-Patent Document 2, there is a time difference between the start of lip movement and the actual start of speaking. It is desirable that the movement of the lips in the past can be grasped from the moving image data. Therefore, by tracing the face in the direction going back on the time axis and grasping the movement of the lips in the past, it is possible to predict the beginning of the conversation earlier. As described above, in the present embodiment, the face can be tracked more accurately from the moving image data by tracking the face in the direction going back in time and obtaining the face tracking result in the past. Furthermore, the characteristics of the future face can be predicted more accurately.

そして、逆探索部２６により得られた時間軸を遡った方向の顔の追跡結果については、時間軸に沿った方向の顔の追跡に利用することができる。具体的には、時間軸を遡って追跡された顔の特性を含む検出済静止画データ中の顔の特性に基づいて、順探索範囲設定部２０が探索範囲を設定することにより、探索範囲をより限定することができ、順探索部２２による顔の探索時間を短縮させることができる。 Then, the face tracking result in the direction going back the time axis obtained by the inverse search unit 26 can be used for tracking the face in the direction along the time axis. Specifically, the forward search range setting unit 20 sets the search range based on the face characteristics in the detected still image data including the face characteristics traced back along the time axis. More specifically, the face search time by the forward search unit 22 can be shortened.

また、時間軸を遡った追跡結果を含む顔の特性に基づいて顔の将来特性を推定し、その推定された将来の顔をカメラ１０の撮像範囲が含むようにカメラ１０の撮像方向及びズームの少なくとも一方を制御することにより、その顔をカメラ１０により正確に追尾して撮像することができる。 Further, the future characteristics of the face are estimated based on the characteristics of the face including the tracking result traced back in time, and the imaging direction and zoom of the camera 10 are set so that the imaging range of the camera 10 includes the estimated future face. By controlling at least one of the faces, the face can be accurately tracked and imaged by the camera 10.

本実施形態では、顔の検出の画像処理に用いられる画像処理用撮像装置と、顔を追尾して撮像する追尾用撮像装置と、をカメラ１０により共用化しているが、画像処理用撮像装置と追尾用撮像装置とで別々のカメラを用いることも可能である。さらに、本発明の追尾装置はカメラに限られるものではない。例えば、顔の将来特性に基づいて指向性マイクの指向特性を制御することにより、指向性マイクの指向特性を人間の顔に追尾させることもできる。 In the present embodiment, the image processing imaging device used for face detection image processing and the tracking imaging device that tracks and captures a face are shared by the camera 10, but the image processing imaging device and It is also possible to use separate cameras for the tracking imaging device. Furthermore, the tracking device of the present invention is not limited to a camera. For example, by controlling the directional characteristics of the directional microphone based on the future characteristics of the face, the directional characteristics of the directional microphone can be tracked to a human face.

そして、本実施形態では、追跡するオブジェクトが人間の顔である場合を説明したが、本発明は他のオブジェクトを追跡する場合でも適用が可能である。 In the present embodiment, the case where the object to be tracked is a human face has been described, but the present invention can also be applied to the case of tracking other objects.

また、本実施形態では、画像処理装置がリアルタイム動画データ中からオブジェクトをリアルタイムで追跡する処理を行う場合について説明したが、本発明の画像処理装置については、予め記憶されている動画データ中からオブジェクトを追跡する後処理を行う場合でも適用が可能である。 Further, in the present embodiment, the case where the image processing apparatus performs processing for tracking an object in real time from real-time moving image data has been described. However, the image processing apparatus of the present invention is configured so that an object from moving image data stored in advance is stored. It can be applied even in the case of performing post-processing for tracking.

以上、本発明の実施の形態について説明したが、本発明はこうした実施の形態に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において、種々なる形態で実施し得ることは勿論である。 As mentioned above, although embodiment of this invention was described, this invention is not limited to such embodiment at all, Of course, in the range which does not deviate from the summary of this invention, it can implement with a various form. It is.

本発明の実施形態に係る画像処理装置を備えるオブジェクト追跡システムの構成の概略を示すブロック図である。It is a block diagram which shows the outline of a structure of an object tracking system provided with the image processing apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る画像処理装置による顔の追跡処理を説明する図である。It is a figure explaining the tracking process of the face by the image processing apparatus which concerns on embodiment of this invention. 探索部により実行される処理を説明するフローチャートである。It is a flowchart explaining the process performed by the search part. 追跡部により実行される処理を説明するフローチャートである。It is a flowchart explaining the process performed by the tracking part. 追跡部により実行される処理を説明するフローチャートである。It is a flowchart explaining the process performed by the tracking part. 制御装置により実行される処理を説明するフローチャートである。It is a flowchart explaining the process performed by a control apparatus. 顔を追跡するときの探索範囲の設定の一例を説明する図である。It is a figure explaining an example of the setting of the search range when tracking a face. 顔を追跡するときの探索範囲の設定の一例を説明する図である。It is a figure explaining an example of the setting of the search range when tracking a face. 顔を追跡するときの探索範囲の設定の一例を説明する図である。It is a figure explaining an example of the setting of the search range when tracking a face. フレーム間隔周期における両目中央位置の横方向変位の確率分布を示す図である。It is a figure which shows the probability distribution of the horizontal displacement of the center position of both eyes in a frame space | interval period.

Explanation of symbols

１０カメラ、１２画像処理装置、１４制御装置、１６探索部、１８追跡部、２０順探索範囲設定部、２２順探索部、２４逆探索範囲設定部、２６逆探索部、２８画像前処理部、２９画像保管部、３０特性推定部、３２制御指令演算部。 10 cameras, 12 image processing devices, 14 control devices, 16 search units, 18 tracking units, 20 forward search range setting units, 22 forward search units, 24 reverse search range setting units, 26 reverse search units, 28 image preprocessing units, 29 Image storage unit, 30 characteristic estimation unit, 32 control command calculation unit.

Claims

An image processing apparatus that performs processing for tracking an object from moving image data composed of still image data sequences that are data at different times,
Search means for searching for an object from still image data;
Tracking means for tracking the one or more objects in other still image data when one or more objects can be detected by the search means;
With
The tracking means tracks an object from the still image data to be tracked based on the characteristics of the object already detected from one or more still image data at a time later than the still image data to be tracked. An image processing apparatus that tracks an object in a direction that goes back along the time axis of a still image data sequence.

The image processing apparatus according to claim 1,
One or more still image data at a time later than the tracking target still image data includes still image data at a time immediately after the tracking target still image data.

The image processing apparatus according to claim 1, wherein:
The image processing apparatus, wherein the characteristics of the object include at least one of a position, a size, an angle, a shape, brightness, a color, and a locus of the object.

The image processing apparatus according to any one of claims 1 to 3,
The tracking means is configured to select from the still image data to be tracked based on the characteristics including the position and size of the object already detected from one or more still image data at a time before the still image data to be tracked. An image processing apparatus that tracks an object in a direction along a time axis of a still image data sequence by tracking the object.

The image processing apparatus according to claim 4,
The image processing apparatus according to claim 1, wherein the one or more still image data at a time prior to the tracking target still image data includes still image data at a time immediately before the tracking target still image data.

An image processing apparatus according to any one of claims 1 to 5,
An image processing apparatus, wherein the object is a face.

The image processing apparatus according to any one of claims 1 to 6,
The image processing apparatus is an apparatus that performs processing for tracking an object in real time from real-time moving image data.

An object tracking system comprising the image processing device according to claim 7,
In order to obtain real-time moving image data including an object, an image processing imaging device that images the object;
A characteristic estimation unit that estimates a future characteristic of the object based on the characteristic of the object in the still image data already detected by the tracking unit;
A tracking device capable of tracking objects;
Control command calculating means for controlling the tracking device to track the object by calculating a control command value based on the future characteristics of the object estimated by the characteristic estimating means and outputting the control command value to the tracking device; ,
An object tracking system comprising:

The object tracking system according to claim 8, comprising:
The tracking device is a tracking imaging device capable of tracking and imaging an object,
The object tracking system, wherein the control command calculation means calculates a control command value for controlling at least one of an imaging direction and zoom of the tracking imaging device.

An image processing method for performing processing for tracking an object from moving image data composed of still image data sequences each of which is data at different times,
A search step for searching for an object from still image data;
A tracking step of tracking the one or more objects in other still image data if the searching step detects one or more objects;
Including
The tracking step includes tracking an object from the still image data to be tracked based on the characteristics of the object already detected from one or more still image data at a time later than the still image data to be tracked. An image processing method characterized in that an object is traced in a direction going back on a time axis of a still image data sequence by repeating.

An image processing program for causing a computer to execute processing for tracking an object from moving image data composed of still image data sequences each of which is data at different times,
Search processing for searching for objects from still image data,
A tracking process for tracking the one or more objects in other still image data when one or more objects can be detected by the search process;
To the computer,
The tracking process is a process of tracking an object from the still image data to be tracked based on the characteristics of the object already detected from one or more still image data at a time later than the still image data to be tracked. An image processing program that performs processing for tracking an object in a direction that goes back in the time axis of a still image data sequence.