JP2013003860A

JP2013003860A - Object detection device and object detection program

Info

Publication number: JP2013003860A
Application number: JP2011134662A
Authority: JP
Inventors: Takuya Akashi; 卓也明石; Daijiro Hoshi; 大二郎星
Original assignee: Iwate University
Current assignee: Iwate University
Priority date: 2011-06-16
Filing date: 2011-06-16
Publication date: 2013-01-07

Abstract

PROBLEM TO BE SOLVED: To provide an object detection device for tracking a front face region of an object regardless of a posture of the object.SOLUTION: An object detection device 1 with image acquisition parts 2 and 5 capable of acquiring a static image and a dynamic image composed of and multiple frame images detects an object whose posture has been changed. The object detection device 1 comprises: a reference information generation part 4 for generating a histogram as a template from a static image of the object taken by the image acquisition part 2; and a detection processing part 6 for detecting an object region from a retrieved frame image on the basis of the retrieved frame image among the multiple frame images composing the dynamic image taken by the image acquisition part 5 and the template generated by the reference information generation part 4.

Description

本発明は、動画像から顔や標識などの物体を検出する装置に係り、特に物体の姿勢が変化した場合でも当該物体を検出し得る物体検出装置及び物体検出プログラムに関する。 The present invention relates to an apparatus for detecting an object such as a face or a sign from a moving image, and more particularly to an object detection apparatus and an object detection program capable of detecting the object even when the posture of the object changes.

近年、多くの民生用のディジタルカメラに顔検出機能が搭載されている。従来の顔検出の手法として非特許文献１にはHaar−like特徴を用いてAdaBoostに基づいた手法が開示されている。この手法はカスケード型に構成した識別器を使用する。具体的には、任意の弱識別器を組み合わせてより性能の高い強識別器を構築するAdaBoost法（非特許文献２）を基調とし、複数の識別器を直列的に接続して構築されている。 In recent years, a face detection function has been installed in many consumer digital cameras. As a conventional face detection technique, Non-Patent Document 1 discloses a technique based on AdaBoost using Haar-like features. This method uses a classifier configured in cascade. Specifically, based on the AdaBoost method (Non-Patent Document 2) that constructs a strong classifier with higher performance by combining arbitrary weak classifiers, it is constructed by connecting a plurality of classifiers in series. .

また、特許文献１には、顔の特定領域を抽出する画像内特定領域抽出方法が開示されている。この抽出方法は、遺伝的アルゴリズムを用いて特定のテンプレートとのマッチングを利用した技術で、動画を対象として探索画像から特定の領域を抽出する。 Further, Patent Document 1 discloses a method for extracting a specific area in an image that extracts a specific area of a face. This extraction method is a technique that uses matching with a specific template using a genetic algorithm, and extracts a specific region from a search image for a moving image.

特開２００６−１２０９３号公報JP 2006-12093 A

P. Viola, and M. Jones: “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.511-518 (2001).P. Viola, and M. Jones: “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.511-518 (2001). Y. Freund, and R. E. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”, Journal of Computer and System Sciences, Vol.55, pp.119-139 (1997).Y. Freund, and R. E. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”, Journal of Computer and System Sciences, Vol.55, pp.119-139 (1997).

AdaBoost法は複雑な背景下でも高精度な検出が可能であるが、一般的に、学習した対象物の特定の姿勢に限定されてしまう。この問題に対して対象の複数の姿勢を学習し、その結果である識別器を複数使用する手法が提案されている。しかし、すべての姿勢を学習することは困難である。また、識別器を増やすことで全方向の顔検出に対応することができる反面、学習に時間が必要であるという点や、計算コストが高いという問題点がある。 Although the AdaBoost method can detect with high accuracy even in a complicated background, it is generally limited to a specific posture of the learned object. To solve this problem, a method of learning a plurality of target postures and using a plurality of discriminators as a result has been proposed. However, learning all postures is difficult. Further, increasing the number of discriminators can cope with face detection in all directions, but there are problems that time is required for learning and calculation cost is high.

また、特許文献１の手法では、テンプレートとして特定領域内に含まれる複数の画素を、探索対象の画像中の同じサイズの領域内の複数の画素とそれぞれ一対一で比較して適合性を評価するため、観察対象の被験者が横を向いた場合には顔領域のサイズが変わって当該顔領域を構成する画素数が変わるため、横向きの場合の顔領域を抽出することができなかった。 In the method of Patent Document 1, a plurality of pixels included in a specific region as a template are compared with a plurality of pixels in a region of the same size in an image to be searched for one-on-one to evaluate suitability. Therefore, when the subject to be observed faces sideways, the size of the face area changes and the number of pixels constituting the face area changes, so that the face area in the case of landscape orientation cannot be extracted.

さらに近年では、多くの民生用のディジタルカメラに顔検出機能が搭載されているが、本発明者等の知る限り、顔の向き、つまりカメラの光軸からずれた方向へ顔を向けた姿勢にロバストに対応できている製品はない。また、エンタテイメント分野やセキュリティ分野においても、顔検出技術は重要な要素技術である。実用的な顔検出を実現するためには、顔の向きにロバストな顔探索が必要である。 Furthermore, in recent years, many consumer digital cameras have been equipped with a face detection function, but as far as the present inventors know, the face is oriented in a direction deviating from the optical axis of the camera. There are no products that are robust. In the entertainment field and security field, face detection technology is an important elemental technology. In order to realize practical face detection, face search that is robust in the direction of the face is necessary.

このような問題を解決するために、本発明は、物体の姿勢に拘わらず物体領域を検出する物体検出装置及び物体検出プログラムを提供することを目的とする。 In order to solve such a problem, an object of the present invention is to provide an object detection apparatus and an object detection program for detecting an object region regardless of the posture of the object.

上記目的を達成するために、本発明の第１の構成は、静止画像と複数のフレーム画像からなる動画像とを取得することができる画像取得部を備え、姿勢を変えた物体を検出する物体検出装置であって、上記画像取得部で撮影された物体の静止画像からテンプレートとしてのヒストグラムを作成する参照情報作成部と、上記画像取得部で撮影された動画像を構成する複数のフレーム画像の内、探索対象のフレーム画像と上記参照情報作成部で作成されたテンプレートとに基づいて、探索対象のフレーム画像から物体領域を検出する検出処理部と、を備え、上記検出処理部は遺伝的アルゴリズムに基づいて以下の処理(α１)〜(α４)を行うことを特徴としている。
(α１)探索対象の画像の中で物体領域を特定するパラメータを含む個体をＮ個生成する。
(α２)各個体の染色体のパラメータで特定される物体領域のヒストグラムをそれぞれ作成し、これらのヒストグラムと参照情報作成部で作成されたテンプレートとの一致度を適応度関数によって評価する。
(α３)Ｎ個の個体に対する選択，交叉，突然変異に基づいた遺伝的操作によって新たな個体をＮ個生成する。
(α４)世代交代限度まで、上記（α２）と（α３）とを繰り返し、最終世代の個体の内、適応度が最も高い個体のパラメータを解とし、当該解によって特定される領域を物体領域と判断する。
ここで、物体とは、人の顔、体の一部に限らず、動物、昆虫、魚類などの生物の他、土地に定着した看板や標識などの不動産、その物自体可搬自在な車やテレビなどの動産を含む。 In order to achieve the above object, the first configuration of the present invention includes an image acquisition unit that can acquire a still image and a moving image composed of a plurality of frame images, and detects an object whose posture has been changed. A reference information creating unit that creates a histogram as a template from a still image of an object photographed by the image obtaining unit; and a plurality of frame images constituting a moving image photographed by the image obtaining unit. A detection processing unit that detects an object region from the frame image to be searched based on the frame image to be searched and the template created by the reference information creation unit, wherein the detection processing unit is a genetic algorithm Based on the above, the following processing (α1) to (α4) is performed.
(α1) N individuals including parameters for specifying the object region are generated in the search target image.
(α2) Histograms of object regions specified by chromosomal parameters of each individual are created, and the degree of coincidence between these histograms and the template created by the reference information creation unit is evaluated by the fitness function.
(α3) N new individuals are generated by genetic operations based on selection, crossover, and mutation for N individuals.
(α4) The above (α2) and (α3) are repeated until the generation change limit, and the parameter of the individual having the highest fitness among the individuals of the last generation is set as a solution, and the region specified by the solution is defined as the object region to decide.
Here, the object is not limited to a person's face or body part, but also a creature such as an animal, an insect, or a fish, a real estate such as a signboard or a sign fixed on the land, Includes movable goods such as television.

本発明の物体検出装置において、前記検出処理部は、次のフレーム画像について前記遺伝的アルゴリズムの処理を開始する際、前のフレーム画像についての前記遺伝的アルゴリズムの処理で得られたＮ個の個体の染色体を利用する。 In the object detection apparatus of the present invention, when the detection processing unit starts processing of the genetic algorithm for the next frame image, the N individuals obtained by the processing of the genetic algorithm for the previous frame image Using the chromosomes of

本発明の物体検出装置は、好ましくは、前記テンプレートを別のテンプレートに更新する参照情報更新部をさらに備え、例えば前記参照情報更新部は複数のフレーム画像に亘って物体領域のヒストグラムが同じか又はその差が小さいときに、当該ヒストグラムを前記テンプレートに設定する。 The object detection device of the present invention preferably further includes a reference information update unit that updates the template to another template. For example, the reference information update unit has the same histogram of the object region over a plurality of frame images or When the difference is small, the histogram is set in the template.

上記目的を達成するために、本発明の第２の構成は、姿勢を変えた物体を検出する物体検出プログラムであって、コンピュータを、画像取得部で撮影された物体の静止画像からテンプレートとしてのヒストグラムを作成する参照情報作成部、画像取得部で撮影された動画像を構成する複数のフレーム画像の内、探索対象のフレーム画像と上記参照情報作成部で作成されたテンプレートとに基づいて、探索対象のフレーム画像から物体領域を検出する検出処理部、として機能させ、上記検出処理部は遺伝的アルゴリズムに基づいて以下の処理(α１)〜(α４)を行うことを特徴とする。
(α１)探索対象の画像の中で物体領域を特定するパラメータを含む個体をＮ個生成する。
(α２)各個体の染色体のパラメータで特定される物体領域のヒストグラムをそれぞれ作成し、これらのヒストグラムと参照情報作成部で作成されたテンプレートとの一致度を適応度関数によって評価する。
(α３)Ｎ個の個体に対する選択，交叉，突然変異に基づいた遺伝的操作によって新たな個体をＮ個生成する。
(α４)世代交代限度まで、上記（α２）と（α３）とを繰り返し、最終世代の個体の内、適応度が最も高い個体のパラメータを解とし、当該解によって特定される領域を物体領域と判断する。 In order to achieve the above object, a second configuration of the present invention is an object detection program for detecting an object whose posture has been changed, wherein a computer is used as a template from a still image of an object photographed by an image acquisition unit. Search based on a frame image to be searched and a template created by the reference information creation unit among a plurality of frame images constituting a moving image photographed by a reference information creation unit that creates a histogram and an image acquisition unit The detection processing unit functions as a detection processing unit that detects an object region from a target frame image, and the detection processing unit performs the following processes (α1) to (α4) based on a genetic algorithm.
(α1) N individuals including parameters for specifying the object region are generated in the search target image.
(α2) Histograms of object regions specified by chromosomal parameters of each individual are created, and the degree of coincidence between these histograms and the template created by the reference information creation unit is evaluated by the fitness function.
(α3) N new individuals are generated by genetic operations based on selection, crossover, and mutation for N individuals.
(α4) The above (α2) and (α3) are repeated until the generation change limit, and the parameter of the individual having the highest fitness among the individuals of the last generation is set as a solution, and the region specified by the solution is defined as the object region to decide.

本発明の物体検出プログラムにおいて、前記検出処理部は、次のフレーム画像について前記遺伝的アルゴリズムの処理を開始する際、前のフレーム画像についての前記遺伝的アルゴリズムの処理で得られたＮ個の個体の染色体を利用する。 In the object detection program of the present invention, when the detection processing unit starts the process of the genetic algorithm for the next frame image, the N individuals obtained by the process of the genetic algorithm for the previous frame image Using the chromosomes of

本発明の物体検出プログラムは、好ましくは、、前記テンプレートを別のテンプレートに更新する参照情報更新部として前記コンピュータを機能させ、例えば、前記参照情報更新部は複数のフレーム画像に亘って物体領域のヒストグラムが同じか又はその差が小さいときに、当該ヒストグラムを前記テンプレートに設定する。 The object detection program according to the present invention preferably causes the computer to function as a reference information update unit that updates the template to another template. For example, the reference information update unit can detect an object region over a plurality of frame images. When the histograms are the same or the difference between them is small, the histogram is set in the template.

本発明によれば、ヒストグラムテンプレートを用いたマッチングによって顔を追跡することができる。特に、顔の姿勢によらず、顔のヒストグラムが一定であることから、実時間処理で顔領域の追跡を行うことが可能である。 According to the present invention, a face can be tracked by matching using a histogram template. In particular, the face histogram can be tracked by real-time processing because the face histogram is constant regardless of the face posture.

本発明の第１実施形態に係る顔検出装置のブロック図である。1 is a block diagram of a face detection device according to a first embodiment of the present invention. （Ａ）は本発明の第１実施形態に係る参照情報作成部が特定する顔領域、（Ｂ）〜（Ｄ）はヒストグラムテンプレートである。(A) is a face area specified by the reference information creation unit according to the first embodiment of the present invention, and (B) to (D) are histogram templates. 本発明の第１実施形態に係る画像取得部が生成する動画像情報を示す図である。It is a figure which shows the moving image information which the image acquisition part which concerns on 1st Embodiment of this invention produces | generates. 本発明の第１実施形態に係る個体の染色体の構造を示す図である。It is a figure which shows the structure of the chromosome of the individual | organism | solid which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る検出処理部を示すブロック図である。It is a block diagram which shows the detection process part which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る顔検出装置の動作について説明する図である。It is a figure explaining operation | movement of the face detection apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る顔検出装置の動作について説明する図である。It is a figure explaining operation | movement of the face detection apparatus which concerns on 1st Embodiment of this invention. （Ａ）〜（Ｄ）は本発明の第１実施形態に係るテンプレート画像取得部と参照情報作成部との動作を説明するための図である。(A)-(D) are the figures for demonstrating operation | movement with the template image acquisition part and reference information preparation part which concern on 1st Embodiment of this invention. 本発明の第１実施形態に係る顔検出装置の動作について説明する図である。It is a figure explaining operation | movement of the face detection apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る顔検出装置によって特定した顔領域の、枠で表示したディスプレイの画面例を示す図である。It is a figure which shows the example of a screen of the display displayed with the frame of the face area specified by the face detection apparatus which concerns on 1st Embodiment of this invention. （Ａ）はテンプレートの画像、（Ｂ）〜（Ｋ）は各個体で特定される領域とその領域のヒスグラムを示す図である。(A) is an image of a template, and (B) to (K) are diagrams showing a region specified by each individual and a hysteresis of the region. （Ａ）はテンプレートの画像、（Ｂ）は（Ａ）の矩形領域のヒストグラム、（Ｃ）は検出処理で特定された矩形領域、（Ｄ）は（Ｃ）の矩形領域のヒストグラムを示す図である。(A) is a template image, (B) is a histogram of the rectangular area of (A), (C) is a rectangular area specified by the detection process, and (D) is a diagram showing a histogram of the rectangular area of (C). is there. （ａ）〜（ｉ）はＣｒ成分を使用した画像追跡結果を示す図である。(A)-(i) is a figure which shows the image tracking result using Cr component. 本発明の第２実施形態に係る顔検出装置を示すブロック図である。It is a block diagram which shows the face detection apparatus which concerns on 2nd Embodiment of this invention.

以下、図面を参照して本発明の実施形態を説明する。
［Ａ．第１実施形態］
図１は本発明の第１実施形態に係る顔検出装置１のブロック図である。顔検出装置１は、第１画像取得部２と、記憶部３と、参照情報作成部４と、第２画像取得部５と、検出処理部６と、を備えている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[A. First Embodiment]
FIG. 1 is a block diagram of a face detection apparatus 1 according to the first embodiment of the present invention. The face detection device 1 includes a first image acquisition unit 2, a storage unit 3, a reference information creation unit 4, a second image acquisition unit 5, and a detection processing unit 6.

第１画像取得部２は、顔追跡に必要な画像情報を取得する。具体的には、テンプレート画像取得部２は、ＣＣＤカメラやＣＭＯＳイメージセンサなどの撮像装置として構成されており、追跡対象の顔の静止画像を、例えば１枚取得する。第１画像取得部２で撮影された画像は、後述するヒストグラムテンプレートＴＰを作成するための参照用画像として記憶部３に保存される。 The first image acquisition unit 2 acquires image information necessary for face tracking. Specifically, the template image acquisition unit 2 is configured as an imaging device such as a CCD camera or a CMOS image sensor, and acquires, for example, one still image of the face to be tracked. The image photographed by the first image acquisition unit 2 is stored in the storage unit 3 as a reference image for creating a histogram template TP described later.

参照情報作成部４は、記憶部３から参照用画像を読み出し、この参照用画像を処理して正面顔を検出する。この場合の顔検出の手法では、Haar−like特徴を用いたAdaBoost法に基づくカスケード型識別器を利用する。正面顔として画像から矩形の領域を抽出した後、さらにそれより狭い図２（Ａ）に示す矩形の顔領域（以下、矩形領域と呼ぶ場合がある。）ＡＲを抽出する。 The reference information creation unit 4 reads a reference image from the storage unit 3, processes the reference image, and detects a front face. The face detection method in this case uses a cascade classifier based on the AdaBoost method using Haar-like features. After extracting a rectangular area from the image as the front face, a narrower rectangular face area (hereinafter sometimes referred to as a rectangular area) AR shown in FIG. 2A is extracted.

参照情報作成部４は、このように得られた顔領域ＡＲからヒストグラムを取得する。本実施形態では、使用する画像データはＲＧＢ表色系ではなく、ＹＣｒＣｂ表色系を利用する。ここで、Ｙは輝度、Ｃｒは赤色の色差、Ｃｂは青色の色差を表している。ＹＣｒＣｂ表色系はＲＧＢ表色系と異なり、輝度と色が分離している。従って、輝度の変化に対応することができることからＹＣｒＣｂ表色系を利用する。 The reference information creation unit 4 acquires a histogram from the face area AR obtained in this way. In this embodiment, the image data to be used is not the RGB color system but the YCrCb color system. Here, Y represents luminance, Cr represents red color difference, and Cb represents blue color difference. Unlike the RGB color system, the YCrCb color system separates luminance and color. Therefore, the YCrCb color system is used because it can cope with a change in luminance.

図２（Ｂ）は顔領域ＡＲに関するＹ成分のヒストグラム、図２（Ｃ）は顔領域ＡＲに関するＣｂ成分のヒストグラム、図２（Ｄ）は顔領域ＡＲに関するＣｒ成分のヒストグラムを表している。これらのヒストグラムを、以下、ヒストグラムテンプレートＴＰと呼ぶ。 2B shows a Y component histogram relating to the face area AR, FIG. 2C shows a Cb component histogram relating to the face area AR, and FIG. 2D shows a Cr component histogram relating to the face area AR. These histograms are hereinafter referred to as histogram templates TP.

第２画像取得部５は追跡対象の画像をリアルタイムに撮像する。具体的には、第２画像取得部５は、ＣＣＤカメラやＣＭＯＳイメージセンサなどの撮像装置として構成されており、追跡対象の顔の動画像、即ち複数のフレーム画像を取得する。図３に示すように、複数のフレーム画像Ｆからなる動画像情報を以下、ビデオシーケンスＶＳと呼ぶ。なお、本実施形態は、一つの撮像装置が第２画像取得部５と第１画像取得部２とを兼ねて利用に供されるように構成されてもよい。 The second image acquisition unit 5 captures an image to be tracked in real time. Specifically, the second image acquisition unit 5 is configured as an imaging device such as a CCD camera or a CMOS image sensor, and acquires a moving image of a face to be tracked, that is, a plurality of frame images. As shown in FIG. 3, moving image information including a plurality of frame images F is hereinafter referred to as a video sequence VS. In addition, this embodiment may be comprised so that one imaging device may serve for the double use as the 2nd image acquisition part 5 and the 1st image acquisition part 2. FIG.

検出処理部６は、第２画像取得部５で作成されたビデオシーケンスＶＳを構成する各フレーム画像Ｆ、つまり撮像対象が静止した状態のターゲット画像の中で、追跡対象の顔の位置を特定する。具体的には、ターゲット画像の中の顔領域（図１０のＡＲ′参照）の位置を特定する。なお、ターゲット画像の中の顔領域（以下、矩形領域と呼ぶ場合がある。）については、ヒストグラムテンプレートＴＰの作成時に利用する矩形領域ＡＲ（図２（Ａ）参照）と区別するため、異なる符合ＡＲ′を付して、以下説明する。 The detection processing unit 6 specifies the position of the face to be tracked in each frame image F constituting the video sequence VS created by the second image acquisition unit 5, that is, the target image in a state where the imaging target is stationary. . Specifically, the position of the face area (see AR ′ in FIG. 10) in the target image is specified. Note that the face area in the target image (hereinafter sometimes referred to as a rectangular area) is distinguished from the rectangular area AR (see FIG. 2A) used when creating the histogram template TP. A description will be given below with AR ′.

検出処理部６は、第２画像取得部５で作成されたフレーム画像Ｆと記憶部３に保存されているヒストグラムテンプレートＴＰとに基づいて、顔の検出処理を行う。詳細は後述するが、検出処理部６は遺伝的アルゴリズム（以下、ＧＡと呼ぶ場合がある。）に基づいて検出処理を行う。 The detection processing unit 6 performs face detection processing based on the frame image F created by the second image acquisition unit 5 and the histogram template TP stored in the storage unit 3. As will be described in detail later, the detection processing unit 6 performs detection processing based on a genetic algorithm (hereinafter sometimes referred to as GA).

本実施形態の検出処理では、ターゲット画像上の顔の領域、つまり画像中の追跡すべき顔領域ＡＲ′をパラメータで表し、このパラメータで特定される顔領域ＡＲ′とのテンプレートマッチングを最適化問題として解決する。具体的には、パラメータで特定される顔領域ＡＲ′のヒストグラムとヒストグラムテンプレートＴＰとのマッチングを評価する。 In the detection process of the present embodiment, the face area on the target image, that is, the face area AR ′ to be tracked in the image is represented by a parameter, and template matching with the face area AR ′ specified by this parameter is an optimization problem. As a solution. Specifically, the matching between the histogram of the face area AR ′ specified by the parameter and the histogram template TP is evaluated.

パラメータとして、探索目標である矩形領域ＡＲ′の中心を表す座標、大きさ、回転角度を利用する。検出処理では、当初の矩形領域ＡＲを、或いはこれに基づいた矩形領域ＡＲ′を、パラメータを用いてターゲット画像上で幾何学変換し、新たに矩形領域ＡＲ′を特定する。そして、変換結果の矩形領域ＡＲ′上の画素のヒストグラムと、ヒストグラムテンプレートＴＰとのマッチングの最適化の解決手法として、本実施形態では遺伝的アルゴリズムを利用する。 As parameters, coordinates, size, and rotation angle representing the center of the rectangular area AR ′ that is the search target are used. In the detection process, the original rectangular area AR or the rectangular area AR ′ based on the original rectangular area AR is geometrically transformed on the target image using the parameters, and a new rectangular area AR ′ is specified. In this embodiment, a genetic algorithm is used as a solution for optimizing matching between the histogram of the pixels on the rectangular area AR ′ as a result of conversion and the histogram template TP.

遺伝的アルゴリズムにおける各個体の染色体構造として、本実施形態の染色体ＣＨは、図４に示すように、当初の矩形領域ＡＲの或いは変換前の矩形領域ＡＲ′の中心座標（ｃ_x,ｃ_y）と、矩形を画する枠のｘ軸とｙ軸方向の拡大縮小倍率ｍ_x，ｍ_yと、矩形を画する枠の回転角度angleの情報を持つとする。これらは最終的に求める解であり、探索対象のターゲット画像上で、顔領域として特定する矩形領域ＡＲ′の位置、サイズ、回転角度を表すパラメータである。これらのパラメータの起こり得る範囲は、本実施形態では以下の（１）〜（３）ように選定されている。

拡大縮小倍率はテンプレートの大きさから考慮して決定し、回転角度については日常生活で顔の曲がり得る角度を考慮して決定した。本実施形態で各パラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleをそれぞれ８bitで表し、一つの個体の染色体ＣＨは合計の４０bitで表される。 As the chromosome structure of each individual in the genetic algorithm, as shown in FIG. 4, the chromosome CH of this embodiment includes the center coordinates (c _x , c _y ) of the original rectangular area AR or the rectangular area AR ′ before conversion. When, and with the x-axis of the frame demarcating the rectangle and the y-axis direction of the scaling factor m _x, and m _y, the information of the rotation angle angle of the frame demarcating the rectangular. These are finally obtained solutions and are parameters representing the position, size, and rotation angle of the rectangular area AR ′ specified as the face area on the target image to be searched. The possible ranges of these parameters are selected as (1) to (3) below in this embodiment.

The enlargement / reduction ratio was determined in consideration of the size of the template, and the rotation angle was determined in consideration of the angle at which the face can bend in daily life. Each parameter c _x in this embodiment, expressed as _{_{_{c y, m x, m y}}} , respectively angle 8bit, chromosome CH of one individual is represented by a sum of 40bit.

検出処理部６は、これらのパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleをそれぞれ遺伝情報（以下、遺伝子と呼ぶ場合がある。）として取り扱う。特に最適化問題を解くために、検出処理部６はこれらのパラメータを遺伝情報として引き継いだ次世代の個体を遺伝的アルゴリズムによって適宜生成する。そして、検出処理部６は、生成された次世代の個体で特定される、つまり個体の染色体ＣＨを構成する各遺伝子（パラメータ）ｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleで特定される矩形領域ＡＲ′のヒストグラムを、ヒストグラムテンプレートＴＰと比較して、解としての各パラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleの適性を評価する。 Detection processing unit 6, these parameters _{_{_{c x, c y, m x}}} , m y, respectively genetic information angle handled as (hereinafter. May be referred to as gene). In particular, in order to solve the optimization problem, the detection processing unit 6 appropriately generates a next-generation individual that inherits these parameters as genetic information using a genetic algorithm. The detection processing unit 6 is identified in the generated next generation individuals are identified i.e. each gene (parameter) c _x constituting a chromosome CH _{_{individuals, c y, m x, m}} y, by angle the histogram of the rectangular area AR ', compared to the histogram template TP, the parameters c _x as a _{_{_{solution, c y, m x, m}}} y, to evaluate the suitability of the angle.

このため図５に示すように、検出処理部６は、前処理部６１と、遺伝操作部６２と、座標変換部６３と、適応度計算部６４と、判断部６５と、を備えている。 Therefore, as shown in FIG. 5, the detection processing unit 6 includes a preprocessing unit 61, a genetic operation unit 62, a coordinate conversion unit 63, a fitness calculation unit 64, and a determination unit 65.

前処理部６１は、Ｎ個の個体、つまりＮ個の染色体ＣＨ（図４参照）を生成する。ここで、各染色体ＣＨを構成する各パラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleの数値は、上記（１）〜（３）の範囲内でランダムに選定される。本実施形態では、個体数Ｎを１０に設定するが、その数に限定されるものではない。 The preprocessing unit 61 generates N individuals, that is, N chromosomes CH (see FIG. 4). Here, numerical values of the parameters _{_{_{c x, c y, m x}}} , m y, angle constituting each chromosome CH is randomly selected within the range of (1) to (3). In the present embodiment, the number N of individuals is set to 10, but is not limited to that number.

遺伝操作部６２は、Ｎ個の個体の内、１個を選択した後に操作して、あるいは選択した２個の個体を操作して、新たな個体をＮ個生成する。具体的には、遺伝操作部６２は、個体に対する遺伝的操作として、選択（淘汰、再生）、交叉、突然変異の3つの操作を行う。 The genetic operation unit 62 operates after selecting one of the N individuals or by operating the selected two individuals to generate N new individuals. Specifically, the genetic operation unit 62 performs three operations, selection (淘汰, regeneration), crossover, and mutation, as genetic operations for individuals.

ここで、交叉とは、個体の染色体（図４参照）を構成する各遺伝情報、つまりパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleを入れ替える操作であり、具体的には選択された２つの親個体の遺伝情報を相互に入れ替える操作である。ある。突然変異とは、染色体を構成する遺伝情報の一部、つまりパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleの一部を変える操作であり、具体的には別の数値に変える操作である。突然変異によってパラメータを特定する数値はランダムに選定される。 Here, crossing the, each genetic information constituting an individual chromosome (see Fig. 4), an operation to replace words parameter _{_{c x, c y, m x}} , m y, the angle, in particular selected This operation replaces the genetic information of two parent individuals. is there. The mutation, a part of the genetic information composing the chromosome, i.e. the parameter c _x, c _y, a m _{x, m} _y, operation of changing a part of the angle, the operation in particular changing to a different numerical is there. The numerical value specifying the parameter by mutation is selected at random.

本実施形態では、交叉率は０．７、突然変異率は０．０５に選定されているが、それらの数値に限定されるものではない。また、交叉方法は一様交叉とし、選択方法はルーレット選択とし、エリート保存戦略を基調として、遺伝的アルゴリズムを設定しているが、交叉方法や選択方法等はこれに限定されるものではい。なお、本実施形態では、エリート保存戦略として、後述する適応度が最も高い一つの個体（以下、エリート個体と呼ぶ。）を次世代に残すこととする。 In this embodiment, the crossover rate is selected to be 0.7 and the mutation rate is set to 0.05, but it is not limited to those values. The crossover method is uniform crossover, the selection method is roulette selection, and the genetic algorithm is set based on the elite preservation strategy. However, the crossover method and the selection method are not limited to this. In the present embodiment, as an elite preservation strategy, one individual with the highest fitness described below (hereinafter referred to as an elite individual) is left in the next generation.

遺伝情報の操作は、第２画像取得部５で撮像されたターゲット画像、つまり一つのフレーム画像Ｆ当たり４０回を限度とするが、世代交代数Ｇは４０回に限定されるものではない。 The genetic information can be manipulated up to 40 times per target image captured by the second image acquisition unit 5, that is, one frame image F, but the generation change number G is not limited to 40 times.

ここで、遺伝操作部６２によって生成された個体、つまり当該個体の染色体ＣＨを構成するパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleに基づいて特定される矩形領域ＡＲ′の座標は、以下の座標変換として表される。ここで、全ての幾何学変換を変換行列の複雑化を軽減するため、同次座標を用いる。点Ｐを変換前の画像上の顔領域の座標、点Ｐ^＊を生成された個体によって特定される変換後の点Ｐの座標とすると、これらの点を同次座標で表すと、以下の式（４）、（５）で表される。

Here, the coordinates of the rectangular region AR ′ specified based on the parameters c _x , c _y , m _x , m _y , and angle constituting the individual CH generated by the genetic operation unit 62, that is, the chromosome CH of the individual are: It is expressed as the following coordinate transformation. Here, all geometric transformations use homogeneous coordinates in order to reduce the complexity of the transformation matrix. Assuming that the point P is the coordinate of the face area on the image before conversion and the point P ^* is the coordinate of the point P after conversion specified by the generated individual, when these points are expressed in homogeneous coordinates, It is represented by (4) and (5).

さらに、点Ｐ^＊は以下の式（６）よって表される。
Furthermore, the point P ^* is expressed by the following equation (6).

座標変換部６３は、変換前の座標点Ｐを式（６）に基づいて変換して、ターゲット画像中で探索目標の矩形領域ＡＲ′の位置を特定する。なお、点Ｐは、初めの１フレーム目の場合には図１の第１画像取得部２で取得した参照用画像から抽出した矩形領域ＡＲ（図２の（Ａ）参照）の座標であり、次フレーム目からは前のフレーム画像で特定される矩形領域ＡＲ′の座標である。 The coordinate conversion unit 63 converts the coordinate point P before the conversion based on the equation (6), and specifies the position of the search target rectangular area AR ′ in the target image. In the case of the first frame, the point P is the coordinate of the rectangular area AR (see FIG. 2A) extracted from the reference image acquired by the first image acquisition unit 2 of FIG. From the next frame, the coordinates of the rectangular area AR ′ specified by the previous frame image.

適応度計算部６４は、座標変換部６３によって場所を移動させられた矩形領域ＡＲ′内にある画素によって特定されるヒストグラムと、ヒストグラムテンプレートＴＰと、のマッチング処理を行う。具体的には、ターゲット画像上の矩形領域ＡＲ′のヒストグラムを算出して、矩形領域ＡＲ′のヒストグラムとテンプレートＴＰのヒストグラムとの適応度、つまり両者の類似の度合いを判断する。なお、画像の倍率を変えた場合に矩形領域ＡＲ′がヒストグラムテンプレートＴＰの作成の基になった顔領域ＡＲのサイズと異なることになるが、矩形領域ＡＲ′のヒストグラムを作成する際に利用する矩形領域ＡＲ′の画素数は、ヒストグラムテンプレートＴＰの作成の基になった顔領域ＡＲに含まれる画素の数に合わせる。例えば、顔領域ＡＲが１０×１０pixelであり、矩形領域ＡＲ′が２０×２０pixelである場合、矩形領域ＡＲ′のヒストグラムは、４００個の画素を全て利用するのではなく、例えば１個置きに配置される画素を利用して顔領域ＡＲの画素数と同じ１００個の画素を利用して作成される。
矩形領域ＡＲ′のヒストグラムとテンプレートＴＰのヒストグラムとの適応度fitnessは次に示す適応度関数（７）、（８）を用いて表される。

The fitness calculation unit 64 performs a matching process between the histogram specified by the pixels in the rectangular area AR ′ whose location has been moved by the coordinate conversion unit 63 and the histogram template TP. Specifically, the histogram of the rectangular area AR ′ on the target image is calculated, and the adaptability between the histogram of the rectangular area AR ′ and the histogram of the template TP, that is, the degree of similarity between the two is determined. Note that, when the magnification of the image is changed, the rectangular area AR ′ differs from the size of the face area AR from which the histogram template TP is created, but this is used when creating a histogram of the rectangular area AR ′. The number of pixels in the rectangular area AR ′ is adjusted to the number of pixels included in the face area AR on which the histogram template TP is created. For example, when the face area AR is 10 × 10 pixels and the rectangular area AR ′ is 20 × 20 pixels, the histogram of the rectangular area AR ′ does not use all 400 pixels, but is arranged, for example, every other pixel. It is created using 100 pixels that are the same as the number of pixels of the face area AR.
The fitness fitness between the histogram of the rectangular area AR ′ and the histogram of the template TP is expressed using fitness functions (7) and (8) shown below.

ρ_iは表色系の各成分（ｉ＝１がＹ成分、ｉ＝２がＣｂ成分、ｉ＝３がＣｒ成分を表す。）におけるヒストグラムの類似度、ｍはヒストグラムのビンの数、ｐはヒストグラムテンプレートＴＰのヒストグラム、ｑはターゲット画像（観察対象のフレーム画像）上で矩形領域ＡＲ′のヒストグラム、ＮはヒストグラムテンプレートＴＰ作成の元になった矩形領域ＡＲの画素数を表している。適応度が大きい程、テンプレートＴＰのヒストグラムに類似しているということとなる。なお、（７）式のヒストグラムの類似度の計算ではBhattacharyya係数を用いている。 ρ _i is the similarity of the histogram in each component of the color system (i = 1 represents the Y component, i = 2 represents the Cb component, and i = 3 represents the Cr component), m is the number of histogram bins, and p is The histogram of the histogram template TP, q is the histogram of the rectangular area AR ′ on the target image (the frame image to be observed), and N is the number of pixels of the rectangular area AR from which the histogram template TP was created. The higher the fitness, the more similar to the histogram of the template TP. Note that the Bhattacharyya coefficient is used in the calculation of the similarity of the histogram in equation (7).

判断部６５は、Ｎ個の個体のそれぞれについて適応度計算部６４で算出された結果から、どの個体が最も追跡対象の顔領域を特定しているか判断する。具体的には、判断部６５は、適応度が最も大きい個体がどれかを判断する。エリート保存戦略に基づいて適応度が最も大きい個体は次世代にそのままの遺伝情報で継承される。また、世代交代数Ｇが最大限、例えば本実施形態の４０回目の場合、適応度が最も大きい個体が追跡対象の顔領域を特定していると判断する。このように最終的に選ばれた個体の染色体ＣＨを構成するパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleに基づいて、目標の顔領域ＡＲ′を特定する。 The determination unit 65 determines which individual specifies the face area to be tracked most from the result calculated by the fitness calculation unit 64 for each of the N individuals. Specifically, the determination unit 65 determines which individual has the highest fitness. The individual with the highest fitness based on the elite conservation strategy is inherited by the next generation with the same genetic information. Further, when the generational change number G is the maximum, for example, in the 40th time of this embodiment, it is determined that the individual with the highest fitness has specified the face area to be tracked. Parameter c _x thus configuring the chromosome CH of selected individuals _{_{eventually, c y, m x, m}} y, based on the angle, to identify the target of the face area AR '.

以上の顔検出装置１は例えばコンピュータから構成される。このコンピュータは、前もってインストールされたソフトウェアとしての顔追跡プログラムを実行することで、上記の手法、即ち顔の検出処理を実現する。具体的には、コンピュータが検出処理プログラムを実行することで、コンピュータが前述の参照情報作成部４、検出処理部６、特に前処理部６１、遺伝操作部６２、座標変換部６３、適応度計算部６４、判断部６５として機能する。 The face detection apparatus 1 described above is constituted by a computer, for example. This computer implements the above-described method, that is, the face detection process, by executing a face tracking program as software installed in advance. Specifically, when the computer executes the detection processing program, the computer executes the above-described reference information creation unit 4, the detection processing unit 6, in particular, the preprocessing unit 61, the genetic operation unit 62, the coordinate conversion unit 63, the fitness calculation. Functions as the unit 64 and the determination unit 65.

なお、複数のコンピュータをＬＡＮやインターネット、公衆網等を介して相互に接続して、参照情報作成部４、検出処理部６、特に前処理部６１、遺伝操作部６２、座標変換部６３、適応度計算部６４、判断部６５との動作を複数のパーソナルコンピュータによって分散処理させてもよい。コンピュータは、従来公知の構成のものを使用することができ、ＲＡＭ，ＲＯＭ，ハードディスクなどの記憶装置と、キーボード，ポインティング・デバイスなどの操作装置と、操作装置等からの指示により記憶装置に格納されたデータやソフトウェアを処理する中央処理装置（ＣＰＵ）と、処理結果等を表示するディスプレイなどを備えている。このコンピュータは汎用の装置でも、専用の装置として構成されたものであってもよい。 A plurality of computers are connected to each other via a LAN, the Internet, a public network, etc., and a reference information creation unit 4, a detection processing unit 6, particularly a preprocessing unit 61, a genetic operation unit 62, a coordinate conversion unit 63, an adaptation The operations of the degree calculation unit 64 and the determination unit 65 may be distributed by a plurality of personal computers. A computer having a conventionally known configuration can be used. The computer is stored in a storage device such as a RAM, a ROM, or a hard disk, an operation device such as a keyboard or a pointing device, and an instruction from the operation device. A central processing unit (CPU) for processing data and software, a display for displaying processing results, and the like. This computer may be a general-purpose device or a dedicated device.

次に、本実施形態に係る顔検出装置１の動作について説明する。
顔検出装置１が検出処理を行うための事前処理として、図６に示すステップＳ１で参照情報としてのヒストグラムテンプレートＴＰを取得する。この処理は、具体的には図７に示すように、ステップＳ１１で第１画像取得部２が顔追跡に必要な入力画像Ｆ１（図８（Ａ）参照）を取得する。ステップＳ１２では、参照情報作成部４がこの入力画像Ｆ１、つまり参照用画像を処理して、つまりHaar−like特徴を用いたAdaBoost法に基づいて正面顔Ｒ（図８（Ｂ）参照）を検出する。さらに、参照情報作成部４は、ステップＳ１３で正面顔Ｒとしての矩形の領域より狭い矩形の顔領域ＡＲ（図８（Ｃ）参照）をAdaBoost法によって抽出し、当該顔領域ＡＲのヒストグラム（図８（Ｄ）参照）を計算して、ヒストグラムテンプレートＴＰを作成する（ステップＳ１４）。 Next, the operation of the face detection apparatus 1 according to this embodiment will be described.
As pre-processing for the face detection device 1 to perform detection processing, a histogram template TP as reference information is acquired in step S1 shown in FIG. Specifically, as shown in FIG. 7, in this process, the first image acquisition unit 2 acquires an input image F1 (see FIG. 8A) necessary for face tracking in step S11. In step S12, the reference information creation unit 4 processes the input image F1, that is, the reference image, that is, detects the front face R (see FIG. 8B) based on the AdaBoost method using the Haar-like feature. To do. Further, in step S13, the reference information creation unit 4 extracts a rectangular face area AR (see FIG. 8C) narrower than the rectangular area as the front face R by the AdaBoost method, and a histogram (FIG. 8 (D)) is calculated to create a histogram template TP (step S14).

上記の事前処理が終了した後、顔検出装置１は検出処理を実行する。先ず、図６に示すように、ステップＳ２で前処理部６１が遺伝的アルゴリズムの初期化を行う。この初期化設定は、顔検出装置１が検出処理を開始する際にだけ行われる。この初期化設定では、矩形領域ＡＲ′を特定する各パラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleを遺伝情報とするＮ個（例えばＮ＝１０である。）の個体を生成する。各個体の染色体を構成するそれぞれのパラメータは上記の（１）〜（３）の範囲内でランダムに数値が選択される。 After the above pre-processing is completed, the face detection device 1 executes detection processing. First, as shown in FIG. 6, in step S2, the preprocessing unit 61 initializes the genetic algorithm. This initialization setting is performed only when the face detection apparatus 1 starts the detection process. In this initialization setting, the parameters c _x specifying the rectangular area _{_{AR ', c y, m x}} , m y, generates an individual of N to genetic information (for example, N = 10.) The angle. Numerical values are randomly selected for each parameter constituting the chromosome of each individual within the above ranges (1) to (3).

次に、ステップ３で第２画像取得部５が探索対象の画像情報Ｆ（図３参照）を取得して、ステップＳ４で検出処理部６がＧＡによるヒストグラムテンプレートＴＰのマッチングを行う。 Next, in step 3, the second image acquisition unit 5 acquires search target image information F (see FIG. 3), and in step S4, the detection processing unit 6 performs matching of the histogram template TP by GA.

ＧＡによるヒストグラムテンプレートＴＰのマッチングでは、先ず図９に示すように、ステップＳ４１で、前処理部６１が生成したＮ個の各個体を評価する。具体的には、前処理部６１が生成した個体の染色体ＣＨの各遺伝情報、つまり各パラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleで特定される矩形領域ＡＲ′内のヒストグラムを算出し、このヒストグラムとヒストグラムテンプレートＴＰとのマッチング度合いを算出する。このとき、式（７）及び（８）に示す適応度関数に基づいて、適応度を算出する。 In the matching of the histogram template TP by GA, first, as shown in FIG. 9, each of the N individuals generated by the preprocessing unit 61 is evaluated in step S41. Specifically, it calculates the genetic information of a chromosome CH individuals preprocessing unit 61 has generated, i.e. the parameters _{_{c x, c y, m x}} , m y, a histogram of the rectangular region AR in the 'specified by angle Then, the degree of matching between this histogram and the histogram template TP is calculated. At this time, the fitness is calculated based on the fitness function shown in the equations (7) and (8).

次に、ステップＳ４２で、検出処理部６の判断部６５が、各個体が終了条件を満たしているか、判断する。具体的には、個体に対する遺伝的操作、つまり世代交代の回数が最大の回数Ｇに至っているか判断する。最大の世代交代数Ｇに至っている場合、最終世代のＮ個の個体の内で、最も適応度が大きい個体を選択し、その個体の遺伝情報、つまりパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleを求める解として取り扱う。 Next, in step S42, the determination unit 65 of the detection processing unit 6 determines whether each individual satisfies the termination condition. More specifically, it is determined whether the number of genetic operations for individuals, that is, generational changes, reaches the maximum number G. When the maximum generation alternation number G is reached, an individual having the highest fitness is selected from N individuals in the final generation, and genetic information of the individual, that is, parameters c _x , c _y , m _x , m Treated as a solution to find _y and angle.

このような終了条件を満たさない場合、ステップＳ４３で遺伝操作部６２が各個体に基づいて遺伝的操作、言い換えればパラメータの交換を行う。遺伝操作部６２は、Ｎ個の個体に対して、選択（淘汰、再生）、交差、突然変異をある確率で生じさせて新たな個体をＮ個生成する。また、エリート保存戦略に基づいて、エリート個体は次世代にそのまま残す。 If such an end condition is not satisfied, in step S43, the genetic operation unit 62 performs a genetic operation based on each individual, in other words, exchanges parameters. The genetic manipulation unit 62 generates N new individuals by causing selection (selection, regeneration), crossing, and mutation to N individuals with a certain probability. Also, based on the elite conservation strategy, the elite individuals will remain in the next generation.

このようにして、遺伝操作部６２が新たにＮ個の個体を生成する。そして、新たに生成した個体が終了条件を満たすか判断する（ステップＳ４３からステップＳ４２へ）。その際、新たな個体に関しても、矩形領域ＡＲ′内のヒストグラムを算出し、このヒストグラムとヒストグラムテンプレートＴＰとのマッチング度合いを、式（７）及び（８）に示す適応度関数に基づいて算出する。 In this way, the genetic operation unit 62 newly generates N individuals. Then, it is determined whether the newly generated individual satisfies the end condition (from step S43 to step S42). At this time, also for a new individual, a histogram in the rectangular area AR ′ is calculated, and the degree of matching between this histogram and the histogram template TP is calculated based on the fitness function shown in equations (7) and (8). .

ステップＳ４２で終了条件が満たされるまで、ステップＳ４３の遺伝的操作と適応度の評価とを繰り返す。なお、本実施形態では、画像情報としての１フレームあたりの世代交代数Ｇを４０回に設定している。 Until the end condition is satisfied in step S42, the genetic operation and fitness evaluation in step S43 are repeated. In the present embodiment, the generation change number G per frame as the image information is set to 40 times.

このように遺伝的操作を行った世代交代を繰り返して終了条件を満たした場合、最終世代の個体群の内で、適応度が最も大きい個体のパラメータｃ_x，ｃ_y，ｍ_ｘ，ｍ_ｙ，angleを求める解として取り扱い、当該パラメータで特定される矩形領域ＡＲ′をディスプレイ上に、例えば図１０に示すように表示する。 If the end condition is satisfied in this manner by repeating the generation change was performed genetic operations, among the last generation of the population, the largest individual parameters c _x is _{_{fitness, c y, m x, m}} y, As a solution for obtaining the angle, the rectangular area AR ′ specified by the parameter is displayed on the display as shown in FIG. 10, for example.

なお、最終世代のＮ個の個体の各情報は、次のフレーム画像Ｆの処理開始時にそのまま利用されるよう、記憶部３に保存される。 Note that each piece of information of the N individuals of the last generation is stored in the storage unit 3 so as to be used as it is when the processing of the next frame image F is started.

このように、１フレームの画像情報について顔追跡の処理が完了、つまり顔領域と推定する領域を矩形枠で囲うことができたら、動画像として記憶部３に格納されているビデオシーケンスから次のフレーム画像Ｆ（図３参照）を画像情報として読み出し、この次フレーム画像Ｆについて検出処理を行う（図６のステップＳ４からステップＳ３へ）。次フレーム画像Ｆについて検出処理を開始する際、固体の初期化を行わず、つまりＮ個の個体を改めて作り直すことを行わずに、前フレーム画像の検出処理で最終世代として作成したＮ個の個体を、次フレーム画像における第１世代の個体群として利用する。 As described above, when face tracking processing is completed for one frame of image information, that is, the area estimated as a face area can be surrounded by a rectangular frame, the next video sequence stored as a moving image in the storage unit 3 A frame image F (see FIG. 3) is read out as image information, and a detection process is performed on the next frame image F (from step S4 to step S3 in FIG. 6). When the detection process for the next frame image F is started, N individuals created as the final generation in the detection process of the previous frame image without performing solid initialization, that is, without recreating the N individuals. Is used as the first generation population in the next frame image.

このように本実施形態に係る顔検出装置１によれば、ヒストグラムテンプレートＴＰを用いたマッチングによって顔を追跡することができる。特に、顔検出装置１では、顔の姿勢によらず、顔のヒストグラムが一定であることから、実時間処理で顔領域ＡＲ′の追跡を行うことが可能である。さらに、個体の初期化は初期フレームの１度のみ実施すると共に、前フレームにおいて進化した個体、適応度を確定する遺伝情報である各パラメータを次フレームへ継承することによって、個体数および世代交代数を減らすことができ、計算コスト削減と精度向上が可能となる。 Thus, according to the face detection apparatus 1 according to the present embodiment, a face can be tracked by matching using the histogram template TP. In particular, the face detection apparatus 1 can track the face area AR ′ by real-time processing because the face histogram is constant regardless of the posture of the face. Furthermore, the initialization of individuals is performed only once in the initial frame, and the number of individuals and the number of generations are changed by inheriting the individual that has evolved in the previous frame and each parameter that is genetic information for determining fitness to the next frame. Therefore, calculation cost can be reduced and accuracy can be improved.

［Ｂ．実験例］
本発明の実施形態について、ヒストグラムと各表色系の有効性について以下説明する。 [B. Experimental example]
Regarding the embodiment of the present invention, the effectiveness of the histogram and each color system will be described below.

Ｂ１：実験１．ヒストグラムの有効性
［Ｂ１−１．実験内容］
顔検出処理が抽出したターゲット画像の矩形領域ＡＲ′のヒストグラムと、ヒストグラムテンプレートＴＰとの類似性を確認し、その類似度合いと顔追跡との関係を調査する。 B1: Experiment 1. Effectiveness of histogram [B1-1. Experiment contents]
The similarity between the histogram of the rectangular area AR ′ of the target image extracted by the face detection process and the histogram template TP is confirmed, and the relationship between the degree of similarity and face tracking is investigated.

［Ｂ１−２．システムの設定］
実験で使用するＧＡのパラメータを以下のように設定した。個体数は１０個体，交叉率は０．７で、交叉方法は一様交叉、突然変異率は０．０５、選択方法はルーレット選択とし、エリート保存戦略を用いた。また、１フレーム画像における世代交代回数を４０回とした。実験には３．２ＧＨｚのＣＰＵを搭載した計算機を使用した。 [B1-2. System settings]
The GA parameters used in the experiment were set as follows. The number of individuals was 10, the crossover rate was 0.7, the crossover method was uniform crossover, the mutation rate was 0.05, the selection method was roulette selection, and an elite conservation strategy was used. In addition, the number of generation changes in one frame image is 40 times. A computer equipped with a 3.2 GHz CPU was used for the experiment.

［Ｂ１−３．評価方法］
本システムの顔検出装置１が取得した１フレームの矩形領域ＡＲ′に関して、当該矩形領域ＡＲ′のヒストグラムとテンプレートヒストグラムＴＰとの類似性を確認し、ヒストグラムのパターンが似通っているか評価する。また、ヒストグラムのマッチング度合いとシステムが特定した顔領域の位置とヒストグラムとの関係を、システムが作成した画像から評価する。 [B1-3. Evaluation method]
Regarding the rectangular area AR ′ of one frame acquired by the face detection device 1 of this system, the similarity between the histogram of the rectangular area AR ′ and the template histogram TP is confirmed, and it is evaluated whether the histogram patterns are similar. Further, the relationship between the histogram matching degree, the position of the face area specified by the system, and the histogram is evaluated from the image created by the system.

［Ｂ１−４．実験結果］
図１１（Ａ）はヒストグラムテンプレートＴＰを示す。図１１（Ｂ）〜（Ｋ）はシステムがフレームを処理した結果、つまりフレーム画像に関する検出処理で最終世代として残った１０個体（第１の個体〜第１０の個体）を示す結果であり、図の右側領域がシステムで特定された矩形領域ＡＲ′を示し、左側がその矩形領域ＡＲ′のヒストグラムを示す。 [B1-4. Experimental result]
FIG. 11A shows a histogram template TP. FIGS. 11B to 11K show the results of processing the frame by the system, that is, the results showing the 10 individuals (first to 10th individuals) remaining as the final generation in the detection processing related to the frame image. The right area of FIG. 2 shows the rectangular area AR ′ specified by the system, and the left side shows the histogram of the rectangular area AR ′.

図１１（Ａ）のヒストグラムテンプレートＴＰと図１１（Ｃ）のヒストグラムとを比較すると、ほぼヒストグラムのパターンが類似していることが確認できる。図１１（Ｃ）の右側領域に示されるように、ほぼ正確に被験者の顔の領域に矩形領域ＡＲ′が選択されている。
一方、図１１（Ｂ）の第１の個体のヒストグラムを図１１（Ａ）のヒストグラムテンプレートＴＰと比較すると、両者のヒストグラムのパターンの形状は異なり、類似した形状部位が見当たらない。このように形状が相違する場合、図１１（Ｂ）の右側領域に示すように、被験者の顔の領域と矩形領域ＡＲ′とがずれている。さらに、テンプレートＴＰのパターンと大きく形状が異なるヒストグラムを有する第８の個体では、図１１（Ｇ）に示すように、システムが設定した矩形領域ＡＲ′は被験者の顔から大きくずれた位置にある。
ここで、下記の表１は、システムが取得した各個体ＣＨのヒストグラムとヒストグラムテンプレートＴＰとの類似度合を評価する距離を表している。

この距離は、二つのヒストグラムの類似性を距離ｄとして表す以下の式（９）から算出した。距離が短いほど類似し、距離が長いほど非類似であることを表す。

ここで、Ｈ₁は個体のヒストグラム、Ｈ₂はテンプレートＴＰのヒストグラム、Iはビン数である。
表１から、ヒストグラムのパターンが殆ど似ていない個体番号６の個体（図１１（Ｇ）参照）については、距離ｄが一番長いことが確認できた。一方、ヒストグラムのパターンがほぼ似通っている個体番号２の個体（図１１（Ｃ）参照）については、距離が一番短いことが確認できた。
以上のことから、ヒストグラムテンプレートＴＰのマッチングによる顔認識が有効であること、つまりヒストグラムテンプレートＴＰが有効であることが確認できた。 When the histogram template TP in FIG. 11A and the histogram in FIG. 11C are compared, it can be confirmed that the histogram patterns are substantially similar. As shown in the right area of FIG. 11C, the rectangular area AR ′ is selected almost accurately in the face area of the subject.
On the other hand, when the histogram of the first individual in FIG. 11B is compared with the histogram template TP in FIG. 11A, the shapes of the patterns of the two histograms are different and a similar shape portion is not found. When the shapes are different in this way, the face area of the subject and the rectangular area AR ′ are shifted as shown in the right area of FIG. Further, in the eighth individual having a histogram whose shape is greatly different from the pattern of the template TP, as shown in FIG. 11G, the rectangular area AR ′ set by the system is at a position greatly deviated from the face of the subject.
Here, Table 1 below shows the distance for evaluating the degree of similarity between the histogram of each individual CH acquired by the system and the histogram template TP.

This distance was calculated from the following equation (9) that expresses the similarity between two histograms as the distance d. The shorter the distance, the more similar, and the longer the distance, the dissimilar.

Here, H ₁ is the histogram of the individual, H ₂ is the histogram of the template TP, and I is the number of bins.
From Table 1, it was confirmed that the distance d was the longest for the individual of the individual number 6 (see FIG. 11G) whose pattern of the histogram is almost similar. On the other hand, it was confirmed that the distance of the individual of the individual number 2 (see FIG. 11C) having a substantially similar histogram pattern was the shortest.
From the above, it was confirmed that face recognition by matching of the histogram template TP is effective, that is, the histogram template TP is effective.

Ｂ２：各表色系の有効性
［Ｂ２−１．実験内容］
ターゲット画像は、Ｗｅｂカメラを使用して被験者が当該Ｗｅｂカメラに対して顔を上下左右に振ることで得られた動画像シーケンスとする。ターゲット画像のサイズは３２０×２４０pixelであり、総フレーム数は１８０である。また、ヒストグラムテンプレートＴＰの取得に使用した正面顔は同一被験者の画像として、１６×２１pixelの画像を利用した。 B2: Effectiveness of each color system [B2-1. Experiment contents]
The target image is a moving image sequence obtained by using a Web camera and a subject shaking his / her face up / down / left / right with respect to the Web camera. The size of the target image is 320 × 240 pixels, and the total number of frames is 180. The front face used for acquiring the histogram template TP was a 16 × 21 pixel image as the same subject image.

本実施形態では、ヒストグラムを使用するため、表色系成分の組み合わせによって結果が大きく左右されることが考えられる。したがってＹＣｂＣｒ，ＹＣｒ，ＹＣｂ，ＣｂＣｒ，Ｙ，Ｃｒ，Ｃｂ，ＨＳＶ，ＨＳ，ＨＶ，ＳＶ，Ｈ，Ｓ，Ｖの表色系成分の組み合わせ１４パターンの実験によって各表色系成分に関する検出の有効性を確認する。 In this embodiment, since a histogram is used, it is conceivable that the result depends greatly on the combination of the color system components. Therefore, the effectiveness of detection of each color system component by experiment of 14 patterns of color system components of YCbCr, YCr, YCb, CbCr, Y, Cr, Cb, HSV, HS, HV, SV, H, S, V. Confirm.

評価基準は、あらかじめ目視によって、ターゲット画像上で顔の中心座標を正解座標として決定する。実験結果である検出された矩形領域ＡＲ′の中心座標と正解座標の距離を用いて正誤判定を行う。 As the evaluation criteria, the center coordinates of the face are determined as correct coordinates on the target image by visual observation in advance. Correct / incorrect determination is performed using the distance between the center coordinates of the detected rectangular area AR ′ and the correct coordinates, which is the experimental result.

［Ｂ２−２．システムの設定］
実験で使用するＧＡのパラメータを以下のように設定した。個体数は１０個体，交叉率は０．７で、交叉方法は一様交叉、突然変異率は０．０５、選択方法はルーレット選択とし、エリート保存戦略を用いた。また、１フレーム画像における世代交代回数を４０回とした。実験には３．２ＧＨｚのＣＰＵを搭載した計算機を使用した。 [B2-2. System settings]
The GA parameters used in the experiment were set as follows. The number of individuals was 10, the crossover rate was 0.7, the crossover method was uniform crossover, the mutation rate was 0.05, the selection method was roulette selection, and an elite conservation strategy was used. In addition, the number of generation changes in one frame image is 40 times. A computer equipped with a 3.2 GHz CPU was used for the experiment.

［Ｂ２−３．評価方法］
事前にターゲット画像に対して顔の中心と判断できる座標、つまり正解座標を目視で決定する。そして、実験結果で検出された矩形領域ＡＲ′の中心座標、つまり結果座標を正解座標と比較して正誤判定を行う。 [B2-3. Evaluation method]
The coordinates that can be determined as the center of the face with respect to the target image in advance, that is, the correct coordinates are determined visually. Then, the correctness determination is performed by comparing the center coordinates of the rectangular area AR ′ detected from the experimental results, that is, the result coordinates with the correct coordinates.

正解座標と結果座標の距離は，顔の大きさに影響されるため，単純な距離ではなくヒストグラムテンプレートＴＰ作成の元になった顔領域ＡＲの画像のサイズに応じて正規化を行った。計算式は以下の式（１０）を利用する。
Since the distance between the correct coordinates and the result coordinates is affected by the size of the face, normalization is performed according to the size of the image of the face area AR from which the histogram template TP is created, not a simple distance. The following formula (10) is used as the calculation formula.

ここで、Ａは正解座標、Ｒは結果座標、width，heightはヒストグラムテンプレートＴＰ作成の元になった顔領域ＡＲの画像の幅，高さを表している。実験ではターゲット画像の顔領域の高さ（あごから眉毛までの距離）がおよそ５０pixelであったことに加えて、肌色である首元まで検出した場合に検出失敗とするために結果座標と正解座標との距離が１０pixel以内であれば正解という判定基準を定めた。 Here, A represents the correct coordinates, R represents the result coordinates, and width and height represent the width and height of the image of the face area AR from which the histogram template TP was created. In the experiment, in addition to the face area height (distance from chin to eyebrows) of the target image being about 50 pixels, the result coordinates and correct coordinates are used to make the detection failure when detecting the skin color neck If the distance between and is within 10 pixels, a criterion of correct answer was set.

［Ｂ２−４．実験結果］
図１２（Ａ）のテンプレートの画像である矩形領域ＡＲのヒストグラム（図１２（Ｂ））と図１２（Ｃ）の顔検出処理で特定された矩形領域ＡＲ′のヒストグラム（図１２（Ｄ））とを比較すると、図１２に示すように、３つの成分、つまりＹ成分、Ｃｒ成分、Ｃｂ成分のヒストグラム形状が類似していることがわかる。このことからも本実施形態で提案するＧＡがヒストグラムを評価してマッチングを行えることが判明した。 [B2-4. Experimental result]
The histogram of the rectangular area AR (FIG. 12B) which is the template image of FIG. 12A and the histogram of the rectangular area AR ′ identified by the face detection processing of FIG. 12C (FIG. 12D). As shown in FIG. 12, it can be seen that the histogram shapes of the three components, that is, the Y component, the Cr component, and the Cb component are similar. From this, it was found that the GA proposed in the present embodiment can perform matching by evaluating the histogram.

また、同じ乱数種を使用して、１４パターンすべてを用いて実験を実施した。その結果、７０％以上の精度が得られた上位６パターンに対して、乱数種をさらに４種類追加して実験を実施した。その結果を下記の表２に示す。
In addition, using the same random number seed, the experiment was conducted using all 14 patterns. As a result, the experiment was carried out by adding four more random seeds to the top six patterns with an accuracy of 70% or more. The results are shown in Table 2 below.

上位の４パターンはＣｒ，ＹＣｒ，ＣｒＣｂ，ＹＣｒＣｂの成分、或いは組み合わせであり、これらは１４パターンの中でＣｒ成分が含まれているパターンである。このことから、Ｃｒ成分が重要であることがわかる。もっとも精度が高いのは、Ｃｒ成分のみを使用した結果であった。
Ｃｒ成分を使用した結果画像を図１３（ａ）〜（ｉ）に示す。真横に向いた場合など大きな顔向きの変化にロバストであることが確認できた。 The upper four patterns are Cr, YCr, CrCb, and YCrCb components or combinations, and these are patterns that include the Cr component in 14 patterns. This shows that the Cr component is important. The result with the highest accuracy was the result of using only the Cr component.
Results images using the Cr component are shown in FIGS. It was confirmed that it was robust to changes in the face direction, such as when it turned to the side.

［Ｃ．第２実施形態］
図１４は本発明の第２実施形態に係る顔検出装置１Ａを示すブロック図である。顔検出装置１Ａは、前述の第１実施形態に係る顔検出装置１の構成に加えて、図１４に示すように、参照情報更新部７を備えている。前述の第１実施形態の構成と同じ構成には同じ符号を付してその詳細な説明を省略する。 [C. Second Embodiment]
FIG. 14 is a block diagram showing a face detection apparatus 1A according to the second embodiment of the present invention. In addition to the configuration of the face detection device 1 according to the first embodiment described above, the face detection device 1A includes a reference information update unit 7 as shown in FIG. The same components as those of the first embodiment described above are denoted by the same reference numerals, and detailed description thereof is omitted.

本実施形態では、最初のフレーム画像から最終フレーム画像までの追跡の処理で同じヒストグラムテンプレートＴＰを使用するのではなく、場合により、ヒストグラムテンプレートＴＰを途中で別のヒストグラムテンプレートＴＰに更新、つまり換えることを特徴としている。 In the present embodiment, the same histogram template TP is not used in the tracking process from the first frame image to the last frame image, but in some cases, the histogram template TP is updated to another histogram template TP in the middle. It is characterized by.

このため、本実施形態では、参照情報更新部７を備えている。
参照情報更新部７は、撮像対象である被験者が数フレーム画像に亘って同じ姿勢であることを確認できた場合に、その姿勢で抽出できるヒストグラムを以後の追跡の際に利用するヒストグラムテンプレートＴＰに設定する。
参照情報更新部７は、１０フレームに亘ってヒストグラムが同じである場合に、ヒストグラムテンプレートＴＰを交換する。ヒストグラムが同じとは、一致する場合のほか後述するようにその差が小さい場合が該当する。 For this reason, in this embodiment, the reference information update part 7 is provided.
When it is confirmed that the subject to be imaged has the same posture over several frame images, the reference information update unit 7 uses a histogram that can be extracted with the posture as a histogram template TP to be used for subsequent tracking. Set.
The reference information update unit 7 replaces the histogram template TP when the histograms are the same over 10 frames. The case where the histograms are the same corresponds to the case where the differences are small as well as the case where they match, as will be described later.

参照情報更新部７は、前述の式（９）の距離ｄに基づいて、前後のフレーム画像のヒストグラムの類似性を評価する。具体的には、前後のフレーム画像のヒストグラム同士の距離ｄを算出し、さらに次のフレーム画像とその次のフレーム画像との距離ｄが前のフレーム画像同士の距離ｄと同じであるか判断する。この同一性の判断は、本実施形態では、例えば誤差±０．０５の範囲を同じと評価する。誤差の範囲はこの数値に限定されるものではなく、例えば有効数字を設定して数値の完全一致を同じと評価してもよい。参照情報更新部７が、１０フレームに亘ってヒストグラムが同じであると判断した場合、１０フレーム前のフレーム画像に関するＧＡ処理の最終世代で選ばれた個体によって特定される矩形領域ＡＲ′のヒストグラムをヒストグラムテンプレートＴＰとして以後取り扱う。 The reference information updating unit 7 evaluates the similarity of the histograms of the preceding and succeeding frame images based on the distance d of the above-described equation (9). Specifically, the distance d between the histograms of the preceding and following frame images is calculated, and it is further determined whether the distance d between the next frame image and the next frame image is the same as the distance d between the previous frame images. . In the present embodiment, for example, this sameness determination is made by evaluating that the range of error ± 0.05 is the same. The range of error is not limited to this numerical value. For example, significant digits may be set to evaluate the complete coincidence of numerical values as the same. When the reference information updating unit 7 determines that the histograms are the same over 10 frames, the histogram of the rectangular area AR ′ specified by the individual selected in the final generation of the GA processing for the frame image 10 frames before is displayed. Hereinafter, it is handled as a histogram template TP.

参照情報更新部７がヒストグラムテンプレートＴＰを交換した後、検出処理部６は新たに設定されたヒストグラムテンプレートＴＰに基づいてヒストグラムのマッチング処理を行う。 After the reference information updating unit 7 exchanges the histogram template TP, the detection processing unit 6 performs a histogram matching process based on the newly set histogram template TP.

以上の顔検出装置１Ａは例えばコンピュータから構成される。このコンピュータは、前もってインストールされたソフトウェアとしての顔追跡プログラムを実行することで、上記の手法、即ち顔検出処理を実現する。具体的には、コンピュータが顔検出処理プログラムを実行することで、コンピュータが前述の参照情報作成部４、検出処理部６、特に前処理部６１、遺伝操作部６２、座標変換部６３、適応度計算部６４、判断部６５、参照情報更新部７として機能する。 The face detection apparatus 1A described above is composed of a computer, for example. The computer implements the above-described method, that is, the face detection process by executing a face tracking program as software installed in advance. Specifically, when the computer executes the face detection processing program, the computer executes the above-described reference information creation unit 4, detection processing unit 6, particularly pre-processing unit 61, genetic operation unit 62, coordinate conversion unit 63, fitness level It functions as a calculation unit 64, a determination unit 65, and a reference information update unit 7.

このように本発明の第２実施形態に係る顔検出処理装置１Ａによれば、参照情報更新部７によってヒストグラムテンプレート７を使用環境に応じたテンプレートに交換することができる。例えば、撮像対象の被験者のまわりが暗く或いは明るくなった場合、その明暗度に応じて被験者の顔領域のヒストグラムが変わるので、参照情報としてのヒストグラムパターンＴＰを使用環境に応じたものに調整することができる。これにより、顔の追跡精度を向上することができる。 As described above, according to the face detection processing apparatus 1A according to the second embodiment of the present invention, the reference information update unit 7 can replace the histogram template 7 with a template corresponding to the use environment. For example, when the surroundings of the subject to be imaged become dark or bright, the histogram of the subject's face area changes according to the brightness, so the histogram pattern TP as reference information is adjusted according to the usage environment. Can do. Thereby, the tracking accuracy of the face can be improved.

［Ｄ．その他の実施形態］
以上詳述したが、本発明は発明の趣旨を逸脱しない範囲において様々な形態で実施をすることができる。
上記実施形態では、参照情報更新部がHaar−like特徴を用いたAdaBoost法に基づくカスケード型識別器を利用して、自動で顔領域ARのヒストグラムを作成し、このヒストグラムをテンプレートＴＰとして利用しているが、テンプレートのヒストグラムは必ずしも、被験者の正面顔に限定されるものではない。例えば、初期設定時のテンプレートＴＰを横顔、上向きの顔、下向きの顔などから自動で作成してもよいことは勿論である。この場合、それらの顔の向き、つまり被験者の姿勢に応じて、ヒストグラムを作成するよう、参照情報更新部を構成するカスケード型識別器を構築する。
染色体を構成する上記パラメータの範囲（１）〜（３）を規定する数値は例示である。
上記説明では、検出処理装置が計算結果として顔領域を枠でディスプレイ上に表示するが、このようなディスプレイ上の表示を省略してもよい。
また、染色体を構成する遺伝情報としてのパラメータは上記に限定されるものではなく、それらの一部を省略し、さらに、三次元的な回転と言った情報を遺伝情報として活用してもよい。
また遺伝的操作において、選択はルーレット選択に限らず、ランキング選択、トーナメント選択を利用し、交叉方法は一様交叉に限らず、一点交叉、二点交叉、多点交叉を利用してもよい。
染色体を表すビット数は４０ビットに限定されるものではない。
上記実施形態では、ヒストグラムとして３要素、つまりＹ成分、Ｃｒ成分、Ｃｂ成分を活用したが、Ｃｒ成分だけのヒストグラムを利用して、顔追跡におけるヒストグラムのマッチングを行ってもよい。
第２実施形態の参照情報更新部でのヒスグラムの同一性の判断において、ヒストグラムテンプレートＴＰの交換条件として、同じヒストグラムが連続するフレーム数は１０フレームに限定されるものではない。また、同一性の判断は、時間で処理してもよい。例えば数秒間ヒストグラムが同じ場合に交換してもよい。
前述の第１実施形態及び第２実施形態では、追跡対象を撮像対象である被験者の顔と特定したが、撮像対象は人だけでなく、動物などの生物の他、土地に定着した看板や標識などの不動産、その物自体可搬自在な車やテレビなどの動産であってもよく、何れも物としての正面や側面が存在する物を対象とすることができる。このように観察対象が変われば、画像特徴量としての色成分が変わるため、物の正面の色に応じて表色系を変える。
前述の実施形態では、終了条件を満たすまで遺伝的操作を繰り返し行う構成を説明したが、終了条件として、ある個体の適応度が所定の値を超えた場合を条件としてもよい。例えば、最大世代交代数Ｇに至る前の第５世代で、個体群の中に適応度がある閾値を超える個体がある場合、世代交代、つまり以後の遺伝的操作を行わず、第５世代の個体の内、適応度が最も高い個体の遺伝情報を解として取り扱い、第５世代のＮ個の個体を次フレーム画像の第１世代の個体として取り扱う。
本発明は動画像から顔などの物体の正面領域を抽出する処理を行うが、動画像を構成するフレームを順次、処理する場合に限らず、例えば数フレーム置きに検出処理を行うように構成してもよい。 [D. Other Embodiments]
As described above in detail, the present invention can be implemented in various forms without departing from the spirit of the invention.
In the above embodiment, the reference information update unit automatically creates a histogram of the face area AR using a cascade classifier based on the AdaBoost method using Haar-like features, and uses this histogram as a template TP. However, the histogram of the template is not necessarily limited to the front face of the subject. For example, the template TP at the time of initial setting may be automatically created from a profile, an upward face, a downward face, and the like. In this case, a cascade classifier constituting the reference information update unit is constructed so as to create a histogram according to the orientation of those faces, that is, the posture of the subject.
The numerical values defining the ranges (1) to (3) of the parameters constituting the chromosome are examples.
In the above description, the detection processing device displays the face area on the display as a calculation result on the display, but such display on the display may be omitted.
Moreover, the parameters as genetic information constituting the chromosome are not limited to the above, and some of them may be omitted, and information such as three-dimensional rotation may be used as genetic information.
In genetic operation, selection is not limited to roulette selection, ranking selection or tournament selection is used, and the crossover method is not limited to uniform crossover, and one-point crossover, two-point crossover, or multipoint crossover may be used.
The number of bits representing a chromosome is not limited to 40 bits.
In the above-described embodiment, three elements, that is, the Y component, the Cr component, and the Cb component are used as the histogram. However, histogram matching in face tracking may be performed using a histogram of only the Cr component.
In the determination of the identity of the histogram in the reference information update unit of the second embodiment, the number of frames in which the same histogram continues as the replacement condition of the histogram template TP is not limited to ten frames. The determination of identity may be processed by time. For example, it may be exchanged when the histograms are the same for several seconds.
In the first embodiment and the second embodiment described above, the tracking target is specified as the face of the subject who is the imaging target, but the imaging target is not only a person but also a living thing such as an animal, as well as a signboard or sign fixed on the land. Such a property may be a movable property such as a portable car or a television, and any of them may have a front or side as an object. If the observation object changes in this way, the color component as the image feature amount changes, so the color system is changed according to the color of the front of the object.
In the above-described embodiment, the configuration in which the genetic operation is repeated until the end condition is satisfied has been described. However, the end condition may be a case where the fitness of a certain individual exceeds a predetermined value. For example, in the fifth generation before reaching the maximum generation change number G, if there is an individual whose fitness exceeds a certain threshold in the population, the generation change, that is, the subsequent genetic operation is not performed, Among individuals, the genetic information of the individual having the highest fitness is handled as a solution, and the N individuals of the fifth generation are handled as the first generation individuals of the next frame image.
The present invention performs a process of extracting the front area of an object such as a face from a moving image, but is not limited to sequentially processing the frames constituting the moving image, for example, a detection process is performed every few frames. May be.

１，１Ａ顔検出装置
２第１画像取得部
３記憶部
４参照情報作成部
５第２画像取得部
６検出処理部
７参照情報更新部
６１前処理部
６２遺伝操作部
６３座標変換部
６４適応度計算部
６５判断部
ＡＲ，ＡＲ′ 顔領域
ＣＨ染色体
c_x 中心座標のＸ成分
c_y 中心座標のＹ税分
ｍ_ｘｘ軸方向の枠の拡大縮小倍率
ｍ_ｙｙ軸方向の枠の拡大縮小倍率
Angle 矩形を画する枠の回転角度
Ｆフレーム
Ｐ変換前の座標
P^＊変換後の座標
ＴＰヒストグラムテンプレート
ＶＳビデオシーケンス 1, 1A Face detection device 2 First image acquisition unit 3 Storage unit 4 Reference information creation unit 5 Second image acquisition unit 6 Detection processing unit 7 Reference information update unit 61 Preprocessing unit 62 Genetic operation unit 63 Coordinate conversion unit 64 Fitness Calculation unit 65 Judgment unit AR, AR ′ Facial region CH Chromosome
c _x component of _x center coordinate
c _y- axis y-axis tax m _x x-axis frame scaling factor _y y y-axis frame scaling factor
Angle Rotation angle of the frame that draws the rectangle F Frame P Coordinates before conversion
P ^* Coordinate TP after conversion Histogram template VS Video sequence

Claims

An object detection device that has an image acquisition unit capable of acquiring a still image and a moving image composed of a plurality of frame images and detects an object whose posture has been changed,
A reference information creation unit that creates a histogram as a template from a still image of an object photographed by the image acquisition unit;
Based on the frame image to be searched and the template created by the reference information creation unit among the plurality of frame images constituting the moving image taken by the image acquisition unit, the object region is extracted from the frame image to be searched. A detection processing unit for detecting,
The object detection apparatus, wherein the detection processing unit performs the following processes (α1) to (α4) based on a genetic algorithm.
(α1) N individuals including parameters for specifying the object region are generated in the search target image.
(α2) Histograms of object regions specified by chromosomal parameters of each individual are created, and the degree of coincidence between these histograms and the template created by the reference information creation unit is evaluated by the fitness function.
(α3) N new individuals are generated by genetic operations based on selection, crossover, and mutation for N individuals.
(α4) The above (α2) and (α3) are repeated until the generation change limit, and the parameter of the individual having the highest fitness among the individuals of the last generation is set as a solution, and the region specified by the solution is defined as the object region to decide.

The detection processing unit uses the chromosomes of N individuals obtained by the genetic algorithm processing for the previous frame image when starting the genetic algorithm processing for the next frame image. The object detection device according to claim 1.

The object detection apparatus according to claim 1, further comprising a reference information update unit that updates the template to another template.

The object detection according to claim 3, wherein the reference information update unit sets the histogram in the template when the histogram of the object region is the same or the difference between the plurality of frame images is the same. apparatus.

The object detection apparatus according to claim 1, wherein the histogram is created based on a Cr color system representing a red color difference.

An object detection program for detecting an object whose posture has been changed,
Computer
A reference information creation unit that creates a histogram as a template from a still image of an object photographed by the image acquisition unit;
An object region is detected from a frame image to be searched based on a frame image to be searched and a template created by the reference information creating unit among a plurality of frame images constituting a moving image photographed by the image acquisition unit. A detection processing unit,
Function as
An object detection program, wherein the detection processing unit performs the following processes (α1) to (α4) based on a genetic algorithm.
(α1) N individuals including parameters for specifying the object region are generated in the search target image.
(α2) Histograms of object regions specified by chromosomal parameters of each individual are created, and the degree of coincidence between these histograms and the template created by the reference information creation unit is evaluated by the fitness function.
(α3) N new individuals are generated by genetic operations based on selection, crossover, and mutation for N individuals.
(α4) The above (α2) and (α3) are repeated until the generation change limit, and the parameter of the individual having the highest fitness among the individuals of the last generation is set as a solution, and the region specified by the solution is defined as the object region to decide.

The detection processing unit uses the chromosomes of N individuals obtained by the genetic algorithm processing for the previous frame image when starting the genetic algorithm processing for the next frame image. The object detection program according to claim 6.

A reference information update unit that updates the computer with another template;
The object detection program according to claim 6, wherein the object detection program is made to function as:

9. The object detection according to claim 8, wherein the reference information update unit sets the histogram in the template when the histogram of the object region is the same or the difference between the plurality of frame images is the same. program.

The face tracking program according to claim 6, wherein the histogram is created based on a Cr color system representing a red color difference.