JPH1153525A

JPH1153525A - Facial organ detector and medium

Info

Publication number: JPH1153525A
Application number: JP9225699A
Authority: JP
Inventors: Tatsumi Watanabe; 辰巳渡辺
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1997-08-06
Filing date: 1997-08-06
Publication date: 1999-02-26

Abstract

PROBLEM TO BE SOLVED: To provide a facial organ detector by which a robust facial organ not to be affected by an image input environment such as the background, illumination conditions or object size can be segmented from an input image by using color information such as hue or saturation and to provide the medium therefor. SOLUTION: Based on the color information calculated by a color information calculating means 102 and smoothing filtering processing at a smoothing processing means 103, a figure area limiting means 104 extracts a figure area. Plural rectangular areas are selected from a figure area image by an initial extraction vector setting means 107 and based on the distribution of color information in each rectangular area, an evaluation vector generating means 108 generates the evaluation vector of each extraction vector. By comparing each evaluation vector with an evaluation template vector in an evaluation template dictionary 109, the suitability of each extraction vector is evaluated by an extracted area evaluating means 110, and in a recombining operating means 111, each rectangular area is recombined by genetic recombination operation.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、自然画像中から
目、鼻、口、眉等の顔器官領域を検出する装置として利
用可能な顔器官検出装置及び媒体に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a face organ detection device and a medium which can be used as a device for detecting a face organ region such as an eye, a nose, a mouth, and eyebrows from a natural image.

【０００２】[0002]

【従来の技術】マルチメディア情報を扱う機会が増大す
るにつれ、自然画像から任意の対象、特に人物の顔やそ
の器官を抽出する要望が増大しつつある。そして、その
ような技術の応用例として、顔画像を用いた個人照合技
術や口唇を用いた音声認識精度の向上技術等が挙げられ
る。2. Description of the Related Art As the opportunities for handling multimedia information increase, there is an increasing demand for extracting arbitrary objects, particularly human faces and their organs, from natural images. Examples of application of such technology include a personal matching technology using a face image and a technology for improving speech recognition accuracy using a lip.

【０００３】従来、画像より顔器官を検出する技術とし
ては、［１］入力画像を濃淡で２値化処理を行った後、連結領
域を検出することにより顔器官を検出する方法。［２］入力画像を濃淡で２値化を行った後、縦、横につ
いて射影を計算し、その射影パターン上における領域分
割を射影パターンのしきい値処理により顔器官の検出を
行う方法。［３］入力画像中のエッジ画像を抽出し、対象器官形状
のテンプレートマッチングや相互の位置関係をもとに顔
器官の切り出しを行う方法。［４］入力画像の濃度情報をモザイク表現等により圧縮
した表現で表し、テンプレートとのマッチングにより顔
器官の検出を行う方法。等が挙げられる。Conventionally, as a technique for detecting a face organ from an image, there is a method of [1] detecting a face organ by performing binarization processing on an input image by shading and then detecting a connected region. [2] A method in which after an input image is binarized by shading, projections are calculated for the vertical and horizontal directions, and face divisions are detected on the projection pattern by threshold processing of the projection pattern. [3] A method of extracting an edge image from an input image and extracting a face organ based on template matching of a target organ shape and mutual positional relationship. [4] A method in which density information of an input image is represented by a compressed expression such as a mosaic expression, and a facial organ is detected by matching with a template. And the like.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、［１］
［２］の場合、照明条件等により２値化を行う最適なし
きい値を動的に変える必要があり、画像上におけるノイ
ズを大きく拾ってしまったり、逆に顔器官部分が欠けて
しまうというような課題を持つ。［３］の場合、顔器官
が含まれる画像に対してエッジ強調処理を行い、顔輪郭
や顔器官（目、鼻、口等）の輪郭形状、ならびに相互の
位置関係を抽出する際、撮像する際の照明条件によりエ
ッジ自体を安定して抽出することが困難な場合が多いと
いう課題を抱えている。さらに、うまくエッジが抽出さ
れ顔の線的な形状特徴が得られたとしても、その線的な
形状特徴により顔、顔器官とそれ以外の物体間の微妙な
差異をはたしてどこまで表現できるのかまだ明確になっ
ていないのも実情である。一方、［４］の濃淡パターン
をモザイクパターンに圧縮表現する方法の場合でも、簡
単な処理で良いというメリットを持つが、一定の制約条
件を満足しなければ安定した切り出しを実現することが
できないという報告がなされている。However, [1]
In the case of [2], it is necessary to dynamically change the optimal threshold value for performing binarization depending on the lighting conditions and the like, so that a large amount of noise is picked up on the image, or conversely, the face organ part is missing. Issues. In the case of [3], an image is captured when an edge emphasizing process is performed on an image including a face organ to extract a face contour, a contour shape of the face organ (eyes, nose, mouth, etc.) and a mutual positional relationship. There is a problem that it is often difficult to stably extract the edge itself due to the lighting conditions at that time. Furthermore, even if the edges are successfully extracted and the linear shape features of the face are obtained, it is still unclear how far the subtle differences between the face, facial organs and other objects can be expressed by the linear shape features It is also the fact that it is not. On the other hand, the method of [4] for compressing and expressing the light and shade pattern into a mosaic pattern has the advantage that simple processing is sufficient, but it is not possible to realize stable cutout unless certain constraints are satisfied. Reports have been made.

【０００５】本発明は、従来のこの様な課題を考慮し
て、色相、彩度等の色情報を用いて入力画像から、背
景、照明条件、対象の大きさ等の画像入力環境に影響さ
れない頑健な顔器官の切り出しを行うことが出来る顔器
官検出装置及び媒体を提供することを目的とする。In the present invention, in consideration of such conventional problems, an input image is not affected by an image input environment such as a background, an illumination condition, and a size of an object by using color information such as hue and saturation. An object of the present invention is to provide a face organ detection device and a medium that can cut out a robust face organ.

【０００６】[0006]

【課題を解決するための手段】上記課題を解決するため
に本発明における第１の顔器官検出装置は、例えば、色
情報を用いて不適な画素群を見つけるとともに輝度値に
対する平滑化処理を行うことにより、まず人物候補領域
を抽出する。その後に、人物領域内より選択された各抽
出候補領域内での色情報分布から生成された評価ベクト
ルと予め用意された抽出用標本画像集合から生成された
評価テンプレートベクトル集合とのパターンマッチング
を行い、その結果をもとに遺伝的アルゴリズムによる抽
出候補領域の調節を行うようにしたものである。In order to solve the above-mentioned problems, a first facial organ detecting apparatus according to the present invention finds an inappropriate pixel group using color information and performs a smoothing process on a luminance value. Thus, first, a person candidate area is extracted. After that, pattern matching is performed between the evaluation vector generated from the color information distribution in each extraction candidate area selected from the person area and the evaluation template vector set generated from the previously prepared sample image set for extraction. Based on the result, the extraction candidate area is adjusted by a genetic algorithm.

【０００７】また、本発明における第２の顔器官検出装
置は、例えば、色情報を用いて不適な画素群を見つける
とともに輝度値に対する平滑化処理を行うことにより
に、まず人物候補領域を抽出する。その後に、人物領域
内より選択された各抽出候補領域内での色情報分布から
生成された評価ベクトルと予め用意された抽出用標本画
像集合から生成された評価テンプレートベクトル集合と
のパターンマッチングを行い、その結果をもとに遺伝的
アルゴリズムによる抽出候補領域の調節を行うようにし
たものである。その際、抽出候補領域に対応する抽出ベ
クトルの近傍における調整を行い最も適合度の高いベク
トルと入れ替えた後に、遺伝的アルゴリズムを用いて各
抽出ベクトルの組み替え操作を行うようにしたものであ
る。Further, the second face organ detecting apparatus according to the present invention first extracts a person candidate area by finding an inappropriate pixel group using color information and performing a smoothing process on a luminance value, for example. . After that, pattern matching is performed between the evaluation vector generated from the color information distribution in each extraction candidate area selected from the person area and the evaluation template vector set generated from the previously prepared sample image set for extraction. Based on the result, the extraction candidate area is adjusted by a genetic algorithm. At this time, after performing adjustment in the vicinity of the extraction vector corresponding to the extraction candidate area and replacing it with the vector having the highest matching degree, the recombining operation of each extraction vector is performed using a genetic algorithm.

【０００８】また、本発明における第３の顔器官検出装
置は、例えば、色情報を用いて不適な画素群を見つける
とともに輝度値に対する平滑化処理を行うことにより
に、まず人物候補領域を抽出する。各抽出候補領域内で
の色情報分布から生成された評価ベクトルと予め用意さ
れた抽出用標本画像集合から生成された評価テンプレー
トベクトル集合とのパターンマッチングを行い、その結
果をもとに遺伝的アルゴリズムによる抽出候補領域の調
節を行うようにしたものである。その際、各抽出候補領
域を表す評価ベクトル間の類似度をもとに抽出候補領域
を複数のグループに分割し、各グループ内における遺伝
的アルゴリズムでの交叉処理を禁止することにより対象
顔器官を複数人数分だけ同時に検出するようにしたもの
である。Further, the third face organ detecting apparatus according to the present invention first extracts a candidate person area by, for example, finding an inappropriate pixel group using color information and performing a smoothing process on a luminance value. . Pattern matching is performed between the evaluation vector generated from the color information distribution in each extraction candidate area and the evaluation template vector set generated from the prepared sample image set for extraction, and the genetic algorithm is used based on the result. Is performed to adjust the extraction candidate area. At this time, the extraction candidate area is divided into a plurality of groups based on the similarity between the evaluation vectors representing the respective extraction candidate areas, and crossover processing by a genetic algorithm in each group is prohibited, thereby identifying a target face organ. This is to detect only a plurality of persons at the same time.

【０００９】また、本発明における第４の顔器官検出装
置は、例えば、色情報を用いて不適な画素群を見つける
とともに輝度値に対する平滑化処理を行うことにより
に、まず人物候補領域を抽出する。各抽出候補領域内で
の色情報分布から生成された評価ベクトルと予め用意さ
れた複数の器官検出のために用意された標本画像集合か
ら生成された複数器官評価テンプレートベクトル集合と
のパターンマッチングをもとに、遺伝的アルゴリズムに
よる抽出候補領域の調整を行い、複数器官の同時抽出を
行うようにしたものである。その際、まず抽出候補領域
に対応する抽出ベクトルの近傍における調整を行い最も
適合度の高いベクトルと入れ替えた後に、各抽出候補領
域を表す評価ベクトル間の類似度をもとに抽出候補領域
を複数のグループに分割し、各グループ内における遺伝
的アルゴリズムでの交叉処理を禁止することにより入力
画像に存在する複数の顔器官（目、鼻、口、耳等）を同
時に検出するようにしたものである。The fourth face organ detecting apparatus according to the present invention first extracts a candidate person area by, for example, finding an inappropriate pixel group using color information and performing a smoothing process on a luminance value. . Pattern matching between an evaluation vector generated from the color information distribution in each extraction candidate area and a set of multiple organ evaluation template vectors generated from a set of sample images prepared for detecting a plurality of organs prepared in advance is also performed. In addition, the extraction candidate area is adjusted by a genetic algorithm to simultaneously extract a plurality of organs. At this time, first, adjustment is performed in the vicinity of the extraction vector corresponding to the extraction candidate area, and the vector is replaced with the vector having the highest matching degree. And multiple face organs (eyes, nose, mouth, ears, etc.) present in the input image are detected at the same time by prohibiting the crossover processing by the genetic algorithm in each group. is there.

【００１０】[0010]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を参照しながら説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１１】図１は本発明の第１の実施の形態である顔
器官検出装置の構成図、図２は本発明の第１の実施の形
態である顔器官検出装置の組み替え操作手段の構成図、
図９は本発明の第２の実施の形態である顔器官検出装置
の構成図、図１１は本発明の第３の実施の形態である顔
器官検出装置の構成図、図１２は本発明の第４の実施の
形態である顔器官検出装置の構成図である。構成図の各
図において、同一部には同じ番号を付している。FIG. 1 is a configuration diagram of a face organ detection device according to a first embodiment of the present invention, and FIG. 2 is a configuration diagram of rearrangement operation means of the face organ detection device according to the first embodiment of the present invention. ,
FIG. 9 is a configuration diagram of a face organ detection device according to a second embodiment of the present invention, FIG. 11 is a configuration diagram of a face organ detection device according to a third embodiment of the present invention, and FIG. It is a lineblock diagram of a face part detecting device which is a 4th embodiment. In the drawings of the configuration, the same parts are denoted by the same reference numerals.

【００１２】（第１の実施の形態）まず、本発明の第１
の実施の形態である顔器官検出置について説明する。(First Embodiment) First, a first embodiment of the present invention will be described.
A face organ detection device according to the embodiment will be described.

【００１３】図１において、１０１は、識別を行う画像
データをCCDカメラ等を使って取り込む画像入力手段、
１０２は画像入力手段１０１でより入力された画像の各
画素における色情報（彩度S、明度V、色相H）を計算す
る色情報計算手段、１０３は１０２で得られた色情報よ
り人物領域に不適な画素群を見つけ、その画素での輝度
値を対象外であることを示す値に設定した後に、輝度に
対し平滑化フィルタをかける平滑化処理手段、１０４は
１０３をもとに入力画像から人物に相当する領域を抽出
する人物領域抽出手段、１０５は遺伝的アルゴリズムを
用いて１０４の人物領域画像から対象とする顔器官領域
を抽出する遺伝的アルゴリズム処理手段、１０６は遺伝
的アルゴリズム処理手段１０６において、対象とする顔
器官領域と判断された領域を記録する対象領域記録手段
である。In FIG. 1, reference numeral 101 denotes image input means for capturing image data to be identified by using a CCD camera or the like;
Reference numeral 102 denotes color information calculation means for calculating color information (saturation S, lightness V, hue H) at each pixel of the image input by the image input means 101, and reference numeral 103 denotes a person area based on the color information obtained in 102. After finding an inappropriate pixel group and setting the luminance value at that pixel to a value indicating that it is not a target, a smoothing processing unit 104 that applies a smoothing filter to the luminance. A person region extracting unit for extracting a region corresponding to a person; 105, a genetic algorithm processing unit for extracting a target face organ region from a 104 person region image using a genetic algorithm; 106, a genetic algorithm processing unit 106 Is a target area recording means for recording an area determined as a target facial organ area.

【００１４】遺伝的アルゴリズム処理手段１０６は１０
４の人物領域画像からさらに対象とする顔器官を抽出す
るために必要な矩形領域の中心座標(x0(i),y0(i))、横
画素数width(i)、縦画素数height(i)で構成されるパラ
メータ列（抽出ベクトル）v_c[i] = ( x0(i), y0
(i), width(i), height(i)) (i=1, ..., K) をラ
ンダムに設定する初期抽出ベクトル設定手段１０７、１
０７で設定されたK個の矩形抽出領域内の色情報分布値
を使ってK個のM次元評価ベクトルv_e[i]=(e(1,i),e(2,
i),...,e(M,i))(i=1,2,...,K)を作成する評価ベクトル
生成手段１０８、１０８で生成されたK個の評価ベクト
ルと比較する目的で予め用意された複数の顔器官抽出用
標本画像から生成されたN個の評価テンプレートベクト
ルT_e[j]=(te(1,j),te(2,j),...,te(M,j))(j=1,2,...,
N)が記録されている評価テンプレート辞書１０９、１０
８で生成された各抽出領域に対応する評価ベクトルv_e
[i]を１０９の評価テンプレート辞書内のテンプレート
ベクトルT_e[j]と比較して、抽出ベクトルv_c[i]の適合
度fitness(i)を評価する抽出領域評価手段１１０、１１
０における各抽出ベクトルの適合度をもとに遺伝的アル
ゴリズムによる抽出ベクトルの組み替え操作を行い、よ
り適合度の高い抽出矩形領域を探索する組み替え操作手
段１１１より構成される。図２に示すように、組み替え
操作手段１１１は、現時点における抽出ベクトルv_c[i]
の集合Cに、各抽出ベクトルの適合度fitness[i]をもと
に、抽出ベクトルの選択的淘汰を実行する候補選択手段
２０１、候補選択手段２０１で得られた抽出ベクトル集
合Cに対して交叉処理を実行する交叉処理手段２０２、
交叉処理手段２０２で得られた抽出ベクトル集合Cに対
して突然変異処理を実行する突然変異処理手段２０３に
より構成される。さらに、候補選択手段２０１は、抽出
ベクトル集合Cからある抽出ベクトルv_c[i]を選択する
時の選択確率h(i)とその選択範囲I(i)を導出する選択範
囲導出手段２０４、[0,1)内の一様乱数g(i)の組G=(g
(1),g(2),...,g(K))を発生させる乱数発生手段２０５、
乱数発生手段２０５の結果をもとに抽出ベクトル集合C
から選択する抽出ベクトルを抽出する抽出ベクトル選択
手段２０６により構成される。The genetic algorithm processing means 106 has 10
4, the center coordinates (x0 (i), y0 (i)) of the rectangular area necessary for extracting a target facial organ from the human area image, the number of horizontal pixels width (i), and the number of vertical pixels height (i ) Parameter sequence (extracted vector) v_c [i] = (x0 (i), y0
(i), width (i), height (i)) Initial extraction vector setting means 107 for randomly setting (i = 1, ..., K)
The K M-dimensional evaluation vector v_e [i] = (e (1, i), e (2,
i), ..., e (M, i)) (i = 1,2, ..., K) for comparison with the K evaluation vectors generated by the evaluation vector generating means 108,108 N evaluation template vectors T_e [j] = (te (1, j), te (2, j), ..., te (M , j)) (j = 1,2, ...,
N) is recorded in the evaluation template dictionary 109, 10
Evaluation vector v_e corresponding to each extraction region generated in step 8
Extraction area evaluation means 110 and 11 for comparing [i] with the template vector T_e [j] in the evaluation template dictionary 109 to evaluate the fitness “fitness (i)” of the extraction vector v_c [i].
It is composed of a recombining operation means 111 for performing a recombining operation of extracted vectors by a genetic algorithm based on the degree of conformity of each extracted vector at 0 and searching for an extracted rectangular area having a higher degree of conformity. As shown in FIG. 2, the rearrangement operation unit 111 outputs the current extracted vector v_c [i].
Of the extracted vectors based on the fitness degree [i] of each extracted vector, a candidate selecting means 201 for performing selective selection of the extracted vectors, and an intersection with the extracted vector set C obtained by the candidate selecting means 201 Cross-processing means 202 for executing processing,
It is configured by a mutation processing unit 203 that executes a mutation process on the extracted vector set C obtained by the crossover processing unit 202. Further, the candidate selection unit 201 selects a selection probability h (i) when selecting a certain extraction vector v_c [i] from the extraction vector set C and a selection range derivation unit 204 that derives the selection range I (i). , 1), a set of uniform random numbers g (i) G = (g
(1), g (2),..., G (K))
Extracted vector set C based on the result of random number generation means 205
And an extraction vector selection unit 206 for extracting an extraction vector to be selected from the above.

【００１５】以上のように構成された第１の実施の形態
である顔器官検出装置の動作について説明する。画像入
力手段１０１において、CCDカメラ等を用いて、横256画
素×縦256画素の大きさで、RGB各256階調のカラー信号
で入力画像が取り込まれる。色情報計算手段１０２で
は、１０１で取り込まれた入力画像における各画素での
カラー信号をHSV表色系をもとに彩度、明度、色相と呼
ばれる色情報に変換する。一般に色相H成分は色みの種
類を表す値であり、照明等による反射や陰影の影響を受
けにくく、顔のように色相がほぼ一定であるような領域
抽出には有効であると言われている。また、明度Vは色
の明るさを表した値を、彩度Sは色の鮮やかさの程度を
表しており、人間の顔領域は比較的彩度、明度の高い物
体であるのに対し、本発明が主に使用されると思われる
オフィスをはじめとする建物内部環境では、比較的低彩
度色が用いられている。平滑化処理手段１０２では、こ
の２点にポイントをおいて、該当しない画素群は人物領
域に含まれないものと判断し、その画素における輝度値
を例外値（例えば負の大きな値）に再設定する。なお、
ここでは（数１）に示されるHSV表色系を用いたが、色
相成分、彩度成分を持つ表色系であれば同様に適用する
ことができる。（数１）において、r_xy、g_xy、b_xyは
画素座標(x,y)におけるカラー信号を、H_xyは色相を、V
_xyは明度を、S_xyは彩度を、Y_xyは輝度値を表すもの
とする。The operation of the face organ detecting apparatus according to the first embodiment configured as described above will be described. In the image input means 101, an input image is captured using a CCD camera or the like with a color signal of 256 pixels in the horizontal direction and 256 pixels in the vertical direction, each of which has 256 gradations of RGB. The color information calculation means 102 converts the color signal at each pixel in the input image captured in 101 into color information called saturation, lightness, and hue based on the HSV color system. In general, the hue H component is a value that represents the type of tint, is less likely to be affected by reflections and shadows due to lighting, etc., and is said to be effective for extracting areas where the hue is almost constant, such as a face. I have. In addition, the lightness V represents a value representing the brightness of the color, the saturation S represents the degree of vividness of the color, and the human face region is an object having relatively high saturation and high brightness, Relatively low chroma colors are used in office interior environments, including offices, where the invention is likely to be used primarily. The smoothing processing unit 102 puts a point on these two points, determines that a pixel group that does not correspond is not included in the person area, and resets the luminance value of that pixel to an exceptional value (for example, a large negative value). I do. In addition,
Here, the HSV color system shown in (Equation 1) is used, but any color system having a hue component and a saturation component can be similarly applied. In (Equation 1), r_xy, g_xy, and b_xy are color signals at pixel coordinates (x, y), H_xy is a hue, and V
_xy represents lightness, S_xy represents saturation, and Y_xy represents a luminance value.

【００１６】[0016]

【数１】上述の処理は具体的には、 (i)彩度S_xy≧彩度しきい値th_s (ii)明度V_xy≧明度しきい値th_v (iv)0.0≦色相H_xy≦60.0、又は300.0≦H_xy≦36.0 を満足しない画素座標(x,y)における輝度値Y_xyを-1000
0.0に再設定する。満足する(x,y)における輝度Y_xyは
（数２）より計算された値をそのまま使用する。なお、
I_xy、Q_xyについては後ほど説明する。(Equation 1) Specifically, the above processing is performed by (i) saturation S_xy ≧ saturation threshold th_s (ii) lightness V_xy ≧ lightness threshold th_v (iv) 0.0 ≦ hue H_xy ≦ 60.0, or 300.0 ≦ H_xy ≦ 36.0 The luminance value Y_xy at the pixel coordinates (x, y) that are not satisfied is -1000
Reset to 0.0. As the luminance Y_xy at the satisfied (x, y), the value calculated from (Equation 2) is used as it is. In addition,
I_xy and Q_xy will be described later.

【００１７】[0017]

【数２】そして、この結果をもとに輝度値の平滑化を図３のよう
な平滑化フィルタを用いる。これは、顔器官中にも明
度、彩度、色相条件を満足しない画素が存在する可能性
があるため、平滑化処理を行いその影響を低減するため
の処理である。(Equation 2) Then, based on the result, a smoothing filter as shown in FIG. 3 is used for smoothing the luminance value. This is a process for reducing the influence by performing a smoothing process because there is a possibility that a pixel that does not satisfy the brightness, saturation, and hue conditions may exist in the face organ.

【００１８】次に、人物領域抽出手段１０４では次のよ
うな処理を行う。まず、図４のように縦画素yを固定し
たまま、横画素x方向(0≦x≦255)に対して、画素座標
(x,y)における輝度値Y_xy≧0.0を満足する画素ヒストグ
ラムCount(x)を求め、x方向に対する平均値C_ave_xを求
める。そして、x=0から走査してCount(x)がC_ave_xを最
初に上回るx=ｘ_sと、x=255から走査して最初にC_ave_x
を上回るx=x_eを求める。ある縦画素yにおいて、x_sか
らx_eに相当する領域を横画素x方向における人物領域と
見なすのである。この操作をすべての縦画素y(0≦y≦25
5)に対して行い、横画素x方向における人物領域を決定
する。同様に縦画素y方向(0≦x≦255)に対しても、輝度
値Y_xy≧0.0を満足する画素ヒストグラムCount(y)を求
め、y方向に対する平均値C_ave_yを求める。そして、y=
0から走査してCount(y)がC_ave_yを最初に上回るy=y_s
と、同様にy=255から走査して最初にC_ave_yを上回るy=
y_eを求める。ある横画素xにおいて、y_sからy_eに相当
する領域を縦画素y方向における人物領域を見なし、こ
の操作をすべての横画素x(0≦x≦255)に対して行い、縦
画素y方向における人物領域を決定するのである。こう
して得られた横画素x方向におけるx_sからx_e、縦画素y
方向におけるy_sからy_eに含まれる矩形領域を人物領域
として抽出する処理を行う。Next, the person region extracting means 104 performs the following processing. First, with the vertical pixel y fixed as shown in FIG. 4, the pixel coordinates are set in the horizontal pixel x direction (0 ≦ x ≦ 255).
A pixel histogram Count (x) satisfying the luminance value Y_xy ≧ 0.0 at (x, y) is obtained, and an average value C_ave_x in the x direction is obtained. Then, scanning from x = 0, x = x_s where Count (x) exceeds C_ave_x first, and scanning from x = 255, C_ave_x
X = x_e that exceeds In a certain vertical pixel y, an area corresponding to x_s to x_e is regarded as a person area in the horizontal pixel x direction. Repeat this operation for all vertical pixels y (0 ≦ y ≦ 25
This is performed for 5) to determine a person area in the horizontal pixel x direction. Similarly, in the vertical pixel y direction (0 ≦ x ≦ 255), a pixel histogram Count (y) satisfying the luminance value Y_xy ≧ 0.0 is obtained, and an average value C_ave_y in the y direction is obtained. And y =
Scan from 0, Count (y) first exceeds C_ave_y y = y_s
And similarly scan from y = 255 and first y = above C_ave_y
Find y_e. In a certain horizontal pixel x, an area corresponding to y_s to y_e is not seen in the person area in the vertical pixel y direction, and this operation is performed on all the horizontal pixels x (0 ≦ x ≦ 255) to obtain a person in the vertical pixel y direction. The area is determined. X_s to x_e in the horizontal pixel x direction, vertical pixel y thus obtained
A process of extracting a rectangular area included in y_s to y_e in the direction as a person area is performed.

【００１９】さらに人物領域から対象とする顔器官のみ
を検出するために、以下に説明するような遺伝的アルゴ
リズムを用いた領域抽出を遺伝的アルゴリズム処理手段
１０５が行う。そこでは、対象とする顔器官抽出のため
に予め用意された標本画像集合から生成された評価テン
プレートベクトル集合G_T内のベクトルT_e[j](j=1,
2,...,N)と、人物領域抽出手段１０４で抽出された人物
領域から選ばれたK個の矩形領域により得られる評価ベ
クトルv_e[i](i=1,2,...,K)との間のパターンマッチン
グを行い、最適な矩形領域を対象とする顔器官と見なし
て切り出す。その際、矩形領域の中心点座標(x0(i),y0
(i))、横画素数width(i)、縦画素数height(i)の最適値
を、多数パラメータの組み合わせで大きな効果のある遺
伝的アルゴリズム(GA)を用いて推定するのである。Further, in order to detect only a target facial organ from a human region, the genetic algorithm processing means 105 performs region extraction using a genetic algorithm as described below. Here, a vector T_e [j] (j = 1, j) in an evaluation template vector set G_T generated from a sample image set prepared in advance for extracting a target facial organ.
, N) and an evaluation vector v_e [i] (i = 1, 2,...) Obtained from K rectangular areas selected from the person areas extracted by the person area extracting means 104. K) is performed, and an optimal rectangular area is regarded as a target facial organ and cut out. At that time, the coordinates of the center point of the rectangular area (x0 (i), y0
(i)), the optimal values of the horizontal pixel number width (i) and the vertical pixel number height (i) are estimated by using a genetic algorithm (GA) having a large effect by combining a large number of parameters.

【００２０】遺伝的アルゴリズムは生物の種の染色体が
環境に適応して進化していく様子を模擬した最適化手法
であり、例えば文献「ジェネティックアルゴリズム
インサーチオプティマイゼーションアンドマシー
ンラーニング」（”Genetic Algorithms in Search,
Optimization and Machine Learning”（David E.Goldb
erg, Addison Wesley））にその詳細が記載されてい
る。まず、推定する複数のパラメータからなるパラメー
タ列ベクトルを染色体q_k(k=1,2,...,K)と見なし、最適
化すべき問題を環境とみなす。自然界では染色体が環境
に適していない個体は死滅し、適したものは他の染色体
との間で交配を行い子を生む。子は両親の遺伝子を組み
合わせたものになり、今までの集団にはなかった染色体
を持つ。同じ両親から生まれた子でも遺伝子の組み合わ
せ次第で良くも悪くもなる。適合度の高い子は子孫を増
やし、低い子は死滅する。時に染色体は突然変異を起こ
す。この過程を繰り返すことにより染色体の集団は次第
に適合度の高い染色体を持つよう均質化していく。GAで
は、求めるパラメータ列ベクトルの集団を生物における
染色体の集団と見なす。最初にランダムにパラメータ列
ベクトルの集団が生成される。各々の解を最適化したい
関数(評価関数と呼ぶ)に代入して評価し、最適に近いも
のを増殖させ、適していないものを淘汰させる。その
後、他の染色体との間で解の一部を交換し合う。これを
交叉と呼ぶ。さらに、ある確率で突然変異を起こし、パ
ラメータ列ベクトルの一部を変化させる（図５参照)。
この増殖・淘汰、交叉、突然変異からなる世代交代の過
程を繰り返し適用し、一定の世代交代の後、集団内で最
良のパラメータ列ベクトルを最適なベクトルとする。GA
の特徴はパラメータ列ベクトル集団を用いて同時に探索
空間上で複数探索を行う点にある。これにより、ローカ
ルミニマムに陥る可能性が減少する。また、複数のパラ
メータ列ベクトルの良い点を組み合わせて新たなパラメ
ータ列ベクトルを作成することで効率的な探索を行うこ
とができることも特徴の１つである。GAは、以上のよう
な処理を繰り返し実行することによりパラメータ列ベク
トルの最適化を行うものである。The genetic algorithm is an optimization method that simulates the evolution of the chromosome of a species of an organism while adapting to the environment.
"In Search Optimization and Machine Learning"("Genetic Algorithms in Search,
Optimization and Machine Learning ”(David E. Goldb
erg, Addison Wesley)). First, a parameter sequence vector including a plurality of parameters to be estimated is regarded as a chromosome q_k (k = 1, 2,..., K), and a problem to be optimized is regarded as an environment. In nature, individuals whose chromosomes are unsuitable for the environment die, and those that are suitable crosses with other chromosomes to produce offspring. The offspring combine the genes of their parents and have chromosomes not previously found in the population. A child born to the same parents can get better or worse depending on the combination of genes. Highly fit offspring increase offspring and low offspring die. Sometimes chromosomes mutate. By repeating this process, the population of chromosomes is gradually homogenized to have chromosomes with a high degree of fitness. In GA, a group of parameter column vectors to be determined is regarded as a group of chromosomes in an organism. First, a group of parameter column vectors is randomly generated. Each solution is evaluated by substituting it into a function to be optimized (referred to as an evaluation function), and those that are close to optimal are proliferated, and those that are not suitable are eliminated. Then, a part of the solution is exchanged with another chromosome. This is called crossover. Further, a mutation is caused at a certain probability, and a part of the parameter column vector is changed (see FIG. 5).
The process of generation alternation consisting of multiplication / selection, crossover, and mutation is repeatedly applied, and after a certain generation alternation, the best parameter column vector in the group is set as the optimal vector. GA
Is that a plurality of searches are simultaneously performed in a search space using a parameter column vector group. This reduces the possibility of falling into a local minimum. Another feature is that efficient search can be performed by creating a new parameter column vector by combining good points of a plurality of parameter column vectors. The GA optimizes a parameter column vector by repeatedly executing the above processing.

【００２１】本発明では、１０４で色情報により切り出
された人物領域に対してGAを適用することによって顔器
官を切り出す。なお、切り出す顔器官としては口唇を対
象として説明することとする。染色体構造は図６のよう
に、顔器官を抽出するために必要な矩形領域の中心座標
(x0(i),y0(i))、横画素数width(i)、縦画素数height(i)
(i=1,2,...,K)よりなる抽出ベクトルで表現される。ま
ず初期抽出ベクトル設定手段１０７において、ランダム
に設定されたK個の元を持つ染色体集合Dを用意する。な
お、ここでは矩形領域の座標、横画素数、縦画素数をそ
のまま要素にして染色体を生成したが、座標、横画素
数、縦画素数を0、1の２進数のビット列に変換して並べ
たビット列にすることも考えられる。評価ベクトル生成
手段１０8では、各染色体q_i(i=1,2,...,K)の適合度を
評価するために、各染色体の表す矩形領域内の画像の特
徴を表す評価ベクトルを生成する。染色体q_iに対する
評価として様々な方法が考えられるが、本発明では矩形
領域内のHSV表色系における色相Hと（数２）のように表
されるYIQ表色系におけるI信号の分布を用いることとす
る。これは、(1)色相Hは照明条件の変動より受ける割合
が小さいこと、(2)通常のNTSC方式のカラーテレビ放送
に使用されているYIQ表色系においていわゆる色差に相
当するIとQ信号は口唇において、周囲と比較して大きな
値を持っていることに起因する。なお、HSV表色系の色
相HとYIQ表色系のQ信号の分布を用いることも当然可能
である。また、これら以外にも抽出候補である矩形領域
評価のための特徴を表すものとして多くの組み合わせが
考えられ、その一例としてHSV表色系における色相Hと明
度Vの分布を用いることも考えられる。これは、HSV表色
系において、白、黒色系は色相Hだけでは判定すること
ができず明度Vを使用する必要があることに起因してお
り、肌色を確実に見つけだすための工夫である。In the present invention, facial organs are cut out by applying GA to the person region cut out based on the color information in step 104. It should be noted that the lip is used as a target facial organ to be described. As shown in Fig. 6, the chromosome structure is the center coordinates of the rectangular area required to extract the facial organs.
(x0 (i), y0 (i)), horizontal pixel number width (i), vertical pixel number height (i)
(i = 1, 2,..., K). First, in the initial extraction vector setting means 107, a chromosome set D having K elements set at random is prepared. In this case, the chromosome is generated by using the coordinates of the rectangular area, the number of horizontal pixels, and the number of vertical pixels as they are, but the coordinates, the number of horizontal pixels, and the number of vertical pixels are converted into a binary bit string of 0 and 1 and arranged. It is also conceivable to use a bit string. The evaluation vector generating means 108 generates an evaluation vector representing the feature of the image in the rectangular area represented by each chromosome in order to evaluate the fitness of each chromosome q_i (i = 1, 2,..., K). . Although various methods can be considered for the evaluation of the chromosome q_i, the present invention uses the distribution of the hue H in the HSV color system in the rectangular area and the distribution of the I signal in the YIQ color system expressed as (Equation 2). And This is because (1) the hue H is less affected by fluctuations in lighting conditions, and (2) the I and Q signals corresponding to the so-called color difference in the YIQ color system used in normal NTSC color TV broadcasting. Is caused by having a large value in the lips compared to the surroundings. Note that it is naturally possible to use the distribution of the hue H of the HSV color system and the Q signal of the YIQ color system. In addition to these, many combinations can be considered as representing features for evaluating a rectangular area, which is an extraction candidate, and the distribution of hue H and lightness V in the HSV color system may be used as an example. This is because, in the HSV color system, the white and black systems cannot be determined only by the hue H and the brightness V needs to be used, and this is a device for surely finding the skin color.

【００２２】まず色相Hでは、肌色の存在する0.0度から
60.0度そして300.0度から360.0度をHnum個の均等領域に
分割する。同様に、I信号は最小値-160.0から最大値16
0.0の範囲をInum個の均等領域に分割する。Hの分割幅を
Hstep=120/Hnum、Iの分割幅をIstep=320/Inumとする
と、jj番目のHの範囲I_h(jj)は（数３）のように、kk番
目のIの範囲I_i(kk)は（数４）のようになる。なお、
（数４）においてkk番目のQの範囲I_q(kk)も合わせて記
述されているが、これは評価用特徴としてI信号の分布
比の代わりにQ信号の分布比を用いた場合の、ヒストグ
ラム作成のための領域設定を表すものであり、以下のI
信号における処理と同様の処理を行えばよい。First, in the hue H, from 0.0 degrees at which flesh color exists.
Divide 60.0 degrees and 300.0 degrees to 360.0 degrees into Hnum equal regions. Similarly, the I signal ranges from a minimum of -160.0 to a maximum of 16
Divide the 0.0 range into Inum equal regions. H division width
Assuming that Hstep = 120 / Hnum and the division width of I is Istep = 320 / Inum, the range I_h (jj) of the jj-th H is represented by (Equation 3), and the range I_i (kk) of the kk-th I is ( Equation 4) is obtained. In addition,
In Equation (4), the range of the kk-th Q, I_q (kk), is also described. This is a histogram obtained when the distribution ratio of the Q signal is used instead of the distribution ratio of the I signal as an evaluation feature. It indicates the area setting for creation, and the following I
What is necessary is just to perform the same processing as the signal processing.

【００２３】[0023]

【数３】 (Equation 3)

【００２４】[0024]

【数４】 Hの範囲I_h(jj)に含まれる画素数C_h(jj)を求め、その
値の人物領域限定手段１０４で得られた領域全体におけ
る画素数Totalに対する比ratio_h(jj)(jj=1,2,..,Hnum)
を求める。同様にIの範囲I_i(kk)に含まれる画素数C_i
(kk)を求め、その値のTotalに対する比ratio_i(kk)(kk=
1,2,..,Inum)を求める。これを順番に並べてやることに
より、抽出ベクトルv_c[i]に対応する評価ベクトルv_e
[i]=(e(1,i),e(2,i),...,e(M,i))=(ratio_h(1),...,rat
io_h(Hnum), ratio_i(1),...,ratio_i(Inum))(M=Hnum+I
num)を導出する。評価テンプレート辞書１０９は、予め
対象とする顔器官領域抽出のために用意されたN_sample
個の抽出用標本画像集合の中の画像nn内の色相HとQ信号
の分布比を要素に持つM次元ベクトルT_ed[nn]=(ted(1,n
n),ted(2,nn),...,ted(M,nn))(nn=1,2,...,N_sample)を
作成する。そして、作成されたN_sample個のベクトルT_
ed[nn]にベクトル量子化（Vector Quantization:VQ）を
適用してN個の要素を持つ評価テンプレートベクトル集
合を作成する。VQは用意された多数のベクトル集合のデ
ータ分布に応じて、参照ベクトルと呼ばれる各部分空間
（クラスタ）を代表するベクトルを配置して、元のベク
トル空間を複数の部分空間に分割するクラスタ化手法で
あり、詳細は文献「アンアルゴリズムフォーベク
トルクフォンタイザーデザイン」( "An algorithmf
or vector quantizer design"(IEEE Transaction of Co
mmunication, COM-28,No.1, pp.84-95,1980, Linde,
Y., Buzo, A. and Gray, R. M. ))に掲載されている。
本発明では、このVQを用いて予め用意されたN_sample個
の抽出用標本画像集合から生成される評価ベクトルをN
個の部分集合にクラスタ分割し、得られたN個の参照ベ
クトルT_e[j](j=1,...,N)を用いて評価テンプレートベ
クトル集合G_teを生成する。このG_teが評価テンプレー
ト辞書１０９に相当する。(Equation 4) The number of pixels C_h (jj) included in the range I_h (jj) of H is obtained, and the ratio ratio_h (jj) (jj = 1, 2, to the total number of pixels in the entire region obtained by the person region limiting unit 104). .., Hnum)
Ask for. Similarly, the number of pixels C_i included in the range I_i (kk) of I
(kk) and the ratio of that value to Total ratio_i (kk) (kk =
1,2, .., Inum). By arranging these in order, the evaluation vector v_e corresponding to the extraction vector v_c [i]
[i] = (e (1, i), e (2, i), ..., e (M, i)) = (ratio_h (1), ..., rat
io_h (Hnum), ratio_i (1), ..., ratio_i (Inum)) (M = Hnum + I
num) is derived. The evaluation template dictionary 109 includes N_sample prepared in advance for extracting a target facial organ region.
An M-dimensional vector T_ed [nn] = (ted (1, n) having the distribution ratio of the hue H and Q signals in the image nn in the sample image set for
n), ted (2, nn), ..., ted (M, nn)) (nn = 1,2, ..., N_sample) Then, the created N_sample vectors T_
Vector quantization (Vector Quantization: VQ) is applied to ed [nn] to create an evaluation template vector set having N elements. VQ is a clustering method that divides the original vector space into multiple subspaces by arranging vectors representing each subspace (cluster) called reference vectors according to the data distribution of a large number of prepared vector sets For details, refer to the document "An algorithm for vector
or vector quantizer design "(IEEE Transaction of Co
mmunication, COM-28, No.1, pp.84-95,1980, Linde,
Y., Buzo, A. and Gray, RM)).
In the present invention, an evaluation vector generated from a set of N_sample extraction sample images prepared in advance using this VQ is represented by N
Then, an evaluation template vector set G_te is generated using the obtained N reference vectors T_e [j] (j = 1,..., N). This G_te corresponds to the evaluation template dictionary 109.

【００２５】次に抽出領域評価手段１１０では、１０８
で生成された各染色体q_i(i=1,2,...,K)に対応する矩形
領域からの評価ベクトルを用いて、各染色体の適合度を
計算する。各染色体の適合度を表す評価関数fitness(i)
として、いろいろなものが考えられるが、本発明では評
価ベクトルv_e[i]( i=1,...,K)と用意されたN個の評価
テンプレート辞書内のベクトルT_e[j](j=1,...,N)との
間の相関係数Sim(i,j)の最大値をq_iに対する適合度fit
ness(i)とする。しかし、評価ベクトルv_e[i](i=1,...,
K)と用意されたN個の評価テンプレート辞書内のベクト
ルT_e[j](j=1,...,N)との間の２乗距離D(i,j)の逆数の
最大値を適合度fitness(i)とみなすことも可能である。
GAでは、この値が大きくなるような最適な染色体q_best
を推定するのである。Next, the extraction area evaluation means 110
The fitness of each chromosome is calculated using the evaluation vector from the rectangular region corresponding to each chromosome q_i (i = 1, 2,..., K) generated in the above. Fitness function fitness (i) representing the fitness of each chromosome
In the present invention, the evaluation vector v_e [i] (i = 1,..., K) and the vector T_e [j] (j = 1, ..., N) and the maximum value of the correlation coefficient Sim (i, j)
ness (i). However, the evaluation vector v_e [i] (i = 1, ...,
Match the maximum value of the reciprocal of the square distance D (i, j) between (K) and the vector T_e [j] (j = 1, ..., N) in the prepared N evaluation template dictionaries It can also be regarded as degree fitness (i).
In GA, the optimal chromosome q_best
Is estimated.

【００２６】さらに１１０では、抽出判断基準を満足す
るか、遺伝的アルゴリズム処理の繰り返し回数を予め設
定された上限を越えていないかの判定がされる。なお、
抽出判断基準として様々なものが考えられるが、ここで
は最低適合度f_thを抽出判断基準とする。ここでの判定
は、 [i]最大適合度f_minが抽出判断基準f_thより大きい場合
に、領域抽出が終了し、１０６の抽出領域記録手段で最
適な矩形領域が記録される。Further, at 110, it is determined whether the extraction criterion is satisfied or the number of repetitions of the genetic algorithm processing does not exceed a preset upper limit. In addition,
Although various extraction criteria can be considered, here, the minimum fitness f_th is used as the extraction criteria. In this determination, [i] when the maximum fitness f_min is larger than the extraction criterion f_th, the area extraction is completed, and the optimal rectangular area is recorded by the extraction area recording unit 106.

【００２７】[ii]最大適合度f_minが抽出判断基準f_th
より小さく、かつ染色体集合更新回数g_numが予め設定
された繰り返し上限回数g_num_max以下の場合、組み替
え操作手段１１１へ処理が移る。[Ii] The maximum fitness f_min is the extraction criterion f_th
If the number is smaller and the chromosome set update count g_num is equal to or smaller than the preset maximum number of repetitions g_num_max, the process proceeds to the rearrangement operation unit 111.

【００２８】[iii]最大適合度f_minが抽出判断基準f_th
より小さく、かつ染色体集合更新回数g_numが予め設定
された繰り返し上限回数g_num_maxを超えた場合には、
入力画像中に対象とする顔器官が含まれていないと判断
して、顔器官検出処理を終了する。の３つの判断処理に
従い実行される。[Iii] The maximum fitness f_min is the extraction criterion f_th
If smaller, and the chromosome set update count g_num exceeds a preset upper limit count g_num_max,
It is determined that the target face organ is not included in the input image, and the face organ detection processing ends. Is executed according to the three determination processes.

【００２９】組替え操作手段１１１の動作は以下の通り
である。まず、候補選択手段２０１において染色体の選
択淘汰が実行される。この場合、図７に表されるように
適合度に比例する確率で染色体を選択するルーレット選
択法が用いられる。The operation of the rearrangement operation means 111 is as follows. First, chromosome selection is performed in the candidate selection means 201. In this case, as shown in FIG. 7, a roulette wheel selection method for selecting a chromosome with a probability proportional to the fitness is used.

【００３０】（ルーレット選択法） (1)染色体集合Pdに属する各染色体q_i(i=1,...,K)の適
合度fitnsee(i)、全染色体の適合度の総和f_totalを求
める。(Roulette Selection Method) (1) The fitness degree fitnsee (i) of each chromosome q_i (i = 1,..., K) belonging to the chromosome set Pd and the sum f_total of the fitness degrees of all chromosomes are obtained.

【００３１】(2)q_iが次世代の染色体を作り出す親とし
て選ばれる選択確率h(i)が（数５）のように求められ
る。(2) The selection probability h (i) that q_i is selected as a parent that creates the next generation chromosome is obtained as shown in (Equation 5).

【００３２】[0032]

【数５】この確率を染色体に割り当てるためには例えば次のよう
な方法が考えられる。 (iii)各染色体の選択範囲L(i)を[0,1)内の区間に（数
６）（数７）を用いて次のように割り当てる。つまり、(Equation 5) In order to assign this probability to a chromosome, for example, the following method can be considered. (iii) The selection range L (i) of each chromosome is allocated to the section in [0, 1) using (Equation 6) and (Equation 7) as follows. That is,

【００３３】[0033]

【数６】とする時、q_iの選択範囲L(i)は(Equation 6) Then, the selection range L (i) of q_i is

【００３４】[0034]

【数７】のように定義する。ここで[0,1)内に一様乱数g(i)の組G
=(g(1),g(2),...,g(K))を発生させる。g(i)∈I(j)(i,j=
1,2,...,K)を満足するnum(i)=jの組Num=(num(1),num
(2),...,num(K))を求めることにより、このNumに対応す
るK個の染色体の組が選択されることになる。(Equation 7) Is defined as Here, a set G of uniform random numbers g (i) in [0,1)
= (g (1), g (2), ..., g (K)). g (i) ∈I (j) (i, j =
A set of num (i) = j satisfying (1,2, ..., K) Num = (num (1), num
(2), ..., num (K)), a set of K chromosomes corresponding to this Num is selected.

【００３５】このようなルーレット選択法により、現在
の染色体集団Pの中の染色体q_iの選択を行うのである。
まず、（数５）〜（数７）に従い選択範囲導出手段２０
４が各染色体が選択される確率h(i)とその選択範囲L(i)
を求める。そして、乱数発生手段２０５が０から１の間
の一様乱数gをK個発生する。乱数発生手段２０５で得ら
れた乱数の組Gと選択範囲導出手段２０４により得られ
る選択範囲L(i)は、染色体選択手段２０６へ送られg(i)
∈I(j)を満足するnum(i)=kの組Numが求められる。それ
により染色体選択手段２０６では、Numによって指定さ
れる染色体で構成される新しい染色体集団Pdを出力する
のである。この候補選択手段２０１で得られる新しい染
色体集団Pdに対して、交叉処理手段２０２が交叉処理を
行う。交叉処理としては様々な方法があるが、本実施の
形態では図５のような１点交叉もしくは２点交叉処理を
用いる。さらに突然変異手段２０３が、交叉処理手段２
０２を経て得られた新しい染色体集団に対して、ある低
い確率で選ばれた遺伝子（多次元空間における座標ベク
トルの要素）にある範囲内で与えられた乱数を付加する
ことによって実現される。その際、突然変異を行う確率
は、染色体集団の半分と残り半分では変動させることに
より、より染色体の多様性に維持することに努めた。な
お、ここでは、多次元空間における座標ベクトルの実数
値要素をそのまま並べて染色体として扱ったが、ビット
列に変換して扱うことも可能である。図８は、ある入力
画像に対して顔器官である口唇抽出の一例を表す。図８
（ａ）は入力画像、図８（ｂ）は人物抽出結果、図８
（ｃ）はＧＡ（遺伝的アルゴリズム）による顔器官抽出
結果の例を表す。図８（ｃ）は、図８（ａ）の入力画像
に検出された口唇領域に矩形線を当てはめたものであ
り、この矩形線内部の領域に口唇があると検出された結
果を示すものである。なお、人物領域抽出において、彩
度閾値th_s = 25、明度閾値th_v = 18とした。GAによる
顔器官抽出を行う際、１世代あたりに用意される染色体
q_iの個数は100とし、顔器官抽出までに60世代の繰り返
し推定が行われた結果を示す。これらより明らかなよう
に、良好に人物の顔器官が切り出されていることがわか
る。The chromosome q_i in the current chromosome group P is selected by such a roulette wheel selection method.
First, the selection range deriving means 20 according to (Equation 5) to (Equation 7)
4 is the probability h (i) that each chromosome is selected and the selection range L (i)
Ask for. Then, the random number generation means 205 generates K uniform random numbers g between 0 and 1. The set of random numbers G obtained by the random number generation means 205 and the selection range L (i) obtained by the selection range derivation means 204 are sent to the chromosome selection means 206 and g (i)
A set Num of num (i) = k that satisfies ∈I (j) is obtained. As a result, the chromosome selection means 206 outputs a new chromosome population Pd composed of chromosomes designated by Num. Crossover processing means 202 performs crossover processing on the new chromosome population Pd obtained by the candidate selection means 201. There are various methods for the crossover processing. In this embodiment, a one-point crossover or a two-point crossover as shown in FIG. 5 is used. Further, the mutation means 203 is provided for
This is realized by adding a random number given within a certain range to a gene (an element of a coordinate vector in a multidimensional space) selected at a certain low probability to a new chromosome population obtained through 02. At that time, we tried to maintain more chromosomal diversity by changing the probability of mutation in half and the other half of the chromosome population. Here, the real-valued elements of the coordinate vector in the multidimensional space are arranged as they are and treated as chromosomes, but they may be converted into bit strings and treated. FIG. 8 illustrates an example of extraction of a lip that is a facial organ from a certain input image. FIG.
8A shows an input image, FIG. 8B shows a person extraction result, and FIG.
(C) shows an example of a face organ extraction result by GA (Genetic Algorithm). FIG. 8C shows a result obtained by applying a rectangular line to the lip region detected in the input image of FIG. 8A, and shows a result of detecting that there is a lip in the region inside the rectangular line. is there. In the extraction of the person region, the saturation threshold th_s = 25 and the lightness threshold th_v = 18. Chromosomes prepared per generation when extracting facial organs by GA
The number of q_i is assumed to be 100, and the result of repetitive estimation for 60 generations before the facial organ extraction is shown. As is clear from these, it can be seen that the facial organ of the person has been well cut out.

【００３６】このように本実施の形態によれば、色情報
を用いて入力画像から人物領域を抽出した後、遺伝的ア
ルゴリズムにより予め用意されたテンプレートベクト
ル辞書ともっとも適合すると思われる領域を探索して顔
器官を切り出すことができ、背景や、顔の大きさに依存
しなで自動的に顔器官抽出を行うことができる。As described above, according to the present embodiment, after extracting a person region from an input image using color information, a region which seems to be most compatible with a template vector dictionary prepared in advance by a genetic algorithm is searched. Face organs can be extracted, and face organs can be automatically extracted independently of the background and the size of the face.

【００３７】（第２の実施の形態）次に、本発明の第２
の実施の形態である顔器官検出装置について説明する。
図９は本発明の第２の実施の形態である顔器官検出装置
の構成図である。図９において、９０１は各抽出ベクト
ルの近傍周囲から複数のベクトル群を取り出す抽出ベク
トル近傍選択手段、９０２は抽出ベクトル選択手段９０
１で得られたベクトル群の中で最も適合度の高いベクト
ルを元の抽出ベクトルと置き換える抽出ベクトル近傍調
整手段である。以上のように構成された第２の実施の形
態である顔器官検出装置の動作について説明する。対象
本発明の第１の実施の形態である顔器官検出装置と同様
に、画像入力手段１０１より入力された入力画像は、色
情報計算手段１０２で求められた色相、彩度、明度の色
情報をもとに不適な画素における輝度が対象外値に設定
され、輝度の平滑処理を経て大まかに人物領域が切り出
される。その後、遺伝的アルゴリズム処理手段１０５
で、まず、ランダムに切り出された矩形領域内の色相H
の分布値とYIQ表色系におけるQ信号の分布値の人物領域
全体に対する比より評価ベクトルが求められる。そし
て、この評価ベクトルと予め用意された顔器官抽出用の
評価テンプレート辞書１０９内のテンプレートベクトル
とのパターンマッチングにより各抽出ベクトルの適合度
が決定され、この適合度に基づき遺伝的組み替え操作を
行って抽出ベクトル（抽出候補である矩形領域）の調整
が繰り返されることにより最適な顔器官領域が推定さ
れ、抽出領域記録手段１０６で記録される。本実施の形
態では、以上の処理に加えて抽出ベクトルの最適化の効
率の向上を計るために、局所的な調整能力処理を加えた
ものである。組み替え操作手段１１１に移る前に、その
時点における各抽出ベクトルv_c[i](i=1,2,...,K)を中
心として、ベクトルノルムlen内における多次元空間内
の球体内部に任意にmm個のベクトルv_cd[j](j=1,..,mm)
を選択する。そして、このmm個のベクトルで表現される
矩形領域に対して同様に評価ベクトルv_ed[j]を求め、
評価テンプレート辞書１０９との評価により適合度fitn
ess_d(j)を求める。このfitenss_d(j)と元の抽出ベクト
ルv_c[i]の適合度fitness(i)の計(mm+1)個の中で適合度
の最も高いベクトルを改めて組み替え操作手段１１１で
遺伝的な組み替え操作を行う抽出ベクトルに置き換える
のである。こうすることにより、遺伝的アルゴリズムの
探索で問題とされていた局所的探索能力の欠如を補うこ
とができ、対象とする顔器官を高速に検出することが可
能となる。(Second Embodiment) Next, a second embodiment of the present invention will be described.
A face organ detecting device according to the embodiment will be described.
FIG. 9 is a configuration diagram of a face organ detection device according to a second embodiment of the present invention. In FIG. 9, reference numeral 901 denotes an extraction vector selection means for extracting a plurality of vector groups from the vicinity of each extraction vector, and 902 denotes an extraction vector selection means 90
This is an extraction vector neighborhood adjustment unit that replaces the vector having the highest degree of matching from the vector group obtained in step 1 with the original extraction vector. The operation of the face organ detecting apparatus according to the second embodiment configured as described above will be described. The input image input from the image input unit 101 is similar to the face organ detection device according to the first embodiment of the present invention, and the color information of the hue, saturation, and brightness obtained by the color information calculation unit 102 is used. , The luminance of the inappropriate pixel is set to a non-target value, and the person area is roughly cut out through luminance smoothing processing. Thereafter, the genetic algorithm processing means 105
First, the hue H in the rectangular area randomly cut out
The evaluation vector is obtained from the ratio of the distribution value of the distribution signal and the distribution value of the Q signal in the YIQ color system to the entire human area. Then, the matching degree of each extracted vector is determined by pattern matching between the evaluation vector and a template vector in the evaluation template dictionary 109 for facial organ extraction prepared in advance, and a genetic rearrangement operation is performed based on the matching degree. By repeating the adjustment of the extraction vector (rectangular area as an extraction candidate), an optimal face organ area is estimated and recorded by the extraction area recording unit 106. In the present embodiment, in addition to the above processing, local adjustment capability processing is added in order to improve the efficiency of optimization of the extraction vector. Before moving to the rearrangement operation means 111, any extracted vector v_c [i] (i = 1, 2,..., K) at the time is centered on a sphere in a multidimensional space within the vector norm len. Mm vectors v_cd [j] (j = 1, .., mm)
Select Then, an evaluation vector v_ed [j] is similarly obtained for the rectangular area represented by the mm vectors,
The fitness fitn is determined based on the evaluation with the evaluation template dictionary 109.
Find ess_d (j). The vector having the highest fitness in the total (mm + 1) fitness of fitness (i) between the fitenss_d (j) and the original extracted vector v_c [i] is again genetically rearranged by the rearrangement operation means 111. Is replaced by an extraction vector. By doing so, it is possible to compensate for the lack of local search capability, which has been a problem in searching for a genetic algorithm, and to detect a target facial organ at high speed.

【００３８】このように本実施の形態によれば、入力画
像中において対象とする顔器官の検出を高速にかつ精度
良く行うことが可能となり、その効果は大きいと考え
る。As described above, according to the present embodiment, it is possible to detect a target facial organ in an input image at high speed and with high accuracy, and the effect is considered to be great.

【００３９】（第３の実施の形態）次に本発明の第３の
実施の形態である顔器官検出装置について説明する。図
１０は第３の実施の形態である顔器官検出装置の構成を
表す。１００１は、１０８で得られた各抽出ベクトルに
対する評価ベクトル間の類似度を導出するベクトル類似
度導出手段、１００２はベクトル類似度導出手段１００
１で得られた各ベクトル間の類似度をもとに現在の抽出
ベクトル集合を複数のグループに分割する抽出ベクトル
集合分割手段、１００３は類似度の低い複数の抽出ベク
トルの遺伝的アルゴリズムによる組み替え操作を行うグ
ループ別組み替え操作手段、１００４は抽出ベクトルの
組み替え操作の繰り返し回数が予め設定された繰り返し
上限回数を越えた場合に領域抽出処理を終了して、各グ
ループにおいて予め設定された抽出判断基準を満足する
抽出ベクトルの表す対象領域を記録する複数対象記録手
段である。(Third Embodiment) Next, a face organ detecting apparatus according to a third embodiment of the present invention will be described. FIG. 10 shows the configuration of a face organ detection device according to the third embodiment. Reference numeral 1001 denotes a vector similarity deriving unit that derives the similarity between the evaluation vectors with respect to each extracted vector obtained in 108, and 1002 denotes a vector similarity deriving unit 100.
Extracting vector set dividing means based on the similarity between the vector obtained in 1 divides the current extraction vector set into a plurality of groups, 1003 recombinant manipulation by the genetic algorithm with low similarity plurality of extraction vectors The group-specific rearrangement operation means 1004 performs the region extraction process when the number of repetitions of the rearrangement operation of the extraction vector exceeds the preset upper limit of the number of repetitions. This is a multiple target recording unit that records a target area represented by a satisfactory extraction vector.

【００４０】以上のように構成された第３の実施の形態
である顔器官検出装置の動作について説明する。対象本
発明の第１の実施の形態である顔器官検出装置と同様
に、画像入力手段１０１より入力された入力画像は、色
情報計算手段１０２で求められた色相、彩度、明度の色
情報をもとに不適な画素における輝度が対象外値に設定
され、輝度の平滑処理を経て大まかに人物領域が切り出
される。その後、遺伝的アルゴリズム処理手段１０５
で、予め用意された顔器官抽出用の評価テンプレート辞
書１０９内のテンプレートベクトルと、ランダムに切り
出された矩形領域内の色相、YIQ表色系におけるQ信号の
人物領域全体に対する分布比から求められる評価ベクト
ルとのパターンマッチングにより現在の抽出ベクトルの
適合度が評価される。ここで評価された適合度をもとに
遺伝的アルゴリズムによる組み替え操作を用いて最適な
顔器官領域が推定する訳だが、一般に遺伝的アルゴリズ
ムでは、探索候補ベクトル集合内における多様性が欠如
するにつれ、突然変異による最適解の探索に依存する割
合が多くなり、その探索能力が落ちてくることが既に明
らかとなっている。そこで、この探索候補ベクトル集合
内にける多様性を維持するために、似通ったベクトル間
での組み替え操作を避ける必要がある。そこで、各抽出
ベクトル間の類似度を計算し、その類似度が予め設定さ
れた類似度しきい値よりも大きい組み合わせに関して
は、遺伝的アルゴリズムにおける組み替え操作である交
叉処理を禁止することとした。まず、抽出ベクトル類似
度導出手段１００１において、抽出ベクトル集合内のベ
クトルv_c[i]とv_c[j]の間の類似度sim(i,j)を計算す
る。この類似度の定義には多くのものが考えられる。例
えば、２つのベクトル間のノルムや２つのベクトル間の
なす角度、２つのベクトル要素ごとの差分の絶対値の総
和等である。しかし、できるだけ同じ領域における探索
を避けるために、本実施の形態では、２つの抽出ベクト
ルの表す矩形領域が重なる画素数を２つの矩形領域画素
数で小さい方の値で割った値を用いることとした。こう
すれば、重なりが大きいほど２つのベクトル間の類似度
は大きくなる（図１１）。本発明のように抽出候補領域
の評価特徴量として、領域内の色情報の候補領域画素数
に対する分布比を用いる場合、目的とする顔器官の部分
領域でかなり高い適合度を示すことが多々みられ、遺伝
的アルゴリズムによる領域探索が進むにつれ、目的とす
る顔器官領域に近い領域を表す抽出ベクトルと、顔器官
のある部分に相当する領域を表す抽出ベクトルが同じ抽
出ベクトル集合に混在すること可能性が高くなる。この
場合、抽出ベクトル類似度として、例えば２つの抽出ベ
クトル間のノルムを用いると、明らかに異なるグループ
に分かれてしまい、その結果として同じ領域にさも複数
の顔器官があるように検出される可能性が生じる。しか
し本発明のように、２つの抽出ベクトルの表す矩形領域
が重なる画素数を２つの矩形領域画素数で小さい方の値
で割った値を用いた場合、片方がもう片方に含まれるよ
うな場合、この２つの領域を表す抽出ベクトル間の類似
度は1.0になり必ず同じグループに分類することがで
き、前述の問題を回避することができる。ただし、遺伝
的アルゴリズムによる対象領域調整の最初の段階では、
どんなに矩形領域内の画素数に隔たりがあっても、片方
の領域がもう片方に含まれてしまうと同じグループに属
すると見られ、抽出領域の最適化にむしろ悪影響を与え
るため、小さい抽出領域内の画素数に対する大きい抽出
領域内の画素数の比rrrが面積隔たりしきい値th_check
より小さい場合には、この２つの抽出ベクトル類似度に
rrrを乗算することとする。このようにして得られた抽
出ベクトル類似度が予め用意された類似度しきい値th_s
imより大きいベクトルは同じグループGs_kk(kk=1,...,g
num)になるように抽出ベクトル集合分割手段１００２で
分類される。グループ別組み替え操作手段１００３は、
２０２における交叉処理を行う際に、候補選択手段２０
１において、１００２で分類されたグループから同じも
のが２つ選択されている場合には、２０２の交叉処理を
行わず、２０３の突然変異処理のみを行う。一方、別の
グループから選択されている場合には２０２の交叉処理
を経た後、２０３の突然変異処理を行うのである。そし
て、この遺伝的アルゴリズム処理手段１０５における組
み替え操作が予め設定された繰り返し回数のしきい値を
満足するまで一連の処理を行い、繰り返し回数のしきい
値が満たされた時点における抽出ベクトルのグループ毎
で、用意された抽出判断基準である適合度しきい値を満
足する抽出ベクトルの表す矩形領域が複数抽出記録手段
１００４に記録され、複数人物の顔器官検出が終了す
る。The operation of the face organ detecting apparatus according to the third embodiment configured as described above will be described. The input image input from the image input unit 101 is similar to the face organ detection device according to the first embodiment of the present invention, and the color information of the hue, saturation, and brightness obtained by the color information calculation unit 102 is used. , The luminance of the inappropriate pixel is set to a non-target value, and the person area is roughly cut out through luminance smoothing processing. Thereafter, the genetic algorithm processing means 105
Then, an evaluation obtained from a template vector in the evaluation template dictionary 109 for facial organ extraction prepared in advance, the hue in a randomly cut rectangular area, and the distribution ratio of the Q signal in the YIQ color system to the entire human area. The matching degree of the current extracted vector is evaluated by pattern matching with the vector. The optimal face organ region is estimated using the recombination operation by the genetic algorithm based on the fitness evaluated here.Generally, in the genetic algorithm, as the diversity in the search candidate vector set is lacking, It has already been clarified that the rate of dependence on the search for an optimal solution due to mutation increases, and the search ability decreases. Therefore, in order to maintain the diversity in the search candidate vector set, it is necessary to avoid a re-combining operation between similar vectors. Therefore, the similarity between each of the extracted vectors is calculated, and the crossover processing, which is the re-combining operation in the genetic algorithm, is prohibited for a combination in which the similarity is larger than a preset similarity threshold. First, the extraction vector similarity deriving unit 1001 calculates the similarity sim (i, j) between the vectors v_c [i] and v_c [j] in the extraction vector set. There are many possible definitions of this similarity. For example, the norm between two vectors, the angle formed between two vectors, the sum of absolute values of differences between two vector elements, and the like. However, in order to avoid searching in the same region as much as possible, in the present embodiment, a value obtained by dividing the number of pixels where the rectangular regions represented by the two extracted vectors overlap by the smaller number of the two rectangular region pixels is used. did. In this way, the greater the overlap, the greater the similarity between the two vectors (FIG. 11). When the distribution ratio of the color information in the region to the number of pixels in the candidate region is used as the evaluation feature amount of the extraction candidate region as in the present invention, the partial region of the target facial organ often shows a fairly high degree of fitness. As the area search by the genetic algorithm progresses, the extracted vector representing the area close to the target face organ area and the extracted vector representing the area corresponding to a part of the face organ can be mixed in the same extracted vector set The nature becomes high. In this case, if, for example, the norm between two extracted vectors is used as the extracted vector similarity, the groups are clearly divided into different groups, and as a result, there is a possibility that multiple facial organs are detected in the same region. Occurs. However, as in the present invention, when a value obtained by dividing the number of pixels where the rectangular areas represented by two extraction vectors overlap by the smaller number of pixels of the two rectangular areas is used, when one is included in the other, Since the similarity between the extracted vectors representing the two regions is 1.0, the extracted vectors can be always classified into the same group, and the above-described problem can be avoided. However, in the first stage of the target area adjustment by the genetic algorithm,
No matter how much the number of pixels in the rectangular area is different, if one area is included in the other, it is considered to belong to the same group, which has a bad influence on the optimization of the extraction area, so the small extraction area The ratio rrr of the number of pixels in the large extraction region to the number of pixels of the area difference threshold th_check
If smaller, the two extracted vector similarities
multiply by rrr. The extracted vector similarity obtained in this way is a similarity threshold th_s prepared in advance.
Vectors larger than im are in the same group Gs_kk (kk = 1, ..., g
(num) by the extracted vector set dividing means 1002. The group-specific rearrangement operation means 1003
When performing the crossover process in 202, the candidate selection unit 20
In 1, if two of the same groups are selected from the group classified in 1002, only the mutation process in 203 is performed without performing the cross process in 202. On the other hand, when a group is selected from another group, after the crossover process of 202, the mutation process of 203 is performed. Then, a series of processing is performed until the rearrangement operation in the genetic algorithm processing means 105 satisfies a preset threshold value of the number of repetitions, and each group of extracted vectors at the time when the threshold value of the number of repetitions is satisfied. Then, a rectangular area represented by an extraction vector that satisfies the matching threshold value, which is a prepared extraction criterion, is recorded in the plural extraction recording means 1004, and the face organ detection of plural persons is completed.

【００４１】このように本実施の形態では、１人の顔器
官検出の後、その領域にマスクして次の人物の顔器官検
出を遺伝的アルゴリズムを用いて改めて行うというよう
な繰り返し作業を行うことなく、入力画像に存在する複
数人物分だけ対象とする顔器官を効率良くかつ同時に検
出することができ、その効果は大きい。As described above, in the present embodiment, after detecting a facial organ of one person, a repetitive operation is performed such that the region is masked and the facial organ of the next person is detected again using a genetic algorithm. Without this, it is possible to efficiently and simultaneously detect facial organs targeted for a plurality of persons existing in the input image, and the effect is great.

【００４２】（第４の実施の形態）最後に本発明の第４
の実施の形態である顔器官検出装置について説明する。
図１２は第４の実施の形態である顔器官検出装置の構成
を表す。１２０１は複数の顔器官を抽出するために各々
の器官に対して予め用意された抽出用標本画像集合から
生成された評価テンプレートベクトルを格納する複数器
官評価テンプレート辞書、１２０２は各グループにおい
て予め設定された抽出判断基準を満足する抽出ベクトル
の表す対象領域を記録する複数器官対象記録手段であ
る。(Fourth Embodiment) Finally, the fourth embodiment of the present invention
A face organ detecting device according to the embodiment will be described.
FIG. 12 shows a configuration of a face organ detection device according to the fourth embodiment. A multiple organ evaluation template dictionary 1201 stores evaluation template vectors generated from a set of extraction sample images prepared in advance for each organ to extract a plurality of facial organs, and 1202 is set in advance in each group. A plurality of organ object recording means for recording an object area represented by an extraction vector satisfying the extraction criterion.

【００４３】以上のように構成された第４の実施の形態
である顔器官検出装置の動作について説明する。画像入
力手段１０１からグループ別組み替え操作手段１１３に
おける処理は本発明の第３の実施の形態である顔器官検
出装置の場合と同様である。この本発明の目的は、入力
画像中に存在する複数人物の顔器官を抽出するのではな
く、１人の顔器官である、目、鼻、口、眉等の複数器官
を検出することである。そのため、各抽出ベクトルの評
価を行うためのテンプレート辞書１２０１を作成する標
本画像集合はは複数の器官ごとに用意されている。そし
て、この各標本画像より得られる評価ベクトルに対し
て、学習ベクトル量子化手法を用いて複数器官評価テン
プレート辞書１２０１を作成するのである。学習ベクト
ル量子化手法は、対象である複数器官P個のカテゴリを
代表するベクトルT_r[k]=(tr(1,k),tr(2,k),...,tr(M,
k))(k=1,2,...,P)の集合G_Trを作成する。このベクトル
集合G_Trの作成手法として、本発明では次のような手順
で与えられた複数のベクトルvより各カテゴリを代表す
るベクトル集合を作成するLVQを適用する。 [i]ベクトルvに対して式(数８)により、最近傍な代表ベ
クトルv_c(n)を見つける。なお、nはLVQにおける繰り返
し回数を、v_p(n)は回数nにおける代表ベクトルを表
す。The operation of the face organ detecting apparatus according to the fourth embodiment configured as described above will be described. The processing from the image input unit 101 to the group-specific rearrangement operation unit 113 is the same as in the case of the face organ detection device according to the third embodiment of the present invention. An object of the present invention is not to extract the face organs of a plurality of persons present in an input image, but to detect a plurality of organs such as eyes, nose, mouth, and eyebrows, which are one face organ. . Therefore, a sample image set for creating a template dictionary 1201 for evaluating each extracted vector is prepared for each of a plurality of organs. Then, a multiple organ evaluation template dictionary 1201 is created for the evaluation vector obtained from each sample image by using a learning vector quantization technique. The learning vector quantization method is based on a vector T_r [k] = (tr (1, k), tr (2, k), ..., tr (M,
k)) Create a set G_Tr of (k = 1, 2,..., P). As a method of creating the vector set G_Tr, in the present invention, an LVQ for creating a vector set representing each category from a plurality of vectors v given in the following procedure is applied. [i] The closest representative vector v_c (n) is found for the vector v by Expression (Equation 8). Note that n represents the number of repetitions in LVQ, and v_p (n) represents a representative vector in the number n.

【００４４】[0044]

【数８】 [ii]ここでvの属するカテゴリをG_v、v_c(n)の属するカ
テゴリをG_vcとする。もし、G_vとG_vcが同じカテゴリ
であれば、(数９)のようにv_c(n)をvに近づけ、異なる
カテゴリであれば、(数１０)のようにv_c(n)をvから遠
ざける。なお、v_c以外の代表ベクトルは更新されな
い。(Equation 8) [ii] Here, the category to which v belongs is G_v, and the category to which v_c (n) belongs is G_vc. If G_v and G_vc are in the same category, v_c (n) is made closer to v as in (Equation 9), and if they are different categories, v_c (n) is moved away from v as in (Equation 10). Note that the representative vectors other than v_c are not updated.

【００４５】[0045]

【数９】 (Equation 9)

【００４６】[0046]

【数１０】 alpha(n)は単調減少関数( 0 < alpha < 1 )である。LVQ
は、VQを発展させた自己組織化ニューラルネットワーク
手法であり、VQにおける各クラスタを代表する参照ベク
トルをニューロン間の結合係数に対応させ、教師あり学
習を用いることによって適切な参照ベクトルを求める手
法である。LVQでは、同じカテゴリに属するベクトル間
の距離を近づけ、異なるカテゴリに属するベクトル間の
距離を遠ざけて各カテゴリ間の境界線の最適化を行う。
本発明では、予め与えられた複数器官の検出用標本画像
集合から得られる評価ベクトルにLVQを用いて、検出対
象である各器官を代表する参照ベクトル集合を作成し、
評価テンプレート辞書１２０１として用意する。そし
て、１２０２は遺伝的アルゴリズム処理手段１０５にお
ける繰り返し回数が予め設定された繰り返し回数のしき
い値を満足した時点における抽出ベクトル集合の各グル
ープにおいて、抽出判断基準を満足する抽出ベクトルの
示す矩形領域を記録するのである。この際、抽出判断基
準としては最低適合度が考えられ、一律の値でもかまわ
ないが、各顔器官の検出の困難度合いに応じて、個別に
設定した方が賢明である。(Equation 10) alpha (n) is a monotonically decreasing function (0 <alpha <1). LVQ
Is a self-organizing neural network method developed from VQ, in which a reference vector representing each cluster in VQ is made to correspond to the coupling coefficient between neurons, and an appropriate reference vector is obtained by using supervised learning. is there. In LVQ, the distance between vectors belonging to the same category is reduced, and the distance between vectors belonging to different categories is increased to optimize the boundary between the categories.
In the present invention, using LVQ to the evaluation vector obtained from a sample image set for detection of a plurality of organs given in advance, to create a reference vector set representing each organ to be detected,
It is prepared as an evaluation template dictionary 1201. In each group of the extracted vector set when the number of repetitions in the genetic algorithm processing unit 105 satisfies a preset threshold value of the number of repetitions, 1202 indicates a rectangular area indicated by the extracted vector satisfying the extraction criterion. Record it. In this case, the extraction judgment criterion may be the lowest matching degree, and may be a uniform value. However, it is wise to set individually according to the degree of difficulty in detecting each facial organ.

【００４７】このように本実施の形態では、入力画像中
における人物の持つ複数の顔器官を同時に効率よく検出
することを目的としたものであり、画像による個人照合
をはじめとした多くの分野においてその効果は大きい。
なお、本発明の第１から第４の実施の形態である顔器官
検出装置の遺伝的アルゴリズム処理手段１０５におい
て、抽出に必要な矩形領域の中心座標(x0(i),y0(i))、
横画素数width(i)、縦画素数height(i)(i=1,2,...,K)よ
りなるパラメータ列ベクトルを染色体として適用した
が、これに矩形領域の縦画素方向における対称軸を考え
その対称軸の左周りを正にして傾き角度αを設定し、こ
れを先ほどのパラメータ列ベクトルに加えることによ
り、傾いた顔領域の抽出にも適用することができる。そ
の場合、各抽出矩形領域の適合度を計算する際に、矩形
領域の重心点を基準にして角度-αだけ矩形領域内画像
を回転させることにより、評価テンプレート辞書とのマ
ッチングを行うことができる。また、本発明の第１から
第４の実施の形態である顔器官検出装置における交叉処
理手段２０２において、図５のような１点もしくは２点
交叉を用いて説明したが、候補選択手段２０１で選択さ
れた２つのベクトルを結ぶ直線上の２点を任意に選ぶ方
法も考えられる。As described above, the present embodiment is intended to simultaneously and efficiently detect a plurality of facial organs possessed by a person in an input image, and can be used in many fields including personal verification using images. The effect is great.
Note that, in the genetic algorithm processing means 105 of the face organ detecting apparatus according to the first to fourth embodiments of the present invention, the center coordinates (x0 (i), y0 (i)) of the rectangular area required for extraction,
A parameter column vector consisting of the number of horizontal pixels width (i) and the number of vertical pixels height (i) (i = 1,2, ..., K) was applied as a chromosome. By considering the axis and setting the tilt angle α with the left rotation of the symmetry axis being positive, and adding this to the parameter column vector, it can be applied to the extraction of a tilted face area. In that case, when calculating the fitness of each extracted rectangular area, matching with the evaluation template dictionary can be performed by rotating the image in the rectangular area by an angle -α with respect to the center of gravity of the rectangular area. . Although the crossover processing unit 202 in the face organ detecting apparatus according to the first to fourth embodiments of the present invention has been described using one-point or two-point crossover as shown in FIG. A method of arbitrarily selecting two points on a straight line connecting the two selected vectors is also conceivable.

【００４８】さらに、本発明の第１から第３の実施の形
態である顔器官検出装置における評価テンプレート辞書
作成において、予め用意された標本画像集合より得られ
る評価ベクトルにベクトル量子化手法を適用している
が、このベクトル量子化手法にニューラルネットワーク
による学習機能を持たせた学習ベクトル量子化手法を用
いることも可能である。こうすることにより、ただ評価
ベクトル空間におけるベクトルの分布のみに着目して機
械的に顔器官のテンプレートを作成するのではなく、同
じ人物の持つ顔器官は必ず同じカテゴリに属するように
学習しながら顔器官のテンプレートベクトルを作成する
ことができる。こうすることにより、各個人の持つ顔器
官の色情報における特徴をよりうまく抽出することがで
き、入力画像から目的とする顔器官の検出効率をさらに
向上させることができると思われる。Further, in the creation of an evaluation template dictionary in the face organ detecting apparatus according to the first to third embodiments of the present invention, a vector quantization technique is applied to an evaluation vector obtained from a sample image set prepared in advance. However, it is also possible to use a learning vector quantization method having a learning function using a neural network in this vector quantization method. By doing this, instead of mechanically creating a face organ template focusing solely on the vector distribution in the evaluation vector space, the face organs belonging to the same person must be learned while learning so that they always belong to the same category. Organ template vectors can be created. By doing so, it is considered that the feature in the color information of the facial organs possessed by each individual can be better extracted, and the detection efficiency of the target facial organ from the input image can be further improved.

【００４９】以上のように本発明の第１から第４の顔器
官検出装置は、画像入力時における環境の影響を受けな
いで口等の顔器官のみを抽出することを可能にする技術
に関するものであり、背景、画像中における顔器官の大
きさ、照明条件等に影響されないで安定した領域抽出を
簡単な構造で実現することが可能となる。As described above, the first to fourth face organ detecting devices according to the present invention relate to a technique for extracting only a face organ such as a mouth without being affected by the environment when an image is input. Therefore, stable region extraction can be realized with a simple structure without being affected by the background, the size of the face organ in the image, the lighting conditions, and the like.

【００５０】尚、上記実施の形態で述べた各手段の全部
又は一部の手段の機能をコンピュータに実行させるため
のプログラムを記録した媒体を利用しても、上記の場合
と同様の効果を発揮する。It should be noted that the same effects as in the above case can be obtained by using a medium on which a program for causing a computer to execute the functions of all or some of the means described in the above embodiments is used. I do.

【００５１】又、上記実施の形態の各手段の処理動作
は、コンピュータを用いてプログラムの働きにより、ソ
フトウェア的に実現してもよいし、あるいは、上記処理
動作をコンピュータを使用せずに特有の回路構成によ
り、ハード的に実現してもよい。The processing operation of each means of the above-described embodiment may be realized as software by using a computer and operating a program, or the processing operation may be performed without using a computer. The circuit configuration may be implemented as hardware.

【００５２】[0052]

【発明の効果】以上述べたところから明らかなように本
発明は、顔器官の検出が従来に比べてより一層安定して
行えると言う長所を有する。As is apparent from the above description, the present invention has an advantage that the detection of the face organ can be performed more stably than in the conventional case.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態における顔器官検出
装置の構成を表すブロック図FIG. 1 is a block diagram illustrating a configuration of a face organ detection device according to a first embodiment of the present invention.

【図２】本発明の第１の実施の形態における顔器官検出
装置の組み替え操作手段の構成を表すブロック図FIG. 2 is a block diagram illustrating a configuration of a rearrangement operation unit of the face organ detection device according to the first embodiment of the present invention.

【図３】本実施の形態の輝度値平滑化処理に使用される
フィルタを示す図FIG. 3 is a diagram illustrating a filter used in a brightness value smoothing process according to the embodiment;

【図４】本実施の形態の入力画像から人物領域を抽出す
る手法の概念図FIG. 4 is a conceptual diagram of a method for extracting a person region from an input image according to the present embodiment;

【図５】本実施の形態の組み替え操作手段で行われる交
叉、突然変異処理の概念図FIG. 5 is a conceptual diagram of crossover / mutation processing performed by the rearrangement operation means according to the embodiment;

【図６】本実施の形態の抽出矩形領域と遺伝的アルゴリ
ズムにおける染色体構造の関係を表す概念図FIG. 6 is a conceptual diagram illustrating a relationship between an extracted rectangular region and a chromosome structure in a genetic algorithm according to the present embodiment.

【図７】選択淘汰に用いられるルーレット選択方式を説
明するための図FIG. 7 is a diagram for explaining a roulette selection method used for selection.

【図８】本実施の形態による口唇抽出例を表すために、
ディスプレー上に表示した中間調画像を、プリンターか
ら出力したものであり、図面に代わる写真である。
（ａ）は入力画像を示す図面に代わる写真、（ｂ）は彩
度、色相で抽出した人物画像を示す図面に代わる写真、
（ｃ）は遺伝的アルゴリズムにより抽出した口唇領域を
示す図面に代わる写真である。FIG. 8 shows an example of lip extraction according to the present embodiment.
The halftone image displayed on the display is output from a printer, and is a photograph replacing a drawing.
(A) is a photograph replacing a drawing showing an input image, (b) is a photograph replacing a drawing showing a person image extracted by saturation and hue,
(C) is a photograph instead of a drawing showing the lip region extracted by the genetic algorithm.

【図９】本発明の第２の実施の形態における顔器官検出
装置の構成を表すブロック図FIG. 9 is a block diagram illustrating a configuration of a face organ detection device according to a second embodiment of the present invention.

【図１０】本発明の第３の実施の形態における顔器官検
出装置の構成を表すブロック図FIG. 10 is a block diagram illustrating a configuration of a face organ detection device according to a third embodiment of the present invention.

【図１１】抽出候補領域集合のグループ分類に関する概
念図FIG. 11 is a conceptual diagram related to group classification of a set of extraction candidate regions.

【図１２】本発明の第４の実施の形態における顔器官検
出装置の構成を表すブロック図FIG. 12 is a block diagram illustrating a configuration of a face organ detection device according to a fourth embodiment of the present invention.

[Explanation of symbols]

１０１画像入力手段１０２色情報計算手段１０３平滑化処理手段１０４人物領域限定手段１０５遺伝的アルゴリズム処理手段１０６抽出領域記録手段１０７初期抽出ベクトル設定手段１０８評価ベクトル生成手段１０９評価テンプレート辞書１１０抽出領域評価手段１１１組み替え操作手段２０１候補選択手段２０２交叉処理手段２０３突然変異処理手段２０４選択範囲導出手段２０５乱数発生手段２０６抽出ベクトル選択手段９０１抽出ベクトル近傍選択手段９０２抽出ベクトル近傍調整手段１００１抽出ベクトル類似度導出手段１００２抽出ベクトル集合分割手段１００３グループ別組み替え操作手段１００４複数対象記録手段１２０１複数器官評価テンプレート辞書１２０２複数器官対象記録手段 DESCRIPTION OF SYMBOLS 101 Image input means 102 Color information calculation means 103 Smoothing processing means 104 Person area limitation means 105 Genetic algorithm processing means 106 Extraction area recording means 107 Initial extraction vector setting means 108 Evaluation vector generation means 109 Evaluation template dictionary 110 Extraction area evaluation means 111 Recombination operation means 201 Candidate selection means 202 Crossover processing means 203 Mutation processing means 204 Selection range derivation means 205 Random number generation means 206 Extraction vector selection means 901 Extraction vector neighborhood selection means 902 Extraction vector neighborhood adjustment means 1001 Extraction vector similarity derivation means 1002 Extracted vector set division means 1003 Group-specific reordering operation means 1004 Multiple object recording means 1201 Multiple organ evaluation template dictionary 1202 Multiple organ object recording means

Claims

[Claims]

An image input unit for inputting an image; a color information calculation unit for calculating color information of the input image; and a pixel not belonging to a person area is extracted from the color information obtained by the color information calculation unit. Smoothing processing means for performing a smoothing process on luminance; Person area limiting means for limiting a person area based on the smoothing processing means; and setting of a group of extraction vectors for specifying an initial extraction area from within the person area Initial extraction vector setting means for performing evaluation vector generation means for generating an evaluation vector used for each area evaluation from a distribution of color information in an extraction area specified by the set extraction vector; and An evaluation template dictionary for storing evaluation template vectors generated based on a sample image set for extraction prepared in advance; Extracting area evaluation means for comparing the obtained evaluation vector with the evaluation template vector in the evaluation template dictionary and evaluating the degree of fitness, and an extraction criterion in which the degree of fitness obtained by the extraction area evaluating means is set in advance. A recombining operation means for performing a symbolic recombining operation of the extracted vector based on the degree of suitability when the satisfaction is not satisfied; A face organ detecting apparatus, comprising: a target area recording unit that records an organ.

2. An image input means for inputting an image, a color information calculation means for calculating color information of an input image, and a pixel which does not belong to a person area is extracted from the color information obtained by the color information calculation means. Smoothing processing means for performing a smoothing process on luminance; Person area limiting means for limiting a person area based on the smoothing processing means; and setting of a group of extraction vectors for specifying an initial extraction area from within the person area Initial extraction vector setting means for performing evaluation vector generation means for generating an evaluation vector used for each area evaluation from a distribution of color information in an extraction area specified by the set extraction vector; and An evaluation template dictionary for storing evaluation template vectors generated based on a sample image set for extraction prepared in advance; Extracting area evaluation means for comparing the obtained evaluation vector with the evaluation template vector in the evaluation template dictionary and evaluating the degree of fitness, and an extraction criterion in which the degree of fitness obtained by the extraction area evaluating means is set in advance. Extracting vector neighborhood selecting means for extracting a plurality of vector groups from the vicinity of each extracted vector when not satisfied; and a vector having the highest fitness and the original extracted vector among the vector groups obtained by the extracted vector neighboring selecting means Extracting vector neighborhood adjusting means for replacing the extracted vector, a recombining operation means for performing a symbolic rearranging operation of the extracted vector replaced by the extracted vector neighboring adjusting means based on the degree of conformity, and when the extraction criterion is satisfied. The target area recording means for ending the area extraction processing and recording the optimal extraction area as a face organ. A facial organ detection device characterized by being formed.

3. Image input means for performing image input processing; color information calculation means for calculating color information of an input image; and pixels which do not belong to a person area are extracted from the color information obtained by the color information calculation means. Together with a smoothing processing means for performing a smoothing processing on the luminance, a person area limiting means for limiting a person area based on the smoothing processing means, and a group of extraction vectors for specifying an initial extraction area from within the person area. Initial extraction vector setting means for setting; evaluation vector generating means for generating an evaluation vector for use in evaluating each area from a distribution of color information in the extraction area specified by the set extraction vector; Extraction vector similarity deriving means for deriving the similarity between, and an evaluation template generated from a set of extraction sample images prepared in advance for extracting a face organ An evaluation template dictionary for storing a vector, an extraction vector evaluation unit for comparing an evaluation vector corresponding to each extraction region obtained by the evaluation vector generation unit with an evaluation template vector in the evaluation template dictionary, and evaluating the degree of conformity. Means, an extracted vector set dividing means for dividing an extracted vector into a plurality of groups based on the extracted vector similarity, and a group for performing a rearrangement operation of a plurality of extracted vectors having a low similarity based on the fitness. A separate recombination operation means, an extraction vector which satisfies a predetermined extraction determination criterion in each group when the number of repetitions of the extraction vector rearrangement operation exceeds a preset maximum number of repetitions, and terminates the region extraction processing. Characterized by comprising a plurality of target recording means for recording a target area represented by Organ detection device.

4. Image input means for performing image input processing; color information calculation means for calculating color information of an input image; and pixels which do not belong to a person area are extracted from the color information obtained by the color information calculation means. Together with a smoothing processing means for performing a smoothing processing on the luminance, a person area limiting means for limiting a person area based on the smoothing processing means, and a group of extraction vectors for specifying an initial extraction area from within the person area. Initial extraction vector setting means for setting; evaluation vector generating means for generating an evaluation vector for use in evaluating each area from a distribution of color information in the extraction area specified by the set extraction vector; Extraction vector similarity deriving means for deriving a similarity between the images; and extracting a plurality of facial organs from an extraction sample image set prepared in advance for each organ. A plurality of organ evaluation template dictionaries storing the generated evaluation template vectors, and comparing an evaluation vector corresponding to each extraction region obtained by the evaluation vector generating means with an evaluation template vector in the plural organ evaluation template dictionary. Extraction region evaluation means for evaluating the degree of fitness, extracted vector neighborhood selection means for selecting a plurality of vector groups from the vicinity of each extracted vector, and a vector having the highest degree of fitness in the selected vector group and the original vector. Extraction vector neighborhood adjustment means for replacing the extraction vector; extraction vector set division means for dividing the extraction vector into a plurality of groups based on the extraction vector similarity; and a plurality of low similarity degrees based on the fitness. Group-based rearrangement operation means for performing rearrangement of extracted vectors, and rearrangement of extracted vectors When the number of operation repetitions exceeds a preset upper limit number of repetitions, the region extraction process is terminated, and a plurality of organ objects for recording a target region represented by an extraction vector satisfying a preset extraction criterion in each group A facial organ detecting device comprising a recording unit.

5. The recombining operation means uses a genetic algorithm for performing a genetic operation process by crossover, mutation, and selection.
A facial organ detection device according to claim 1.

6. The face organ detecting apparatus according to claim 3, wherein said group-specific rearranging operation means uses a genetic algorithm for performing a genetic operation process by crossover, mutation, and selection. apparatus.

7. The face organ detecting apparatus according to claim 1, wherein said evaluation vector is created from a distribution histogram of hue and lightness in each area.

8. The face organ detecting apparatus according to claim 1, wherein said evaluation vector is created from a distribution histogram of hue and color difference in each area.

9. A medium in which a program for causing a computer to execute the functions of all or a part of each means described in any one of claims 1 to 8 is recorded.