JP2000311248A

JP2000311248A - Image processor

Info

Publication number: JP2000311248A
Application number: JP11121097A
Authority: JP
Inventors: Atsuo Matsuoka; 篤郎松岡; Ryushi Funayama; 竜士船山; So Takezawa; 創竹澤; Yoshinori Nagai; 義典長井; Ai Ito; 愛伊藤; Minehiro Konya; 峰弘紺矢
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1999-04-28
Filing date: 1999-04-28
Publication date: 2000-11-07

Abstract

PROBLEM TO BE SOLVED: To provide an image processor which can fast extract an optional feature value from an image with high strength and accuracy and also can easily synthesize a portrait of high picture quality against a problem that much time is required for performing the image processing to extract the feature value of a specific object included in an image, the wrong feature value is extracted by mistake or many positions must be designated when a processing range is directly designated by means of a position designation means, etc. SOLUTION: This image processor is provided with an input means 11 which inputs the images, a storage means 12 which stores the inputted images, an arithmetic means 13 which performs an optional operation, a position designation means 14 which can designate an optional position in an image, a recognition means which recognizes the position and size of an object included in an image and a feature extraction means 10 which extracts an optional feature from a relevant image by inputting one or more positions of a relevant object included in the image as long as the relation set between the position and size of the object satisfies a fixed constraint condition.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、パソコン、ワープ
ロ、ワークステーション、携帯型情報ツール、コピー
機、スキャナ装置、ファクシミリ、テレビ、ビデオ、ビ
デオカメラなどの情報機器や電子機器に用いられ、入力
した画像に関する特定の特徴量、例えば人物画像におけ
る目や口などの位置や大きさ、形状を抽出するととも
に、この抽出された情報に基づき、入力した画像を操作
者の所望する状態、例えば、漫画風の趣をもつ画像を生
成することのできる画像処理装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is used for information devices and electronic devices such as personal computers, word processors, workstations, portable information tools, copiers, scanners, facsimiles, televisions, videos, and video cameras. Specific features related to the image, such as the position, size, and shape of the eyes and mouth in a human image, are extracted, and based on the extracted information, the input image can be converted to a state desired by the operator, such as a cartoon style. The present invention relates to an image processing apparatus capable of generating an image having a taste.

【０００２】[0002]

【従来の技術】以下に、本願発明の画像処理に関する従
来からの技術を記載します。尚、記載に際しては、下記
の文献を引用する場合には、該当する文献名を「文
献［］」という形式で表現します。・文献［１］：Ｒ．ＢｒｕｎｅｌｌｉａｎｄＴ．
Ｐｏｇｇｉｏ， “ＦａｃｅＲｅｃｏｇｎｉｔｉｏ
ｎ：ＦｅａｔｕｒｅｓｖｅｒｓｕｓＴｅｍｐｌａ
ｔｅｓ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏ
ｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａ
ｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．１
５，Ｎｏ．１０，ｐｐ．１０４２−１０５２，１
９９３．・文献［２］：Ｍ．ＴｕｒｋａｎｄＡ．Ｐｅｎ
ｔｌａｎｄ， “ＦａｃｅＲｅｃｏｇｎｉｔｉｏｎ
ＵｓｉｎｇＥｉｇｅｎｆａｃｅｓ”，Ｐｒｏｃｅｅ
ｄｉｎｇｓｏｆＩＥＥＥＣｏｎｆｅｒｅｎｃｅ
ｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａ
ｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，ｐｐ．５８６−
５９１，１９９１．・文献［３］：船山竜士，横矢直和，岩佐英彦，竹村治
雄， “複数の動的な網のモデルの協調とその顔部品抽
出への応用”，電子情報通信学会技術報告，ＰＲＵ９
５−１７９，ｐｐ．１５−２２，１９９５．・文献［４］：船山竜士，竹澤創，紺矢峰弘，斗谷充
宏， “ユーザーの簡易指定を伴う顔領域の適応的セグ
メンテーション”，情報処理学会第５５回全国大会，
６ＡＢ−３，１９９７．・文献［５］：高木幹雄，下田陽久監修， “画像解
析ハンドブック”，東京大学出版会，１９９１．・文献［６］：Ｍ．Ｋａｓｓ．：Ｓｎａｋｅｓ：Ａｃ
ｔｉｖｅＣｏｎｔｏｕｒＭｏｄｅｌｓ，Ｉｎｔ．
Ｊ．Ｃｏｍｐｕｔ．Ｖｉｓｉｏｎ，ｐ．３２１，１
９８８．・文献［７］：細井聖他，髪型の認識と合成，電
子情報通信学会技術研究報告，ＰＲＭＵ９７−１５
５，ｐｐ．２５−３２（１９９７）．・文献［８］：特開平１０−３２０５４３号公報・文献［９］：特開平１０−２５５０１７号公報・文献［１０］：特開平０８−３０５８８０号公報・文献［１１］：特開平１０−２４０９２１号公報・文献［１２］：特開平１１−０１５９４７号公報（１）［請求項１，２，６に関する従来技術］一般に、画像中の任意の特徴量を抽出するには次のよう
な手法がある。尚、ここでいう任意の特徴量とは、画像
中に含まれる特定物体の任意の座標、あるいは座標の組
み合わせで表現することのできる形状や大きさ、あるい
は特定の種別や属性を表現することのできる符号等であ
る。例えば、画像中に人物顔が含まれており、ここでい
う特定物体をその人物の右目であるとすると、任意の特
徴量の一つとして、例えば当該画像中での当該右目の目
頭の座標とすることができるし、例えば当該画像中での
当該右目に外接する矩形をその左上の座標と右下の座標
の組み合わせで表現したものとすることができるし、例
えば当該右目がタレ目であるという種別を表現する符号
とすること等ができる。2. Description of the Related Art Conventional techniques relating to image processing of the present invention are described below. In addition, when citing the following documents, the relevant document name will be expressed in the format of "Document []". Reference [1]: R. Brunelli and T.M.
Poggio, “Face Recognition
n: Features versus Templa
tes ”, IEEE Transactions o
n Pattern Analysis and Ma
chine Intelligence, Vol. 1
5, No. 10, pp. 1042-1052, 1
993. Reference [2]: M.A. Turk and A. Pen
land, “Face Recognition”
Using Eigenfaces ", Processe
dings of IEEE Conference
on Computer Vision and Pa
ttern Recognition, p. 586-
591, 1991.・ Document [3]: Ryushi Funayama, Naokazu Yokoya, Hidehiko Iwasa, Haruo Takemura, “Cooperation of multiple dynamic network models and its application to facial parts extraction”, IEICE technical report, PRU9
5-179, p. 15-22, 1995.・ Reference [4]: Tatsushi Funayama, Hajime Takezawa, Minehiro Konya, Mitsuhiro Tootani, “Adaptive Segmentation of Face Region with Simple User Specification”, IPSJ 55th Annual Convention,
6AB-3, 1997.・ Reference [5]: Mikio Takagi, supervised by Hirohisa Shimoda, “Image Analysis Handbook”, University of Tokyo Press, 1991. Reference [6]: M.A. Kass. : Snakes: Ac
five Contour Models, Int.
J. Comput. Vision, p. 321, 1
988.・ Reference [7]: Satoshi Hosoi et al., Recognition and Synthesis of Hairstyles, IEICE Technical Report, PRMU97-15
5, pp. 25-32 (1997). -Document [8]: JP-A-10-320543-Document [9]: JP-A-10-255017-Document [10]: JP-A-08-305880-Document [11]: JP-A-10-240921 [Patent Document 1]-Document [12]: Japanese Patent Laying-Open No. 11-015947 (1) [Prior Art Related to Claims 1, 2, and 6] Generally, the following method is used to extract an arbitrary feature amount in an image. is there. Here, the arbitrary feature amount refers to an arbitrary coordinate of a specific object included in an image, a shape or size that can be expressed by a combination of coordinates, or a specific type or attribute. It is a code that can be used. For example, if a human face is included in the image, and the specific object referred to here is the right eye of the person, as one of the arbitrary feature amounts, for example, the coordinates of the inner corner of the right eye in the image and For example, a rectangle circumscribing the right eye in the image can be represented by a combination of upper left coordinates and lower right coordinates, and for example, the right eye is a sagging eye. It can be a code representing the type.

【０００３】原始的な手法は、表示装置に表示された画
像に対して、操作者が目視により任意の特徴量を、位置
指定手段等を用いて入力することが考えられる。これ
は、当然のことながら作業の煩雑さや、同一画像かつ同
一物体かつ同一特徴量でも作業者によって異なる特徴量
が入力され得る非客観性、同一画像かつ同一物体かつ同
一特徴量でも作業する環境や時間等の状態によって異な
る特徴量が入力され得る非恒常性等が問題となる。した
がって、これは一部の高品質で少量の映像制作の場、あ
るいは非客観性等が作業者の感性として尊重されるよう
な芸術作品制作などの特殊な場面でしか適用されないこ
とが多い。[0003] As a primitive method, it is conceivable that an operator visually inputs an arbitrary feature amount to an image displayed on a display device by using a position specifying means or the like. This is, of course, the complexity of work, the non-objectivity in which different features can be input by the operator even for the same image and the same object and the same feature, the environment for working with the same image and the same object and the same feature, There is a problem of non-constancy and the like in which different feature amounts can be input depending on states such as time. Therefore, this is often applied only in some places of high quality and small amount of video production, or in special situations such as art work production where non-objectivity etc. is respected as the sensitivity of workers.

【０００４】近年、パソコンやワープロ、デジタルカメ
ラ等の普及に伴い、手軽に画像を取得し、取得画像に対
して様々な処理を施すことで実用性、娯楽性を高めるこ
とが行われるようになってきている。その画像処理のひ
とつとして、上述したような任意の特徴量を抽出し、そ
れに従って種々の処理を行うといったことがある。例え
ば取得画像中の自動車のボディ色を任意の色に変換する
ような処理を施そうとした場合、画像中のどの領域が当
該自動車のボディであるかを示す特徴量を抽出する必要
があるし、例えば取得画像中に人物顔が含まれている場
合に、その人物が誰であるか判定（以下、個人同定と記
す）するには、その顔に関する様々な特徴量（目、鼻、
口の位置や形状等）を抽出する必要がある。In recent years, with the spread of personal computers, word processors, digital cameras, and the like, it has become possible to easily acquire images and perform various processes on the acquired images to enhance practicality and entertainment. Is coming. As one of the image processes, there is a case where an arbitrary feature amount as described above is extracted and various processes are performed according to the extracted feature amount. For example, if an attempt is made to convert the body color of a car in an acquired image into an arbitrary color, it is necessary to extract a feature amount indicating which region in the image is the body of the car. For example, when a person's face is included in an acquired image, in order to determine who the person is (hereinafter, referred to as individual identification), various feature amounts (eyes, nose,
It is necessary to extract the position and shape of the mouth).

【０００５】上記の他にも様々な応用が考えられ、画像
中の任意の特徴量を頑健かつ高精度かつ高速に抽出する
ことは、非常に有用なことであると考えられる。[0005] In addition to the above, various applications are conceivable, and it is considered that it is very useful to extract an arbitrary feature amount in an image robustly, with high accuracy, and at high speed.

【０００６】画像中に存在する任意の特徴量を抽出する
基本的な手法としては、テンプレートマッチング、投影
法、固有空間法、色情報を利用した手法等様々な手法が
提案されている。Various methods have been proposed as a basic method for extracting an arbitrary feature quantity existing in an image, such as a template matching method, a projection method, an eigenspace method, and a method using color information.

【０００７】テンプレートマッチングは、例えば上記文
献［１］に記載されており、人物顔を含んだ画像中の顔
が誰のものであるか判定するような場合に用いられる。
判定すべきパターン、すなわち、ここでは人物顔の濃淡
画像を複数記憶しておき、入力画像と各記憶パターンと
の類似度を特徴量とすることで、個人同定を行うことが
できる。[0007] Template matching is described, for example, in the above-mentioned document [1], and is used to determine who owns a face in an image including a human face.
A plurality of patterns to be determined, that is, a plurality of grayscale images of a person's face are stored here, and individual identification can be performed by using the similarity between the input image and each storage pattern as a feature amount.

【０００８】投影法は、同様に上記文献［１］に記載さ
れており、入力画像の輝度値や微分値を水平、あるいは
垂直方向に合計しヒストグラムを作成することで、目に
外接する矩形の座標等を特徴量として抽出することがで
きる。The projection method is also described in the above-mentioned document [1]. The luminance value and the differential value of the input image are summed in the horizontal or vertical direction to form a histogram, thereby forming a rectangle circumscribing the eye. Coordinates and the like can be extracted as feature amounts.

【０００９】固有空間法は、上記文献［２］に記載され
ており、テンプレートマッチングのようにパターンは記
憶するが、そのパターンを単一の画像とするのではなく
複数の画像の確率分布として記憶するものである。その
ため、テンプレートマッチングに比べて対象パターンの
ばらつき、例えば、同一人物の無表情顔でも時間や環境
によって変化すること等の影響がより小さくなるという
利点を持った手法である。The eigenspace method is described in the above reference [2], and stores a pattern as in template matching, but stores the pattern as a probability distribution of a plurality of images instead of a single image. Is what you do. Therefore, compared to the template matching, this method has an advantage in that the influence of variations in the target pattern, for example, even a non-expression face of the same person changes with time or environment is reduced.

【００１０】また、上記文献［３］に記載されているよ
うに抽出したい物体の色分布をあらかじめ統計的に求め
ておき、入力画像中の当該分布に当てはまる色をもった
画素の座標を特徴量として抽出する手法がある。これ
は、例えば肌色の画素のみを抽出してそれを白色方向の
色に近づけることで、他の色はそのままに肌を色白に見
えるように変換するような応用に適用可能である。（２）［請求項３に関する従来技術］認識対象の領域を得るために画像の２値化を行なうに
は、通常次のような手法がとられる。まず必要に応じ画
像を変換する。この変換には、輝度への変換、微分オペ
レータによる変換等の方法が使われる。次に閾値を決定
する。閾値の決定には固定値を使用する方法や、画像中
の画素値に対し判別分析、算術平均、メディアン等によ
る演算処理を行ない決定する方法等がある。次にその閾
値と全ての画素値を比較し、画素値が閾値よりも大きい
場合は１、画素値が閾値よりも小さい場合は０を新たな
画素値とする画像を生成する。（３）［請求項４，５，８に関する従来技術］目や口などの顔部品の位置、大きさ検出を行なうには、
通常次のような手法がとられる。まずユーザーによる指
定は何もなく、テンプレートマッチング、投影法、固有
空間法、色情報を利用した手法等の様々な手法により画
像中の顔や顔部品を検出する方法がある。その他に、ユ
ーザーは顔の大きさは指定せず概略位置のみを指定、も
しくはユーザーは顔の位置及び概略位置及びおおよその
大きさを指定しテンプレートマッチング、投影法、固有
空間法、色情報を利用した手法等の様々な手法を、ユー
ザーが指定した情報をもとに顔もしくは顔部品があると
思われる画像中の一部分を推定し、その部分にのみ上記
手法を適用する方法がある。Further, as described in the above document [3], the color distribution of the object to be extracted is statistically obtained in advance, and the coordinates of pixels having a color applicable to the distribution in the input image are calculated as feature quantities. There is a method of extracting as This can be applied to, for example, an application in which only skin color pixels are extracted and made closer to the color in the white direction, and the other colors are converted to make the skin look fair. (2) [Prior Art Related to Claim 3] To binarize an image in order to obtain a region to be recognized, the following method is usually employed. First, convert the image as needed. For this conversion, a method of conversion into luminance, conversion by a differential operator, or the like is used. Next, a threshold is determined. There are a method of determining the threshold value using a fixed value, a method of performing a discriminant analysis, arithmetic averaging, a median or the like on pixel values in an image, and the like. Next, the threshold is compared with all pixel values, and an image is generated with a new pixel value of 1 if the pixel value is larger than the threshold and 0 if the pixel value is smaller than the threshold. (3) [Prior Art Related to Claims 4, 5, and 8] To detect the position and size of face parts such as eyes and mouth,
Usually, the following method is used. First, there is no user specification, and there is a method of detecting a face or a face part in an image by various methods such as a template matching method, a projection method, an eigenspace method, and a method using color information. In addition, the user specifies only the approximate position without specifying the face size, or the user specifies the face position, approximate position and approximate size, and uses template matching, projection method, eigenspace method, color information There is a method of estimating a part of an image in which a face or a face part is considered to be present based on information specified by a user, and applying the above method only to that part.

【００１１】また、撮影時にカメラと顔の位置関係を一
定に保った状態で撮影を行なうことなどにより入力画像
の条件を一定に保ち、画像中の顔や顔部品の位置が一定
の位置／大きさの範囲内に入るようにした上で上記様々
な手法により画像中の顔や顔部品を検出する方法があ
る。（４）［請求項７に関する従来技術］目の形状を判定するには、通常次のような手法がとられ
る。目の位置及び大きさがすでにわかっているものとし
て、上記テンプレートマッチングや固有空間法のような
手法を用い、前もって目の形状を複数のカテゴリーに分
類しておき、各カテゴリーについて、当該カテゴリーを
表す画像特徴を備えるパターンを作成し、そのパターン
を記憶しておき、それと入力画像との間との類似度を計
算し、最も近いものを入力画像中の目の形状とする。（５）［請求項９に関する従来技術］眉毛の位置及び大きさを検出するには、通常次のような
手法がとられる。上記テンプレートマッチングや固有空
間法のような手法を用い、前もって平均的な眉の形状を
表す画像特徴を備えるパターンを作成し、そのパターン
を記憶しておき、それを、入力画像上を移動しながら対
応する入力画像の部分画像との間の類似度を計算し、類
似度が最も高い時の入力画像上でのパターンの位置を眉
の位置とする。その時のパターンの大きさから眉の大き
さが決定される。あるいは、眉毛はその周辺画素より暗
い画素で構成されているという特徴を利用し、入力画像
を２値化し、同様に２値化した画像での平均的な眉の形
状を表す画像特徴を備えるパターンを作成し、同様にテ
ンプレートマッチングや固有空間法のような手法を用い
ることで眉の位置及び大きさを検出する。（６）［請求項１０に関する従来技術］眉毛の形状を判定するには、通常次のような手法がとら
れる。眉の位置及び大きさがすでにわかっているものと
して、上記テンプレートマッチングや固有空間法のよう
な手法を用い、前もって眉毛の形状を複数のカテゴリー
に分類しておき、各カテゴリーについて、当該カテゴリ
ーを表す画像特徴を備えるパターンを作成し、そのパタ
ーンを記憶しておき、それと入力画像との間との類似度
を計算し、最も近いものを入力画像中の眉毛の形状とす
る。あるいは、眉毛はその周辺画素より暗い画素で構成
されているという特徴を利用し、入力画像を２値化し、
同様に２値化した画像での各カテゴリーを表す画像特徴
を備えるパターンを作成し、同様にテンプレートマッチ
ングや固有空間法のような手法を用いることで眉毛の形
状を判定する。（７）［請求項１１に関する従来技術］顎の形状を判定するには、通常次のような方法がとられ
る。まず顔輪郭線の抽出を行なった後に、その形状とあ
らかじめ用意した基準形状とのテンプレートマッチング
を行なうことにより、もっとも近い基準輪郭線形状を選
択する。ここで、顔輪郭線を抽出する方法としては、例
えば、動的輪郭モデルを用いる手法が考えられる。これ
は抽出対象の近傍に初期輪郭と呼ばれる仮の輪郭線を仮
定し、この輪郭線上の点を真の輪郭線上で最小となるよ
うに設定されたエネルギー関数に基づき移動させること
により、輪郭線を求める手法である。この際、エネルギ
ー関数として、輪郭線のなめらかさを表すエネルギーや
輪郭線を収縮させるエネルギーの他に、物体の輪郭を特
徴づけるエネルギー（以下、画像エネルギーと呼ぶ）を
用いる。画像エネルギーとして、例えば、原画像のエッ
ジを抽出して得られるエッジ画像を利用する。上記のよ
うに動的輪郭モデルを用いて輪郭抽出を行なう手法とし
て上記文献［６］に記載されているＳｎａｋｅ法などが
ある。（８）［請求項１２，１３に関する従来技術］髪形状を分類するには、例えば上記文献［７］に記載さ
れているように、髪色を抽出し、髪色またはそれにに近
い画素を抽出することにより髪領域を切り出し、その形
状を分類することが行われる。また、髪色を抽出するに
は、例えば上記文献［７］に記載されているように、ま
ず、肌色及び背景色を抽出し、次に、肌色またはそれに
近い画素を抽出することにより顔の肌色部分を抽出し、
及び、背景色またはそれに近い画素を抽出することによ
り背景部分を抽出し、さらに、顔の位置及び幅などから
髪画素の存在領域を推定し、この領域内の髪画素から髪
色を決定していた。（９）［請求項１４，１５に関する従来技術］髪形状を分類するには、上記髪色を用いて、髪色または
それに近い画素を抽出することにより髪領域を切り出し
た後、例えば上記文献［７］に記載されているように、
解像度を落とした上で、テンプレートマッチングの手法
により、予め用意されたテンプレートパターンのうち最
も類似度の高いテンプレートパターンの属するカテゴリ
ーに分類することが行われていた。（１０）［請求項１６，１７に関する従来技術］部品選
択により似顔絵を生成する装置は、髪部品の決定を上記
髪形状に基づき、１つの髪部品を選択することで行われていた。また、
上記文献［８］に記載されているように、髪領域を２値
化したものと、前髪部分の髪を表現する小部品とを組み
合わせる手法もあった。（１１）［請求項１８，１９に関する従来技術］上記文献［９］には、似顔絵を画像から抽出した特徴量
に基づいて合成する技術が開示されている。この技術
は、性別または成人か否か及びそれぞれにおける平均的
な配置との比較に基づいて、抽出された位置情報を基準
とし、性別または成人か否かによって配置特徴量の修正
量を調整し、顔部品の配置情報を決定している。しかし
ながら、この技術において顔部品は顔輪郭とは別に配置
情報が決定され、最終的に顔輪郭と合成されているにす
ぎない。（１２）［請求項２０，２１に関する従来技術］従来の似顔絵合成技術においては、顔の対称性などにも
とづく編集操作の簡易化技術が多く報告されている。例
えば、上記文献［１０］に記載されているような右目と
左目が同じ編集操作を受けるという技術である。しかし
ながら、顔の部品の同じ色の部品あるいは同じ特性の部
品は、同じ編集操作を受けるといった点に注目した編集
操作の簡易化技術はあまり知られていない。In addition, the conditions of the input image are kept constant by taking a picture while keeping the positional relationship between the camera and the face constant at the time of taking the picture, and the position of the face or face parts in the image is kept at a fixed position / size. There is a method of detecting a face or a face part in an image by using the above-mentioned various methods after setting the value within the range. (4) [Prior Art Related to Claim 7] To determine the shape of an eye, the following method is usually used. Assuming that the position and size of the eyes are already known, the shapes of the eyes are classified in advance into a plurality of categories using a method such as the template matching or the eigenspace method described above, and for each category, the category is expressed. A pattern having image features is created, the pattern is stored, the similarity between the pattern and the input image is calculated, and the closest one is set as the eye shape in the input image. (5) [Prior Art Related to Claim 9] In order to detect the position and size of the eyebrows, the following method is usually employed. Using a method such as the template matching or the eigenspace method described above, a pattern having an image characteristic representing an average eyebrow shape is created in advance, the pattern is stored, and the pattern is moved while moving on the input image. The similarity between the corresponding partial image of the input image and the partial image is calculated, and the position of the pattern on the input image when the similarity is the highest is defined as the position of the eyebrow. The size of the eyebrows is determined from the size of the pattern at that time. Alternatively, utilizing the feature that the eyebrows are composed of pixels darker than the surrounding pixels, the input image is binarized, and a pattern having image characteristics representing an average eyebrow shape in the binarized image in the same manner Is created, and the position and size of the eyebrows are detected by using a method such as template matching or the eigenspace method. (6) [Prior Art Related to Claim 10] In order to determine the shape of eyebrows, the following method is usually used. Assuming that the position and size of the eyebrows are already known, the shapes of the eyebrows are classified in advance into a plurality of categories using a method such as the template matching or the eigenspace method, and for each category, the category is expressed. A pattern having image features is created, the pattern is stored, the similarity between the pattern and the input image is calculated, and the closest one is used as the shape of the eyebrows in the input image. Alternatively, the input image is binarized by utilizing the feature that the eyebrows are composed of pixels darker than the surrounding pixels,
Similarly, a pattern having image features representing each category in the binarized image is created, and the shape of the eyebrows is similarly determined by using a method such as template matching or the eigenspace method. (7) [Prior Art Related to Claim 11] To determine the shape of the jaw, the following method is usually used. First, after extracting the face outline, the closest reference outline shape is selected by performing template matching between the shape and a reference shape prepared in advance. Here, as a method of extracting the face outline, for example, a method using an active outline model can be considered. This assumes a temporary outline called an initial outline in the vicinity of the extraction target, and moves the points on this outline based on the energy function set to be the minimum on the true outline, thereby shifting the outline. This is the method to be found. At this time, in addition to the energy representing the smoothness of the contour and the energy for contracting the contour, an energy characterizing the contour of the object (hereinafter referred to as image energy) is used as the energy function. As the image energy, for example, an edge image obtained by extracting an edge of the original image is used. As a method of extracting a contour using the active contour model as described above, there is the Snake method described in the above-mentioned reference [6]. (8) [Prior Art Related to Claims 12 and 13] To classify a hair shape, for example, as described in the above-mentioned reference [7], a hair color is extracted and a pixel close to the hair color is extracted. By doing so, a hair region is cut out and its shape is classified. To extract the hair color, for example, as described in the above reference [7], first, the skin color and the background color are extracted, and then the skin color of the face is extracted by extracting the skin color or a pixel similar thereto. Extract the parts,
In addition, a background portion is extracted by extracting a background color or a pixel close to the background color, and further, an area where hair pixels are present is estimated from the position and width of the face, and the hair color is determined from the hair pixels in this area. Was. (9) [Prior Art Related to Claims 14 and 15] To classify a hair shape, a hair region is cut out by extracting a hair color or a pixel close to the hair color using the hair color. 7],
After reducing the resolution, a template matching method has been used to classify the template pattern prepared in advance into a category to which the template pattern having the highest similarity belongs. (10) [Prior Art Related to Claims 16 and 17] In a device for generating a portrait by selecting parts, a hair part is determined by selecting one hair part based on the hair shape. Also,
As described in the above reference [8], there is also a method of combining a binarized hair region with a small part that expresses the hair of the bangs. (11) [Prior Art Regarding Claims 18 and 19] The above-mentioned document [9] discloses a technique of synthesizing a portrait based on a feature amount extracted from an image. This technology adjusts the correction amount of the placement feature based on gender or adult, based on the extracted location information based on gender or adult and comparison with the average arrangement in each, The arrangement information of the face parts is determined. However, in this technique, the arrangement information of the face part is determined separately from the face contour, and the face part is merely synthesized with the face contour. (12) [Prior Art Regarding Claims 20 and 21] In the conventional portrait synthesizing techniques, many techniques for simplifying an editing operation based on symmetry of a face have been reported. For example, there is a technique described in the above document [10] in which the right eye and the left eye receive the same editing operation. However, there is not much known a technique for simplifying an editing operation that focuses on the fact that a part of the same color or a part having the same characteristics as a face part receives the same editing operation.

【００１２】[0012]

【発明が解決しようとする課題】しかしながら、前記記
載の技術においては、なお以下のような課題を有してい
る。（１）［請求項１，２の発明］テンプレートマッチングで特徴量を抽出する手法は、
処理が単純であり実現が容易であるという特徴を持つ
が、入力画像と比較するパターンを画像として記憶して
いるという特性上、次のような問題点がある。However, the technique described above still has the following problems. (1) [Inventions of Claims 1 and 2] A method of extracting a feature amount by template matching is as follows.
Although the feature is that the processing is simple and easy to realize, there is the following problem due to the characteristic that the pattern to be compared with the input image is stored as an image.

【００１３】記憶しているパターン画像（以下、テンプ
レート画像と記す）の画素数が大きい、すなわちより細
かい特徴まで含んだテンプレート画像の場合、入力画像
中の対象としていない物体の特徴量が、対象としている
物体の特徴量と同じになる、あるいは近い値になるとい
う危険性は小さくなる一方で、入力画像中の対象となる
物体の見え方が、テンプレート画像と少し異なっただけ
で、本来の特徴量から大きく離れた特徴量となる危険性
が生じる。逆もまた同様であり、テンプレート画像の画
素数が小さいと、対象としていない物体の特徴量が対象
としている物体の特徴量と同じになる、あるいは近い値
になり、その代わりに対象となる物体の見え方が少し変
わっても、特徴量があまり変化しない利点がある。In the case of a stored pattern image (hereinafter, referred to as a template image) having a large number of pixels, that is, a template image including finer features, the feature amount of an object which is not an object in the input image is set as an object. While the risk of becoming the same as or close to the feature value of the existing object is reduced, the appearance of the target object in the input image is slightly different from the template image, There is a risk that the feature value will be far away from the feature value. The converse is also true: if the number of pixels of the template image is small, the feature amount of the non-target object becomes the same as or close to the feature amount of the target object, and instead, the feature amount of the target object is changed. There is an advantage that the feature amount does not change much even if the appearance changes slightly.

【００１４】したがって、例えば、人物顔画像を含んだ
入力画像中より、テンプレートマッチングを用いて右目
の位置を検出するという問題を考えた場合、ある画像で
正しく目の位置を検出できるようなテンプレート画像を
用意できたとしても、別の画像では誤った位置、例え
ば、背景の中で濃淡パターンが目のそれと類似している
ような場所の位置を検出してしまうことがある。Therefore, for example, in consideration of the problem of detecting the position of the right eye using template matching from an input image including a human face image, a template image that can correctly detect the position of the eye in a certain image is considered. May be prepared, an erroneous position may be detected in another image, for example, a position in the background where the grayscale pattern is similar to that of the eyes.

【００１５】従来のように投影法を用いて特徴量を抽
出する手法も、処理が単純であり実現が容易であるとい
う特徴を持つが、これは入力画像の輝度値や微分値を水
平あるいは垂直方向に合計してヒストグラムを作成し、
当該ヒストグラムを調べることで特徴量を抽出するもの
であり、ヒストグラムを作成する際に、その合計する方
向に、対象となる物体以外のものが存在すると、ヒスト
グラムはそれによって変化し、正しい特徴量を抽出する
のは困難である。The conventional method of extracting a feature using a projection method also has a feature that processing is simple and easy to realize. However, this method is used to extract the luminance value and differential value of an input image horizontally or vertically. Sum up the directions to create a histogram,
The feature amount is extracted by examining the histogram, and when a histogram is created, if there is something other than the target object in the direction of summing, the histogram changes accordingly, and the correct feature amount is determined. It is difficult to extract.

【００１６】従来のように固有空間法を用いて特徴量
を抽出する手法は、テンプレートマッチングを用いる手
法とほぼ同じ制約をもつ。しかし、固有空間法は、記憶
するパターンを単一の画像とするのではなく、複数の画
像の確率分布として表現したパターン（以下、辞書画像
と記す）を用いるため、対象となる物体の変形に対して
より頑健であるという特徴をもつ。ただし、辞書画像を
作成するには、膨大な作業量が必要である。A method of extracting a feature using the eigenspace method as in the related art has almost the same restrictions as a method using template matching. However, the eigenspace method uses a pattern (hereinafter, referred to as a dictionary image) expressed as a probability distribution of a plurality of images instead of a single image as a stored pattern. It has the feature that it is more robust. However, creating a dictionary image requires an enormous amount of work.

【００１７】従来のように色情報を用いて特徴量を抽
出する手法は、対象となる物体を構成する画素の色値と
同様の色値をもつ画素が、他にも存在していた場合、誤
った特徴量を抽出してしまう。A method of extracting a feature amount using color information as in the prior art is based on the following method. An incorrect feature value is extracted.

【００１８】すなわち、上記のいずれの手法も、入力画
像全体を対象として処理を行うと、誤った特徴量を抽出
してしまうことがある。これを回避するために、通常は
画像の取得を制御された環境下で行う、すなわち、例え
ば人物を撮影する場合、人物の背面に均一色のカーテン
等を設置し背景に余計なものが写らないようにする、あ
るいは、撮像系と被写体の距離を常に一定に保つ、ある
いは、照明を一定方向から一定光量照射するようにす
る、あるいは、必ず正面を向き顔が回転しないようにす
る、等である。しかしながら、以上のような制限を設け
ることは、これらの手法を採用した装置の利便性を大き
く損なうものである。That is, in any of the above methods, when processing is performed on the entire input image, an erroneous feature amount may be extracted. In order to avoid this, the acquisition of the image is usually performed in a controlled environment, that is, for example, when photographing a person, a uniform color curtain or the like is installed on the back of the person and no extraneous matter is reflected in the background Or always keep the distance between the imaging system and the subject constant, or irradiate a certain amount of light from a certain direction, or make sure that the face faces forward and the face does not rotate. . However, providing the above-mentioned restrictions greatly impairs the convenience of an apparatus employing these methods.

【００１９】そこで、通常は処理をするための範囲を何
らかの方法を用いて限定することで、上記の制限を緩和
するという下記の方法が用いられる。Therefore, usually, the following method is used in which the range for processing is limited by some method to relax the above-mentioned limitation.

【００２０】位置指定手段を用いて、上記処理をするた
めの範囲を限定するには、通常、次のような方法がとら
れる。人物顔が含まれている画像中より、当該人物の右
目、左目、口の位置を特徴量として抽出する場合、ま
ず、右目に対して処理するための探索範囲を設定する必
要があり、通常この探索範囲は矩形で表され、矩形を表
現するためには少なくとも２点を指定する必要がある。
同様に、左目、口に対しても、それぞれ少なくとも２点
を指定する必要があり、合計で少なくとも６点を指定す
る必要がある。結果的には多くの点を指示しなければな
らないといった問題が発生する。（２）［請求項３，５，８の発明］従来、認識対象の領域を得るために２値化を行なうため
の方法として、輝度の２値化、微分した後に２値化、色
情報に基づく２値化などが行われており、また各々の２
値化の際に用いられる閾値の決定にあたっても判別分
析、平均値など様々な方法が用いられてきた。しかしど
の方式にも一長一短があり、例えばある画像では方式Ａ
で望む領域が得られるが方式Ｂでは望む領域が得られ
ず、別の画像では方式Ｂで望む領域が得られるが方式Ａ
では望む領域が得られない、ということがあり、方式Ａ
を採用しても方式Ｂを採用しても全ての画像で所望の結
果を得ることが不可能であった。（３）［請求項４，５，８の発明］従来、目や口等の顔部品の位置、大きさ検出を行なうに
あたり、ユーザーによる指定は何もなしで検出する方法
や、ユーザーは顔の大きさは指定せず概略位置のみを指
定しその情報から検出する方法、が行なわれてきた。し
かし、これらの方法では、顔部品の大きさを予測するこ
とは不可能であり、不必要に大きな範囲で検出を行なう
必要があるため無駄な計算が発生し、また検出結果が大
きく誤ることがある。またユーザーは顔の位置及び概略
位置及びおおよその大きさを指定しその情報から検出す
る方法では顔部品の大きさを予測することは可能であ
る。しかし、顔のおおよその大きさの指定は髪による顔
の隠蔽等の問題からユーザーが正しく顔のおおよその大
きさを指定しないことがあり、従って十分な精度が得ら
れず、そこから予測される顔部品の大きさも十分な精度
が得られない。そのため、やはり不必要に大きな範囲で
検出を行なう必要があるため無駄な計算が発生し、また
検出結果が大きく誤ることがある。また予め顔画像中の
顔や顔部品の位置や大きさが一定の範囲内に納まるよう
撮影するという方法では、不必要に大きな範囲で検出を
行なう必要はなく、検出結果が大きく誤ることも少ない
が、任意の入力画像に対し顔や顔部品の検出を行なうこ
とが不可能であるという問題がある。（４）［請求項６，８の発明］従来、顔部品の位置、大きさ検出を行なうにあたり行な
われてきた入力画像に対しテンプレートマッチングや投
影等の画像処理手法を適用し、顔部品の位置、大きさ検
出する方法では、適用される手法が顔部品の傾きに大き
く影響されるため、入力された画像中の顔は水平である
ことが求められており、画像中の顔が傾いていた場合、
顔部品の位置、大きさ検出の精度が著しく低下するとい
う問題がある。（５）［請求項７の発明］テンプレートマッチングや固有空間法で目の形状を判定
する手法を用いた場合、まず目の位置と大きさが正しく
検出されている必要がある。上記２手法は、位置ずれに
対して非常に脆弱であり、テンプレート画像、あるいは
辞書画像と、それに対応する入力画像中の部分画像とが
わずかにずれているだけで、算出される特徴量は大きく
異なる場合がある。また、形状を判定しようとするカテ
ゴリーごとに対応するテンプレート画像や辞書画像を予
め準備しておかなければならず、対象物体が、予め準備
しておいたカテゴリーに含まれない形状の場合は、正し
い特徴量を算出することができないし、テンプレート画
像や辞書画像を準備するには多くの作業量を要する。（６）［請求項９の発明］テンプレートマッチングや固有空間法で眉毛の位置及び
大きさを検出する手法を用いた場合、まずテンプレート
画像あるいは辞書画像を予め準備しておかなければなら
ないが、一般に、眉毛は個人差により様々な形状をとり
うるため、一つのテンプレート画像あるいは辞書画像で
全ての眉毛の位置及び大きさを検出するのは困難であ
る。そこで、予め設定した各カテゴリーごとにテンプレ
ート画像あるいは辞書画像を作成し、それぞれについて
テンプレートマッチング、固有空間法を適用して最も適
当なテンプレート画像あるいは辞書画像での最も適当な
類似度をもって、眉毛の位置及び大きさを検出しなけれ
ばならない。In order to limit the range for performing the above processing by using the position specifying means, the following method is usually employed. When extracting the positions of the right eye, left eye, and mouth of a person as features from an image including a person face, it is necessary to first set a search range for processing the right eye. The search range is represented by a rectangle, and at least two points must be specified to represent the rectangle.
Similarly, it is necessary to specify at least two points for the left eye and the mouth, respectively, and it is necessary to specify at least six points in total. As a result, there arises a problem that many points must be specified. (2) [Inventions of Claims 3, 5, and 8] Conventionally, as a method of performing binarization to obtain a region to be recognized, binarization of luminance, binarization after differentiation, and color information Binarization, etc. based on
Various methods, such as discriminant analysis and average values, have been used in determining the threshold value used in valuation. However, each system has advantages and disadvantages.
Can obtain the desired area, but the desired area cannot be obtained by the method B, and the desired area can be obtained by the method B in another image, but the method A
May not be able to obtain the desired area.
It is impossible to obtain a desired result with all images, even if the method is adopted or the method B is adopted. (3) [Inventions of Claims 4, 5 and 8] Conventionally, in detecting the position and size of face parts such as eyes and mouth, a method of detecting without specifying anything by the user, A method has been performed in which only the approximate position is specified without specifying the size and the information is detected from the information. However, with these methods, it is impossible to predict the size of the face part, and it is necessary to perform detection in an unnecessarily large range, so that useless calculation occurs and the detection result may be erroneously wrong. is there. In addition, the user can predict the size of the face part by a method in which the user specifies the position, approximate position, and approximate size of the face and detects from the information. However, specifying the approximate size of the face may cause the user to incorrectly specify the approximate size of the face due to problems such as concealment of the face by the hair. Sufficient accuracy cannot be obtained for the size of the face parts. Therefore, since it is necessary to perform detection in an unnecessarily large range, useless calculation occurs and the detection result may be erroneously large. In addition, in a method in which the position and size of the face and face parts in the face image are captured in advance within a certain range, it is not necessary to perform detection in an unnecessarily large range, and the detection result is unlikely to be significantly erroneous. However, there is a problem that it is impossible to detect a face or a face part from an arbitrary input image. (4) [Inventions of Claims 6 and 8] An image processing method such as template matching or projection is applied to an input image that has been conventionally used to detect the position and size of a face part, and the position of the face part is determined. In the method of detecting the size, since the applied method is greatly affected by the inclination of the face part, the face in the input image is required to be horizontal, and the face in the image is inclined. If
There is a problem that the accuracy of detecting the position and size of the face part is significantly reduced. (5) [Invention of Claim 7] When a method of determining the shape of an eye by template matching or the eigenspace method is used, first, the position and size of the eye need to be correctly detected. The above two methods are very vulnerable to misregistration, and the calculated feature amount is large if the template image or dictionary image is slightly displaced from the corresponding partial image in the input image. May be different. In addition, a template image or a dictionary image corresponding to each category whose shape is to be determined must be prepared in advance, and if the target object has a shape that is not included in the prepared category, a correct The feature amount cannot be calculated, and a large amount of work is required to prepare a template image and a dictionary image. (6) [Invention of Claim 9] When a technique of detecting the position and size of eyebrows by template matching or the eigenspace method is used, a template image or a dictionary image must be prepared in advance, but generally, it is generally used. Since eyebrows can take various shapes depending on individual differences, it is difficult to detect the positions and sizes of all eyebrows with one template image or dictionary image. Therefore, a template image or a dictionary image is created for each category set in advance, and the template matching and the eigenspace method are applied to each of the categories to obtain the position of the eyebrows with the most appropriate similarity in the most appropriate template image or dictionary image. And size must be detected.

【００２１】しかしながら、既に述べたように、テンプ
レート画像や辞書画像を準備するには多くの作業量を要
するし、一つのテンプレート画像あるいは辞書画像を用
いた場合でも誤った位置及び大きさを検出する可能性が
あるのに、複数のテンプレート画像あるいは辞書画像を
用いると、さらに誤った位置及び大きさを検出する危険
性は大きくなる。However, as described above, preparing a template image or a dictionary image requires a large amount of work, and an erroneous position and size are detected even when one template image or a dictionary image is used. If there is a possibility that a plurality of template images or dictionary images are used, the risk of detecting an erroneous position and size increases.

【００２２】さらに、テンプレートマッチングや固有空
間法等の手法を用いる場合、テンプレート画像や辞書画
像の大きさと、入力画像中に存在する、対象となる物体
の大きさが同一でなければならないが、それらが既知で
ない場合、複数の大きさのテンプレート画像あるいは辞
書画像を用意しておき、それら全てと入力画像との間で
処理を行うか、もしくは、入力画像を拡大あるいは縮小
し、それら全ての拡縮画像とテンプレート画像あるいは
辞書画像との間で処理を行わなければならない。これに
よる処理量は膨大である。Further, when a technique such as template matching or the eigenspace method is used, the size of the template image or dictionary image must be the same as the size of the target object existing in the input image. If is not known, prepare template images or dictionary images of a plurality of sizes, perform processing between all of them and the input image, or enlarge or reduce the input image, and enlarge or reduce all of them. The processing must be performed between the template image and the dictionary image. This results in a huge amount of processing.

【００２３】また、入力画像を２値化する場合は、その
閾値を適当に設定しなければならないが、画像により照
明条件等が異なる等の原因により、一般に用いられてい
る２値化の手法を用いるだけでは、眉毛と眉毛以外を明
確に分離するための閾値を求めるのは困難である。（７）［請求項１０の発明］テンプレートマッチングや固有空間法で眉毛の形状を判
定する手法を用いた場合、まず眉毛の位置と大きさが正
しく検出されている必要がある。上記２手法は、位置ず
れに対して非常に脆弱であり、テンプレート画像、ある
いは辞書画像と、それに対応する入力画像中の部分画像
とがわずかにずれているだけで、算出される特徴量は大
きく異なる場合がある。また、形状を判定しようとする
カテゴリーごとに対応するテンプレート画像や辞書画像
を予め準備しておかなければならず、対象物体が、予め
準備しておいたカテゴリーに含まれない形状の場合は、
正しい特徴量を算出することができないし、テンプレー
ト画像や辞書画像を準備するには多くの作業量を要す
る。（８）［請求項１１の発明］通常、形状判定を行なう前に顎を含む顔の輪郭線を抽出
しておく必要があるが、一般に照明条件や顔の向き、表
情の変化などにより、鮮明な輪郭線がいつも得られると
は限らない。特にエッジ画像を利用する場合、輪郭がと
ぎれたり、逆に偽輪郭がでるなどの問題がおこりやす
い。このため、一定の照明条件で、ほぼ均一の背景下の
もとで被写体を撮影しておくなど、対象画像の条件がき
びしくなる場合が多い。When the input image is binarized, the threshold value must be appropriately set. However, a binarization method generally used is used due to factors such as different lighting conditions depending on the image. It is difficult to obtain a threshold value for clearly separating the eyebrows from those other than the eyebrows simply by using the eyebrows. (7) [Invention of Claim 10] When a technique of determining the shape of eyebrows by template matching or the eigenspace method is used, first, the position and size of eyebrows need to be correctly detected. The above two methods are very vulnerable to misregistration, and the calculated feature amount is large if the template image or dictionary image is slightly displaced from the corresponding partial image in the input image. May be different. In addition, a template image or a dictionary image corresponding to each category for which a shape is to be determined must be prepared in advance, and when the target object has a shape that is not included in the prepared category,
Correct feature amounts cannot be calculated, and a large amount of work is required to prepare template images and dictionary images. (8) [Invention of Claim 11] Normally, it is necessary to extract the contour of a face including a chin before performing shape determination. However, in general, sharpness is determined by lighting conditions, face orientation, changes in facial expressions, and the like. Contours are not always obtained. In particular, when an edge image is used, problems such as a break in the contour and a false contour appear easily. For this reason, the conditions of the target image often become severe, such as shooting the subject under a substantially uniform background under a constant lighting condition.

【００２４】次いで、得られた輪郭線をテンプレートマ
ッチングにより形状を判定する手法を適用する場合、上
記顔輪郭抽出で正しい輪郭線が抽出出来たとしても、顎
の輪郭線には個人差が大きいため、基準となるテンプレ
ートの数を多くする、すなわち辞書画像が多数必要にな
る。また、辞書画像が少なくし、分類数を減らすことも
可能だが、個人差を加味した共通の基準辞書画像を調整
するのは困難である。（９）［請求項１２，１３の発明］背景色を抽出して画像全体から背景領域を抽出し、髪領
域がこれに入り込まないようにするため、背景色が一様
であるかまたはそれに近い必要があり、通常のスナップ
写真などから似顔絵を作成することは難しかった。（１０）［請求項１４の発明］髪形状をテンプレートマッチングにより分類する方法で
は、いわゆる「七三分け」，「真中分け」などの呼び方
でいう「分け目」を精度よく検出して分類することや、
髪の生え際線の形状の丸みを判定して「四角型」，「丸
型」に分類することなど、きめ細かい形状分類を行うこ
とは難しかった。（１１）［請求項１５の発明］髪形状は抽出された髪領域及び髪特徴のみに基づいて決
定されるので、例えば、白髪であるなど髪領域と肌領域
の区別が難しい場合に、誤って「髪が薄い」などの誤判
断を起こす場合があった。（１２）［請求項１６，１７の発明］部品選択により似顔絵を生成する装置においては、分類
された髪形状のカテゴリ１つに対し１種類の髪部品が対
応するため、前髪と後髪との調和の取れたリアルな髪形
状を生成することは難しかった。髪領域を２値化したも
のと、前髪部分の髪を表現する小部品とを組み合わせる
手法で形成した髪画像は、髪部品全体を予め作成する場
合と比較すると、美しさにおいて劣る場合が多く、ま
た、顔の他の部分との調和をとるのが難しかった。さら
に、髪領域の２値化画像を用いるため、２値化処理が不
完全な部分が少しでも存在すると、それが出力としてそ
のままユーザーに見えて違和感を与えてしまう、という
問題点があった。（１３）［請求項１８，１９の発明］従来の似顔絵などの画像を自動作成する技術では、目／
鼻／口／眉／耳などの顔部品を配置するにあたって、固
定的な位置に顔部品を配置したり、あるいは目／鼻／口
／眉／耳の実画像上での相対的な位置関係を基準に配置
を行っていた。このため、顔幅の広い顔では、目が中心
に寄ってしまったり、逆に顔幅が狭い顔では目や眉が顔
からはみ出したりしていた。またオデコの広い顔でも目
が上の方に来たりして不自然な画像となっていた。これ
は、顔の部品の配置が、顔輪郭の形状によって決定され
るということを無視していたからである。年齢や性別に
よって部品の配置を変更するという方法も考えられる
が、これは顔輪郭の一般的な傾向を反映してはいるが、
顔の幅の違いや、目の位置の違いなどの細かい特性を直
接的には反映していなかった。（１４）［請求項２０の発明］人間の顔は複数の顔部品より構成されているが、１人の
人間においては、顔部品間で極めて強い相関を持ってい
るのが普通である。似顔絵の顔部品としては、１つの髪
型に対して、黒／茶／白などの色が存在しており、人に
よってバリエーションは大きいが、１人の人間において
前髪と後髪が異なる色であることは極めてまれである。
ところが顔部品としては、前髪と後髪が別部品であるた
め、前髪の色を認識または編集操作によって、ある色に
変更した場合、後髪が異なる色のままであると違和感が
生じる。また逆に後髪の色を変更した場合、前髪が異な
る色であると同じく違和感が生じる。同様に前髪がパー
マで、後髪がストレートといった、ありえない組み合わ
せに関しても違和感を生じることがある。（１５）［請求項２１の発明］似顔絵に用いる顔部品での１つの部品データは、その部
品そのものと、その部品が発生する影の部分から構成さ
れており、この両者を一組として顔部品データ記憶手段
に記憶している。従来の技術では、顔部品データ抽出手
段が抽出した部品をそのまま合成していた。このため、
例えば目の部品を描画した上に、前髪の部品を描画する
と、目の上に前髪の影を描画して、目が潰れてしまうこ
とがあった。これは、前髪の影は、前髪の形で形状が決
まるが、影自体は影が投影された先である顔輪郭の一部
であることを無視しているからである。また顔輪郭など
影が投影される面が存在しなくても、影だけが書かれて
しまう欠点もあった。また、上記文献［１１］のよう
に、部品を階層によって分割して記憶し、階層ごとに合
成する手法もあるが、１つのデータを複数の部品に分割
するため、データの管理が複雑になったり、必要以上に
データ容量を要してしまう欠点があった。Next, when a method of determining the shape of the obtained contour line by template matching is applied, even if the correct contour line can be extracted by the face contour extraction, there is a large individual difference in the jaw contour line. The number of reference templates is increased, that is, a large number of dictionary images are required. Although it is possible to reduce the number of dictionary images and reduce the number of classifications, it is difficult to adjust a common reference dictionary image taking individual differences into account. (9) [Inventions of Claims 12 and 13] The background color is uniform or close to it so as to extract the background color and extract the background region from the entire image and prevent the hair region from entering the background region. It was necessary, and it was difficult to create a portrait from ordinary snapshots. (10) [Invention of claim 14] In the method of classifying the hair shape by template matching, it is necessary to accurately detect and classify the "separation" referred to as a so-called "seven-three division" or "center division". And
It has been difficult to perform detailed shape classification, such as determining the roundness of the hairline shape and classifying it into "square" or "round". (11) [Invention of claim 15] Since the hair shape is determined based only on the extracted hair region and hair characteristics, for example, when it is difficult to distinguish the hair region from the skin region such as gray hair, the hair shape is erroneously determined. Misjudgments such as "thin hair" were sometimes caused. (12) [Inventions of Claims 16 and 17] In an apparatus for generating a portrait by selecting a part, one type of hair part corresponds to one of the classified categories of hair shape, so that the front hair and the back hair are separated. It was difficult to generate harmonious and realistic hair shapes. A hair image formed by a method of combining a binarized hair region with a small part representing the hair of the bangs is often inferior in beauty as compared with a case where the entire hair part is created in advance, It was also difficult to harmonize with the rest of the face. Furthermore, since the binarized image of the hair region is used, there is a problem that if there is any part where the binarization processing is incomplete, the part is directly output to the user and gives a sense of incongruity. (13) [Inventions of Claims 18 and 19] In the conventional technology for automatically creating images such as portraits,
When arranging face parts such as nose / mouth / eyebrows / ears, place face parts in fixed positions, or determine the relative positional relationship of eyes / nose / mouth / eyebrows / ears on a real image. The arrangement was based on the reference. For this reason, in a face with a wide face, the eyes are shifted toward the center, and in a face with a narrow face, the eyes and eyebrows protrude from the face. Also, even with a wide face of Odeco, the eyes came to the top and the image was unnatural. This is because the fact that the arrangement of the face parts is determined by the shape of the face outline was ignored. You can also change the placement of parts according to age and gender, although this reflects the general tendency of face contours,
It did not directly reflect fine characteristics such as differences in face width and differences in eye position. (14) [Invention of Claim 20] Although a human face is composed of a plurality of face parts, it is normal for one person to have a very strong correlation between the face parts. As facial parts of portraits, there are colors such as black / brown / white for one hairstyle, and the variation is large depending on the person, but the color of the bangs and back hair of one person is different Is extremely rare.
However, since the forehead and the back hair are separate parts as the face parts, when the color of the forehead is changed to a certain color by the recognition or editing operation, if the back hair remains in a different color, an uncomfortable feeling occurs. Conversely, when the color of the back hair is changed, if the bangs have a different color, the same discomfort occurs. Similarly, an uncomfortable combination, such as a case where the forelock is perm and the back hair is straight, may cause discomfort. (15) [Invention of claim 21] One part data of a face part used for a portrait is composed of the part itself and a shadow part generated by the part. It is stored in data storage means. In the prior art, the components extracted by the face component data extraction means are synthesized as they are. For this reason,
For example, if a part of the bangs is drawn after drawing the parts of the eyes, the shadow of the bangs may be drawn on the eyes and the eyes may be crushed. This is because the shape of the shadow of the bangs is determined by the shape of the bangs, but ignores that the shadow itself is part of the face contour to which the shadow is projected. There is also a drawback that only the shadow is written even if there is no face on which the shadow is projected, such as a face outline. Further, as in the above document [11], there is a method in which parts are divided into layers and stored, and are synthesized for each layer. However, since one piece of data is divided into a plurality of parts, data management becomes complicated. And there is a disadvantage that the data capacity is required more than necessary.

【００２５】本発明は、上記問題点を解決するためにな
されたもので、その目的とするところは、画像中の特定
物体の特徴量を抽出する画像処理を行う場合、処理に時
間がかかったり、誤った特徴量を抽出してしまうことな
く、また処理の範囲を位置指定手段等を用いて直接指定
する場合に多くの位置を指定することなく、画像中の任
意の特徴量を頑健、高精度かつ高速に抽出できるととも
に、簡易に高品質な似顔絵を合成できる画像処理装置を
提供することである。SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problems. It is an object of the present invention to take a long time to perform image processing for extracting a feature amount of a specific object in an image. Any feature amount in an image can be robust and high without extracting erroneous feature amounts, and without specifying many positions when directly specifying the range of processing using position designation means. It is an object of the present invention to provide an image processing apparatus that can accurately and quickly extract and easily synthesize a high-quality portrait.

【００２６】[0026]

【課題を解決するための手段】［請求項１］本発明の請
求項１に係る画像処理装置は、画像を入力する入力手段
と、前記入力した画像を記憶する記憶手段と、任意の演
算を行う演算手段と、当該画像中の任意の位置を指定す
ることのできる位置指定手段と、当該画像中に配置され
た物体の位置及び大きさを認識し、前記物体の位置及び
大きさの関係が一定の拘束条件を満たす場合に、画像中
の一つ以上の当該物体の位置を入力することで、当該画
像中の任意の特徴を抽出する特徴抽出量手段とを備えて
なることを特徴とする。According to a first aspect of the present invention, there is provided an image processing apparatus comprising: an input unit for inputting an image; a storage unit for storing the input image; Calculating means for performing, position specifying means for specifying an arbitrary position in the image, and recognizing the position and size of the object arranged in the image, and determining the relationship between the position and size of the object. When a certain constraint condition is satisfied, a feature extraction amount means for extracting an arbitrary feature in the image by inputting a position of one or more of the objects in the image is provided. .

【００２７】上記構成によれば、画像中に配置された物
体の位置及び大きさの関係が一定の拘束条件を満たす場
合に、画像中の一つ以上の位置を入力することで、当該
入力位置より、対象となる物体の特徴量を抽出するため
の、画像処理を行うための適当な探索範囲を設定するこ
とが可能となる。According to the above arrangement, when the relationship between the position and the size of the object arranged in the image satisfies a certain constraint, one or more positions in the image are input, and the input position As a result, it is possible to set an appropriate search range for performing image processing for extracting a feature amount of a target object.

【００２８】すなわち、従来の手法では探索範囲を設定
するために、例えば３つの物体であれば、それぞれにつ
いて探索範囲を設定するために少なくとも６点を指定し
なければならなかったが、本発明の画像処理装置を用い
ることで、より少ない指定点で適当な探索範囲を設定す
ることが可能となる。［請求項２］本発明の請求項２に係る画像処理装置は、
請求項１記載の画像処理装置において、前記特徴抽出手
段は、入力される画像に顔を含む場合に、顔を構成する
目、鼻、口、眉、耳、輪郭、髪を顔部品とし、前記顔部
品の内の少なくとも１つの該当顔部品の位置、大きさ、
形状を特徴量として抽出することを特徴とする。That is, in the conventional method, in order to set the search range, for example, if there are three objects, at least six points have to be specified to set the search range for each object. By using the image processing device, it is possible to set an appropriate search range with fewer designated points. [Claim 2] An image processing apparatus according to claim 2 of the present invention is
2. The image processing apparatus according to claim 1, wherein, when the input image includes a face, an eye, a nose, a mouth, an eyebrow, an ear, an outline, and a hair constituting the face are face parts. A position, a size, and a position of at least one corresponding face part among the face parts;
It is characterized in that a shape is extracted as a feature amount.

【００２９】上記構成によれば、請求項１の作用に加え
て、目、鼻、口、眉、耳、輪郭、髪等の顔部品の位置、
大きさ、形状等の特徴量を、頑健かつ高精度に抽出する
ことができる。According to the above construction, in addition to the operation of the first aspect, the positions of face parts such as eyes, nose, mouth, eyebrows, ears, contours, hair, etc.
Features such as size and shape can be extracted robustly and with high accuracy.

【００３０】本画像処理装置においては、例えば、入力
画像中に含まれている人物の右目、左目、口の位置を特
徴量として抽出する場合、上記位置指定手段を用いて右
目及び左目の２点を指定する。ここで、右目、左目、口
の位置及び大きさについては、一定の拘束条件にしたが
っている、すなわち、左右の目の大きさはほぼ同一であ
り、それは、両目間の距離に一定の係数を乗じた値から
大きく離れた値ではなく、口は、両目を結ぶ線分の中央
から垂直下方に位置し、その距離と口の大きさは、両目
間の距離に一定の係数を乗じた値から大きく離れた値で
はない、とすることができる。In the present image processing apparatus, for example, when the positions of the right eye, left eye, and mouth of a person included in the input image are extracted as feature amounts, two points of the right eye and the left eye are extracted by using the above-mentioned position specifying means. Is specified. Here, the position and size of the right eye, the left eye, and the mouth follow a certain constraint, that is, the size of the left and right eyes is almost the same, which is obtained by multiplying the distance between the eyes by a certain coefficient. The mouth is located vertically below the center of the line connecting the eyes, and the distance and the size of the mouth are larger than the value obtained by multiplying the distance between the eyes by a certain coefficient. Values that are not far apart.

【００３１】すなわち、従来技術では、特徴抽出の処理
において誤った値を抽出することがないよう、探索範囲
を限定するためのに、位置指定手段にて６点を指定しな
ければならかなったのを、２点指定するだけで同様の効
果が得られるようになることを特徴としている。［請求項３］本発明の請求項３に係る画像処理装置は、
請求項２記載の画像処理装置において、前記特徴量抽出
手段は、入力された画像を複数の方式及び複数の閾値で
２値化し、それらの画像中の領域の位置や大きさや形状
を判定し、最も信頼度の高い画像を選択することで、認
識対象の領域を検出する領域検出手段を備えてなること
を特徴とする。That is, in the prior art, six points have to be designated by the position designation means in order to limit the search range so that an erroneous value is not extracted in the feature extraction processing. The same effect can be obtained only by designating two points. [Claim 3] An image processing apparatus according to claim 3 of the present invention is
3. The image processing apparatus according to claim 2, wherein the feature amount extracting unit binarizes the input image using a plurality of methods and a plurality of thresholds, and determines a position, a size, and a shape of a region in the images, It is characterized by comprising an area detecting means for detecting an area to be recognized by selecting an image having the highest reliability.

【００３２】上記構成では、認識対象の領域を得るため
に２値化を行なうにあたっての閾値の決定に際し、複数
の方式による複数の閾値を用いて２値化を行ない、その
結果得られた複数の領域と、予め概略のわかっている認
識対象の位置、形状、大きさ等と比較し最も認識対象の
位置、形状、大きさに近い領域を認識対象とすること
で、認識対象の領域を頑健かつ高精度に抽出することが
できる。［請求項４，５，８］本発明の請求項４に係る画像処理
装置は、請求項２記載の画像処理装置において、前記特
徴量抽出手段は、前記位置指定手段で指定された２つ以
上の顔部品の位置間の距離関係から顔部品の大きさを予
測することにより、顔部品の位置、大きさの検出を行な
う顔部品認識手段を備えてなることを特徴とする。In the above configuration, when determining a threshold for performing binarization to obtain a region to be recognized, binarization is performed using a plurality of thresholds by a plurality of methods, and a plurality of thresholds obtained as a result are obtained. By comparing the area with the position, shape, size, etc. of the recognition target whose outline is known in advance, and making the area closest to the position, shape, size of the recognition target the recognition target, the recognition target area is robust and It can be extracted with high accuracy. According to a fourth aspect of the present invention, there is provided an image processing apparatus according to the second aspect, wherein the feature amount extracting means is provided with two or more specified by the position specifying means. A face part recognizing means for detecting the position and the size of the face part by estimating the size of the face part from the distance relationship between the positions of the face parts.

【００３３】本発明の請求項５に係る画像処理装置は、
請求項４記載の画像処理装置において、前記顔部品認識
手段は、検出する位置や大きさの対象が顔部品の目であ
ることを特徴とする。An image processing apparatus according to a fifth aspect of the present invention comprises:
5. The image processing apparatus according to claim 4, wherein the face part recognition unit detects the target of the position and the size to be detected is an eye of the face part.

【００３４】本発明の請求項８に係る画像処理装置は、
請求項４記載の画像処理装置において、前記顔部品認識
手段は、検出する位置や大きさが顔部品の口であること
を特徴とする。An image processing apparatus according to claim 8 of the present invention provides:
5. The image processing apparatus according to claim 4, wherein the face part recognizing means detects the position or size of the face part by the mouth of the face part.

【００３５】上記構成により、高精度に顔部品の大きさ
を予測し、画像中の必要十分な範囲内でのみ検出処理を
行ない、少ない計算量で高精度に顔部品の位置、大きさ
検出を行なうことができる。［請求項６］本発明の請求項６に係る画像処理装置は、
請求項４または５記載の画像処理装置において、前記顔
部品認識手段は、検出した両目の位置に関して、左右の
目を結ぶ線が水平になるように、顔画像を回転させる手
段を備えてなることを特徴とする。With the above configuration, the size of the face part is predicted with high accuracy, the detection process is performed only within a necessary and sufficient range in the image, and the position and size of the face part are detected with high accuracy with a small amount of calculation. Can do it. [Claim 6] An image processing apparatus according to claim 6 of the present invention comprises:
6. The image processing apparatus according to claim 4, wherein said face part recognizing means includes means for rotating a face image such that a line connecting left and right eyes is horizontal with respect to the detected positions of both eyes. It is characterized by.

【００３６】上記構成により、顔が傾いた画像を入力と
して与えられても高精度に顔部品の位置、大きさ検出を
行なうことができる。［請求項７］本発明の請求項７に係る画像処理装置は、
請求項４乃至６のいずれか記載の画像処理装置におい
て、前記顔部品認識手段は、検出した目の位置や大きさ
に基づいて探索範囲を設定し、その範囲で目の傾き及び
厚みをあらわす画像特徴を検出し、目の形状を判定する
手段を備えてなることを特徴とする。According to the above configuration, the position and size of a face part can be detected with high accuracy even when an image with a tilted face is given as an input. [Claim 7] An image processing apparatus according to claim 7 of the present invention is
The image processing apparatus according to claim 4, wherein the face part recognizing unit sets a search range based on the detected position and size of the eyes, and displays an inclination and a thickness of the eyes in the range. It is characterized by comprising means for detecting a feature and determining an eye shape.

【００３７】上記構成によれば、設定された探索範囲内
で目の傾き及び目の厚みをあらわす画像特徴を検出し、
目の形状を判定するので、テンプレート画像、あるいは
辞書画像と、それに対応する入力画像中の部分画像とが
ずれていることにより、誤った特徴量が抽出されるとい
う危険を回避することができる。さらに、対象とする目
の形状が、予め準備しておいたカテゴリーに含まれない
形状の場合に、正しい特徴量が算出できないといった危
険を回避することができる。さらに、テンプレート画像
や辞書画像を準備する必要がなり、作業量を大幅に少な
くすることができる。［請求項９］本発明の請求項９に係る画像処理装置は、
請求項４記載の画像処理装置において、前記顔部品認識
手段は、検出する位置や大きさが顔部品の眉であること
を特徴とする。According to the above arrangement, an image feature representing the inclination and the thickness of the eyes is detected within the set search range,
Since the shape of the eyes is determined, it is possible to avoid a risk that an erroneous feature amount is extracted due to a shift between the template image or the dictionary image and the corresponding partial image in the input image. Furthermore, when the target eye shape is a shape that is not included in the category prepared in advance, it is possible to avoid a danger that a correct feature amount cannot be calculated. Furthermore, it is necessary to prepare a template image and a dictionary image, and the amount of work can be significantly reduced. [Claim 9] An image processing apparatus according to claim 9 of the present invention is characterized in that:
5. The image processing apparatus according to claim 4, wherein the position and size of the face part recognition means are eyebrows of the face part.

【００３８】上記構成によれば、２つ以上の顔部品の位
置を位置指定手段により指定し、それら指定位置間の距
離関係から眉毛の大きさを予測することで、特徴量を抽
出する際の処理を行うべき範囲を適当な大きさに制限す
ることができる。According to the above arrangement, the positions of two or more face parts are specified by the position specifying means, and the size of the eyebrows is predicted from the distance relationship between the specified positions, thereby extracting a feature amount. The range in which processing is to be performed can be limited to an appropriate size.

【００３９】その上で、当該処理範囲に対して２値化を
行うが、本画像処理装置は、上記問題を解決するため
に、認識対象の領域を得るために２値化を行なうにあた
り、複数の方式、複数の閾値で２値化し、それらの画像
中の領域の位置、大きさ、形状等を判定し、最も信頼で
きる画像を選択することにより、認識対象を高精度に検
出する方法を備えることを特徴としている。Then, binarization is performed on the processing range. In order to solve the above-described problem, the image processing apparatus performs a plurality of binarizations to obtain a region to be recognized. Method, binarizing with a plurality of thresholds, determining the position, size, shape, and the like of the region in those images, and selecting the most reliable image to detect the recognition target with high accuracy. It is characterized by:

【００４０】この構成によれば、２つ以上の顔部品の位
置を位置指定手段により指定し、それら指定位置間の距
離関係から眉毛の大きさを予測し、特徴量を抽出する際
の処理を行うべき範囲を適当な大きさに制限することに
加え、眉毛の大きさを推定することができる。すなわ
ち、例えば、位置指定手段によって位置指定される顔部
品が右目と左目である場合、眉毛の大きさは、その両目
間の距離に一定の係数を乗じた値から大きく離れた値で
はない、とすることができる。According to this configuration, the position of two or more face parts is designated by the position designation means, the size of the eyebrows is predicted from the distance relationship between the designated positions, and the processing for extracting the characteristic amount is performed. In addition to limiting the range to be performed to an appropriate size, the size of the eyebrows can be estimated. That is, for example, when the face parts whose positions are specified by the position specifying unit are the right eye and the left eye, the size of the eyebrows is not a value greatly separated from a value obtained by multiplying the distance between the eyes by a constant coefficient. can do.

【００４１】したがって、２値化を行った際に分離され
る領域の大きさと、推定される眉毛の大きさを比較し、
それらの大きさがあまり離れていないような２値化の閾
値を求めることで、眉毛をあらわす領域を高精度に検出
することができる。［請求項１０］本発明の請求項１０に係る画像処理装置
は、請求項４または９に記載の画像処理装置において、
前記顔部品認識手段は、検出した眉の位置や大きさに基
づいて探索範囲を設定し、その範囲で眉の太さ及び折れ
曲がり方をあらわす画像特徴を検出し、眉の形状を判定
する手段を備えてなることを特徴とする。Therefore, the size of the region separated when binarization is performed is compared with the estimated size of the eyebrows.
By obtaining a binarization threshold value such that their sizes are not so far apart, an area representing an eyebrow can be detected with high accuracy. [Claim 10] An image processing apparatus according to claim 10 of the present invention is the image processing apparatus according to claim 4 or 9, wherein
The face part recognizing means sets a search range based on the detected position and size of the eyebrows, detects image characteristics representing the thickness and how to bend the eyebrows in the range, and determines the shape of the eyebrows. It is characterized by comprising.

【００４２】上記構成によれば、設定された探索範囲内
で眉毛の太さ及び折れ曲がりかたをあらわす画像特徴を
検出し、眉毛の形状を判定するので、テンプレート画
像、あるいは辞書画像と、それに対応する入力画像中の
部分画像とがずれていることにより、誤った特徴量が抽
出されるという危険を回避することができる。さらに、
対象とする眉毛の形状が、予め準備しておいたカテゴリ
ーに含まれない形状の場合に、正しい特徴量が算出でき
ないといった危険を回避することができる。さらに、テ
ンプレート画像や辞書画像を準備する必要がなり、作業
量を大幅に少なくすることができる。［請求項１１］本発明の請求項１１に係る画像処理装置
は、請求項２記載の画像処理装置において、前記特徴抽
出手段は、前記位置指定手段で指定された一つ以上の位
置情報に基づき、顎の輪郭特徴を検出し、その形状を判
定する輪郭認識手段を備えてなることを特徴とする。つ
まり、位置指定手段から得られた１つ以上の顔を特徴付
ける情報から、顎輪郭の特徴をより顕著に表す特徴画像
を作成し、その画像を画像エネルギーとして利用する動
的輪郭モデルにより輪郭を検出することを特徴とし、ま
た、その検出輪郭線を顔の該値の部分からの距離と方向
からなる距離関数として表現し、その距離関数の特徴を
求めて基準特徴を比較することにより、顎輪郭形状を判
定することを特徴とする。According to the above arrangement, the image feature representing the thickness and the way of bending of the eyebrows is detected within the set search range, and the shape of the eyebrows is determined. It is possible to avoid the danger that an erroneous feature amount is extracted due to a deviation from the partial image in the input image to be performed. further,
In the case where the target eyebrow shape is a shape that is not included in the category prepared in advance, it is possible to avoid a risk that a correct feature amount cannot be calculated. Furthermore, it is necessary to prepare a template image and a dictionary image, and the amount of work can be significantly reduced. [11] The image processing apparatus according to claim 11 of the present invention, in the image processing apparatus according to claim 2, wherein the feature extracting means is based on one or more position information specified by the position specifying means. And a contour recognizing means for detecting a contour feature of a jaw and determining a shape thereof. In other words, a feature image representing the features of the chin contour is created more prominently from information characterizing one or more faces obtained from the position specifying means, and the contour is detected by a dynamic contour model using the image as image energy. And expressing the detected contour as a distance function consisting of a distance and a direction from the value portion of the face, calculating the feature of the distance function, and comparing the reference feature to obtain a jaw contour. The shape is determined.

【００４３】そのため、画像処理装置の操作者は、最初
に位置指定手段により、入力画像中に含まれている人物
の顔の中心を指定する。この顔中心は、直接指定しても
よいし、他の顔特徴の指定、例えば、両目、口の座標か
ら推定してもよい。次に人物の顔を含むような初期輪郭
座標列を求める。次いで、顔中心座標と初期輪郭上の各
座標を結ぶ直線上の隣り合う画素間の色差を算出し、対
象画素間の座標中点を座標値とし、算出した色差を画素
値にもつ画像（以降、色差マップ画像と呼ぶ）を作成す
る。次いで、この色差マップ画像を、画像エネルギーと
する動的輪郭モデルを用いて顎輪郭線を検出する。次い
で、得られた輪郭線を顔内部の該値の座標、例えば、顔
中心からの距離と方向（角度）からなる関数（以降距離
関数と呼ぶ）として表現する。次いで、この距離関数の
特徴を、基準となる輪郭形状の距離関数の特徴と比較
し、最も特徴が近い距離関数をもつ輪郭形状を、入力画
像の顎形状として判定する。Therefore, the operator of the image processing apparatus first specifies the center of the face of the person included in the input image by the position specifying means. The face center may be specified directly, or may be estimated from the specification of other facial features, for example, the coordinates of both eyes and mouth. Next, an initial outline coordinate sequence including the face of the person is obtained. Next, a color difference between adjacent pixels on a straight line connecting the face center coordinates and each coordinate on the initial contour is calculated, a coordinate middle point between the target pixels is set as a coordinate value, and an image having the calculated color difference as a pixel value (hereinafter, referred to as an image) , A color difference map image). Next, a jaw contour line is detected using an active contour model using this color difference map image as image energy. Next, the obtained contour is expressed as a function (hereinafter, referred to as a distance function) composed of the coordinates of the value inside the face, for example, the distance from the face center and the direction (angle). Next, the feature of the distance function is compared with the feature of the distance function of the reference contour shape, and the contour shape having the closest distance function is determined as the jaw shape of the input image.

【００４４】ここで、色差マップ画像作成には、対象画
像人物顔であることを利用して精度を高めてもよい。例
えば、色差を求める際には、肌色とそれ以外の色を区別
して求めてもよい。すなわち、肌色に分類される画素同
士の色差には、色差の検出精度を低くすることにより、
ノイズやしわの影響が色差マップ画像に反映されにくく
することができる。逆に、首と顎の境目は同じ肌色であ
ることが多く、色差が出にくいため、中心から首方向へ
の直線上の色差検出時には、検出精度を上げるようにし
てもよい。尚、首の位置は、例えば口の座標が該値であ
るならば、方向を推定することが出来る。The accuracy of the color difference map image creation may be enhanced by utilizing the fact that the image is a human face of the target image. For example, when obtaining the color difference, the skin color and the other colors may be determined separately. In other words, the color difference between pixels classified as skin color is reduced by lowering the color difference detection accuracy.
The effects of noise and wrinkles can be made less likely to be reflected in the color difference map image. Conversely, the border between the neck and the chin often has the same flesh color, and it is difficult to produce a color difference. Therefore, when detecting a color difference on a straight line from the center to the neck, the detection accuracy may be increased. Note that the direction of the neck position can be estimated if the coordinates of the mouth are the values.

【００４５】また、上記により色差マップ画像を作成し
た後に、例えば顔輪郭として楕円を仮定することによ
り、顔中心座標を中心とする楕円座標上にある画素値
（＝色差）とその両隣の画素値を平均化し、その画素値
とする。あるいは、顔輪郭以外の他の特徴が別途判明し
ている場合、例えば、口の中心座標が該値であるなら
ば、口の中心座標と顔中心を結ぶ直線を対称軸にもつ２
画素の画素値を平均化して、その画素値としてもよい。
これにより、顎形状の特徴を加味したエネルギー画像を
作成することができ、鮮明な輪郭線が現れていない入力
画像やノイズの多い画像に対しても、より安定な顎検出
を行なうことができる。After the color difference map image is created as described above, for example, by assuming an ellipse as the face outline, the pixel value (= color difference) on the ellipse coordinate centered on the face center coordinate and the pixel values on both sides thereof Are averaged to obtain the pixel value. Alternatively, when other features other than the face outline are known separately, for example, if the center coordinates of the mouth are the values, a line having the straight line connecting the center coordinates of the mouth and the center of the face on the symmetry axis is used.
The pixel values of the pixels may be averaged and used as the pixel value.
This makes it possible to create an energy image in which the features of the jaw shape are taken into account, and it is possible to perform more stable jaw detection even on an input image in which a clear outline does not appear or an image with much noise.

【００４６】また、輪郭線から距離関数を作成する際に
も、人物顔輪郭独自の特徴性を利用することにより、ノ
イズや照明による影響を出来るだけ排除し、顎の特徴を
より顕著に表すように距離関数を修正することができ
る。例えば色差マップ作成時と同じように、楕円や対称
性などの顔の形状に基づき平均化等の距離関数の修正を
行なうことができる。Also, when a distance function is created from a contour line, the influence of noise and illumination is eliminated as much as possible by using the unique characteristics of the contour of the human face, so that the jaw features are more remarkably represented. To modify the distance function. For example, a distance function such as averaging can be corrected based on the shape of a face such as an ellipse or symmetry as in the case of creating a color difference map.

【００４７】次に、距離関数の比較は、距離関数の変曲
点の位置、変曲点数、変曲点間の傾きなどその距離関数
のもつ特徴と位置づけ、基準となる輪郭形状の距離関数
の特徴とそれぞれ比較することにより行なう。そして、
最も類似している基準距離関数を有する基準形状を該当
する輪郭形状として判定する。Next, the distance function is compared with features of the distance function such as the position of the inflection point of the distance function, the number of inflection points, and the slope between the inflection points. This is done by comparing each with the feature. And
The reference shape having the most similar reference distance function is determined as the corresponding contour shape.

【００４８】また、平面上の曲線を周波数領域で記述す
る手法、例えば、フーリエ記述子を用いて距離関数を表
現すれば、これにより算出されるフーリエ係数をその距
離関数のもつ特徴として位置付けることができ、基準と
なる輪郭形状の距離関数の係数と比較することにより、
上記と同様に形状判定を行なうことができる。Further, if a distance function is represented by a method of describing a curve on a plane in the frequency domain, for example, by using a Fourier descriptor, the Fourier coefficient calculated thereby can be positioned as a feature of the distance function. By comparing with the coefficient of the distance function of the reference contour shape,
Shape determination can be performed in the same manner as described above.

【００４９】比較対象となる基準距離関数の特徴は、距
離関数を予め正規化して表としてメモリに格納しておい
てもよいし、予め必要となる正規化した変曲点の位置等
の情報だけを格納しておいてもよい。フーリエ記述子を
用いる場合は、必要な次数の係数を格納しておけばよ
い。これらの手法では、テンプレートマッチングに比べ
て、比較対象となる基準形状を辞書画像としてもつ必要
がなく、メモリコストや処理速度の面で有利となる。The feature of the reference distance function to be compared is that the distance function may be normalized beforehand and stored in a memory as a table, or only the information such as the required positions of the inflection points which are required in advance may be used. May be stored. When a Fourier descriptor is used, a coefficient of a required order may be stored. These methods do not need to have a reference shape to be compared as a dictionary image as compared with template matching, which is advantageous in terms of memory cost and processing speed.

【００５０】また、フーリエ記述子を用いる場合、フー
リエ係数の低次の項にはおおまかな曲線形状、高次の項
にはより詳細な曲線形状が反映されていることを利用
し、まず低次の項の比較を行なうことにより、ノイズや
個人差などの影響をなるべく排除した判定結果を得るこ
とが可能である。［請求項１２，１３］本発明の請求項１２に係る画像処
理装置は、請求項２記載の画像処理装置において、前記
特徴抽出手段は、前記位置指定手段で指定された一つ以
上の位置情報に基づき、頭頂高さと髪生え際高さとを推
定し、髪領域を認識する髪認識手段を備えてなることを
特徴とする。When the Fourier descriptor is used, the fact that the lower-order terms of the Fourier coefficients reflect a rough curve shape and the higher-order terms reflect a more detailed curve shape is used. By comparing the terms (1) and (2), it is possible to obtain a determination result in which effects such as noise and individual differences are eliminated as much as possible. According to a twelfth aspect of the present invention, in the image processing apparatus according to the second aspect, the feature extracting means includes at least one position information designated by the position designation means. And a hair recognizing means for estimating the height of the crown and the height of the hairline on the basis of the above, and recognizing the hair region.

【００５１】本発明の請求項１３に係る画像処理装置
は、請求項１２記載の画像処理装置において、前記髪認
識手段は、髪色を抽出する髪色抽出手段を備えてなるこ
とを特徴とする。According to a thirteenth aspect of the present invention, in the image processing apparatus according to the twelfth aspect, the hair recognizing means includes a hair color extracting means for extracting a hair color. .

【００５２】上記構成によれば、画像全体から背景領域
を抽出する必要がないため、背景色が一様またはそれに
近い必要はなく、通常のスナップ写真などからでも髪色
を抽出し、あるいは、似顔絵を作成することができる。［請求項１４］本発明の請求項１４に係る画像処理装置
は、請求項１２記載の画像処理装置において、前記髪認
識手段は、前記位置指定手段で指定された一つ以上の位
置情報に基づき、髪部分の特徴を抽出する髪特徴抽出手
段と、該髪部分の特徴を用いて髪輪郭を抽出する髪輪郭
抽出手段と、該髪輪郭を用いて髪を分類する髪分類手段
と、をさらに備えてなることを特徴とする。According to the above configuration, it is not necessary to extract the background region from the entire image, so that the background color does not need to be uniform or close to it, and the hair color can be extracted even from a normal snapshot, etc. Can be created. [14] The image processing apparatus according to claim 14 of the present invention, in the image processing apparatus according to claim 12, wherein the hair recognizing means is based on one or more position information specified by the position specifying means. A hair feature extracting means for extracting a feature of the hair part, a hair contour extracting means for extracting a hair contour using the feature of the hair part, and a hair classifying means for classifying the hair using the hair contour. It is characterized by comprising.

【００５３】上記構成によれば、テンプレートマッチン
グによるのではなく、髪の輪郭線を抽出するため、いわ
ゆる「七三分け」，「真中分け」などの呼び方でいう
「分け目」を精度よく検出して分類することや、髪生え
際線の形状の丸みを判定して「四角型」，「丸型」に分
類することなど、きめ細かい形状分類を行うことができ
る。［請求項１５］本発明の請求項１５に係る画像処理装置
は、請求項１２記載の画像処理装置において、前記髪認
識手段は、顔輪郭の特徴を抽出する顔輪郭特徴抽出手段
と、髪特徴及び顔輪郭特徴を用いて髪を分類する髪分類
手段と、を備えてなることを特徴とする。According to the above-described configuration, instead of using template matching, the hair contours are extracted, so that the "separation" referred to by the so-called "seven-three division" or "center division" is accurately detected. Fine shape classification can be performed, for example, by determining the roundness of the shape of the hairline, and by classifying it into “square” or “round”. [15] The image processing apparatus according to claim 15 of the present invention, in the image processing apparatus according to claim 12, wherein the hair recognizing means extracts a face contour feature, and a hair feature. And a hair classification means for classifying the hair using the facial contour feature.

【００５４】上記構成によれば、例えば、顔輪郭線の最
上部の高さが頭頂高さと比較してある閾値以上低い場合
は髪が相当量あると判断することにより、白髪であるな
ど髪領域と肌領域の区別が難しい場合にも、「髪が薄
い」などの誤判断をなくす、あるいは減らすことができ
る。［請求項１６］本発明の請求項１６に係る画像処理装置
は、請求項１２記載の画像処理装置において、前記髪認
識手段は、前髪部分の特徴を抽出する前髪特徴抽出手段
と、後髪部分の特徴を抽出する後髪特徴抽出手段を備
え、髪部分を含む画像を入力した際、前記前髪特徴抽出
手段にて抽出された前髪特徴と前記後髪特徴抽出手段に
て抽出された後髪特徴とを用いて前髪部品を決定するこ
とを特徴とする。According to the above arrangement, for example, when the height of the top of the face contour line is lower than the head height by a certain threshold or more, it is determined that there is a considerable amount of hair, and the hair area such as white hair is determined. Even when it is difficult to distinguish between a skin region and a skin region, it is possible to eliminate or reduce erroneous judgments such as "thin hair". [16] The image processing apparatus according to claim 16 of the present invention, in the image processing apparatus according to claim 12, wherein the hair recognizing means is a bangs feature extracting means for extracting features of a bangs part, and a back hair part. And a back hair feature extracted by the bangs feature extraction unit and a back hair feature extracted by the back hair feature extraction unit when an image including a hair portion is input. Are used to determine bangs parts.

【００５５】上記構成によれば、前髪、後髪の両方の特
徴を用いて前髪部品を決定するため、例えば、髪の上部
で左側の方に分け目があれば、前髪部品でも左の方から
流れているような、「左分け」にマッチしたものを選択
することにより、よりリアルな、違和感の少ない似顔絵
を作成することができる。また、予め用意された髪部品
を用いているので、例えば、髪領域を２値化したもの
と、前髪部分の髪を表現する小部品とを組み合わせる手
法のように、髪領域が２値化画像の一部あるいは全部
が、髪画像の一部あるいは全部としてそのまま出力され
る手法と比較して、より美しい似顔絵を作成できる場合
が多く、また、処理が不完全な部分が存在しても、それ
が出力としてそのままユーザーに見えるわけではないの
で、違和感を与えにくい。［請求項１７］本発明の請求項１７に係る画像処理装置
は、請求項１２記載の画像処理装置において、前記髪認
識手段は、前髪部分の特徴を抽出する前髪特徴抽出手段
と、後髪部分の特徴を抽出する後髪特徴抽出手段を備
え、髪部分を含む画像を入力した際、前記前髪特徴抽出
手段にて抽出された前髪特徴と前記後髪特徴抽出手段に
て抽出された後髪特徴とを用いて後髪部品を決定するこ
とを特徴とする。According to the above configuration, the bangs part is determined by using the characteristics of both the bangs and the back hair. For example, if there is a division on the left side in the upper part of the hair, even the bangs part flows from the left side. By selecting an item that matches the "left division" as described above, a more realistic portrait with less discomfort can be created. Further, since a hair part prepared in advance is used, the hair area is converted to a binarized image, for example, by a method of combining a binarized hair area with a small part representing the hair of the bangs. In many cases, a more beautiful portrait can be created as compared with a method in which part or all of the hair image is output as it is as part or all of the hair image. Is not directly visible to the user as output, so it is difficult to give a sense of incongruity. [17] The image processing apparatus according to claim 17 of the present invention, in the image processing apparatus according to claim 12, wherein the hair recognizing means is a bangs feature extracting means for extracting features of a bangs part, and a back hair part. And a back hair feature extracted by the bangs feature extraction unit and a back hair feature extracted by the back hair feature extraction unit when an image including a hair portion is input. Is used to determine the back hair part.

【００５６】上記構成によれば、前髪、後髪の両方の特
徴を用いて後髪部品を決定するため、例えば、額の前髪
部分で左の方に分け目があれば、後髪部品でも、左側の
方に分け目があるような、「左分け」にマッチしたもの
を選択することにより、よりリアルな、または違和感の
少ない似顔絵を作成することができる。また、予め用意
された髪部品を用いているので、例えば、髪領域を２値
化したものと、前髪部分の髪を表現する小部品とを組み
合わせる手法のように、髪領域が２値化画像の一部ある
いは全部が、髪画像の一部あるいは全部としてそのまま
出力される手法と比較して、より美しい似顔絵を作成で
きる場合が多く、また、処理が不完全な部分が存在して
も、それが出力としてそのままユーザーに見えるわけで
はないので、違和感を与えにくい。［請求項１８］本発明の請求項１８に係る画像処理装置
は、画像中の顔部品の大きさや形状の顔部品特徴情報を
得る手段と、その特徴情報に対応する複数の顔部品種類
を持ち、各顔部品種ごとに複数の部品データを記憶して
いる顔部品データ記憶手段と、前記顔部品特徴情報をも
とに前記顔部品データ記憶部から適当な部品データを抽
出する顔部品データ抽出手段と、前記抽出された各顔部
品データを顔部品データ記憶部に記憶してある顔輪郭部
品種類ごとに部品の配置位置を定めることにより、輪郭
に適した位置に他の顔部品を配置する手段を備えてなる
ことを特徴とする。According to the above configuration, the back hair part is determined by using both characteristics of the bangs and the back hair. For example, if the front hair part of the forehead has a division on the left side, the back hair part is also determined on the left side. By selecting an item that matches the "left division" in which there is a division, a more realistic or less unnatural portrait can be created. Further, since a hair part prepared in advance is used, the hair area is converted to a binarized image, for example, by a method of combining a binarized hair area with a small part representing the hair of the bangs. In many cases, a more beautiful portrait can be created as compared with a method in which part or all of the hair image is output as it is as part or all of the hair image. Is not directly visible to the user as output, so it is difficult to give a sense of incongruity. [Claim 18] An image processing apparatus according to claim 18 of the present invention has means for obtaining face part feature information of the size and shape of a face part in an image and a plurality of face part types corresponding to the feature information. A face part data storage unit storing a plurality of part data for each face part type; and a face part data extraction unit for extracting appropriate part data from the face part data storage unit based on the face part characteristic information. Means for arranging another face part at a position suitable for the contour by determining the arrangement position of the part for each face outline part type stored in the face part data storage unit with each of the extracted face part data. It is characterized by comprising means.

【００５７】上記構成により、画像中の顔部品の大き
さ、形状などの、どの顔部品を使用するか決定した後、
顔部品の中の顔輪郭部品ごとに部品配置位置と部品サイ
ズなどの部品配置方法を決定し、各顔部品データを配置
する。部品を顔輪郭に基いて配置することによって、単
に部品が顔からはみ出したりしないだけでなく、その顔
輪郭の形状に最も適した位置と大きさに顔部品を配置す
ることができる。顔輪郭ごとに顔部品配置情報を決定す
ることにより、劇画調とコミック調など似顔絵のコンセ
プトによって顔のバランスがまったく異なる似顔絵につ
いても、顔部品を配置した場合に破綻することなく似顔
絵を生成することが可能である。［請求項１９］本発明の請求項１９に係る画像処理装置
は、請求項１８記載の画像処理装置において、画像中の
顔部品の位置の情報を得る手段と、得られた顔部品の位
置情報に基づき、顔輪郭に対応して決定した他の顔部品
の配置位置を補正し、顔部品の配置位置を定める手段を
備えてなることを特徴とする。With the above configuration, after determining which face part to use, such as the size and shape of the face part in the image,
A part arrangement method such as a part arrangement position and a part size is determined for each face contour part in the face parts, and each piece of face part data is arranged. By arranging the parts based on the face outline, not only the parts do not protrude from the face but also the face parts can be arranged at the position and size most suitable for the shape of the face outline. Determining facial part placement information for each facial contour to generate a facial caricature without disintegration when facial parts are placed, even if the facial balance is completely different due to the concept of portraits such as dramatic and comic styles Is possible. [19] The image processing apparatus according to claim 19 of the present invention, in the image processing apparatus according to claim 18, means for obtaining information of the position of the face part in the image, and the obtained position information of the face part. A means for correcting the arrangement position of another face part determined in accordance with the face outline based on the face contour to determine the arrangement position of the face part.

【００５８】上記構成により、同じ顔輪郭を持つが、微
妙に顔部品の位置が異なる顔のバリエーション全て対応
するため、画像中の顔部品の位置情報に基づき、上記請
求項１８によって決定した他の顔部品の配置位置を、そ
の顔輪郭の許容範囲の中で修正し、より適切な部品の配
置位置を決定する。また、入力画像から抽出した顔部品
の位置と、統計的なその顔部品の顔輪郭における標準位
置とを比較し、その差異と、顔輪郭ごとに決定される修
正許容範囲によって、顔部品の配置を補正する。これに
より、顔輪郭から顔がはみ出すような破綻を避けなが
ら、入力画像の微妙な顔部品の位置の特徴を、似顔絵に
反映させることが可能となる。［請求項２０］本発明の請求項２０に係る画像処理装置
は、請求項１８記載の画像処理装置において、似顔絵合
成時に配置する顔部品データにおいて、関連のある一組
の部品の特定の色に関する部品の変更指定が行われたと
き、関連する他の部品についても自動的に部品を変更す
ることにより、常に矛盾を生じないデータを生成する手
段を備えてなることを特徴とする。According to the above configuration, all variations of the face having the same face contour but having slightly different positions of the face parts are supported. Therefore, another face determined based on the position information of the face parts in the image is used. The arrangement position of the face part is corrected within the allowable range of the face outline, and a more appropriate part arrangement position is determined. Further, the position of the face part extracted from the input image is compared with a statistical standard position in the face contour of the face part, and the difference and the correction allowable range determined for each face contour determine the arrangement of the face part. Is corrected. This makes it possible to reflect the delicate feature of the position of the face part of the input image on the portrait while avoiding a failure such that the face protrudes from the face contour. [Claim 20] The image processing apparatus according to claim 20 of the present invention is the image processing apparatus according to claim 18, wherein the face part data arranged at the time of caricature composition relates to a specific color of a set of related parts. It is characterized in that, when a change of a part is designated, there is provided means for automatically generating a data which does not cause inconsistency by automatically changing the part with respect to other related parts.

【００５９】上記構成により、前髪と後髪、あるいは前
髪とヒゲなど、顔部品で強い相関を持つ部品間で、一方
の部品の色を、例えば黒／茶／白のどれかに決定した場
合、他方の部品の色も自動的に同じ色に変更することに
より、他方の部品の色の変更を陽に指定することなく、
違和感のない似顔絵を自動生成する。同様に、髪がパー
マであるかなど、前髪と後髪などで強い相関のある形状
についても、片方に対しての指示を他方に反映させるこ
とで、違和感のない編集操作が可能となる。［請求項２１］本発明の請求項２１に係る画像処理装置
は、請求項１８記載の画像処理装置において、顔部品デ
ータ記憶部に記憶してある顔部品データの中で、一つの
顔部品データが二つ以上の層構造を持っている場合に、
他の顔部品データと組み合わされたとき、部品中の色情
報に基づいて層の並び順を変更し、適切な順番で部品と
部品を構成する層の配置順序を定める手段を備えてなる
ことを特徴とする。According to the above configuration, when the color of one part is determined to be any of black / brown / white, for example, between parts having a strong correlation between face parts such as bangs and back hair or bangs and mustache. By automatically changing the color of the other part to the same color, without explicitly specifying the color change of the other part,
Automatically generate a portrait with no discomfort. Similarly, for a shape that has a strong correlation between the front hair and the back hair, such as whether the hair is perm, an editing operation without a sense of incongruity can be performed by reflecting an instruction for one on the other. [21] The image processing apparatus according to claim 21 of the present invention, in the image processing apparatus according to claim 18, wherein one of the face part data stored in the face part data storage unit is one of the face part data. Has more than one layer structure,
When combined with other facial part data, means for changing the order of layers based on the color information in the part and determining the arrangement order of the part and the layers constituting the part in an appropriate order is provided. Features.

【００６０】上記構成により、顔部品あるいは、帽子な
どの似顔絵を構成する部品の各画素または領域におい
て、その色情報から、顔輪郭上に投影された影を構成し
ていると判断される画素または領域を抽出し、その画素
または領域を予め描画してから、個々の顔部品を描画す
ることにより、ある部品の上に、他の部品の影が載ると
いうことを防ぐ。さらに影が投影される部品が存在しな
い場合はその画素または領域の描画そのものを停止する
ことにより、顔輪郭が存在しない場所に対する描画は禁
止できる。According to the above configuration, in each pixel or region of a face part or a part forming a portrait such as a hat, a pixel or a pixel determined to constitute a shadow projected on a face contour from its color information. By extracting a region, drawing pixels or regions in advance, and then drawing individual face parts, it is possible to prevent a shadow of another part from being placed on a certain part. Further, when there is no component to which a shadow is projected, by stopping the drawing of the pixel or the region itself, drawing at a place where no face outline exists can be prohibited.

【００６１】[0061]

【発明の実施の形態】以下に、本発明における画像処理
装置の実施形態に関して図面を用いて説明する。〈実施形態１〉［第１の実施例］（請求項１，２の発明）図１は、本実施例の画像処理装置の機能ブロックを示し
た図である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of an image processing apparatus according to the present invention will be described below with reference to the drawings. First Embodiment [First Embodiment] (Inventions of Claims 1 and 2) FIG. 1 is a diagram showing functional blocks of an image processing apparatus according to the present embodiment.

【００６２】本実施例の画像処理装置は、外部から入力
された画像、すなわち装置が外部から取得した画像（以
下、入力画像と記す）に含まれる物体の特徴量、例え
ば、右目の位置等を、該入力画像から抽出処理する装置
である。すなわち、図１に示すように、電子画像を取得
する画像取得手段としての入力装置１１と、入力装置１
１によって取得された入力画像等を記憶する記憶装置１
２と、入力画像から特徴量を抽出するための処理を行う
演算装置１３と、入力画像の任意の位置を指定する手段
としての位置指定装置１４と、抽出した特徴量を外部に
出力する出力装置１５が設けられている。尚、上記物体
の特徴量を抽出する特徴量抽出部１０は、記憶装置１２
と演算装置１３とから構成されている。The image processing apparatus according to the present embodiment determines the feature amount of an object included in an image input from the outside, that is, an image acquired from the outside of the apparatus (hereinafter referred to as an input image), for example, the position of the right eye. , A device that performs extraction processing from the input image. That is, as shown in FIG. 1, an input device 11 as an image acquisition unit for acquiring an electronic image, and an input device 1
Storage device 1 for storing the input image and the like acquired by
2, an arithmetic unit 13 for performing a process for extracting a feature from the input image, a position specifying device 14 as a unit for specifying an arbitrary position of the input image, and an output device for outputting the extracted feature to the outside 15 are provided. Note that the feature amount extraction unit 10 for extracting the feature amount of the object includes a storage device 12
And an arithmetic unit 13.

【００６３】本実施例では、上記構成の画像処理装置で
実行される画像処理について説明する。尚、ここでは、
入力画像に顔を含み、抽出される特徴量が、目、鼻、
口、眉、耳、輪郭、髪等の顔部品の位置、大きさ、形状
等である場合について説明する。In this embodiment, image processing executed by the image processing apparatus having the above configuration will be described. Here,
The input image contains a face, and the extracted features are eyes, nose,
The case where the position, size, shape, and the like of face parts such as the mouth, eyebrows, ears, contours, and hair are described.

【００６４】まず、入力装置１１より画像を入力する。
操作者は、ディスプレイ装置などの出力装置１５に表示
される入力画像を見ながら、位置指定装置１４を用い、
入力画像上の任意の位置を指定する。First, an image is input from the input device 11.
The operator uses the position specifying device 14 while viewing the input image displayed on the output device 15 such as a display device,
Specify an arbitrary position on the input image.

【００６５】図２は、位置指定の例及び位置指定によっ
て決定される探索範囲の一例を示した図である。これ
は、入力された人物画像２１の、右目位置２２、左目位
置２３、口位置２４を指定したことを示した画像であ
る。右目位置をＰｒｅ、左目位置をＰｌｅ、口位置をＰ
ｍとすると、右目に関する特徴量を抽出するために画像
処理をする範囲２５（以下、右目の探索範囲と記す）
は、右目位置Ｐｒｅを中心として、幅Ｗｒｅ、高さＨｒ
ｅの矩形であらわされる。FIG. 2 is a diagram showing an example of position designation and an example of a search range determined by the position designation. This is an image showing that the right eye position 22, the left eye position 23, and the mouth position 24 of the input human image 21 are specified. Right eye position Pre, left eye position Ple, mouth position P
Assuming that m, a range 25 in which image processing is performed to extract a feature amount related to the right eye (hereinafter, referred to as a right eye search range)
Is a width Wre and a height Hr around the right eye position Pre.
This is represented by a rectangle e.

【００６６】ここで、幅Ｗｒｅ、高さＨｒｅは次のよう
に決定される。すなわち、人物顔において各顔部品の位
置関係及び大きさは一定の拘束条件にしたがっている。
つまり、両目間の距離と、目の大きさには、一定の比率
が存在すると考えられる。目の大きさを、目に外接する
矩形の幅Ｗｅｙｅ及び高さＨｅｙｅで表現するとし、両
目間の距離をＬとした場合、Ｗｅｙｅ及びＨｅｙｅは、
それぞれ次のようにあらわされる。Here, the width Wre and the height Hre are determined as follows. In other words, the positional relationship and the size of each face part in the human face follow a certain constraint.
That is, it is considered that a certain ratio exists between the distance between the eyes and the size of the eyes. Assume that the size of the eyes is expressed by a width Weye and a height Heye of a rectangle circumscribing the eyes.
Each is represented as follows.

【００６７】Ｗｅｙｅ＝Ｃｅｗ×ＬＨｅｙｅ＝Ｃｅｈ×Ｌここで、Ｃｅｗは両目間の距離Ｌに対する目の幅Ｗｅｙ
ｅの比率の平均値で、Ｃｅｈは両目間の距離Ｌに対する
目の高さＨｅｙｅの比率の平均値である。これらの係数
Ｃｅｗ，Ｃｅｈは、予め、複数の人物顔について、両目
間の距離と目の幅及び高さについて計測し、平均値を求
めておく。個人差により、目の幅及び高さはばらつきを
生じるから、これにさらに適当な係数を乗じることで、
ほとんど全ての人物について、右目がその中に含まれる
ような十分な大きさの探索範囲をあらわす矩形の幅Ｗｒ
ｅ及び高さＨｒｅを以下のように求めることができる。Weye = Cew × L Heye = Ceh × L where Cew is the eye width Wey with respect to the distance L between the eyes.
In the average value of the ratio e, Ceh is the average value of the ratio of the eye height Heye to the distance L between the eyes. These coefficients Cew and Ceh are measured in advance for a plurality of human faces with respect to the distance between the eyes and the width and height of the eyes, and the average value is obtained. Due to individual differences, the width and height of the eyes will vary, so by multiplying this by an appropriate coefficient,
For almost all persons, a rectangular width Wr representing a sufficiently large search range such that the right eye is included therein
e and the height Hre can be determined as follows.

【００６８】Ｗｒｅ＝Ｍｅｗ×ＷｅｙｅＨｒｅ＝Ｍｅｈ×Ｈｅｙｅただし、Ｍｅｗ及びＭｅｈは、予め、複数の人物につい
て、両目間の距離と目の幅及び高さについて計測した分
散値などから求めておく。Wre = Mew × Weye Hre = Meh × Heye Here, Mew and Meh are obtained in advance for a plurality of persons from the variance values measured for the distance between the eyes and the width and height of the eyes.

【００６９】上記のプロセスを左目、口等にも同様に適
用することができる。すなわち、左目の探索範囲、口の
探索範囲を同様に求めることができる。尚、ここでは、
両目間の距離Ｌを基準に探索範囲を求めているが、別の
基準を設けてもよい。例えば、両目を結ぶ線分の中央と
口位置を結ぶ線分の距離、すなわち、目と口の高さ、を
基準として探索範囲を求めてもよい。The above process can be similarly applied to the left eye, mouth, etc. That is, the search range of the left eye and the search range of the mouth can be similarly obtained. Here,
Although the search range is determined based on the distance L between the eyes, another reference may be provided. For example, the search range may be determined based on the distance between the center of the line connecting both eyes and the line connecting the position of the mouth, that is, the height of the eye and the mouth.

【００７０】以上のように、位置指定装置を用いて、位
置及び大きさが一定の拘束条件にしたがう物体の位置を
入力することで、少ない入力位置で、その物体の特徴量
を抽出するための探索範囲を適当に設定することが可能
となる。As described above, by using the position specifying device to input the position of an object whose position and size comply with a fixed constraint, the feature amount of the object can be extracted with a small number of input positions. The search range can be set appropriately.

【００７１】したがって、本実施例の特徴をまとめると
次のようになる。（１）請求項１の画像処理装置は、画像を入力する入力
手段と、前記入力した画像を記憶する記憶手段と、任意
の演算を行う演算手段と、当該画像中の任意の位置を指
定することのできる位置指定手段と、当該画像中に配置
された物体の位置及び大きさを認識し、前記物体の位置
及び大きさの関係が一定の拘束条件を満たす場合に、画
像中の一つ以上の当該物体の位置を入力することで、当
該画像中の任意の特徴を抽出する特徴抽出量手段とを備
えてなることを特徴とする。Therefore, the features of the present embodiment are summarized as follows. (1) An image processing apparatus according to claim 1, wherein input means for inputting an image, storage means for storing the input image, operation means for performing an arbitrary operation, and an arbitrary position in the image are designated. Position specifying means capable of recognizing the position and size of an object placed in the image, and when the relationship between the position and size of the object satisfies a certain constraint, one or more of And a feature extraction amount means for extracting an arbitrary feature in the image by inputting the position of the object.

【００７２】上記（１）の構成によれば、画像中に配置
された物体の位置及び大きさの関係が一定の拘束条件を
満たす場合に、画像中の一つ以上の位置を入力すること
で、当該入力位置より、対象となる物体の特徴量を抽出
するための、画像処理を行うための適当な探索範囲を設
定することが可能となる。According to the above configuration (1), when the relationship between the position and the size of the object arranged in the image satisfies a certain constraint, one or more positions in the image can be input. It is possible to set an appropriate search range for performing image processing for extracting a feature amount of a target object from the input position.

【００７３】すなわち、従来の手法では探索範囲を設定
するために、例えば３つの物体であれば、それぞれにつ
いて探索範囲を設定するために少なくとも６点を指定し
なければならなかったが、本発明の画像処理装置を用い
ることで、より少ない指定点で適当な探索範囲を設定す
ることが可能となる。（２）請求項２の画像処理装置は、特徴抽出手段が入力
される画像に顔を含む場合に、顔を構成する目、鼻、
口、眉、耳、輪郭、髪を顔部品とし、前記顔部品の内の
少なくとも１つの該当顔部品の位置、大きさ、形状を特
徴量として抽出することを特徴とする。That is, in the conventional method, in order to set a search range, for example, in the case of three objects, at least six points had to be specified in order to set a search range for each of the three objects. By using the image processing device, it is possible to set an appropriate search range with fewer designated points. (2) In the image processing apparatus according to the second aspect, when the feature extraction unit includes a face in the input image, eyes, nose,
The mouth, eyebrows, ears, contour, and hair are face parts, and the position, size, and shape of at least one of the face parts are extracted as feature amounts.

【００７４】上記（２）の構成によれば、上記（１）の
作用に加えて、目、鼻、口、眉、耳、輪郭、髪等の顔部
品の位置、大きさ、形状等の特徴量を、頑健かつ高精度
に抽出することができる。例えば、入力画像中に含まれ
ている人物の右目、左目、口の位置を特徴量として抽出
する場合、上記位置指定手段を用いて右目及び左目の２
点を指定する。ここで、右目、左目、口の位置及び大き
さについては、一定の拘束条件にしたがっている、すな
わち、左右の目の大きさはほぼ同一であり、それは、両
目間の距離に一定の係数を乗じた値から大きく離れた値
ではなく、口は、両目を結ぶ線分の中央から垂直下方に
位置し、その距離と口の大きさは、両目間の距離に一定
の係数を乗じた値から大きく離れた値ではない、とする
ことができる。According to the above configuration (2), in addition to the operation of the above (1), features such as the position, size and shape of facial parts such as eyes, nose, mouth, eyebrows, ears, contours, hair, etc. The quantity can be extracted robustly and with high precision. For example, when the positions of the right eye, left eye, and mouth of a person included in the input image are extracted as the feature amounts, two positions of the right eye and the left eye are extracted by using the position designating means.
Specify a point. Here, the position and size of the right eye, the left eye, and the mouth follow a certain constraint, that is, the size of the left and right eyes is almost the same, which is obtained by multiplying the distance between the eyes by a certain coefficient. The mouth is located vertically below the center of the line connecting the eyes, and the distance and the size of the mouth are larger than the value obtained by multiplying the distance between the eyes by a certain coefficient. Values that are not far apart.

【００７５】すなわち、従来技術では、特徴抽出の処理
において誤った値を抽出することがないよう、探索範囲
を限定するためのに、位置指定手段にて６点を指定しな
ければならかなったのを、２点指定するだけで同様の効
果が得られるようになることを特徴としている。［第２の実施例］（請求項３，４，５の発明）本実施例では、図１７乃至図１８を用いて説明する。図
１７は目の検出動作をあらわすフローチャートであり、
図１８は目の検出処理（検出、変換、抽出等の結果）に
使用される各種の画像の例である。That is, in the prior art, six points must be designated by the position designation means in order to limit the search range so that an erroneous value is not extracted in the feature extraction processing. The same effect can be obtained only by designating two points. [Second Embodiment] (Inventions of Claims 3, 4, and 5) This embodiment will be described with reference to FIGS. FIG. 17 is a flowchart showing the eye detection operation.
FIG. 18 is an example of various images used for eye detection processing (results of detection, conversion, extraction, and the like).

【００７６】まず、ユーザーが画像中の両目及び口のお
およその位置をペン、マウス等のポインティングデバイ
ス（図１の位置指定装置）により指定する。次に、ユー
ザーにより指定された個所付近の画像を切り出す。First, the user specifies the approximate positions of both eyes and the mouth in the image with a pointing device (position specifying device in FIG. 1) such as a pen or a mouse. Next, an image near a location specified by the user is cut out.

【００７７】切り出す範囲の算出方法は以下の通りであ
る。The calculation method of the range to be cut out is as follows.

【００７８】まずユーザーが指定した両目のおおよその
位置間の距離ｅｌを算出する。その求めた距離ｅｌに対
し、予め定めた定数ＥＷ，ＥＨを乗算し、切り出す領域
の幅ｅｗと高さｅｈを決定する（１００）。次にユーザ
ーが指定した両目のおおよその位置それぞれを中心と
し、求めた幅と高さで画像を切り出す（１０１）。尚、
この切り出した画像をこれ以降、目周辺画像と呼ぶ。First, the distance el between the approximate positions of both eyes specified by the user is calculated. The obtained distance el is multiplied by predetermined constants EW and EH to determine the width ew and height eh of the region to be cut out (100). Next, an image is cut out at the determined width and height with the approximate positions of both eyes specified by the user as the centers (101). still,
This cut-out image is hereinafter referred to as an eye periphery image.

【００７９】続いて目周辺画像内の目を認識対象としそ
の領域を得る方法、すなわち入力画像を２値化すること
により認識対象の領域を得る方法について説明する。Next, a method of obtaining an area of an eye in an image around the eye as a recognition target, that is, a method of obtaining a recognition target area by binarizing an input image will be described.

【００８０】まず目周辺画像を図１８に示すような輝度
画像（１１１）に変換する（１０２）。続いて輝度画像
に対し判別分析法を適用し、閾値ｔｈ１を決定する。次
に閾値ｔｈ１に対しそれぞれ予め定めた値を加減算し、
下記の式に基づいて閾値ｔｈ２〜ｔｈ５を決定する（１
０３）。First, the image around the eyes is converted into a luminance image (111) as shown in FIG. 18 (102). Subsequently, the threshold value th1 is determined by applying the discriminant analysis method to the luminance image. Next, a predetermined value is added to or subtracted from the threshold value th1,
The thresholds th2 to th5 are determined based on the following equation (1
03).

【００８１】ｔｈ２＝ｔｈ１−２０ｔｈ３＝ｔｈ１−１０ｔｈ４＝ｔｈ１＋１０ｔｈ５＝ｔｈ１＋２０本実施例においては上記の方法で閾値の決定を行ってい
るが、他の閾値の決定方法を使用もしくは併用すること
は容易である。次に上記によって決定した閾値ｔｈ１〜
ｔｈ５を使用し、２値化画像Ｉｍｇ１〜Ｉｍｇ５を得る
（１０４）。図１８中の１４２は上記方法によって得ら
れた２値化画像の例である。この例では認識対象である
目の部分が黒いが左上にも黒い箇所があり、認識対象の
領域だけを分離できていないことがわかる。無論上記の
方式で認識対象の領域を正しく分離できることもある。Th2 = th1-20 th3 = th1-10 th4 = th1 + 10 th5 = th1 + 20 In this embodiment, the threshold value is determined by the above method. However, it is not possible to use or use another threshold value determination method. Easy. Next, the threshold values th1 to
Using th5, binarized images Img1 to Img5 are obtained (104). Reference numeral 142 in FIG. 18 is an example of a binarized image obtained by the above method. In this example, the eye portion to be recognized is black, but there is also a black portion at the upper left, which indicates that only the region to be recognized cannot be separated. Of course, there are cases where the recognition target area can be correctly separated by the above method.

【００８２】次に、入力画像を微分画像に変換する（１
０５）。微分画像を生成する方法に関してはＳｏｂｅｌ
オペレータを使用する方法などの様々な方法が知られて
おり、そのいずれかを使用してもよい（尚、これらの方
法は、当該技術分野に従事する技術者にとっては容易に
実現できる手法であるためここでの詳細な説明は行なわ
ない）。続いて微分画像に対し輝度画像に対して行った
ものと同じ方法で閾値ｔｈ６〜ｔｈ１０を決定し（１０
６）、２値化を行い２値化画像Ｉｍｇ６〜Ｉｍｇ１０を
得る（１０７）。Next, the input image is converted into a differential image (1
05). For how to generate differential images, see Sobel
Various methods are known, such as a method using an operator, and any of them may be used. (These methods are methods that can be easily realized by a person skilled in the art.) Therefore, a detailed description will not be given here.) Subsequently, thresholds th6 to th10 are determined in the same manner as that performed for the luminance image for the differential image (10
6) Binarization is performed to obtain binarized images Img6 to Img10 (107).

【００８３】図１８中の１１３は上記方法によって得ら
れた２値化画像の例である。この例では認識対象である
目の部分が白くなっており、他に小さな白い部分がある
が、ほぼ認識対象の領域の分離ができている。無論、上
記の方式で認識対象の領域を正しく分離できないことも
ある。また、上記の方法の他にも、入力画像を２値化す
る方法には色情報を利用したもの等の様々な方法があ
り、本発明にそれを適用することは容易である。また本
実施例では各方式における閾値のバリエーションとそれ
により生成される２値化画像の数を“５”としている
が、これは任意の数でよく、また方式によって異なる数
とすることや入力画像により変化するよう実施すること
も容易である。FIG. 18 shows an example of a binarized image 113 obtained by the above method. In this example, the eye portion to be recognized is white, and there is another small white portion. However, the region to be recognized is almost separated. Of course, there are cases where the recognition target area cannot be correctly separated by the above method. In addition to the above method, there are various methods for binarizing an input image, such as a method using color information, and it is easy to apply the method to the present invention. In this embodiment, the variation of the threshold value in each system and the number of binarized images generated thereby are set to "5". However, this may be an arbitrary number. It is also easy to carry out the change.

【００８４】続いて、各２値化画像における認識対象と
思わしき領域を得る。本実施例においては認識対象は顔
画像中の目領域の検出であるため、輝度画像中の暗い領
域もしくは微分値の高い領域である。従って２値化画像
Ｉｍｇ１〜Ｉｍｇ５においては画素値が０すなわち黒い
画素が、２値化画像Ｉｍｇ６〜Ｉｍｇ１０においては画
素値が１すなわち白い画素が認識対象と思わしき画素で
ある。上記認識対象と思わしき画素で、各２値化画像Ｉ
ｍｇ１〜Ｉｍｇ１０の中で最大の面積を持つ連続領域の
みを取り出す処理を行い、残った領域をそれぞれ認識対
象候補Ａｒｅａ１〜Ａｒｅａ１０とする（１０８）。Subsequently, a region which is considered to be a recognition target in each binarized image is obtained. In the present embodiment, since the recognition target is the detection of the eye region in the face image, it is a dark region or a region with a high differential value in the luminance image. Therefore, in the binarized images Img1 to Img5, the pixel value is 0, that is, a black pixel, and in the binarized images Img6 to Img10, the pixel value is 1, that is, a white pixel is a pixel considered to be a recognition target. Each of the binarized images I
A process of extracting only a continuous region having the largest area from mg1 to Img10 is performed, and the remaining regions are set as recognition target candidates Area1 to Area10, respectively (108).

【００８５】図１８中の１１４は同図中の１１３に対し
最大面積をもつ連続領域のみを取り出す処理をした結果
であり、１１３中の１１５は最大面積をもつ連続領域で
あり、認識対象候補の例である。１１４では１１３にあ
った目領域ではない白い小さな領域がなくなっており、
認識対象領域を正しく分離できている。ここでは認識対
象に関して予め分かっている情報を使用し、認識対象候
補Ａｒｅａ１〜Ａｒｅａ１０の中から最もよく認識対象
をあらわしていると思われる領域を選択する。In FIG. 18, reference numeral 114 denotes a result obtained by extracting only the continuous area having the maximum area from 113 in FIG. 18. Reference numeral 115 in 113 denotes a continuous area having the maximum area. It is an example. In 114, a small white area that was not the eye area that was in 113 has disappeared,
The recognition target area has been correctly separated. Here, information known in advance regarding the recognition target is used, and an area that seems to represent the recognition target best is selected from the recognition target candidates Area1 to Area10.

【００８６】顔における両目間の距離と目の大きさの統
計的性質を利用し、ユーザーにより指定された両目のお
およその位置から、予想される目の大きさを算出する。
具体的にはユーザーにより指定された両目のおおよその
位置の間の距離を算出し、予め求めてある両目間の距離
と目の大きさの比率の平均値を乗じ、それを予想される
目の大きさとする（１０９）。Using the statistical properties of the distance between the eyes and the size of the eyes in the face, the expected eye size is calculated from the approximate position of the eyes specified by the user.
Specifically, the distance between the approximate positions of the eyes specified by the user is calculated, and the distance between the eyes is multiplied by the average value of the ratio of the eye size, which is obtained in advance. The size is determined (109).

【００８７】次に認識対象候補Ａｒｅａ１〜Ａｒｅａ１
０の大きさを求める。本実施例においては外接矩形の大
きさを認識対象候補Ａｒｅａ１〜Ａｒｅａ１０の大きさ
とする。図１８中の１４４，１４５は認識対象候補の例
１１５の大きさである。次に各認識対象候補Ａｒｅａ１
〜１０の大きさと予想される目の大きさを比較し、最も
近いものを求め、この最も近い大きさを持つ認識対象候
補を認識対象の検出結果とする（１１０）。認識対象候
補の例１１５は認識対象を正しく分離しており、認識対
象の大きさに等しい大きさを持っている。従って認識対
象項の例１１５の大きさ１４４，１４５は予測された目
の大きさに近い値である。一方認識対象候補の例１４３
は認識対象を正しく分離していないため、その大きさは
予測された目の大きさに近い値ではない。よって、この
例では正しく認識対象を分離している認識対象候補の例
１１５が選択され、安定かつ高精度に目の認識が行なわ
れる。Next, recognition target candidates Area 1 to Area 1
Find the magnitude of zero. In this embodiment, the size of the circumscribed rectangle is set as the size of the recognition target candidates Area1 to Area10. Reference numerals 144 and 145 in FIG. 18 indicate the sizes of examples 115 of recognition target candidates. Next, each recognition target candidate Area1
The size of the eye is compared with the expected eye size to find the closest one, and the recognition target candidate having the closest size is used as the recognition target detection result (110). In the example 115 of the recognition target candidate, the recognition target is correctly separated, and has a size equal to the size of the recognition target. Therefore, the sizes 144 and 145 of the example 115 of the recognition target item are values close to the predicted eye size. On the other hand, examples of recognition target candidates 143
Does not correctly separate the recognition target, so its size is not close to the predicted eye size. Therefore, in this example, the example 115 of the recognition target candidate that correctly separates the recognition target is selected, and the eyes are recognized stably and with high accuracy.

【００８８】したがって、本実施例の特徴をまとめると
次のようになる。（３）請求項３の画像処理装置は、特徴量抽出手段が入
力された画像を複数の方式及び複数の閾値で２値化し、
それらの画像中の領域の位置や大きさや形状を判定し、
最も信頼度の高い画像を選択することで、認識対象の領
域を検出する領域検出手段を備えてなることを特徴とす
る。尚、領域検出手段は図示していないが、演算装置１
３の中もしくは特徴量抽出部１０の中でその機能が実現
されている。Therefore, the characteristics of this embodiment are summarized as follows. (3) In the image processing apparatus of the third aspect, the feature amount extraction unit binarizes the input image with a plurality of methods and a plurality of thresholds,
Determine the position, size and shape of the area in those images,
It is characterized by comprising an area detecting means for detecting an area to be recognized by selecting an image having the highest reliability. Although the area detecting means is not shown, the arithmetic unit 1
3 or the feature extraction unit 10 realizes the function.

【００８９】上記（３）の構成によれば、認識対象の領
域を得るために２値化を行なうにあたっての閾値の決定
に際し、複数の方式による複数の閾値を用いて２値化を
行ない、その結果得られた複数の領域と、予め概略のわ
かっている認識対象の位置、形状、大きさ等と比較し最
も認識対象の位置、形状、大きさに近い領域を認識対象
とすることで、認識対象の領域を頑健かつ高精度に抽出
することができる。（４）請求項４の画像処理装置は、特徴量抽出手段が位
置指定手段で指定された２つ以上の顔部品の位置間の距
離関係から顔部品の大きさを予測することにより、顔部
品の位置、大きさの検出を行なう顔部品認識手段を備え
てなることを特徴とする。尚、顔部品認識手段は図示し
ていないが、演算装置１３の中もしくは特徴量抽出部１
０の中でその機能が実現されている。（５）請求項５の画像処理装置は、顔部品認識手段が検
出する位置や大きさの対象が顔部品の目であることを特
徴とする。According to the above configuration (3), when determining a threshold value for performing binarization in order to obtain an area to be recognized, binarization is performed using a plurality of threshold values by a plurality of methods. By comparing the obtained multiple regions with the position, shape, size, etc. of the recognition target whose outline is known in advance, the region closest to the position, shape, and size of the recognition target is set as the recognition target. The target region can be extracted robustly and with high accuracy. (4) The image processing apparatus according to (4), wherein the feature amount extracting means predicts the size of the face part from the distance relationship between the positions of the two or more face parts specified by the position specifying means, thereby obtaining the face part. And a face part recognizing means for detecting the position and size of the face part. Although the face part recognizing means is not shown, it is included in the arithmetic unit 13 or the feature amount extracting unit 1.
0 implements that function. (5) An image processing apparatus according to a fifth aspect is characterized in that the target of the position and the size detected by the face part recognizing means are eyes of the face part.

【００９０】上記（４），（５）の構成によれば、高精
度に顔部品の大きさを予測し、画像中の必要十分な範囲
内でのみ検出処理を行ない、少ない計算量で高精度に顔
部品の位置、大きさ検出を行なうことができる。これが
目に対して効果的に適用できる。［第３の実施例］（請求項６の発明）本実施例では、図１９乃至図２０を用いて説明する。図
１９は目の検出結果を利用して顔の傾きを補正のフロー
チャートであり、図２０は目の検出結果による顔の傾き
を補正した画面の例である。According to the above configurations (4) and (5), the size of the face part is predicted with high accuracy, the detection process is performed only within a necessary and sufficient range in the image, and high accuracy is achieved with a small amount of calculation. Then, the position and size of the face part can be detected. This can be effectively applied to the eyes. [Third Embodiment] (Invention of Claim 6) This embodiment will be described with reference to FIGS. FIG. 19 is a flowchart of correcting the face inclination using the eye detection result, and FIG. 20 is an example of a screen in which the face inclination is corrected based on the eye detection result.

【００９１】まず両目の検出が行われる（１１６）。次
いで検出された両目の各々の中心を次式を用いて算出す
る。両目の中心は、検出された両目の外接矩形の上下端
の位置座標ｅｐｕ，ｅｐｄの平均を上下方向の中心位置
ｅｐｙ、左右端の位置座標ｅｐｌ，ｅｐｒの平均を左右
方向の中心位置ｅｐｘとする（１１７）。First, both eyes are detected (116). Next, the center of each of the detected eyes is calculated using the following equation. For the center of both eyes, the average of the position coordinates epu, epd of the upper and lower ends of the detected circumscribed rectangle of both eyes is the vertical center position epy, and the average of the left and right end position coordinates epl, epr is the horizontal center position epx. (117).

【００９２】ｅｐｙ＝（ｅｐｕ＋ｅｐｄ）／２ｅｐｘ＝（ｅｐｌ＋ｅｐｒ）／２図２０中の１２１，１２２は顔が傾いている顔画像の例
であり、１２４は傾いている顔画像において上記手法に
て目を検出し、両目の中心位置を算出し、両目の中心位
置同士を結んだ線である。続いて両目の各々の中心同士
を結ぶ線が水平となるよう画像を回転する。具体的に
は、まず両目の各々の中心を結ぶ線の角度を算出するた
めに、両目の中心位置同士を結んだ線のベクトルを求め
る（１１８）。このベクトルの大きさを（ｘ，ｙ）とす
ると、ベクトルの角度は次式により求められる（１１
９）。Epy = (epu + epd) / 2 epx = (epl + epr) / 2 Reference numerals 121 and 122 in FIG. 20 denote examples of a face image with a tilted face, and reference numeral 124 denotes an eye image of the face image with a tilt in the above method. Is detected, the center position of both eyes is calculated, and the center position of both eyes is connected. Subsequently, the image is rotated so that the line connecting the centers of both eyes is horizontal. Specifically, in order to calculate the angle of a line connecting the centers of both eyes, a vector of a line connecting the center positions of both eyes is obtained (118). Assuming that the magnitude of this vector is (x, y), the angle of the vector is obtained by the following equation (11)
9).

【００９３】ｋ＝ａｔａｎ（ｙ／ｘ）；このｋが両目の中心位置同士を結んだ線の傾きであり、
すなわち画像中の顔全体の傾きでもある。次に画像を−
ｋ度回転する（１２０）。画像を回転する方法は、当分
野の技術者は容易にこれを実現できると考えられるの
で、特に具体的な説明はしない。この回転により両目の
各々の中心を結ぶ線は傾きが補正され水平となり、画像
中の顔及び顔部品も傾きが補正される。図２０中の１２
３は上記回転後の画像の例であり、１４８は回転された
画像中における両目の中心位置同士を結んだ線である。
これによれば、両目の中心位置同士を結んだ線は水平と
なり、画像中の顔も傾きが補正され正しく直立している
ことがわかる。上記手法により傾きが補正されるため、
傾いた顔が入力された場合においても以降の処理では顔
の傾きが補正された画像に対し認識等の処理を行なえば
よいので、従来の方法より安定かつ高精度な認識が行な
える。K = atan (y / x); k is the inclination of a line connecting the center positions of both eyes,
That is, it is also the inclination of the entire face in the image. Next,
Rotate k degrees (120). The method of rotating the image is not specifically described since it is considered that a person skilled in the art can easily realize this. By this rotation, the line connecting the centers of both eyes is corrected to be horizontal, and the face and facial parts in the image are also corrected for tilt. 12 in FIG.
Reference numeral 3 denotes an example of the image after the rotation, and reference numeral 148 denotes a line connecting the center positions of both eyes in the rotated image.
According to this, it is understood that the line connecting the center positions of both eyes is horizontal, and that the face in the image is also corrected in inclination and stands upright. Since the inclination is corrected by the above method,
Even when an inclined face is input, in the subsequent processing, processing such as recognition may be performed on the image whose face inclination has been corrected, so that stable and highly accurate recognition can be performed as compared with the conventional method.

【００９４】したがって、本実施例の特徴をまとめると
次のようになる。（６）請求項６の画像処理装置は、顔部品認識手段が検
出した両目の位置に関して、左右の目を結ぶ線が水平に
なるように、顔画像を回転させる手段を備えてなること
を特徴とする。尚、この手段は図示されていない顔部品
認識手段に備えられているため、顔部品認識手段と同様
に、演算装置１３の中もしくは特徴量抽出部１０の中で
その機能が実現されている。Therefore, the features of this embodiment are summarized as follows. (6) The image processing apparatus according to claim 6, further comprising means for rotating the face image such that a line connecting the left and right eyes is horizontal with respect to the positions of both eyes detected by the face part recognizing means. And Since this means is provided in a face part recognizing means (not shown), its function is realized in the arithmetic unit 13 or in the feature quantity extracting unit 10 like the face part recognizing means.

【００９５】上記（６）の構成によれば、顔が傾いた画
像を入力として与えられても高精度に顔部品の位置、大
きさ検出を行なうことができる。［第４の実施例］（請求項７の発明）本実施例では、図３乃至図６を用いて説明する。図３は
目の形状を認識するための目の探索範囲、ヒストグラ
ム、検出された目頭及び目尻の一例を示した図であり、
図４は目頭を検出するためのテンプレートの一例をあら
わした図であり、図５は目の厚みを検出するための目の
探索範囲、肌いろをサンプリングするための領域、肌・
非肌領域の一例をあらわす図であり、図６は目頭探索範
囲及び目尻探索範囲を求めるための動作をあらわすフロ
ーチャートである。According to the above configuration (6), the position and size of a face part can be detected with high accuracy even when an image with a tilted face is given as an input. [Fourth Embodiment] (Invention of Claim 7) This embodiment will be described with reference to FIGS. FIG. 3 is a diagram illustrating an example of an eye search range for recognizing an eye shape, a histogram, and detected inner and outer corners of the eye.
FIG. 4 is a diagram showing an example of a template for detecting the inner corner of the eye. FIG.
FIG. 6 is a flowchart illustrating an example of a non-skin region, and FIG. 6 is a flowchart illustrating an operation for obtaining an eye opening search range and an eye corner search range.

【００９６】図３乃至図６を用いて、左目の形状を判定
する方法を説明する。A method for determining the shape of the left eye will be described with reference to FIGS.

【００９７】左目をその中に含むように探索範囲３１を
設定し、目頭と目尻の位置を検出する。これは、例え
ば、次のような方法で実現できる。人間の目の場合、上
瞼と下瞼が合わさる形で目が構成されている。したがっ
て、これら両瞼の境界が出会う点が目頭、あるいは目尻
とすることができる。The search range 31 is set so as to include the left eye therein, and the positions of the inner and outer corners of the eye are detected. This can be achieved, for example, by the following method. In the case of the human eye, the eye is configured such that the upper and lower eyelids meet. Therefore, the point where the boundary between these two eyelids meets can be the inner or outer corner of the eye.

【００９８】探索範囲内の画像を垂直方向に微分し（Ｓ
６１）、その微分値を垂直方向に投影しヒストグラムを
作成する（Ｓ６２）。３２は、画像３１を垂直方向に微
分し、その微分値を垂直方向に投影して作成したヒスト
グラムを模式的にあらわしたものである。そして、ヒス
トグラムを左から右に走査する（Ｓ６３）。垂直方向へ
の微分は、両瞼の境界線の水平方向の成分を抽出するか
ら、それを垂直方向へ投影すると、次のような特徴をあ
らわすことになる。すなわち、左から右方向へ走査する
と、まず探索範囲内において左端には目が存在せず、肌
部分であるため、ヒストグラムは平坦である。右方向に
見ていき、目頭に達すると、そこから上瞼及び下瞼が始
まるので、ヒストグラムは急激に立ち上がる。そして、
目尻付近に達すると再びヒストグラムは下降に転じる。The image within the search range is differentiated in the vertical direction (S
61), and the histogram is created by projecting the differential value in the vertical direction (S62). Reference numeral 32 schematically represents a histogram created by differentiating the image 31 in the vertical direction and projecting the differential value in the vertical direction. Then, the histogram is scanned from left to right (S63). Since the differentiation in the vertical direction extracts the horizontal component of the boundary line between the two eyelids, when it is projected in the vertical direction, the following features will be exhibited. That is, when scanning from left to right, the histogram is flat because there is no eye at the left end in the search range and it is a skin portion. Looking to the right, when reaching the inner corner of the eye, the upper and lower eyelids start from there, so the histogram rises sharply. And
When reaching near the outer corner of the eye, the histogram starts to fall again.

【００９９】したがって、目の探索範囲内において、該
ヒストグラムを左から右に走査した際に、急激に立ち上
がる点は、その付近に目頭が存在する可能性が高い。こ
こで、適当な閾値を設け、ヒストグラム上昇の変化量が
その閾値を超えた場合（Ｓ６４）、その点の前後の適当
な範囲を、次に目頭を検出するための探索範囲３３とす
る（Ｓ６５）。同様に、適当な閾値を設け、ヒストグラ
ム下降の変化量がそのしき値を超えた場合（Ｓ６６）、
その前後の適当な範囲を、次に目尻を検出するための探
索範囲３４とする（Ｓ６７）。ただし、目尻の場合は目
頭に比べ、はっきりとした境界がない場合が多く、探索
範囲としては目頭の場合より広く設定する方が望まし
い。Therefore, when the histogram is scanned from left to right within the eye search range, the point where the histogram rises sharply is likely to have the inner corner of the eye. Here, an appropriate threshold value is provided, and when the amount of change in the histogram rise exceeds the threshold value (S64), an appropriate range before and after that point is set as a search range 33 for detecting the next inner corner (S65). ). Similarly, if an appropriate threshold value is set and the amount of change in histogram descending exceeds the threshold value (S66),
An appropriate range before and after that is set as a search range 34 for next detecting the outer corner of the eye (S67). However, in the case of the outer corner of the eye, there are many cases where there is no clear boundary as compared with the inner corner of the eye, and it is desirable to set the search range wider than that of the inner corner of the eye.

【０１００】目頭の探索範囲が設定されたら、次にその
探索範囲内において、目頭位置を検出する。これには、
例えば、次のような方法で実現できる。目頭を検出する
ための小テンプレート（以下、目頭テンプレートと記
す）を設定する。図４の例では、目頭テンプレートのサ
イズを４ピクセル×４ピクセルとし、目頭の形状に合わ
せて３種類を用意している。目頭が下がっている４４の
ような目の場合、４１のような目頭テンプレートを用
い、目頭が上がっている４６のような目の場合、４３の
ような目頭テンプレートを用い、それ以外の４５のよう
な目の場合、４２のような目頭テンプレートを用いる。After the search range of the inner corner is set, the position of the inner corner is detected within the search range. This includes
For example, it can be realized by the following method. A small template for detecting the inner corner (hereinafter, referred to as the inner corner template) is set. In the example of FIG. 4, the size of the inner corner template is 4 pixels × 4 pixels, and three types are prepared according to the shape of the inner corner. In the case of an eye like 44 in which the inner corner is lowered, an inner eye template such as 41 is used, and in the case of an eye such as 46 whose inner corner is raised, an inner eye template such as 43 is used. In the case of a simple eye, an inner corner template such as 42 is used.

【０１０１】目頭探索範囲内において、上記目頭テンプ
レートを移動させ（Ｓ７１）、対応する画像との間で類
似度を計算する（Ｓ７２）。類似度Ｓは、例えば、次式
のように定義される。The above-mentioned inner-eye template is moved within the inner-eye search range (S71), and the similarity between the image and the corresponding image is calculated (S72). The similarity S is defined, for example, by the following equation.

【０１０２】ただし、Ｗは目頭テンプレートの白であらわされている
領域、Ｂは目頭テンプレートの黒であらわされている領
域であり、Ｉ（ｐ）は、画素ｐでの輝度値、Ｎ（Ｗ）、
Ｎ（Ｂ）はそれぞれ、目頭テンプレートの白、黒であら
わされている領域の画素数である。目頭探索範囲内すべ
てにおいて類似度Ｓを計算し（Ｓ７３）、類似度Ｓが最
も大きい点を、目頭位置３５とする（Ｓ７４）。[0102] Here, W is a white area of the inner template, B is a black area of the inner template, I (p) is a luminance value at the pixel p, N (W),
N (B) is the number of pixels in the white and black regions of the inner corner template, respectively. The similarity S is calculated in the entire inner-corner search range (S73), and the point having the highest similarity S is set as the inner-corner position 35 (S74).

【０１０３】目尻の探索範囲が設定されたら、次にその
探索範囲内において、目尻位置を検出する。目頭に比べ
て目尻の境界はあいまいであることが多いため、目頭の
ようにテンプレートを用いた手法ではうまく働かないこ
とがある。これには、例えば、目尻の探索範囲内で重心
を求める（Ｓ８１）、という手法を用いる。重心（ｇ
ｘ，ｇｙ）は、例えば、次式のように定義される。After the search range of the outer corner of the eye is set, the position of the outer corner of the eye is detected within the search range. Since the border of the outer corner of the eye is often more ambiguous than the inner eye, a template-based method such as the inner eye may not work well. For this, for example, a method of finding the center of gravity within the search range of the outer corner of the eye (S81) is used. Center of gravity (g
x, gy) is defined, for example, by the following equation.

【０１０４】ここで、Ｒは目尻の探索範囲をあらわす領域、Ｘ（ｐ）
は点ｐのＸ座標、Ｙ（ｐ）は点ｐのＹ座標である。この
重心（ｇｘ，ｇｙ）を目尻位置３６とする（Ｓ８２）。[0104] Here, R is a region representing the search range of the outer corner of the eye, X (p)
Is the X coordinate of point p, and Y (p) is the Y coordinate of point p. The center of gravity (gx, gy) is set as the outer corner of the eye 36 (S82).

【０１０５】目頭位置３５及び目尻位置３６が検出され
ると、目の傾き３７を求めることができる。When the inner and outer eye positions 35 and 36 are detected, the inclination 37 of the eyes can be obtained.

【０１０６】次に、目の厚みを求める。目の厚みを求め
るには、例えば次のような手法がとられる。目の探索範
囲内で、肌及び非肌領域を分離する。そのためにはま
ず、目の探索範囲内での肌の色を解析する。図５の目の
探索範囲画像５０において、明らかに肌であると思われ
る領域、例えば、当該探索範囲画像の外辺部５１の色分
布を調べる。具体的には、人間の肌を構成する画素の色
は、ある色を平均として正規分布にしたがうと仮定し、
該外辺部５１内の画素の色の平均と分散を求める。この
平均と分散で、肌の色を表す確率密度関数を求め、該関
数を目の探索範囲画像に適用することで、肌及び非肌領
域に分離することができる。尚、本手法は、上記文献
［４］あるいは文献［１２］に記載された技術を用い
る。Next, the thickness of the eyes is determined. In order to determine the eye thickness, for example, the following method is used. Separate the skin and non-skin regions within the eye search range. First, the skin color within the eye search range is analyzed. In the search range image 50 of the eye in FIG. 5, the color distribution of an area apparently considered to be skin, for example, the outer edge 51 of the search range image is examined. Specifically, it is assumed that the colors of the pixels constituting the human skin follow a normal distribution with a certain color as an average,
The average and variance of the colors of the pixels in the outer edge 51 are obtained. A probability density function representing the color of the skin is obtained from the average and the variance, and the function is applied to the eye search range image, whereby the skin and non-skin regions can be separated. This technique uses the technique described in the above-mentioned reference [4] or reference [12].

【０１０７】ここで模式図５５において黒く示されてい
る領域５２は、非肌領域として分離された領域を模式的
に示したものである。これは、目を構成する画素の大部
分を含んでいる。この領域について、水平方向及びその
前後に回転させた方向について投影した時に、該領域が
存在する範囲をそれぞれ求める（５３，５４，５５）。
その範囲の長さが最小となるものを、該左目の厚みとす
る。Here, the area 52 shown in black in the schematic diagram 55 schematically shows an area separated as a non-skin area. This includes most of the pixels that make up the eye. When this area is projected in the horizontal direction and in the direction rotated forward and backward, the range in which the area exists is determined (53, 54, 55).
The minimum length of the range is defined as the thickness of the left eye.

【０１０８】このように、目の傾き及び厚みを計測し、
この２つのパラメータをもとに、予め設定しておいたカ
テゴリーに対応する目の形状コードを求めることができ
る。Thus, the inclination and the thickness of the eyes are measured,
Based on these two parameters, an eye shape code corresponding to a preset category can be obtained.

【０１０９】以上は左目の場合の処理を説明したが、右
目の場合は、探索範囲画像を左右反転させてまったく同
じ処理を適用することができる。Although the processing for the left eye has been described above, for the right eye, exactly the same processing can be applied by inverting the search range image horizontally.

【０１１０】したがって、本実施例の特徴をまとめると
次のようになる。（７）請求項７の画像処理装置は、顔部品認識手段が検
出した目の位置や大きさに基づいて探索範囲を設定し、
その範囲で目の傾き及び厚みをあらわす画像特徴を検出
し、目の形状を判定する手段を備えてなることを特徴と
する。尚、この手段は図示されていない顔部品認識手段
に備えられているため、顔部品認識手段と同様に、演算
装置１３の中もしくは特徴量抽出部１０の中でその機能
が実現されている。Therefore, the characteristics of this embodiment are summarized as follows. (7) The image processing apparatus according to claim 7 sets a search range based on the position and size of the eyes detected by the face part recognition means,
It is characterized by comprising means for detecting an image feature representing the inclination and thickness of the eye in that range and determining the shape of the eye. Since this means is provided in a face part recognizing means (not shown), its function is realized in the arithmetic unit 13 or in the feature quantity extracting unit 10 like the face part recognizing means.

【０１１１】上記（７）の構成によれば、設定された探
索範囲内で目の傾き及び目の厚みをあらわす画像特徴を
検出し、目の形状を判定するので、テンプレート画像、
あるいは辞書画像と、それに対応する入力画像中の部分
画像とがずれていることにより、誤った特徴量が抽出さ
れるという危険を回避することができる。さらに、対象
とする目の形状が、予め準備しておいたカテゴリーに含
まれない形状の場合に、正しい特徴量が算出できないと
いった危険を回避することができる。さらに、テンプレ
ート画像や辞書画像を準備する必要がなり、作業量を大
幅に少なくすることができる。［第５の実施例］（請求項８の発明）本実施例では、図１５及び図１６を用いて、口の検出に
ついて説明する。図１５は口の検出動作をあらわすフロ
ーチャートであり、図１６は口の検出の画像及び投影結
果の画像の一例をあらわした図である。According to the above configuration (7), an image feature representing the inclination and thickness of the eyes is detected within the set search range, and the shape of the eyes is determined.
Alternatively, it is possible to avoid a danger that an erroneous feature amount is extracted due to a difference between the dictionary image and the corresponding partial image in the input image. Furthermore, when the target eye shape is a shape that is not included in the category prepared in advance, it is possible to avoid a danger that a correct feature amount cannot be calculated. Furthermore, it is necessary to prepare a template image and a dictionary image, and the amount of work can be significantly reduced. Fifth Embodiment (Invention of Claim 8) In this embodiment, detection of a mouth will be described with reference to FIGS. FIG. 15 is a flowchart showing a mouth detection operation, and FIG. 16 is a diagram showing an example of a mouth detection image and an image of a projection result.

【０１１２】口の検出は回転により傾きの補正された画
像に対し行なう。まず検出された両目の各々の中心の間
の距離ｅｌ２を求める。次に検出された両目の各々の中
心の間の距離ｅｌ２から、十分に口が含まれるであろう
大きさとなるよう予め定めた定数ＭＷ，ＭＨを両目の各
々の中心間の距離に乗算して範囲の幅ｍｗ及び高さｍｈ
を算出する。続いて算出された範囲の幅ｍｗ及び高さｍ
ｈに従い、ユーザーにより指定された口のおおよその位
置を中心として画像を切り出す（１２５）。図１６中の
１３２は切り出された口周辺の画像の例である。The mouth is detected for an image whose inclination has been corrected by rotation. First, a distance el2 between the centers of the detected eyes is determined. Next, from the detected distance el2 between the centers of both eyes, constants MW and MH, which are predetermined so as to have a size enough to include the mouth, are multiplied by the distance between the centers of both eyes. Range width mw and height mh
Is calculated. Subsequently, the calculated width mw and height m of the range
According to h, an image is cut out about the approximate position of the mouth designated by the user (125). Reference numeral 132 in FIG. 16 is an example of an image around the mouth that has been cut out.

【０１１３】次に切り出された口周辺の画像の横方向で
の中央付近の領域を決定し（１２６）、横に投影する
（１２７）。図１６中の１３３は切り出された口周辺の
画像の例であり、１３６は１３３における横方向の投影
を行なう領域であり、１３９は投影結果を模式的に表し
たものである。この投影を行なう領域は予め定めた定数
ＭＨＷを切り出した口付近の画像の幅ｍｈに乗算した値
を投影する範囲の幅ｍｈｗに、切り出した口付近の画像
の高さを投影する範囲の高さとして算出している。Next, a region near the center in the horizontal direction of the image of the cut out mouth is determined (126), and the image is projected horizontally (127). Reference numeral 133 in FIG. 16 is an example of an image around the mouth that has been cut out. Reference numeral 136 denotes a region in which projection in the horizontal direction in 133 is performed, and reference numeral 139 schematically shows a projection result. The area where this projection is performed is a height m of a range where a value obtained by multiplying a predetermined constant MHW by the width mh of the image near the clipped mouth is projected, and a height of a range where the height of the image near the clipped mouth is projected. It is calculated as

【０１１４】次に投影結果１３９から値の最も低い箇所
を探索する。この値の最も低い箇所が唇裂傷の縦方向の
位置である（１２８）。この時、口付近の画像が傾いて
おり補正もされておらず、唇裂が水平でない場合を仮定
する。図１６中の１３４はそのような口周辺画像の例で
あり、１３７は１３４における横方向の投影を行なう領
域であり、１３９は投影結果を模式的に表したものであ
る。１３４においては唇裂は水平になっておらず、従っ
て高さ方向の位置だけでは唇裂の位置を正確に表すこと
ができない。また投影結果１３９では投影結果の鋭さが
失われ、唇裂の位置が検出不可能になっていることがわ
かる。言い替えると本発明においては、目の検出結果に
従い顔全体が正しく直立し唇裂が水平となるよう予め画
像を回転した後に口の検出を行なうことにより、従来の
方式に比べ安定かつ高精度に唇裂の検出を行なうことを
可能としている。Next, a portion having the lowest value is searched from the projection result 139. The lowest point of this value is the vertical position of the lip tear (128). At this time, it is assumed that the image near the mouth is tilted and not corrected, and the lip cleft is not horizontal. In FIG. 16, reference numeral 134 denotes an example of such an image around the mouth. Reference numeral 137 denotes a region in the horizontal projection in 134, and reference numeral 139 schematically shows a projection result. In 134, the cleft is not horizontal, so the position in the height direction alone cannot accurately represent the position of the cleft. In the projection result 139, it can be seen that the sharpness of the projection result is lost, and the position of the lip cleft cannot be detected. In other words, in the present invention, the mouth is detected after rotating the image in advance so that the entire face is correctly erect and the lip cleft is horizontal according to the result of the eye detection, so that the cleft of the lip is more stably and more accurately than the conventional method. It is possible to perform detection.

【０１１５】続いて検出した唇裂の付近で上下に細く横
に長い領域を決定し（１２９）、縦に投影する（１３
０）。図１６中の１３５がその場合の口周辺画像の例で
あり、１３８が縦方向の投影を行なう領域である。この
縦方向の投影を行なう領域は、切り出した口付近の画像
の高さｍｈに予め定めた定数ＭＶＨを乗算した値を投影
する範囲の高さｍｖｈに、切り出した口付近の画像の幅
を投影する範囲の幅とし、検出した唇裂の高さ方向の位
置を投影する範囲の高さ方向の中心として算出する。Subsequently, a vertically and horizontally long region is determined in the vicinity of the detected cleft lip (129) and projected vertically (13).
0). Reference numeral 135 in FIG. 16 is an example of the image around the mouth in that case, and reference numeral 138 is an area for performing vertical projection. This vertical projection area is obtained by projecting the width of the image near the cut mouth onto the height mvh of a range in which a value obtained by multiplying the height mh of the image near the cut mouth by a predetermined constant MVH is projected. And the calculated position of the detected lip cleft in the height direction is calculated as the center of the projected range in the height direction.

【０１１６】図１６中の１４１が同図中の１３８を縦に
投影した結果である。１３８の値の低い部分の幅が唇裂
の幅である。１３８の値の低い部分の判定は以下の方法
で行なう。まず、投影結果の値に対し判別分析を行な
い、閾値ｔｈｍを決定する。次に求めた閾値ｔｈｍと、
投影結果の各値を比較、ｔｈｍより値が低い箇所が唇裂
のある箇所であり、最も左にあるｔｈｍより値が低い箇
所が唇裂の左端の箇所、最も右にあるｔｈｍより値が低
い箇所が唇裂の右端の箇所である。また唇裂の右端と左
端の間隔が唇裂の幅であり、以上により唇裂の位置と幅
が認識される（１３１）。A result 141 of FIG. 16 is obtained by vertically projecting 138 in FIG. The width of the part with a low value of 138 is the width of the cleft lip. The determination of the portion having a low value of 138 is performed by the following method. First, discriminant analysis is performed on the value of the projection result to determine a threshold thm. Next, the threshold thm obtained,
The values of the projection results are compared, the portion having a value lower than thm is a portion having a cleft, the portion having a value lower than thm at the leftmost portion is at the left end portion of the cleft, and the portion having a value lower than thm at the rightmost portion is a portion having a cleft. This is the right end of the cleft lip. The interval between the right end and the left end of the cleft is the width of the cleft, and the position and width of the cleft are recognized as described above (131).

【０１１７】特に図示することはしないが、画像中の唇
裂が傾いていれば、縦方向の投影を行なう領域から唇裂
がはみ出てしまうか、縦方向の投影を行なう領域の高さ
を大きくしなければならず、どちらの場合も唇裂幅の検
出精度が低下することは明らかである。言い替えると、
本発明においては、顔の傾きを補正することにより安定
かつ高精度に唇裂の位置、幅を検出することを可能にし
ている。Although not particularly shown, if the lip in the image is tilted, the lip may protrude from the region where the vertical projection is performed, or the height of the region where the vertical projection is performed must be increased. In both cases, it is clear that the detection accuracy of the lip cleft width decreases. In other words,
According to the present invention, the position and width of the lip cleft can be detected stably and with high accuracy by correcting the inclination of the face.

【０１１８】したがって、本実施例の特徴をまとめると
次のようになる。（８）請求項８の画像処理装置は、顔部品認識手段が検
出する位置や大きさが顔部品の口であることを特徴とす
る。尚、顔部品認識手段は図示されていないが演算装置
１３の中もしくは特徴量抽出部１０の中でその機能が実
現されている。Therefore, the characteristics of this embodiment are summarized as follows. (8) The image processing apparatus according to claim 8 is characterized in that the position and the size detected by the face part recognition means are the mouth of the face part. Although the face part recognition means is not shown, its function is realized in the arithmetic unit 13 or the feature quantity extraction unit 10.

【０１１９】上記（８）の構成によれば、高精度に顔部
品の口の大きさを予測し、画像中の必要十分な範囲内で
のみ検出処理を行ない、少ない計算量で高精度に顔部品
の位置、大きさ検出を行なうことができる。［第６の実施例］（請求項９の発明）本実施例では、図９乃至図１０を用いて、左眉の位置及
び大きさを抽出する方法を説明する。図９は眉毛の位置
及び大きさを検出するための眉毛の探索範囲、２値化画
像の一例をあらわした図であり、図１０は眉毛の位置及
び大きさを検出するための動作をあらわすフローチャー
トである。According to the configuration (8), the size of the mouth of the face part is predicted with high accuracy, the detection process is performed only within a necessary and sufficient range in the image, and the face is accurately calculated with a small amount of calculation. The position and size of a part can be detected. [Sixth Embodiment] (Invention of Claim 9) In this embodiment, a method for extracting the position and size of the left eyebrow will be described with reference to FIGS. FIG. 9 is a diagram illustrating an example of an eyebrow search range and a binarized image for detecting the position and size of eyebrows, and FIG. 10 is a flowchart illustrating an operation for detecting the position and size of eyebrows. It is.

【０１２０】まず、左眉をその中に含むように十分な大
きさの探索範囲を設定する（Ｓ１０１）。探索範囲は、
目の探索範囲を求めるときと同様に位置指定装置で指定
された両目の位置等から求めてもよいし、前記第２の実
施例（請求項５記載の画像処理装置を使用して）で求め
た左目の位置から求めてもよい。いずれの場合も、予
め、複数の人物顔について、目の位置と両目間の距離、
乃至は、左目の位置と、左眉がその中に含まれるような
探索範囲の関係を計測し、求めておく必要がある。図９
に左眉の探索範囲画像９１と、それを複数の閾値で２値
化した画像９２乃至９６を示す。ここでは、画像９４が
眉毛を分離するための最適な閾値で２値化ｓされた画像
であるとする。First, a search range large enough to include the left eyebrow therein is set (S101). The search range is
It may be obtained from the positions of both eyes specified by the position specifying device in the same manner as when obtaining the eye search range, or may be obtained from the second embodiment (using the image processing device according to claim 5). Alternatively, it may be obtained from the position of the left eye. In any case, in advance, for a plurality of human faces, the position of the eyes and the distance between the eyes,
Alternatively, it is necessary to measure and obtain the relationship between the position of the left eye and the search range in which the left eyebrow is included. FIG.
15 shows a search range image 91 of the left eyebrow and images 92 to 96 obtained by binarizing the image 91 with a plurality of thresholds. Here, it is assumed that the image 94 is an image binarized with an optimum threshold value for separating eyebrows.

【０１２１】一般に、入力される画像のコントラスト
や、髪の毛、影など様々な要因があり、画像を２値化し
て眉毛を分離するための最適な閾値を求めることは非常
に困難である。そこで、本画像処理装置では、複数のし
きいで２値化した画像を比較し、最も眉をよく分離して
いると思われる画像の閾値を求めることで、上記の問題
を解決している。In general, there are various factors such as contrast of an input image, hair, shadow, and the like, and it is very difficult to binarize an image to obtain an optimal threshold for separating eyebrows. Therefore, the image processing apparatus solves the above-described problem by comparing binarized images with a plurality of thresholds and determining a threshold value of an image that is considered to have the best eyebrow separation.

【０１２２】すなわち、画像９２ないし９６は、閾値の
下限と上限を求め（Ｓ１０２）、その間で等間隔に閾値
を設定してそれぞれ２値化（Ｓ１０３）した画像であ
る。閾値の下限及び上限は、例えば、Ｐ−ＴＩＬＥ法等
を用い、下限の場合は暗い画素が全体の５％、上限の場
合は暗い画素が全体の５０％というように決定すること
ができるし、他の適当な手法を用いてもよい。尚、Ｐ−
ＴＩＬＥ法については、文献［５］に記載されている技
術を用いてもよい。That is, the images 92 to 96 are binarized images (S103) in which the lower and upper limits of the threshold are determined (S102), and the thresholds are set at regular intervals therebetween. For example, the lower limit and the upper limit of the threshold value can be determined by using the P-TILE method or the like. In the case of the lower limit, the dark pixels are 5% of the whole, and in the case of the upper limit, the dark pixels are 50% of the whole. Other suitable techniques may be used. In addition, P-
For the TILE method, a technique described in reference [5] may be used.

【０１２３】それぞれの閾値で２値化した画像に対し、
次の処理を行う。まず、ここでの２値化は、暗い画素を
白い画素に、明るい画素を黒い画素で表現する方法を用
いており、隣接する白画素の領域をラベルと呼ぶ。２値
化画像９２は、９２ａないし９２ｃの３つのラベルで構
成されていることを示している。With respect to the image binarized by each threshold value,
The following processing is performed. First, the binarization here uses a method of expressing dark pixels as white pixels and bright pixels as black pixels, and an area of adjacent white pixels is called a label. This shows that the binarized image 92 is composed of three labels 92a to 92c.

【０１２４】眉の探索範囲は、その中に眉を含むように
十分な大きさをもって設定されるので、２値化した際
に、探索範囲画像の各辺に接するラベルは、眉以外の領
域が分離されたものであるとみなすことができる。すな
わち、２値画像９２では、ラベル９２ｃは眉毛ではなく
髪の毛が分離されたものであり、眉毛を構成するラベル
ではないことがわかる。したがって、上記探索範囲画像
の各辺に接するラベルを取り除いた後に残ったラベルが
眉毛を構成するラベルの候補として残る（Ｓ１０４）。The search range of the eyebrow is set with a sufficient size so as to include the eyebrow in the search range. Therefore, when binarized, the label in contact with each side of the search range image has an area other than the eyebrow. It can be considered as separated. That is, in the binary image 92, it can be seen that the label 92c is not the eyebrow but the hair separated, and is not a label constituting the eyebrow. Therefore, the labels remaining after removing the labels in contact with the respective sides of the search range image remain as label candidates constituting eyebrows (S104).

【０１２５】上記候補として残ったラベルのうち、面積
の小さいラベル、例えば眉探索範囲画像の１％以下の面
積しかもたないようなラベルについては、画像中のノイ
ズやその他の要因で、眉毛以外の暗い画素が分離された
と考えることができる。したがって、上記小面積ラベル
を取り除いた後に残ったラベルが眉毛を構成する画素と
する（Ｓ１０５）。Of the labels remaining as candidates, labels having a small area, for example, labels having an area of 1% or less of the eyebrow search range image, may be caused by noise in the image or other factors. It can be considered that dark pixels have been separated. Therefore, the label remaining after removing the small area label is used as a pixel constituting the eyebrows (S105).

【０１２６】以上のようにして残ったラベルに外接する
矩形を求め、この幅及び高さを眉毛候補の幅及び高さと
する。これらの幅及び高さを、両目間の距離などから推
定される眉毛の平均的な幅及び高さと比較し、その差が
最も小さい眉毛候補が含まれる２値化画像を、眉毛を最
もよく分離している２値化画像であるとみなす（Ｓ１０
６）。この２値化画像における、眉毛を分離していると
みなされるラベルの位置及び大きさを、該入力画像での
左眉の位置及び大きさとする（Ｓ１０７）。A rectangle circumscribing the remaining label is obtained as described above, and this width and height are used as the width and height of the eyebrow candidate. The width and height are compared with the average width and height of eyebrows estimated from the distance between the eyes, and the binarized image including the eyebrow candidate with the smallest difference is separated from the eyebrows best. (S10)
6). In this binarized image, the position and size of the label that is considered to separate the eyebrows are set as the position and size of the left eyebrow in the input image (S107).

【０１２７】以上は左眉の場合の処理を説明したが、右
眉の場合は、探索範囲画像を左右反転させてまったく同
じ処理を適用することができる。The processing for the left eyebrow has been described above. However, for the right eyebrow, the same processing can be applied by inverting the search range image horizontally.

【０１２８】したがって、本実施例の特徴をまとめると
次のようになる。（９）請求項９の画像処理装置は、顔部品認識手段が検
出する位置や大きさが顔部品の眉であることを特徴とす
る。尚、顔部品認識手段は図示されていないが演算装置
１３の中もしくは特徴量抽出部１０の中でその機能が実
現されている。Therefore, the characteristics of this embodiment are summarized as follows. (9) The image processing apparatus according to claim 9 is characterized in that the position and size detected by the face part recognizing means are eyebrows of the face part. Although the face part recognition means is not shown, its function is realized in the arithmetic unit 13 or the feature quantity extraction unit 10.

【０１２９】上記（９）の構成によれば、２つ以上の顔
部品の位置を位置指定手段により指定し、それら指定位
置間の距離関係から眉毛の大きさを予測することで、特
徴量を抽出する際の処理を行うべき範囲を適当な大きさ
に制限することができる。その上で、当該処理範囲に対
して２値化を行うが、認識対象の領域を得るために２値
化を行なうにあたり、複数の方式、複数の閾値で２値化
し、それらの画像中の領域の位置、大きさ、形状等を判
定し、最も信頼できる画像を選択することにより、認識
対象を高精度に検出する方法を備えることを特徴として
いる。そのため、２つ以上の顔部品の位置を位置指定手
段により指定し、それら指定位置間の距離関係から眉毛
の大きさを予測し、特徴量を抽出する際の処理を行うべ
き範囲を適当な大きさに制限することに加え、眉毛の大
きさを推定することができる。すなわち、例えば、位置
指定手段によって位置指定される顔部品が右目と左目で
ある場合、眉毛の大きさは、その両目間の距離に一定の
係数を乗じた値から大きく離れた値ではない、とするこ
とができる。According to the configuration of the above (9), the position of two or more face parts is specified by the position specifying means, and the size of the eyebrows is predicted from the distance relationship between the specified positions. The range in which the processing at the time of extraction should be performed can be limited to an appropriate size. Then, binarization is performed on the processing range. In order to obtain a region to be recognized, binarization is performed using a plurality of methods and a plurality of thresholds. The method is characterized in that a method is provided for detecting the recognition target with high accuracy by determining the position, size, shape, etc. of the image and selecting the most reliable image. Therefore, the positions of two or more face parts are specified by the position specifying means, the size of the eyebrows is predicted from the distance relationship between the specified positions, and the range in which the processing for extracting the feature amount is to be performed is appropriately large. In addition to the restriction, the size of the eyebrows can be estimated. That is, for example, when the face parts whose positions are specified by the position specifying unit are the right eye and the left eye, the size of the eyebrows is not a value greatly separated from a value obtained by multiplying the distance between the eyes by a constant coefficient. can do.

【０１３０】したがって、２値化を行った際に分離され
る領域の大きさと、推定される眉毛の大きさを比較し、
それらの大きさがあまり離れていないような２値化の閾
値を求めることで、眉毛をあらわす領域を高精度に検出
することができる。［第７の実施例］（請求項１０の発明）本実施例では、図１１乃至図１４を用いて、左眉の形状
を判定する方法を説明する。Therefore, the size of the region separated when binarization is performed is compared with the estimated size of the eyebrows.
By obtaining a binarization threshold value such that their sizes are not so far apart, an area representing an eyebrow can be detected with high accuracy. Seventh Embodiment In the present embodiment, a method of determining the shape of the left eyebrow will be described with reference to FIGS.

【０１３１】図１１は眉毛に外接する矩形の一例をあら
わした図であり、図１２は眉毛の形状を認識するための
量子化の一例をあらわす図であり、図１３は眉毛の折れ
曲がりかたを検出するための動作をあらわすフローチャ
ートであり、図１４は眉毛の厚みを検出するための動作
をあらわすフローチャートである。FIG. 11 is a diagram showing an example of a rectangle circumscribing the eyebrows. FIG. 12 is a diagram showing an example of quantization for recognizing the shape of the eyebrows. FIG. 13 is a diagram showing how the eyebrows bend. FIG. 14 is a flowchart showing the operation for detecting, and FIG. 14 is a flowchart showing the operation for detecting the thickness of the eyebrows.

【０１３２】まず、左眉をその中に含むように探索範囲
１１１を設定する（Ｓ１３１）。図１１には、前記第６
の実施例（請求項９の画像処理装置）を用いて眉の位置
及び大きさを抽出するために最適な閾値で２値化した画
像１１２において、眉の位置及び大きさが、眉を分離し
たラベルの外接矩形１１３として示されている。上記矩
形内のラベル画像を、量子化する（Ｓ１３２）。図１２
では、３×２のサイズに量子化する例を示している。
尚、ここでは、地の色が白、眉毛の色が黒で表されてい
る。ここで、上がり眉の例１２１及び下がり眉の例１２
２が示されているが、これを３×２のサイズに量子化す
ると、模式図１２４及び１２５のようになる。ここで、
例えば、３×２のブロックのうち、模式図１２３で示さ
れるＡ及びＢの部分を観測し、Ａ白かつＢ黒の場合は上
がり眉、逆に、Ａ黒かつＢ白の場合は下がり眉とするこ
とができる。同様に、その他の折れ曲がり形状も同様の
量子化パターンを調べることで検出（Ｓ１３３）するこ
とができ、これをもって眉の折れ曲がりかたを検出する
ことが可能となる。First, the search range 111 is set so as to include the left eyebrow therein (S131). FIG.
In the image 112 binarized with the optimum threshold value to extract the position and size of the eyebrows using the embodiment (the image processing apparatus of claim 9), the position and size of the eyebrows are separated from the eyebrows. This is shown as the bounding rectangle 113 of the label. The label image within the rectangle is quantized (S132). FIG.
Shows an example in which quantization is performed to a size of 3 × 2.
Here, the color of the ground is represented by white, and the color of the eyebrows is represented by black. Here, example 121 of rising eyebrow and example 12 of falling eyebrow
Although 2 is shown, when this is quantized to a size of 3 × 2, it becomes as shown in schematic diagrams 124 and 125. here,
For example, among the 3 × 2 blocks, the portions A and B shown in the schematic diagram 123 are observed, and when A white and B black, the rising eyebrow, and conversely, when A black and B white, the falling eyebrow can do. Similarly, other bent shapes can also be detected by examining the same quantization pattern (S133), and thus, it is possible to detect how the eyebrows are bent.

【０１３３】次に、眉毛の太さを検出する。眉毛の太さ
を検出するには、例えば次のような処理を行う。まず、
前記第６の実施例（請求項９の画像処理装置）を用いて
眉の位置及び大きさを抽出するために最適な閾値で２値
化した画像における、眉を分離したラベル（以下、眉毛
ラベル画像と記す）に対して、収縮処理を施す。この収
縮処理については、文献［５］の技術を用いてもよい。
収縮処理を施し、眉毛ラベルが消滅するまでの回数を計
測し、その回数により眉の太さを判定する。すなわち、
細い眉の場合は、太い眉に比べて、より少ない回数の収
縮処理で眉毛ラベルが消滅するので、眉毛ラベルが消滅
するまでの回数を計測することで、眉毛の太さを検出す
ることができる。Next, the thickness of the eyebrows is detected. To detect the thickness of the eyebrows, for example, the following processing is performed. First,
A label (hereinafter referred to as an eyebrow label) in which an eyebrow is separated in an image binarized with an optimum threshold value for extracting the position and size of an eyebrow using the sixth embodiment (the image processing apparatus of claim 9) (Hereinafter referred to as an image). For this contraction processing, the technique of Reference [5] may be used.
A contraction process is performed, the number of times until the eyebrow label disappears is measured, and the thickness of the eyebrow is determined based on the number of times. That is,
In the case of a thin eyebrow, the eyebrow label disappears with a smaller number of contraction processes than in the case of a thick eyebrow, so by measuring the number of times until the eyebrow label disappears, the thickness of the eyebrow can be detected. .

【０１３４】以上、眉毛の折れ曲がりかた及び太さを計
測し、この２つのパラメータをもとに、予め設定してお
いたカテゴリーに対応する眉毛の形状コードを求めるこ
とができる。As described above, the bending and thickness of the eyebrows are measured, and based on these two parameters, the eyebrow shape code corresponding to the preset category can be obtained.

【０１３５】以上は左眉の場合の処理を説明したが、右
眉の場合は、探索範囲画像を左右反転させてまったく同
じ処理を適用することができる。The processing for the left eyebrow has been described above. However, for the right eyebrow, the same processing can be applied by inverting the search range image horizontally.

【０１３６】したがって、本実施例の特徴をまとめると
次のようになる。（１０）請求項１０の画像処理装置は、顔部品認識手段
が検出した眉の位置や大きさに基づいて探索範囲を設定
し、その範囲で眉の太さ及び折れ曲がり方をあらわす画
像特徴を検出し、眉の形状を判定する手段を備えてなる
ことを特徴とする。尚、この手段は図示されていない顔
部品認識手段に備えられているため、顔部品認識手段と
同様に、演算装置１３の中もしくは特徴量抽出部１０の
中でその機能が実現されている。Therefore, the characteristics of this embodiment are summarized as follows. (10) The image processing device according to claim 10 sets a search range based on the position and size of the eyebrows detected by the face part recognizing means, and detects an image feature representing the thickness of the eyebrows and how to bend in the range. And a means for determining the shape of the eyebrows. Since this means is provided in a face part recognizing means (not shown), its function is realized in the arithmetic unit 13 or in the feature quantity extracting unit 10 like the face part recognizing means.

【０１３７】上記（１０）の構成によれば、設定された
探索範囲内で眉毛の太さ及び折れ曲がりかたをあらわす
画像特徴を検出し、眉毛の形状を判定するので、テンプ
レート画像、あるいは辞書画像と、それに対応する入力
画像中の部分画像とがずれていることにより、誤った特
徴量が抽出されるという危険を回避することができる。
さらに、対象とする眉毛の形状が、予め準備しておいた
カテゴリーに含まれない形状の場合に、正しい特徴量が
算出できないといった危険を回避することができる。さ
らに、テンプレート画像や辞書画像を準備する必要がな
り、作業量を大幅に少なくすることができる。［第８の実施例］（請求項１１の発明）本実施例では、図２３乃至図３０を用いて、図２に示す
原画像内の人物顎輪郭形状を判定する画像処理動作を説
明する。According to the configuration of the above (10), the image characteristics representing the thickness and the manner of bending of the eyebrows are detected within the set search range, and the shape of the eyebrows is determined, so that the template image or the dictionary image And the corresponding partial image in the input image is shifted, thereby avoiding the risk that an erroneous feature amount is extracted.
Further, when the shape of the target eyebrow is not included in the category prepared in advance, it is possible to avoid a risk that a correct feature amount cannot be calculated. Furthermore, it is necessary to prepare a template image and a dictionary image, and the amount of work can be significantly reduced. [Eighth Embodiment] (Embodiment 11) In this embodiment, an image processing operation for judging a human jaw contour shape in an original image shown in FIG. 2 will be described with reference to FIGS.

【０１３８】図２３は図１に示す画像処理装置によって
実施される画像処理動作を説明するためのフローチャー
トであり、図２４は入力画像中の中心座標および初期輪
郭の配置を説明するための図であり、図２５は初期輪郭
上の一点と中心座標を結ぶ直線上の色差算出を行なう方
法を説明するための図であり、図２６は色差の算出例を
模式的に示した図であり、図２７は顔輪郭形状に特化し
た色差算出をおこなう手法として顔が楕円形状であるこ
とを利用する場合について説明するための図であり、図
２８は顔輪郭形状に特化した色差算出をおこなう手法と
して顔が中心軸に対して左右対称であることを利用する
場合について説明するための図であり、図２９は抽出し
た顔輪郭線から距離関数を算出する手法を説明するため
の図であり、図３０は入力画像から得られた距離関数と
基準距離関数を比較する手法を説明するための図であ
る。FIG. 23 is a flowchart for explaining an image processing operation performed by the image processing apparatus shown in FIG. 1, and FIG. 24 is a diagram for explaining the arrangement of the center coordinates and the initial contour in the input image. FIG. 25 is a diagram for explaining a method of calculating a color difference on a straight line connecting a point on the initial contour and the center coordinates, and FIG. 26 is a diagram schematically illustrating an example of calculating a color difference. FIG. 27 is a diagram for explaining a case where a face is elliptical as a method for performing color difference calculation specialized for a face contour shape, and FIG. 28 is a method for performing color difference computation specialized for a face contour shape. FIG. 29 is a diagram for describing a case in which the fact that the face is symmetrical about the center axis is used, and FIG. 29 is a diagram for explaining a method of calculating a distance function from the extracted face contour. FIG. Is a diagram for explaining a method of comparing the distance function and the reference distance function obtained from the input image.

【０１３９】ここでは図２３のフローチャートを用い
て、図２に示す原画像２１内の人物顎輪郭形状を判定す
る画像処理動作を説明する。Here, an image processing operation for judging the shape of the person jaw contour in the original image 21 shown in FIG. 2 will be described with reference to the flowchart of FIG.

【０１４０】まず前提として入力装置１１は、対象とな
る原画像２１を記憶装置１１に格納してあるとする。最
初に操作者は、顔の中心を特定するための位置情報を位
置指定装置１４により指定し、原画像上の顔の中心位置
を確定する（ステップＳ２０１）。この顔の中心位置
は、直接操作者が指定してもよいし、図２中の２２〜２
４に示すような、右目、左目、口の中心座標を指定し、
その中心を顔の中心位置として算出してもよい。First, it is assumed that the input device 11 stores the target original image 21 in the storage device 11. First, the operator specifies position information for specifying the center of the face using the position specifying device 14 and determines the center position of the face on the original image (step S201). The center position of the face may be directly designated by the operator, or may be designated by 22 to 2 in FIG.
Specify the center coordinates of the right eye, left eye, and mouth as shown in 4.
The center may be calculated as the center position of the face.

【０１４１】次に、ステップＳ２０２では、顔輪郭の近
傍に初期輪郭の配置を行なう。初期輪郭の配置は、操作
者が位置指定装置１４により直接指定するか、上述のよ
うに、目や口などの顔の他の部分の配置が分かっている
場合は、それらの情報をもとに自動的に適当な位置に配
置してもよい。例えば、目と口の領域を囲むような領域
を初期輪郭とする。目や口の相対距離を統計的に予め調
べておいて、適当なマージンをつけて目と口を囲むよう
に配置してもよい。図２４は中心位置２１０、及び初期
輪郭２１１を確定した画像を説明する例を示したもので
ある。Next, in step S202, an initial contour is arranged near the face contour. The arrangement of the initial contour is directly designated by the operator using the position designation device 14 or, as described above, when the arrangement of other parts of the face such as eyes and mouth is known, the information is used based on such information. It may be automatically arranged at an appropriate position. For example, a region surrounding the eye and mouth regions is set as the initial contour. The relative distance between the eyes and the mouth may be statistically checked in advance, and may be arranged so as to surround the eyes and the mouth with an appropriate margin. FIG. 24 shows an example for explaining an image in which the center position 210 and the initial contour 211 are determined.

【０１４２】次に、ステップＳ２０３では、原画像と中
心位置及び初期輪郭から、顔中心座標と初期輪郭上の各
座標を結ぶ直線上の隣り合う画素間の色差を算出し、対
象画素間の座標中点を座標値とし、算出した色差を画素
値にも画像（色差マップ画像）を作成する。Next, in step S203, a color difference between adjacent pixels on a straight line connecting the face center coordinates and each coordinate on the initial contour is calculated from the original image, the center position, and the initial contour, and the coordinates between the target pixels are calculated. An image (color difference map image) is created with the calculated color difference as the pixel value, with the middle point as the coordinate value.

【０１４３】ここで、前記色差を算出する方法として
は、例えば、画素データの各単色光ごとの輝度値を画素
間でそれぞれ減算処理することにより差分値を算出し、
各単色光ごとの差分値の合計値を色差として算出する。
色差算出手法は、この他の別の手法を用いてもよい。例
えば、画素データを各単色光の輝度値から、色相（Ｈ）
／彩度（Ｓ）／輝度（Ｖ）で表現されるＨＳＶ値に変換
し、色差を求める２画素のＨＳＶ空間上での位置を求
め、その間の距離値を色差としてもよい。また隣り合う
画素間ではなく、例えば連続する５画素単位ごとに平均
色を求め、その平均色同士の色差を求めてもよい。ま
た、色差の算出時には、対象が人物顔であることを利用
して色差の検出精度を変更してもよい。例えば、色差を
算出する際に比較する２画素の画素値が肌色を表す画素
値に近い値をもつ場合、２点は顔輪郭内の画素である可
能性が高いとみなし、色差の検出精度を低くし、ノイズ
等の影響を軽減することができる。一方、顎と首はどち
らも肌色を表す画素値をもつ可能性が高く、その境目で
ある顎境界を検出する際には、検出精度を上げたほうが
よい。したがって、中心から首方向への直線上の色差検
出時には、色差の検出精度を高めるようにし、顎境界を
検出しやすくする。尚、首の位置は、例えば口の座標が
該値であるならば、その座標方向を推定することが可能
である。Here, as a method of calculating the color difference, for example, a difference value is calculated by subtracting a luminance value for each single color light of pixel data between pixels.
The sum of the difference values for each single color light is calculated as a color difference.
The color difference calculation method may use another method. For example, the pixel data is calculated based on the hue (H) from the luminance value of each monochromatic light.
It may be converted into an HSV value expressed by / saturation (S) / luminance (V), the position of the two pixels for which the color difference is to be determined in the HSV space, and the distance value between them may be used as the color difference. Alternatively, an average color may be calculated not for adjacent pixels but for every five consecutive pixels, for example, and a color difference between the average colors may be calculated. When calculating the color difference, the detection accuracy of the color difference may be changed by utilizing the fact that the target is a human face. For example, if the pixel values of the two pixels to be compared when calculating the color difference have a value close to the pixel value representing the skin color, it is considered that the two points are likely to be pixels within the face outline, and the detection accuracy of the color difference is determined. Lowering the influence of noise and the like. On the other hand, it is highly possible that both the chin and the neck have pixel values representing the skin color, and it is better to increase the detection accuracy when detecting the chin boundary which is the boundary. Therefore, at the time of detecting a color difference on a straight line from the center to the neck, the detection accuracy of the color difference is increased, and the jaw boundary is easily detected. Note that the position of the neck can be estimated in the coordinate direction if the coordinates of the mouth are the values.

【０１４４】例として図２５に示すような顔中心２１２
と初期輪郭の上の座標点２１３を結ぶ直線２１４上の色
差を求める場合の模式図を図２６に示す。２１５が直線
上の画素値の並びを示し、２１６が隣合う２点間の画素
値の差分を示している。すなわち、この例では２１６が
色差の並びを示す。For example, the face center 212 as shown in FIG.
FIG. 26 is a schematic diagram showing a case where a color difference on a straight line 214 connecting the coordinate point 213 on the initial contour with the color difference 213 is obtained. Reference numeral 215 denotes an arrangement of pixel values on a straight line, and reference numeral 216 denotes a difference in pixel value between two adjacent points. That is, in this example, 216 indicates the arrangement of the color differences.

【０１４５】また、色差を検出後に、さらに人物顔輪郭
独自の特徴性を利用し、より顔輪郭形状に特化した色差
マップ画像を作成してもよい。例えば、顔を楕円の相似
形であると仮定し、図２７に示すように顔中心を中心と
する任意の大きさの楕円曲線上の１点と隣り合う２点の
計３点の色差を平均化して、その座標の色差として再格
納することによりノイズの影響を抑制する。または、顔
輪郭が左右対称性をもつと仮定して、図２８に示すよう
に顔中心と口座標を結ぶ直線を対称軸とする２座標の色
差を平均化し、それぞれの色差としてもよい。Further, after detecting the color difference, a color difference map image more specialized for the face outline shape may be created by further utilizing the unique characteristic of the human face outline. For example, assuming that the face has a similar shape to an ellipse, as shown in FIG. 27, the color difference of a total of three points of two points adjacent to one point on an elliptic curve of an arbitrary size centered on the center of the face is averaged. Then, the color difference of the coordinates is stored again to suppress the influence of noise. Alternatively, assuming that the face outline has left-right symmetry, as shown in FIG. 28, the color differences of two coordinates with the straight line connecting the face center and the mouth coordinates as the axes of symmetry may be averaged to obtain the respective color differences.

【０１４６】上記のように、人物顔であることを制約条
件に用いることにより、より顎形状の特徴を表すことに
特化したエネルギー画像を作成することができ、鮮明な
輪郭線が現れていない入力画像やノイズの多い画像に対
しても、より安定な顎検出を行なうことができる。As described above, by using a person's face as a constraint, an energy image specializing in expressing jaw-shaped features can be created, and a clear contour line does not appear. More stable jaw detection can be performed on an input image or an image with much noise.

【０１４７】次に、ステップＳ２０４では、初期輪郭を
動的輪郭モデルにしたがって移動させ、輪郭線を抽出す
る。エネルギー関数Ｅとして、例えば、輪郭線のなめら
かさを表す内部エネルギーＥ１、輪郭を収縮しようとす
るエネルギーＥ２、物体輪郭を特徴づける画像エネルギ
ーＥ３の和Ｅ＝Ｅ１＋Ｅ２＋Ｅ３求め、このＥを最小化
するように輪郭を移動させる。Next, in step S204, the initial contour is moved according to the active contour model to extract the contour. As the energy function E, for example, the sum E = E1 + E2 + E3 of the internal energy E1 representing the smoothness of the contour, the energy E2 for shrinking the contour, and the image energy E3 characterizing the object contour is obtained, and this E is minimized. Move the contour.

【０１４８】ここで、画像エネルギーＥ３にはステップ
Ｓ２０３で作成した色差マップ画像を利用する。画像上
の任意の点Ｐ（ｘ，ｙ）における画像エネルギーＥ３
（Ｐ）は、Ｐに対応する色差マップ画像上の色差値をＤ
（Ｐ）としたとき式２０１から求める。Here, the color difference map image created in step S203 is used for the image energy E3. Image energy E3 at an arbitrary point P (x, y) on the image
(P) represents the color difference value on the color difference map image corresponding to P by D
When (P) is set, it is obtained from Expression 201.

【０１４９】Ｅ３（Ｐ）＝α×（ＭＡＸ（Ｄ）−Ｄ（Ｐ）） …（式２０１）ただし、ＭＡＸ（Ｄ）は色差マップ画像中の色差の最大
値で、係数αはエネルギー関数Ｅにおける画像エネルギ
ーの貢献度を意味する。式２０１に従えば、色差が小さ
いところほど画像エネルギーは大きくなり、輪郭は移動
しやすくなる。逆に色差が大きいところほど画像エネル
ギーは小さくなり輪郭は移動しにくくなる。E3 (P) = α × (MAX (D) −D (P)) (Equation 201) where MAX (D) is the maximum value of the color difference in the color difference map image, and the coefficient α is the energy function E Means the degree of contribution of image energy. According to Equation 201, the smaller the color difference is, the larger the image energy is, and the easier the contour is to move. Conversely, the larger the color difference, the smaller the image energy and the more difficult it is for the outline to move.

【０１５０】次に、ステップＳ２０５では、ステップＳ
２０４で求めた輪郭線をもとに距離関数を算出する。す
なわち、輪郭線を顔内部の該値の座標、例えば、顔中心
からの距離ｒと方向（角度）θからなる関数ｒ＝Ｌ
（θ）として表現する。この様子を示す模式図を図２９
に示す。Next, in step S205, step S205
A distance function is calculated based on the contour obtained in step 204. That is, a contour r is a function r = L consisting of the coordinates of the value inside the face, for example, the distance r from the face center and the direction (angle) θ.
(Θ). FIG. 29 is a schematic diagram showing this state.
Shown in

【０１５１】Ｌ（θ）は、θの値を単位角度ずつ変えた
ときのｒを求めてもよいし、例えば顎形状をより顕著に
表す範囲（顔中心からみて首のある方向）は、単位角度
を狭くし、他の方向に比べてより情報量を多くしてもよ
い。また、距離関数を例えば式２０２によって表される
フーリエ記述子として表現してもよい。For L (θ), r may be obtained when the value of θ is changed by a unit angle. For example, the range (the direction in which the neck is viewed from the center of the face) that more clearly expresses the jaw shape is expressed in units of The angle may be narrowed to increase the amount of information as compared to other directions. Further, the distance function may be represented as a Fourier descriptor represented by, for example, Expression 202.

【０１５２】ここで、Ａ（ｎ）が曲線形状を表す係数ｅｘｐ（）は自
然対数の底のベキ乗を表し、ｓは曲線上の距離を、Ｌは
閉曲線の全長を意味する。フーリエ記述子に関する詳細
は、例えば文献［５］に開示されている。[0152] Here, the coefficient exp () in which A (n) represents the curve shape represents the power of the base of natural logarithm, s represents the distance on the curve, and L represents the total length of the closed curve. Details regarding the Fourier descriptor are disclosed, for example, in reference [5].

【０１５３】次に、ステップＳ２０６では、ステップＳ
２０５で求めた距離関数の特徴を基準距離関数と比較す
ることにより、形状を判定する。ここで、基準距離関数
とは、基準となる顎形状の輪郭線から予め作られた距離
関数のことである。基準となる顎形状の輪郭線は、輪郭
線が予め手動で検出されている画像を、類似の顎形状、
例えば、ベース型、丸型等に分類し、分類ごとに手動検
出の輪郭線を平均化したものを利用すればよい。距離関
数の比較は、例えば、距離関数上の変曲点の位置、変曲
点数、変曲点間の傾きなどを、その距離関数のもつ特徴
と位置づけ、基準となる輪郭形状の距離関数の特徴とそ
れぞれ比較することにより行なう。比較を行う際には、
予め基準距離関数と位置が整合するように正規化を行っ
ておく必要がある。尚変曲点の位置や数、変曲点間の傾
きは、基準形状の場合は予め求めておき、その情報をメ
モリに格納しておき、ステップＳ２０５で求めた距離関
数の変曲点の情報と適宜比較すればよい。そして、比較
結果が最も近い基準距離関数をもつ形状を判定結果とし
て決定する。尚、距離関数の比較は、より単純に基準距
離関数との差分和を比較により行うこともできる。Next, in step S206, step S
The shape is determined by comparing the features of the distance function obtained in 205 with the reference distance function. Here, the reference distance function is a distance function created in advance from a jaw-shaped contour serving as a reference. The contour of the jaw shape serving as a reference is an image in which the contour is manually detected in advance, a similar jaw shape,
For example, what is necessary is to classify into a base type, a round type, and the like, and use an average of contours detected manually for each classification. The comparison of the distance functions includes, for example, positioning the position of the inflection point on the distance function, the number of inflection points, the slope between the inflection points, and the like as the characteristics of the distance function, and the characteristics of the distance function of the reference contour shape. By comparing with When making a comparison,
It is necessary to normalize in advance so that the position matches the reference distance function. The position and number of inflection points and the inclination between inflection points are obtained in advance for the reference shape, the information is stored in a memory, and the information of the inflection point of the distance function obtained in step S205 is obtained. May be compared as appropriate. Then, the shape having the reference distance function closest to the comparison result is determined as the determination result. Note that the comparison of the distance functions can be performed simply by comparing the sum of the differences with the reference distance function.

【０１５４】図３０はこの様子を模式的に表した図であ
る。図３０中のｚは基準関距離関数との差を示してい
る。基準距離関数をＢ（θ）とした時、差分和Ｚ１は式
２０３により与えられる。FIG. 30 is a diagram schematically showing this state. Z in FIG. 30 indicates a difference from the reference function. When the reference distance function is B (θ), the difference sum Z1 is given by Expression 203.

【０１５５】すなわち、Ｚ１が最も最小となるＢ（θ）をもつ形状を
類似形状として決定すればよい。この方法の場合は、θ
の範囲分のＢ（θ）を基準形状分メモリに用意しておく
必要があるが、より詳細な形状の分類・判定を簡単に行
うことができる。[0155] That is, the shape having B (θ) that minimizes Z1 may be determined as a similar shape. In this method, θ
It is necessary to prepare B (θ) for the reference shape in the memory for the reference shape, but it is possible to easily perform more detailed shape classification and judgment.

【０１５６】また、平面上の曲線を周波数領域で記述す
る手法、例えば、フーリエ記述子を用いて距離関数を表
現すれば、これにより算出されるフーリエ係数をその距
離関数のもつ特徴として位置付けることができ、基準と
なる輪郭形状の距離関数の係数と比較することにより、
上記と同様に形状判定を行なうことができる。If a distance function is expressed using a method of describing a curve on a plane in the frequency domain, for example, using a Fourier descriptor, the Fourier coefficient calculated by this method can be positioned as a feature of the distance function. By comparing with the distance function coefficient of the reference contour shape,
Shape determination can be performed in the same manner as described above.

【０１５７】基準形状のフーリエ記述子の係数をＡｂ
（ｎ）とした時、対象距離関数との差分Ｚ２を次式２０
４により求め、Ｚ２が最も最小となるＡｂ（ｎ）をもつ
形状を類似形状として決定する。Let Ab be the coefficient of the Fourier descriptor of the reference shape.
When (n) is set, the difference Z2 from the target distance function is expressed by the following equation (20).
4, the shape having Ab (n) in which Z2 is the minimum is determined as a similar shape.

【０１５８】一般にフーリエ係数の低次の項にはおおまかな曲線形
状、高次の項にはより詳細な曲線形状が反映されてい
る。したがって、低次の項の比較、すなわち式２０４に
おけるｎの範囲を小さくしてＺ２を求めることにより、
ノイズや個人差などの影響をなるべく排除した判定結果
を得ることが可能である。[0158] Generally, a low-order term of the Fourier coefficient reflects a rough curve shape, and a high-order term reflects a more detailed curve shape. Therefore, by comparing low-order terms, that is, by finding Z2 by reducing the range of n in Equation 204,
It is possible to obtain a determination result in which influences such as noise and individual differences are eliminated as much as possible.

【０１５９】以上の動作により、撮影条件が悪く、あま
り明確な輪郭線が現れていない画像やノイズの多い画像
に対しても、より安定な顎検出を行ない、ノイズや個人
差などの影響を出来るだけ排除した顎形状判定を行なう
ことが可能である。By the above operation, more stable jaw detection can be performed even on an image in which the photographing conditions are poor, and an image in which a clear contour line does not appear or an image with much noise can be obtained, and the influence of noise and individual differences can be obtained. It is possible to judge the jaw shape which is excluded.

【０１６０】したがって、本実施例の特徴をまとめると
次のようになる。（１１）請求項１１の画像処理装置は、特徴抽出手段が
位置指定手段で指定された一つ以上の位置情報に基づ
き、顎の輪郭特徴を検出し、その形状を判定する輪郭認
識手段を備えてなることを特徴とする。つまり、位置指
定手段から得られた１つ以上の顔を特徴付ける情報か
ら、顎輪郭の特徴をより顕著に表す特徴画像を作成し、
その画像を画像エネルギーとして利用する動的輪郭モデ
ルにより輪郭を検出することを特徴とし、また、その検
出輪郭線を顔の該値の部分からの距離と方向からなる距
離関数として表現し、その距離関数の特徴を求めて基準
特徴を比較することにより、顎輪郭形状を判定すること
を特徴とする。尚、この輪郭認識手段は図示されていな
い顔部品認識手段に備えられているため、顔部品認識手
段と同様に、演算装置１３の中もしくは特徴量抽出部１
０の中でその機能が実現されている。Therefore, the features of the present embodiment are summarized as follows. (11) An image processing apparatus according to claim 11, further comprising a contour recognizing means for detecting a jaw contour feature based on one or more pieces of position information designated by the position designating means and judging its shape. It is characterized by becoming. In other words, a feature image that represents the features of the jaw contour more prominently is created from the information that characterizes one or more faces obtained from the position specifying unit,
The method is characterized in that a contour is detected by a dynamic contour model using the image as image energy, and the detected contour is expressed as a distance function consisting of a distance and a direction from the value portion of the face. It is characterized in that the jaw contour shape is determined by calculating the feature of the function and comparing the reference feature. This contour recognition means is provided in a face part recognition means (not shown).
0 implements that function.

【０１６１】上記（１１）の構成によれば、画像処理装
置の操作者は、最初に位置指定手段により、入力画像中
に含まれている人物の顔の中心を指定する。この顔中心
は、直接指定してもよいし、他の顔特徴の指定、例え
ば、両目、口の座標から推定してもよい。次に人物の顔
を含むような初期輪郭座標列を求める。次いで、顔中心
座標と初期輪郭上の各座標を結ぶ直線上の隣り合う画素
間の色差を算出し、対象画素間の座標中点を座標値と
し、算出した色差を画素値にもつ画像（以降、色差マッ
プ画像と呼ぶ）を作成する。次いで、この色差マップ画
像を、画像エネルギーとする動的輪郭モデルを用いて顎
輪郭線を検出する。次いで、得られた輪郭線を顔内部の
該値の座標、例えば、顔中心からの距離と方向（角度）
からなる関数（以降距離関数と呼ぶ）として表現する。
次いで、この距離関数の特徴を、基準となる輪郭形状の
距離関数の特徴と比較し、最も特徴が近い距離関数をも
つ輪郭形状を、入力画像の顎形状として判定する。According to the configuration (11), the operator of the image processing apparatus first specifies the center of the face of the person included in the input image by the position specifying means. The face center may be specified directly, or may be estimated from the specification of other facial features, for example, the coordinates of both eyes and mouth. Next, an initial outline coordinate sequence including the face of the person is obtained. Next, a color difference between adjacent pixels on a straight line connecting the face center coordinates and each coordinate on the initial contour is calculated, a coordinate middle point between the target pixels is set as a coordinate value, and an image having the calculated color difference as a pixel value (hereinafter, referred to as an image) , A color difference map image). Next, a jaw contour line is detected using an active contour model using this color difference map image as image energy. Next, the obtained contour is converted into the coordinates of the value inside the face, for example, the distance and direction (angle) from the face center.
(Hereinafter referred to as a distance function).
Next, the feature of the distance function is compared with the feature of the distance function of the reference contour shape, and the contour shape having the closest distance function is determined as the jaw shape of the input image.

【０１６２】ここで、色差マップ画像作成には、対象画
像人物顔であることを利用して精度を高めてもよい。例
えば、色差を求める際には、肌色とそれ以外の色を区別
して求めてもよい。すなわち、肌色に分類される画素同
士の色差には、色差の検出精度を低くすることにより、
ノイズやしわの影響が色差マップ画像に反映されにくく
することができる。逆に、首と顎の境目は同じ肌色であ
ることが多く、色差が出にくいため、中心から首方向へ
の直線上の色差検出時には、検出精度を上げるようにし
てもよい。尚、首の位置は、例えば口の座標が該値であ
るならば、方向を推定することが出来る。Here, the accuracy of the color difference map image creation may be enhanced by utilizing the fact that the image is a human face of the target image. For example, when obtaining the color difference, the skin color and the other colors may be determined separately. In other words, the color difference between pixels classified as skin color is reduced by lowering the color difference detection accuracy.
The effects of noise and wrinkles can be made less likely to be reflected in the color difference map image. Conversely, the border between the neck and the chin often has the same flesh color, and it is difficult to produce a color difference. Therefore, when detecting a color difference on a straight line from the center to the neck, the detection accuracy may be increased. Note that the direction of the neck position can be estimated if the coordinates of the mouth are the values.

【０１６３】また、上記により色差マップ画像を作成し
た後に、例えば顔輪郭として楕円を仮定することによ
り、顔中心座標を中心とする楕円座標上にある画素値
（＝色差）とその両隣の画素値を平均化し、その画素値
とする。あるいは、顔輪郭以外の他の特徴が別途判明し
ている場合、例えば、口の中心座標が該値であるなら
ば、口の中心座標と顔中心を結ぶ直線を対称軸にもつ２
画素の画素値を平均化して、その画素値としてもよい。
これにより、顎形状の特徴を加味したエネルギー画像を
作成することができ、鮮明な輪郭線が現れていない入力
画像やノイズの多い画像に対しても、より安定な顎検出
を行なうことができる。After the color difference map image is created as described above, for example, by assuming an ellipse as the face outline, the pixel value (= color difference) on the ellipse coordinate centered on the face center coordinate and the pixel values on both sides thereof Are averaged to obtain the pixel value. Alternatively, when other features other than the face outline are known separately, for example, if the center coordinates of the mouth are the values, a line having the straight line connecting the center coordinates of the mouth and the center of the face on the symmetry axis is used.
The pixel values of the pixels may be averaged and used as the pixel value.
This makes it possible to create an energy image in which the features of the jaw shape are taken into account, and it is possible to perform more stable jaw detection even on an input image in which a clear outline does not appear or an image with much noise.

【０１６４】また、輪郭線から距離関数を作成する際に
も、人物顔輪郭独自の特徴性を利用することにより、ノ
イズや照明による影響を出来るだけ排除し、顎の特徴を
より顕著に表すように距離関数を修正することができ
る。例えば色差マップ作成時と同じように、楕円や対称
性などの顔の形状に基づき平均化等の距離関数の修正を
行なうことができる。Also, when creating a distance function from a contour line, the characteristic of the jaw can be represented more prominently by using the unique characteristics of the human face contour to minimize the influence of noise and lighting. To modify the distance function. For example, a distance function such as averaging can be corrected based on the shape of a face such as an ellipse or symmetry as in the case of creating a color difference map.

【０１６５】次に、距離関数の比較は、距離関数の変曲
点の位置、変曲点数、変曲点間の傾きなどその距離関数
のもつ特徴と位置づけ、基準となる輪郭形状の距離関数
の特徴とそれぞれ比較することにより行なう。そして、
最も類似している基準距離関数を有する基準形状を該当
する輪郭形状として判定する。Next, the distance function is compared with those of the distance function, such as the position of the inflection point of the distance function, the number of inflection points, and the slope between the inflection points. This is done by comparing each with the feature. And
The reference shape having the most similar reference distance function is determined as the corresponding contour shape.

【０１６６】また、平面上の曲線を周波数領域で記述す
る手法、例えば、フーリエ記述子を用いて距離関数を表
現すれば、これにより算出されるフーリエ係数をその距
離関数のもつ特徴として位置付けることができ、基準と
なる輪郭形状の距離関数の係数と比較することにより、
上記と同様に形状判定を行なうことができる。If a distance function is expressed using a method of describing a curve on a plane in the frequency domain, for example, using a Fourier descriptor, the Fourier coefficient calculated by this method can be positioned as a feature of the distance function. By comparing with the distance function coefficient of the reference contour shape,
Shape determination can be performed in the same manner as described above.

【０１６７】比較対象となる基準距離関数の特徴は、距
離関数を予め正規化して表としてメモリに格納しておい
てもよいし、予め必要となる正規化した変曲点の位置等
の情報だけを格納しておいてもよい。フーリエ記述子を
用いる場合は、必要な次数の係数を格納しておけばよ
い。これらの手法では、テンプレートマッチングに比べ
て、比較対象となる基準形状を辞書画像としてもつ必要
がなく、メモリコストや処理速度の面で有利となる。The feature of the reference distance function to be compared may be that the distance function may be normalized in advance and stored in a memory as a table, or only the information such as the position of the inflection point that is required in advance is normalized. May be stored. When a Fourier descriptor is used, a coefficient of a required order may be stored. These methods do not need to have a reference shape to be compared as a dictionary image as compared with template matching, which is advantageous in terms of memory cost and processing speed.

【０１６８】また、フーリエ記述子を用いる場合、フー
リエ係数の低次の項にはおおまかな曲線形状、高次の項
にはより詳細な曲線形状が反映されていることを利用
し、まず低次の項の比較を行なうことにより、ノイズや
個人差などの影響をなるべく排除した判定結果を得るこ
とが可能である。［第９の実施例］（請求項１２から１７の発明）本実施例では、図３１乃至図３５を用いて説明する。図
３１は本実施例の画像処理装置の構成を示すブロック図
であり、図３２は図３１の画像合成装置の処理を示すフ
ローチャートであり、図３３は髪色抽出に関する説明図
であり、図３４は前髪分類に関する説明図であり、図３
５は後髪分類に関する説明図である。以下に、図３２の
処理フローに従って、図３３，図３４，図３５の説明図
を参照しながら、各ステップについて詳細に説明する。When the Fourier descriptor is used, the fact that the low-order term of the Fourier coefficient reflects a rough curve shape and the high-order term reflects a more detailed curve shape is used. By comparing the terms (1) and (2), it is possible to obtain a determination result in which influences such as noise and individual differences are eliminated as much as possible. Ninth Embodiment (Inventions of Claims 12 to 17) This embodiment will be described with reference to FIGS. FIG. 31 is a block diagram showing the configuration of the image processing apparatus of the present embodiment, FIG. 32 is a flowchart showing the processing of the image synthesizing apparatus of FIG. 31, FIG. 33 is an explanatory diagram relating to hair color extraction, and FIG. FIG. 3 is an explanatory diagram related to bangs classification, and FIG.
FIG. 5 is an explanatory diagram relating to back hair classification. Hereinafter, each step will be described in detail according to the processing flow of FIG. 32 and with reference to the explanatory diagrams of FIGS. 33, 34, and 35.

【０１６９】まず、入力手段３２１により、顔画像を入
力し、記憶手段３２２に記憶する（ステップＳ３４
１）。次に、位置指定手段３２４により、右目、左目、
口のおおまかな位置、及び、顔輪郭を入力し、記憶手段
３２２に格納する（ステップＳ３４２）。各手段は、記
憶手段３２２に格納された画像及び右目、左目、口、顔
輪郭の情報を参照し、演算手段３２３を用いて動作し、
前髪及び後髪の分類を行い、あるいは、髪部品を決定す
る。尚、ここでは、右目、左目、口のおおまかな位置を
入力するとしているが、必ずしもこれら３点が必要なわ
けではなく、例えば、これら３点の代わりに鼻と口の２
点のおおまかな位置を入力することも考えられる。First, a face image is input by the input means 321 and stored in the storage means 322 (step S34).
1). Next, the right eye, the left eye,
The approximate position of the mouth and the face outline are input and stored in the storage unit 322 (step S342). Each unit operates by using the arithmetic unit 323 with reference to the image stored in the storage unit 322 and information on the right eye, left eye, mouth, and face outline,
Classify bangs and back hair, or determine hair parts. Although the approximate positions of the right eye, the left eye, and the mouth are input here, these three points are not necessarily required. For example, instead of these three points, the nose and the mouth are replaced.
It is also conceivable to enter a rough position of the point.

【０１７０】上記右目、左目、口、顔輪郭の情報は、指
定されたものをそのまま用いてもよいが、上記第２の実
施例（請求項５）または第５の実施例（請求項８）で述
べたような方法により検出された位置を用いれば、より
精度を向上することができる。また、画像は入力された
ものをそのまま用いてもよいが、予め、上記検出された
右目、左目、口位置に基づいて、上記第３の実施例（請
求項６）で述べたような方法で、右目と左目とが水平ま
たはそれに近くなるように回転処理を行うことや、ロー
パスフィルタなどの画像処理を行うことなどにより、精
度を向上することができる。画像の回転処理を行う場合
は、上記右目、左目、口、顔輪郭についても、これと同
じ角度分の回転処理を行う。As the information on the right eye, left eye, mouth, and face outline, the designated information may be used as it is, but the second embodiment (claim 5) or the fifth embodiment (claim 8) The accuracy can be further improved by using the position detected by the method described above. Although the input image may be used as it is, the image may be used as it is, based on the detected right eye, left eye, and mouth position in advance by the method described in the third embodiment (claim 6). The accuracy can be improved by performing a rotation process so that the right and left eyes are horizontal or close to it, or performing image processing such as a low-pass filter. When performing image rotation processing, rotation processing for the same angle is performed on the right eye, left eye, mouth, and face contour.

【０１７１】髪色抽出手段３２６は、以下のようにして
髪色を抽出する（ステップＳ３４３）。この抽出方法
を、図３３を用いて説明する。尚、図３３に示すよう
に、以下の説明では、ｙ座標は、上方から下方に行くに
従って値が大きくなる向きに取っている。The hair color extracting means 326 extracts the hair color as follows (step S343). This extraction method will be described with reference to FIG. Note that, as shown in FIG. 33, in the following description, the y-coordinate is set in such a direction that the value increases from the upper side to the lower side.

【０１７２】まず、肌色抽出手段３２５により、右目、
左目、口の座標に基づいて、鼻付近の領域内の画素値を
用いて肌色を抽出する。これは、単純に平均値を求めて
もよいが、例えば、一旦平均値及び分散を求め、平均か
ら大きく外れている画素を除いて、再び平均及び分散を
求め直してもよい。尚、ここで、肌色抽出を行うこと
は、後述のように、髪色抽出及び髪特徴抽出に役立つ
が、必ずしも必要ではなく、肌色抽出手段は省略するこ
ともできる。First, the right eye,
Based on the coordinates of the left eye and the mouth, a skin color is extracted using pixel values in a region near the nose. In this case, the average value may be simply calculated. For example, the average value and the variance may be calculated once, and the average and the variance may be calculated again except for the pixels that are largely deviated from the average. Here, performing skin color extraction is useful for hair color extraction and hair feature extraction as described later, but is not always necessary, and the skin color extraction means can be omitted.

【０１７３】次に、上記右目、左目、口の座標を用い
て、頭頂高さｆｔ及び髪生え際高さｆｈの初期推定値ｆ
ｔ０，ｆｈ０を決定する。これは、例えば、右目と左目
とのｙ座標の平均値をｙ＿ｅｙｅ，口のｙ座標をｙ＿ｍ
ｏｕｔｈとすると、適当に定める係数ｋ＿ｆｔ，ｋ＿ｆ
ｈに対し、ｆｔ０＝ｙ＿ｅｙｅ−ｋ＿ｆｔ×（ｙ＿ｍｏｕｔｈ
−ｙ＿ｅｙｅ）ｆｈ０＝ｙ＿ｅｙｅ−ｋ＿ｆｈ×（ｙ＿ｍｏｕｔｈ
−ｙ＿ｅｙｅ）とすればよい。Next, using the coordinates of the right eye, left eye, and mouth, the initial estimated value f of the crown height ft and the hairline height fh is calculated.
t0 and fh0 are determined. For example, the average value of the y coordinate of the right eye and the left eye is y_eye, and the y coordinate of the mouth is y_m.
If out, coefficients k_ft, k_f appropriately determined
h, ft0 = y_eye-k_ft × (y_mouth
−y_eye) fh0 = y_eye−k_fh × (y_mouth
−y_eye).

【０１７４】次に、上記ｆｔ０，ｆｈ０に基づき、サン
プリング矩形ＡＢＦＥ及びＥＦＤＣを設定する。ここ
に、Ｅ，Ｆのｙ座標はｆｔ０，Ｃ，Ｄのｙ座標はｆｈ０
に等しく取り、Ａ，Ｂのｙ座標はｆｔ０−（ｆｈ０−ｆ
ｔ０）に等しく取る（ＡＥ＝ＥＣとなる）。また、Ａ，
Ｅ，Ｃのｘ座標は、右目（画像上では左側に来る）付近
か、または、少し右目より（画像上で）左に、Ｅ，Ｆ，
Ｄのｘ座標は、左目付近か、または、左目より少し（画
像上で）右に取るとよい。Next, sampling rectangles ABFE and EFDC are set based on the above ft0 and fh0. Here, the y coordinate of E and F is ft0, and the y coordinate of C and D is fh0.
And the y coordinate of A and B is ft0− (fh0−f
t0) (AE = EC). Also, A,
The x-coordinates of E and C are near the right eye (coming to the left on the image) or slightly to the left of the right eye (on the image).
The x coordinate of D may be near the left eye or slightly to the right (on the image) of the left eye.

【０１７５】次に、ＥＦの高さｆｔを適当な閾値ｆｔ＿
ｕｐ，ｆｔ＿ｄｏｗｎに対して、ｆｔ０−ｆｔ＿ｕｐ＜＝ｆｔ＜＝ｆｔ０＋ｆｔ
＿ｄｏｗｎの範囲内で上下に動かして探索を行い、矩形ＡＢＦＥ内
の画素値と矩形ＥＦＤＣ内の画素値との分離度が最大に
なる所を頭頂高さｆｔの推定値とする。この分離度は、
矩形ＡＢＦＥ内の画素値の平均値をＡ１，分散をＶ１，
矩形ＥＦＤＣ内の画素値の平均値をＡ２，分散をＶ２，
ＡＢＤＣ内の画素値の平均値をＡ３，分散をＶ３、矩形
ＡＢＦＥと矩形ＥＦＤＣとの面積比をＳ１：Ｓ２とした
とき、｛Ｓ１×（Ａ１−Ａ３）×（Ａ１−Ａ３）＋Ｓ２×（Ａ
２−Ａ３）×（Ａ２−Ａ３）｝／Ｖ３で計算される。尚、画像がカラー画像の場合は、画素値
を３次元のベクトルとみなして同様に計算すればよい。Next, the height ft of the EF is set to an appropriate threshold value ft_
For up, ft_down, ft0−ft_up << = ft << = ft0 + ft
The search is performed by moving up and down within the range of _down, and a point at which the degree of separation between the pixel value in the rectangle ABFE and the pixel value in the rectangle EFDC is maximized is set as the estimated value of the crown height ft. This degree of separation is
The average value of the pixel values in the rectangle ABFE is A1, the variance is V1,
The average value of the pixel values in the rectangular EFDC is A2, the variance is V2,
If the average value of the pixel values in the ABDC is A3, the variance is V3, and the area ratio between the rectangle ABFE and the rectangle EFDC is S1: S2, then: S1 × (A1-A3) × (A1-A3) + S2 × (A
2-A3) × (A2-A3)｝ / V3. If the image is a color image, the pixel value may be calculated assuming that the pixel value is a three-dimensional vector.

【０１７６】次に、矩形ＡＢＦＥ内で、背景色の抽出を
を行う。このとき、下辺ＥＦは、上記の探索により動か
した後の高さｆｔにある。これは、一旦平均値及び分散
を求め、平均から大きく外れている画素を除いて再び平
均及び分散を求め直すとよい。尚、このように背景色の
抽出を行うことは、後述のように、髪色抽出に役立つ
が、必ずしも必要ではない。Next, a background color is extracted in the rectangle ABFE. At this time, the lower side EF is at the height ft after being moved by the above search. In this case, the average value and the variance may be obtained once, and the average value and the variance may be obtained again except for the pixels that are largely deviated from the average. Note that extracting the background color in this way is useful for hair color extraction as described below, but is not always necessary.

【０１７７】さらに、矩形ＥＦＤＣ内で、髪色の抽出を
行う。このとき、上辺ＥＦは、上記の探索により動かし
た後の高さｆｔにある。これは、単純に平均値をとるこ
とも考えられるが、そうすると、髪画素以外の画素値が
平均計算に含まれてしまい、精度が低下すると考えられ
るので、例えば、以下の様に行うとよい。Further, hair color is extracted in the rectangular EFDC. At this time, the upper side EF is at the height ft after being moved by the above search. Although it is conceivable that the average value is simply calculated, pixel values other than the hair pixels are included in the average calculation, and it is considered that the accuracy is reduced. Therefore, for example, the average value may be performed as follows.

【０１７８】上記肌色の平均値及び分散、及び、上記背
景色の平均値及び分散を用いて、肌色に近い画素、及
び、背景色に近い画素を除いて平均及び分散を計算す
る。さらに、この髪色の平均及び分散を用いて、上記計
算で既に除いた画素、及び、平均から大きく外れている
画素を除いて再び平均及び分散を求め直す。このとき、
除かれた画素の数が多く、髪色として計算に使われる画
素（以下、「髪色画素」と呼ぶ）の数が、ある閾値ｎ＿
ｓｈより少ない場合は、髪が薄いため、髪色が安定に抽
出できていないものと考えられるので、ステップＳ３４
５の髪特徴抽出はスキップして、ステップＳ３４６の髪
分類へジャンプする（ステップＳ３４４）。この場合
は、髪分類手段３３５は、髪分類を「髪が薄い」とす
る。尚、肌色または背景色の一方または両方を抽出しな
い場合は、抽出を行わない方に関しては、これに近い画
素を除く処理を省略することにより、髪色の抽出は可能
ではあるが、精度が低下すると考えられる。Using the average and variance of the flesh color and the average and variance of the background color, the average and variance are calculated excluding the pixels close to the flesh color and the pixels close to the background color. Further, by using the average and the variance of the hair color, the average and the variance are calculated again except for the pixels which have already been removed in the above calculation and the pixels which are largely deviated from the average. At this time,
The number of removed pixels is large, and the number of pixels used for calculation as hair color (hereinafter referred to as “hair color pixel”) is equal to a certain threshold n_
If the value is less than sh, it is considered that the hair color has not been stably extracted because the hair is thin, so that step S34 is performed.
The hair characteristic extraction of No. 5 is skipped, and the process jumps to the hair classification of step S346 (step S344). In this case, the hair classification means 335 sets the hair classification to “light hair”. When one or both of the skin color and the background color are not extracted, the hair color can be extracted by omitting the process of removing the pixels that are not extracted, but the accuracy is reduced. It is thought that.

【０１７９】髪特徴抽出手段３２９は、前髪特徴抽出手
段３２７及び後髪特徴抽出手段３２８のうち一方または
両方から構成され、髪特徴を抽出する（ステップＳ３４
５）。The hair characteristic extracting means 329 comprises one or both of the front hair characteristic extracting means 327 and the back hair characteristic extracting means 328, and extracts hair characteristics (step S34).
5).

【０１８０】前髪特徴抽出手段３２７の動作例を以下に
説明する。An operation example of the bangs feature extracting means 327 will be described below.

【０１８１】上記髪色の平均値及び分散、及び、上記肌
色の平均値及び分散を用いて、画像内の各画素に関し、
髪色よりも肌色に近く、かつ、肌色の平均値から大きく
外れてはいない場合は、非髪画素、そうでない場合は、
髪画素というラベルを付ける。これによって、髪領域を
抽出することができる。尚、ステップＳ３４３において
肌色を抽出しない場合は、画像内の各画素に関し、髪色
の平均値から大きく外れてはいない場合は、髪画素、そ
うでない場合は、非髪画素とラベル付けすればよい。上
記髪領域は、それ自身、１つの前髪特徴であるとも考え
られるが、さらに、これを用いて、前髪を含むと思われ
る適当な位置に横１１メッシュ×縦７メッシュ程度のメ
ッシュを設定し、各メッシュ内の髪画素数を前髪特徴と
する（以下、この特徴を「前髪メッシュ特徴」と呼ぶ）
など、前髪特徴を抽出する。Using the average value and the variance of the hair color and the average value and the variance of the skin color, for each pixel in the image,
If the skin color is closer to the skin color than the hair color and it does not greatly deviate from the average value of the skin color, it is a non-hair pixel, otherwise,
Label it as a hair pixel. Thereby, a hair region can be extracted. If the skin color is not extracted in step S343, the respective pixels in the image may be labeled as hair pixels if they do not greatly deviate from the average value of the hair color, and otherwise may be labeled as non-hair pixels. . The hair region itself is considered to be one bangs feature itself, and further, by using this, a mesh of about 11 mesh width × 7 mesh length is set at an appropriate position that is considered to include the bangs, The number of hair pixels in each mesh is defined as a bangs feature (hereinafter, this feature is referred to as a "bangs mesh feature").
For example, bangs characteristics are extracted.

【０１８２】後髪特徴抽出手段３２８の動作例を以下に
説明する。An example of the operation of the back hair characteristic extracting means 328 will be described below.

【０１８３】上記髪色の平均値及び分散、及び、上記肌
色の平均値及び分散を用いて、画像内の各画素に関し、
肌色よりも髪色に近く、かつ、髪色の平均値から大きく
外れてはいない場合は、髪画素、そうでない場合は、非
髪画素というラベルを付ける。これによって、髪領域を
抽出することができる。尚、ステップＳ３４３において
肌色特徴を抽出しない場合は、画像内の各画素に関し、
髪色の平均値から大きく外れてはいない場合は、髪画
素、そうでない場合は、非髪画素とラベル付けすればよ
い。上記髪領域は、それ自身、１つの後髪特徴であると
も考えられるが、さらに、これを用いて、例えば、いわ
ゆる「セミロング」を含む長髪系の髪の場合には髪が相
当量あり、そうでなく短髪系の髪の場合には髪があまり
ない、と思われるような矩形領域を、顔の左右両側に設
定し、それらの矩形内の髪画素数を取る（以下、この特
徴を「後髪矩形特徴」と呼ぶ）など、後髪特徴を抽出す
る。Using the average value and the variance of the hair color and the average value and the variance of the skin color, for each pixel in the image,
If the color is closer to the hair color than the skin color and does not greatly deviate from the average value of the hair color, the hair pixel is labeled, and if not, the label is a non-hair pixel. Thereby, a hair region can be extracted. When the skin color feature is not extracted in step S343, for each pixel in the image,
If it does not deviate significantly from the average value of the hair color, it may be labeled as a hair pixel, otherwise, it may be labeled as a non-hair pixel. The hair area is itself considered to be a back hair feature, but it is further used to provide a considerable amount of hair in the case of long hair, including, for example, so-called "semi-long" hair. In the case of short-haired hair, instead, a rectangular area that seems to have little hair is set on both the left and right sides of the face, and the number of hair pixels in those rectangles is taken (hereinafter, this feature is referred to as A back hair feature is extracted.

【０１８４】尚、上記では、前髪特徴と後髪特徴とを別
々のものとして各々の抽出方法を述べたが、両者を一体
のものと考えて、髪特徴を抽出してもよい。例えば、髪
画素領域は、前髪、後髪共通として、例えば、画像内の
各画素に関し、髪色の平均値から大きく外れてはいない
場合は、髪画素、そうでない場合は、非髪画素とラベル
付けすることにより作成することも考えられる。In the above description, the extraction method has been described assuming that the forehead feature and the back hair feature are separate, but the hair feature may be extracted by considering both as one. For example, the hair pixel area is common to the bangs and the back hair, and for example, for each pixel in the image, if it does not greatly deviate from the average value of the hair color, it is labeled as a hair pixel, otherwise, it is labeled as a non-hair pixel. It can also be created by attaching.

【０１８５】請求項１４の発明では、さらに、髪輪郭抽
出手段３３２を用いて、前髪輪郭または後髪輪郭のうち
一方または両方を抽出する。髪輪郭抽出手段３３２は、
前髪輪郭抽出手段３３０及び後髪輪郭抽出手段３３１の
うち一方または両方から構成される。In the fourteenth aspect, one or both of the front hair contour and the rear hair contour are extracted by using the hair contour extracting means 332. The hair contour extracting means 332
It comprises one or both of the forehead contour extraction means 330 and the back hair contour extraction means 331.

【０１８６】前髪輪郭抽出手段３３０は、上記前髪特徴
抽出手段３２７により抽出された髪領域を用いて、以下
の様に動作する。The forelock contour extracting means 330 operates as follows using the hair region extracted by the forehead characteristic extracting means 327.

【０１８７】右目、左目の中点から、画像上を真上方向
に画像端まで走査し、最も長い髪画素のランを検出す
る。このランの最下点を始点とし、（画像上の）左方向
に輪郭を追跡し、右目及び左目の座標に基づいて定め
る、あるｙ座標の閾値より下方（ｙ座標値は大）、か
つ、同様に定める、あるｘ座標の閾値より左方に来た時
に、追跡を終える。次に、上記ランの最下点を始点と
し、（画像上の）右方向に輪郭を追跡し、右目及び左目
の座標に基づいて定める、あるｙ座標の閾値より下方
（ｙ座標値は大）、かつ、同様に定める、あるｘ座標の
閾値より右方に来た時に、追跡を終える。さらに、上記
左側の輪郭線と右側の輪郭線とをつなぎ合わせ、前髪輪
郭とする。From the middle point of the right and left eyes, the image is scanned right above the image to the end of the image, and the longest hair pixel run is detected. With the lowest point of this run as the starting point, the contour is traced in the left direction (on the image), below a threshold value of a certain y coordinate (the y coordinate value is large) determined based on the coordinates of the right and left eyes, and The tracking is terminated when it comes to the left of a threshold value of a certain x coordinate similarly determined. Next, with the lowest point of the run as a starting point, the contour is traced in the right direction (on the image), and is lower than a certain y-coordinate threshold value determined based on the coordinates of the right and left eyes (the y-coordinate value is large). And, when it comes to the right of a threshold value of a certain x coordinate similarly determined, the tracking is completed. Further, the left contour and the right contour are joined to form a bangs contour.

【０１８８】後髪輪郭抽出手段３３１は、上記後髪特徴
抽出手段３２８により抽出された髪領域を用いて、以下
の様に動作する。The back hair contour extracting means 331 operates as follows using the hair region extracted by the back hair feature extracting means 328.

【０１８９】右目、左目の中点から、画像上を真上方向
に画像端まで走査し、最も長い髪画素のランを検出す
る。このランの最上点を始点とし、（画像上の）左方向
に輪郭を追跡し、右目及び左目の座標に基づいて定め
る、あるｙ座標の閾値より下方（ｙ座標値は大）、か
つ、同様に定める、あるｘ座標の閾値より左方に来た時
に、追跡を終える。次に、上記ランの最上点を始点と
し、（画像上の）右方向に輪郭を追跡し、右目及び左目
座標に基づいて定める、あるｙ座標の閾値より下方（ｙ
座標値は大）、かつ、同様に定める、あるｘ座標の閾値
より右方に来た時に、追跡を終える。さらに、上記左側
の輪郭線と右側の輪郭線とをつなぎ合わせ、後髪輪郭と
する。From the middle point of the right and left eyes, the image is scanned right above the image to the end of the image, and the longest hair pixel run is detected. With the top point of this run as the starting point, the contour is traced to the left (on the image), below a certain y-coordinate threshold value determined based on the coordinates of the right and left eyes (the y-coordinate value is large), and similarly When it comes to the left of a threshold value of a certain x coordinate, the tracking ends. Next, with the top point of the run as a starting point, the contour is traced in the right direction (on the image), and a threshold value (y) below a certain y coordinate determined based on the right eye and left eye coordinates (y
The tracking is finished when the coordinate value is large), and when it comes to the right of a threshold value of a certain x coordinate similarly determined. Further, the left contour and the right contour are joined to form a back hair contour.

【０１９０】尚、上記では前髪輪郭と後髪輪郭を別々の
ものとして各々の抽出方法を述べたが、両者を一体のも
のとみなし、髪領域を用いてその輪郭を追跡することに
より、髪輪郭を抽出してもよい。In the above, each extraction method has been described assuming that the forehead outline and the back hair outline are separate. However, the hair outline is tracked by assuming the two as one and tracking the outline using the hair region. May be extracted.

【０１９１】請求項１４の発明では、髪特徴抽出手段３
２９は、上記抽出された髪輪郭を用いて、さらに別の髪
特徴を抽出してもよい。例えば、前髪輪郭の最上点を検
出し、前髪特徴とすることや、後髪輪郭を走査し、髪の
内側へのへこみが最大になる点を検出し、後髪特徴とす
ることなどが考えられる。According to the fourteenth aspect, the hair characteristic extracting means 3
29 may use the extracted hair contour to extract another hair feature. For example, it is conceivable to detect the top point of the forehead contour and use it as the forehead feature, or to scan the back hair contour and detect the point where the dent inside the hair becomes the maximum, and use it as the back hair feature. .

【０１９２】髪分類手段３３５は、前髪分類手段３３３
及び後髪分類手段３３４のうち一方または両方から構成
され、上記髪特徴抽出手段３２９で求められた髪特徴、
及び、請求項１４の発明では、髪輪郭抽出手段３３２で
求められた髪輪郭を用いて、髪型を分類する（ステップ
Ｓ３４６）。尚、前髪と後髪とを区別あるいは分離せ
ず、一体とみなして髪分類を行うことも考えられる。[0192] The hair classification means 335 is a bangs classification means 333.
And one or both of the back hair classification means 334, and the hair features obtained by the hair feature extraction means 329,
According to the fourteenth aspect, the hair style is classified using the hair contour obtained by the hair contour extracting means 332 (step S346). It is also conceivable to classify the hair by considering the bangs and back hair as one, without distinguishing or separating them.

【０１９３】前髪分類手段３３３の動作例を以下に説明
する。An example of the operation of the bangs classification means 333 will be described below.

【０１９４】上記抽出された前髪メッシュ特徴を用い
て、髪画素数がある閾値ｃ２以上のメッシュの数が、あ
る閾値ｍ＿ｆｃ以上ある場合は、「おかっぱ」へ分類す
る。ここでいう「おかっぱ」とは、額部分が大部分髪の
毛に覆われたような髪型のことである。また、髪画素数
がｃ２未満で、かつ、ｃ１＜ｃ２なる別の閾値ｃ１以上
であるメッシュの数が、ある閾値ｍ＿ｆｍ以上ある場合
は、「すだれ」へ分類する。ここでいう「すだれ」と
は、額部分に相当量の髪の毛がかぶさっているが、髪の
毛のすきまから相当量の肌が見えているような髪型のこ
とである。Using the extracted bangs mesh feature, if the number of meshes whose number of hair pixels is equal to or greater than a certain threshold value c2 is equal to or greater than a certain threshold value m_fc, the hair is classified as “Oppappa”. Here, “okap” is a hairstyle in which the forehead is mostly covered with hair. If the number of meshes whose number of hair pixels is less than c2 and is equal to or greater than another threshold value c1 satisfying c1 <c2 is equal to or greater than a certain threshold value m_fm, the mesh is classified as "blind." The term “blind” as used herein refers to a hairstyle in which a considerable amount of hair is covered on the forehead, but a considerable amount of skin is visible from the gap in the hair.

【０１９５】請求項１４の発明では、さらに、上記髪輪
郭を特徴を用いて、例えば、以下の様に、髪型を分類す
る（図３４参照）。According to the fourteenth aspect of the present invention, the hair style is further classified using the features of the hair contour as follows (see FIG. 34).

【０１９６】上記で「おかっぱ」へも「すだれ」へも分
類されなかった場合に、まず、前髪輪郭の上部で、輪郭
が髪領域の側（上方）へどの程度へこんでいるかを調べ
て、へこみがあまりない場合は「分け目なし」へ、そう
でない場合は「分け目あり」へ大分類する。In the case where neither “okappa” nor “blind” is described above, first, at the upper part of the bangs contour, it is checked how much the contour is depressed to the side (upward) of the hair area. If there is not much, it is roughly classified into "without division", and if not, it is classified into "with division".

【０１９７】次に、「分け目なし」に関しては、前髪輪
郭の上部の直線度を調べて、直線度が大きい（直線に近
い）場合は「四角型」（図３４（ａ）参照）、そうでな
い場合は「丸型」（図３４（ｂ）参照）へ分類する。Next, as for “no division”, the linearity at the top of the bangs contour is examined. If the linearity is large (close to a straight line), it is “square” (see FIG. 34 (a)). The case is classified into "round" (see FIG. 34B).

【０１９８】さらに、「分け目あり」に関しては、上記
検出された前髪輪郭の最上点のｘ座標（以下ｘ＿ｄｆと
する）を用いて、適当に定めたｄｆ１＜ｄｆ２＜ｄｆ３
＜ｄｆ４なる閾値ｄｆ１，ｄｆ２，ｄｆ３，ｄｆ４に対
し、ｘ＿ｄｆ＜ｄｆ１の場合は「一九分け」（図３４
（ｃ）参照）、ｄｆ１＜＝ｘ＿ｄｆ＜ｄｆ２の場合は
「三七分け」、ｄｆ２＜＝ｘ＿ｄｆ＜＝ｄｆ３の場合は
「真中分け」（図３４（ｄ）参照）、ｄｆ３＜ｘ＿ｄｆ
＜＝ｄｆ４の場合は「七三分け」、ｄｆ４＜ｘ＿ｄｆの
場合は「九一分け」へ分類する。Further, with respect to “with a split”, an appropriately determined df1 <df2 <df3 is determined by using the x-coordinate (hereinafter referred to as x_df) of the uppermost point of the detected bangs contour.
In contrast to thresholds df1, df2, df3, and df4 of <df4, when x_df <df1, "one ninety division" (FIG. 34)
(See FIG. 34 (c)), df1 <= x_df <df2, "37 divisions", df2 <= x_df <= df3, "center division" (see FIG. 34 (d)), df3 <x_df
In the case of <= df4, it is classified as "divided into seven," and in the case of df4 <x_df, it is classified as "divided into nineteen."

【０１９９】後髪分類手段３３４の動作例を以下に説明
する。An example of the operation of the back hair classification means 334 will be described below.

【０２００】上記抽出された後髪矩形特徴を用いて、髪
画素数がある閾値ｎ＿ｂ以上ある場合は「長髪系」、そ
うでない場合は「短髪系」へ分類する。Using the extracted back hair rectangle features, if the number of hair pixels is equal to or greater than a certain threshold value n_b, the hair is classified as “long hair”, otherwise, it is classified as “short hair”.

【０２０１】請求項１４の発明では、さらに、上記髪輪
郭を特徴を用いて、例えば、以下の様に髪型を分類する
（図３５参照）。According to the fourteenth aspect of the present invention, the hair style is further classified as follows using the features of the hair contour (see FIG. 35).

【０２０２】上記「長髪系」「短髪系」を大分類とし、
さらに、上記検出された後髪輪郭の髪領域の内側へのへ
こみが最大になる点のｘ座標（以下ｘ＿ｄｂとする）を
用いて、適当に定めたｄｂ１＜ｄｂ２＜ｄｂ３＜ｄｂ４
なる閾値ｄｂ１，ｄｂ２，ｄｂ３，ｄｂ４に対し、ｘ＿
ｄｂ＜ｄｂ１の場合は「一九分け」、ｄｂ１＜＝ｘ＿ｄ
ｂ＜ｄｂ２の場合は「三七分け」、ｄｂ２＜＝ｘ＿ｄｂ
＜＝ｄｂ３の場合は「真中分け」（図３５（ａ）参
照）、ｄｂ３＜ｘ＿ｄｂ＜＝ｄｂ４の場合は「七三分
け」、ｄｂ４＜ｘ＿ｄｂの場合は「九一分け」へ小分類
する。ただし、上記検出された後髪輪郭の髪の内側への
へこみが、最大になる点においてもそれほど大きくない
場合は、「分け目検出されず」（図３５（ｂ）参照）へ
小分類する。The above “long hair type” and “short hair type” are broadly classified.
Furthermore, using the x-coordinate (hereinafter, referred to as x_db) of the point where the inward indentation of the detected back hair contour inside the hair region becomes maximum, db1 <db2 <db3 <db4 appropriately determined.
For the thresholds db1, db2, db3, db4, x_
In the case of db <db1, "one ninety", db1 <= x_d
In the case of b <db2, “divide by three”, db2 <= x_db
In the case of <= db3, it is subdivided into "center division" (see FIG. 35 (a)), in the case of db3 <x_db <= db4, it is subdivided into "73 divisions", and in the case of db4 <x_db, it is subdivided into "91 divisions". However, if the detected indentation of the back hair contour into the inside of the hair is not so large even at the point where it becomes the maximum, it is subdivided into "No split detected" (see FIG. 35 (b)).

【０２０３】請求項１５の発明では、顔輪郭特徴抽出手
段３３６を備え、髪分類手段３３５は、髪特徴に加えて
この顔輪郭特徴を用いて、以下のように髪を分類する。[0203] According to the fifteenth aspect, the face contour feature extracting means 336 is provided, and the hair classifying means 335 classifies the hair as follows using the face contour features in addition to the hair features.

【０２０４】まず、顔輪郭特徴抽出手段３３６は、顔輪
郭点のｙ座標の最小値を求め、これを顔輪郭線の最上部
の高さとする。First, the face outline feature extracting means 336 obtains the minimum value of the y-coordinate of the face outline point, and sets this as the height of the top of the face outline.

【０２０５】次に、髪分類手段３３５は、例えば、上記
顔輪郭線の最上部の高さ（以下、ｙ＿ｆｔとする）と、
右目と左目とのｙ座標の平均値ｙ＿ｅｙｅとを用いて、
ある閾値ｈｆに対し、ｙ＿ｅｙｅ−ｙ＿ｆｔ＜ｈｆであ
る場合は、ステップＳ３４４において、髪色画素数が少
ない場合でも、「髪が薄い」への分類は行わず、ステッ
プＳ３４５へ進み、上記髪特徴を抽出した上、ステップ
Ｓ３４６でこれらを用いて「髪が薄い」以外の適当なカ
テゴリへの分類を行う。Next, the hair classifying means 335 determines, for example, the height of the uppermost part of the face contour (hereinafter referred to as y_ft),
Using the average value y_eye of the y coordinate of the right eye and the left eye,
If y_eye-y_ft <hf with respect to a certain threshold value hf, in step S344, even if the number of hair color pixels is small, the classification is not performed as “light hair”, and the process proceeds to step S345, where the hair characteristics are determined. After the extraction, in step S346, these are used to perform classification into an appropriate category other than "thin hair".

【０２０６】前髪部品決定手段３３７及び後髪部品決定
手段３３８は、以下の様に動作し、髪部品を決定する
（ステップＳ３４７）。尚、前髪部品決定手段３３７と
後髪部品決定手段３３８とについては、必ずしもその両
方が存在する必要はなく、前髪部品あるいは後髪部品の
一方はユーザ指定により決定することや、前髪と後髪と
を区別あるいは分離せず、一体とみなした髪部品を用い
ることも考えられる。The fore-hair part determining means 337 and the back-hair part determining means 338 operate as follows to determine a hair part (step S347). It is not always necessary to have both of the bangs part deciding means 337 and the back hair parts deciding means 338. One of the bangs part and the back hair part can be determined by the user's specification, and the bangs and the back hair can be determined. It is also conceivable to use a hair part that is regarded as one without distinguishing or separating the hair parts.

【０２０７】前髪部品決定手段３３７は、上記前髪分類
手段３３３の分類結果、例えば、「おかっぱ」，「すだ
れ」，「四角型」，「丸型」，「一九分け」，「七三分
け」，「真中分け」，「七三分け」，「九一分け」に従
って、これらに対応して予め作成しておいた部品のうち
一つを決定し、出力する。このとき、部品の色を、ステ
ップＳ３４３で抽出した髪色に従って決定することも考
えられる。The bangs part deciding means 337 outputs the classification result of the bangs categorizing means 333, for example, "Okappa", "Blind", "Square type", "Round type", "Nineteenth part", "Seventh part". One of the parts created in advance corresponding to these is determined and output in accordance with ", middle division", "73 divisions", and "91 divisions". At this time, the color of the component may be determined according to the hair color extracted in step S343.

【０２０８】請求項１６の発明では、前髪分類手段３３
３の分類結果に加えて、後髪分類手段３３４の分類結果
も参照し、例えば、前髪分類が「四角型」、後髪分類が
「短髪系・九一分け」であれば、「四角型」ではなく、
「九一分け」に対応する前髪パターンを出力する。ある
いは、前髪部分ではそれほど明瞭な分けがないものの、
左側（画像上右側）から右側（画像上左側）に向かって
髪の毛が流れているような「四角型」とも「九一分け」
とも異なるパターンを用意して、これを出力することも
考えられる。In the sixteenth aspect, the bangs classification means 33
In addition to the classification result of No. 3, the classification result of the back hair classification means 334 is also referred to. For example, if the front hair classification is “square type” and the rear hair classification is “short hair / 91 class”, “square type” not,
The bangs pattern corresponding to "Nine parts" is output. Or, although there is not so clear division in the bangs,
"Square type" with hair flowing from the left side (right side on the image) to the right side (left side on the image)
It is also conceivable to prepare a different pattern and output this.

【０２０９】後髪部品決定手段３３８は、上記後髪分類
手段３３４の分類結果、例えば「短髪系・九一分け」，
「長髪系・真中分け」に従って、これらに対応して予め
作成しておいた部品の一つを決定し、出力する。[0209] The back hair part determination means 338 outputs the classification result of the back hair classification means 334, for example, "short hair type / one-part",
In accordance with the “long hair system / middle classification”, one of the parts created in advance corresponding to these is determined and output.

【０２１０】請求項１７の発明では、後髪分類手段３３
４の分類結果に加えて、前髪分類手段３３３の分類結果
も参照し、例えば、後髪分類が「長髪系・分け目検出さ
れず」、前髪分類が「九一分け」であれば、後髪パター
ンは「長髪系・九一分け」に対応するものを出力する。
あるいは、後髪分類が「長髪系・九一分け」、前髪分類
が「三七分け」であれば、前髪分類と後髪分類とで分け
目位置が食い違っており、前髪での分け目位置判定の方
が信頼度が高いと考えられるので、後髪パターンも「長
髪系・三七分け」に対応するものを出力する。According to the seventeenth aspect, the back hair classification means 33
In addition to the classification result of No. 4, the classification result of the bangs classification means 333 is also referred to. For example, if the back hair classification is “long hair type / partition is not detected” and the bangs classification is “Nine classification”, the back hair pattern Outputs the one corresponding to "Long hair style / 91".
Alternatively, if the back hair classification is "long hair / 91 divisions" and the bangs classification is "37 divisions", the division positions are different between the bangs classification and the back hair classification. Is considered to have a high degree of reliability, so that the back hair pattern corresponding to the "long hair type / 37 division" is output.

【０２１１】以上では、髪色抽出、髪特徴抽出、髪分
類、髪部品決定を行っているが、これらをすべて行うこ
とは必ずしも必要ではない。例えば、請求項１３の発明
で、ステップＳ３４３の髪色抽出までを行い、ステップ
Ｓ３４４〜ステップＳ３４７は省略して、髪部品の形状
の種類はユーザ指定により決定し、髪部品の色のみを、
抽出した髪色に従って決定することも考えられる。In the above, hair color extraction, hair feature extraction, hair classification, and hair part determination are performed, but it is not always necessary to perform all of them. For example, in the invention of claim 13, the processing up to the hair color extraction in step S343 is performed, steps S344 to S347 are omitted, the type of the shape of the hair part is determined by user designation, and only the color of the hair part is determined.
It is also conceivable to determine according to the extracted hair color.

【０２１２】また、髪分類手段と髪部品決定手段とを別
々の手段としているが、分類結果がすなわち髪部品種類
であるとみなすこともできることを考えれば、これらの
間の区分は必ずしもも明確なものではない。Although the hair classification means and the hair part determination means are separate means, considering that the classification result can be regarded as a hair part type, the division between them is not always clear. Not something.

【０２１３】さらに、髪部品を予め作成しておき、髪部
品の決定を行って出力するとしているが、髪分類の結果
に基づいて、予め用意しておいた１つまたは複数のパタ
ーンを適当に変形して出力することも考えられる。Further, hair parts are created in advance, and hair parts are determined and output. However, one or a plurality of patterns prepared in advance are appropriately applied based on the result of hair classification. It is also conceivable that the output is deformed.

【０２１４】尚、上記で説明した各手段である、肌色抽
出手段３２５と、髪認識手段が髪色を抽出する髪色抽出
手段３２６と、前髪部分の特徴を抽出する前髪特徴抽出
手段３２７と後髪部分の特徴を抽出する後髪特徴抽出手
段３２８とからなる髪部分の特徴を抽出する髪特徴抽出
手段３２９と、前髪の輪郭を抽出する前髪輪郭抽出手段
３３０と後髪の輪郭を抽出する後髪輪郭抽出手段３３１
とからなる髪部分の特徴を用いて髪の輪郭を抽出する髪
輪郭抽出手段３３２と、前髪の分類を行う前髪分類手段
３３３と後髪の分類を行う後髪分類手段３３４とからな
る髪の輪郭を用いて髪を分類する髪分類手段３３５と、
顔輪郭の特徴を抽出する顔輪郭特徴抽出手段３３６と、
顔の輪郭から前髪の部品を決定する前髪部品決定手段３
３７と後髪の部品を決定する後髪部品決定手段３３８
と、から髪認識手段が構成されているものとする。[0214] The skin color extracting means 325, the hair color extracting means 326 for extracting the hair color by the hair recognizing means, the bangs characteristic extracting means 327 for extracting the characteristics of the bangs, and the rear means. A hair feature extracting means 329 for extracting the features of the hair part, comprising a back hair feature extracting means 328 for extracting the features of the hair part, a forehead contour extracting means 330 for extracting the contour of the bangs, and after extracting the contour of the back hair Hair contour extraction means 331
A hair contour extraction means 332 for extracting a hair contour using the features of the hair part, which is composed of: a hair contour extraction means 332 for classifying bangs, and a hair contour classification means 334 for classifying back hair Hair classification means 335 for classifying hair using
A face contour feature extraction unit 336 for extracting features of the face contour;
Bangs part determining means 3 for determining bangs parts from the contour of the face
37 and back hair part determination means 338 for determining back hair parts
It is assumed that a hair recognizing means is configured from the above.

【０２１５】したがって、本実施例の特徴をまとめると
次のようになる。（１２）請求項１２の画像処理装置は、特徴抽出手段が
位置指定手段で指定された一つ以上の位置情報に基づ
き、頭頂高さと髪生え際高さとを推定し、髪領域を認識
する髪認識手段を備えてなることを特徴とする。（１３）請求項１３の画像処理装置は、髪認識手段が髪
色を抽出する髪色抽出手段を備えてなることを特徴とす
る。Therefore, the features of the present embodiment are summarized as follows. (12) In the image processing apparatus according to the twelfth aspect, the feature recognition unit estimates the crown height and the hairline height based on one or more position information designated by the position designation unit, and recognizes the hair region. It is characterized by comprising means. (13) An image processing apparatus according to a thirteenth aspect is characterized in that the hair recognizing means includes a hair color extracting means for extracting a hair color.

【０２１６】上記（１２），（１３）の構成によれば、
画像全体から背景領域を抽出する必要がないため、背景
色が一様またはそれに近い必要はなく、通常のスナップ
写真などからでも髪色を抽出し、あるいは、似顔絵を作
成することができる。（１４）請求項１４の画像処理装置は、髪認識手段が位
置指定手段で指定された一つ以上の位置情報に基づき、
髪部分の特徴を抽出する髪特徴抽出手段と、該髪部分の
特徴を用いて髪輪郭を抽出する髪輪郭抽出手段と、該髪
輪郭を用いて髪を分類する髪分類手段と、をさらに備え
てなることを特徴とする。According to the above configurations (12) and (13),
Since it is not necessary to extract the background region from the entire image, the background color does not need to be uniform or close to it, and the hair color can be extracted from a normal snapshot or the like, or a portrait can be created. (14) The image processing apparatus according to claim 14, wherein the hair recognizing unit is configured to perform the operation based on at least one position information specified by the position specifying unit.
Hair feature extracting means for extracting features of the hair part, hair contour extracting means for extracting a hair contour using the features of the hair part, and hair classification means for classifying the hair using the hair contour It is characterized by becoming.

【０２１７】上記（１４）の構成によれば、テンプレー
トマッチングによるのではなく、髪の輪郭線を抽出する
ため、いわゆる「七三分け」，「真中分け」などの呼び
方でいう「分け目」を精度よく検出して分類すること
や、髪生え際線の形状の丸みを判定して「四角型」，
「丸型」に分類することなど、きめ細かい形状分類を行
うことができる。（１５）請求項１５の画像処理装置は、髪認識手段が顔
輪郭の特徴を抽出する顔輪郭特徴抽出手段と、髪特徴及
び顔輪郭特徴を用いて髪を分類する髪分類手段と、を備
えてなることを特徴とする。According to the structure of (14) above, in order to extract the outline of the hair instead of using the template matching, the "separation" referred to as a so-called "seven-three division" or "center division" is used. Detecting and classifying with high accuracy, judging the roundness of the hairline,
It is possible to perform fine shape classification such as classification into “round type”. (15) An image processing apparatus according to claim 15, wherein the hair recognizing means includes a face contour feature extracting means for extracting face contour features, and a hair classifying means for classifying hair using the hair features and the face contour features. It is characterized by becoming.

【０２１８】上記（１５）の構成によれば、例えば、顔
輪郭線の最上部の高さが頭頂高さと比較してある閾値以
上低い場合は髪が相当量あると判断することにより、白
髪であるなど髪領域と肌領域の区別が難しい場合にも、
「髪が薄い」などの誤判断をなくす、あるいは減らすこ
とができる。（１６）請求項１６の画像処理装置は、髪認識手段が前
髪部分の特徴を抽出する前髪特徴抽出手段と、後髪部分
の特徴を抽出する後髪特徴抽出手段を備え、髪部分を含
む画像を入力した際、前記前髪特徴抽出手段にて抽出さ
れた前髪特徴と前記後髪特徴抽出手段にて抽出された後
髪特徴とを用いて前髪部品を決定することを特徴とす
る。According to the above configuration (15), for example, when the height of the top of the face contour line is lower than the head height by a certain threshold or more, it is determined that there is a considerable amount of hair, so When it is difficult to distinguish between the hair area and the skin area,
It is possible to eliminate or reduce erroneous judgments such as "thin hair". (16) An image processing apparatus according to (16), wherein the hair recognizing means includes a bangs feature extracting means for extracting the features of the forehead part, and a back hair feature extracting means for extracting the features of the back hair part, and the image including the hair part. Is input, and a bangs part is determined by using the bangs feature extracted by the bangs feature extraction unit and the back hair feature extracted by the back hair feature extraction unit.

【０２１９】上記（１６）の構成によれば、前髪、後髪
の両方の特徴を用いて前髪部品を決定するため、例え
ば、髪の上部で左側の方に分け目があれば、前髪部品で
も左の方から流れているような、「左分け」にマッチし
たものを選択することにより、よりリアルな、違和感の
少ない似顔絵を作成することができる。また、予め用意
された髪部品を用いているので、例えば、髪領域を２値
化したものと、前髪部分の髪を表現する小部品とを組み
合わせる手法のように、髪領域が２値化画像の一部ある
いは全部が、髪画像の一部あるいは全部としてそのまま
出力される手法と比較して、より美しい似顔絵を作成で
きる場合が多く、また、処理が不完全な部分が存在して
も、それが出力としてそのままユーザーに見えるわけで
はないので、違和感を与えにくい。（１７）請求項１７の画像処理装置は、髪認識手段が前
髪部分の特徴を抽出する前髪特徴抽出手段と、後髪部分
の特徴を抽出する後髪特徴抽出手段を備え、髪部分を含
む画像を入力した際、前記前髪特徴抽出手段にて抽出さ
れた前髪特徴と前記後髪特徴抽出手段にて抽出された後
髪特徴とを用いて後髪部品を決定することを特徴とす
る。According to the configuration (16), the bangs part is determined by using the characteristics of both the bangs and the back hair. For example, if there is a division on the left side in the upper part of the hair, even the bangs part is left. By selecting an item that matches the "left division" that flows from the person, a more realistic portrait with less discomfort can be created. Further, since a hair part prepared in advance is used, the hair area is converted to a binarized image, for example, by a method of combining a binarized hair area with a small part representing the hair of the bangs. In many cases, a more beautiful portrait can be created as compared with a method in which part or all of the hair image is output as it is as part or all of the hair image. Is not directly visible to the user as output, so it is difficult to give a sense of incongruity. (17) An image processing apparatus according to claim 17, wherein the hair recognizing means includes a bangs feature extracting means for extracting the features of the bangs portion, and a back hair feature extracting means for extracting the features of the back hair portion. Is input, the back hair part is determined using the bangs feature extracted by the bangs feature extraction unit and the back hair features extracted by the back hair feature extraction unit.

【０２２０】上記（１７）の構成によれば、前髪、後髪
の両方の特徴を用いて後髪部品を決定するため、例え
ば、額の前髪部分で左の方に分け目があれば、後髪部品
でも、左側の方に分け目があるような、「左分け」にマ
ッチしたものを選択することにより、よりリアルな、ま
たは違和感の少ない似顔絵を作成することができる。ま
た、予め用意された髪部品を用いているので、例えば、
髪領域を２値化したものと、前髪部分の髪を表現する小
部品とを組み合わせる手法のように、髪領域が２値化画
像の一部あるいは全部が、髪画像の一部あるいは全部と
してそのまま出力される手法と比較して、より美しい似
顔絵を作成できる場合が多く、また、処理が不完全な部
分が存在しても、それが出力としてそのままユーザーに
見えるわけではないので、違和感を与えにくい。〈実施形態２〉本実施形態に関して図３６乃至図４１を
用いて説明する。図３６は似顔絵画像の合成に関する画
像処理装置の概略ブロック図であり、図３７は各顔部品
の中心位置を配置すべき位置座標の図であり、図３８は
顔輪郭の形状の例であり、図３９は従来手法による影の
描画例であり、図４０は本発明による影の描画例であ
り、図４１は図３６の部品配置手段４０８に含まれる影
の描画方法のフローチャートである。According to the configuration (17), since the back hair part is determined by using the characteristics of both the front hair and the back hair, for example, if the left side of the forehead part of the forehead has a split, the back hair By selecting a part that matches the "left division" that has a division on the left side, a more realistic or less unnatural portrait can be created. Also, since hair parts prepared in advance are used, for example,
As in the technique of combining a binarized hair region with a small part representing the hair of the bangs, a part or all of the binarized image of the hair region is directly used as a part or all of the hair image. Compared to the output method, more beautiful portraits can be created in many cases, and even if there is an incompletely processed part, it will not be visible to the user as output as it is, so it is difficult to give a sense of incongruity . <Embodiment 2> This embodiment will be described with reference to FIGS. FIG. 36 is a schematic block diagram of an image processing apparatus for synthesizing a portrait image, FIG. 37 is a diagram of position coordinates at which the center position of each face part is to be arranged, and FIG. 38 is an example of the shape of a face outline. FIG. 39 shows an example of drawing a shadow by a conventional method, FIG. 40 shows an example of drawing a shadow according to the present invention, and FIG. 41 is a flowchart of a method of drawing a shadow included in the component arrangement means 408 of FIG.

【０２２１】まず、図３６を用いて、本発明の似顔絵画
像の合成に関する部分の構成を説明する。First, referring to FIG. 36, the configuration of a portion relating to the composition of a portrait image of the present invention will be described.

【０２２２】画像入力手段４００はスキャナなど、似顔
絵をつくる元画像を入力するための手段で、特徴量抽出
手段４０１は前記実施形態１（第１の実施例から第９の
請求項１−１７）にて説明した特徴量抽出手段で、顔輪
郭決定手段４０２はその特徴量より対応する顔部品を選
択する手段で、顔部品データ記憶手段４０３は顔部品の
画像データを蓄積する手段で、顔輪郭決定手段４０４は
抽出した特徴量より、その人物の顔輪郭を決定し、その
輪郭に対応する部品の変形や配置に関する情報を、部品
変形／配置情報記憶手段４０５から引き出す手段で、配
置位置補正手段４０６は４０５から引き出した配置情報
を４０１の特徴量に基いてより似た似顔絵を生成するた
めに補正する手段で、部品変形手段４０７は４０６から
受け取った変形情報から、その顔の輪郭に適した形に顔
部品の大きさなどに顔部品データを変形する手段で、部
品配置手段４０８は４０４及び４０６が生成する顔部を
顔輪郭の上、または下の適正な位置に配置するための手
段で、画像出力手段４０９はＣＲＴやプリンタなど合成
した似顔絵画像を出力する手段で、編集指定入力手段４
１０は合成された似顔絵の結果画像に対して、部品の配
置位置や変形率を変更する手段である。［第１０の実施例］（請求項１８の発明）本実施例では、特徴量抽出手段４０１で抽出した顔特徴
から、あらかじめ用意された顔輪郭部品の中から最も近
いものを、顔輪郭決定手段４０４によって決定する。例
えば図３８に示すような、標準顔／細顔／幅広顔／丸顔
などの顔輪郭部品である。この顔輪郭部品には、各々図
３７に示すような形で各顔部品の中心位置を配置すべき
座標と部品の変形率が、テーブルまたは関数の形で部品
変形／配置情報記憶手段４０５に記憶されている。具体
的例として以下に細顔のテーブルを示す。The image input means 400 is a means for inputting an original image for making a portrait, such as a scanner, and the feature amount extracting means 401 is the first embodiment (first to ninth aspects of the present invention). The facial contour determining means 402 selects the corresponding facial part based on the characteristic quantity, and the facial part data storing means 403 stores the facial part image data. The determining unit 404 determines a face outline of the person from the extracted feature amount, and extracts information on the deformation and arrangement of the part corresponding to the outline from the part deformation / arrangement information storage unit 405. 406 is a means for correcting the arrangement information drawn from 405 to generate a more similar portrait based on the feature amount of 401, and the component deformation means 407 is a means for correcting the deformation information received from 406. Then, the face part data is transformed into a shape suitable for the face contour to the size of the face part or the like. The image output means 409 is a means for outputting a synthesized portrait image such as a CRT or a printer.
Numeral 10 denotes a unit for changing the arrangement position and the deformation ratio of the parts with respect to the synthesized portrait image. [Tenth Embodiment] (Embodiment 18) In this embodiment, the face features extracted by the feature amount extracting means 401 are used to determine the closest face contour part from the prepared face contour parts. Determined by 404. For example, face contour parts such as a standard face / small face / wide face / round face as shown in FIG. In this face outline part, the coordinates at which the center position of each face part is to be arranged and the deformation ratio of the part are stored in the part deformation / arrangement information storage means 405 in the form of a table or a function as shown in FIG. Have been. A detailed example table is shown below as a specific example.

【０２２３】ｘ座標ｙ座標ｘ方向拡大率ｙ方向拡大率後髪｛５００，５００，０．９２，１．００｝左まゆ｛４１０，６７０，０．８５，０．８５｝右まゆ｛５９０，６７０，０．８５，０．８５｝左目｛４１０，６１５，０．８０，０．８０｝右目｛５９０，６１５，０．８０，０．８０｝鼻｛５００，５１０，０．８５，０．８５｝口｛５００，４５０，０．９５，０．９５｝左耳｛３１０，６００，１．００，１．００｝右耳｛６９０，６００，１．００，１．００｝前髪｛５００，５００，０．９２，１．００｝ｘ座標、ｙ座標は個々の部品を配置すべき場所であり、
似顔絵全体の大きさが１０００×１０００で、顔の中心
が（５００，５００）の例である。拡大率は標準顔の各
部品の大きさを１．００とした比率で示されている。標
準顔よりも部品位置が中心に寄り、部品もやや小さ目の
指定となっており、細顔の形状に最適な部品構成とな
る。部品変形手段４０７と部品配置手段４０８は、この
テーブルの値から顔部品を顔輪郭上、または下面（後髪
や耳）に配置する。X coordinate y coordinate x direction enlargement ratio y direction enlargement ratio Back hair ｛500, 500, 0.92, 1.00｝ left eyebrows ｛410, 670, 0.85, 0.85｝ right eyebrows 590, 670, 0.85, 0.85｝ Left eye ｛410, 615, 0.80, 0.80｝ Right eye ｛590, 615, 0.80, 0.80｝ Nose ｛500, 510, 0.85, 0. 85 口 mouth ｛500, 450, 0.95, 0.95｝ left ear ｛310, 600, 1.00, 1.00｝ right ear 690, 600, 1.00, 1.00｝ bangs ｛500, 500, 0.92, 1.00 ｘ x-coordinate and y-coordinate are places where individual parts should be placed.
This is an example in which the size of the entire portrait is 1000 × 1000 and the center of the face is (500, 500). The enlargement ratio is indicated by a ratio where the size of each part of the standard face is 1.00. The part position is closer to the center than the standard face, and the parts are specified slightly smaller, so that the part configuration is optimal for the shape of the fine face. The part deforming means 407 and the part arranging means 408 arrange the face parts on the face outline or on the lower surface (back hair or ears) based on the values in this table.

【０２２４】したがって、本実施例の特徴をまとめると
次のようになる。（１８）請求項１８の画像処理装置は、画像中の顔部品
の大きさや形状の顔部品特徴情報を得る手段（特徴量抽
出手段）と、その特徴情報に対応する複数の顔部品種類
を持ち、各顔部品種ごとに複数の部品データを記憶して
いる顔部品データ記憶手段と、前記顔部品特徴情報をも
とに前記顔部品データ記憶部から適当な部品データを抽
出する顔部品データ抽出手段と、前記抽出された各顔部
品データを顔部品データ記憶部に記憶してある顔輪郭部
品種類ごとに部品の配置位置を定めることにより、輪郭
に適した位置に他の顔部品を配置する手段（部品配置手
段）を備えてなることを特徴とする。Therefore, the features of this embodiment are summarized as follows. (18) An image processing apparatus according to claim 18 has means for obtaining face part feature information of the size and shape of a face part in an image (feature amount extraction means) and a plurality of face part types corresponding to the feature information. A face part data storage unit for storing a plurality of part data for each face part type; and a face part data extraction unit for extracting appropriate part data from the face part data storage unit based on the face part characteristic information. Means for arranging another face part at a position suitable for the contour by determining the arrangement position of the part for each type of face outline part stored in the face part data storage unit with the extracted face part data. Means (component arranging means).

【０２２５】上記（１８）の構成によれば、画像中の顔
部品の大きさ、形状などの、どの顔部品を使用するか決
定した後、顔部品の中の顔輪郭部品ごとに部品配置位置
と部品サイズなどの部品配置方法を決定し、各顔部品デ
ータを配置する。部品を顔輪郭に基いて配置することに
よって、単に部品が顔からはみ出したりしないだけでな
く、その顔輪郭の形状に最も適した位置と大きさに顔部
品を配置することができる。顔輪郭ごとに顔部品配置情
報を決定することにより、劇画調とコミック調など似顔
絵のコンセプトによって顔のバランスがまったく異なる
似顔絵についても、顔部品を配置した場合に破綻するこ
となく似顔絵を生成することが可能である。［第１１の実施例］（請求項１９の発明）本実施例では、顔部品配置補正に使用する特徴量をあら
わす図２２と該補正量算出の処理フローを示した図２１
を用いて説明する。According to the above configuration (18), after deciding which face part to use, such as the size and shape of the face part in the image, the component arrangement position is determined for each face contour part in the face part. And a part placement method such as a part size are determined, and each face part data is placed. By arranging the parts based on the face outline, not only the parts do not protrude from the face but also the face parts can be arranged at the position and size most suitable for the shape of the face outline. Determining facial part placement information for each facial contour to generate a facial caricature without disintegration when facial parts are placed, even if the facial balance is completely different due to the concept of portraits such as dramatic and comic styles Is possible. [Eleventh Embodiment] (Embodiment 19) In this embodiment, FIG. 22 showing a feature amount used for face part arrangement correction and FIG. 21 showing a processing flow of the correction amount calculation
This will be described with reference to FIG.

【０２２６】顔輪郭の横幅１５１及び顔画像中の目の位
置（高さ）１５６を基準とし、顔輪郭の横幅に対する目
の中心間の距離（幅）１５３、眉の中心間の距離（幅）
１５２、目と目の中心と眉と眉の中心間の距離（高さ）
１５４、目と目の中心と口の中心間の距離（高さ）１５
５をそれぞれ求め、数百人の値をあらかじめ測定、平均
を記憶しておき、顔画像中の顔部品位置の認識結果と比
較、除算により比率を算出する（１４９）。具体的には
顔輪郭の横幅に対する目の中心間の距離（幅）１５３の
平均Ｍｅｗと検出結果Ｄｅｗの比率Ｒｅｗ、眉の中心間
の距離（幅）１５２の平均Ｍｅｂｗと検出結果Ｄｅｂｗ
の比率Ｒｅｂｗ、目と目の中心と眉と眉の中心間の距離
（高さ）１５４の平均Ｍｅｂｈと検出結果Ｄｅｂｈの比
率Ｒｅｂｈ、目と目の中心と口の中心間の距離（高さ）
１５５の平均Ｍｍｈと検出結果Ｄｍｈの比率Ｒｍｈを算
出する。With reference to the width 151 of the face outline and the position (height) 156 of the eyes in the face image, the distance 153 between the centers of the eyes with respect to the width of the face outline (width) and the distance (width) between the centers of the eyebrows
152, distance (height) between the center of the eyebrow and the center of the eyebrow
154, distance (height) 15 between eyes, center of eyes and center of mouth
5 are obtained, the values of several hundred persons are measured in advance, the average is stored, and the ratio is calculated by comparing and dividing the result with the recognition result of the face part position in the face image (149). Specifically, the ratio Rew of the average Mew of the distance (width) 153 between the centers of the eyes to the width of the face contour and the detection result Dew, the average Mebw of the distance (width) 152 between the centers of the eyebrows, and the detection result Debw
Ratio Rebw, the average Mebh of the distance (height) 154 between the center of the eye and the center of the eyebrow and the center of the eyebrow and the ratio Rebh of the detection result Debh, the distance (height) between the center of the eye, the center of the eye and the center of the mouth
The ratio Rmh between the average Mmh of 155 and the detection result Dmh is calculated.

【０２２７】Ｒｅｗ＝Ｄｅｗ／ＭｅｗＲｅｂｗ＝Ｄｅｂｗ／ＭｅｂｗＲｅｂｈ＝Ｄｅｂｈ／ＭｅｂｈＲｍｈ＝Ｄｍｈ／Ｍｍｈこれにより、顔画像中の顔の顔部品位置の特徴が抽出さ
れる。ついで、抽出された顔部品位置の特徴に基づき顔
部品の配置位置の補正量を算出する。本実施例において
は、上記により得られた各々の比率に対し、あらかじめ
強調の度合等を考慮して定められた各々の定数を乗じ、
得られた値を合成画像中での顔部品配置位置の補正量と
する（１５０）。Rew = Dew / Mew Rebw = Debw / Mebw Rebw = Debh / Mebh Rmh = Dmh / Mmh Thus, the feature of the face part position of the face in the face image is extracted. Next, a correction amount of the arrangement position of the face part is calculated based on the extracted feature of the face part position. In the present embodiment, the respective ratios obtained above are multiplied by respective constants determined in advance in consideration of the degree of emphasis and the like,
The obtained value is used as the correction amount of the face part arrangement position in the composite image (150).

【０２２８】すなわち、眉の縦方向の位置補正量を決定する定数Ｋｅｂｈ眉の横方向の位置補正量を決定する定数Ｋｅｂｗ目の横方向の位置補正量を決定する定数Ｋｅｗ口の縦方向の位置補正量を決定する定数Ｋｍｈをそれぞれ上記で求めた比率に乗じ、眉の縦方向の位置補正量Ｈｅｂｈ眉の横方向の位置補正量Ｈｅｂｗ目の横方向の位置補正量Ｈｅｗ口の縦方向の位置補正量Ｈｍｈを求める。That is, a constant for determining the vertical position correction amount of the eyebrows Kebh A constant for determining the horizontal position correction amount of the eyebrows Kebw A constant for determining the horizontal position correction amount of the eyebrows The vertical position of the Kew mouth The constant Kmh for determining the correction amount is multiplied by each of the ratios obtained above, and the vertical position correction amount of the eyebrows Hebh The horizontal position correction amount of the eyebrows Hebw The horizontal position correction amount of the eye Hew The vertical position of the mouth The correction amount Hmh is obtained.

【０２２９】Ｈｅｂｈ＝Ｒｅｂｈ×ＲｅｂｈＨｅｂｗ＝Ｒｅｂｗ×ＲｅｂｗＨｅｗ＝Ｒｅｗ ×ＲｅｗＨｍｈ＝Ｒｍｈ ×Ｒｍｈ上記により求めた顔部品の配置位置の補正量に基づき部
品の配置位置の補正を行なって似顔絵画像の合成を行な
うことにより、使用される顔輪郭部品に適した配置であ
ると同時に顔画像中の顔の顔部品位置の特徴を再現ある
いは強調した似顔絵画像を合成することが可能となる。Hebh = Rebh × Rebh Hew = Rebw × Rebw Hew = Rew × Rew Hmh = Rmh × Rmh The layout of the facial parts is corrected based on the correction amount of the layout of the facial parts obtained as described above, and a portrait image is synthesized. By performing the above, it is possible to synthesize a portrait image in which the arrangement is suitable for the face contour part to be used and at the same time, the feature of the face part position of the face in the face image is reproduced or emphasized.

【０２３０】したがって、本実施例の特徴をまとめると
次のようになる。（１９）請求項１９の画像処理装置は、画像中の顔部品
の位置の情報を得る手段（）と、得られた顔部品の位置
情報に基づき、顔輪郭に対応して決定した他の顔部品の
配置位置を補正（配置位置補正手段）し、顔部品の配置
位置を定める手段（部品配置手段）を備えてなることを
特徴とする。Therefore, the characteristics of this embodiment are summarized as follows. (19) An image processing apparatus according to claim 19, further comprising: means () for obtaining information on the position of the face part in the image, and another face determined corresponding to the face outline based on the obtained position information of the face part. It is characterized by comprising means (part arrangement means) for correcting the arrangement position of the parts (arrangement position correction means) and determining the arrangement position of the face parts.

【０２３１】上記（１９）の構成によれば、同じ顔輪郭
を持つが、微妙に顔部品の位置が異なる顔のバリエーシ
ョン全て対応するため、画像中の顔部品の位置情報に基
づき、上記請求項１８によって決定した他の顔部品の配
置位置を、その顔輪郭の許容範囲の中で修正し、より適
切な部品の配置位置を決定する。また、入力画像から抽
出した顔部品の位置と、統計的なその顔部品の顔輪郭に
おける標準位置とを比較し、その差異と、顔輪郭ごとに
決定される修正許容範囲によって、顔部品の配置を補正
する。これにより、顔輪郭から顔がはみ出すような破綻
を避けながら、入力画像の微妙な顔部品の位置の特徴
を、似顔絵に反映させることが可能となる。［第１２の実施例］（請求項２０の発明）図３６の顔部品データ記憶手段４０３において、前髪の
部品と、後髪の部品は、まったく別の部品として記憶さ
れており、それぞれ人間の髪の色を簡易に表現できるよ
うに黒色／茶色／白色／金色などの髪色の部品が用意さ
れている。通常の人間の髪色は、前髪も後髪も同じ色で
あり、特徴量抽出手段４０１は顔部品データ抽出手段４
０２に両者が同じ色であるとする特徴量を渡すため、生
成結果に問題を生じることはない。ところが編集指定入
力手段４１０によって前髪の色の変更を指示すると後髪
と矛盾を生じてしまう。そこで、前髪に対して色の異な
る部品への変更を指示すると顔部品データ抽出手段４０
２は、同時に後髪の色も同じ色の部品へ変更を自動的に
行う。これにより前髪と後髪の色が異なるという矛盾を
防ぐことができる。逆に後髪に対して色の変更指示を与
えた場合にも、前髪の色の変更を自動的に行う。このよ
うな複数部品の同時色変更は、前髪と後髪だけでなく、
眉毛やヒゲなど同じ髪色に統一する必要がある部品すべ
てに対して行ってもよい。また、顔部品のなかで同様に
拘束条件を持つものとして、顔輪郭の肌色を色白から小
麦色などへの部品変更を指示すると、耳や鼻の部品も対
応する肌色の部品に変更するような実施例も考えられ
る。また色だけでなく、髪がパーマであるかどうかな
ど、前髪と後髪で強い相関のある項目について、同様に
連動して変更を行うことが考えられる。According to the configuration of (19), all face variations having the same face contour but subtly different face part positions are supported. The arrangement position of the other face parts determined in step 18 is corrected within the allowable range of the face contour, and a more appropriate arrangement position of the parts is determined. Further, the position of the face part extracted from the input image is compared with a statistical standard position in the face contour of the face part, and the difference and the correction allowable range determined for each face contour determine the arrangement of the face part. Is corrected. This makes it possible to reflect the delicate feature of the position of the face part of the input image on the portrait while avoiding a failure such that the face protrudes from the face contour. [Twelfth embodiment] (Invention of claim 20) In the face part data storage means 403 of FIG. 36, the front hair part and the back hair part are stored as completely different parts, and each of them is human hair. There are provided hair color parts such as black / brown / white / gold so that the color can be easily expressed. The normal human hair color is the same for both the forehead and the back hair, and the feature amount extraction unit 401 uses the face part data extraction unit 4.
Since a feature amount indicating that the two colors are the same is passed to 02, no problem occurs in the generation result. However, if an instruction to change the color of the bangs is given by the edit designation input means 410, it will be inconsistent with the back hair. Therefore, when the bangs are instructed to change to a part having a different color, the face part data extracting means 40
2 automatically changes the color of the back hair to a part of the same color at the same time. This can prevent the inconsistency that the color of the forelock and the back hair is different. Conversely, when a color change instruction is given to the rear hair, the color of the front hair is automatically changed. Such simultaneous color change of multiple parts is not only for bangs and back hair,
This may be performed for all parts that need to be unified to the same hair color, such as eyebrows and mustaches. Also, assuming that the face parts also have the same constraint conditions, if a change of the skin color of the face outline from fair skin to tan is instructed, the ear and nose parts are also changed to the corresponding skin color parts. Embodiments are also conceivable. In addition, it is conceivable that not only the color but also an item having a strong correlation between the bangs and the back hair, such as whether the hair is perm, may be similarly changed in conjunction with each other.

【０２３２】したがって、本実施例の特徴をまとめると
次のようになる。（２０）請求項２０の画像処理装置は、似顔絵合成時に
配置する顔部品データにおいて、関連のある一組の部品
の特定の色に関する部品の変更指定が行われたとき、関
連する他の部品についても自動的に部品を変更すること
により、常に矛盾を生じないデータを生成する手段（部
品変形手段）を備えてなることを特徴とする。Therefore, the features of this embodiment are summarized as follows. (20) In the image processing device according to the twentieth aspect, when a change of a part related to a specific color of a set of related parts is designated in the face part data arranged at the time of portrait composition, the other related parts are Also, a means (part deforming means) for automatically generating data which does not cause inconsistency by automatically changing parts is provided.

【０２３３】上記（２０）の構成によれば、前髪と後
髪、あるいは前髪とヒゲなど、顔部品で強い相関を持つ
部品間で、一方の部品の色を、例えば黒／茶／白のどれ
かに決定した場合、他方の部品の色も自動的に同じ色に
変更することにより、他方の部品の色の変更を陽に指定
することなく、違和感のない似顔絵を自動生成する。同
様に、髪がパーマであるかなど、前髪と後髪などで強い
相関のある形状についても、片方に対しての指示を他方
に反映させることで、違和感のない編集操作が可能とな
る。［第１３の実施例］（請求項２１の発明）図３６の顔部品データ記憶手段４０３に記憶される各顔
部品は、内部的には実体と影いう層構造を伴った画像と
して記憶される。この層構造を持った部品を単純に組み
合わせて描画をおこなうと、図３９のように、ある部品
の上に、他の部品の影が重なって描画され不自然な絵と
なってしまう。そこで各部品を配置する時に図４１のフ
ローチャートに示すように、各部品を構成する画素また
は層構造の色を調べ、もしその色が肌色の影の色（例え
ばＲＧＢ値がｒ＝２３８、ｇ＝１６０、ｂ＝８７）の色
である場合、まず部品のこの影の色に相当する画素また
は層構造だけを予め描画し、引き続き影でない色の画素
または層を描画する。これにより図４０に示すように自
然な画像を生成することが可能となる。このようなこと
が可能となるのは、顔部品において使用する色数が極め
て限られていることに起因する。According to the configuration of the above (20), the color of one of the parts having a strong correlation between the face parts such as the bangs and the back hair or the bangs and the mustache is changed to, for example, black / brown / white. If it is determined that the color of the other part is automatically changed to the same color, a portrait without any discomfort is automatically generated without explicitly specifying the change of the color of the other part. Similarly, for a shape that has a strong correlation between the front hair and the back hair, such as whether the hair is perm, an editing operation without a sense of incongruity can be performed by reflecting an instruction for one on the other. [Thirteenth Embodiment] (Invention of Claim 21) Each face part stored in the face part data storage means 403 of FIG. 36 is internally stored as an image having a layered structure called a shadow. . If a drawing is performed by simply combining components having this layer structure, as shown in FIG. 39, a shadow of another component overlaps a certain component and is drawn, resulting in an unnatural picture. Therefore, when arranging each part, as shown in the flowchart of FIG. 41, the color of the pixel or the layer structure constituting each part is checked, and if the color is a shade color of a flesh color (for example, the RGB value is r = 238, g = 160, b = 87), first, only the pixel or layer structure corresponding to the shadow color of the component is drawn in advance, and then the pixel or layer of a non-shadow color is drawn. This makes it possible to generate a natural image as shown in FIG. This is possible because the number of colors used in the face part is extremely limited.

【０２３４】さらに、この肌色の影の層に限り、描画す
べき画素領域の下層に顔輪郭がない場合は、描画そのも
のを抑制することにより、後髪や背景などに肌色の影だ
けが描画されて不自然となることを防ぐことも可能であ
る。Furthermore, if only the skin color shadow layer has no face outline under the pixel area to be drawn, the drawing itself is suppressed, so that only the skin color shadow is drawn on the back hair or the background. It is also possible to prevent unnaturalness.

【０２３５】したがって、本実施例の特徴をまとめると
次のようになる。（２１）請求項２１の画像処理装置は、顔部品データ記
憶部に記憶してある顔部品データの中で、一つの顔部品
データが二つ以上の層構造を持っている場合に、他の顔
部品データと組み合わされたとき、部品中の色情報に基
づいて層の並び順を変更し、適切な順番で部品と部品を
構成する層の配置順序を定める手段（部品配置手段）を
備えてなることを特徴とする。Therefore, the features of this embodiment are summarized as follows. (21) In the image processing apparatus according to claim 21, when one face part data has two or more layer structures among the face part data stored in the face part data storage unit, the other When combined with the face part data, there is provided means (part arranging means) for changing the order of layers based on the color information in the part and determining the arrangement order of the part and the layers constituting the part in an appropriate order. It is characterized by becoming.

【０２３６】上記構成によれば、顔部品あるいは、帽子
などの似顔絵を構成する部品の各画素または領域におい
て、その色情報から、顔輪郭上に投影された影を構成し
ていると判断される画素または領域を抽出し、その画素
または領域を予め描画してから、個々の顔部品を描画す
ることにより、ある部品の上に、他の部品の影が載ると
いうことを防ぐ。さらに影が投影される部品が存在しな
い場合はその画素または領域の描画そのものを停止する
ことにより、顔輪郭が存在しない場所に対する描画は禁
止できる。According to the above arrangement, in each pixel or region of a face part or a part constituting a portrait such as a hat, it is determined from the color information that a shadow projected on the face outline is formed. Pixels or regions are extracted, and the pixels or regions are drawn in advance, and then individual face parts are drawn, thereby preventing a part from shadowing another part. Further, when there is no component to which a shadow is projected, by stopping the drawing of the pixel or the region itself, drawing at a place where no face outline exists can be prohibited.

【０２３７】以上、説明してきた記載内容に基づくと、
画像中の特定物体の特徴量を抽出する画像処理を行う場
合、処理に時間がかかったり、誤った特徴量を抽出して
しまうことなく、また処理の範囲を位置指定手段等を用
いて直接指定する場合に多くの位置を指定することな
く、画像中の任意の特徴量を頑健、高精度かつ高速に抽
出できるとともに、簡易に高品質な似顔絵を合成できる
画像処理装置を提供することができる。Based on the contents described above,
When performing image processing to extract the characteristic amount of a specific object in an image, it does not take a long time to process or extract an incorrect characteristic amount, and directly specifies the processing range using a position specifying unit or the like. In this case, it is possible to provide an image processing apparatus that can robustly, accurately and quickly extract an arbitrary feature amount in an image without specifying many positions, and can easily synthesize a high-quality portrait.

【０２３８】また、上記説明では画像処理の対象を顔と
して記載してきたが、顔のみに限定されず、例えば、写
真の映像やパソコンやディジタルカメラやビデオで撮影
した映像等を含む様々な任意の物体に対しても、構成手
段や部品の役割や機能を変更することで適用できる。In the above description, the target of image processing is described as a face. However, the present invention is not limited to the face, and various arbitrary images including, for example, a picture of a picture, a picture taken by a personal computer, a digital camera, and a video, and the like. The present invention can also be applied to an object by changing the role and function of the constituent means and parts.

【０２３９】尚、ここまで挙げた各実施形態における内
容は、本発明の主旨を変えない限り、上記記載内容に限
定されるものではない。The contents in each of the embodiments described so far are not limited to the contents described above unless the gist of the present invention is changed.

【０２４０】[0240]

【発明の効果】本発明における画像処理装置は、各請求
項において以下の効果が得られる。The image processing apparatus according to the present invention has the following effects in each claim.

【０２４１】本発明の請求項１においては、外部から入
力した画像において、画像中に配置された物体の位置及
び大きさの関係が一定の拘束条件を満たす場合に、当該
画像中の任意の位置を指定することのできる位置指定手
段を用いて、画像中の一つ以上の当該物体の位置を入力
することで、当該物体の特徴量を抽出するための画像処
理を行う適切な探索範囲を設定することができる。これ
により、頑健かつ高精度かつ高速に特徴量を抽出するこ
とが可能となる。また、これは、位置指定手段を用い
て、探索範囲を指定することに比べ、少ない指定点で同
様の効果を得ることができるため、作業量を削減する効
果がある。According to the first aspect of the present invention, when the relationship between the position and the size of the object arranged in the image satisfies a certain constraint condition in the image input from the outside, an arbitrary position in the image is By inputting the position of one or more of the objects in the image using the position specifying means that can specify the object, an appropriate search range for performing image processing for extracting the feature amount of the object is set. can do. This makes it possible to extract feature values robustly, with high accuracy, and at high speed. In addition, since the same effect can be obtained with a smaller number of designated points as compared with the case where the search range is designated by using the position designation means, there is an effect of reducing the amount of work.

【０２４２】本発明の請求項２においては、外部から入
力される画像に顔を含み、抽出される特徴量が、目、
鼻、口、眉、耳、輪郭、髪等の顔部品における該当する
少なくとも１つの顔部品の位置、大きさ、形状等である
ことを特徴としている。そのため、各顔部品の位置及び
大きさの関係は、予め複数の顔を調べておくことですぐ
に適用することが可能であり、特徴量を利用したアプリ
ケーションを構築する際、処理を頑健かつ高精度かつ高
速に行うことが可能となる。また、位置指定手段を用い
て、探索範囲を指定することに比べ、少ない指定点で同
様の効果を得ることができる。そのため、作業量を削減
する効果があり、当該アプリケーションの利便性が高ま
るという効果がある。According to a second aspect of the present invention, an image input from the outside includes a face, and the extracted feature quantity is an eye,
It is characterized by the position, size, shape, and the like of at least one corresponding facial part in the facial parts such as the nose, mouth, eyebrows, ears, contours, and hair. Therefore, the relationship between the position and the size of each face part can be immediately applied by examining a plurality of faces in advance, and when constructing an application using the feature amount, the processing is robust and highly efficient. Accurate and high-speed operation is possible. Further, the same effect can be obtained with fewer designated points as compared with the case where the search range is designated by using the position designation means. This has the effect of reducing the amount of work, and has the effect of increasing the convenience of the application.

【０２４３】本発明の請求項３においては、認識対象の
領域を得るために２値化を行なうにあたり、複数の方
式、複数の閾値で２値化し、それらの画像中の領域の位
置、大きさ、形状等を判定し、最も信頼できる画像を選
択することにより、認識対象を高精度に検出することが
可能になる。In claim 3 of the present invention, when performing binarization to obtain a region to be recognized, binarization is performed using a plurality of methods and a plurality of thresholds, and the position and size of the region in the image are determined. , Shape, and the like, and selecting the most reliable image, the recognition target can be detected with high accuracy.

【０２４４】本発明の請求項４においては、２つ以上の
顔部品の位置を位置指定手段により指定し、それら指定
位置間の距離関係から顔部品の大きさを予測することに
より、高精度に顔部品の位置、大きさ検出を行なうこと
が可能となる。According to a fourth aspect of the present invention, the position of two or more face parts is specified by the position specifying means, and the size of the face part is predicted from the distance relationship between the specified positions, thereby achieving high precision. The position and size of the face part can be detected.

【０２４５】本発明の請求項５においては、目の大きさ
が一定の範囲にあることを利用して、２つ以上の顔部品
の位置を位置指定手段により指定し、それら指定位置間
の距離関係から顔部品の大きさを予測すること、あるい
は認識対象の領域を得るために２値化を行なうにあた
り、複数の方式、複数の閾値で２値化し、それらの画像
中の領域の位置、大きさ、形状等を判定し、最も信頼で
きる画像を選択すること、あるいは両目の位置を検出
し、左右の目を結ぶ線が水平になるように、顔画像を回
転させること、もしくは上記のいずれかの方法を任意に
組合せること、により、認識対象を高精度に検出するこ
とが可能になる。According to a fifth aspect of the present invention, utilizing the fact that the size of the eyes is within a certain range, the positions of two or more face parts are specified by the position specifying means, and the distance between the specified positions is specified. In order to predict the size of the face part from the relationship or to perform binarization in order to obtain a region to be recognized, binarization is performed using a plurality of methods and a plurality of thresholds, and the position and size of the region in the image are determined. Judging the shape, etc., and selecting the most reliable image, or detecting the position of both eyes, rotating the face image so that the line connecting the left and right eyes is horizontal, or any of the above By arbitrarily combining the above methods, the recognition target can be detected with high accuracy.

【０２４６】本発明の請求項６においては、両目の位置
を検出し、左右の目を結ぶ線が水平になるように、顔画
像を回転させることにより、顔部品検出精度を向上させ
ることが可能になる。According to the sixth aspect of the present invention, it is possible to improve the face part detection accuracy by detecting the positions of both eyes and rotating the face image so that the line connecting the left and right eyes is horizontal. become.

【０２４７】本発明の請求項７においては、設定された
探索範囲内で目の傾き及び目の厚みをあらわす画像特徴
を検出し、目の形状を判定するので、テンプレート画
像、あるいは辞書画像と、それに対応する入力画像中の
部分画像とがずれていることにより、誤った特徴量が抽
出されるという危険を回避することができる。さらに、
対象とする目の形状が、予め準備しておいたカテゴリー
に含まれない形状の場合に、正しい特徴量が算出できな
いといった危険を回避することができる。さらに、テン
プレート画像や辞書画像を準備する必要がなり、作業量
を大幅に少なくすることができる。According to the seventh aspect of the present invention, an image feature representing the inclination and thickness of the eyes is detected within the set search range, and the shape of the eyes is determined. Due to the shift from the corresponding partial image in the input image, it is possible to avoid the danger that an erroneous feature amount is extracted. further,
When the shape of the target eye is a shape that is not included in the category prepared in advance, it is possible to avoid a risk that a correct feature amount cannot be calculated. Furthermore, it is necessary to prepare a template image and a dictionary image, and the amount of work can be significantly reduced.

【０２４８】本発明の請求項８においては、口の大きさ
が一定の範囲にあることを利用して、２つ以上の顔部品
の位置を位置指定手段により指定し、それら指定位置間
の距離関係から顔部品の大きさを予測すること、あるい
は認識対象の領域を得るために２値化を行なうにあた
り、複数の方式、複数の閾値で２値化し、それらの画像
中の領域の位置、大きさ、形状等を判定し、最も信頼で
きる画像を選択すること、あるいは両目の位置を検出
し、左右の目を結ぶ線が水平になるように、顔画像を回
転させること、もしくは上記のいずれかの方法を任意に
組合せること、により、認識対象を高精度に検出するこ
とが可能になる。According to an eighth aspect of the present invention, utilizing the fact that the size of the mouth is within a certain range, the positions of two or more face parts are specified by the position specifying means, and the distance between the specified positions is specified. In order to predict the size of the face part from the relationship or to perform binarization in order to obtain a region to be recognized, binarization is performed using a plurality of methods and a plurality of thresholds, and the position and size of the region in the image are determined. Judging the shape, etc., and selecting the most reliable image, or detecting the position of both eyes, rotating the face image so that the line connecting the left and right eyes is horizontal, or any of the above By arbitrarily combining the above methods, the recognition target can be detected with high accuracy.

【０２４９】本発明の請求項９においては、２つ以上の
顔部品の位置を位置指定手段により指定し、それら指定
位置間の距離関係から眉毛の大きさを予測し、眉毛に関
する特徴量を抽出する際の処理を行うべき範囲を適当な
大きさに制限することに加え、眉毛の大きさを推定する
ことができる。すなわち、２値化を行った際に分離され
る領域の大きさと、推定される眉毛の大きさを比較し、
それらの大きさがあまり離れていないような２値化の閾
値を求めることで、眉毛をあらわす領域を高精度に検出
することができる。According to a ninth aspect of the present invention, the positions of two or more face parts are specified by the position specifying means, the size of the eyebrows is predicted from the distance relationship between the specified positions, and the characteristic amount relating to the eyebrows is extracted. In addition to restricting the range in which the processing is performed to an appropriate size, the size of the eyebrows can be estimated. That is, the size of the region separated when binarization is performed is compared with the estimated size of the eyebrows,
By obtaining a binarization threshold value such that their sizes are not so far apart, an area representing an eyebrow can be detected with high accuracy.

【０２５０】本発明の請求項１０においては、設定され
た探索範囲内で眉毛の折れ曲がりかた及び眉毛の太さを
あらわす画像特徴を検出し、眉毛の形状を判定するの
で、テンプレート画像、あるいは辞書画像と、それに対
応する入力画像中の部分画像とがずれていることによ
り、誤った特徴量が抽出されるという危険を回避するこ
とができる。さらに、対象とする眉毛の形状が、予め準
備しておいたカテゴリーに含まれない形状の場合に、正
しい特徴量が算出できないといった危険を回避すること
ができる。さらに、テンプレート画像や辞書画像を準備
する必要がなり、作業量を大幅に少なくすることができ
る。According to the tenth aspect of the present invention, an image feature representing the manner in which the eyebrows are bent and the thickness of the eyebrows is detected within the set search range, and the shape of the eyebrows is determined. It is possible to avoid the danger that an erroneous feature amount is extracted due to the difference between the image and the corresponding partial image in the input image. Further, when the shape of the target eyebrow is not included in the category prepared in advance, it is possible to avoid a risk that a correct feature amount cannot be calculated. Furthermore, it is necessary to prepare a template image and a dictionary image, and the amount of work can be significantly reduced.

【０２５１】本発明の請求項１１においては、撮影条件
が悪く、あまり明確な輪郭線が現れていない画像やノイ
ズの多い画像に対しても、より安定な顎検出を行ない、
ノイズや個人差などの影響を出来るだけ排除した顎形状
判定を行なうことが可能である。According to the eleventh aspect of the present invention, more stable jaw detection is performed even for an image in which the photographing conditions are poor and a clear contour line does not appear or for an image with much noise.
It is possible to perform a jaw shape determination in which the effects of noise, individual differences, and the like are eliminated as much as possible.

【０２５２】本発明の請求項１２、１３においては、画
像全体から背景領域を抽出する必要がないため、背景色
が一様またはそれに近い必要はなく、通常のスナップ写
真などからでも髪色を抽出し、それによって、似顔絵を
作成することができる。According to the twelfth and thirteenth aspects of the present invention, it is not necessary to extract the background area from the entire image, so the background color does not need to be uniform or close to it, and the hair color can be extracted even from a normal snapshot. Thus, a portrait can be created.

【０２５３】本発明の請求項１４においては、テンプレ
ートマッチングによるのではなく、髪の輪郭線を抽出す
るため、いわゆる「七三分け」，「真中分け」などの呼
び方でいう「分け目」を精度よく検出して分類すること
や、髪生え際線の形状の丸みを判定して「四角型」，
「丸型」に分類することなど、きめ細かい形状分類を行
うことができる。According to the fourteenth aspect of the present invention, in order to extract the outline of the hair instead of using template matching, the "separation" referred to as a so-called "seven-three division" or "central division" is used. Detecting and classifying well, and determining the roundness of the hairline shape,
It is possible to perform fine shape classification such as classification into “round type”.

【０２５４】本発明の請求項１５においては、例えば、
顔輪郭線の最上部の高さが頭頂高さと比較して、ある閾
値以上低い場合は髪が相当量あると判断することによ
り、白髪であるなど髪領域と肌領域の区別が難しい場合
にも、「髪が薄い」などの誤判断を無くしたり、あるい
は減らすことができる。In claim 15 of the present invention, for example,
If the height of the top of the face contour line is lower than a certain threshold value by comparison with the head height, it is determined that there is a considerable amount of hair. Erroneous judgments such as "thin hair" can be eliminated or reduced.

【０２５５】本発明の請求項１６においては、前髪、後
髪の両方の特徴を用いて前髪部品を決定するため、例え
ば、髪の上部で左側の方に分け目があれば、前髪部品で
も左の方から流れているような、「左分け」にマッチし
たものを選択することにより、よりリアルな、違和感の
少ない似顔絵を作成することができる。また、予め用意
された髪部品を用いているので、例えば、髪領域を２値
化したものと、前髪部分の髪を表現する小部品とを組み
合わせる手法のように、髪領域が２値化画像の一部ある
いは全部が、髪画像の一部あるいは全部としてそのまま
出力される手法と比較して、より美しい似顔絵を作成で
きる場合が多く、また、処理が不完全な部分が存在して
も、それが出力としてそのままユーザーに見えるわけで
はないので、違和感を与えにくい。In the sixteenth aspect of the present invention, since the bangs part is determined using both the bangs and the back hair characteristics, for example, if there is a division on the left side in the upper part of the hair, the bangs part is also left. By selecting one that matches the "left division" that flows from one side, a more realistic portrait with less discomfort can be created. Further, since a hair part prepared in advance is used, the hair area is converted to a binarized image, for example, by a method of combining a binarized hair area with a small part representing the hair of the bangs. In many cases, a more beautiful portrait can be created as compared with a method in which part or all of the hair image is output as it is as part or all of the hair image. Is not directly visible to the user as output, so it is difficult to give a sense of incongruity.

【０２５６】本発明の請求項１７においては、前髪、後
髪の両方の特徴を用いて後髪部品を決定するため、例え
ば、額の前髪部分で左の方に分け目があれば、後髪部品
でも、左側の方に分け目があるような、「左分け」にマ
ッチしたものを選択することにより、よりリアルな、違
和感の少ない似顔絵を作成することができる。また、予
め用意された髪部品を用いているので、例えば、髪領域
を２値化したものと、前髪部分の髪を表現する小部品と
を組み合わせる手法のように、髪領域が２値化画像の一
部あるいは全部が、髪画像の一部あるいは全部としてそ
のまま出力される手法と比較して、より美しい似顔絵を
作成できる場合が多く、また、処理が不完全な部分が存
在しても、それが出力としてそのままユーザーに見える
わけではないので、違和感を与えにくい。According to the seventeenth aspect of the present invention, since the back hair part is determined by using the characteristics of both the bangs and the back hair, for example, if there is a split on the left side in the forehead part of the forehead, the back hair part However, by selecting an item that matches the "left division" in which there is a division on the left side, a more realistic caricature with less discomfort can be created. Further, since a hair part prepared in advance is used, the hair area is converted to a binarized image, for example, by a method of combining a binarized hair area with a small part representing the hair of the bangs. In many cases, a more beautiful portrait can be created as compared with a method in which part or all of the hair image is output as it is as part or all of the hair image. Is not directly visible to the user as output, so it is difficult to give a sense of incongruity.

【０２５７】本発明の請求項１８においては、特徴量抽
出手段によって得られた顔の各部品の特徴を最大限に反
映しながら、顔輪郭の形状に基づく拘束条件を加えるこ
とで行き過ぎた特徴の反映を防ぎ、部品配置が最適で破
綻をきたさない似顔絵を自動合成することが可能とな
る。According to the eighteenth aspect of the present invention, while the features of each part of the face obtained by the feature amount extracting means are reflected to the maximum extent, a constraint condition based on the shape of the face contour is added to remove the excessive features. It is possible to prevent the reflection and automatically synthesize a portrait that is optimal and has no breakdown.

【０２５８】本発明の請求項１９においては、画像中の
顔部品の位置などの情報を得る手段と、得られた顔部品
の位置情報に基づき、上記請求項１８における顔輪郭に
対応して決定した他の顔部品の配置位置を補正し、より
適切な部品の配置位置を定める手段を備えることによ
り、上記請求項１８の効果に加え、画像中の顔の顔部品
の配置による特徴を再現もしくは強調した似顔絵画像を
合成することが可能となる。According to a nineteenth aspect of the present invention, a means for obtaining information such as the position of a face part in an image, and a decision corresponding to the face contour according to the eighteenth aspect, based on the obtained position information of the face part. Means for correcting the arrangement position of the other face parts, and determining the more appropriate arrangement position of the parts, in addition to the effect of claim 18, in addition to the effect of the arrangement of the face parts of the face in the image, It is possible to combine the emphasized portrait image.

【０２５９】本発明の請求項２０においては、似顔絵の
中の色に関する拘束条件を利用するこにより、似顔絵と
して不適切な画像の画像の生成を抑制し、かつユーザー
に余分な部品の色変更指定を減らすことができる。[0259] According to a twentieth aspect of the present invention, by using a constraint condition relating to colors in a portrait, generation of an image that is inappropriate as a portrait is suppressed, and a color change designation of an extra part is given to the user. Can be reduced.

【０２６０】本発明の請求項２１においては、記憶する
顔部品のデータを必要以上に複雑することなく、部品を
構成する層の描画順位を自動的に決定し、あるいは抑制
することにより自然な画像を自動生成できる。According to a twenty-first aspect of the present invention, a natural image is obtained by automatically determining or suppressing the drawing order of layers constituting a part without unnecessarily complicating the data of the stored face part. Can be automatically generated.

[Brief description of the drawings]

【図１】本発明の一実施の形態に係る画像処理装置の概
略ブロック図である。FIG. 1 is a schematic block diagram of an image processing apparatus according to an embodiment of the present invention.

【図２】位置指定の例及び指定位置により決定される探
索範囲の例をあらわす図である。FIG. 2 is a diagram illustrating an example of position designation and an example of a search range determined by a designated position.

【図３】目の形状を認識するための目の探索範囲、ヒス
トグラム、検出された目頭及び目尻位置の例をあらわす
図である。FIG. 3 is a diagram showing an example of an eye search range for recognizing an eye shape, a histogram, and detected positions of the inner and outer corners of the eye.

【図４】目頭を検出するためのテンプレートの例をあら
わす図であるFIG. 4 is a diagram illustrating an example of a template for detecting an inner corner of the eye;

【図５】目の厚みを検出するための目の探索範囲、肌色
をサンプリングするための領域、肌・非肌領域の例をあ
らわす図である。FIG. 5 is a diagram illustrating an example of an eye search range for detecting eye thickness, a skin color sampling region, and a skin / non-skin region.

【図６】目頭探索範囲及び目尻探索範囲を求めるための
動作をあらわすフローチャートである。FIG. 6 is a flowchart showing an operation for obtaining an inner and outer corner search range.

【図７】目頭位置を検出するための動作をあらわすフロ
ーチャートである。FIG. 7 is a flowchart showing an operation for detecting an inner corner position.

【図８】目尻位置を検出するための動作をあらわすフロ
ーチャートである。FIG. 8 is a flowchart showing an operation for detecting the position of the outer corner of the eye.

【図９】眉毛の位置及び大きさを検出するための眉毛の
探索範囲、２値化画像の例をあらわす図である。FIG. 9 is a diagram illustrating an example of an eyebrow search range and a binarized image for detecting the position and size of eyebrows.

【図１０】眉毛の位置及び大きさを検出するための動作
をあらわすフローチャートである。FIG. 10 is a flowchart showing an operation for detecting a position and a size of an eyebrow.

【図１１】眉毛に外接する矩形の例をあらわす図であ
る。FIG. 11 is a diagram illustrating an example of a rectangle circumscribing eyebrows.

【図１２】眉毛の形状を認識するための量子化の例をあ
らわす図である。FIG. 12 is a diagram illustrating an example of quantization for recognizing the shape of eyebrows.

【図１３】眉毛の折れ曲がりかたを検出するための動作
をあらわすフローチャートである。FIG. 13 is a flowchart showing an operation for detecting how an eyebrow is bent.

【図１４】眉毛の厚みを検出するための動作をあらわす
フローチャートである。FIG. 14 is a flowchart showing an operation for detecting the thickness of eyebrows.

【図１５】口の検出の動作をあらわすフローチャートで
ある。FIG. 15 is a flowchart showing an operation of mouth detection.

【図１６】口の検出の画像及び投影結果の例をあらわす
図である。FIG. 16 is a diagram illustrating an example of a mouth detection image and a projection result.

【図１７】目の検出の動作をあらわすフローチャートで
ある。FIG. 17 is a flowchart illustrating an eye detection operation.

【図１８】目の検出の画像及び変換、抽出結果の例をあ
らわす図である。FIG. 18 is a diagram illustrating an example of an eye detection image and conversion and extraction results.

【図１９】目の検出結果を利用した傾き補正の動作をあ
らわすフローチャートである。FIG. 19 is a flowchart showing an operation of tilt correction using an eye detection result.

【図２０】目の検出結果を利用した傾き補正の例をあら
わす図である。FIG. 20 is a diagram illustrating an example of tilt correction using an eye detection result.

【図２１】顔部品配置補正量の算出の動作をあらわすフ
ローチャートである。FIG. 21 is a flowchart illustrating an operation of calculating a face part arrangement correction amount.

【図２２】顔部品配置補正に使用する特徴量をあらわす
図である。FIG. 22 is a diagram illustrating feature amounts used for face part arrangement correction.

【図２３】本発明の請求項１１の実施例における画像処
理装置の動作を概略的に示すフローチャートである。FIG. 23 is a flowchart schematically showing an operation of the image processing apparatus according to the eleventh embodiment of the present invention.

【図２４】本発明の請求項１１の実施例における入力画
像中の中心座標及び初期輪郭の配置を説明するための図
である。FIG. 24 is a diagram for explaining the arrangement of center coordinates and initial contours in an input image in the embodiment of claim 11 of the present invention.

【図２５】本発明の請求項１１の実施例における初期輪
郭上の一点と中心座標を結ぶ直線上の色差算出を行なう
方法を説明するための図である。FIG. 25 is a diagram for explaining a method of calculating a color difference on a straight line connecting a point on the initial contour and the center coordinates in the embodiment of claim 11 of the present invention.

【図２６】本発明の請求項１１の実施例における色差の
算出例を模式的に示した図である。FIG. 26 is a diagram schematically showing an example of calculating a color difference in the embodiment according to claim 11 of the present invention.

【図２７】本発明の請求項１１の実施例における顔輪郭
形状に特化した色差算出をおこなう手法として顔が楕円
形状であることを利用する場合について説明するための
図である。FIG. 27 is a diagram for explaining a case where the ellipse shape of a face is used as a method for performing a color difference calculation specialized for a face contour shape according to the eleventh embodiment of the present invention.

【図２８】本発明の請求項１１の実施例における顔輪郭
形状に特化した色差算出をおこなう手法として顔が中心
軸に対して左右対称であることを利用する場合について
説明するための図である。FIG. 28 is a diagram for explaining a case of utilizing the fact that a face is bilaterally symmetric with respect to a center axis as a method for performing color difference calculation specialized for a face contour shape according to an embodiment of the present invention. is there.

【図２９】本発明の請求項１１の実施例における抽出し
た顔輪郭線から距離関数を算出する手法を説明するため
の図である。FIG. 29 is a diagram for explaining a method of calculating a distance function from an extracted face contour in the embodiment of claim 11 of the present invention.

【図３０】本発明の請求項１１の実施例における入力画
像から得られた距離関数と基準距離関数を比較する手法
を説明するための図である。FIG. 30 is a diagram for explaining a method for comparing a distance function obtained from an input image with a reference distance function according to the eleventh embodiment of the present invention.

【図３１】本発明の請求項１２〜１６の画像処理装置の
構成を示すブロック図である。FIG. 31 is a block diagram showing a configuration of an image processing apparatus according to claims 12 to 16 of the present invention.

【図３２】本発明の請求項１２〜１６の画像合成装置の
処理を示すフローチャートである。FIG. 32 is a flowchart showing processing of the image synthesizing apparatus according to claims 12 to 16 of the present invention.

【図３３】髪色抽出に関する説明図である。FIG. 33 is an explanatory diagram related to hair color extraction.

【図３４】前髪分類に関する説明図である。FIG. 34 is an explanatory diagram related to bangs classification.

【図３５】後髪分類に関する説明図である。FIG. 35 is an explanatory diagram relating to back hair classification.

【図３６】本発明の別の実施形態である似顔絵画像の合
成に関する画像処理装置の概略ブロック図である。FIG. 36 is a schematic block diagram of an image processing apparatus relating to the synthesis of a portrait image according to another embodiment of the present invention.

【図３７】各顔部品の中心位置を配置すべき位置座標の
図である。FIG. 37 is a diagram of position coordinates where the center position of each face part is to be arranged.

【図３８】顔輪郭の形状の例である。FIG. 38 is an example of the shape of a face contour.

【図３９】従来手法による影の描画例である。FIG. 39 is an example of shadow drawing by a conventional method.

【図４０】本発明による影の描画例である。FIG. 40 is a drawing example of a shadow according to the present invention.

【図４１】図３６の部品配置手段４０８に含まれる影の
描画方法のフローチャートである。FIG. 41 is a flowchart of a shadow drawing method included in the component placement unit 408 of FIG. 36.

[Explanation of symbols]

１０特徴量抽出部１１入力装置１２記憶装置１３演算装置１４位置指定装置１５出力装置４００画像入力手段４０１特徴量抽出手段４０２顔部品データ抽出手段４０３顔部品データ記憶手段４０４顔輪郭決定手段４０５部品変形／配置情報記憶手段４０６配置位置補正手段４０７部品変形手段４０８部品配置手段４０９画像出力手段４１０編集指定入力手段 Reference Signs List 10 feature amount extraction unit 11 input device 12 storage device 13 arithmetic device 14 position designation device 15 output device 400 image input means 401 feature quantity extraction means 402 face part data extraction means 403 face part data storage means 404 face contour determination means 405 part deformation / Placement information storage means 406 placement position correction means 407 part deformation means 408 part placement means 409 image output means 410 edit designation input means

フロントページの続き (72)発明者竹澤創大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内 (72)発明者長井義典大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内 (72)発明者伊藤愛大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内 (72)発明者紺矢峰弘大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内Ｆターム(参考） 5B050 BA06 BA12 CA07 EA09 EA12 EA13 EA19 FA09 FA19 5L096 BA18 FA06 FA15 FA69 GA38 GA51 JA22 LA05 Continuation of the front page (72) Inventor Sou Takezawa 22-22, Nagaike-cho, Abeno-ku, Osaka-shi, Osaka Inside (72) Inventor Yoshinori Nagai 22-22, Nagaike-cho, Abeno-ku, Osaka-shi, Osaka (72) Inventor Ai Ito 22-22 Nagaike-cho, Abeno-ku, Osaka-shi, Osaka Inside (72) Inventor Minehiro Konya 22-22 Nagaike-cho, Abeno-ku, Osaka-shi, Osaka F-term (Reference) 5B050 BA06 BA12 CA07 EA09 EA12 EA13 EA19 FA09 FA19 5L096 BA18 FA06 FA15 FA69 GA38 GA51 JA22 LA05

Claims

[Claims]

1. An input unit for inputting an image, a storage unit for storing the input image, an arithmetic unit for performing an arbitrary operation, and a position specifying unit for specifying an arbitrary position in the image. Recognizing the position and size of an object placed in the image, and inputting the position of one or more of the objects in the image when the relationship between the position and size of the object satisfies a certain constraint. And a feature extraction unit for extracting an arbitrary feature in the image.

2. The image processing apparatus according to claim 1, wherein the feature amount extracting means includes a face, an eye, a nose, a mouth, an eyebrow, an ear,
The contour and hair are face parts, and at least one of the face parts
2. The image processing apparatus according to claim 1, wherein the position, the size, and the shape of the corresponding face part are extracted as feature amounts.

3. The feature amount extracting unit binarizes an input image by a plurality of methods and a plurality of thresholds, determines the position, size, and shape of a region in the image, and determines the highest reliability. 3. The image processing apparatus according to claim 2, further comprising an area detecting unit that detects an area to be recognized by selecting an image.

4. The feature quantity extracting means predicts the size of the face part from the distance relationship between the positions of two or more face parts specified by the position specifying means, thereby obtaining the position and size of the face part. 3. The image processing apparatus according to claim 2, further comprising a face part recognizing means for detecting the height.

5. The image processing apparatus according to claim 4, wherein the target of the position and size to be detected by the face part recognition means is an eye of the face part.

6. The face part recognizing means, wherein a line connecting left and right eyes is horizontal with respect to a detected position of both eyes.
6. The image processing apparatus according to claim 4, further comprising: means for rotating a face image.

7. The face part recognizing means sets a search range based on the detected position and size of the eyes, detects image features representing the inclination and thickness of the eyes in the range, and determines the shape of the eyes. 5. The apparatus according to claim 4, further comprising:
7. The image processing apparatus according to any one of claims 1 to 6.

8. The image processing apparatus according to claim 4, wherein said face part recognition means detects a position and a size of a mouth of the face part.

9. An image processing apparatus according to claim 4, wherein said face part recognizing means detects a position or a size of an eyebrow of the face part.

10. The face part recognizing means sets a search range based on the detected position and size of the eyebrow, detects image features representing the thickness and how to bend the eyebrow within the range, and detects the shape of the eyebrow. The image processing apparatus according to claim 4, further comprising: means for determining

11. The method according to claim 1, wherein the feature extracting means includes a contour recognizing means for detecting a contour feature of a jaw and determining a shape thereof based on one or more position information designated by the position designating means. 3. The image processing apparatus according to claim 2, wherein:

12. A hair recognizing means for estimating a head height and a hairline height based on one or more position information specified by the position specifying means, and recognizing a hair region. 3. The image processing apparatus according to claim 2, wherein:

13. The image processing apparatus according to claim 12, wherein said hair recognizing means includes a hair color extracting means for extracting a hair color.

14. A hair feature extracting means for extracting a feature of a hair portion based on one or more pieces of position information specified by the position specifying means, and a hair using the feature of the hair portion. 13. The image processing apparatus according to claim 12, further comprising a hair contour extracting means for extracting a contour, and a hair classifying means for classifying the hair using the hair contour.

15. The hair recognizing means includes a face contour feature extracting means for extracting facial contour features, and a hair classifying means for classifying hair using the hair features and the facial contour features. The image processing apparatus according to claim 12, wherein:

16. The hair recognizing means includes bangs feature extracting means for extracting features of a bangs part, and back hair feature extracting means for extracting features of a back hair part, and when an image including a hair part is inputted, 13. The image processing apparatus according to claim 12, wherein a bangs part is determined using the bangs feature extracted by the bangs feature extraction unit and the back hair feature extracted by the back hair feature extraction unit.

17. The hair recognizing means includes bangs feature extracting means for extracting features of a bangs portion, and back hair feature extracting means for extracting features of a back hair portion, and when an image including a hair portion is input, 13. The image processing apparatus according to claim 12, wherein a back hair part is determined by using a bangs feature extracted by the bangs feature extraction unit and a back hair feature extracted by the back hair feature extraction unit. .

18. A means for obtaining face part feature information of the size and shape of a face part in an image, a plurality of face part types corresponding to the feature information, and storing a plurality of part data for each face part type. Face part data storage means, a face part data extraction means for extracting appropriate part data from the face part data storage unit based on the face part characteristic information, and a face part data An image processing apparatus comprising means for arranging another face part at a position suitable for the contour by determining the arrangement position of the part for each face outline part type stored in the part data storage unit. .

19. A means for obtaining information on the position of a face part in an image, and based on the obtained position information on the face part, correct the arrangement position of another face part determined corresponding to the face outline, and 19. The image processing apparatus according to claim 18, further comprising: means for determining an arrangement position of the component.

20. In the facial part data to be arranged at the time of portrait composition, when a part change designation is made for a specific color of a set of related parts, the parts are automatically changed also for the other related parts. 19. The image processing apparatus according to claim 18, further comprising means for generating data that does not cause inconsistency.

21. When one face part data has two or more layers in the face part data stored in the face part data storage unit, the face part data is combined with another face part data. 19. The apparatus according to claim 18, further comprising means for changing the arrangement order of the layers based on the color information in the parts and determining the arrangement order of the parts and the layers constituting the parts in an appropriate order.
The image processing apparatus according to any one of the preceding claims.