JP2000259833A

JP2000259833A - Face image processor and processing method therefor

Info

Publication number: JP2000259833A
Application number: JP11060079A
Authority: JP
Inventors: Hiroshi Sukegawa; 寛助川
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-03-08
Filing date: 1999-03-08
Publication date: 2000-09-22
Anticipated expiration: 2019-03-08
Also published as: JP4377472B2

Abstract

PROBLEM TO BE SOLVED: To provide a face image processor capable of obtaining a desired face image by automatically judging the state of the face of an object. SOLUTION: This face image processor is provided with an image input part 11 for continuously inputting the image of a human body including the face image and image selection parts 14-19 and 21 for judging the state of the face of the human body, e.g., the state of the pupil of the human body, from the image and automatically selecting the image including an optimum face image. Thus, for instance, by turning a camera to the object for a fixed time, an image including the optimum face image in which eyes are not closed is obtained.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、画像処理装置に
関するもので、特に人間の顔画像を扱う顔画像処理装置
及びその方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus, and more particularly, to a face image processing apparatus for handling a human face image and a method thereof.

【０００２】[0002]

【従来の技術】最近、電子スチルカメラ等のデジタル画
像装置の普及はめざましく、様々な分野で広く利用がな
されている。2. Description of the Related Art In recent years, digital imaging devices such as electronic still cameras have been remarkably popularized and widely used in various fields.

【０００３】例えば電子スチルカメラやＴＶ電話、監視
カメラで人物を撮影する場合、顔の向きや目や口などの
状態が希望の状態の時に一人または複数の人物の顔を撮
影しようとする時は、被撮影者に希望の状態に顔の状態
をあわせてもらうといった方法をとるか、監視カメラな
どではすべての時間において連続的にビデオテープなど
を使って撮影を行い、後から最適な画像を目で見ながら
選ぶといった方法をとっている。For example, when photographing a person with an electronic still camera, a TV phone, or a surveillance camera, when photographing the face of one or a plurality of persons when the face direction, eyes, mouth, etc. are in a desired state, Take the subject's face condition to the desired condition, or use a surveillance camera etc. to continuously shoot using videotape etc. at all times and watch the optimal image later. We take the method of selecting while watching.

【０００４】[0004]

【発明が解決しようとする課題】しかし、一人または複
数名を対象に撮影を行っている時に、撮影者の希望する
画像を獲得するために被撮影者に対してあらかじめ顔の
状態の希望を伝えてその状態にしてもらう必要があった
り、複数人の撮影する場合には撮影してみて一人でも適
さない状態の人がいた場合は、再度撮りなおしする必要
がある。そのため、監視のように撮影されていることを
相手に知られたくない場合や複数の人物がいて常に全員
がばらばらな顔の状態をしているような撮影対象の場
合、非常に撮影が困難であるという問題がある。However, when one or more persons are being photographed, the photographer is informed in advance of his / her desired face state in order to obtain the image desired by the photographer. If it is necessary to have the user take that state, or if more than one person takes a picture, and there is a person who is not in a suitable state, it is necessary to take the picture again. Therefore, it is very difficult to shoot when you do not want the other party to know that the shooting is being performed like surveillance, or when the shooting target is such that there are multiple people and all of them are in disparate faces at all times. There is a problem that there is.

【０００５】本発明は上記問題を鑑み、顔の状態を自動
的に判断し所望の画像を獲得することができる顔画像処
理装置を提供することを目的とする。SUMMARY OF THE INVENTION In view of the above problems, an object of the present invention is to provide a face image processing apparatus capable of automatically determining a face state and obtaining a desired image.

【０００６】[0006]

【課題を解決するための手段】本発明は上記課題を解決
するべく、顔画像を含む人物の画像を連続的に入力する
画像入力手段と、前記画像入力手段が入力した複数の連
続的な画像から、前記人物の顔の状態を判定して最適な
顔画像を含む画像を選択する画像選択手段とを有するこ
とを特徴とする顔画像処理装置である。According to the present invention, there is provided an image input means for continuously inputting an image of a person including a face image, and a plurality of continuous images input by the image input means. And an image selecting means for determining the state of the person's face and selecting an image including an optimal face image.

【０００７】つまり本発明は上記した構造により、カメ
ラ等からの連続した顔画像から人物の顔画像を検出し、
撮影者の意図する顔画像の状態、例えば被写体の目が開
いている状態の顔画像を判定しこれを選択するものであ
る。これにより従来のように撮影者が被写体の目が閉じ
ているかどうか等を気にするようなことなく、一定時間
カメラ等を被写体に向けているだけで、目をつぶってい
ない等の被写体の顔画像を自動的に選択してくれる顔画
像処理装置を提供することができる。That is, according to the present invention, a face image of a person is detected from a continuous face image from a camera or the like by the above-described structure.
The purpose is to determine the state of the face image intended by the photographer, for example, the face image with the subject's eyes open, and select this. As a result, the face of the subject, such as not closing his / her eyes, merely pointing the camera or the like at the subject for a certain period of time without having to worry about whether or not the subject's eyes are closed as in the past, A face image processing apparatus that automatically selects an image can be provided.

【０００８】又本発明は、顔画像を含む複数の人物の画
像を連続的に入力する画像入力手段と、前記画像入力手
段が入力した複数の連続的な画像から、前記複数の人物
の顔の状態を判定し、前記複数の人物のそれぞれの顔の
状態が最適となる画像を選択して出力する画像選択手段
とを有することを特徴とする顔画像処理装置である。Further, the present invention provides an image input means for continuously inputting a plurality of images of a person including a face image, and a plurality of images of the faces of the plurality of persons from the plurality of continuous images input by the image input means. Image selecting means for judging a state and selecting and outputting an image having an optimum face state of each of the plurality of persons.

【０００９】本発明は上記した構造により、複数の人物
の顔画像のそれぞれを評価し、これが総合的に最も良く
なっている画像を蓄積画像から自動的に選択することに
より、例えばいわゆる集合写真についても最適なものを
自動的に選択する顔画像処理装置を提供するものであ
る。The present invention evaluates each of a plurality of face images of a person using the above-described structure, and automatically selects an image having the best overall image from the stored images. Also provides a face image processing apparatus that automatically selects the most suitable one.

【００１０】又本発明は、顔画像を含む複数の人物の画
像を連続的に入力する画像入力手段と、前記画像入力手
段が入力した複数の連続的な画像から、前記複数の人物
の顔の状態をそれぞれ判定し、前記複数の人物の全員の
顔の最適な状態である画像を前記複数の人物の顔画像ご
とに選択する画像選択手段と、前記画像選択手段が選択
した前記複数の人物のそれぞれの最適な顔画像を合成し
て一つの画像として出力する顔画像合成手段とを有する
ことを特徴とする顔画像処理装置である。Further, according to the present invention, there is provided image input means for continuously inputting images of a plurality of persons including a face image, and a plurality of images of the faces of the plurality of persons are obtained from the plurality of continuous images input by the image input means. An image selection unit that determines the state of each of the plurality of persons, and selects an image that is an optimal state of the faces of all of the plurality of persons for each of the plurality of face images of the plurality of persons; A face image synthesizing means for synthesizing respective optimum face images and outputting the result as one image.

【００１１】本発明は上記した構造により、複数の人物
のそれぞれの顔画像を評価し、人物ごとに最適な顔画像
を検出するものである。そしてこの最適な顔画像を後に
合成することで、従来のカメラでは簡単には実現しなか
った各人が最適な表情の集合写真を自動的に実現する顔
画像処理装置を提供するものである。According to the present invention, the face image of each of a plurality of persons is evaluated by the above structure, and an optimum face image is detected for each person. Then, by synthesizing the optimum face image later, it is possible to provide a face image processing apparatus that automatically realizes a group photograph of the optimal expression for each person, which was not easily realized by the conventional camera.

【００１２】又本発明は、顔画像を含む複数の人物の画
像を連続的に入力する画像入力手段と、前記画像入力手
段が入力した複数の連続的な画像から、前記複数の人物
の顔の状態を、性別・国籍・大人／子供等の各属性の平
均顔の値と比較することで、撮影領域内の複数人物の各
属性ごとの人数構成を判断する属性人数判断手段とを有
することを特徴とする顔画像処理装置である。The present invention also provides image input means for continuously inputting a plurality of images of a person including a face image, and converting the plurality of images of the faces of the plurality of persons from the plurality of continuous images input by the image input means. An attribute head count determining means for determining a head count configuration for each attribute of a plurality of persons in the photographing area by comparing the state with the average face value of each attribute such as gender, nationality, adult / child, etc. This is a facial image processing device that is a feature.

【００１３】本発明は上記した構造により、複数の人物
が写った写真において性別や大人・子供等の被写体の属
性を自動的に判断し出力することにより、より情報量の
多い画像を提供することができる。[0013] The present invention provides an image having a larger amount of information by automatically judging the gender and the attributes of subjects such as adults and children in a photograph in which a plurality of persons are photographed by the above structure. Can be.

【００１４】又本発明は、顔画像を含む人物の画像を連
続的に入力する画像入力手段と、前記画像入力手段が入
力した複数の連続的な画像から前記人物の顔画像の瞳領
域を検出する瞳領域検出手段と、前記瞳領域検出手段が
検出した前記瞳領域の所定時間の連続した瞳領域の画像
を検出してその変動状態を出力する瞳領域変動状態検出
手段と、前記瞳領域変動状態検出手段が出力した前記変
動状態に基づき、これと予め格納された瞳辞書とを比較
判定することにより、撮影者が希望している状態に最も
近い瞳領域をもつ顔画像を含む画像を選択し出力する画
像選択手段とを有することを特徴とする顔画像処理装置
である。According to the present invention, there is provided image input means for continuously inputting an image of a person including a face image, and detecting a pupil region of the face image of the person from a plurality of continuous images input by the image input means. Pupil region detection means for detecting an image of a pupil region that is continuous for a predetermined time of the pupil region detected by the pupil region detection device, and outputting a fluctuation state thereof; Based on the fluctuation state output from the state detection means, the image is compared with a previously stored pupil dictionary to select an image including a face image having a pupil region closest to the state desired by the photographer. And an image selecting unit for outputting the image.

【００１５】本発明は上記した構造により、瞳領域を自
動検出しこの領域の所定時間の画像を検討することで、
まばたきを含めた目の状態の画像を認識し、撮影に最も
適した目の状態を含む画像を自動選択するものである。
これにより、カメラ等を一定時間、被写体に向けている
だけで、目が開きすぎず閉じてもいない最も撮影に適し
た状態の瞳を含む顔画像を自動選択することで、適切な
顔画像を提供することができる。According to the present invention, the pupil region is automatically detected by the above-described structure, and an image of this region for a predetermined time is examined.
The image of the eye state including the blink is recognized, and an image including the eye state most suitable for photographing is automatically selected.
By simply pointing the camera, etc. at the subject for a certain period of time, the eyes are not too open and not closed. Can be provided.

【００１６】又本発明は、顔画像を含む人物の画像を連
続的に入力する画像入力手段と、前記画像入力手段が入
力した複数の連続的な画像から前記人物の顔の瞳と鼻孔
の位置を検出しこれに基づき口領域候補を決定する口領
域候補決定手段と、前記口領域候補決定手段が決定した
口領域候補の暗い領域のみが抽出されるしきい値を用い
て二値化した画像を基準画像とし、徐々に暗い画素が多
くなる方向にしきい値を変化させながら二値化を行い基
準画像との差分画像を求め、得られた差分画像をラベリ
ングして所定サイズとなるラベルを検出することで口領
域を検出する口領域検出手段とを有することを特徴とす
る顔画像処理装置である。According to the present invention, there is provided image input means for continuously inputting an image of a person including a face image, and a position of a pupil and a nostril of the face of the person from a plurality of continuous images input by the image input means. And a binarized image using a threshold that extracts only dark regions of the mouth region candidates determined by the mouth region candidate determining unit that detects mouth region candidates based on the detected Is used as the reference image, binarization is performed while changing the threshold value in the direction in which the number of dark pixels gradually increases, a difference image from the reference image is obtained, and the obtained difference image is labeled to detect a label having a predetermined size. And a mouth region detecting means for detecting a mouth region by performing the operation.

【００１７】本発明は上記した構造により、上記した手
順により口領域を自動的に検出することにより、口ひげ
等があった場合でも正確に口領域を認識することがで
き、これにより例えば笑った表情などの任意の口領域の
検出が可能となる。According to the present invention, by automatically detecting the mouth area according to the above-described procedure, the mouth area can be accurately recognized even when a mustache or the like is present. It is possible to detect any mouth area such as

【００１８】又本発明は、顔画像を含む人物の画像を連
続的に入力する画像入力手段と、前記画像入力手段が入
力した複数の連続的な画像から瞳や口、鼻孔が含まれる
顔中心領域で瞳や鼻孔が検出できる適当なしきい値を求
め、そのしきい値を用いて顔全体領域を二値化しラベリ
ング処理をすることで頭髪の内側に含まれる連結した一
つの顔領域を抽出する顔領域抽出手段と、前記顔領域抽
出手段が抽出した前記顔領域の左右端、瞳位置の配置、
両瞳の中心座標までの距離の比を用いて耳が頭髪に隠れ
ているかどうかを判定し、この判定結果に基づき耳を含
めない前記顔領域のサイズを求める顔領域測定手段と、
前記顔領域測定手段が求めた前記顔領域のサイズに基づ
き、前記顔領域の画像を任意のサイズに拡大縮小する顔
領域補正手段とを有することを特徴とする顔画像処理装
置である。According to the present invention, there is provided image input means for continuously inputting an image of a person including a face image, and a face center including a pupil, a mouth and a nostril from a plurality of continuous images input by the image input means. Calculate an appropriate threshold value that can detect pupils and nostrils in the area, and use the threshold value to binarize the entire face area and perform labeling processing to extract one connected face area contained inside the hair A face region extracting unit, and a left and right end of the face region extracted by the face region extracting unit;
Face area measurement means for determining whether the ears are hidden by the hair using the ratio of the distances to the center coordinates of both eyes, and determining the size of the face area not including the ears based on the determination result,
A face image processing apparatus comprising: a face area correction unit configured to enlarge or reduce an image of the face area to an arbitrary size based on the size of the face area obtained by the face area measurement unit.

【００１９】本発明は上記した構造により、耳が頭髪か
ら出ているかいないかに関わらず、本来の顔の幅を検出
し、例えば顔が小さくしか写っていない場合でも、設定
に応じて最適の大きさに自動的に引き伸ばして画像を出
力するものである。これにより、正確な顔画像の大きさ
を認識しこの大きさに応じた拡大縮小を自動的に行うこ
とにより、例えば画面に小さくしか写っていなかった人
物の写真も、適切なサイズに自動引き延ばしされ、良好
な画像を自動的に提供することができる顔画像処理装置
である。With the above structure, the present invention detects the original width of the face regardless of whether or not the ears protrude from the hair. For example, even when the face is only small, the optimum size is determined according to the setting. Then, the image is automatically enlarged and output. Thus, by automatically recognizing the size of the face image and automatically performing scaling according to the size, for example, a photograph of a person who was only small on the screen is automatically enlarged to an appropriate size. And a face image processing apparatus capable of automatically providing a good image.

【００２０】又本発明は、顔画像を含む人物の画像を連
続的に入力する画像入力手段と、前記画像入力手段によ
り入力された前記画像の複数枚を蓄積する画像蓄積手段
と、前記画像蓄積手段によって蓄積された前記画像の中
の顔領域を抽出する顔領域抽出手段と、前記顔領域抽出
手段が抽出した顔領域の中から瞳領域を検出する瞳検出
手段と、前記顔領域抽出手段が抽出した顔領域の中から
口領域を検出する口検出手段と、前記瞳検出手段が検出
した瞳領域の状態を判定する瞳状態判定手段と、前記口
検出手段が検出した口領域の状態を判定する口状態判定
手段と、前記瞳状態判定手段と前記口状態判定手段が下
した判定に基づき、抽出された顔領域の状態を判定する
顔状態判定手段と、前記顔状態判定手段が判定した判定
結果に基づき、最適の顔状態の顔画像を含む画像を、前
記画像蓄積手段が蓄積する複数の画像から自動選択する
画像選択手段と、前記画像選択手段が選択した画像の中
の顔画像のサイズを検出し、任意のサイズに拡大縮小す
る顔領域補正手段とを有することを特徴とする顔画像処
理装置である。According to the present invention, there is provided image input means for continuously inputting an image of a person including a face image, image storage means for storing a plurality of images input by the image input means, Means for extracting a face area in the image stored by the means, pupil detection means for detecting a pupil area from the face areas extracted by the face area extraction means, and face area extraction means. Mouth detection means for detecting a mouth area from the extracted face area, pupil state determination means for determining a state of the pupil area detected by the pupil detection means, and determination of a state of the mouth area detected by the mouth detection means Mouth state determining means to perform, a face state determining means for determining the state of the extracted face area based on the determinations made by the pupil state determining means and the mouth state determining means, and a determination made by the face state determining means Based on the results, Image selecting means for automatically selecting an image including a face image in the face state from the plurality of images stored by the image storing means, and detecting the size of the face image in the image selected by the image selecting means, And a face area correcting means for enlarging or reducing the size of the face image.

【００２１】本発明は上記した構造により、最適な瞳の
状態、口の状態、また顔サイズを自動的に実現する顔画
像処理装置である。The present invention is a face image processing apparatus for automatically realizing an optimum pupil state, mouth state, and face size by the above structure.

【００２２】又方法の発明についても同様の趣旨で同様
の作用効果が得られることは言うまでもない。It is needless to say that the same effect can be obtained with the same effect in the invention of the method.

【００２３】[0023]

【発明の実施の形態】以下、この発明の一実施の形態に
ついて図面を参照して詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below in detail with reference to the drawings.

【００２４】初めに本方式を用いてテレビカメラや電子
スチルカメラから入力された連続画像中に含まれる一人
または複数の人物顔の状態を認識し、撮影者の希望とす
る状態の顔を撮影する装置についての実施形態を示す。First, using this method, the state of one or more human faces included in a continuous image input from a television camera or an electronic still camera is recognized, and a face in a state desired by the photographer is photographed. 1 shows an embodiment of an apparatus.

【００２５】（１）実施形態の全体処理概要の処理説明図１は、本発明の実施形態であるシステムの一例を示す
構成図である。図１において、本実施形態は、テレビカ
メラ及びモニタ１、ＰＣ（またはワークステーション）
からなる装置２，３、または電子スチルカメラのような
携帯型の筐体内部にＰＣと同様の計算及び記憶装置等を
含み、液晶やプラズマ等の小型ディスプレイを装備した
装置４からなる。(1) Description of Processing Overview of Overall Processing of Embodiment FIG. 1 is a configuration diagram showing an example of a system according to an embodiment of the present invention. In FIG. 1, this embodiment is a television camera and monitor 1, a PC (or workstation).
, Or a device 4 including a calculation and storage device similar to a PC in a portable housing such as an electronic still camera and equipped with a small display such as a liquid crystal or a plasma.

【００２６】図２は、本発明の実施形態であるシステム
の処理に沿ったブロックダイアグラムである。図２にお
いて、本発明に係るシステムは、画像入力部１１と、画
像蓄積部１２と、顔領域抽出部１３と、瞳検出部１４
と、鼻孔検出部１５と、口検出部１６と、瞳状態判定部
１７と、口状態判定部１８と、顔状態判定部１９と、属
性別計数部２０と、最適画像撮影部２１と、最適画像合
成部２２と、顔サイズ補整部２３と、出力部２４とを有
している。FIG. 2 is a block diagram showing the processing of the system according to the embodiment of the present invention. 2, the system according to the present invention includes an image input unit 11, an image storage unit 12, a face region extraction unit 13, a pupil detection unit 14,
A nostril detection unit 15, a mouth detection unit 16, a pupil state determination unit 17, a mouth state determination unit 18, a face state determination unit 19, an attribute-based counting unit 20, an optimal image photographing unit 21, It has an image synthesizing unit 22, a face size adjusting unit 23, and an output unit 24.

【００２７】このようなシステムにおいて、本発明の画
像処理は以下のような手順で行われる。つまり、画像入
力部１１からデジタイズされた画像を入力し、画像蓄積
部１２にその内容を連続して格納する。入力画像に対し
て顔領域抽出部１３を適用することにより入力画像内に
存在する一人または複数の人物の顔を抽出し、抽出され
た各顔領域において瞳検出部１４、鼻孔検出部１５、口
検出部１６を用いて顔内の目鼻口の部位を検出する。顔
の各部位が検出されたら瞳状態判定部１７及び口状態判
定部１８によって瞳の開閉状態や視線の状態、口の開閉
状態等を求め、顔状態判定部１９ではその結果を利用し
て被撮影者それぞれの顔の状態がどのような状態である
かを判定する。In such a system, the image processing of the present invention is performed in the following procedure. That is, a digitized image is input from the image input unit 11, and the contents are continuously stored in the image storage unit 12. The face region extracting unit 13 is applied to the input image to extract the faces of one or more persons present in the input image, and the pupil detecting unit 14, the nostril detecting unit 15, and the mouth are extracted in each of the extracted face regions. The detection unit 16 is used to detect the site of the eyes, nose, and mouth in the face. When each part of the face is detected, the pupil state determination unit 17 and the mouth state determination unit 18 determine the open / closed state of the pupil, the state of the line of sight, the open / closed state of the mouth, and the like. The state of the face of each photographer is determined.

【００２８】属性別計数部２０では撮影領域内にいる人
物それぞれの性別、大人／子供等の属性をもとめ、属性
毎及び撮影領域内全部の人数を計測する。最適画像撮影
部２１では、得られた画像が撮影者の希望とする状態で
あるかどうかを一枚一枚毎に判定し、複数枚得られた画
像の中で最も最適状態に近いものを出力し、最適画像合
成部２２では複数人物を撮影している場合には被撮影者
それぞれにおいて最適の画像を保存し、最終出力画像で
合成する。The attribute-based counting section 20 determines the gender, attributes of adults / children, etc. of the persons in the photographing area, and measures the number of persons in each attribute and in the photographing area. The optimum image photographing unit 21 determines whether the obtained image is in a state desired by the photographer, one by one, and outputs an image closest to the optimum state among a plurality of obtained images. When a plurality of persons are photographed, the optimal image combining unit 22 stores the optimal image for each of the photographed persons and combines the images with the final output image.

【００２９】得られた結果や候補画像は入力画像サイズ
または顔サイズ補正部２３によってサイズを補正しなが
ら出力部２４によって表示し、撮影者に結果を知らせ
る。The obtained result and the candidate image are displayed by the output unit 24 while correcting the size by the input image size or the face size correcting unit 23 to inform the photographer of the result.

【００３０】次に、それぞれの処理部１１〜２３に沿っ
て詳細にその動作を図面を用いて説明する。Next, the operation of each of the processing units 11 to 23 will be described in detail with reference to the drawings.

【００３１】（２）画像入力部１１の処理説明の処理説
明一名または複数名の人物が写るように設置された、動画
像入力用のテレビカメラ及び静止画入力用の電子スチル
カメラ等を利用して画像をカラーまたはモノクロでデジ
タイズして入力する。入力画像の階調やサイズはとくに
限定せずカメラの入力階調、入力解像度に従うこととす
る。(2) Processing Description of Processing Description of Image Input Unit 11 A television camera for inputting a moving image, an electronic still camera for inputting a still image, and the like, which are installed so that one or more persons appear therein, are used. And digitize the image in color or monochrome and input. The gradation and the size of the input image are not particularly limited, and are based on the input gradation and the input resolution of the camera.

【００３２】（３）画像蓄積部１２の処理説明の処理説
明画像入力部１１から取り込まれた画像はそのままメモリ
に保存され、また直前（Ｎフレーム前まで）の複数の画
像を別の領域に保存する。(3) Description of Processing of Image Storage Unit 12 The image captured from the image input unit 11 is stored in the memory as it is, and a plurality of images immediately before (up to N frames before) are stored in another area. I do.

【００３３】（４）顔領域抽出部１３の処理説明の処理
説明人物顔領域のうち、上下端は眉毛から唇付近、左右端は
両目の両端の外側に位置する領域を顔検索用領域として
定め、予め複数名の画像を利用して平均画像もしくはＫ
Ｌ展開をして上位成分固有ベクトルを用いる等して顔探
索用の顔辞書を作成する。また前もって顔探索用の辞書
で様々な画像を評価し、顔辞書と類似度が高い領域で顔
ではないものが得られたら非顔辞書として画像を収集す
る。入力された画像に対して顔の大きさの影響をなくす
ために複数段階での拡大・縮小画像を作成し、それぞれ
の画像に対して複合類似度法もしくはテンプレートマッ
チング法を利用して顔領域の探索を行う。走査する手順
を図３の説明図に示す。顔領域は顔辞書と類似度が高く
非顔辞書と類似度が低くなるのが理想で、評価値＝顔辞書との類似度−非顔辞書との類似度で与えられる評価値の最も高い場所を求め第一の顔検出
領域とする。最高値を出した領域と重ならず所定の距離
以上離れた位置で所定の評価しきい値以上の評価値を与
える領域に対しても顔の検出領域とすることで、複数人
数が入力画像に入っている場合でも全員を検出し、被撮
影領域中の人数を計測することも可能である。(4) Processing Description of Processing by Face Area Extraction Unit 13 Of the human face area, the upper and lower ends are defined as face search areas that are located near the lips from the eyebrows and the left and right ends are located outside both ends of both eyes. , An average image or K
A face dictionary for face search is created by performing L expansion and using the upper component eigenvector. In addition, various images are evaluated in advance using a face search dictionary, and if a non-face image is obtained in an area having a high degree of similarity to the face dictionary, the images are collected as a non-face dictionary. In order to eliminate the influence of the size of the face on the input image, an enlarged / reduced image is created in multiple stages, and the face area is determined for each image using the composite similarity method or template matching method. Perform a search. The scanning procedure is shown in the explanatory diagram of FIG. Ideally, the face area has a high degree of similarity to the face dictionary and a low degree of similarity to the non-face dictionary. The evaluation value = the degree of similarity to the face dictionary minus the degree of similarity given by the similarity to the non-face dictionary. Is determined as the first face detection area. By setting a face detection area to an area that gives an evaluation value equal to or higher than a predetermined evaluation threshold at a position that is not overlapped with the area where the highest value is obtained and that is at least a predetermined distance, a plurality of people can input images. Even in the case of entering, it is also possible to detect all the members and measure the number of persons in the photographed area.

【００３４】（５）瞳検出部１４の処理説明の処理説明顔領域抽出部１３によって抽出された顔領域それぞれに
対して、複数の半径で円形分離度フィルター（「動画像
を用いた顔認識システム」、山口修他、信学技報 PRMU9
7-50,PP17-23を参照）をかけることで円形で周りより暗
くなっている場所を瞳候補点として列挙する。瞳領域は
顔の上方領域にあると想定されるので探索領域は顔全体
に対して処理する必要はない。また二値化されて暗いと
判定された場所のみで図４に示された外側領域と内側領
域それぞれにおける輝度分散の比率を求める円形分離度
の計算をすることにより高速化をすることが可能であ
る。得られた候補点それぞれに対して次に用途に応じた
幾何学配置条件を用いて候補点の組み合わせ（左右で一
組）を絞込む。例えば、カメラからの距離によって両瞳
間の距離の大小しきい値を決める。又は、正面静止状態
の顔しかない場合は両瞳を結ぶ線が水平に近いように角
度のしきい値を決める等である。その両目それぞれに対
して以下の評価値計算を行い左右の評価値を足したもの
をその組み合わせの評価値とする。(5) Processing Description of Processing of Pupil Detector 14 For each of the face regions extracted by the face region extractor 13, a circular separation degree filter with a plurality of radii (“Face recognition system using moving images”) Osamu Yamaguchi et al., IEICE Technical Report PRMU9
7-50, refer to PP17-23) and circumscribe the places that are circular and darker than the surroundings as pupil candidate points. Since the pupil region is assumed to be above the face, the search region need not be processed for the entire face. Further, it is possible to speed up the calculation by calculating the circular separation degree for calculating the ratio of the luminance variance in each of the outer region and the inner region shown in FIG. is there. Next, for each of the obtained candidate points, combinations of candidate points (one pair on the left and right sides) are narrowed down using a geometric arrangement condition according to the application. For example, the threshold value of the distance between the pupils is determined based on the distance from the camera. Alternatively, if there is only a face in a front stationary state, the threshold value of the angle is determined so that the line connecting both pupils is nearly horizontal. The following evaluation values are calculated for both eyes, and the sum of the left and right evaluation values is used as the evaluation value of the combination.

【００３５】評価値＝瞳辞書との類似度−非瞳辞書との
類似度尚、各辞書は前もって複数名の被験者のデータから顔領
域抽出部と同様に辞書を予め作成しておくものとし、こ
の場合の瞳辞書は眼がねをかけている、目つぶり、横
目、半目などといった各種の瞳の状態を全て別々の複数
辞書として持ち、目つぶりや横目の状態など様々な状態
でも安定して瞳領域を検出することができる。また非瞳
辞書も瞳と間違いやすい鼻孔や目尻目頭、眉などのクラ
スを分け複数の辞書を持たせ、非瞳辞書の類似度計算の
時にはその中で最も高い類似度を与える物を選択して計
算することで色々な抽出失敗に対処する。この様子を図
６に示す。Evaluation value = similarity with pupil dictionary−similarity with non-pupil dictionary Note that each dictionary is prepared in advance from data of a plurality of subjects in the same manner as the face area extraction unit. The pupil dictionary in this case has various pupil states, such as eye-opening, blinking, side-eyed eyes, half-eyed eyes, etc., all in separate dictionaries, and it is stable even in various states such as eyes-closed and side-eyed eyes A pupil region can be detected. In addition, the non-pupil dictionary is divided into classes such as the nostrils, the nostrils, the outer corners of the eyes, the eyebrows, etc. The calculation will deal with various extraction failures. This is shown in FIG.

【００３６】また鼻孔検出部１５と組み合わせて幾何学
的な拘束条件を図５のように定めることで、瞳検出の精
度を上げることが可能である。Further, by determining the geometric constraint conditions as shown in FIG. 5 in combination with the nostril detection section 15, it is possible to increase the accuracy of pupil detection.

【００３７】（６）鼻孔検出部１５の処理説明顔検出部１３及び瞳検出部１４の位置関係を用いて鼻領
域を限定する。顔領域中央部であり両瞳よりも下におい
て瞳検出部１４と同様に二値化、円形分離度フィルター
処理をすることで暗くて丸い部分の領域を鼻孔候補点と
して列挙し、それぞれに対して顔検出部と同様、鼻孔辞
書、非鼻孔辞書と類似度計算をし以下の評価値を各点で
求める。(6) Description of Processing of Nose Detector 15 The nose area is limited using the positional relationship between the face detector 13 and the pupil detector 14. By performing binarization and circular separation filter processing in the center of the face area and below both pupils in the same manner as in the pupil detection unit 14, the area of the dark and round part is enumerated as nostril candidate points, and Similar to the face detection unit, similarity calculation is performed with the nostril dictionary and non-nostril dictionary, and the following evaluation values are obtained at each point.

【００３８】評価値＝鼻孔辞書との類似度−非鼻孔辞書
との類似度また候補点全ての２点の組みあわせの中で、予め与えて
ある瞳との幾何学的な配置条件に一致する中で上記評価
値が最高となる一組の点（左右の２点）を求め、それを
両鼻孔位置として検出する。また瞳検出部１４にも示し
たが幾何学配置条件の中で瞳と鼻孔の４点を行うことで
精度を上げることも可能である。Evaluation value = similarity with nostril dictionary-similarity with non-nostril dictionary Also, in the combination of two points of all candidate points, it matches the geometric arrangement condition with the pupil given in advance. Among them, a set of points (two points on the left and right) having the highest evaluation value is obtained and detected as the position of both nostrils. Although shown in the pupil detection unit 14, the accuracy can be improved by performing four points of the pupil and the nostril in the geometric arrangement condition.

【００３９】（７）口検出部１６の処理説明顔領域抽出部１３、瞳検出部１４及び鼻孔検出部１５に
よって顔及び目鼻の配置が求められたため、両瞳の中
心、両鼻孔の中心を求め平均的な幾何学配置を利用して
口があるだろうと思われる計算を行う。図５は、本発明
の瞳検出部及び鼻孔検出部における瞳と鼻孔と口の位置
関係を説明する説明図であり、図５を参照されたい。(7) Description of Processing of Mouth Detection Unit 16 Since the arrangement of the face, eyes, and nose is obtained by the face area extraction unit 13, pupil detection unit 14, and nostril detection unit 15, the center of both pupils and the center of both nostrils are obtained. Uses the average geometry to make calculations that you think will have a mouth. FIG. 5 is an explanatory diagram illustrating the positional relationship between the pupil, the nostrils, and the mouth in the pupil detection unit and the nostril detection unit according to the present invention, and FIG.

【００４０】又、口検出部１６の処理の説明図が図７に
示され、これは本発明における口検出部の検出処理を説
明する説明図である。図７において、その領域において
最も暗い画素しか出ないような所定しきい値以下の輝度
を持つ画素を黒画素にし、それ以外の画素を白画素とす
る二値化処理を行い、この画像を基準画像とする。この
しきい値でも抽出される領域は暗い部分もしくは黒い部
分のため、ひげの領域もしくは開いている口の領域とす
る。そこから徐々にしきい値を上げて二値化をし、基準
画像との差分画像に対してラベリング処理を行い、横に
長い領域（ラベル）がでてきて大きくなってきたらその
領域が縦横それぞれ所定サイズ以上になった段階で口の
領域とする。一方で初期しきい値の二値化結果とサイズ
がほとんど変わらないのはひげなどのような真っ黒な領
域は差分処理によって排除でき、口領域とは区別するこ
とができる。FIG. 7 is an explanatory diagram of the processing of the mouth detecting section 16, which is an explanatory view for explaining the detecting processing of the mouth detecting section in the present invention. In FIG. 7, a binarization process is performed in which a pixel having a luminance equal to or lower than a predetermined threshold value, at which only the darkest pixel appears in the region, is set as a black pixel, and the other pixels are set as white pixels. Make an image. Since the area to be extracted even with this threshold value is a dark part or a black part, it is defined as a beard area or an open mouth area. From there, the threshold value is gradually increased to perform binarization, and a labeling process is performed on the difference image from the reference image. The area of the mouth is determined when the size of the mouth is exceeded. On the other hand, a black region such as a beard where the size hardly changes from the binarization result of the initial threshold value can be excluded by the difference processing, and can be distinguished from the mouth region.

【００４１】（８）瞳状態判定部１７の処理説明瞳検出部１４で求められた左右の各瞳領域にたいし、
「目つぶり」「半目」「横目」「上目」等といった目の
様々な状態にあわせて辞書を作成しておき、得られた瞳
画像との類似度が最も高くなる状態を現在の瞳の状態と
判定する。(8) Description of the processing of the pupil state determination unit 17 For each of the left and right pupil regions obtained by the pupil detection unit 14,
A dictionary is created according to various states of the eyes such as "blind eyes", "half eyes", "side eyes", "upper eyes", etc., and the state in which the similarity with the obtained pupil image is the highest is determined for the current pupil. Judge as the state.

【００４２】また後述する顔状態判定部１９にも書かれ
ているようにどの状態を希望するのか撮影者側が予め選
択されている場合には以下の方法で最適画像を選択する
ものとする。If the photographer has previously selected which state he or she wants, as described in the face state determination unit 19 described later, the optimum image is selected by the following method.

【００４３】図９は、瞳状態判定部の判定処理を示すフ
ローチャートである。この処理によって瞬きや視線の動
きなど瞳の状態が逐次変わる状態であったときや目が細
くて瞳の開閉の判定がしにくい被撮影者であっても最適
な画像を選択することができる。FIG. 9 is a flowchart showing the judgment processing of the pupil state judgment section. By this processing, an optimal image can be selected even when the state of the pupil such as blinking or movement of the line of sight changes sequentially, or even for a subject whose eyes are thin and it is difficult to determine whether the pupil is open or closed.

【００４４】評価値は希望状態を示す辞書との類似度と
それ以外の辞書の中で最も高い類似度との差とする。こ
の値が高いということは理想の状態に近く他の状態と明
確に区別できる状態だと判断できる。この評価値を一枚
の画像で判定すると目の細い人が開いた状態なのか大き
な目の人が半目状態であるのかの区別がつけられないた
め、瞬きが開始して終わるまでの時間より時間だけ撮影
を行うのに十分な枚数Ｎだけ連続に画像を蓄積し、評価
値の分散及び平均値を計算する。The evaluation value is the difference between the similarity with the dictionary indicating the desired state and the highest similarity among the other dictionaries. When this value is high, it can be determined that the state is close to the ideal state and can be clearly distinguished from other states. If this evaluation value is determined on a single image, it cannot be distinguished whether a person with a narrow eye is in an open state or a person with a large eye is in a half-eyed state, so it takes more time than the time from when blinking starts to when it ends. The image is accumulated continuously for the number N of images sufficient to capture only the images, and the variance and average value of the evaluation values are calculated.

【００４５】図９において、評価値の分散が小さい場合
には（Ｓ３１）、目の状態の変化はほとんどないとし
て、平均値よりも高い時間が長い場合には（Ｓ３２）、
平均よりも高い評価値の中で最も平均に近い評価値を与
える状態を最適画像とし（Ｓ３５）、平均値よりも低い
時間が長い場合には平均よりも低い評価値の中で最も平
均に近い評価値を与える状態を最適画像として選択する
（Ｓ３３）。逆に分散が大きい場合には目の状態が大き
く変動していると考えられ、最も高い評価値を与えるも
のを最適画像とする（Ｓ３４）。In FIG. 9, when the variance of the evaluation value is small (S31), it is determined that there is almost no change in the state of the eyes, and when the time higher than the average value is longer (S32),
The state in which the evaluation value closest to the average among the evaluation values higher than the average is given as the optimal image (S35). If the time lower than the average value is long, the state closest to the average among the evaluation values lower than the average is used. The state in which the evaluation value is given is selected as the optimum image (S33). Conversely, if the variance is large, it is considered that the state of the eyes has fluctuated greatly, and the one giving the highest evaluation value is determined as the optimal image (S34).

【００４６】図１０は本発明における瞳状態判定部の判
定処理を説明する説明図であり、これを例にとって説明
すると、（ａ）と（ｂ）は動きも少なく分散も小さく、
平均よりも高い時間が長いために平均より高い中で最も
平均値に近く評価値を与える画像を選択する。（ｃ）で
は変動が大きく分散が大きくなるため、最高値を与える
画像を選択する。（ｄ）では分散が小さく平均よりも低
い時間が長いために、平均よりも低い評価値を与える中
で最も平均値に近い画像を選択する。FIG. 10 is an explanatory diagram for explaining the judgment processing of the pupil state judging section in the present invention. In this example, (a) and (b) show little movement and little variance,
An image that gives an evaluation value closest to the average value among those higher than the average is selected because the time longer than the average is longer. In (c), since the fluctuation is large and the variance is large, the image giving the highest value is selected. In (d), since the variance is small and the time lower than the average is long, the image closest to the average value is selected among the evaluation values lower than the average.

【００４７】（９）口状態判定部１８の処理説明次に、口状態判定部１６の処理のフローチャートを図１
１に示す。(9) Description of Processing of Mouth State Determining Section 18 Next, a flowchart of the processing of the mouth state determining section 16 will be described with reference to FIG.
It is shown in FIG.

【００４８】図１１において、口の上下幅左右幅、及び
上下左右幅、およびそれぞれに定めたしきい値との比較
によって口が開いているか閉じているかの判定を行う。
口の上下幅が所定しきい値以上となれば（Ｓ４１）口が
開いていると判定し（Ｓ４４）、所定しきい値以下の場
合で横幅が所定しきい値以上であれば（Ｓ４２）口が閉
じていると判定する（Ｓ４５）。さらにそのどちらにも
属さない場合には、口の上下幅左右幅、及び上下左右幅
を一定サイズになるように正規化した画像において複数
の状態の辞書（普通の口、とんがっている口、くいしば
り、あかんべぇ等それぞれにあわせて辞書を作成）と比
較することで（Ｓ４３）口の状態を判定する（Ｓ４６，
Ｓ４７）。In FIG. 11, it is determined whether the mouth is open or closed by comparing the upper and lower widths and the left and right widths of the mouth, the upper and lower widths and the left and right widths, and threshold values respectively defined.
If the vertical width of the mouth is equal to or greater than a predetermined threshold (S41), it is determined that the mouth is open (S44). If the width is equal to or less than the predetermined threshold and the width is equal to or greater than the predetermined threshold (S42). Is determined to be closed (S45). In addition, if it does not belong to either of them, the dictionary in multiple states (normal mouth, pointed mouth, clenched mouth) in the image in which the top, bottom, left, and right widths and the top, bottom, left, and right widths are normalized to have a fixed size (S43), and the state of the mouth is determined (S46,
S47).

【００４９】（１０）顔状態判定部１９の処理説明瞳状態判定部１７及び口状態判定部１８の出力を利用
し、撮影者の希望する顔状態であるかどうかを判定す
る。希望の状態とは、例えば証明写真等の場合の状態と
は「瞳が正面を向いて開いた状態であり、口は閉じた状
態である」になり、スナップ写真等では「瞳が開いた状
態で口の状態はどちらでもよい」「瞳が開いた状態で口
が笑った状態」等となる。(10) Description of Processing of Face State Determining Unit 19 The output of the pupil state determining unit 17 and the mouth state determining unit 18 is used to determine whether the face state is desired by the photographer. The desired state is, for example, the state in the case of an identification photograph or the like, which is "a state in which the eyes are open facing the front and the mouth is in a closed state". And the state of the mouth may be either. "The state of the mouth laughing with the eyes open."

【００５０】実際の状態判定には図１２に示すような瞳
と口の状態それぞれを縦軸横軸にとったマトリクスを準
備し、希望の状態であるかどうかをそれぞれのセルに入
れていくといった形になる。For the actual state determination, a matrix is prepared in which the pupil and mouth states as shown in FIG. 12 are plotted on the vertical and horizontal axes, and whether or not the desired state is entered in each cell. It takes shape.

【００５１】（１１）属性別計数部２０の処理説明顔領域抽出部１３で抽出された顔領域それぞれにおい
て、男女それぞれの平均顔からなる辞書、大人子供それ
ぞれの平均顔からなる辞書、また国籍などそれぞれで平
均顔画像辞書をもち、類似度計算をしてどちらに近いか
で属性ごとに人数の計測を行い、得られた結果をもとに
顔領域に対して属性のラベル付けを行う。また属性に関
係なく非撮影領域内に存在する人物の数を全部積算する
ことにより人数計測を行うことができる。(11) Description of Processing by Attribute-Based Counting Unit 20 In each of the face regions extracted by the face region extracting unit 13, a dictionary composed of average faces of men and women, a dictionary composed of average faces of adults and children, nationality, etc. Each has an average face image dictionary, calculates the degree of similarity, measures the number of persons for each attribute based on which is closer, and labels the face area with the attribute based on the obtained result. Also, the number of persons can be measured by integrating all the numbers of persons existing in the non-photographing area regardless of the attribute.

【００５２】（１２）最適画像撮影部２１の処理説明所定時間内に蓄積された時系列連続画像の中において、
顔状態判定部１９で示したようなマトリクスを用い、撮
影者の希望とする状態であるかどうかを、一枚一枚毎
に、そして各人毎に、そして各部位毎に計数をかけて積
算したものを評価値として求める。式は以下の通り。(12) Description of Processing of Optimal Image Shooting Unit 21 In the time-series continuous images accumulated within a predetermined time,
Using a matrix such as that shown in the face state determination unit 19, whether or not the photographer desires the state is counted and integrated for each image, for each person, and for each region. The calculated value is obtained as an evaluation value. The formula is as follows.

【００５３】評価値＝（希望辞書との類似度−
非希望辞書中最高類似度）ここで「顔」は撮影領域内に含まれる全顔を示し、「部
位」は各顔領域内における目と口を示す。複数枚得た画
像の中で上記評価値が最も高くなる画像を最適画像とし
て選択する。Evaluation value = (similarity with desired dictionary−
(The highest similarity in the non-desired dictionary.) Here, “face” indicates all faces included in the photographing area, and “part” indicates eyes and mouth in each face area. The image having the highest evaluation value among the plurality of obtained images is selected as the optimum image.

【００５４】（１３）最適画像合成部２２の処理説明複数人物を対象として撮影をしており、撮影領域内の全
員が目を開いて笑っている（口を開いている）状態の写
真を撮りたいなどといった希望の状態の撮影を行いたい
場合、上記顔状態判定部１９までの処理を所定時間繰り
返すことで蓄積された画像の中で、被撮影者それぞれに
おいて最適の画像を顔領域及び所定範囲の顔の周辺画像
を保存し、最終出力画像で最適画像をあてはめて合成す
ることで、被撮影者が撮影タイミングやまわりの調整が
必要なく最適な画像を作成する。合成する場合にはでき
るだけ被撮影者が動かないことが前提であるが、動いて
しまった場合には顔領域より大きめにとった保存領域の
周辺に沿ってアンチエイリアス処理をかけることにより
不自然な合成画像でなくなるように処理を行う。(13) Description of Processing of Optimal Image Compositing Unit 22 A photograph is taken for a plurality of persons, and all the persons in the photographing area open their eyes and laugh (open their mouths). When it is desired to take a desired state such as a desired state, by repeating the processing up to the face state determination unit 19 for a predetermined period of time, an optimum image for each of the subjects can be obtained in the face area and the predetermined range. By storing the peripheral image of the face and applying the optimum image to the final output image and synthesizing the image, the subject can create an optimum image without having to adjust the shooting timing and surroundings. In the case of composition, it is assumed that the subject does not move as much as possible, but if it moves, unnatural composition is performed by applying anti-aliasing along the periphery of the storage area larger than the face area Processing is performed so that the image is no longer displayed.

【００５５】（１４）顔サイズ補正部２３の処理説明出力部２４に出力する際に入力された画像をそのまま出
力することもできるが、抽出された一人または複数人の
顔領域の大きさに応じて出力画像の大きさを拡大・縮小
する。顔のサイズは顔領域抽出部１３で用いた複数解像
度の顔辞書のサイズを用いれば求めることができるのだ
がサイズの解像度分だけ解像度が必要となるため、ここ
では別手法を用いる。(14) Description of the Processing of the Face Size Correction Unit 23 When the image is output to the output unit 24, the input image can be output as it is. To enlarge or reduce the size of the output image. The size of the face can be obtained by using the size of the face dictionary of a plurality of resolutions used in the face area extraction unit 13. However, since the resolution is required for the resolution of the size, another method is used here.

【００５６】顔領域として抽出された領域内の輝度分布
のみを利用して、白画素黒画素比率が一定となるような
Ｐ−Ｔｉｌｅ法、もしくは一定しきい値、判別分析法等
の手法によって二値化を行い、顔領域を二値化した際の
しきい値で顔の周辺領域を含む領域を二値化する。二値
化された画像をラベリングすることで顔中心部を含む連
結した領域が抽出され、その領域の左右端を顔の左右端
としてその横幅の値をもって顔サイズとする。ただし耳
が出ている場合と髪の毛で耳が隠れる場合があるため、
瞳検出部１４によって求められた瞳位置、及び顔の左右
端の位置を用いて分類を行う。Using only the luminance distribution in the region extracted as the face region, the P-Tile method or the fixed threshold value, the discriminant analysis method, or the like, in which the ratio of white pixels to black pixels is constant, is used. The binarization is performed, and the region including the peripheral region of the face is binarized using the threshold value when the face region is binarized. By labeling the binarized image, a connected area including the face center is extracted, and the left and right ends of the area are defined as the left and right ends of the face, and the width value is used as the face size. However, because your ears may be out and your ears may be hidden by your hair,
Classification is performed using the pupil position obtained by the pupil detection unit 14 and the positions of the left and right ends of the face.

【００５７】図１３に処理の説明図を示すが、両瞳の中
心Ｄを基準にし向かって左側を例にとって説明する。顔
の左端は耳が出ている場合はＡの位置となり、ＡＤの長
さ／ＣＤの長さが所定しきい値以上となるようにしきい
値を予め設定しておく。仮に耳が髪の毛で隠れている場
合には左端位置はＢの位置となるため（ＢＤの長さ／Ｃ
Ｄの長さ）の値は耳が出ている場合より小さくなるた
め、ここで耳が出ているかどうかの判定を行う。同様に
反対側の耳についても耳が出ているかどうかを判定す
る。FIG. 13 is an explanatory diagram of the processing. The description will be made by taking the left side as an example with reference to the center D of both pupils. The left end of the face is located at position A when the ear is out, and the threshold value is set in advance so that the length of AD / the length of CD is equal to or greater than a predetermined threshold value. If the ear is hidden by the hair, the left end position is the position B (BD length / C
Since the value of (length of D) is smaller than when the ear is out, it is determined here whether or not the ear is out. Similarly, it is determined whether or not the ear on the opposite side is out.

【００５８】耳が出ていない場合にはそのまま左右端と
して抽出された位置を顔領域だとし、耳が出ている場合
には複数人物のデータで予め計算された（Ａ−Ｄ）／
（Ｂ−Ｄ）の平均値を用いて耳位置に影響うけずにＢの
位置を計算して求める。以上によって求められた顔サイ
ズをもとに撮影者側が希望のサイズを入力していた場合
には拡大縮小処理をすることで希望サイズでの画像出力
を行う。If no ears are present, the positions extracted as the left and right ends are regarded as face areas. If ears are present, (AD) / (AD) / is calculated in advance using data of a plurality of persons.
The position of B is calculated and obtained using the average value of (BD) without affecting the ear position. If the photographer has input the desired size based on the face size obtained as described above, the image is output at the desired size by performing enlargement / reduction processing.

【００５９】（１５）出力部２４の処理説明最後に出力部２４の処理を以下に説明する。(15) Description of Processing of Output Unit 24 Finally, the processing of the output unit 24 will be described below.

【００６０】テレビカメラで据え置き型の装置の場合に
はモニタ、携帯タイプのものでは内蔵されたモニタに最
適画像及び最適候補画像を並べて出力を行う。図１４に
示されたように最適画像と判定された画像が大きく出力
され、その横には時間列にそって評価値の高いものを並
べる。もし希望の画像が候補列の方にある場合には上下
左右のボタンで希望画像を選択できるようにして最終出
力画像を変更できるほか、図１４の点線の四角で囲われ
た矩形領域Ｈのように各画像それぞれ顔領域に印をつ
け、複数の画像の中から最適の顔を手動で合成すること
も可能である。In the case of a stationary device such as a television camera, an optimum image and an optimum candidate image are output side by side on a monitor in the case of a stationary type and a built-in monitor in the case of a portable type. As shown in FIG. 14, the image determined to be the optimum image is output in large size, and the images having the higher evaluation values are arranged beside the image in the time sequence. If the desired image is located in the candidate column, the final output image can be changed by selecting the desired image using the up, down, left, and right buttons, and a rectangular area H surrounded by a dotted rectangle in FIG. It is also possible to mark the face area of each image and manually synthesize the optimum face from a plurality of images.

【００６１】[0061]

【発明の効果】以上詳述したようにこの発明によれば、
電子スチルカメラやＴＶ電話、監視カメラで撮影などで
一人または複数の人物の顔を撮影する場合、相手に希望
の撮影状態や撮影していることを知らせることなく、さ
らに目の細さや動きの影響もうけず、顔が正面を向いて
いるかどうか、瞳の開閉状態、口の開閉状態等を判定す
ることができ、撮影で必要とする状態に適した顔の状態
を確認しながら自動的に最適なものを選択して撮影を行
うことができる。As described in detail above, according to the present invention,
When photographing the face of one or more people using an electronic still camera, TV phone, or surveillance camera, the effect of the fineness and movement of the eyes can be further improved without informing the other party of the desired photographing state or shooting. It is possible to determine whether the face is facing forward, whether the eyes are open or closed, whether the mouth is open or closed, etc. You can select an object and shoot.

【００６２】また集合写真等など複数人物を撮影する場
合に被撮影者それぞれの最適状態の画像を自動的に合成
することで、被撮影者全員の最適な画像を容易に得るこ
とが可能となる。When a plurality of persons, such as a group photograph, are photographed, by automatically synthesizing the images in the optimum state of each of the persons, it is possible to easily obtain the optimum images of all the persons. .

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の実施形態であるシステムの一例を示す
構成図。FIG. 1 is a configuration diagram showing an example of a system according to an embodiment of the present invention.

【図２】本発明の実施形態であるシステムの処理に沿っ
たブロックダイアグラム。FIG. 2 is a block diagram illustrating processing of the system according to the embodiment of the present invention.

【図３】本発明の顔領域抽出部の処理を説明する説明
図。FIG. 3 is an explanatory diagram illustrating a process of a face area extraction unit according to the present invention.

【図４】本発明の瞳検出部の円形分離度フィルターの処
理を説明する説明図。FIG. 4 is an explanatory diagram illustrating processing of a circular separation filter of a pupil detection unit according to the present invention.

【図５】本発明の瞳検出部及び鼻孔検出部における瞳と
鼻孔と口の位置関係を説明する説明図。FIG. 5 is an explanatory diagram illustrating a positional relationship between a pupil, a nostril, and a mouth in a pupil detection unit and a nostril detection unit according to the present invention.

【図６】本発明における瞳検出部の検出処理を説明する
説明図。FIG. 6 is an explanatory diagram illustrating a detection process of a pupil detection unit according to the present invention.

【図７】本発明における口検出部の検出処理を説明する
説明図。FIG. 7 is an explanatory diagram illustrating detection processing of a mouth detection unit according to the present invention.

【図８】本発明における瞳状態判定部の判定処理を説明
する説明図。FIG. 8 is an explanatory diagram illustrating a determination process of a pupil state determination unit according to the present invention.

【図９】本発明における瞳状態判定部の判定処理を示す
フローチャート。FIG. 9 is a flowchart illustrating a determination process of a pupil state determination unit according to the present invention.

【図１０】本発明における瞳状態判定部の判定処理を説
明する説明図。FIG. 10 is an explanatory diagram illustrating a determination process of a pupil state determination unit according to the present invention.

【図１１】本発明における口状態判定部の判定処理を説
明するフローチャート。FIG. 11 is a flowchart illustrating a determination process of a mouth state determination unit according to the present invention.

【図１２】本発明における顔状態判定部の判定処理を説
明する説明図。FIG. 12 is an explanatory diagram illustrating a determination process of a face state determination unit according to the present invention.

【図１３】本発明における顔サイズ補正部のサイズ補正
処理を説明する説明図。FIG. 13 is an explanatory diagram illustrating a size correction process of a face size correction unit according to the present invention.

【図１４】本発明における撮影画像選択画面およびイン
ターフェースを示す図。FIG. 14 is a diagram showing a captured image selection screen and an interface according to the present invention.

[Explanation of symbols]

１…カメラ２ … ディスプレイ３ … パーソナルコンピュータ又はワークステーショ
ン４ … ＰＣ同等の計算・記憶装置および内部表示装置
を含むデジタルカメラ１１ … 画像入力部１２ … 画像蓄積部１３ … 顔領域抽出部１４ … 瞳検出部１５ … 鼻孔検出部１６ … 口検出部１７ … 瞳状態検出部１８ … 口状態検出部１９ … 顔状態判定部２０ … 属性別計数部２１ … 最適画像撮影部２２ … 最適画像合成部２３ … 顔サイズ補整部２４ … 出力部DESCRIPTION OF SYMBOLS 1 ... Camera 2 ... Display 3 ... Personal computer or workstation 4 ... Digital camera including a calculation / storage device and internal display device equivalent to a PC 11 ... Image input unit 12 ... Image storage unit 13 ... Face region extraction unit 14 ... Eye detection Unit 15: Nostril detection unit 16: Mouth detection unit 17: Pupil state detection unit 18: Mouth state detection unit 19: Face state determination unit 20: Attribute-based counting unit 21: Optimal image photographing unit 22: Optimal image synthesis unit 23: Face Size adjustment unit 24… Output unit

Claims

[Claims]

An image input means for continuously inputting an image of a person including a face image, and a plurality of continuous images input by the image input means,
An image selecting means for judging the state of the person's face and selecting an image including an optimal face image.

2. An image input means for continuously inputting a plurality of images of a person including a face image, and a plurality of continuous images input by the image input means.
A face image processing apparatus comprising: an image selecting unit that determines the state of the faces of the plurality of persons and selects and outputs an image in which the states of the faces of the plurality of persons are optimal.

3. An image input means for continuously inputting a plurality of images of a person including a face image; and a plurality of continuous images input by the image input means.
An image selection unit that determines the state of the faces of the plurality of persons, and selects an image that is the optimal state of the faces of all of the plurality of persons for each of the plurality of face images of the persons; A face image synthesizing means for synthesizing optimal face images of the plurality of selected persons and outputting the synthesized image as one image.

4. An image input means for continuously inputting images of a plurality of persons including a face image, and a plurality of continuous images input by the image input means,
The number of attributes to determine the composition of the number of persons for each attribute of the plurality of persons in the shooting area by comparing the state of the faces of the plurality of persons with the average face value of each attribute such as gender, nationality, adult / child, etc. A face image processing apparatus comprising: a determination unit.

5. An image input means for continuously inputting an image of a person including a face image, and a pupil area for detecting a pupil area of the face image of the person from a plurality of continuous images input by the image input means. Detection means; pupil area fluctuation state detection means for detecting an image of the pupil area continuous for a predetermined time of the pupil area detected by the pupil area detection means and outputting the fluctuation state thereof; Based on the output of the fluctuation state, the image is compared with a previously stored pupil dictionary to select and output an image including a face image having a pupil region closest to the state desired by the photographer. A face image processing apparatus, comprising: an image selection unit.

6. An image input means for continuously inputting an image of a person including a face image, and detecting a position of a pupil and a nostril of the face of the person from a plurality of continuous images input by the image input means. A mouth region candidate determining unit for determining a mouth region candidate based on the reference region, and a binarized image using a threshold from which only a dark region of the mouth region candidate determined by the mouth region candidate determining unit is extracted as a reference image. By changing the threshold value in the direction in which the number of dark pixels gradually increases, binarization is performed to obtain a difference image from the reference image, and the obtained difference image is labeled to detect a label having a predetermined size. A face image processing apparatus comprising: a mouth area detection unit that detects a mouth area.

7. An image input means for continuously inputting an image of a person including a face image, and a pupil in a face central region including a pupil, a mouth, and a nostril from a plurality of continuous images input by the image input means. Extraction of an appropriate threshold value that can detect the nostrils and nostrils, and using that threshold value to binarize the entire face area and perform labeling processing to extract a single connected face area contained inside the hair Means for determining whether or not the ears are hidden by the hair using the ratio of the distance to the center coordinates of the pupils, the left and right ends of the face area extracted by the face area extraction means, and the ratio of the distance to the center coordinates of both pupils. A face area measuring means for obtaining the size of the face area not including the ears based on the result; and a face for scaling the image of the face area to an arbitrary size based on the size of the face area obtained by the face area measuring means. Region correction means; And a facial image processing apparatus.

8. An image input means for continuously inputting an image of a person including a face image, an image storage means for storing a plurality of the images input by the image input means, and an image storage means for storing the plurality of images input by the image input means. Face area extracting means for extracting a face area in the extracted image, pupil detecting means for detecting a pupil area from the face areas extracted by the face area extracting means, and a face extracted by the face area extracting means. A mouth detecting means for detecting a mouth area from the area; a pupil state determining means for determining a state of the pupil area detected by the pupil detecting means; a mouth state for determining a state of the mouth area detected by the mouth detecting means Determining means, a face state determining means for determining the state of the extracted face area based on the determination made by the pupil state determining means and the mouth state determining means, and based on the determination result determined by the face state determining means The best face Image selecting means for automatically selecting an image including a face image in a state from a plurality of images stored by the image storing means, detecting a size of the face image in the image selected by the image selecting means, and selecting an arbitrary size A face image processing apparatus comprising: a face area correction unit that scales up and down.

9. An image inputting step of continuously inputting an image of a person including a face image; and a plurality of continuous images input in the image inputting step.
An image selecting step of determining the state of the face of the person and selecting an image including an optimal face image.

10. An image inputting step of continuously inputting images of a plurality of persons including a face image; and a plurality of continuous images input in the image inputting step.
An image selecting step of determining the state of the faces of the plurality of persons and selecting and outputting an image in which the states of the faces of the plurality of persons are optimal.

11. An image inputting step of continuously inputting images of a plurality of persons including a face image; and a plurality of continuous images input in the image inputting step.
The image selection step of determining the state of the faces of the plurality of persons, and selecting an image that is the optimum state of the faces of all of the plurality of persons for each of the plurality of face images of the persons, A face image synthesizing step of synthesizing optimal face images of the plurality of selected persons and outputting the synthesized image as one image.

12. An image inputting step of continuously inputting a plurality of images of a person including a face image; and a plurality of continuous images input in the image inputting step.
The number of attributes to determine the composition of the number of persons for each attribute of the plurality of persons in the shooting area by comparing the state of the faces of the plurality of persons with the average face value of each attribute such as gender, nationality, adult / child, etc. A face image processing method, comprising: a determining step.

13. An image inputting step of continuously inputting an image of a person including a face image, and a pupil area detecting a pupil area of the face image of the person from a plurality of continuous images input in the image inputting step. A detection step, a pupil area fluctuation state detection step of detecting an image of a continuous pupil area for a predetermined time of the pupil area detected in the pupil area detection step and outputting a fluctuation state thereof, and the pupil area fluctuation state detection step Based on the fluctuation state output in step (1), the image is compared with a previously stored pupil dictionary to select and output an image including a face image having a pupil region closest to the state desired by the photographer. A face image processing method, comprising: an image selecting step.

14. An image inputting step of continuously inputting an image of a person including a face image, and detecting a position of a pupil and a nostril of the face of the person from a plurality of continuous images input in the image inputting step. A mouth region candidate determining step of determining a mouth region candidate based on this, and a binarized image using a threshold from which only the dark region of the mouth region candidate determined in the mouth region candidate determining step is extracted as a reference image By changing the threshold value in the direction in which the number of dark pixels gradually increases, binarization is performed to obtain a difference image from the reference image, and the obtained difference image is labeled to detect a label having a predetermined size. A face image processing method comprising: a mouth area detection step of detecting a mouth area.

15. An image inputting step of continuously inputting an image of a person including a face image, and a pupil in a face central region including a pupil, a mouth, and a nostril from a plurality of continuous images input in the image inputting step. Extraction of an appropriate threshold value that can detect the nostrils and nostrils, and using that threshold value to binarize the entire face area and perform labeling processing to extract a single connected face area contained inside the hair And determining whether the ears are hidden by the hair by using the ratio of the distance to the center coordinates of both pupils and the left and right ends of the face area extracted in the face area extraction step. A face area measuring step of obtaining the size of the face area not including the ears based on the result; and a face for scaling the image of the face area to an arbitrary size based on the size of the face area obtained in the face area measuring step. And an area correction step. A face image processing method characterized by the above-mentioned.

16. An image inputting step of continuously inputting an image of a person including a face image, an image storing step of storing a plurality of the images input in the image inputting step, and storing by the image storing step A face area extracting step of extracting a face area in the image obtained, a pupil detecting step of detecting a pupil area from the face areas extracted in the face area extracting step, and a face extracted in the face area extracting step. A mouth detection step of detecting a mouth area from the area, a pupil state determination step of determining a state of the pupil area detected in the pupil detection step, and a mouth state of determining a state of the mouth area detected in the mouth detection step A determination step, a face state determination step of determining a state of the extracted face region based on the determination made in the pupil state determination step and the mouth state determination step, and a determination result determined in the face state determination step. The best An image selection step of automatically selecting an image including the face image in the state from the plurality of images stored in the image storage step; detecting a size of the face image in the image selected in the image selection step; A face image processing method, comprising: