JP4126721B2

JP4126721B2 - Face area extraction method and apparatus

Info

Publication number: JP4126721B2
Application number: JP2002355017A
Authority: JP
Inventors: 学兵藤
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2002-12-06
Filing date: 2002-12-06
Publication date: 2008-07-30
Anticipated expiration: 2022-12-06
Also published as: JP2004185555A

Description

【０００１】
【発明の属する技術分野】
本発明は顔領域抽出方法及び装置に係り、特にデジタルカメラ等により取得したカラー画像中に存在する人物の顔に相当する領域を抽出するための方法及び装置に関する。
【０００２】
【従来の技術】
人物写真を鑑賞するときに最も注目される部位は人物の顔である。人物の顔が適正な明るさ及び色で再現されるように、画像内の人物の顔に相当する領域を自動的に検出する技術が提案されている（特許文献１）。特許文献１に開示された方法によれば、原画像から肌色領域を抽出するとともに、画像中のエッジを検出し、エッジで囲まれる肌色領域を顔領域として抽出している。
【０００３】
【特許文献１】
特開平９−１０１５７９号公報
【０００４】
【発明が解決しようとする課題】
しかしながら、特許文献１に開示された方法では、撮影時の光の加減によって、顔のエッジが閉曲線にならないケースも存在し、顔領域を誤検出することがある。また、顔抽出の処理において問題になるのは、肌色と類似した色相の物体（例えば、砂、地面、木、レンガなど）である。更に、タングステン光源下で撮影を行った場合には、オートホワイトバランス（ＡＷＢ）のロバスト性から完全にはホワイトバランスがとれず、光源色が残る（完全に補正せずに、電球特有の光源の雰囲気を残すような設定となっている）。したがって、タングステン光源下で白い物体を撮影すると、肌色に類似の赤黄色となり、顔領域検出の障害となる。
【０００５】
本発明はこのような事情に鑑みてなされたもので、上述のような撮影状況下においても顔と類似色相の物体を排除して、正しく顔領域を抽出することができる顔領域抽出方法及び装置を提供することを目的とする。
【０００６】
【課題を解決するための手段】
前記目的を達成するために本発明に係る顔領域抽出方法は、画像内から人物の顔に相当する領域を抽出する方法であって、撮影時の焦点距離情報を取得する情報取得工程と、前記情報取得工程で取得した焦点距離情報に基づき画像内における顔領域として想定される最大値を求める最大値予測工程と、画像データを解析して当該画像内から肌色の色相を有する領域を検出する肌色領域検出工程であって、肌色の色相を検出する肌色検出工程と、前記肌色検出工程で検出される肌色部分を更に彩度によって分割する彩度分割工程と、を含む肌色領域検出工程と、前記肌色領域検出工程で検出された肌色領域のうち前記彩度分割工程で分割された彩度別の肌色領域について、前記最大値予測工程で見積られた最大値から設定される判定基準値と比較し、該判定基準値よりも大きい領域について顔領域である可能性が低いものとして取り扱う処理を行う処理工程と、を含むことを特徴としている。
【０００７】
本発明によれば、撮影光学系の焦点距離を示す情報を取得し、その焦点距離で実際に撮影して得られる人物の顔の最大サイズを見積る。その一方、画像データを解析して画像内の肌色領域を検出する。検出された肌色領域は顔領域の候補となり得るが、本発明では焦点距離情報を基に見積られた最大値から定められた判定基準値よりも大きな肌色領域については顔領域である可能性が低いものとして取り扱う。例えば、判定基準値よりも大きな領域については顔領域でないとして顔領域の候補から排除する態様、或いは、顔領域である可能性の低い領域について演算上の重み付け係数を変更するなどの態様がある。
【０００８】
このように、撮影時の焦点距離情報から顔領域として妥当な大きさを見積り、極端に大きな領域部分については顔領域である確度が低いと判断するようにしたので、より高い確度で顔領域を判定することが可能になる。また、画像内で肌色の色相を有する領域（肌色領域）を更に彩度によって細かく領域分けして、その形状や大きさを認識することにより、顔領域の判定が容易になる。
【０００９】
なお、焦点距離情報は、撮影に使用するカメラから取得してもよいし、画像データに付加された付属情報（タグ情報）などから読み込んでもよい。
【００１０】
本発明の一態様によれば、前記処理工程は、前記肌色領域検出工程で検出された肌色領域のうち、前記最大値予測工程で見積られた最大値から設定される判定基準値よりも大きい領域を顔領域の候補から排除し、前記判定基準値よりも小さい領域の中から顔領域を決定する顔領域決定工程と、を含むことを特徴としている。
【００１１】
かかる態様においては、判定基準値よりも大きな肌色領域については顔領域ではないものとして顔領域の候補から排除し、判定基準値よりも小さな肌色領域の中から顔領域を決定している。このようなアルゴリズム構成にしたことにより、タングステン光源下の背景や地面など肌色の色相に類似する対象の影響を排除でき、より確度ある顔領域の抽出が可能となる。
【００１４】
また、判定基準値よりも小さい領域とされた顔領域候補の中から真の顔領域を特定するには、その領域の形状に基づいて顔領域を絞り込むことが好ましい。例えば、人物の顔はおよそ円形或いは楕円形に近いものと考えられ、顔領域か否かを判別するための縦横比を定めておくことができる。顔領域の各候補について規定の縦横比から極端に外れるもの（極端に細長いものなど）は、顔でないものとして排除され、規定の縦横比に近いものについて顔であると判断される。
【００１５】
本発明の顔領域抽出方法において、更に、被写体の距離情報を取得する距離情報取得工程と、前記距離情報取得工程で取得した距離情報に基づき画像内における顔領域として想定される大きさを求める顔サイズ予測工程と、付加する態様もある。
【００１６】
焦点距離の情報に加えて被写体距離の情報が得られると、実際に撮影される人物の顔の大きさをより正確に見積ることができるため、その予測値に基づいて顔領域の候補を絞り込むことができ、顔抽出の精度を上げることができる。
【００１７】
上記方法発明を具現化する装置を提供するため、本発明にかかる顔領域抽出装置は、画像内から人物の顔に相当する領域を抽出する装置であって、撮影時の焦点距離情報を取得する情報取得手段と、前記情報取得手段を介して取得した焦点距離情報に基づき画像内における顔領域として想定される最大値を求める最大値予測手段と、画像データを解析して当該画像内から肌色の色相を有する領域を検出する肌色領域検出手段であって、肌色の色相を検出する肌色検出手段と、前記肌色検出工程で検出される肌色部分を更に彩度によって分割する彩度分割手段と、を含む肌色領域検出手段と、前記肌色領域検出手段で検出された肌色領域のうち前記彩度分割手段で分割された彩度別の肌色領域について、前記最大値予測手段で見積られた最大値から設定される判定基準値と比較し、該判定基準値よりも大きい領域について顔領域の候補から削除し、前記判定基準値よりも小さい領域の中から顔領域を決定する顔領域決定手段と、を備えたことを特徴とする。
【００１８】
本発明の顔領域抽出装置は、デジタルカメラやビデオカメラなどの電子撮影装置（電子カメラ）の信号処理部に組み込むことが可能であるとともに、電子カメラで記録した画像データを再生表示又はプリント出力する画像処理装置などに組み込むことが可能である。
【００１９】
また、本発明の顔領域抽出装置は、コンピュータによって実現することが可能であり、上述した顔領域抽出方法の各工程をコンピュータによって実現させるためのプログラムをＣＤ−ＲＯＭや磁気ディスクその他の記録媒体に記録し、記録媒体を通じて当該プログラムを第三者に提供したり、インターネットなどの通信回線を通じて当該プログラムのダウンロードサービスを提供することも可能である。
【００２０】
【発明の実施の形態】
以下添付図面に従って本発明に係る顔領域抽出方法及び装置の好ましい実施の形態について詳説する。
【００２１】
図１は本発明の実施形態に係る電子カメラの構成を示すブロック図である。このカメラ１０は、被写体の光学像をデジタル画像データに変換して記録メディア１２に記録するデジタルカメラであり、撮影により得られた画像信号を処理する信号処理手段の一部に本発明の顔領域抽出装置が用いられている。
【００２２】
カメラ１０全体の動作は、カメラ内蔵の中央処理装置（ＣＰＵ）１４によって統括制御される。ＣＰＵ１４は、所定のプログラムに従って本カメラシステムを制御する制御手段として機能するとともに、自動露出（ＡＥ）演算、自動焦点調節（ＡＦ）演算、及びオートホワイトバランス（ＡＷＢ）制御、顔領域抽出演算など各種演算を実施する演算手段として機能する。
【００２３】
ＣＰＵ１４はバス１６を介してＲＯＭ２０及びメモリ（ＲＡＭ）２２と接続されている。ＲＯＭ２０にはＣＰＵ１４が実行するプログラム及び制御に必要な各種データ等が格納されている。メモリ２２はプログラムの展開領域及びＣＰＵ１４の演算作業用領域として利用されるとともに、画像データの一時記憶領域として利用される。
【００２４】
また、ＣＰＵ１４にはＥＥＰＲＯＭ２４が接続されている。ＥＥＰＲＯＭ２４は、顔領域抽出処理に必要なテーブルデータ（肌色データや顔サイズの最大値データなどテーブル）、ＡＥ、ＡＦ及びＡＷＢ等の制御に必要なデータ或いはユーザが設定したカスタマイズ情報などを記憶している不揮発性の記憶手段であり、電源ＯＦＦ時においても記憶内容が保持される。ＣＰＵ１４は必要に応じてＥＥＰＲＯＭ２４のデータを参照して演算等を行う。なお、ＲＯＭ２０は書換不能なものであってもよいし、ＥＥＰＲＯＭのように書換可能なものでもよい。
【００２５】
カメラ１０にはユーザが各種の指令を入力するための操作部３０が設けられている。操作部３０は、マクロボタン３１、シャッターボタン３２、ズームスイッチ３３など各種操作部を含む。
【００２６】
マクロボタン３１は、近距離撮影に適したマクロモードの設定（ＯＮ）／解除（ＯＦＦ）を行う操作手段である。マクロモードは、被写界深度を比較的浅くして背景を美しくぼかしたクローズアップ写真を撮影できる。マクロボタン３１の押下によってカメラ１０がマクロモードに設定されると、近距離撮影に適したフォーカス制御が行われ、被写体距離が約２０cm〜８０cmの範囲で撮影が可能となる。
【００２７】
シャッターボタン３２は、撮影開始の指示を入力する操作手段であり、半押し時にＯＮするＳ1 スイッチと、全押し時にＯＮするＳ2 スイッチとを有する二段ストローク式のスイッチで構成されている。Ｓ1 オンにより、ＡＥ及びＡＦ処理が行われ、Ｓ2 オンによって記録用の露光が行われる。ズームスイッチ３３は、撮影倍率や再生倍率を変更するための操作手段である。
【００２８】
また、図示しないが、操作部３０には、撮影モードと再生モードとを切り換えるためのモード選択手段、液晶モニタ４０にメニュー画面を表示させるメニューボタン、メニュー画面から所望の項目を選択する十字ボタン（カーソル移動操作手段）、選択項目の確定や処理の実行を指令するＯＫボタン、選択項目など所望の対象の消去や指示内容の取消し、或いは１つ前の操作状態に戻らせる指令を入力するキャンセルボタンなどの操作手段も含まれる。なお、操作部３０の中には、プッシュ式のスイッチ部材、ダイヤル部材、レバースイッチなどの構成によるものに限らず、メニュー画面から所望の項目を選択するようなユーザインターフェースによって実現されるものも含まれている。
【００２９】
操作部３０からの信号はＣＰＵ１４に入力される。ＣＰＵ１４は操作部３０からの入力信号に基づいてカメラ１０の各回路を制御し、例えば、レンズ駆動制御、撮影動作制御、画像処理制御、画像データの記録／再生制御、液晶モニタ４０の表示制御などを行う。
【００３０】
液晶モニタ４０は、撮影時に画角確認用の電子ファインダーとして使用できるとともに、記録済み画像を再生表示する手段として利用される。また、液晶モニタ４０は、ユーザインターフェース用表示画面としても利用され、必要に応じてメニュー情報や選択項目、設定内容などの情報が表示される。なお、液晶ディスプレイに代えて、有機ＥＬなど他の方式の表示装置（表示手段）を用いることも可能である。
【００３１】
次に、カメラ１０の撮影機能について説明する。
【００３２】
カメラ１０は撮影光学系としての撮影レンズ４２とＣＣＤ固体撮像素子（以下、ＣＣＤという。）４４とを備えている。なお、ＣＣＤ４４に代えて、ＭＯＳ型固体撮像素子など他の方式の撮像素子を用いることも可能である。撮影レンズ４２は、電動式のズームレンズで構成されており、詳細な光学構成については図示しないが、主として倍率変更（焦点距離可変）作用をもたらす変倍レンズ群及び補正レンズ群と、フォーカス調整に寄与するフォーカスレンズとを含む。
【００３３】
撮影者によってズームスイッチ３３が操作されると、そのスイッチ操作に応じてＣＰＵ１４からズーム駆動部４６に対して制御信号が出力される。ズーム駆動部４６は、動力源となるモータ（ズームモータ）とその駆動回路とを含む電動駆動手段である。ズーム駆動部４６のモータ駆動回路は、ＣＰＵ１４からの制御信号に基づいてレンズ駆動用の信号を生成し、ズームモータに与える。こうして、モータ駆動回路から出力されるモータ駆動電圧によってズームモータが作動し、撮影レンズ４２内の変倍レンズ群及び補正レンズ群が光軸に沿って前後移動することにより、撮影レンズ４２の焦点距離（光学ズーム倍率）が変更される。
【００３４】
本例では、ワイド（広角）端からテレ（望遠）端までのズーム動作範囲内において撮影レンズ４２の焦点距離を１０段階で可変できるものとする。撮影者は撮影目的に応じて所望の焦点距離を選択して撮影を行うことができる。
【００３５】
撮影レンズ４２のズーム位置（焦点距離に相当）は、ズーム位置検出センサ４８によって検出され、その検出信号はＣＰＵ１４に通知される。ＣＰＵ１４はズーム位置検出センサ４８からの信号によって現在のズーム位置（すなわち、焦点距離）を把握できる。ズーム位置検出センサ４８は、ズームモータ等の回転によりパルスを発生する回路であってもよいし、レンズ鏡胴の外周に位置検出エンコード板を配置した構成などであってもよく、本発明の実施に際しては、特に限定されるものではない。
【００３６】
撮影レンズ４２を通過した光は、図示せぬ絞り機構を介して光量が調節された後、ＣＣＤ４４に入射する。ＣＣＤ４４の受光面には多数のフォトセンサ（受光素子）が平面的に配列され、各フォトセンサに対応して赤（Ｒ）、緑（Ｇ）、青（Ｂ）の原色カラーフィルタが所定の配列構造（ベイヤー、Ｇストライプなど）で配置されている。
【００３７】
ＣＣＤ４４の受光面に結像された被写体像は、各フォトセンサによって入射光量に応じた量の信号電荷に変換される。ＣＣＤ４４は、シャッターゲートパルスのタイミングによって各フォトセンサの電荷蓄積時間（シャッタースピード）を制御する電子シャッター機能を有している。
【００３８】
ＣＣＤ４４の各フォトセンサに蓄積された信号電荷は、ＣＣＤドライバ５０から与えられるパルスに基づいて信号電荷に応じた電圧信号（画像信号）として順次読み出される。ＣＣＤ４４から出力された画像信号は、アナログ処理部５２に送られる。アナログ処理部５２は、ＣＤＳ（相関二重サンプリング）回路及びゲイン調整回路を含む先行処理部であり、このアナログ処理部５２において、サンプリング処理並びにＲ，Ｇ，Ｂの各色信号に色分離処理され、各色信号の信号レベルの調整（プリホワイトバランス処理）が行われる。
【００３９】
アナログ処理部５２から出力された画像信号はＡ／Ｄ変換器５４によってデジタル信号に変換された後、信号処理部５６を介してメモリ２２に格納される。このときメモリ２２に記憶される画像データは、ＣＣＤ４４から出力された画像信号のＡ／Ｄ変換出力をそのまま（未加工のまま）記録したものであり、ガンマ変換や同時化などの信号処理が行われていない画像データである。（以下、CCDRAWデータという。）ただし、「未加工のデータ」といっても、一切の信号処理を排除するものではなく、例えば、撮像素子の欠陥画素（キズ）のデータを補間する欠陥画素補正処理を行って得られた画像データなどについては汎用フォーマットに展開されていないという点でCCDRAWデータの概念に含まれるものとする。
【００４０】
タイミングジェネレータ（ＴＧ）５８は、ＣＰＵ１４の指令に従ってＣＣＤドライバ５０、アナログ処理部５２及びＡ／Ｄ変換器５４に対してタイミング信号を与えており、このタイミング信号によって各回路の同期がとられている。
【００４１】
信号処理部５６は、メモリ２２の読み書きを制御するメモリコントローラを兼ねたデジタル信号処理ブロックである。信号処理部５６は、ＡＥ／ＡＦ／ＡＷＢ処理を行うオート演算部と、同時化回路（単板ＣＣＤのカラーフィルタ配列に伴う色信号の空間的なズレを補間して各点の色を計算する処理回路）、ホワイトバランス回路、ガンマ変換回路、輝度・色差信号生成回路、輪郭補正回路、コントラスト補正回路等を含む画像処理手段であり、ＣＰＵ１４からのコマンドに従ってメモリ２２を活用しながら画像信号を処理する。
【００４２】
メモリ２２に格納されたCCDRAWデータは、バス１６を介して信号処理部５６に送られる。信号処理部５６に入力された画像データは、ホワイトバランス調整処理、ガンマ変換処理、輝度信号（Ｙ信号）及び色差信号（Ｃr,Ｃb 信号）への変換処理（ＹＣ処理）など、所定の信号処理が施された後、メモリ２２に格納される。
【００４３】
撮影画像をモニタ出力する場合、メモリ２２から画像データが読み出され、表示回路６０に転送される。表示回路６０に送られた画像データは表示用の所定方式の信号（例えば、ＮＴＳＣ方式のカラー複合映像信号）に変換された後、液晶モニタ４０に出力される。ＣＣＤ４４から出力される画像信号によってメモリ２２内の画像データが定期的に書き換えられ、その画像データから生成される映像信号が液晶モニタ４０に供給されることにより、撮像中の映像（スルー画）がリアルタイムに液晶モニタ４０に表示される。撮影者は液晶モニタ４０に表示される映像（いわゆるスルームービー）によって画角（構図）を確認できる。
【００４４】
撮影者が画角を決めてシャッターボタン３２を押下すると、ＣＰＵ１４はこれを検知し、シャッターボタン３２の半押し（Ｓ1 ＯＮ）に応動してＡＥ処理及びＡＦ処理を行い、シャッターボタン３２の全押し（Ｓ２＝ＯＮ）に応動して記録用の画像を取り込むためのＣＣＤ露光及び読み出し制御を開始する。
【００４５】
本カメラ１０におけるＡＦ制御は、例えば映像信号のＧ信号の高周波成分が極大になるようにフォーカスレンズ（撮影レンズ４２を構成するレンズ光学系のうちフォーカス調整に寄与する移動レンズ）を移動させるコントラストＡＦが適用される。すなわち、ＡＦ演算部は、Ｇ信号の高周波成分のみを通過させるハイパスフィルタ、絶対値化処理部、画面内（例えば、画面中央部）に予め設定されているフォーカス対象エリア内の信号を切り出すＡＦエリア抽出部、及びＡＦエリア内の絶対値データを積算する積算部から構成される。
【００４６】
ＡＦ演算部で求めた積算値のデータはＣＰＵ１４に通知される。ＣＰＵ１４は、ＡＦモータを含むフォーカス駆動部６２を制御してフォーカスレンズを移動させながら、複数のＡＦ検出ポイントで焦点評価値（ＡＦ評価値）の演算を行い、各ＡＦ検出ポイントで算出されたＡＦ評価値からその値が極大となるレンズ位置を合焦位置として決定する。そして、求めた合焦位置にフォーカスレンズを移動させるようにフォーカス駆動部６２を制御する。
【００４７】
また、ＡＥ制御に関連して、ＡＥ演算部は１画面を複数のエリア（例えば、８×８）に分割し、分割エリアごとにＲＧＢ信号を積算する回路を含み、その積算値をＣＰＵ１４に提供する。ＣＰＵ１４は、ＡＥ演算部から得た積算値に基づいて被写体の明るさ（被写体輝度）を検出し、撮影に適した露出値（撮影ＥＶ値）を算出する。求めた露出値と所定のプログラム線図に従い、絞り値とシャッタースピードが決定される。そして、アイリスモータを含む不図示の駆動部及びＣＣＤ４４の電子シャッターを制御して最適な露光量を得る。
【００４８】
シャッターボタン３２の全押し（Ｓ2 ＝ＯＮ）に応動して取り込まれた画像データは、信号処理部５６においてＹＣ処理その他の所定の信号処理を経た後、圧縮伸張回路６４において所定の圧縮フォーマット（例えば、JPEG方式) に従って圧縮される。圧縮された画像データは、メディアインターフェース部６６を介して記録メディア１２に記録される。圧縮形式はJPEGに限定されず、MPEGその他の方式を採用してもよい。
【００４９】
画像データを保存する手段は、スマートメディア（商標）、コンパクトフラッシュ（商標）などで代表される半導体メモリカード、磁気ディスク、光ディスク、光磁気ディスクなど、種々の媒体を用いることができる。また、リムーバブルメディアに限らず、カメラ１０に内蔵された記録媒体（内部メモリ）であってもよい。
【００５０】
モード選択手段によって再生モードが選択されると、記録メディア１２に記録されている最終の画像ファイル（最後に記録したファイル）が読み出される。記録メディア１２から読み出された画像ファイルのデータは、圧縮伸張回路６４によって伸張処理され、表示回路６０を介して液晶モニタ３８に出力される。
【００５１】
再生モードの一コマ再生時に十字ボタンを操作することにより、順方向又は逆方向にコマ送りすることができ、コマ送りされた次のファイルが記録メディア１２から読み出され、上記と同様にして画像が再生される。
【００５２】
図２は、本例のカメラ１０における顔抽出処理に関係する要部ブロック図である。図２中図１で説明した構成と共通する部分には同一の符号を付し、その説明は省略する。
【００５３】
図２において、上段の積算回路７０とホワイトバランス回路７２は、通常のオートホワイトバランス処理に使用している処理系である。また、その下段に示した、積算回路７４、ホワイトバランス回路７６及び同時化回路７８は、顔抽出アルゴリズムの前処理を実施するための処理部（以下、前処理系という。）である。
【００５４】
Ａ／Ｄ変換器５４にてデジタル信号に変換されたCCDRAWデータはメモリ２２に格納され、メモリ２２から積算回路７０に送られる。積算回路７０は、１画面内を複数のエリア（例えば、８×８の６４ブロック）に分割し、エリアごとにＲＧＢ信号の色別の平均積算値を算出する回路を含み、その算出結果は基準光源（デイライト）を想定したホワイトバランス回路７２に送られる。ホワイトバランス回路７２でゲイン調整された信号はＣＰＵ１４に提供される。
【００５５】
ＣＰＵ１４は、Ｒの積算値、Ｂの積算値、Ｇの積算値を得て、Ｒ／Ｇ、Ｂ／Ｇの比を求め、これらＲ／Ｇ、Ｂ／Ｇの値と、ＡＥ演算による撮影ＥＶ値の情報に基づいてシーン判別（光源種の判別）を行い、シーンに適した所定のホワイトバランス調整値（光源の雰囲気を残すような設定）に従って、信号処理部５６内のホワイトバランス回路８０及び前処理系のホワイトバランス回路７６のアンプゲインを制御し、各色チャンネルの信号に補正をかける。光源種の判別方法及び光源の雰囲気を残すホワイトバランス制御については、特開２０００−２２４６０８号公報等に開示された手法を用いることができる。なお、シーン判別においては、Ｒ／Ｇ、Ｂ／Ｇの値を利用するのに代えて、Ｒ−Ｙ、Ｂ−Ｙなど色温度情報を用いてもよい。
【００５６】
その後、シャッターボタン３２の全押し（Ｓ2 ＯＮ）に応動して記録用の画像を取り込む。Ｓ2 ＯＮに応じて取得されたCCDRAWデータは一旦メモリ２２に格納され、その後、信号処理部５６及び顔抽出用の前処理系の積算回路７４に送られる。
【００５７】
前処理系の積算回路７４は、１画面を例えば、１３０×１９０のエリアに分割し、エリアごとに積算値を算出する。積算回路７４の算出結果は、ホワイトバランス回路７６に送られ、ここでＡＷＢを反映させたホワイトバランス処理が行われる。ＡＷＢを効かせたデータは、同時化回路７８に送られ、ここでＲ，Ｇ，Ｂについてそれぞれ同画素数のデータ（３面データ）が生成される。
【００５８】
ＣＰＵ１４は、同時化回路７８で生成されたＲＧＢの３面データに基づいて、肌色検出、彩度分割、顔候補抽出、並びに顔候補の領域の形状に基づく顔領域の特定の処理を実行する。顔抽出のアルゴリズムについて詳細は後述する。
【００５９】
その一方、メモリ２２から信号処理部５６に送られたCCDRAWデータは、ホワイトバランス回路８０によりＡＷＢを反映させた処理が行われた後、ガンマ変換回路８２に送られる。ガンマ変換回路８２は、ホワイトバランス調整されたＲＧＢ信号が所望のガンマ特性となるように入出力特性を変更し、輝度／色差信号生成回路８４に出力する。
【００６０】
輝度／色差信号生成回路８４は、ガンマ補正されたＲ、Ｇ、Ｂ信号から輝度信号Ｙとクロマ信号Ｃｒ、Ｃｂとを作成する。これらの輝度信号Ｙとクロマ信号Ｃｒ、Ｃｂ（ＹＣ信号）は、圧縮伸張回路６４によってJPEGその他の所定のフォーマットで圧縮され、記録メディア１２に記録される。
【００６１】
次に、本実施形態に係るカメラ１０に搭載されている顔抽出機能について説明する。
【００６２】
図３は、撮影光学系における被写体とその像の関係を示した図である。同図に示したように、撮影距離Ｄと視野角Ｌ（この焦点距離で撮影できる全範囲）の比は、撮影レンズ４２の焦点距離ＤＦとＣＣＤサイズＨの比に等しい。
【００６３】
【数１】
撮影距離Ｄ：視野角Ｌ＝焦点距離ＤＦ：ＣＣＤサイズＨ …（１）
すなわち、視野角Ｌ内に包含される人物の顔の実サイズＡとＣＣＤ受光面上における像サイズａとの間には、次式の関係が成り立つ。
【００６４】
【数２】
撮影距離Ｄ：実サイズＡ＝焦点距離ＤＦ：像サイズａ …（２）
撮影距離Ｄは、カメラ１０によって撮影可能な最短距離（最至近距離）とする。マクロモードＯＦＦの場合ならば、カメラ１０までの最短距離は６０cm程度である。なお、マクロモードＯＮの場合、最短距離は２０cm程度であるが、マクロモードＯＮのときには、人物を写す可能性が低いため、顔抽出の処理は行わないものとする。
【００６５】
人物の顔の実サイズＡは、多少のばらつきがあるにせよ、およそ顔の大きさというものは、ある大きさの範囲にあるとして規定値を設定できる。
【００６６】
したがって、上記の式（２）の関係からＣＣＤ４４面上での像サイズ（人物の顔がＣＣＤ４４面に結像するときの画素数）を割り出すことができる。こうして、各焦点距離における顔のサイズを求めることができる。
【００６７】
本例のカメラ１０は、ワイド端からテレ端までの間で１０段階の焦点距離を選択できる構成になっているため、各焦点距離についてＣＣＤ４４面上での顔サイズの最大値がテーブルデータとしてカメラ１０内のＥＥＰＲＯＭ２４（データ格納手段）に記憶されている。
【００６８】
図４は、顔抽出におけるシーケンスを示すフローチャートである。
【００６９】
まず、シャッターボタン３２のＳ1 ＝ＯＮ及びＳ2 ＝ＯＮを経て、CCDRAWデータをメモリ２２に記録する（ステップＳ１１０）。その後、ＣＰＵ１４は、撮影時の焦点距離情報及びマクロのＯＮ／ＯＦＦ情報を入手する（ステップＳ１１２）。マクロモードがＯＮのときには、顔抽出処理を実施しないものとする。
【００７０】
その一方、マクロモードがＯＦＦのときには、図２で説明した前処理系（７４、７６、７８）による処理を行い、記録したCCDRAWデータから積算処理、ホワイトバランス処理及び同時化処理を行う（図４のステップＳ１１４）。
【００７１】
次いで、ＣＰＵ１４は、ＥＥＰＲＯＭ２４から肌色テーブルを取得する（ステップＳ１１６）。肌色テーブルは、所定の色空間において「肌色」として認識される色相の範囲を定めたデータである。ＣＰＵ１４は、前処理系の同時化回路７８から取得したＲＧＢの３面データ（リニアな系のデータ）を基に、肌色テーブル内の色相を検出する（ステップＳ１１８）。
【００７２】
図５は、肌色として抽出されるべき色相の範囲（肌色抽出エリア）を例示した図である。図示した色空間はガンマをかける前のリニアな系であり、横軸をＲ／Ｇ、縦軸をＢ／Ｇとした座標系である。
【００７３】
図５において、符号８８の矩形枠で囲まれた範囲が肌色抽出エリアとして設定されている。すなわち、Ｇに対してＲが少し多く、また、ＢはＧに対して少し低い値を示す範囲を「肌色」と定めておき、この範囲内に入ったものを肌色として判定する。
【００７４】
更に、肌色抽出エリア８８は、彩度によって更に複数のエリアに分割されている。図５において、「彩度」は原点Ｏ（１，１）からの距離によって表され、原点Ｏから離れるほど彩度が高くなる。同図の例では、原点を中心とする同心円状の境界線（破線により図示）によって肌色検出エリア８８が６つの領域に区分けされている。
【００７５】
ＣＰＵ１４は、前処理系の同時化回路７８から取得したＲＧＢの三面データから肌色領域を検出した後、更に、当該肌色検出された部分を彩度を基準に領域分割する（図４のステップＳ１１８）。
【００７６】
図６に、肌色検出された領域を更に彩度によって区分けした例を示す。画像内で対象物が異なるとその彩度も異なり、一般に人物の顔の肌色は、タングステン光源下の白色物や机などの木よりも彩度が高い傾向がある。したがって、肌色検出された領域を彩度によって細かく分けてその領域形状を把握することにより顔領域であるか、顔以外の領域であるかを判別することが容易になる。
【００７７】
図４のステップＳ１２０において彩度による領域分割を行った後は、ステップＳ１２２に進む。ステップＳ１２２では、ステップＳ１１２で入手した焦点距離情報に従い、ＥＥＰＲＯＭ２４から顔領域の最大値テーブルを入手し、当該焦点距離における顔領域の予測最大値を求め、この最大値を顔領域判定の判定基準値として設定する。もちろん、カメラ１０で撮影可能な最短撮影距離よりも近くで撮影される可能性等にも配慮して予測最大値に所定のマージンを付加して判定基準値を設定するか、或いは、判定にマージンをもたせることが好ましい。
【００７８】
そして、彩度によって分割された肌色領域と予測した最大値とを比較し、最大値よりも極端に大きい領域は顔ではないと判断して顔領域の候補から排除し、残りの領域を顔領域の候補として抽出する（ステップＳ１２４）。これにより、実際に「顔」であり得ない大きいサイズの領域が排除される。
【００７９】
ステップＳ１２４で絞り込まれた顔領域候補の中から、更に各領域について形状検出を行い、形状から顔領域を特定する（ステップＳ１２６）。すなわち、顔として妥当なモデル形状（楕円や円）から定められた縦横比の規定値と、顔候補の形状とを対比し、検出された形状の縦横比が規定値から大きく外れるものについては、顔以外の領域であると判断する。この形状判定によって顔候補が更に絞り込まれ、所定の縦横比を有する形状の領域部分が「顔」として判定される。
【００８０】
こうして、顔領域が抽出され、その抽出結果は明るさ補正、ホワイトバランス補正、肌色をベストの色（目標値）に近づける色補正、赤目補正などに利用される。
【００８１】
上述した実施形態によれば、カメラ情報である焦点距離情報を利用して顔領域の大きさを見積る一方、画像内を肌色検出して得られた肌色エリアを彩度によって領域分割して各エリアの大きさを認識し、見積もった最大値と比較して極端に大きなエリアは顔領域の候補から排除する構成にしたので、顔領域の候補を絞り込むことができる。そして、残ったエリアについて形状認識を行い、最終的な顔を判定するようにしたので、高い確度で正しい顔領域を抽出できる。
【００８２】
例えば、図７に示したシーンは、電球９０の照明下で白い布９２を背景に人物９４を撮影したものであるが、背景の布９２や木の机９６など、肌色と類似の色相を有する物体を排除して、人物９４の顔を正確に抽出することができる。
【００８３】
〔変形例１〕
上述の実施形態では、撮影画像の中から顔領域を完全に特定する例を述べたが、本発明の実施に際しては、最終的に顔領域を特定しない態様も可能である。
【００８４】
例えば、人物の顔を重視した明るさ補正を行う場合などについては、肌色検出によって抽出された肌色エリアについて、顔領域の候補を絞りこむ代わりに、又はこれと併用して、明るさ演算における重みを付け係数を焦点距離情報に可変する態様がある。
【００８５】
図８のように、肌色検出によって画面内に複数の肌色エリアが検出された場合に、焦点距離情報に基づき実際に顔である可能性の高低に応じた重み付けｗi(i ＝1,2,3 …) を設定して、次式（３）に従い明るさＹを計算する。
【００８６】
【数３】
Ｙ＝Σ (ｗi ×Ｙi)／Σｗi …（３）
ただし、Ｙi は各顔領域の明るさを示す。
【００８７】
重み付けｗi については、顔である可能性の低いエリアの重み付けを軽くする（「０」又は「０」に近い値にする）ことにより、顔である可能性の高いエリアの情報が強く反映された値となる。こうして求めた明るさＹを所定の目標値に近づけるように補正処理を行う。
【００８８】
〔変形例２〕
図１で説明した焦点距離情報を利用する態様に代えて、又はこれと併用して被写体の距離情報を利用する態様がある。
【００８９】
図９に本発明の他の実施形態に係る電子カメラのブロック図を示す。図９中図１と同一又は類似の部分には同一の符号を付し、その説明は省略する。
【００９０】
図９に示したカメラ１０は、被写体距離（撮影距離）を測定する手段としての測距センサ１０２を備えている。測距センサ１０２から得られる信号はＣＰＵ１４に入力され、ＣＰＵ１４は被写体の距離情報を取得する。
【００９１】
なお、被写体距離を検出する手段は、三角測量の原理を利用したアクティブ方式やパッシブ方式で代表される測距方式のＡＦ機構や位相法によるＡＦ機構など、周知のＡＦ機構を利用することが可能である。また、測距センサ１０２を省略したカメラ（図１の構成）においても、コントラストＡＦなどによりフォーカスレンズを合焦位置に移動させたときに、フォーカス位置検出手段からフォーカスレンズの位置情報（フォーカス位置情報）を取得し、この情報に基づいてカメラ１０と被写体の距離（被写体距離）を計算することも可能である。
【００９２】
こうして、被写体の距離情報を取得することにより、その距離で実際に撮影される顔の大きさを見積ることができる。顔サイズの算出に際しては、被写体距離に対応した顔の大きさを示すテーブルデータをＥＥＰＲＯＭ２４内に格納しておいてもよいし、演算式を利用して計算してよい。
【００９３】
被写体の距離情報に基づいて算出された顔サイズよりも極端に大きい肌色エリア又は極端に小さい肌色エリアについては顔候補から除外することにより、より正確に顔領域を抽出できる。
【００９４】
図１０に距離情報を利用する顔抽出のシーケンスを示す。同図中、図４のフローチャートと同一又は類似の工程には同一の符号を付してある。
【００９５】
図１０に示したフローチャートではステップＳ１１２において、測距センサ１０２の情報を読み込み、被写体の距離情報を入手する処理が追加されている。
【００９６】
焦点距離情報に加えて被写体距離情報を得ることにより、顔サイズを一層正確に予測できるため、その予測に基づいて顔領域として許容される妥当な数値範囲を設定することが可能となり、顔領域の抽出精度が向上する。
【００９７】
具体的には、図１０のステップＳ１２２において、焦点距離に基づいて最大値テーブルを入手するとともに、被写体距離に基づいて顔領域の最小値テーブルを入手し、これらのテーブルで規定される数値範囲を基に顔領域の候補を絞り込む（ステップＳ１２４）。或いは、ステップＳ１２２において、焦点距離と被写体距離とに基づいて顔領域の最大値と最小値とを考慮した判定基準値（又は判定基準範囲）を定め、この判定基準値にしたがって、顔領域の候補を絞り込む（ステップＳ１２４）などの態様がある。
【００９８】
図９及び図１０では、測距センサ１０２から被写体距離情報を取得したが、被写体の距離情報は、撮影に使用するカメラから取得する態様に限らず、画像データに付加された付属情報（タグ情報）などから読み込むことも可能である。
【００９９】
上述の実施形態では、光学ズーム機能を有するデジタルカメラを説明したが、単焦点レンズを用いるカメラについては、そのレンズの焦点距離の値を利用して顔サイズを判定すればよい。また、画像信号を電子的に処理して拡大画像を得る電子ズーム（デジタルズーム）機能を備えたカメラについても、画角全体の画像を撮影して、電子ズーム処理前の全体の画像データから顔領域を検出することにより、上述の実施形態と同様の手法を適用できる。
【０１００】
上述の実施形態では、デジタルカメラを例示したが、本発明の適用範囲はこれに限定されず、カメラ付き携帯電話機、カメラ付きＰＤＡ、カメラ付きモバイルパソコンなど、電子撮像機能を備えた他の情報機器についても本発明を適用できる。この場合、撮像部は携帯電話機等の本体から分離可能な着脱式（外付けタイプ）のものであってもよい。
【０１０１】
また、上述のような電子撮像機能付き機器で記録した画像データを再生表示する画像再生装置、或いはプリント出力するプリント装置などについても本発明を適用することが可能である。
【０１０２】
【発明の効果】
本発明によれば、撮影光学系の焦点距離情報から人物の顔の最大値を予測し、その最大値から設定される判定基準値よりも大きな肌色領域については顔領域である可能性が低いものとして取り扱う構成にしたので、より高い確度で顔領域を判定することが可能である。
【０１０３】
また、焦点距離の情報に加えて被写体距離の情報を取得することにより、実際に撮影される人物の顔の大きさをより正確に見積ることができ、顔抽出の精度を上げることができる。
【図面の簡単な説明】
【図１】本発明の実施形態に係る電子カメラの構成を示すブロック図
【図２】本例のカメラにおける顔抽出処理に関係する要部ブロック図
【図３】撮影光学系における被写体とその像の関係を示した図
【図４】顔抽出処理のシーケンスを示すフローチャート
【図５】肌色として抽出する色相の範囲（肌色抽出エリア）と彩度分割の例を示す図
【図６】肌色検出された領域を更に彩度によって区分けした例を示す図
【図７】撮影シーンの一例を示す図
【図８】画面内に複数の肌色エリアが存在する場合の明るさ演算の例を示す図
【図９】本発明の他の実施形態に係る電子カメラのブロック図
【図１０】被写体距離情報を利用した顔抽出処理のシーケンスを示すフローチャート
【符号の説明】
１０…カメラ、１４…ＣＰＵ、２４…ＥＥＰＲＯＭ、４２…撮影レンズ、４４…ＣＣＤ、４８…ズーム位置検出センサ、７４…積算回路、７６…ホワイトバランス回路、７８…同時化回路、８８…肌色抽出エリア、１０２…測距センサ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a face area extraction method and apparatus, and more particularly to a method and apparatus for extracting an area corresponding to a human face existing in a color image acquired by a digital camera or the like.
[0002]
[Prior art]
The most noticeable part when appreciating a portrait is the face of the person. There has been proposed a technique for automatically detecting a region corresponding to a human face in an image so that the human face is reproduced with appropriate brightness and color (Patent Document 1). According to the method disclosed in Patent Document 1, a skin color area is extracted from an original image, an edge in the image is detected, and a skin color area surrounded by the edge is extracted as a face area.
[0003]
[Patent Document 1]
JP-A-9-101579
[0004]
[Problems to be solved by the invention]
However, in the method disclosed in Patent Document 1, there are cases where the edge of the face does not become a closed curve due to the amount of light at the time of shooting, and the face area may be erroneously detected. A problem in face extraction processing is an object having a hue similar to the skin color (for example, sand, ground, wood, brick, etc.). Furthermore, when shooting under a tungsten light source, the white balance cannot be completely achieved due to the robustness of auto white balance (AWB), and the light source color remains (without complete correction, Setting to leave an atmosphere). Therefore, when a white object is photographed under a tungsten light source, it becomes reddish yellow similar to the skin color, which hinders face area detection.
[0005]
The present invention has been made in view of such circumstances, and a face area extraction method and apparatus capable of correctly extracting a face area by eliminating an object having a hue similar to that of the face even in the above-described shooting situation. The purpose is to provide.
[0006]
[Means for Solving the Problems]
  In order to achieve the above object, a face region extraction method according to the present invention is a method for extracting a region corresponding to a human face from an image, and includes an information acquisition step of acquiring focal length information at the time of shooting, A maximum value predicting step for obtaining a maximum value assumed as a face region in the image based on the focal length information acquired in the information acquiring step, and a skin color for analyzing the image data and detecting a region having a skin color hue from the image Region detection processA skin color area detecting step including: a skin color detecting step for detecting a hue of skin color; and a saturation dividing step for further dividing the skin color portion detected in the skin color detecting step according to saturation.And among the skin color areas detected in the skin color area detection stepAbout the skin color area by saturation divided in the saturation division stepThe criterion value set from the maximum value estimated in the maximum value prediction stepCompared to the criterion valueAnd a processing step of performing processing for treating a region larger than that as a face region having a low possibility of being a face region.
[0007]
According to the present invention, the information indicating the focal length of the photographing optical system is acquired, and the maximum size of the human face obtained by actually photographing at the focal length is estimated. On the other hand, the skin tone region in the image is detected by analyzing the image data. Although the detected skin color area can be a face area candidate, in the present invention, a skin color area that is larger than the criterion value determined from the maximum value estimated based on the focal length information is unlikely to be a face area. Treat as a thing. For example, there is an aspect in which an area larger than the determination reference value is excluded from the face area candidates as not being a face area, or an arithmetic weighting coefficient is changed in an area that is unlikely to be a face area.
[0008]
  In this way, the appropriate size as the face area is estimated from the focal length information at the time of shooting, and it is determined that the accuracy of the face area is low for extremely large areas, so the face area can be determined with higher accuracy. It becomes possible to judge.In addition, a region having a flesh-colored hue (skin-colored region) in an image is further divided into fine regions according to saturation, and the shape and size thereof are recognized, thereby facilitating the determination of a face region.
[0009]
The focal length information may be acquired from a camera used for shooting, or may be read from attached information (tag information) added to the image data.
[0010]
According to an aspect of the present invention, the processing step is a region that is larger than a determination reference value set from the maximum value estimated in the maximum value prediction step among the skin color regions detected in the skin color region detection step. And a face area determination step of determining a face area from areas smaller than the determination reference value.
[0011]
In this aspect, the skin color area larger than the determination reference value is excluded from the face area candidates as not being a face area, and the face area is determined from the skin color areas smaller than the determination reference value. By adopting such an algorithm configuration, it is possible to eliminate the influence of an object similar to a flesh-colored hue such as a background or ground under a tungsten light source, and to extract a more accurate face region.
[0014]
In order to identify a true face area from face area candidates that are smaller than the determination reference value, it is preferable to narrow down the face area based on the shape of the area. For example, a person's face is considered to be approximately circular or elliptical, and an aspect ratio for determining whether or not the face is a face region can be determined. Those that are extremely different from the specified aspect ratio (extremely long and narrow) for each candidate for the face area are excluded as non-faces, and those that are close to the specified aspect ratio are determined to be faces.
[0015]
In the face region extraction method of the present invention, a distance information acquisition step for acquiring subject distance information, and a face for obtaining a size assumed as a face region in the image based on the distance information acquired in the distance information acquisition step There is also a mode for adding a size prediction step.
[0016]
When the subject distance information is obtained in addition to the focal length information, the size of the face of the person actually photographed can be estimated more accurately, so the face area candidates are narrowed down based on the predicted value. And the accuracy of face extraction can be improved.
[0017]
  In order to provide an apparatus that embodies the above method invention, a face area extraction apparatus according to the present invention is an apparatus that extracts an area corresponding to a human face from an image, and acquires focal length information at the time of shooting. An information acquisition means, a maximum value prediction means for obtaining a maximum value assumed as a face region in the image based on the focal length information acquired through the information acquisition means, and analyzing the image data to determine the skin color from the image. Skin color area detecting means for detecting an area having a hueA skin color area detecting means including a skin color detecting means for detecting a hue of the skin color, and a saturation dividing means for further dividing the skin color portion detected in the skin color detecting step by saturation.And among the skin color areas detected by the skin color area detection meansAbout the skin color area by saturation divided by the saturation dividing meansThe criterion value set from the maximum value estimated by the maximum value predicting meansCompared to the criterion valueA face area determining unit that deletes a larger area from the face area candidates and determines a face area from areas smaller than the determination reference value.
[0018]
The face area extraction apparatus of the present invention can be incorporated in a signal processing unit of an electronic photographing apparatus (electronic camera) such as a digital camera or a video camera, and reproduces or displays or prints out image data recorded by the electronic camera. It can be incorporated into an image processing apparatus or the like.
[0019]
The face area extraction apparatus of the present invention can be realized by a computer, and a program for realizing the steps of the above-described face area extraction method by a computer is stored on a CD-ROM, a magnetic disk, or other recording medium. The program can be recorded and provided to a third party through a recording medium, or the program download service can be provided through a communication line such as the Internet.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, preferred embodiments of a face area extracting method and apparatus according to the present invention will be described in detail with reference to the accompanying drawings.
[0021]
FIG. 1 is a block diagram showing a configuration of an electronic camera according to an embodiment of the present invention. This camera 10 is a digital camera that converts an optical image of a subject into digital image data and records it on a recording medium 12, and the face region of the present invention is part of a signal processing means that processes an image signal obtained by photographing. An extraction device is used.
[0022]
The overall operation of the camera 10 is centrally controlled by a central processing unit (CPU) 14 built in the camera. The CPU 14 functions as a control means for controlling the camera system according to a predetermined program, and various types such as automatic exposure (AE) calculation, automatic focus adjustment (AF) calculation, auto white balance (AWB) control, face area extraction calculation, etc. It functions as a calculation means for performing a calculation.
[0023]
The CPU 14 is connected to a ROM 20 and a memory (RAM) 22 via a bus 16. The ROM 20 stores a program executed by the CPU 14 and various data necessary for control. The memory 22 is used as a program development area and a calculation work area for the CPU 14, and is also used as a temporary storage area for image data.
[0024]
Further, an EEPROM 24 is connected to the CPU 14. The EEPROM 24 stores table data (tables such as skin color data and face size maximum value data) necessary for face area extraction processing, data necessary for control of AE, AF, AWB, etc., or customization information set by the user. Non-volatile storage means that retains the stored contents even when the power is turned off. The CPU 14 performs calculations and the like with reference to data in the EEPROM 24 as necessary. The ROM 20 may be non-rewritable or may be rewritable like an EEPROM.
[0025]
The camera 10 is provided with an operation unit 30 for a user to input various commands. The operation unit 30 includes various operation units such as a macro button 31, a shutter button 32, and a zoom switch 33.
[0026]
The macro button 31 is an operation means for setting (ON) / releasing (OFF) a macro mode suitable for short-distance shooting. Macro mode allows you to take close-up photos with a relatively shallow depth of field and a beautifully blurred background. When the camera 10 is set to the macro mode by pressing the macro button 31, focus control suitable for short-distance shooting is performed, and shooting is possible within a subject distance range of about 20 cm to 80 cm.
[0027]
The shutter button 32 is an operation means for inputting an instruction to start photographing, and is composed of a two-stroke switch having an S1 switch that is turned on when half-pressed and an S2 switch that is turned on when fully pressed. When S1 is on, AE and AF processing are performed, and when S2 is on, recording exposure is performed. The zoom switch 33 is an operation means for changing the photographing magnification and the reproduction magnification.
[0028]
Although not shown, the operation unit 30 includes mode selection means for switching between a shooting mode and a playback mode, a menu button for displaying a menu screen on the liquid crystal monitor 40, and a cross button (for selecting a desired item from the menu screen). Cursor movement operation means), OK button for confirming selection item and executing process, cancel button for canceling desired object such as selection item, canceling instruction contents, or cancel button for inputting a command to return to the previous operation state Such operation means are also included. The operation unit 30 is not limited to a push-type switch member, dial member, lever switch, or the like, but also includes a unit realized by a user interface that selects a desired item from a menu screen. It is.
[0029]
A signal from the operation unit 30 is input to the CPU 14. The CPU 14 controls each circuit of the camera 10 based on an input signal from the operation unit 30. For example, lens driving control, photographing operation control, image processing control, image data recording / reproduction control, display control of the liquid crystal monitor 40, and the like. I do.
[0030]
The liquid crystal monitor 40 can be used as an electronic viewfinder for checking the angle of view at the time of shooting, and is used as a means for reproducing and displaying a recorded image. The liquid crystal monitor 40 is also used as a user interface display screen, and displays information such as menu information, selection items, and setting contents as necessary. Instead of a liquid crystal display, other types of display devices (display means) such as an organic EL can be used.
[0031]
Next, the shooting function of the camera 10 will be described.
[0032]
The camera 10 includes a photographing lens 42 as a photographing optical system and a CCD solid-state imaging device (hereinafter referred to as a CCD) 44. Instead of the CCD 44, other types of image pickup devices such as MOS solid-state image pickup devices can be used. The photographic lens 42 is composed of an electric zoom lens. Although a detailed optical configuration is not shown in the drawing, a magnifying lens group and a correction lens group that mainly provide an effect of changing magnification (variable focal length), and focus adjustment. And a contributing focus lens.
[0033]
When the photographer operates the zoom switch 33, a control signal is output from the CPU 14 to the zoom drive unit 46 in accordance with the switch operation. The zoom drive unit 46 is an electric drive unit including a motor (zoom motor) serving as a power source and a drive circuit thereof. The motor driving circuit of the zoom driving unit 46 generates a lens driving signal based on the control signal from the CPU 14 and supplies the lens driving signal to the zoom motor. Thus, the zoom motor operates by the motor drive voltage output from the motor drive circuit, and the zoom lens group and the correction lens group in the photographic lens 42 move back and forth along the optical axis, so that the focal length of the photographic lens 42 is increased. (Optical zoom magnification) is changed.
[0034]
In this example, it is assumed that the focal length of the photographing lens 42 can be varied in 10 steps within the zoom operation range from the wide (wide angle) end to the tele (telephoto) end. The photographer can select a desired focal length according to the shooting purpose and perform shooting.
[0035]
The zoom position (corresponding to the focal length) of the photographic lens 42 is detected by the zoom position detection sensor 48 and the detection signal is notified to the CPU 14. The CPU 14 can grasp the current zoom position (that is, focal length) from the signal from the zoom position detection sensor 48. The zoom position detection sensor 48 may be a circuit that generates a pulse by the rotation of a zoom motor or the like, or may have a configuration in which a position detection encode plate is disposed on the outer periphery of the lens barrel. At that time, there is no particular limitation.
[0036]
The light that has passed through the photographing lens 42 is incident on the CCD 44 after the amount of light is adjusted through a diaphragm mechanism (not shown). A large number of photosensors (light receiving elements) are arranged in a plane on the light receiving surface of the CCD 44, and primary color filters of red (R), green (G), and blue (B) are arranged in a predetermined arrangement corresponding to each photosensor. Arranged in a structure (Bayer, G stripe, etc.).
[0037]
The subject image formed on the light receiving surface of the CCD 44 is converted into a signal charge of an amount corresponding to the amount of incident light by each photosensor. The CCD 44 has an electronic shutter function that controls the charge accumulation time (shutter speed) of each photosensor according to the timing of the shutter gate pulse.
[0038]
The signal charges accumulated in the photosensors of the CCD 44 are sequentially read out as voltage signals (image signals) corresponding to the signal charges based on the pulses given from the CCD driver 50. The image signal output from the CCD 44 is sent to the analog processing unit 52. The analog processing unit 52 is a preceding processing unit including a CDS (correlated double sampling) circuit and a gain adjustment circuit. In this analog processing unit 52, sampling processing and color separation processing are performed on each of the R, G, and B color signals, The signal level of each color signal is adjusted (pre-white balance processing).
[0039]
The image signal output from the analog processing unit 52 is converted into a digital signal by the A / D converter 54 and then stored in the memory 22 via the signal processing unit 56. The image data stored in the memory 22 at this time is an A / D conversion output of the image signal output from the CCD 44 recorded as it is (unprocessed), and is subjected to signal processing such as gamma conversion and synchronization. The image data is not broken. (Hereafter, it is called CCDRAW data.) However, “raw data” does not exclude any signal processing. For example, defective pixel correction that interpolates defective pixel (scratch) data of an image sensor. Image data obtained by processing is included in the concept of CCDRAW data in that it has not been developed into a general-purpose format.
[0040]
A timing generator (TG) 58 provides timing signals to the CCD driver 50, the analog processing unit 52, and the A / D converter 54 in accordance with instructions from the CPU 14, and the circuits are synchronized by this timing signal. .
[0041]
The signal processing unit 56 is a digital signal processing block that also serves as a memory controller that controls reading and writing of the memory 22. The signal processing unit 56 calculates the color of each point by interpolating a spatial shift of the color signal associated with the color filter array of the single-chip CCD and an auto calculation unit that performs AE / AF / AWB processing. Processing circuit), a white balance circuit, a gamma conversion circuit, a luminance / color difference signal generation circuit, a contour correction circuit, a contrast correction circuit, and the like, and processes image signals using the memory 22 in accordance with commands from the CPU 14 To do.
[0042]
The CCDRAW data stored in the memory 22 is sent to the signal processing unit 56 via the bus 16. The image data input to the signal processing unit 56 is subjected to predetermined signal processing such as white balance adjustment processing, gamma conversion processing, conversion processing to luminance signals (Y signals) and color difference signals (Cr, Cb signals) (YC processing). Is stored in the memory 22.
[0043]
When the captured image is output to the monitor, the image data is read from the memory 22 and transferred to the display circuit 60. The image data sent to the display circuit 60 is converted to a predetermined display signal (for example, an NTSC color composite video signal), and then output to the liquid crystal monitor 40. The image data in the memory 22 is periodically rewritten by the image signal output from the CCD 44, and the video signal generated from the image data is supplied to the liquid crystal monitor 40, so that the video being captured (through image) can be obtained. It is displayed on the liquid crystal monitor 40 in real time. The photographer can check the angle of view (composition) from the video (so-called through movie) displayed on the liquid crystal monitor 40.
[0044]
When the photographer determines the angle of view and presses the shutter button 32, the CPU 14 detects this, performs AE processing and AF processing in response to half-pressing (S1 ON) of the shutter button 32, and fully presses the shutter button 32. In response to (S2 = ON), CCD exposure and readout control for capturing an image for recording is started.
[0045]
The AF control in the camera 10 is, for example, contrast AF that moves the focus lens (a moving lens that contributes to focus adjustment among the lens optical systems constituting the photographing lens 42) so that the high-frequency component of the G signal of the video signal is maximized. Applies. That is, the AF calculation unit is a high-pass filter that passes only a high-frequency component of the G signal, an absolute value processing unit, and an AF area that cuts out a signal in a focus target area set in advance in the screen (for example, the center of the screen). An extraction unit and an integration unit that integrates absolute value data in the AF area are configured.
[0046]
The integrated value data obtained by the AF calculation unit is notified to the CPU 14. The CPU 14 calculates a focus evaluation value (AF evaluation value) at a plurality of AF detection points while moving the focus lens by controlling the focus driving unit 62 including the AF motor, and calculates the AF calculated at each AF detection point. From the evaluation value, the lens position where the value is maximized is determined as the focus position. Then, the focus driving unit 62 is controlled to move the focus lens to the obtained in-focus position.
[0047]
In relation to the AE control, the AE calculation unit includes a circuit that divides one screen into a plurality of areas (for example, 8 × 8) and integrates the RGB signals for each divided area, and provides the integrated value to the CPU 14. To do. The CPU 14 detects the brightness of the subject (subject brightness) based on the integrated value obtained from the AE calculation unit, and calculates an exposure value (shooting EV value) suitable for shooting. An aperture value and a shutter speed are determined according to the obtained exposure value and a predetermined program diagram. Then, a driving unit (not shown) including an iris motor and an electronic shutter of the CCD 44 are controlled to obtain an optimum exposure amount.
[0048]
The image data captured in response to the full press of the shutter button 32 (S2 = ON) undergoes YC processing and other predetermined signal processing in the signal processing unit 56, and then a predetermined compression format (for example, in the compression / expansion circuit 64). , JPEG format). The compressed image data is recorded on the recording medium 12 via the media interface unit 66. The compression format is not limited to JPEG, and MPEG or other methods may be adopted.
[0049]
As a means for storing image data, various media such as a semiconductor memory card represented by SmartMedia (trademark), compact flash (trademark), a magnetic disk, an optical disk, and a magneto-optical disk can be used. Further, the recording medium (internal memory) built in the camera 10 is not limited to the removable medium.
[0050]
When the playback mode is selected by the mode selection means, the last image file (last recorded file) recorded on the recording medium 12 is read. The image file data read from the recording medium 12 is decompressed by the compression / decompression circuit 64 and output to the liquid crystal monitor 38 via the display circuit 60.
[0051]
By operating the cross button during single-frame playback in the playback mode, it is possible to advance the frame in the forward direction or in the reverse direction, and the next file after the frame advance is read from the recording medium 12, and the image is processed in the same manner as described above. Is played.
[0052]
FIG. 2 is a principal block diagram related to face extraction processing in the camera 10 of this example. 2 that are the same as those described in FIG. 1 are denoted by the same reference numerals, and description thereof is omitted.
[0053]
In FIG. 2, an integration circuit 70 and a white balance circuit 72 in the upper stage are processing systems used for normal auto white balance processing. Further, the integration circuit 74, the white balance circuit 76, and the synchronization circuit 78 shown in the lower stage are processing units (hereinafter referred to as a preprocessing system) for performing preprocessing of the face extraction algorithm.
[0054]
The CCDRAW data converted into a digital signal by the A / D converter 54 is stored in the memory 22 and sent from the memory 22 to the integrating circuit 70. The integrating circuit 70 includes a circuit that divides one screen into a plurality of areas (for example, 8 × 8 64 blocks) and calculates an average integrated value for each color of the RGB signal for each area, and the calculation result is a reference. It is sent to the white balance circuit 72 assuming a light source (daylight). The signal whose gain has been adjusted by the white balance circuit 72 is provided to the CPU 14.
[0055]
The CPU 14 obtains an integrated value of R, an integrated value of B, and an integrated value of G, obtains a ratio of R / G and B / G, and takes these R / G and B / G values and a shooting EV by AE calculation. Based on the value information, scene discrimination (light source type discrimination) is performed, and a white balance circuit 80 in the signal processing unit 56 and a predetermined white balance adjustment value suitable for the scene (setting to leave the light source atmosphere) and The amplifier gain of the pre-processing white balance circuit 76 is controlled to correct the signal of each color channel. The method disclosed in Japanese Patent Application Laid-Open No. 2000-224608 and the like can be used for the light source type discrimination method and the white balance control that leaves the light source atmosphere. In scene discrimination, color temperature information such as RY and BY may be used instead of using the values of R / G and B / G.
[0056]
Thereafter, a recording image is captured in response to the full press of the shutter button 32 (S2 ON). The CCDRAW data acquired in response to S2 ON is temporarily stored in the memory 22 and then sent to the signal processing unit 56 and the pre-processing system integrating circuit 74 for face extraction.
[0057]
The preprocessing integration circuit 74 divides one screen into, for example, 130 × 190 areas, and calculates an integrated value for each area. The calculation result of the integration circuit 74 is sent to the white balance circuit 76, where white balance processing reflecting AWB is performed. Data with AWB applied is sent to the synchronization circuit 78, where data of the same number of pixels (3-plane data) is generated for each of R, G, and B.
[0058]
Based on the RGB three-plane data generated by the synchronization circuit 78, the CPU 14 executes skin color detection, saturation division, face candidate extraction, and face area specification processing based on the shape of the face candidate area. Details of the face extraction algorithm will be described later.
[0059]
On the other hand, the CCDRAW data sent from the memory 22 to the signal processing unit 56 is subjected to processing reflecting AWB by the white balance circuit 80 and then sent to the gamma conversion circuit 82. The gamma conversion circuit 82 changes the input / output characteristics so that the RGB signal subjected to white balance adjustment has a desired gamma characteristic, and outputs it to the luminance / color difference signal generation circuit 84.
[0060]
The luminance / chrominance signal generation circuit 84 creates a luminance signal Y and chroma signals Cr, Cb from the gamma-corrected R, G, B signals. The luminance signal Y and chroma signals Cr and Cb (YC signal) are compressed by the compression / decompression circuit 64 in a predetermined format such as JPEG and recorded on the recording medium 12.
[0061]
Next, the face extraction function installed in the camera 10 according to the present embodiment will be described.
[0062]
FIG. 3 is a diagram showing the relationship between the subject and its image in the photographing optical system. As shown in the figure, the ratio of the photographing distance D and the viewing angle L (the entire range that can be photographed at this focal length) is equal to the ratio of the focal length DF of the photographing lens 42 and the CCD size H.
[0063]
[Expression 1]
Shooting distance D: viewing angle L = focal length DF: CCD size H (1)
That is, the following relationship is established between the actual size A of the human face included in the viewing angle L and the image size a on the CCD light receiving surface.
[0064]
[Expression 2]
Shooting distance D: actual size A = focal length DF: image size a (2)
The shooting distance D is the shortest distance (closest distance) that can be shot by the camera 10. If the macro mode is OFF, the shortest distance to the camera 10 is about 60 cm. When the macro mode is ON, the shortest distance is about 20 cm. However, when the macro mode is ON, the possibility of capturing a person is low, and thus face extraction processing is not performed.
[0065]
Although the actual size A of the person's face has some variation, a prescribed value can be set assuming that the size of the face is within a certain size range.
[0066]
Therefore, the image size on the CCD 44 surface (the number of pixels when a human face forms an image on the CCD 44 surface) can be determined from the relationship of the above equation (2). Thus, the face size at each focal length can be obtained.
[0067]
Since the camera 10 of the present example is configured to be able to select 10 stages of focal lengths from the wide end to the tele end, the maximum face size on the CCD 44 surface is the table data for each focal length. 10 is stored in an EEPROM 24 (data storage means).
[0068]
FIG. 4 is a flowchart showing a sequence in face extraction.
[0069]
First, CCDRAW data is recorded in the memory 22 through S1 = ON and S2 = ON of the shutter button 32 (step S110). Thereafter, the CPU 14 obtains focal length information and macro ON / OFF information at the time of photographing (step S112). When the macro mode is ON, face extraction processing is not performed.
[0070]
On the other hand, when the macro mode is OFF, processing by the preprocessing system (74, 76, 78) described in FIG. 2 is performed, and integration processing, white balance processing, and synchronization processing are performed from the recorded CCDRAW data (FIG. 4). Step S114).
[0071]
Next, the CPU 14 acquires a skin color table from the EEPROM 24 (step S116). The skin color table is data that defines a range of hues recognized as “skin color” in a predetermined color space. The CPU 14 detects the hue in the skin color table based on the RGB three-plane data (linear data) acquired from the preprocessing system synchronization circuit 78 (step S118).
[0072]
FIG. 5 is a diagram illustrating a hue range (skin color extraction area) to be extracted as a skin color. The illustrated color space is a linear system before gamma is applied, and is a coordinate system in which the horizontal axis is R / G and the vertical axis is B / G.
[0073]
In FIG. 5, a range surrounded by a rectangular frame denoted by reference numeral 88 is set as a skin color extraction area. That is, a range in which R is slightly larger than G and B is slightly lower than G is defined as “skin color”, and a color within this range is determined as a skin color.
[0074]
Further, the skin color extraction area 88 is further divided into a plurality of areas depending on the saturation. In FIG. 5, “saturation” is represented by the distance from the origin O (1, 1), and the saturation increases as the distance from the origin O increases. In the example of the figure, the skin color detection area 88 is divided into six regions by a concentric boundary line (illustrated by a broken line) centered on the origin.
[0075]
After detecting the skin color area from the RGB three-surface data acquired from the preprocessing system synchronization circuit 78, the CPU 14 further divides the detected skin color area based on the saturation (step S118 in FIG. 4). .
[0076]
FIG. 6 shows an example in which the skin color detected area is further divided by saturation. Different objects in the image have different saturations, and generally the skin color of a person's face tends to be higher in saturation than white objects or trees such as desks under a tungsten light source. Therefore, it becomes easy to discriminate whether it is a face region or a region other than the face by finely dividing the skin color detected region according to the saturation and grasping the shape of the region.
[0077]
After performing the area division by saturation in step S120 of FIG. 4, the process proceeds to step S122. In step S122, in accordance with the focal length information obtained in step S112, a face area maximum value table is obtained from the EEPROM 24, a predicted maximum value of the face area at the focal distance is obtained, and this maximum value is used as a determination reference value for face area determination. Set as. Of course, in consideration of the possibility of shooting near the shortest shooting distance that can be shot by the camera 10, a predetermined reference value is set by adding a predetermined margin to the predicted maximum value, or a margin is set for the determination. It is preferable to have
[0078]
Then, the skin color area divided by the saturation is compared with the predicted maximum value, and an area extremely larger than the maximum value is determined not to be a face, and is excluded from the face area candidates, and the remaining area is excluded from the face area. Are extracted as candidates (step S124). This eliminates large sized regions that cannot actually be “faces”.
[0079]
From the face area candidates narrowed down in step S124, shape detection is further performed for each area, and the face area is specified from the shape (step S126). That is, the aspect ratio determined from the model shape (ellipse or circle) appropriate for the face is compared with the shape of the face candidate, and the aspect ratio of the detected shape greatly deviates from the specified value. It is determined that the region is other than the face. Face candidates are further narrowed down by this shape determination, and an area portion having a shape having a predetermined aspect ratio is determined as a “face”.
[0080]
Thus, the face area is extracted, and the extraction result is used for brightness correction, white balance correction, color correction for bringing the skin color close to the best color (target value), red-eye correction, and the like.
[0081]
According to the above-described embodiment, the size of the face area is estimated by using the focal length information that is camera information, while the skin color area obtained by detecting the skin color in the image is divided into areas by saturation and each area is divided. Since the area that is extremely large compared to the estimated maximum value is excluded from the face area candidates, the face area candidates can be narrowed down. Since the shape of the remaining area is recognized and the final face is determined, a correct face area can be extracted with high accuracy.
[0082]
For example, the scene shown in FIG. 7 is a photograph of a person 94 against a white cloth 92 under the illumination of a light bulb 90, but has a hue similar to the skin color, such as the background cloth 92 or a wooden desk 96. By removing the object, the face of the person 94 can be accurately extracted.
[0083]
[Modification 1]
In the above-described embodiment, the example in which the face area is completely specified from the captured image has been described. However, when the present invention is implemented, an aspect in which the face area is not finally specified is possible.
[0084]
For example, when performing brightness correction with an emphasis on the face of a person, for the skin color area extracted by skin color detection, the weight in brightness calculation is used instead of or in combination with the face area candidates. There is a mode in which the coefficient is changed to the focal length information.
[0085]
As shown in FIG. 8, when a plurality of skin color areas are detected in the screen by skin color detection, weights w i (i = 1, 2, 3) corresponding to the level of the possibility that the face is actually a face based on the focal length information. ...) is set, and brightness Y is calculated according to the following equation (3).
[0086]
[Equation 3]
Y = Σ (wi × Yi) / Σwi (3)
Yi indicates the brightness of each face area.
[0087]
As for the weighting w i, the information of the area that is likely to be a face is strongly reflected by reducing the weighting of the area that is not likely to be a face (a value close to “0” or “0”). Value. Correction processing is performed so that the brightness Y thus obtained approaches a predetermined target value.
[0088]
[Modification 2]
There is an aspect in which the distance information of the subject is used instead of or in combination with the aspect using the focal length information described in FIG.
[0089]
FIG. 9 shows a block diagram of an electronic camera according to another embodiment of the present invention. 9, parts that are the same as or similar to those in FIG. 1 are given the same reference numerals, and descriptions thereof are omitted.
[0090]
The camera 10 shown in FIG. 9 includes a distance measuring sensor 102 as means for measuring the subject distance (shooting distance). A signal obtained from the distance measuring sensor 102 is input to the CPU 14, and the CPU 14 acquires distance information of the subject.
[0091]
As the means for detecting the subject distance, it is possible to use a well-known AF mechanism such as an AF mechanism using a ranging method represented by an active method or a passive method using the principle of triangulation, or an AF mechanism using a phase method. It is. Also in a camera (configuration shown in FIG. 1) in which the distance measuring sensor 102 is omitted, when the focus lens is moved to the in-focus position by contrast AF or the like, the position information (focus position information) of the focus lens from the focus position detection unit. ) And the distance between the camera 10 and the subject (subject distance) can be calculated based on this information.
[0092]
Thus, by acquiring the distance information of the subject, it is possible to estimate the size of the face actually photographed at that distance. When calculating the face size, table data indicating the size of the face corresponding to the subject distance may be stored in the EEPROM 24, or may be calculated using an arithmetic expression.
[0093]
By excluding the skin color area that is extremely larger or smaller than the face size calculated based on the distance information of the subject from the face candidates, the face area can be extracted more accurately.
[0094]
FIG. 10 shows a face extraction sequence using distance information. In the figure, the same or similar steps as those in the flowchart of FIG.
[0095]
In the flowchart shown in FIG. 10, in step S112, a process of reading information from the distance measuring sensor 102 and obtaining subject distance information is added.
[0096]
By obtaining subject distance information in addition to focal length information, the face size can be predicted more accurately, so it is possible to set a reasonable numerical range that is allowed as a face area based on the prediction, Extraction accuracy is improved.
[0097]
Specifically, in step S122 of FIG. 10, the maximum value table is obtained based on the focal length, the minimum value table of the face area is obtained based on the subject distance, and the numerical range defined in these tables is obtained. Based on this, the face area candidates are narrowed down (step S124). Alternatively, in step S122, a determination reference value (or determination reference range) that considers the maximum value and the minimum value of the face area is determined based on the focal length and the subject distance, and a face area candidate is determined according to the determination reference value. There are modes such as narrowing down (step S124).
[0098]
9 and 10, the subject distance information is acquired from the distance measuring sensor 102. However, the subject distance information is not limited to the mode acquired from the camera used for shooting, but is attached information (tag information) added to the image data. ) Etc. are also possible.
[0099]
In the above-described embodiment, the digital camera having the optical zoom function has been described. However, for a camera using a single focus lens, the face size may be determined using the value of the focal length of the lens. Also, for a camera equipped with an electronic zoom (digital zoom) function that electronically processes an image signal to obtain an enlarged image, an image of the entire angle of view is captured and the face area is determined from the entire image data before the electronic zoom processing. By detecting this, it is possible to apply the same technique as in the above-described embodiment.
[0100]
In the above-described embodiment, a digital camera has been exemplified, but the scope of application of the present invention is not limited to this, and other information devices having an electronic imaging function such as a camera-equipped mobile phone, a camera-equipped PDA, and a camera-equipped mobile personal computer. The present invention can also be applied to. In this case, the imaging unit may be a detachable (external type) that is separable from a main body such as a mobile phone.
[0101]
Further, the present invention can also be applied to an image reproducing apparatus that reproduces and displays image data recorded by the device with the electronic imaging function as described above, or a printing apparatus that performs print output.
[0102]
【The invention's effect】
According to the present invention, the maximum value of a person's face is predicted from the focal length information of the photographing optical system, and a skin color region that is larger than the criterion value set from the maximum value is less likely to be a face region. Therefore, it is possible to determine the face area with higher accuracy.
[0103]
Further, by acquiring subject distance information in addition to focal length information, it is possible to more accurately estimate the size of a person's face that is actually photographed, and to improve face extraction accuracy.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an electronic camera according to an embodiment of the present invention.
FIG. 2 is a principal block diagram related to face extraction processing in the camera of this example.
FIG. 3 is a diagram showing the relationship between a subject and its image in a photographic optical system.
FIG. 4 is a flowchart showing a sequence of face extraction processing;
FIG. 5 is a diagram illustrating an example of a hue range (skin color extraction area) to be extracted as a skin color and saturation division;
FIG. 6 is a diagram showing an example in which a region where skin color is detected is further divided by saturation;
FIG. 7 is a diagram showing an example of a shooting scene
FIG. 8 is a diagram showing an example of brightness calculation when there are a plurality of skin color areas in the screen.
FIG. 9 is a block diagram of an electronic camera according to another embodiment of the present invention.
FIG. 10 is a flowchart showing a sequence of face extraction processing using subject distance information.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 10 ... Camera, 14 ... CPU, 24 ... EEPROM, 42 ... Shooting lens, 44 ... CCD, 48 ... Zoom position detection sensor, 74 ... Integration circuit, 76 ... White balance circuit, 78 ... Synchronization circuit, 88 ... Skin color extraction area , 102 ... Ranging sensor

Claims

A method for extracting an area corresponding to a human face from an image,
An information acquisition step of acquiring focal length information at the time of shooting;
A maximum value prediction step for obtaining a maximum value assumed as a face region in the image based on the focal length information acquired in the information acquisition step;
A skin color region detection step of analyzing image data to detect a region having a skin color hue from within the image, further comprising: a skin color detection step of detecting a skin color hue; and a skin color portion detected in the skin color detection step A skin color region detection step including a saturation division step of dividing by saturation ,
Compared with the reference value set from the maximum value estimated in the maximum value prediction step for the skin color region classified by saturation among the skin color regions detected in the skin color region detection step. And a processing step of performing processing that treats an area larger than the determination reference value as a face area that is unlikely to be a face area;
A facial region extraction method characterized by comprising:

The processing step excludes, from among the skin color regions detected in the skin color region detection step, a region larger than a determination reference value set from the maximum value estimated in the maximum value prediction step from a face region candidate, The face area extracting method according to claim 1, further comprising a face area determining step of determining a face area from areas smaller than the determination reference value.

The face region determining step, claim 2 Symbol mounting face region extraction method and determining a face region based on the shape of the skin color region to be a candidate for a face region.

A distance information acquisition step of acquiring subject distance information;
A face size prediction step for obtaining a size assumed as a face region in the image based on the distance information acquired in the distance information acquisition step;
Face region extraction method according to any one of claims 1 to 3, characterized in that it comprises a.

An apparatus for extracting an area corresponding to a human face from an image,
Information acquisition means for acquiring focal length information at the time of shooting;
Maximum value predicting means for obtaining a maximum value assumed as a face area in the image based on focal length information acquired via the information acquiring means;
Skin color area detecting means for analyzing image data and detecting an area having a flesh-colored hue from the image, further comprising: a flesh-color detecting means for detecting a flesh-colored hue; and a flesh-color portion detected in the flesh-color detecting step. A skin color area detecting means including a saturation dividing means for dividing by saturation ;
Of the skin color areas detected by the skin color area detecting means, the skin color areas classified by saturation divided by the saturation dividing means are compared with a criterion value set from the maximum value estimated by the maximum value predicting means. A face area determining unit that deletes an area larger than the determination reference value from the face area candidates and determines a face area from areas smaller than the determination reference value;
A face area extracting apparatus comprising: