JP2011034311A

JP2011034311A - Image processor and method

Info

Publication number: JP2011034311A
Application number: JP2009179548A
Authority: JP
Inventors: Kazunori Kita; 一記喜多
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2009-07-31
Filing date: 2009-07-31
Publication date: 2011-02-17
Anticipated expiration: 2029-07-31
Also published as: JP5471130B2

Abstract

PROBLEM TO BE SOLVED: To provide an image processor for improving the speed and the precision of face detection/face recognition processing. SOLUTION: An image processor estimates a dot region 54 under consideration by using a remarkable map 53S obtained by integrating a plurality of featured value maps 52Fc, 52Fh, and 52Fs about a through-image 51 (step Sa to Sc). The image processor sets a parameter to be used for face detection/face recognition processing, for example, a parameter indicating whether a region is a processing object region by using the region 54 under consideration (step Sd). For example, only a region 56 among regions 55 to 57 is set as the object of processing. Then, the image processor performs the face detection/face recognition processing to the region 56 (step Se). Thus, a face detection region 58 is obtained. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、画像処理装置及び方法に関し、特に、顔検出／顔認識処理の速度及び精度を向上させる技術に関する。 The present invention relates to an image processing apparatus and method, and more particularly to a technique for improving the speed and accuracy of face detection / face recognition processing.

従来より、カメラにより撮影して得られた画像（以下、撮影画像と称する）から、人物の顔を検出したり、特定の人物の顔を認識する処理が存在する。以下、かかる処理を顔検出／顔認識処理と称する。 2. Description of the Related Art Conventionally, there is a process for detecting a person's face or recognizing a specific person's face from an image obtained by photographing with a camera (hereinafter referred to as a photographed image). Hereinafter, such processing is referred to as face detection / face recognition processing.

顔検出／顔認識処理の結果を利用して、ＡＦ（ＡｕｔｏｍａｔｉｃＦｏｃｕｓ）処理やＡＥ（ＡｕｔｏｍａｔｉｃＥｘｐｏｓｕｒｅ）処理など各種画像処理を実現可能な手法が、例えば特許文献１乃至６に開示されている。 For example, Patent Documents 1 to 6 disclose methods that can realize various image processing such as AF (Automatic Focus) processing and AE (Automatic Exposure) processing using the results of face detection / face recognition processing.

特開平０５−３７９４０号公報JP 05-37940 A 特開平０５−４１８３０号公報JP 05-41830 A 特開平０５−５３０４１号公報Japanese Patent Laid-Open No. 05-53041 特開平０６−３０３４９１号公報Japanese Patent Laid-Open No. 06-303491 特開平０６−２１７１８７号公報Japanese Patent Laid-Open No. 06-217187 特開２００６−２５４２２９号公報JP 2006-254229 A

しかしながら、小型カメラでは搭載できる処理能力や処理時間に制約がある。このため、特許文献１乃至６などに開示された従来の手法をかかる小型カメラに適用すると、顔検出／顔認識処理の速度が遅くなる。その結果、ユーザにとって、撮影操作に支障が生じたり、シャッタタイミングを逸するといった不具合が生ずる場合がある。特に、検出又は認識対象の顔の数が増加する程、不具合が生じる確率が高くなる。
また、特許文献１乃至６などに開示された従来の手法を処理能力や処理時間に制約のある小型カメラに適用すると、顔検出／顔認識処理の精度が十分に得られない。その結果、誤認識や誤検出、最悪の場合顔の認識や検出自体が不能になるといった不具合が生じる場合もある。 However, there are restrictions on the processing capacity and processing time that can be installed in a small camera. For this reason, when the conventional methods disclosed in Patent Documents 1 to 6 are applied to such a small camera, the speed of face detection / face recognition processing is slowed down. As a result, the user may have troubles such as trouble in photographing operation and missed shutter timing. In particular, as the number of faces to be detected or recognized increases, the probability of occurrence of a problem increases.
In addition, when the conventional methods disclosed in Patent Documents 1 to 6 are applied to a small camera with limited processing capability and processing time, the accuracy of face detection / face recognition processing cannot be obtained sufficiently. As a result, there may be a problem that erroneous recognition, erroneous detection, or in the worst case, face recognition or detection itself becomes impossible.

処理能力や処理時間に制約がある小型カメラであっても、顔検出／顔認識処理の速度及び精度の向上が要求されている。しかしながら、特許文献１乃至６などに開示された従来の手法では、かかる要求に十分に応えることができない。 Even a small camera with limited processing capability and processing time is required to improve the speed and accuracy of face detection / face recognition processing. However, the conventional methods disclosed in Patent Documents 1 to 6 cannot sufficiently meet such a demand.

そこで、本発明は、かかる従来の課題に鑑みてなされたものであり、顔検出／顔認識処理の速度及び精度を向上させることを目的とする。 Therefore, the present invention has been made in view of such conventional problems, and an object thereof is to improve the speed and accuracy of face detection / face recognition processing.

本発明の第１の観点によると、主要被写体を含む入力画像に対して、前記入力画像から抽出された複数の特徴量に基づく顕著性マップを用いて、注目点領域を推定する推定部と、前記推定部により推定された前記注目点領域を用いて、前記入力画像から前記主要被写体を検出する被写体検出処理に関するパラメータを設定する設定部と、前記設定部により設定されたパラメータを用いて、前記被写体検出処理を実行する検出部と、を備える画像処理装置を提供する。 According to a first aspect of the present invention, for an input image including a main subject, an estimation unit that estimates an attention point region using a saliency map based on a plurality of feature amounts extracted from the input image; Using the attention point area estimated by the estimation unit, a setting unit that sets a parameter related to subject detection processing for detecting the main subject from the input image, and using the parameter set by the setting unit, An image processing apparatus is provided that includes a detection unit that executes subject detection processing.

本発明の第２の観点によると、前記検出部は、前記被写体検出処理として、前記主要被写体としての人物の顔を検出するか、又は特定の人物の顔を認識する顔処理を実行する画像処理装置を提供する。 According to the second aspect of the present invention, the detection unit detects, as the subject detection process, a face of a person as the main subject or performs a face process of recognizing a specific person's face Providing equipment.

本発明の第３の観点によると、前記パラメータは、前記顔処理の対象となるか否かを示すパラメータであり、前記設定部は、前記注目点領域が所定条件を満たす場合、前記注目点領域を前記顔処理の対象に含める第１の設定を行い、記注目点領域が前記所定条件を満たさない場合、前記注目点領域を前記顔処理の対象から除外する第２の設定を行い、前記検出部は、前記設定部により前記第１の設定が行われた場合、前記注目点領域に対して前記顔検出処理を実行し、前記設定部により前記第２の設定が行われた場合、前記注目点領域に対する前記顔検出処理の実行を禁止する画像処理装置を提供する。 According to a third aspect of the present invention, the parameter is a parameter indicating whether or not to be the target of the face processing, and the setting unit, when the target point region satisfies a predetermined condition, the target point region Is included in the face processing target, and if the target point area does not satisfy the predetermined condition, a second setting is performed to exclude the target point area from the face processing target, and the detection is performed. When the first setting is performed by the setting unit, the face detection process is performed on the attention point region, and when the second setting is performed by the setting unit, the attention is performed. Provided is an image processing device that prohibits execution of the face detection processing for a point region.

本発明の第４の観点によると、前記パラメータは、前記顔処理により検出される顔の領域の大きさ若しくは対象範囲、前記顔検出処理により検出される顔から特徴点を抽出する領域の大きさ若しくは解像度、又は、これらの２以上の組合せである画像処理装置を提供する。 According to a fourth aspect of the present invention, the parameters include the size or target range of the face area detected by the face processing, and the size of the area from which feature points are extracted from the face detected by the face detection processing. Alternatively, an image processing apparatus having a resolution or a combination of two or more of these is provided.

本発明の第５の観点によると、前記パラメータは、前記顔処理の精度に関するパラメータである画像処理装置を提供する。 According to a fifth aspect of the present invention, there is provided the image processing apparatus, wherein the parameter is a parameter related to the accuracy of the face processing.

本発明の第６の観点によると、前記パラメータは、前記顔処理の順番であり、前記設定部は、所定条件にしたがって、１以上の前記注目点領域のそれぞれに対して前記順番を付す設定を行い、前記検出部は、前記設定部により設定された前記順番にしたがって、前記１以上の注目点領域のそれぞれに対して前記顔処理を順次実行する画像処理装置を提供する。 According to a sixth aspect of the present invention, the parameter is an order of the face processing, and the setting unit sets the order for each of the one or more attention point regions according to a predetermined condition. And the detection unit provides an image processing device that sequentially executes the face processing for each of the one or more attention point regions according to the order set by the setting unit.

本発明の第７の観点によると、主要被写体を含む入力画像に対して、前記入力画像から抽出された複数の特徴量に基づく顕著性マップを用いて、注目点領域を推定する推定ステップと、前記推定ステップの処理により推定された前記注目点領域を用いて、前記入力画像から前記主要被写体を検出する被写体検出処理に関するパラメータを設定する設定部と、前記設定ステップの処理により設定された前記パラメータを用いて、前記被写体検出処理を実行する検出ステップと、を含む画像処理方法を提供する。 According to a seventh aspect of the present invention, for an input image including a main subject, an estimation step for estimating a point of interest area using a saliency map based on a plurality of feature amounts extracted from the input image; A setting unit that sets a parameter related to subject detection processing for detecting the main subject from the input image using the attention point region estimated by the processing of the estimation step, and the parameter set by the processing of the setting step And a detection step of executing the subject detection process.

本発明によれば、顔処理（顔検出／顔認識処理）の速度及び精度を向上させることができる。 According to the present invention, the speed and accuracy of face processing (face detection / face recognition processing) can be improved.

本発明の一実施形態に係る画像処理装置のハードウェアの構成図である。It is a hardware block diagram of the image processing apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態における顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の一例を示す図である。It is a figure explaining the outline of the face detection / face recognition process in one Embodiment of this invention, and its pre-process, Comprising: It is a figure which shows an example of a specific process result. 本発明の一実施形態における顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の一例を示す図である。It is a figure explaining the outline of the face detection / face recognition process in one Embodiment of this invention, and its pre-process, Comprising: It is a figure which shows an example of a specific process result. 本発明の一実施形態における顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の一例を示す図である。It is a figure explaining the outline of the face detection / face recognition process in one Embodiment of this invention, and its pre-process, Comprising: It is a figure which shows an example of a specific process result. 本発明の一実施形態における顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の一例を示す図である。It is a figure explaining the outline of the face detection / face recognition process in one Embodiment of this invention, and its pre-process, Comprising: It is a figure which shows an example of a specific process result. 本発明の一実施形態における撮影モード処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the imaging | photography mode process in one Embodiment of this invention. 本発明の一実施形態における撮影モード処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the imaging | photography mode process in one Embodiment of this invention. 本発明の一実施形態における撮影モード処理のうちの注目点領域推定処理の流れの詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the flow of an attention point area | region estimation process among the imaging | photography mode processes in one Embodiment of this invention. 本発明の一実施形態における撮影モード処理のうちの特徴量マップ作成処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the feature-value map creation process among the imaging | photography mode processes in one Embodiment of this invention. 本発明の一実施形態における撮影モード処理のうちの特徴量マップ作成処理の流れの別の例を示すフローチャートである。It is a flowchart which shows another example of the flow of the feature-value map creation process in the imaging | photography mode processes in one Embodiment of this invention. 本発明の一実施形態における撮影モード処理のうちの顔検出／顔認識処理の流れの詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the flow of the face detection / face recognition process in the imaging | photography mode process in one Embodiment of this invention.

以下、本発明の一実施形態を図面に基づいて説明する。
図１は、本発明の一実施形態に係る画像処理装置１００のハードウェアの構成を示す図である。画像処理装置１００は、例えばデジタルカメラにより構成することができる。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram illustrating a hardware configuration of an image processing apparatus 100 according to an embodiment of the present invention. The image processing apparatus 100 can be configured by a digital camera, for example.

画像処理装置１００は、光学レンズ装置１と、シャッタ装置２と、アクチュエータ３と、ＣＭＯＳセンサ４と、ＡＦＥ５と、ＴＧ６と、ＤＲＡＭ７と、ＤＳＰ８と、ＣＰＵ９と、ＲＡＭ１０と、ＲＯＭ１１と、液晶表示コントローラ１２と、液晶ディスプレイ１３と、操作部１４と、メモリカード１５と、測距センサ１６と、測光センサ１７と、を備える。 The image processing apparatus 100 includes an optical lens device 1, a shutter device 2, an actuator 3, a CMOS sensor 4, an AFE 5, a TG 6, a DRAM 7, a DSP 8, a CPU 9, a RAM 10, a ROM 11, and a liquid crystal display controller. 12, a liquid crystal display 13, an operation unit 14, a memory card 15, a distance measuring sensor 16, and a photometric sensor 17.

光学レンズ装置１は、例えばフォーカスレンズやズームレンズなどで構成される。フォーカスレンズは、ＣＭＯＳセンサ４の受光面に被写体像を結像させるためレンズである。 The optical lens device 1 is composed of, for example, a focus lens and a zoom lens. The focus lens is a lens for forming a subject image on the light receiving surface of the CMOS sensor 4.

シャッタ装置２は、例えばシャッタ羽根などから構成される。シャッタ装置２は、ＣＭＯＳセンサ４へ入射する光束を遮断する機械式のシャッタとして機能する。シャッタ装置２はまた、ＣＭＯＳセンサ４へ入射する光束の光量を調節する絞りとしても機能する。アクチュエータ３は、ＣＰＵ９による制御にしたがって、シャッタ装置２のシャッタ羽根を開閉させる。 The shutter device 2 includes, for example, shutter blades. The shutter device 2 functions as a mechanical shutter that blocks a light beam incident on the CMOS sensor 4. The shutter device 2 also functions as a diaphragm that adjusts the amount of light flux incident on the CMOS sensor 4. The actuator 3 opens and closes the shutter blades of the shutter device 2 according to control by the CPU 9.

ＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）センサ４は、例えばＣＭＯＳ型のイメージセンサなどから構成される。ＣＭＯＳセンサ４には、光学レンズ装置１からシャッタ装置２を介して被写体像が入射される。そこで、ＣＭＯＳセンサ４は、ＴＧ６から供給されるクロックパルスにしたがって、一定時間毎に被写体像を光電変換（撮影）して画像信号を蓄積し、蓄積した画像信号をアナログ信号として順次出力する。 The CMOS (Complementary Metal Oxide Semiconductor) sensor 4 is composed of, for example, a CMOS type image sensor. A subject image is incident on the CMOS sensor 4 from the optical lens device 1 through the shutter device 2. Therefore, the CMOS sensor 4 photoelectrically converts (photographs) the subject image at regular intervals according to the clock pulse supplied from the TG 6, accumulates the image signal, and sequentially outputs the accumulated image signal as an analog signal.

ＡＦＥ（ＡｎａｌｏｇＦｒｏｎｔＥｎｄ）５には、ＣＭＯＳセンサ４からアナログの画像信号が供給される。そこで、ＡＦＥ５は、ＴＧ６から供給されるクロックパルスにしたがって、アナログの画像信号に対し、Ａ／Ｄ（Ａｎａｌｏｇ／Ｄｉｇｉｔａｌ）変換処理などの各種信号処理を施す。各種信号処理の結果、ディジタル信号が生成され、ＡＦＥ５から出力される。 An analog image signal is supplied from the CMOS sensor 4 to the AFE (Analog Front End) 5. Therefore, the AFE 5 performs various signal processing such as A / D (Analog / Digital) conversion processing on the analog image signal in accordance with the clock pulse supplied from the TG 6. As a result of various signal processing, a digital signal is generated and output from the AFE 5.

ＴＧ（ＴｉｍｉｎｇＧｅｎｅｒａｔｏｒ）６は、ＣＰＵ９による制御にしたがって、一定時間毎にクロックパルスをＣＭＯＳセンサ４とＡＦＥ５とにそれぞれ供給する。 A TG (Timing Generator) 6 supplies a clock pulse to the CMOS sensor 4 and the AFE 5 at regular intervals according to control by the CPU 9.

ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）７は、ＡＦＥ５により生成されるディジタル信号や、ＤＳＰ８により生成される画像データを一時的に記憶する。 A DRAM (Dynamic Random Access Memory) 7 temporarily stores a digital signal generated by the AFE 5 and image data generated by the DSP 8.

ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）８は、ＣＰＵ９による制御にしたがって、ＤＲＡＭ７に記憶されたディジタル信号に対して、ホワイトバランス補正処理、γ補正処理、ＹＣ変換処理などの各種画像処理を施す。各種画像処理の結果、輝度信号と色差信号とでなる画像データが生成される。なお、以下、かかる画像データをフレーム画像データと称し、このフレーム画像データにより表現される画像をフレーム画像と称する。 A DSP (Digital Signal Processor) 8 performs various image processing such as white balance correction processing, γ correction processing, and YC conversion processing on the digital signal stored in the DRAM 7 under the control of the CPU 9. As a result of various image processing, image data composed of a luminance signal and a color difference signal is generated. Hereinafter, such image data is referred to as frame image data, and an image expressed by the frame image data is referred to as a frame image.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９は、画像処理装置１００全体の動作を制御する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１０は、ＣＰＵ９が各処理を実行する際にワーキングエリアとして機能する。ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１１は、画像処理装置１００が各処理を実行するのに必要なプログラムやデータを記憶する。ＣＰＵ９は、ＲＡＭ１０をワーキングエリアとして、ＲＯＭ１１に記憶されているプログラムとの協働により各種処理を実行する。 A CPU (Central Processing Unit) 9 controls the operation of the entire image processing apparatus 100. A RAM (Random Access Memory) 10 functions as a working area when the CPU 9 executes each process. A ROM (Read Only Memory) 11 stores programs and data necessary for the image processing apparatus 100 to execute each process. The CPU 9 executes various processes in cooperation with a program stored in the ROM 11 using the RAM 10 as a working area.

液晶表示コントローラ１２は、ＣＰＵ９による制御にしたがって、ＤＲＡＭ７やメモリカード１５に記憶されているフレーム画像データをアナログ信号に変換して、液晶ディスプレイ１３に供給する。液晶ディスプレイ１３は、液晶表示コントローラ１２から供給されてくるアナログ信号に対応する画像として、フレーム画像を表示する。 The liquid crystal display controller 12 converts frame image data stored in the DRAM 7 or the memory card 15 into an analog signal under the control of the CPU 9 and supplies the analog signal to the liquid crystal display 13. The liquid crystal display 13 displays a frame image as an image corresponding to the analog signal supplied from the liquid crystal display controller 12.

また、液晶表示コントローラ１２は、ＣＰＵ９による制御にしたがって、ＲＯＭ１１などに予め記憶されている各種画像データをアナログ信号に変換して、液晶ディスプレイ１３に供給する。液晶ディスプレイ１３は、液晶表示コントローラ１２から供給されてくるアナログ信号に対応する画像を表示する。例えば本実施の形態では、各種シーンを特定可能な情報（以下、シーン情報と称する）の画像データがＲＯＭ１１に記憶されている。このため、図４を参照して後述するように、各種シーン情報が液晶ディスプレイ１３に適宜表示される。 The liquid crystal display controller 12 converts various image data stored in advance in the ROM 11 or the like into analog signals under the control of the CPU 9 and supplies the analog signals to the liquid crystal display 13. The liquid crystal display 13 displays an image corresponding to the analog signal supplied from the liquid crystal display controller 12. For example, in the present embodiment, image data of information that can specify various scenes (hereinafter referred to as scene information) is stored in the ROM 11. Therefore, various scene information is appropriately displayed on the liquid crystal display 13 as will be described later with reference to FIG.

操作部１４は、ユーザから各種ボタンの操作を受け付ける。操作部１４は、電源釦、十字釦、決定釦、メニュー釦、レリーズ釦などを備える。操作部１４は、ユーザから受け付けた各種ボタンの操作に対応する信号を、ＣＰＵ９に供給する。ＣＰＵ９は、操作部１４からの信号に基づいてユーザの操作内容を解析し、その操作内容に応じた処理を実行する。 The operation unit 14 receives operations of various buttons from the user. The operation unit 14 includes a power button, a cross button, a determination button, a menu button, a release button, and the like. The operation unit 14 supplies the CPU 9 with signals corresponding to various button operations received from the user. The CPU 9 analyzes the user's operation content based on the signal from the operation unit 14 and executes processing according to the operation content.

メモリカード１５は、ＤＳＰ８により生成されたフレーム画像データを記録する。測距センサ１６は、ＣＰＵ９による制御にしたがって、被写体までの距離を検出する。測光センサ１７は、ＣＰＵ９による制御にしたがって、被写体の輝度（明るさ）を検出する。 The memory card 15 records frame image data generated by the DSP 8. The distance measuring sensor 16 detects the distance to the subject under the control of the CPU 9. The photometric sensor 17 detects the luminance (brightness) of the subject under the control of the CPU 9.

このような構成を有する画像処理装置１００の動作モードとしては、撮影モードや再生モードを含む各種モードが存在する。ただし、以下、説明の簡略上、撮影モード時における処理（以下、撮影モード処理と称する）についてのみ説明する。なお、以下、撮影モード処理の主体は主にＣＰＵ９であるとする。 As an operation mode of the image processing apparatus 100 having such a configuration, there are various modes including a shooting mode and a reproduction mode. However, for the sake of simplicity, only the processing in the shooting mode (hereinafter referred to as shooting mode processing) will be described below. In the following, it is assumed that the subject of the shooting mode processing is mainly the CPU 9.

次に、図１の画像処理装置１００の撮影モード処理のうち、顕著性マップに基づく注目点領域を用いてパラメータを設定するまでの前処理、及び、そのパラメータを用いた顔検出／顔認識処理の概略について説明する。
図２は、顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の一例を示す図である。 Next, in the shooting mode processing of the image processing apparatus 100 in FIG. 1, pre-processing until a parameter is set using a region of interest based on the saliency map, and face detection / face recognition processing using the parameter The outline of will be described.
FIG. 2 is a diagram for explaining the outline of the face detection / face recognition process and its pre-process, and shows an example of a specific process result.

図１の画像処理装置１００のＣＰＵ９は、撮影モードを開始させると、ＣＭＯＳセンサ４による撮影を継続させ、その間にＤＳＰ８により逐次生成されるフレーム画像データを、ＤＲＡＭ７に一時的に記憶させる。なお、以下、かかるＣＰＵ９の一連の処理を、スルー撮像と称する。
また、ＣＰＵ９は、液晶表示コントローラ１２などを制御して、スルー撮像時にＤＲＡＭ７に記録された各フレーム画像データを順次読み出して、それぞれに対応するフレーム画像を液晶ディスプレイ１３に表示させる。なお、以下、かかるＣＰＵ９の一連の処理を、スルー表示と称する。また、スルー表示されているフレーム画像を、スルー画像と称する。
以下の説明では、スルー撮像及びスルー表示により、例えば図２に示されるスルー画像５１が液晶ディスプレイ１３に表示されているとする。ただし、図２に示されるスルー画像５１は、表現の簡略上、実際のフレーム画像を線図化したものである。 When starting the shooting mode, the CPU 9 of the image processing apparatus 100 in FIG. 1 continues shooting by the CMOS sensor 4 and temporarily stores frame image data sequentially generated by the DSP 8 in the DRAM 7. Hereinafter, a series of processes of the CPU 9 is referred to as through imaging.
Further, the CPU 9 controls the liquid crystal display controller 12 and the like to sequentially read out each frame image data recorded in the DRAM 7 at the time of through imaging, and display a corresponding frame image on the liquid crystal display 13. Hereinafter, a series of processes of the CPU 9 is referred to as through display. A frame image displayed as a through image is referred to as a through image.
In the following description, it is assumed that a through image 51 shown in FIG. 2 is displayed on the liquid crystal display 13 by through imaging and through display. However, the through image 51 shown in FIG. 2 is a diagram of an actual frame image for simplicity of expression.

この場合、ステップＳａにおいて、ＣＰＵ９は、特徴量マップ作成処理として、例えば次のような処理を実行する。
すなわち、ＣＰＵ９は、スルー画像５１に対応するフレーム画像データについて、例えば色、方位、輝度などの複数種類の特徴量のコントラストから、複数種類の特徴量マップを作成することができる。このような複数種類のうち所定の１種類の特徴量マップを作成するまでの一連の処理が、ここでは、特徴量マップ作成処理と称されている。各特徴量マップ作成処理の詳細例については、図９や図１０を参照して後述する。
例えば図２の例では、後述する図１０Ａのマルチスケールのコントラストの特徴量マップ作成処理の結果、特徴量マップ５２Ｆｃが作成されている。また、後述する図１０ＢのＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップ作成処理の結果、特徴量マップ５２Ｆｈが作成されている。また、図１０Ｃの色空間分布の特徴量マップ作成処理の結果、特徴量マップ５２Ｆｓが作成されている。 In this case, in step Sa, the CPU 9 executes, for example, the following process as the feature amount map creation process.
That is, for the frame image data corresponding to the through image 51, the CPU 9 can create a plurality of types of feature amount maps from contrasts of a plurality of types of feature amounts such as color, orientation, and luminance. A series of processing until a predetermined one type of feature amount map among the plurality of types is created is referred to herein as feature amount map creation processing. A detailed example of each feature amount map creation process will be described later with reference to FIG. 9 and FIG.
For example, in the example of FIG. 2, the feature map 52Fc is created as a result of the multi-scale contrast feature map creation process of FIG. 10A described later. In addition, as a result of processing for creating a feature map of the center-surround color histogram of FIG. 10B described later, a feature map 52Fh is created. Further, as a result of the color space distribution feature quantity map creation processing of FIG. 10C, a feature quantity map 52Fs is created.

ステップＳｂにおいて、ＣＰＵ９は、複数種類の特徴量マップを統合することで、顕著性マップを求める。例えば図２の例では、特徴量マップ５２Ｆｃ，５２Ｆｈ，５２Ｆｓが統合されて顕著性マップ５３Ｓが求められている。
ステップＳｂの処理は、後述する図８のステップＳ４５の処理に対応する。 In step Sb, the CPU 9 obtains a saliency map by integrating a plurality of types of feature amount maps. For example, in the example of FIG. 2, feature maps 52Fc, 52Fh, and 52Fs are integrated to obtain a saliency map 53S.
The process of step Sb corresponds to the process of step S45 of FIG.

ステップＳｃにおいて、ＣＰＵ９は、顕著性マップを用いて、スルー画像の中から、人間の視覚的注意を引く可能性の高い画像領域（以下、注目点領域と称する）を推定する。例えば図２の例では、顕著性マップ５３Ｓを用いて、スルー画像５１の中から注目点領域５４が推定されている。
ステップＳｃの処理は、後述する図８のステップＳ４６の処理に対応する。 In step Sc, the CPU 9 uses the saliency map to estimate an image area (hereinafter, referred to as a point-of-interest area) that is likely to attract human visual attention from the through image. For example, in the example of FIG. 2, the attention point region 54 is estimated from the through image 51 using the saliency map 53S.
The process of step Sc corresponds to the process of step S46 of FIG.

なお、以上のステップＳａ乃至Ｓｃまでの一連の処理を、以下、注目点領域推定処理と称する。注目点領域推定処理は、後述する図６のステップＳ２の処理に対応する。注目点領域推定処理の詳細については、図８乃至図１０を参照して後述する。 In addition, a series of processes from the above steps Sa to Sc will be hereinafter referred to as an attention point area estimation process. The attention point region estimation process corresponds to the process in step S2 of FIG. Details of the attention point region estimation process will be described later with reference to FIGS.

次に、ステップＳｄにおいて、ＣＰＵ９は、注目点領域に基づく顔検出／顔認識パラメータ設定処理として、例えば次のような処理を実行する。
すなわち、ＣＰＵ９は、注目点領域を用いて、顔検出／顔認識処理に関するパラメータを設定する。設定対象のパラメータは、顔検出／顔認識処理に関するものであれば足り、特に限定されない。 Next, in step Sd, the CPU 9 executes, for example, the following process as the face detection / face recognition parameter setting process based on the attention point area.
That is, the CPU 9 sets parameters relating to face detection / face recognition processing using the attention point area. The parameter to be set is not particularly limited as long as it relates to face detection / face recognition processing.

例えば図２の例では、設定対称のパラメータは、顔検出／顔認識処理の対象となるか否かを示すパラメータとされている。
すなわち、ＣＰＵ９は、例えば、図２に示される領域５６を、顔検出／顔認識処理の対象として設定する。領域５６は、注目点領域５４のうち所定の条件を満たす注目点領域を含む領域、すなわち、そのような注目点領域の所定の周囲領域だからである。なお、所定の条件は、特に限定されず、例えば、領域についての、大きさ、位置、形状、若しくは、縦横比に関する条件、又は、それらのうちの任意の２以上の組合せを採用することができる。
また、ＣＰＵ９は、例えば、図２に示される領域５７及び領域５８を、顔検出／顔認識処理の対象から除外する。領域５５は、注目点領域５４として推定されなかった領域だからである。領域５７は、注目点領域５４のうち所定の条件を満たさない領域だからである。
このようにして、ＣＰＵ９は、領域５７及び領域５８を、顔検出／顔認識処理の対象外とすることができる。より一般的にいえば、ＣＰＵ９は、例えば、視野領域の外周端部の所定範囲の注目点領域を、顔検出／顔認識処理の対象外とすることができる。ＣＰＵ９は、例えば、所定サイズ未満の注目点領域を、顔検出／顔認識処理の対象外とすることができる。ＣＰＵ９は、例えば、面積が大きくても、縦横比が顔の縦横比とはかけ離れた縦横比であったり、細長い形状や線状などに該当する注目点領域も、顔検出／顔認識処理の対象外とすることができる。
その結果、ＣＰＵ９は、注目点領域の中でも人間の視覚的注意を特に引く可能性の高い領域に対して、顔検出／顔認識処理を重点的に施すことが可能になる。また、ＣＰＵ９は、対象領域についても、大きさ、処理量、処理時間などを絞ることにより、顔検出／顔認識処理を短時間で実行することができる。
なお、図２の例のステップＳｄの処理は、後述する図６のステップＳ４，Ｓ５の処理に対応する。 For example, in the example of FIG. 2, the setting symmetry parameter is a parameter indicating whether or not to be a target of face detection / face recognition processing.
That is, the CPU 9 sets, for example, the area 56 shown in FIG. 2 as a target for face detection / face recognition processing. This is because the area 56 includes an attention point area that satisfies a predetermined condition among the attention point areas 54, that is, a predetermined surrounding area of such an attention point area. Note that the predetermined condition is not particularly limited, and for example, a condition regarding the size, position, shape, or aspect ratio of the region, or a combination of any two or more thereof can be adopted. .
For example, the CPU 9 excludes the area 57 and the area 58 shown in FIG. 2 from the face detection / face recognition processing targets. This is because the region 55 is a region that has not been estimated as the attention point region 54. This is because the region 57 is a region that does not satisfy the predetermined condition in the attention point region 54.
In this way, the CPU 9 can exclude the area 57 and the area 58 from the face detection / face recognition process. More generally, for example, the CPU 9 can exclude a target area of a predetermined range at the outer peripheral edge of the visual field area from being subject to face detection / face recognition processing. For example, the CPU 9 can exclude an attention point area having a size smaller than a predetermined size from the face detection / face recognition processing target. For example, even when the area is large, the CPU 9 also applies a face detection / face recognition process to a target point area corresponding to an aspect ratio whose aspect ratio is far from the aspect ratio of the face, or an elongated shape or line shape. Can be outside.
As a result, the CPU 9 can focus face detection / face recognition processing on a region that is particularly likely to attract human visual attention in the attention point region. Further, the CPU 9 can execute the face detection / face recognition process in a short time by reducing the size, the processing amount, the processing time, and the like for the target region.
The process of step Sd in the example of FIG. 2 corresponds to the processes of steps S4 and S5 of FIG.

このようにして、ステップＳｄの処理でパラメータが設定されると、処理はステップＳｅに進む。ステップＳｅにおいて、ＣＰＵ９は、設定されたパラメータを用いて、顔検出／顔認識処理を実行する。その結果、検出又は認識された人物の顔を含む領域が得られる。以下、かかる領域を顔検出領域と称する。
例えば図２の例では、スルー画像５１のうち、領域５５乃至５７のうち、対象領域として設定された領域５６に対してのみ、顔検出／顔認識処理が施される。換言すると、対象領域から除外された領域５５及び領域５７に対しては、顔検出／顔認識処理の実行が禁止される。その結果として、顔検出領域５８が得られる。
ステップＳｅの処理は、後述する図６のステップＳ６の処理に対応する。 Thus, when the parameter is set in the process of step Sd, the process proceeds to step Se. In step Se, the CPU 9 executes face detection / face recognition processing using the set parameters. As a result, a region including the face of the detected or recognized person is obtained. Hereinafter, such an area is referred to as a face detection area.
For example, in the example of FIG. 2, the face detection / face recognition process is performed only on the region 56 set as the target region among the regions 55 to 57 in the through image 51. In other words, execution of face detection / face recognition processing is prohibited for the regions 55 and 57 excluded from the target region. As a result, a face detection area 58 is obtained.
The process of step Se corresponds to the process of step S6 of FIG.

図３は、顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の図２の例とは別の例を示す図である。 FIG. 3 is a diagram for explaining the outline of the face detection / face recognition processing and the preprocessing thereof, and is a diagram illustrating an example different from the example of FIG. 2 as a specific processing result.

なお、図３の処理の流れ自体は、図２と基本的に同様であるので、図３に示される処理結果のみ説明する。また、以下の説明では、スルー撮像及びスルー表示により、例えば図３に示されるスルー画像６１が液晶ディスプレイ１３に表示されているとする。ただし、図３に示されるスルー画像６１は、表現の簡略上、実際のフレーム画像を線図化したものである。 3 is basically the same as that in FIG. 2, only the processing result shown in FIG. 3 will be described. In the following description, it is assumed that, for example, a through image 61 shown in FIG. 3 is displayed on the liquid crystal display 13 by through imaging and through display. However, the through image 61 shown in FIG. 3 is a diagram of an actual frame image for simplicity of expression.

この場合、ステップＳａの処理の結果、例えば図３の例では、次のような各特徴量マップが作成される。すなわち、後述する図１０Ａのマルチスケールのコントラストの特徴量マップ作成処理の結果、特徴量マップ６２Ｆｃが作成される。また、後述する図１０ＢのＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップ作成処理の結果、特徴量マップ６２Ｆｈが作成される。また、図１０Ｃの色空間分布の特徴量マップ作成処理の結果、特徴量マップ６２Ｆｓが作成される。次に、ステップＳｂの処理の結果、これらの特徴量マップ６２Ｆｃ，６２Ｆｈ，６２Ｆｓが統合されて、顕著性マップ６３Ｓが求められる。そして、ステップＳｃの処理の結果、顕著性マップ６３Ｓを用いて注目点領域６４，６５，６６が推定される。 In this case, as a result of the process of step Sa, for example, in the example of FIG. That is, the feature map 62Fc is created as a result of the multi-scale contrast feature map creation process of FIG. In addition, as a result of processing for creating a feature map of a center-surround color histogram in FIG. Further, as a result of the color space distribution feature quantity map creation processing of FIG. Next, as a result of the processing in step Sb, these feature amount maps 62Fc, 62Fh, 62Fs are integrated to obtain a saliency map 63S. Then, as a result of the process of step Sc, the attention point areas 64, 65, 66 are estimated using the saliency map 63S.

ステップＳｄにおいて、注目点領域に基づく顔検出／顔認識パラメータ設定処理が実行されて、顔検出／顔認識処理に関するパラメータが設定される。例えば図３の例では、設定対称のパラメータは、顔検出／顔認識処理の順番に関するパラメータ、具体的には例えば、実行順番や走査順番とされている。
すなわち、ＣＰＵ９は、例えば、注目点領域の大きさや位置に応じて、顔検出／顔認識処理の実行順番などを設定する。具体的には例えば、図３に示されるように、一番大きな注目点領域６４が、顔検出／顔認識処理の実行順番などが１番目の領域として設定される。次に大きな注目点領域６５が、顔検出／顔認識処理の実行順番などが２番目の領域として設定される。一番小さな注目点領域６６が、顔検出／顔認識処理の実行順番などが３番目の領域として設定される。
なお、図３の例のステップＳｄの処理は、後述する図６のステップＳ３の処理に対応する。 In step Sd, face detection / face recognition parameter setting processing based on the point-of-interest area is executed, and parameters relating to face detection / face recognition processing are set. For example, in the example of FIG. 3, the setting symmetry parameter is a parameter related to the order of face detection / face recognition processing, specifically, for example, an execution order or a scanning order.
That is, for example, the CPU 9 sets the execution order of face detection / face recognition processing according to the size and position of the target point area. Specifically, for example, as shown in FIG. 3, the largest attention point region 64 is set as the first region in the face detection / face recognition processing execution order. The next largest attention point area 65 is set as the second area in the order of execution of face detection / face recognition processing. The smallest point-of-interest area 66 is set as the third area in the execution order of face detection / face recognition processing.
The process of step Sd in the example of FIG. 3 corresponds to the process of step S3 of FIG.

ステップＳｅにおいて、設定されたパラメータを用いて、顔検出／顔認識処理が実行される。例えば図３の例では、１番目に、注目点領域６４に対して顔検出／顔認識処理が施され、その結果、顔検出領域７１が得られる。２番目に、注目点領域６５に対して顔検出／顔認識処理が施され、その結果、顔検出領域７２が得られる。３番目に、注目点領域６６に対して顔検出／顔認識処理が施され、その結果、顔検出領域７３が得られる。 In step Se, face detection / face recognition processing is executed using the set parameters. For example, in the example of FIG. 3, first, face detection / face recognition processing is performed on the attention point area 64, and as a result, a face detection area 71 is obtained. Second, face detection / face recognition processing is performed on the attention point area 65, and as a result, a face detection area 72 is obtained. Third, face detection / face recognition processing is performed on the attention point region 66, and as a result, a face detection region 73 is obtained.

図４及び図５は、顔検出／顔認識処理及びその前処理の概略を説明する図であって、具体的な処理結果の図２や図３の例とは別の例を示す図である。 4 and 5 are diagrams for explaining the outline of the face detection / face recognition processing and the preprocessing thereof, and are diagrams illustrating examples of specific processing results different from the examples of FIGS. 2 and 3. .

なお、図４及び図５の処理の流れ自体は、図２と基本的に同様であるので、図４及び図５に示される処理結果のみ説明する。また、スルー撮像及びスルー表示により、図４の例の説明では同図に示されるスルー画像８１が、図５の例の説明では同図に示されるスルー画像１１１が、それぞれ液晶ディスプレイ１３に表示されているとする。ただし、図４に示されるスルー画像８１及び図５に示されるスルー画像１１１は、表現の簡略上、実際のフレーム画像を線図化したものである。 4 and 5 are basically the same as those in FIG. 2, and only the processing results shown in FIGS. 4 and 5 will be described. Further, through image pickup and through display, the through image 81 shown in FIG. 4 is displayed on the liquid crystal display 13 in the description of the example of FIG. 4, and the through image 111 shown in FIG. Suppose that However, the through image 81 shown in FIG. 4 and the through image 111 shown in FIG. 5 are diagrams of actual frame images for simplicity of expression.

この場合、ステップＳａの処理の結果、例えば図４及び図５の例では、次のような各特徴量マップが作成される。 In this case, as a result of the process of step Sa, for example, in the example of FIGS. 4 and 5, the following feature amount maps are created.

すなわち、図４の例では、後述する図１０Ａのマルチスケールのコントラストの特徴量マップ作成処理の結果、特徴量マップ８２Ｆｃが作成される。また、後述する図１０ＢのＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップ作成処理の結果、特徴量マップ８２Ｆｈが作成される。また、図１０Ｃの色空間分布の特徴量マップ作成処理の結果、特徴量マップ８２Ｆｓが作成される。 That is, in the example of FIG. 4, a feature map 82Fc is created as a result of the multi-scale contrast feature map creation process of FIG. 10A described later. Also, as a result of the feature amount map creation process of the center-surround color histogram of FIG. 10B described later, a feature amount map 82Fh is created. Further, as a result of the color space distribution feature quantity map creation processing of FIG. 10C, a feature quantity map 82Fs is created.

一方、図５の例では、後述する図１０Ａのマルチスケールのコントラストの特徴量マップ作成処理の結果、特徴量マップ１１２Ｆｃが作成される。また、後述する図１０ＢのＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップ作成処理の結果、特徴量マップ１１２Ｆｈが作成される。また、図１０Ｃの色空間分布の特徴量マップ作成処理の結果、特徴量マップ１１２Ｆｓが作成される。 On the other hand, in the example of FIG. 5, a feature map 112Fc is created as a result of the multi-scale contrast feature map creation process of FIG. 10A described later. Further, as a result of processing for creating a feature map of the center-surround color histogram of FIG. 10B described later, a feature map 112Fh is created. In addition, as a result of the feature map creation processing of the color space distribution in FIG. 10C, the feature map 112Fs is created.

次に、図４の例では、ステップＳｂの処理の結果、特徴量マップ８２Ｆｃ，８２Ｆｈ，８２Ｆｓが統合されて、顕著性マップ８３Ｓが求められる。そして、ステップＳｃの処理の結果、顕著性マップ８３Ｓを用いて注目点領域８４が推定される。 Next, in the example of FIG. 4, as a result of the process of step Sb, the feature amount maps 82Fc, 82Fh, and 82Fs are integrated to obtain the saliency map 83S. Then, as a result of the process of step Sc, the attention point region 84 is estimated using the saliency map 83S.

一方、図５の例では、ステップＳｂの処理の結果、特徴量マップ１１２Ｆｃ，１１２Ｆｈ，１１２Ｆｓが統合されて、顕著性マップ１１３Ｓが求められる。そして、ステップＳｃの処理の結果、顕著性マップ１１３Ｓを用いて複数の注目点領域１１４が推定される。 On the other hand, in the example of FIG. 5, as a result of the processing in step Sb, the feature amount maps 112Fc, 112Fh, and 112Fs are integrated to obtain the saliency map 113S. Then, as a result of the processing in step Sc, a plurality of attention point regions 114 are estimated using the saliency map 113S.

ステップＳｄにおいて、注目点領域に基づく顔検出／顔認識パラメータ設定処理が実行されて、顔検出／顔認識処理に関するパラメータが設定される。例えば図４及び図５の例では、設定対称のパラメータは、顔検出／顔認識処理における顔の特徴データ（特徴点）に関するパラメータとされている。具体的には例えば、顔の特徴データについての、抽出方法、抽出個数、照合方法などが、設定対象のパラメータとされている。そこで、ＣＰＵ９は、例えば、注目点領域の大きさや位置に応じて、顔の特徴データに関する各種パラメータを設定する。 In step Sd, face detection / face recognition parameter setting processing based on the point-of-interest area is executed, and parameters relating to face detection / face recognition processing are set. For example, in the examples of FIGS. 4 and 5, the setting symmetry parameter is a parameter related to face feature data (feature points) in the face detection / face recognition processing. Specifically, for example, the extraction method, the number of extractions, the collation method, and the like for facial feature data are set as parameters to be set. Therefore, the CPU 9 sets various parameters related to facial feature data, for example, in accordance with the size and position of the attention point area.

具体的には例えば図４の例では、抽出された注目点領域８４の数は１つと一定数以下であり（少数であり）、視野領域内に占める注目点領域８４は所定サイズよりも大きい。このような場合、ＣＰＵ９は、特徴データの抽出領域を、例えば「額の上〜顎の下」とやや広い領域８５に設定する。また、ＣＰＵ９は、主要部位の特徴データ８６として、例えば、顔の輪郭に９特徴点、左右の眉に２×５特徴点、左右の眼に２×９特徴点、鼻に３特徴点、及び口唇に８特徴点の総計４８特徴点を設定する。また、ＣＰＵ９は、例えば、照合評価処理の度合を「やや詳しい」という度合に設定する。 Specifically, for example, in the example of FIG. 4, the number of extracted attention point regions 84 is one and below a certain number (a small number), and the attention point region 84 occupying in the visual field region is larger than a predetermined size. In such a case, the CPU 9 sets the feature data extraction region in a slightly wide region 85 such as “upper forehead to under the chin”, for example. Further, the CPU 9 uses, for example, nine feature points for the face contour, 2 × 5 feature points for the left and right eyebrows, 2 × 9 feature points for the left and right eyes, three feature points for the nose, A total of 48 feature points of 8 feature points are set on the lips. For example, the CPU 9 sets the degree of collation evaluation processing to a degree of “slightly detailed”.

一方、例えば図５の例では、注目点領域１１４の数は一定数を超えており（多数であり）、各注目点領域１１４は広域に分散している。また、個々の注目点領域１１４は所定サイズよりも小さい。このような場合、ＣＰＵ９は、特徴データの抽出領域を、例えば、「眉の上〜口唇の下」とやや狭い領域１１５に設定する。また、ＣＰＵ９は、主要部位の特徴データ１１６として、例えば、図４の例と比較して少ない総計２４特徴点を設定する。なお、特徴点の減点箇所は、図５の例に特に限定されない。また、ＣＰＵ９は、例えば、照合評価処理の度合を「やや粗い」という度合に設定する。
なお、図４及び図５の例のステップＳｄの処理は、後述する図１１のステップＳ２０５の処理に対応する。 On the other hand, for example, in the example of FIG. 5, the number of attention point regions 114 exceeds a certain number (a large number), and each attention point region 114 is dispersed in a wide area. Moreover, each attention point area | region 114 is smaller than predetermined size. In such a case, the CPU 9 sets the feature data extraction area to a slightly narrow area 115 such as “above the eyebrows to under the lips”. Further, the CPU 9 sets, for example, a total of 24 feature points that are smaller than the example of FIG. 4 as the feature data 116 of the main part. Note that the deduction points of the feature points are not particularly limited to the example of FIG. For example, the CPU 9 sets the degree of the collation evaluation process to a degree of “slightly rough”.
Note that the processing in step Sd in the examples of FIGS. 4 and 5 corresponds to the processing in step S205 in FIG. 11 described later.

ステップＳｅにおいて、設定されたパラメータを用いて、顔検出／顔認識処理が実行される。
例えば図４の例では、同図に示される領域８５から特徴データ８６が抽出されて、やや詳しい照合処理が実行されて、その結果、顔検出領域８７が得られる。
一方、例えば図５の例では、同図に示される領域１１５から特徴データ１１６が抽出されて、やや粗い照合処理が実行されて、その結果、複数の顔検出領域１１７が得られる。 In step Se, face detection / face recognition processing is executed using the set parameters.
For example, in the example of FIG. 4, the feature data 86 is extracted from the area 85 shown in FIG. 4, and a slightly more detailed collation process is executed. As a result, the face detection area 87 is obtained.
On the other hand, for example, in the example of FIG. 5, the feature data 116 is extracted from the region 115 shown in FIG. 5 and a slightly rough collation process is executed. As a result, a plurality of face detection regions 117 are obtained.

以上、図２乃至図５を参照して、画像処理装置１００が実行する顔検出／顔認識処理及びその前処理の概略について説明した。次に、図６乃至図１１を参照して、顔検出／顔認識処理及びその前処理を含む撮影モード処理全体について説明する。 The outline of the face detection / face recognition processing and the preprocessing performed by the image processing apparatus 100 has been described above with reference to FIGS. Next, with reference to FIGS. 6 to 11, the entire photographing mode process including the face detection / face recognition process and its pre-process will be described.

図６及び図７は、撮影モード処理の流れの一例を示すフローチャートである。 6 and 7 are flowcharts showing an example of the flow of shooting mode processing.

撮影モード処理は、ユーザが撮影モードを選択する所定の操作を操作部１４に対して行った場合、その操作を契機として開始される。すなわち、次のような処理が実行される。 When the user performs a predetermined operation for selecting a shooting mode on the operation unit 14, the shooting mode process is started when the operation is performed. That is, the following processing is executed.

ステップＳ１において、ＣＰＵ９は、スルー撮像とスルー表示を行う。 In step S1, the CPU 9 performs through imaging and through display.

ステップＳ２において、ＣＰＵ９は、注目点領域推定処理を実行することで、注目点領域を推定する。注目点領域推定処理については、その概略は図２乃至図５のステップＳａ乃至Ｓｃの説明として上述した通りであり、その詳細は図８を参照して後述する。 In step S 2, the CPU 9 estimates a target point region by executing a target point region estimation process. The outline of the point-of-interest area estimation process has been described above as the description of steps Sa to Sc in FIGS. 2 to 5, and details thereof will be described later with reference to FIG.

ステップＳ３において、ＣＰＵ９は、ステップＳ２の処理で推定された各注目点領域のうちの所定の１つを、処理対象領域に設定する。ここで、後述するステップＳ６の顔検出／顔認識処理に関するパラメータとして、顔検出／顔認識処理の順番に関するパラメータが含まれている場合がある。このような場合、ＣＰＵ９は、図３の例として説明したように、ステップＳ２の処理で推定された各注目点領域の大きさや位置に応じて、各注目点領域のそれぞれに対して順番を設定する。そして、この順番にしたがって、処理対象領域が順次設定されていく。 In step S3, the CPU 9 sets a predetermined one of the attention point areas estimated in the process of step S2 as a process target area. Here, parameters relating to the order of face detection / face recognition processing may be included as parameters relating to face detection / face recognition processing in step S6 described later. In such a case, as described in the example of FIG. 3, the CPU 9 sets the order for each attention point area according to the size and position of each attention point area estimated in the process of step S 2. To do. Then, the processing target areas are sequentially set according to this order.

ステップＳ４において、ＣＰＵ９は、処理対象領域を評価する。ステップＳ５において、ＣＰＵ９は、その評価結果に基づいて、処理対象領域は所定の条件を満たすか否かを判定する。
すなわち、図６の例では、後述するステップＳ６の顔検出／顔認識処理に関するパラメータとして、顔検出／顔認識処理の対象領域となるか否かのパラメータが含まれていることが前提とされている。したがって、ここでいう所定の条件とは、図２を参照して上述したように、特に限定されないが、例えば、領域についての、大きさ、位置、形状、及び、縦横比の単体に関する条件、又は、それらのうちの任意の２以上の組合せを採用することができる。 In step S4, the CPU 9 evaluates the processing target area. In step S5, the CPU 9 determines whether or not the processing target area satisfies a predetermined condition based on the evaluation result.
That is, in the example of FIG. 6, it is assumed that a parameter regarding whether or not to be a target area for face detection / face recognition processing is included as a parameter related to face detection / face recognition processing in step S6 described later. Yes. Accordingly, the predetermined condition here is not particularly limited as described above with reference to FIG. 2, but for example, the condition regarding the size, position, shape, and aspect ratio of the region, or Any combination of two or more of them can be employed.

処理対象領域が所定の条件を満たさない場合、処理対象領域を顔検出／顔認識処理の対象から除外する設定がなされる。このため、ステップＳ５の処理でＮＯであると判定されて、処理対象領域に対してステップＳ６の顔検出／顔認識処理が実行されずに、処理はステップＳ３に戻され、それ以降の処理が実行される。すなわち、ステップＳ３の処理で別の注目点領域が処理対象に新たに設定されて、ステップＳ４以降の処理が実行される。 When the processing target area does not satisfy the predetermined condition, a setting is made to exclude the processing target area from the target of face detection / face recognition processing. For this reason, it is determined NO in the process of step S5, the face detection / face recognition process of step S6 is not executed for the processing target area, the process returns to step S3, and the subsequent processes are performed. Executed. That is, another attention point area is newly set as a process target in the process of step S3, and the processes after step S4 are executed.

これに対して、処理対象領域が所定の条件を満たす場合、処理対象領域を顔検出／顔認識処理の対象にする設定がなされる。このため、ステップＳ５の処理でＹＥＳであると判定されて、処理はステップＳ６に進む。ステップＳ６において、ＣＰＵ９は、処理対象領域に対して顔検出／顔認識処理を施す。なお、顔検出／顔認識処理の詳細については、図１１を参照して後述する。 On the other hand, when the processing target area satisfies a predetermined condition, the processing target area is set as a target for face detection / face recognition processing. For this reason, it determines with it being YES by the process of step S5, and a process progresses to step S6. In step S6, the CPU 9 performs face detection / face recognition processing on the processing target area. Details of the face detection / face recognition processing will be described later with reference to FIG.

ステップＳ７において、ＣＰＵ９は、顔検出領域が存在するか否かを判定する。ステップＳ６の顔検出／顔認識処理により１以上の顔検出領域が得られた場合、ステップＳ７においてＹＥＳであると判定されて、処理はステップＳ８に進む。ステップＳ８において、ＣＰＵ９は、１以上の顔検出領域のそれぞれにＡＦ（ＡｕｔｏｍａｔｉｃＦｏｃｕｓ）枠を設定する。これにより、処理はステップＳ９に進む。これに対して、ステップＳ６の顔検出／顔認識処理により顔検出領域が得られなかった場合、ステップＳ７においてＮＯであると判定されて、ステップＳ８の処理は実行されずに、処理はステップＳ９に進む。 In step S7, the CPU 9 determines whether or not a face detection area exists. When one or more face detection areas are obtained by the face detection / face recognition process in step S6, it is determined as YES in step S7, and the process proceeds to step S8. In step S8, the CPU 9 sets an AF (Automatic Focus) frame in each of the one or more face detection areas. Thereby, a process progresses to step S9. On the other hand, if the face detection area is not obtained by the face detection / face recognition process in step S6, it is determined as NO in step S7, the process in step S8 is not executed, and the process proceeds to step S9. Proceed to

ステップＳ９において、ＣＰＵ９は、顔検出領域の数が閾値以上か否かを判定する。顔検出領域の数が閾値以上の場合、ステップＳ９においてＹＥＳであると判定されて、処理はステップＳ１１に進む。ただし、ステップＳ１１以降の処理は後述する。 In step S9, the CPU 9 determines whether or not the number of face detection areas is equal to or greater than a threshold value. If the number of face detection areas is greater than or equal to the threshold, it is determined as YES in step S9, and the process proceeds to step S11. However, the process after step S11 is mentioned later.

これに対して、顔検出領域の数が閾値未満の場合、ステップＳ９においてＮＯであると判定されて、処理はステップＳ１０に進む。ステップＳ１０において、ＣＰＵ９は、処理対象領域が最後の注目点領域か否かを判定する。処理対象領域が最後の注目点領域でない場合、すなわち、ステップＳ２の処理で推定された各注目点領域の中に処理対象領域に設定されていないものが存在する場合、ステップＳ１０においてＮＯであると判定されて、処理はステップＳ３に戻され、それ以降の処理が繰り返される。すなわち、ステップＳ３の処理で別の注目点領域が処理対象に新たに設定されて、ステップＳ４以降の処理が実行される。 On the other hand, if the number of face detection areas is less than the threshold value, it is determined as NO in step S9, and the process proceeds to step S10. In step S 10, the CPU 9 determines whether the processing target area is the last attention point area. If the processing target area is not the last target point area, that is, if there is an area that has not been set as the processing target area among the target point areas estimated in the process of step S2, NO is determined in step S10. After the determination, the process is returned to step S3, and the subsequent processes are repeated. That is, another attention point area is newly set as a process target in the process of step S3, and the processes after step S4 are executed.

このようにして、ステップＳ２の処理で推定された各注目点領域のそれぞれが処理対象領域に順次設定されて、ステップＳ３乃至Ｓ１０のループ処理が繰り返し実行される。その後、顔検出領域の数が閾値未満の状態で、最後の注目点領域が処理対象に設定されてステップＳ３乃至Ｓ９の処理が実行されると、次のステップＳ１０の処理でＹＥＳであると判定されて、処理はステップＳ１１に進む。
また、ステップＳ３乃至Ｓ１０のループ処理の繰り返しの途中で、顔検出領域の数が閾値以上になった場合、上述したように、ステップＳ９の処理でＹＥＳであると判定されて、処理はステップＳ１１に進む。 In this way, each target point area estimated in the process of step S2 is sequentially set as a process target area, and the loop process of steps S3 to S10 is repeatedly executed. After that, when the number of face detection areas is less than the threshold value and the last attention point area is set as the processing target and the processes in steps S3 to S9 are executed, it is determined that the next process in step S10 is YES. Then, the process proceeds to step S11.
Further, when the number of face detection regions is equal to or greater than the threshold value during the loop processing of steps S3 to S10, as described above, it is determined as YES in the processing of step S9, and the processing is performed in step S11. Proceed to

ステップＳ１１において、ＣＰＵ９は、ステップＳ８の処理で設定されたＡＦ枠が被写界深度内に入るように、ＡＦ処理（オートフォーカス処理）を実行する。 In step S11, the CPU 9 executes an AF process (autofocus process) so that the AF frame set in the process of step S8 falls within the depth of field.

ステップＳ１２において、ＣＰＵ９は、レリーズ釦が半押しの状態であるか否かを判定する。
ユーザがレリーズ釦を半押ししていない場合、ステップＳ１２においてＮＯであると判定され、処理はステップＳ１に戻され、それ以降の処理が繰り返される。すなわち、ユーザがレリーズ釦を半押しするまでの間、ステップＳ１乃至Ｓ１２のループ処理が繰り返し実行される。 In step S12, the CPU 9 determines whether or not the release button is half pressed.
If the user has not pressed the release button halfway, it is determined as NO in step S12, the process returns to step S1, and the subsequent processes are repeated. That is, until the user half-presses the release button, the loop process of steps S1 to S12 is repeatedly executed.

その後、ユーザがレリーズ釦を半押しすると、ステップＳ１２においてＹＥＳであると判定されて、処理は図７のステップＳ１３に進む。 Thereafter, when the user presses the release button halfway, it is determined as YES in Step S12, and the process proceeds to Step S13 in FIG.

ステップＳ１３において、ＣＰＵ９は、測光処理を実行する。 In step S13, the CPU 9 executes a photometric process.

ステップＳ１４において、ＣＰＵ９は、レリーズ釦が全押しの状態であるか否かを判定する。 In step S14, the CPU 9 determines whether or not the release button is fully pressed.

ユーザがレリーズ釦を全押ししていない場合、ステップＳ１４においてＮＯであると判定され、処理はステップＳ２２に進む。ステップＳ２２において、ＣＰＵ９は、レリーズ釦が解除されたか否かを判定する。ユーザの指などがレリーズ釦から離された場合、ステップＳ２２においてＹＥＳであると判定されて、撮影モード処理は終了となる。これに対して、ユーザの指などがレリーズ釦から離されていない場合、ステップＳ２２においてＮＯであると判定されて、処理はステップＳ１４に戻され、それ以降の処理が繰り返される。すなわち、レリーズ釦の半押し状態が継続している限り、ステップＳ１４ＮＯ，Ｓ２２ＮＯのループ処理が繰り返し実行される。 If the user has not fully pressed the release button, it is determined as NO in Step S14, and the process proceeds to Step S22. In step S22, the CPU 9 determines whether or not the release button has been released. If the user's finger or the like is released from the release button, it is determined as YES in step S22, and the shooting mode process ends. On the other hand, if the user's finger or the like is not released from the release button, it is determined as NO in step S22, the process returns to step S14, and the subsequent processes are repeated. That is, as long as the release button is half-pressed, the loop process of steps S14NO and S22NO is repeated.

その後、ユーザがレリーズ釦を全押しすると、ステップＳ１４においてＹＥＳであると判定されて、処理はステップＳ１５に進む。ステップＳ１５において、ＣＰＵ９は、ＡＷＢ（ＡｕｔｏｍａｔｉｃＷｈｉｔｅＢａｌａｎｃｅ）処理（オートホワイトバランス処理）を実行する。ステップＳ１６において、ＣＰＵ９は、ＡＥ（ＡｕｔｏｍａｔｉｃＥｘｐｏｓｕｒｅ）処理（自動露出処理）を実行する。すなわち、測光センサ１７による測光情報や撮影条件などに基づいて、絞り、露出時間、ストロボ条件などが設定される。 Thereafter, when the user fully presses the release button, it is determined as YES in Step S14, and the process proceeds to Step S15. In step S15, the CPU 9 executes an AWB (Automatic White Balance) process (auto white balance process). In step S16, the CPU 9 executes an AE (Automatic Exposure) process (automatic exposure process). That is, the aperture, exposure time, strobe conditions, and the like are set based on photometric information obtained by the photometric sensor 17 and photographing conditions.

ステップＳ１７において、ＣＰＵ９は、ＴＧ６やＤＳＰ８などを制御して、測光センサ１７による測光情報や撮影条件などに基づいて露出及び撮影処理を実行する。この露出及び撮影処理により、撮影条件などにしたがってＣＭＯＳセンサ４により撮影された被写体像は、フレーム画像データとしてＤＲＡＭ７に記憶される。なお、以下、かかるフレーム画像データを撮影画像データと称し、また、撮影画像データにより表現される画像を撮影画像と称する。 In step S 17, the CPU 9 controls the TG 6, the DSP 8, and the like, and executes exposure and photographing processing based on photometric information and photographing conditions by the photometric sensor 17. By this exposure and photographing processing, the subject image photographed by the CMOS sensor 4 according to the photographing conditions and the like is stored in the DRAM 7 as frame image data. Hereinafter, such frame image data is referred to as captured image data, and an image expressed by the captured image data is referred to as a captured image.

ステップＳ１８において、ＣＰＵ９は、ＤＳＰ８などを制御して、撮影画像データの補正及び変更処理を実行する。すなわち、ＣＰＵ９は、顔検出領域、注目点領域、撮影条件などに基づいて、撮影画像データに対して、補正又は変更に必要な各種画像処理を適宜施す。 In step S18, the CPU 9 controls the DSP 8 and the like, and executes correction and change processing of captured image data. That is, the CPU 9 appropriately performs various image processes necessary for correction or change on the captured image data based on the face detection area, the attention point area, the imaging conditions, and the like.

ステップＳ１９において、ＣＰＵ９は、液晶表示コントローラ１２などを制御して、撮影画像のレビュー表示処理を実行する。また、ステップＳ２０において、ＣＰＵ９は、ＤＳＰ８などを制御して撮影画像データの圧縮符号化処理を実行する。その結果、符号化画像データが得られることになる。そこで、ステップＳ２１において、ＣＰＵ９は、符号化画像データの保存記録処理を実行する。これにより、符号化画像データがメモリカード１５などに記録され、撮影モード処理が終了となる。 In step S19, the CPU 9 controls the liquid crystal display controller 12 and the like to execute a review display process for the photographed image. In step S20, the CPU 9 controls the DSP 8 and the like to execute the compression encoding process of the photographed image data. As a result, encoded image data is obtained. Therefore, in step S21, the CPU 9 executes a process for storing and recording encoded image data. As a result, the encoded image data is recorded in the memory card 15 or the like, and the photographing mode process is completed.

次に、撮影モード処理のうち、ステップＳ２（図２乃至図５のステップＳａ乃至Ｓｃ）の注目点領域処理の詳細例について説明する。 Next, a detailed example of the attention point area process in step S2 (steps Sa to Sc in FIGS. 2 to 5) in the shooting mode process will be described.

上述したように、注目点領域推定処理では、注目点領域の推定のために、顕著性マップが作成される。したがって、注目点領域推定処理に対して、例えば、Ｔｒｅｉｓｍａｎの特徴統合理論や、Ｉｔｔｉ及びＫｏｃｈらによる顕著性マップを適用することができる。
なお、Ｔｒｅｉｓｍａｎの特徴統合理論については、「Ａ．Ｍ．ＴｒｅｉｓｍａｎａｎｄＧ．Ｇｅｌａｄｅ，“Ａｆｅａｔｕｒｅ―ｉｎｔｅｇｒａｔｉｏｎｔｈｅｏｒｙｏｆａｔｔｅｎｔｉｏｎ”，ＣｏｇｎｉｔｉｖｅＰｓｙｃｈｏｌｏｇｙ，Ｖｏｌ．１２，Ｎｏ．１，ｐｐ．９７−１３６，１９８０．」を参照すると良い。
また、Ｉｔｔｉ及びＫｏｃｈらによる顕著性マップについては、「Ｌ．Ｉｔｔｉ，Ｃ．Ｋｏｃｈ，ａｎｄＥ．Ｎｉｅｂｕｒ，“ＡＭｏｄｅｌｏｆＳａｌｉｅｎｃｙ−ＢａｓｅｄＶｉｓｕａｌＡｔｔｅｎｔｉｏｎｆｏｒＲａｐｉｄＳｃｅｎｅＡｎａｌｙｓｉｓ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，ＶＯｌ．２０，Ｎｏ１１，Ｎｏｖｅｍｂｅｒ１９９８．」を参照すると良い。 As described above, in the attention point region estimation process, a saliency map is created for estimating the attention point region. Therefore, for example, the Treisman feature integration theory or the saliency map by Itti and Koch et al. Can be applied to the attention point region estimation process.
As for the feature integration theory of Treisman, “AM Treisman and G. Gelade,“ A feature-integration theory of attention ”, Cognitive Psychology, Vol. 12, No. 1, pp. 97-136, 1980. Please refer to.
For the saliency map by Itti and Koch et al., “L. Itti, C. Koch, and E. Niebur,“ A Model of Salientity-Based Visual Attention for Rapid Scen Analysis ”, IE Et. , VOL.20, No11, November 1998. ”.

図８は、Ｔｒｅｉｓｍａｎの特徴統合理論やＮｉｔｔｉ及びＫｏｃｈらによる顕著性マップを適用した場合における、注目点領域推定処理の流れの詳細例を示すフローチャートである。 FIG. 8 is a flowchart showing a detailed example of the flow of attention point region estimation processing when applying the Triisman feature integration theory and the saliency map by Niti and Koch et al.

ステップＳ４１において、ＣＰＵ９は、スルー撮像により得られたフレーム画像データを、処理対象画像データとして入力する。 In step S41, the CPU 9 inputs frame image data obtained by through imaging as processing target image data.

ステップＳ４２において、ＣＰＵ９は、ガウシアン解像度ピラミット（ＧａｕｓｓｉａｎＲｅｓｏｌｕｔｉｏｎＰｙｒａｍｉｄ）を作成する。具体的には例えば、ＣＰＵ９は、処理対象画像データ｛（ｘ，ｙ）の位置の画素データ｝をＩ（０）＝Ｉ（ｘ，ｙ）として、ガウシアンフィルタ処理とダウンサンプリング処理とを順次繰り返し実行する。その結果として、階層型のスケール画像データＩ（Ｌ）（例えばＬ∈｛０・・・８｝）の組が生成される。この階層型のスケール画像データＩ（Ｌ）の組が、ガウシアン解像度ピラミッドと称されている。ここで、スケールＬ＝ｋ（ここではｋは１乃至８のうちのいずれかの整数値）の場合、スケール画像データＩ（ｋ）は、１／２^ｋの縮小画像（ｋ＝０の場合は原画像）を示す。 In step S42, the CPU 9 creates a Gaussian resolution pyramid. Specifically, for example, the CPU 9 sets the processing target image data {pixel data at the position of (x, y)} to I (0) = I (x, y), and sequentially repeats the Gaussian filter processing and the downsampling processing. Execute. As a result, a set of hierarchical scale image data I (L) (for example, Lε {0... 8}) is generated. This set of hierarchical scale image data I (L) is called a Gaussian resolution pyramid. Here, when the scale L = k (here, k is any integer value from 1 to 8), the scale image data I (k) is a reduced image of 1/2 ^k (when k = 0). Original image).

ステップＳ４３において、ＣＰＵ９は、各特徴量マップ作成処理を開始する。すなわち、ＣＰＵ９は、処理対象画像データについて、例えば色、方位、輝度などの複数種類の特徴量のコントラストから、複数種類の特徴量マップを作成することができる。このような複数種類のうち所定の１種類の特徴量マップを作成するまでの一連の処理が、ここでは、特徴量マップ作成処理と称されている。各特徴量マップ作成処理の詳細例については、図９や図１０を参照して後述する。 In step S43, the CPU 9 starts each feature amount map creation process. That is, the CPU 9 can create a plurality of types of feature amount maps from the contrast of a plurality of types of feature amounts such as color, azimuth, and luminance for the processing target image data. A series of processing until a predetermined one type of feature amount map among the plurality of types is created is referred to herein as feature amount map creation processing. A detailed example of each feature amount map creation process will be described later with reference to FIG. 9 and FIG.

ステップＳ４４において、ＣＰＵ９は、全ての特徴量マップ作成処理が終了したか否かを判定する。各特徴量マップ作成処理のうち１つでも処理が終了していない場合、ステップＳ４４において、ＮＯであると判定されて、処理はステップＳ４４に再び戻される。すなわち、各特徴量マップ作成処理の全処理が終了するまでの間、ステップＳ４４の判定処理が繰り返し実行される。そして、各特徴量マップ作成処理の全処理が終了して、全ての特徴量マップが作成されると、ステップＳ４４においてＹＥＳであると判定されて、処理はステップＳ４５に進む。 In step S44, the CPU 9 determines whether or not all the feature map creation processing has been completed. If even one of the feature map creation processes has not been completed, it is determined as NO in step S44, and the process returns to step S44 again. That is, the determination process in step S44 is repeatedly executed until all the process of each feature amount map creation process is completed. When all the feature value map creation processes are completed and all feature value maps are created, it is determined as YES in Step S44, and the process proceeds to Step S45.

ステップＳ４５において、ＣＰＵ９は、各特徴量マップを線形和で結合して、顕著性マップＳ（ＳａｌｉｅｎｃｙＭａｐ）を求める。 In step S45, the CPU 9 obtains a saliency map S (Saliency Map) by combining the feature amount maps with a linear sum.

ステップＳ４６において、ＣＰＵ９は、顕著性マップＳを用いて、処理対象画像データから注目領域を推定する。すなわち、一般に、主要被写体となる人物や撮影対象（ｏｂｊｅｃｔｓ）となる物体の多くは、背景（ｂａｃｋｇｒｏｕｎｄ）領域に比べ、顕著性（ｓａｌｉｅｎｃｙ）が高いと考えられる。そこで、ＣＰＵ９は、顕著性マップＳを用いて、処理対象画像データから顕著性（ｓａｌｉｅｎｃｙ）が高い領域を認識する。そして、ＣＰＵ９は、その認識結果に基づいて、人間の視覚的注意を引く可能性の高い領域、すなわち、注目点領域を推定する。このようにして、注目点領域が推定されると、注目点領域推定処理は終了する。すなわち、図６のステップＳ２の処理は終了し、処理はステップＳ３に進む。図２乃至図５の例でいえば、ステップＳａ乃至Ｓｃの一連の処理は終了し、処理はステップＳｄに進む。 In step S 46, the CPU 9 estimates a region of interest from the processing target image data using the saliency map S. That is, generally, a person who is a main subject and an object which is a subject to be photographed (objects) are considered to have higher saliency than a background area. Therefore, the CPU 9 recognizes a region having high saliency from the processing target image data using the saliency map S. Then, based on the recognition result, the CPU 9 estimates an area that is likely to attract human visual attention, that is, an attention point area. When the attention point region is estimated in this way, the attention point region estimation process ends. That is, the process of step S2 in FIG. 6 ends, and the process proceeds to step S3. In the example of FIGS. 2 to 5, the series of steps Sa to Sc ends, and the process proceeds to step Sd.

次に、各特徴量マップ作成処理の具体例について説明する。 Next, a specific example of each feature amount map creation process will be described.

図９は、輝度、色、及び、方向性の特徴量マップ作成処理の流れの一例を示すフローチャートである。 FIG. 9 is a flowchart illustrating an example of a flow of a feature amount map creation process for luminance, color, and directionality.

図９Ａは、輝度の特徴量マップ作成処理の一例を示している。 FIG. 9A shows an example of a luminance feature amount map creation process.

ステップＳ６１において、ＣＰＵ９は、処理対象画像データに対応する各スケール画像から、各注目画素を設定する。例えば各注目画素ｃ∈｛２，３，４｝が設定されたとして、以下の説明を行う。各注目画素ｃ∈｛２，３，４｝とは、スケールｃ∈｛２，３，４｝のスケール画像データＩ（ｃ）上の演算対象として設定された画素をいう。 In step S61, the CPU 9 sets each pixel of interest from each scale image corresponding to the processing target image data. For example, assuming that each pixel of interest cε {2, 3, 4} is set, the following description will be given. Each pixel of interest cε {2, 3, 4} is a pixel set as a calculation target on the scale image data I (c) of scale cε {2, 3, 4}.

ステップＳ６２において、ＣＰＵ９は、各注目画素ｃ∈｛２，３，４｝の各スケール画像の輝度成分を求める。 In step S62, the CPU 9 obtains the luminance component of each scale image of each pixel of interest cε {2, 3, 4}.

ステップＳ６３において、ＣＰＵ９は、各注目画素の周辺画素ｓ＝ｃ＋δの各スケール画像の輝度成分を求める。各注目画素の周辺画素ｓ＝ｃ＋δとは、例えばδ∈｛３，４｝とすると、スケールｓ＝ｃ＋δのスケール画像Ｉ（ｓ）上の、注目画素（対応点）の周辺に存在する画素をいう。 In step S63, the CPU 9 calculates the luminance component of each scale image of the peripheral pixel s = c + δ of each pixel of interest. The peripheral pixel s = c + δ of each target pixel is a pixel existing around the target pixel (corresponding point) on the scale image I (s) of the scale s = c + δ, for example, when δε {3, 4}. Say.

ステップＳ６４において、ＣＰＵ９は、各スケール画像について、各注目画素ｃ∈｛２，３，４｝における輝度コントラストを求める。例えば、ＣＰＵ９は、各注目画素ｃ∈｛２，３，４｝と、各注目画素の周辺画素ｓ＝ｃ＋δ（例えばδ∈｛３，４｝）のスケール間差分を求める。ここで、注目画素ｃをＣｅｎｔｅｒと呼称し、注目画素の周辺画素ｓをＳｕｒｒｏｕｎｄと呼称すると、求められたスケール間差分は、輝度のＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄスケール間差分と呼称することができる。この輝度のＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄスケール間差分は、注目画素ｃが白で周辺画素ｓが黒の場合又はその逆の場合には大きな値をとるという性質がある。したがって、輝度のＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄスケール間差分は、輝度コントラストを表わすことになる。なお、以下、かかる輝度コントラストをＩ（ｃ，ｓ）と記述する。 In step S64, the CPU 9 obtains the luminance contrast at each target pixel cε {2, 3, 4} for each scale image. For example, the CPU 9 obtains an inter-scale difference between each target pixel cε {2, 3, 4} and a peripheral pixel s = c + δ (for example, δε {3,4)} of each target pixel. Here, when the target pixel c is referred to as “Center” and the peripheral pixel s of the target pixel is referred to as “Surround”, the obtained inter-scale difference can be referred to as a luminance Center-Surround inter-scale difference. The difference between the center-surround scales of the luminance has a property of taking a large value when the target pixel c is white and the peripheral pixel s is black or vice versa. Therefore, the luminance Center-Surround scale difference represents the luminance contrast. Hereinafter, such luminance contrast is described as I (c, s).

ステップＳ６５において、ＣＰＵ９は、処理対象画像データに対応する各スケール画像において、注目画素に設定されていない画素が存在するか否かを判定する。そのような画素が存在する場合、ステップＳ６５においてＹＥＳであると判定されて、処理はステップＳ６１に戻され、それ以降の処理が繰り返される。 In step S65, the CPU 9 determines whether or not there is a pixel that is not set as the target pixel in each scale image corresponding to the processing target image data. If such a pixel exists, it is determined as YES in step S65, the process returns to step S61, and the subsequent processes are repeated.

すなわち、処理対象画像データに対応する各スケール画像の各画素に対して、ステップＳ６１乃至Ｓ６５の処理がそれぞれ施されて、各画素の輝度コントラストＩ（ｃ，ｓ）が求められる。ここで、各注目画素ｃ∈｛２，３，４｝、及び、周辺画素ｓ＝ｃ＋δ（例えばδ∈｛３，４｝）が設定される場合、ステップＳ６１乃至Ｓ６５の１回の処理で、（注目画素ｃの３通り）×（周辺画素ｓの２通り）＝６通りの輝度コントラストＩ（ｃ，ｓ）が求められる。ここで、所定のｃと所定のｓについて求められた輝度コントラストＩ（ｃ，ｓ）の画像全体の集合体を、以下、輝度コントラストＩの特徴量マップと称する。輝度コントラストＩの特徴量マップは、ステップＳ６１乃至Ｓ６５のループ処理の繰り返しの結果、６通り求められることになる。このようにして、６通りの輝度コントラストＩの特徴量マップが求められると、ステップＳ６５においてＮＯであると判定されて、処理はステップＳ６６に進む。 That is, the processing of steps S61 to S65 is performed on each pixel of each scale image corresponding to the processing target image data, and the luminance contrast I (c, s) of each pixel is obtained. Here, when each target pixel cε {2, 3, 4} and the surrounding pixel s = c + δ (for example, δε {3,4)} are set, in one process of steps S61 to S65, (3 types of pixel of interest c) × (2 types of peripheral pixel s) = 6 luminance contrasts I (c, s) are obtained. Here, the aggregate of the entire image of the luminance contrast I (c, s) obtained for the predetermined c and the predetermined s is hereinafter referred to as a feature amount map of the luminance contrast I. As a result of repeating the loop processing of steps S61 to S65, six types of feature maps of luminance contrast I are obtained. In this way, when six feature maps of luminance contrast I are obtained, it is determined NO in step S65, and the process proceeds to step S66.

ステップＳ６６において、ＣＰＵ９は、輝度コントラストＩの各特徴量マップを正規化した上で結合することで、輝度の特徴量マップを作成する。これにより、輝度の特徴量マップ作成処理は終了する。なお、以下、輝度の特徴量マップを、他の特徴量マップと区別すべく、ＦＩと記述する。 In step S 66, the CPU 9 creates a luminance feature amount map by normalizing and combining the feature amount maps of the luminance contrast I. As a result, the luminance feature amount map creation process ends. Hereinafter, the feature map of luminance is described as FI in order to distinguish it from other feature maps.

図９Ｂは、色の特徴量マップ作成処理の一例を示している。 FIG. 9B shows an example of color feature amount map creation processing.

図９Ｂの色の特徴量マップ作成処理は、図９Ａの輝度の特徴量マップ作成処理と比較すると、処理の流れは基本的に同様であり、処理対象が異なるだけである。すなわち、図９ＢのステップＳ８１乃至Ｓ８６のそれぞれの処理は、図９ＡのステップＳ６１乃至Ｓ６６のそれぞれに対応する処理であり、各ステップの処理対象が図９Ａとは異なるだけである。したがって、図９Ｂの色の特徴量マップ作成処理については、処理の流れの説明は省略し、以下、処理対象についてのみ簡単に説明する。 Compared with the luminance feature quantity map creation processing of FIG. 9A, the processing flow of the color feature quantity map creation processing of FIG. 9B is basically the same, and only the processing target is different. That is, each process of steps S81 to S86 in FIG. 9B is a process corresponding to each of steps S61 to S66 in FIG. 9A, and the processing target of each step is only different from that in FIG. 9A. Therefore, the description of the processing flow of the color feature amount map creation processing in FIG. 9B is omitted, and only the processing target will be briefly described below.

すなわち、図９ＡのステップＳ６２とＳ６３の処理対象は、輝度成分であったのに対して、図９ＢのステップＳ８２とＳ８３の処理対象は、色成分である。
また、図９ＡのステップＳ６４の処理では、輝度のＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄスケール間差分が、輝度コントラストＩ（ｃ，ｓ）として求められた。これに対して、図９ＢのステップＳ８４の処理では、色相（Ｒ／Ｇ，Ｂ／Ｙ）のＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄスケール間差分が、色相コントラストとして求められる。なお、色成分のうち、赤の成分がＲで示され、緑の成分がＧで示され、青の成分がＢで示され、黄の成分がＹで示されている。また、以下、色相Ｒ／Ｇについての色相コントラストを、ＲＧ（ｃ，ｓ）と記述し、色相Ｂ／Ｙについての色相コントラストを、ＢＹ（ｃ，ｓ）と記述する。
ここで、上述の例にあわせて、注目画素ｃが３通り存在して、周辺画素ｓが２通り存在するとする。この場合、図９ＡのステップＳ６１乃至Ｓ６５のループ処理の結果、６通りの輝度コントラストＩの特徴量マップが求められた。これに対して、図９ＢのステップＳ８１乃至Ｓ８５のループ処理の結果、６通りの色相コントラストＲＧの特徴量マップと、６通りの色相コントラストＢＹの特徴量マップとが求められる。
最終的に、図９ＡのステップＳ６６の処理で、輝度の特徴量マップＦＩが求められた。これに対して、図９ＢのステップＳ８６の処理で、色の特徴量マップが求められる。なお、以下、色の特徴量マップを、他の特徴量マップと区別すべく、ＦＣと記述する。 That is, the processing target in steps S62 and S63 in FIG. 9A is a luminance component, whereas the processing target in steps S82 and S83 in FIG. 9B is a color component.
Further, in the process of step S64 in FIG. 9A, the luminance Center-Surround scale difference is obtained as the luminance contrast I (c, s). On the other hand, in the process of step S84 in FIG. 9B, the difference between the center and surround scales of the hue (R / G, B / Y) is obtained as the hue contrast. Of the color components, a red component is indicated by R, a green component is indicated by G, a blue component is indicated by B, and a yellow component is indicated by Y. Hereinafter, the hue contrast for the hue R / G is described as RG (c, s), and the hue contrast for the hue B / Y is described as BY (c, s).
Here, in accordance with the above example, it is assumed that there are three types of the target pixel c and two types of the peripheral pixel s. In this case, as a result of the loop processing in steps S61 to S65 in FIG. 9A, six types of feature map of luminance contrast I were obtained. On the other hand, as a result of the loop processing in steps S81 to S85 in FIG. 9B, six kinds of hue contrast RG feature quantity maps and six kinds of hue contrast BY feature quantity maps are obtained.
Finally, a luminance feature amount map FI was obtained in the process of step S66 of FIG. 9A. On the other hand, a color feature amount map is obtained in step S86 in FIG. 9B. Hereinafter, the color feature map is described as FC in order to distinguish it from other feature maps.

図９Ｃは、方向性の特徴量マップ作成処理の一例を示している。 FIG. 9C illustrates an example of a directional feature amount map creation process.

図９Ｃの方向性の特徴量マップ作成処理は、図９Ａの輝度の特徴量マップ作成処理と比較すると、処理の流れは基本的に同様であり、処理対象が異なるだけである。すなわち、図９ＣのステップＳ１０１乃至Ｓ１０６のそれぞれの処理は、図９ＡのステップＳ６１乃至Ｓ６６のそれぞれに対応する処理であり、各ステップの処理対象が図９Ａとは異なるだけである。したがって、図９Ｃの方向性の特徴量マップ作成処理については、処理の流れの説明は省略し、以下、処理対象についてのみ簡単に説明する。 Compared with the luminance feature quantity map creation processing of FIG. 9A, the flow of processing of the directional feature quantity map creation processing of FIG. 9C is basically the same, and only the processing target is different. That is, each process of steps S101 to S106 in FIG. 9C is a process corresponding to each of steps S61 to S66 in FIG. 9A, and the processing target of each step is only different from that in FIG. 9A. Therefore, the description of the flow of the directional feature amount map creation process in FIG. 9C will be omitted, and only the processing target will be briefly described below.

すなわち、ステップＳ１０２とＳ１０２３の処理対象は、方向成分である。ここで、方向成分とは、輝度成分に対してガウスフィルタφを畳み込んだ結果得られる各方向の振幅成分をいう。ここでいう方向とは、ガウスフィルタφのパラメータとして存在する回転角θにより示される方向をいう。例えば回転角θとしては、０°，４５°，９０°，１３５°の４方向を採用することができる。
また、ステップＳ１０４の処理では、方向性のＣｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄスケール間差分が、方向性コントラストとして求められる。なお、以下、方向性コントラストを、Ｏ（ｃ，ｓ，θ）と記述する。
ここで、上述の例にあわせて、注目画素ｃが３通り存在して、周辺画素ｓが２通り存在するとする。この場合、ステップＳ１０１乃至Ｓ１０５のループ処理の結果、回転角θ毎に、６通りの方向性コントラストＯの特徴量マップが求められる。例えば回転角θとして、０°，４５°，９０°，１３５°の４方向が採用されている場合には、２４通り（＝６×４通り）の方向性コントラストＯの特徴量マップが求められる。
最終的に、ステップＳ１０６の処理で、方向性の特徴量マップが求められる。なお、以下、方向性の特徴量マップを、他の特徴量マップと区別すべく、ＦＯと記述する。 That is, the processing target of steps S102 and S1023 is a direction component. Here, the direction component means an amplitude component in each direction obtained as a result of convolution of the Gaussian filter φ with the luminance component. The direction here refers to the direction indicated by the rotation angle θ existing as a parameter of the Gaussian filter φ. For example, four directions of 0 °, 45 °, 90 °, and 135 ° can be adopted as the rotation angle θ.
Further, in the process of step S104, the directional Center-Surround scale difference is obtained as the directional contrast. Hereinafter, the directional contrast is described as O (c, s, θ).
Here, in accordance with the above example, it is assumed that there are three types of the target pixel c and two types of the peripheral pixel s. In this case, as a result of the loop processing in steps S101 to S105, six feature amount maps of the directional contrast O are obtained for each rotation angle θ. For example, when four directions of 0 °, 45 °, 90 °, and 135 ° are employed as the rotation angle θ, 24 (= 6 × 4) directional contrast O feature amount maps are obtained. .
Finally, a directional feature map is obtained in the process of step S106. Hereinafter, the directional feature quantity map is described as FO in order to distinguish it from other feature quantity maps.

以上説明した図９の特徴量マップ作成処理のより詳細な処理内容については、例えば、「Ｌ．Ｉｔｔｉ，Ｃ．Ｋｏｃｈ，ａｎｄＥ．Ｎｉｅｂｕｒ，“ＡＭｏｄｅｌｏｆＳａｌｉｅｎｃｙ−ＢａｓｅｄＶｉｓｕａｌＡｔｔｅｎｔｉｏｎｆｏｒＲａｐｉｄＳｃｅｎｅＡｎａｌｙｓｉｓ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，ＶＯｌ．２０，Ｎｏ１１，Ｎｏｖｅｍｂｅｒ１９９８．」を参照すると良い。 For more detailed processing contents of the feature amount map creation processing of FIG. 9 described above, for example, “L. Itti, C. Koch, and E. Niebur,“ A Model of Saliency-Based Visual Attention for Rapid Scene Analysis ”. , IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL.20, No11, November 1998. ”.

なお、特徴量マップ作成処理は、図９の例に特に限定されない。例えば、特徴量マップ作成処理として、明度、彩度、色相、及びモーションの各特徴量を用いて、それぞれの特徴量マップを作成する処理を採用することもできる。 Note that the feature map creation processing is not particularly limited to the example of FIG. For example, as the feature quantity map creation process, it is possible to employ a process of creating each feature quantity map using each feature quantity of lightness, saturation, hue, and motion.

また例えば、特徴量マップ作成処理として、マルチスケールのコントラスト、Ｃｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラム、及び、色空間分布の各特徴量を用いて、それぞれの特徴量マップを作成する処理を採用することもできる。 Further, for example, as the feature amount map creation processing, it is also possible to employ processing for creating each feature amount map using each feature amount of multi-scale contrast, Center-Surround color histogram, and color space distribution. .

図１０は、マルチスケールのコントラスト、Ｃｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラム、及び、色空間分布の特徴量マップ作成処理の一例を示すフローチャートである。 FIG. 10 is a flowchart illustrating an example of a multi-scale contrast, a Center-Surround color histogram, and a color space distribution feature amount map creation process.

図１０Ａは、マルチスケールのコントラストの特徴量マップ作成処理の一例を示している。
ステップＳ１２１において、ＣＰＵ９は、マルチスケールのコントラストの特徴量マップを求める。これにより、マルチスケールのコントラストの特徴量マップ作成処理は終了となる。
なお、以下、マルチスケールのコントラストの特徴量マップを、他の特徴量マップと区別すべく、Ｆｃと記述する。 FIG. 10A shows an example of a multi-scale contrast feature map creation process.
In step S121, the CPU 9 obtains a feature map of multiscale contrast. As a result, the multi-scale contrast feature map creation processing is completed.
Hereinafter, the feature scale map of multi-scale contrast is described as Fc so as to be distinguished from other feature map.

図１０Ｂは、Ｃｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップ作成処理の一例を示している。 FIG. 10B illustrates an example of a feature map creation process for a Center-Surround color histogram.

ステップＳ１４１において、ＣＰＵ９は、異なるアスペクト比毎に、矩形領域の色ヒストグラムと、周辺輪郭の色ヒストグラムとを求める。アスペクト比自体は、特に限定されず、例えば｛０．５，０．７５，１．０，１．５，２．０｝などを採用することができる。 In step S141, the CPU 9 obtains a color histogram of the rectangular area and a color histogram of the peripheral outline for each different aspect ratio. The aspect ratio itself is not particularly limited, and for example, {0.5, 0.75, 1.0, 1.5, 2.0} can be adopted.

ステップＳ１４２において、ＣＰＵ９は、異なるアスペクト比毎に、矩形領域の色ヒストグラムと、周辺輪郭の色ヒストグラムとのカイ２乗距離を求める。ステップＳ１４３において、ＣＰＵ９は、カイ２乗距離が最大となる矩形領域の色ヒストグラムを求める。 In step S142, the CPU 9 obtains a chi-square distance between the color histogram of the rectangular area and the color histogram of the peripheral contour for each different aspect ratio. In step S143, the CPU 9 obtains a color histogram of a rectangular area where the chi-square distance is maximum.

ステップＳ１４４において、ＣＰＵ９は、カイ２乗距離が最大となる矩形領域の色ヒストグラムを用いて、Ｃｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップを作成する。これにより、Ｃｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップ作成処理は終了となる。
なお、以下、Ｃｅｎｔｅｒ−Ｓｕｒｒｏｕｎｄの色ヒストグラムの特徴量マップを、他の特徴量マップと区別すべく、Ｆｈと記述する。 In step S 144, the CPU 9 creates a feature map of the Center-Surround color histogram using the color histogram of the rectangular area having the maximum chi-square distance. Thus, the center-surround color histogram feature quantity map creation processing is completed.
Hereinafter, the feature amount map of the Center-Surround color histogram is described as Fh so as to be distinguished from other feature amount maps.

図１０Ｃは、色空間分布の特徴量マップ作成処理の一例を示している。 FIG. 10C illustrates an example of a feature map creation process for a color space distribution.

ステップＳ１６１において、ＣＰＵ９は、色空間分布について、水平方向の分散を計算する。また、ステップＳ１６２において、ＣＰＵ９は、色空間分布について、垂直方向の分散を計算する。そして、ステップＳ１６３において、ＣＰＵ９は、水平方向の分散と垂直方向の分散とを用いて、色の空間的な分散を求める。 In step S161, the CPU 9 calculates the horizontal variance for the color space distribution. In step S162, the CPU 9 calculates the vertical variance for the color space distribution. In step S163, the CPU 9 obtains the spatial dispersion of the colors using the horizontal dispersion and the vertical dispersion.

ステップＳ１６４において、ＣＰＵ９は、色の空間的な分散を用いて、色空間分布の特徴量マップを作成する。これにより、色空間分布の特徴量マップ作成処理は終了となる。
なお、以下、色空間分布の特徴量マップを、他の特徴量マップと区別すべく、Ｆｓと記述する。 In step S164, the CPU 9 creates a feature amount map of color space distribution using the spatial dispersion of colors. As a result, the feature map creation process of the color space distribution is completed.
Hereinafter, the feature map of the color space distribution is described as Fs so as to be distinguished from other feature maps.

以上説明した図１０の特徴量マップ作成処理のより詳細な処理内容については、例えば、「Ｔ．Ｌｉｕ，Ｊ．Ｓｕｎ，Ｎ．Ｚｈｅｎｇ，Ｘ．Ｔａｎｇ，Ｈ．Ｓｕｍ，“ＬｅａｒｎｉｎｇｔｏＤｅｔｅｃｔＡＳａｌｉｅｎｔＯｂｊｅｃｔ”，ＣＶＰＲ０７，ｐｐ．１−８，２００７．」を参照すると良い。 For more detailed processing contents of the feature amount map creation processing of FIG. 10 described above, for example, “T. Liu, J. Sun, N. Zheng, X. Tang, H. Sum,“ Learning to Detect A Salient Object ”. ", CVPR07, pp. 1-8, 2007."

次に、図６の撮影モード処理のうち、ステップＳ６の顔検出／顔認識処理の詳細例について説明する。 Next, a detailed example of the face detection / face recognition process in step S6 in the shooting mode process of FIG. 6 will be described.

図１１は、顔検出／顔認識処理の流れの詳細例を示すフローチャートである。 FIG. 11 is a flowchart showing a detailed example of the flow of face detection / face recognition processing.

ステップＳ２０１において、ＣＰＵ９は、顔検出用辞書情報に基づいて、顔検出用辞書を設定する。なお、顔検出用辞書情報は、例えば図１のＲＯＭ１１などに予め記憶されているとする。 In step S201, the CPU 9 sets a face detection dictionary based on the face detection dictionary information. It is assumed that the face detection dictionary information is stored in advance in, for example, the ROM 11 in FIG.

ステップＳ２０２において、ＣＰＵ９は、顔の個人認識か否かを判定する。顔の検出のみが行われる場合、ステップＳ２０２においてＮＯであると判定されて、処理はステップＳ２０５に進む。ただし、ステップＳ２０５以降の処理については後述する。 In step S202, the CPU 9 determines whether or not personal recognition of the face is performed. When only face detection is performed, it is determined as NO in step S202, and the process proceeds to step S205. However, the processing after step S205 will be described later.

これに対して、顔の個人認識も行われる場合、ステップＳ２０２においてＹＥＳであると判定されて、処理はステップＳ２０３に進む。ステップＳ２０３において、ＣＰＵ９は、個人認識する顔が登録されているか否かを判定する。個人認識する顔が未登録の場合、ステップＳ２０３においてＮＯであると判定されて、処理はステップＳ２０５に進む。ただし、ステップＳ２０５以降の処理については後述する。 On the other hand, if personal recognition of the face is also performed, it is determined as YES in step S202, and the process proceeds to step S203. In step S203, the CPU 9 determines whether or not a face for personal recognition is registered. If the face to be recognized is not registered, it is determined as NO in step S203, and the process proceeds to step S205. However, the processing after step S205 will be described later.

これに対して、個人認識する顔が登録されている場合、ステップＳ２０３においてＹＥＳであると判定されて、処理はステップＳ２０４に進む。ステップＳ２０４において、ＣＰＵ９は、顔の個人の認識用辞書情報に基づいて、顔認識用辞書を設定する。なお、顔の個人の認識用辞書情報は、例えば図１のＲＯＭ１１などに予め記憶されているとする。 On the other hand, if a face for personal recognition is registered, it is determined as YES in step S203, and the process proceeds to step S204. In step S 204, the CPU 9 sets a face recognition dictionary based on personal face recognition dictionary information of the face. Note that it is assumed that dictionary information for personal recognition of a face is stored in advance in, for example, the ROM 11 in FIG.

このようにして、ステップＳ２０４の処理が実行されると、処理はステップＳ２０５に進む。また、上述したように、ステップＳ２０２においてＮＯであると判定された場合、又は、ステップＳ２０３においてＮＯであると判定された場合にも、処理はステップＳ２０５に進む。ステップＳ２０５において、ＣＰＵ９は、注目点領域に基づいて、顔検出／顔認識のパラメータを設定する。なお、ここでいう注目点領域は、図６のステップＳ３の処理で処理対象領域に設定された注目点領域のみならず、必要に応じて、ステップＳ２の処理で推定された各注目点領域も含む。例えば、図４や図５の例として上述した顔の特徴データに関するパラメータ、具体的には例えば、その抽出方法、抽出個数、照合方法などが、ステップＳ２０５の処理で設定される。 Thus, when the process of step S204 is executed, the process proceeds to step S205. As described above, when it is determined as NO in step S202 or when it is determined as NO in step S203, the process proceeds to step S205. In step S205, the CPU 9 sets parameters for face detection / face recognition based on the attention point area. Note that the attention point area here is not only the attention point area set as the processing target area in the process of step S3 in FIG. 6, but also each attention point area estimated in the process of step S2 as necessary. Including. For example, the parameters related to the facial feature data described above as examples in FIGS. 4 and 5, specifically, for example, the extraction method, the number of extractions, the collation method, and the like are set in the process of step S205.

ステップＳ２０６において、ＣＰＵ９は、スルー撮像により得られたフレーム画像データを、処理対象画像データとして入力する。 In step S 206, the CPU 9 inputs frame image data obtained by through imaging as processing target image data.

ステップＳ２０７において、ＣＰＵ９は、処理対象画像データに対応するフレーム画像から、顔検出領域を抽出する。ステップＳ２０８において、ＣＰＵ９は、顔検出領域の少なくとも一部から、顔の特徴データを抽出する。この顔の特徴データの抽出に関するパラメータが、図４や図５の例として説明したパラメータであって、上述したようにステップＳ２０５の処理で設定されるパラメータである。 In step S207, the CPU 9 extracts a face detection area from the frame image corresponding to the processing target image data. In step S208, the CPU 9 extracts facial feature data from at least a part of the face detection area. The parameters relating to the extraction of the facial feature data are the parameters described as examples in FIGS. 4 and 5 and are set in the process of step S205 as described above.

ステップＳ２０９において、ＣＰＵ９は、顔の個人認識か否かを判定する。顔の個人認識も行われる場合、ステップＳ２０９においてＹＥＳであると判定されて、処理はステップＳ２１０に進む。ステップＳ２１０において、ＣＰＵ９は、顔の特徴データを、顔認識用辞書と照合する。この結果、照合評価値が求められる。 In step S209, the CPU 9 determines whether or not personal recognition of the face is performed. If personal recognition of the face is also performed, it is determined as YES in step S209, and the process proceeds to step S210. In step S210, the CPU 9 collates the face feature data with the face recognition dictionary. As a result, a collation evaluation value is obtained.

これに対して、顔の検出のみが行われる場合、ステップＳ２０９においてＮＯであると判定されて、処理はステップＳ２１１に進む。ステップＳ２１１において、ＣＰＵ９は、顔の特徴データを、顔検出用辞書と照合する。この結果、照合評価値が求められる。 On the other hand, when only face detection is performed, NO is determined in step S209, and the process proceeds to step S211. In step S211, the CPU 9 collates the face feature data with the face detection dictionary. As a result, a collation evaluation value is obtained.

このようにして、ステップＳ２１０又はＳ２１１の処理が実行されて照合評価値が求められると、処理はステップＳ２１２に進む。ステップＳ２１２において、ＣＰＵ９は、照合評価値が照合閾値以上であるか否かを判定する。 In this way, when the process of step S210 or S211 is executed and the collation evaluation value is obtained, the process proceeds to step S212. In step S212, the CPU 9 determines whether or not the collation evaluation value is greater than or equal to the collation threshold value.

照合評価値が照合閾値未満である場合、ステップＳ２１２においてＮＯであると判定され、処理はステップＳ２１３に進む。ステップＳ２１３において、ＣＰＵ９は、所定のエラー条件を満たしたか否かを判定する。所定のエラー条件を満たさない場合、ステップＳ２１３においてＮＯであると判定されて、処理はステップＳ２０６に戻され、それ以降の処理が繰り返される。これに対して、所定のエラー条件を満たしている場合、ステップＳ２１３においてＹＥＳであると判定されて、顔検出／顔認識処理は終了となる。すなわち、この場合には、図６のステップＳ６の顔検出／顔認識処理により顔検出領域が得られなかったとして、ステップＳ７の処理でＮＯであると判定されて、処理はステップＳ９に進むことになる。 When the collation evaluation value is less than the collation threshold value, it is determined as NO in Step S212, and the process proceeds to Step S213. In step S213, the CPU 9 determines whether or not a predetermined error condition is satisfied. If the predetermined error condition is not satisfied, NO is determined in step S213, the process returns to step S206, and the subsequent processes are repeated. On the other hand, if the predetermined error condition is satisfied, it is determined as YES in step S213, and the face detection / face recognition process is ended. That is, in this case, it is determined that the face detection area is not obtained by the face detection / face recognition process in step S6 of FIG. 6 and NO is determined in step S7, and the process proceeds to step S9. become.

これに対して、照合評価値が照合閾値以上である場合、ステップＳ２１２においてＹＥＳであると判定され、処理はステップＳ２１４に進む。ステップＳ２１４において、ＣＰＵ９は、照合結果を確定する。すなわち、顔検出領域に関する情報として、顔の識別ＩＤ、顔の位置、照合評価値などの情報が確定されて、ＤＲＡＭ７やＲＡＭ１０などに記憶される。これにより、顔検出／顔認識処理は終了となる。すなわち、この場合には、図６のステップＳ６の顔検出／顔認識処理により顔検出領域が得られたとして、ステップＳ７の処理でＹＥＳであると判定されて、処理はステップＳ８に進むことになる。 On the other hand, when the collation evaluation value is equal to or greater than the collation threshold value, it is determined as YES in Step S212, and the process proceeds to Step S214. In step S214, the CPU 9 determines the collation result. That is, information such as a face identification ID, a face position, and a collation evaluation value is determined as information related to the face detection area and stored in the DRAM 7 or the RAM 10. As a result, the face detection / face recognition process ends. That is, in this case, assuming that the face detection area is obtained by the face detection / face recognition process in step S6 of FIG. 6, it is determined YES in the process of step S7, and the process proceeds to step S8. Become.

以上説明したように、本実施形態に係る画像処理装置１００のＣＰＵ９は、主要被写体を含む入力画像（スルー画像）に対して、入力画像から抽出された複数の特徴量に基づく顕著性マップを用いて、注目点領域を推定する機能を有している。ＣＰＵ９は、推定された注目点領域を用いて、顔検出／顔認識処理に関するパラメータを設定する機能を有している。ＣＰＵ９は、設定されたパラメータを用いて、顔検出／顔認識処理を実行する機能を有している。 As described above, the CPU 9 of the image processing apparatus 100 according to the present embodiment uses the saliency map based on a plurality of feature amounts extracted from the input image for the input image (through image) including the main subject. Thus, it has a function of estimating the attention point region. The CPU 9 has a function of setting parameters relating to face detection / face recognition processing using the estimated attention point region. The CPU 9 has a function of executing face detection / face recognition processing using the set parameters.

このように、ＣＰＵ９は、注目点領域の推定処理と顔検出／顔認識処理とを縦列式に又は並列式に実行することができる。これにより、ＣＰＵ９は、注目点領域を用いて適切にパラメータを設定することができ、その結果、顔検出／顔認識処理の速度及び精度を向上させることができる。 As described above, the CPU 9 can execute the attention point region estimation process and the face detection / face recognition process in a tandem manner or in a parallel manner. Thus, the CPU 9 can appropriately set parameters using the attention point region, and as a result, the speed and accuracy of the face detection / face recognition processing can be improved.

例えば、ＣＰＵ９は、注目点領域のうち所定条件を満たすものだけを処理対象として顔検出／顔認識処理を実行することができる。これにより、ＣＰＵ９は、注目点領域に応じて、顔の探索、抽出すべき特徴データ、照合範囲などを絞り込んだり、処理不要と認定できる注目点領域を処理対象から除外することができる。その結果、顔検出／顔認識処理の速度及び精度を向上させることができる。 For example, the CPU 9 can execute the face detection / face recognition process only on the target point area that satisfies the predetermined condition. As a result, the CPU 9 can narrow down the face search, the feature data to be extracted, the collation range, and the like according to the attention point area, or can exclude the attention point area that can be recognized as processing unnecessary from the processing target. As a result, the speed and accuracy of face detection / face recognition processing can be improved.

また例えば、ＣＰＵ９は、注目点領域の大きさ、位置、形状などに応じて、顔検出／顔認識処理内の各処理（顔検出処理、特徴データの抽出処理、照合評価処理など）のパラメータを適切に調整することができる。また、例えば、ＣＰＵ９は、サイズが大きい注目点領域や顔検出の可能性が高い注目点領域に対しては、顔検出／顔認識処理をより詳しく実行するような設定をすることができる。一方、ＣＰＵ９は、サイズが小さい注目点領域や顔検出の可能性が低い注目点領域に対しては、顔検出／顔認識処理をより粗く実行するような設定をすることができる。その結果、顔検出／顔認識処理が効率的に行われて、総合的な処理時間が短縮される。すなわち、顔検出／顔認識処理の総合的な速度及び精度を向上させることができる。 Further, for example, the CPU 9 sets parameters of each process (face detection process, feature data extraction process, collation evaluation process, etc.) in the face detection / face recognition process according to the size, position, shape, etc. Can be adjusted appropriately. Further, for example, the CPU 9 can make settings so that the face detection / face recognition process is executed in more detail for a point of interest region with a large size or a region of interest with a high possibility of face detection. On the other hand, the CPU 9 can perform setting so that the face detection / face recognition process is performed more roughly for the attention point area having a small size or the attention point area having a low possibility of face detection. As a result, face detection / face recognition processing is performed efficiently, and the overall processing time is shortened. That is, the overall speed and accuracy of face detection / face recognition processing can be improved.

なお、上述のごとく、設定対象のパラメータは、顔検出／顔認識処理に関するパラメータであれば足り、様々なパラメータを採用することができる。上述した例に重複する場合もあるが、以下、採用可能なパラメータを幾つか例示列挙する。例えば、顔検出／顔認識処理の対象となるか否かを示すパラメータを採用することができる。例えば、顔検出／顔認識処理により検出若しくは認識される顔の領域の大きさ若しくは対象範囲、当該顔から特徴点（特徴データ）を抽出する領域の大きさ若しくは解像度に関するパラメータ、又は、これらの２以上の組合せを採用することができる。例えば、顔検出／顔認識処理の精度に関するパラメータを採用することができる。例えば、顔検出／顔認識処理の順番に関するパラメータを採用することができる。 As described above, the parameters to be set need only be parameters relating to face detection / face recognition processing, and various parameters can be adopted. Although it may overlap with the above-described example, a few examples of parameters that can be adopted are listed below. For example, it is possible to employ a parameter indicating whether or not to be a target of face detection / face recognition processing. For example, the size or target range of the face area detected or recognized by the face detection / face recognition processing, the size or resolution parameter of the area where feature points (feature data) are extracted from the face, or these two Combinations of the above can be employed. For example, parameters relating to the accuracy of face detection / face recognition processing can be employed. For example, parameters relating to the order of face detection / face recognition processing can be employed.

また、本発明は前記実施形態に限定されるものではなく、本発明の目的を達成できる範囲での変形、改良などは本発明に含まれるものである。
例えば、上述した実施形態では、本発明が適用される画像処理装置は、デジタルカメラとして構成される例として説明した。しかしながら、本発明は、デジタルカメラに特に限定されず、電子機器一般に適用することができる。具体的には例えば、本発明は、ビデオカメラ、携帯型ナビゲーション装置、ポータブルゲーム機などに適用可能である。 Moreover, the present invention is not limited to the above-described embodiment, and modifications, improvements, and the like within the scope that can achieve the object of the present invention are included in the present invention.
For example, in the above-described embodiments, the image processing apparatus to which the present invention is applied has been described as an example configured as a digital camera. However, the present invention is not particularly limited to digital cameras, and can be applied to electronic devices in general. Specifically, for example, the present invention is applicable to a video camera, a portable navigation device, a portable game machine, and the like.

また例えば、本発明は、顔検出／顔認識処理のみならず、入力画像から主要被写体を検出する検出処理に対して広く一般に適用することが可能である。この場合、ＣＰＵ９は、推定された注目点領域を用いて、検出処理に関するパラメータを設定する機能を有することになる。これにより、例えば、注目点領域が所定の大きさ、輪郭形状、所定の範囲内の縦横比などを有している場合には、ＣＰＵ９は、人物の顔の検出処理を禁止して、人物の顔以外の対象の検出（識別）を行うなどの各種処理を実行できるようになる。 Further, for example, the present invention can be widely applied not only to face detection / face recognition processing but also to detection processing for detecting a main subject from an input image. In this case, the CPU 9 has a function of setting a parameter related to the detection process using the estimated attention point region. Thus, for example, when the attention point area has a predetermined size, contour shape, aspect ratio within a predetermined range, the CPU 9 prohibits the human face detection process, Various processes such as detection (identification) of an object other than a face can be executed.

上述した一連の処理は、ハードウェアにより実行させることもできるし、ソフトウェアにより実行させることもできる。 The series of processes described above can be executed by hardware or can be executed by software.

一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、コンピュータなどにネットワークや記録媒体からインストールされる。コンピュータは、専用のハードウェアに組み込まれているコンピュータであっても良い。また、コンピュータは、各種のプログラムをインストールすることで、各種の機能を実行することが可能なコンピュータ、例えば汎用のパーソナルコンピュータであっても良い。
このようなプログラムを含む記録媒体は、図示はしないが、ユーザにプログラムを提供するために装置本体とは別に配布されるリムーバブルメディアにより構成されるだけでなく、装置本体に予め組み込まれた状態でユーザに提供される記録媒体などで構成される。リムーバブルメディアは、例えば、磁気ディスク（フロッピディスクを含む）、光ディスク、又は光磁気ディスクなどにより構成される。光ディスクは、例えば、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｋ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ），ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）などにより構成される。光磁気ディスクは、ＭＤ（Ｍｉｎｉ−Ｄｉｓｋ）などにより構成される。また、装置本体に予め組み込まれた状態でユーザに提供される記録媒体は、例えば、プログラムが記録されている図１のＲＯＭ１１や、図示せぬハードディスクなどで構成される。 When a series of processing is executed by software, a program constituting the software is installed on a computer or the like from a network or a recording medium. The computer may be a computer incorporated in dedicated hardware. The computer may be a computer capable of executing various functions by installing various programs, for example, a general-purpose personal computer.
Although not shown, the recording medium including such a program is not only constituted by a removable medium distributed separately from the apparatus main body in order to provide a program to the user, but also in a state of being incorporated in the apparatus main body in advance. It consists of a recording medium provided to the user. The removable medium is composed of, for example, a magnetic disk (including a floppy disk), an optical disk, a magneto-optical disk, or the like. The optical disk is composed of, for example, a CD-ROM (Compact Disk-Read Only Memory), a DVD (Digital Versatile Disk), or the like. The magneto-optical disk is configured by an MD (Mini-Disk) or the like. In addition, the recording medium provided to the user in a state of being preliminarily incorporated in the apparatus main body includes, for example, the ROM 11 in FIG.

なお、本明細書において、記録媒体に記録されるプログラムを記述するステップは、その順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的或いは個別に実行される処理をも含むものである。 In the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in time series along the order, but is not necessarily performed in time series, either in parallel or individually. The process to be executed is also included.

１００・・・画像処理装置、１・・・光学レンズ装置、２・・・シャッタ装置、３・・・アクチュエータ、４・・・ＣＭＯＳセンサ、５・・・ＡＦＥ、６・・・ＴＧ、７・・・ＤＲＡＭ、８・・・ＤＳＰ、９・・・ＣＰＵ、１０・・・ＲＡＭ、１１・・・ＲＯＭ、１２・・・液晶表示コントローラ、１３・・・液晶ディスプレイ、１４・・・操作部、１５・・・メモリカード、１６・・・測距センサ、１７・・・測光センサ DESCRIPTION OF SYMBOLS 100 ... Image processing apparatus, 1 ... Optical lens apparatus, 2 ... Shutter apparatus, 3 ... Actuator, 4 ... CMOS sensor, 5 ... AFE, 6 ... TG, 7 ··· DRAM, 8 ... DSP, 9 ... CPU, 10 ... RAM, 11 ... ROM, 12 ... Liquid crystal display controller, 13 ... Liquid crystal display, 14 ... Operation unit, 15 ... Memory card, 16 ... Ranging sensor, 17 ... Photometric sensor

Claims

An estimation unit that estimates a region of interest using an saliency map based on a plurality of feature amounts extracted from the input image for an input image including a main subject;
A setting unit that sets a parameter related to subject detection processing for detecting the main subject from the input image using the attention point region estimated by the estimation unit;
A detection unit that executes the subject detection process using the parameters set by the setting unit;
An image processing apparatus comprising:

The image processing apparatus according to claim 1, wherein the detection unit detects, as the subject detection process, a face of a person as the main subject or recognizes a face of a specific person.

The parameter is a parameter indicating whether or not to be a target of the face processing,
The setting unit performs a first setting to include the target point region in the face processing target when the target point region satisfies a predetermined condition, and when the target target region does not satisfy the predetermined condition, Perform a second setting to exclude the point area from the face processing target,
When the first setting is performed by the setting unit, the detection unit performs the face processing on the attention point region, and when the second setting is performed by the setting unit, The image processing apparatus according to claim 2, wherein execution of the face processing on the attention point region is prohibited.

The parameter is a parameter relating to the size or target range of a face area detected or recognized by the face processing, the size or resolution of the area from which a feature point is extracted from the face, or a combination of two or more thereof. The image processing according to claim 2.

The image processing according to claim 2, wherein the parameter is a parameter related to accuracy of the face processing.

The parameter is a parameter related to the order of the face processing,
The setting unit performs setting to attach the order to each of the one or more attention point regions according to a predetermined condition,
The image processing apparatus according to claim 2, wherein the detection unit sequentially executes the face processing on each of the one or more attention point regions according to the order set by the setting unit.

An estimation step for estimating a region of interest using an saliency map based on a plurality of feature amounts extracted from the input image for an input image including a main subject;
A setting step for setting a parameter relating to a subject detection process for detecting the main subject from the input image using the attention point region estimated by the processing of the estimation step;
A detection step of executing the subject detection process using the parameters set by the setting step;
An image processing method including: