JP2015219648A

JP2015219648A - Age estimation device, imaging device, age estimation method and program

Info

Publication number: JP2015219648A
Application number: JP2014101602A
Authority: JP
Inventors: 浩一中込; Koichi Nakagome
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2014-05-15
Filing date: 2014-05-15
Publication date: 2015-12-07
Anticipated expiration: 2034-05-15
Also published as: JP6476589B2

Abstract

PROBLEM TO BE SOLVED: To accurately estimate the age of a subject.SOLUTION: In an age estimation device 100, a face detection part 110 detects a face area of a subject in a subject image including the subject. A featured value generation part 120 generates a featured value indicating features of the face of the subject from the face area detected by the face detection part 110. A facial expression detection part 130 detects a degree of facial expression of the subject from the face area detected by the face detection part 110. An age estimation part 140 estimates the age of the subject on the basis of the featured value generated by the featured value generation part 120 and the degree of facial expression detected by the facial expression detection part 130.

Description

本発明は、年齢推定装置、撮像装置、年齢推定方法及びプログラムに関する。 The present invention relates to an age estimation device, an imaging device, an age estimation method, and a program.

被写体の顔画像を分析して被写体の年齢を推定する年齢推定技術がある。多くの年齢推定技術では、顔画像から年齢を推定するための指標となる特徴量を生成して、この特徴量と年齢との関係を、予め収集された多数の学習用画像からＳＶＲ（Support Vector Regression：サポートベクター回帰）等の手法を用いて学習しておく。そして、この学習結果に基づいて、年齢推定の対象となる被写体の年齢を推定する。 There is an age estimation technique for estimating the age of a subject by analyzing the face image of the subject. In many age estimation techniques, a feature amount serving as an index for estimating age is generated from a face image, and the relationship between the feature amount and age is calculated from a large number of pre-collected learning images using an SVR (Support Vector). Learn using techniques such as Regression (regression support vector). Based on the learning result, the age of the subject to be age estimated is estimated.

例えば非特許文献１は、顔画像における皺のような生物学的に生じる特徴から、年齢を推定するための指標を得て、人間の年齢を推定する技術を開示している。 For example, Non-Patent Document 1 discloses a technique for obtaining a human age by obtaining an index for estimating an age from biologically generated features such as wrinkles in a face image.

また、顔画像から特徴量を生成する際に用いる個々の技術として、特許文献１〜３は、撮影手段によって撮影された画像から、被写体の顔領域を検出する手法を開示している。また、非特許文献２は、顔画像から、左右の目の中心位置や口の中心位置等、特定部位の位置を抽出する手法を開示している。 Further, as individual techniques used when generating a feature amount from a face image, Patent Documents 1 to 3 disclose a technique for detecting a face area of a subject from an image photographed by photographing means. Non-Patent Document 2 discloses a technique for extracting the position of a specific part such as the center position of the left and right eyes and the center position of the mouth from the face image.

特開２００２−３３３６５２号公報Japanese Patent Laid-Open No. 2002-333651 特開２００５−２３４６８６号公報JP 2005-234686 A 特開２００８−３００９８６号公報JP 2008-300098 A

Ｇ．Ｇｕｏ，Ｇ．Ｍｕ，Ｙ．Ｆｕ，ａｎｄＴ．Ｓ．Ｈｕａｎｇ，“ＨｕｍａｎＡｇｅＥｓｔｉｍａｔｉｏｎＵｓｉｎｇＢｉｏ−ｉｎｓｐｉｒｅｄＦｅａｔｕｒｅｓ”，ＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，１１２−１１９，Ｊｕｎｅ，２００９G. Guo, G .; Mu, Y. et al. Fu, and T.M. S. Huang, “Human Age Estimating Using Bio-inspired Features”, IEEE Conference on Computer Vision and Pattern Recognition, 112-119, June, 2009. Ｔ．Ｆ．Ｃｏｏｔｅｓ，ａｎｄＣ．Ｊ．Ｔａｙｌｏｒ，“ＳｔａｔｉｓｔｉｃａｌＭｏｄｅｌｓｏｆＡｐｐｅａｒａｎｃｅｆｏｒＣｏｍｐｕｔｅｒＶｉｓｉｏｎ”，ＩｍａｇｉｎｇＳｃｉｅｎｃｅａｎｄＢｉｏｍｅｄｉｃａｌＥｎｇｉｎｅｅｒｉｎｇ，ＵｎｉｖｅｒｓｉｔｙｏｆＭａｎｃｈｅｓｔｅｒ，ＭａｎｃｈｅｓｔｅｒＭ１３９ＰＴ，Ｍａｒｃｈ，２００４T. T. et al. F. Coutes, and C.I. J. et al. Taylor, “Statistical Models of Appearance for Computer Vision”, Imaging Science and Biomedical Engineering, University of Manchester, Manchester 4

年齢推定のためのＳＶＲを用いた学習には、様々な年齢層・人種・性別等を含んだ顔画像（訓練データ）とその年齢（クラスラベル）を用意する必要がある。このとき、学習データに様々な表情を有した顔画像を混在させると、年齢推定の精度が落ちてしまうという問題があった。 In learning using SVR for age estimation, it is necessary to prepare face images (training data) including various age groups, races, genders, and the like and their ages (class labels). At this time, if face images having various expressions are mixed in the learning data, there is a problem that the accuracy of age estimation is lowered.

例えば、年齢推定の対象者の表情が笑顔や怒りを含んでいる場合、この対象者の年齢を高く推定してしまう等、年齢推定の精度が低下する傾向があった。推定精度の低下の要因として、特に加齢による皺や肌の滑らかさを表現している特徴量の年齢推定への寄与が高いため、笑顔や怒りといった表情に起因する皺と加齢に起因する皺との区別が判別できていないことが考えられる。 For example, when the age estimation target person's facial expression includes a smile or anger, the age estimation accuracy tends to decrease, such as estimating the age of the target person high. As a factor of the decrease in estimation accuracy, especially due to age-related features that express wrinkles and smoothness of skin due to aging, it is attributed to wrinkles and aging caused by facial expressions such as smiles and anger It is possible that the distinction from moths has not been determined.

このような年齢推定精度の低下を避けるために、従来は、無表情の顔画像を用いてＳＶＲの学習を行い、無表情な顔画像に限って年齢を推定することが多かった。しかしながら、笑顔や怒り等、様々な表情を有した被写体に対しても、年齢を精度よく推定することが求められていた。 In order to avoid such a decrease in age estimation accuracy, conventionally, the SVR is learned using an expressionless face image, and the age is often estimated only for the expressionless face image. However, it has been required to accurately estimate the age even for subjects having various expressions such as smiles and anger.

本発明は、以上のような課題を解決するためのものであり、被写体の年齢を精度よく推定するのに好適な年齢推定装置、年齢推定方法及びプログラムを提供することを目的とする。 The present invention has been made to solve the above-described problems, and an object thereof is to provide an age estimation device, an age estimation method, and a program suitable for accurately estimating the age of a subject.

上記目的を達成するため、本発明に係る年齢推定装置は、
被写体を含む被写体画像における、当該被写体の顔領域を検出する顔検出部と、
前記顔検出部が検出した顔領域から、前記被写体の顔の特徴を示す特徴量を生成する特徴量生成部と、
前記顔検出部が検出した顔領域から、前記被写体の表情の度合を検出する表情検出部と、
前記特徴量生成部が生成した特徴量と、前記表情検出部が検出した表情の度合と、に基づいて、前記被写体の年齢を推定する年齢推定部と、
を備えることを特徴とする。 In order to achieve the above object, an age estimation apparatus according to the present invention includes:
A face detection unit that detects a face area of the subject in a subject image including the subject;
A feature amount generation unit that generates a feature amount indicating a feature of the face of the subject from the face area detected by the face detection unit;
A facial expression detection unit that detects the degree of facial expression of the subject from the face area detected by the face detection unit;
An age estimation unit that estimates the age of the subject based on the feature amount generated by the feature amount generation unit and the degree of facial expression detected by the facial expression detection unit;
It is characterized by providing.

本発明によれば、被写体の年齢を精度よく推定することができる。 According to the present invention, the age of a subject can be accurately estimated.

本発明の実施形態１に係る撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the imaging device which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る撮像装置及び年齢推定装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the imaging device and age estimation apparatus which concern on Embodiment 1 of this invention. （ａ）、（ｂ）、（ｃ）共に、学習用被写体を含む学習用画像の例を示す図である。(A), (b), (c) is a figure which shows the example of the learning image containing the to-be-photographed object. （ａ）は、顔領域の検出例を示す図である。（ｂ）は、特定部位の位置の抽出例を示す図である。（ｃ）は、正規化画像の例を示す図である。(A) is a figure showing an example of detection of a face area. (B) is a figure showing an example of extraction of a position of a specific part. (C) is a figure which shows the example of a normalized image. 本発明の実施形態１に係る年齢推定装置が実行する学習処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the learning process which the age estimation apparatus which concerns on Embodiment 1 of this invention performs. 本発明の実施形態１に係る年齢推定装置が実行する年齢推定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the age estimation process which the age estimation apparatus which concerns on Embodiment 1 of this invention performs. 本発明の実施形態２に係る撮像装置及び年齢推定装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the imaging device and age estimation apparatus which concern on Embodiment 2 of this invention. 本発明の実施形態２に係る年齢推定装置が実行する年齢推定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the age estimation process which the age estimation apparatus which concerns on Embodiment 2 of this invention performs.

以下、本発明の実施形態について、図面を参照して説明する。なお、図中同一又は相当する部分には同一符号を付す。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the drawings, the same or corresponding parts are denoted by the same reference numerals.

（実施形態１）
本発明の実施形態１に係る撮像装置は、図１に示すように構成される。撮像装置１は、撮像部１０と、データ処理部２０と、ユーザインタフェース部３０と、を備える。 (Embodiment 1)
The imaging apparatus according to Embodiment 1 of the present invention is configured as shown in FIG. The imaging device 1 includes an imaging unit 10, a data processing unit 20, and a user interface unit 30.

撮像部１０は、光学レンズ１１とイメージセンサ１２とを含む。撮像部１０は、被写体を撮像することにより、被写体の画像データを生成する。 The imaging unit 10 includes an optical lens 11 and an image sensor 12. The imaging unit 10 generates image data of the subject by imaging the subject.

光学レンズ１１は、被写体から射出された光を集光するレンズと、焦点、露出、ホワイトバランス等の撮像設定パラメータを調整するための周辺回路と、によって構成される。 The optical lens 11 includes a lens that collects light emitted from a subject and a peripheral circuit for adjusting imaging setting parameters such as focus, exposure, and white balance.

イメージセンサ１２は、例えば、ＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）等によって構成される。イメージセンサ１２は、光学レンズ１１が光を集光することによって結像した被写体の光学像を取得して、取得した光学像の電圧情報をアナログ／デジタル変換器（図示せず）によりデジタル情報に変換する。そして、得られたデジタル画像データを、メモリ２１に保存する。 The image sensor 12 is configured by, for example, a CCD (Charge Coupled Device), a CMOS (Complementary Metal Oxide Semiconductor), or the like. The image sensor 12 acquires an optical image of a subject formed by the optical lens 11 condensing light, and converts the voltage information of the acquired optical image into digital information by an analog / digital converter (not shown). Convert. Then, the obtained digital image data is stored in the memory 21.

データ処理部２０は、メモリ２１と、ビデオ出力部２２と、記憶部２３と、ＣＰＵ（Central Processing Unit）２４と、年齢推定装置１００と、を含む。 The data processing unit 20 includes a memory 21, a video output unit 22, a storage unit 23, a CPU (Central Processing Unit) 24, and an age estimation device 100.

メモリ２１は、例えばＲＡＭ（Random Access Memory）等によって構成され、撮像部１０が生成した画像データを記憶する。 The memory 21 is composed of, for example, a RAM (Random Access Memory) and stores image data generated by the imaging unit 10.

ビデオ出力部２２は、メモリ２１に記憶された画像データを読み出して、読み出した画像データに対応するＲＧＢ（Ｒ（レッド）、Ｇ（グリーン）、Ｂ（ブルー））信号を生成し、表示部３１に出力する。 The video output unit 22 reads the image data stored in the memory 21, generates RGB (R (red), G (green), B (blue)) signals corresponding to the read image data, and displays the display unit 31. Output to.

記憶部２３は、例えばＲＯＭ（Read-Only Memory）やフラッシュメモリ等の不揮発性メモリによって構成される。記憶部２３は、ＣＰＵ２４が実行する各種の制御プログラムや、撮像装置１の各種の処理によって得られたデータを記憶する。 The storage unit 23 is configured by a nonvolatile memory such as a ROM (Read-Only Memory) or a flash memory, for example. The storage unit 23 stores various control programs executed by the CPU 24 and data obtained by various processes of the imaging device 1.

ＣＰＵ２４は、メモリ２１を一時的な記憶領域として用いながら、記憶部２３に格納されているプログラムに従って各種の演算処理を実行することにより、撮像装置１全体の動作を制御する。 The CPU 24 controls the overall operation of the imaging apparatus 1 by executing various arithmetic processes according to a program stored in the storage unit 23 while using the memory 21 as a temporary storage area.

ユーザインタフェース部３０は、表示部３１と、操作部３２と、外部インタフェース３３と、外部記憶媒体３４と、を含む。 The user interface unit 30 includes a display unit 31, an operation unit 32, an external interface 33, and an external storage medium 34.

表示部３１は、例えば液晶表示装置や有機ＥＬ（Electro Luminescence）ディスプレイ等によって構成される。表示部３１は、ビデオ出力部２２から供給されたＲＧＢ信号に基づいて、撮像装置１によって撮像した画像や撮像装置１の操作メニューを示す画像等、各種の画像データを表示する。 The display part 31 is comprised by a liquid crystal display device, an organic EL (Electro Luminescence) display, etc., for example. Based on the RGB signal supplied from the video output unit 22, the display unit 31 displays various image data such as an image captured by the imaging device 1 and an image showing an operation menu of the imaging device 1.

操作部３２は、ユーザからの操作指示を受け付ける。操作部３２は、撮像装置１の電源スイッチ、シャッタボタン、撮像装置１の各種の機能を選択するためのボタン等、各種の操作ボタンによって構成される。操作部３２は、ユーザから操作指示を受け付けると、受け付けた指示情報をデータ処理部２０のＣＰＵ２４に供給する。 The operation unit 32 receives an operation instruction from the user. The operation unit 32 includes various operation buttons such as a power switch, a shutter button, and a button for selecting various functions of the imaging device 1. When receiving an operation instruction from the user, the operation unit 32 supplies the received instruction information to the CPU 24 of the data processing unit 20.

なお、表示部３１と操作部３２とは、互いに重畳して配置されたいわゆるタッチパネルによって構成されるものであってもよい。 In addition, the display part 31 and the operation part 32 may be comprised by what is called a touch panel arrange | positioned mutually superimposed.

外部インタフェース３３は、撮像装置１の外部の機器とデータをやり取りするためのインタフェースである。例えば、外部インタフェース３３は、撮像装置１が撮像して得られた画像データを、ＵＳＢ（Universal Serial Bus）規格のデータに変換して、ＵＳＢケーブルを介して外部の機器との間でデータを送受信する。 The external interface 33 is an interface for exchanging data with an external device of the imaging apparatus 1. For example, the external interface 33 converts image data obtained by imaging by the imaging device 1 into USB (Universal Serial Bus) standard data, and transmits / receives data to / from an external device via a USB cable. To do.

外部記憶媒体３４は、撮像装置１の外部の機器とデータをやり取りするための記憶媒体である。 The external storage medium 34 is a storage medium for exchanging data with devices external to the imaging apparatus 1.

年齢推定装置１００は、例えば撮像部１０の撮像により得られた、年齢推定対象の被写体の年齢を推定する。年齢推定装置１００は、ＣＰＵ２４に制御されて、図２に示す各部のように機能する。具体的には、年齢推定装置１００は、顔検出部１１０、特徴量生成部１２０、表情検出部１３０、年齢推定部１４０、分類部１５０、及び学習部１６０として機能する。 The age estimation device 100 estimates the age of the subject to be age estimated, for example, obtained by imaging by the imaging unit 10. The age estimation device 100 is controlled by the CPU 24 and functions like each unit shown in FIG. Specifically, the age estimation apparatus 100 functions as a face detection unit 110, a feature amount generation unit 120, a facial expression detection unit 130, an age estimation unit 140, a classification unit 150, and a learning unit 160.

これらの機能により、年齢推定装置１００は、以下の（１）及び（２）の処理を実行する。以下、（１）及び（２）の処理について、図面を参照しながら詳細に説明する。
（１）学習用被写体を用いて年齢の識別基準を学習する処理
（２）学習結果に基づいて年齢推定対象の被写体の年齢を推定する処理 With these functions, the age estimation apparatus 100 executes the following processes (1) and (2). Hereinafter, the processes (1) and (2) will be described in detail with reference to the drawings.
(1) Process for learning age identification criterion using learning subject (2) Process for estimating age of subject of age estimation based on learning result

第１に、（１）学習用被写体を用いて年齢の識別基準を学習する処理について説明する。 First, (1) a process of learning age identification criteria using a learning subject will be described.

この学習処理は、撮像装置１が実行する通常の撮像処理とは異なる専用の学習モードで実行される。この学習処理は、撮像装置１の製造時や工場出荷時等の製品調整段階において実行されてもよいし、一般的なユーザのもとにおいて実行されてもよい。 This learning process is executed in a dedicated learning mode different from the normal imaging process executed by the imaging apparatus 1. This learning process may be executed at the product adjustment stage such as when the imaging device 1 is manufactured or shipped from the factory, or may be executed under a general user.

学習処理において、顔検出部１１０は、それぞれが学習用被写体を含む複数の学習用画像における、当該学習用被写体の顔領域を検出する。 In the learning process, the face detection unit 110 detects a face area of the learning subject in a plurality of learning images each including the learning subject.

学習用画像として、撮像装置１は、例えば図３（ａ）〜（ｃ）に示すような、様々な表情を有する学習用被写体２１０ａ，２１０ｂ，２１０ｃを含む学習用画像２００ａ，２００ｂ，２００ｃ等を、撮像部１０により撮像したり、外部インタフェース３３や外部記憶媒体３４を介して外部の機器から取得したりすることによって予め収集しておき、記憶部２３等に格納しておく。 As learning images, the imaging apparatus 1 includes learning images 200a, 200b, 200c including learning subjects 210a, 210b, 210c having various facial expressions as shown in FIGS. 3A to 3C, for example. The image is captured by the image capturing unit 10 or acquired from an external device via the external interface 33 or the external storage medium 34, and is collected in advance and stored in the storage unit 23 or the like.

学習処理が開始すると、顔検出部１１０は、記憶部２３から学習用画像を取得して、取得した学習用画像の中からそれぞれを順次選択して、学習用被写体の顔領域を検出する。 When the learning process is started, the face detection unit 110 acquires learning images from the storage unit 23, sequentially selects each of the acquired learning images, and detects a face region of the learning subject.

図４（ａ）に、顔検出部１１０が学習用画像２００ｂを選択した場合において、学習用被写体２１０ｂの顔領域２２０を検出した例を示す。顔検出部１１０は、学習用画像２００ｂの輝度分布から、学習用被写体２１０ｂの顔に相当する顔領域２２０を検出する。 FIG. 4A shows an example in which the face area 220 of the learning subject 210b is detected when the face detection unit 110 selects the learning image 200b. The face detection unit 110 detects a face region 220 corresponding to the face of the learning subject 210b from the luminance distribution of the learning image 200b.

例えば、顔検出部１１０は、各画素の明度、彩度、色相に基づいて肌色の画素を特定して、特定した肌色の画素の領域を顔領域２２０として検出する。あるいは、顔検出部１１０は、学習用画像２００ｂから、学習用被写体２１０ｂの口や目といった顔の構成要素に相当する形状を抽出して、その構成要素の位置を基準として、顔領域２２０を検出する。このような顔領域を検出するための具体的な手法として、例えば上述した特許文献１〜３に開示された技術を用いることができる。 For example, the face detection unit 110 identifies a skin color pixel based on the brightness, saturation, and hue of each pixel, and detects the identified skin color pixel region as the face region 220. Alternatively, the face detection unit 110 extracts a shape corresponding to a facial component such as the mouth and eyes of the learning subject 210b from the learning image 200b, and detects the face region 220 on the basis of the position of the structural component. To do. As a specific method for detecting such a face area, for example, the techniques disclosed in Patent Documents 1 to 3 described above can be used.

特徴量生成部１２０は、顔検出部１１０が検出した顔領域２２０から、学習用被写体２１０ｂの顔の特徴を示す特徴量を生成する。具体的には、特徴量生成部１２０は、部位抽出部１２１、及び正規化部１２２として機能する。 The feature amount generation unit 120 generates a feature amount indicating the facial feature of the learning subject 210b from the face region 220 detected by the face detection unit 110. Specifically, the feature quantity generation unit 120 functions as a part extraction unit 121 and a normalization unit 122.

部位抽出部１２１は、顔検出部１１０が検出した顔領域２２０における特定部位の位置を抽出する。例えば、部位抽出部１２１は、両目及び口といった顔を構成する部位の形状をエッジ検出等の手法により検出して、検出した部位の中心位置を抽出する。具体的には図４（ｂ）に示すように、部位抽出部１２１は、右目２２１及び左目２２２の中心位置、並びに口２２３の中心位置の３点を抽出する。このような特定部位の位置を抽出するための具体的な手法として、例えば上述した非特許文献２に開示された技術を用いることができる。 The part extraction unit 121 extracts the position of the specific part in the face region 220 detected by the face detection unit 110. For example, the part extracting unit 121 detects the shape of a part constituting the face such as both eyes and mouth by a method such as edge detection, and extracts the center position of the detected part. Specifically, as illustrated in FIG. 4B, the part extracting unit 121 extracts three points, that is, the center positions of the right eye 221 and the left eye 222 and the center position of the mouth 223. As a specific method for extracting the position of such a specific part, for example, the technique disclosed in Non-Patent Document 2 described above can be used.

正規化部１２２は、部位抽出部１２１が抽出した右目２２１、左目２２２、及び口２２３の位置に基づいて、学習用画像２００ｂを、特徴量を生成するために最適な画像に正規化する。具体的には図４（ｃ）に示すように、正規化部１２２は、学習用画像２００ｂの拡大縮小、回転、平行移動等により、顔領域２２０が画像中央の大部分の領域を占めるように、高さ（縦幅）Ｈ及び幅（横幅）Ｗのサイズを有する正規化画像２３０を生成する。 The normalization unit 122 normalizes the learning image 200b to an optimal image for generating a feature amount based on the positions of the right eye 221, the left eye 222, and the mouth 223 extracted by the part extraction unit 121. Specifically, as shown in FIG. 4C, the normalization unit 122 causes the face region 220 to occupy most of the center of the image by scaling, rotation, translation, etc. of the learning image 200b. Then, a normalized image 230 having a size of height (vertical width) H and width (horizontal width) W is generated.

より詳細に説明すると、正規化部１２２は、部位抽出部１２１が抽出した右目２２１、左目２２２、及び口２２３の３点の位置が、正規化画像２３０内における一定の位置になるように、学習用画像２００ｂ内におけるこれら３点の座標と正規化後のこれら３点の座標とからアフィン行列を計算する。具体的には、図４（ｃ）に示すように、正規化部１２２は、両目（右目２２１及び左目２２２）が正規化画像２３０の下端から予め定められた高さｅｖに位置し、口２２３が正規化画像２３０の下端から予め定められた高さｍｖに位置し、且つ、両目の間隔が予め定められた間隔ｅｈになるような、アフィン行列を計算する。そして、正規化部１２２は、計算したアフィン行列に従ったアフィン変換により学習用画像２００ｂを変形して、正規化画像２３０を生成する。 More specifically, the normalization unit 122 performs learning so that the positions of the three points of the right eye 221, the left eye 222, and the mouth 223 extracted by the part extraction unit 121 are constant positions in the normalized image 230. An affine matrix is calculated from the coordinates of these three points in the image 200b and the coordinates of these three points after normalization. Specifically, as illustrated in FIG. 4C, the normalization unit 122 has both eyes (the right eye 221 and the left eye 222) positioned at a predetermined height ev from the lower end of the normalized image 230, and the mouth 223. Is located at a predetermined height mv from the lower end of the normalized image 230, and the affine matrix is calculated such that the distance between the eyes is equal to the predetermined distance eh. Then, the normalization unit 122 deforms the learning image 200b by affine transformation according to the calculated affine matrix, and generates a normalized image 230.

特徴量生成部１２０は、正規化部１２２が正規化した後の学習用画像２００ｂ（すなわち正規化画像２３０）における顔領域２２０から、学習用被写体２１０ｂの顔の特徴を示す特徴量を生成する。この特徴量は、顔から年齢を推定するための指標となる量である。特徴量として、具体的には上述した非特許文献１と同様に、人の顔に生じる様々な皺や目袋のような、一般的に加齢に伴って変化する特徴の度合を用いることができる。 The feature amount generation unit 120 generates a feature amount indicating the feature of the face of the learning subject 210b from the face region 220 in the learning image 200b (that is, the normalized image 230) after normalization by the normalization unit 122. This feature amount is an amount that serves as an index for estimating the age from the face. Specifically, as in the case of Non-Patent Document 1 described above, the degree of features that generally change with age, such as various wrinkles and eye bags that appear on a person's face, is used as the feature amount. it can.

例えば特徴量生成部１２０は、ガボール（Gabor）フィルタやＬＢＰ（Local Binary Pattern）ヒストグラム等の手法を用いて顔領域２２０中に含まれる局所的なエッジを検出することにより、皺等の特徴量を生成する。ガボールフィルタは、画像中にどの向きの線やエッジが含まれているかを抽出するためのフィルタである。ＬＢＰは、中心画素値とその周辺画素値との大小関係から０又は１のビットを作成してコード化する方式である。ＬＰＢヒストグラムを用いることで、画像中に含まれるエッジの形状情報等を検出することができる。 For example, the feature quantity generation unit 120 detects feature edges such as wrinkles by detecting local edges included in the face region 220 using a technique such as a Gabor filter or an LBP (Local Binary Pattern) histogram. Generate. The Gabor filter is a filter for extracting which direction line or edge is included in the image. LBP is a method in which 0 or 1 bits are created and coded based on the magnitude relationship between a central pixel value and its peripheral pixel values. By using the LPB histogram, it is possible to detect edge shape information and the like included in the image.

このような手法を用いて、特徴量生成部１２０は、顔領域２２０を、例えば顔を構成するパーツ単位で複数の部分領域に分割して、分割した部分領域単位で特徴量を生成する。複数の部分領域に分割するため、特徴量は、画像内の位置に依存した（スペースバリアントな）値として算出される。そのため、高い精度で特徴量を生成するためには、上述した正規化部１２２による学習用画像２００ａ〜２００ｃの正規化が必要となる。 Using such a method, the feature amount generation unit 120 divides the face region 220 into a plurality of partial regions, for example, in units of parts constituting the face, and generates feature amounts in units of the divided partial regions. Since the image is divided into a plurality of partial areas, the feature amount is calculated as a value (space variant) depending on the position in the image. Therefore, in order to generate feature quantities with high accuracy, it is necessary to normalize the learning images 200a to 200c by the normalizing unit 122 described above.

表情検出部１３０は、顔検出部１１０が検出した顔領域２２０から、学習用被写体の表情の度合を検出する。具体的には、表情検出部１３０は、表情の度合として、学習用被写体の笑顔の度合を検出する。 The facial expression detection unit 130 detects the degree of facial expression of the learning subject from the face region 220 detected by the face detection unit 110. Specifically, the facial expression detection unit 130 detects the degree of smile of the learning subject as the degree of facial expression.

笑顔の度合を検出する具体的な手法として、例えば上述した特許文献２又は特許文献３に開示された技術を用いることができる。特許文献２は、口の開き具合を示すパラメータと歯の見え具合を示すパラメータとを用いて、笑顔等の画像上の顔の表情を表す指標を生成する手法を開示している。また、特許文献３は、被写体の顔の個々の構成要素に対する基準顔情報と他の顔情報との差分を取得して、取得した差分を合計することにより笑顔度を算出する手法を開示している。 As a specific method for detecting the degree of smile, for example, the technique disclosed in Patent Document 2 or Patent Document 3 described above can be used. Patent Document 2 discloses a technique for generating an index representing a facial expression on an image such as a smile using a parameter indicating the degree of opening of the mouth and a parameter indicating the appearance of teeth. Patent Document 3 discloses a technique for calculating a smile degree by acquiring a difference between reference face information for each component of a face of a subject and other face information and summing the acquired differences. Yes.

表情検出部１３０は、このような手法により学習用被写体の笑顔の度合を笑顔度ｋとして検出する。例えば、表情検出部１３０は、図３（ａ）に示した学習用被写体２１０ａについてはｋ＝０という低い笑顔度を検出し、図３（ｂ）に示した学習用被写体２１０ｂについてはｋ＝０．５という中程度の笑顔度を検出し、図３（ｃ）に示した学習用被写体２１０ｃについてはｋ＝１という高い笑顔度を検出する。 The facial expression detection unit 130 detects the smile level of the learning subject as the smile level k using such a method. For example, the facial expression detection unit 130 detects a low smile level of k = 0 for the learning subject 210a shown in FIG. 3A, and k = 0 for the learning subject 210b shown in FIG. A medium smile level of .5 is detected, and a high smile level of k = 1 is detected for the learning subject 210c shown in FIG.

分類部１５０は、予め収集された複数の学習用画像を、表情検出部１３０が検出した学習用被写体の笑顔度ｋに応じて、笑顔グループと非笑顔グループとに分類する。具体的には、分類部１５０は、笑顔度ｋが、所定の閾値ｔｈを超えるか否かに応じて、学習用画像を笑顔グループと非笑顔グループとに分類する。閾値ｔｈは、学習用画像に含まれる学習用被写体の表情が笑顔か非笑顔かを判別するための値であって、０から１までの間の値に設定される。 The classifying unit 150 classifies a plurality of learning images collected in advance into a smiling group and a non-smiling group according to the smile k of the learning subject detected by the facial expression detection unit 130. Specifically, the classification unit 150 classifies the learning image into a smiling group and a non-smiling group according to whether or not the smile degree k exceeds a predetermined threshold th. The threshold th is a value for determining whether the facial expression of the learning subject included in the learning image is a smile or a non-smile, and is set to a value between 0 and 1.

学習部１６０は、笑顔グループと非笑顔グループとのそれぞれについて、それぞれのグループに分類された学習用画像における、学習用被写体の年齢と特徴量との関係に基づいて、特徴量から年齢を識別するための識別基準を学習する。 The learning unit 160 identifies the age of the smile group and the non-smile group based on the feature amount based on the relationship between the age of the learning subject and the feature amount in the learning images classified into the groups. Learning identification criteria for

具体的に説明すると、学習部１６０は、笑顔グループに分類された学習用画像のみをサンプルとして用いて、ＳＶＲ（サポートベクター回帰）等の手法により、第１の識別基準を学習する。これにより、笑顔年齢識別機１４１を生成する。また、学習部１６０は、非笑顔グループに分類された学習用画像のみをサンプルとして用いて、ＳＶＲ等の手法により、第２の識別基準を学習する。これにより、非笑顔年齢識別機１４２を生成する。 More specifically, the learning unit 160 learns the first identification criterion using a technique such as SVR (support vector regression) using only the learning images classified as the smile group as samples. As a result, a smile age classifier 141 is generated. In addition, the learning unit 160 learns the second identification criterion by a method such as SVR using only the learning images classified as the non-smiling group as samples. As a result, the non-smile age classifier 142 is generated.

このようにして生成した笑顔年齢識別機１４１と非笑顔年齢識別機１４２とを用いて、年齢推定装置１００は、後述する被写体の年齢推定処理を実行する。 Using the smile age classifier 141 and the non-smile age classifier 142 generated in this way, the age estimation device 100 executes an age estimation process of a subject to be described later.

以上のような学習処理の流れについて、図５に示すフローチャートを参照して説明する。この学習処理は、例えば操作部３２等を介して学習開始の指示を受け付けると、開始する。 The flow of the learning process as described above will be described with reference to the flowchart shown in FIG. This learning process is started when a learning start instruction is received via the operation unit 32 or the like, for example.

学習処理が開始すると、顔検出部１１０は、予め収集された複数の学習用画像の中からいずれか１つを選択して（ステップＳ１）、選択した学習用画像における学習用被写体の顔領域を検出する（ステップＳ２）。すなわち、顔検出部１１０は、例えば図４（ａ）に示したように、学習用画像２００ｂにおける学習用被写体２１０ｂの顔領域２２０を検出する。 When the learning process is started, the face detection unit 110 selects any one of the plurality of learning images collected in advance (step S1), and determines the face area of the learning subject in the selected learning image. Detect (step S2). That is, for example, as illustrated in FIG. 4A, the face detection unit 110 detects the face area 220 of the learning subject 210b in the learning image 200b.

顔検出部１１０が顔領域２２０を検出すると、特徴量生成部１２０は、学習用画像を正規化して（ステップＳ３）、検出した顔領域２２０から、学習用被写体２１０ｂの顔の特徴を示す特徴量を生成する（ステップＳ４）。すなわち、部位抽出部１２１が、例えば図４（ｂ）に示したように顔領域２２０に含まれる特定部位の位置を抽出し、正規化部１２２が、例えば図４（ｃ）に示したように、抽出した位置に基づいて正規化画像２３０を生成する。そして、特徴量生成部１２０は、正規化画像２３０における顔領域２２０から、上述した手法により、特徴量を生成する。 When the face detection unit 110 detects the face region 220, the feature amount generation unit 120 normalizes the learning image (step S3), and the feature amount indicating the facial feature of the learning subject 210b from the detected face region 220. Is generated (step S4). That is, the part extracting unit 121 extracts the position of a specific part included in the face region 220 as shown in FIG. 4B, for example, and the normalizing unit 122 is shown as shown in FIG. 4C, for example. The normalized image 230 is generated based on the extracted position. Then, the feature amount generation unit 120 generates a feature amount from the face region 220 in the normalized image 230 by the method described above.

特徴量生成部１２０が特徴量を生成すると、表情検出部１３０は、顔検出部１１０が検出した顔領域２２０から、学習用被写体の表情の度合として、笑顔の度合（笑顔度ｋ）を検出する（ステップＳ５）。 When the feature amount generation unit 120 generates a feature amount, the facial expression detection unit 130 detects the degree of smile (smile level k) as the degree of facial expression of the learning subject from the face area 220 detected by the face detection unit 110. (Step S5).

表情検出部１３０が笑顔度ｋを検出すると、分類部１５０は、検出した笑顔度ｋが、０から１までの間の値に設定された閾値ｔｈを超えるか否かを判別することにより、笑顔判定を行う（ステップＳ６）。笑顔度ｋが閾値ｔｈを超える場合（ステップＳ６；ＹＥＳ）、分類部１５０は、選択した学習用画像を笑顔グループに分類する（ステップＳ７）。一方で、笑顔度ｋが閾値ｔｈ以下である場合（ステップＳ６；ＮＯ）、分類部１５０は、選択した学習用画像を非笑顔グループに分類する（ステップＳ８）。 When the facial expression detection unit 130 detects the smile level k, the classification unit 150 determines whether or not the detected smile level k exceeds a threshold th set to a value between 0 and 1, thereby smiling. A determination is made (step S6). When the smile k exceeds the threshold th (step S6; YES), the classification unit 150 classifies the selected learning image into a smile group (step S7). On the other hand, when the smile level k is equal to or less than the threshold th (step S6; NO), the classification unit 150 classifies the selected learning image into a non-smile group (step S8).

選択した学習用画像を笑顔グループと非笑顔グループとのいずれかに分類すると、分類部１５０は、全ての学習用画像について分類が終了したか否かを判別する（ステップＳ９）。 When the selected learning image is classified into either the smile group or the non-smile group, the classification unit 150 determines whether or not the classification has been completed for all the learning images (step S9).

分類が終了していない場合（ステップＳ９；ＮＯ）、顔検出部１１０は、記憶部２３に記憶された学習用画像の中から別の学習用画像を１つ選択する（ステップＳ１０）。そして、学習処理はステップＳ２に戻り、ステップＳ２〜Ｓ９までの処理を繰り返して、新たに選択した学習用画像を笑顔グループと非笑顔グループとのいずれかに分類する。このような分類処理を、予め用意された全ての学習用画像について、繰り返す。 If the classification has not ended (step S9; NO), the face detection unit 110 selects another learning image from the learning images stored in the storage unit 23 (step S10). Then, the learning process returns to step S2, and the processes from steps S2 to S9 are repeated to classify the newly selected learning image into either a smiling group or a non-smiling group. Such a classification process is repeated for all learning images prepared in advance.

最終的に全ての学習用画像について分類部１５０による分類が終了すると（ステップＳ９；ＹＥＳ）、学習部１６０は、笑顔グループに分類した学習用画像をサンプルとして笑顔年齢識別機１４１を生成し（ステップＳ１１）、且つ、非グループに分類した学習用画像をサンプルとして非笑顔年齢識別機１４２を生成する（ステップＳ１２）。以上により、図５のフローチャートに示した学習処理は終了する。 When the classification by the classification unit 150 is finally completed for all learning images (step S9; YES), the learning unit 160 generates the smile age classifier 141 using the learning images classified into the smile group as a sample (step S9). S11) and the non-smiling age classifier 142 is generated using the learning images classified into non-groups as samples (step S12). Thus, the learning process shown in the flowchart of FIG. 5 ends.

第２に、（２）学習結果に基づいて年齢推定対象の被写体の年齢を推定する処理について、説明する。 Secondly, (2) a process for estimating the age of the subject to be estimated based on the learning result will be described.

年齢推定処理において、撮像装置１は、年齢推定対象の被写体を含む被写体画像を、撮像部１０により撮像することにより、あるいは外部の機器が予め被写体を撮像して得られた画像を例えば外部インタフェース３３や外部記憶媒体３４を介して、取得する。 In the age estimation process, the imaging device 1 captures a subject image including a subject to be age estimated by the imaging unit 10 or an image obtained by capturing an object in advance by an external device, for example, the external interface 33. Or via the external storage medium 34.

顔検出部１１０は、取得した被写体画像における、被写体の顔領域を検出する。特徴量生成部１２０は、顔検出部１１０が検出した被写体の顔領域から、被写体の顔の特徴を示す特徴量を生成する。表情検出部１３０は、顔検出部１１０が検出した被写体の顔領域から、被写体の表情の度合として、笑顔の度合（笑顔度ｋ）を検出する。 The face detection unit 110 detects the face area of the subject in the acquired subject image. The feature value generation unit 120 generates a feature value indicating the feature of the face of the subject from the face area of the subject detected by the face detection unit 110. The expression detection unit 130 detects the degree of smile (smile degree k) as the degree of expression of the subject from the face area of the subject detected by the face detection unit 110.

これら顔検出部１１０、特徴量生成部１２０、及び表情検出部１３０の各部が、被写体を含む被写体画像に対して実行する処理は、図４（ａ）〜（ｃ）を参照して説明した、学習用被写体２１０ｂを含む学習用画像２００ｂに対して実行する処理と、同様である。これは、高い精度で被写体の年齢を推定するためには、学習時と年齢推定時とで同じアルゴリズムを使用することが好適なためである。そのため、ここでは詳細な説明を省略する。 The processing that each of the face detection unit 110, the feature amount generation unit 120, and the facial expression detection unit 130 executes on the subject image including the subject has been described with reference to FIGS. This is the same as the processing executed for the learning image 200b including the learning subject 210b. This is because it is preferable to use the same algorithm at the time of learning and at the time of age estimation in order to estimate the age of the subject with high accuracy. Therefore, detailed description is omitted here.

年齢推定部１４０は、特徴量生成部１２０が生成した特徴量と、表情検出部１３０が検出した笑顔度ｋと、に基づいて、被写体の年齢を推定する。具体的には、年齢推定部１４０は、笑顔年齢識別機１４１、非笑顔年齢識別機１４２、及び選択部１４３として機能する。 The age estimation unit 140 estimates the age of the subject based on the feature amount generated by the feature amount generation unit 120 and the smile level k detected by the facial expression detection unit 130. Specifically, the age estimation unit 140 functions as a smile age classifier 141, a non-smile age classifier 142, and a selection unit 143.

笑顔年齢識別機１４１と非笑顔年齢識別機１４２とは、それぞれ、学習部１６０が学習した互いに異なる識別基準に基づいて、特徴量生成部１２０が生成した特徴量から被写体の年齢を識別する。具体的には、笑顔年齢識別機１４１は、第１の年齢識別部として機能し、笑顔グループに分類された学習用画像から学習された第１の識別基準に基づいて、笑顔年齢ＳＡｇｅを取得する。非笑顔年齢識別機１４２は、第２の年齢識別部として機能し、非笑顔グループに分類された学習用画像から学習された第２の識別基準に基づいて、非笑顔年齢ＮＡｇｅを取得する。 The smile age classifier 141 and the non-smile age classifier 142 each identify the age of the subject from the feature amount generated by the feature amount generation unit 120 based on different identification criteria learned by the learning unit 160. Specifically, the smile age classifier 141 functions as a first age discrimination unit, and acquires the smile age SAge based on the first identification criterion learned from the learning images classified into the smile group. . The non-smiling age classifier 142 functions as a second age identifying unit, and acquires the non-smiling age Nage based on the second identification criterion learned from the learning images classified into the non-smiling group.

笑顔年齢ＳＡｇｅは、被写体の表情が笑顔であると仮定したときの、被写体の推定年齢である。非笑顔年齢ＮＡｇｅは、被写体の表情が笑顔でないと仮定したときの、被写体の推定年齢である。顔の皺等を特徴量として年齢を推定する場合、加齢による皺と笑顔による皺との区別が困難なものになるため、一般的には、笑顔の学習用被写体のみをサンプルとして取得した笑顔年齢ＳＡｇｅは、笑顔でない学習用被写体のみをサンプルとして取得した非笑顔年齢ＮＡｇｅよりも、低い値となる。 Smile age Sage is the estimated age of the subject when it is assumed that the facial expression of the subject is a smile. The non-smile age Nage is an estimated age of the subject when it is assumed that the facial expression of the subject is not a smile. When estimating age using facial wrinkles as features, it is difficult to distinguish between wrinkles due to aging and wrinkles due to smiles. The age Sage is lower than the non-smile age Nage obtained by using only the learning subject that is not smiling as a sample.

選択部１４３は、表情検出部１３０が検出した笑顔度ｋに応じて、笑顔年齢識別機１４１と非笑顔年齢識別機１４２との中からいずれかを選択する。そして、年齢推定部１４０は、選択部１４３が選択した年齢識別部が識別した年齢、すなわち笑顔年齢ＳＡｇｅと非笑顔年齢ＮＡｇｅとのどちらかを、被写体の年齢として推定する。 The selection unit 143 selects one of the smile age classifier 141 and the non-smile age classifier 142 according to the smile level k detected by the facial expression detection unit 130. Then, the age estimation unit 140 estimates the age identified by the age identification unit selected by the selection unit 143, that is, either the smile age SAge or the non-smile age Nage as the age of the subject.

具体的に説明すると、選択部１４３は、被写体の年齢を識別するための識別機として、笑顔度ｋが所定の閾値ｔｈを超える場合に笑顔年齢識別機１４１を選択し、笑顔度ｋが所定の閾値ｔｈ以下の場合に非笑顔年齢識別機１４２を選択する。閾値ｔｈは、被写体の表情が笑顔か非笑顔かを判別するための値であって、学習時と同じ値に設定される。 More specifically, the selection unit 143 selects the smile age classifier 141 as a classifier for identifying the age of the subject when the smile level k exceeds a predetermined threshold th, and the smile level k is a predetermined level. When it is equal to or less than the threshold th, the non-smile age classifier 142 is selected. The threshold th is a value for determining whether the subject's facial expression is a smile or a non-smile, and is set to the same value as during learning.

このように被写体の笑顔の度合に応じて選択部１４３が年齢識別機を選択することにより、年齢推定部１４０は、被写体の表情が笑顔のときは、笑顔年齢ＳＡｇｅを被写体の年齢として推定し、被写体の表情が笑顔でないときは、非笑顔年齢ＮＡｇｅを被写体の年齢として推定する。 As described above, when the selection unit 143 selects the age discriminator according to the degree of smile of the subject, the age estimation unit 140 estimates the smile age Sage as the age of the subject when the subject's facial expression is smiling, When the facial expression of the subject is not a smile, the non-smile age Nage is estimated as the age of the subject.

すなわち、年齢推定部１４０は、顔の特徴量が同じであるならば、被写体の表情が笑顔である方が笑顔でないよりも、言い換えると被写体の笑顔の度合が大きいほど、低い年齢をその被写体の年齢として推定する。このような推定方法により、笑顔の度合に応じて適切に被写体の年齢を推定することができ、特に、加齢による皺と笑顔による皺との区別が困難なため笑顔の被写体の年齢を高く推定してしまう、といった年齢推定精度の低下を防ぐことができる。 In other words, if the facial feature amount is the same, the age estimating unit 140 sets the lower age of the subject as the degree of smile of the subject is larger than the smile of the subject whose facial expression is smile. Estimated as age. With such an estimation method, the age of the subject can be estimated appropriately according to the degree of smile, especially because it is difficult to distinguish between wrinkles due to aging and wrinkles due to smiles. It is possible to prevent a decrease in the accuracy of age estimation, such as.

出力部４０は、年齢推定部１４０による被写体の年齢の推定結果を出力する。例えば、表示部３１、外部インタフェース３３、又は外部記憶媒体３４が、出力部４０として機能する。出力部４０は、ビデオ出力部２２の制御のもと、年齢推定結果を表示部３１に表示したり、外部インタフェース３３又は外部記憶媒体３４を介して年齢推定結果を外部の機器に提供したりする。 The output unit 40 outputs the estimation result of the age of the subject by the age estimation unit 140. For example, the display unit 31, the external interface 33, or the external storage medium 34 functions as the output unit 40. The output unit 40 displays the age estimation result on the display unit 31 under the control of the video output unit 22 or provides the age estimation result to an external device via the external interface 33 or the external storage medium 34. .

以上のような年齢推定処理の流れについて、図６に示すフローチャート（年齢推定処理１）を参照して説明する。この年齢推定処理は、撮像装置１を操作しているユーザから、例えば操作部３２等を介して被写体の年齢を推定する旨の指示を受け付けると、開始する。 The flow of the age estimation process as described above will be described with reference to the flowchart (age estimation process 1) shown in FIG. This age estimation process is started when an instruction to estimate the age of the subject is received from the user operating the imaging device 1 via the operation unit 32 or the like, for example.

年齢推定処理が開始すると、顔検出部１１０は、年齢推定対象の被写体を含む被写体画像をメモリ２１や記憶部２３等から取得して（ステップＳ１０１）、取得した被写体画像における被写体の顔領域を検出する（ステップＳ１０２）。 When the age estimation process starts, the face detection unit 110 acquires a subject image including the subject of age estimation target from the memory 21, the storage unit 23, or the like (step S101), and detects the face area of the subject in the acquired subject image. (Step S102).

顔検出部１１０が顔領域２２０を検出すると、特徴量生成部１２０は、被写体画像を正規化して（ステップＳ１０３）、正規化した後の被写体画像から、被写体の顔の特徴を示す特徴量を生成する（ステップＳ１０４）。 When the face detection unit 110 detects the face area 220, the feature amount generation unit 120 normalizes the subject image (step S103), and generates a feature amount indicating the feature of the subject's face from the normalized subject image. (Step S104).

特徴量生成部１２０が特徴量を生成すると、表情検出部１３０は、顔検出部１１０が検出した顔領域から、被写体の笑顔の度合（笑顔度ｋ）を検出する（ステップＳ１０５）。 When the feature amount generation unit 120 generates the feature amount, the facial expression detection unit 130 detects the degree of smile of the subject (smile level k) from the face area detected by the face detection unit 110 (step S105).

すなわち、年齢推定処理において、年齢推定装置１００は、学習処理において学習用画像に対して実行した処理と同様の処理（顔検出、特徴量生成、及び笑顔度検出の各処理）を、取得した被写体画像に対して実行する。 That is, in the age estimation process, the age estimation apparatus 100 performs processing similar to the process executed on the learning image in the learning process (each process of face detection, feature amount generation, and smile level detection). Run on the image.

表情検出部１３０が笑顔度ｋを検出すると、選択部１４３は、検出した笑顔度ｋが、予め定められた閾値ｔｈを超えるか否かを判別することにより、笑顔判定を行う（ステップＳ１０６）。 When the facial expression detection unit 130 detects the smile level k, the selection unit 143 performs smile determination by determining whether or not the detected smile level k exceeds a predetermined threshold th (step S106).

笑顔度ｋが閾値ｔｈを超える場合（ステップＳ１０６；ＹＥＳ）、選択部１４３は、笑顔年齢識別機１４１を選択する。笑顔年齢識別機１４１は、特徴量生成部１２０が生成した特徴量から笑顔年齢ＳＡｇｅを取得する（ステップＳ１０７）。年齢推定部１４０は、取得した笑顔年齢ＳＡｇｅを、被写体の年齢として推定する。 When the smile level k exceeds the threshold th (step S106; YES), the selection unit 143 selects the smile age classifier 141. The smile age classifier 141 acquires the smile age Sage from the feature amount generated by the feature amount generation unit 120 (step S107). The age estimation unit 140 estimates the acquired smile age Sage as the age of the subject.

一方で、笑顔度ｋが閾値ｔｈ以下である場合（ステップＳ１０６；ＮＯ）、選択部１４３は、非笑顔年齢識別機１４２を選択する。非笑顔年齢識別機１４２は、特徴量生成部１２０が生成した特徴量から非笑顔年齢ＮＡｇｅを取得する（ステップＳ１０８）。年齢推定部１４０は、取得した非笑顔年齢ＮＡｇｅを、被写体の年齢として推定する。 On the other hand, when the smile level k is less than or equal to the threshold th (step S106; NO), the selection unit 143 selects the non-smile age classifier 142. The non-smile age classifier 142 acquires the non-smile age Nage from the feature amount generated by the feature amount generation unit 120 (step S108). The age estimation unit 140 estimates the acquired non-smile age Nage as the age of the subject.

年齢推定部１４０による被写体の年齢の推定結果が得られると、出力部４０は、年齢推定部１４０による被写体の年齢の推定結果を、表示部３１に表示する等により、外部に出力する（ステップＳ１０９）。以上により、図６のフローチャートに示した年齢推定処理は終了する。 When the estimation result of the age of the subject by the age estimation unit 140 is obtained, the output unit 40 outputs the estimation result of the age of the subject by the age estimation unit 140 to the outside by displaying it on the display unit 31 or the like (step S109). ). Thus, the age estimation process shown in the flowchart of FIG. 6 ends.

以上説明したように、実施形態１に係る年齢推定装置１００及び撮像装置１は、予め、笑顔グループと非笑顔グループとに分類された学習用画像を用いて、笑顔年齢識別機１４１と非笑顔年齢識別機１４２とを生成する。そして、年齢推定対象の被写体の笑顔度に応じて笑顔年齢識別機１４１と非笑顔年齢識別機１４２のいずれか一方を選択し、選択した識別機によって被写体の年齢を推定する。 As described above, the age estimation device 100 and the imaging device 1 according to the first embodiment use the learning age classifier 141 and the non-smile age using the learning images that are classified in advance into the smile group and the non-smile group. The discriminator 142 is generated. Then, one of the smile age classifier 141 and the non-smile age classifier 142 is selected according to the smile level of the subject whose age is to be estimated, and the age of the subject is estimated by the selected classifier.

その結果、笑顔の被写体の年齢を高く推定してしまう等、被写体の表情が笑顔か非笑顔かに起因する年齢推定精度の低下を避けることができ、笑顔と非笑顔とのどちらの表情を有する被写体に対しても年齢を精度よく推定することができる。 As a result, it is possible to avoid a decrease in age estimation accuracy caused by whether the subject's facial expression is smiling or non-smiling, such as estimating the age of the smiling subject high, and it has either a smiling or non-smiling facial expression It is possible to accurately estimate the age of the subject.

なお、上述した年齢推定装置１００は、笑顔年齢識別機１４１と非笑顔年齢識別機１４２との２つの年齢識別部を備えていた。しかし、年齢推定装置１００は、学習用被写体の笑顔度に応じて生成された３つ以上の年齢識別部を備えてもよい。 The above-described age estimation apparatus 100 includes two age identification units, a smile age classifier 141 and a non-smile age classifier 142. However, the age estimation apparatus 100 may include three or more age identification units generated according to the smile level of the learning subject.

例えば年齢推定装置１００が３つの年齢識別部を備える場合、学習段階において、分類部１５０は、予め収集された複数の学習用画像を、表情検出部１３０が検出した学習用被写体の笑顔度に応じて、笑顔度が低・中・高の３つのグループに分類する。そして、学習部１６０は、これら３つのグループとのそれぞれについて、それぞれのグループに分類された学習用画像における、学習用被写体の年齢と特徴量との関係に基づいて、特徴量から年齢を識別するための識別基準を学習する。そして、年齢推定段階において、選択部１４３は、３つの年齢識別部の中から、年齢推定対象の被写体の笑顔度に合致する年齢識別部を選択して、年齢推定に使用する。 For example, when the age estimating apparatus 100 includes three age identifying units, in the learning stage, the classifying unit 150 selects a plurality of learning images collected in advance according to the smile level of the learning subject detected by the facial expression detecting unit 130. And classify them into three groups with low, medium and high smile levels. Then, for each of these three groups, the learning unit 160 identifies the age based on the feature amount based on the relationship between the age of the learning subject and the feature amount in the learning images classified into the respective groups. Learning identification criteria for In the age estimation stage, the selection unit 143 selects an age identification unit that matches the smile level of the subject to be age estimated from the three age identification units, and uses it for age estimation.

このように年齢識別部の数を増やすことにより、被写体の表情に応じた最適な年齢識別機を選択することができるため、年齢推定の精度を向上させることができる。 In this way, by increasing the number of age discriminating units, it is possible to select an optimal age discriminator according to the facial expression of the subject, so that the accuracy of age estimation can be improved.

（実施形態２）
次に、本発明の実施形態２について説明する。 (Embodiment 2)
Next, Embodiment 2 of the present invention will be described.

実施形態２に係る撮像装置及び年齢推定装置の機能構成を図７に示す。実施形態２に係る撮像装置１ａ及び年齢推定装置１００ａは、年齢推定部１４０ａにおいて、実施形態１に係る年齢推定部１４０における選択部１４３の代わりに、年齢算出部１４４を備える。その他の構成は、実施形態１に係る撮像装置１及び年齢推定装置１００が備える構成と同様である。 FIG. 7 shows functional configurations of the imaging apparatus and the age estimation apparatus according to the second embodiment. The imaging device 1a and the age estimation device 100a according to the second embodiment include an age calculation unit 144 in the age estimation unit 140a instead of the selection unit 143 in the age estimation unit 140 according to the first embodiment. Other configurations are the same as the configurations included in the imaging device 1 and the age estimation device 100 according to the first embodiment.

実施形態２に係る年齢推定装置１００ａは、分類部１５０、学習部１６０等の機能により、被写体の年齢推定処理に先立って、（１）学習用被写体を用いて年齢の識別基準を学習する処理を実行する。この学習処理は、実施形態１と同様であるため、ここでは説明を省略する。 The age estimation apparatus 100a according to the second embodiment uses the functions of the classification unit 150, the learning unit 160, and the like, prior to subject age estimation processing, (1) a process of learning age identification criteria using a learning subject. Run. Since this learning process is the same as that of Embodiment 1, description is abbreviate | omitted here.

実施形態２における、（２）学習結果に基づいて年齢推定対象の被写体の年齢を推定する処理について、図８に示すフローチャート（年齢推定処理２）を参照して説明する。 In the second embodiment, (2) the process of estimating the age of the subject of age estimation based on the learning result will be described with reference to the flowchart (age estimation process 2) shown in FIG.

この年齢推定処理におけるステップＳ２０１〜Ｓ２０５の処理は、実施形態１の年齢推定処理におけるステップＳ１０１〜Ｓ１０５（図６）の処理と同様である。 The processes of steps S201 to S205 in this age estimation process are the same as the processes of steps S101 to S105 (FIG. 6) in the age estimation process of the first embodiment.

すなわち、年齢推定処理が開始すると、顔検出部１１０は、年齢推定対象の被写体を含む被写体画像をメモリ２１や記憶部２３等から取得して（ステップＳ２０１）、取得した被写体画像における被写体の顔領域を検出する（ステップＳ２０２）。 That is, when the age estimation process is started, the face detection unit 110 acquires a subject image including the subject of age estimation target from the memory 21, the storage unit 23, etc. (step S201), and the face area of the subject in the acquired subject image Is detected (step S202).

顔検出部１１０が顔領域２２０を検出すると、特徴量生成部１２０は、被写体画像を正規化して（ステップＳ２０３）、正規化した後の被写体画像から、被写体の顔の特徴を示す特徴量を生成する（ステップＳ２０４）。 When the face detection unit 110 detects the face region 220, the feature amount generation unit 120 normalizes the subject image (step S203), and generates a feature amount indicating the feature of the subject's face from the normalized subject image. (Step S204).

特徴量生成部１２０が特徴量を生成すると、表情検出部１３０は、顔検出部１１０が検出した顔領域から、被写体の笑顔の度合（笑顔度ｋ）を検出する（ステップＳ２０５）。 When the feature amount generation unit 120 generates the feature amount, the facial expression detection unit 130 detects the degree of smile (smile level k) of the subject from the face area detected by the face detection unit 110 (step S205).

表情検出部１３０が笑顔度ｋを検出すると、笑顔年齢識別機１４１は、特徴量生成部１２０が生成した特徴量から笑顔年齢ＳＡｇｅを取得する（ステップＳ２０６）。続いて、非笑顔年齢識別機１４２は、特徴量生成部１２０が生成した特徴量から非笑顔年齢ＮＡｇｅを取得する（ステップＳ２０７）。 When the facial expression detection unit 130 detects the smile degree k, the smile age classifier 141 acquires the smile age Sage from the feature amount generated by the feature amount generation unit 120 (step S206). Subsequently, the non-smile age classifier 142 acquires the non-smile age Nage from the feature amount generated by the feature amount generation unit 120 (step S207).

年齢算出部１４４は、表情検出部１３０が検出した被写体の笑顔度ｋに基づいて、笑顔年齢ＳＡｇｅと非笑顔年齢ＮＡｇｅとの重み付き線形和をとることにより、笑顔年齢ＳＡｇｅと非笑顔年齢ＮＡｇｅとの間の年齢を推定年齢として算出する（ステップＳ２０８）。 The age calculation unit 144 calculates a weighted linear sum of the smile age Sage and the non-smile age Nage based on the smile level k of the subject detected by the facial expression detection unit 130, thereby obtaining the smile age Sage and the non-smile age Nage. Is calculated as the estimated age (step S208).

具体的に説明すると、年齢算出部１４４は、下記（１）式に基づいて、笑顔度ｋが大きいほど笑顔年齢ＳＡｇｅに近い年齢（笑顔度ｋが小さいほど非笑顔年齢ＮＡｇｅに近い年齢）を、推定年齢Ａｇｅとして算出する。
Ａｇｅ＝ｋ×ＳＡｇｅ＋（１−ｋ）×ＮＡｇｅ・・・（１） Specifically, based on the following formula (1), the age calculating unit 144 calculates an age closer to the smile age SAge as the smile degree k is larger (an age closer to the non-smile age Nage as the smile degree k is smaller). Calculated as the estimated age Age.
Age = k × Sage + (1−k) × Nage (1)

通常、笑顔年齢ＳＡｇｅの方が非笑顔年齢ＮＡｇｅよりも低い値となるため、年齢算出部１４４は、笑顔度ｋが大きいほど低い年齢を、被写体の推定年齢として算出することになる。年齢推定部１４０ａは、このような関係式により年齢算出部１４４が算出した年齢を、被写体の年齢として推定する。 Normally, the smile age Sage has a lower value than the non-smile age Nage, so the age calculation unit 144 calculates the lower age as the smile degree k is larger as the estimated age of the subject. The age estimation unit 140a estimates the age calculated by the age calculation unit 144 using such a relational expression as the age of the subject.

年齢推定部１４０ａによる被写体の年齢の推定結果が得られると、出力部４０は、年齢推定部１４０ａによる被写体の年齢の推定結果を、表示部３１に表示する等により、外部に出力する（ステップＳ２０９）。以上により、図８のフローチャートに示した年齢推定処理は終了する。 When the estimation result of the subject's age by the age estimation unit 140a is obtained, the output unit 40 outputs the estimation result of the subject's age by the age estimation unit 140a to the outside by displaying it on the display unit 31 (step S209). ). Thus, the age estimation process shown in the flowchart of FIG. 8 ends.

以上説明したように、実施形態２に係る撮像装置１ａ及び年齢推定装置１００ａは、笑顔グループと非笑顔グループとに分類された学習用画像を用いて生成された笑顔年齢識別機１４１と非笑顔年齢識別機１４２との双方を用いて、年齢推定対象の被写体の笑顔年齢ＳＡｇｅと非笑顔年齢ＮＡｇｅとを取得する。そして、被写体の笑顔度に応じて笑顔年齢ＳＡｇｅと非笑顔年齢ＮＡｇｅとの重み付き線形和をとって、被写体の推定年齢を算出する。 As described above, the imaging device 1a and the age estimation device 100a according to the second embodiment have the smile age classifier 141 and the non-smile age generated using the learning images classified into the smile group and the non-smile group. Using both the classifier 142, the smile age Sage and the non-smile age Nage of the subject of age estimation are acquired. Then, an estimated age of the subject is calculated by taking a weighted linear sum of the smile age SAge and the non-smile age Nage according to the smile level of the subject.

その結果、被写体の笑顔の度合による年齢推定への影響を精密に考慮した上で、被写体の年齢を精度よく推定することができる。 As a result, the age of the subject can be accurately estimated in consideration of the influence of the degree of smile on the subject on the age estimation.

なお、上記（１）式は一例である。年齢算出部１４４は、線形和に限らず、例えば笑顔度ｋから年齢をより精密に推定可能な他の関係式が判明している場合には、そのような他の関係式に従って被写体の推定年齢を算出してもよい、 The above equation (1) is an example. The age calculation unit 144 is not limited to the linear sum, and for example, when another relational expression that can estimate the age more accurately from the smile k is known, the estimated age of the subject is determined according to such other relational expression. May be calculated,

（変形例）
以上に本発明の実施形態について説明したが、これらの実施形態は一例であり、本発明の適用範囲はこれに限られない。すなわち、本発明の実施形態は種々の応用が可能であり、あらゆる実施の形態が本発明の範囲に含まれる。 (Modification)
Although the embodiments of the present invention have been described above, these embodiments are merely examples, and the scope of application of the present invention is not limited thereto. That is, the embodiments of the present invention can be applied in various ways, and all the embodiments are included in the scope of the present invention.

例えば、上記実施形態では、表情検出部１３０は、被写体の表情の度合として、笑顔の度合を検出した。しかし、本発明に係る年齢推定装置及び撮像装置は、笑顔に代えて、又は笑顔に加えて、怒り等、他の表情の度合を検出してもよい。そして、予め、学習用被写体の怒りの度合に応じて分類された学習用画像をサンプルとして用いて複数の怒り度別の識別機を生成しておき、これらの識別機を用いて被写体の年齢を推定してもよい。 For example, in the above embodiment, the facial expression detection unit 130 detects the degree of smile as the degree of facial expression of the subject. However, the age estimation device and the imaging device according to the present invention may detect the degree of other facial expressions such as anger instead of or in addition to a smile. A learning image classified according to the anger level of the learning subject is used as a sample in advance to generate a plurality of classifiers by anger level, and the age of the subject is determined using these classifiers. It may be estimated.

このように、笑顔や怒り等、様々な表情の度合別に識別機を用いることにより、様々な表情を有した被写体に対しても、年齢を精度よく推定することができる。 As described above, by using the classifier according to the degree of various facial expressions such as smile and anger, the age can be accurately estimated even for subjects having various facial expressions.

上記実施形態では、年齢推定装置１００，１００ａは、撮像装置１，１ａの内部に搭載されていた。しかし、本発明に係る年齢推定装置は、撮像装置１，１ａとは独立した装置であってもよい。例えばパーソナルコンピュータ等の一般的な情報処理装置が年齢推定装置として機能することができる。この場合、年齢推定装置は、自装置とは独立した撮像装置により得られた被写体画像を外部とのデータ通信手段を介して取得して、取得した被写体画像に対して上述した年齢推定処理を実行する。 In the above embodiment, the age estimation devices 100 and 100a are mounted inside the imaging devices 1 and 1a. However, the age estimation apparatus according to the present invention may be an apparatus independent of the imaging apparatuses 1 and 1a. For example, a general information processing apparatus such as a personal computer can function as an age estimation apparatus. In this case, the age estimation device acquires a subject image obtained by an imaging device independent from the own device via data communication means with the outside, and executes the above-described age estimation processing on the acquired subject image. To do.

上記実施形態では、年齢推定装置１００，１００ａは、学習用被写体を用いて年齢の識別基準を学習する機能（分類部１５０及び学習部１６０）を備えていた。しかし、本発明に係る年齢推定装置は、学習機能を備えていなくてもよい。例えば、本発明に係る年齢推定装置は、自装置とは独立した他の装置において予め学習された学習結果を外部とのデータ通信手段を介して取得して、取得した学習結果に基づいて上述した年齢推定処理を実行するように構成することもできる。 In the above embodiment, the age estimation devices 100 and 100a have the function of learning age identification criteria (classification unit 150 and learning unit 160) using a learning subject. However, the age estimation apparatus according to the present invention may not have a learning function. For example, the age estimation device according to the present invention acquires a learning result learned in advance in another device independent of the device itself through data communication means with the outside, and has been described above based on the acquired learning result. It can also comprise so that an age estimation process may be performed.

上記実施形態１に係る年齢推定装置１００，１００ａが推定する被写体の年齢は、実年齢であってもよいし、見た目年齢であってもよい。推定すべき年齢が実年齢の場合は、学習部１６０は、学習用被写体の実年齢を用いて、上述した識別基準を学習する。推定すべき年齢が見た目年齢の場合は、学習部１６０は、学習用被写体の見た目年齢を用いて、上述した識別基準を学習する。どちらの年齢を推定する場合であっても、従来は笑顔や怒り等の表情の度合が大きい場合には、被写体の年齢を高く推定してしまう傾向があった。しかし、本発明に係る年齢推定装置によれば、被写体の実年齢又は見た目年齢を精度よく推定することができる。 The age of the subject estimated by the age estimation devices 100 and 100a according to the first embodiment may be a real age or an apparent age. When the age to be estimated is the real age, the learning unit 160 learns the identification criterion described above using the real age of the learning subject. When the age to be estimated is the apparent age, the learning unit 160 learns the identification criterion described above using the apparent age of the learning subject. Regardless of which age is estimated, conventionally, when the degree of facial expression such as smile or anger is large, the age of the subject tends to be estimated high. However, the age estimation device according to the present invention can accurately estimate the actual age or the apparent age of the subject.

なお、本発明に係る機能を実現するための構成を予め備えた年齢推定装置として提供できることはもとより、プログラムの適用により、既存のパーソナルコンピュータや情報端末機器等を、本発明に係る年齢推定装置として機能させることもできる。すなわち、上記実施形態で例示した年齢推定装置１００，１００ａによる各機能構成を実現させるためのプログラムを、既存のパーソナルコンピュータや情報端末機器等を制御するＣＰＵ等が実行できるように適用することで、本発明に係る年齢推定装置として機能させることができる。また、本発明に係る年齢推定方法は、年齢推定装置を用いて実施できる。 It should be noted that not only can an age estimation apparatus provided with a configuration for realizing the functions according to the present invention be provided in advance, an existing personal computer, an information terminal device, or the like can be provided as an age estimation apparatus according to the present invention by applying a program. It can also function. That is, by applying a program for realizing each functional configuration by the age estimation devices 100 and 100a exemplified in the above embodiment so that a CPU or the like that controls an existing personal computer or information terminal device can be executed, It can function as an age estimation apparatus according to the present invention. The age estimation method according to the present invention can be implemented using an age estimation device.

また、このようなプログラムの適用方法は任意である。プログラムを、例えば、コンピュータが読取可能な記録媒体（ＣＤ−ＲＯＭ（Compact Disc Read-Only Memory）、ＤＶＤ（Digital Versatile Disc）、ＭＯ（Magneto Optical disc）等）に格納して適用できる他、インターネット等のネットワーク上のストレージにプログラムを格納しておき、これをダウンロードさせることにより適用することもできる。 Moreover, the application method of such a program is arbitrary. For example, the program can be stored and applied to a computer-readable recording medium (CD-ROM (Compact Disc Read-Only Memory), DVD (Digital Versatile Disc), MO (Magneto Optical disc), etc.), the Internet, etc. It is also possible to apply the program by storing it in a storage on the network and downloading it.

以上、本発明の好ましい実施形態について説明したが、本発明は係る特定の実施形態に限定されるものではなく、本発明には、特許請求の範囲に記載された発明とその均等の範囲が含まれる。以下に、本願出願の当初の特許請求の範囲に記載された発明を付記する。 As mentioned above, although preferable embodiment of this invention was described, this invention is not limited to the specific embodiment which concerns, This invention includes the invention described in the claim, and its equivalent range It is. Hereinafter, the invention described in the scope of claims of the present application will be appended.

（付記１）
被写体を含む被写体画像における、当該被写体の顔領域を検出する顔検出部と、
前記顔検出部が検出した顔領域から、前記被写体の顔の特徴を示す特徴量を生成する特徴量生成部と、
前記顔検出部が検出した顔領域から、前記被写体の表情の度合を検出する表情検出部と、
前記特徴量生成部が生成した特徴量と、前記表情検出部が検出した表情の度合と、に基づいて、前記被写体の年齢を推定する年齢推定部と、
を備えることを特徴とする年齢推定装置。 (Appendix 1)
A face detection unit that detects a face area of the subject in a subject image including the subject;
A feature amount generation unit that generates a feature amount indicating a feature of the face of the subject from the face area detected by the face detection unit;
A facial expression detection unit that detects the degree of facial expression of the subject from the face area detected by the face detection unit;
An age estimation unit that estimates the age of the subject based on the feature amount generated by the feature amount generation unit and the degree of facial expression detected by the facial expression detection unit;
An age estimation device comprising:

（付記２）
前記特徴量生成部は、
前記顔検出部が検出した顔領域における特定部位の位置を抽出する部位抽出部と、
前記部位抽出部が抽出した位置に基づいて、前記被写体画像を正規化する正規化部と、
を含み、
前記正規化部が正規化した後の前記被写体画像における前記顔領域から、前記特徴量を生成する、
ことを特徴とする付記１に記載の年齢推定装置。 (Appendix 2)
The feature quantity generation unit
A part extracting unit that extracts a position of a specific part in the face area detected by the face detecting unit;
A normalization unit that normalizes the subject image based on the position extracted by the part extraction unit;
Including
Generating the feature amount from the face area in the subject image after normalization by the normalization unit;
The age estimation apparatus according to supplementary note 1, wherein:

（付記３）
前記年齢推定部は、
それぞれが、互いに異なる識別基準に基づいて、前記特徴量から前記被写体の年齢を識別する複数の年齢識別部と、
前記表情検出部が検出した表情の度合に応じて、前記複数の年齢識別部の中からいずれかを選択する選択部と、
を含み、
前記選択部が選択した年齢識別部が識別した年齢を、前記被写体の年齢として推定する、
ことを特徴とする付記１又は２に記載の年齢推定装置。 (Appendix 3)
The age estimating unit
A plurality of age discriminating units each identifying the age of the subject from the feature amount based on different discrimination criteria;
A selection unit that selects one of the plurality of age identification units according to the degree of the facial expression detected by the facial expression detection unit;
Including
The age identified by the age identifying unit selected by the selecting unit is estimated as the age of the subject.
The age estimation apparatus according to Supplementary Note 1 or 2, characterized by:

（付記４）
前記顔検出部は、それぞれが学習用被写体を含む複数の学習用画像における、当該学習用被写体の顔領域を検出し、
前記特徴量生成部は、前記複数の学習用画像のそれぞれについて、前記顔検出部が検出した顔領域から、前記学習用被写体の顔の特徴を示す特徴量を生成し、
前記表情検出部は、前記複数の学習用画像のそれぞれについて、前記顔検出部が検出した顔領域から、前記学習用被写体の表情の度合を検出し、
前記複数の学習用画像を、前記表情検出部が検出した前記学習用被写体の表情の度合に応じて、複数のグループに分類する分類部と、
前記複数のグループのそれぞれについて、当該それぞれのグループに分類された学習用画像における、学習用被写体の年齢と特徴量との関係に基づいて、前記識別基準を学習する学習部と、
をさらに備える、
ことを特徴とする付記３に記載の年齢推定装置。 (Appendix 4)
The face detection unit detects a face area of the learning subject in a plurality of learning images each including a learning subject;
The feature amount generation unit generates a feature amount indicating a feature of the face of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images.
The facial expression detection unit detects the degree of facial expression of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images;
A classifying unit that classifies the plurality of learning images into a plurality of groups according to the degree of facial expression of the learning subject detected by the facial expression detection unit;
For each of the plurality of groups, a learning unit that learns the identification criterion based on the relationship between the age of the learning subject and the feature amount in the learning images classified into the respective groups;
Further comprising
The age estimation apparatus according to Supplementary Note 3, wherein

（付記５）
前記年齢推定部は、
第１の識別基準に基づいて、前記特徴量から前記被写体の年齢を識別する第１の年齢識別部と、
第２の識別基準に基づいて、前記特徴量から前記被写体の年齢を識別する第２の年齢識別部と、
前記表情検出部が検出した表情の度合に基づいて、前記第１の年齢識別部が識別した年齢と、前記第２の年齢識別部が識別した年齢と、の間の年齢を算出する年齢算出部と、
を含み、
前記年齢算出部が算出した年齢を、前記被写体の年齢として推定する、
ことを特徴とする付記１又は２に記載の年齢推定装置。 (Appendix 5)
The age estimating unit
A first age identification unit for identifying the age of the subject from the feature amount based on a first identification criterion;
A second age identifying unit for identifying the age of the subject from the feature amount based on a second identification criterion;
An age calculating unit that calculates an age between the age identified by the first age identifying unit and the age identified by the second age identifying unit based on the degree of facial expression detected by the facial expression detecting unit When,
Including
Estimating the age calculated by the age calculation unit as the age of the subject;
The age estimation apparatus according to Supplementary Note 1 or 2, characterized by:

（付記６）
前記顔検出部は、それぞれが学習用被写体を含む複数の学習用画像における、当該学習用被写体の顔領域を検出し、
前記特徴量生成部は、前記複数の学習用画像のそれぞれについて、前記顔検出部が検出した顔領域から、前記学習用被写体の顔の特徴を示す特徴量を生成し、
前記表情検出部は、前記複数の学習用画像のそれぞれについて、前記顔検出部が検出した顔領域から、前記学習用被写体の表情の度合を検出し、
前記複数の学習用画像の中で、前記表情検出部が検出した前記学習用被写体の表情の度合が所定の閾値を超える学習用画像を第１のグループに分類し、前記表情検出部が検出した前記学習用被写体の表情の度合が当該所定の閾値以下の学習用画像を第２のグループに分類する分類部と、
前記第１のグループに分類された学習用画像における、学習用被写体の年齢と特徴量との関係に基づいて、前記第１の識別基準を学習し、前記第２のグループに分類された学習用画像における、学習用被写体の年齢と特徴量との関係に基づいて、前記第２の識別基準を学習する学習部と、
をさらに備え、
前記年齢算出部は、前記表情検出部が検出した前記被写体の表情の度合が大きいほど前記第１の年齢識別部が識別した年齢に近い年齢を、前記被写体の推定年齢として算出する、
ことを特徴とする付記５に記載の年齢推定装置。 (Appendix 6)
The face detection unit detects a face area of the learning subject in a plurality of learning images each including a learning subject;
The feature amount generation unit generates a feature amount indicating a feature of the face of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images.
The facial expression detection unit detects the degree of facial expression of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images;
Among the plurality of learning images, learning images in which the degree of facial expression of the learning subject detected by the facial expression detection unit exceeds a predetermined threshold are classified into a first group, and the facial expression detection unit detects the learning image. A classifying unit for classifying learning images in which the degree of expression of the learning subject is equal to or less than the predetermined threshold value into a second group;
Based on the relationship between the age of the learning subject and the feature amount in the learning image classified into the first group, the first identification criterion is learned, and the learning image classified into the second group A learning unit that learns the second identification criterion based on the relationship between the age of the learning subject and the feature amount in the image;
Further comprising
The age calculation unit calculates an age closer to the age identified by the first age identification unit as the estimated age of the subject as the degree of expression of the subject detected by the facial expression detection unit increases.
The age estimation apparatus according to appendix 5, characterized in that:

（付記７）
前記表情検出部は、前記被写体の笑顔の度合を、前記表情の度合として検出する、
ことを特徴とする付記１から６のいずれか１つに記載の年齢推定装置。 (Appendix 7)
The facial expression detection unit detects the degree of smile of the subject as the degree of facial expression;
The age estimation device according to any one of appendices 1 to 6, characterized in that:

（付記８）
前記年齢推定部は、前記表情検出部が検出した表情の度合が大きいほど低い年齢を、前記被写体の年齢として推定する、
ことを特徴とする付記１から７のいずれか１つに記載の年齢推定装置。 (Appendix 8)
The age estimation unit estimates a lower age as the age of the subject as the degree of the facial expression detected by the facial expression detection unit is larger,
The age estimation device according to any one of appendices 1 to 7, characterized in that:

（付記９）
付記１から８のいずれか１つに記載の年齢推定装置と、
前記被写体を撮像することにより前記被写体画像を取得する撮像部と、
前記年齢推定部による前記被写体の年齢の推定結果を出力する出力部と、
を備えることを特徴とする撮像装置。 (Appendix 9)
The age estimation device according to any one of appendices 1 to 8,
An imaging unit that acquires the subject image by imaging the subject;
An output unit for outputting an estimation result of the age of the subject by the age estimation unit;
An imaging apparatus comprising:

（付記１０）
被写体を含む被写体画像における、当該被写体の顔領域を検出するステップと、
前記顔領域から、前記被写体の顔の特徴を示す特徴量を生成するステップと、
前記顔領域から、前記被写体の表情の度合を検出するステップと、
前記特徴量と、前記表情の度合と、に基づいて、前記被写体の年齢を推定するステップと、
を含むことを特徴とする年齢推定方法。 (Appendix 10)
Detecting a face area of the subject in a subject image including the subject;
Generating a feature amount indicating a feature of the face of the subject from the face region;
Detecting the degree of expression of the subject from the face area;
Estimating the age of the subject based on the feature amount and the degree of facial expression;
The age estimation method characterized by including.

（付記１１）
コンピュータに、
被写体を含む被写体画像における、当該被写体の顔領域を検出する機能、
前記顔領域から、前記被写体の顔の特徴を示す特徴量を生成する機能、
前記顔領域から、前記被写体の表情の度合を検出する機能、
前記特徴量と、前記表情の度合と、に基づいて、前記被写体の年齢を推定する機能、
を実現させるためのプログラム。 (Appendix 11)
On the computer,
A function for detecting a face area of a subject in a subject image including the subject;
A function of generating a feature amount indicating a feature of the face of the subject from the face region;
A function of detecting the degree of expression of the subject from the face area;
A function of estimating the age of the subject based on the feature amount and the degree of the facial expression;
A program to realize

１，１ａ…撮像装置、１０…撮像部、１１…光学レンズ、１２…イメージセンサ、２０…データ処理部、２１…メモリ、２２…ビデオ出力部、２３…記憶部、２４…ＣＰＵ、３０…ユーザインタフェース部、３１…表示部、３２…操作部、３３…外部インタフェース、３４…外部記憶媒体、４０…出力部、１００，１００ａ…年齢推定装置、１１０…顔検出部、１２０…特徴量生成部、１２１…部位抽出部、１２２…正規化部、１３０…表情検出部、１４０，１４０ａ…年齢推定部、１４１…笑顔年齢識別機、１４２…非笑顔年齢識別機、１４３…選択部、１４４…年齢算出部、１５０…分類部、１６０…学習部、２００ａ，２００ｂ，２００ｃ…学習用画像、２１０ａ，２１０ｂ，２１０ｃ…学習用被写体、２２０…顔領域、２２１…右目、２２２…左目、２２３…口、２３０…正規化画像 DESCRIPTION OF SYMBOLS 1, 1a ... Imaging device, 10 ... Imaging part, 11 ... Optical lens, 12 ... Image sensor, 20 ... Data processing part, 21 ... Memory, 22 ... Video output part, 23 ... Memory | storage part, 24 ... CPU, 30 ... User Interface unit 31 ... Display unit 32 ... Operation unit 33 ... External interface 34 ... External storage medium 40 ... Output unit 100, 100a ... Age estimation device 110 ... Face detection unit 120 ... Feature quantity generation unit DESCRIPTION OF SYMBOLS 121 ... Site extraction part, 122 ... Normalization part, 130 ... Facial expression detection part, 140,140a ... Age estimation part, 141 ... Smile age classifier, 142 ... Non-smile age classifier, 143 ... Selection part, 144 ... Age calculation 150, classification unit, 160 learning unit, 200a, 200b, 200c learning image, 210a, 210b, 210c learning object, 220 face region, 221 right eye 222 ... left, 223 ... mouth, 230 ... normalized image

Claims

A face detection unit that detects a face area of the subject in a subject image including the subject;
A feature amount generation unit that generates a feature amount indicating a feature of the face of the subject from the face area detected by the face detection unit;
A facial expression detection unit that detects the degree of facial expression of the subject from the face area detected by the face detection unit;
An age estimation unit that estimates the age of the subject based on the feature amount generated by the feature amount generation unit and the degree of facial expression detected by the facial expression detection unit;
An age estimation device comprising:

The feature quantity generation unit
A part extracting unit that extracts a position of a specific part in the face area detected by the face detecting unit;
A normalization unit that normalizes the subject image based on the position extracted by the part extraction unit;
Including
Generating the feature amount from the face area in the subject image after normalization by the normalization unit;
The age estimation apparatus according to claim 1.

The age estimating unit
A plurality of age discriminating units each identifying the age of the subject from the feature amount based on different discrimination criteria;
A selection unit that selects one of the plurality of age identification units according to the degree of the facial expression detected by the facial expression detection unit;
Including
The age identified by the age identifying unit selected by the selecting unit is estimated as the age of the subject.
The age estimation apparatus according to claim 1 or 2, characterized in that.

The face detection unit detects a face area of the learning subject in a plurality of learning images each including a learning subject;
The feature amount generation unit generates a feature amount indicating a feature of the face of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images.
The facial expression detection unit detects the degree of facial expression of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images;
A classifying unit that classifies the plurality of learning images into a plurality of groups according to the degree of facial expression of the learning subject detected by the facial expression detection unit;
For each of the plurality of groups, a learning unit that learns the identification criterion based on the relationship between the age of the learning subject and the feature amount in the learning images classified into the respective groups;
Further comprising
The age estimation apparatus according to claim 3.

The age estimating unit
A first age identification unit for identifying the age of the subject from the feature amount based on a first identification criterion;
A second age identifying unit for identifying the age of the subject from the feature amount based on a second identification criterion;
An age calculating unit that calculates an age between the age identified by the first age identifying unit and the age identified by the second age identifying unit based on the degree of facial expression detected by the facial expression detecting unit When,
Including
Estimating the age calculated by the age calculation unit as the age of the subject;
The age estimation apparatus according to claim 1 or 2, characterized in that.

The face detection unit detects a face area of the learning subject in a plurality of learning images each including a learning subject;
The feature amount generation unit generates a feature amount indicating a feature of the face of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images.
The facial expression detection unit detects the degree of facial expression of the learning subject from the face area detected by the face detection unit for each of the plurality of learning images;
Among the plurality of learning images, learning images in which the degree of facial expression of the learning subject detected by the facial expression detection unit exceeds a predetermined threshold are classified into a first group, and the facial expression detection unit detects the learning image. A classifying unit for classifying learning images in which the degree of expression of the learning subject is equal to or less than the predetermined threshold value into a second group;
Based on the relationship between the age of the learning subject and the feature amount in the learning image classified into the first group, the first identification criterion is learned, and the learning image classified into the second group A learning unit that learns the second identification criterion based on the relationship between the age of the learning subject and the feature amount in the image;
Further comprising
The age calculation unit calculates an age closer to the age identified by the first age identification unit as the estimated age of the subject as the degree of expression of the subject detected by the facial expression detection unit increases.
The age estimation apparatus according to claim 5.

The facial expression detection unit detects the degree of smile of the subject as the degree of facial expression;
The age estimation apparatus according to any one of claims 1 to 6, wherein

The age estimation unit estimates a lower age as the age of the subject as the degree of the facial expression detected by the facial expression detection unit is larger,
The age estimation apparatus according to any one of claims 1 to 7, characterized in that:

The age estimation device according to any one of claims 1 to 8,
An imaging unit that acquires the subject image by imaging the subject;
An output unit for outputting an estimation result of the age of the subject by the age estimation unit;
An imaging apparatus comprising:

Detecting a face area of the subject in a subject image including the subject;
Generating a feature amount indicating a feature of the face of the subject from the face region;
Detecting the degree of expression of the subject from the face area;
Estimating the age of the subject based on the feature amount and the degree of facial expression;
The age estimation method characterized by including.

On the computer,
A function for detecting a face area of a subject in a subject image including the subject;
A function of generating a feature amount indicating a feature of the face of the subject from the face region;
A function of detecting the degree of expression of the subject from the face area;
A function of estimating the age of the subject based on the feature amount and the degree of the facial expression;
A program to realize