JP4612806B2

JP4612806B2 - Image processing apparatus, image processing method, and imaging apparatus

Info

Publication number: JP4612806B2
Application number: JP2004167589A
Authority: JP
Inventors: 雄司金田; 優和真継; 克彦森; 裕輔御手洗
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2003-07-18
Filing date: 2004-06-04
Publication date: 2011-01-12
Anticipated expiration: 2024-06-04
Also published as: JP2005056388A

Description

本発明は、入力画像中の顔の表情を判別する技術に関するものである。 The present invention relates to a technique for discriminating facial expressions in an input image.

従来の表情認識装置の一つとして表情から情緒を判定する技術が開示されている（例えば特許文献１を参照）。情緒とは一般的に怒り、悲しみなど感情を表すものであり、上記技術によると、顔面の各特徴から関連規則に基づき所定の表情要素を抽出して、この所定の表情要素から表情要素情報を抽出している。ここで、表情要素は眼の開閉、眉の動き、前額の動き、唇の上下、唇の開閉、下唇の上下を示しており、これら表情要素は眉の動きに関しては左眉の傾きや右眉の傾きなどの表情要素情報から構成されている。 As one of conventional facial expression recognition devices, a technique for determining emotion from facial expressions is disclosed (for example, see Patent Document 1). Emotion generally expresses emotions such as anger and sadness. According to the above technique, a predetermined facial expression element is extracted from each feature of the face based on related rules, and facial expression element information is extracted from the predetermined facial expression element. Extracting. Here, facial expression elements indicate eye opening / closing, eyebrow movement, forehead movement, lip up / down, lip opening / closing, and lower lip up / down. It consists of facial expression element information such as the tilt of the right eyebrow.

次に、得られた表情要素を構成する表情要素情報から所定の表情要素定量化規則に基づき、表情要素を定量化した表情要素コードを算出する。さらに、情緒カテゴリ毎に決められた所定の表情要素コードから所定の情緒変換式で情緒カテゴリ毎に情緒量を算出する。そして、情緒カテゴリの中で情緒量の最大値を情緒と判定を行う方法がある。
特許公報第２５７３１２６号 Next, a facial expression element code obtained by quantifying the facial expression element is calculated from facial expression element information constituting the obtained facial expression element based on a predetermined facial expression element quantification rule. Further, the emotion amount is calculated for each emotion category by a predetermined emotion conversion formula from a predetermined facial expression element code determined for each emotion category. Then, there is a method for determining the emotional maximum value in the emotion category as emotion.
Japanese Patent Publication No. 2573126

顔面の各特徴の形状や長さは個人によって大きな差を持っている。例えば、真顔である無表情画像において、すでに眼尻が下がっている人や元々眼が細い人など画像１枚からの主観的視点からは一見喜びであるように見えるが本人にとっては真顔である場合がある。さらに、顔画像は常に顔サイズや顔の向きが一定というわけではなく、顔がサイズ変動した場合や顔が回転した場合には表情を認識するために必要な特徴量を顔のサイズ変動や顔の回転変動に応じて正規化することが必要になる。 The shape and length of each facial feature varies greatly from person to person. For example, in an expressionless image that is a true face, it looks like a joy from a subjective point of view from a single image, such as a person with already lowered eyes or a person with narrow eyes, but it is a true face for the person himself / herself There is. Furthermore, the face size and orientation of the face image are not always constant, and if the face size changes or the face rotates, the features necessary for recognizing facial expressions are the face size changes and the face orientation. It is necessary to normalize according to the rotation fluctuation of the.

また、入力画像に表情場面や真顔画像である無表情場面の他に、会話場面である非表情場面を含めた日常場面を想定した時系列画像を入力画像とした場合、例えば、驚きの表情に類似した会話場面での発音「お」や喜びの表情に類似した発音「い」「え」など非表情場面を表情場面と誤判定する場合がある。 If the input image is a time-series image that assumes daily scenes including non-facial scenes that are conversational scenes in addition to facial expressions and faceless scenes that are face-to-face images, for example, surprised facial expressions. There are cases where non-facial scenes such as pronunciations “O” in similar conversation scenes and pronunciations “i” and “e” similar to joyful expressions are misjudged as facial expressions.

本発明は以上の問題に鑑みてなされたものであり、個人差や表情場面などにロバストな、画像中の顔の表情をより正確に判断する技術を提供することを目的とする。また、顔のサイズが変動した場合や顔が回転した場合においても正確に表情を判断する技術を提供することを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to provide a technique for more accurately determining facial expressions in an image, which is robust to individual differences and facial expressions. It is another object of the present invention to provide a technique for accurately determining a facial expression even when the face size changes or the face rotates.

本発明の目的を達成するために、例えば本発明の画像処理装置は以下の構成を備える。 In order to achieve the object of the present invention, for example, an image processing apparatus of the present invention comprises the following arrangement.

即ち、顔を含む連続するフレームの画像を入力する入力手段と、
前記連続するフレームの各フレームの画像中の顔における予め設定された部位群の夫々について特徴量を求める第１の特徴量計算手段と、
予め設定された表情の顔を含む画像における当該顔の前記予め設定された部位群の夫々について特徴量を求める第２の特徴量計算手段と、
前記第１の特徴量計算手段が求めた特徴量と、前記第２の特徴量計算手段が求めた特徴量との差分または比に基づいて、前記予め設定された部位群の夫々の特徴量の変化量を求める変化量計算手段と、
前記予め設定された部位群の夫々について前記変化量計算手段が求めた変化量に基づいて、前記予め設定された部位群の夫々について得点を計算する得点計算手段と、
前記得点計算手段が前記予め設定された部位群の夫々について計算した得点の分布と、各表情毎に計算した、前記予め設定された部位群の夫々に対する得点の分布と、を比較することで、前記入力手段が入力した画像中の顔の表情を判断する第１の判断手段と、
連続したｐフレームの夫々の画像中の顔の表情が第１の表情であると前記第１の判断手段が判断してから、当該判断後、連続したｑフレームの夫々の画像中の顔の表情が前記第１の表情とは異なる第２の表情であると前記第１の判断手段が判断するまでの各フレームの画像中の顔の表情を、第１の表情として判断する第２の判断手段と
を備えることを特徴とする。 That is, input means for inputting images of successive frames including a face;
First feature amount calculation means for obtaining a feature amount for each of a predetermined group of parts in the face in the image of each frame of the continuous frames ;
A second feature amount calculating means for obtaining a feature amount for each of the preset part groups of the face in an image including a face of a preset facial expression;
Based on the difference or ratio between the feature quantity obtained by the first feature quantity calculation means and the feature quantity obtained by the second feature quantity calculation means, the respective feature quantities of the preset part group A change amount calculating means for obtaining a change amount;
Score calculating means for calculating a score for each of the preset part groups, based on the amount of change obtained by the change amount calculating means for each of the preset part groups;
By comparing the score distribution calculated for each of the preset group of parts by the score calculation means with the distribution of scores for each of the preset group of parts calculated for each facial expression, First determination means for determining facial expressions in the image input by the input means ;
After the first determination means determines that the facial expression in each image of consecutive p frames is the first expression, the facial expression in each image of consecutive q frames is determined after the determination. Second determination means for determining the facial expression in the image of each frame until the first determination means determines that the second expression is different from the first expression. characterized in that it comprises and.

本発明の構成により、個人差や表情場面などにロバストな、画像中の顔の表情をより正確に判断することができる。さらに、顔のサイズが変動した場合や顔が回転した倍でも画像中の顔の表情をより正確に判断することができる。 With the configuration of the present invention, it is possible to more accurately determine a facial expression in an image that is robust to individual differences and facial expressions. Furthermore, the facial expression in the image can be more accurately determined when the size of the face changes or when the face is rotated.

以下添付図面を参照して、本発明を好適な実施形態に従って詳細に説明する。 Hereinafter, the present invention will be described in detail according to preferred embodiments with reference to the accompanying drawings.

［第１の実施形態］
図２３は、本実施形態に係る画像処理装置の基本構成を示す図である。 [First Embodiment]
FIG. 23 is a diagram illustrating a basic configuration of the image processing apparatus according to the present embodiment.

２３０１はＣＰＵで、ＲＡＭ２３０２やＲＯＭ２３０３に記憶されているプログラムやデータを用いて本装置全体の制御を行うと共に、後述の表情判別処理も行う。 Reference numeral 2301 denotes a CPU that controls the entire apparatus using programs and data stored in the RAM 2302 and the ROM 2303 and also performs facial expression determination processing described later.

２３０２はＲＡＭで、外部記憶装置３０７や記憶媒体ドライブ装置２３０８からロードされたプログラムやデータを記憶するためのエリアを備えると共に、ＣＰＵ２３０１が各種の処理を行うために必要なエリアも備える。 A RAM 2302 includes an area for storing programs and data loaded from the external storage device 307 and the storage medium drive device 2308, and also includes an area necessary for the CPU 2301 to perform various processes.

２３０３はＲＯＭで、ブートプログラム等の各種のプログラムやデータを格納する。 Reference numeral 2303 denotes a ROM that stores various programs such as a boot program and data.

２３０４、２３０５は夫々キーボード、マウスで、各種の指示をＣＰＵ２３０１に入力することができる。 Reference numerals 2304 and 2305 are a keyboard and a mouse, respectively, and can input various instructions to the CPU 2301.

２３０６は表示装置で、ＣＲＴや液晶画面などにより構成されており、画像や文字などの各種の情報を表示することができる。 A display device 2306 includes a CRT, a liquid crystal screen, and the like, and can display various information such as images and characters.

２３０７は外部記憶装置で、ハードディスクドライブ装置などの大容量情報記憶装置として機能するものであり、ここにＯＳ（オペレーティングシステム）や後述の表情判別処理に係るプログラム、そして各種のデータを保存する。そしてこれらプログラムやデータは、ＣＰＵ２３０１からの指示によりＲＡＭ２３０２に読み出される。 Reference numeral 2307 denotes an external storage device that functions as a large-capacity information storage device such as a hard disk drive device, and stores an OS (Operating System), a program relating to facial expression discrimination processing described later, and various data. These programs and data are read into the RAM 2302 according to instructions from the CPU 2301.

２３０８は記憶媒体ドライブ装置で、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの記憶媒体に記録されたプログラムやデータを読み取って、外部記憶装置２３０７やＲＡＭ２３０２に出力するものである。なお、後述の表情判別処理に係るプログラムやデータなどはこの記憶媒体に記録しておき、必要に応じてＣＰＵ２３０１の制御によりＲＡＭ２３０２に読み出されるようにしても良い。 Reference numeral 2308 denotes a storage medium drive device that reads a program or data recorded on a storage medium such as a CD-ROM or DVD-ROM and outputs the read program or data to an external storage device 2307 or RAM 2302. It should be noted that a program, data, and the like related to a facial expression determination process described later may be recorded in this storage medium and read out to the RAM 2302 under the control of the CPU 2301 as necessary.

２３０９はＩ／Ｆで、外部装置と本装置を接続するためのもので、外部装置とのデータ通信はこのＩ／Ｆを介して行われる。 Reference numeral 2309 denotes an I / F for connecting the external apparatus and the apparatus, and data communication with the external apparatus is performed via the I / F.

２３１０は上記各部を繋ぐバスである。 Reference numeral 2310 denotes a bus connecting the above-described units.

図１は本実施形態に係る画像処理装置の機能構成を示すブロック図である。 FIG. 1 is a block diagram showing a functional configuration of the image processing apparatus according to the present embodiment.

画像処理装置の機能構成は、時系列に連続して複数の画像を入力する画像入力部１００と、画像入力部１００によって入力された画像（入力画像）から表情を判別するために必要な特徴量を抽出する特徴量計算部１０１と、あらかじめ用意しておいた真顔（無表情）の参照顔から表情を認識するために必要な参照特徴を抽出し保持しておく参照特徴保持部１０２と、特徴量算出部１０１で抽出された特徴量と参照特徴保持部１０２で保持されている特徴量の差分量を計算することで参照顔からの顔面の各特徴量の変化量を算出する特徴量変化量算出部１０３と、特徴量変化量算出部１０３で抽出された各特徴の変化量から各特徴ごとに得点を算出する得点算出部１０４と、得点算出部１０４で算出された得点の総和から入力画像における顔の表情の判定を行う表情判定部１０５から構成される。 The functional configuration of the image processing apparatus includes an image input unit 100 that inputs a plurality of images continuously in time series, and a feature amount necessary for discriminating a facial expression from an image (input image) input by the image input unit 100. A feature quantity calculation unit 101 that extracts a reference feature, a reference feature holding unit 102 that extracts and holds a reference feature necessary for recognizing an expression from a reference face of a true face (no expression) prepared in advance, and a feature A feature amount change amount for calculating a change amount of each feature amount of the face from the reference face by calculating a difference amount between the feature amount extracted by the amount calculation unit 101 and the feature amount held by the reference feature holding unit 102 The calculation unit 103, a score calculation unit 104 that calculates a score for each feature from the amount of change of each feature extracted by the feature amount change amount calculation unit 103, and an input image from the sum of the scores calculated by the score calculation unit 104 Face in Composed of facial expression determination unit 105 for determining the information.

なお、図１に示した各部はハードウェアにより構成しても良いが、本実施形態では画像入力部１００，特徴量抽出部１０１，特徴量変化量算出部１０３，得点算出部１０４，表情判定部１０５の各部はプログラムにより構成し、このプログラムはＲＡＭ２３０２に記憶されており、これをＣＰＵ２３０１が実行することで、各部の機能を実現するものとする。また、参照特徴保持部１０２はＲＡＭ２３０２内の所定のエリアとするが、外部記憶装置２３０７内のエリアとしても良い。 1 may be configured by hardware. However, in this embodiment, the image input unit 100, the feature amount extraction unit 101, the feature amount change amount calculation unit 103, the score calculation unit 104, and the facial expression determination unit. Each unit 105 is configured by a program, and this program is stored in the RAM 2302, and is executed by the CPU 2301, thereby realizing the function of each unit. The reference feature holding unit 102 is a predetermined area in the RAM 2302, but may be an area in the external storage device 2307.

以下に、図１に示した各部についてより詳細に説明する。 Below, each part shown in FIG. 1 is demonstrated in detail.

画像入力部１００は、ビデオカメラ等から得られた動画像を１フレーム毎に切り出した時系列の顔画像を入力画像として入力する。即ち図２３の構成によると、各フレームの画像のデータはＩ／Ｆ２３０９に接続されたビデオカメラなどから、このＩ／Ｆ２３０９を介して逐次ＲＡＭ２３０２に出力される。 The image input unit 100 inputs a time-series face image obtained by cutting out a moving image obtained from a video camera or the like for each frame as an input image. That is, according to the configuration of FIG. 23, image data of each frame is sequentially output to the RAM 2302 via the I / F 2309 from a video camera or the like connected to the I / F 2309.

特徴量抽出部１０１は図２に示すように、眼と口と鼻位置抽出部１１０、エッジ画像生成部１１１、顔面の各特徴エッジ抽出部１１２、顔面の特徴点抽出部１１３、表情特徴量抽出部１１４で構成される。図２は特徴量抽出部１０１の機能構成を示すブロック図である。 As shown in FIG. 2, the feature quantity extraction unit 101 includes an eye / mouth / nose position extraction unit 110, an edge image generation unit 111, facial feature edge extraction units 112, a facial feature point extraction unit 113, and facial expression feature quantity extraction. Part 114. FIG. 2 is a block diagram illustrating a functional configuration of the feature amount extraction unit 101.

以下、図２に示した各部について、より詳細に説明する。 Hereafter, each part shown in FIG. 2 is demonstrated in detail.

眼と口と鼻位置抽出部１１０は、画像入力部１００で入力された画像（入力画像）から顔面の所定部位、即ち、眼や口や鼻の位置（入力画像中の位置）を決定する。眼や口の位置を決定する方法は、例えば、眼と口と鼻のそれぞれのテンプレートを用意し、テンプレートマッチングを行うことにより眼と口と鼻の候補を抽出し、更にその抽出後、テンプレートマッチングにより得られた眼と口と鼻の候補の空間的な配置関係と色情報である肌色情報を用いることにより眼と口と鼻位置の検出を行う方法など用いることができる。検出した眼と口の位置データは後段の顔面の各特徴エッジ抽出部１１２に出力される。 The eye, mouth, and nose position extraction unit 110 determines a predetermined part of the face, that is, the position of the eye, mouth, and nose (position in the input image) from the image (input image) input by the image input unit 100. The method for determining the position of the eyes and mouth is, for example, preparing eye, mouth, and nose templates, and performing template matching to extract eye, mouth, and nose candidates. A method of detecting the position of the eyes, mouth, and nose by using the spatial arrangement relationship between the candidates of the eyes, mouth, and nose obtained by the above and the skin color information that is color information can be used. The detected eye and mouth position data is output to each feature edge extraction unit 112 of the subsequent face.

次に、画像入力部１００で得られた入力画像からエッジ画像生成部１１１でエッジ抽出し、抽出されたエッジをエッジ膨張処理を加えた後で細線化処理を行うことによってエッジ画像を生成する。例えば、エッジ抽出にはSobelフィルタによるエッジ抽出、エッジ膨張処理には８近傍膨張処理、細線化処理にはHilditchの細線化処理を使用することができる。ここでエッジ膨張処理と細線化処理は、エッジを膨張させることにより分裂しているエッジを連結し、細線化処理を行うことで、後述するエッジ走査と特徴点抽出を円滑に行うことを目的としている。生成したエッジ画像は後段の顔面の各特徴エッジ抽出部１１２に出力する。 Next, an edge image is extracted from the input image obtained by the image input unit 100 by the edge image generation unit 111, an edge expansion process is performed on the extracted edge, and then a thinning process is performed to generate an edge image. For example, edge extraction using a Sobel filter can be used for edge extraction, 8-neighbor expansion processing can be used for edge expansion processing, and Hilditch thinning processing can be used for thinning processing. Here, the edge expansion process and the thinning process are intended to smoothly perform edge scanning and feature point extraction, which will be described later, by connecting the split edges by expanding the edges and performing the thinning process. Yes. The generated edge image is output to each feature edge extraction unit 112 of the subsequent face.

顔面の各特徴エッジ抽出部１１２では、眼と口位置抽出部１１０で検出した眼と口の位置データとエッジ画像生成部１１１によるエッジ画像を用いて、図３に示すようなエッジ画像における眼領域、頬領域、口領域を決定する。 Each facial feature edge extraction unit 112 uses the eye and mouth position data detected by the eye and mouth position extraction unit 110 and the edge image generated by the edge image generation unit 111 to generate an eye region in the edge image as shown in FIG. Determine cheek area and mouth area.

眼領域は領域内に眉毛と眼のエッジのみが含まれるように設定し、頬領域は頬のエッジと鼻のエッジのみが含まれるように領域を設定し、口領域は上唇エッジ、歯エッジ、下唇エッジのみが含まれるように領域指定を行う。 The eye area is set so that only the eyebrows and the eye edge are included in the area, the cheek area is set so that only the cheek edge and the nose edge are included, and the mouth area is the upper lip edge, the tooth edge, Specify the area so that only the lower lip edge is included.

ここで、これらの領域の設定処理の一例について説明する。 Here, an example of setting processing of these areas will be described.

目領域の縦幅は、テンプレートマッチングと空間配置関係から求められた左目位置検出結果と右目位置検出結果の中点から上に左右目位置検出間距離の０.５倍、下に左右目位置検出間距離の０.３倍の領域を目の縦方向領域とする。 The vertical width of the eye area is 0.5 times the distance between the left and right eye position detection from the middle point of the left eye position detection result and the right eye position detection result obtained from the template matching and the spatial arrangement relationship, and the left eye position detection is below A region having a distance of 0.3 times the distance is defined as a vertical region of the eyes.

目領域の横幅は、テンプレートマッチングと空間配置関係から求められた左目位置検出結果と右目位置検出結果の中点から左右それぞれに左右目位置検出間距離の領域を目の横方向領域とする。 Regarding the lateral width of the eye region, the region of the distance between the left and right eye positions detected from the middle point of the left eye position detection result and the right eye position detection result obtained from the template matching and the spatial arrangement relationship is set as the lateral region of the eye.

つまり、目領域の縦方向の辺の長さは左右目位置検出間距離の０.８倍、横方向の辺の長さは左右目位置検出間距離の２倍となる。 That is, the length of the side of the eye region in the vertical direction is 0.8 times the distance between the left and right eye position detection, and the length of the side in the horizontal direction is twice the distance between the left and right eye position detection.

口領域の縦幅は、テンプレートマッチングと空間配置関係から求められた口位置検出結果の位置より上に鼻位置検出と口検出位置間距離の０.７５倍、下に左目位置検出結果と右目位置検出結果の中点と、口位置検出結果との距離の０.２５倍の領域を縦方向領域とする。口領域の横幅は、テンプレートマッチングと空間配置関係から求められた口位置検出結果の位置より左右それぞれ左右目位置検出間距離の０.８倍を目の横方向領域とする。 The vertical width of the mouth area is 0.75 times the distance between the nose position detection and the mouth detection position above the position of the mouth position detection result obtained from the template matching and the spatial arrangement relationship, and below the left eye position detection result and the right eye position. An area that is 0.25 times the distance between the midpoint of the detection result and the mouth position detection result is defined as a vertical area. The lateral width of the mouth area is 0.8 times the distance between the left and right eye positions detected from the position of the mouth position detection result obtained from the template matching and the spatial arrangement relationship.

頬領域の縦幅は、テンプレートマッチングと空間配置関係から求められた左目位置検出結果と右目位置検出結果の中点と口位置検出結果との中点（顔の中心付近の点となる）から上下それぞれ左目位置検出結果と右目位置検出結果の中点と、口位置検出結果との距離の０.２５倍の領域を縦方向領域とする。 The vertical width of the cheek area is above and below the midpoint of the left eye position detection result, right eye position detection result, and mouth position detection result obtained from template matching and spatial layout (the point near the center of the face) An area that is 0.25 times the distance between the midpoint of the left eye position detection result and the right eye position detection result and the mouth position detection result is defined as the vertical direction area.

頬領域の横幅は、テンプレートマッチングと空間配置関係から求められた左目位置検出結果と右目位置検出結果の中点と口位置検出結果との中点（顔の中心付近の点となる）から左右それぞれ左右目位置検出間距離の０.６倍の領域を頬の横領域とする。 The width of the cheek area is determined from the midpoint of the left eye position detection result, the right eye position detection result, and the mouth position detection result obtained from the template matching and the spatial arrangement relationship. An area that is 0.6 times the distance between the left and right eye position detection is defined as a lateral area of the cheek.

つまり、頬領域の縦方向の辺の長さは、左目位置検出結果と右目位置検出結果の中点と、口位置検出結果との距離の０.５倍、横方向の辺の長さは左右目位置検出間距離の１.２倍となる。 That is, the vertical side length of the cheek region is 0.5 times the distance between the midpoint of the left eye position detection result and the right eye position detection result and the mouth position detection result, and the horizontal side length is left and right. This is 1.2 times the distance between eye position detection.

以上の領域の設定処理によって、図３に示すように、眼領域内では上から１本目のエッジ１２０とエッジ１２１が眉毛のエッジとして、２本目のエッジ１２２とエッジ１２３が眼のエッジとして判定され、口領域内では口を閉じている場合には図３に示すように上から１本目のエッジ１２６が上唇エッジ、２本目のエッジ１２７が下唇エッジとして判定され、口を開けている場合には上から１本目のエッジが上唇エッジ、２本目のエッジが歯のエッジ、３本目のエッジが下唇エッジとして判定される。 With the above region setting process, as shown in FIG. 3, the first edge 120 and edge 121 from the top are determined as eyebrow edges, and the second edge 122 and edge 123 are determined as eye edges in the eye region. When the mouth is closed in the mouth area, as shown in FIG. 3, the first edge 126 from the top is determined as the upper lip edge, the second edge 127 is determined as the lower lip edge, and the mouth is opened. The first edge from the top is determined as the upper lip edge, the second edge as the tooth edge, and the third edge as the lower lip edge.

以上の判定結果は、以上の３つの領域（眼領域、頬領域、口領域）の夫々が眼領域、頬領域、口領域の何れであるかを示すデータ、そして各領域の位置とサイズのデータとして顔面の各特徴エッジ抽出部１１２によって生成され、顔面の特徴点抽出部１１３にエッジ画像と共に出力される。 The above determination results are data indicating whether each of the above three regions (eye region, cheek region, mouth region) is an eye region, cheek region, or mouth region, and data on the position and size of each region. Are generated by the feature edge extraction unit 112 of the face, and output to the feature point extraction unit 113 of the face together with the edge image.

顔面の特徴点抽出部１１３は、顔面の各特徴エッジ抽出部１１２から入力した上記各種データを用いて、エッジ画像における眼領域、頬領域、口領域内のエッジを走査することによって、後述の各特徴点を検出する。 The facial feature point extraction unit 113 uses the various data input from the facial feature edge extraction unit 112 to scan the edges in the eye area, cheek area, and mouth area in the edge image, thereby performing each of the following. Detect feature points.

図４は、顔面の特徴点抽出部１１３が検出する各特徴点を示す図である。同図に示すように、各特徴点とは、各エッジの端点、エッジ上における端点間の中点であり、これらは例えばエッジを構成する画素の値（ここではエッジを構成する画素の値を１，エッジを構成していない画素の値を０とする）を参照して、横方向の座標位置の最大値、最小値を求めることでエッジの端点を求めることが出来、エッジ上における端点間の中点は、エッジ上において単純に端点間の中点の横方向の座標値を取る位置とすることで求めることができる。 FIG. 4 is a diagram illustrating each feature point detected by the feature point extraction unit 113 of the face. As shown in the figure, each feature point is the end point of each edge and the midpoint between the end points on the edge. These are, for example, the values of the pixels constituting the edge (here, the values of the pixels constituting the edge). 1), the edge points of the edge can be obtained by obtaining the maximum value and the minimum value of the coordinate position in the horizontal direction, and between the end points on the edge. The midpoint can be obtained by simply setting the horizontal coordinate value of the midpoint between the end points on the edge.

顔面の特徴点抽出部１１３はこれらの端点の位置情報を特徴点情報として求め、眼の特徴点情報（眼領域における各エッジの特徴点の位置情報）、口の特徴点情報（口領域における各エッジの特徴点の位置情報）を夫々後段の表情特徴量抽出部１１４にエッジ画像と共に出力する。 The facial feature point extraction unit 113 obtains position information of these end points as feature point information, eye feature point information (position information of feature points of each edge in the eye region), mouth feature point information (each of the mouth regions) The edge feature point position information) is output to the subsequent facial expression feature quantity extraction unit 114 together with the edge image.

なお、特徴点に関しては目・口・鼻の位置検出同様に目や口や鼻の端点位置を算出するテンプレートなどを用いてもよくエッジ走査による特徴点抽出に限定されるわけではない。 As for the feature points, a template for calculating the positions of the end points of the eyes, mouth, and nose may be used in the same manner as the eye / mouth / nose position detection, and the feature points are not limited to feature point extraction by edge scanning.

表情特徴量抽出部１１４は顔面の特徴点抽出部１１３で得られた各特徴点情報から、表情判別に必要な「額周りのエッジ密度」、「眉毛エッジの形状」、「左右眉毛エッジ間の距離」、「眉毛エッジと眼のエッジ間の距離」、「眼の端点と口端点の距離」、「眼の線エッジの長さ」、「眼の線エッジの形状」、「頬周りのエッジ密度」、「口の線エッジの長さ」、「口の線エッジの形状」等の特徴量を算出する。 The facial expression feature amount extraction unit 114 determines from the feature point information obtained by the facial feature point extraction unit 113 the “edge density around the forehead”, “the shape of the eyebrows edge”, and “the eyebrow edge shape” necessary for facial expression discrimination. "Distance", "Distance between eyebrows edge and eye edge", "Distance between eye end point and mouth end point", "Length of eye line edge", "Shape of eye line edge", "Edge around cheek" Features such as “density”, “length of mouth line edge”, and “shape of mouth edge” are calculated.

ここで、「眼の端点と口の端点の距離」とは、図４の特徴点１３６（右眼の右端点）の座標位置から特徴点１４７（唇の右端点）の座標位置までの縦方向の距離、同様に特徴点１４１（左眼の左端点）の座標位置から特徴点１４９（唇の左端点）の座標位置までの縦方向の距離である。 Here, the “distance between the end point of the eye and the end point of the mouth” is the vertical direction from the coordinate position of the feature point 136 (right end point of the right eye) in FIG. 4 to the coordinate position of the feature point 147 (right end point of the lips). Similarly, the distance in the vertical direction from the coordinate position of the feature point 141 (left end point of the left eye) to the coordinate position of the feature point 149 (left end point of the lips).

また、「眼の線エッジの長さ」とは、図４の特徴点１３６（右眼の右端点）の座標位置から特徴点１３８（右眼の左端点）の座標位置までの横方向の距離、又は特徴点１３９（左眼の右端点）の座標位置から特徴点１４１（左眼の左端点）の座標位置までの横方向の距離である。 Further, “the length of the eye line edge” is the distance in the horizontal direction from the coordinate position of the feature point 136 (right end point of the right eye) in FIG. 4 to the coordinate position of the feature point 138 (left end point of the right eye). Or the distance in the horizontal direction from the coordinate position of the feature point 139 (the right end point of the left eye) to the coordinate position of the feature point 141 (the left end point of the left eye).

また、「眼の線エッジの形状」とは、図５に示すように、特徴点１３６（右眼の右端点）と特徴点１３７（右眼の中点）により規定される線分（直線）１５０、特徴点１３７（右眼の中点）と特徴点１３８（右眼の左端点）により規定される線分（直線）１５１を算出し、この算出された２本の直線１５０と直線１５１の傾きから形状を判定している。 Further, as shown in FIG. 5, the “shape of the eye line edge” is a line segment (straight line) defined by the feature point 136 (right end point of the right eye) and the feature point 137 (midpoint of the right eye). 150, a line segment (straight line) 151 defined by the feature point 137 (the middle point of the right eye) and the feature point 138 (the left end point of the right eye) is calculated, and the two straight lines 150 and 151 calculated The shape is determined from the inclination.

この処理は左目の線エッジの形状を求める処理についても同様で、用いる特徴点が異なるだけである。即ち、特徴点１３９（左眼の右端点）と特徴点１４０（左眼の中点）とで規定される線分の傾き、特徴点１４０（左眼の中点）と特徴点１４１（左眼の左端点）とで規定される線分の傾きを求め、これにより同様に判定する。 This process is the same as the process for obtaining the shape of the line edge of the left eye, and only the feature points used are different. That is, the slope of the line segment defined by the feature point 139 (the right end point of the left eye) and the feature point 140 (the midpoint of the left eye), the feature point 140 (the midpoint of the left eye), and the feature point 141 (the left eye) Is determined in the same manner.

また、「頬周りのエッジ密度」は、上記頬の領域において、エッジを構成する画素の数を表すものである。これは頬の筋肉が持ち上がることによって「しわ」が生じ、これにより長さ、太さの異なる様々なエッジが発生するので、これらのエッジの量として、これらのエッジを構成する画素の数（画素値が１である画素の数）をカウントし、頬の領域を構成する画像数で割り算することにより密度を求めることができる。 Further, “edge density around cheek” represents the number of pixels constituting the edge in the cheek area. This is because wrinkles are generated when the muscles of the cheeks are lifted. As a result, various edges having different lengths and thicknesses are generated. As the amount of these edges, the number of pixels constituting these edges (pixels) The number of pixels having a value of 1) is counted, and the density can be obtained by dividing by the number of images constituting the cheek region.

また、「口線エッジの長さ」とは、口領域においてすべてのエッジを走査し、エッジを構成する画素のうちで横方向の座標位置が最も小さい画素が特徴点１４７（口の右端点）、最も大きい画素が特徴点１４９（口の左端点）とした場合には、特徴点１４７（口の右端点）の座標位置から特徴点１４９（口の左端点）の座標位置までの距離を示すものである。 Also, “the length of the mouth edge” means that all edges are scanned in the mouth area, and the pixel having the smallest horizontal coordinate position among the pixels constituting the edge is the feature point 147 (the right end point of the mouth). When the largest pixel is the feature point 149 (the left end point of the mouth), the distance from the coordinate position of the feature point 147 (the right end point of the mouth) to the coordinate position of the feature point 149 (the left end point of the mouth) is indicated. Is.

なお上述のように、特徴量を求めるために端点間の距離や、２つの端点で規定される線分の傾き、エッジ密度を求めているのであるが、この処理は換言すると、各部位のエッジの長さや形状などの特徴量を求めていることになる。従って、以下ではこれらエッジの長さや形状を総称して「エッジの特徴量」と呼称する場合がある。 In addition, as described above, the distance between the end points, the slope of the line segment defined by the two end points, and the edge density are obtained in order to obtain the feature amount. Thus, the feature amount such as the length and shape of the image is obtained. Therefore, hereinafter, the length and shape of these edges may be collectively referred to as “edge feature amounts”.

以上のようにして、特徴量抽出部１０１は、入力画像から各特徴量を求めることができる。 As described above, the feature quantity extraction unit 101 can obtain each feature quantity from the input image.

図１に戻って、参照特徴保持部１０２には、表情判別処理を行う前に、予め真顔である無表情の画像から、特徴量抽出部１０１で行われる上記特徴量検出処理によって検出された、この無表情の顔における特徴量が保持されている。 Returning to FIG. 1, the reference feature holding unit 102 detects, in advance from the facial expression image that is a true face, by the feature amount detection process performed by the feature amount extraction unit 101 before performing the facial expression determination process. The feature amount in the expressionless face is retained.

よって、以下説明する処理では、特徴量抽出部１０１が入力画像のエッジ画像から上記特徴量検出処理によって検出した特徴量が、この参照特徴保持部１０２が保持する特徴量からどの程度変化しているかを求め、この変化量に応じて入力画像中における顔の表情の判別を行う。従って、参照特徴保持部１０２が保持する特徴量を以下、「参照特徴量」と呼称する場合がある。 Therefore, in the process described below, how much the feature quantity detected by the feature quantity extraction unit 101 from the edge image of the input image by the feature quantity detection process has changed from the feature quantity held by the reference feature holding unit 102. The facial expression in the input image is discriminated according to the amount of change. Accordingly, the feature quantity held by the reference feature holding unit 102 may be referred to as “reference feature quantity” hereinafter.

まず、特徴量変化量算出部１０３は、特徴量抽出部１０１が入力画像のエッジ画像から上記特徴量検出処理によって検出した特徴量と、参照特徴保持部１０２が保持する特徴量との差分を計算する。例えば、特徴量抽出部１０１が入力画像のエッジ画像から上記特徴量検出処理によって検出した「眼の端点と口の端点の距離」と、参照特徴保持部１０２が保持する「眼の端点と口の端点の距離」との差分を計算し、特徴量の変化量する。このような差分計算を各特徴量毎に求めることは、換言すれば、各部位の特徴量の変化を求めることになる。 First, the feature amount change amount calculation unit 103 calculates a difference between the feature amount detected by the feature amount extraction unit 101 from the edge image of the input image by the feature amount detection process and the feature amount held by the reference feature holding unit 102. To do. For example, the “distance between the eye end point and the mouth end point” detected by the feature amount extracting unit 101 from the edge image of the input image by the feature amount detecting process, and the “eye end point and mouth end point” held by the reference feature holding unit 102. The difference from the “endpoint distance” is calculated, and the amount of change in the feature amount is calculated. Obtaining such a difference calculation for each feature amount, in other words, obtaining a change in the feature amount of each part.

なお、特徴量抽出部１０１が入力画像のエッジ画像から上記特徴量検出処理によって検出した特徴量と、参照特徴保持部１０２が保持する特徴量との差分を計算する際には当然、同じ特徴におけるもの同士で差分を取るので（例えば、特徴量抽出部１０１が入力画像のエッジ画像から上記特徴量検出処理によって検出した「眼の端点と口の端点の距離」と、参照特徴保持部１０２が保持する「眼の端点と口の端点の距離」との差分計算）、それぞれの特徴量は関係付けられている必要があるが、この手法については特に限定するものではない。 When calculating the difference between the feature amount detected by the feature amount detection process from the edge image of the input image by the feature amount extraction unit 101 and the feature amount held by the reference feature holding unit 102, naturally, the same feature is used. Since the difference between the objects is taken (for example, “the distance between the eye end point and the mouth end point” detected by the feature amount extraction unit 101 from the edge image of the input image by the feature amount detection process, the reference feature holding unit 102 holds The calculation of the difference between the “distance between the eye end point and the mouth end point”) needs to be related to each other, but this method is not particularly limited.

なお、この参照特徴量はユーザ毎に大きく異なる場合もあり、その場合、あるユーザにはこの参照特徴量が合致するものであっても、他のユーザには合致しない場合がある。従って、参照特徴保持部１０２に複数のユーザの参照特徴量を格納させておいても良い。その場合、上記画像入力部１００から画像を入力する前段で、誰の顔の画像を入力するのかを示す情報を予め入力しておき、特徴量変化量算出部１０３が処理を行う際に、この情報を元に参照特徴量を決定すれば、ユーザ毎の参照特徴量を用いて上記差分を計算することができ、後述する表情判別処理の精度をより一層上げることができる。 Note that this reference feature amount may be greatly different for each user. In this case, even if this reference feature amount matches a certain user, it may not match another user. Accordingly, the reference feature holding unit 102 may store reference feature amounts of a plurality of users. In this case, in advance of inputting the image from the image input unit 100, information indicating who the face image is input is input in advance, and this is performed when the feature amount change amount calculation unit 103 performs processing. If the reference feature amount is determined based on the information, the difference can be calculated using the reference feature amount for each user, and the accuracy of facial expression determination processing described later can be further improved.

また、この参照特徴保持部１０２にはユーザ毎の参照特徴量の代わりに、平均的な顔における無表情の画像から、特徴量抽出部１０１で行われる上記特徴量検出処理によって検出された、この無表情の顔における特徴量が保持されていてもよい。 Further, the reference feature holding unit 102 detects this feature amount detection process performed by the feature amount extraction unit 101 from an expressionless image on the average face instead of the reference feature amount for each user. The feature amount in the expressionless face may be held.

このようにして、特徴量変化量算出部１０３によって求められた、各部位の特徴量の変化を示す、各変化量のデータは、後段の得点算出部１０４に出力される。 In this way, the data of each change amount indicating the change in the feature amount of each part obtained by the feature amount change amount calculation unit 103 is output to the score calculation unit 104 in the subsequent stage.

得点算出部１０４は、各特徴量の変化量と、予め求められ、メモリ（例えばＲＡＭ２３０２）が保持する「重み」とに基づいて、得点の算出を行う。重みについては、予め各部位毎に変化量の個人差などに対する分析を行っており、この分析結果に応じて各特徴量毎に適切な重みを設定する。 The score calculation unit 104 calculates a score based on the amount of change of each feature value and the “weight” obtained in advance and held in a memory (for example, the RAM 2302). As for the weight, an analysis is performed in advance for individual differences in the amount of change for each part, and an appropriate weight is set for each feature amount according to the analysis result.

例えば、眼のエッジの長さなど比較的変化量が小さい特徴やしわなど変化量に個人差がある特徴は重み付けを小さくし、眼と口の端点距離など変化量に個人差が出にくい特徴は重み付けを大きくとることである。 For example, features with relatively small changes, such as the length of the edge of the eye, and features with individual differences in the amount of change, such as wrinkles, are weighted less. It is to take a large weight.

図６は例として変化量に個人差が存在する特徴である眼のエッジの長さの変化量から得点を算出するために参照するグラフである。 FIG. 6 is a graph that is referred to in order to calculate a score from the amount of change in eye edge length, which is a feature in which there is an individual difference in the amount of change.

横軸が特徴量変化量（以下、参照顔での特徴量で正規化した値）、縦軸が得点を表しており、例えば、眼のエッジの長さの変化量が０.４であるとするとグラフから得点は５０点と算出される。眼のエッジの長さの変化量が１.２である場合でも変化量が０.３の場合と同じように得点が５０点と算出され、個人差により変化量が大きく異なった場合でも得点差が小さくなるよう重み付けを行っている。 The horizontal axis represents a feature amount change amount (hereinafter, a value normalized by the feature amount in the reference face), and the vertical axis represents a score. For example, the amount of change in the length of the eye edge is 0.4. Then, the score is calculated as 50 points from the graph. Even when the amount of change in the length of the eye edge is 1.2, the score is calculated as 50 points in the same way as when the amount of change is 0.3, and even if the amount of change varies greatly depending on individual differences, Is weighted so as to be small.

図７は変化量の個人差がない特徴である眼と口の端点距離の長さの変化量から得点を算出するために参照するグラフである。 FIG. 7 is a graph that is referred to in order to calculate a score from the amount of change in the length of the distance between the end points of the eyes and the mouth, which is a feature that does not cause individual differences in the amount of change.

図６と同様に、横軸が特徴量変化量、縦軸が得点を表しており、例えば、眼と口の端点距離の長さの変化量が１.１である場合にはグラフから５０点が算出され、眼と口の端点距離の長さの変化量が１.３である場合にはグラフから５５点が算出される。つまり、個人差により変化量が大きく異なった場合には得点差が大きくなるように重み付けを行っている。 As in FIG. 6, the horizontal axis represents the amount of change in the feature amount, and the vertical axis represents the score. For example, when the amount of change in the length of the end point distance between the eyes and the mouth is 1.1, 50 points are obtained from the graph. Is calculated, and 55 points are calculated from the graph when the amount of change in the length of the end point distance between the eyes and the mouth is 1.3. That is, weighting is performed so that the score difference becomes large when the amount of change greatly varies depending on individual differences.

即ち、「重み」は、得点算出部１０４が得点を算出する際の変化量区分幅と得点幅の比に対応するのである。このように特徴量毎に重みを設定するという工程を行うことで特徴量変化量の個人差を吸収させ、さらに、表情判別は１つの特徴のみに依存しないので、誤検出や未検出を減少させ、表情判別（認識）率を向上させることができる。 That is, the “weight” corresponds to the ratio between the change amount segment width and the score width when the score calculation unit 104 calculates the score. By performing the process of setting the weight for each feature amount in this way, the individual difference in the feature amount change amount is absorbed, and the facial expression discrimination does not depend on only one feature, so that false detection and non-detection are reduced. The facial expression discrimination (recognition) rate can be improved.

なお、ＲＡＭ２３０２には図５，６に示したグラフのデータ、即ち、特徴量の変化量と得点との対応関係を示すデータが保持されており、これを用いて得点を算出する。 The RAM 2302 stores the data of the graphs shown in FIGS. 5 and 6, that is, data indicating the correspondence between the amount of change in the feature amount and the score, and the score is calculated using this data.

得点算出部１０４が求めた各特徴量毎の得点のデータは、各得点がどの特徴量に対するものであるかを示すデータと共に後段の表情判定部１０５に出力される。 The score data for each feature amount obtained by the score calculation unit 104 is output to the subsequent facial expression determination unit 105 together with data indicating which feature amount each score is for.

ＲＡＭ２３０２には、表情判別処理を行う前に、予め各表情において、得点算出部１０４による上記処理によって求めた各特徴量毎の得点のデータが保持されている。 The RAM 2302 stores score data for each feature amount obtained by the above processing by the score calculation unit 104 in advance for each facial expression before performing facial expression discrimination processing.

従って表情判定部１０５は、
１各特徴量毎の得点の総和値と所定の閾値との比較処理
２各特徴量毎の得点の分布と、各表情毎の各特徴量毎の得点の分布とを比較する処理
を行うことで、表情の判別を行う。 Therefore, the facial expression determination unit 105
1 Comparing processing of the total value of scores for each feature amount and a predetermined threshold value 2 By performing processing for comparing the distribution of scores for each feature amount and the distribution of scores for each feature amount for each facial expression And facial expression discrimination.

例えば、喜びの表情を示す表情は
１眼尻が下がる
２頬の筋肉が持ち上がる
３口の端が持ち上がる
などの特徴が見られるため、算出される得点の分布は図９に示すように、「眼の端点と口端点の距離」、「頬周りのエッジ密度」、「口の線エッジの長さ」の得点が非常に高く、続いて「眼の線エッジの長さ」、「眼の線エッジの形状」の得点も他の特徴量に比べて高い得点となり、喜び表情に特有な得点分布となる。この特有な得点分布は他の表情に関しても同様なことが言える。図９は、喜びの表情に対する、得点の分布を示す図である。 For example, the expression of joy is as follows: 1 The bottom of the eye is lowered 2 The muscles of the cheeks are lifted 3 The edge of the mouth is lifted, and the distribution of the calculated scores is as shown in FIG. Score of “distance between mouth end point”, “edge density around cheek”, and “line edge length of mouth” are very high, followed by “line edge length of eye”, “line edge of eye” The score of “shape” is also higher than other features, and the score distribution is unique to joyful expressions. This unique score distribution is the same for other facial expressions. FIG. 9 is a diagram showing the distribution of scores for the expression of joy.

従って、表情判定部１０５は、得点算出部１０４が求めた各特徴量の得点による分布の形状が、どの表情に特有の得点分布の形状に最も近いかを特定し、最も近い形状の得点分布が示す表情が、判定結果として出力すべき表情となる。 Therefore, the facial expression determination unit 105 identifies the shape of the distribution based on the score of each feature amount obtained by the score calculation unit 104 is closest to the shape of the score distribution specific to which facial expression, and the score distribution of the closest shape is The facial expression shown is the facial expression to be output as the determination result.

ここで、形状が最も近い得点分布を探す方法としては、例えば、分布の形状を混合ガウシアン近似してパラメトリックにモデル化し、求めた得点分布と各表情毎に設けられた得点分布との類似度判別を、パラメータ空間内の距離の大小を判定することにより求める。そして、求めた得点分布とより類似度の高い得点分布（より距離の小さい得点分布）が示す表情を、判定の候補とする。 Here, as a method of finding the score distribution with the closest shape, for example, the distribution shape is approximated to a mixed Gaussian and parametrically modeled, and similarity determination between the obtained score distribution and the score distribution provided for each facial expression is performed. Is determined by determining the magnitude of the distance in the parameter space. Then, the facial expression indicated by the score distribution (score distribution with a smaller distance) having a higher degree of similarity to the obtained score distribution is set as a candidate for determination.

次に、得点算出部１０４が求めた各特徴量の得点の総和が閾値以上であるか否かを判定する処理を行う。この比較処理は、表情場面に類似した非表情場面を表情場面と正確に判定するためにより有効な処理である。従ってこの総和値が所定の閾値以上である場合には上記候補を、最終的に判定した表情として判別する。一方、この総和値が所定の閾値よりも小さい場合には上記候補は破棄し、入力画像における顔は無表情、もしくは非表情であると判定する。 Next, a process of determining whether or not the total score of each feature amount obtained by the score calculation unit 104 is equal to or greater than a threshold value is performed. This comparison process is a more effective process for accurately determining a non-expression scene similar to an expression scene as an expression scene. Therefore, if the total value is equal to or greater than a predetermined threshold, the candidate is determined as the finally determined facial expression. On the other hand, if the total value is smaller than a predetermined threshold, the candidate is discarded and it is determined that the face in the input image is expressionless or non-expression.

また、上記得点分布の形状の比較処理において、上記類似度がある値以下である場合にはこの時点で入力画像における顔は無表情、もしくは非表情であると判定し、得点算出部１０４が求めた各特徴量の得点の総和値と閾値との比較処理を行わないで処理を終了するようにしても良い。 In the comparison processing of the shape of the score distribution, if the similarity is not more than a certain value, it is determined that the face in the input image is no expression or no expression at this time, and the score calculation unit 104 obtains it. Alternatively, the process may be terminated without performing the comparison process between the total value of the score of each feature amount and the threshold value.

図８は、得点算出部１０４が求めた各特徴量毎の得点を用いて、入力画像における顔の表情が「特定の表情」であるか否かを判定する場合の判定処理のフローチャートである。 FIG. 8 is a flowchart of a determination process when it is determined whether or not the facial expression in the input image is a “specific expression” using the score for each feature amount obtained by the score calculation unit 104.

まず、表情判定部１０５は、得点算出部１０４が求めた各特徴量の得点による分布の形状が、特定の表情に特有の得点分布の形状と近いか否かを判断する（ステップＳ８０１）。これは例えば、求めた得点分布と特定の表情の得点分布との類似度が所定値以上である場合には、「得点算出部１０４が求めた各特徴量の得点による分布の形状が、特定の表情に特有の得点分布の形状と近い」と判定する。 First, the facial expression determination unit 105 determines whether or not the shape of the distribution of each feature value obtained by the score calculation unit 104 is close to the shape of the score distribution specific to a specific facial expression (step S801). For example, if the degree of similarity between the obtained score distribution and the score distribution of a specific facial expression is equal to or greater than a predetermined value, “the shape of the distribution by the score of each feature amount obtained by the score calculation unit 104 is a specific value. It is determined that it is close to the shape of the score distribution peculiar to facial expressions.

近いと判定された場合には処理をステップＳ８０２に進め、次に、得点算出部１０４が求めた各特徴量の得点の総和値が所定の閾値以上であるか否かの判定処理を行う（ステップＳ８０２）。そして閾値以上であると判定された場合には、入力画像における顔の表情は上記「特定の表情」であると判定し、その判定結果を出力する。 If it is determined that the points are close, the process proceeds to step S802. Next, a determination process is performed to determine whether or not the total value of the scores of the respective feature values obtained by the score calculation unit 104 is equal to or greater than a predetermined threshold (step). S802). If it is determined that the value is greater than or equal to the threshold value, it is determined that the facial expression in the input image is the “specific expression”, and the determination result is output.

一方、ステップＳ８０１で近くないと判定された場合、ステップＳ８０２で上記総和値が閾値よりも小さいと判定された場合には処理をステップＳ８０４に進め、入力画像が非表情画像である、若しくは無表情画像であるという旨を示すデータを出力する（ステップＳ８０４）。 On the other hand, if it is determined in step S801 that it is not close, or if it is determined in step S802 that the total value is smaller than the threshold value, the process proceeds to step S804, and the input image is a non-expression image or no expression. Data indicating that the image is an image is output (step S804).

なお、本実施形態では表情判別処理として、各特徴量毎の得点の総和値と所定の閾値との比較処理、及び各特徴量毎の得点の分布と、各表情毎の各特徴量毎の得点の分布とを比較する処理の両方を行っていたが、これに限定されるものではなく、何れか一方の比較処理を行うのみとしても良い。 In the present embodiment, as facial expression discrimination processing, the comparison between the total value of scores for each feature amount and a predetermined threshold, the distribution of scores for each feature amount, and the score for each feature amount for each facial expression However, the present invention is not limited to this, and only one of the comparison processes may be performed.

以上の処理により、本実施形態によれば、得点分布の比較処理、及び得点の総和値との比較処理を行うので、入力画像における顔の表情がどの表情であるかをより正確に判別できる。また、入力画像における顔の表情が特定の表情であるか否かを判別することもできる。 With the above processing, according to the present embodiment, the comparison processing of score distribution and the comparison processing with the total value of the scores are performed, so it is possible to more accurately determine which facial expression is the facial expression in the input image. It is also possible to determine whether the facial expression in the input image is a specific facial expression.

［第２の実施形態］
図１０は本実施形態に係る画像処理装置の機能構成を示すブロック図である。図１と同じ部分については同じ番号を付けており、その説明は省略する。なお本実施形態に係る画像処理装置の基本構成については、第１の実施形態と同じ、即ち図２３に示したものと同じである。 [Second Embodiment]
FIG. 10 is a block diagram showing a functional configuration of the image processing apparatus according to the present embodiment. The same parts as those in FIG. 1 are given the same numbers, and the description thereof is omitted. The basic configuration of the image processing apparatus according to this embodiment is the same as that of the first embodiment, that is, the same as that shown in FIG.

以下、本実施形態に係る画像処理装置について説明する。上述のように、本実施形態に係る画像処理装置の機能構成において、第１の実施形態に係る画像処理装置の機能構成と異なる点は、表情判定部１６５である。従って以下ではこの表情判定部１６５について詳細に説明する。 The image processing apparatus according to this embodiment will be described below. As described above, the functional configuration of the image processing apparatus according to the present embodiment is different from the functional configuration of the image processing apparatus according to the first embodiment in the facial expression determination unit 165. Therefore, the facial expression determination unit 165 will be described in detail below.

図１１は表情判定部１６５の機能構成を示すブロック図である。同図に示すように、表情判定部１６５は表情可能性判定部１７０と、表情確定部１７１から成る。 FIG. 11 is a block diagram illustrating a functional configuration of the facial expression determination unit 165. As shown in the figure, the facial expression determination unit 165 includes a facial expression possibility determination unit 170 and a facial expression determination unit 171.

表情可能性判定部１７０は、得点算出部１０４から得られた各特徴量の得点から成る得点分布と各得点の総和値とを用いて第１の実施形態と同様の表情判定処理を行い、その判定結果を「表情の可能性判定結果」とする。例えば、喜びの表情であるか否かの判定を行う場合、得点算出部１０４で得られた得点の分布と総和値から「喜びの表情である」と判定を行うのではなくて、「喜びの表情である可能性がある」と判定を行う。 The facial expression possibility determination unit 170 performs facial expression determination processing similar to that of the first embodiment using the score distribution composed of the score of each feature value obtained from the score calculation unit 104 and the total value of each score. The determination result is “expression possibility determination result”. For example, when determining whether or not the expression is a joyful expression, instead of determining that the expression is a joyful expression from the distribution of the scores obtained by the score calculation unit 104 and the total value, It may be a facial expression ”.

この可能性判定は、例えば、非表情場面である会話場面での発音「い」と「え」の顔面の各特徴変化と、喜び場面の顔面の各特徴変化はほぼ全く同じ特徴変化であるので、これら会話場面である非表情場面と喜び場面とを区別するために行うものである。 This possibility determination is made, for example, because each feature change of the pronunciation “I” and “E” in the conversation scene which is a non-expression scene is almost the same feature change as each feature change of the face of the joy scene. This is done to distinguish these non-facial scenes and joyful scenes.

次に、表情確定部１７１は、表情可能性判定部１７０で得られた表情可能性判定結果を用いて、ある特定の表情画像であるという判定を行う。図１２は横軸を時系列画像の夫々固有に付けた画像番号、縦軸を得点の総和と閾値ライン１８２との差とし、真顔である無表情場面から喜び表情場面に変化した場合の得点総和と閾値ライン１８２との差を表した図である。 Next, the facial expression determination unit 171 uses the facial expression possibility determination result obtained by the facial expression possibility determination unit 170 to determine that it is a specific facial expression image. In FIG. 12, the horizontal axis represents the image number uniquely assigned to each time series image, and the vertical axis represents the difference between the sum of the points and the threshold line 182, and the total score when the faceless expression scene changes from a true face to a joy expression scene. FIG. 6 is a diagram illustrating a difference between a threshold line and a threshold line.

図１３は横軸を時系列画像の画像番号、縦軸を得点の総和と閾値ライン１８３との差とし、非表情場面である会話場面の得点総和と閾値ライン１８３との差を表した図である。 In FIG. 13, the horizontal axis represents the image number of the time-series image, the vertical axis represents the difference between the sum of the scores and the threshold line 183, and the difference between the score sum of the conversation scene that is a non-facial expression scene and the threshold line 183 is shown. is there.

図１２の無表情場面から喜び表情場面に変化する場合を参照すると、初期過程から中間過程における得点変化は変動が大きいが、中間過程を過ぎた後は得点変動が緩やかになり、最終的には得点はほぼ一定になっている。つまり、無表情場面から喜び表情場面に変化する初期過程から中間過程では顔面の眼や口などの各部位は急激な変動を起こすが、中間過程から喜びになる過程間は眼や口の各特徴の変動は緩やかとなり最終的には変動しなくなることを示している。 Referring to the case of changing from an expressionless scene to a joyful expression scene in FIG. 12, the score change from the initial process to the intermediate process varies greatly, but after passing the intermediate process, the score change becomes gradual. The score is almost constant. In other words, each part of the face such as the eyes and mouth of the face changes suddenly from the initial process to the joyful expression scene from the expressionless scene to the joyful expression scene. This shows that the fluctuations in the figure are gradual and will eventually cease.

この顔面の各特徴の変動特性は他の表情に対しても同様なことが言える。逆に図１３の非表情場面である会話場面を参照すると、眼や口の顔の各特徴変化が喜びとほぼ同じである発音「い」の会話場面では、得点が閾値ラインを超える画像が存在する。しかし、発音「い」の会話場面では喜び表情場面とは異なり、顔面の各特徴は常に急激な変動を起こしているため、例え得点が閾値ライン以上になったとしても、すぐに得点が閾値ライン以下になるような傾向が見られる。 It can be said that the variation characteristic of each feature of the face is the same for other facial expressions. Conversely, referring to the conversation scene that is a non-facial expression scene in FIG. 13, there is an image whose score exceeds the threshold line in the conversation scene of pronunciation “I” in which each feature change of the face of the eyes and mouth is almost the same as pleasure. To do. However, unlike the joy expression scene in the conversation scene of pronunciation “I”, each feature of the face is constantly changing rapidly, so even if the score exceeds the threshold line, the score immediately becomes the threshold line. There is a tendency to become as follows.

よって、表情可能性判定部１７０において表情の可能性判定を行い、この表情可能性判定結果の連続性から表情確定部１７１で表情を確定させる工程を行うことで会話場面と表情場面をより正確に判別できる。 Therefore, the possibility of facial expression is determined by the expression possibility determination unit 170, and the conversation scene and the expression scene are more accurately performed by performing a step of determining the expression by the expression determination unit 171 from the continuity of the expression possibility determination result. Can be determined.

なお、人間による顔表情の認知に関する視覚心理研究においても表情表出における顔面の動き、特に速度が表情からの感情カテゴリー判断を左右する要因となっていることはＭ．Ｋａｍａｃｈｉ，Ｖ．Ｂｒｕｃｅ，Ｓ．Ｍｕｋａｉｄａ，Ｊ．Ｇｙｏｂａ，Ｓ．Ｙｏｓｈｉｋａｗａ，ａｎｄＳ．Ａｋａｍａｔｓｕ，”Ｄｙｎａｍｉｃｐｒｏｐｅｒｔｉｅｓｉｎｆｌｕｅｎｃｅｔｈｅｐｅｒｃｅｐｔｉｏｎｏｆｆａｃｉａｌｅｘｐｒｅｓｓｉｏｎ，”Ｐｅｒｃｅｐｔｉｏｎ，ｖｏｌ．３０，ｐｐ．８７５−８８７，Ｊｕｌｙ２００１でも明らかになっている。 In the visual psychological research on the recognition of facial expressions by humans, it has been found that the movement of the face, especially the speed, is a factor that determines the emotion category judgment from the facial expression. Kamachi, V .; Bruce, S.M. Mukaida, J. et al. Gyoba, S .; Yoshikawa, and S.J. Akamatsu, "Dynamic properties influencing the perception of facial expression," Perception, vol. 30, pp. 875-887, July 2001.

次に、表情可能性判定部１７０、表情確定部が行う処理についてより詳細に説明する。 Next, processing performed by the facial expression possibility determination unit 170 and the facial expression determination unit will be described in more detail.

先ずある入力画像（ｍフレーム目の画像）について可能性判定部１７０が「第１の表情である」と判定したとする。この判定結果は表情確定部１７１に可能性判定結果として出力される。表情確定部１７１はこの判定結果をすぐには出力せずにその代わりに、可能性判定部１７０が第１の表情であると判定した回数をカウントする。なお、このカウントは可能性判定部１７０が第１の表情とは異なる第２の表情である判定をすると、０にリセットされる。 First, it is assumed that the possibility determination unit 170 determines that the input image (the mth frame image) is “the first facial expression”. This determination result is output to the facial expression determination unit 171 as a possibility determination result. The facial expression determination unit 171 does not output this determination result immediately, but instead counts the number of times that the possibility determination unit 170 determines that it is the first facial expression. This count is reset to 0 when the possibility determination unit 170 determines that the second facial expression is different from the first facial expression.

ここで表情確定部１７１がこの表情の判定結果（第１の表情であるという判定結果）をすぐに出力しないのは、これは上述の通り、ここで判断した表情はまだ上記様々な要因により不明瞭なものである可能性があることに起因するからである。 The facial expression determination unit 171 does not immediately output the facial expression determination result (the determination result that the facial expression is the first facial expression) as described above. This is because the facial expression determined here is still unsatisfactory due to the various factors described above. This is because it may be clear.

可能性判定部１７０はこの後も（ｍ＋１）フレーム目の入力画像、（ｍ＋２）フレーム目の入力画像、、、というように夫々の入力画像に対する表情判定処理を行うのであるが、表情確定部１７１によるカウント値がｎに達した場合、即ち、可能性判定部１７０がｍフレーム目から連続してｎフレーム分全て「第１の表情である」と判定すると、表情確定部１７１はこの時点が「第１の表情の開始時」であること、即ち（ｍ＋ｎ）フレーム目が開始フレームであることを示すデータをＲＡＭ２３０２に記録し、この時点以降、可能性判定部１７０が第１の表情とは異なる第２の表情である判定をした時点までを喜びの表情とする。 The possibility determination unit 170 continues to perform facial expression determination processing for each input image, such as the input image of the (m + 1) th frame and the input image of the (m + 2) th frame, but the facial expression determination unit 171 When the count value of n reaches n, that is, when the possibility determination unit 170 determines that all the n frames are continuously “first facial expression” from the mth frame, the facial expression determination unit 171 determines that “ Data indicating that “the start of the first facial expression”, that is, the (m + n) th frame is the start frame is recorded in the RAM 2302, and after this point, the possibility determination unit 170 is different from the first facial expression. The expression up to the point when the second expression is determined is determined as a joy expression.

図１２を用いた上記説明のように、表情場面では一定期間得点総和と閾値との差が変化しなくなる、即ち、一定期間同じ表情が続く。逆に、一定期間同じ表情が続かない場合には図１３を用いた上記説明のように、非表情場面である会話場面の可能がある。 As described above with reference to FIG. 12, in the facial expression scene, the difference between the score total for a certain period and the threshold value does not change, that is, the same expression continues for a certain period. Conversely, if the same facial expression does not continue for a certain period, there is a possibility of a conversation scene that is a non-facial scene as described above with reference to FIG.

従って、可能性判定部１７０が行う上記処理によって、一定期間（ここではｎフレーム分）同じ表情の可能性を判定すれば、初めてその表情を最終的な判断結果として出力するので、このような非表情場面である会話場面等による表情判定処理に外乱となる要素を取り除くことが出来、より正確な表情判定処理を行うことができる。 Therefore, when the possibility determination unit 170 determines the possibility of the same expression for a certain period (here, n frames) by the above process, the expression is output as the final determination result for the first time. Disturbing elements can be removed from facial expression determination processing in a conversation scene or the like, which is a facial expression scene, and more accurate facial expression determination processing can be performed.

図１４は、表情確定部１７１が行う、画像入力部１００から連続して入力される画像において、喜びの表情の開始時を決定する処理のフローチャートである。 FIG. 14 is a flowchart of processing performed by the facial expression determination unit 171 to determine the start time of a joyful facial expression in images continuously input from the image input unit 100.

まず、可能性判定部１７０による可能性判定結果が喜びであることを示すものである場合には（ステップＳ１９０）、処理をステップＳ１９１に進め、表情確定部１７１によるカウントの値がｐ（図１４ではｐ＝４とする）に達した場合（ステップＳ１９１）、即ち、可能性判定部１７０による可能性判定結果がｐフレーム連続して喜びと判定された場合に、この時点を「喜び開始」と判断し、この旨を示すデータ（例えば現在のフレーム番号データ、及び喜び開始を示すフラグデータ）をＲＡＭ２３０２に記録する（ステップＳ１９２）。 First, when the possibility determination result by the possibility determination unit 170 indicates that the user is happy (step S190), the process proceeds to step S191, and the count value by the facial expression determination unit 171 is p (FIG. 14). P = 4) (step S191), that is, when the possibility determination result by the possibility determination unit 170 is determined to be pleasure for p frames consecutively, this point of time is referred to as “joy start”. Judgment is made, and data indicating this (for example, current frame number data and flag data indicating pleasure start) is recorded in the RAM 2302 (step S192).

以上の処理によって、喜びの表情の開始時（開始フレーム）を特定することができる。 Through the above processing, the start time (start frame) of the expression of joy can be specified.

図１５は、表情確定部１７１が行う、画像入力部１００から連続して入力される画像において、喜びの表情の終了時を決定する処理のフローチャートである。 FIG. 15 is a flowchart of processing performed by the facial expression determination unit 171 to determine the end time of a joyful facial expression in images continuously input from the image input unit 100.

先ず、表情確定部１７１はステップＳ１９２においてＲＡＭ２３０２に記録された上記フラグデータを参照し、現在、喜びの表情を開始して、且つ終了していないかを判断する（ステップＳ２００）。後述するが、喜びの表情が終了したらこのデータはその旨に書き換えられるので、このデータを参照することで、現在喜びの表情が終了しているのか否かを判定することができる。 First, the facial expression determination unit 171 refers to the flag data recorded in the RAM 2302 in step S192, and determines whether the expression of joy is currently started or not ended (step S200). As will be described later, when the expression of joy ends, this data is rewritten to that effect, so it is possible to determine whether or not the expression of joy is currently ended by referring to this data.

まだ喜びの表情が終了していない場合には処理をステップＳ２０１に進め、表情可能性部１７０で喜びである可能性がないとｑ（図１５ではｑ＝３とする）フレーム連続して判定された場合（表情確定部１７１によるカウントがｑフレーム連続０である場合）、この時点を「喜び終了」と判断し、上記フラグデータを「喜びが終了したことを示すデータ」に書き換えてＲＡＭ２３０２に記録する（ステップＳ２０２）。 If the expression of joy has not ended yet, the process proceeds to step S201, and the expression possibility unit 170 determines that there is no possibility of joy q (q = 3 in FIG. 15) frames continuously. (When the count by the facial expression determination section 171 is q frame continuous 0), this point is determined as “joy end”, and the flag data is rewritten to “data indicating that pleasure ends” and recorded in the RAM 2302 (Step S202).

しかし、ステップＳ２０１において、表情可能性部１７０で喜びである可能性がないとｑフレーム連続して判定されていない場合（表情確定部１７１によるカウントがｑフレーム連続０ではない場合）、最終的な表情判定結果として、入力画像中の顔の表情を「喜んでる」と判定し、上記データは操作しない。 However, in step S201, if it is not determined that there is no possibility that the expression possibility unit 170 is delighted in q frames continuously (when the count by the expression determination unit 171 is not q frames continuous 0), the final determination is made. As a facial expression determination result, the facial expression in the input image is determined to be “happy”, and the data is not manipulated.

そして喜びの表情の終了後、表情確定部１７１は、開始時から終了時までの各フレームにおける表情を「喜び」と判定する。 After the expression of joy ends, the expression determination unit 171 determines that the expression in each frame from the start to the end is “joy”.

このように、表情開始画像と表情終了画像を決定し、その間の画像をすべて表情画像と判定を行うことで、その間の画像に対しての表情判断処理の誤判定などの発生を抑制することが出来、全体として表情判断処理の精度を上げることができる。 In this way, the expression start image and the expression end image are determined, and all the images between them are determined to be expression images, thereby suppressing the occurrence of misjudgment in the expression determination process for the images in between. As a whole, the accuracy of facial expression determination processing can be improved.

なお、本実施形態では「喜び」の表情を判断するための処理を例にとって説明したが、この表情が「喜び」以外であってもその処理内容は基本的には同じであることは明白である。 In this embodiment, the processing for determining the expression of “joy” has been described as an example. However, even if this expression is other than “joy”, the processing contents are basically the same. is there.

［第３の実施形態］
図１６は本実施形態に係る画像処理装置の機能構成を示すブロック図である。図１とほぼ同じ動作を行う部分については同じ番号を付けており、その説明は省略する。なお本実施形態に係る画像処理装置の基本構成については、第１の実施形態と同じ、即ち図２３に示したものと同じである。 [Third Embodiment]
FIG. 16 is a block diagram illustrating a functional configuration of the image processing apparatus according to the present embodiment. Parts that perform substantially the same operations as in FIG. 1 are given the same numbers, and descriptions thereof are omitted. The basic configuration of the image processing apparatus according to this embodiment is the same as that of the first embodiment, that is, the same as that shown in FIG.

本実施形態に係る画像処理装置は、入力画像中の顔の表情が何であるかの候補を１つ以上入力しておき、入力画像中の顔の表情がこの入力した１つ以上の何れであるかを判定するものである。 The image processing apparatus according to the present embodiment inputs one or more candidates for what the facial expression in the input image is, and the facial expression in the input image is any one of the input ones or more. It is to determine whether.

以下、本実施形態に係る画像処理装置についてより詳細に説明する。上述のように、本実施形態に係る画像処理装置の機能構成において、第１の実施形態に係る画像処理装置の機能構成と異なる点は、表情選択部２１１、特徴量抽出部２１２、表情判定部２１６である。従って以下ではこの表情選択部２１１、特徴量抽出部２１２、表情判定部２１６について詳細に説明する。 Hereinafter, the image processing apparatus according to the present embodiment will be described in more detail. As described above, the functional configuration of the image processing apparatus according to the present embodiment differs from the functional configuration of the image processing apparatus according to the first embodiment in that the facial expression selection unit 211, the feature amount extraction unit 212, and the facial expression determination unit. 216. Therefore, the facial expression selection unit 211, the feature amount extraction unit 212, and the facial expression determination unit 216 will be described in detail below.

表情選択部２１１は、１つ以上の表情の候補を入力するためのものである。入力には例えば表示装置２３０６の表示画面上に表示される、複数の表情を選択するためのＧＵＩ上で、１つ以上の表情をキーボード２３０４やマウス２３０５を用いて選択するようにしても良い。なお選択した結果はコード（例えば番号）として特徴量抽出部２１２、特徴量変化量算出部１０３に出力される。 The facial expression selection unit 211 is for inputting one or more facial expression candidates. For input, for example, one or more facial expressions may be selected using a keyboard 2304 or a mouse 2305 on a GUI for selecting a plurality of facial expressions displayed on the display screen of the display device 2306. The selected result is output to the feature amount extraction unit 212 and the feature amount change amount calculation unit 103 as a code (for example, a number).

特徴量抽出部２１２は、画像入力部１００から入力された画像における顔から、表情選択部２１１で選択された表情を認識するための特徴量を求める処理を行う。 The feature amount extraction unit 212 performs processing for obtaining a feature amount for recognizing the facial expression selected by the facial expression selection unit 211 from the face in the image input from the image input unit 100.

表情判定部２１６は、画像入力部１００から入力された画像における顔が、表情選択部２１１で選択された表情の何れであるかを判別する処理を行う。 The facial expression determination unit 216 performs a process of determining which face in the image input from the image input unit 100 is the facial expression selected by the facial expression selection unit 211.

図１７は特徴量抽出部２１２の機能構成を示すブロック図である。なお、同図において図２と同じ部分については同じ番号を付けており、その説明は省略する。以下、図１７に示した各部について説明する。 FIG. 17 is a block diagram illustrating a functional configuration of the feature amount extraction unit 212. In the figure, the same parts as those in FIG. 2 are denoted by the same reference numerals, and the description thereof is omitted. Hereinafter, each unit illustrated in FIG. 17 will be described.

各表情毎の特徴量抽出部２２４は、顔面の特徴点抽出部１１３で得られた特徴点情報を用いて、表情選択部２１１が選択した表情に応じた特徴量を算出する。 The feature amount extraction unit 224 for each facial expression uses the feature point information obtained by the facial feature point extraction unit 113 to calculate a feature amount according to the facial expression selected by the facial expression selection unit 211.

図１８は表情選択部２１１が選択した各表情（表情１，表情２，表情３）に応じた特徴量を示す図である。例えば同図によると、表情１を認識するためには特徴１乃至４を算出する必要があるし、表情３を認識するためには特徴２乃至５を算出する必要がある。 FIG. 18 is a diagram showing feature amounts corresponding to the facial expressions (expression 1, facial expression 2, facial expression 3) selected by the facial expression selection unit 211. FIG. For example, according to the figure, it is necessary to calculate features 1 to 4 to recognize facial expression 1, and to calculate features 2 to 5 to recognize facial expression 3.

例えば、表情選択部２１１で喜び表情を選択したと仮定すると、喜び表情に必要な特徴は眼と口の端点距離、眼のエッジの長さ、眼のエッジの傾き、口エッジの長さ、口エッジの傾き、頬周りのエッジ密度の６特徴であるというように表情別に個別の特徴量が必要となる。 For example, assuming that the joy expression is selected by the expression selection unit 211, the features required for the joy expression are the distance between the end points of the eyes and the mouth, the length of the eye edge, the inclination of the eye edge, the length of the mouth edge, Individual feature amounts are required for each facial expression, such as edge inclination and edge density around the cheek.

このような、各表情を認識するために必要な特徴量を示すテーブル（図１８に例示するような対応関係を示すテーブル）、即ち、表情選択部２１１から入力される表情を示すコードと、この表情を認識するためにはどのような特徴量を求めるのかを示すデータとが対応付けられたテーブルはＲＡＭ２３０２に予め記録されているものとする。 Such a table (characteristic table shown in FIG. 18) indicating the feature amount necessary for recognizing each facial expression, that is, a code indicating the facial expression input from the facial expression selection unit 211, It is assumed that a table that is associated with data indicating what kind of feature value is obtained in order to recognize a facial expression is recorded in the RAM 2302 in advance.

上述の通り、表情選択部２１１からは選択した表情に応じたコードが入力されるので、特徴量抽出部２１２は、このテーブルを参照することで、このコードに応じた表情を認識するための特徴量を特定することができ、その結果、表情選択部２１１が選択した表情に応じた特徴量を算出することができる。 As described above, since the code corresponding to the selected facial expression is input from the facial expression selection unit 211, the feature amount extraction unit 212 refers to this table to recognize the facial expression corresponding to this code. The amount can be specified, and as a result, the feature amount corresponding to the facial expression selected by the facial expression selection unit 211 can be calculated.

図１６に戻って、次に後段の特徴量変化量算出部１０３は第１の実施形態と同様に特徴量抽出部２１２による特徴量と、参照特徴保持部１０２が保持する特徴量との差分を計算する。 Returning to FIG. 16, the subsequent feature amount change amount calculation unit 103 calculates the difference between the feature amount by the feature amount extraction unit 212 and the feature amount held by the reference feature holding unit 102 as in the first embodiment. calculate.

なお、特徴量抽出部２１２が算出する特徴量は、表情によってその数や種類が異なる。従って本実施形態に係る特徴量変化量算出部１０３は、表情選択部２１１が選択した表情を認識するために必要な特徴量を参照特徴保持部１０２から読み出して用いる。表情選択部２１１が選択した表情を認識するために必要な特徴量の特定は、特徴量抽出部２１２が用いた上記テーブルを参照すれば特定することができる。 Note that the number and type of feature amounts calculated by the feature amount extraction unit 212 differ depending on facial expressions. Therefore, the feature amount change amount calculation unit 103 according to the present embodiment reads out and uses the feature amount necessary for recognizing the facial expression selected by the facial expression selection unit 211 from the reference feature holding unit 102. The feature amount necessary for recognizing the facial expression selected by the facial expression selection unit 211 can be specified by referring to the table used by the feature amount extraction unit 212.

例えば、喜び表情に必要な特徴は眼と口の端点距離、眼のエッジの長さ、眼のエッジの傾き、口エッジの長さ、口エッジの傾き、頬周りのエッジ密度の６特徴であるので、この６特徴と同様の特徴を参照特徴保持部１０２から読み出し、用いる。 For example, the features necessary for a joyful expression are the following six features: eye-to-mouth end point distance, eye edge length, eye edge tilt, mouth edge length, mouth edge tilt, cheek edge density. Therefore, features similar to these six features are read from the reference feature holding unit 102 and used.

特徴量変化量算出部１０３からは各特徴量の変化量が出力されるので、得点算出部１０４は第１の実施形態と同様の処理を行う。本実施形態では表情選択部２１１によって複数の表情が選択されている場合があるので、選択された夫々の表情毎に、第１の実施形態と同様の得点算出処理を行い、表情毎に各特徴量毎の得点を算出する。 Since the feature amount change amount calculation unit 103 outputs the change amount of each feature amount, the score calculation unit 104 performs the same processing as in the first embodiment. In the present embodiment, since a plurality of facial expressions may be selected by the facial expression selection unit 211, a score calculation process similar to that of the first embodiment is performed for each selected facial expression, and each feature is represented for each facial expression. The score for each quantity is calculated.

図１９は各表情毎に、各変化量に基づいて得点を算出する様子を示す模式図である。 FIG. 19 is a schematic diagram showing how a score is calculated for each facial expression based on each change amount.

そして表情判定部１０５は、表情選択部２１１によって複数された表情毎に得点の総和値を求める。この夫々の表情毎の総和値において、最も高い値を示す表情が、入力画像における顔の表情とすることができる。 The facial expression determination unit 105 obtains the total value of the scores for each of the facial expressions selected by the facial expression selection unit 211. In the total value for each expression, the expression showing the highest value can be the facial expression in the input image.

例えば、喜び、悲しみ、怒り、驚き、嫌悪、恐怖の表情のうちで喜び表情が最も高い得点総和ならば、表情は喜び表情であると判定されるということである。 For example, if a joy expression is the highest score sum among joy, sadness, anger, surprise, disgust, and fear, the expression is determined to be a joy expression.

［第４の実施形態］
本実施形態に係る画像処理装置は、入力画像中の顔の表情を判定した場合に、更に、表情場面での表情の度合いを判定する。本実施形態に係る画像処理装置の基本構成、機能構成については第１乃至３の何れの実施形態のものを適用しても良い。 [Fourth Embodiment]
The image processing apparatus according to the present embodiment further determines the degree of facial expression in the facial expression scene when the facial expression in the input image is determined. The basic configuration and functional configuration of the image processing apparatus according to this embodiment may be the same as those of any of the first to third embodiments.

まず、表情の度合いを判定する方法では、表情判定部においてある特定の表情であると判定された入力画像に対して、得点算出部で算出された得点変化の推移もしくは得点総和を参照する。 First, in the method of determining the degree of facial expression, the transition of the score change or the total score calculated by the score calculation unit is referred to for an input image determined to be a specific facial expression by the facial expression determination unit.

もし、得点算出部で算出された得点総和が得点の総和の閾値と比較して閾値との差が小さいならば、喜びの度合いは小さいと判定される。逆に、得点算出部で算出された得点の総和が閾値と比較して閾値との差が大きいならば、喜びの度合いが大きいと判定される。この方法は、喜びの表情以外の他の表情に対しても同様に表情の度合いを判定できる。 If the total score calculated by the score calculation unit is smaller than the threshold of the total score, the degree of joy is determined to be small. On the other hand, if the sum of the scores calculated by the score calculation unit is large compared to the threshold value, the degree of pleasure is determined to be large. This method can determine the degree of facial expression in the same way for facial expressions other than joyful facial expressions.

［第５の実施形態］
また、上記実施形態において、得点算出部で算出された眼の形状の得点から眼をつぶっているか否かの判定を行うこともできる。 [Fifth Embodiment]
Moreover, in the said embodiment, it can also be determined whether the eye is closed from the score of the eye shape calculated by the score calculation part.

図２１は参照顔の眼のエッジ、即ち、眼を開いている場合の眼のエッジを示した図であり、図２２は眼をつぶった場合の眼のエッジを示した図である。 FIG. 21 is a diagram showing the eye edges of the reference face, that is, the eye edges when the eyes are open, and FIG. 22 is a diagram showing the eye edges when the eyes are closed.

特徴量抽出部で抽出された眼をつぶった場合の眼のエッジ３１６の長さは参照画像の眼のエッジ３０４の長さと比べて全く変化しない。 The length of the eye edge 316 when the eye extracted by the feature amount extraction unit is crushed does not change at all compared to the length of the eye edge 304 of the reference image.

しかし、図２１の眼を開いている場合の眼のエッジ３０４の特徴点３０５と３０６を結んで得られる直線３０８の傾きと、図２２の眼をつぶった場合の眼のエッジ３１６の特徴点３１０と３１１を結んで得られる直線３１３の傾きを比べると、眼を開いた状態から眼をつぶった状態に変化した場合には直線の傾きの変化量が負となっている。 However, the inclination of the straight line 308 obtained by connecting the feature points 305 and 306 of the eye edge 304 when the eye of FIG. 21 is open, and the feature point 310 of the eye edge 316 when the eye of FIG. When the inclination of the straight line 313 obtained by connecting 311 and 311 is compared, when the eye changes from the open state to the closed state, the amount of change in the linear inclination is negative.

また、図２１の眼を開いている場合の眼のエッジ３０４の特徴点３０６と３０７から得られる直線３０９の傾きと、図２２の眼をつぶった場合の眼のエッジ３１６の特徴点３１１と３１２から得られる直線３１４の傾きを比べると、眼を開いた状態から眼をつぶった状態に変化した場合には直線の傾きの変化量が正となっている。 Further, the inclination of the straight line 309 obtained from the feature points 306 and 307 of the eye edge 304 when the eye of FIG. 21 is opened, and the feature points 311 and 312 of the eye edge 316 when the eye of FIG. When the inclination of the straight line 314 obtained from the above is compared, when the eye changes from the open state to the closed state, the change amount of the linear inclination is positive.

そこで、眼のエッジの長さが全く変化せず、眼のエッジから得られる上述した左右２本の直線の傾きの変化量の絶対値が参照画像の眼のエッジに対して、それぞれある所定値以上で、一方が負かつ他方が正の変化をした場合には眼をつぶっている可能性が高いと判定することができ、直線の傾きの変化量に応じて極端に得点算出部で得られる得点を小さくしている。 Therefore, the length of the eye edge does not change at all, and the absolute value of the change amount of the inclination of the two right and left straight lines obtained from the eye edge is a predetermined value with respect to the eye edge of the reference image. As described above, when one side is negative and the other side is positive, it can be determined that there is a high possibility that the eyes are closed, and the score calculation unit can obtain extremely according to the amount of change in the slope of the straight line. The score is reduced.

図２０は、得点算出部で算出された眼の形状の得点から眼をつぶっているか否かの判定処理のフローチャートである。この処理は表情判定部で行われるものである。 FIG. 20 is a flowchart of a process for determining whether or not the eyes are closed from the score of the eye shape calculated by the score calculation unit. This process is performed by the facial expression determination unit.

上述のように、目の形状に対する得点が閾値以下であるか否かを判断し、閾値以下であれば目をつぶっている、否であれば目をつぶっていないと判定する。 As described above, it is determined whether or not the score for the eye shape is equal to or less than a threshold value. If the score is equal to or less than the threshold value, it is determined that the eyes are closed.

［第６の実施形態］
図２４は本実施形態に係る画像処理装置の機能構成を示すブロック図である。図１とほぼ同じ動作を行う部分については同じ番号を付けており、その説明は省略する。なお本実施形態に係る画像処理装置の基本構成については、第１の実施形態と同じ、即ち図２３に示したものと同じである。 [Sixth Embodiment]
FIG. 24 is a block diagram showing a functional configuration of the image processing apparatus according to the present embodiment. Parts that perform substantially the same operations as in FIG. The basic configuration of the image processing apparatus according to this embodiment is the same as that of the first embodiment, that is, the same as that shown in FIG.

特徴量抽出部７０１は図２５に示す如く、鼻と眼と口位置算出部７１０、エッジ画像生成部７１１、顔面の各特徴エッジ抽出部７１２、顔面の特徴点抽出部７１３、表情特徴量抽出部７１４から構成される。図２５は、特徴量抽出部７０１の機能構成を示すブロック図である。 As shown in FIG. 25, the feature quantity extraction unit 701 includes a nose, eye, and mouth position calculation unit 710, an edge image generation unit 711, each facial feature edge extraction unit 712, a facial feature point extraction unit 713, and a facial expression feature quantity extraction unit. 714. FIG. 25 is a block diagram illustrating a functional configuration of the feature amount extraction unit 701.

正規化特徴変化量計算部７０３は、特徴量抽出部７０１から得られる夫々の特徴量と参照特徴保持部７０２から得られる夫々の特徴量の比を算出する。なお、正規化特徴変化量算出部７０３で算出される夫々の特徴変化量は笑顔を検出すると仮定した場合には「眼と口の端点距離」、「眼のエッジの長さ」、「眼のエッジの傾き」、「口エッジの長さ」、「口エッジの傾き」である。さらに、顔のサイズ変動や顔の回転変動に応じて各特徴量を正規化する。 The normalized feature change amount calculation unit 703 calculates a ratio between each feature amount obtained from the feature amount extraction unit 701 and each feature amount obtained from the reference feature holding unit 702. In addition, when it is assumed that each feature change amount calculated by the normalized feature change amount calculation unit 703 detects a smile, “end-to-end distance between eyes and mouth”, “length of eye edge”, “eye length” These are “edge inclination”, “mouth edge length”, and “mouth edge inclination”. Further, each feature amount is normalized according to the face size fluctuation and the face rotation fluctuation.

正規化特徴変化量算出部７０３で得られたそれぞれの特徴変化量の正規化方法について説明する。図２６は、画像における顔中の目、鼻の重心位置を示す図である。同図において７２０，７２１はそれぞれ右目、左目の重心位置、７２２は花の重心位置を示す。特徴量抽出部７０１の鼻と眼と口位置検出部７１０で鼻・眼・口のそれぞれのテンプレートを用いることによって検出された鼻の重心位置７２２、眼の重心位置７２０、７２１から、図２８に示す如く、右目位置と顔位置水平方向距離７３０、左目位置と顔位置水平方向距離７３１、左右眼の垂直方向の座標平均と顔位置との垂直方向距離７３２を算出する。 A method for normalizing each feature variation obtained by the normalized feature variation calculator 703 will be described. FIG. 26 is a diagram illustrating the gravity center positions of eyes and noses in the face in the image. In the figure, 720 and 721 indicate the center of gravity of the right eye and the left eye, respectively, and 722 indicates the center of gravity of the flower. From the nose, eye, and mouth position detection unit 710 of the feature amount extraction unit 701 using the nose, eye, and mouth templates, the nose center of gravity position 722, and the eye center of gravity positions 720 and 721, FIG. As shown, the right eye position and face position horizontal distance 730, the left eye position and face position horizontal distance 731, the vertical coordinate average of the left and right eyes, and the vertical distance 732 of the face position are calculated.

右目位置と顔位置水平方向距離７３０、左目位置と顔位置水平方向距離７３１、左右眼の垂直方向の座標平均と顔位置との垂直方向距離７３２の比ａ：ｂ：ｃは、顔サイズが変動した場合には図２９に示す如く、右目位置と顔位置水平方向距離７３３、左目位置と顔位置水平方向距離７３４、左右眼の垂直方向の座標平均と顔位置との垂直方向距離７３５のそれぞれの比ａ１：ｂ１：ｃ１とほとんど変化はないが、サイズ変動しない場合の右目位置と顔位置水平方向距離７３０とサイズ変動した場合の右目位置と顔位置水平方向距離７３３の比ａ：ａ１は顔サイズ変動に応じて変化する。なお、右目位置と顔位置水平方向距離７３０、左目位置と顔位置水平方向距離７３１、左右眼の垂直方向の座標平均と顔位置との垂直方向距離７３２を算出する際には図２７に示す如く、鼻と眼の重心位置以外に眼の端点位置（７２３、７２４）や左右それぞれの鼻腔位置や左右鼻腔位置の重心（７２５）を用いても良い。眼の端点を算出する方法は、例えばエッジを走査する方法や眼の端点検出用のテンプレートを用いる方法や、鼻腔位置に関しても鼻腔検出用テンプレートを用いて左右の鼻腔の重心やそれぞれ左右鼻腔位置を用いる方法がある。変動を判定するための特徴間距離も左右の目頭間距離など他の特徴を用いても良い。 The right eye position and face position horizontal distance 730, the left eye position and face position horizontal distance 731, and the ratio of the vertical average coordinate of the left and right eyes to the vertical distance 732 of the face position, a: b: c, the face size varies. In this case, as shown in FIG. 29, the right eye position and the face position horizontal distance 733, the left eye position and the face position horizontal distance 734, the vertical average coordinate of the left and right eyes and the vertical distance 735 of the face position, respectively. The ratio a1: a1 is the ratio of the right eye position and face position horizontal distance 733 when the size does not change, but the ratio a: a1 is the face size. It changes according to the fluctuation. When calculating the right eye position and face position horizontal distance 730, the left eye position and face position horizontal distance 731 and the vertical coordinate average of the left and right eyes and the vertical distance 732 as shown in FIG. In addition to the center of gravity of the nose and eyes, the eye endpoint positions (723, 724), the left and right nasal cavity positions, and the center of gravity of the left and right nasal cavity positions (725) may be used. The methods for calculating the eye endpoints include, for example, a method of scanning an edge, a method of using an eye endpoint detection template, and the nasal cavity position using the nasal cavity detection template to determine the center of gravity of the left and right nasal cavity and the position of the left and right nasal cavity. There is a method to use. Other distances such as the distance between the left and right eyes may be used as the distance between the characteristics for determining the variation.

さらに、図３０に示す如く、左右眼の垂直方向の座標平均と顔位置との垂直方向距離７３８と図２８の顔が回転しない場合の左右眼の垂直方向の座標平均と顔位置との垂直方向距離７３２との比ｃ：ｃ２は顔の上下回転によって比が変化する。 Further, as shown in FIG. 30, the vertical coordinate 738 between the vertical average of the left and right eyes and the vertical distance 738 between the face position and the vertical coordinate average of the left and right eyes when the face of FIG. The ratio c: c2 with the distance 732 changes depending on the vertical rotation of the face.

また、図３１に示す如く、右目位置と顔位置水平方向距離７３９と左目位置と顔位置水平方向距離７４０の比ａ３：ｂ３は図２８の顔が左右回転しない場合の右目位置と顔位置水平方向距離７３０と左目位置と顔位置水平方向距離７３１の比ａ：ｂと比べると比が変化する。 Further, as shown in FIG. 31, the ratio a3: b3 of the right eye position / face position horizontal distance 739 and the left eye position / face position horizontal distance 740 is the right eye position / face position horizontal direction when the face of FIG. The ratio changes as compared with the ratio a: b of the distance 730 and the left eye position / face position horizontal distance 731.

また、顔の左右回転した場合には、図３２に示す参照画像（無表情時の画像）の右眼端点間距離ｄ１と左眼端点間距離ｅ１の比ｇ１（＝ｄ１／ｅ１）と、図３３に示す入力画像（笑顔時の画像）の右眼端点間距離ｄ２と左眼端点間距離ｅ２の比ｇ２＝（ｄ２／ｅ２）の比ｇ２／ｇ１を用いることもできる。 When the face is rotated left and right, the ratio g1 (= d1 / e1) of the distance d1 between the right eye end points and the distance e1 between the left eye end points of the reference image (image without expression) shown in FIG. The ratio g2 = (d2 / e2) of the distance d2 between the right eye end points and the distance e2 between the left eye end points of the input image (image when smiling) shown in FIG. 33 can also be used.

図３４はサイズ変動、左右回転変動、上下回転変動を判定する処理のフローチャートである。同図のフローチャートを用いてサイズ変動、左右回転変動、上下回転変動を判定する処理について説明するのであるが、その際に図２８を「変動していない状態で眼と鼻の位置間を直線で結んだ図」、図３５を「サイズ変動、左右回転変動ないしは上下変動した後の眼と鼻の位置間を直線で結んだ図」として用いるものとする。 FIG. 34 is a flowchart of processing for determining size variation, left-right rotation variation, and vertical rotation variation. The process of determining size variation, left-right rotation variation, and vertical rotation variation will be described with reference to the flowchart of FIG. 28. In this case, FIG. FIG. 35 and FIG. 35 are used as “a diagram in which the positions of the eyes and nose after a size variation, left-right rotation variation, or vertical variation are connected by a straight line”.

まず、ステップＳ７７０において、ａ：ｂ：ｃとａ４：ｂ４：ｃ４の比が同じであるかの判定を行う。この「同じである」という判定は、「全く同じ」であることに限定するものではなく、「両者の比の差がある許容範囲内」であれば「同じである」と判断しても良い。 First, in step S770, it is determined whether the ratio of a: b: c and a4: b4: c4 is the same. This determination of “same” is not limited to “exactly the same”, but may be determined to be “same” as long as “the difference between the two ratios is within an allowable range”. .

ステップＳ７７０の判定処理でａ：ｂ：ｃとａ４：ｂ４：ｃ４の比が同じであると判断した場合には処理をステップＳ７７１に進め、「変化なし、もしくはサイズ変動のみである」と判断し、更に処理をステップＳ７７２に進め、ａ／ａ４が１であるか否かを判断する。 If it is determined in the determination process in step S770 that the ratio of a: b: c and a4: b4: c4 is the same, the process proceeds to step S771, and “no change or only size variation” is determined. Further, the process proceeds to step S772, and it is determined whether or not a / a4 is 1.

ａ／ａ４が１である場合には処理をステップＳ７７３に進め、「サイズ変動かつ回転変動がない」と判断する。一方、ステップＳ７７２でａ／ａ４が１ではないと判断した場合には処理をステップＳ７７４に進め、「サイズ変動のみ」と判断する。 If a / a4 is 1, the process proceeds to step S773, and it is determined that “there is no size variation and no rotation variation”. On the other hand, if it is determined in step S772 that a / a4 is not 1, the process proceeds to step S774, and “size variation only” is determined.

一方、ステップＳ７７０における判断処理で、ａ：ｂ：ｃとａ４：ｂ４：ｃ４の比が同じではないと判定された場合には処理をステップＳ７７５に進め、「上下回転、左右回転、上下回転かつサイズ変動、左右回転かつサイズ変動、上下回転かつ左右回転、上下回転かつ左右回転かつサイズ変動の何れかである」と判断する。 On the other hand, if it is determined in the determination process in step S770 that the ratio of a: b: c and a4: b4: c4 is not the same, the process proceeds to step S775, and “up and down rotation, left and right rotation, up and down rotation and It is determined that any one of size fluctuation, left-right rotation and size fluctuation, up-down rotation and left-right rotation, up-down rotation, left-right rotation, and size fluctuation ”.

そして処理をステップＳ７７６に進め、ａ：ｂとａ４：ｂ４の比が同じであるか否かを判定し（ここでの「同じである」という判断についてもステップＳ７７０におけるものと同じである）、同じであると判断した場合には処理をステップＳ７７７に進め、「上下回転、上下回転かつサイズ変動の何れか」と判断する。そして処理をステップＳ７７８に進め、ａ／ａ４が１であるか否かを判断する。ａ／ａ４が１ではないと判断した場合には処理をステップＳ７７９に進め、「上下回転かつサイズ変動である」と判断する。一方、ａ／ａ４が１であると判断された場合には、処理をステップＳ７８０に進め、「上下回転のみである」と判定する。 Then, the process proceeds to step S776, and it is determined whether or not the ratio of a: b and a4: b4 is the same (this is also the same as that in step S770). If it is determined that they are the same, the process proceeds to step S777, and it is determined that “any of vertical rotation, vertical rotation, and size variation”. Then, the process proceeds to step S778 to determine whether or not a / a4 is 1. If it is determined that a / a4 is not 1, the process proceeds to step S779, where it is determined that “up / down rotation and size variation”. On the other hand, if it is determined that a / a4 is 1, the process proceeds to step S780, where it is determined that “only vertical rotation”.

一方、ステップＳ７７６において、ａ：ｂとａ４：ｂ４の比が同じではないと判断された場合には処理をステップＳ７８１に進め、ステップＳ７７８と同様にａ／ａ４が１であるか否かを判断する。 On the other hand, if it is determined in step S776 that the ratio of a: b and a4: b4 is not the same, the process proceeds to step S781, and it is determined whether a / a4 is 1 as in step S778. To do.

そしてａ／ａ４が１である場合には処理をステップＳ７８２に進め、「左右回転、上下回転かつ左右回転の何れかである」と判断する。そして処理をステップＳ７８３に進め、ｃ／ｃ３が１であるか否かを判断する。ｃ／ｃ３が１ではないと判断された場合には処理をステップＳ７８４に進め、「上下回転かつ左右回転である」と判断し、一方、ｃ／ｃ３が１であると判断した場合には処理をステップＳ７８５に進め、「左右回転である」と判定する。 If a / a4 is 1, the process advances to step S782, and it is determined that “any of left-right rotation, vertical rotation, and left-right rotation” is set. Then, the process proceeds to step S783, and it is determined whether c / c3 is 1. If it is determined that c / c3 is not 1, the process proceeds to step S784, where “up / down and left / right rotation” is determined. On the other hand, when c / c3 is determined to be 1, The process proceeds to step S785, and it is determined that the rotation is “left-right rotation”.

一方、ステップＳ７８１において、ａ／ａ４が１ではないと判断した場合には処理をステップＳ７８６に進め、「左右回転かつサイズ変動、上下回転かつ左右回転かつサイズ変動の何れかである」と判断する。そして処理をステップＳ７８７に進め、（ａ４／ｂ４）／（ａ／ｂ）が１よりも大きいか否かを判断する。 On the other hand, if it is determined in step S781 that a / a4 is not 1, the process proceeds to step S786, and it is determined that "one of left and right rotation and size variation, vertical rotation and left and right rotation and size variation". . Then, the process proceeds to step S787, and it is determined whether (a4 / b4) / (a / b) is larger than 1.

そして（ａ４／ｂ４）／（ａ／ｂ）が１よりも大きい場合には処理をステップＳ７８８に進め、「左回転」と判定する。そして処理をステップＳ７８９に進め、ａ：ｃとａ４：ｃ４の比が同じ（「同じである」の基準はステップＳ７７０と同じ）であるか否かを判断し、同じである場合には処理をステップＳ７９０に進め、「左右回転かつサイズ変動である」と判断する。一方、ａ：ｃとａ４：ｃ４の比が同じでない場合には処理を処理をステップＳ７９３に進め、「上下回転かつ左右回転かつサイズ変動である」と判断する。 If (a4 / b4) / (a / b) is greater than 1, the process advances to step S788 to determine “left rotation”. Then, the process proceeds to step S789, where it is determined whether or not the ratio of a: c and a4: c4 is the same (the criterion of “same” is the same as that of step S770). Proceeding to step S790, it is determined that “rotation left and right and size variation”. On the other hand, if the ratio of a: c and a4: c4 is not the same, the process proceeds to step S793, and it is determined that “the rotation is up / down, left / right, and size variation”.

一方、ステップＳ７８７において（ａ４／ｂ４）／（ａ／ｂ）が１以下であると判断した場合には処理をステップＳ７９１に進め、「右回転」と判定する。そして処理をステップＳ７９２に進め、ｂ：ｃとｂ４：ｃ４の比が同じ（「同じである」の基準はステップＳ７７０と同じ）であるか否かを判断する。そして同じである場合には処理をステップＳ７９０に進め、「左右回転かつサイズ変動である」と判断する。一方、ｂ：ｃとｂ４：ｃ４の比が同じではない場合には処理をステップＳ７９３に進め、「上下回転かつ左右回転かつサイズ変動である」と判断する。それぞれのステップで用いられている比などは、フローチャートに書かれているものに限定されるわけではない。例えば、ステップＳ７７２、ステップＳ７７８、ステップＳ７８１ではｂ／ｂ４や（ａ＋ｂ）／（ａ４＋ｂ４）などを用いても良い。 On the other hand, if it is determined in step S787 that (a4 / b4) / (a / b) is 1 or less, the process proceeds to step S791, and “right rotation” is determined. Then, the process proceeds to step S792, and it is determined whether or not the ratio of b: c and b4: c4 is the same (the criterion of “same” is the same as that of step S770). If they are the same, the process proceeds to step S790, and it is determined that “rotate left and right and change size”. On the other hand, if the ratio of b: c and b4: c4 is not the same, the process proceeds to step S793, and it is determined that “there is vertical rotation, left-right rotation, and size variation”. The ratio used in each step is not limited to that described in the flowchart. For example, b / b4, (a + b) / (a4 + b4), or the like may be used in steps S772, S778, and S781.

以上の処理によって、顔のサイズ変動や顔の回転変動した場合の判別が可能となる。さらに、これらの変動が判別された場合には正規化特徴変化量算出部７０３で得られる夫々の特徴変化量を正規化することによって顔のサイズが変動した場合や顔が回転した場合においても表情の認識が可能となる。 By the above processing, it is possible to determine when the face size changes or the face rotation changes. Further, when these variations are determined, the facial expression changes even when the face size changes or the face rotates by normalizing each feature change amount obtained by the normalized feature change amount calculation unit 703. Can be recognized.

特徴量正規化方法は、例えば、サイズ変動のみである場合については、図２８と図２９を用いて説明すると、入力画像から得られるすべての特徴変化量を１／（ａ１／ａ）倍すれば良い。なお、１／（ａ１／ａ）ではなくて１（１ｂ／ｂ）、１／（（ａ１＋ｂ１）／（ａ＋ｂ））、１／（ｃ１／ｃ）やそれ以外の特徴を用いても良い。また、図３６に示すように、上下回転かつサイズ変動した場合には、上下回転が影響を与える目の端点と口の端点距離を（ａ５／ｃ５）／（ａ／ｃ）倍した後ですべての特徴量を１／（ａ１／ａ）倍すれば良い。上下回転した場合においても同様に（ａ５／ｃ５）／（ａ／ｃ）を用いることに限定するわけではない。このようにして顔のサイズ変動、上下左右回転変動を判定し、特徴変化量を正規化することによって顔のサイズが変動した場合や顔が上下左右回転変動した場合でも表情の認識が可能である。 For example, when the feature amount normalizing method is only a size variation, it will be described with reference to FIGS. 28 and 29. All feature change amounts obtained from the input image are multiplied by 1 / (a1 / a). good. Instead of 1 / (a1 / a), 1 (1b / b), 1 / ((a1 + b1) / (a + b)), 1 / (c1 / c) or other features may be used. In addition, as shown in FIG. 36, when the vertical rotation and size change occur, the distance between the end point of the eye and the end point of the mouth, which is affected by the vertical rotation, is multiplied by (a5 / c5) / (a / c). May be multiplied by 1 / (a1 / a). Similarly, when it is rotated up and down, it is not limited to using (a5 / c5) / (a / c). In this way, it is possible to recognize facial expression even when the face size changes or the face changes up and down, left and right by normalizing the feature change amount by determining the face size fluctuation and vertical and horizontal rotation fluctuation. .

図３７は、左右眼及び鼻の位置検出から上下・左右回転変動、サイズ変動に応じて各特徴量を正規化し、表情判定する処理のフローチャートである。 FIG. 37 is a flowchart of processing for determining facial expressions by normalizing each feature amount in accordance with vertical / horizontal rotation fluctuations and size fluctuations from detection of left and right eye and nose positions.

ステップＳ８７０で左右眼の重心座標と鼻の重心座標を検出した後、ステップＳ８７１で左右・上下回転変動またはサイズ変動の判定を行い、もし、左右・上下回転変動がない場合にはステップＳ８７２で特徴変化量正規化必要ないと判定され、参照特徴量との比を算出することにより特徴量の変化量を算出し、ステップＳ８７３で各特徴毎の得点算出を行い、ステップＳ８７４で各特徴量変化量から算出された得点総和を算出する。一方、ステップＳ８７１で左右・上下回転変動またはサイズ変動がありと判定された場合、ステップＳ８７５で各特徴量正規化必要ありと判定され、各特徴量を参照特徴量との比を算出することにより特徴量の変化量を算出し、上下・左右回転変動またはサイズ変動に応じて特徴量の変化量を正規化した後、ステップＳ８７３で各特徴量変化量毎の得点算出を行い、ステップＳ８７４で各特徴量変化量から算出された得点総和を算出する。 After detecting the center-of-gravity coordinates of the left and right eyes and the center of gravity of the nose in step S870, the left-right / vertical rotation variation or size variation is determined in step S871, and if there is no left-right / vertical rotation variation, the feature is determined in step S872. It is determined that change amount normalization is not necessary, and a change amount of the feature amount is calculated by calculating a ratio with the reference feature amount. In step S873, a score is calculated for each feature, and each feature amount change amount is determined in step S874. The total score calculated from is calculated. On the other hand, if it is determined in step S871 that there is left / right / vertical rotation variation or size variation, it is determined in step S875 that each feature amount needs to be normalized, and by calculating the ratio of each feature amount to the reference feature amount, After calculating the change amount of the feature amount and normalizing the change amount of the feature amount according to the vertical / horizontal rotation variation or the size variation, the score is calculated for each feature amount change amount in step S873, and in step S874, The total score calculated from the feature amount change amount is calculated.

そして算出された得点の総和から、ステップＳ８７６で入力画像における顔の表情の判定を第１の実施形態と同様にして行う。 Then, in step S876, the facial expression in the input image is determined from the calculated total score in the same manner as in the first embodiment.

［第７の実施形態］
図３８は本実施形態に係る撮像装置の機能構成を示すブロック図である。本実施形態に係る撮像装置は同図に示す如く、撮像部８２０、画像処理部８２１、画像２次記憶部８２２から構成されている。 [Seventh Embodiment]
FIG. 38 is a block diagram illustrating a functional configuration of the imaging apparatus according to the present embodiment. As shown in the figure, the imaging apparatus according to the present embodiment includes an imaging unit 820, an image processing unit 821, and an image secondary storage unit 822.

図３９は、撮像部８２０の機能構成を示す図で、撮像部８２０は大まかには同図に示す如く、結像光学系８３０、固体撮像素子８３１、映像信号処理８３２、画像１次記憶部８３３により構成されている。 FIG. 39 is a diagram showing a functional configuration of the imaging unit 820. The imaging unit 820 is roughly shown in FIG. 39, and the imaging optical system 830, the solid-state imaging element 831, the video signal processing 832, and the image primary storage unit 833. It is comprised by.

結像光学系８３０は例えばレンズであって、周知の通り後段の固体撮像素子８３１に対して外界の光を結像させる。固体撮像素子８３１は例えばＣＣＤであって、周知の通り結像光学系８３０により結蔵された像を電気信号に変換し、結果として撮像画像を電気信号として後段の映像信号処理回路８３２に出力され、映像信号処理部８３２はこの電気信号に対してＡ／Ｄ変換を施し、ディジタル信号として後段の画像１次記憶部８３３に出力する。すなわち、画像１次記憶部８３３には、撮像画像のデータが出力される。画像１次記憶部８３３は例えばフラッシュメモリなどの記憶媒体で構成されており、この撮像画像のデータを記憶する。 The imaging optical system 830 is, for example, a lens, and forms an image of external light on the subsequent solid-state imaging device 831 as is well known. The solid-state imaging device 831 is, for example, a CCD, and converts the image stored by the imaging optical system 830 into an electrical signal as is well known, and as a result, the captured image is output as an electrical signal to the video signal processing circuit 832 at the subsequent stage. The video signal processing unit 832 performs A / D conversion on the electrical signal and outputs the digital signal to the subsequent image primary storage unit 833. That is, captured image data is output to the image primary storage unit 833. The image primary storage unit 833 is configured by a storage medium such as a flash memory, for example, and stores data of the captured image.

図４０は画像処理部８２１の機能構成を示すブロック図である。画像処理部８２１は、上記画像１次記憶部８３３に記憶されている撮像画像データを読み出し、後段の特徴量抽出部８４２に出力する画像入力部８４０、後述する表情情報を入力し、後段の特徴量抽出部８４２に出力する表情情報入力部８４１、特徴量抽出部８４２、参照特徴保持部８４３、特徴量抽出部８４２による特徴量の比を算出することで変化量算出を行う変化量計算部８４４、回転・上下変動またはサイズ変動に応じて変化量計算部８４４で算出された各特徴の変化量を正規化する変化量正規化部８４５、変化量正規化部８４５で正規化された各特徴の変化量から各変化量毎に得点算出を行う得点算出部８４６、表情判定部８４７により構成されている。同図に示した各部は特に説明がない限りは、上記実施形態において同じ名前の部分と同じ機能を有するものである。 FIG. 40 is a block diagram illustrating a functional configuration of the image processing unit 821. The image processing unit 821 reads the captured image data stored in the image primary storage unit 833, inputs an image input unit 840 that outputs to the subsequent feature amount extraction unit 842, and inputs facial expression information described later, and the subsequent features. A change amount calculation unit 844 that calculates a change amount by calculating a feature amount ratio by the facial expression information input unit 841, the feature amount extraction unit 842, the reference feature holding unit 843, and the feature amount extraction unit 842 that is output to the amount extraction unit 842. , A change amount normalizing unit 845 that normalizes the change amount of each feature calculated by the change amount calculating unit 844 in accordance with the rotation / up / down change or size change, and each feature normalized by the change amount normalizing unit 845. The score calculation unit 846 and the facial expression determination unit 847 are configured to calculate a score for each change amount from the change amount. Unless otherwise specified, each part shown in the figure has the same function as the part having the same name in the above embodiment.

なお、表情情報入力部８４１では撮影したい表情を撮影者が選択することにより撮影表情情報が入力される。つまり、撮影者が笑顔を撮影したい場合には笑顔撮影モードを選択する。これにより笑顔のみを撮影するようにする。よって、この表情情報とは、選択された表情を示す情報である。なお、選択する表情は１つに限定するものではなく、複数であっても良い。 The facial expression information input unit 841 inputs photographing facial expression information when the photographer selects a facial expression to be photographed. That is, when the photographer wants to photograph a smile, the smile photographing mode is selected. As a result, only a smile is photographed. Therefore, this facial expression information is information indicating the selected facial expression. Note that the number of facial expressions to be selected is not limited to one, and a plurality of facial expressions may be selected.

図４１は、特徴量抽出部８４２の機能構成を示すブロック図である。特徴量抽出部８４２は同図に示す如く、鼻と眼と口位置検出部８５０、エッジ画像生成部８５１、顔面の各特徴エッジ抽出部８５２、顔面の特徴点抽出部８５３、表情特徴量抽出部８５４により構成されている。各部の機能については図２５に示した各部と同じであるのでその説明は省略する。 FIG. 41 is a block diagram showing a functional configuration of the feature quantity extraction unit 842. As shown in FIG. As shown in the figure, the feature quantity extraction unit 842 includes a nose, eye, and mouth position detection unit 850, an edge image generation unit 851, facial feature edge extraction units 852, facial feature point extraction units 853, and facial expression feature quantity extraction units. 854. The function of each part is the same as that shown in FIG.

画像処理部８２１における画像入力部８４０は、画像１次記憶部８３３に記憶されている撮像画像のデータを読み出し、後段の特徴量抽出部８４２に出力する。特徴量抽出部８４２は、表情情報入力部８４１から入力された表情情報に基づいて、撮影者の選択による撮影したい表情の特徴量を抽出する。例えば、撮影者が笑顔を撮影したい場合には笑顔認識に必要な特徴量を抽出するということである。 The image input unit 840 in the image processing unit 821 reads captured image data stored in the image primary storage unit 833 and outputs the data to the feature extraction unit 842 in the subsequent stage. The feature amount extraction unit 842 extracts the feature amount of the facial expression desired to be photographed by the photographer based on the facial expression information input from the facial expression information input unit 841. For example, when the photographer wants to photograph a smile, a feature amount necessary for smile recognition is extracted.

更に、変化量計算部８４４は、抽出した各特徴量と、参照特徴保持部８４３が保持する各特徴量との比を算出することによって各特徴量の変化量を算出し、変化量正規化８４５では変化量計算部８４４で得られた各特徴変化量の比を顔のサイズ変動や顔の回転変動に応じて正規化する。そして、得点算出部８４６で各特徴毎の重みと各特徴毎の変化量に応じて得点算出を行う。 Further, the change amount calculation unit 844 calculates a change amount of each feature amount by calculating a ratio between each extracted feature amount and each feature amount held by the reference feature holding unit 843, and a change amount normalization 845. Then, the ratio of each feature change amount obtained by the change amount calculation unit 844 is normalized according to the face size fluctuation or the face rotation fluctuation. Then, a score calculation unit 846 performs score calculation according to the weight for each feature and the amount of change for each feature.

図４２は表情判定部８４６の機能構成を示すブロック図である。表情可能性判定部８６０は、第２の実施形態と同様に得点算出部８４６で算出された各特徴毎の得点総和の閾値処理により表情情報入力部８４１で得られた表情の可能性判定を行い、表情確定部８６１は、この表情可能性判定結果の連続性から表情情報入力部８４７で得られた表情であると確定している。もし、表情情報入力部８４７で得られた表情である場合には撮影部８２０で得られた画像データを画像２次記憶部８２２に記憶する。 FIG. 42 is a block diagram illustrating a functional configuration of the facial expression determination unit 846. The facial expression possibility determination unit 860 determines the possibility of the facial expression obtained by the facial expression information input unit 841 by threshold processing of the total score for each feature calculated by the score calculation unit 846 as in the second embodiment. The facial expression determination unit 861 determines that the facial expression is obtained by the facial expression information input unit 847 from the continuity of the facial expression possibility determination result. If the facial expression is obtained by the facial expression information input unit 847, the image data obtained by the photographing unit 820 is stored in the image secondary storage unit 822.

このようにすることで、撮影者が意図した表情の画像のみを記録することができる。 In this way, only an image with a facial expression intended by the photographer can be recorded.

なお、画像処理部８２１の機能構成はこれに限定されるものではなく、上記各実施形態における表情認識処理を行うべく構成された装置（もしくはプログラム）を適用しても良い。 The functional configuration of the image processing unit 821 is not limited to this, and an apparatus (or program) configured to perform facial expression recognition processing in each of the above embodiments may be applied.

［第８の実施形態］
図４３は本実施形態に係る撮像装置の機能構成を示すブロック図である。図３８と同じ部分については同じ番号を付けており、その説明を省略する。本実施形態に係る撮像装置は、第７の実施形態に係る撮像装置に更に画像表示部８７３を付加した構成を備える。 [Eighth Embodiment]
FIG. 43 is a block diagram illustrating a functional configuration of the imaging apparatus according to the present embodiment. The same parts as those in FIG. 38 are denoted by the same reference numerals, and the description thereof is omitted. The imaging apparatus according to the present embodiment has a configuration in which an image display unit 873 is further added to the imaging apparatus according to the seventh embodiment.

画像表示部８７３は液晶画面などにより構成されており、画像２次記憶部８２２に記録された画像を表示する。なお、画像表示部８７３に表示する画像は、画像処理部８７１で撮影者により選択された画像のみを表示する場合でも良い。また、画像表示部８７３に表示した画像を撮影者が画像２次記憶部８７２に記憶するか削除するか選択することも可能であり、そのためには例えば画像表示部８７３をタッチパネル形式の液晶画面により構成し、この表示画面に画像表示部８７３に表示した画像を撮影者が画像２次記憶部８７２に記憶するか削除するか選択する為のメニューを表示し、その何れかを撮影者が表示画面上で選択できるようにしても良い。 The image display unit 873 is configured by a liquid crystal screen or the like, and displays an image recorded in the image secondary storage unit 822. Note that the image to be displayed on the image display unit 873 may be a case in which only the image selected by the photographer in the image processing unit 871 is displayed. It is also possible for the photographer to select whether the image displayed on the image display unit 873 is stored in the image secondary storage unit 872 or to delete it. For this purpose, for example, the image display unit 873 is displayed on a touch panel type liquid crystal screen. And a menu for the photographer to select whether to store or delete the image displayed on the image display unit 873 in the image secondary storage unit 872 on the display screen. You may make it selectable above.

［その他の実施形態］
本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体（または記憶媒体）を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。この場合、記録媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記録した記録媒体は本発明を構成することになる。 [Other Embodiments]
An object of the present invention is to supply a recording medium (or storage medium) that records software program codes for realizing the functions of the above-described embodiments to a system or apparatus, and the computer of the system or apparatus (or CPU or MPU). Needless to say, this can also be achieved by reading and executing the program code stored in the recording medium. In this case, the program code itself read from the recording medium realizes the functions of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, after the program code read from the recording medium is written into a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function is based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion card or the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

本発明を上記記録媒体に適用する場合、その記録媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。 When the present invention is applied to the recording medium, program code corresponding to the flowchart described above is stored in the recording medium.

本発明の第１の実施形態に係る画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image processing apparatus which concerns on the 1st Embodiment of this invention. 特徴量抽出部１０１の機能構成を示すブロック図である。3 is a block diagram illustrating a functional configuration of a feature amount extraction unit 101. FIG. エッジ画像における眼領域、頬領域、口領域を示す図である。It is a figure which shows the eye area | region, cheek area | region, and mouth area | region in an edge image. 顔面の特徴点抽出部１１３が検出する各特徴点を示す図である。It is a figure which shows each feature point which the feature point extraction part 113 of a face detects. 「眼の線エッジの形状」を説明するための図である。It is a figure for demonstrating "the shape of an eye line edge." 例として変化量に個人差が存在する特徴である眼のエッジの長さの変化量から得点を算出するために参照するグラフである。It is a graph referred in order to calculate a score from the variation | change_quantity of the length of the edge of the eye which is the characteristic in which an individual difference exists in a variation | change_quantity as an example. 変化量の個人差がない特徴である眼と口の端点距離の長さの変化量から得点を算出するために参照するグラフである。It is a graph referred in order to calculate a score from the amount of change in the length of the end point distance between the eyes and the mouth, which is a feature that does not have individual differences in amount of change. 得点算出部１０４が求めた各特徴量毎の得点を用いて、入力画像における顔の表情が「特定の表情」であるか否かを判定する場合の判定処理のフローチャートである。It is a flowchart of the determination process in the case of determining whether the facial expression in an input image is "a specific facial expression" using the score for each feature-value which the score calculation part 104 calculated | required. 喜びを示す表情に対する得点の分布の一例を示す図である。It is a figure which shows an example of distribution of the score with respect to the facial expression which shows pleasure. 本発明の第２の実施形態に係る画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image processing apparatus which concerns on the 2nd Embodiment of this invention. 表情判定部１６５の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the facial expression determination part. 横軸を時系列画像の夫々固有に付けた画像番号、縦軸を得点の総和と閾値ライン１８２との差とし、真顔である無表情場面から喜び表情場面に変化した場合の得点総和と閾値ライン１８２との差を表した図である。The horizontal axis is the image number uniquely assigned to each time-series image, the vertical axis is the difference between the sum of the points and the threshold line 182, and the total score and the threshold line when changing from a faceless expression scene to a joy expression scene FIG. 横軸を時系列画像の画像番号、縦軸を得点の総和と閾値ライン１８３との差とし、非表情場面である会話場面の得点総和と閾値ライン１８３との差を表した図である。The horizontal axis is the image number of the time-series image, the vertical axis is the difference between the sum of the points and the threshold line 183, and the difference between the sum of the scores of the conversation scene that is a non-expression scene and the threshold line 183 is shown. 表情確定部１７１が行う、画像入力部１００から連続して入力される画像において、喜びの表情の開始時を決定する処理のフローチャートである。It is a flowchart of the process which determines the start time of a facial expression of pleasure in the image continuously input from the image input part 100 which the facial expression determination part 171 performs. 表情確定部１７１が行う、画像入力部１００から連続して入力される画像において、喜びの表情の終了時を決定する処理のフローチャートである。It is a flowchart of the process which determines the end time of the expression of pleasure in the image continuously input from the image input part 100 which the facial expression determination part 171 performs. 本発明の第３の実施形態に係る画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image processing apparatus which concerns on the 3rd Embodiment of this invention. 特徴量抽出部２１２の機能構成を示すブロック図である。3 is a block diagram illustrating a functional configuration of a feature amount extraction unit 212. FIG. 表情選択部２１１が選択した各表情（表情１，表情２，表情３）に応じた特徴量を示す図である。It is a figure which shows the feature-value according to each expression (Expression 1, Expression 2, Expression 3) which the expression selection part 211 selected. 各表情毎に、各変化量に基づいて得点を算出する様子を示す模式図である。It is a schematic diagram which shows a mode that a score is calculated for every facial expression based on each variation | change_quantity. 得点算出部で算出された眼の形状の得点から眼をつぶっているか否かの判定処理のフローチャートである。It is a flowchart of the determination process whether the eye is closed from the score of the eye shape calculated by the score calculation unit. 参照顔の眼のエッジ、即ち、眼を開いている場合の眼のエッジを示した図である。It is the figure which showed the edge of the eye of a reference face, ie, the edge of the eye when the eyes are open. 眼をつぶった場合の眼のエッジを示した図である。It is the figure which showed the edge of the eye at the time of closing an eye. 本発明の第１の実施形態に係る画像処理装置の基本構成を示す図である。1 is a diagram illustrating a basic configuration of an image processing apparatus according to a first embodiment of the present invention. 本発明の第６の実施形態に係る画像処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image processing apparatus which concerns on the 6th Embodiment of this invention. 特徴量抽出部７０１の機能構成を示すブロック図である。3 is a block diagram illustrating a functional configuration of a feature amount extraction unit 701. FIG. 画像における顔中の目、鼻の重心位置を示す図である。It is a figure which shows the eye in the face in an image, and the gravity center position of a nose. 左右それぞれ目頭と鼻の重心を示した図であるIt is the figure which showed the center of gravity of the right and left eyes and nose respectively 何も変動がない場合の左右眼間距離、左右眼・鼻間距離、及び眼・鼻間距離を示した図である。It is the figure which showed the distance between left-and-right eyes, the distance between left-right eyes and nose, and the distance between eyes and nose when there is no change. サイズ変動がある場合の左右眼間距離、左右眼・鼻間距離、及び眼・鼻間距離を示した図である。It is the figure which showed the distance between left-and-right eyes, the distance between left-right eyes and nose, and the distance between eyes and nose when there is a size variation. 上下回転変動がある場合の左右眼間距離、左右眼・鼻間距離、及び眼・鼻間距離を示した図である。It is the figure which showed the distance between left-and-right eyes, the distance between left-right eyes and nose, and the distance between eyes and nose when there is a vertical rotation fluctuation. 左右回転変動がある場合の左右眼間距離、左右眼・鼻間距離、及び眼・鼻間距離を示した図である。It is the figure which showed the distance between right-and-left eyes, the distance between right-and-left eyes and nose, and the distance between eyes and nose when there is a left-right rotation variation. 無表情時の左右眼の端点間距離を示した図である。It is the figure which showed the distance between the endpoints of the left and right eyes when there is no expression. 笑顔時の左右眼の端点間距離を示した図である。It is the figure which showed the distance between the endpoints of the left and right eyes when smiling. サイズ変動、左右回転変動、上下回転変動を判定する処理のフローチャートである。It is a flowchart of the process which determines a size fluctuation | variation, a left-right rotation fluctuation | variation, and a vertical rotation fluctuation | variation. サイズ変動、左右回転変動、上下回転変動のいずれかの変動があった場合の左右眼間距離、左右眼・鼻間距離、及び眼・鼻間距離を示した図である。It is the figure which showed the distance between right-and-left eyes, the distance between right-and-left eyes and nose, and the distance between eyes and nose when there was any fluctuation | variation of a size fluctuation | variation, a left-right rotation fluctuation | variation, and a vertical rotation fluctuation | variation. 上下回転変動とサイズ変動がある場合の左右眼間距離、左右眼・鼻間距離、及び眼・鼻間距離を示した図である。It is the figure which showed the distance between left-and-right eyes, the distance between right-and-left eyes and a nose, and the distance between eyes and nose when there is a vertical rotation fluctuation and a size fluctuation. 左右眼及び鼻の位置検出から上下・左右回転変動、サイズ変動に応じて各特徴量を正規化し、表情判定する処理のフローチャートである。It is a flowchart of the process which normalizes each feature-value according to the up-down / left-right rotation fluctuation | variation and size fluctuation | variation from the left-right eye and nose position detection, and determines an expression. 本発明の第７の実施形態に係る撮像装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the imaging device which concerns on the 7th Embodiment of this invention. 撮像部８２０の機能構成を示す図である。It is a figure which shows the function structure of the imaging part 820. FIG. 画像処理部８２１の機能構成を示すブロック図である。3 is a block diagram illustrating a functional configuration of an image processing unit 821. FIG. 特徴量抽出部８４２の機能構成を示すブロック図である。5 is a block diagram showing a functional configuration of a feature amount extraction unit 842. FIG. 表情判定部８４６の機能構成を示すブロック図である。It is a block diagram showing a functional configuration of the facial expression determination unit 846. 本発明の第８の実施形態に係る撮像装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the imaging device which concerns on the 8th Embodiment of this invention.

Claims

Input means for inputting images of successive frames including a face;
First feature amount calculation means for obtaining a feature amount for each of a predetermined group of parts in the face in the image of each frame of the continuous frames ;
A second feature amount calculating means for obtaining a feature amount for each of the preset part groups of the face in an image including a face of a preset facial expression;
Based on the difference or ratio between the feature quantity obtained by the first feature quantity calculation means and the feature quantity obtained by the second feature quantity calculation means, the respective feature quantities of the preset part group A change amount calculating means for obtaining a change amount;
Score calculating means for calculating a score for each of the preset part groups, based on the amount of change obtained by the change amount calculating means for each of the preset part groups;
By comparing the score distribution calculated for each of the preset group of parts by the score calculation means with the distribution of scores for each of the preset group of parts calculated for each facial expression, First determination means for determining facial expressions in the image input by the input means ;
After the first determination means determines that the facial expression in each image of consecutive p frames is the first expression, the facial expression in each image of consecutive q frames is determined after the determination. Second determination means for determining the facial expression in the image of each frame until the first determination means determines that the second expression is different from the first expression. the image processing apparatus characterized in that it comprises and.

The change amount calculation means is set in advance based on a difference or ratio between the feature quantity obtained by the first feature quantity calculation means and the feature quantity obtained by the second feature quantity calculation means. After determining the amount of change in each feature amount of the region group, normalize the amount of change in each feature amount of the preset region group by a normalized value based on face size variation, rotation variation,
The score calculation means calculates a score for each of the preset group of parts based on the normalized variation for each of the preset group of parts. The image processing apparatus described.

The change amount calculating means includes:
Horizontal direction between both eyes using at least one of the center-of-gravity position of the eye region, the end-point position of the eye, the center-of-gravity position of the nose region, the center-of-gravity position of the left and right nostrils, and the right and left nostril positions obtained by the first feature amount calculation means Distance, eye-to-nose horizontal and vertical distances, and the center-of-gravity position of the eye region obtained by the second feature amount calculation means, the end point position of the eye, the center-of-gravity position of the nose region, the center-of-gravity position of the left and right nostrils, Calculating the horizontal distance between eyes, the horizontal distance between eyes and nose, and the vertical distance obtained using at least one of
Calculated by the change amount calculation means by using at least one of the ratio between the horizontal and vertical distances between both eyes and the ratio between the horizontal and vertical distances between the eyes and the nose obtained by the first and second feature amount calculation means. The image processing apparatus according to claim 2 , wherein normalization of a change amount of each feature amount of the preset part group is performed.

The change amount calculating means includes a left eye / right eye end point distance ratio obtained from the first feature amount calculating means, and a left eye / right eye end point distance ratio obtained from the second feature amount calculating means. The image processing apparatus according to claim 2, wherein the amount of change of each feature amount of the preset part group is normalized using a ratio of

The first determination means further obtains the sum of the scores calculated by the score calculation means for each of the preset group of parts, and whether or not the value of the obtained sum is equal to or greater than a preset value. In response to determining whether the image input by the input means is a facial expression scene,
With further reference to the determination result, the image processing apparatus according to any one of claims 1 to 4, characterized in that to determine the expression of the face in the image by the input means inputs.

The first and second feature amount calculating means obtains an edge on the image for each of the preset group of parts, and further finds an end point at the edge of each of the obtained preset parts,
The change amount calculation means includes at least one of a change amount of an edge length, a change amount of a distance between end points, and a change amount of a slope of a line segment by two end points for each of the preset part groups. the image processing apparatus according to any one of claims 1 to 5, wherein the determination of the amount of change in feature amount using the.

An image processing method performed by an image processing apparatus,
An input step in which the input means of the image processing apparatus inputs an image of successive frames including a face;
A first feature amount calculating step in which a first feature amount calculating means included in the image processing device calculates a feature amount for each of a predetermined group of parts in a face in an image of each frame of the continuous frames ;
A second feature amount calculating step in which a second feature amount calculating means included in the image processing apparatus calculates a feature amount for each of the predetermined group of parts of the face in an image including a face with a predetermined facial expression; When,
Based on the difference or ratio between the feature amount obtained in the first feature amount calculation step and the feature amount obtained in the second feature amount calculation step, the change amount calculation means included in the image processing apparatus A change amount calculating step for obtaining a change amount of each feature amount of a predetermined group of parts;
The score calculating means of the image processing apparatus calculates a score for each of the preset part groups based on the change amount obtained in the change amount calculating step for each of the preset part groups. Calculation process;
The first determination means of the image processing apparatus includes the distribution of the score calculated for each of the preset part groups in the score calculation step, and the preset part group calculated for each facial expression. A first determination step of determining the facial expression in the image input in the input step by comparing the distribution of the scores for each of them ;
The second determination means of the image processing apparatus determines in the first determination step that the facial expression in each image of consecutive p frames is the first expression, and after the determination, The facial expression in each frame image until it is determined in the first determination step that the facial expression in each image of consecutive q frames is a second expression different from the first expression. And a second determination step of determining as the first facial expression .

An image processing apparatus according to claim 1;
Imaging means for capturing images of the continuous frames input to the input means ;
An image pickup apparatus comprising: storage means for storing an image including a face determined to be a predetermined facial expression by the first determination means.

9. The imaging apparatus according to claim 8 , further comprising image display means for displaying the image determined by the first determination means.

The computer program for functioning a computer as each means which the image processing apparatus of any one of Claims 1 thru | or 6 has.

A computer-readable storage medium storing the computer program according to claim 10 .