JP2013041493A

JP2013041493A - Face image feature amount generation apparatus and program for generating face image feature amount

Info

Publication number: JP2013041493A
Application number: JP2011178829A
Authority: JP
Inventors: Makoto Okuda; 誠奥田
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2011-08-18
Filing date: 2011-08-18
Publication date: 2013-02-28

Abstract

PROBLEM TO BE SOLVED: To have robustness to light effects and highly accurately generate face image feature amount.SOLUTION: A face image feature amount generation apparatus includes: an image data acquiring section 10 for importing image data; a face area extraction section 20 for executing face detection processing for the image data that the image data acquiring section 10 has imported and extracting a face area from the image data; an analysis area determination section 30 for determining a first area and a second area obtained by halving a circular or elliptical analysis area, from the face area that the face area extraction section 20 has extracted; and a face image feature amount calculation section 40 for calculating feature amounts of the first area and the second area that the analysis area determination section 30 has determined, concatenating the calculated feature amounts and generating the face image feature amount.

Description

本発明は、顔画像特徴量生成装置および顔画像特徴量生成プログラムに関する。 The present invention relates to a face image feature value generation device and a face image feature value generation program.

ビデオカメラで撮影された映像データの各フレーム画像から、人物の顔領域をロバストに検出する技術が知られている（例えば、非特許文献１参照）。
また、人物の顔画像から顔の目部分や口部分等の位置を検出し、これらの位置に基づいて顔画像を分割して分割領域ごとの画像特徴ベクトルを計算し、これら画像特徴ベクトルを連結して顔画像特徴量を生成する技術が知られている（例えば、非特許文献２参照）。 A technique for robustly detecting a human face area from each frame image of video data captured by a video camera is known (see, for example, Non-Patent Document 1).
In addition, the positions of the eyes and mouth of the face are detected from the face image of the person, the face image is divided based on these positions, the image feature vector for each divided region is calculated, and these image feature vectors are connected. A technique for generating a facial image feature amount is known (see, for example, Non-Patent Document 2).

PAUL VIOLA, MICHAEL J. JONES, "Robust Real-Time Face Detection", International Journal of Computer Vision, 2004, Vol. 57, No. 2, pp. 137-154.PAUL VIOLA, MICHAEL J. JONES, "Robust Real-Time Face Detection", International Journal of Computer Vision, 2004, Vol. 57, No. 2, pp. 137-154. Zisheng Li, Jun-ichi Imai, Masahide Kaneko, "Facial Expression Recognition Using Facial-component-based Bag of Words and PHOG Descriptors", 映像情報メディア学会誌, 2010, Vol. 64, No.2, pp. 230-236.Zisheng Li, Jun-ichi Imai, Masahide Kaneko, "Facial Expression Recognition Using Facial-component-based Bag of Words and PHOG Descriptors", IEICE Journal, 2010, Vol. 64, No.2, pp. 230-236 .

非特許文献１記載の技術では、顔領域をロバストに検出することが可能である。しかしながら、非特許文献１記載の技術は、人物の顔を含む画像から顔領域を矩形で検出するものであり、顔の背景、髪の毛、アクセサリ等、顔認識や顔表情認識に必要がない情報をも含めて認識対象とするため、認識精度を落とす要因となる。
また、非特許文献２記載の技術では、顔の目部分や口部分の位置を検出し、顔認識や顔表情認識を行う上で重要となる領域のみで顔画像特徴量を生成するものである。しかしながら、外光や室内照明の状態が変化する撮影環境において撮影された人物の画像から、顔の目部分や口部分等の位置をロバストに検出することは難しい。よって、光の影響を受けずに人物の顔画像特徴量を安定して生成することは困難である。 With the technique described in Non-Patent Document 1, it is possible to detect a face area robustly. However, the technique described in Non-Patent Document 1 is to detect a face area in a rectangle from an image including a human face, and information that is not necessary for face recognition or facial expression recognition, such as a face background, hair, accessories, etc. It is a factor that lowers the recognition accuracy.
Further, in the technique described in Non-Patent Document 2, the position of the face portion and the mouth portion of the face is detected, and the facial image feature amount is generated only in a region that is important for performing face recognition and facial expression recognition. . However, it is difficult to robustly detect the positions of the eyes, mouth, and the like of a face from an image of a person taken in a shooting environment in which the state of outside light or room lighting changes. Therefore, it is difficult to stably generate a human face image feature amount without being affected by light.

そこで、本発明は上記問題点を解決するためになされたものであり、光の影響にロバストであって且つ高精度に顔画像特徴量を生成する、顔画像特徴量生成装置および顔画像特徴量生成プログラムを提供することを目的とする。 Accordingly, the present invention has been made to solve the above-described problems, and is a face image feature value generation apparatus and a face image feature value that are robust against the influence of light and generate a face image feature value with high accuracy. An object is to provide a generation program.

［１］上記の課題を解決するため、本発明の一態様である顔画像特徴量生成装置は、画像データを取り込む画像データ取得部と、前記画像データ取得部が取り込んだ前記画像データに対して顔検出処理を実行し、前記画像データから顔領域を抽出する顔領域抽出部と、前記顔領域抽出部が抽出した前記顔領域から、円形または楕円形の解析領域を二分する第１の領域および第２の領域を決定する解析領域決定部と、前記解析領域決定部が決定した前記第１の領域および前記第２の領域それぞれについて特徴量を計算し、これら計算した特徴量を連結して顔画像特徴量を生成する顔画像特徴量計算部と、を備えることを特徴とする。
［２］上記［１］記載の顔画像特徴量生成装置において、前記顔領域抽出部は、顔を含む矩形の顔領域を前記画像データから抽出し、前記解析領域決定部は、前記顔領域に内接する円形または楕円形よりも小さな円形または楕円形の前記解析領域を縦方向に二分して前記第１の領域および第２の領域を決定することを特徴とする。
［３］上記［２］記載の顔画像特徴量生成装置において、前記顔領域抽出部は、前記矩形の顔領域を所定サイズの顔領域に正規化することを特徴とする。
［４］上記の課題を解決するため、本発明の一態様である顔画像特徴量生成プログラムは、コンピュータを、画像データを取り込む画像データ取得部と、前記画像データ取得部が取り込んだ前記画像データに対して顔検出処理を実行し、前記画像データから顔領域を抽出する顔領域抽出部と、前記顔領域抽出部が抽出した前記顔領域から、円形または楕円形の解析領域を二分する第１の領域および第２の領域を決定する解析領域決定部と、前記解析領域決定部が決定した前記第１の領域および前記第２の領域それぞれについて特徴量を計算し、これら計算した特徴量を連結して顔画像特徴量を生成する顔画像特徴量計算部と、として機能させる。 [1] In order to solve the above-described problem, a face image feature amount generation device according to one aspect of the present invention is configured to perform an image data acquisition unit that captures image data, and the image data that the image data acquisition unit captures. A face area extraction unit that executes face detection processing and extracts a face area from the image data; a first area that bisects a circular or elliptical analysis area from the face area extracted by the face area extraction unit; An analysis region determining unit that determines a second region, and calculating feature amounts for each of the first region and the second region determined by the analysis region determining unit, and connecting the calculated feature amounts to face And a face image feature value calculation unit that generates an image feature value.
[2] In the face image feature value generation device according to [1], the face area extraction unit extracts a rectangular face area including a face from the image data, and the analysis area determination unit adds the face area to the face area. The first region and the second region are determined by vertically dividing the analysis region of a circle or ellipse smaller than the inscribed circle or ellipse in the vertical direction.
[3] In the face image feature value generation device according to [2] above, the face area extraction unit normalizes the rectangular face area to a face area of a predetermined size.
[4] In order to solve the above-described problem, a face image feature value generation program according to one aspect of the present invention includes a computer that includes an image data acquisition unit that acquires image data, and the image data that the image data acquisition unit acquires. A face area extracting unit that extracts a face area from the image data, and a first analysis that bisects a circular or elliptical analysis area from the face area extracted by the face area extracting unit. An analysis region determination unit for determining the region and the second region, and feature amounts are calculated for each of the first region and the second region determined by the analysis region determination unit, and the calculated feature amounts are connected. And function as a face image feature value calculation unit that generates a face image feature value.

本発明によれば、光の影響にロバストであって且つ高精度に顔画像特徴量を生成することができる。 According to the present invention, it is possible to generate a face image feature value with high accuracy and robustness to the influence of light.

本発明の一実施形態である顔画像特徴量生成装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the face image feature-value production | generation apparatus which is one Embodiment of this invention. フレーム画像と、このフレーム画像から抽出された矩形の顔領域と、この顔領域を正規化して得られた正規化顔領域との例を模式的に示す図である。It is a figure which shows typically the example of the frame image, the rectangular face area extracted from this frame image, and the normalized face area obtained by normalizing this face area. 正規化顔領域に基づき、解析領域決定部によって決定された解析領域を視覚的に分かり易く線描画した図である。FIG. 10 is a diagram in which an analysis region determined by an analysis region determination unit is line-drawn in a visually easy-to-understand manner based on a normalized face region. 顔画像特徴量計算部によって得られた、上部解析領域における特徴量のヒストグラムと、下部解析領域における特徴量のヒストグラムと、これら二つのヒストグラムが連結された、解析領域全体における特徴量のヒストグラムとを模式的に示した図である。A histogram of feature amounts in the upper analysis region, a histogram of feature amounts in the lower analysis region, and a histogram of feature amounts in the entire analysis region obtained by connecting these two histograms, obtained by the face image feature amount calculation unit. It is the figure shown typically. 同実施形態である顔画像特徴量生成装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the face image feature-value production | generation apparatus which is the embodiment.

以下、本発明を実施するための形態について、図面を参照して詳細に説明する。
図１は、本発明の一実施形態である顔画像特徴量生成装置の機能構成を示すブロック図である。同図に示すように、顔画像特徴量生成装置１は、画像データ取得部１０と、顔領域抽出部２０と、解析領域決定部３０と、顔画像特徴量計算部４０とを備える。 Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram showing a functional configuration of a face image feature value generation device according to an embodiment of the present invention. As shown in the figure, the face image feature value generation device 1 includes an image data acquisition unit 10, a face region extraction unit 20, an analysis region determination unit 30, and a face image feature value calculation unit 40.

画像データ取得部１０は、図示しない外部装置から供給される画像データを取り込む。画像データは、静止画像データまたは動画像データである。画像データが静止画像データである場合、画像データ取得部１０は、取り込んだ画像データをフレーム画像データとして顔領域抽出部２０に供給する。また、画像データが動画像データである場合、画像データ取得部１０は、取り込んだ動画像データからキーフレームを検出し、このキーフレームのデータをフレーム画像データとして、順次またはあらかじめ決定された所定フレーム数おきに顔領域抽出部２０に供給する。
外部装置は、例えば、撮影装置や記録装置である。 The image data acquisition unit 10 takes in image data supplied from an external device (not shown). The image data is still image data or moving image data. When the image data is still image data, the image data acquisition unit 10 supplies the captured image data to the face area extraction unit 20 as frame image data. When the image data is moving image data, the image data acquisition unit 10 detects a key frame from the captured moving image data, and uses the data of the key frame as frame image data in order or a predetermined frame determined in advance. It is supplied to the face area extraction unit 20 every few numbers.
The external device is, for example, a photographing device or a recording device.

顔領域抽出部２０は、画像データ取得部１０から供給されるフレーム画像データを取り込む。顔領域抽出部２０は、取り込んだフレーム画像データに対して顔検出処理を実行し、そのフレーム画像データから人物の顔領域を検出する。この顔領域は、人物の顔を含む矩形の画像領域である。顔検出処理のアルゴリズムとして、顔領域抽出部２０は、公知の顔検出アルゴリズム、例えばＡｄａＢｏｏｓｔを用いる。
なお、公知の顔検出アルゴリズムについては、例えば、前記の非特許文献１に詳細が開示されている。 The face area extraction unit 20 captures the frame image data supplied from the image data acquisition unit 10. The face area extraction unit 20 performs face detection processing on the captured frame image data, and detects a human face area from the frame image data. This face area is a rectangular image area including the face of a person. As a face detection processing algorithm, the face area extraction unit 20 uses a known face detection algorithm, for example, AdaBoost.
The details of known face detection algorithms are disclosed in Non-Patent Document 1, for example.

顔領域抽出部２０は、検出した顔領域のデータをフレーム画像データから抽出し、その顔領域データを、所定サイズ（例えば、１２８画素×１２８画素）の画像データに正規化する。正規化の処理として、顔領域抽出部２０は、顔領域を所定サイズの矩形領域に拡大させたり、縮小させたりする画像処理を実行する。つまり、フレーム画像に含まれる顔のサイズは様々であるため、顔領域抽出部２０は、顔領域のサイズを拡大または縮小して、全ての顔領域の解像度を同程度にする。これにより、後段の顔画像特徴量計算部４０によって検出される局所特徴量の数や特徴のレベルが合い、顔認識処理や顔表情認識処理で、これらの局所特徴量を利用したときに、認識率の向上に繋がる。顔領域抽出部２０は、正規化した画像データである正規化顔領域データを解析領域決定部３０に供給する。 The face area extraction unit 20 extracts the detected face area data from the frame image data, and normalizes the face area data to image data of a predetermined size (for example, 128 pixels × 128 pixels). As a normalization process, the face area extraction unit 20 executes image processing for enlarging or reducing the face area to a rectangular area of a predetermined size. That is, since the sizes of the faces included in the frame image are various, the face area extracting unit 20 enlarges or reduces the size of the face area so that all the face areas have the same resolution. As a result, the number of local feature amounts and the level of features detected by the facial image feature amount calculation unit 40 in the subsequent stage are matched, and recognition is performed when these local feature amounts are used in the face recognition processing and facial expression recognition processing. It leads to improvement of rate. The face area extraction unit 20 supplies normalized face area data, which is normalized image data, to the analysis area determination unit 30.

解析領域決定部３０は、顔領域検出部２０から供給される正規化顔領域データを取り込み、この正規化顔領域データに基づき、顔画像特徴量を計算するための解析領域を決定する。解析領域は、例えば、正規化顔領域の中心位置を中心として設けられる、正規化顔領域に含まれる円（楕円または真円）領域である。解析領域決定部３０は、例えば、正規化顔領域の水平方向であって且つその中心を通る直線で解析領域を二分し、その上部の領域を上部解析領域（第１の領域）、下部の領域を下部解析領域（第２の領域）として決定する。言い換えると、解析領域決定部３０は、正規化顔領域に内接する円形または楕円形よりも小さな円形または楕円形の解析領域を上下（縦）方向に二分して上部解析領域および下部解析領域を決定する。
なお、上部解析領域および下部解析領域の面積は同一であってもよいし、異なっていてもよい。 The analysis region determination unit 30 takes in the normalized face region data supplied from the face region detection unit 20, and determines an analysis region for calculating a face image feature amount based on the normalized face region data. The analysis area is, for example, a circle (ellipse or perfect circle) area included in the normalized face area that is provided around the center position of the normalized face area. For example, the analysis region determination unit 30 bisects the analysis region by a straight line that is in the horizontal direction of the normalized face region and passes through the center thereof, the upper region is an upper analysis region (first region), and the lower region Is determined as the lower analysis region (second region). In other words, the analysis region determination unit 30 divides a circular or elliptical analysis region smaller than a circle or ellipse inscribed in the normalized face region in the vertical (vertical) direction to determine the upper analysis region and the lower analysis region. To do.
Note that the areas of the upper analysis region and the lower analysis region may be the same or different.

顔画像特徴量計算部４０は、解析領域抽出部３０が決定した正規化顔領域における上部解析領域および下部解析領域それぞれについて、特徴量であるＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを計算する。具体的には、顔画像特徴量計算部４０は、上部解析領域からＳＩＦＴ（ＳｃａｌｅＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍａｔｉｏｎ）特徴量、またはＳＵＲＦ（ＳｐｅｅｄｅｄＵｐＲｏｂｕｓｔＦｅａｔｕｒｅｓ）等の局所特徴量（Ｋｅｙｐｏｉｎｔｓ）を計算する。顔画像特徴量計算部４０は、これら特徴量についてクラスタ分析処理を実行することによってクラス分類し、クラスタに対する出現頻度のヒストグラムを生成する。クラスタ分析処理として、顔画像特徴量計算部４０は、例えばＫ平均法を用いる。顔画像特徴量計算部４０は、生成したヒストグラムをＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓとする。顔画像特徴量計算部４０は、下部解析領域についても上部解析領域と同様にＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを求める。 The face image feature amount calculation unit 40 calculates Bag-of-Keypoints that are feature amounts for each of the upper analysis region and the lower analysis region in the normalized face region determined by the analysis region extraction unit 30. Specifically, the face image feature value calculation unit 40 calculates a local feature value (Keypoints) such as a SIFT (Scale Invariant Feature Transformation) feature value or a SURF (Speeded Up Robust Features) from the upper analysis region. The face image feature value calculation unit 40 classifies these feature values by executing a cluster analysis process, and generates a histogram of appearance frequencies for the clusters. As the cluster analysis process, the face image feature amount calculation unit 40 uses, for example, a K-average method. The face image feature amount calculation unit 40 sets the generated histogram as Bag-of-Keypoints. The face image feature amount calculation unit 40 obtains Bag-of-Keypoints for the lower analysis area as in the upper analysis area.

顔画像特徴量計算部４０は、それぞれ計算した二つのＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを連結して解析領域全体としてのＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを生成し、このＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを顔画像特徴量として出力する。例えば、顔画像特徴量計算部４０は、上部解析領域に対する１７５次元のＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓに、下部解析領域に対する１２５次元のＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを連結し、解析領域全体として３００次元のＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを生成する。 The face image feature value calculation unit 40 generates two Bag-of-Keypoints as the entire analysis region by connecting the two calculated Bag-of-Keypoints, and outputs the Bag-of-Keypoints as a face image feature value. To do. For example, the face image feature value calculation unit 40 connects the 175-dimensional Bag-of-Keypoints for the upper analysis area to the 125-dimensional Bag-of-Keypoints for the lower analysis area, and the 300-dimensional Bag- of-Keypoints are generated.

図２は、フレーム画像と、このフレーム画像から抽出された矩形の顔領域と、この顔領域を正規化して得られた正規化顔領域との例を模式的に示す図である。符号２はフレーム画像であり、符号２ａは顔領域抽出部２０によってフレーム画像２から検出された顔領域であり、符号２ｂは顔領域抽出部２０によって顔領域が正規化（ここでは、縮小）された正規化顔領域である。同図に示すように、正規化顔領域２ｂは、顔のパーツである、両眉毛、両目、鼻、口を含むようにＬ_Ｘ×Ｌ_Ｙに正規化されている。Ｌ_ＸとＬ_Ｙとの長さの関係は、例えば、Ｌ_Ｘ＝Ｌ_Ｙである。 FIG. 2 is a diagram schematically illustrating an example of a frame image, a rectangular face region extracted from the frame image, and a normalized face region obtained by normalizing the face region. Reference numeral 2 is a frame image, reference numeral 2 a is a face area detected from the frame image 2 by the face area extraction unit 20, and reference numeral 2 b is a face area normalized (here, reduced) by the face area extraction unit 20. Normalized face area. As shown in the figure, the normalized face region 2b is normalized to L _X × L _Y so as to include both eyebrows, both eyes, nose and mouth, which are facial parts. The length relationship between L _X and L _Y is, for example, L _X = L _Y.

図３は、正規化顔領域２ｂに基づき、解析領域決定部３０によって決定された解析領域を視覚的に分かり易く線描画した図である。同図に示すように、解析領域決定部３０は、Ｌ_Ｘ×Ｌ_Ｙの正規化顔領域２ｂの中心位置を中心として、正規化顔領域２ｂに含まれる円形の解析領域３を決定する。解析領域３の水平方向の径は、例えば、Ｌ_Ｘの０．８倍の長さであり、垂直方向の径は、例えば、Ｌ_Ｙの０．８倍の長さである。このように、解析領域３の径を正規化顔領域２ｂの内接円の径よりも小さくすることにより、人物の顔認識や顔表情認識にとって重要度が低い髪の毛や、イヤリング等の情報を除外することができる。解析領域決定部３０は、解析領域３の水平方向であって且つその中心を通る直線で、解析領域３を上部解析領域３Ｕと下部解析領域３Ｄとに区分する。このように区分することにより、上部解析領域３Ｕは顔の両眉毛や両目を含み、下部解析領域３Ｄは鼻頭や口を含むこととなる。 FIG. 3 is a diagram in which the analysis region determined by the analysis region determination unit 30 is line-drawn in a visually easy-to-understand manner based on the normalized face region 2b. As shown in the figure, the analysis region determination unit 30 determines a circular analysis region 3 included in the normalized face region 2b with the center position of the L _X × L _Y normalized face region 2b as the center. Horizontal diameter of the analysis region 3, for example, is 0.8 times the length of the L _X, the diameter of the vertical direction is, for example, 0.8 times the length of L _Y. In this way, by making the diameter of the analysis region 3 smaller than the diameter of the inscribed circle of the normalized face region 2b, information such as hair and earrings that are less important for human face recognition and facial expression recognition are excluded. can do. The analysis region determination unit 30 divides the analysis region 3 into an upper analysis region 3U and a lower analysis region 3D by a straight line passing through the center of the analysis region 3 in the horizontal direction. By dividing in this way, the upper analysis region 3U includes both eyebrows and both eyes of the face, and the lower analysis region 3D includes the nasal head and mouth.

図４は、顔画像特徴量計算部４０によって得られた、上部解析領域における特徴量のヒストグラムと、下部解析領域における特徴量のヒストグラムと、これら二つのヒストグラムが連結された、解析領域全体における特徴量のヒストグラムとを模式的に示した図である。同図は、上部解析領域における特徴量のヒストグラムの後に、下部解析領域における特徴量のヒストグラムを連結した例である。このように、領域を分割してクラス分類することにより、Ｂａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓに位置情報（上部解析領域または下部解析領域）を対応付けることができる。
以上により、光の影響にロバストな顔認識処理や顔表情認識処理を可能にすることができる。
なお、顔画像特徴量計算部４０は、下部解析領域における特徴量のヒストグラムの後に、上部解析領域における特徴量のヒストグラムを連結することによって、解析領域全体における特徴量のヒストグラムを得てもよい。 FIG. 4 shows a feature amount histogram in the upper analysis region, a feature amount histogram in the lower analysis region, and a feature in the entire analysis region obtained by connecting these two histograms, obtained by the face image feature amount calculation unit 40. It is the figure which showed the histogram of quantity typically. This figure is an example in which a histogram of feature amounts in the lower analysis region is connected to a histogram of feature amounts in the upper analysis region. In this way, by dividing the region and classifying the region, position information (upper analysis region or lower analysis region) can be associated with Bag-of-Keypoints.
As described above, face recognition processing and facial expression recognition processing that are robust to the influence of light can be made possible.
Note that the face image feature amount calculation unit 40 may obtain a histogram of feature amounts in the entire analysis region by concatenating a histogram of feature amounts in the upper analysis region after a histogram of feature amounts in the lower analysis region.

図５は、本実施形態である顔画像特徴量生成装置１の処理手順を示すフローチャートである。このフローチャートは、１つのフレーム画像データについての処理手順を示したものである。
ステップＳ１において、画像データ取得部１０は、外部装置から供給される画像データを取り込む。画像データが静止画像データである場合、画像データ取得部１０は、取り込んだ画像データをフレーム画像データとして顔領域抽出部２０に供給する。また、画像データが動画像データである場合、画像データ取得部１０は、取り込んだ動画像データからキーフレームを検出し、このキーフレームのデータをフレーム画像データとして顔領域抽出部２０に供給する。 FIG. 5 is a flowchart illustrating a processing procedure of the face image feature value generation device 1 according to the present embodiment. This flowchart shows a processing procedure for one frame image data.
In step S1, the image data acquisition unit 10 captures image data supplied from an external device. When the image data is still image data, the image data acquisition unit 10 supplies the captured image data to the face area extraction unit 20 as frame image data. When the image data is moving image data, the image data acquisition unit 10 detects a key frame from the captured moving image data, and supplies the key frame data to the face area extraction unit 20 as frame image data.

次に、ステップＳ２において、顔領域抽出部２０は、画像データ取得部１０から供給されるフレーム画像データを取り込み、この取り込んだフレーム画像データに対して顔検出処理を実行し、そのフレーム画像データから人物の顔領域を検出する。 Next, in step S2, the face area extraction unit 20 captures the frame image data supplied from the image data acquisition unit 10, performs face detection processing on the captured frame image data, and uses the frame image data from the frame image data. Detect human face area.

次に、ステップＳ３において、顔領域抽出部２０は、検出した顔領域のデータをフレーム画像データから抽出し、その顔領域データを、所定サイズ（例えば、１２８画素×１２８画素）の画像データに正規化し、この正規化顔領域データを解析領域決定部３０に供給する。 Next, in step S3, the face area extraction unit 20 extracts the detected face area data from the frame image data, and normalizes the face area data to image data of a predetermined size (for example, 128 pixels × 128 pixels). The normalized face area data is supplied to the analysis area determination unit 30.

次に、ステップＳ４において、解析領域決定部３０は、顔領域検出部２０から供給される正規化顔領域データを取り込み、この正規化顔領域データに基づき、顔画像特徴量を計算するための解析領域を決定する。具体的には、解析領域決定部３０は、例えば、正規化顔領域の中心位置を中心として設けられる、正規化顔領域に含まれる円（楕円または真円）領域を解析領域として決定する。
次に、解析領域決定部３０は、正規化顔領域の水平方向の直線で解析領域を二分し、その上部の領域を上部解析領域、下部の領域を下部解析領域として決定する。例えば、解析領域決定部３０は、正規化顔領域の水平方向であって且つその中心を通る直線で解析領域を二分し、その上部の領域を上部解析領域、下部の領域を下部解析領域として決定する。 Next, in step S4, the analysis region determination unit 30 takes in the normalized face region data supplied from the face region detection unit 20, and performs an analysis for calculating a face image feature amount based on the normalized face region data. Determine the area. Specifically, the analysis region determination unit 30 determines, for example, a circle (ellipse or perfect circle) region included in the normalized face region provided around the center position of the normalized face region as the analysis region.
Next, the analysis region determination unit 30 bisects the analysis region by a horizontal straight line of the normalized face region, and determines an upper region as an upper analysis region and a lower region as a lower analysis region. For example, the analysis region determination unit 30 bisects the analysis region by a straight line that passes through the center of the normalized face region and determines the upper region as the upper analysis region and the lower region as the lower analysis region. To do.

次に、ステップＳ５において、顔画像特徴量計算部４０は、解析領域抽出部３０が決定した正規化顔領域における上部解析領域および下部解析領域それぞれについて、特徴量であるＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを計算する。
次に、顔画像特徴量計算部４０は、それぞれ計算した二つのＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを連結して解析領域全体としてのＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを生成し、顔画像特徴量として出力する。 Next, in step S <b> 5, the face image feature value calculation unit 40 calculates Bag-of-Keypoints that are feature values for each of the upper analysis region and the lower analysis region in the normalized face region determined by the analysis region extraction unit 30. To do.
Next, the face image feature value calculation unit 40 connects the two calculated Bag-of-Keypoints to generate Bag-of-Keypoints as the entire analysis region, and outputs the Bag-of-Keypoints as a face image feature value.

以上、説明したとおり、本発明の一実施形態である顔画像特徴量生成装置１を、画像データを取り込む画像データ取得部１０と、画像データに対して顔検出処理を実行し、画像データから顔領域を抽出する顔領域抽出部２０と、顔領域から、円形または楕円形の解析領域を上下（縦）方向に二分する上部解析領域および下部解析領域を決定する解析領域決定部３０と、上部解析領域および下部解析領域それぞれについて特徴量であるＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを計算し、これら計算したＢａｇ−ｏｆ−Ｋｅｙｐｏｉｎｔｓを連結して顔画像特徴量を生成する顔画像特徴量計算部４０とを備えるように構成した。 As described above, the face image feature value generation device 1 according to an embodiment of the present invention performs face detection processing on image data acquisition unit 10 that captures image data, and face detection from the image data. A face area extraction unit 20 that extracts a region; an analysis area determination unit 30 that determines an upper analysis area and a lower analysis area that bisect a circular or elliptical analysis area in the vertical (vertical) direction from the face area; and an upper analysis A face image feature value calculation unit 40 that calculates Bag-of-Keypoints as feature values for each of the region and the lower analysis region, and generates a face image feature value by connecting the calculated Bag-of-Keypoints. Configured.

このように構成したことにより、本実施形態である顔画像特徴量生成装置１は、顔領域から、人物の顔認識や顔表情認識にとって重要度が低い髪の毛や、イヤリング等の情報を除外するため、高精度な顔画像特徴量を生成することができる。また、顔画像特徴量生成装置１は、解析領域を上部解析領域と下部解析領域とに区分することによって、例えば、上部解析領域に顔の両眉毛や両目を含ませ、下部解析領域に鼻頭や口を含ませることができる。これにより、顔画像特徴量生成装置１が生成する顔画像特徴量には、位置情報が対応付けされることとなる。よって、顔画像特徴量生成装置１によれば、光の影響にロバストな顔認識処理や顔表情認識処理を可能にすることができる。
したがって、本実施形態によれば、顔画像特徴量生成装置１は、光の影響にロバストであって且つ高精度に顔画像特徴量を生成することができる。 With this configuration, the facial image feature value generation device 1 according to the present embodiment excludes information such as hair and earrings that are less important for human face recognition and facial expression recognition from the face area. It is possible to generate a highly accurate face image feature amount. Further, the face image feature value generation device 1 divides the analysis region into an upper analysis region and a lower analysis region, so that, for example, the upper analysis region includes both the brow and both eyes of the face, and the lower analysis region includes the nasal head and Mouth can be included. As a result, the position information is associated with the face image feature value generated by the face image feature value generation device 1. Therefore, according to the face image feature value generation device 1, it is possible to perform face recognition processing and facial expression recognition processing that are robust to the influence of light.
Therefore, according to the present embodiment, the face image feature value generation device 1 can generate a face image feature value with high accuracy and robust to the influence of light.

なお、上述した実施形態である顔画像特徴量生成装置１の一部の機能をコンピュータで実現するようにしてもよい。この場合、その機能を実現するための顔画像特徴量生成プログラムをコンピュータ読み取り可能な記録媒体に記録し、この記録媒体に記録された顔画像特徴量生成プログラムをコンピュータシステムに読み込ませて、このコンピュータシステムが実行することによって実現してもよい。なお、このコンピュータシステムとは、オペレーティング・システム（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ；ＯＳ）や周辺装置のハードウェアを含むものである。また、コンピュータ読み取り可能な記録媒体とは、フレキシブルディスク、光磁気ディスク、光ディスク、メモリカード等の可搬型記録媒体、コンピュータシステムに備えられる磁気ハードディスクやソリッドステートドライブ等の記憶装置のことをいう。さらに、コンピュータ読み取り可能な記録媒体とは、インターネット等のコンピュータネットワーク、および電話回線や携帯電話網を介してプログラムを送信する場合の通信回線のように、短時間の間、動的にプログラムを保持するもの、さらには、その場合のサーバ装置やクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持するものを含んでもよい。また上記の顔画像特徴量生成プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせにより実現するものであってもよい。 Note that some functions of the face image feature value generation device 1 according to the above-described embodiment may be realized by a computer. In this case, a face image feature value generating program for realizing the function is recorded on a computer-readable recording medium, and the face image feature value generating program recorded on the recording medium is read into a computer system, so that the computer You may implement | achieve by performing a system. This computer system includes an operating system (OS) and hardware of peripheral devices. The computer-readable recording medium is a portable recording medium such as a flexible disk, a magneto-optical disk, an optical disk, or a memory card, and a storage device such as a magnetic hard disk or a solid state drive provided in the computer system. Furthermore, a computer-readable recording medium dynamically holds a program for a short time, such as a computer network such as the Internet, and a communication line when transmitting a program via a telephone line or a cellular phone network. In addition, a server that holds a program for a certain period of time, such as a volatile memory inside a computer system serving as a server device or a client in that case, may be included. The face image feature value generation program may be for realizing a part of the functions described above, and further, the function described above is realized by a combination with a program already recorded in a computer system. It may be a thing.

以上、本発明の実施の形態について図面を参照して詳述したが、具体的な構成はその実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計等も含まれる。 As mentioned above, although embodiment of this invention was explained in full detail with reference to drawings, the specific structure is not restricted to that embodiment, The design of the range which does not deviate from the summary of this invention, etc. are included.

１顔画像特徴量生成装置
１０画像データ取得部
２０顔領域抽出部
３０解析領域決定部
４０顔画像特徴量計算部 DESCRIPTION OF SYMBOLS 1 Face image feature-value production apparatus 10 Image data acquisition part 20 Face area extraction part 30 Analysis area | region determination part 40 Face image feature-value calculation part

Claims

An image data acquisition unit for capturing image data;
A face area extraction unit that performs a face detection process on the image data captured by the image data acquisition unit and extracts a face area from the image data;
An analysis region determination unit that determines a first region and a second region that bisect a circular or elliptical analysis region from the face region extracted by the face region extraction unit;
A face image feature value calculation unit that calculates feature values for each of the first region and the second region determined by the analysis region determination unit, and generates a face image feature value by connecting the calculated feature values; ,
A face image feature value generating apparatus comprising:

The face area extraction unit extracts a rectangular face area including a face from the image data,
The analysis region determining unit determines the first region and the second region by vertically dividing the circular or elliptical analysis region smaller than the circle or ellipse inscribed in the face region in the vertical direction. The face image feature amount generation apparatus according to claim 1, wherein the face image feature amount generation device is a feature.

The face image feature value generation apparatus according to claim 2, wherein the face area extraction unit normalizes the rectangular face area to a face area of a predetermined size.

Computer
An image data acquisition unit for capturing image data;
A face area extraction unit that performs a face detection process on the image data captured by the image data acquisition unit and extracts a face area from the image data;
An analysis region determination unit that determines a first region and a second region that bisect a circular or elliptical analysis region from the face region extracted by the face region extraction unit;
A face image feature value calculation unit that calculates feature values for each of the first region and the second region determined by the analysis region determination unit, and generates a face image feature value by connecting the calculated feature values; ,
Face feature generation program to function as