JP6147003B2

JP6147003B2 - Information processing apparatus, information processing method, and program

Info

Publication number: JP6147003B2
Application number: JP2012288721A
Authority: JP
Inventors: 加藤　政美; 政美加藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-12-28
Filing date: 2012-12-28
Publication date: 2017-06-14
Anticipated expiration: 2032-12-28
Also published as: JP2014130526A

Description

本発明は、特に、物体の認識性能を向上させるために用いて好適な情報処理装置、情報処理方法及びプログラムに関する。 In particular, the present invention relates to an information processing apparatus, an information processing method, and a program suitable for use in improving object recognition performance.

顔画像データを用いて個人の認識等を行う際に、顔器官或いはそれに準ずる複数の特徴的な部位（以下、特徴点とする）の位置を画像上で決定することは重要なタスクであり、認識性能に大きく影響を及ぼす。 When performing personal recognition using face image data, it is an important task to determine the positions of facial organs or multiple characteristic parts (hereinafter referred to as feature points) on the image, It greatly affects the recognition performance.

複数の特徴点の位置を決定する手法としては、特徴点毎に位置候補を決定する手段と、対象とする例えば顔などの物体に特有な特徴点の配置関係に基づいて特徴点の位置候補を補正する手段とにより構成されていることが多い。例えば、非特許文献１及び２には、統計量に基づく幾何学的な拘束手法に従って複数の顔器官の特徴点の位置を決定する方法が開示されている。ここで拘束とは、特徴点の位置を、対象とする物体に特有の配置関係に基づいて制限することである。 As a method for determining the positions of a plurality of feature points, a method for determining position candidates for each feature point and a feature point position candidate based on the arrangement relationship of feature points specific to an object such as a face to be processed are used. In many cases, it is constituted by means for correcting. For example, Non-Patent Documents 1 and 2 disclose a method for determining the positions of feature points of a plurality of facial organs according to a geometric constraint technique based on statistics. Here, “restraint” refers to restricting the position of the feature point based on the arrangement relationship peculiar to the target object.

非特許文献１に開示されている手法では、ＰＤＭ（ＰｏｉｎｔＤｉｓｔｒｉｂｕｔｉｏｎＭｏｄｅｌ）と呼ばれる特徴点の位置座標の関係モデルを用いて、特徴点の配置の幾何学的な補正を行う。ＰＤＭでは、複数の特徴点の位置座標を連結したベクトルを主成分分析により生成した固有空間へ射影し、その射影ベクトルを特徴点の位置座標の空間へ逆射影することにより補正後の特徴点の位置座標を得る。その際、射影ベクトルの振幅を制限することにより統計的な配置関係に基づく特徴点の位置の拘束効果が得られる。 In the method disclosed in Non-Patent Document 1, geometrical correction of feature point arrangement is performed using a relational model of feature point position coordinates called PDM (Point Distribution Model). In PDM, a vector obtained by concatenating the position coordinates of a plurality of feature points is projected onto an eigenspace generated by principal component analysis, and the projected vector is back-projected onto the space of the position coordinates of the feature points. Get position coordinates. At that time, by restricting the amplitude of the projection vector, the effect of restricting the position of the feature point based on the statistical arrangement relationship can be obtained.

一方、非特許文献２に開示されている手法も、主成分分析を利用して特徴点の位置座標の補正を行う手法である。ところが、当該手法では射影ベクトルの次元数を制限するによって特徴点の位置の拘束効果を得ている。これらの手法は固有空間上に射影されたベクトルの変動を値域や次元数で制限することにより特徴点の位置の配置に幾何学的な拘束を与えている。 On the other hand, the method disclosed in Non-Patent Document 2 is also a method for correcting the position coordinates of feature points using principal component analysis. However, in this method, the restriction effect of the position of the feature point is obtained by limiting the number of dimensions of the projection vector. These methods give geometric constraints to the arrangement of the feature point positions by limiting the variation of the vector projected on the eigenspace by the range or the number of dimensions.

Ｔ．Ｆ．Ｃｏｏｔｅｓ，Ｃ．Ｊ．Ｔａｙｌｏｒ，ＡｃｔｉｖｅＳｈａｐｅＭｏｄｅｌｓ − 'ＳｍａｒｔＳｎａｋｅｓ'．ｉｎＰｒｏｃ．ＢｒｉｔｉｓｈＭａｃｈｉｎｅＶｉｓｉｏｎＣｏｎｆｅｒｅｎｃｅ．Ｓｐｒｉｎｇｅｒ−Ｖｅｒｌａｇ，１９９２，ｐｐ．２６６−２７５T. T. F. Cootes, C.I. J. et al. Taylor, Active Shape Models-'Smart Snakes'. in Proc. British Machine Vision Conference. Springer-Verlag, 1992, pp. 266-275 Ｂｅｕｍｅｒ，Ｇ．Ｍ．；Ｔａｏ，Ｑ．；Ｂａｚｅｎ，Ａ．Ｍ．；Ｖｅｌｄｈｕｉｓ，Ｒ．Ｎ．Ｊ．"Ａｌａｎｄｍａｒｋｐａｐｅｒｉｎｆａｃｅｒｅｃｏｇｎｉｔｉｏｎ"ＡｕｔｏｍａｔｉｃＦａｃｅａｎｄＧｅｓｔｕｒｅＲｅｃｏｇｎｉｔｉｏｎ，２００６．ＦＧＲ２００６．７ｔｈＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅ、ｐｐ．７３−７８Beumer, G.G. M.M. Tao, Q .; Bazen, A .; M.M. Veldhuis, R .; N. J. et al. “A Landmark Paper In Face Recognition”, “Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference, pp. 73-78 Ｙ．ＬｅＣｕｎ，Ｌ．Ｂｏｔｔｏｕ，Ｇ．ＯｒｒａｎｄＫ．Ｍｕｌｌｅｒ： "ＥｆｆｉｃｉｅｎｔＢａｃｋＰｒｏｐ"，ｉｎＯｒｒ，Ｇ．ａｎｄＭｕｌｌｅｒＫ．（Ｅｄｓ），ＮｅｕｒａｌＮｅｔｗｏｒｋｓ：Ｔｒｉｃｋｓｏｆｔｈｅｔｒａｄｅ，Ｓｐｒｉｎｇｅｒ，１９９８Y. LeCun, L.M. Bottow, G.M. Orr and K.K. Muller: “Efficient BackProp”, in Orr, G.M. and Muller K.M. (Eds), Neural Networks: Tricks of the trade, Springer, 1998.

しかしながら、上記何れの手法も統計的な処理（主成分分析を利用）に基づいて効率良く特徴点の位置を補正（外れ値の補正）する優れた方式であるが、補正の程度（以下、拘束力とする）を特徴点毎に制御することはできなかった。例えば、特徴点の位置候補を決定する精度が特徴点毎に異なる場合、特徴点毎に補正の程度を変えることが望ましいが、上記従来例による手法では補正程度を変えることができない。 However, each of the above methods is an excellent method for efficiently correcting the position of feature points (correcting outliers) based on statistical processing (using principal component analysis). Power) could not be controlled for each feature point. For example, when the accuracy of determining feature point position candidates is different for each feature point, it is desirable to change the degree of correction for each feature point. However, the degree of correction cannot be changed by the method according to the conventional example.

本発明は前述の問題点に鑑み、特徴点ごとに特徴点の位置候補の補正の程度を変化させることができるようにすることを目的としている。 An object of the present invention is to make it possible to change the degree of correction of feature point position candidates for each feature point.

本発明の情報処理装置は、物体領域の中から複数の特徴点の位置候補を求める位置候補決定手段と、前記位置候補決定手段によって求められた複数の位置候補の座標を連結して特徴点ベクトルを生成する生成手段と、前記生成手段によって生成された特徴点ベクトルを所定の次元の空間に射影して射影ベクトルを得る射影手段と、前記複数の特徴点の位置候補毎に逆射影に使用する射影ベクトルの要素数を決定する要素数決定手段と、前記複数の特徴点の位置候補のそれぞれを、前記射影手段により得られた射影ベクトルの要素のうち、前記要素数決定手段により当該位置候補に対して決定された要素数の要素で構成される射影ベクトルを用いて逆射影する逆射影手段と、前記逆射影手段の結果を用いて前記複数の特徴点の位置候補を補正した複数の特徴点の位置を決定する位置決定手段と、を有することを特徴とする。 The information processing apparatus according to the present invention includes a position candidate determining unit that obtains position candidates of a plurality of feature points from an object region, and a feature point vector obtained by connecting the coordinates of the plurality of position candidates obtained by the position candidate determining unit. Generating means, projecting means for projecting the feature point vector generated by the generating means onto a space of a predetermined dimension to obtain a projection vector, and using the plurality of feature point position candidates for back projection An element number determining unit that determines the number of elements of a projection vector, and each of the plurality of feature point position candidates is converted into the position candidates by the element number determining unit among the elements of the projection vector obtained by the projecting unit. reverse projection means for inverse projection with projection vector composed of elements of the determined number of elements for, using the results of said inverse projection unit to correct the position candidate of the plurality of feature points And having a position determining means for determining the positions of a plurality of feature points, the.

本発明によれば、特徴点毎に特徴点の位置候補の補正の程度を制御することができ、特徴点の位置をより高精度に決定することができる。 According to the present invention, the degree of correction of feature point position candidates can be controlled for each feature point, and the position of the feature point can be determined with higher accuracy.

本発明の第１の実施形態において、特徴点の位置を決定する処理手順の一例を示すフローチャートである。5 is a flowchart illustrating an example of a processing procedure for determining a position of a feature point in the first embodiment of the present invention. 実施形態において特徴点の位置を決定するための情報処理装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the information processing apparatus for determining the position of a feature point in embodiment. 顔画像の切り出し処理の一例を説明する図である。It is a figure explaining an example of the cutting-out process of a face image. 顔器官に関連する特徴点位置候補を説明する図である。It is a figure explaining the feature point position candidate relevant to a facial organ. 図１のステップＳ１０１の処理の概要を説明する図である。It is a figure explaining the outline | summary of the process of step S101 of FIG. 幾何学的な補正処理の例を説明する図である。It is a figure explaining the example of a geometric correction process. 本発明の第１の実施形態による射影処理と逆射影処理とを模式的に説明する図である。It is a figure which illustrates typically the projection process and reverse projection process by the 1st Embodiment of this invention. 逆射影次元数のテーブル情報の例を示す図である。It is a figure which shows the example of the table information of a reverse projection dimension number. 本発明の第２の実施形態において、特徴点の位置を決定する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which determines the position of the feature point in the 2nd Embodiment of this invention. 本発明の第２の実施形態による射影処理と逆射影処理とを模式的に説明する図である。It is a figure which illustrates typically the projection process and reverse projection process by the 2nd Embodiment of this invention. 逆射影用の射影行列の例を説明する図である。It is a figure explaining the example of the projection matrix for reverse projection.

（第１の実施形態）
以下、本発明の第１の実施形態の動作について説明する。
図２は、本実施形態において特徴点の位置を決定するための情報処理装置２００の構成例を示すブロック図である。本実施形態の情報処理装置２００は先ず、画像データから顔を含む領域を抽出する。以下、抽出した領域を顔画像データと呼ぶ。そして、得られた顔画像データから複数の特徴点位置（ここでは顔の器官に関連する特徴の位置）を決定する。 (First embodiment)
The operation of the first embodiment of the present invention will be described below.
FIG. 2 is a block diagram illustrating a configuration example of the information processing apparatus 200 for determining the position of the feature point in the present embodiment. The information processing apparatus 200 according to the present embodiment first extracts a region including a face from image data. Hereinafter, the extracted area is referred to as face image data. Then, a plurality of feature point positions (here, feature positions related to the facial organs) are determined from the obtained face image data.

図２において、画像入力部２０１は、光学系デバイス、光電変換デバイス及びセンサーを制御するドライバー回路、ＡＤコンバーター、各種画像補正を司る信号処理回路、フレームバッファ等により構成されている。前処理部２０２は、後段の各種処理を効果的に行うために各種前処理を実行する。具体的には、画像入力部２０１で取得した画像データに対して色変換処理、コントラスト補正処理等の補正をハードウェアで処理する。 In FIG. 2, an image input unit 201 includes an optical system device, a photoelectric conversion device, a driver circuit that controls a sensor, an AD converter, a signal processing circuit that controls various image corrections, a frame buffer, and the like. The pre-processing unit 202 executes various pre-processes in order to effectively perform various processes in the subsequent stage. Specifically, correction such as color conversion processing and contrast correction processing is processed by hardware on the image data acquired by the image input unit 201.

顔画像データ切り出し処理部２０３は、前処理部２０２で補正した画像データに対して顔検出処理を実行する。顔検出の手法は、従来提案されている様々な手法を適用可能である。また、顔画像データ切り出し処理部２０３は、検出された顔毎に顔画像を所定のサイズに正規化して切り出す。 The face image data cutout processing unit 203 performs face detection processing on the image data corrected by the preprocessing unit 202. Various conventionally proposed methods can be applied to the face detection method. Further, the face image data cutout processing unit 203 normalizes and cuts out a face image to a predetermined size for each detected face.

図３は、顔画像の切り出し処理の一例を説明する図である。前処理部２０２で補正された画像３０１から顔領域３０２を検出し、予め定めるサイズに正規化した正立の顔画像３０３を切り出す。従って、顔画像３０３の大きさは顔によらず一定である。切り出した顔画像のデータは、ＤＭＡＣ（ＤｉｒｅｃｔＭｅｍｏｒｙＡｃｃｅｓｓＣｏｎｔｒｏｌｌｅｒ）２０５を介してＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０９に格納される。以後、特徴点の位置とは顔画像３０３内の特徴点の座標と定義する。また、座標は顔画像３０３の左上端を原点とする座標系（ｘ座標、ｙ座標）で表現するものとする。 FIG. 3 is a diagram illustrating an example of a face image cut-out process. A face region 302 is detected from the image 301 corrected by the preprocessing unit 202, and an erect face image 303 normalized to a predetermined size is cut out. Therefore, the size of the face image 303 is constant regardless of the face. Data of the cut face image is stored in a RAM (Random Access Memory) 209 via a DMAC (Direct Memory Access Controller) 205. Hereinafter, the position of the feature point is defined as the coordinate of the feature point in the face image 303. The coordinates are expressed in a coordinate system (x coordinate, y coordinate) with the upper left corner of the face image 303 as the origin.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０７は、本実施形態に係る主要な処理を実行すると共に本装置全体の動作を制御する。ブリッジ２０４は、画像バス２１０とＣＰＵバス２０６との間のバスブリッジ機能を提供する。ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２０８は、ＣＰＵ２０７の動作を規定するプログラムを格納する。ＲＡＭ２０９は、ＣＰＵ２０７の動作に必要な作業メモリである。ＣＰＵ２０７は、ＲＡＭ２０９に格納された顔画像のデータに対して処理を実行する。 A CPU (Central Processing Unit) 207 executes main processing according to the present embodiment and controls the operation of the entire apparatus. The bridge 204 provides a bus bridge function between the image bus 210 and the CPU bus 206. A ROM (Read Only Memory) 208 stores a program that defines the operation of the CPU 207. A RAM 209 is a work memory necessary for the operation of the CPU 207. The CPU 207 executes processing on the face image data stored in the RAM 209.

図１は、本実施形態において、特徴点の位置を決定する処理手順の一例を示すフローチャートである。当該フローチャートはＣＰＵ２０７の動作を示し、正規化された顔画像３０３に対して特徴点の位置を決定する処理について説明するものである。
まず、ステップＳ１０１において、ＣＰＵ２０７は位置候補決定手段として機能し、予め定める特徴点の位置候補を決定する。この処理では、例えば図４に示すような１５個の顔器官に関連する特徴点４０１〜４１５の位置候補を決定する。 FIG. 1 is a flowchart illustrating an example of a processing procedure for determining the position of a feature point in the present embodiment. The flowchart shows the operation of the CPU 207 and describes the process of determining the position of the feature point with respect to the normalized face image 303.
First, in step S101, the CPU 207 functions as a position candidate determination unit, and determines a position candidate of a predetermined feature point. In this process, for example, position candidates of feature points 401 to 415 related to 15 facial organs as shown in FIG. 4 are determined.

図５は、図１のステップＳ１０１の処理の概要を説明する図である。なお、図５は説明のために２つの特徴点の位置候補を決定する場合の処理構成を示している。本実施形態では、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋｓ）により特徴点の位置候補を決定する。ＣＮＮは階層的な特徴抽出処理により構成されており、図５では、第１階層５０６の特徴数が３、第２階層５１０の特徴数が２の２層ＣＮＮの例を示している。 FIG. 5 is a diagram for explaining the outline of the processing in step S101 in FIG. FIG. 5 shows a processing configuration in the case where position candidates for two feature points are determined for the sake of explanation. In the present embodiment, candidate positions of feature points are determined by CNN (Convolutional Neural Networks). The CNN is configured by hierarchical feature extraction processing, and FIG. 5 shows an example of a two-layer CNN in which the number of features in the first layer 506 is 3 and the number of features in the second layer 510 is 2.

図５において、顔画像５０１は、前述の図３に示す顔画像３０３に相当するものである。特徴面５０３ａ〜５０３ｃは第１階層５０６の特徴面を示す。特徴面とは、所定の特徴抽出フィルタ（コンボリューション演算の累積和及び非線形処理）で前階層のデータを走査しながら演算した結果を格納する画像データ面である。特徴面は、ラスタスキャンされた画像データに対する検出結果であるため、検出結果も面で表される。 In FIG. 5, a face image 501 corresponds to the face image 303 shown in FIG. Characteristic surfaces 503 a to 503 c indicate characteristic surfaces of the first hierarchy 506. The feature plane is an image data plane that stores a result of calculation while scanning data of the previous layer with a predetermined feature extraction filter (cumulative sum of convolution calculations and nonlinear processing). Since the feature plane is a detection result for raster-scanned image data, the detection result is also represented by a plane.

この特徴面５０３ａ〜５０３ｃは、顔画像５０１を参照して、異なる特徴抽出フィルタにより算出される。特徴面５０３ａ〜５０３ｃはそれぞれ模式的にコンボリューションフィルタ５０４ａ〜５０４ｃに対応する２次元のコンボリューションフィルタ演算と演算結果の非線形変換（シグモイド関数等）とにより生成される。なお、参照画像領域５０２はコンボリューション演算に必要な領域を示す。例えば、フィルタサイズ（水平方向の長さと垂直方向の高さ）が１１×１１のコンボリューションフィルタ演算では、以下の式（１）に示すような積和演算により処理する。 The feature surfaces 503a to 503c are calculated by different feature extraction filters with reference to the face image 501. The characteristic surfaces 503a to 503c are generated by two-dimensional convolution filter operations corresponding to the convolution filters 504a to 504c and non-linear transformation (sigmoid function or the like) of the operation results, respectively. A reference image area 502 indicates an area necessary for the convolution calculation. For example, in a convolution filter operation with a filter size (horizontal length and vertical height) of 11 × 11, processing is performed by a product-sum operation as shown in the following equation (1).

ここで、ｉｎｐｕｔ（ｘ，ｙ）は座標（ｘ，ｙ）での参照画素値を表し、ｏｕｔｐｕｔ（ｘ，ｙ）は座標（ｘ，ｙ）での演算結果を表す。また、ｗｅｉｇｈｔ（ｃｏｌｕｍｎ，ｒｏｗ）は座標（ｘ＋ｃｏｌｕｍｎ，ｙ＋ｒｏｗ）での重み係数を表し、ｃｏｌｕｍｎＳｉｚｅ＝１１，ｒｏｗＳｉｚｅ＝１１はフィルタサイズ（フィルタタップ数）を表す。 Here, input (x, y) represents a reference pixel value at coordinates (x, y), and output (x, y) represents a calculation result at coordinates (x, y). Also, weight (column, row) represents a weighting coefficient at coordinates (x + column, y + row), and columnSize = 11 and rowSize = 11 represent a filter size (the number of filter taps).

コンボリューションフィルタ５０４ａ〜５０４ｃは夫々異なる係数のフィルタであり、特徴面によってコンボリューションフィルタ５０４ａ〜５０４ｃのサイズも異なる。ＣＮＮ演算では、複数のフィルタを画素単位で走査しながら積和演算を繰り返し、最終的な積和結果を非線形変換することにより特徴面を生成する。例えば特徴面５０３ａを算出する場合は、前階層との結合数が１であるため、コンボリューションフィルタ５０４ａが用いられる。一方、特徴面５０７ａ、５０７ｂを算出する場合は、前階層との結合数が３であるため、３つのコンボリューションフィルタの演算結果を累積加算する。つまり、特徴面５０７ａは、コンボリューションフィルタ５０８ａ〜５０８ｃの全ての出力を累積加算し、最後に非線形変換処理することによって得られる。 The convolution filters 504a to 504c are filters having different coefficients, and the sizes of the convolution filters 504a to 504c are different depending on the feature plane. In the CNN operation, a product-sum operation is repeated while scanning a plurality of filters in units of pixels, and a final product-sum result is nonlinearly transformed to generate a feature plane. For example, when calculating the feature plane 503a, the convolution filter 504a is used because the number of connections with the previous layer is one. On the other hand, when calculating the feature planes 507a and 507b, since the number of connections with the previous layer is 3, the calculation results of the three convolution filters are cumulatively added. That is, the feature plane 507a is obtained by accumulating all the outputs of the convolution filters 508a to 508c and finally performing nonlinear conversion processing.

参照画像領域５０５ａ〜５０５ｃはそれぞれ第２階層５１０のコンボリューション演算に必要な領域を示す。コンボリューションフィルタの係数はバックプロパゲーション等従来知られているニューラルネットワークの学習手法によって予め決定しておく。ＣＮＮの学習に関しては、例えば非特許文献３に開示されている方法を用いる。 Reference image areas 505a to 505c indicate areas necessary for the convolution calculation of the second hierarchy 510, respectively. The coefficient of the convolution filter is determined in advance by a conventionally known neural network learning method such as backpropagation. For learning CNN, for example, the method disclosed in Non-Patent Document 3 is used.

ＣＮＮ演算によって得られる特徴面５０７ａ、５０７ｂは、対象とする特徴点の存在可能性を示す画像マップに相当する。即ち、特徴点が存在する位置に高い値を有する画像マップである。ステップＳ１０１では、更に特徴面５０７ａ、５０７ｂの重心を算出し、その座標値を特徴点の位置候補の座標とする。 The feature surfaces 507a and 507b obtained by the CNN calculation correspond to an image map indicating the possibility of the existence of target feature points. That is, the image map has a high value at the position where the feature point exists. In step S101, the center of gravity of the feature planes 507a and 507b is further calculated, and the coordinate value is used as the coordinates of the feature point position candidate.

ステップＳ１０２〜ステップＳ１０６では、ステップＳ１０１で得られた１５個の特徴点の位置候補に対して幾何学的な補正処理（ステップＳ１０１の性能に起因する外れ値の補正）を実行する。 In steps S102 to S106, geometric correction processing (correction of outliers due to the performance of step S101) is performed on the 15 feature point position candidates obtained in step S101.

図６は、幾何学的な補正処理の例を説明する図である。図６に示す例では、特徴点４０２ａは目尻を特徴とする特徴点であるが、誤って眉毛端の位置に判定されている。本実施形態では、人の顔の特徴の配置関係に基づいて統計的な処理によりその位置を補正する。即ち学習（主成分分析等）によって得られる知見に基づいて、特徴点４０２ａが示す位置候補を特徴点４０２ｂが示す位置に補正する。以下、本実施形態による補正処理の具体例について順番に説明する。 FIG. 6 is a diagram for explaining an example of the geometric correction process. In the example shown in FIG. 6, the feature point 402a is a feature point characterized by the corner of the eye but is erroneously determined as the position of the eyebrow end. In the present embodiment, the position is corrected by statistical processing based on the arrangement relationship of human face features. That is, based on the knowledge obtained by learning (principal component analysis or the like), the position candidate indicated by the feature point 402a is corrected to the position indicated by the feature point 402b. Hereinafter, specific examples of the correction processing according to the present embodiment will be described in order.

まず、ステップＳ１０２において、ＣＰＵ２０７は生成手段として機能し、ステップＳ１０１で得られた複数の特徴点の位置候補の座標を単純に連結して１つの特徴点ベクトルを生成する。本実施形態の場合、１５個の特徴点位置座標から３０次元の特徴ベクトルＶを生成する。各特徴点の位置座標（ｘ_i，ｙ_i）（ｉ：特徴点の番号１〜１５）を単純に連結したデータ列を特徴点ベクトルＶ（要素ｖ_j：ｊ＝１〜３０）とする。特徴点の番号１〜１５は、本実施形態では特徴点４０１〜４１５に対応する。したがって、例えば、特徴点ベクトルの要素ｖ₁、ｖ₂はそれぞれ特徴点４０１のｘ座標値、ｙ座標値に対応する。また、特徴ベクトルＶは以下の式（２）により定義する。 First, in step S102, the CPU 207 functions as a generation unit, and generates a single feature point vector by simply connecting the coordinates of the plurality of feature point position candidates obtained in step S101. In this embodiment, a 30-dimensional feature vector V is generated from 15 feature point position coordinates. A data string obtained by simply connecting the position coordinates (x _i , y _i ) (i: feature point numbers 1 to 15) of each feature point is defined as a feature point vector V (element v _j : j = 1 to 30). The feature point numbers 1 to 15 correspond to the feature points 401 to 415 in this embodiment. Therefore, for example, elements v ₁ and v ₂ of the feature point vector correspond to the x coordinate value and the y coordinate value of the feature point 401, respectively. The feature vector V is defined by the following equation (2).

式中、ｆは特徴点の数を表し、Ｔは転置を表す。 In the formula, f represents the number of feature points, and T represents transposition.

次に、ステップＳ１０３において、ＣＰＵ２０７は射影手段として機能し、平均ベクトルＡ、射影行列Ｅを用いて射影ベクトルＰを算出する。射影ベクトルＰは、特徴点ベクトルＶから平均ベクトルＡを減じたベクトルデータと射影行列Ｅとを使用して以下の式（３）により算出する。なお、射影行列Ｅ及び平均ベクトルＡは、予め多数の顔画像に対する特徴点ベクトル（学習用特徴点ベクトル）を用いて、主成分分析により算出した行列である。ここで、学習用特徴点とベクトルとは、顔画像の正しい特徴点位置座標を全て同様に連結して生成したベクトルである。平均ベクトルＡは、以下の式（４）により定義され、射影行列Ｅは、以下の式（５）により定義される。 In step S103, the CPU 207 functions as a projecting unit, and calculates a projection vector P using the average vector A and the projection matrix E. The projection vector P is calculated by the following equation (3) using vector data obtained by subtracting the average vector A from the feature point vector V and the projection matrix E. Note that the projection matrix E and the average vector A are matrices calculated in advance by principal component analysis using feature point vectors (learning feature point vectors) for a large number of face images. Here, the learning feature points and vectors are vectors generated by connecting all the correct feature point position coordinates of the face image in the same manner. The average vector A is defined by the following equation (4), and the projection matrix E is defined by the following equation (5).

Ｐ＝Ｅ^T（Ｖ−Ａ）・・・（３）
Ａ＝（ａ₁，ａ₂，ａ₃，‥，ａ₂×_f）^T ・・・（４）
Ｅ＝（ｕ₁，ｕ₂，‥，ｕ_p）・・・（５） P = E ^T (V−A) (3)
A = (a ₁ , a ₂ , a ₃ ,..., A ₂ × _f ) ^T (4)
E = (u ₁ , u ₂ ,..., U _p ) (5)

式中、ｕ₁，ｕ₂，‥，ｕ_pは主成分分析によって得られたそれぞれ２×ｆ次元の正規直交ベクトル（固有ベクトル）である。本実施形態の場合、３０次元のベクトルとなる。ｐは射影ベクトルの次元を表しており、２×ｆ次元より少ない値として、例えば、８に設定する。この場合、射影行列Ｅは主成分分析によって得られる直交ベクトルのうち、対応する固有値が大きい方から順に８個の固有ベクトルを使用する。なお、射影行列Ｅ及び平均ベクトルＡの情報はＲＯＭ２０８或いはＲＡＭ２０９等に予め格納されているものとする。 In the equation, u ₁ , u ₂ ,..., U _p are 2 × f-dimensional orthonormal vectors (eigenvectors) obtained by principal component analysis. In the case of this embodiment, it is a 30-dimensional vector. p represents the dimension of the projection vector, and is set to 8, for example, as a value smaller than 2 × f dimension. In this case, the projection matrix E uses eight eigenvectors in descending order of the corresponding eigenvalues among orthogonal vectors obtained by principal component analysis. It is assumed that information of the projection matrix E and the average vector A is stored in advance in the ROM 208 or RAM 209.

以上のようにステップＳ１０３では、式（３）の演算により２×ｆ次元の特徴点ベクトルをｐ次元の射影ベクトルに次元削減する。即ちｐ次元の固有空間に射影する。また、非特許文献２に記載の方法では、射影ベクトルＰから元の特徴点ベクトル次元のデータ、即ち座標位置を復元する。ここで、復元ベクトルＶ'は、前述した射影行列Ｅと平均ベクトルＡとを用いて以下の式（６）により算出する。以下、式（６）の第１項の処理（ＥＰ）を逆射影処理と呼ぶ。
Ｖ'＝ＥＰ＋Ａ・・・（６） As described above, in step S103, the dimension of the 2 × f-dimensional feature point vector is reduced to a p-dimensional projection vector by the calculation of Expression (3). That is, it projects onto a p-dimensional eigenspace. In the method described in Non-Patent Document 2, the original feature point vector dimension data, that is, the coordinate position is restored from the projection vector P. Here, the restoration vector V ′ is calculated by the following equation (6) using the projection matrix E and the average vector A described above. Hereinafter, the process (EP) of the first term of the equation (6) is referred to as a reverse projection process.
V ′ = EP + A (6)

一方、本実施形態では、ステップＳ１０４〜Ｓ１０６でＣＰＵ２０７は逆射影手段として機能し、特徴点毎に異なる個数の射影ベクトルの要素を用いて特徴点ベクトルを復元する。図７は、本実施形態による射影処理と逆射影処理とを模式的に説明する図である。ここでは説明のために３つの特徴点に対応する特徴点ベクトルに対して処理する場合について説明する。 On the other hand, in this embodiment, the CPU 207 functions as a reverse projection unit in steps S104 to S106, and restores the feature point vector using a different number of projection vector elements for each feature point. FIG. 7 is a diagram schematically illustrating the projection process and the reverse projection process according to the present embodiment. Here, for the sake of explanation, a case will be described in which processing is performed on feature point vectors corresponding to three feature points.

図７において、特徴点ベクトル７０１は特徴点の座標データに対応するベクトルである。即ち、ステップＳ１０１で決定した特徴点の位置候補の座標（ｘ₁，ｙ₁）、（ｘ₂，ｙ₂）、（ｘ₃，ｙ₃）から生成した特徴点ベクトル（ｖ₁，ｖ₂，ｖ₃，ｖ₄，ｖ₅，ｖ₆）である。なお、説明のため、ここでの特徴点ベクトルは既に平均ベクトルＡが減算されているものとする。即ち、式（３）のＶ−Ａが処理された状態であるとする。７０２ａ〜７０２ｃは射影行列Ｅであり、ここでは３次元の射影ベクトルを算出するために３つの固有ベクトルを使用する。射影行列７０２ａ、７０２ｂ、７０２ｃは、主成分分析による固有ベクトルが大きい順に３つの固有ベクトル（ｅ_1i，ｅ_2i，ｅ_3i：ｉ＝１〜６）により構成されている。射影ベクトル７０３ａ〜７０３ｃは、特徴点ベクトル７０１と対応する射影行列７０２ａ、７０２ｂ、７０２ｃの固有ベクトルの内積により算出される。即ち、射影ベクトルの要素ｐ_nは、ステップＳ１０３で以下の式（７）により算出される。 In FIG. 7, a feature point vector 701 is a vector corresponding to feature point coordinate data. That is, feature point vectors (v ₁ , v ₂ , v ₂ ) generated from the coordinates (x ₁ , y ₁ ), (x ₂ , y ₂ ), (x ₃ , y ₃ ) of the feature point position candidates determined in step S101. _{_{_{v 3, v 4, v 5}}} , v 6) it is. For the sake of explanation, it is assumed that the mean vector A has already been subtracted from the feature point vector here. That is, it is assumed that VA in the expression (3) is processed. Reference numerals 702a to 702c denote a projection matrix E. Here, three eigenvectors are used to calculate a three-dimensional projection vector. The projection matrices 702a, 702b, and 702c are configured by three eigenvectors (e _1i , e _2i , e _3i : i = 1 to 6) in descending order of eigenvectors by principal component analysis. Projection vectors 703a to 703c are calculated by inner products of eigenvectors of projection matrices 702a, 702b, and 702c corresponding to feature point vector 701. That is, the elements p _n of the projection vector is calculated by the following equation at Step S103 (7).

次に、ステップＳ１０４において、特徴点毎に逆射影する特徴点の次元（逆射影時に使用する射影ベクトルの要素数）を決定する。この処理では、予めＲＡＭ２０９やＲＯＭ２０８に格納されているテーブル情報等を参照して決定する。図８は、テーブル情報の例を示す図であり、特徴点１、２、３毎に逆射影次元数（逆射影時に使用する射影ベクトルの要素数）を指定する。図７に示す例の場合は、特徴点の位置候補の座標（ｘ₁，ｙ₁）、（ｘ₃，ｙ₃）に対応する特徴点は２次元で逆射影処理し、座標（ｘ₂，ｙ₂）は３次元で逆射影処理する。 Next, in step S104, the dimension of the feature point to be back-projected for each feature point (the number of elements of the projection vector used during back-projection) is determined. In this process, it is determined by referring to table information stored in the RAM 209 or ROM 208 in advance. FIG. 8 is a diagram showing an example of table information, and the number of reverse projection dimensions (number of elements of a projection vector used during reverse projection) is designated for each of the feature points 1, 2, and 3. In the case of the example shown in FIG. 7, the feature points corresponding to the coordinates (x ₁ , y ₁ ) and (x ₃ , y ₃ ) of the feature point position candidates are two-dimensionally back-projected, and the coordinates (x ₂ , y ₂ ) performs reverse projection processing in three dimensions.

ステップＳ１０５においては、射影ベクトル７０３ａ〜７０３ｃと射影行列７０２ａ〜７０２ｃとを用いて補正後の特徴点ベクトル７０４（ｖ₁'，ｖ₂'，ｖ₃'，ｖ₄'，ｖ₅'，ｖ₆'）を生成する。例えば、特徴点（ｘ₁，ｙ₁）に対応する補正後の特徴点ベクトルｖ₁'、ｖ₂'は、射影ベクトル要素ｐ₁、ｐ₂を用いて以下の式（８）により算出される。
ｖ₁'＝（ｐ₁×ｅ₁₁）＋（ｐ₂×ｅ₂₁）
ｖ₂'＝（ｐ₁×ｅ₁₂）＋（ｐ₂×ｅ₂₂）・・・（８） In step S105, corrected feature point vectors 704 (v ₁ ′, v ₂ ′, v ₃ ′, v ₄ ′, v ₅ ′, v ₆₎ using the projection vectors 703 a to 703 c and the projection matrices 702 a to 702 c. '). For example, corrected feature point vectors v ₁ ′ and v ₂ ′ corresponding to the feature point (x ₁ , y ₁ ) are calculated by the following equation (8) using the projection vector elements p ₁ and p _2. .
v ₁ '= (p ₁ × e ₁₁ ) + (p ₂ × e ₂₁ )
v ₂ '= (p ₁ × e ₁₂ ) + (p ₂ × e ₂₂ ) (8)

一方、座標（ｘ₂，ｙ₂）に対応する補正後の特徴点ベクトルｖ₃'、ｖ₄'は、以下の式（９）により算出される。
ｖ₃'＝（ｐ₁×ｅ₁₃）＋（ｐ₂×ｅ₂₃）＋（ｐ₃×ｅ₃₃）
ｖ₄'＝（ｐ₁×ｅ₁₄）＋（ｐ₂×ｅ₂₄）＋（ｐ₃×ｅ₃₄）・・・（９） On the other hand, the corrected feature point vectors v ₃ ′ and v ₄ ′ corresponding to the coordinates (x ₂ , y ₂ ) are calculated by the following equation (9).
_{_{v 3 '= (p 1 ×}} e 13) + (p 2 × e 23) + (p 3 × e 33)
v ₄ ′ = (p ₁ × e ₁₄ ) + (p ₂ × e ₂₄ ) + (p ₃ × e ₃₄ ) (9)

補正後の特徴点ベクトル（ｖ₁'，ｖ₂'）は、ステップＳ１０４で次元数２が選択され、２つの射影ベクトル要素を用いてステップＳ１０５で逆射影処理される。なお、射影ベクトルの要素は、対応する固有値の大きな固有ベクトルによって射影された要素から順に選択する。つまり、対応する固有値の小さな固有ベクトルによって射影された要素を使用しない。一方、補正後の特徴点ベクトル（ｖ₃'，ｖ₄'）は、ステップＳ１０４で次元数３が選択され、全ての射影ベクトル要素を用いてステップＳ１０５で逆射影処理される。 The corrected feature point vector (v ₁ ′, v ₂ ′) is selected as dimension number 2 in step S104, and is subjected to reverse projection processing in step S105 using two projection vector elements. The elements of the projection vector are selected in order from the element projected by the corresponding eigenvector having a large eigenvalue. That is, an element projected by a corresponding eigenvector having a small eigenvalue is not used. On the other hand, for the corrected feature point vector (v ₃ ′, v ₄ ′), the dimension number 3 is selected in step S104, and reverse projection processing is performed in step S105 using all projection vector elements.

ステップＳ１０６においては、以上の処理を全ての特徴点に対応して処理したか否かを判定する。この判定の結果、処理していない特徴点がある場合はステップＳ１０４へ戻り、全ての特徴点に対応した処理した場合はステップＳ１０７に進む。そして、ステップＳ１０７において、ＣＰＵ２０７は決定手段として機能し、逆射影した復元ベクトルＶ'に平均ベクトルＡを加算し、特徴点に対応する位置の補正後の座標のデータを取り出し、ＲＡＭ２０９等に格納する。 In step S106, it is determined whether or not the above processing has been performed for all feature points. As a result of the determination, if there is a feature point that has not been processed, the process returns to step S104, and if processing has been performed for all feature points, the process proceeds to step S107. In step S107, the CPU 207 functions as a determination unit, adds the average vector A to the back-projected restored vector V ′, extracts the corrected coordinate data of the position corresponding to the feature point, and stores it in the RAM 209 or the like. .

以上のように、ステップＳ１０２からステップ１０６の処理により、複数の特徴点の位置候補の座標を連結した特徴点ベクトルを次元削減した固有空間に射影し、その後逆射影することにより、統計的な外れ値を補正することができる。つまり、特徴点の位置候補の座標を決定する時に誤検出によって生じる、射影した空間で表現できない外れ値が統計的に補正される。 As described above, by the processing from step S102 to step 106, the feature point vector obtained by connecting the coordinates of the position candidate positions of the plurality of feature points is projected onto the eigenspace with reduced dimensions, and then back-projected, so that the statistical deviation is obtained. The value can be corrected. That is, an outlier that cannot be expressed in the projected space, which is caused by erroneous detection when determining the coordinates of the position candidate of the feature point, is statistically corrected.

ここで、射影次元数が高い場合は、射影行列で射影された固有空間の自由度が高く、射影次元数が低い場合は、当該射影行列で射影された固有空間の自由度が低い。即ち、射影次元数が高い場合は幾何学的な配置関係の拘束力が弱く、射影次元数が低い場合は拘束力が強まる。本実施形態では、逆射影の際に特徴点毎に逆射影する次元数を変更可能にすることにより、例えば、特徴点の位置候補を決定する処理の性能が低い場合に、特徴点の逆射影を行う時は、使用する射影ベクトルの数を削減して逆射影する。これにより、拘束力を高める等の制御が簡単な処理で実現できる。 Here, when the projection dimension number is high, the degree of freedom of the eigenspace projected by the projection matrix is high, and when the projection dimension number is low, the degree of freedom of the eigenspace projected by the projection matrix is low. That is, when the projection dimension number is high, the constraint force of the geometric arrangement relationship is weak, and when the projection dimension number is low, the constraint force is strong. In this embodiment, the number of dimensions to be back-projected for each feature point can be changed at the time of back-projection, so that, for example, when the performance of the process for determining the position candidate of the feature point is low, the back-projection of the feature point When performing, reverse projection is performed by reducing the number of projection vectors to be used. Thereby, control, such as raising restraint force, is realizable by simple processing.

なお、特徴点毎に必要な逆射影次元数については、予め複数の評価用データを用いて決定しておく。例えば、正解となる特徴点の位置情報を有するデータを用いて、補正後の特徴点ベクトルの正解とのずれ量の統計値等に基づいて決定する。例えば、その分散が大きい特徴点に対する逆射影の次元数を少なくしておく。 Note that the number of reverse projection dimensions required for each feature point is determined in advance using a plurality of evaluation data. For example, it is determined based on a statistical value of a deviation amount from the correct answer of the feature point vector after correction, using data having position information of the correct feature point. For example, the number of dimensions of reverse projection for feature points having a large variance is reduced.

以上のように本実施形態によれば、固有空間を利用して特徴点の位置を補正する処理において、特徴点毎に固有空間からの逆射影に使用する射影ベクトルの要素数を制御するようにした。これにより、乗除算を含む複雑な処理が不要となり、簡単な処理によって特徴点毎に好適な補正を行うことができ、特徴点の位置を検出する性能が向上する。 As described above, according to the present embodiment, in the process of correcting the position of the feature point using the eigenspace, the number of elements of the projection vector used for back projection from the eigenspace is controlled for each feature point. did. This eliminates the need for complicated processing including multiplication and division, makes it possible to perform appropriate correction for each feature point by simple processing, and improve the performance of detecting the position of the feature point.

（第２の実施形態）
図９は、本実施形態における特徴点の位置を決定する処理手順の一例を示すフローチャートである。なお、本実施形態に係る情報処理装置の構成については図２と同様であるため、説明は省略する。また、図９に示す各処理も、図２のＣＰＵ２０７の動作により行われる。
先ず、ステップＳ９０１において、特徴点の位置候補を決定する。この処理では、第１の実施形態で説明した図１のステップＳ１０１と同様に、複数の特徴点の位置候補をＣＮＮ演算とその結果に対する重心探索とにより特徴点の位置候補を求める。 (Second Embodiment)
FIG. 9 is a flowchart illustrating an example of a processing procedure for determining the position of a feature point in the present embodiment. The configuration of the information processing apparatus according to the present embodiment is the same as that shown in FIG. 9 is also performed by the operation of the CPU 207 in FIG.
First, in step S901, feature point position candidates are determined. In this process, similar to step S101 of FIG. 1 described in the first embodiment, the candidate position of the feature point is obtained by CNN calculation and the centroid search for the result as the candidate position of the plurality of feature points.

次のステップＳ９０２においては、ステップＳ１０２と同様に複数の特徴点の位置候補座標を連結して特徴点ベクトルを生成する。そして、ステップＳ９０３において、特徴点ベクトルを固有空間に射影するための射影行列を選択する。ここで選択される射影行列はステップＳ１０３で使用する射影行列と同じであり、ＲＯＭ２０８やＲＡＭ２０９に予め格納されている射影用の射影行列を選択する。次に、ステップＳ９０４において、選択した射影行列を用いて特徴点ベクトルを固有空間に射影する。ここでの処理もステップＳ１０３に記載の処理内容と同一である。 In the next step S902, as in step S102, the feature point vectors are generated by connecting the position candidate coordinates of a plurality of feature points. In step S903, a projection matrix for projecting the feature point vector to the eigenspace is selected. The projection matrix selected here is the same as the projection matrix used in step S103, and a projection matrix for projection stored in advance in the ROM 208 or RAM 209 is selected. Next, in step S904, the feature point vector is projected onto the eigenspace using the selected projection matrix. The processing here is also the same as the processing content described in step S103.

次に、ステップＳ９０５において、固有空間から元の空間への逆射影用の射影行列を選択する。ここで、使用する射影行列について、図１０を参照しながら第１の実施形態と同様に３つの特徴点に対応する特徴点ベクトルに対して処理する場合について説明する。 Next, in step S905, a projection matrix for back projection from the eigenspace to the original space is selected. Here, the case where the projection matrix to be used is processed with respect to the feature point vectors corresponding to the three feature points as in the first embodiment will be described with reference to FIG.

図１０において、特徴点ベクトル１００１、射影行列１００２ａ〜１００２ｃ、射影ベクトル１００３ａ〜１００３ｃ、及び補正後の特徴点ベクトル１００４はそれぞれ、図７に示す例と同様である。一方、図１０に示す例では、逆射影行列１００５ａ〜１００５ｃの一部が図７に示す例と異なっている。 In FIG. 10, the feature point vector 1001, the projection matrices 1002a to 1002c, the projection vectors 1003a to 1003c, and the corrected feature point vector 1004 are the same as in the example shown in FIG. On the other hand, in the example shown in FIG. 10, a part of the inverse projection matrices 1005a to 1005c is different from the example shown in FIG.

第１の実施形態では、射影用の射影行列及び逆射影用の射影行列は共通のデータを使用し、逆射影処理時に使用する射影ベクトルの要素数を選択することにより特徴点毎に拘束力を制御している。一方、本実施形態では、前述の式（６）の演算に従って、全ての特徴点に対して３つの射影ベクトルを用いて逆射影処理を実行する。但し、本実施形態では、射影時に使用した射影行列と異なる逆射影行列を使用することにより特徴点毎に拘束力を制御する。 In the first embodiment, the projection matrix for projection and the projection matrix for reverse projection use common data, and by selecting the number of elements of the projection vector used during the reverse projection processing, the constraint force is set for each feature point. I have control. On the other hand, in the present embodiment, reverse projection processing is executed using three projection vectors for all feature points in accordance with the calculation of the above-described equation (6). However, in the present embodiment, the constraint force is controlled for each feature point by using an inverse projection matrix different from the projection matrix used at the time of projection.

図１０に示す例では、逆射影行列１００５ｃは、一部が射影行列１００２ｃと異なる。具体的には値ｅ₂₁、ｅ₂₂、ｅ₂₁、ｅ₂₂が全て０である。これにより、補正後の特徴点ベクトルｖ₁'、ｖ₂'、ｖ₅'、ｖ₆'は、２つの射影ベクトルｐ₁、ｐ₂を用いて逆射影したことに等価となる。また、補正後の特徴点ベクトルｖ₃'、ｖ₄'は、３つの射影ベクトルｐ₁、ｐ₂、ｐ₃を用いて逆射影したことと等価になる。例えば、補正後の特徴点ベクトルｖ₁'については、以下の式（１０）により算出され、前述した式（８）と等価である。
ｖ₁'＝（ｐ₁×ｅ₁₁）＋（ｐ₂×ｅ₂₁）＋（ｐ₃×０）・・・（１０） In the example shown in FIG. 10, the inverse projection matrix 1005c is partially different from the projection matrix 1002c. Specifically, the values e ₂₁ , e ₂₂ , e ₂₁ , e ₂₂ are all 0. As a result, the corrected feature point vectors v ₁ ′, v ₂ ′, v ₅ ′, v ₆ ′ are equivalent to the reverse projection using the _two projection vectors p ₁ , p ₂ . Further, the corrected feature point vectors v ₃ ′ and v ₄ ′ are equivalent to reverse projection using the _three projection vectors p ₁ , p ₂ , and p ₃ . For example, the corrected feature point vector v ₁ ′ is calculated by the following formula (10) and is equivalent to the above-described formula (8).
v ₁ '= (p ₁ × e ₁₁ ) + (p ₂ × e ₂₁ ) + (p ₃ × 0) (10)

このように、逆射影用の射影行列の一部を０にしておくことにより、特徴点毎の処理内容を変えることなく第１の実施形態と等価の結果を得ることができる。 Thus, by setting a part of the projection matrix for reverse projection to 0, it is possible to obtain a result equivalent to that of the first embodiment without changing the processing content for each feature point.

次のステップＳ９０６においては、ステップＳ９０５で選択した逆射影行列を用いて射影ベクトルから実空間における補正後の特徴点ベクトルを生成する。そして、ステップＳ９０７において、補正後の特徴点ベクトルから対応する特徴点の位置を抽出し、その情報をＲＡＭ２０９等に格納する。 In the next step S906, a feature point vector after correction in the real space is generated from the projection vector using the inverse projection matrix selected in step S905. In step S907, the position of the corresponding feature point is extracted from the corrected feature point vector, and the information is stored in the RAM 209 or the like.

以上のように本実施形態によれば、逆射影用の射影行列を予め生成しておくことにより、逆射影の処理を行う時に、ステップＳ１０４のように、使用する射影ベクトルの要素の数を判定する処理を不要にすることができる。これにより、より簡便な処理で特徴点毎の幾何学的な拘束力を制御することができる。即ち、非特許文献２に記載の従来例等と比べて、逆射影時に当該逆射影用の行列を選択するだけの簡単な変更により特徴点毎に補正力の異なる幾何学的な補正を処理することができる。 As described above, according to the present embodiment, by generating a projection matrix for back projection in advance, when performing back projection processing, the number of elements of the projection vector to be used is determined as in step S104. The processing to be performed can be made unnecessary. Thereby, the geometrical restraint force for every feature point can be controlled by simpler processing. That is, as compared with the conventional example described in Non-Patent Document 2, geometric correction with different correction force is processed for each feature point by simple change by simply selecting the matrix for back projection at the time of back projection. be able to.

（その他の実施形態）
第１及び第２の実施形態では、特徴点毎に予め定める数（逆射影次元数）の射影ベクトル要素を用いて逆射影する場合について説明したが、実行時に特徴点毎に使用する射影ベクトルの要素数を変更してもよい。例えば、ステップＳ１０１またはＳ９０１で特徴点の位置候補を決定する処理においてその検出信頼度に相当する値を出力する場合、当該値に基づいて逆射影次元数を設定する。 (Other embodiments)
In the first and second embodiments, the case of performing reverse projection using a predetermined number (projection dimension number) of projection vector elements for each feature point has been described. However, the projection vector used for each feature point at the time of execution is described. The number of elements may be changed. For example, when a value corresponding to the detection reliability is output in the process of determining the candidate position of the feature point in step S101 or S901, the reverse projection dimension number is set based on the value.

また、各実施形態において、ＣＮＮにより特徴点の候補を決定する際に、ＣＮＮの出力値を利用して信頼度を判定してもよい。例えば、ＣＮＮの出力値が高い場合、対象とする特徴点の可能性が高いと判断し、逆射影次元数を大きくする。当該処理では、信頼度と逆射影次元数との関係を示すテーブル情報を予めＲＯＭ２０８やＲＡＭ２０９に格納しておき、ステップＳ１０４で逆射影次元数を決定することにより容易に実現可能である。 Further, in each embodiment, when determining a feature point candidate by the CNN, the reliability may be determined using the output value of the CNN. For example, when the output value of CNN is high, it is determined that there is a high possibility of the target feature point, and the number of reverse projection dimensions is increased. This processing can be easily realized by preliminarily storing table information indicating the relationship between the reliability and the reverse projection dimension number in the ROM 208 or the RAM 209 and determining the reverse projection dimension number in step S104.

また、第２の実施形態の構成においては、逆射影のための射影行列を複数用意しておき、ステップＳ９０１において出力する信頼度に応じて当該射影行列を選択する。その場合、信頼度と射影行列との関係を示すテーブル情報を予めＲＯＭ２０８やＲＡＭ２０９に格納しておく。この場合も、射影行列毎に所定の要素を０に設定しておくことにより特徴点毎に異なる次元（異なる数の射影ベクトルの要素）で逆射影処理することができる。 In the configuration of the second embodiment, a plurality of projection matrices for reverse projection are prepared, and the projection matrix is selected according to the reliability output in step S901. In that case, table information indicating the relationship between the reliability and the projection matrix is stored in the ROM 208 or RAM 209 in advance. Also in this case, by setting a predetermined element to 0 for each projection matrix, it is possible to perform reverse projection processing with different dimensions (elements of different numbers of projection vectors) for each feature point.

図１１は、逆射影用の射影行列の例を説明する図である。図１１（ａ）〜図１１（ｃ）はそれぞれ、特徴点ベクトル（ｖ₁，ｖ₂）、（ｖ₃，ｖ₄）、（ｖ₅，ｖ₆）の逆射影次元数を削減する場合の射影行列である。例えば、特徴点ベクトル（ｖ₁，ｖ₂）に対応する位置候補の検出の信頼度が低い場合は、逆射影時に図１１（ａ）に示す射影行列を選択する。なお、第１及び第２の実施形態では、ＣＮＮにより特徴点の位置候補を決定する場合について説明したが、これに限るわけではなくどの様な手法にも適用できる。 FIG. 11 is a diagram for explaining an example of a projection matrix for reverse projection. 11 (a) to 11 (c) show cases in which the number of inverse projection dimensions of the feature point vectors (v ₁ , v ₂ ), (v ₃ , v ₄ ), (v ₅ , v ₆ ) is reduced, respectively. It is a projection matrix. For example, when the reliability of the detection of the position candidate corresponding to the feature point vector (v ₁ , v ₂ ) is low, the projection matrix shown in FIG. 11A is selected at the time of reverse projection. In the first and second embodiments, the case where the candidate position of the feature point is determined by the CNN has been described. However, the present invention is not limited to this and can be applied to any method.

また、第１及び第２の実施形態では、画像中の顔画像から特定の器官位置に対応する特徴点の位置を検出する場合について説明したが、本発明はこれに限るわけではない。様々な物体領域の特徴点の検出に適用することが可能である。 In the first and second embodiments, the case where the position of the feature point corresponding to the specific organ position is detected from the face image in the image has been described. However, the present invention is not limited to this. It can be applied to detection of feature points of various object regions.

また、第１及び第２の実施形態では、２次元データ上の特徴点の位置を決定する場合について説明したが、これに限るわけではない。例えば３次元データに基づいた処理の場合、３次元空間上の特徴点の位置決定に適用することも可能である。この場合、３次元の座標値を連結して特徴点ベクトルＶを作成する。 Moreover, although the case where the position of the feature point on two-dimensional data was determined was demonstrated in 1st and 2nd embodiment, it does not necessarily restrict to this. For example, in the case of processing based on three-dimensional data, it can be applied to position determination of feature points in a three-dimensional space. In this case, a feature point vector V is created by connecting three-dimensional coordinate values.

また、第１及び第２の実施形態では、補正後の特徴点の位置を算出する方法についてのみ説明した。実際には、補正後の特徴点を最終的な特徴点の位置とするわけではなく、説明しない他の処理と組み合わせて最終的な特徴点の位置を決定するものとする。 In the first and second embodiments, only the method of calculating the corrected feature point positions has been described. Actually, the corrected feature point is not the final feature point position, but the final feature point position is determined in combination with other processing not described.

さらに、第２の実施形態では、射影ベクトルの要素の一部を０に書き換えることにより、逆射影に用いる射影ベクトルを生成する場合について説明したが、０に限るわけではなく、１より小さな値であれば第２の実施形態と類似の効果を得ることができる。 Furthermore, in the second embodiment, a case has been described in which a projection vector used for reverse projection is generated by rewriting some of the elements of the projection vector to 0. However, the present invention is not limited to 0, and a value smaller than 1 is used. If it exists, the effect similar to 2nd Embodiment can be acquired.

また、第１及び第２の実施形態では、主成分分析により算出した射影行列を用いて幾何学的な配置関係に拘束を与える場合につて説明した。主成分分析による手法は簡易な方法で本発明の効果を最も良く引き出すことができるが、他の手法により生成した射影行列を用いてもよい。 In the first and second embodiments, the case where the geometric arrangement relationship is constrained using the projection matrix calculated by the principal component analysis has been described. The method based on principal component analysis can best bring out the effects of the present invention by a simple method, but a projection matrix generated by another method may be used.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

２０７ＣＰＵ
２０８ＲＯＭ
２０９ＲＡＭ 207 CPU
208 ROM
209 RAM

Claims

Position candidate determination means for obtaining position candidates of a plurality of feature points from the object region;
Generating means for connecting the coordinates of a plurality of position candidates obtained by the position candidate determining means to generate a feature point vector;
Projecting means for projecting the feature point vector generated by the generating means onto a space of a predetermined dimension to obtain a projection vector;
Element number determination means for determining the number of elements of a projection vector used for back projection for each position candidate of the plurality of feature points;
Each of the plurality of feature point position candidates is composed of elements of the number of elements determined for the position candidates by the element number determination means among the elements of the projection vector obtained by the projection means. A reverse projection means for performing reverse projection using a vector ;
Position determining means for determining positions of a plurality of feature points obtained by correcting position candidates of the plurality of feature points using a result of the reverse projection means;
An information processing apparatus comprising:

The projecting means projects the feature point vector using a projection matrix, and the inverse projection means reversely projects the projection vector using a projection matrix different from the projection matrix used by the projection means. The information processing apparatus according to claim 1.

The projection matrix used by the inverse projection means is a matrix in which the elements of the matrix corresponding to the elements of the projection vector not used when performing the reverse projection among the elements of the projection matrix used by the projection means are zero. The information processing apparatus according to claim 2 .

When the position candidate determination means obtains position candidates of the plurality of feature points, the position candidate determination means further includes a determination means for determining reliability for each position candidate ,
Said element number determination means, on the basis of the reliability determined by the determining means, according to claim 1, characterized in that to determine the number of elements of the projection vector for use in the inverse projection for each position candidate of feature points Information processing device.

The projection matrix projecting means is employed, according to claim 2 or 3, characterized in that a matrix generated by the principal component analysis of a plurality of feature points vector obtained from the coordinates of the position of a feature point as a learning Information processing device.

A position candidate determination step for obtaining position candidates of a plurality of feature points from the object region;
Generating a feature point vector by connecting the coordinates of a plurality of position candidates obtained in the position candidate determination step;
A projection step of projecting the feature point vector generated in the generation step to a space of a predetermined dimension to obtain a projection vector;
An element number determination step for determining the number of elements of a projection vector used for back projection for each of the plurality of feature point position candidates;
Each of the plurality of feature point position candidates is composed of elements having the number of elements determined for the position candidates in the element number determination step among the elements of the projection vector obtained in the projection step. A reverse projection process using a vector for reverse projection;
A determination step for determining positions of a plurality of feature points obtained by correcting the position candidates of the plurality of feature points using a result in the reverse projection step;
An information processing method characterized by comprising:

A position candidate determination step for obtaining position candidates of a plurality of feature points from the object region;
Generating a feature point vector by connecting the coordinates of a plurality of position candidates obtained in the position candidate determination step;
A projection step of projecting the feature point vector generated in the generation step to a space of a predetermined dimension to obtain a projection vector;
An element number determination step for determining the number of elements of a projection vector used for back projection for each of the plurality of feature point position candidates;
Each of the plurality of feature point position candidates is composed of elements having the number of elements determined for the position candidates in the element number determination step among the elements of the projection vector obtained in the projection step. A reverse projection process using a vector for reverse projection;
A determination step for determining positions of a plurality of feature points obtained by correcting the position candidates of the plurality of feature points using a result in the reverse projection step;
A program that causes a computer to execute.