JP5791373B2

JP5791373B2 - Feature point position determination device, feature point position determination method and program

Info

Publication number: JP5791373B2
Application number: JP2011116222A
Authority: JP
Inventors: 加藤　政美; 政美加藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-05-24
Filing date: 2011-05-24
Publication date: 2015-10-07
Anticipated expiration: 2031-05-24
Also published as: JP2012243285A

Description

本発明は、画像中の複数の特徴点位置を決定する特徴点位置決定装置、特徴点位置決定方法及びプログラムに関する。 The present invention relates to a feature point position determination device, a feature point position determination method, and a program for determining a plurality of feature point positions in an image.

顔画像データを用いた個人の認識（以下、顔認識とする）において、顔器官或いはそれに準ずる特徴的な部位（以下、特徴点とする）の位置決定は重要なタスクであり、認識性能を律することが多い。また、高精度な特徴点の位置決定は高い処理負荷を要し、認識処理全体の時間を律速する場合もある。 In personal recognition using face image data (hereinafter referred to as face recognition), the determination of the position of a facial organ or a characteristic part corresponding to the face organ (hereinafter referred to as a feature point) is an important task, and regulates recognition performance. There are many cases. In addition, the determination of the position of feature points with high accuracy requires a high processing load and may limit the time of the entire recognition process.

そこで、例えば特許文献１では、動画データから個人を認識する場合に、前フレームの認識結果を利用して、現フレームで抽出する特徴点の数を削減することにより処理を高速化する手法が開示されている。複数の特徴点の位置を決定する手法は、特徴点毎の位置候補を抽出する特徴点候補検出器と対象とする物体（例えば顔）に特有な特徴点の配置関係に基づいて特徴点候補を補正する補正処理器により構成することが多い。例えば、非特許文献１では、統計量に基づく幾何学的な拘束にしたがって複数の顔器官特徴位置を決定する方法が開示されている。 Therefore, for example, Patent Document 1 discloses a method for speeding up processing by reducing the number of feature points extracted in the current frame by using the recognition result of the previous frame when an individual is recognized from moving image data. Has been. A technique for determining the positions of a plurality of feature points is to select feature point candidates based on a feature point candidate detector that extracts position candidates for each feature point and the feature point arrangement specific to the target object (for example, a face). In many cases, a correction processor is used for correction. For example, Non-Patent Document 1 discloses a method of determining a plurality of facial organ feature positions according to geometric constraints based on statistics.

特開２００９−７５９９９号公報JP 2009-75999 A

Beumer, G.M.; Tao, Q.; Bazen, A.M.; Veldhuis, R.N.J."A landmark paper in face recognition" Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference、pp. 73-78Beumer, G.M .; Tao, Q .; Bazen, A.M .; Veldhuis, R.N.J. "A landmark paper in face recognition" Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference, pp. 73-78

しかしながら、特許文献１に開示されている手法は抽出する特徴点の数を削減することによる認識性能の劣化が激しい。また、非特許文献１に開示されている手法は、部分空間を利用して効率よく特徴点の位置を補正する方式であるが、特徴点位置候補の抽出精度が低い場合、効果よく補正できない場合がある。 However, the technique disclosed in Patent Document 1 is severely degraded in recognition performance by reducing the number of feature points to be extracted. In addition, the method disclosed in Non-Patent Document 1 is a method for efficiently correcting the position of a feature point using a partial space. However, when the extraction accuracy of a feature point position candidate is low, it cannot be corrected effectively. There is.

本発明は前述の問題点に鑑み、処理高速化等のために検出された特徴点位置候補の精度が低い場合であっても妥当な特徴点位置補正を実現し、所望の特徴点位置を決定できるようにすることを目的としている。 In view of the above-mentioned problems, the present invention realizes appropriate feature point position correction even when the accuracy of feature point position candidates detected for speeding up the processing is low, and determines a desired feature point position. The purpose is to be able to.

本発明の特徴点位置決定装置は、画像データから複数の特徴点の位置を決定する特徴点位置決定装置であって、前記複数の特徴点の位置候補を求める候補決定手段と、前記候補決定手段により得られた特徴点の位置候補の信頼度を判定する判定手段と、前記信頼度に基づいて、当該信頼度が低いほど拘束力の高い補正条件を決定する条件決定手段と、前記複数の特徴点の位置候補を前記補正条件に基づいて補正する補正手段とを有することを特徴とする。 The feature point position determination device of the present invention is a feature point position determination device that determines the positions of a plurality of feature points from image data, a candidate determination unit that obtains position candidates of the plurality of feature points, and the candidate determination unit Determination means for determining the reliability of the position candidate of the feature point obtained by the above, condition determination means for determining a correction condition having a higher binding force as the reliability is lower based on the reliability, and the plurality of features characterized by chromatic and correcting means for correcting, based the position candidate of a point on the correction condition.

本発明によれば、特徴点位置候補の検出精度が低い場合であっても、特徴点の位置を妥当に決定することができる。 According to the present invention, the position of a feature point can be appropriately determined even when the detection accuracy of the feature point position candidate is low.

特徴点位置決定処理及び顔認識処理を説明するフローチャートである。It is a flowchart explaining a feature point position determination process and a face recognition process. 実施形態に係る画像認識装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image recognition apparatus which concerns on embodiment. 顔画像データの切り出し処理例を説明する図である。It is a figure explaining the example of a clipping process of face image data. 顔器官に関連する１５個の特徴点位置を示す図である。It is a figure which shows the 15 feature-point position relevant to a facial organ. 図１のステップＳ１０３の処理の例を説明する図である。It is a figure explaining the example of a process of step S103 of FIG. テンプレートマッチングの例を説明する図である。It is a figure explaining the example of template matching. 補正処理の例を説明する図である。It is a figure explaining the example of a correction process. 図１のステップＳ１０８の処理内容を説明する図である。It is a figure explaining the processing content of step S108 of FIG. 第２の実施形態における動作の手順の一例を示すフローチャートである。It is a flowchart which shows an example of the procedure of the operation | movement in 2nd Embodiment. 図９のステップＳ９０２の流れを説明する図である。It is a figure explaining the flow of step S902 of FIG.

（第１の実施形態）
以下、本発明の第１の実施形態の動作について図１及び図２を参照しながら説明する。
図２は、本実施形態に係る画像認識装置２００の構成例を示すブロック図である。画像認識装置２００はまず、画像データから顔を含む領域を抽出する。以下、抽出した領域を顔画像データと呼ぶ。そして、得られた顔画像データから複数の特徴点位置（ここでは顔の器官に関連する特徴の位置）を決定し、特徴点位置に基づいて個人を識別する認識機能を有する。 (First embodiment)
The operation of the first embodiment of the present invention will be described below with reference to FIGS.
FIG. 2 is a block diagram illustrating a configuration example of the image recognition apparatus 200 according to the present embodiment. First, the image recognition apparatus 200 extracts a region including a face from image data. Hereinafter, the extracted area is referred to as face image data. A plurality of feature point positions (here, positions of features related to facial organs) are determined from the obtained face image data, and a recognition function for identifying an individual based on the feature point positions is provided.

図２において、２０１は画像入力部である。画像入力部２０１は光学系デバイス、光電変換デバイス及びセンサーを制御するドライバー回路、ＡＤコンバーター、各種画像補正を司る信号処理回路、フレームバッファ等により構成されている。２０２は前処理部であり、後段の各種処理を効果的に行うために各種前処理を実行する。具体的には、画像入力部２０１で取得した画像データに対して色変換処理、コントラスト補正処理等の画像データ変換をハードウェアで処理する。 In FIG. 2, reference numeral 201 denotes an image input unit. The image input unit 201 includes an optical device, a photoelectric conversion device, a driver circuit that controls a sensor, an AD converter, a signal processing circuit that controls various image corrections, a frame buffer, and the like. Reference numeral 202 denotes a preprocessing unit that executes various types of preprocessing in order to effectively perform various types of subsequent processing. Specifically, image data conversion such as color conversion processing and contrast correction processing is performed on the image data acquired by the image input unit 201 by hardware.

顔画像データ切り出し処理部２０３は、前処理部２０２で補正した画像データに対して顔検出処理を実行する。顔検出の手法は従来提案されている様々な手法を適用可能である。顔画像データ切り出し処理部２０３は検出された顔毎に顔画像データを所定のサイズに正規化して切り出す。 The face image data cutout processing unit 203 performs face detection processing on the image data corrected by the preprocessing unit 202. Various conventionally proposed methods can be applied to the face detection method. The face image data cutout processing unit 203 normalizes and cuts out face image data to a predetermined size for each detected face.

図３は、顔画像データの切り出し処理例を説明する図である。前処理部２０２で補正された画像３１から顔領域３２を判定し、予め定めるサイズに正規化した正立顔画像データ３３を切り出す。このように、正立顔画像データ３３の大きさは顔によらず一定である。切り出した顔画像データはＤＭＡＣ（Direct Memory Access Controller）２０５を介してＲＡＭ（Random Access Memory）２０９に格納される。以後、特徴点の位置とは正立顔画像データ３３内の特徴点の座標と定義し、正立顔画像データ３３の左上端を原点とする座標系（x座標、ｙ座標）で表現するものとする。 FIG. 3 is a diagram illustrating an example of face image data cutout processing. The face area 32 is determined from the image 31 corrected by the preprocessing unit 202, and the upright face image data 33 normalized to a predetermined size is cut out. Thus, the size of the upright face image data 33 is constant regardless of the face. The cut face image data is stored in a RAM (Random Access Memory) 209 via a DMAC (Direct Memory Access Controller) 205. Hereinafter, the position of the feature point is defined as the coordinate of the feature point in the erect face image data 33, and is expressed by a coordinate system (x coordinate, y coordinate) with the upper left end of the erect face image data 33 as the origin. And

２０７はＣＰＵ（Central Processing Unit）であり、本実施形態に係る主要な処理を実行するとともに画像認識装置２００全体の動作を制御する。２０４はブリッジであり、画像バス２１０とＣＰＵバス２０６との間のバスブリッジ機能を提供する。２０８はＲＯＭ（Read Only Memory）であり、ＣＰＵ２０７の動作を規定する命令を格納する。ＲＡＭ２０９はＣＰＵ２０７の動作に必要な作業メモリである。ＣＰＵ２０７はＲＡＭ２０９に格納した顔画像データに対して認識処理を実行する。 Reference numeral 207 denotes a CPU (Central Processing Unit), which executes main processing according to the present embodiment and controls the overall operation of the image recognition apparatus 200. A bridge 204 provides a bus bridge function between the image bus 210 and the CPU bus 206. Reference numeral 208 denotes a ROM (Read Only Memory), which stores instructions that define the operation of the CPU 207. A RAM 209 is a work memory necessary for the operation of the CPU 207. The CPU 207 executes recognition processing on the face image data stored in the RAM 209.

図１は、本実施形態に関する特徴点位置決定処理及び顔認識処理を説明するフローチャートである。当該フローチャートはＣＰＵ２０７の動作を示す。本実施形態では動作モードに応じて特徴点位置候補決定方法を選択し、選択された特徴点位置候補決定方法に応じて好適な特徴点位置補正処理を実行する。 FIG. 1 is a flowchart for explaining feature point position determination processing and face recognition processing according to the present embodiment. The flowchart shows the operation of the CPU 207. In this embodiment, a feature point position candidate determination method is selected according to the operation mode, and a suitable feature point position correction process is executed according to the selected feature point position candidate determination method.

まず、ステップＳ１０１で動作モードを判定する。ここでの動作モードとは、通常モードまたは追尾モードである。通常モードでは、処理速度より認識精度を優先して顔認識を行う。一方、追尾モードでは精度の劣化を許容して高速な認識処理を実行する。通常モードか追尾モードかは、前フレームで認識対象人物（認識対象として登録されている人物）が認識されたか否かによって判定する。例えば、ＲＡＭ２０９に格納された前フレームの認識結果を参照して動作モードを判定する。 First, an operation mode is determined in step S101. The operation mode here is a normal mode or a tracking mode. In the normal mode, face recognition is performed with priority given to recognition accuracy over processing speed. On the other hand, in the tracking mode, high-speed recognition processing is executed while allowing deterioration in accuracy. Whether the mode is the normal mode or the tracking mode is determined based on whether or not a recognition target person (person registered as a recognition target) has been recognized in the previous frame. For example, the operation mode is determined with reference to the recognition result of the previous frame stored in the RAM 209.

次に、ステップＳ１０２では、ステップＳ１０１の判定結果にしたがって、特徴点位置候補決定方法を選択するための情報を取得する。特徴毎に特徴点位置候補決定方法を選択するためのテーブル情報の例を以下の表１に示す。 Next, in step S102, information for selecting a feature point position candidate determination method is acquired according to the determination result in step S101. Table 1 below shows an example of table information for selecting a feature point position candidate determination method for each feature.

通常モード時、ステップＳ１０２では通常モード用テーブルを選択し、追尾モード時は追尾モード用テーブルを選択する。表１のテーブルに記載された内容１及び２はそれぞれ第１の特徴点位置候補決定方法、第２の特徴点位置候補決定方法に対応する。 In the normal mode, a normal mode table is selected in step S102, and in the tracking mode, a tracking mode table is selected. Contents 1 and 2 described in the table of Table 1 correspond to the first feature point position candidate determination method and the second feature point position candidate determination method, respectively.

図４は、ここで決定する特徴点の位置の例を説明する図である。４０１〜４１５に示す顔器官に関連する１５個の特徴点位置を決定する。ステップＳ１０２では選択されたテーブルを参照して特徴点に対応する特徴点位置候補を選択する。例えば、通常モードでは、表１に示すテーブルを参照して、全ての特徴点で第１の特徴点位置候補決定方法を選択する。一方、追尾モードでは特徴点毎に異なる特徴点位置候補決定方法を選択する。例えば、特徴点４０１ではステップＳ１０３に進み、特徴点４０３ではステップＳ１０４に進む。 FIG. 4 is a diagram for explaining an example of the position of the feature point determined here. Fifteen feature point positions related to the facial organs 401 to 415 are determined. In step S102, feature point position candidates corresponding to the feature points are selected with reference to the selected table. For example, in the normal mode, the first feature point position candidate determination method is selected for all feature points with reference to the table shown in Table 1. On the other hand, in the tracking mode, a different feature point position candidate determination method is selected for each feature point. For example, at the feature point 401, the process proceeds to step S103, and at the feature point 403, the process proceeds to step S104.

ステップＳ１０３では、高精度に特徴点位置候補を決定する。ここでは、処理時間を要するが精度の高い手法を用いて特徴点の位置候補を決定する。一方、第２の特徴点位置候補決定方法では、精度より処理時間を優先する手法で特徴点の位置候補を決定する。表１に示すテーブルの内容は所望の処理速度と性能とを考慮して事前に決定しておく。 In step S103, feature point position candidates are determined with high accuracy. Here, feature point position candidates are determined using a highly accurate method that requires processing time. On the other hand, in the second feature point position candidate determination method, feature point position candidates are determined by a method that prioritizes processing time over accuracy. The contents of the table shown in Table 1 are determined in advance in consideration of the desired processing speed and performance.

図５は、ステップＳ１０３の処理の例を説明する図である。本実施形態では、ＣＮＮ（Convolutional Neural Networks）により特徴点位置候補を決定する。なお、図５では、説明のために２つの特徴点位置候補を決定する場合の構成を示している。ＣＮＮは階層的な特徴抽出処理により構成する。 FIG. 5 is a diagram illustrating an example of the process in step S103. In the present embodiment, feature point position candidates are determined by CNN (Convolutional Neural Networks). FIG. 5 shows a configuration in the case where two feature point position candidates are determined for the sake of explanation. The CNN is configured by hierarchical feature extraction processing.

図５では、第１階層５０６の特徴数が３、第２階層５１０の特徴数が２の２層ＣＮＮの例を示している。５０１は顔画像データであり、前述の正立顔画像データ３３に相当する。５０３ａ〜５０３ｃは第１階層５０６の特徴面を示す。特徴面とは、所定の特徴抽出フィルタ（コンボリューション演算の累積和及び非線形処理）で前階層のデータを走査しながら演算した結果を格納する画像データ面である。特徴面はラスタスキャンされた画像データに対する検出結果であるため検出結果も面で表す。５０３ａ〜５０３ｃは、顔画像データ５０１を参照して、異なる特徴抽出フィルタにより算出する。５０３ａ〜５０３ｃはそれぞれ模式的に５０４ａ〜５０４ｃに対応する２次元のコンボリューションフィルタ演算と演算結果の非線形変換により生成する。なお、５０２はコンボリューション演算に必要な参照画像領域を示す。例えば、カーネルサイズ（水平方向の長さと垂直方向の高さ）が１１×１１のコンボリューションフィルタ演算は以下に示すような式（１）の積和演算により処理する。 FIG. 5 shows an example of a two-layer CNN in which the number of features in the first layer 506 is 3 and the number of features in the second layer 510 is 2. Reference numeral 501 denotes face image data, which corresponds to the above-described upright face image data 33. Reference numerals 503 a to 503 c denote characteristic surfaces of the first hierarchy 506. The feature plane is an image data plane that stores a result of calculation while scanning data of the previous layer with a predetermined feature extraction filter (cumulative sum of convolution calculations and nonlinear processing). Since the feature plane is a detection result for raster-scanned image data, the detection result is also represented by a plane. 503a to 503c are calculated using different feature extraction filters with reference to the face image data 501. 503a to 503c are generated by two-dimensional convolution filter calculation corresponding to 504a to 504c and non-linear conversion of the calculation result, respectively. Reference numeral 502 denotes a reference image area necessary for the convolution calculation. For example, a convolution filter operation with a kernel size (length in the horizontal direction and height in the vertical direction) of 11 × 11 is processed by a product-sum operation of Expression (1) as shown below.

ここで、input（x,y）は座標（ｘ、ｙ）での参照画素値であり、output（x,y）は座標（ｘ、ｙ）での演算結果である。weight（column, row）は座標（ｘ+column、ｙ+row）での重み係数であり、columnSize=１１、rowSize=１１はフィルタカーネルサイズ（フィルタタップ数）である。 Here, input (x, y) is a reference pixel value at coordinates (x, y), and output (x, y) is a calculation result at coordinates (x, y). weight (column, row) is a weight coefficient at coordinates (x + column, y + row), and columnSize = 11 and rowSize = 11 are filter kernel sizes (number of filter taps).

５０４ａ〜５０４ｃは夫々異なる係数のコンボリューションフィルタカーネルである。また、特徴面によってコンボリューションカーネルのサイズも異なる。ＣＮＮ演算では複数のフィルタカーネルを画素単位で走査しながら積和演算を繰り返し、最終的な積和結果を非線形変換することで特徴面を生成する。非線形変換はシグモイド関数等を適用する。５０３ａを算出する場合は前階層との結合数が１であるため、１つのコンボリューションフィルタカーネル５０４ａである。一方、特徴面５０７ａ及び特徴面５０７ｂを計算する場合、前階層との結合数が３であるため夫々５０８ａ〜５０８ｃ及び５０８ｄ〜５０８ｅに相当する３つのコンボリューションフィルタの演算結果を累積加算する。つまり、特徴面５０７ａは、コンボリューションフィルタ５０９ａ〜５０９ｃの全ての出力を累積加算し、最後に非線形変換処理することによって得る。５０５ａ〜５０５ｃはそれぞれ第２階層５１０のコンボリューション演算に必要な参照画像領域を示す。コンボリューションフィルタの係数は学習によって予め決定しておく。 Reference numerals 504a to 504c denote convolution filter kernels having different coefficients. Also, the size of the convolution kernel varies depending on the feature plane. In the CNN operation, a product-sum operation is repeated while scanning a plurality of filter kernels in units of pixels, and a final product-sum result is nonlinearly transformed to generate a feature plane. A sigmoid function or the like is applied to the nonlinear transformation. When calculating 503a, since the number of connections with the previous layer is 1, there is one convolution filter kernel 504a. On the other hand, when calculating the feature plane 507a and the feature plane 507b, since the number of connections with the previous layer is 3, the calculation results of the three convolution filters corresponding to 508a to 508c and 508d to 508e, respectively, are cumulatively added. That is, the feature plane 507a is obtained by accumulating all the outputs of the convolution filters 509a to 509c and finally performing a nonlinear conversion process. Reference numerals 505a to 505c denote reference image areas necessary for the convolution calculation of the second hierarchy 510, respectively. The coefficient of the convolution filter is determined in advance by learning.

ステップＳ１０３ではＣＮＮ演算結果である特徴面５０７ａ、５０７ｂの値の重心を特徴点位置候補座標とする。本実施形態の場合、実際には２層の特徴数が１５個のＣＮＮを用いて１５個の特徴点位置候補を決定可能なネットワークを構成しておく。ステップＳ１０２の結果に応じて必要な２層目の特徴面を算出し特徴点位置候補を高精度に決定する。ＣＮＮ演算は強力な特徴抽出手法として知られているが、この様に高い処理負荷を要する手法である。 In step S103, the center of gravity of the values of the feature planes 507a and 507b, which are CNN calculation results, is used as the feature point position candidate coordinates. In the case of the present embodiment, a network that can determine 15 feature point position candidates is actually configured using a CNN having 15 feature numbers in two layers. A necessary feature plane of the second layer is calculated according to the result of step S102, and feature point position candidates are determined with high accuracy. CNN calculation is known as a powerful feature extraction method, but it is a method that requires such a high processing load.

一方、ステップＳ１０４では、処理負荷の低い手法で特徴点位置候補を決定する。本実施形態では、単純なテンプレートマッチングにより特徴点位置を決定する。図６はテンプレートマッチングの例を説明する図である。ここでは図４の鼻４０９を簡易に検出する場合の例について説明する。まず、予め用意する鼻に対応するテンプレート６４と顔画像データ６１中の所定の探索領域６２内の画素データとの類似度を算出する。類似度の算出はテンプレートと画素値の正規化相関演算等により算出する。テンプレート６４を所定の領域内でスキャン６３しながら類似度を算出し、類似度のピーク値が現れる座標や類似度の探索領域６２内の重心等により鼻の位置候補を決定する。テンプレートマッチング演算は、前述した第１の特徴点位置候補決定方法で説明したＣＮＮ演算に比べて少ない処理負荷で特徴点の位置候補を決定することができる。但し、個人差や顔の向き等の変動に対するロバスト性が低く総合的な精度は低い。 On the other hand, in step S104, feature point position candidates are determined by a method with a low processing load. In this embodiment, the feature point position is determined by simple template matching. FIG. 6 is a diagram for explaining an example of template matching. Here, an example in which the nose 409 in FIG. 4 is simply detected will be described. First, the similarity between the template 64 corresponding to the nose prepared in advance and the pixel data in the predetermined search area 62 in the face image data 61 is calculated. The similarity is calculated by normalized correlation calculation between the template and the pixel value. The similarity is calculated while scanning 63 the template 64 within a predetermined area, and a nose position candidate is determined based on the coordinates where the peak value of the similarity appears, the center of gravity in the search area 62 of the similarity, and the like. The template matching calculation can determine feature point position candidates with a smaller processing load than the CNN calculation described in the first feature point position candidate determination method. However, the robustness with respect to variations in individual differences and face orientation is low and the overall accuracy is low.

次に、ステップＳ１０５では、全ての特徴点位置候補の決定処理が終了したか否かを判定する。以上の処理により、動作モードに応じて特徴点毎に異なる方法で特徴点位置候補を決定する。ＣＰＵ２０７は決定した特徴点位置候補をＲＡＭ２０９に格納する。 Next, in step S105, it is determined whether or not all feature point position candidate determination processes have been completed. With the above processing, feature point position candidates are determined by different methods for each feature point according to the operation mode. The CPU 207 stores the determined feature point position candidates in the RAM 209.

以上のステップＳ１０１〜Ｓ１０５により、通常モードでは全ての特徴点をＣＮＮ演算により高精度に決定する。一方、追尾モード時は特徴点４０１、４０２、４０５、４０６、４１２、４１５をＣＮＮ演算により高精度に決定し、それ以外の特徴点はテンプレートマッチングを使用して高速に決定する。このように、後段の処理で重要度が高い一部の特徴点のみを高精度に算出する。 Through the above steps S101 to S105, all feature points are determined with high accuracy by CNN calculation in the normal mode. On the other hand, in the tracking mode, feature points 401, 402, 405, 406, 412, and 415 are determined with high accuracy by CNN calculation, and other feature points are determined at high speed using template matching. In this way, only some feature points having high importance in the subsequent processing are calculated with high accuracy.

次に、ステップＳ１０６では特徴点位置候補の信頼度を判定する。本実施形態では動作モードに応じて信頼度を決定する。即ち、通常モードは高い信頼度、追尾モードは低い信頼度であると判断する。 Next, in step S106, the reliability of the feature point position candidate is determined. In this embodiment, the reliability is determined according to the operation mode. That is, it is determined that the normal mode has high reliability and the tracking mode has low reliability.

ステップＳ１０７では、後述する幾何学的な補正処理に使用する固有ベクトルの射影次元を決定する。ここでは、ステップＳ１０６で判定した信頼度に応じて射影次元を選択する。特徴点候補位置の信頼度が高いと判定した場合は高い次元数を選択し、特徴点候補位置の信頼度が低い判定した場合は低い次元を選択する。本実施形態では射影次元を変えることで後述するステップＳ１０８の補正条件を変更する。 In step S107, the projection dimension of the eigenvector used for the geometric correction process described later is determined. Here, the projection dimension is selected according to the reliability determined in step S106. When it is determined that the reliability of the feature point candidate position is high, a high dimension number is selected, and when it is determined that the reliability of the feature point candidate position is low, a low dimension is selected. In the present embodiment, the correction condition in step S108 described later is changed by changing the projection dimension.

ステップＳ１０８では、ステップＳ１０３またはステップＳ１０４で得られた１５個の特徴点位置候補に対して幾何学的な補正処理を実行し、最終的な特徴点の位置を決定する。図７は補正処理の例を説明する図である。４０２ａは目尻を特徴とする特徴点であるが、誤って眉毛端の位置に判定されている。ステップＳ１０８では、人の顔の特徴の配置関係に基づいて統計的な処理によりその位置を補正する。図７に示す例では、複数の特徴点候補位置の配置関係に基づいて４０２ａを４０２ｂの位置に補正する。 In step S108, geometric correction processing is executed on the 15 feature point position candidates obtained in step S103 or step S104, and the final feature point position is determined. FIG. 7 is a diagram illustrating an example of correction processing. 402a is a feature point that characterizes the corner of the eye but is erroneously determined as the position of the end of the eyebrows. In step S108, the position is corrected by statistical processing based on the arrangement relationship of human face features. In the example illustrated in FIG. 7, 402a is corrected to the position 402b based on the arrangement relationship of the plurality of feature point candidate positions.

図８は、ステップＳ１０８の処理内容を説明する図である。ステップＳ８０１では、各特徴点位置候補座標を単純に連結して１つのベクトルデータを生成する。本実施形態の場合は、１５個の特徴点位置座標から３０次元の特徴ベクトルデータＶを生成する。各特徴点の位置座標データ（ｘ_i，ｙ_i）（ｉ：特徴点の番号１〜１５）を単純に連結したデータ列を特徴点ベクトルデータＶ（要素ｖ_j：ｊ＝１〜３０）とする。特徴点の番号１〜１５は実施例では特徴点４０１〜４１５に対応する。したがって、例えば、特徴点ベクトルデータの要素ｖ₁、ｖ₂はそれぞれ特徴点４０１のｘ座標値、ｙ座標値に対応する。特徴ベクトルデータＶは以下の式（２）で定義する。なお、以降Ｔは転置を示す。
Ｖ＝（ｖ₁，ｖ₂，ｖ₃，・・・ｖ₂×_f）^T ・・・式（２） FIG. 8 is a diagram for explaining the processing content of step S108. In step S801, each feature point position candidate coordinate is simply connected to generate one vector data. In this embodiment, 30-dimensional feature vector data V is generated from 15 feature point position coordinates. A data string obtained by simply connecting the position coordinate data (x _i , y _i ) (i: feature point numbers 1 to 15) of each feature point is referred to as feature point vector data V (element v _j : j = 1 to 30). To do. The feature point numbers 1 to 15 correspond to the feature points 401 to 415 in the embodiment. Therefore, for example, the elements v ₁ and v ₂ of the feature point vector data correspond to the x coordinate value and the y coordinate value of the feature point 401, respectively. The feature vector data V is defined by the following equation (2). Hereinafter, T represents transposition.
V = (v ₁ , v ₂ , v ₃ ,... V ₂ × _f ) ^T ... Equation (2)

ここで、ｆは特徴点の数を示している。 Here, f indicates the number of feature points.

ステップＳ８０２及びＳ８０３では、それぞれ平均ベクトルＡ８０７、射影行列Ｅ８０８を用いて射影ベクトルを算出する。射影ベクトルＰは、特徴点ベクトルデータＶから平均ベクトルＡを減じたベクトルデータと射影行列Ｅを使用して以下の式（３）〜（５）により算出する。なお、射影行列Ｅ及び平均ベクトルＡは、予め多数の顔画像に対する特徴点ベクトルデータ（学習用特徴ベクトルデータ）を用いて、主成分分析により算出した行列である。学習用特徴ベクトルデータは顔画像の正しい特徴点位置座標を全て同様に連結して生成したベクトルデータである。
Ｐ＝Ｅ^T（Ｖ−Ａ）・・・式（３）
Ａ＝（ａ₁，ａ₂，ａ₃，・・・ａ₂×_f）^T ・・・式（４）
Ｅ＝（ｕ₁，ｕ₂，・・・ｕ_p）・・・式（５） In steps S802 and S803, a projection vector is calculated using an average vector A807 and a projection matrix E808, respectively. The projection vector P is calculated by the following equations (3) to (5) using the vector data obtained by subtracting the average vector A from the feature point vector data V and the projection matrix E. Note that the projection matrix E and the average vector A are previously calculated by principal component analysis using feature point vector data (learning feature vector data) for a large number of face images. The learning feature vector data is vector data generated by connecting all the correct feature point position coordinates of the face image in the same manner.
P = E ^T (VA) (3)
A = (a ₁ , a ₂ , a ₃ ,... A ₂ × _f ) ^T (4)
E = (u ₁ , u ₂ ,... U _p ) (5)

ｕ₁，ｕ₂，・・・ｕ_pは主成分分析によって得られたそれぞれ２×ｆ次元の正規直交ベクトル（固有ベクトル）である。当該実施例の場合、３０次元のベクトルとなる。ｐは射影ベクトルの次元であり特徴点候補位置の信頼度に応じて異なる値が使用される。即ち、ここで使用する射影次元は、射影次元決定処理ステップＳ１０７で設定した値である。具体的には、通常動作モード時、即ち特徴点候補の精度が高い場合の８に設定され、追尾モード時、即ち特徴候補の精度が低い場合の６に設定する。この場合、射影行列Ｅは主成分分析によって得られる直交ベクトルのうち、対応する固有値が大きい８個のベクトルを選択した行列であり、追尾モード時はそのうち固有値の大きい６個のみを使用することになる。射影行列Ｅ及び平均ベクトルＡはＲＯＭ２０８或いはＲＡＭ２０９等に予め格納されているものとする。 u ₁ , u ₂ ,..., u _p are 2 × f-dimensional orthonormal vectors (eigenvectors) obtained by principal component analysis. In the case of this embodiment, it is a 30-dimensional vector. p is the dimension of the projection vector, and a different value is used according to the reliability of the feature point candidate position. That is, the projection dimension used here is the value set in the projection dimension determination processing step S107. Specifically, it is set to 8 in the normal operation mode, that is, when the accuracy of the feature point candidate is high, and is set to 6 in the tracking mode, that is, when the accuracy of the feature candidate is low. In this case, the projection matrix E is a matrix in which eight vectors having large corresponding eigenvalues are selected from the orthogonal vectors obtained by the principal component analysis, and only six of the large eigenvalues are used in the tracking mode. Become. Assume that the projection matrix E and the average vector A are stored in advance in the ROM 208, the RAM 209, or the like.

ステップＳ８０２及びＳ８０３では、式（３）〜（５）の演算により、２×ｆ次元の特徴点ベクトルをｐ次元の射影ベクトルに次元削減する。即ち、ｐ次元の部分空間に射影する。ステップＳ８０４及びＳ８０５では、射影ベクトルＰから元の特徴点ベクトル次元のデータ（即ち座標位置）を復元する。復元ベクトルＶ'は前記射影行列Ｅと前記平均ベクトルＡとを用いて以下の式（６）により算出する。
Ｖ'＝ＥＰ＋Ａ・・・式（６） In steps S802 and S803, the 2 × f-dimensional feature point vector is reduced to a p-dimensional projection vector by the calculations of equations (3) to (5). That is, it projects onto a p-dimensional subspace. In steps S804 and S805, the original feature point vector dimension data (that is, the coordinate position) is restored from the projection vector P. The restoration vector V ′ is calculated by the following equation (6) using the projection matrix E and the average vector A.
V ′ = EP + A (6)

逆射影した復元ベクトルデータＶ'からステップＳ８０６で補正後の座標データを取り出す。以上のステップＳ８０１からステップ８０６の処理により、全ての特徴点位置データを連結したベクトルデータを次元削減した部分空間に射影した後逆射影することで、統計的な外れ値を補正することができる。つまり、射影した部分空間で表現できない外れ値（誤検出）が補正される。これによって、各特徴点の配置関係に基づいて幾何学的な配置関係を補正し、図６の４０１ａに示す様な誤検出を補正する。 In step S806, the corrected coordinate data is extracted from the back-projected restored vector data V ′. Through the processing from step S801 to step 806 described above, the statistical outlier can be corrected by projecting the vector data obtained by concatenating all the feature point position data onto the subspace whose dimension has been reduced and then performing reverse projection. That is, outliers (incorrect detection) that cannot be expressed in the projected subspace are corrected. As a result, the geometrical arrangement relationship is corrected based on the arrangement relationship of each feature point, and the erroneous detection as indicated by 401a in FIG. 6 is corrected.

ここで、射影次元数が高い場合、当該固有ベクトルで射影された空間の自由度が高く、射影次元数が低い場合、当該固有ベクトルで射影された空間の自由度が低い。即ち、射影次元数が高い場合、幾何学的な配置関係の拘束力が弱く、射影次元数が低い場合拘束力が強まる。本実施形態ではこの特徴を利用して、特徴点位置候補決定方法の精度が高い場合、高い射影次元数を選択して弱い拘束で特徴点の位置を補正し、特徴点位置候補決定方法の精度が低い場合、低い射影次元数を選択し、強い拘束で特徴点の位置を補正する。例えば、射影次元を０とした場合、特徴点位置補正処理の結果は平均ベクトルＡ８０７となり特徴点位置候補決定方法の結果を全て利用しない（全て信頼しない）ことと等価になる。追尾モード時は、ステップＳ１０４で高速に決定する特徴点の数が多いことから信頼度が低いと判断し、強い拘束を与える。 Here, when the projection dimension number is high, the degree of freedom of the space projected with the eigenvector is high, and when the projection dimension number is low, the degree of freedom of the space projected with the eigenvector is low. That is, when the projection dimension number is high, the constraint force of the geometric arrangement relationship is weak, and when the projection dimension number is low, the constraint force is increased. In this embodiment, when the accuracy of the feature point position candidate determination method is high using this feature, the position of the feature point position determination method is corrected by selecting a high projection dimension and correcting the position of the feature point with weak constraints. If is low, a low projection dimension is selected and the position of the feature point is corrected with strong constraints. For example, when the projection dimension is set to 0, the result of the feature point position correction processing is an average vector A807, which is equivalent to not using all the results of the feature point position candidate determination method (not trusting all). In the tracking mode, since the number of feature points determined at a high speed in step S104 is large, it is determined that the reliability is low, and a strong constraint is given.

図１の説明に戻り、次に、ステップＳ１０９では、ステップＳ１０８で得られた特徴点の位置に基づいて認識処理を実行する。認識処理は従来提案されている様々な手法を適用してよい。例えば決定した特徴点の位置を基準にして複数の局所的な領域を切り出し、直交変換等により次元圧縮する。そして、次元圧縮したデータを特徴ベクトル（特徴量）とする。同様にして算出した登録人物の特徴ベクトルとの相関演算により登録者との類似度を算出する。登録人物の特徴ベクトルは認識に先立って、ＲＡＭ２０９等に格納しておく。 Returning to the description of FIG. 1, in step S109, recognition processing is executed based on the position of the feature point obtained in step S108. For recognition processing, various conventionally proposed methods may be applied. For example, a plurality of local regions are cut out based on the determined feature point positions, and dimensionally compressed by orthogonal transformation or the like. The dimension-compressed data is used as a feature vector (feature amount). Similarly, the degree of similarity with the registrant is calculated by a correlation operation with the calculated feature vector of the registered person. The feature vector of the registered person is stored in the RAM 209 or the like prior to recognition.

また、特徴ベクトルは特徴点の位置に応じて複数算出する。例えば、目や鼻や口を含む複数の局所的な領域から算出する。算出した複数の特徴ベクトルの相関値を統合することで最終的な類似度を算出する。当該最終類似度を閾値処理することで登録者であるか否かを判定する。使用する局所領域の数や種類は動作モードが異なる場合でも同一である。 A plurality of feature vectors are calculated according to the positions of feature points. For example, it is calculated from a plurality of local areas including eyes, nose and mouth. The final similarity is calculated by integrating the calculated correlation values of the plurality of feature vectors. Whether or not the user is a registrant is determined by thresholding the final similarity. The number and type of local areas to be used are the same even when the operation modes are different.

以上、本実施形態では、動作モードに応じて特徴点候補位置の検出精度を変えて処理し、検出精度（特徴点候補位置の信頼度）に応じて幾何拘束処理の射影次元を選択する。そして、ステップＳ１１０では認識結果をＲＯＭ２０８に記録する。 As described above, in the present embodiment, processing is performed by changing the detection accuracy of the feature point candidate position according to the operation mode, and the projection dimension of the geometric constraint processing is selected according to the detection accuracy (reliability of the feature point candidate position). In step S110, the recognition result is recorded in the ROM 208.

本実施形態によれば、追尾モード等精度の劣化を許容して高速処理する場合であっても、特徴点位置を、性能劣化を抑えて妥当に決定することができる。即ち、射影次元の次元数を変えるだけの極めて簡単な手法で、特徴点位置候補決定方法に応じて適切な補正を実現することが可能になる。 According to the present embodiment, even when high-speed processing is performed while allowing accuracy degradation such as the tracking mode, the feature point position can be appropriately determined while suppressing performance degradation. That is, it is possible to realize an appropriate correction according to the feature point position candidate determination method by a very simple method of changing the number of projection dimensions.

（第２の実施形態）
第１の実施形態では動作モードに応じて信頼度を判断する場合について説明した。本実施形態では対象とする顔画像データから信頼度を判定する。図９は、本実施形態の動作の手順の一例を示すフローチャートである。 (Second Embodiment)
In the first embodiment, the case where the reliability is determined according to the operation mode has been described. In this embodiment, the reliability is determined from target face image data. FIG. 9 is a flowchart illustrating an example of an operation procedure according to the present embodiment.

ステップＳ９０１では特徴点位置候補を決定する。ここでは第１の実施形態のステップＳ１０３と同様に複数の特徴点位置候補をＣＮＮ演算により求める。ステップＳ９０２では特徴点信頼度を判定する。ここでは対象とする顔画像のコントラストや解像度をから特徴点位置候補決定処理の信頼度を推定する。例えば、コントラストが低い場合や、コントラストの傾きが激しい場合、解像度が低くボケの激しい画像である場合等はステップＳ９０１の信頼度が低いと判断する。 In step S901, feature point position candidates are determined. Here, similarly to step S103 of the first embodiment, a plurality of feature point position candidates are obtained by CNN calculation. In step S902, the feature point reliability is determined. Here, the reliability of the feature point position candidate determination process is estimated from the contrast and resolution of the target face image. For example, if the contrast is low, the contrast is sharp, the resolution is low, and the image is blurry, the reliability of step S901 is determined to be low.

図１０（ａ）は、コントラストのヒストグラムに基づくステップＳ９０２の流れを説明する図である。処理１００１では、正立顔画像データ３３の画素値のヒストグラムを作成する。そして、処理１００２で得られたヒストグラムをベクトルとしてＳＶＭ（Support Vector Machine）等の判別器で特徴位置候補の信頼度を判定する。ＳＶＭは予めヒストグラムの形状と特徴点位置候補の信頼度の関係を学習しておく。 FIG. 10A illustrates the flow of step S902 based on the contrast histogram. In process 1001, a histogram of pixel values of the erect face image data 33 is created. Then, the reliability of the feature position candidate is determined by a discriminator such as SVM (Support Vector Machine) using the histogram obtained in the processing 1002 as a vector. The SVM learns the relationship between the histogram shape and the reliability of the feature point position candidates in advance.

図１０（ｂ）は、解像度に基づくステップＳ９０２の流れを説明する図である。正立顔画像データ３３に対して処理１００３でエッジ抽出フィルタ処理を実行し、処理１００４でエッジ量の総和を算出する。そして、処理１００５で総和値を予め定めるしきい値と比較することで解像度を予測する。解像度が高い場合、特徴点位置候補の信頼度が高いと判定する。 FIG. 10B illustrates the flow of step S902 based on the resolution. In step 1003, edge extraction filter processing is executed for the erect face image data 33, and in step 1004, the sum of edge amounts is calculated. In step 1005, the total value is compared with a predetermined threshold value to predict the resolution. When the resolution is high, it is determined that the reliability of the feature point position candidate is high.

図１０（ｃ）は画像データそのものを用いたステップＳ９０２の流れを説明する図である。まず、処理１００６で正立顔画像データ３３を所定のサイズにサブサンプリングし、処理１００７でサブサンプリングしたデータをベクトルとしてＳＶＭで判定する。ＳＶＭは予めサブサンプリングした画像データと特徴点位置候補の信頼度の関係を学習しておく。 FIG. 10C is a diagram illustrating the flow of step S902 using the image data itself. First, erecting face image data 33 is subsampled to a predetermined size in process 1006, and the data subsampled in process 1007 is determined as a vector by SVM. The SVM learns the relationship between the subsampled image data and the reliability of the feature point position candidates in advance.

図９の説明に戻り、次にステップＳ９０３では、ステップＳ９０２で判定した信頼度に応じて射影次元を設定する。特徴点位置候補の信頼度が高い場合高い射影次元を設定し、低い場合低い次元を設定する。ステップＳ９０４〜ステップＳ９０６は第１の実施形態で説明した図１のステップＳ１０８〜ステップＳ１１０と同じである。 Returning to the description of FIG. 9, in step S903, the projection dimension is set in accordance with the reliability determined in step S902. A high projection dimension is set when the feature point position reliability is high, and a low dimension is set when the reliability is low. Steps S904 to S906 are the same as steps S108 to S110 of FIG. 1 described in the first embodiment.

以上のように本実施形態では、対象の画像を解析することで、特徴点位置候補決定処理の信頼度を予測し、その予測に基づいて射影次元を設定する。つまり、信頼度が低いと予想される場合、幾何学的補正処理の拘束力を強め、信頼度が高いと予想される場合、幾何学的補正処理の拘束力を弱める。これにより、対象とする様々な顔画像に応じて、適切な特徴点位置の決定が可能になる。 As described above, in the present embodiment, the reliability of the feature point position candidate determination process is predicted by analyzing the target image, and the projection dimension is set based on the prediction. That is, when the reliability is predicted to be low, the geometric correction processing constraint force is increased, and when the reliability is predicted to be high, the geometric correction processing constraint force is decreased. Accordingly, it is possible to determine an appropriate feature point position according to various face images to be processed.

（その他の実施形態）
第１の実施形態では、前フレームの認識結果にしたがって動作モード（通常モード・追尾モード）を選択する場合について説明したが、この様な場合に限るわけではない。例えば、対象とする画像中の顔の数によって高精度モードと高速モードを切り替える。ユーザーが所定のユーザーインターフェースを介して高精度モードと高速モードを設定する等様々なケースに適用可能である。例えばユーザーインターフェースを介して設定する場合、ステップＳ１０１はユーザーの指定する情報に基づいて動作モードを決定する。 (Other embodiments)
In the first embodiment, the case where the operation mode (normal mode / tracking mode) is selected according to the recognition result of the previous frame has been described. However, the present invention is not limited to such a case. For example, the high accuracy mode and the high speed mode are switched depending on the number of faces in the target image. The present invention can be applied to various cases such as a user setting a high-precision mode and a high-speed mode through a predetermined user interface. For example, when setting via a user interface, step S101 determines an operation mode based on information specified by the user.

また、第２の実施形態では、対象とする顔画像に対する画像解析により特徴点位置候補の信頼度を予測する場合について説明したが、他の手法でもよい。例えば、ステップＳ９０１で得られる情報（中間結果）に基づいて信頼度を決定する等の方法でもよい。すなわち、実施形態で示すＣＮＮにより特徴点の候補を決定する場合、ＣＮＮの出力値を利用して信頼度を判定することができる。ＣＮＮの出力値が高い場合対象とする特徴点の可能性が高いことから、全特徴点位置候補のＣＮＮ出力値の総和を基準にして信頼度を判定する等の手法で予測することが可能である。特徴点位置候補決定処理を単純なテンプレートマッチングで実現する場合であっても、同様に相関値の総和等を利用して判断することができる。 In the second embodiment, the case where the reliability of the feature point position candidate is predicted by image analysis on the target face image has been described. However, other methods may be used. For example, a method of determining the reliability based on the information (intermediate result) obtained in step S901 may be used. In other words, when the feature point candidates are determined by the CNN shown in the embodiment, the reliability can be determined using the output value of the CNN. When the output value of CNN is high, the possibility of the target feature point is high. Therefore, it is possible to predict by a method such as determining reliability based on the sum of CNN output values of all feature point position candidates. is there. Even when the feature point position candidate determination process is realized by simple template matching, it can be similarly determined using the sum of correlation values or the like.

また、前述した実施形態では、部分空間への射影（ステップＳ８０３）と逆射影（ステップＳ８０４）とをそれぞれ独立に処理する場合について説明したが、これは原理的な説明であり、統合して処理することも可能である。当該演算はそれぞれ線形なマトリクス演算であるため、射影演算と逆射影演算を統合する変換行列を予め生成して処理することができる。 In the above-described embodiment, the case where the projection onto the partial space (step S803) and the reverse projection (step S804) are processed independently has been described. It is also possible to do. Since each of these operations is a linear matrix operation, a transformation matrix that integrates the projection operation and the reverse projection operation can be generated and processed in advance.

さらに、前述した実施形態では、画像中の顔画像から特定の人物を認識する場合について説明したが、本発明はこれに限るわけではない。特徴点の配置に基づいて所定の物体を認識或いは検出する様々な画像認識装置に利用することが可能である。更には、対象物体の姿勢や表情等の状態を認識する手法に適用することも可能である。 Furthermore, in the above-described embodiment, the case where a specific person is recognized from the face image in the image has been described, but the present invention is not limited to this. The present invention can be used for various image recognition apparatuses that recognize or detect a predetermined object based on the arrangement of feature points. Furthermore, the present invention can be applied to a method for recognizing states such as the posture and facial expression of the target object.

また、前述した実施形態では、特徴点の位置として２次元座標上の位置の場合について説明したが、これに限るわけではなく、３次元データに基づいた処理の場合、３次元座標に適用してもよい。この場合、３つの座標値を連結して特徴ベクトルデータＶを作成する。さらに、実施形態では画像認識装置に適用した場合について説明したが、これに限るわけではなく、決定した特徴点の座標を利用して画像を補整・変形する装置等にも利用可能である。 In the above-described embodiment, the case where the position of the feature point is a position on the two-dimensional coordinate has been described. However, the present invention is not limited to this, and in the case of processing based on three-dimensional data, the feature point is applied to the three-dimensional coordinate. Also good. In this case, feature vector data V is created by concatenating three coordinate values. Furthermore, in the embodiment, the case where the present invention is applied to an image recognition apparatus has been described. However, the present invention is not limited to this, and the present invention can also be used for an apparatus that corrects and deforms an image using the coordinates of determined feature points.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

２０７ＣＰＵ
２０８ＲＯＭ
２０９ＲＡＭ 207 CPU
208 ROM
209 RAM

Claims

A feature point position determination device that determines the positions of a plurality of feature points from image data,
Candidate determination means for obtaining position candidates of the plurality of feature points;
Determination means for determining the reliability of the position candidate of the feature point obtained by the candidate determination means;
Based on the reliability, condition determination means for determining a correction condition with a higher binding force as the reliability is lower;
Wherein the plurality of feature point location determination apparatus characterized by chromatic and correcting means for correcting, based the position candidate of the feature points on the correction condition.

2. The feature point position determining apparatus according to claim 1, wherein the correction condition of the correction means is a geometrical constraint force of an arrangement relation of a plurality of feature points.

The correcting means is obtained by connecting the coordinate data of a plurality of feature point position candidates and inputting them as vector data, projecting means for projecting the vector data onto a subspace of a predetermined number of dimensions, and the projecting means. Reverse projection means for reversely projecting the projected vector, and means for extracting coordinate data from the result of the reverse projection means,
2. The feature point position determination apparatus according to claim 1, wherein the correction condition of the correction means is the number of dimensions.

Means for determining the operation mode;
2. The feature point position determination apparatus according to claim 1, wherein the determination unit determines the reliability according to a result of the unit that determines the operation mode.

The determination means includes image analysis means for analyzing image data,
2. The feature point position determination apparatus according to claim 1, wherein the reliability is determined based on a result of the image analysis means.

The feature point position determination apparatus according to claim 1, wherein the determination unit determines the reliability based on information obtained by the candidate determination unit.

A feature point position determination method for determining the positions of a plurality of feature points from image data,
A candidate determination step for obtaining position candidates of the plurality of feature points;
A determination step of determining the reliability of the position candidate of the feature point obtained in the candidate determination step;
Based on the reliability, a condition determination step for determining a correction condition with a higher binding force as the reliability is lower;
Feature point position determination method characterized by chromatic and a correction step of correcting, based the position candidate of the plurality of feature points on the correction condition.

The program for making a computer perform each process of the feature point position determination method of Claim 7.