JP5625196B2

JP5625196B2 - Feature point detection device, feature point detection method, feature point detection program, and recording medium

Info

Publication number: JP5625196B2
Application number: JP2012088571A
Authority: JP
Inventors: 哲平猪俣
Original assignee: Morpho Inc
Current assignee: Morpho Inc
Priority date: 2012-04-09
Filing date: 2012-04-09
Publication date: 2014-11-19
Anticipated expiration: 2032-04-09
Also published as: JP2013218530A

Description

本発明は、特徴点検出装置、特徴点検出方法、特徴点検出プログラム及び記録媒体に関するものである。 The present invention relates to a feature point detection device, a feature point detection method, a feature point detection program, and a recording medium.

従来、物体中に含まれるパーツを検出する装置が知られている（例えば特許文献１参照）。特許文献１には、顔パーツを検出する装置が開示されている。 Conventionally, an apparatus for detecting parts contained in an object is known (see, for example, Patent Document 1). Patent Document 1 discloses an apparatus for detecting a facial part.

特許文献１記載の装置は、暫定的に複数の候補特徴点を決定した後に、複数の最終特徴点を決定する。候補特徴点は、第１の検出器群で検出された複数の特徴点であって、２つの特徴点の位置関係が第１の位置関係のモデルに適合する。最終特徴点は、第２の検出器群で検出された複数の特徴点であって、２つの特徴点の位置関係が第２の位置関係のモデルに適合する。第２の検出器群は第１の検出器群よりも検出精度が高くロバスト性が低い。第２の位置関係のモデルは第１の位置関係のモデル許容度が低い。すなわち、特許文献１記載の装置は、比較的粗い検出の後で比較的細かい検出を行うという２段階の構成を採用する。ここで、特徴点としてはＨａｒｒ−Ｌｉｋｅ特徴量を用いており、特徴点の学習にはＡｄａＢｏｏｓｔを採用している。位置関係のモデルは、特徴点の組の位置関係を、特徴点の組（顔パーツの組）ごとに用意した存在確率分布で表現する。 The apparatus described in Patent Literature 1 tentatively determines a plurality of candidate feature points, and then determines a plurality of final feature points. The candidate feature points are a plurality of feature points detected by the first detector group, and the positional relationship between the two feature points matches the model of the first positional relationship. The final feature point is a plurality of feature points detected by the second detector group, and the positional relationship between the two feature points matches the model of the second positional relationship. The second detector group has higher detection accuracy and lower robustness than the first detector group. The model of the second positional relationship has a low model tolerance of the first positional relationship. That is, the apparatus described in Patent Document 1 employs a two-stage configuration in which a relatively fine detection is performed after a relatively coarse detection. Here, a Harr-Like feature quantity is used as the feature point, and AdaBoost is used for learning of the feature point. The positional relationship model represents the positional relationship of a set of feature points with an existence probability distribution prepared for each set of feature points (a set of face parts).

特開２００８−３７４９号公報JP 2008-3749 A

特許文献１記載の装置は、Ｈａｒｒ−Ｌｉｋｅ特徴量すなわち輝度差に着目した特徴量を用いているため、適切な特徴点を抽出できない場合がある。例えば、照明が変化した場合、ノイズが存在する場合、又はテクスチャの乏しい箇所においては、信頼性のある特徴点を検出することは困難である。また、輝度差に着目した特徴量の場合には、画素領域に依存して処理速度が増加する。当技術分野では、物体中に含まれる特徴点を、適切かつ高速に検出することができる特徴点検出装置、特徴点検出方法、特徴点検出プログラム及び該プログラムを記録した記録媒体が望まれている。 Since the apparatus described in Patent Document 1 uses a Harr-Like feature value, that is, a feature value focused on a luminance difference, an appropriate feature point may not be extracted. For example, it is difficult to detect a reliable feature point when the illumination changes, when noise exists, or in a place where texture is poor. Further, in the case of a feature amount focused on the luminance difference, the processing speed increases depending on the pixel area. In this technical field, a feature point detection apparatus, a feature point detection method, a feature point detection program, and a recording medium on which the program is recorded that can detect feature points included in an object appropriately and at high speed are desired. .

本発明の一側面に係る特徴点検出装置は、対象画像に描画された対象物を構成するパーツの特徴点を検出する装置である。この装置は、基準特徴量取得部及び特徴点検出部を備える。基準特徴量取得部は、基準となる対象物のパーツの特徴量を取得する。特徴点検出部は、基準となる対象物のパーツの特徴量を用いて対象画像内の走査範囲を走査して、対象画像に描画されたパーツの特徴点の位置情報を出力する。ここで、基準となる対象物のパーツの特徴量は、教師画像に描画された対象物のパーツの画素から選択された複数の画素対において、各画素対の画素値の大小関係を３値に符号化した符号の集合であり、かつ、各画素対の位置情報と各画素対の符号とが関連付けされている。特徴点検出部は、基準となる対象物の画素対の位置情報を用いて、対象画像の注目点における画素対の画素値の大小関係を３値に符号化して注目点における特徴量とし、注目点における特徴量と対象物のパーツの特徴量との一致度を用いて、対象画像の走査範囲内の注目点の中から候補点を選択し、候補点の位置情報を用いて対象画像に描画されたパーツの特徴点の位置情報を出力する。 A feature point detection device according to one aspect of the present invention is a device that detects feature points of parts constituting an object drawn on a target image. This apparatus includes a reference feature quantity acquisition unit and a feature point detection unit. The reference feature amount acquisition unit acquires the feature amount of the part of the target object that serves as a reference. The feature point detection unit scans the scanning range in the target image using the feature amount of the part of the target object serving as a reference, and outputs position information of the feature point of the part drawn on the target image. Here, the feature amount of the target object part is a ternary relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the target part pixels drawn in the teacher image. It is a set of encoded codes, and the position information of each pixel pair is associated with the code of each pixel pair. The feature point detection unit uses the positional information of the pixel pair of the target object as a reference to encode the pixel value relationship of the pixel pair at the target point of the target image into a ternary value to obtain a feature amount at the target point. Using the degree of coincidence between the feature amount at the point and the feature amount of the part of the target object, a candidate point is selected from the points of interest within the scanning range of the target image, and drawn on the target image using the position information of the candidate point The position information of the feature points of the selected parts is output.

この特徴点検出装置では、各画素対の画素値の大小関係を符号化し、得られた符号の集合を対象画像の特徴量として用いる。環境変化によって画素値そのものが変化した場合であっても、画素値の大小関係に変化がなければ特徴量としては変化がない。したがって、画素値の差分に基づく特徴量を用いてパーツを検出する場合に比べてより適切にパーツを検出することができる。また、この特徴点検出装置では、各画素対の画素値の大小関係を３値に符号化する。例えば、「大」・「小」というカテゴリのみならず、「略同一」というカテゴリに分類するため、ノイズがある場合やテクスチャの乏しい画像に対して強制的に「大」・「小」の何れかに符号化することを回避することができる。よって、適切にパーツを検出することが可能となる。さらに、特徴点検出部は、符号化された大小関係を比較するだけで対象画像のパーツの特徴点であるか否かを判断することができる。このため、輝度差分を用いたブロックマッチングに比べてより高速にパーツを検出することが可能となる。 In this feature point detection apparatus, the magnitude relationship between the pixel values of each pixel pair is encoded, and the obtained set of codes is used as the feature amount of the target image. Even if the pixel value itself changes due to environmental changes, if there is no change in the magnitude relationship between the pixel values, there is no change in the feature amount. Therefore, parts can be detected more appropriately than in the case where parts are detected using feature values based on pixel value differences. Further, in this feature point detection apparatus, the magnitude relationship between the pixel values of each pixel pair is encoded into three values. For example, in order to categorize not only “large” and “small” categories but also “substantially the same” category, either “large” or “small” is forcibly applied to images with poor noise or poor texture. It is possible to avoid encoding the crab. Therefore, parts can be detected appropriately. Furthermore, the feature point detection unit can determine whether or not the feature point is a feature point of a part of the target image simply by comparing the encoded magnitude relationships. For this reason, parts can be detected at a higher speed than block matching using a luminance difference.

一実施形態では、画素対の画素値は、第１画素値及び第２画素値からなり、特徴点検出部は、各画素対の画素値の大小関係を、第１画素値から第２画素値を減算した差分を用いて符号化してもよい。特徴点検出部は、差分が第１閾値より小さい場合には第１符号、差分が第２閾値より大きい場合には第２符号、差分が第１閾値以上であって第２閾値以下の場合には第３符号としてもよい。このように、差分が第１閾値以上であって第２閾値以下の場合には、例えば「略同一」というカテゴリに分類することができる。すなわち第１閾値及び第２閾値を適切に設定することで、ノイズがある場合やテクスチャの乏しい画像に対して適切にパーツを検出することが可能となる。 In one embodiment, the pixel value of the pixel pair includes a first pixel value and a second pixel value, and the feature point detection unit determines the magnitude relationship between the pixel values of each pixel pair from the first pixel value to the second pixel value. You may encode using the difference which subtracted. The feature point detector is a first code when the difference is smaller than the first threshold, a second code when the difference is larger than the second threshold, and a case where the difference is greater than or equal to the first threshold and less than or equal to the second threshold. May be a third code. Thus, when the difference is not less than the first threshold value and not more than the second threshold value, for example, it can be classified into a category of “substantially the same”. That is, by appropriately setting the first threshold value and the second threshold value, it is possible to appropriately detect parts when there is noise or an image with poor texture.

一実施形態では、基準特徴量取得部は、基準となる対象物のパーツの特徴量が反映されたテンプレートを取得し、特徴点検出部は、テンプレートと対象画像の特徴量とを比較して、対象画像のパーツの特徴点の位置情報を出力してもよい。 In one embodiment, the reference feature amount acquisition unit acquires a template in which the feature amount of the part of the target object serving as a reference is reflected, and the feature point detection unit compares the template and the feature amount of the target image, Position information of feature points of parts of the target image may be output.

一実施形態では、特徴点検出部は、画素対の符号がテンプレートの符号と同一であるか否かを画素対ごとに判断し、画素対の符号がテンプレートの符号と同一であると判断された画素対の数と、画素対ごとに定義された重みとを用いて、対象画像の特徴量とテンプレートとが一致するほど大きな値となる一致度を算出し、一致度に基づいて対象画像のパーツの特徴点の位置情報を出力してもよい。一致度を用いることで、例えば対象画像の中から最も類似する特徴点を適切に取得することができる。 In one embodiment, the feature point detection unit determines for each pixel pair whether the code of the pixel pair is the same as the template code, and determines that the code of the pixel pair is the same as the template code. Using the number of pixel pairs and the weights defined for each pixel pair, the degree of coincidence that increases as the feature amount of the target image matches the template is calculated, and the parts of the target image are calculated based on the degree of coincidence. The position information of the feature points may be output. By using the degree of coincidence, for example, the most similar feature point can be appropriately acquired from the target image.

一実施形態では、テンプレートに含まれる画素対、画素対の大小関係を示す符号及び画素対ごとに定義された重みは、教師画像を用いて予め学習することによって設定されてもよい。 In one embodiment, the pixel pair included in the template, the code indicating the magnitude relationship of the pixel pair, and the weight defined for each pixel pair may be set by learning in advance using a teacher image.

一実施形態では、パーツの特徴点の位置情報の関係性をモデル化した形状モデルを用いて、特徴点検出部によって検出された対象画像のパーツの特徴点の位置情報を補正する位置補正部をさらに備えてもよい。形状モデルを用いて特徴点の位置情報を補正するため、形状モデルを逸脱しないパーツの特徴点とすることができる。 In one embodiment, a position correction unit that corrects the position information of the feature points of the part of the target image detected by the feature point detection unit using a shape model that models the relationship of the position information of the feature points of the parts. Further, it may be provided. Since the position information of the feature points is corrected using the shape model, the feature points of the parts that do not deviate from the shape model can be obtained.

一実施形態では、形状モデルは、対象物のパーツの特徴点の位置情報の集合を形状ベクトルとし、基準形状ベクトル、及び、基準形状ベクトルからの偏差を用いて、パーツの特徴点の位置情報の関係性を示してもよい。このような形状モデルを採用することで、基準形状ベクトルから逸脱しないように特徴点の位置情報を補正することができる。 In one embodiment, the shape model uses a set of position information of feature points of a part of an object as a shape vector, and uses the reference shape vector and a deviation from the reference shape vector to calculate the position information of the feature point of the part. Relationships may be shown. By adopting such a shape model, it is possible to correct the position information of feature points so as not to deviate from the reference shape vector.

一実施形態では、基準形状ベクトルは、教師画像を用いて予め学習された平均の形状ベクトルであってもよい。また、一実施形態では、形状モデルは、教師画像を用いて学習された形状ベクトルの分散共分散行列を主成分分析することによって取得された固有ベクトルを主成分ベクトルとし、平均の形状ベクトルと重み係数付きの主成分ベクトルとの和を用いてパーツの特徴点の位置情報の関係性を示してもよい。このように構成することで、対象物を構成するパーツが許容される動きの範囲に近づくように特徴点の位置情報を補正することができる。また、一実施形態では、位置補正部は、形状モデルを用いて位置情報が補正された対象画像のパーツの特徴点が、基準形状ベクトルから所定値以上離れた位置に存在する場合には、該特徴点を除外して補正を再度行ってもよい。このように、補正後の特徴点が形状モデルによって許容される位置範囲から逸脱している位置に存在する場合には、該特徴点を除外して補正を再度行うことができるので、より適切な特徴点を取得することができる。また、一実施形態では、対象物は顔であってもよい。 In one embodiment, the reference shape vector may be an average shape vector learned in advance using a teacher image. Also, in one embodiment, the shape model uses a principal component vector as an eigenvector acquired by performing principal component analysis on a variance-covariance matrix of a shape vector learned using a teacher image, and an average shape vector and a weighting factor You may show the relationship of the positional information on the feature point of parts using the sum with an attached principal component vector. By configuring in this way, the position information of the feature points can be corrected so that the parts constituting the object approach the range of allowable movement. In one embodiment, the position correction unit, when the feature point of the part of the target image whose position information has been corrected using the shape model is present at a position away from the reference shape vector by a predetermined value or more, Correction may be performed again by excluding the feature points. As described above, when the corrected feature point exists at a position deviating from the position range allowed by the shape model, the feature point can be excluded and the correction can be performed again. Feature points can be acquired. In one embodiment, the object may be a face.

本発明の他の側面に係る特徴点検出方法は、対象画像に描画された対象物を構成するパーツの特徴点を検出する方法である。該方法は、基準特徴量取得ステップ及び特徴点検出ステップを備える。基準特徴量取得ステップでは、基準となる対象物のパーツの特徴量を取得する。特徴点検出ステップでは、基準となる対象物のパーツの特徴量を用いて対象画像内の走査範囲を走査して、対象画像に描画されたパーツの特徴点の位置情報を出力する。ここで、基準となる対象物のパーツの特徴量は、教師画像に描画された対象物のパーツの画素から選択された複数の画素対において、各画素対の画素値の大小関係を３値に符号化した符号の集合であり、かつ、各画素対の位置情報と各画素対の符号とが関連付けされている。特徴点検出ステップでは、基準となる対象物の画素対の位置情報を用いて、対象画像の注目点における画素対の画素値の大小関係を３値に符号化して注目点における特徴量とし、注目点における特徴量と対象物のパーツの特徴量との一致度を用いて、対象画像の走査範囲内の注目点の中から候補点を選択し、候補点の位置情報を用いて対象画像に描画されたパーツの特徴点の位置情報を出力する。 A feature point detection method according to another aspect of the present invention is a method for detecting feature points of parts constituting an object drawn on a target image. The method includes a reference feature amount acquisition step and a feature point detection step. In the reference feature amount acquisition step, the feature amount of the part of the target object as a reference is acquired. In the feature point detection step, a scan range in the target image is scanned using the feature amount of the part of the target object serving as a reference, and position information of the feature points of the part drawn on the target image is output. Here, the feature amount of the target object part is a ternary relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the target part pixels drawn in the teacher image. It is a set of encoded codes, and the position information of each pixel pair is associated with the code of each pixel pair. In the feature point detection step, using the positional information of the pixel pair of the target object as a reference, the magnitude relationship of the pixel value of the pixel pair at the target point of the target image is encoded into a ternary value to obtain a feature amount at the target point. Using the degree of coincidence between the feature amount at the point and the feature amount of the part of the target object, a candidate point is selected from the points of interest within the scanning range of the target image, and drawn on the target image using the position information of the candidate point The position information of the feature points of the selected parts is output .

本発明のさらに他の側面に係る特徴点検出プログラムは、対象画像に描画された対象物を構成するパーツの特徴点を検出するようにコンピュータを動作させるプログラムである。該プログラムは、コンピュータを基準特徴量取得部及び特徴点検出部として動作させる。基準特徴量取得部は、基準となる対象物のパーツの特徴量を取得する。特徴点検出部は、基準となる対象物のパーツの特徴量を用いて対象画像内の走査範囲を走査して、対象画像に描画されたパーツの特徴点の位置情報を出力する。ここで、基準となる対象物のパーツの特徴量は、教師画像に描画された対象物のパーツの画素から選択された複数の画素対において、各画素対の画素値の大小関係を３値に符号化した符号の集合であり、かつ、各画素対の位置情報と各画素対の符号とが関連付けされている。特徴点検出部は、基準となる対象物の画素対の位置情報を用いて、対象画像の注目点における画素対の画素値の大小関係を３値に符号化して注目点における特徴量とし、注目点における特徴量と対象物のパーツの特徴量との一致度を用いて、対象画像の走査範囲内の注目点の中から候補点を選択し、候補点の位置情報を用いて対象画像に描画されたパーツの特徴点の位置情報を出力する。 A feature point detection program according to still another aspect of the present invention is a program that causes a computer to operate so as to detect feature points of parts constituting an object drawn on a target image. The program causes the computer to operate as a reference feature amount acquisition unit and a feature point detection unit. The reference feature amount acquisition unit acquires the feature amount of the part of the target object that serves as a reference. The feature point detection unit scans the scanning range in the target image using the feature amount of the part of the target object serving as a reference, and outputs position information of the feature point of the part drawn on the target image. Here, the feature amount of the target object part is a ternary relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the target part pixels drawn in the teacher image. It is a set of encoded codes, and the position information of each pixel pair is associated with the code of each pixel pair. The feature point detection unit uses the positional information of the pixel pair of the target object as a reference to encode the pixel value relationship of the pixel pair at the target point of the target image into a ternary value to obtain a feature amount at the target point. Using the degree of coincidence between the feature amount at the point and the feature amount of the part of the target object, a candidate point is selected from the points of interest within the scanning range of the target image, and drawn on the target image using the position information of the candidate point The position information of the feature points of the selected parts is output.

本発明のさらに他の側面に係る記録媒体は、対象画像に描画された対象物を構成するパーツの特徴点を検出するようにコンピュータを動作させる特徴点検出プログラムを記録した媒体である。該プログラムは、コンピュータを基準特徴量取得部及び特徴点検出部として動作させる。基準特徴量取得部は、基準となる対象物のパーツの特徴量を取得する。特徴点検出部は、基準となる対象物のパーツの特徴量を用いて対象画像内の走査範囲を走査して、対象画像に描画されたパーツの特徴点の位置情報を出力する。ここで、基準となる対象物のパーツの特徴量は、教師画像に描画された対象物のパーツの画素から選択された複数の画素対において、各画素対の画素値の大小関係を３値に符号化した符号の集合であり、かつ、各画素対の位置情報と各画素対の符号とが関連付けされている。特徴点検出部は、基準となる対象物の画素対の位置情報を用いて、対象画像の注目点における画素対の画素値の大小関係を３値に符号化して注目点における特徴量とし、注目点における特徴量と対象物のパーツの特徴量との一致度を用いて、対象画像の走査範囲内の注目点の中から候補点を選択し、候補点の位置情報を用いて対象画像に描画されたパーツの特徴点の位置情報を出力する。 A recording medium according to still another aspect of the present invention is a medium that records a feature point detection program that causes a computer to operate so as to detect feature points of parts constituting an object drawn in a target image. The program causes the computer to operate as a reference feature amount acquisition unit and a feature point detection unit. The reference feature amount acquisition unit acquires the feature amount of the part of the target object that serves as a reference. The feature point detection unit scans the scanning range in the target image using the feature amount of the part of the target object serving as a reference, and outputs position information of the feature point of the part drawn on the target image. Here, the feature amount of the target object part is a ternary relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the target part pixels drawn in the teacher image. It is a set of encoded codes, and the position information of each pixel pair is associated with the code of each pixel pair. The feature point detection unit uses the positional information of the pixel pair of the target object as a reference to encode the pixel value relationship of the pixel pair at the target point of the target image into a ternary value to obtain a feature amount at the target point. Using the degree of coincidence between the feature amount at the point and the feature amount of the part of the target object, a candidate point is selected from the points of interest within the scanning range of the target image, and drawn on the target image using the position information of the candidate point The position information of the feature points of the selected parts is output.

上述した特徴点検出方法、特徴点検出プログラム及び記録媒体によれば、上述した特徴点検出装置と同様の効果を奏する。 According to the feature point detection method, the feature point detection program, and the recording medium described above, the same effects as those of the feature point detection device described above can be obtained.

以上説明したように、本発明の種々の側面及び実施形態によれば、物体中に含まれる特徴点を適切かつ高速に検出することができる。 As described above, according to various aspects and embodiments of the present invention, feature points included in an object can be detected appropriately and at high speed.

実施形態に係る特徴点検出装置を搭載した携帯端末の機能ブロック図である。It is a functional block diagram of the portable terminal carrying the feature point detection apparatus which concerns on embodiment. 図１中の特徴点検出装置が搭載される携帯端末のハードウェア構成図である。It is a hardware block diagram of the portable terminal in which the feature point detection apparatus in FIG. 1 is mounted. 顔パーツを構成する特徴点を説明する図である。It is a figure explaining the feature point which comprises a face part. 画素対の大小関係を特徴量とする例である。This is an example in which the magnitude relationship between pixel pairs is a feature amount. 形状モデルを説明する概要図である。It is a schematic diagram explaining a shape model. 形状モデルの主成分ベクトルを説明する概要図である。It is a schematic diagram explaining the principal component vector of a shape model. 形状モデルと対象画像の特徴点の位置との関係を説明する概要図である。It is a schematic diagram explaining the relationship between a shape model and the position of the feature point of a target image. 学習装置の機能ブロック図である。It is a functional block diagram of a learning device. 教師画像の一例である。It is an example of a teacher image. 多重解像度処理を説明する概要図である。（Ａ）は、多重解像度画像である。（Ｂ）は、多重解像度画像の関係を示す概要図である。It is a schematic diagram explaining multi-resolution processing. (A) is a multi-resolution image. (B) is a schematic diagram showing the relationship of multi-resolution images. 図１中の特徴点検出装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the feature point detection apparatus in FIG. 対象画像から切り出した顔領域を説明する図である。It is a figure explaining the face area cut out from the target image. 顔領域におけるテンプレートの走査を説明する図である。It is a figure explaining the scanning of the template in a face area. 対象画像の特徴量とテンプレートとの一致度を説明する図である。It is a figure explaining the matching degree with the feature-value of a target image, and a template. 形状モデルを用いた特徴点除外動作を説明する概要図である。It is a schematic diagram explaining the feature point exclusion operation | movement using a shape model. 検出結果の一例である。It is an example of a detection result. 作用効果を説明する概要図である。It is a schematic diagram explaining an effect.

以下、添付図面を参照して本発明の実施形態について説明する。なお、図面の説明において同一の要素には同一の符号を付し、重複する説明を省略する。また、図面の寸法比率は、説明のものと必ずしも一致していない。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant description is omitted. Further, the dimensional ratios in the drawings do not necessarily match those described.

本実施形態に係る特徴点検出装置は、画像中の特徴点を検出する装置である。この装置は、例えば、検出対象となる対象物を構成するパーツの特徴点を検出する際に採用される。対象物としては、画像に描画される物体であれば何でもよい。以下では、本実施形態の一例として、顔を構成する顔パーツの特徴点を検出する場合を説明する。また、この装置は、例えば、携帯電話、デジタルカメラ、ＰＤＡ（Personal Digital Assistant）又は通常のコンピュータシステム等に搭載される。なお、以下では、説明理解の容易性を考慮し、本発明に係る特徴点検出装置の一例として、携帯端末に搭載される特徴点検出装置を説明する。 The feature point detection apparatus according to the present embodiment is an apparatus that detects feature points in an image. This apparatus is employed, for example, when detecting feature points of parts constituting an object to be detected. Any object can be used as long as it is an object drawn on an image. Below, the case where the feature point of the face parts which comprise a face is detected is demonstrated as an example of this embodiment. The apparatus is mounted on, for example, a mobile phone, a digital camera, a PDA (Personal Digital Assistant), or a normal computer system. In the following, a feature point detection device mounted on a mobile terminal will be described as an example of a feature point detection device according to the present invention in consideration of ease of understanding.

図１は、本実施形態に係る特徴点検出装置１を備える携帯端末２の機能ブロック図である。図１に示す携帯端末２は、例えばユーザにより携帯される移動端末であり、図２に示すハードウェア構成を有する。図２は、携帯端末２のハードウェア構成図である。図２に示すように、携帯端末２は、物理的には、ＣＰＵ（Central Processing Unit）１００、ＲＯＭ（Read Only Memory）１０１及びＲＡＭ（Random Access Memory）１０２等の主記憶装置、カメラ又はキーボード等の入力デバイス１０３、ディスプレイ等の出力デバイス１０４、ハードディスク等の補助記憶装置１０５などを含む通常のコンピュータシステムとして構成される。後述する携帯端末２及び特徴点検出装置１の各機能は、ＣＰＵ１００、ＲＯＭ１０１、ＲＡＭ１０２等のハードウェア上に所定のコンピュータソフトウェアを読み込ませることにより、ＣＰＵ１００の制御の元で入力デバイス１０３及び出力デバイス１０４を動作させるとともに、主記憶装置や補助記憶装置１０５におけるデータの読み出し及び書き込みを行うことで実現される。なお、上記の説明は携帯端末２のハードウェア構成として説明したが、特徴点検出装置１がＣＰＵ１００、ＲＯＭ１０１及びＲＡＭ１０２等の主記憶装置、入力デバイス１０３、出力デバイス１０４、補助記憶装置１０５などを含む通常のコンピュータシステムとして構成されてもよい。また、携帯端末２は、通信モジュール等を備えてもよい。 FIG. 1 is a functional block diagram of a mobile terminal 2 including a feature point detection apparatus 1 according to the present embodiment. A mobile terminal 2 shown in FIG. 1 is a mobile terminal carried by a user, for example, and has a hardware configuration shown in FIG. FIG. 2 is a hardware configuration diagram of the mobile terminal 2. As shown in FIG. 2, the portable terminal 2 physically includes a main storage device such as a CPU (Central Processing Unit) 100, a ROM (Read Only Memory) 101, and a RAM (Random Access Memory) 102, a camera, a keyboard, and the like. The input device 103, the output device 104 such as a display, the auxiliary storage device 105 such as a hard disk, and the like are configured as a normal computer system. Each function of the portable terminal 2 and the feature point detection apparatus 1 described later is configured such that predetermined computer software is loaded on hardware such as the CPU 100, the ROM 101, and the RAM 102, thereby controlling the input device 103 and the output device 104 under the control of the CPU 100. This is realized by reading and writing data in the main storage device and the auxiliary storage device 105. Although the above description has been given as the hardware configuration of the mobile terminal 2, the feature point detection device 1 includes a CPU 100, a main storage device such as the ROM 101 and the RAM 102, an input device 103, an output device 104, an auxiliary storage device 105, and the like. You may comprise as a normal computer system. The mobile terminal 2 may include a communication module or the like.

図１に示すように、携帯端末２は、特徴点検出装置１及び表示部３２を備えている。特徴点検出装置１は、画像入力部１０、多重解像度処理部１１、候補点選択部（基準特徴量取得部、特徴点検出部）１２及び顔パーツ検出部（位置補正部）１３を備えている。 As shown in FIG. 1, the mobile terminal 2 includes a feature point detection device 1 and a display unit 32. The feature point detection apparatus 1 includes an image input unit 10, a multi-resolution processing unit 11, a candidate point selection unit (reference feature quantity acquisition unit, feature point detection unit) 12, and a face part detection unit (position correction unit) 13. .

画像入力部１０は、検出対象となる画像データを入力する。画像入力部１０は、ここでは人物が描画された対象画像３０を入力する。画像入力部１０は、例えば、携帯端末２に搭載されたカメラにより撮像された画像を入力してもよいし、通信を介して画像を入力してもよい。また、既に携帯端末２の主記憶装置又は補助記憶装置１０５に記録された画像を入力してもよい。また、画像入力部１０は、対象画像３０から顔画像を切り出してもよい。例えば、画像入力部１０は、対象画像３０に人物の体まで描画されている場合には、対象画像３０から既知の顔検出プログラム等を用いて顔が描画された領域を検出し、該領域を顔画像として抽出する。 The image input unit 10 inputs image data to be detected. Here, the image input unit 10 inputs a target image 30 on which a person is drawn. For example, the image input unit 10 may input an image captured by a camera mounted on the mobile terminal 2 or may input an image via communication. Further, an image already recorded in the main storage device or the auxiliary storage device 105 of the portable terminal 2 may be input. Further, the image input unit 10 may cut out a face image from the target image 30. For example, when the human body is drawn on the target image 30, the image input unit 10 detects an area where a face is drawn from the target image 30 using a known face detection program or the like, Extract as a face image.

多重解像度処理部１１は、対象画像３０（以下、対象画像３０を顔画像へ加工した画像も含む）の大きさを所定のサイズを下回るまで１／２ずつ段階的に縮小する。例えばｎ個の縮小画像を作成する場合には、ｎ段目の画像は１／２^ｎ縮小された画像となる。なお、多重解像度画像において、処理前の対象画像３０が最精細画像となり、ｎ段目の画像が最粗画像となる。このように、対象画像３０の解像度をそれぞれ段階的に変更することで、解像度の異なる複数の画像を作成する。多重解像度画像を用いることで処理の高速化を図ることができる。また、多重解像度処理部１１は、多重解像度画像を平滑化してもよい。例えば、多重解像度処理部１１は、ガウシアンフィルタ等を用いて多重解像度画像を平滑化する。これにより、画像の高周波成分がカットされる。このため、高周波ノイズや微小な位置ずれに対して頑健性を持たせることができる。 The multi-resolution processing unit 11 reduces the size of the target image 30 (hereinafter also including an image obtained by processing the target image 30 into a face image) in half steps until it falls below a predetermined size. For example, when creating n reduced images, the n-th image is reduced to 1/2 ⁿ . In the multi-resolution image, the target image 30 before processing is the highest definition image, and the nth image is the coarsest image. In this manner, by changing the resolution of the target image 30 in stages, a plurality of images having different resolutions are created. By using a multi-resolution image, the processing speed can be increased. Further, the multi-resolution processing unit 11 may smooth the multi-resolution image. For example, the multi-resolution processing unit 11 smoothes the multi-resolution image using a Gaussian filter or the like. Thereby, the high frequency component of an image is cut. For this reason, it is possible to provide robustness against high frequency noise and minute positional deviation.

候補点選択部１２は、顔パーツを構成する特徴点の候補（候補点）を画像の中から選択する。図３は顔パーツを構成する特徴点を説明する概要図である。図３に示すように、予め顔パーツを構成する特徴点Ｐｎ（ｎ：整数）を定義する。図３では一例として３０点の特徴点Ｐ１〜Ｐ３０を定義している。例えば、眉毛、目、鼻、口、顔輪郭等に特徴点を定義すればよい。図中の数値は位置ベクトルの番号を示している。すなわち、図３に示すように３０点の顔パーツの特徴点を定義した場合には、２次元であるため顔パーツの特徴点の位置ベクトルは３０×２＝６０ベクトルとなる。 The candidate point selection unit 12 selects feature point candidates (candidate points) constituting the face part from the image. FIG. 3 is a schematic diagram for explaining the feature points constituting the face part. As shown in FIG. 3, feature points Pn (n: integer) constituting the face part are defined in advance. In FIG. 3, 30 feature points P1 to P30 are defined as an example. For example, feature points may be defined in eyebrows, eyes, nose, mouth, face contour, and the like. Numerical values in the figure indicate position vector numbers. That is, as shown in FIG. 3, when the feature points of 30 face parts are defined, the position vector of the feature points of the face parts is 30 × 2 = 60 vectors because it is two-dimensional.

候補点選択部１２は、特徴点における特徴量として、２つの画素からなる画素対の大小関係を用いる。図４は、特徴量を説明する概要図である。候補点選択部１２は、注目点Ｋを中心として±数ピクセルの領域から画素対を選択する。図４では２つの画素を直線で結ぶことでペアを表現している。ここではＦ１〜Ｆ８までの複数の画素対が選択されている。画素対は、第１画素及び第２画素からなる。第１画素と第２画素とは座標点を用いて一義的に定義すればよい。候補点選択部１２は、各画素対の大小関係を、注目点Ｋにおける特徴量とする。例えば、候補点選択部１２は、画素対の画素値の大小関係を、第１画素値から第２画素値を減算した差分を用いて符号化する。候補点選択部１２は、画素対の画素値の大小関係を、３値に符号化する。例えば、候補点選択部１２は、差分が第１閾値より小さい場合には第１符号、差分が第２閾値より大きい場合には第２符号、差分が第１閾値以上であって第２閾値以下の場合には第３符号とする。一例として、第１符合は０、第２符合は１、第３符合は２であってもよい。すなわち３値に符号化することは、画素対の画素値の大小関係を、カテゴリ「小」、カテゴリ「大」及びカテゴリ「略同一」の３つのカテゴリに分類することを意味する。なお、第１閾値と第２閾値との差は、例えば２５６階調の輝度値を用いた場合には、２〜６とされる。候補点選択部１２は、各画素対の符号の集合を、注目点Ｋにおける特徴量とする。すなわち、ｋ個の画素対が存在する場合には、ｋ個の符号の集合が特徴量となる。 The candidate point selection unit 12 uses the magnitude relationship of a pixel pair composed of two pixels as the feature amount at the feature point. FIG. 4 is a schematic diagram illustrating feature amounts. The candidate point selection unit 12 selects a pixel pair from an area of ± several pixels with the attention point K as the center. In FIG. 4, a pair is expressed by connecting two pixels with a straight line. Here, a plurality of pixel pairs F1 to F8 are selected. The pixel pair includes a first pixel and a second pixel. The first pixel and the second pixel may be uniquely defined using coordinate points. The candidate point selection unit 12 sets the size relationship between each pixel pair as a feature amount at the attention point K. For example, the candidate point selection unit 12 encodes the magnitude relationship between the pixel values of the pixel pair using a difference obtained by subtracting the second pixel value from the first pixel value. The candidate point selection unit 12 encodes the magnitude relationship between the pixel values of the pixel pair into a ternary value. For example, the candidate point selection unit 12 uses the first code when the difference is smaller than the first threshold, the second code when the difference is larger than the second threshold, and the difference is equal to or larger than the first threshold and equal to or smaller than the second threshold. In this case, the third code is used. As an example, the first code may be 0, the second code may be 1, and the third code may be 2. In other words, encoding to ternary means classifying the magnitude relationship between pixel values of a pixel pair into three categories: category “small”, category “large”, and category “substantially the same”. Note that the difference between the first threshold value and the second threshold value is 2 to 6 when, for example, a luminance value of 256 gradations is used. The candidate point selection unit 12 sets a set of codes of each pixel pair as a feature amount at the attention point K. That is, when there are k pixel pairs, a set of k codes is a feature amount.

候補点選択部１２は、顔パーツの特徴量をテンプレート３１として取得する。顔パーツの特徴量が反映されたテンプレート３１は、携帯端末２の主記憶装置又は補助記憶装置１０５に記録されていてもよい。あるいは、テンプレート３１は、通信を介して入力してもよい。テンプレート３１は、対象物の顔パーツの画素から選択された複数の画素対において、各画素対の画素値の大小関係を３値に符号化した符号の集合である。３値に符号化する手法は上述した手法と同一である。テンプレート３１には、顔パーツを構成する特徴点の特徴量として、各画素対の位置情報（座標情報）と各画素対の符号とが関連付けされて記録されている。テンプレート３１は、顔パーツの特徴点を検出するために、予め教師画像を用いて学習される。学習にはＡｄａｂｏｏｓｔが用いられる。学習処理については後述する。 The candidate point selection unit 12 acquires the feature amount of the face part as the template 31. The template 31 in which the feature amount of the face part is reflected may be recorded in the main storage device or the auxiliary storage device 105 of the mobile terminal 2. Alternatively, the template 31 may be input via communication. The template 31 is a set of codes obtained by encoding, in a plurality of pixel pairs selected from the pixels of the face part of the object, a ternary relationship between the pixel values of each pixel pair. The method of encoding into ternary values is the same as that described above. In the template 31, the position information (coordinate information) of each pixel pair and the code of each pixel pair are recorded in association with each other as the feature amount of the feature point constituting the face part. The template 31 is learned in advance using a teacher image in order to detect feature points of face parts. Adaboost is used for learning. The learning process will be described later.

候補点選択部１２は、テンプレート３１を用いて対象画像３０内を走査する。走査範囲としては例えば学習に基づいて算出された顔パーツの特徴点の平均座標を中心として±数ピクセルの範囲を走査する。そして、候補点選択部１２は、走査範囲内の注目点Ｋにおける特徴量をテンプレート３１に記録された画素対の座標情報を用いて算出する。そして、候補点選択部１２は、対象画像の特徴量とテンプレート３１に記録された顔パーツの特徴量とを比較して、対象画像３０における顔パーツの特徴点の位置情報を出力する。ここで、候補点選択部１２は、走査範囲内の注目点Ｋの中から一致度Ｒを用いて候補点を選択する。一致度Ｒは、対象画像３０の特徴量とテンプレート３１とが一致するほど大きな値となるように設定される。候補点選択部１２は、一致度Ｒを以下の手順で算出する。最初に、候補点選択部１２は、対象画像３０における画素対の符号がテンプレート３１の符号と同一であるか否かを画素対ごとに判断する。そして、候補点選択部１２は、画素対の符号がテンプレート３１の符号と同一であると判断された画素対の数と、画素対ごとに定義された重みとを用いて一致度Ｒを算出する。テンプレート３１との一致度Ｒは、ｋ個の画素対が存在するものとし、画素対の符号が一致した時は１、それ以外では０を出力する関数をδ（ｋ）、各画素対の重みをｗ_ｋとすると、以下の式１で表現される。

各画素対の重みｗ_ｋは、後述する学習処理で得られる。候補点選択部１２は、例えば最も高い一致度Ｒとなった注目点Ｋを候補点とし、該注目点Ｋの位置情報を候補点の位置情報として出力する。一致度Ｒを用いることで、例えば対象画像の中から最も類似する特徴点を適切に取得することができる。候補点選択部１２は、例えば定義された顔パーツの特徴点ごとに候補点を出力する。 The candidate point selection unit 12 scans the target image 30 using the template 31. As the scanning range, for example, a range of ± several pixels is scanned around the average coordinates of the feature points of the face parts calculated based on learning. Then, the candidate point selection unit 12 calculates the feature amount at the point of interest K within the scanning range using the coordinate information of the pixel pair recorded in the template 31. Then, the candidate point selection unit 12 compares the feature amount of the target image with the feature amount of the face part recorded in the template 31 and outputs the position information of the feature point of the face part in the target image 30. Here, the candidate point selection unit 12 selects a candidate point from the attention points K within the scanning range using the matching degree R. The degree of matching R is set so as to increase as the feature amount of the target image 30 matches the template 31. The candidate point selection unit 12 calculates the degree of matching R according to the following procedure. First, the candidate point selection unit 12 determines for each pixel pair whether or not the code of the pixel pair in the target image 30 is the same as the code of the template 31. Then, the candidate point selection unit 12 calculates the degree of coincidence R using the number of pixel pairs for which the code of the pixel pair is determined to be the same as the code of the template 31 and the weight defined for each pixel pair. . The degree of matching R with the template 31 is that k pixel pairs exist, a function that outputs 1 when the signs of the pixel pairs match, and 0 otherwise, is δ (k), and the weight of each pixel pair a If you w _k, is expressed by equation 1 below.

The weight w _k of each pixel pair is obtained by a learning process described later. For example, the candidate point selection unit 12 sets the attention point K having the highest matching score R as a candidate point, and outputs position information of the attention point K as position information of the candidate point. By using the matching degree R, for example, the most similar feature point can be appropriately acquired from the target image. The candidate point selection unit 12 outputs a candidate point for each feature point of a defined facial part, for example.

顔パーツ決定部１３は、候補点の位置情報を所定の形状モデルを用いて補正する。例えば、顔パーツ決定部１３は、顔パーツの特徴点の位置情報の関係性をモデル化した形状モデルを満たすように、候補点の位置情報を補正する。 The face part determination unit 13 corrects the position information of the candidate points using a predetermined shape model. For example, the face part determination unit 13 corrects the position information of the candidate points so as to satisfy the shape model obtained by modeling the relationship between the position information of the feature points of the face parts.

図５は、顔パーツの形状モデルの一例である。図５は、図３に示す口のパーツの形状を例にして示している。図５に示すように、口のパーツＰＡ１を定義する特徴点Ｐ３,Ｐ４,Ｐ１８, Ｐ１９の位置情報とそれらの位置関係を用いて口の形状をモデル化する。すなわち、口パーツＰＡ１の形状は、特徴点の集合によって表現される。形状は、例えばベクトルで表記される。このように、顔パーツの形状は、特徴点の位置情報の集合を形状ベクトルで表現することでモデル化される。なお顔パーツが複数存在する場合には、顔全体を形状ベクトルで表現する。すなわち、図３のように眉、目、鼻、口及び顔輪郭という複数の顔パーツが存在する場合には、Ｐ１〜Ｐ３０の位置情報とそれらの位置関係を用いて顔の形状をモデル化する。 FIG. 5 is an example of a shape model of a face part. FIG. 5 shows the shape of the mouth part shown in FIG. 3 as an example. As shown in FIG. 5, the shape of the mouth is modeled using the positional information of the feature points P3, P4, P18, and P19 that define the mouth part PA1 and their positional relationship. That is, the shape of the mouth part PA1 is expressed by a set of feature points. The shape is expressed by a vector, for example. Thus, the shape of the face part is modeled by expressing a set of feature point position information with a shape vector. When there are a plurality of face parts, the entire face is represented by a shape vector. That is, when there are a plurality of face parts such as eyebrows, eyes, nose, mouth, and face contour as shown in FIG. 3, the shape of the face is modeled using the positional information of P1 to P30 and their positional relationship. .

この形状モデルは、顔パーツの許容される動きを含む。形状モデルでは、例えば教師画像を用いて予め学習された顔パーツの平均の形状ベクトルが基準のベクトル（基準形状ベクトル）とされる。そして、基準形状ベクトルからの偏差（許容される動き）を用いて、顔パーツの特徴点の位置情報の関係性を示す。基準形状ベクトルからの偏差は、主成分ベクトルと、重み係数との積で表現される。主成分ベクトルは、教師画像から作成された顔パーツの形状ベクトルの分散共分散行列（variance-covariance matrix）を主成分分析することによって得られた固有ベクトルである。なお、主成分分析によって、固有ベクトルに対応する固有値も得られる。固有値が高い主成分ベクトルほど顔の特徴に影響を与える。図６は、主成分ベクトルの一例である。図６では、固有値の高い順から第１主成分、第２主成分、…と順に表記している。第１主成分は、顔形状の拡大縮小成分である。第２主成分は、顔形状の回転成分である。第３主成分は、口のみが動くものであり口の動き成分である。第４主成分は、眉のみが動くものであり眉の動き成分である。 This shape model includes the allowed movements of the facial parts. In the shape model, for example, an average shape vector of face parts learned in advance using a teacher image is set as a reference vector (reference shape vector). Then, using the deviation (allowable motion) from the reference shape vector, the relationship of the positional information of the feature points of the face parts is shown. The deviation from the reference shape vector is expressed by the product of the principal component vector and the weight coefficient. The principal component vector is an eigenvector obtained by performing principal component analysis on a variance-covariance matrix of a face part shape vector created from a teacher image. Note that eigenvalues corresponding to eigenvectors are also obtained by principal component analysis. The principal component vector having a higher eigenvalue affects the facial features. FIG. 6 is an example of the principal component vector. In FIG. 6, the first principal component, the second principal component,... The first principal component is a facial shape enlargement / reduction component. The second principal component is a face-shaped rotation component. The third principal component is a movement component of the mouth, which moves only the mouth. The fourth principal component is a component that moves only the eyebrows and is a movement component of the eyebrows.

上述した主成分ベクトルをΦとし、重み係数（パラメータベクトル）をｂとすると、形状モデルは以下の数式２で表現される。

ここで、左辺の変数が形状モデルであり、右辺の第１項が基準形状ベクトルである。このように、形状モデルは、基準形状ベクトル（平均の形状ベクトル）と重み係数付きの主成分ベクトルとの和を用いてパーツの特徴点の位置情報の関係性を示す。 If the above-described principal component vector is Φ and the weight coefficient (parameter vector) is b, the shape model is expressed by Equation 2 below.

Here, the variable on the left side is a shape model, and the first term on the right side is a reference shape vector. As described above, the shape model indicates the relationship between the position information of the feature points of the parts using the sum of the reference shape vector (average shape vector) and the principal component vector with the weighting factor.

顔パーツ決定部１３は、上述した形状モデルを用いて、顔を構成する顔パーツが許容される動きの範囲に近くなるように特徴点の位置情報を補正する。具体的には、以下の数式３で示される条件を満たす位置に補正する。

ここでＸは候補点から求まる形状ベクトルであり、ｂはパラメータベクトルである。パラメータベクトルｂを変数とした最適化処理をすることで、パラメータベクトルｂが算出される。最適化処理には、例えば最急降下法が用いられる。図７は、上記数式３の概念図である。顔パーツ決定部１３は、図７に示す点線の部分が最小になるように収束演算する。顔パーツ決定部１３は、算出されたパラメータベクトルｂを用いて候補点から求まる形状ベクトルを補正する。顔パーツ決定部１３は、補正後の形状ベクトルすなわち候補点の位置情報を出力する。 The face part determination unit 13 uses the shape model described above to correct the position information of the feature points so that the face parts constituting the face are close to the allowable range of motion. Specifically, the position is corrected to a position that satisfies the condition expressed by the following Equation 3.

Here, X is a shape vector obtained from candidate points, and b is a parameter vector. The parameter vector b is calculated by performing an optimization process using the parameter vector b as a variable. For example, the steepest descent method is used for the optimization process. FIG. 7 is a conceptual diagram of Formula 3 above. The face part determination unit 13 performs a convergence calculation so that the dotted line portion shown in FIG. 7 is minimized. The face part determination unit 13 corrects the shape vector obtained from the candidate points using the calculated parameter vector b. The face part determination unit 13 outputs a corrected shape vector, that is, position information of candidate points.

顔パーツ決定部１３は、補正後の候補点が基準形状ベクトルから所定値以上離れた位置に存在する場合には、該候補点を除外して補正を再度行う。この場合、顔パーツ決定部１３は、残りの候補点及び上記数式３を用いて、パラメータベクトルｂを再度算出する。表示部３２は、補正後の候補点を特徴点して表示する。表示部３２として、例えばディスプレイが用いられる。 When the candidate point after correction exists at a position away from the reference shape vector by a predetermined value or more, the face part determination unit 13 excludes the candidate point and performs correction again. In this case, the face part determination unit 13 calculates the parameter vector b again using the remaining candidate points and the above Equation 3. The display unit 32 displays the corrected candidate points as feature points. For example, a display is used as the display unit 32.

次に、テンプレート３１の学習処理について説明する。図８は、学習処理を実行する装置の機能ブロック図である。図８では携帯端末２が学習処理を行う装置として示しているが、学習処理を行う装置は特徴点検出装置１を備える携帯端末２である必要はなく、他の装置であってもよい。図８に示すように、携帯端末２は、画像入力部１０、多重解像度処理部１１及び学習部１５を備えている。 Next, the learning process of the template 31 will be described. FIG. 8 is a functional block diagram of an apparatus that executes learning processing. In FIG. 8, the mobile terminal 2 is illustrated as a device that performs the learning process, but the device that performs the learning process does not have to be the mobile terminal 2 including the feature point detection device 1 and may be another device. As illustrated in FIG. 8, the mobile terminal 2 includes an image input unit 10, a multi-resolution processing unit 11, and a learning unit 15.

画像入力部１０及び多重解像度処理部１１は、特徴点検出装置１の画像入力部１０及び多重解像度処理部１１とほぼ同様であり、教師画像３３を対象とする点のみが相違する。図９は、教師画像３３の一例である。図９に示すように、教師画像３３は、複数の画像ＴＰｎ（ｎ：整数）からなる。図９では、ＴＰｎは、輝度画像、水平エッジ画像及び垂直エッジ画像の三枚一組で構成されている。なお、教師画像３３は、輝度画像だけでもよいし、輝度画像、水平エッジ画像及び垂直エッジ画像を適宜組み合わせたものであってもよい。図１０は、多重解像度画像の一例である。図１０の（Ａ）に示すように、教師画像３３である顔画像、水平エッジ画像及び垂直エッジ画像のそれぞれに対して多重解像度処理を行う。図１０の（Ｂ）に示すように、大きさを変更することで、画像Ｐａ〜Ｐｄまで段階的に解像度が異なる教師画像を生成する。 The image input unit 10 and the multi-resolution processing unit 11 are substantially the same as the image input unit 10 and the multi-resolution processing unit 11 of the feature point detection apparatus 1 and are different only in that the teacher image 33 is targeted. FIG. 9 is an example of the teacher image 33. As shown in FIG. 9, the teacher image 33 includes a plurality of images TPn (n: integer). In FIG. 9, TPn is composed of a set of three images of a luminance image, a horizontal edge image, and a vertical edge image. Note that the teacher image 33 may be a luminance image alone, or may be a combination of a luminance image, a horizontal edge image, and a vertical edge image as appropriate. FIG. 10 is an example of a multi-resolution image. As shown in FIG. 10A, multi-resolution processing is performed on each of the face image, the horizontal edge image, and the vertical edge image that are the teacher images 33. As shown in FIG. 10B, by changing the size, teacher images having different resolutions in steps from the images Pa to Pd are generated.

学習部１５は、多重解像度処理で得られた各画像（輝度画像、エッジ画像）を用いて、画像ごとに、画素対の大小関係を示す符号及び画素対ごとに定義された重みを学習する。学習部１５は、Ａｄａｂｏｏｓｔを用いて学習処理を行い、識別器を作成する。この識別器は、顔パーツの特徴点であるか否かを識別する。識別器を構成する弱識別器は、ある画素対が３値の何れかに等しいかどうかで識別を行う。弱識別器は、画素対ごとに用意される。各弱識別器には、識別の性能に応じた重みｗ_ｋが設定される。学習部１５は、教師画像３３を最も識別可能な一つ目の画素対を選択する。そして、例えば識別率に応じて重みｗ_ｋを変更する。次に、一つ目の画素対で識別できない教師画像３３の優先度ｗ_ｐを大きくする。そして、一つ目の画素対で識別できない教師画像３３を識別できる二つ目の画素対を選択する。この処理を繰り返し、識別器の識別率が所定値以上となった場合には、以降では新たな画素対の追加を行なわず、識別器の作成を終了する。得られた画素対の位置情報及び大小関係を示す符号を、テンプレート３１とする。 The learning unit 15 uses each image (luminance image, edge image) obtained by the multi-resolution processing to learn a code indicating the magnitude relationship between pixel pairs and a weight defined for each pixel pair for each image. The learning unit 15 performs learning processing using Adaboost to create a discriminator. This discriminator discriminates whether or not the feature point is a facial part. The weak classifier that constitutes the classifier performs identification based on whether a certain pixel pair is equal to any of the three values. A weak classifier is prepared for each pixel pair. Each weak classifier is set with a weight w _k corresponding to the classification performance. The learning unit 15 selects the first pixel pair that can most distinguish the teacher image 33. Then, for example, the weight w _k is changed according to the identification rate. Next, the priority w _p of the teacher image 33 that cannot be identified by the first pixel pair is increased. Then, a second pixel pair that can identify the teacher image 33 that cannot be identified by the first pixel pair is selected. When this process is repeated and the discrimination rate of the discriminator becomes a predetermined value or more, the creation of the discriminator is terminated without adding a new pixel pair thereafter. The obtained code indicating the positional information and the size relationship of the pixel pair is defined as a template 31.

さらに、学習部１５は、顔パーツの特徴点の位置（平均的な特徴点の位置）を学習してもよい。この平均的な特徴点の位置は、上述したテンプレートを探索する際の初期値や形状モデルの基準形状ベクトルとして用いられる。 Further, the learning unit 15 may learn the position of the feature point of the face part (average feature point position). The average feature point position is used as an initial value or a reference shape vector of the shape model when searching for the template described above.

次に、特徴点検出装置１の動作を説明する。図１１は、特徴点検出装置１の動作を示すフローチャートである。なお、図１１に示すフローチャートの実行前に、対象画像３０から顔画像が抽出されているものとする。例えば、図１２に示すように、対象画像３０から顔画像３０ａが抽出されているものとする。 Next, the operation of the feature point detection apparatus 1 will be described. FIG. 11 is a flowchart showing the operation of the feature point detection apparatus 1. It is assumed that a face image is extracted from the target image 30 before the flowchart shown in FIG. 11 is executed. For example, as shown in FIG. 12, it is assumed that a face image 30 a is extracted from the target image 30.

図１１に示すように、特徴点検出装置１は、フィルタ処理から開始する（Ｓ１０）。Ｓ１０の処理では、多重解像度処理部１１が顔画像３０ａに対してフィルタ処理を行い、平滑化する。多重解像度処理部１１は、例えばガウシアンフィルタを用いて平滑化する。 As shown in FIG. 11, the feature point detection apparatus 1 starts from the filtering process (S10). In the process of S10, the multi-resolution processing unit 11 performs a filtering process on the face image 30a and smoothes it. The multi-resolution processing unit 11 performs smoothing using, for example, a Gaussian filter.

次に、多重解像度処理部１１は、Ｓ１０の処理で平滑化された顔画像３０ａに対して多重解像度処理を行う（Ｓ１２）。これにより、解像度の異なる画像が段階的に生成される。次に、候補点選択部１２が、Ｓ１２の処理で多重解像度処理が施された顔画像３０ａから候補点を選択する（Ｓ１４）。候補点選択部１２は、多重解像度画像のうち最も粗い画像を用いて候補点を選択する。まず、候補点選択部１２は、例えば携帯端末２に備わる記憶部を参照してテンプレート３１を取得する（基準特徴量取得ステップ）。そして、候補点選択部１２は、図１３に示すように、テンプレート３１を用いて顔画像３０ａの所定領域を走査する。所定領域は、学習処理で得られた平均の特徴点の位置座標を中心として±数ピクセルの範囲である。そして、候補点選択部１２は、所定領域の画素全てを注目点として一致度を算出する。図１４に示すように、候補点選択部１２は、各画素対の画素値を大小比較して符号化し、テンプレートの符号と一致する場合には、当該画素値に対応する重みで重み付けし、それらを加算して一致度Ｒを算出する。すなわち候補点選択部１２は、上述した数式１を用いて一致度Ｒを算出する。そして、候補点選択部１２は、一致度Ｒが最も高い注目点を特徴点の候補点として選択する（特徴点検出ステップ）。候補点選択部１２は、特徴点ごとに用意されたテンプレート３１の全てに対して上述した処理を行い、候補点を選択する。 Next, the multi-resolution processing unit 11 performs multi-resolution processing on the face image 30a smoothed by the processing of S10 (S12). Thereby, images with different resolutions are generated in stages. Next, the candidate point selection unit 12 selects candidate points from the face image 30a that has been subjected to the multi-resolution processing in the processing of S12 (S14). The candidate point selection unit 12 selects a candidate point using the coarsest image among the multi-resolution images. First, the candidate point selection part 12 acquires the template 31 with reference to the memory | storage part with which the portable terminal 2 is equipped, for example (reference | standard feature-value acquisition step). Then, the candidate point selection unit 12 scans a predetermined area of the face image 30a using the template 31, as shown in FIG. The predetermined area is a range of ± several pixels centered on the position coordinates of the average feature point obtained by the learning process. Then, the candidate point selection unit 12 calculates the degree of coincidence with all the pixels in the predetermined region as the attention points. As shown in FIG. 14, the candidate point selection unit 12 encodes the pixel values of each pixel pair by comparing the magnitudes, and when the pixel values match the codes of the template, the candidate point selection unit 12 weights them with weights corresponding to the pixel values. Is added to calculate the matching degree R. That is, the candidate point selection unit 12 calculates the degree of coincidence R using Equation 1 described above. Then, the candidate point selection unit 12 selects an attention point having the highest matching degree R as a feature point candidate point (feature point detection step). The candidate point selection unit 12 performs the above-described processing on all the templates 31 prepared for each feature point, and selects candidate points.

次に、顔パーツ決定部１３が、Ｓ１４の処理で選択された複数の候補点の位置情報を補正し、顔パーツを構成する特徴点を決定する。顔パーツ決定部１３は、式３を用いて候補点の位置ベクトルを補正し、最終的な位置合わせを行う。これにより人間の顔らしい形状から逸脱しない顔パーツの特徴点を選択することができる。図１５は、補正後の候補点の例である。そして、顔パーツ決定部１３は、補正後の候補点の位置情報を算出し、基準形状ベクトルによって定まる位置情報と比較する。顔パーツ決定部１３は、補正後の候補点の位置情報と基準形状ベクトルによって定まる位置情報との差分を算出し、所定値以上離れた位置に存在する場合には、該特徴点を除外して補正を再度行う。例えば、外れた候補点Ｙは、座標ベクトルが０であるものとして取り扱う。これにより、大きく外れた候補点の影響を除去することができる。 Next, the face part determination unit 13 corrects the position information of the plurality of candidate points selected in the process of S14, and determines the feature points constituting the face part. The face part determination unit 13 corrects the position vector of the candidate point using Expression 3, and performs final alignment. This makes it possible to select feature points of the facial parts that do not deviate from the human face-like shape. FIG. 15 is an example of candidate points after correction. Then, the face part determination unit 13 calculates the position information of the corrected candidate point and compares it with position information determined by the reference shape vector. The face part determination unit 13 calculates the difference between the position information of the candidate point after correction and the position information determined by the reference shape vector, and excludes the feature point if it exists at a position separated by a predetermined value or more. Perform correction again. For example, a candidate point Y that has been removed is treated as having a coordinate vector of zero. As a result, it is possible to remove the influence of candidate points that are greatly deviated.

次に、特徴点検出装置１は、Ｓ１６で処理した多重解像度画像が最精細画像であるか否かを判断する（Ｓ１８）。Ｓ１８の処理において、特徴点検出装置１は、最精細画像でないと判定した場合には、候補点選択処理へ再度移行する（Ｓ１４）。新たにＳ１４を実行する場合には、次に粗い画像を選択して処理を行う。この際、Ｓ１６の処理で補正された特徴点の位置情報を初期値として処理を行う。これにより、探索コストを削減することができる。また、多重解像度画像を用いることで、特徴点の位置が変化してしまう画像回転等に対しても頑健となる。このように、最精細画像を処理するまで、Ｓ１４〜Ｓ１８の処理が繰り返し実行される。 Next, the feature point detection apparatus 1 determines whether or not the multi-resolution image processed in S16 is the finest image (S18). In the process of S18, when the feature point detection apparatus 1 determines that the image is not the finest image, the feature point detection apparatus 1 proceeds to the candidate point selection process again (S14). When S14 is newly executed, the next coarse image is selected and processed. At this time, the process is performed using the position information of the feature points corrected in the process of S16 as an initial value. Thereby, search cost can be reduced. In addition, by using a multi-resolution image, it is robust against image rotation or the like in which the position of a feature point changes. As described above, the processes of S14 to S18 are repeatedly executed until the finest image is processed.

一方、Ｓ１８の処理において、特徴点検出装置１が、最精細画像であると判定した場合には、図１１に示す制御処理を終了する。 On the other hand, if the feature point detection apparatus 1 determines in step S18 that the image is the finest image, the control process shown in FIG. 11 is terminated.

以上で図１１に示す制御処理を終了する。図１１に示す制御処理を実行することで、例えば、図１６の（Ａ）に示すように顔の特徴点を検出することができる。あるいは、図１６の（Ｂ）に示すように、顔パーツを検出することもできる。 This is the end of the control process shown in FIG. By executing the control process shown in FIG. 11, for example, a feature point of a face can be detected as shown in FIG. Alternatively, as shown in FIG. 16B, face parts can also be detected.

次に、携帯端末２（コンピュータ）を特徴点検出装置１として機能させるための特徴点検出プログラムを説明する。 Next, a feature point detection program for causing the portable terminal 2 (computer) to function as the feature point detection apparatus 1 will be described.

特徴点検出プログラムは、メインモジュール、入力モジュール及び演算処理モジュールを備えている。メインモジュールは、画像処理を統括的に制御する部分である。入力モジュールは、対象画像３０を取得するように携帯端末２を動作させる。演算処理モジュールは、候補点選択モジュール及び顔パーツ決定モジュールを備えている。メインモジュール、入力モジュール及び演算処理モジュールを実行させることにより実現される機能は、上述した特徴点検出装置１の画像入力部１０、候補点選択部１２及び顔パーツ決定部１３の機能とそれぞれ同様である。 The feature point detection program includes a main module, an input module, and an arithmetic processing module. The main module is a part that comprehensively controls image processing. The input module operates the mobile terminal 2 so as to acquire the target image 30. The arithmetic processing module includes a candidate point selection module and a face part determination module. The functions realized by executing the main module, the input module, and the arithmetic processing module are the same as the functions of the image input unit 10, the candidate point selection unit 12, and the face part determination unit 13 of the feature point detection apparatus 1 described above. is there.

特徴点検出プログラムは、例えば、コンピュータ読み取り可能なＲＯＭ等の記憶媒体または半導体メモリによって提供される。また、特徴点検出プログラムは、データ信号としてネットワークを介して提供されてもよい。 The feature point detection program is provided by a storage medium such as a computer-readable ROM or a semiconductor memory, for example. The feature point detection program may be provided as a data signal via a network.

以上、本実施形態に係る特徴点検出装置１によれば、各画素対の画素値の大小関係を符号化し、得られた符号の集合を対象画像３０の特徴量として用いる。照明変化によって画素値そのものが変化した場合であっても、画素値の大小関係に変化がなければ特徴量としては変化がない。したがって、画素値の差分に基づく特徴量を用いて顔パーツを検出する場合に比べてより適切に顔パーツを検出することができる。 As described above, according to the feature point detection apparatus 1 according to the present embodiment, the magnitude relationship between the pixel values of each pixel pair is encoded, and the obtained set of codes is used as the feature amount of the target image 30. Even if the pixel value itself changes due to a change in illumination, the feature value does not change if there is no change in the magnitude relationship between the pixel values. Therefore, the face part can be detected more appropriately than the case where the face part is detected using the feature amount based on the difference between the pixel values.

また、本実施形態に係る特徴点検出装置１では、各画素対の画素値の大小関係を３値に符号化する。例えば、「大」・「小」というカテゴリのみならず、「略同一」というカテゴリに分類する。このため、ノイズがある場合やテクスチャの乏しい画像に対して強制的に「大」・「小」の何れかに符号化することを回避することができる。例えば、背景は顔パーツの特徴量として検討する価値がない。２値で符号化する場合には、背景によって「大」・「小」がランダムで出現しているのか顔領域のテクスチャが乏しいために「大」・「小」がランダムで出現しているのか区別がつかない。一方、「略同一」というカテゴリを導入することで、背景と顔との境界近傍において、背景領域では、「大」・「小」・「略同一」がランダムとなり、顔領域のテクスチャが乏しい場合には「略同一」となるため、背景については特徴点として選択されない（特徴点として学習されることを回避できる）。このため、顔輪郭を適切に検出することができる。また、髪は人によって大きく異なるため、顔の特徴量として検討する価値がない。このような大小を比較する意味のない領域を「略同一」とし、「特徴がない」という特徴を付与することができる。よって、適切に顔パーツを検出することが可能となる。 Further, in the feature point detection apparatus 1 according to the present embodiment, the magnitude relationship between the pixel values of each pixel pair is encoded into ternary values. For example, it is classified into the category “substantially the same” as well as the categories “large” and “small”. For this reason, it is possible to avoid forcibly encoding “large” or “small” for an image having noise or an image with poor texture. For example, the background is not worth considering as a facial part feature. When encoding in binary, whether “Large” and “Small” appear randomly depending on the background, or whether “Large” and “Small” appear randomly because the texture of the face area is poor Indistinguishable. On the other hand, by introducing the category “substantially the same”, in the vicinity of the boundary between the background and the face, “large”, “small”, “substantially the same” are random in the background region, and the texture of the face region is poor Are substantially the same, the background is not selected as a feature point (learning as a feature point can be avoided). For this reason, a face outline can be detected appropriately. Also, since hair varies greatly from person to person, it is not worth considering as a facial feature. Such a meaningless region for comparing magnitudes can be set to “substantially the same” and can be given a feature of “no feature”. Therefore, it is possible to detect face parts appropriately.

また、本実施形態に係る特徴点検出装置１では、符号化された大小関係をテンプレートと比較するだけで対象画像のパーツの特徴点であるか否かを判断することができる。このため、輝度差分を用いたブロックマッチングに比べてより高速にパーツを検出することが可能となる。さらに、画素対の数を変化させるだけで処理速度をチューニングすることができる。 Further, the feature point detection apparatus 1 according to the present embodiment can determine whether or not the feature point is a feature point of a part of the target image only by comparing the encoded magnitude relationship with the template. For this reason, parts can be detected at a higher speed than block matching using a luminance difference. Furthermore, the processing speed can be tuned simply by changing the number of pixel pairs.

また、本実施形態に係る特徴点検出装置１では、顔形状モデルを用いて、対象物を構成するパーツが許容される動きの範囲に近づくように候補点の位置情報を補正することができる。このため、形状モデルを逸脱しないパーツの特徴点とすることができる。そして、候補点は顔らしい形状に拘束されるため、候補点中に外れ値があった場合であっても顔として自然な特徴点を出力することができる。 Further, in the feature point detection apparatus 1 according to the present embodiment, the position information of the candidate points can be corrected using the face shape model so that the parts constituting the target object are close to the allowable range of motion. For this reason, it can be set as the feature point of the part which does not deviate from a shape model. And since a candidate point is restrained by the shape which seems to be a face, even when there is an outlier in a candidate point, a natural feature point as a face can be output.

さらに、本実施形態に係る特徴点検出装置１では、大小関係を符号化した特徴量と、顔形状モデルとを組み合わせることで、識別精度が低くなる部分を互いに補完しあい、良好に特徴点を検出することができる。図１７は、組み合わせたことによる作用効果を説明する概要図である。図１７では眉を例にして説明する。眉に候補点を設定する場合には、画素値の差が縦方向Ｌ１に大きく出現するため、大小関係を符号化した特徴量としては縦方向Ｌ１に第１画素と第２画素とが並んだ画素対が選択されやすい傾向となる。このような画素対を用いて検出した場合、眉は横方向Ｌ２に長いため、特徴量を取得する位置が横方向Ｌ２に多少ずれても同じ値となる。一方、特徴量を取得する位置が縦方向Ｌ１にずれた場合には大小関係がくずれてしまうため、縦方向には、ずれにくい。すなわち、大小関係を符号化した特徴量は、顔パーツにおいて、縦方向の位置ずれに対しては比較的精度が良く、横方向の位置ずれに対しては比較的精度が悪い。 Furthermore, in the feature point detection apparatus 1 according to the present embodiment, by combining the feature quantity encoded with the magnitude relationship and the face shape model, the parts with low identification accuracy are complemented with each other, and the feature points are detected well. can do. FIG. 17 is a schematic diagram for explaining the operational effect of the combination. In FIG. 17, the eyebrows will be described as an example. When candidate points are set on the eyebrows, the difference between the pixel values appears greatly in the vertical direction L1, and therefore, the first pixel and the second pixel are arranged in the vertical direction L1 as a feature quantity in which the magnitude relationship is encoded. The pixel pair tends to be selected easily. When detection is performed using such a pixel pair, the eyebrows are long in the horizontal direction L2, and therefore the same value is obtained even if the position where the feature amount is acquired is slightly shifted in the horizontal direction L2. On the other hand, when the position where the feature amount is acquired is shifted in the vertical direction L1, the magnitude relationship is lost, so that it is difficult to shift in the vertical direction. That is, the feature quantity encoded with the magnitude relation is relatively accurate with respect to a vertical position shift and relatively inaccurate with respect to a horizontal position shift in the face part.

ここで、顔形状モデルは、顔らしい動きに沿ったずれは許容し、顔らしい動きに沿っていないずれは許容しない。すなわち顔形状モデルは主成分ベクトルの方向に沿った動きであれば自然な顔の動きであると判断し、場合によっては補正する。このため、眉の候補点が横方向にずれた場合には、除外処理によって除外される。一方、眉の候補点が縦方向にずれた場合には、自然な動きとしてそのようになっているのか、あるいは単に誤検出してずれているのか判断することができない。すなわち顔形状モデルは、顔パーツにおいて、縦方向の位置ずれに対しては比較的精度が悪く、横方向の位置ずれに対しては比較的精度が良い。よって、大小関係を符号化した特徴量と、顔形状モデルとを組み合わせることで、互いの欠点を互いの長所で補うため、良好に特徴点を検出することができる。 Here, the face shape model allows a shift along the face-like movement and does not allow any deviation along the face-like movement. That is, if the face shape model moves along the direction of the principal component vector, it is determined that the face shape model is a natural face movement, and is corrected in some cases. For this reason, when the eyebrow candidate point is shifted in the horizontal direction, it is excluded by the exclusion process. On the other hand, when the eyebrow candidate point is shifted in the vertical direction, it cannot be determined whether it is such a natural movement or simply misdetected. In other words, the face shape model is relatively inaccurate with respect to the vertical position deviation and relatively accurate with respect to the horizontal position deviation in the face part. Therefore, by combining the feature quantity in which the magnitude relationship is encoded and the face shape model, the mutual defect is compensated by the mutual advantage, so that the feature point can be detected well.

なお、上述した実施形態は本発明に係る特徴点検出装置、特徴点検出方法、特徴点検出プログラム及び記録媒体の一例を示すものであり、実施形態に係る装置、方法、プログラム及び記録媒体に限られるものではなく、変形し、又は他のものに適用したものであってもよい。 The above-described embodiment shows an example of the feature point detection device, the feature point detection method, the feature point detection program, and the recording medium according to the present invention, and is limited to the device, method, program, and recording medium according to the embodiment. It may be modified or applied to other things.

１…特徴点検出装置、１２…候補点選択部（基準特徴量取得部、特徴点検出部）、１３…顔パーツ決定部（位置補正部）。 DESCRIPTION OF SYMBOLS 1 ... Feature point detection apparatus, 12 ... Candidate point selection part (reference | standard feature-value acquisition part, feature point detection part), 13 ... Face part determination part (position correction part).

Claims

A feature point detection device for detecting feature points of parts constituting an object drawn in a target image,
A reference feature value acquisition unit for acquiring a feature value of a part of the target object serving as a reference;
A feature point detection unit that scans a scanning range in the target image using a feature amount of a part of the target object serving as a reference, and outputs position information of a feature point of a part drawn on the target image;
With
The feature amount of the part of the target object serving as a reference is a ternary sign that indicates the magnitude relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the pixels of the part of the target object drawn in the teacher image. And the position information of each pixel pair and the code of each pixel pair are associated with each other,
The feature point detector
Using the position information of the pixel pair of the target object serving as a reference, the magnitude relationship of the pixel value of the pixel pair at the target point of the target image is encoded into a ternary value as a feature amount at the target point,
Using the degree of coincidence between the feature amount at the target point and the feature amount of the part of the target object, a candidate point is selected from the target points within the scanning range of the target image, and the position information of the candidate point is used. A feature point detection device that outputs position information of feature points of parts drawn on the target image .

The pixel value of each pixel pair consists of a first pixel value and a second pixel value,
The feature point detection unit encodes the magnitude relationship between the pixel values of each pixel pair using a difference obtained by subtracting the second pixel value from the first pixel value. The feature inspection according to claim 1, wherein a sign is a second sign if the difference is greater than a second threshold, and a third sign if the difference is greater than or equal to the first threshold and less than or equal to the second threshold. Out device.

The reference feature amount acquisition unit acquires a template in which the feature amount of the part of the target object serving as a reference is reflected,
The feature point detection apparatus according to claim 1, wherein the feature point detection unit compares the template and a feature amount of the target image and outputs position information of a feature point of a part of the target image.

The feature point detection unit determines, for each pixel pair, whether the code of the pixel pair is the same as the code of the template, and the pixel pair for which the code of the pixel pair is determined to be the same as the code of the template And the weight defined for each pixel pair are used to calculate a degree of coincidence that increases as the feature amount of the target image matches the template, and the target image is calculated based on the degree of coincidence. The feature point detection apparatus according to claim 3, wherein position information of feature points of the parts is output.

The pixel pair included in the template, a code indicating a magnitude relationship between the pixel pairs, and a weight defined for each pixel pair are set by learning in advance using a teacher image. Feature point detection device.

A position correction unit that corrects the position information of the feature point of the part of the target image detected by the feature point detection unit using a shape model that models the relationship of the position information of the feature point of the part. The feature point detection apparatus according to any one of Items 1 to 5.

The shape model uses a set of position information of feature points of the part of the object as a shape vector, and uses a reference shape vector and a deviation from the reference shape vector to determine the relationship between the position information of the feature points of the part. The feature point detection apparatus according to claim 6 shown.

The feature point detection apparatus according to claim 7, wherein the reference shape vector is an average shape vector learned in advance using a teacher image.

The shape model has a principal component vector as an eigenvector obtained by principal component analysis of a variance-covariance matrix of a shape vector learned using a teacher image, and an average shape vector and a principal component vector with a weight coefficient The feature point detection apparatus according to claim 8, wherein a relationship between positional information of feature points of parts is expressed using a sum of the points.

The position correction unit excludes the feature point when the feature point of the part of the target image whose position information has been corrected using the shape model is present at a position away from the reference shape vector by a predetermined value or more. The feature point detection apparatus according to claim 7, wherein the correction is performed again.

The feature point detection apparatus according to claim 1, wherein the object is a face.

A feature point detection method for detecting feature points of parts constituting an object drawn in a target image,
A reference feature amount acquisition step of acquiring a feature amount of a part of the object to be a reference;
A feature point detecting step of scanning a scanning range in the target image using a feature amount of the part of the target object serving as a reference, and outputting position information of a feature point of the part drawn on the target image; Prepared,
The feature amount of the part of the target object serving as a reference is a ternary sign that indicates the magnitude relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the pixels of the part of the target object drawn in the teacher image. And the position information of each pixel pair and the code of each pixel pair are associated with each other,
In the feature point detection step,
Using the position information of the pixel pair of the target object serving as a reference, the magnitude relationship of the pixel value of the pixel pair at the target point of the target image is encoded into a ternary value as a feature amount at the target point,
Using the degree of coincidence between the feature amount at the target point and the feature amount of the part of the target object, a candidate point is selected from the target points within the scanning range of the target image, and the position information of the candidate point is used. A feature point detection method for outputting position information of feature points of parts drawn on the target image .

A feature point detection program for operating a computer to detect feature points of parts constituting an object drawn in a target image,
A reference feature amount acquisition unit for acquiring a feature amount of a part of the target object serving as a reference; and
By scanning the scanning range in the target image using the feature amount of the part of the object as a reference, and a feature point detection unit for outputting the position information of the feature points of the drawn parts in the target image Operating the computer;
The feature amount of the part of the target object serving as a reference is a ternary sign that indicates the magnitude relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the pixels of the part of the target object drawn in the teacher image. And the position information of each pixel pair and the code of each pixel pair are associated with each other,
The feature point detector
Using the position information of the pixel pair of the target object serving as a reference, the magnitude relationship of the pixel value of the pixel pair at the target point of the target image is encoded into a ternary value as a feature amount at the target point,
Using the degree of coincidence between the feature amount at the target point and the feature amount of the part of the target object, a candidate point is selected from the target points within the scanning range of the target image, and the position information of the candidate point is used. A feature point detection program for outputting position information of feature points of parts drawn on the target image .

A computer-readable recording medium having recorded thereon a feature point detection program for operating a computer to detect feature points of parts constituting an object drawn on a target image,
The feature point detection program includes:
A reference feature amount acquisition unit for acquiring a feature amount of a part of the target object serving as a reference; and
By scanning the scanning range in the target image using the feature amount of the part of the object as a reference, and a feature point detection unit for outputting the position information of the feature points of the drawn parts in the target image Operating the computer;
The feature amount of the part of the target object serving as a reference is a ternary sign that indicates the magnitude relationship between the pixel values of each pixel pair in a plurality of pixel pairs selected from the pixels of the part of the target object drawn in the teacher image. And the position information of each pixel pair and the code of each pixel pair are associated with each other,
The feature point detector
Using the position information of the pixel pair of the target object serving as a reference, the magnitude relationship of the pixel value of the pixel pair at the target point of the target image is encoded into a ternary value as a feature amount at the target point,
Using the degree of coincidence between the feature amount at the target point and the feature amount of the part of the target object, a candidate point is selected from the target points within the scanning range of the target image, and the position information of the candidate point is used. A recording medium for outputting position information of feature points of parts drawn on the target image .