JP2006338686A

JP2006338686A - Face similarity calculation method and device

Info

Publication number: JP2006338686A
Application number: JP2006212711A
Authority: JP
Inventors: Toshio Kamei; 俊男亀井
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2006-08-04
Filing date: 2006-08-04
Publication date: 2006-12-14
Anticipated expiration: 2021-12-14
Also published as: JP4375571B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a face metadata generation technique and a face similarity calculation technique capable of improving accuracy of face recognition, and to provide a technique for constructing a practical face matching system. <P>SOLUTION: Face feature quantities are extracted by a face feature extraction part 121, and a reliability index is extracted by a reliability index extraction part 122 and output as face metadata. Information such as a parameter or the like related to a posterior distribution when the reliability index is obtained is estimated by a distribution estimation part 141 by using the reliability index of the face metadata in matching, and the similarity among the feature quantities is calculated by a distance calculation part 142. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、顔同定や顔識別、顔の表情認識、顔による男女識別、顔による年齢判別等に利用可能な技術に係り、特に静止画像や動画像に映されている顔画像の類似度を算出する方法および装置に関する。 The present invention relates to a technique that can be used for face identification, face identification, facial expression recognition, gender identification by face, age discrimination by face, and the like, and in particular, the degree of similarity of a face image displayed on a still image or a moving image. The present invention relates to a calculation method and apparatus.

メタデータとは、一般に、データの意味を記述あるいは代表するデータのことであり、顔認識の場合には、主に静止顔画像や動画像などの顔データに関するデータを意味する。 Metadata generally refers to data that describes or represents the meaning of data. In the case of face recognition, it mainly refers to data related to face data such as still face images and moving images.

映像・画像や音声などのマルチメディアコンテンツに対するメタデータの標準化活動として、MPEG-7（MPEG(Moving Pictures Experts Group)によって標準化されたマルチメディアコンテンツ記述インタフェースの国際標準規格,ISO/IEC 15938）の活動が広く知られている。この中で顔認識に関するメタデータの記述子として、顔認識記述子が提案されている（非特許文献１）。 MPEG-7 (International standard for multimedia content description interface, ISO / IEC 15938 standardized by the Moving Pictures Experts Group (MPEG)) as a standardization activity for multimedia content such as video, images and audio Is widely known. Among them, a face recognition descriptor has been proposed as a metadata descriptor related to face recognition (Non-Patent Document 1).

この顔認識記述子では、切り出して正規化した顔画像に対して、一般的に固有顔と呼ばれる部分空間法の一種を用いて、顔画像の特徴量を抽出するための基底行列を求め、この基底行列によって画像中から顔特徴量を抽出し、これをメタデータとする。また、この顔特徴量に対する類似度として重み付け絶対値距離を用いることを提案している。 In this face recognition descriptor, a base matrix for extracting feature values of a face image is obtained from a cut and normalized face image using a kind of subspace method generally called an eigenface. A face feature amount is extracted from the image using a base matrix, and this is used as metadata. It has also been proposed to use a weighted absolute value distance as the similarity to the face feature amount.

また、顔認識に関する技術には様々な方法があることが知られており、例えば、主成分分析（非特許文献２）あるいは判別分析（非特許文献３）に基づく固有顔による方法などが知られている。 In addition, it is known that there are various methods in the technology related to face recognition, for example, a method using a unique face based on principal component analysis (Non-Patent Document 2) or discriminant analysis (Non-Patent Document 3). ing.

また、指紋画像から得られた特徴量に対して部分空間法を適用する際に、品質指標を導入し適応的にパターン間の距離を測る方法がある（非特許文献４、特許文献１）。 In addition, when applying the subspace method to a feature amount obtained from a fingerprint image, there is a method of introducing a quality index and adaptively measuring a distance between patterns (Non-patent Document 4 and Patent Document 1).

A. Yamada他編, "MPEG-7 Visual part of eXperimental Model Version 9.0," ISO/IEC JTC1/SC29/WG11 N3914, 2001A. Yamada et al., "MPEG-7 Visual part of eXperimental Model Version 9.0," ISO / IEC JTC1 / SC29 / WG11 N3914, 2001 Moghaddam他,"Probalilistic Visual Learning for Object Detection",IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 7, pp. 696-710, 1997Moghaddam et al., "Probalilistic Visual Learning for Object Detection", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 7, pp. 696-710, 1997 W. Zhao他, "Discriminant Analysis of Principal Components for Face Recognition," Proceedings of the IEEE Third International Conference on Automatic Face and Gesture Recognition, pp. 336-341, 1998W. Zhao et al., "Discriminant Analysis of Principal Components for Face Recognition," Proceedings of the IEEE Third International Conference on Automatic Face and Gesture Recognition, pp. 336-341, 1998 T. Kamei and M. Mizoguchi, “Fingerprint Preselection Using Eigenfeatures,” Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.918-923, 1998,T. Kamei and M. Mizoguchi, “Fingerprint Preselection Using Eigenfeatures,” Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.918-923, 1998, C.M. Bishop, "Neural Networks for Pattern Recognition", Oxford University Express, 1995C.M.Bishop, "Neural Networks for Pattern Recognition", Oxford University Express, 1995 特開平１０−１７７６５０号公報Japanese Patent Laid-Open No. 10-177650

しかしながら、上記従来の技術では、十分な顔認識の精度を得られなかった。そこで、本発明の目的は、顔認識の精度を向上させることが可能な顔メタデータ生成技術および顔類似度算出技術を提供し、さらに実用的な顔のマッチングシステムを構築するための技術を提供することにある。 However, the conventional technology described above cannot obtain sufficient face recognition accuracy. Accordingly, an object of the present invention is to provide a face metadata generation technique and a face similarity calculation technique capable of improving the accuracy of face recognition, and further provide a technique for constructing a practical face matching system. There is to do.

本発明によれば、顔画像から信頼性を抽出して、その信頼性に応じて適応的にパターン間の類似度を算出することで、顔認識の精度を向上させることができる。 According to the present invention, the accuracy of face recognition can be improved by extracting the reliability from the face image and adaptively calculating the similarity between the patterns according to the reliability.

本発明によれば、顔画像から抽出された顔特徴量と前記顔特徴量を用いた顔認識結果の信頼性を表す信頼性指標とに基づいて、顔画像の類似度を算出する顔類似度算出装置において、顔特徴量間の比較量に関する分布の統計量を信頼性指標の値ごとに予め記憶しておき、前記信頼性指標の値に対応した前記統計量を、前記比較量の事後分布の母数（パラメータ情報）として出力する分布推定手段と、前記比較量と前記母数とを用いて、前記比較量の事後分布に基づき算出した顔特徴量間距離を前記類似度として出力する距離算出手段と、を有することを特徴とする。 According to the present invention, the face similarity that calculates the similarity of the face image based on the face feature amount extracted from the face image and the reliability index that represents the reliability of the face recognition result using the face feature amount. In the calculation device, a statistic of a distribution related to a comparison amount between facial feature amounts is stored in advance for each reliability index value, and the statistic corresponding to the reliability index value is calculated as a posterior distribution of the comparison amount. A distance for outputting the distance between facial feature amounts calculated based on the posterior distribution of the comparison amount using the distribution estimation means that outputs the parameter as parameter (parameter information) and the comparison amount and the parameter as the similarity And calculating means.

本発明の第１の観点によれば、前記分布推定手段は、顔特徴量間の差ベクトルの分布の統計量を信頼性指標の値ごとに予め記憶しておき、前記信頼性指標の値に対応した前記統計量を、前記差ベクトルの事後分布の母数として出力し、前記距離算出手段は、前記差ベクトルと前記母数を用いて、前記差ベクトルの事後分布の対数尤度から導かれる距離関数を用いて算出した前記顔特徴量間距離を前記類似度として出力する、ことを特徴とする。 According to the first aspect of the present invention, the distribution estimation means stores in advance a statistical quantity of a difference vector distribution between facial feature quantities for each reliability index value, and uses the reliability index value as a value. The corresponding statistic is output as a parameter of the posterior distribution of the difference vector, and the distance calculation means is derived from the log likelihood of the posterior distribution of the difference vector using the difference vector and the parameter. The distance between the facial feature amounts calculated using a distance function is output as the similarity.

本発明の第２の観点によれば、前記分布推定手段は、一致すると見做すべきクラス内における顔特徴量間の差ベクトルの分布(クラス内分布)の第１の統計量、および、不一致と見做すべきクラス間における顔特徴量間の差ベクトルの分布(クラス間分布)の第２の統計量を信頼性指標の値ごとに予め記憶しておき、前記信頼性指標の値に対応した前記第１の統計量および第２の統計量をそれぞれ前記クラス内分布の母数・前記クラス間分布の母数として出力し、前記距離算出手段は、前記差ベクトルと前記母数を用いて、前記クラス内分布および前記クラス間分布の比の対数尤度から導かれる距離関数を用いて算出した前記顔特徴量間距離を前記類似度として出力する、ことを特徴とする。 According to the second aspect of the present invention, the distribution estimation means includes a first statistic of a difference vector distribution (intraclass distribution) between facial feature quantities in a class to be regarded as a match, and a mismatch. The second statistic of the difference vector distribution (class-class distribution) between the facial feature quantities between classes that should be considered as a class is stored in advance for each reliability index value, and corresponds to the reliability index value. The first statistic and the second statistic are respectively output as a parameter of the intraclass distribution and a parameter of the interclass distribution, and the distance calculation means uses the difference vector and the parameter. The distance between the face feature amounts calculated using a distance function derived from the log likelihood of the ratio between the intraclass distribution and the interclass distribution is output as the similarity.

本発明によれば、信頼性指標の値に対応した統計量を比較量の事後分布の母数として出力し、比較量と母数とを用いて比較量の事後分布に基づき算出した顔特徴量間距離を類似度として出力することにより、高精度の顔画像マッチングを達成することができる。 According to the present invention, a statistic corresponding to the value of the reliability index is output as a parameter of the posterior distribution of the comparison amount, and the facial feature amount calculated based on the posterior distribution of the comparison amount using the comparison amount and the parameter By outputting the inter-distance as the similarity, highly accurate face image matching can be achieved.

（発明の原理）
はじめに本発明の原理について説明する。一般に、パターン認識を行う際、認識を行いたいクラスに対する学習データを大量に用意することが可能であるならば、その学習データに対する統計解析に基づいてパターンの分布関数を推定し、パターン認識機構を構築することができる。しかしながら、顔認識応用の場合、個人毎には１枚だけの登録画像しか得られず、ごく少数の登録画像しか許されないことが多い。 (Principle of the invention)
First, the principle of the present invention will be described. In general, when pattern recognition is performed, if it is possible to prepare a large amount of learning data for a class to be recognized, a pattern distribution function is estimated based on statistical analysis of the learning data. Can be built. However, in the face recognition application, only one registered image can be obtained for each individual, and only a few registered images are often allowed.

このような場合であっても、顔の特徴ベクトルに対して信頼性指標という指標を与え、その指標に基づくクラスを考えることで、その信頼性指標に対するクラスの統計解析を行いパターンの分布関数を推定することで、一枚の登録画像しか得られないような顔認識応用に対しても信頼性指標を通じた分布関数に基づくパターン認識機構を構築することができる。 Even in such a case, an index called a reliability index is given to the facial feature vector, and by considering a class based on that index, statistical analysis of the class for that reliability index is performed, and the pattern distribution function is calculated. By estimating, it is possible to construct a pattern recognition mechanism based on a distribution function through a reliability index even for face recognition applications in which only one registered image can be obtained.

以下、誤差分布に基づく場合のマハラノビス距離とクラス内分布やクラス間分布に基づく判別距離についての顔認識の原理について説明する。 Hereinafter, the principle of face recognition regarding the Mahalanobis distance based on the error distribution and the discrimination distance based on the intraclass distribution and the interclass distribution will be described.

いま、ある一人の顔を観測する場合を考える。誤差がない場合に顔画像から得られる特徴ベクトルをｖ₀とし、実際に観測される観測ベクトルｖは、誤差ベクトルεが重畳しているとする(数１)。 Consider the case of observing one person's face. It is assumed that the feature vector obtained from the face image when there is no error is v ₀ , and the actually observed vector v is superimposed with the error vector ε (Equation 1).

ここで、特徴ベクトルｖ₀を２回観測するとすれば、二つの観測ベクトルｖ₁とｖ₂が得られる（数２）。 Here, if the feature vector v ₀ is observed twice, two observation vectors v ₁ and v ₂ are obtained (Equation 2).

いま、誤差ベクトルε₁、誤差ベクトルε₂に対して相関を持つ信頼性指標θ₁、信頼性指標θ₂がそれぞれ得られ、その事後分布がｐ(ε|θ₁)、ｐ(ε|θ₂)であるとする。 Now, a reliability index θ ₁ and a reliability index θ ₂ correlated with the error vector ε ₁ and the error vector ε ₂ are obtained, respectively, and their posterior distributions are p (ε | θ ₁ ), p (ε | θ ₂ ).

このような分布が得られたときのｖ₁とｖ₂の差ベクトルｓの事後分布がｐ(ｓ|θ₁,θ₂)と表されるとすれば、パターン間の類似度ｄ(ｖ₁,ｖ₂)として、次の対数尤度を用いることができる。 If the posterior distribution of the difference vector s between v ₁ and v ₂ when such a distribution is obtained is expressed as p (s | θ ₁ , θ ₂ ), the similarity d (v ₁ between patterns) , v ₂ ), the following log likelihood can be used.

事後分布ｐ(ε|θ₁)、ｐ(ε|θ₂)がそれぞれ正規分布であるとすると、差ベクトルｓの分布ｐ(ｓ|θ₁, θ₂)も正規分布となる。ここで、誤差ベクトルの事後分布ｐ(ε|θ_i) (i=1,2)として平均０で共分散行列Σε(θ_i)の正規分布を考えると、差ベクトルｓの分布は、平均０で、共分散行列Σ_s(θ₁, θ₂)は次式（数４）となる。 If the posterior distributions p (ε | θ ₁ ) and p (ε | θ ₂ ) are normal distributions, the distribution p (s | θ ₁ , θ ₂ ) of the difference vector s is also a normal distribution. Here, considering the normal distribution of the covariance matrix Σε (θ _i ) with an average of 0 as the posterior distribution p (ε | θ _i ) (i = 1, 2) of the error vector, the distribution of the difference vector s has an average of 0 Thus, the covariance matrix Σ _s (θ ₁ , θ ₂ ) is expressed by the following equation (Equation 4).

つまり、事後分布ｐ(ｓ|θ₁, θ₂)は次式（数５）で表される。

That is, the posterior distribution p (s | θ ₁ , θ ₂ ) is expressed by the following equation (Equation 5).

従って、（数３）は、次式（数６）に示すように、共分散行列Σε(θ₁)やΣε(θ₂)を用いて、信頼性指標θ₁,やθ₂に対して適応的なマハラノビス距離に書き表すことができる。 Therefore, (Equation 3) is adapted to the reliability indices θ ₁ and θ ₂ using the covariance matrix Σε (θ ₁ ) and Σε (θ ₂ ) as shown in the following equation (Equation 6). It can be expressed as a typical Mahalanobis distance.

誤差ベクトルの各要素間における独立性を仮定すれば、（数５）は次式（数７）となる。

Assuming the independence between the elements of the error vector, (Equation 5) becomes the following equation (Equation 7).

ここで、σ_s,k(θ₁, θ₂)²は共分散行列Σ_s(θ₁, θ₂)のｋ番目の対角要素であり、つまり、観測誤差の分散である。また、σε_,k(θ₁)²、σε_,k(θ₂)²は、それぞれ共分散行列Σε(θ₁)、Σε(θ₂)のｋ番目の対角要素である。ｓ_kは差ベクトルｓのｋ番目の要素である。 Here, σ _{s, k} (θ ₁ , θ ₂ ) ² is the k-th diagonal element of the covariance matrix Σ _s (θ ₁ , θ ₂ ), that is, the variance of the observation error. Also, σε _{, k} (θ ₁ ) ² and σε _{, k} (θ ₂ ) ² are the k-th diagonal elements of the covariance matrices Σε (θ ₁ ) and Σε (θ ₂ ), respectively. s _k is the k th element of the difference vector s.

このように正規分布を仮定することで、（数３）は、次式（数８）のように各特徴ベクトルの要素毎の分散σε_,k(θ₁)、σε_,k(θ₂)を用いて、信頼性指標θ₁やθ₂に対して適応的なマハラノビス距離によって類似度を定義することができる。 Assuming a normal distribution in this way, (Equation 3) gives the variances σε _{, k} (θ ₁ ) and σε _{, k} (θ ₂ ) for each element of each feature vector as shown in the following equation (Equation 8). The similarity can be defined by the Mahalanobis distance adaptive to the reliability indices θ ₁ and θ ₂ .

ここで、ｖ_1,k、ｖ_2,kはそれぞれ特徴ベクトルｖ₁、ｖ₂のｋ番目の要素である。

_{Here, v 1, k, v 2} , k is the k th element of the feature vector v _1, v _2, respectively.

上記説明では事後分布ｐ(ｓ|θ₁, θ₂)として正規分布を仮定したが、以下、混合正規分布を仮定する。さて、事後分布ｐ(ｓ|θ₁, θ₂)として、次式（数９）によって示すように、正規分布ｐ(ｓ|θ₁, θ₂, j) (j=1,2,...,M)の和によって事後分布ｐ(ｓ|θ₁, θ₂)が表現できると仮定する。 In the above description, a normal distribution is assumed as the posterior distribution p (s | θ ₁ , θ ₂ ). Hereinafter, a mixed normal distribution is assumed. As the posterior distribution p (s | θ ₁ , θ ₂ ), as shown by the following equation (Equation 9), the normal distribution p (s | θ ₁ , θ ₂ , j) (j = 1, 2,... ., M) suppose that the posterior distribution p (s | θ ₁ , θ ₂ ) can be expressed.

従って、次式（数１０）により適応的混合マハラノビス距離を定義することができる。

Therefore, the adaptive mixed Mahalanobis distance can be defined by the following equation (Equation 10).

事後分布ｐ(ｓ|θ₁, θ₂, j)の共分散行列Σ_s(θ₁, θ₂, j)およびP(j)の推定については、一般的な推定方法である最尤推定法やＥＭアルゴリズムを用いて推定することができる（非特許文献５を参照）。 For estimating the covariance matrices Σ _s (θ ₁ , θ ₂ , j) and P (j) of the posterior distribution p (s | θ ₁ , θ ₂ , j), a maximum likelihood estimation method which is a general estimation method Or EM algorithm (see Non-Patent Document 5).

混合正規分布を仮定することで、分布をより正確に近似でき、マッチング性能が向上するが、大量の学習データが必要になるとともに演算量もまた大幅に増加する。 By assuming a mixed normal distribution, the distribution can be approximated more accurately and the matching performance is improved. However, a large amount of learning data is required and the amount of calculation is also greatly increased.

前述の誤差分布に基づくマハラノビス距離は、検索する顔が他の登録されている顔データの中でどれに最も近いかを調べる顔同定(face identification)のような問題に対して優れている距離である。 The Mahalanobis distance based on the above error distribution is a distance that is excellent for problems such as face identification to find out which face to search is closest to among other registered face data. is there.

一方、顔識別(face verification)の問題では、入力された顔が登録された画像との同一性を判定する上で、受け入れるか、棄却するかが重要な問題である。以下で説明する「判別距離」と名付ける距離は、この顔識別問題に対しては前述のマハラノビス距離を用いる場合よりも、優れた類似度尺度となっている。 On the other hand, in the face verification problem, whether to accept or reject is important in determining the identity of the input face with the registered image. The distance named “discrimination distance” described below is a better similarity measure than the above-described Mahalanobis distance for the face identification problem.

いま、顔の二つの特徴ベクトルｖが、一致すると判定すべき、つまり、二つの特徴ベクトルが同一クラス内に属している(例えば、二つの特徴ベクトルが同じ人物の顔データである)ならば、それらの特徴ベクトルの組み合わせはクラスＷに属しているとする。また、二つの特徴ベクトルｖが不一致と判定すべき、つまり、クラス間の特徴ベクトルである(例えば、二つの特徴ベクトルが異なった人物の顔データである)ならば、それらの組み合わせはクラスＢに属しているとする。 Now, it should be determined that the two feature vectors v of the face match, that is, if the two feature vectors belong to the same class (for example, the two feature vectors are the face data of the same person), Assume that the combination of these feature vectors belongs to class W. Also, if the two feature vectors v should be determined to be inconsistent, that is, feature vectors between classes (for example, the two feature vectors are different person face data), the combination thereof is class B. Suppose it belongs.

二つの特徴ベクトルｖ₁、ｖ₂に対して信頼性指標θ₁、θ₂が得られるとする。差ベクトルｓと二つの信頼性指標θ₁、θ₂(以下、二つの信頼性指標のセットを[θ_i]と表記する。)が観測されたときに一致していると見倣すクラスＷと不一致と見倣すべきクラスＢとの判別問題を考えると、次式（数１１）の判別則(decision rule)が得られる。 Assume that reliability indices θ ₁ and θ ₂ are obtained for the two feature vectors v ₁ and v ₂ . Class W that imitates that the difference vector s and the two reliability indices θ ₁ and θ ₂ (hereinafter, the set of two reliability indices is represented as [θ _i ]) match each other. And the discriminant problem between the class B to be imitated and the discrimination rule of the following equation (Equation 11) is obtained.

上記（数１１）の左辺は、ベイズ定理(Bayes Theorem)によって次式（数１２）のように書き直すことができる。 The left side of the above (Equation 11) can be rewritten as the following equation (Equation 12) by the Bayes Theorem.

ここで、Ｗ、Ｂと[θ_i]の生じる確率は独立であると仮定して、P(Ｗ,[θ_i])=P(Ｗ)P([θ_i])、P(Ｂ,[θi])=P(Ｂ)P([θ_i])としている。 Here, assuming that the probabilities of occurrence of W, B and [θ _i ] are independent, P (W, [θ _i ]) = P (W) P ([θ _i ]), P (B, [ θi]) = P (B) P ([θ _i ]).

パターン間の距離ｄ(ｖ₁,ｖ₂)として（数１２）の対数尤度を計算することで、次式（数１３）のように、顔識別問題に適した類似度を求めることができる。 By calculating the log likelihood of (Equation 12) as the distance d (v ₁ , v ₂ ) between patterns, a similarity suitable for the face identification problem can be obtained as in the following equation (Equation 13). .

個別のマッチング毎に事前確率Ｐ(Ｗ)、Ｐ(Ｂ)が異なり、且つ、それを知ることができるならば、（数１３）の第２項を計算することが望ましい。しかし、多くの場合、個別のマッチング毎に事前確率を知ることができないので、事前確率は一定と仮定することで、第２項を一定と見倣し、類似度の計算からは除外する。 If the prior probabilities P (W) and P (B) are different for each individual matching and can be known, it is desirable to calculate the second term of (Equation 13). However, in many cases, since the prior probability cannot be known for each individual matching, assuming that the prior probability is constant, the second term is assumed to be constant and excluded from the calculation of the similarity.

次に、クラス内分布ｐ_W(ｓ|[θ_i])、クラス間分布ｐ_B(ｓ|[θ_i])がそれぞれ正規分布であると仮定し、その平均がそれぞれ０、共分散行列がそれぞれΣ_W([θ_i])、Σ_B([θ_i])とすれば、事後分布はそれぞれ次式（数１５）で書き表すことができる。 Next, it is assumed that the intraclass distribution p _W (s | [θ _i ]) and the interclass distribution p _B (s | [θ _i ]) are normal distributions, and the mean is 0 and the covariance matrix is Assuming that Σ _W ([θ _i ]) and Σ _B ([θ _i ]) respectively, the posterior distribution can be expressed by the following equation (Equation 15).

上式を（数１４）に代入すると(但し、（数１４）の第２項は省略)、次式（数１６）で示す距離を得ることができる。これを「適応的判別距離」と呼ぶことにする。 Substituting the above expression into (Expression 14) (however, the second term of (Expression 14) is omitted), the distance expressed by the following expression (Expression 16) can be obtained. This is called “adaptive discrimination distance”.

差ベクトルｓの各要素間における独立性を仮定すれば、（数１５）は、次式となる。

Assuming the independence between the elements of the difference vector s, (Equation 15) becomes the following equation.

ここで、σ_W,k(θ_i)²、σ_B,k(θ_i)²は、それぞれ共分散行列Σ_W(θ_i)、Σ_B(θ_i)のｋ番目の対角要素であり、つまり、クラス内分散とクラス間分散に相当する。ｓ_kは差ベクトルｓのｋ番目の要素である。 Here, σ _{W, k} (θ _i ) ² and σ _{B, k} (θ _i ) ² are the k-th diagonal elements of the covariance matrices Σ _W (θ _i ) and Σ _B (θ _i ), respectively. That is, it corresponds to intra-class variance and inter-class variance. s _k is the k th element of the difference vector s.

このように正規分布を仮定することで（数１６）は、次式（数１８）のように、各特徴ベクトルの要素毎のクラス内分散σ_W,k(θ_i)²、クラス間分散σ_B,k(θ_i)²を用いて、信頼性指標[θ_i]に対して適応的な判別距離による類似度を定義することができる。 Assuming a normal distribution in this way, (Equation 16) can be obtained from the following equation (Equation 18): intra-class variance σ _{W, k} (θ _i ) ^{2 for} each element of each feature vector, inter-class variance σ _{Using B, k} (θ _i ) ² , it is possible to define a similarity based on an adaptive discrimination distance with respect to the reliability index [θ _i ].

上記までの説明ではクラス内分散σ_W,k(θ_i)²、クラス間分散σ_B,k(θ_i)²として正規分布を仮定したが、以下では混合分布を仮定する。 In the above description, the normal distribution is assumed as the intra-class variance σ _{W, k} (θ _i ) ² and the inter-class variance σ _{B, k} (θ _i ) ² , but in the following, a mixed distribution is assumed.

次式（数１９）のように、クラス内分布ｐ_W(ｓ|[θ_i])、クラス間分布ｐ_B(ｓ|[θ_i])として、それぞれ正規分布ｐ_W(ｓ|[θ_i], j_W) (j_W=1,2,...,M_W)、ｐ_B(ｓ|[θ_i], j_B) (j_B=1,2,...,M_B)の和によって事後分布が表現できると仮定する。 As shown in the following equation (Equation 19), the intra-class distribution p _W (s | [θ _i ]) and the inter-class distribution p _B (s | [θ _i ]) are respectively expressed as normal distributions p _W (s | [θ _i]. ], j _W ) (j _W = 1,2, ..., M _W ), p _B (s | [θ _i ], j _B ) (j _B = 1,2, ..., M _B ) Assume that the posterior distribution can be expressed by the sum.

従って、この対数尤度を用いて次式（数２０）の適応的混合マハラノビス距離を導くことができる。 Therefore, the adaptive mixed Mahalanobis distance of the following equation (Equation 20) can be derived using this log likelihood.

クラス内分布ｐ_W(ｓ|[θ_i], j_W)、クラス間分布ｐ_B(ｓ|[θ_i], j_B)の共分散行列Σ_W(s|[θ_i], j_W)、Σ_B(s|[θ_i], j_B)および P(j_W)、P(j_B)の推定については、最尤推定法やＥＭアルゴリズムを用いて推定することができる。 Covariance matrix Σ _W (s | [θ _i ], j _W ) of intraclass distribution p _W (s | [θ _i ], j _W ) and interclass distribution p _B (s | [θ _i ], j _B ) , Σ _B (s | [θ _i ], j _B ), P (j _W ), and P (j _B ) can be estimated using a maximum likelihood estimation method or an EM algorithm.

混合分布を仮定することで、分布をより正確に近似でき、マッチング性能を向上させることができるが、同時に大量の学習データが必要になるとともに、演算量も大幅に増加する。 By assuming a mixed distribution, the distribution can be approximated more accurately and the matching performance can be improved, but at the same time, a large amount of learning data is required and the amount of computation is greatly increased.

このように顔特徴量に対してさらに信頼性指標を抽出することによって、信頼性指標に対して適応的な距離規範を導くことができ、高精度な顔認識機構を構築することができる。なお、上記では、特徴ベクトルに対する信頼性指標をスカラー量(一つの成分のみ)か、ベクトル量(複数の成分をもつ)か特定していないが、どちらの場合でも成立する議論であり、複数の要素を用いることで、性能の向上が期待できる。 Thus, by extracting a reliability index further from the face feature amount, an adaptive distance criterion can be derived for the reliability index, and a highly accurate face recognition mechanism can be constructed. In the above, the reliability index for the feature vector is not specified as a scalar quantity (only one component) or a vector quantity (having a plurality of components), but it is an argument that holds in either case. Use of elements can be expected to improve performance.

具体的な信頼性指標については、実験的に有効な信頼性指標を発見することが必要であり、顔認識の場合では、画像のコントラストを表すコントラスト指標や、正面顔の認識では、照明変動や姿勢変動によって生じる顔画像の左右の対称性の歪みからの量を表す非対称指標を用いると高い効果が得られ、それらの信頼性指標を組み合わせてベクトル量とすることで、より精度の向上が期待できる。 As for the specific reliability index, it is necessary to find an experimentally effective reliability index. In the case of face recognition, the contrast index indicating the contrast of the image, and the front face recognition, Using an asymmetric index that represents the amount of distortion in the face image caused by posture changes from left and right symmetry distortion is highly effective, and combining these reliability indices into a vector quantity is expected to improve accuracy. it can.

（実施の形態）
図１は、本発明の一実施形態による顔画像マッチングシステムを示すブロック図である。以下、顔画像マッチングシステムについて詳細に説明する。 (Embodiment)
FIG. 1 is a block diagram illustrating a face image matching system according to an embodiment of the present invention. Hereinafter, the face image matching system will be described in detail.

図１に示すように、本発明による顔画像マッチングシステムには、顔画像を入力する顔画像入力部１１と、入力された顔画像からそれらの顔特徴量、信頼性指標の顔メタデータを生成する顔メタデータ生成部１２と、抽出された顔メタデータを蓄積する顔メタデータ蓄積部１３と、顔メタデータから顔の類似度を算出する顔類似度算出部１４と、顔画像を蓄積する顔画像データベース１５と、画像の登録要求・検索要求に応じて、画像の入力・メタデータの生成・メタデータの蓄積・顔類似度の算出の制御を行う制御部１６と、顔画像や他の情報を表示するディスプレイの表示部１７と、が設けられている。 As shown in FIG. 1, in the face image matching system according to the present invention, a face image input unit 11 for inputting a face image, and generating face metadata of the face feature amount and reliability index from the input face image. A face metadata generation unit 12, a face metadata storage unit 13 that stores the extracted face metadata, a face similarity calculation unit 14 that calculates a face similarity from the face metadata, and a face image. A face image database 15, a control unit 16 that controls input of images, generation of metadata, accumulation of metadata, and calculation of face similarity in response to an image registration request / search request; And a display unit 17 for displaying information.

また、顔メタデータ生成部１２は、入力された顔画像から顔特徴を抽出する顔特徴抽出部１２１と、信頼性指標を抽出する信頼性指標抽出部１２２とから構成され、顔類似度算出部１４は、信頼性指標から事後分布に関するパラメータ情報を推定する分布推定部１４１と顔特徴量と分布推定部１４１からの事後分布情報から顔特徴量間の距離を算出する距離算出部１４２によって構成される。 The face metadata generation unit 12 includes a face feature extraction unit 121 that extracts a face feature from an input face image and a reliability index extraction unit 122 that extracts a reliability index, and includes a face similarity calculation unit. 14 includes a distribution estimation unit 141 that estimates parameter information related to the posterior distribution from the reliability index, and a distance calculation unit 142 that calculates the distance between the facial feature amount and the posterior distribution information from the distribution estimation unit 141. The

登録時には、画像入力部１１では、スキャナあるいはビデオカメラなどで顔写真等を顔の大きさや位置を合わせた上で入力する。あるいは、人物の顔を直接ビデオカメラなどから入力しても構わない。この場合には、前述のMoghaddamの文献に示されているような顔検出技術を用いて、入力された画像の顔位置を検出し、顔画像の大きさ等を自動的に正規化する方がよいであろう。 At the time of registration, the image input unit 11 inputs a face photograph or the like after matching the size and position of the face with a scanner or a video camera. Alternatively, a person's face may be input directly from a video camera or the like. In this case, it is better to detect the face position of the input image and automatically normalize the size of the face image using the face detection technique as shown in the above-mentioned Moghaddam document. Would be good.

また、入力された顔画像は必要に応じて顔画像データベース１５に登録する。顔画像登録と同時に、顔メタデータ生成部１２によって顔メタデータを生成し、顔メタデータ蓄積部１３に蓄積する。 Further, the input face image is registered in the face image database 15 as necessary. Simultaneously with the registration of the face image, face metadata is generated by the face metadata generation unit 12 and stored in the face metadata storage unit 13.

検索時には同様に顔画像入力部１１によって顔画像を入力し、顔メタデータ生成部１２にて顔メタデータを生成する。生成された顔メタデータは、一旦顔メタデータ蓄積部１３に登録するか、または、直接に顔類似度算出部１４へ送られる。検索では、予め入力された顔画像がデータベース中にあるかどうかを確認する場合(顔同定)には、顔メタデータ蓄積部１３に登録されたデータの一つ一つとの類似度を算出する。最も類似度が高い(距離値が小さい)結果に基づいて制御部１６では、顔画像データベース１５から、顔画像を選び、表示部１７等に顔画像の表示を行い、検索画像と登録画像における顔の同一性を作業者が確認する。 Similarly, at the time of retrieval, a face image is input by the face image input unit 11, and face metadata is generated by the face metadata generation unit 12. The generated face metadata is once registered in the face metadata accumulation unit 13 or directly sent to the face similarity calculation unit 14. In the search, when it is confirmed whether or not a face image inputted in advance is in the database (face identification), the similarity with each piece of data registered in the face metadata storage unit 13 is calculated. Based on the result with the highest similarity (small distance value), the control unit 16 selects a face image from the face image database 15, displays the face image on the display unit 17 and the like, and displays the face in the search image and the registered image. The operator confirms the identity of.

一方、予めＩＤ番号等で特定された顔画像と検索の顔画像が一致するかどうかを確認する場合(顔識別)では、特定されたＩＤ番号の顔画像と一致するか否かを顔類似度算出部１４にて計算し、予め決められた類似度よりも類似度が低い(距離値が大きい)場合には一致しないと判定し、類似度が高い場合には一致すると判定し、その結果を表示部１７に表示する。このシステムを入室管理用に用いるならば、表示する代わりに、制御部１６から自動ドアに対して、その開閉制御信号を送ることで、自動ドアの制御によって入室管理を行うことができる。 On the other hand, when confirming whether the face image specified in advance by the ID number or the like matches the searched face image (face identification), it is determined whether or not the face image with the specified ID number matches. When the similarity is lower than the predetermined similarity (the distance value is large), it is determined not to match, and when the similarity is high, it is determined to match, and the result is calculated. It is displayed on the display unit 17. If this system is used for entry management, entry control can be performed by controlling the automatic door by sending an opening / closing control signal from the control unit 16 to the automatic door instead of displaying.

上記のように、顔画像マッチングシステムは動作するが、このような動作はコンピュータシステム上で実現することもできる。たとえば、次に詳述するようなメタデータ生成を実行するメタデータ生成プログラムおよび類似度算出を実行する類似度算出プログラムをそれぞれメモリに格納しておき、これらをプログラム制御プロセッサによってそれぞれ実行することで顔画像マッチングを実現することができる。 As described above, the face image matching system operates, but such an operation can also be realized on a computer system. For example, a metadata generation program that executes metadata generation and a similarity calculation program that executes similarity calculation, which will be described in detail below, are stored in memory, respectively, and are executed by a program control processor. Face image matching can be realized.

次に、この顔画像マッチングシステムの動作、特に顔メタデータ生成部１２と顔類似度算出部１４について詳細に説明する。 Next, the operation of the face image matching system, particularly the face metadata generation unit 12 and the face similarity calculation unit 14 will be described in detail.

（１）顔メタデータ生成
顔メタデータ生成部１２では、位置と大きさを正規化した画像I(x, y)を用いて、顔特徴量を抽出する。位置と大きさの正規化は、例えば、目位置が(16, 24)、(31, 24)、サイズが46×56画素となるように画像を正規化しておくとよい。以下では、このサイズに画像が正規化されている場合について説明する。 (1) Face Metadata Generation The face metadata generation unit 12 extracts a face feature amount using an image I (x, y) whose position and size are normalized. For normalization of the position and size, for example, the image may be normalized so that the eye position is (16, 24), (31, 24), and the size is 46 × 56 pixels. Hereinafter, a case where an image is normalized to this size will be described.

顔特徴量として、所謂、固有顔の手法(前述Moghaddamの論文)を用いて特徴抽出する。つまり、画像中の画素値を要素とする特徴ベクトルΛに対して、顔画像サンプル集合[Λ]の主成分分析によって得られた基底ベクトルの中から選択した部分基底ベクトルによって特定される基底行列Ｕと、顔画像サンプル集合[Λ]における平均ベクトルである平均顔Ψとを用いて、特徴ベクトルｖ＝Ｕ^T(Λ−Ψ)を算出し特徴ベクトルとする。このように、入力画像ではなく、部分基底ベクトルを用いることで、入力画像のデータ量を削減することができる。このようにデータ量を削減することは、メタデータベースにおける蓄積量を少なくするだけではなく、高速なマッチングを実現するためには重要な要素である。この特徴ベクトルの次元数としては、例えば、４８次元の特徴量を用いればよい。 As a face feature amount, feature extraction is performed using a so-called eigenface technique (Moghaddam's paper described above). That is, the basis matrix U specified by the partial basis vector selected from the basis vectors obtained by the principal component analysis of the face image sample set [Λ] with respect to the feature vector Λ having the pixel values in the image as elements. And a feature vector v = U ^T (Λ−ψ) is calculated using the average face Ψ which is an average vector in the face image sample set [Λ], and is used as a feature vector. In this way, the data amount of the input image can be reduced by using the partial basis vector instead of the input image. Reducing the amount of data in this way is an important factor for realizing high-speed matching as well as reducing the amount of storage in the meta database. As the number of dimensions of the feature vector, for example, a 48-dimensional feature amount may be used.

このように一般的に用いられる固有顔による手法の他に、主成分分析に対して判別分析を組み合わせた手法(前述、W. Zhaoの論文)等を用いて部分基底ベクトルを規定しても構わない。 In addition to the commonly used eigenface technique, partial basis vectors may be defined using a technique that combines discriminant analysis with principal component analysis (the above-mentioned W. Zhao paper). Absent.

また、前述したような顔画像サンプルにおける特徴ベクトル集合[Λ]の主成分分析や判別分析によって得られる基底ベクトルの中から選択した部分基底ベクトルによって特定される基底行列Ｕnと、顔画像を左右反転される画素値の変換と対応するように基底ベクトルの要素を入れ換えた反転部分基底ベクトルによって特定される基底行列Ｕmの線形和によって得られる行列Ｕ(＝ aＵn + bＵm)を基底行列として、顔特徴ベクトルを抽出してもよい。例えばa=b=1とすると、得られる顔特徴ベクトルは、入力画像空間で左右の変換に対して対称な成分のみを抽出することができる。顔は本来左右対称であるので、照明の影響で非対称となった画像成分や顔の向きが正面に向いていないために生じる非対称成分が本来ノイズに相当するものであるので、これを除去し、対称な成分のみを抽出することで、照明や姿勢の変動に対して安定な顔特徴量を抽出することが可能となる。 In addition, the base matrix Un specified by the partial base vector selected from the base vectors obtained by the principal component analysis or discriminant analysis of the feature vector set [Λ] in the face image sample as described above, and the face image are horizontally reversed. The matrix U (= aUn + bUm) obtained by linear summation of the basis matrix Um specified by the inverted partial basis vector with the basis vector elements replaced so as to correspond to the conversion of the pixel value to be A face feature vector may be extracted. For example, when a = b = 1, the obtained facial feature vector can extract only components that are symmetric with respect to the left and right transformations in the input image space. Since the face is essentially symmetric, the image component that has become asymmetric due to the effect of lighting and the asymmetric component that occurs because the orientation of the face is not directed to the front is essentially equivalent to noise. By extracting only symmetric components, it is possible to extract a facial feature quantity that is stable against changes in illumination and posture.

また、画像をフーリエ変換し、得られた複素フーリエ成分の各成分の大きさを要素とするベクトルを特徴ベクトルΛとして算出し、上記のような主成分分析や判別分析によって次元圧縮を行い顔特徴量を抽出してもよい。このように画像をフーリエ変換することで、位置ずれに対して強い顔特徴量を抽出することができる。このようにして、顔特徴抽出部１２１では、顔特徴量ｖを抽出する。 Also, the image is subjected to Fourier transform, and a vector whose element is the size of each component of the obtained complex Fourier component is calculated as a feature vector Λ. The amount may be extracted. In this way, by performing Fourier transform on the image, it is possible to extract a facial feature amount that is strong against displacement. In this way, the facial feature extraction unit 121 extracts the facial feature quantity v.

信頼性指標抽出部１２２では、顔特徴量ｖの信頼性指標として有効なコントラスト指標θ_contrastと非対称性指標θ_asymmetricを抽出する。コントラスト指標θ_contrastとして、顔画像 I(x,y)の画素値の標準偏差を次式（数２１）によって計算する。 The reliability index extraction unit 122 extracts a contrast index θ _contrast and an asymmetry index θ _asymmetric that are effective as the reliability index of the facial feature value v. As the contrast index θ _contrast , the standard deviation of the pixel value of the face image I (x, y) is calculated by the following equation (Equation 21).

ここでround()は数値の丸め処理を意味する。このように抽出された信頼性指標θ_contrastを[0,1,2,...,15]の4 bitの範囲に収まるように、値域を越えるものは値を制限する。なお、上記ではコントラスト指標として、画像の標準偏差を計算したが、分散や画像中の画素値の最大値と最小値の差を抽出してもよい。画像中の画素値の最大値と最小値によるコントラスト指標は、標準偏差や分散による場合よりも演算量が少ないが、効果は相対的に小さい。 Here, round () means rounding of numerical values. Values exceeding the range are limited so that the reliability index θ _contrast thus extracted falls within the 4-bit range of [0,1,2, ..., 15]. In the above description, the standard deviation of the image is calculated as the contrast index. However, the variance or the difference between the maximum value and the minimum value of the pixel values in the image may be extracted. The contrast index based on the maximum and minimum pixel values in the image has a smaller amount of computation than the standard deviation and variance, but the effect is relatively small.

非対称性指標θ_asymmetricとして、顔画像 I(x,y)とその反転画像との間の差分の絶対値(１乗)の平均を次式（数２２）により抽出する。 As an asymmetry index θ _asymmetric , an average of absolute values (first powers) of differences between the face image I (x, y) and its inverted image is extracted by the following equation (Equation 22).

このように抽出された信頼性指標θ_asymmetricを[0,1,2,...,15]の4 bitの範囲に収まるように値域を越えるものは値を制限する。なお、上記では非対称指標として、絶対値による値(１乗)を用いたが差分の２乗等を用いてもよい。また、平均の代わりに、和等の値を用いても同等の効果を得ることができる。また、差分の最大値を検出して、その値を非対称指標として用いれば演算量が少なくて済む。 The value exceeding the range is limited so that the reliability index θ _asymmetric thus extracted falls within the 4-bit range of [0, 1, 2,..., 15]. In the above description, the absolute value (first power) is used as the asymmetric index, but the square of the difference or the like may be used. Further, the same effect can be obtained by using a value such as a sum instead of the average. Further, if the maximum value of the difference is detected and used as an asymmetric index, the amount of calculation can be reduced.

信頼性指標抽出部１２２では、顔画像に対して、特徴ベクトルｖと信頼性指標θ_contrastと信頼性指標θ_asymmetricを抽出し、顔メタデータとして出力する。前述したように、上記顔メタデータ生成手順をコンピュータプログラムによってコンピュータに実行させることもできる。 The reliability index extraction unit 122 extracts the feature vector v, the reliability index θ _contrast, and the reliability index θ _asymmetric from the face image, and outputs them as face metadata. As described above, the face metadata generation procedure can be executed by a computer using a computer program.

（２）顔類似度算出
次に、顔類似度算出部１４の動作について説明する。顔類似度算出部１４では、二つの顔メタデータの信頼性指標θ_contrast,1、θ_constrast,2とθ_asymmetric,1、θ_asymmetric,2を用いて、分布推定部１４１が事後分布に関するパラメータ情報を推定し、二つの顔メタデータの二つの特徴ベクトルｖ₁、ｖ ₂と事後分布に関するパラメータ情報とを用いて距離算出部１４２が顔特徴間の類似度ｄを算出する。 (2) Face Similarity Calculation Next, the operation of the face similarity calculation unit 14 will be described. In the face similarity calculation unit 14, the distribution estimation unit 141 uses the reliability indices θ _{contrast, 1} and θ _{constrast, 2} and θ _{asymmetric, 1} and θ _{asymmetric, 2} of the two face metadata _, and parameter information relating to the posterior distribution. The distance calculation unit 142 calculates the similarity d between the facial features using the two feature vectors v ₁ and v ₂ of the two face metadata and the parameter information regarding the posterior distribution.

ここでは、（数８）あるいは（数１８）によって顔の類似度を計算する場合について説明する。 Here, the case where the face similarity is calculated by (Equation 8) or (Equation 18) will be described.

（数８）あるいは（数１８）にて表記されている信頼性指標θ₁、θ₂は本実施形態ではベクトルであり、それぞれの要素はθ₁= (θ_contrast,1, θ_asymmetric,1)^T、θ₂ = (θ_contrast,2, θ_asymmetric,2)^Tとなる。コントラスト指標、非対称性指標はそれぞれ4bitで表現されているので、θ_iのとり得る状態は256個の状態となる。与えられた信頼性指標θ_iによって、２５６個の状態の中の一つの状態が特定できる。 In this embodiment, the reliability indexes θ ₁ and θ ₂ represented by (Equation 8) or (Equation 18) are vectors, and each element is θ ₁ = (θ _{contrast, 1} , θ _{asymmetric, 1} ). ^T , θ ₂ = (θ _{contrast, 2} , θ _{asymmetric, 2} ) ^T. Since the contrast index and the asymmetry index are each expressed by 4 bits, there are 256 possible states of θ _i . One state among 256 states can be specified by the given reliability index θ _i .

分布推定部１４１では、後述するように、予め求めておいた信頼性指標θ(256個の状態がある)に対する差ベクトルの分散値σε_{, k}(θ)や分散値σ_W,k(θ)、分散値σ_B,k (θ)をテーブルに記憶しておき、信頼性指標θ₁およびθ₂を用いて、それぞれの分散値テーブルの値を参照し、得られた分散値を事後分布情報として、距離算出部１４２に引き渡す。なお、顔同定の場合には、（数８）で必要な分散値σε_,k(θ₁)、分散値σε_,k (θ₂)の値を距離算出部１４２に出力すればよく、顔識別の場合には、（数１８）に必要な分散値σ_W,k(θ)、分散値σ_B,k(θ)を出力する。 As will be described later, the distribution estimation unit 141 calculates the variance value σε _{, k} (θ) or the variance value σ _{W, k} (θ) of the difference vector with respect to the reliability index θ (256 states) obtained in advance. The variance value σ _{B, k} (θ) is stored in the table, the reliability indices θ ₁ and θ ₂ are used to refer to the values of the respective variance value tables, and the obtained variance values are used as the posterior distribution information. As shown in FIG. In the case of face identification, the values of the variance values σε _{, k} (θ ₁ ) and variance values σε _{, k} (θ ₂ ) required in (Equation 8) may be output to the distance calculation unit 142, and face identification is performed. In this case, the variance value σ _{W, k} (θ) and the variance value σ _{B, k} (θ) necessary for (Equation 18) are output.

距離算出部１４２では、（数８）あるいは（数１８）に従って適応的マハラノビス距離あるいは適応的判別距離を算出し、類似度ｄとして出力する。 The distance calculation unit 142 calculates an adaptive Mahalanobis distance or an adaptive discrimination distance according to (Equation 8) or (Equation 18), and outputs it as the similarity d.

前述した分散値テーブルの分散値は、予め用意した顔画像データサンプルを用いて計算しておく。顔画像サンプルにおける特徴ベクトル集合[ｖ_i]とその信頼性指標[θ_i]から、それぞれの分散値は次のように計算できる。 The variance values in the above-described variance value table are calculated using face image data samples prepared in advance. From the feature vector set [v _i ] and its reliability index [θ _i ] in the face image sample, each variance value can be calculated as follows.

ここで、"(i, j)がクラスＷに属する"とは、データiとデータjが同一クラス(同一人物)から得られたデータである(クラス内である)ことを意味し、"(i, j)がクラスＢに属する"とは、データiとデータjが異なるクラス(異なる人物)から得られたデータである(クラス間である)ことを意味する。また、Ｎε(θ)、Ｎ_W(θ)、Ｎ_B(θ)はそれぞれのクラスに属するデータの組み合わせの数である。このようにθのビン毎に分散値を計算する際に、そのデータ数が少な過ぎる場合には、近傍のビンのデータを併合するようにして、サンプル数を確保するようにする(これは、分布推定におけるk近傍法(前述Bishop の文献,pp.53)によってbinを併合するのと同様な手法である)。 Here, “(i, j) belongs to class W” means that data i and data j are data obtained from the same class (same person) (within the class), and “( “i, j) belongs to class B” means that data i and data j are data obtained from different classes (different persons) (between classes). Nε (θ), N _W (θ), and N _B (θ) are the number of combinations of data belonging to each class. In this way, when calculating the variance value for each bin of θ, if the number of data is too small, the data of neighboring bins are merged to ensure the number of samples (this is K-nearest neighbor method in distribution estimation (similar to merging bin by Bishop literature, pp. 53).

ここで、（数１８）の場合、（数４）と同様に、σ_W,k([θ_i])² = σ_W,k(θ₁)² + σ_W,k(θ₂)²、および、σ_B,k([θ_i])² = σ_B,k(θ₁)² +σ_B,k(θ₂)²となることに注意されたい。 Here, in the case of (Equation 18), as in (Equation 4), σ _{W, k} ([θ _i ]) ² = σ _{W, k} (θ ₁ ) ² + σ _{W, k} (θ ₂ ) ² , Note that σ _{B, k} ([θ _i ]) ² = σ _{B, k} (θ ₁ ) ² + σ _{B, k} (θ ₂ ) ² .

なお、誤差分散σε_,k(θ)²とクラス内分散σ_W,k(θ)²が同じになるので、顔画像マッチングシステムで（数８）と（数１８）の両方の距離を計算する場合には、これらの分散値テーブルを共有して構わない。 Since the error variance σε _{, k} (θ) ² and the intraclass variance σ _{W, k} (θ) ² are the same, the distances of both (Equation 8) and (Equation 18) are calculated by the face image matching system. In this case, these distributed value tables may be shared.

また、誤差分布とクラス間分布は強い相関がある場合が多いので、誤差分散σε_,k(θ)²の代わりにクラス間分散σ_B,k(θ)²を用いても、信頼性指標を用いない場合よりは精度が向上する(しかし、誤差分散を用いた方が精度が良い)。 In addition, since the error distribution and the interclass distribution often have a strong correlation, the reliability index can be obtained even if the interclass variance σ _{B, k} (θ) ² is used instead of the error variance σε _{, k} (θ) ^2. The accuracy is improved compared to the case of not using (but the accuracy is better if error variance is used).

このように顔メタデータ間の類似度を信頼性指標θ_contrastや信頼性指標θ_asymmetricを介した事後分布情報を用いて算出することで、精度のよい顔認識を行うことが可能である。前述したように、上記顔類似度算出手順をコンピュータプログラムによってコンピュータに実行させることもできる。 Thus, by calculating the similarity between the face metadata using the posterior distribution information via the reliability index θ _contrast and the reliability index θ _asymmetric , it is possible to perform face recognition with high accuracy. As described above, the computer can execute the face similarity calculation procedure using a computer program.

なお、ここでは（数８）と（数１８）を用いて、類似度を計算しているが、次のような様々な計算方法によって近似的に計算し、高速化等を図ることもできる。 Here, the degree of similarity is calculated using (Equation 8) and (Equation 18), but it is also possible to approximate the calculation by various calculation methods such as the following to increase the speed.

上記各式の右辺第二項(lnの部分)を計算しないことで、さらなる高速演算を図ることができる。 By not calculating the second term on the right side of each of the above formulas (the part of ln), it is possible to achieve further high-speed computation.

また、（数６）や（数１６）によって類似度を計算する場合も、基本的には同様に予め用意した顔画像データサンプルから、それぞれの計算に必要な差ベクトルの誤差の共分散行列Σε(θ)や差ベクトルのクラス内の共分散行列Σ_W(θ)、クラス間の共分散行列Σ_B(θ)を算出して、共分散テーブルとして用意しておき、類似度計算の際にその共分散テーブルを参照するようにすればよい。この方法は共分散行列を用いて距離を計算するために演算量が増加するが、十分な学習サンプルがある場合には類似度計算の精度を向上させることができる。 Also, when calculating the similarity according to (Equation 6) or (Equation 16), basically, from the face image data sample prepared in advance, the covariance matrix Σε of the difference vector error necessary for each calculation is used. Calculate the covariance matrix Σ _W (θ) within the class of (θ) and difference vector, and the covariance matrix Σ _B (θ) between classes, and prepare them as covariance tables. The covariance table may be referred to. In this method, the amount of calculation increases because the distance is calculated using the covariance matrix, but the accuracy of similarity calculation can be improved when there are sufficient learning samples.

（数３）の事後分布や（数１４）のクラス間分布、クラス間分布に対して混合正規分布を仮定して、分布関数を推定することで、それぞれ（数１０）や（数２０）の適応的混合マハラノビス距離や適応的混合判別距離を計算してもよい。この場合も、分散や共分散行列を用いて、事後分布情報を計算するのと同様に顔画像データサンプルから、混合正規分布を表す共分散行列Σ_s(θ₁, j)、P(j)等の混合分布を特定するパラメータを求めておき、テーブルとして記憶しておけばよい。なお、この推定については、一般的な推定方法である最尤推定法やＥＭアルゴリズムを用いて推定すればよい。 By assuming a mixed normal distribution for the posterior distribution of (Equation 3), the interclass distribution of (Equation 14), and the interclass distribution, and estimating the distribution function, respectively, (Equation 10) and (Equation 20) An adaptive mixture Mahalanobis distance or an adaptive mixture discrimination distance may be calculated. In this case, the covariance matrix Σ _s (θ ₁ , j), P (j) representing the mixed normal distribution is obtained from the face image data sample in the same manner as calculating the posterior distribution information using the variance and the covariance matrix. The parameters for specifying the mixture distribution, such as, may be obtained and stored as a table. This estimation may be performed using a general likelihood estimation method such as a maximum likelihood estimation method or an EM algorithm.

ここまでの説明では、一枚の顔画像が登録され、一枚の顔画像を用いて検索する場合について説明したが、一人の顔に対して複数の画像が登録され、一枚の顔画像を用いて検索する場合には、例えば、次のようにすればよい。 In the above description, a case where a single face image is registered and retrieval is performed using a single face image has been described. However, a plurality of images are registered for one face, and a single face image is registered. When using the search, for example, the following may be performed.

検索側の特徴ベクトルをｖ_queとし、登録側の特徴ベクトルをｖ_reg,kとし、複数画像登録の場合の類似度d_multi(ｖ_que, [ｖ_reg,1, ｖ_reg,2, ... ,ｖ_reg,n])として、次式（数２７、数２８）に示す計算式に基づき類似度を計算すればよい。 The search-side feature vector is v _que , the registration-side feature vector is v _{reg, k,} and the similarity d _multi (v _que , [v _{reg, 1} , v _{reg, 2} , ... , v _{reg, n} ]), the similarity may be calculated based on the calculation formulas shown in the following equations (Equations 27 and 28).

あるいは、

Or

同様に１つの顔当たりの複数枚の画像登録と複数画像による検索の場合も、各組み合わせの類似度の平均や最小値を求め類似度を算出することで、一つの顔データに対する類似度を算出することができる。これは、動画像を複数画像と見倣すことで、本発明のマッチングシステムを動画像における顔認識に対しても適用できることを意味する。 Similarly, in the case of registration of a plurality of images per face and search by a plurality of images, the similarity for one face data is calculated by calculating the similarity by calculating the average or the minimum value of the similarity of each combination. can do. This means that the matching system of the present invention can be applied to face recognition in a moving image by imitating the moving image as a plurality of images.

また、上記説明では、同一人物の顔の同定や顔の識別を中心に説明したが、例えば、男性の顔を一つのカテゴリー、女性の顔を一つのカテゴリーと考えて、それぞれの分布に関する情報を求め、男女の顔を識別する性別識別を行ったり、笑った顔、怒った顔、悲しい顔などの表情のカテゴリーを考えて、顔の表情を認識する表情認識に適用することも可能である。１０代、２０代、３０代、４０代というようにカテゴリーを設定し、そのカテゴリー毎に誤差分布、クラス内分布やクラス間分布を求めることで、年齢判別を行うことも可能であり、様々な顔認識に対して本発明を適用することが可能である。 Also, in the above explanation, the explanation was centered on the identification of the same person's face and the identification of the face, but for example, considering the male face as one category and the female face as one category, information on each distribution is given. It can also be applied to facial expression recognition for recognizing facial expressions in consideration of gender identification that identifies male and female faces, or considering facial expression categories such as laughed faces, angry faces, and sad faces. Age can be determined by setting categories such as teens, 20s, 30s and 40s, and obtaining error distribution, intraclass distribution and interclass distribution for each category. The present invention can be applied to face recognition.

本発明の一実施形態による顔画像マッチングシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the face image matching system by one Embodiment of this invention.

Explanation of symbols

11: 顔画像入力部
12: 顔メタデータ生成部
13: 顔メタデータ蓄積部
14: 顔類似度算出部
15: 顔画像データベース
16: 制御部
17: 表示部
121: 顔特徴抽出部
122: 信頼性指標抽出部
141: 分布推定部
142: 距離算出部

11: Face image input part
12: Face metadata generator
13: Face metadata storage
14: Face similarity calculator
15: Face image database
16: Control unit
17: Display
121: Facial feature extraction unit
122: Reliability index extraction unit
141: Distribution estimation part
142: Distance calculator

Claims

In the face similarity calculation device for calculating the similarity of the face image based on the face feature amount extracted from the face image and the reliability index representing the reliability of the face recognition result using the face feature amount,
The statistics of the distribution related to the comparison amount between the facial feature quantities are stored in advance for each reliability index value, and the statistics corresponding to the reliability index value is used as a parameter of the posterior distribution of the comparison quantity. Distribution estimation means to output;
Distance calculating means for outputting, as the similarity, a distance between face feature amounts calculated based on a posterior distribution of the comparison amount using the comparison amount and the parameter;
A face similarity calculation device characterized by comprising:

The distribution estimating means stores in advance a statistic of the distribution of the difference vector between facial feature quantities for each reliability index value, and the statistic corresponding to the reliability index value is stored in the difference vector. Output as a parameter of the posterior distribution,
The distance calculation means outputs, as the similarity, the distance between facial feature amounts calculated using a distance function derived from a log likelihood of the posterior distribution of the difference vector, using the difference vector and the parameter. ,
The face similarity calculation device according to claim 1.

The distance calculation means calculates an adaptive Mahalanobis distance as a similarity degree derived from a logarithmic likelihood of a normal distribution in the reliability index, assuming that a posterior distribution of a difference vector between facial feature quantities is a normal distribution. The face similarity calculation device according to claim 2.

The distribution estimation means estimates a variance σ _{s, k} ([θ _i ]) ² of each element k of the difference vector s with respect to the reliability index [θ _i ],
The distance calculation means calculates an adaptive Mahalanobis distance using the variance σ _{s, k} ([θ _i ]) ² of each element k.
The face similarity calculation device according to claim 3.

The distribution estimation means includes a variance value table that stores in advance the variance σ _{s, k} ([θ _i ]) ² of each element k of the difference vector s with respect to the reliability index [θ _i ], and includes the reliability index [θ _i 5. The face similarity calculation device according to claim 4, wherein a variance value required for calculating an adaptive Mahalanobis distance is output by referring to the variance value table.

The distance calculation means calculates the adaptive mixed Mahalanobis distance derived from the logarithmic likelihood of the mixed distribution in the reliability index as the similarity, assuming that the posterior distribution of the difference vector between the facial feature quantities is a mixed distribution. The face similarity calculation apparatus according to claim 2, wherein:

In the face similarity calculation method for calculating the similarity of the face image based on the face feature amount extracted from the face image and the reliability index representing the reliability of the face recognition result using the face feature amount,
The statistics of the distribution related to the comparison amount between the facial feature quantities are stored in advance for each reliability index value, and the statistics corresponding to the reliability index value is used as a parameter of the posterior distribution of the comparison quantity. An output distribution estimation step;
A distance calculating step of outputting, as the similarity, a distance between face feature amounts calculated based on a posterior distribution of the comparison amount using the comparison amount and the parameter;
A face similarity calculation method characterized by comprising:

In the distribution estimation step, a statistic of a distribution of difference vectors between facial feature quantities is stored in advance for each reliability index value, and the statistic corresponding to the reliability index value is stored in the difference vector. Output as a parameter of the posterior distribution,
The distance calculating step outputs the distance between the facial feature quantities calculated using a distance function derived from a log likelihood of the posterior distribution of the difference vector using the difference vector and the parameter as the similarity. ,
The face similarity calculation method according to claim 7.

9. The adaptive Mahalanobis distance derived from the logarithmic likelihood of the normal distribution in the reliability index is calculated as the similarity, assuming that the posterior distribution of the difference vector between the facial feature quantities is a normal distribution. The face similarity calculation method described.

Estimate the variance σ _{s, k} ([θ _i ]) ² of each element k of the difference vector s with respect to the reliability index [θ _i ],
Calculating an adaptive Mahalanobis distance using the variance σ _{s, k} ([θ _i ]) ² of each element k,
The face similarity calculation method according to claim 9.

Variance sigma _{s, k} for each element k of the difference vector s for the confidence index _{_{[θ i] ([θ i}} ]) 2 stored in advance in the dispersion value table, the variance value table by the reliability index [theta _i] 11. The face similarity calculation method according to claim 10, wherein a variance value required for calculating the adaptive Mahalanobis distance is generated by referring to the face similarity.

The adaptive mixed Mahalanobis distance derived from the logarithmic likelihood of the mixed distribution in the reliability index is calculated as the similarity, assuming that the posterior distribution of the difference vector between the facial feature quantities is a mixed distribution. The face similarity calculation method described in 1.

The distribution estimation means includes a first statistic of a difference vector distribution (intraclass distribution) between facial feature quantities in a class that should be regarded as a match, and a facial feature quantity between classes that should be regarded as a mismatch. The second statistic of the difference vector distribution (interclass distribution) is stored in advance for each reliability index value, and the first statistic and the second statistic corresponding to the reliability index value are stored. The statistics are output as the parameters of the intraclass distribution and the interclass distribution, respectively.
The distance calculation means uses the difference vector and the parameter to calculate the distance between the facial feature quantities calculated using a distance function derived from a logarithmic likelihood of a ratio between the intraclass distribution and the interclass distribution. Output as similarity,
The face similarity calculation device according to claim 1.

The adaptive discriminant distance derived from the log likelihood of the ratio of the respective distributions in the reliability index is calculated as the similarity, assuming that the intraclass distribution and the interclass distribution are normal distributions, respectively. Item 14. The face similarity calculation device according to Item 13.

By estimating the intra-class variance σ _{W, k} ([θ _i ]) ² and the inter-class variance σ _{B, k} ([θ _i ]) ² of each element k of the difference vector s with respect to the reliability index [θ _i ] The face similarity calculation device according to claim 14, wherein the adaptive discrimination distance is calculated as a similarity.

For estimation of intra-class variance σ _{W, k} ([θ _i ]) ² and inter-class variance σ _{B, k} ([θ _i ]) ² of each element k of difference vector s with respect to reliability index [θ _i ] A first variance value table that pre-stores intra-class variance σ _{W, k} ([θ _i ]) ^2, and a second variance value table that pre-stores inter-class variance σ _{B, k} ([θ _i ]) ² With
16. The face similarity according to claim 15, wherein a variance value required in an adaptive discrimination distance is estimated by referring to each of the first and second variance value tables according to the reliability index [θ _i ]. Calculation device.

Assuming that the intraclass distribution and the interclass distribution are each a mixture distribution, an adaptive mixture discrimination distance derived from a logarithmic likelihood of a ratio of each mixture distribution in the reliability index is calculated as a similarity. The face similarity calculation device according to claim 13.

The distribution estimation step includes a first statistic of a distribution of a difference vector between facial feature quantities (class distribution) in a class that should be regarded as a match, and a facial feature quantity between classes to be regarded as a mismatch. The second statistic of the difference vector distribution (interclass distribution) is stored in advance for each reliability index value, and the first statistic and the second statistic corresponding to the reliability index value are stored. The statistics are output as the parameters of the intraclass distribution and the interclass distribution, respectively.
The distance calculating step uses the difference vector and the parameter to calculate the distance between the face feature values calculated using a distance function derived from a logarithmic likelihood of a ratio of the intraclass distribution and the interclass distribution. Output as similarity,
The face similarity calculation method according to claim 7.

The adaptive discriminant distance derived from the log likelihood of the ratio of the respective distributions in the reliability index is calculated as the similarity, assuming that the intraclass distribution and the interclass distribution are normal distributions, respectively. Item 19. The face similarity calculation method according to Item 18.

By estimating the intra-class variance σ _{W, k} ([θ _i ]) ² and the inter-class variance σ _{B, k} ([θ _i ]) ² of each element k of the difference vector s with respect to the reliability index [θ _i ] 20. The face similarity calculation method according to claim 19, wherein the adaptive discrimination distance is calculated as the similarity.

For estimation of intra-class variance σ _{W, k} ([θ _i ]) ² and inter-class variance σ _{B, k} ([θ _i ]) ² of each element k of difference vector s with respect to reliability index [θ _i ] A first variance value table that pre-stores intra-class variance σ _{W, k} ([θ _i ]) ^2, and a second variance value table that pre-stores inter-class variance σ _{B, k} ([θ _i ]) ² With
21. The face similarity according to claim 20, wherein a variance value required in the adaptive discrimination distance is estimated by referring to the first and second variance value tables respectively by the reliability index [θ _i ]. Calculation method.

Assuming that the intraclass distribution and the interclass distribution are each a mixture distribution, an adaptive mixture discrimination distance derived from a logarithmic likelihood of a ratio of each mixture distribution in the reliability index is calculated as a similarity. The face similarity calculation method according to claim 18.