JP2009140513A

JP2009140513A - Pattern characteristic extraction method and device for the same

Info

Publication number: JP2009140513A
Application number: JP2009013764A
Authority: JP
Inventors: Toshio Kamei; 俊男亀井
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2002-07-16
Filing date: 2009-01-26
Publication date: 2009-06-25
Anticipated expiration: 2023-03-13
Also published as: JP4770932B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a feature vector transformation technique for suppressing a reduction in feature amount effective for discrimination and performing more efficient feature extraction when a feature vector effective for discrimination is to be extracted from an input pattern feature vector and feature dimensions are to be compressed. <P>SOLUTION: An input pattern feature amount is decomposed into element vectors. For each of the feature vectors, a discriminant matrix obtained by discriminant analysis is prepared in advance. Each of the feature vectors is projected into a discriminant space defined by the discriminant matrix and the dimensions are compressed. According to the feature vector obtained, projection is performed again by the discriminant matrix to calculate the feature vector. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、パターン認識の分野における画像特徴抽出方法および画像特徴抽出装置、な
らびにそのプログラムに関し、入力特徴ベクトルから、認識に有効な特徴ベクトルを抽出
し、特徴次元を圧縮するための特徴ベクトルの変換技術に関する。 The present invention relates to an image feature extraction method, an image feature extraction apparatus, and a program therefor in the field of pattern recognition, and extracts a feature vector effective for recognition from an input feature vector and converts a feature vector for compressing a feature dimension. Regarding technology.

従来より、パターン認識の分野では、入力されたパターンから特徴ベクトルを抽出し、
その特徴ベクトルから識別に有効な特徴ベクトルを抽出し、各々のパターンから得られた
特徴ベクトルを比較することによって、例えば、文字や人物の顔などのパターンの類似度
を判定することが行われている。 Conventionally, in the field of pattern recognition, feature vectors are extracted from input patterns,
For example, a feature vector effective for identification is extracted from the feature vector, and the similarity of patterns such as characters and human faces is determined by comparing the feature vectors obtained from the patterns. Yes.

例えば、顔認識の場合では、目の位置等によって正規化された顔画像の画素値をラスタ
ー走査することで、一次元特徴ベクトルに変換し、この特徴ベクトルを入力特徴ベクトル
として用い、主成分分析（非特許文献１：Ｍｏｇｈａｄｄａｍ他， ”Ｐｒｏｂａｂｉｌ
ｉｓｔｉｃＶｉｓｕａｌＬｅａｒｎｉｎｇｆｏｒＯｂｊｅｃｔＤｅｔｅｃｔｉ
ｏｎ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉ
ｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．１７，Ｎｏ．
７，ｐｐ．６９６−７１０，１９９７）や特徴ベクトルの主成分に対して線形判別
分析（非特許文献２：Ｗ．Ｚｈａｏ他， ”ＤｉｓｃｒｉｍｉｎａｎｔＡｎａｌｙｓ
ｉｓｏｆＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔｓｆｏｒＦａｃｅＲｅｃｏ
ｇｎｉｔｉｏｎ，” ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＴｈｉｒｄ
ＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＡｕｔｏｍａｔｉｃＦａ
ｃｅａｎｄＧｅｓｔｕｒｅＲｅｃｏｇｎｉｔｉｏｎ，ｐｐ．３３６−３４１，
１９９８）を行うことで次元を削減し、得られた特徴ベクトルを用いて、顔による個人
の同定等を行う。 For example, in the case of face recognition, the pixel value of the face image normalized by the eye position or the like is raster scanned to convert it into a one-dimensional feature vector, and this feature vector is used as an input feature vector to perform principal component analysis. (Non-Patent Document 1: Moghdamdam et al., “Probabil.
istic Visual Learning for Object Detecti
on ”, IEEE Transactions onPattern Analysis
s and Machine Intelligence, Vol. 17, no.
7, pp. 696-710, 1997) and linear discriminant analysis for principal components of feature vectors (Non-Patent Document 2: W. Zhao et al., “Discriminant Analysis”).
is of Principal Components for Face Reco
gitionion, "Proceedings of the IEEE Third
International Conference on Automatic Fa
ce and Gesture Recognition, pp. 336-341
1998), the dimension is reduced, and the obtained feature vector is used to identify an individual using a face.

これらの方法では、予め用意した学習サンプルに対して、共分散行列やクラス内共分
散行列・クラス間共分散行列を計算し、それらの共分散行列における固有値問題の解とし
て得られる基底ベクトルを求め、これらの基底ベクトルを用いて、入力特徴ベクトルの特
徴を変換する。 In these methods, a covariance matrix, an intra-class covariance matrix, and an interclass covariance matrix are calculated for learning samples prepared in advance, and basis vectors obtained as solutions of eigenvalue problems in those covariance matrices are calculated. Using these basis vectors, the features of the input feature vector are converted.

ここで、線形判別分析についてより詳しく説明する。
線形判別分析は、Ｎ次元特徴ベクトルｘがあるときに、この特徴ベクトルをある変換行
列Ｗによって変換したときに得られるＭ次元ベクトルｙ（＝ＷＴｘ）のクラス内共分散行
列ＳＷに対するクラス間共分散行列ＳＢの比を最大化するような変換行列Ｗを求める方
法である。このような分散比の評価関数として、行列式を用いて評価式の（数１）が定
義される。 Here, the linear discriminant analysis will be described in more detail.
In the linear discriminant analysis, when there is an N-dimensional feature vector x, the intra-class covariance matrix SW of the M-dimensional vector y (= WTx) obtained when this feature vector is transformed by a certain transformation matrix W This is a method for obtaining a transformation matrix W that maximizes the ratio of the matrix SB. As an evaluation function of such a dispersion ratio, (Equation 1) of the evaluation formula is defined using a determinant.

ここで、クラス内共分散行列ΣＷおよびクラス間共分散行列ΣＢは、学習サンプルにお
ける特徴ベクトルｘの集合におけるＣ個のクラスωｉ（ｉ＝１，２，．．．，Ｃ；それら
のデータ数ｎｉ）のそれぞれの内部における共分散行列Σｉとクラスの間の共分散行列で
あり、それぞれ（数２）および（数３）によって表される。 Here, the intra-class covariance matrix ΣW and the inter-class covariance matrix ΣB are the C classes ωi (i = 1, 2,..., C; the number of data ni in the set of feature vectors x in the learning sample. ) Are covariance matrices between a class and a covariance matrix Σi, and are represented by (Equation 2) and (Equation 3), respectively.

ここで、ｍｉはクラスωｉの平均ベクトル（数４）、ｍはパターン全体におけるｘの平
均ベクトルである（数５）。 Here, mi is an average vector of class ωi (Equation 4), and m is an average vector of x in the entire pattern (Equation 5).

各クラスωｉの事前確率Ｐ（ωｉ）が、予めサンプル数ｎｉを反映しているならば、Ｐ
（ω ｉ）＝ｎｉ／ｎを仮定すればよい。そうでなく等確率を仮定できるならば、Ｐ（
ωｉ）＝１／Ｃとすればよい。 If the prior probability P (ωi) of each class ωi reflects the number of samples ni in advance, P
(Ω i) = ni / n may be assumed. Otherwise, if we can assume equiprobability, P (
ωi) = 1 / C.

（数１）を最大にする変換行列Ｗは、列ベクトルｗｉの固有値問題である（数６）のＭ
個の大きい固有値に対応する一般化された固有ベクトルのセットとして求められる。この
ようにして求められた変換行列Ｗを判別行列と呼ぶ。 The transformation matrix W that maximizes (Equation 1) is an eigenvalue problem of the column vector wi.
Determined as a set of generalized eigenvectors corresponding to the large eigenvalues. The transformation matrix W obtained in this way is called a discriminant matrix.

なお従来の線形判別分析法については、例えば、非特許文献５：「パターン識別」（
ＲｉｃｈａｒｄＯ．Ｄｕｄａ他、尾上守夫監訳、新技術コミュニケーションズ，２０
０１年，ｐｐ．１１３−１２２）に記載されている。 As for the conventional linear discriminant analysis method, for example, Non-Patent Document 5: “Pattern identification” (
Richard O. Duda et al., Directed by Morio Onoe, New Technology Communications, 20
01, pp. 113-122).

入力特徴ベクトルｘの次元数が特に大きい場合、少ない学習データを用いた場合にはΣ
Ｗが正則ではなくなり、（数６）の固有値問題を通常の方法では解くことができなくなる
。 When the number of dimensions of the input feature vector x is particularly large, or when less learning data is used, Σ
W is not regular, and the eigenvalue problem of (Equation 6) cannot be solved by a normal method.

また、特許文献１：特開平７−２９６１６９号公報でも述べられているように、共分散
行列の固有値が小さい高次成分は、パラメータの推定誤差が大きいことが知られており、
これが認識精度に悪影響を与える。 Further, as described in Patent Document 1: Japanese Patent Laid-Open No. 7-296169, it is known that a high-order component having a small eigenvalue of a covariance matrix has a large parameter estimation error.
This adversely affects recognition accuracy.

このため、前述のＷ．Ｚｈａｏらの論文では入力特徴ベクトルの主成分分析を行い、
固有値が大きな主成分に対して、判別分析を適用している。つまり、図２に示すように、
主成分分析によって得られる基底行列を用いて入力特徴ベクトルを射影することで主成分
を抽出した後に、判別分析によって得られる判別行列を基底行列として、主成分を射影す
ることで、識別に有効な特徴ベクトルの抽出を行う。 For this reason, the aforementioned W.S. Zhao et al. Performed principal component analysis of input feature vectors,
Discriminant analysis is applied to principal components with large eigenvalues. That is, as shown in FIG.
After extracting the principal component by projecting the input feature vector using the basis matrix obtained by principal component analysis, the principal component is projected using the discriminant matrix obtained by discriminant analysis as the basis matrix, which is effective for discrimination Extract feature vectors.

また、特許文献１：特開平７−２９６１６９号公報に記載されている特徴変換行列の演
算方式では、全共分散行列ΣＴの高次の固有値及び対応する固有ベクトルを削除等する
ことによって、次元数を削減し、削減された特徴空間において、判別分析を適用している
。これも全共分散行列の高次の固有値及び対応する固有ベクトルを削除することが主成分
分析によって、固有値が大きな主成分のみの空間で判別分析を行うという意味では、Ｗ．
Ｚｈａｏの方法と同様に高次特徴を除去し、安定なパラメータ推定を行う効果をもた
らす。 Further, in the feature transformation matrix calculation method described in Patent Document 1: Japanese Patent Laid-Open No. 7-296169, the number of dimensions is reduced by deleting higher-order eigenvalues and corresponding eigenvectors of the total covariance matrix ΣT. Discriminant analysis is applied in the reduced feature space. In this sense, in the sense that deleting higher-order eigenvalues and corresponding eigenvectors of the total covariance matrix performs discriminant analysis in a space of only principal components having large eigenvalues by principal component analysis.
Similar to Zhao's method, higher-order features are removed, and stable parameter estimation is achieved.

特開平７−２９６１６９号公報Japanese Patent Laid-Open No. 7-296169

Ｍｏｇｈａｄｄａｍ他， ”ＰｒｏｂａｂｉｌｉｓｔｉｃＶｉｓｕａｌＬｅａｒｎｉｎｇｆｏｒＯｂｊｅｃｔＤｅｔｅｃｔｉｏｎ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．１７，Ｎｏ．７，ｐｐ．６９６−７１０，１９９７）Moghdamdam et al., “Probabilistic Visual Learning for Object Detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, no. 7, pp. 696-710, 1997)

Ｗ．Ｚｈａｏ他，”ＤｉｓｃｒｉｍｉｎａｎｔＡｎａｌｙｓｉｓｏｆＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔｓｆｏｒＦａｃｅＲｅｃｏｇｎｉｔｉｏｎ，” ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＴｈｉｒｄＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＡｕｔｏｍａｔｉｃＦａｃｅａｎｄＧｅｓｔｕｒｅＲｅｃｏｇｎｉｔｉｏｎ，ｐｐ．３３６−３４１，１９９８）W. Zhao et al., “Discriminant Analysis of Principal Components for Face Recognition,” Proceedings of the IEEE International Conferencing Conference. 336-341, 1998)

Ｋｅｒｎｅｌ−ｂａｓｅｄＯｐｔｉｍｉｚｅｄＦｅａｔｕｒｅＶｅｃｔｏｒｓＳｅｌｅｃｔｉｏｎａｎｄＤｉｓｃｒｉｍｉｎａｎｔＡｎａｌｙｓｉｓｆｏｒＦａｃｅＲｅｃｏｇｎｉｔｉｏｎ，”ＰｒｏｃｅｅｄｉｎｇｏｆＩＡＰＲＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＩＣＰＲ），Ｖｏｌ．ＩＩ，ｐｐ．３６２−３６５，２００２Kernel-based Optimized Features Vectors Selection and Discriminant Analysis for Face Recognition, “Proceeding of IAPR International Conf.

ＧｅｎｅｒａｌｉｚｅｄＤｉｓｃｉｍｉｎａｎｔＡｎａｌｙｓｉｓＵｓｉｎｇａＫｅｒｎｅｌＡｐｐｒｏａｃｈ，”ＮｕｅｕｒａｌＣｏｍｐｕｔａｔｉｏｎ，Ｖｏｌ．１２，ｐｐ２３８５−２４０４，２０００Generalized Discrimination Analysis Usage a Kernel Approach, “Neural Computation, Vol. 12, pp 2385-2404, 2000

「パターン識別」（ＲｉｃｈａｒｄＯ．Ｄｕｄａ他、尾上守夫監訳、新技術コミュニケーションズ，２００１年，ｐｐ．１１３−１２２）"Pattern identification" (Richard O. Duda et al., Translated by Morio Onoe, New Technology Communications, 2001, pp. 113-122)

しかしながら、全共分散行列ΣＴを用いた主成分分析は、特徴空間内での分散が大きい
軸方向に順番に直交する軸を選択しているに過ぎず、パターン識別の性能とは無関係に特
徴軸の選択が行われる。このために、パターン識別に有効な特徴軸が失われる。 However, the principal component analysis using the total covariance matrix ΣT only selects an axis that is sequentially orthogonal to the axis direction in which the variance in the feature space is large, and the feature axis is independent of the pattern identification performance. Is selected. For this reason, a feature axis effective for pattern identification is lost.

例えば、特徴ベクトルｘが３つの要素からなっており（ｘ＝（ｘ１，ｘ２，ｘ３）Ｔ
）、ｘ１やｘ２の分散は大きいが、パターン識別には無関係な特徴であり、ｘ３はパタ
ーン識別には有効だが、分散が小さい場合（クラス間分散／クラス内分散、つまりフィッ
シャ比が大きいが、それぞれの分散の値自体はｘ１やｘ２に比較して十分に小さい場合）
に主成分分析を行ない、２次元だけを選択すると、ｘ１やｘ２に関わる特徴空間が選択さ
れてしまい、識別に有効なｘ３の寄与は無視されてしまう。 For example, the feature vector x is composed of three elements (x = (x1, x2, x3) T
), The variance of x1 and x2 is large, but it is a feature unrelated to pattern identification, and x3 is effective for pattern identification, but when the variance is small (interclass variance / intraclass variance, that is, the Fisher ratio is large, (Each dispersion value itself is sufficiently smaller than x1 and x2)
If the principal component analysis is performed on 2 and only two dimensions are selected, the feature space related to x1 and x2 is selected, and the contribution of x3 effective for identification is ignored.

この現象を図を用いて説明すれば、図３の（ａ）がｘ１とｘ２が張る平面におおよそ
垂直な方向から見たデータの分布で、黒丸と白丸がクラスの違うデータ点を表していると
する。ｘ１とｘ２が張る空間（この図では、平面）で見た場合、黒丸と白丸を識別でき
ないが、図３（ｂ）のようにこの平面と直交するｘ３の特徴軸で見ると、黒丸の白丸は分
離することができる。しかし、分散の大きい軸を選択してしまうと、ｘ１とｘ２で張る
平面が特徴空間として選ばれ図３の（ａ）を見て判別を行おうとすることに等しく、判
別を行うことが困難となる。 Explaining this phenomenon with reference to the figure, (a) in FIG. 3 is a distribution of data viewed from a direction substantially perpendicular to the plane stretched by x1 and x2, and black circles and white circles represent data points of different classes. And When viewed in the space spanned by x1 and x2 (in this figure, a plane), the black circle and the white circle cannot be distinguished, but when viewed from the feature axis of x3 perpendicular to this plane as shown in FIG. Can be separated. However, if an axis with a large variance is selected, the plane spanned by x1 and x2 is selected as the feature space, which is equivalent to trying to make a distinction with reference to FIG. Become.

これは、従来の技術で、主成分分析や（全）共分散行列の固有値の小さい空間を削除す
るという技術では避けられない現象である。 This is a phenomenon that cannot be avoided by the conventional technique that deletes a space having a small eigenvalue of the principal component analysis or the (all) covariance matrix.

本発明は、前述のような従来技術の問題点に鑑み、入力のパターン特徴ベクトルから、
判別に有効な特徴ベクトルを抽出し、特徴次元を圧縮する際に、判別に有効な特徴量の削
減を抑制し、より効率の良い特徴抽出を行うための特徴ベクトルの変換技術を提供するこ
とにある。 In view of the problems of the prior art as described above, the present invention is based on input pattern feature vectors.
To provide feature vector conversion technology for extracting feature vectors effective for discrimination and suppressing reduction of feature amounts effective for discrimination and performing more efficient feature extraction when compressing feature dimensions is there.

本発明によれば、パターン特徴を線形変換を用いて特徴次元を圧縮するパターン特徴抽
出方法において、パターン特徴を複数の特徴ベクトルｘｉで表現し、それぞれの特徴ベク
トルｘｉに対して、線形判別分析により求められる各特徴ベクトルの判別行列Ｗｉを予め
求め、さらにそれらの判別行列を用いてベクトルｘｉを線形変換することによって得られ
る各ベクトルｙｉを合わせた特徴ベクトルｙについて、線形判別分析により判別行列ＷＴ
を予め求めておき、前記の判別行列Ｗｉおよび判別行列ＷＴによって特定される線形変換
によって、パターンの特徴ベクトルを変換することで、特徴次元を圧縮することを特徴と
する。 According to the present invention, in a pattern feature extraction method for compressing a feature dimension using linear transformation, a pattern feature is expressed by a plurality of feature vectors xi, and each feature vector xi is subjected to linear discriminant analysis. A discriminant matrix WT is obtained by linear discriminant analysis for a feature vector y obtained by previously obtaining a discriminant matrix Wi of each feature vector to be obtained and further combining each vector yi obtained by linearly transforming the vector xi using the discriminant matrix.
Is obtained in advance, and the feature dimension is compressed by converting the feature vector of the pattern by the linear transformation specified by the discriminant matrix Wi and the discriminant matrix WT.

前記のパターン特徴抽出方法において、パターン特徴を複数の特徴ベクトルｘｉに分割
し、それぞれの特徴ベクトルｘｉについて、判別行列Ｗｉを用いて、線形変換ｙｉ＝Ｗｉ
Ｔｘｉを行い特徴ベクトルｙｉを算出し、算出された特徴ベクトルｙｉを合わせたベクト
ルｙについて、判別行列ＷＴを用いて、線形変換ｚ＝ＷＴＴｙを計算し、特徴ベクトルｚ
を算出することで、パターン特徴の次元数を圧縮することを特徴とする。 In the pattern feature extraction method, the pattern feature is divided into a plurality of feature vectors xi, and the linear transformation yi = Wi is used for each feature vector xi by using the discriminant matrix Wi.
A feature vector yi is calculated by performing Txi, and a linear transformation z = WTTTy is calculated using a discriminant matrix WT for the vector y obtained by combining the calculated feature vectors yi, and the feature vector z
By calculating the number of dimensions of the pattern feature.

また、前記のパターン特徴抽出方法において、それぞれの判別行列ＷｉおよびＷＴに
よって特定される行列Ｗを予め計算しておき、前記行列Ｗを用いて、入力特徴ベクトルｘ
ｉを合わせた特徴ベクトルｘと行列Ｗの線形変換ｚ＝ＷＴｘを計算し、特徴ベクトルｚを
算出することで、パターン特徴の次元数を圧縮してもよい。 In the pattern feature extraction method, the matrix W specified by the respective discriminant matrices Wi and W T is calculated in advance, and the input feature vector x
The number of dimensions of the pattern feature may be compressed by calculating a linear transformation z = WTx of the feature vector x combined with i and the matrix W, and calculating the feature vector z.

この発明を画像に対して適用する場合には、画像から特徴量を抽出し、得られた特徴を
線形変換を用いて特徴次元を圧縮することで画像特徴を抽出することを特徴とする画像特
徴抽出方法において、画像中の予め定めた複数のサンプル点集合Ｓｉについて、複数のサ
ンプル点から得られる画素値からなる特徴ベクトルｘｉとして抽出し、それぞれの特徴ベ
クトルｘｉに対して、線形判別分析により求められる各特徴ベクトルの判別行列Ｗｉを
予め求め、さらにそれらの判別行列を用いてベクトルｘｉを線形変換することによって
得られる各ベクトルｙｉを合わせた特徴ベクトルｙについて、線形判別分析により判別行
列ＷＴを予め求めておき、前記判別行列Ｗｉおよび前記判別行列ＷＴによって特定される
線形変換によって、画像サンプル集合毎の特徴ベクトルを変換することで、画像から特徴
量を抽出することを特徴とする。 When the present invention is applied to an image, an image feature is extracted by extracting a feature amount from the image and compressing a feature dimension of the obtained feature using a linear transformation. In the extraction method, a plurality of predetermined sample point sets Si in the image are extracted as feature vectors xi composed of pixel values obtained from the plurality of sample points, and each feature vector xi is obtained by linear discriminant analysis. A discriminant matrix Wi for each feature vector obtained in advance, and a vector xi obtained by linearly transforming the vector xi using these discriminant matrices to obtain a discriminant matrix WT in advance by linear discriminant analysis. An image sample is obtained by linear transformation specified by the discriminant matrix Wi and the discriminant matrix WT. By converting the feature vector for each case, and extracts a feature from the image.

その一つの方法として、複数のサンプル点からなる複数の特徴ベクトルｘｉについて、
判別行列Ｗｉを用いて、線形変換ｙｉ＝ＷｉＴｘｉを行い特徴ベクトルｙｉを算出し、算
出された特徴ベクトルｙｉを合わせたベクトルｙについて、判別行列ＷＴを用いて、線形
変換ｚ＝ＷＴＴｙを計算し、特徴ベクトルｚを算出することで、画像から特徴量を抽出す
ればよい。 As one method, for a plurality of feature vectors xi composed of a plurality of sample points,
A linear transformation yi = WiTxi is performed using the discriminant matrix Wi to calculate a feature vector yi, and a linear transformation z = WTTy is calculated using the discriminant matrix WT for a vector y obtained by combining the calculated feature vectors yi. By calculating the feature vector z, a feature amount may be extracted from the image.

また、前述の画像特徴抽出方法において、それぞれの判別行列ＷｉおよびＷＴによって
特定される行列Ｗを予め計算しておき、前記行列Ｗを用いて、特徴ベクトルｘｉを合わせ
たベクトルｘと行列Ｗの線形変換ｚ＝ＷＴｘを計算し、特徴ベクトルｚを算出することで
、画像から特徴量を抽出してもよい。 In the above-described image feature extraction method, the matrix W specified by the respective discrimination matrices Wi and WT is calculated in advance, and the matrix W is used to calculate the linearity of the vector x and the matrix W that are the combined feature vectors xi. The feature quantity may be extracted from the image by calculating the transformation z = WTx and calculating the feature vector z.

また、画像を予め定めた複数の局所領域に分割し、その複数の局所領域毎に特徴量を抽
出し、それらの特徴量を特徴ベクトルｘｉとして表現し、それぞれの特徴ベクトルｘｉに
対して、線形判別分析により求められる各特徴ベクトルの判別行列Ｗｉを予め求め、さら
にそれらの判別行列を用いてベクトルｘｉを線形変換することによって得られる各ベクト
ルｙｉを合わせた特徴ベクトルｙについて、線形判別分析により判別行列ＷＴを予め求め
ておき、前記判別行列Ｗｉおよび前記判別行列ＷＴによって特定される線形変換によって
、局所領域の特徴ベクトルを変換することで、画像から特徴量を抽出すればよい。 In addition, the image is divided into a plurality of predetermined local regions, feature amounts are extracted for each of the plurality of local regions, the feature amounts are expressed as feature vectors xi, and linear for each feature vector xi. A feature matrix y obtained by previously obtaining a discriminant matrix Wi of each feature vector obtained by discriminant analysis and further combining each vector yi obtained by linearly transforming the vector xi using the discriminant matrix is discriminated by linear discriminant analysis. A matrix WT is obtained in advance, and a feature quantity may be extracted from an image by converting a feature vector of a local region by linear transformation specified by the discriminant matrix Wi and the discriminant matrix WT.

前述の画像特徴抽出方法において、画像の局所領域の特徴ベクトルｘｉについて、判
別行列Ｗｉを用いて、線形変換ｙｉ＝ＷｉＴｘｉを行い特徴ベクトルｙｉを算出し、算出
された特徴ベクトルｙｉを合わせたベクトルｙについて、判別行列ＷＴを用いて、線形変
換ｚ＝ＷＴＴｙを計算し、特徴ベクトルｚを算出することで、画像から特徴量を抽出す
る。 In the image feature extraction method described above, the feature vector xi of the local region of the image is subjected to linear transformation yi = WiTxi using the discriminant matrix Wi to calculate the feature vector yi, and the vector y obtained by combining the calculated feature vectors yi For, a linear transformation z = WTTY is calculated using the discriminant matrix WT, and a feature quantity is extracted from the image by calculating a feature vector z.

あるいは、前述の画像特徴抽出方法において、それぞれの判別行列ＷｉおよびＷＴによ
って特定される行列Ｗを予め計算しておき、前記行列Ｗを用いて、特徴ベクトルｘｉを合
わせたベクトルｘと行列Ｗの線形変換ｚ＝ＷＴｘを計算し、特徴ベクトルｚを算出するこ
とで、画像から特徴量を抽出してもよい。 Alternatively, in the image feature extraction method described above, the matrix W specified by the respective discrimination matrices Wi and WT is calculated in advance, and the matrix W is used to calculate the linearity of the vector x and the matrix W combined with the feature vector xi. The feature quantity may be extracted from the image by calculating the transformation z = WTx and calculating the feature vector z.

本発明の画像から特徴量を抽出することを特徴とする画像特徴抽出方法の有効な実施方
法として、画像から特徴量を抽出し、得られた特徴を線形変換を用いて特徴次元を圧縮す
ることで画像特徴を抽出することを特徴とする画像特徴抽出方法において、画像を二次元
フーリエ変換し、二次元フーリエ変換の実数成分と虚数成分を特徴ベクトルｘ１として抽
出し、二次元フーリエ変換のパワースペクトラムを算出し、そのパワースペクトラムを特
徴ベクトルｘ２として抽出し、それぞれの特徴ベクトルｘｉ（ｉ＝１，２）に対して、線
形判別分析により求められる各特徴ベクトルの判別行列Ｗｉを求め、さらにそれらの判別
行列を用いてベクトルｘｉを線形変換することによって得られる各ベクトルｙｉを合わせ
た特徴ベクトルｙについて、線形判別分析により判別行列ＷＴを予め求めておき、前記判
別行列Ｗｉおよび前記判別行列ＷＴによって特定される線形変換によって特徴ベクトルを
変換することを特徴とする。 As an effective implementation method of an image feature extraction method characterized by extracting a feature amount from an image of the present invention, the feature amount is extracted from the image, and the obtained feature is compressed using a linear transformation. In the image feature extraction method, the image feature is extracted by a two-dimensional Fourier transform of the image, the real and imaginary components of the two-dimensional Fourier transform are extracted as a feature vector x1, and the power spectrum of the two-dimensional Fourier transform And the power spectrum is extracted as a feature vector x2, and for each feature vector xi (i = 1, 2), a discriminant matrix Wi of each feature vector obtained by linear discriminant analysis is obtained, and further, For a feature vector y that combines each vector yi obtained by linearly transforming the vector xi using a discriminant matrix, Discriminant analysis obtained in advance discriminant matrix WT by, and converting the feature vector by linear transformation specified by the discriminant matrix Wi and the discriminant matrix WT.

また、画像から特徴量を抽出し、得られた特徴を線形変換を用いて特徴次元を圧縮する
ことで画像特徴を抽出することを特徴とする画像特徴特徴抽出方法において、画像を二次
元フーリエ変換し、二次元フーリエ変換の実数成分と虚数成分を特徴ベクトルｘ１として
抽出し、二次元フーリエ変換のパワースペクトラムを算出し、そのパワースペクトラムを
特徴ベクトルｘ２として抽出し、それぞれの特徴ベクトルｘｉ（ｉ＝１，２）の主成分に
対して、線形判別分析により求められる各特徴ベクトルの判別行列Ｗｉを求め、さらにそ
れらの判別行列を用いてベクトルｘｉを線形変換することによって得られる各ベクトルｙ
ｉを合わせた特徴ベクトルｙについて、線形判別分析により判別行列ＷＴを予め求めてお
き、特徴ベクトルｘｉの主成分に対する判別行列Ｗｉおよび判別行列ＷＴによって特定
される線形変換によって、フーリエ成分の実成分と虚成分に対する特徴ベクトルｘ１とフ
ーリエ成分のパワースペクトラムに対する特徴ベクトルｘ２を次元削減するように変換す
ることで、画像から特徴量を抽出することを特徴とする。 In addition, in the image feature feature extraction method, the feature is extracted from the image by extracting the feature amount from the image and compressing the feature dimension by using the linear transformation to obtain the obtained feature. Then, the real and imaginary components of the two-dimensional Fourier transform are extracted as the feature vector x1, the power spectrum of the two-dimensional Fourier transform is calculated, the power spectrum is extracted as the feature vector x2, and each feature vector xi (i = 1, 2) Each vector y obtained by obtaining a discriminant matrix Wi of each feature vector obtained by linear discriminant analysis and linearly transforming vector xi using those discriminant matrices.
For a feature vector y combined with i, a discriminant matrix WT is obtained in advance by linear discriminant analysis, and the real component of the Fourier component is obtained by linear transformation specified by the discriminant matrix Wi and the discriminant matrix WT for the principal component of the feature vector xi. A feature quantity is extracted from the image by converting the feature vector x1 for the imaginary component and the feature vector x2 for the power spectrum of the Fourier component so as to reduce the dimensions.

前述の画像特徴抽出方法において、フーリエ変換による実数成分と虚数成分による特
徴ベクトルｘ１を主成分に変換する変換行列Ψ１と、その主成分に対する判別行列Ｗ１に
よって表される基底行列Φ１（＝（Ｗ１ＴΨ１Ｔ）Ｔ）を用いて、特徴ベクトルｘ１の主
成分の判別特徴を線形変換ｙ１＝Φ１Ｔｘ１により算出し、得られた特徴ベクトルｙ１の
大きさを予め定めた大きさに正規化し、また、フーリエ変換によるパワースペクトラムに
よる特徴ベクトルｘ２を主成分に変換する変換行列Ψ２と、その主成分に対する判別行列
Ｗ２によって表される基底行列Φ２（＝（Ｗ２ＴΨ２Ｔ）Ｔ）を用いて、特徴ベクトルｘ
２の主成分の判別特徴を線形変換ｙ２＝Φ２Ｔｘ２により算出し、得られた特徴ベクトル
ｙ２の大きさを予め定めた大きさに正規化し、二つの特徴ベクトルｙ１とｙ２を合わせた
特徴ベクトルｙについて、判別行列ＷＴを用いて、線形変換ｚ＝ＷＴＴｙを計算し、特徴
ベクトルｚを算出することで、画像から特徴量を抽出することを特徴とする。 In the image feature extraction method described above, a basis matrix Φ1 (= (W1TΨ1T) represented by a transformation matrix Ψ1 that transforms a feature vector x1 of a real component and an imaginary component by Fourier transform into a principal component and a discriminant matrix W1 for the principal component T), the discriminating feature of the principal component of the feature vector x1 is calculated by linear transformation y1 = Φ1Tx1, the magnitude of the obtained feature vector y1 is normalized to a predetermined magnitude, and the power by Fourier transformation Using the transformation matrix Ψ2 for converting the feature vector x2 based on the spectrum into the principal component and the basis matrix Φ2 (= (W2TΨ2T) T) represented by the discrimination matrix W2 for the principal component, the feature vector x
The feature feature y of the principal component of 2 is calculated by linear transformation y2 = Φ2Tx2, the magnitude of the obtained feature vector y2 is normalized to a predetermined magnitude, and the feature vector y that combines the two feature vectors y1 and y2 , Using the discriminant matrix WT, a linear transformation z = WTTTy is calculated, and a feature vector z is calculated to extract a feature quantity from the image.

本発明によるパターン特徴抽出により、入力のパターン特徴ベクトルから、その要素ベ
クトル毎に判別分析による判別に有効な特徴ベクトルを抽出し、得られた特徴ベクトルを
再度判別分析による判別行列を用いた特徴抽出を行うことで、特徴次元を圧縮する際に、
判別に有効な特徴量の削減を抑制し、より効率の良い特徴抽出を行うための特徴ベクトル
の変換を行うことができる。 By extracting pattern features according to the present invention, feature vectors effective for discrimination by discriminant analysis are extracted for each element vector from the input pattern feature vectors, and the obtained feature vectors are extracted again using a discriminant matrix by discriminant analysis. When compressing the feature dimension,
It is possible to perform feature vector conversion for suppressing feature reduction effective for discrimination and performing more efficient feature extraction.

特にパターンの特徴量が多いにも関わらず、判別分析を行う際に必要な学習サンプル数
が限られているような場合に特に有効であり、必ずしも主成分分析を用いることなく、識
別に有効な特徴の損失を抑えた上で特徴次元数を削減することができる。 This is especially effective when the number of learning samples required for performing discriminant analysis is limited despite the large amount of pattern features, and is effective for identification without necessarily using principal component analysis. It is possible to reduce the number of feature dimensions while suppressing loss of features.

本発明の実施形態によるパターン特徴抽出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the pattern feature extraction apparatus by embodiment of this invention. 従来技術を説明するための図である。It is a figure for demonstrating a prior art. パターン特徴の分布を説明するための図である。It is a figure for demonstrating distribution of a pattern feature. 本発明による第二の実施形態によるパターン特徴抽出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the pattern feature extraction apparatus by 2nd embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による第三の実施形態による顔画像マッチングシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the face image matching system by 3rd embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明による実施形態を説明するための図である。It is a figure for demonstrating embodiment by this invention. 本発明の第五の実施の形態における顔記述の一例を示すための図である。It is a figure for showing an example of face description in a 5th embodiment of the present invention. 本発明の第五の実施の形態におけるバイナリー表現文法（ＢｉｎａｒｙＲｅｐｒｅｓｅｎｔａｔｉｏｎＳｙｎｔａｘ）を用いた場合の規則の一例を示す図である。It is a figure which shows an example of the rule at the time of using the binary expression grammar (Binary Representation Syntax) in the 5th embodiment of this invention. 本発明の第５の実施の形態におけるフーリエ特徴（ＦｏｕｒｉｅｒＦｅａｔｕｒｅ）を抽出するための説明図である。It is explanatory drawing for extracting the Fourier feature (FourierFeature) in the 5th Embodiment of this invention. 本発明の第５の実施の形態におけるフーリエスペクトルの走査方法の一例を示すための図である。It is a figure for showing an example of the scanning method of the Fourier spectrum in the 5th Embodiment of this invention. 本発明の第５の実施の形態におけるフーリエスペクトルの走査規則の一例を示すためのテーブルである。It is a table for showing an example of the scanning rule of the Fourier spectrum in the fifth exemplary embodiment of the present invention. 本発明の第５の実施の形態におけるＣｅｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅ要素のためのフーリエ空間における走査領域の一例を示すテーブルである。It is a table which shows an example of the scanning area | region in the Fourier space for the CentralFourierFeature element in the 5th Embodiment of this invention. 本発明の第５の実施の形態におけるブロック図の一例を示す図である。It is a figure which shows an example of the block diagram in the 5th Embodiment of this invention.

（第一の実施の形態）
本発明の実施の形態について図面を参照して詳細に説明する。図１は、本発明のパター
ン特徴抽出装置を用いたパターン特徴抽出装置を示すブロック図である。 (First embodiment)
Embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing a pattern feature extraction apparatus using the pattern feature extraction apparatus of the present invention.

以下、パターン特徴抽出装置について詳細に説明する。
図１に示すように、本発明によるパターン特徴抽出装置は、入力特徴ベクトルｘ１を線形
変換する第１の線形変換手段１１と、入力特徴ベクトルｘ２を線形変換する第２の線形変
換手段１２と、線形変換手段１１と線形変換手段１２によって変換し、次元削減された特
徴ベクトルを入力として、線形変換を行う第３の線形変換手段１３を備える。前述のそれ
ぞれの線形変換手段は、それぞれ対応した判別行列記憶手段１４、１５、１６に記憶され
ている予め学習によって求めておいた判別行列を用いて、判別分析による基底変換を行う
。 Hereinafter, the pattern feature extraction apparatus will be described in detail.
As shown in FIG. 1, the pattern feature extraction apparatus according to the present invention includes a first linear conversion unit 11 that linearly converts an input feature vector x1, a second linear conversion unit 12 that linearly converts an input feature vector x2, A third linear conversion unit 13 that performs linear conversion using the feature vector converted by the linear conversion unit 11 and the linear conversion unit 12 as an input is provided. Each of the above-described linear conversion means performs base conversion by discriminant analysis using the discriminant matrix previously obtained by learning stored in the corresponding discriminant matrix storage means 14, 15, 16.

入力される特徴ベクトルｘ１、ｘ２は、文字認識や顔認識などでそれらの目的に応じて
抽出される特徴量であり、例えば画像の勾配特性から計算される方向特徴や画像の画素値
そのものである濃淡特徴等で、複数の要素がある。この際に、例えば、Ｎ１個の方向特徴
を一方の特徴ベクトルｘ１として、もう一方のＮ２個の濃淡値を特徴ベクトルｘ２として
入力する。 The input feature vectors x1 and x2 are feature amounts extracted according to their purposes in character recognition and face recognition, for example, directional features calculated from image gradient characteristics and image pixel values themselves. There are multiple elements such as shading characteristics. At this time, for example, N1 direction features are input as one feature vector x1, and the other N2 grayscale values are input as a feature vector x2.

判別行列記憶手段１４や判別行列記憶手段１５は、特徴ベクトルｘ１および特徴ベクト
ルｘ２について、線形判別分析を行い、これにより得られる判別行列Ｗ１、Ｗ２をそれぞ
れ記憶する。 The discriminant matrix storage means 14 and the discriminant matrix storage means 15 perform linear discriminant analysis on the feature vector x1 and the feature vector x2, and store the discriminant matrices W1 and W2 obtained thereby, respectively.

判別行列は、前述したように予め用意した学習サンプルにおける特徴ベクトルについて
、そのクラスに応じて、クラス内共分散行列ΣＷ（（数２））、クラス間共分散行列ΣＢ
（（数３））を計算すればよい。また、各クラスωｉの事前確率Ｐ（ωｉ）は、サンプル
数ｎｉを反映させて、Ｐ（ωｉ）＝ｎｉ／ｎとすればよい。 As described above, the discriminant matrix includes the intra-class covariance matrix ΣW ((Expression 2)) and the interclass covariance matrix ΣB according to the class of the feature vectors in the learning sample prepared in advance.
((Equation 3)) may be calculated. The prior probability P (ωi) of each class ωi may be set to P (ωi) = ni / n, reflecting the number of samples ni.

これらの共分散行列に対して、（数６）で表される固有値問題の大きい固有値に対応す
る固有ベクトルｗｉを選択することで、判別行列を予め求めておくことができる。 A discriminant matrix can be obtained in advance by selecting an eigenvector wi corresponding to a large eigenvalue represented by (Equation 6) for these covariance matrices.

それぞれの特徴ベクトルｘ１、ｘ２について、入力特徴次元Ｎ１やＮ２よりも小さいＭ
１次元、Ｍ２次元の基底を選ぶとすると、判別基底への射影変換によってそれぞれＭ１、
Ｍ２次元の特徴ベクトルｙ１、ｙ２を得ることができる。 For each feature vector x1, x2, M is smaller than the input feature dimensions N1 and N2.
If one-dimensional and M2-dimensional bases are selected, then M1,
M2-dimensional feature vectors y1 and y2 can be obtained.

ここで、Ｗ１、Ｗ２の行列の大きさは、それぞれＭ１×Ｎ１、Ｍ２×Ｎ２となる。
射影する特徴空間の次元数Ｍ１、Ｍ２を大幅に小さくすることによって、効率良く特徴次
元数を削減でき、データ量の削減や高速化に効果があるが、特徴次元数を大幅に小さくし
すぎる場合には、判別性能の劣化をもたらす。これは、特徴次元数を削減することによっ
て、判別に有効な特徴量が失われるためである。 Here, the matrix sizes of W1 and W2 are M1 × N1 and M2 × N2, respectively.
If the feature space dimensions M1 and M2 of the projected feature space are significantly reduced, the feature dimensions can be efficiently reduced, and this is effective in reducing the amount of data and speeding up. However, if the feature dimensions are significantly reduced Causes degradation of discrimination performance. This is because a feature quantity effective for discrimination is lost by reducing the number of feature dimensions.

このため、特徴ベクトルの次元数Ｍ１やＭ２等は、学習サンプル数との兼ね合いに影響
されやすい量であり、実験に基づいて定めることが望ましい。 For this reason, the dimension number M1 or M2 of the feature vector is an amount that is easily affected by the balance with the number of learning samples, and is preferably determined based on experiments.

第３の線形変換手段１３では、第１および第２の線形変換手段によって計算されたｙ１
、ｙ２を入力特徴ベクトルｙとして、判別空間への射影を行う。判別行列記憶手段１６に
登録しておく判別行列Ｗ３は、第１、第２の判別行列を計算した場合と同様に学習サンプ
ルから求める。但し、入力特徴ベクトルｙは次の（数８）で表されるように、要素を並べ
たベクトルである。 In the third linear conversion means 13, y1 calculated by the first and second linear conversion means
, Y2 as an input feature vector y, and projection onto the discrimination space is performed. The discriminant matrix W3 registered in the discriminant matrix storage means 16 is obtained from the learning sample in the same manner as when the first and second discriminant matrices are calculated. However, the input feature vector y is a vector in which elements are arranged as represented by the following (Equation 8).

（数７）と同様に基底行列Ｗ３（行列の大きさは、Ｌ×（Ｍ１＋Ｍ２））によって、Ｌ
次元の特徴ベクトルｙを（数９）により射影し、出力となる特徴ベクトルｚを得る。 As in (Expression 7), the basis matrix W3 (the size of the matrix is L × (M1 + M2)), and L
The dimensional feature vector y is projected by (Equation 9) to obtain a feature vector z as an output.

このように特徴ベクトルをそれぞれ分割して、少ない次元数の特徴ベクトルの学習サン
プルに対して、線形判別分析を行うことによって、高い次元の特徴成分で生じやすい推定
誤りを抑制し、且つ、判別に有効な特徴を捉えることができる。 In this way, by dividing each feature vector and performing linear discriminant analysis on the learning samples of feature vectors with a small number of dimensions, it is possible to suppress estimation errors that are likely to occur in high-dimensional feature components and to perform discrimination. Effective features can be captured.

前述の例では、３つの線形変換手段を備えて、並列的・段階的に処理を行う場合につい
て示したが、線形判別手段は、基本的に積和演算器を備えていれば実現できるので、線形
変換を行う入力特徴ベクトルに合わせて、読み出す判別行列を切替え線形変換手段を使い
回すように実現することも可能である。このように一つの線形変換手段を使うことで、必
要な演算器の規模を小さくすることができる。 In the above-described example, the case where three linear conversion units are provided and processing is performed in parallel and stepwise is shown. However, the linear discrimination unit can be realized basically by including a product-sum operation unit. It is also possible to realize that the discriminant matrix to be read out is switched using the switching linear conversion means in accordance with the input feature vector for performing the linear conversion. In this way, the use of one linear conversion means can reduce the scale of a necessary arithmetic unit.

さらに、出力特徴ベクトルｚの演算は、（数７）、（数８）、（数９）から分かるよう
に、（数１０）と書き表すことができる。 Further, the calculation of the output feature vector z can be expressed as (Equation 10) as can be seen from (Equation 7), (Equation 8), and (Equation 9).

つまり、各判別行列を用いた線形変換は、一つの行列による線形変換にまとめることが
できる。段階的な演算を行う場合の積和演算回数は、Ｌ×（Ｍ１＋Ｍ２）＋Ｍ１Ｎ１＋Ｍ
２Ｎ２であり、一つの行列にまとめた場合には、Ｌ×（Ｎ１＋Ｎ２）となり、例えば、Ｎ
１＝Ｎ２＝５００、Ｍ１＝Ｍ２＝２００、Ｌ＝１００とした場合には、段階的な演算で
２４０，０００回の積和演算が必要となり、後者の演算では１００，０００回の積和演算
が必要となり、後者のような一括演算を行う場合の方が演算量が少なく、高速な演算が可
能となる。式からも分かるように最終的な次元数Ｌを小さくする場合には、一括的な演算
方法を用いた方が演算量を削減することができ、有効である。 That is, the linear transformation using each discriminant matrix can be combined into a linear transformation using one matrix. The number of product-sum operations when performing stepwise operations is L × (M1 + M2) + M1N1 + M
When 2N2 is combined into one matrix, L × (N1 + N2), for example, N
When 1 = N2 = 500, M1 = M2 = 200, and L = 100, 240,000 product-sum operations are required in a stepwise operation, and the latter operation requires 100,000 product-sum operations. Therefore, the amount of calculation is smaller in the case of performing the batch operation like the latter, and high-speed calculation is possible. As can be seen from the equation, when the final number of dimensions L is reduced, it is more effective to use the batch calculation method because the calculation amount can be reduced.

（第二の実施の形態）
さて、前述の例では、方向特徴と濃淡特徴というように特徴の種類が異なる場合の特徴
を融合する際に、それぞれの特徴毎に判別分析を施した特徴ベクトルに対して、繰り返し
判別分析を行っているが、一つの特徴に対する複数要素を複数の特徴ベクトルに分割して
、それぞれの要素集合を入力特徴として判別分析し、その射影されたベクトルをさらに判
別分析しても構わない。 (Second embodiment)
In the above example, when merging features with different types of features such as directional features and shade features, repeated discriminant analysis is performed on the feature vectors that have undergone discriminant analysis for each feature. However, a plurality of elements for one feature may be divided into a plurality of feature vectors, each element set may be discriminated and analyzed as an input feature, and the projected vector may be further discriminated and analyzed.

第二の実施例では、顔画像の特徴抽出装置について説明する。
第二の発明による顔画像特徴抽出装置では、図４に示すように入力顔画像の濃淡特徴を
分解する画像特徴分解手段４１と、特徴ベクトルに対応する判別行列に従って特徴ベクト
ルを射影する線形変換手段４２と、前記のそれぞれの判別行列を記憶する判別行列記憶手
段４３を備えている。 In the second embodiment, a face image feature extraction apparatus will be described.
In the face image feature extraction apparatus according to the second invention, as shown in FIG. 4, an image feature decomposing means 41 for decomposing the grayscale feature of the input face image, and a linear conversion means for projecting the feature vector according to the discriminant matrix corresponding to the feature vector. 42 and discriminant matrix storage means 43 for storing the respective discriminant matrices.

顔画像の特徴抽出する技術については、前述のＷ．Ｚｈａｏらの論文に示されているよ
うに、顔画像を目位置などで位置合わせした後に、その濃淡値をベクトル特徴とする方法
がある。 Regarding the technique for extracting features of a face image, the above-mentioned W.W. As shown in the paper by Zhao et al., There is a method in which the gray value is used as a vector feature after the face image is aligned by the eye position or the like.

第二の発明でも原特徴としてはどうように画像の画素の濃淡値を入力特徴として取り扱
うが、画像サイズが例えば左右の目の中心位置を（１４，２３）、（２９，２３）の座標
に正規化した４２×５４画素＝２３５２次元と大きな画像特徴となる。このような大きな
特徴次元では、限られた学習サンプルを用いて直接的に線形判別分析を行っても精度良い
特徴抽出を行うことは困難であり、画像特徴の要素を分解し、その分解された特徴に対し
て判別分析を行い、判別行列を求めることで、主成分分析等を適用した場合に生じる特徴
の劣化を抑制する。 In the second invention, the gray value of the pixel of the image is handled as the input feature as the original feature, but the image size is set to the coordinates of (14, 23), (29, 23), for example, the center position of the left and right eyes. Normalized 42 × 54 pixels = 2352 dimensions and a large image feature. In such a large feature dimension, it is difficult to perform accurate feature extraction even if linear discriminant analysis is performed directly using a limited number of learning samples. By performing discriminant analysis on the feature and obtaining a discriminant matrix, deterioration of the feature that occurs when principal component analysis or the like is applied is suppressed.

画像特徴を分解するための方法の一つが画像を分割することであり、例えば、図５に示
すように画像を一つの大きさが１４×１８画素（＝２５２次元）の大きさに９分割し、そ
れぞれの大きさの局所画像を特徴ベクトルｘｉ（ｉ＝１，２，３，．．．，９）とし、そ
れぞれの部分画像に対して学習サンプルを用いて判別分析を行い、それぞれの特徴ベクト
ルに対応する判別行列Ｗｉを求めておく。 One of the methods for decomposing the image feature is to divide the image. For example, as shown in FIG. 5, the image is divided into 9 parts each having a size of 14 × 18 pixels (= 252 dimensions). , Local images having respective sizes are set as feature vectors xi (i = 1, 2, 3,..., 9), and each partial image is subjected to discriminant analysis using a learning sample. A discriminant matrix Wi corresponding to is obtained.

なお、画像を分割する際に領域間にオーバーラップを持たせておくことで、その境界領
域の画素間の相関に基づく特徴量を特徴ベクトルに反映させることができるので、オーバ
ーラップをさせてサンプルするようにしておいてもよい。 Note that by providing overlap between regions when dividing an image, the feature quantity based on the correlation between pixels in the boundary region can be reflected in the feature vector. You may do it.

特徴次元数が２５２次元と原画像より大幅に少なくなることで、人数で数百人程度の各
人の画像を数枚、計数千枚程度の顔画像をサンプルとすることで、判別分析による基底行
列を精度を保って計算することができる。これが原特徴のまま（２３５２次元）と大きい
場合には、判別分析による特徴で性能を得るためには、数千名以上の顔画像サンプルを必
要となることが予想されるが、実際問題としてこのような大規模な画像データを収集する
ことは困難であるために、実現できない。 The number of feature dimensions is 252 dimensions, which is significantly smaller than the original image. By using several hundreds of images of each person and a count of 1,000 facial images as samples, it is possible to use discriminant analysis. The basis matrix can be calculated with accuracy. If this is as large as the original feature (2352 dimensions), it is expected that more than a thousand face image samples will be required to obtain performance with the feature based on discriminant analysis. Since it is difficult to collect such large-scale image data, it cannot be realized.

第一段階の判別特徴によって、例えば、各局所領域毎に２０次元の特徴に圧縮すると
すると、それらの出力特徴ベクトルは、９領域×２０次元＝１８０次元の特徴ベクトルと
なる。この特徴ベクトルに対してさらに判別分析を行うことで、次元数を例えば５０次元
程度に効率的に圧縮できる。この第二段階目の判別行列も判別行列記憶手段４３に記憶し
、線形変換手段４２により、第一段階の判別特徴の１８０次元ベクトルを入力として、再
度判別分析を行う。なお、予め第一段目の判別行列と第二段目の判別行列を（数１０）で
示したように予め計算しておいてもよいが、２５２次元×９領域を２０次元×９領域に圧
縮し、その１８０次元を５０次元に変換する場合では、二段階に分けて計算した方が使用
メモリも、演算量も半分以下となるので、効率的である。 If, for example, each local region is compressed into 20-dimensional features using the first-stage discrimination features, the output feature vectors will be 9 regions × 20 dimensions = 180-dimensional feature vectors. By further performing discriminant analysis on this feature vector, the number of dimensions can be efficiently compressed to about 50 dimensions, for example. The discriminant matrix of the second stage is also stored in the discriminant matrix storage means 43, and the linear transformation means 42 receives the 180-dimensional vector of the discriminant features of the first stage as input and performs discriminant analysis again. The first-stage discriminant matrix and the second-stage discriminant matrix may be calculated in advance as shown in (Equation 10), but the 252 dimensions × 9 area is changed to the 20 dimensions × 9 area. In the case of compressing and converting the 180 dimensions to 50 dimensions, it is more efficient to calculate in two stages because the memory used and the amount of calculation are less than half.

このように局所的・段階的に判別分析を適用することで、識別能力の高い顔特徴を抽出
することができるようになる。これは、文字認識でいえば、例えば「大」と「犬」の識別
を行おうとしたときに、文字画像全体を主成分分析して固有値が大きい成分抜き出すと、
「大」と「犬」を識別する「｀」の特徴が失われてしまいやすい（このため、類似文字識
別では主成分分析による固有値が大きい部分の特徴よりも、ある特定の高次特徴を用いる
ことが行われる場合もある）。局所領域に分割して判別特徴を抜き出すことの有効性は、
文字認識における類似文字識別における現象と類似しており、識別しやすい特徴を空間
的に限定することで、全体的に主成分の判別分析を行う場合よりも、単位次元あたり
の精度を確保できるようになると考えられる。 By applying discriminant analysis locally and stepwise in this way, facial features with high discrimination ability can be extracted. Speaking of character recognition, for example, when trying to identify “large” and “dog”, if a component with a large eigenvalue is extracted by principal component analysis of the entire character image,
The feature of “｀” that distinguishes between “large” and “dog” is likely to be lost. (For this reason, certain high-order features are used for similar character recognition rather than features with large eigenvalues by principal component analysis. Sometimes happen). The effectiveness of extracting discriminant features by dividing into local regions is
It is similar to the phenomenon of similar character identification in character recognition, and by limiting the features that are easy to identify spatially, it is possible to ensure accuracy per unit dimension rather than performing discriminant analysis of principal components as a whole. It is thought that it becomes.

また、画像特徴分割手段４１では、局所領域毎に画像を分割して、特徴ベクトルを構成
するのではなく、画像全体からサンプリングして分割してもよい。例えば、一次特徴を９
分の１の２５２次元の９つのベクトルに分割する場合には、図６に示すように３ｘ３の領
域からサンプリングする。つまり、サンプリングされた画像は、僅かな位置の違いのある
縮小画像となる。この縮小画像をラスター走査することで、９つの特徴ベクトルに変換す
る。このような特徴ベクトルを一次ベクトルとして判別成分を計算し、その判別成分を統
合して再度判別分析を行ってもよい。 Further, the image feature dividing unit 41 may sample and divide the entire image, instead of dividing the image for each local region to form a feature vector. For example, the primary feature is 9
In the case of dividing into nine 252-dimensional nine vectors, sampling is performed from a 3 × 3 region as shown in FIG. That is, the sampled image becomes a reduced image with a slight difference in position. By raster scanning this reduced image, it is converted into nine feature vectors. A discriminant component may be calculated using such a feature vector as a primary vector, and the discriminant component may be integrated to perform discriminant analysis again.

（第三の実施の形態）
本発明による別の実施の形態について図面を参照して詳細に説明する。図７は、本発明
の顔メタデータ生成装置を用いた顔画像マッチングシステムを示すブロック図である。 (Third embodiment)
Another embodiment of the present invention will be described in detail with reference to the drawings. FIG. 7 is a block diagram showing a face image matching system using the face metadata generation apparatus of the present invention.

以下、顔画像マッチングシステムについて詳細に説明する。
図１に示すように、本発明による顔画像マッチングシステムでは、顔画像を入力する顔
画像入力部７１と、顔メタデータを生成する顔メタデータ生成部７２と、抽出された顔メ
タデータを蓄積する顔メタデータ蓄積部７３と、顔メタデータから顔の類似度を算出する
顔類似度算出部７４と、顔画像を蓄積する顔画像データベース７５と、画像の登録要求・
検索要求に応じて、画像の入力・メタデータの生成・メタデータの蓄積・顔類似度の算出
の制御を行う制御部７６と顔画像や他の情報を表示するディスプレイの表示部７７と、が
設けられている。 Hereinafter, the face image matching system will be described in detail.
As shown in FIG. 1, in the face image matching system according to the present invention, a face image input unit 71 that inputs a face image, a face metadata generation unit 72 that generates face metadata, and the extracted face metadata are stored. A face metadata accumulating unit 73, a face similarity calculating unit 74 for calculating a face similarity from the face metadata, a face image database 75 for accumulating face images, and an image registration request /
In response to a search request, there are a control unit 76 that controls input of images, generation of metadata, accumulation of metadata, and calculation of face similarity, and a display unit 77 of a display that displays face images and other information. Is provided.

また、顔メタデータ生成部７２は、入力された顔画像から顔領域を切り出す領域切り出
し手段７２１と、切り出された領域の顔特徴を抽出する顔パターン特徴抽出手段７２２に
よって構成され、顔の特徴ベクトルを抽出することで、顔画像に関するメタデータを生成
する。 The face metadata generation unit 72 includes a region cutout unit 721 that cuts out a face region from the input face image, and a face pattern feature extraction unit 722 that extracts a face feature of the cut out region, and includes a face feature vector. To generate metadata related to the face image.

顔画像の登録時には、スキャナあるいはビデオカメラなどの画像入力部７１で顔写真等
を顔の大きさや位置を合わせた上で入力する。あるいは、人物の顔を直接ビデオカメラな
どから入力しても構わない。この場合には、前述のＭｏｈａｄｄａｍの文献に示されてい
るような顔検出技術を用いて、入力された画像の顔位置を検出し、顔画像の大きさ等を自
動的に正規化する方がよいであろう。 When registering a face image, an image input unit 71 such as a scanner or a video camera is used to input a face photograph or the like after matching the size and position of the face. Alternatively, a person's face may be input directly from a video camera or the like. In this case, it is better to detect the face position of the input image and automatically normalize the size of the face image using the face detection technique as shown in the above-mentioned Mohaddam literature. Would be good.

また、入力された顔画像は必要に応じて顔画像データベース７５に登録する。顔画像登
録と同時に、顔メタデータ生成部７２によって顔メタデータを生成し、顔メタデータ蓄積
部７３に蓄積する。 Further, the input face image is registered in the face image database 75 as necessary. Simultaneously with the registration of the face image, face metadata is generated by the face metadata generation unit 72 and stored in the face metadata storage unit 73.

検索時には登録時と同様に顔画像入力部７１によって顔画像を入力し、顔メタデータ生
成部７２にて顔メタデータを生成する。生成された顔メタデータは、一旦顔メタデータ蓄
積部７３に登録するか、または、直接に顔類似度算出部７４へ送られる。 At the time of search, a face image is input by the face image input unit 71 as in registration, and the face metadata generation unit 72 generates face metadata. The generated face metadata is once registered in the face metadata accumulation unit 73 or directly sent to the face similarity calculation unit 74.

検索では、予め入力された顔画像がデータベース中にあるかどうかを確認する場合（顔
同定）には、顔メタデータ蓄積部７３に登録されたデータの一つ一つとの類似度を算出す
る。最も類似度が高い結果に基づいて制御部７６では、顔画像データベース７５から、顔
画像を選び、表示部７７等に顔画像の表示を行い、検索画像と登録画像における顔の同一
性を作業者が確認する。 In the search, when it is confirmed whether or not a face image input in advance is in the database (face identification), the similarity with each piece of data registered in the face metadata storage unit 73 is calculated. Based on the result with the highest similarity, the control unit 76 selects a face image from the face image database 75, displays the face image on the display unit 77, etc., and determines the identity of the face in the search image and the registered image. Confirm.

一方、予めＩＤ番号等で特定された顔画像と検索の顔画像が一致するかどうかを確認す
る場合（顔識別）では、特定されたＩＤ番号の顔画像と一致するか、否かを顔類似度算出
部７４にて計算し、予め決められた類似度よりも類似度が低い場合には、一致しないと判
定し、類似度が高い場合には一致すると判定し、その結果を表示部７７に表示する。この
システムを入室管理用に用いるならば、顔画像を表示する変わりに、制御部７６から自動
ドアに対して、その開閉制御信号を送ることで、自動ドアの制御によって入室管理を行う
ことができる。 On the other hand, when confirming whether the face image specified in advance by the ID number or the like matches the searched face image (face identification), it is determined whether the face image with the specified ID number matches or not. When the degree of similarity is lower than the predetermined similarity, it is determined that they do not match, and when the degree of similarity is high, it is determined that they match, and the result is displayed on the display 77. indicate. If this system is used for entry management, entry control can be performed by controlling the automatic door by sending an opening / closing control signal from the control unit 76 to the automatic door instead of displaying a face image. .

上記のように、顔画像マッチングシステムは動作するが、このような動作はコンピュー
タシステム上で実現することもできる。例えば、次に詳述するようなメタデータ生成を実
行するメタデータ生成プログラム及び類似度算出プログラムをそれぞれメモリに格納して
おき、これらをプログラム制御プロセッサによってそれぞれ実行することで、顔画像マッ
チングを実現することができる。 As described above, the face image matching system operates, but such an operation can also be realized on a computer system. For example, a metadata generation program and a similarity calculation program for performing metadata generation as will be described in detail below are stored in a memory, respectively, and executed by a program control processor, thereby realizing face image matching. can do.

次に、この顔画像マッチングシステムの動作、特に顔メタデータ生成部７２と顔類似度
算出部７４について、詳細に説明する。 Next, the operation of the face image matching system, particularly the face metadata generation unit 72 and the face similarity calculation unit 74 will be described in detail.

（１）顔メタデータ生成
顔メタデータ生成部７２では、位置と大きさを正規化した画像Ｉ（ｘ，ｙ）を用いて、
顔特徴量を抽出する。位置と大きさの正規化は、例えば、目位置が（１６，２４）、（３
１，２４）、サイズが４６×５６画素となるように画像を正規化しておくとよい。以下で
は、このサイズに画像が正規化されている場合について説明する。 (1) Face metadata generation The face metadata generation unit 72 uses the image I (x, y) normalized for position and size,
Extract facial features. The normalization of the position and size is, for example, when the eye position is (16, 24), (3
1, 24), the image may be normalized so that the size is 46 × 56 pixels. Hereinafter, a case where an image is normalized to this size will be described.

次に、領域切り出し手段７２１によって顔画像の予め設定した顔画像の複数の局所領域
を切り出す。例えば、上記の画像を例えば、一つは正規化した画像全体（これをｆ（ｘ，
ｙ）とする）ともう一つは、顔を中心とした中心領域の３２×３２画素の領域ｇ（ｘ，ｙ
）である。これは、両目の位置が（９，１２）と（２４，１２）の位置となるように切
り出せばよい。 Next, a plurality of local regions of the preset face image of the face image are cut out by the region cutout unit 721. For example, for example, one of the above images is normalized (this is expressed as f (x,
and y)) and the other is a 32 × 32 pixel region g (x, y) in the central region centered on the face.
). What is necessary is just to cut out so that the position of both eyes may become the position of (9,12) and (24,12).

顔の中心領域を前述のように切り出すのは、これは髪型等に影響をされない範囲を切り
出すことで、髪型の変化するような場合（例えば、家庭内ロボットで顔照合を用いる際に
入浴前後で髪型が変化しても照合できるようにするため）でも安定な特徴を抽出するため
のものであるが、髪型等が変化しない場合（映像クリップ中におけるシーン内の人物同
定などの場合）には、髪型を含んだ形で照合を行うことで照合性能の向上が期待できるの
で、髪型を含んだような大きな顔画像と顔の中心部分の小さな顔画像に対して、顔画像
を切り出す。 The center area of the face is cut out as described above by cutting out a range that is not affected by the hairstyle, etc., when the hairstyle changes (for example, before and after bathing when using face matching with a home robot). This is to extract stable features even if the hairstyle changes, but if the hairstyle does not change (such as when identifying a person in a scene in a video clip) Since collation performance can be expected to improve by collating with hairstyles included, face images are cut out for large face images that contain hairstyles and small face images at the center of the face.

次に顔画像特徴抽出手段７２２では、切り出された二つの領域ｆ（ｘ，ｙ）を２次元の
離散フーリエ変換によって、フーリエ変換し、顔画像の特徴を抽出する。 Next, the face image feature extraction unit 722 performs Fourier transform on the two extracted regions f (x, y) by two-dimensional discrete Fourier transform to extract the feature of the face image.

図８に顔画像特徴抽出手段７２２のより詳しい構成について示す。この顔画像特徴抽出
手段では、正規化し切り出された画像を離散フーリエ変換するフーリエ変換手段８１と、
フーリエ変換したフーリエ周波数成分のパワースペクトラムを算出するフーリエパワー算
出手段８２と、フーリエ変換手段８１によって算出されたフーリエ周波数成分の実成分と
虚成分をラスター走査した特徴ベクトルによって、１次元特徴ベクトルとみなして、その
特徴ベクトルの主成分に対して判別特徴を抽出する線形変換手段８３とその変換のための
基底行列を記憶する基底行列記憶手段８４、および、パワースペクトラムを同様に主成分
の判別特徴を抽出する線形変換手段８５とその変換のための基底行列を記憶する基底行列
記憶手段８６を備える。さらに、フーリエ特徴の実数成分と虚数成分の判別特徴、および
、パワースペクトルの判別特徴をそれぞれ大きさ１のベクトルに正規化し、その二つの特
徴ベクトルを統合したベクトルに対して、そのベクトルの判別特徴を算出する線形変換手
段８８とその判別特徴のための判別行列を記憶する判別行列記憶手段８９を備える。 FIG. 8 shows a more detailed configuration of the face image feature extraction means 722. In this face image feature extracting means, Fourier transform means 81 for performing discrete Fourier transform on the normalized and cut-out image,
A Fourier power calculation means 82 for calculating the power spectrum of the Fourier frequency component obtained by Fourier transform, and a real vector and an imaginary component of the Fourier frequency component calculated by the Fourier transform means 81 are regarded as a one-dimensional feature vector. The linear transformation means 83 for extracting the discriminant feature from the principal component of the feature vector, the base matrix storage means 84 for storing the base matrix for the transformation, and the discriminating feature of the main component in the same manner as the power spectrum. A linear transformation means 85 for extraction and a basis matrix storage means 86 for storing a basis matrix for the transformation are provided. Further, the discriminating features of the real component and the imaginary component of the Fourier feature and the discriminating feature of the power spectrum are normalized to a vector of size 1, respectively, and the discriminating feature of the vector is obtained by integrating the two feature vectors. And a discriminant matrix storage unit 89 for storing a discriminant matrix for the discriminant feature.

このような構成によって、フーリエ周波数特徴を抽出した後に、フーリエ周波数成分の
実数部と虚数部を要素とした特徴ベクトルと、パワースペクトラムを要素とした特徴ベク
トルに対して、それぞれ主成分の判別特徴を計算し、それぞれを統合した特徴ベクトルに
対して再度判別特徴を計算することで、顔の特徴量を計算する。 With this configuration, after extracting the Fourier frequency features, the main component discriminating features are respectively obtained for the feature vector having the real part and the imaginary part of the Fourier frequency component as elements and the feature vector having the power spectrum as an element. The feature amount of the face is calculated by calculating and calculating the discriminating feature again with respect to the feature vector obtained by integrating them.

以下では、それぞれの動作についてより詳しく説明する。
フーリエ変換手段８１では、入力された画像ｆ（ｘ，ｙ）（ｘ＝０，１，２，．．．Ｍ
−１，ｙ＝０，１，２，．．．，Ｎ−１）に対して、（数１１）に従って、２次元の離散
フーリエ変換し、そのフーリエ特徴Ｆ（ｕ，ｖ）を計算する。この方法は広く知られてお
り、例えば、文献（Ｒｏｓｅｎｆｅｌｄら、”ディジタル画像処理”、ｐｐ．２０−２６
，近代科学社）に述べられているので、ここでは説明を省略する。 Hereinafter, each operation will be described in more detail.
In the Fourier transform means 81, the input image f (x, y) (x = 0, 1, 2,... M
−1, y = 0, 1, 2,. . . , N−1), a two-dimensional discrete Fourier transform is performed according to (Equation 11), and its Fourier feature F (u, v) is calculated. This method is widely known and is described, for example, in the literature (Rosenfeld et al., “Digital Image Processing”, pp. 20-26.
The description is omitted here.

フーリエパワー算出手段では、（数１２）に従ってフーリエ特徴Ｆ（ｕ，ｖ）の大きさ
を求めフーリエパワースペクトラム｜Ｆ（ｕ，ｖ）｜を算出する。 The Fourier power calculation means calculates the Fourier power spectrum | F (u, v) | by obtaining the magnitude of the Fourier feature F (u, v) according to (Equation 12).

このようにして得られる二次元のフーリエスペクトルＦ（ｕ，ｖ）や｜Ｆ（ｕ，ｖ）｜
は２次元の実成分のみの画像を変換しているので、得られるフーリエ周波数成分は対称
なものとなる。このため、これらのスペクトル画像Ｆ（ｕ，ｖ）、｜Ｆ（ｕ，ｖ）｜は
ｕ＝０，１，．．．，Ｍ−１；ｖ＝０，１，．．．，Ｎ−１のＭ×Ｎ個の成分を持つが
、その半分の成分ｕ＝０，１，．．．，Ｍ−１；ｖ＝０，１，．．．，Ｎ−１のＭ×
Ｎ／２個の成分と、残りの半分の成分は、実質的に同等な成分となる。このため、特徴ベ
クトルとしては、半分の成分を用いて、以降の処理を行えばよい。当然のことながら、特
徴ベクトルの要素として用いられない成分をフーリエ変換手段８１やフーリエパワー算出
手段８２の演算で省略することで、演算の簡略化を図ることができる。 The two-dimensional Fourier spectrum F (u, v) or | F (u, v) |
Transforms an image of only two-dimensional real components, so that the obtained Fourier frequency components are symmetrical. Therefore, these spectral images F (u, v), | F (u, v) | are u = 0, 1,. . . , M−1; v = 0, 1,. . . , N−1 M × N components, half of which are u = 0, 1,. . . , M−1; v = 0, 1,. . . , N-1 M ×
The N / 2 components and the remaining half of the components are substantially equivalent components. For this reason, the subsequent processing may be performed using half the component as the feature vector. Naturally, by omitting components that are not used as elements of the feature vector in the calculation of the Fourier transform unit 81 or the Fourier power calculation unit 82, the calculation can be simplified.

次に、線形変換手段８３では、周波数特徴として抽出された特徴量をベクトルとして取
り扱う。予め規定しておく部分空間は、学習用の顔画像セットを用意し、対応する切り出
し領域の周波数特徴ベクトルの主成分の判別分析によって得られる基底ベクトル（固有ベ
クトル）によって定める。この基底ベクトルの求め方については、Ｗ．Ｚｈａｏの文献
をはじめとして様々は文献で説明されている一般的に広く知られた方法であるので、ここ
では説明を省略する。ここで判別分析を直接行わないのは、フーリエ変換によって得られ
る特徴ベクトルの次元数が判別分析を直接取り扱うには大きすぎるためであり、既に指摘
したような主成分判別分析における問題点は残るものの第一段階目の特徴ベクトルの抽出
としては、一つの選択ではある。また、ここに判別分析を繰り返す方法による基底行列を
用いて構わない。 Next, the linear conversion means 83 handles the feature quantity extracted as the frequency feature as a vector. The subspace defined in advance is determined by a base vector (eigenvector) obtained by preparing a face image set for learning and discriminating and analyzing the principal components of the frequency feature vectors of the corresponding cutout region. For how to obtain this basis vector, see W.W. Since various methods including the Zhao document are generally well-known methods described in the document, the description thereof is omitted here. The reason why the discriminant analysis is not performed directly is that the dimension number of the feature vector obtained by the Fourier transform is too large to handle the discriminant analysis directly. The extraction of feature vectors at the first stage is one selection. Further, a basis matrix obtained by repeating discriminant analysis may be used here.

つまり、基底行列記憶手段８４に記憶する主成分の判別行列Φ１は、周波数特徴の実成
分と虚成分をラスター走査によって１次元化した特徴ベクトルｘ１の主成分の判別分析を
行うことによって予め学習サンプルから求めることができる。ここで、フーリエ特徴は複
素数として取り扱う必要は必ずしもなく、虚数成分も単なる別の特徴要素として、実数と
して取り扱って構わない。 That is, the principal component discriminant matrix Φ1 stored in the base matrix storage means 84 is preliminarily learned by performing discriminant analysis of the principal component of the feature vector x1 obtained by making the real component and the imaginary component of the frequency feature one-dimensional by raster scanning. Can be obtained from Here, the Fourier feature does not necessarily have to be handled as a complex number, and the imaginary number component may be handled as a real number as just another feature element.

主成分への基底行列をΨ１、その主成分のベクトルを判別分析した判別行列をＷ１とす
れば、主成分の判別行列Φ１は、（数１３）によって書き表される。 If the basis matrix to the principal component is ψ1 and the discrimination matrix obtained by discriminating and analyzing the vector of the principal component is W1, the discrimination matrix Φ1 of the principal component is expressed by (Equation 13).

なお、主成分分析によって削減する次元数は、もとの特徴フーリエ特徴の１／１０程度
（２００次元前後）にすればよく、その後、この判別行列によって７０次元程度に削減す
る。この基底行列を予め学習サンプルから計算しておき、基底行列記憶手段８４に記憶
される情報として用いる。 Note that the number of dimensions to be reduced by principal component analysis may be about 1/10 of the original feature Fourier feature (around 200 dimensions), and then reduced to about 70 dimensions by this discriminant matrix. This basis matrix is calculated in advance from the learning sample and used as information stored in the basis matrix storage means 84.

フーリエパワースペクトラム｜Ｆ（ｕ，ｖ）｜についても同様にそのスペクトルをラス
ター走査によって、１次元特徴ベクトルｘ２として表し、その特徴ベクトルの主成分の判
別分析を行うことによって得られる基底行列Φ２Ｔ＝Ψ２ＴＷ２Ｔを学習サンプルから
予め求めておく。 Similarly, the Fourier power spectrum | F (u, v) | is represented as a one-dimensional feature vector x2 by raster scanning, and the basis matrix Φ2T = Ψ2TW obtained by performing discriminant analysis of the principal components of the feature vector. 2T is obtained in advance from the learning sample.

このように、フーリエ特徴のそれぞれの成分について主成分判別特徴を計算することで
、フーリエ成分の実成分と虚成分の特徴ベクトルｘ１の主成分の判別特徴ｙ１と、パワー
スペクトルの特徴ベクトルｘ２の主成分の判別特徴ｙ２を得ることができる。 Thus, by calculating the principal component discrimination feature for each component of the Fourier feature, the principal component discrimination feature y1 of the real component and imaginary component feature vector x1 of the Fourier component and the main vector of the power spectrum feature vector x2 The component discrimination feature y2 can be obtained.

正規化手段８７では、得られた特徴ベクトルの大きさをそれぞれ例えば長さ１の単位ベ
クトルに正規化する。ここで、ベクトルを測る原点をどこにするかで、ベクトル長は変わ
るので、その基準位置も予め定めておく必要があるが、これは射影された特徴ベクトルｙ
ｉの学習サンプルから求めた平均ベクトルｍｉを用いて、基準点とすればよい。平均ベク
トルを基準点とすることで、基準点の周りに特徴ベクトルが分布するようになり、特にガ
ウシアン分布であるならば、等方的に分布するようになるので、特徴ベクトルを最終的に
量子化するような場合の分布域の領域を限定することが容易にできるようになる。 In the normalizing means 87, the size of the obtained feature vector is normalized to a unit vector of length 1, for example. Here, since the vector length changes depending on where the origin for measuring the vector is set, it is necessary to predetermine the reference position. This is because the projected feature vector y
What is necessary is just to use it as a reference point using the average vector mi calculated | required from the learning sample of i. By using the average vector as the reference point, the feature vector is distributed around the reference point. In particular, if the Gaussian distribution is used, the feature vector is distributed isotropically. In this case, it becomes easy to limit the distribution area.

つまり、特徴ベクトルｙｉをその平均ベクトルｍｉによって、単位ベクトルに正規化し
たベクトルｙｉｏは、（数１４）と表される。 That is, the vector yio obtained by normalizing the feature vector yi to the unit vector by the average vector mi is expressed as (Equation 14).

このように正規化手段を設け、フーリエパワーの実数と虚数に関わる特徴ベクトルｙ１
と、パワーに関わる特徴ベクトルｙ２を単位ベクトルに正規化しておくことで、異種の特
徴量である二つの特徴量の間の大きさの正規化をしておき、特徴ベクトルの分布特性を安
定化させることができる。また、既に次元削減の過程で判別に必要な特徴空間の中での
大きさを正規化しているので、削除された雑音をより多く含む特徴空間で正規化する場合
よりも、雑音に影響されにくい正規化が実現できるためである。この正規化により、単
なる線形変換では除去が難しい全体的な照明強度に比例する変動成分のような変動要素の
影響をとり除くことができる。 Thus, the normalization means is provided, and the feature vector y1 related to the real number and the imaginary number of the Fourier power.
By normalizing the feature vector y2 related to power to a unit vector, the size between two feature quantities that are different kinds of feature quantities is normalized, and the distribution characteristics of the feature vectors are stabilized. Can be made. In addition, since the size in the feature space necessary for discrimination has already been normalized in the process of dimension reduction, it is less susceptible to noise than when normalized in a feature space that contains more deleted noise. This is because normalization can be realized. By this normalization, it is possible to remove the influence of a fluctuation element such as a fluctuation component proportional to the overall illumination intensity, which is difficult to remove by simple linear conversion.

このように正規化した特徴ベクトルｙ１ｏとｙ２ｏを（数８）と同様に一つの特徴ベ
クトルｙに統合し、統合された特徴ベクトルｙに対して、線形判別分析を行い得られる基
底行列Ｗ３を用いて判別空間に射影することで、出力特徴ベクトルｚを得ることができる
。このための判別行列Ｗ３を判別行列記憶手段８９に記憶しておき、線形変換手段８８
では、このための射影の演算を行い、例えば、２４次元の特徴ベクトルｚを算出する。 The feature vectors y1o and y2o thus normalized are integrated into one feature vector y as in (Equation 8), and a basis matrix W3 obtained by performing linear discriminant analysis on the integrated feature vector y is used. Then, the output feature vector z can be obtained by projecting onto the discriminant space. The discriminant matrix W3 for this purpose is stored in the discriminant matrix storage means 89, and the linear conversion means 88 is stored.
Then, a projection calculation for this is performed, and for example, a 24-dimensional feature vector z is calculated.

なお、出力特徴ベクトルｚを、一要素あたり例えば５ビットに量子化する場合には、各
要素の大きさを正規化しておく必要があるが、例えば、各要素の分散値に応じて、正規化
を施しておく。 Note that when the output feature vector z is quantized to, for example, 5 bits per element, the size of each element needs to be normalized. For example, the output feature vector z is normalized according to the variance value of each element. Apply.

つまり、特徴ベクトルｚの各要素ｚｉの学習サンプルにおける標準偏差の値σｉを求め
ておき、ｚｏ＝１６ｚｉ／３σｉというように正規化を施し、これを例えば５ビッ
トなら、−１６から１５の値に量子化すればよい。 That is, the standard deviation value σi in the learning sample of each element zi of the feature vector z is obtained and normalized such that zo = 16zi / 3σi.
For example, the value may be quantized from -16 to 15.

この際の正規化は、各要素に標準偏差の逆数をかけている演算となるので、σｉを対角
要素とする行列Σを考えると、正規化されたベクトルｚｏは、ｚｏ＝Σｚとなる。つまり
、単なる線形変換であるので、予め、判別行列Ｗ３に対してΣを（数１５）のように施し
ておいてもよい。 The normalization at this time is an operation in which each element is multiplied by the reciprocal of the standard deviation. Therefore, considering a matrix Σ having σi as a diagonal element, the normalized vector zo becomes zo = Σz. That is, since it is a simple linear transformation, Σ may be applied to the discriminant matrix W3 in advance as in (Equation 15).

このように正規化しておくことで、量子化に必要な値域補正を行うことができる利点が
あるばかりではなく、標準偏差値による正規化であるので、照合時にパターン間距離のノ
ルムを演算する際に単なるＬ２ノルムを計算するだけで、マハラノビス距離による演算を
行うことが可能となり、照合時における演算量を削減することが可能となる。 Normalization in this way not only has the advantage that the range correction required for quantization can be performed, but also normalization based on standard deviation values, so when calculating the norm of the inter-pattern distance during matching By simply calculating the L2 norm, the Mahalanobis distance can be calculated, and the amount of calculation at the time of collation can be reduced.

このように顔画像特徴抽出手段１２２では、正規化された画像ｆ（ｘ，ｙ）に対して特
徴ベクトルｚｆを抽出する際の説明を行ったが、顔の中心部分のみを切り出した画像ｇ（
ｘ，ｙ）に対しても、前述と同様に顔画像特徴抽出手段１２２によって特徴ベクトルｚｇ
を抽出する。二つの特徴ベクトルｚｆと特徴ベクトルｚｇを顔メタデータ生成部を顔特徴
量ｚとして抽出する。 As described above, the facial image feature extraction unit 122 has explained the case of extracting the feature vector zf from the normalized image f (x, y). However, the image g (
x, y), the feature vector zg is also obtained by the face image feature extraction means 122 in the same manner as described above.
To extract. Two feature vectors zf and feature vector zg are extracted as face feature quantity z by the face metadata generation unit.

なお、前述したように上記顔メタデータ生成手順をコンピュータプログラムによって
コンピュータに実行させることもできる。 As described above, the face metadata generation procedure can be executed by a computer using a computer program.

（２）顔類似度算出
次に顔類似度算出部７４の動作について説明する。 (2) Face similarity calculation Next, the operation of the face similarity calculation unit 74 will be described.

顔類似度算出部７４では、二つの顔メタデータから得られるそれぞれＫ次元特徴ベク
トルｚ１、ｚ２を用いて、二つの顔の間の類似度ｄ（ｚ１，ｚ２）を算出する。 The face similarity calculation unit 74 calculates the similarity d (z1, z2) between the two faces using the K-dimensional feature vectors z1 and z2 obtained from the two face metadata, respectively.

例えば、（数１６）の二乗距離によって類似度を算出する。 For example, the similarity is calculated from the square distance of (Equation 16).

αｉは重み係数で例えば各特徴次元ｚｉの標準偏差の逆数等を用いればマハラノビス距
離による計算となり、予め（数１５）等によって特徴ベクトルを正規化してある場合には
、基底行列が予め分散値によって正規化してあるので、前述の通りマハラノビス距離と
なっている。また、（数３）の比較する各特徴ベクトルのなす余弦によって類似度を算出
してもよい。 αi is a weighting factor, and for example, if the inverse of the standard deviation of each feature dimension zi is used, the calculation is based on the Mahalanobis distance. If the feature vector is normalized in advance using (Equation 15), the basis matrix is preliminarily expressed by the variance value. Since it has been normalized, it is Mahalanobis distance as described above. Further, the similarity may be calculated by the cosine formed by each feature vector to be compared in (Equation 3).

なお、距離を用いた場合には値が大きいほど類似度は小さいこと（顔が似ていない）を
意味し、余弦を用いた場合には値が大きいほど類似度が大きいこと（顔が似ている）を意
味する。 When distance is used, the larger the value, the smaller the similarity (the face is not similar). When cosine is used, the larger the value, the greater the similarity (the face is similar). Mean).

ここまでの説明では、一枚の顔画像が登録され、一枚の顔画像を用いて検索する場合に
ついて説明したが、一人の顔に対して複数の画像が登録され、一枚の顔画像を用いて検索
する場合には、例えば、登録側の複数の顔メタデータをそれぞれ、類似度の算出をすれば
よい。 In the above description, a case where a single face image is registered and retrieval is performed using a single face image has been described. However, a plurality of images are registered for one face, and a single face image is registered. When searching by using, for example, the degree of similarity may be calculated for each of a plurality of face metadata on the registration side.

同様に１つの顔当たりの複数枚の画像登録と複数画像による検索の場合も、各組み合
わせの類似度の平均や最小値を求めることで、類似度を算出することで、一つの顔データ
に対する類似度を算出することができる。これは、動画像を複数画像と見倣すことで、本
発明のマッチングシステムを動画像における顔認識に対しても適用できることを意味する
。 Similarly, in the case of registration of multiple images per face and search by multiple images, the similarity to one face data can be calculated by calculating the similarity by calculating the average or minimum value of the similarity of each combination. The degree can be calculated. This means that the matching system of the present invention can be applied to face recognition in a moving image by imitating the moving image as a plurality of images.

以上、本発明を実施の形態を適宜図面を参照して説明したが、本発明は、コンピュータ
が実行可能なプログラムによっても実現できることは言うまでもない。 Although the embodiments of the present invention have been described with reference to the drawings as appropriate, it is needless to say that the present invention can also be realized by a computer-executable program.

（第四の実施の形態）
本発明による別の実施の形態について図面を参照して詳細に説明する。本発明は、第３
の発明における顔メタデータ生成部７２を改良するものである。第３の発明では、入力顔
画像をフーリエ変換を行うことで得られるフーリエ周波数成分の実数部と虚数部を要素と
した特徴ベクトルと、パワースペクトラムを要素とした特徴ベクトルに対して、それぞれ
の主成分の判別特徴を計算し、それぞれを統合した特徴ベクトルに対して再度判別特徴を
計算することで、顔の特徴量を計算している。この場合、フーリエパワースペクトルが入
力画像全体の特徴量を反映しているために、入力画素にノイズが多い成分（例えば、相対
的な位置が変化しやすい口の周りの画素など）もパワースペクトルの中に他の画素と等し
く反映されてしまい、判別分析によって有効な特徴量を選択しても、十分な性能が得られ
ない場合があった。このような場合には入力画像を領域分割し、その局所領域毎にフーリ
エ変換し、各局所領域毎のパワースペクトルを特徴量として、判別分析することで、局所
的に判別性能が悪い（クラス内分散が大きい）領域の特徴量の影響を判別分析によって、
低減することができる。 (Fourth embodiment)
Another embodiment of the present invention will be described in detail with reference to the drawings. The present invention provides a third
In this invention, the face metadata generation unit 72 is improved. In the third aspect of the present invention, the main vector for the feature vector having the real part and the imaginary part of the Fourier frequency component obtained by performing the Fourier transform on the input face image, and the feature vector having the power spectrum as an element, respectively. The feature amount of the face is calculated by calculating the discriminant feature of the component and calculating the discriminant feature again for the feature vector obtained by integrating them. In this case, since the Fourier power spectrum reflects the feature amount of the entire input image, components with a lot of noise in the input pixel (for example, pixels around the mouth whose relative position is likely to change) are also included in the power spectrum. Even if an effective feature amount is selected by discriminant analysis, sufficient performance may not be obtained. In such a case, the input image is divided into regions, Fourier transformed for each local region, and discriminant analysis is performed with the power spectrum of each local region as a feature quantity, resulting in poor local discrimination performance (within the class Discriminant analysis of the influence of the feature amount of the area
Can be reduced.

図９は実施例を説明するための図で、特徴抽出処理のフローを表している。この実施例
では、例えば、３２ｘ３２画素の領域を１６ｘ１６画素の４領域、８ｘ８画素の１６領域
、４ｘ４画素の６４領域、２ｘ２画素の２５６領域、１ｘ１画素の１０２４領域（実質的
に入力画像と同じなので、入力画像そのままでよい）に分割し、その各々の分割された領
域でフーリエ変換を行う。この処理フローをまとめた図が図１０である。このようにして
得られた各領域のパワースペクトル全ての１０２４ｘ５次元＝５１２０次元の特徴量を抽
出する。この次元数では通常の学習データが少ない場合では次元数が多いので、予め主成
分分析を行い、次元数を削減するような主成分分析の基底を求めておく。例えば、次元数
としては３００次元程度が適当である。この次元数の特徴ベクトルについてさらに判別分
析を行い、次元数を削減し、判別性能のよい特徴軸に対応する基底を求める。主成分分析
と判別分析に対応する基底を予め計算しておく（これをＰＣＬＤＡ射影基底Ψとする）。 FIG. 9 is a diagram for explaining the embodiment and shows a flow of feature extraction processing. In this embodiment, for example, an area of 32 × 32 pixels is divided into four areas of 16 × 16 pixels, 16 areas of 8 × 8 pixels, 64 areas of 4 × 4 pixels, 256 areas of 2 × 2 pixels, and 1024 areas of 1 × 1 pixels (because they are substantially the same as the input image). The input image may be left as it is, and Fourier transform is performed on each of the divided regions. FIG. 10 shows a summary of this processing flow. The 1024 × 5 dimension = 5120 dimension feature quantity of all the power spectra of each region obtained in this way is extracted. Since the number of dimensions is large when the number of normal learning data is small, principal component analysis is performed in advance, and a basis of principal component analysis that reduces the number of dimensions is obtained. For example, about 300 dimensions is appropriate as the number of dimensions. Discriminant analysis is further performed on the feature vector of this number of dimensions, the number of dimensions is reduced, and a base corresponding to a feature axis with good discrimination performance is obtained. Bases corresponding to principal component analysis and discriminant analysis are calculated in advance (this is referred to as PCLDA projection base Ψ).

５１２０次元の特徴をこのＰＣＬＤＡ基底を用いた射影基底Ψを用いて線形演算によっ
て射影することで、判別特徴ｚを得ることができる。さらに量子化等を施すことで、顔の
特徴量となる。なお、５１２０次元の特徴量はフーリエパワースペクトルの対称性等を考
慮したり、高周波成分を除去して、予め使わないことにすれば、次元数を削減でき、高速
な学習、必要とされるデータ量の削減、高速な特徴抽出を可能とすることができるので、
適宜次元数を削減することが望ましい。 The 5120-dimensional feature is projected by linear calculation using the projected basis Ψ using the PCLDA basis, whereby the discriminant feature z can be obtained. Further, by performing quantization or the like, it becomes a facial feature amount. Note that the 5120-dimensional feature quantity can be reduced by considering the symmetry of the Fourier power spectrum, etc., or by removing high frequency components in advance, so that the number of dimensions can be reduced, high-speed learning, and required data. Since the amount can be reduced and high-speed feature extraction can be performed,
It is desirable to reduce the number of dimensions as appropriate.

このように領域をブロック化し、フーリエスペクトラムを多重化することで、画像特徴
と同値な特徴量（１０２４分割の場合）から、順に平行移動の普遍性を持った特徴量と局
所的な特徴量の表現を多重に持つことができる。その多重で冗長な特徴表現の中から、判
別分析によって、識別に有効な特徴量を選択することで、コンパクトで識別性能のよい特
徴量を得ることができる。フーリエパワースペクトルは、画像に対して非線形な演算であ
り、これは画像を単に線形演算によって処理する判別分析を適用するだけでは得られない
有効な特徴量を計算することができる。ここでは主成分に対して線形判別分析を行う場合
について説明したが、カーネル判別分析（ＫｅｒｎｅｌＦｉｓｈｅｒＤｉｓｃｒｉｍ
ｉｎａｎｔＡｎａｌｙｓｉｓ，ＫＦＤＡあるいはＫｅｒｎｅｌＤｉｓｃｒｉｍｉｎａｎ
ｔＡｎａｌｙｓｉｓ：ＫＤＡ、ＧｅｎｅｒａｌｉｚｅｄＤｉｓｃｒｉｍｉｎａｎ
ｔＡｎａｌｙｓｉｓ：ＧＤＡなどと呼ばれるカーネルテクニックを用いた判別分析）
を用いて、２段階目の特徴抽出を行っても構わない。例えば、カーネル判別分析について
は、Ｑ．Ｌｉｕらの文献（非特許文献３：”Ｋｅｒｎｅｌ−ｂａｓｅｄＯｐｔｉｍ
ｉｚｅｄＦｅａｔｕｒｅＶｅｃｔｏｒｓＳｅｌｅｃｔｉｏｎａｎｄＤｉｓｃｒ
ｉｍｉｎａｎｔＡｎａｌｙｓｉｓｆｏｒＦａｃｅＲｅｃｏｇｎｉｔｉｏｎ，”Ｐ
ｒｏｃｅｅｄｉｎｇｏｆＩＡＰＲＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎ
ｃｅｏｎＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＩＣＰＲ），Ｖｏｌ．ＩＩ，
ｐｐ．３６２−３６５，２００２）やＧ．Ｂａｕｄａｔの文献（非特許文献４
：”ＧｅｎｅｒａｌｉｚｅｄＤｉｓｃｉｍｉｎａｎｔＡｎａｌｙｓｉｓＵｓｉｎｇ
ａＫｅｒｎｅｌＡｐｐｒｏａｃｈ，”ＮｅｕｒａｌＣｏｍｐｕｔａｔｉｏｎ，
Ｖｏｌ．１２，ｐｐ２３８５−２４０４，２０００）に詳しく解説されているので、そ
れらを参照されたい。このようにカーネル判別分析を用いて特徴を抽出することで、非
線形による特徴抽出の効果をさらに発揮することができ、有効な特徴を抽出することがで
きる。 By blocking the region and multiplexing the Fourier spectrum in this way, the feature amount having the universality of parallel movement and the local feature amount are sequentially converted from the feature amount equivalent to the image feature (in the case of 1024 divisions). You can have multiple expressions. By selecting feature quantities effective for discrimination from the multiple redundant feature expressions by discriminant analysis, it is possible to obtain feature quantities that are compact and have good discrimination performance. The Fourier power spectrum is a non-linear operation with respect to an image, which can calculate an effective feature amount that cannot be obtained simply by applying a discriminant analysis that processes an image by a linear operation. Here, the case where the linear discriminant analysis is performed on the principal component has been described, but the kernel discriminant analysis (Kernel Fisher Discriminator).
inant Analysis, KFDA or KernelDiscriminan
t Analysis: KDA, Generalized Discriminan
t Analysis: Discriminant analysis using kernel technique called GDA)
The second stage feature extraction may be performed using. For example, for kernel discriminant analysis, Q.I. Liu et al. (Non-Patent Document 3: “Kernel-based Optim”
iZed Feature Vectors Selection and Discr
Iminant Analysis for Face Recognition, “P
laced of of IAPR International Conference
ce on Pattern Recognition (ICPR), Vol. II,
pp. 362-365, 2002) and G.R. Baudat's document (Non-Patent Document 4)
: "Generalized Discriminant Analysis Using
a Kernel Approach, “Neural Computation,
Vol. 12, pp 2385-2404, 2000), which are described in detail. By extracting features using kernel discriminant analysis in this way, the effect of feature extraction by nonlinearity can be further exhibited, and effective features can be extracted.

しかし、この場合、５１２０次元と大きな特徴ベクトルを取り扱うので、主成分分析を
行う場合でも、大量のメモリ、大量の学習データが必要となる。図１１は、このような問
題を避けるべく、各ブロック毎に主成分分析・判別分析を個別に行い、その後、２段階で
判別分析（ＬｉｎｅａｒＤｉｓｃｒｉｍｉｎａｎｔＡｎａｌｙｓｉｓ：ＬＤＡ）を
行うことで、演算量を削減することができる。この場合には、各領域毎に１０２４次元（
対称性を考慮して半分にすると、５１２次元）の特徴量を用いて、主成分分析と判別分析
を行い基底行列Ψｉ（ｉ＝０，１，２，．．，５）を求めておく。そして、その後それぞ
れの平均値を用いて特徴ベクトルを正規化し、二段階目のＬＤＡ射影を行う。このように
ブロック毎に処理を行うことで、学習の際に要求されるデータ数や計算機資源を減少させ
ることができ、学習の最適化の時間削減等を行うことできる。なお、高速に演算を行いた
い場合には、ベクトル正規化の処理を省き、予めＰＣＬＤＡ射影の基底行列とＬＤＡ射影
の基底行列を計算しておくことで、演算の高速化を図ることができる。 However, since a large feature vector of 5120 dimensions is handled in this case, a large amount of memory and a large amount of learning data are required even when principal component analysis is performed. In FIG. 11, in order to avoid such a problem, principal component analysis / discriminant analysis is individually performed for each block, and then discriminant analysis (Linear Discriminant Analysis: LDA) is performed in two stages, thereby reducing the amount of calculation. can do. In this case, each area has 1024 dimensions (
When the symmetry is halved in consideration of the symmetry, the principal component analysis and the discriminant analysis are performed using the 512-dimensional feature value to obtain the basis matrix Ψi (i = 0, 1, 2,..., 5). Then, the feature vector is normalized using each average value, and the second-stage LDA projection is performed. By performing processing for each block in this way, the number of data and computer resources required for learning can be reduced, and learning optimization time can be reduced. If it is desired to perform the calculation at high speed, the vector normalization process is omitted, and the PCLDA projection base matrix and the LDA projection base matrix are calculated in advance, so that the calculation speed can be increased.

図１２はまた別の実施例を説明するための図で、特徴抽出処理のフローを表している。
この実施例では、このような領域分割を複数段階（図では２段階）で行い、局所領域のフ
ーリエパワースペクトルが持つ並進普遍性と、局所領域の信頼性を考慮するように多重に
パワースペクトラムを多重な解像度で抽出し、判別分析のための特徴量として抽出し、そ
の中で判別分析で求められた最も優れた特徴空間利用して、特徴抽出を行う。 FIG. 12 is a diagram for explaining another embodiment, and shows a flow of feature extraction processing.
In this embodiment, such region division is performed in a plurality of steps (two steps in the figure), and the power spectrum is multiplexed in a multiple manner so that the translation universality of the Fourier power spectrum of the local region and the reliability of the local region are taken into consideration. Extracted with multiple resolutions, extracted as feature quantities for discriminant analysis, and extracted features using the best feature space found in discriminant analysis.

例えば、入力画像ｆ（ｘ，ｙ）が３２ｘ３２画素の場合には、図１０に示すように全体
画像のパワースペクトル｜Ｆ（ｕ，ｖ）｜とそれを４分割した１６ｘ１６画素の４つの領
域のそれぞれのパワースペクトラム｜Ｆ１１（ｕ，ｖ）｜，｜Ｆ１２（ｕ，ｖ）｜，
｜Ｆ１３（ｕ，ｖ）｜，｜Ｆ１４（ｕ，ｖ）｜、８ｘ８画素の１６個の領域に分割した
｜Ｆ２１（ｕ，ｖ）｜，｜Ｆ２１（ｕ，ｖ）｜，・・・，｜Ｆ２１６（ｕ，ｖ）｜を特
徴ベクトルを抽出する。但し、実画像のフーリエパワースペクトルの対称性を考慮して、
その１／２を抽出すればよい。また、判別分析における特徴ベクトルの大きさが大きくな
ることを避けるために、判別に対して高周波成分をサンプリングしないで、特徴ベクトル
を構成してもよい。例えば、低周波成分に対応する１／４のスペクトルをサンプリングし
て特徴ベクトルを構成することで、必要となる学習サンプル数を低減したり、学習や認識
に必要な処理時間の軽減を行うことができる。また、学習データ数が少ない場合には、予
め主成分分析して特徴次元数を減らした後に判別分析を行っても良い。 For example, when the input image f (x, y) is 32 × 32 pixels, as shown in FIG. 10, the power spectrum | F (u, v) | of the entire image and four regions of 16 × 16 pixels obtained by dividing the power spectrum | F (u, v) | Respective power spectrums | F11 (u, v) |, | F12 (u, v) |
| F13 (u, v) |, | F14 (u, v) |, | F21 (u, v) |, divided into 16 regions of 8 × 8 pixels, | F21 (u, v) |,. A feature vector is extracted from | F216 (u, v) |. However, considering the symmetry of the Fourier power spectrum of the real image,
One half of that may be extracted. In order to avoid an increase in the size of the feature vector in discriminant analysis, the feature vector may be configured without sampling high-frequency components for discrimination. For example, it is possible to reduce the number of necessary learning samples or reduce the processing time required for learning or recognition by sampling a quarter of the spectrum corresponding to the low frequency component to construct a feature vector. it can. If the number of learning data is small, discriminant analysis may be performed after the principal component analysis is performed in advance and the number of feature dimensions is reduced.

さて、このように抽出した特徴ベクトルｘ２ｆを用いて、予め用意した学習セットを用
いて判別分析を行い、その基底行列Ψ２ｆを求めておく。図９では主成分に対する判別特
徴の抽出（ＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔＬｉｎｅａｒＤｉｓｃｒｉｍｉ
ｎａｎｔＡｎａｌｙｓｉｓ：ＰＣＬＤＡ）の射影を行っている例を示している。特徴
ベクトルｘ２ｆを基底行列Ψ２ｆを用いて射影し、その射影された特徴ベクトルの平均と
大きさを正規化し、特徴ベクトルｙ２ｆを算出する。 Now, using the feature vector x2f extracted in this way, discriminant analysis is performed using a learning set prepared in advance, and its base matrix Ψ2f is obtained. In FIG. 9, the extraction of discriminant features from the principal components (Principal Component Linear Discriminator).
An example in which projection of (Nant Analysis: PCLDA) is performed is shown. The feature vector x2f is projected using the basis matrix Ψ2f, the average and size of the projected feature vector are normalized, and the feature vector y2f is calculated.

同様にフーリエ周波数の実数成分と虚数成分を統合した特徴ベクトルｘ２ｆについても
、基底行列Ψ１ｆを用いて線形演算処理により特徴ベクトルを射影し、次元数を削減した
特徴ベクトルを求め、そのベクトルの平均と大きさを正規化した特徴ベクトルｙ１ｆを算
出する。これらを統合した特徴ベクトルを判別基底Ψ３ｆを用いて、再度射影し、特徴ベ
クトルｚｆを得る。これを例えば５ｂｉｔに量子化することで、顔特徴量を抽出する。 Similarly, for the feature vector x2f that integrates the real and imaginary components of the Fourier frequency, the feature vector is projected by linear arithmetic processing using the basis matrix Ψ1f to obtain a feature vector with a reduced number of dimensions. A feature vector y1f whose size is normalized is calculated. The feature vector obtained by integrating these is projected again using the discriminant basis Ψ 3 f to obtain a feature vector zf. By quantizing this to, for example, 5 bits, a facial feature amount is extracted.

なお、入力が４４ｘ５６画素の大きさに正規化された顔画像である場合には、中心部分
の３２ｘ３２画素に上述の処理を施して、顔特徴量を抽出するとともに、顔全体の４４ｘ
５６画素の領域についても、４４ｘ５６画素の全体領域と、２２ｘ２８画素の４領域、１
１ｘ１４画素の１６画素に多重に分割した領域についてそれぞれ顔特徴量を抽出する。図
１３は、別の実施例を表しており、各局所領域毎に実数成分と虚数成分とパワースペクト
ルを合わせてＰＣＬＤＡを行う場合や、図１４のように実数成分と虚数成分を合わせた特
徴とパワースペクトルとを個別にＰＣＬＤＡ射影し、最後にＬＤＡ射影を行っている例で
ある。 When the input is a face image normalized to a size of 44 × 56 pixels, the above processing is performed on the 32 × 32 pixels in the central portion to extract the face feature amount and 44 × of the entire face.
As for the area of 56 pixels, the entire area of 44 × 56 pixels and 4 areas of 22 × 28 pixels, 1
A facial feature amount is extracted for each region divided into 16 pixels of 1 × 14 pixels. FIG. 13 shows another embodiment. When PCLDA is performed by combining the real component, the imaginary component, and the power spectrum for each local region, or when the real component and the imaginary component are combined as shown in FIG. This is an example in which the power spectrum and PCLDA are individually projected and finally LDA projection is performed.

（第五の実施の形態）
本発明による別の実施の形態について図面を用いて詳細に説明する。 (Fifth embodiment)
Another embodiment according to the present invention will be described in detail with reference to the drawings.

本発明を用いた顔特徴記述方法および顔特徴の記述子の実施例を表わす。図１５には、
顔の特徴記述の一例として、ＩＳＯ／ＩＥＣＦＤＩＳ１５９３８−３“Ｉｎｆｏｒｍ
ａｔｉｏｎｔｅｃｈｎｏｌｏｇｙＭｕｌｔｉｍｅｄｉａｃｏｎｔｅｎｔｄｅｓ
ｃｒｉｐｔｉｏｎｉｎｔｅｒｆａｃｅ− Ｐａｒｔ３：Ｖｉｓｕａｌ”におけるＤＤ
Ｌ表現文法（ＤｅｓｃｒｉｐｔｉｏｎＤｅｆｉｎｉｔｉｏｎＬａｎｇｕａｇｅＲｅ
ｐｒｅｓｅｎｔａｔｉｏｎＳｙｎｔａｘ）を用いて顔特徴量の記述について表わしてい
る。 2 illustrates an embodiment of a facial feature description method and facial feature descriptor using the present invention. In FIG.
As an example of facial feature description, ISO / IEC FDIS 15938-3 “Inform”
application technology multimedia content des
creation interface- Part3: Visual in DD ”
L Expression Grammar (Description Definition Language Re
The description of the facial feature value is expressed using a presentation syntax.

ここでは、ＡｄｖａｎｃｅｄＦａｃｅＲｅｃｏｇｎｉｔｉｏｎと名付けた顔特徴の記述
について、それぞれ”ＦｏｕｒｉｅｒＦｅａｔｕｒｅ”，”ＣｅｎｔｒａｌＦｏｕｒｉｅ
ｒＦｅａｔｕｒｅ”と名付ける要素を有しており、ＦｏｕｉｒｅｒＦｅａｔｕｒｅやＣｅ
ｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅは、符号なし５ビットの整数でそれぞれ２４次
元から６３次元の要素を持つことができることを表わしている。図１６は、そのデータ表
現に対してバイナリー表現文法（ＢｉｎａｒｙＲｅｐｒｅｓｅｎｔａｔｉｏｎＳｙｎ
ｔａｘ）を用いた場合の規則を表わしており、ＦｏｕｒｉｅｒＦｅａｔｕｒｅ、Ｃｅｎｔ
ｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅの配列要素の大きさを符号なし６ビットの整数でｎ
ｕｍＯｆＦｏｕｒｉｅｒＦｅａｔｕｒｅ、ｎｕｍＯｆＣｅｎｔｒａｌＦｏｕｒｉｅｒに格
納し、ＦｏｕｒｉｅｒＦｅａｔｕｒｅ、ＣｅｎｔｒａｌＦｏｕｒｅｉｒＦｅａｔｕｒｅの
それぞれの要素が５ビットの符号なし整数で格納されることを表わしている。 Here, the description of the facial feature named AdvancedFaceRecognition is “FourierFeature” and “CentralFourie” respectively.
rFeature "and has an element named“ FoilerFeature ”or“ Ce ”
ntualFourierFeature is an unsigned 5-bit integer indicating that each element can have 24 to 63 dimensions. FIG. 16 shows a binary representation grammar (Binary Representation Syn) for the data representation.
tax) is used to represent the rules, FourierFeature, Cent
The size of an array element of ralFourierFeature is an unsigned 6-bit integer n
It is stored in umOfFourierFeature and numOfCentralFourier, and indicates that each element of FourierFeature and CentralFourierFeature is stored as a 5-bit unsigned integer.

本発明を用いたこのような顔特徴の記述子について、より詳細に説明する。
●ｎｕｍＯｆＦｏｕｒｉｅｒＦｅａｔｕｒｅ
このフィールドは、ＦｏｕｒｉｅｒＦｅａｔｕｒｅの配列の大きさを規定する。値の許容
範囲は、２４から６３である。
●ｎｕｍＯｆＣｅｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅ
このフィールドは、ＣｅｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅの配列の大きさを規定
する。値の許容範囲は、２４から６３である。
●ＦｏｕｒｉｅｒＦｅａｔｕｒｅ
この要素は、正規化顔画像のフーリエ特性の階層的ＬＤＡに基づく顔特徴を表している。
正規化顔画像は、原画像を各行４６個の輝度値を持つ５６行の画像に大きさを変換するこ
とによって得られる。正規化画像における両目の中心位置は、右目、左目がそれぞれ、２
４行目の１６列目及び３１列目に位置していなければならない。 Such a facial feature descriptor using the present invention will be described in more detail.
● numOfFourierFeature
This field specifies the size of the array of FourierFeatures. The acceptable range of values is 24 to 63.
NumOfCentralFourierFeature
This field defines the size of the CentralFourierFeature array. The acceptable range of values is 24 to 63.
● FourierFeature
This element represents a facial feature based on the hierarchical LDA of the Fourier characteristics of the normalized facial image.
The normalized face image is obtained by converting the size of the original image into an image of 56 lines having 46 luminance values in each line. The center position of both eyes in the normalized image is 2 for the right eye and the left eye, respectively.
It must be located in the 16th and 31st columns of the 4th row.

ＦｏｕｒｉｅｒＦｅａｔｕｒｅの要素は、二つの特徴ベクトルから抽出される。一つは
、フーリエスペクトルベクトルｘ１ｆであり、もう一つは、マルチブロックフーリエ強度
ベクトルｘ２ｆである。図１７は、フーリエ特徴の抽出過程を図示している。正規化画像
が与えられたら、その要素を抽出するために、次の５つの処理ステップを実行しなければ
ならない。 The elements of the FourierFeature are extracted from two feature vectors. One is a Fourier spectrum vector x1f, and the other is a multi-block Fourier intensity vector x2f. FIG. 17 illustrates the process of extracting Fourier features. Given a normalized image, the following five processing steps must be performed to extract its elements.

（１）フーリエスペクトルベクトルｘ１ｆの抽出
（２）マルチブロックフーリエ強度ベクトルｘ２ｆの抽出
（３）ＰＣＬＤＡ基底行列Ψ１ｆ、Ψ２ｆを用いた特徴ベクトルの射影と、単位ベクト
ルｙ１ｆ、ｙ２ｆへの正規化
（４）ＬＤＡ基底行列Ψ３ｆを用いた、単位ベクトルの結合フーリエベクトルの射影
（５）射影ベクトルＺｆの量子化 (1) Extraction of Fourier spectrum vector x1f (2) Extraction of multi-block Fourier intensity vector x2f (3) Projection of feature vector using PCLDA basis matrices Ψ1f and Ψ2f and normalization to unit vectors y1f and y2f (4) Projection of combined Fourier vectors of unit vectors using LDA basis matrix Ψ3f (5) Quantization of projection vector Zf

ＳＴＥＰ−１）フーリエスペクトルベクトルの抽出
与えられた正規化画像ｆ（ｘ，ｙ）に対するフーリエスペクトルＦ（ｕ，ｖ）を（数１
８）式により計算する。 (STEP-1) Extraction of Fourier spectrum vector The Fourier spectrum F (u, v) for a given normalized image f (x, y) is expressed by
8) Calculate according to equation.

ここで、Ｍ＝４６、Ｎ＝５６である。フーリエスペクトルベクトルｘ１ｆは、フーリエ
スペクトルを走査して得られる成分の集合によって定義される。図１８は、フーリエスペ
クトルの走査方法を示している。走査は、フーリエ空間における二つの領域、領域Ａと領
域Ｂ、に対して実行される。走査規則を図１９にまとめる。ここで、ＳＲ（ｕ，ｖ）は
、領域Ｒの左上の座標を表し、ＥＲ（ｕ，ｖ）は領域Ｒの右下の点をそれぞれ表す。そ
れ故に、フーリエスペクトルベクトルｘ１ｆは（数１９）式によって表現される。 Here, M = 46 and N = 56. The Fourier spectrum vector x1f is defined by a set of components obtained by scanning the Fourier spectrum. FIG. 18 shows a Fourier spectrum scanning method. Scanning is performed on two regions in Fourier space, region A and region B. The scanning rules are summarized in FIG. Here, SR (u, v) represents the upper left coordinates of the region R, and ER (u, v) represents the lower right point of the region R. Therefore, the Fourier spectrum vector x1f is expressed by (Equation 19).

ｘ１ｆの次元数は６４４次元である。

The number of dimensions of x1f is 644 dimensions.

ＳＴＥＰ２）マルチブロックフーリエ強度ベクトルの抽出
マルチブロックフーリエ強度ベクトルを正規化顔画像の部分画像のフーリエ強度から抽
出する。部分画像としては、（ａ）全体画像、（ｂ）４分の１画像、（ｃ）１６分
の１画像の３つのタイプの画像が使われる。 STEP 2) Extraction of multi-block Fourier intensity vector A multi-block Fourier intensity vector is extracted from the Fourier intensity of the partial image of the normalized face image. Three types of images are used as partial images: (a) the whole image, (b) a quarter image, and (c) a 16th image.

（ａ）全体画像
全体画像ｆ１０（ｘ，ｙ）は、正規化画像ｆ（ｘ，ｙ）の画像境界の両側の列を取り除
き、４４ｘ５６の画像サイズに切り出すことで得ることができる。これは、（数２０）式
によって与えられる。 (A) Whole image The whole image f10 (x, y) can be obtained by removing columns on both sides of the image boundary of the normalized image f (x, y) and cutting out to an image size of 44x56. This is given by equation (20).

（ｂ）４分の１画像
４分の１画像は、全体画像ｆ１０（ｘ，ｙ）を４ブロックｆｋ１（ｘ，ｙ）（ｋ＝１，
２，３，４）に等分割することによって、得ることができる。 (B) 1/4 image The 1/4 image is obtained by converting the entire image f10 (x, y) into 4 blocks fk1 (x, y) (k = 1,
It can be obtained by dividing equally into 2, 3, 4).

ここで、ｓｋ１＝（ｋ−１）％２、ｔｋ１＝（ｋ−１）／２である。

Here, sk1 = (k−1)% 2 and tk1 = (k−1) / 2.

（ｃ）１６分の１画像
１６分の１画像は、ｆ１０（ｘ，ｙ）を１６ブロックｆｋ２（ｘ，ｙ）（ｋ＝１，２，
３，・・・，１６）に等分割することによって得られ、次式によって与えられる。 (C) 1/16 image A 1/16 image is obtained by changing f10 (x, y) into 16 blocks fk2 (x, y) (k = 1, 2,
3,..., 16) and obtained by the following equation.

ここで、ｓｋ２＝（ｋ−１）％４、ｔｋ２＝（ｋ−１）／４である。

Here, sk2 = (k−1)% 4 and tk2 = (k−1) / 4.

これらの画像から、フーリエ強度｜Ｆｋｊ（ｕ，ｖ）｜を次の（数２３）式のように計
算する。 From these images, the Fourier intensity | Fkj (u, v) | is calculated as in the following equation (23).

Ｍｊは各々の部分画像の幅を表し、Ｍ０＝４４，Ｍ１＝２２，Ｍ２＝１１である。
Ｎｊは部分画像の高さを表し、Ｎ０＝５６，Ｎ１＝２８，Ｎ２＝１４である。

Mj represents the width of each partial image, and M0 = 44, M1 = 22, and M2 = 11.
Nj represents the height of the partial image, and N0 = 56, N1 = 28, and N2 = 14.

マルチブロックフーリエ強度ベクトルは、１）全体画像（ｋ＝１），２）４分の
１画像（ｋ＝１，２，３，４），及び３）１６分の１画像（ｋ＝１，２，・・・，
１６）の順に、各々の強度｜Ｆｋｊ（ｕ，ｖ）｜の低周波数領域を走査することによって
得られる。走査領域は、図１９に定義している。 Multi-block Fourier intensity vectors are: 1) Whole image (k = 1), 2) 1/4 image (k = 1, 2, 3, 4), and 3) 1/16 image (k = 1, 2) , ...,
16) in order of scanning and scanning the low frequency region of each intensity | Fkj (u, v) |. The scanning area is defined in FIG.

それ故に、マルチブロックフーリエ強度ベクトルｘ２ｆは、（数２４）式で表現される
。 Therefore, the multi-block Fourier intensity vector x2f is expressed by Equation (24).

ｘ２ｆの次元数は８５６次元である。

The number of dimensions of x2f is 856 dimensions.

ＳＴＥＰ３）ＰＣＬＤＡ射影とベクトル正規化
フーリエスペクトルベクトルｘ１ｆとマルチブロックフーリエ強度ベクトルｘ２ｆをそ
れぞれＰＣＬＤＡ基底行列Ψ１ｆとΨ２ｆを用いて射影し、単位ベクトルｙ１ｆとｙ２ｆ
に正規化する。正規化ベクトルｙｋｆ（ｋ＝１，２）は次式によって与えられる。 STEP 3) PCLDA Projection and Vector Normalization Fourier spectrum vector x1f and multi-block Fourier intensity vector x2f are projected using PCLDA basis matrices Ψ1f and Ψ2f, respectively, and unit vectors y1f and y2f
Normalize to The normalized vector ykf (k = 1, 2) is given by

ここで、ＰＣＬＤＡ基底行列Ψｋｆと平均ベクトルｍｋｆは、ｘｋｆの主成分の判別
分析によって得られる基底行列と射影して得られる平均ベクトルであり、予め計算してあ
るテーブルを参照する。ｙ１ｆとｙ２ｆの次元数はそれぞれ７０次元と８０次元である
。

Here, the PCLDA basis matrix Ψkf and the average vector mkf are average vectors obtained by projecting with the basis matrix obtained by discriminant analysis of the principal component of xkf, and refer to a previously calculated table. The number of dimensions of y1f and y2f is 70 dimensions and 80 dimensions, respectively.

ＳＴＥＰ４）結合フーリエベクトルのＬＤＡ射影
正規化ベクトルｙ１ｆとｙ２ｆを１５０次元の結合フーリエベクトルｙ３ｆを成すよう
に連結し、ＬＤＡ基底行列を用いて射影する。射影ベクトルｚｆは次式で与えられる。 STEP 4) LDA Projection of Combined Fourier Vectors Normalized vectors y1f and y2f are connected to form a 150-dimensional combined Fourier vector y3f, and projected using an LDA basis matrix. The projection vector zf is given by the following equation.

ＳＴＥＰ５）量子化
ｚｆの要素を次式を用いて５ビットの符号なし整数の範囲に丸める。 STEP 5) Quantization The elements of zf are rounded to a 5-bit unsigned integer range using the following equation.

量子化された要素は、ＦｏｕｒｉｅｒＦｅａｔｕｒｅの配列として保存する。Ｆｏｕｒ
ｉｅｒＦｅａｔｕｒｅ［０］は、量子化された第一要素ｗ０ｆを表し、Ｆｏｕｒｉｅｒ
Ｆｅａｔｕｒｅ［ｎｕｍＯｆＦｏｕｒｉｅｒＦｅａｔｕｒｅ−１］は、第ｎｕｍＯｆＦ
ｏｕｒｉｅｒＦｅａｔｕｒｅ番目の要素ｗｆｎｕｍＯｆＦｏｕｒｉｅｒＦｅａｔｕｒｅ−
１に対応する。 The quantized elements are stored as an array of FourierFeatures. Four
ierFeature [0] represents the quantized first element w0f, and Fourier
Feature [numOfFourierFeature-1] is the numOfF
ourerFeature-th element wfnumOfFourierFeature-
Corresponding to 1.

●ＣｅｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅ
この要素は、正規化顔画像の中心部分のフーリエ特性の階層的ＬＤＡに基づく顔特徴を
表している。ＣｅｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅはＦｏｕｒｉｅｒＦｅａｔｕ
ｒｅと同様な方法により抽出する。 ● CentralFourierFeature
This element represents a facial feature based on a hierarchical LDA of Fourier characteristics of the center portion of the normalized facial image. CentralFourierFeature is a FourierFeatur
Extract by the same method as re.

中心部分ｇ（ｘ，ｙ）は、次式に示すように画像ｆ（ｘ，ｙ）の始点（７，１２）から
３２ｘ３２画素の大きさに切り出すことによって得られる。 The center portion g (x, y) is obtained by cutting out the image to a size of 32 × 32 pixels from the start point (7, 12) of the image f (x, y) as shown in the following equation.

ＳＴＥＰ１）フーリエスペクトルベクトルの抽出
ｇ（ｘ，ｙ）のフーリエスペクトルＧ（ｕ，ｖ）を（数２９）式によって計算する。 STEP 1) Extraction of Fourier Spectral Vector The Fourier spectrum G (u, v) of g (x, y) is calculated by the equation (29).

ここで、Ｍ＝３２，Ｎ＝３２である。２５６次元のフーリエスペクトルベクトルｘ１
ｇは、フーリエスペクトルＧ（ｕ，ｖ）を図２０で定義したように走査することによっ
て得ることができる。

Here, M = 32 and N = 32. 256-dimensional Fourier spectrum vector x1
g can be obtained by scanning the Fourier spectrum G (u, v) as defined in FIG.

ＳＴＥＰ２）マルチブロックフーリエ強度ベクトルの抽出
マルチブロックフーリエ強度ベクトルｘ２ｇを（ａ）中心部分ｇ１０（ｘ，ｙ），
（ｂ）４分の１画像ｇｋ１（ｘ，ｙ）（ｋ＝１，２，３，４），及び（ｃ）１６分
の１画像ｇｋ２（ｘ，ｙ）（ｋ＝１，２，３，・・・，１６）のフーリエ強度から抽出す
る。 STEP 2) Extraction of multi-block Fourier intensity vector Multi-block Fourier intensity vector x2g is converted into (a) center portion g10 (x, y),
(B) 1/4 image gk1 (x, y) (k = 1, 2, 3, 4), and (c) 1/16 image gk2 (x, y) (k = 1, 2, 3, 4). .., 16) are extracted from the Fourier intensity.

（ａ）中心部分 (A) Central part

（ｂ）４分の１画像 (B) 1/4 image

ここで、ｓｋ１＝（ｋ−１）％２、ｔｋ１（ｋ−１）／２である。

Here, sk1 = (k−1)% 2, tk1 (k−1) / 2.

（ｃ）１６分の１画像 (C) 1/16 image

Here, sk2 = (k−1)% 4 and tk2 = (k−1) / 4.

それぞれの画像のフーリエ強度｜Ｇｋｊ（ｕ，ｖ）｜を、（数３３）式のように計算す
る。 The Fourier intensity | Gkj (u, v) | of each image is calculated as shown in (Expression 33).

ここで、Ｍ０＝３２，Ｍ１＝１６，Ｍ２＝８，Ｎ０＝３２，Ｎ１＝１６，Ｎ２
＝８である。マルチブロックフーリエ強度ベクトルｘ２ｇは、図２０に定義するようにそ
れぞれの強度｜Ｇｋｊ（ｕ，ｖ）｜を走査することによって得られる。

Here, M0 = 32, M1 = 16, M2 = 8, N0 = 32, N1 = 16, N2
= 8. The multi-block Fourier intensity vector x2g is obtained by scanning each intensity | Gkj (u, v) | as defined in FIG.

ＳＴＥＰ３−５）の処理は、ＦｏｕｒｉｅｒＦｅａｔｕｒｅと同じである。Ｃｅｎｔ
ｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅのための基底行列Ψ１ｇ，Ψ２ｇ，Ψ３ｇおよび
平均ベクトルｍ１ｇ，ｍ２ｇもまたそれぞれの予め計算してテーブルとして用意してお
いたものを参照する。 The processing of STEP 3-5) is the same as that of FourierFeature. Cent
The basis matrices Ψ1g, Ψ2g, Ψ3g and the average vectors m1g, m2g for ralFourierFeature are also referred to as prepared in advance as a table.

ＣｅｎｔｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅの配列の大きさは、ｎｕｍＯｆＣｅｎｔ
ｒａｌＦｏｕｒｉｅｒＦｅａｔｕｒｅに制限される。 The size of the CentralFourierFeature array is numOfCent
Limited to ralFourierFeature.

このようにして得られた顔特徴記述データは、記述長がコンパクトでありながら、高い
認識性能を有する顔特徴の記述データとなり、データの保存や伝送に効率的な表現となる
。 The facial feature description data obtained in this way is a facial feature description data having a high recognition performance while having a compact description length, and is an efficient expression for storing and transmitting data.

なお、本発明をコンピュータで動作可能なプログラムで実現してもかまわない。この場
合、第五の実施の形態であれば、図１７中のステップ１〜ステップ５で示された機能をコ
ンピュータが読み取り可能なプログラムで記述し、このプログラムをコンピュータ上で機
能させることで本発明を実現可能である。また図１７に記載された例を装置として構成す
る場合は、図２１のブロック図に記載された機能の全部または一部を実現すればよい。 Note that the present invention may be realized by a program operable by a computer. In this case, in the fifth embodiment, the functions shown in steps 1 to 5 in FIG. 17 are described by a computer-readable program, and the program is made to function on the computer. Is feasible. Further, when the example described in FIG. 17 is configured as an apparatus, all or part of the functions described in the block diagram of FIG. 21 may be realized.

１１：第一の線形変換手段１１
１２：第二の線形変換手段１２
１３：第三の線形変換手段１３
１４：第一の判別行列記憶手段１４
１５：第二の判別行列記憶手段１５
１６：第三の判別行列記憶手段１６ 11: First linear conversion means 11
12: Second linear conversion means 12
13: Third linear conversion means 13
14: First discriminant matrix storage means 14
15: Second discriminant matrix storage means 15
16: Third discriminant matrix storage means 16

Claims

A pattern feature extraction method for extracting facial features from an input normalized image,
Calculating a Fourier spectrum vector by calculating a Fourier spectrum for the input normalized face image using a predetermined calculation formula; and
Extracting a multi-block Fourier intensity vector from the Fourier intensity of the partial image of the normalized image;
Projecting a feature vector using a discriminant matrix obtained by linear discriminant analysis of a Fourier spectrum vector and the multi-block intensity vector to obtain respective normalized vectors;
Concatenating respective normalized vectors, performing projection using a second discriminant matrix obtained by linear discriminant analysis of the coupled combined Fourier vectors, and obtaining a projection vector obtained by projection;
A pattern feature extraction method comprising:

The step of obtaining the Fourier spectrum vector is characterized in that the input normalized image is subjected to Fourier transform, and a vector having components obtained by sampling real and imaginary components of the obtained Fourier spectrum is extracted as a Fourier spectrum vector. 3. The pattern feature extraction method according to claim 2.

The step of extracting the multi-block Fourier intensity vector includes dividing the input normalized image into a plurality of partial images, calculating a Fourier intensity of each partial image, and a vector having the Fourier intensity of each partial image as an element. The pattern feature extraction method according to claim 1, wherein the pattern feature is extracted as a multi-block Fourier intensity vector.

Obtaining the normalized vector using a discriminant matrix obtained by linear discriminant analysis of a principal component of the Fourier spectrum vector, projecting the Fourier spectrum vector, and normalizing the size of the projected vector;
Projecting a multi-block Fourier intensity vector using a discriminant matrix obtained by linear discriminant analysis of the principal components of the multi-block Fourier intensity vector, and normalizing the magnitude of the projected vector, The pattern feature extraction method according to any one of claims 1 to 3.

A pattern feature extraction device that extracts facial features from an input normalized image,
Fourier spectrum generation means for calculating a Fourier spectrum vector by calculating a Fourier spectrum for an input normalized face image using a predetermined calculation formula;
A multi-block Fourier intensity vector extraction means for extracting a multi-block Fourier intensity vector from the Fourier intensity of the partial image of the normalized image;
Normalization vector generation means for performing projection of a feature vector using a discriminant matrix obtained by linear discriminant analysis of a Fourier spectrum vector and the multi-block intensity vector, and obtaining respective normalized vectors;
Projection vector generating means for connecting the respective normalized vectors, performing projection using a second discriminant matrix obtained by linear discriminant analysis of the coupled combined Fourier vectors, and obtaining a projection vector obtained by the projection;
A pattern feature extraction apparatus comprising:

The Fourier spectrum vector generation means performs Fourier transform on the input normalized image, and extracts a vector having components obtained by sampling real and imaginary components of the obtained Fourier spectrum as Fourier spectrum vectors. Item 6. The pattern feature extraction apparatus according to Item 5.

The multi-block Fourier intensity vector extraction means divides the input normalized image into a plurality of partial images, calculates a Fourier intensity of each partial image, and multi-blocks a vector having the Fourier intensity of each partial image as an element. The pattern feature extraction apparatus according to claim 5, wherein the pattern feature extraction device is extracted as a Fourier intensity vector.

The normalized vector generation means projects a Fourier spectrum vector using a discriminant matrix obtained by linear discriminant analysis of the principal component of the Fourier spectrum vector, and normalizes the magnitude of the projected vector. And
A multi-block Fourier intensity vector means for projecting a multi-block Fourier intensity vector using a discriminant matrix obtained by linear discriminant analysis of the principal component of the multi-block Fourier intensity vector and normalizing the size of the projected vector; The pattern feature extraction apparatus according to claim 5, wherein:

A process for obtaining a Fourier spectrum vector by calculating a Fourier spectrum for an input normalized face image using a predetermined calculation formula in a computer;
A process of extracting a multi-block Fourier intensity vector from the Fourier intensity of the partial image of the normalized image;
Projecting feature vectors using a discriminant matrix obtained by linear discriminant analysis of a Fourier spectrum vector and the multi-block intensity vector, and obtaining respective normalized vectors;
A process of concatenating the respective normalized vectors, performing projection using a second discriminant matrix obtained by linear discriminant analysis of the coupled combined Fourier vectors, and obtaining a projection vector obtained by the projection;
Pattern feature extraction program to execute

The process of obtaining the Fourier spectrum vector is characterized in that the input normalized image is Fourier transformed, and a vector having components obtained by sampling real and imaginary components of the obtained Fourier spectrum is extracted as a Fourier spectrum vector. The pattern feature extraction program according to claim 9.

The process of extracting the multi-block Fourier intensity vector includes dividing the input normalized image into a plurality of partial images, calculating a Fourier intensity of each partial image, and a vector having the Fourier intensity of each partial image as an element. The pattern feature extraction program according to claim 9, wherein the pattern feature extraction program is extracted as a multi-block Fourier intensity vector.

The process of obtaining the normalized vector is a process of projecting a Fourier spectrum vector using a discriminant matrix obtained by linear discriminant analysis of the principal components of the Fourier spectrum vector, and normalizing the size of the projected vector;
A multi-block Fourier intensity vector is projected using a discriminant matrix obtained by linear discriminant analysis of the principal components of the multi-block Fourier intensity vector, and the size of the projected vector is normalized. The pattern feature extraction program according to any one of claims 9 to 11.