JP5151522B2

JP5151522B2 - Arithmetic apparatus and program

Info

Publication number: JP5151522B2
Application number: JP2008029073A
Authority: JP
Inventors: 誠坂井
Original assignee: Denso Corp
Current assignee: Denso Corp
Priority date: 2007-02-09
Filing date: 2008-02-08
Publication date: 2013-02-27
Anticipated expiration: 2028-02-08
Also published as: JP2008269573A

Description

本発明は、パターン認識装置に用いる次元削減行列を算出する演算装置及びプログラムに関する。 The present invention relates to an arithmetic device and a program for calculating a dimension reduction matrix used in a pattern recognition device.

従来より、パターン認識装置としては、入力されたパターンの特徴を抽出して、特徴ベクトルを生成し、更に、この特徴ベクトルを低次元化して、パターン認識に有効な特徴ベクトル（又はスカラー量）を求め、当該低次元化後の値を、予めデータベースに登録された各クラスの基準値（プロトタイプ）と照合して一致度を評価することにより、入力されたパターンを複数のクラスのいずれかに分類し、パターン認識を実現するものが知られている。 Conventionally, as a pattern recognition apparatus, a feature vector of an input pattern is extracted to generate a feature vector, and the feature vector is reduced in order to obtain a feature vector (or scalar quantity) effective for pattern recognition. The input pattern is classified into one of multiple classes by evaluating the degree of matching by comparing the reduced value with the reference value (prototype) of each class registered in the database in advance. And what realizes pattern recognition is known.

入力されたパターンを装置により認識する際には、上述したように、パターンから様々な特徴量を抽出し、これらを要素として特徴ベクトルを生成するのであるが、高次元の特徴ベクトルで、パターン認識を行う場合には、情報量が多すぎて、パターン認識に、非常に長い時間を要する。また、認識に不要な情報が含まれている特徴ベクトルを用いて、パターン認識を行う場合には、パターンの認識精度が劣化する可能性がある。一方、認識に必要な情報が含まれていない特徴ベクトルを用いて、パターン認識を行っても、当然のことながら、良好なパターン認識結果を得ることができない。 When the input pattern is recognized by the device, as described above, various feature quantities are extracted from the pattern, and feature vectors are generated using these as elements, but pattern recognition is performed using high-dimensional feature vectors. When performing the above, the amount of information is too large, and it takes a very long time for pattern recognition. In addition, when pattern recognition is performed using a feature vector that includes information unnecessary for recognition, there is a possibility that pattern recognition accuracy may deteriorate. On the other hand, even if pattern recognition is performed using a feature vector that does not include information necessary for recognition, it is natural that a good pattern recognition result cannot be obtained.

このため、従来では、まず、高次元の特徴ベクトルを生成し、この特徴ベクトルを低次元空間に写像したときに、特徴ベクトルが低次元空間上でクラス毎に分離し、同クラスの特徴ベクトルが特定領域に集中するような次元削減行列を求め、求めた次元削減行列を特徴ベクトルを作用させることにより、特徴ベクトルを低次元化して、低次元化後の値がパターンの特徴を良好に表すようにしている。 For this reason, conventionally, when a high-dimensional feature vector is first generated and mapped to a low-dimensional space, the feature vector is separated for each class in the low-dimensional space. Determining a dimension reduction matrix that concentrates on a specific area, and applying a feature vector to the calculated dimension reduction matrix so that the feature vector can be reduced in order so that the reduced value represents the pattern features well. I have to.

このような次元削減法によって、特徴ベクトルを低次元化すれば、低次元空間において、各クラスの特徴ベクトルが分離し、同一クラスの特徴ベクトルが類似する値を示すので、特徴ベクトルがいずれのクラスに属するものであるのかを、効率的且つ精度よく判定することができる。 If feature vectors are reduced in dimension by such a dimension reduction method, feature vectors of each class are separated in a low-dimensional space, and feature vectors of the same class show similar values. Can be determined efficiently and accurately.

具体的に、次元削減法としては、従来、線形判別分析法や不等分散判別分析法を用いたものが知られている。
線形判別分析法による次元削減では、まず、クラス毎に、クラスに属するパターンの特徴ベクトルを、サンプルとして複数個用意する。そして、これらのサンプル群から、各クラスの平均ベクトルμ_iを求めると共に、全クラスの平均ベクトルμ₀を求める。 Specifically, as a dimension reduction method, a method using a linear discriminant analysis method or an unequal variance discriminant analysis method is conventionally known.
In dimension reduction by the linear discriminant analysis method, first, for each class, a plurality of feature vectors of patterns belonging to the class are prepared as samples. Then, from these sample groups, an average vector μ _i for each class is obtained, and an average vector μ ₀ for all classes is obtained.

但し、μ_iは、第ｉクラスの平均ベクトルを表し、ｉは、クラスの総数をＬとして、値１から値Ｌまでの整数値を採るものとする。また、Ｎは、全クラスのサンプル数、Ｎ_iは、第ｉクラスのサンプル数、Ｘ_ijは、第ｉクラスの第ｊ（ｊ＝１，２，…，Ｎ_i）サンプルのベクトル値を表すものとする。

However, μ _i represents an average vector of the i-th class, and i is an integer value from value 1 to value L, where L is the total number of classes. N is the number of samples in all classes, N _i is the number of samples in the i-th class, and X _ij is the vector value of the j-th (j = 1, 2,..., N _i ) samples in the _i -th class. Shall.

尚、ｘ₁，…，ｘ_dは、ｄ次元の特徴ベクトルの各要素の値を表し、上付き文字Ｔは、転置を表す。

Incidentally, x _1, ..., x _d represents the value of each element of the d-dimensional feature vector, the superscript T denotes the transpose.

そして、これら平均ベクトルμ₀及びクラス毎の平均ベクトルμ_iを用いて、クラス間共分散行列Ｂを求めると共に、各クラスの共分散行列Ｗ_iを求める。 Then, using the average vector μ ₀ and the average vector μ _i for each class, an inter-class covariance matrix B is obtained, and a covariance matrix W _i for each class is obtained.

但し、Ｐ_iは、第ｉクラスの事前確率であり、Ｐ_i＝Ｎ_i／Ｎとすることができる。また、Ｗ_iは、第ｉクラスの共分散行列を表すものとする。

However, P _i is the prior probability of the i-th class, and can be set to P _i = N _i / N. In addition, W _i represents an i-th class covariance matrix.

上述したようにして、クラス間共分散行列Ｂ及び各クラスの共分散行列Ｗ_iを求めた後には、クラス内共分散行列Ｗを、各クラスの共分散行列Ｗ_iの算術平均とし、評価空間（低次元空間）でのクラス間分散とクラス内分散との比｜Ｂ｜／｜Ｗ｜が最大となる次元削減行列Ａを求める。具体的には、評価関数Ｊ（Ａ）を次のように定義し、評価関数Ｊ（Ａ）が最大となる次元削減行列Ａを求める。 After obtaining the inter-class covariance matrix B and the covariance matrix W _i of each class as described above, the intra-class covariance matrix W is set as the arithmetic mean of the covariance matrix W _i of each class, and the evaluation space A dimension reduction matrix A that maximizes the ratio | B | / | W | of the inter-class variance and the intra-class variance in the (low-dimensional space) is obtained. Specifically, the evaluation function J (A) is defined as follows, and the dimension reduction matrix A that maximizes the evaluation function J (A) is obtained.

例えば、ｄ次元の特徴ベクトルＸ_ijを、ｄ’次元の空間に写像して低次元化する場合には、次元削減行列Ａとして、評価関数Ｊ（Ａ）が最大となるｄ行ｄ’列の行列を求める（例えば、非特許文献１参照）。

For example, when the d-dimensional feature vector X _ij is mapped to a d′-dimensional space and reduced in dimension, the dimension reduction matrix A has d rows and d ′ columns with the maximum evaluation function J (A). A matrix is obtained (for example, see Non-Patent Document 1).

クラス間分散は、クラス間のばらつきの程度を表し、クラス内分散は、クラス内のサンプルのばらつきの程度を表すものである。このため、上述のようにして行列Ａを求めれば、特徴ベクトルが低次元空間上でクラス毎に分離し、同クラスの特徴ベクトルが低次元空間上で特定領域に集中するような次元削減行列Ａを求めることができる。従来の線形判別分析法による次元削減では、このようにして、パターン認識に最適な次元削減行列Ａを求めている。 Interclass variance represents the degree of variation between classes, and intraclass variance represents the degree of variation of samples within a class. For this reason, when the matrix A is obtained as described above, the dimension reduction matrix A in which the feature vectors are separated for each class in the low-dimensional space and the feature vectors of the same class are concentrated in a specific region in the low-dimensional space. Can be requested. In dimension reduction by the conventional linear discriminant analysis method, the dimension reduction matrix A optimum for pattern recognition is obtained in this way.

一方、不等分散判別分析法による次元削減では、クラス内共分散行列Ｗを、各クラスの共分散行列Ｗ_iの幾何平均として、評価空間（低次元空間）でのクラス間分散とクラス内分散との比｜Ｂ｜／｜Ｗ｜が最大となる次元削減行列Ａを求める。具体的には、評価関数Ｊ（Ａ）を次のように定義して、評価関数Ｊ（Ａ）が最大となる次元削減行列Ａを求める（例えば、非特許文献２参照）。 On the other hand, the dimension reduction by unequal variance discriminant analysis method, classes in covariance matrix W, as the geometric mean of the covariance matrix W _i of each class, evaluation space between-class variance and within-class variance in (low-dimensional space) The dimension reduction matrix A that maximizes the ratio | B | / | W | Specifically, the evaluation function J (A) is defined as follows, and a dimension reduction matrix A that maximizes the evaluation function J (A) is obtained (for example, see Non-Patent Document 2).

このようにして、従来の不等分散判別分析法を用いた次元削減では、パターン認識に最適な次元削減行列Ａを求めている。
フクナガケイノスケ（Keinosuke Fukunaga）著，「統計パターン認識（Introduction to Statistical Pattern Recognition）」，第２版，（米国），アカデミックプレス（Academic Press），１９９０年，ｐ．４４１−ｐ．４５９サオン（Saon）外３名，「最尤識別特徴空間（Maximum likelihood discriminant feature space）」，音響・音声・信号処理国際会議（International Conference on Acoustics, Speech, and Signal Processing），２０００年，ｐ．１２９−ｐ．１３２

In this way, in the dimension reduction using the conventional unequal variance discriminant analysis method, the dimension reduction matrix A optimum for pattern recognition is obtained.
Keinosuke Fukunaga, "Introduction to Statistical Pattern Recognition", 2nd edition, (USA), Academic Press, 1990, p. 441-p. 459 3 outside Saon, “Maximum likelihood discriminant feature space”, International Conference on Acoustics, Speech, and Signal Processing, 2000, p. 129-p. 132

しかしながら、従来の次元削減法では、パターン認識を精度よく実現するのに十分に、適切な次元削減行列Ａを求めることができない場合があった。即ち、従来の手法では、パターン認識を精度よく実現するのに十分に、特徴ベクトルが低次元空間上でクラス毎に分離し、同クラスの特徴ベクトルが低次元空間上で特定領域に集中するような次元削減行列Ａを求めることができない場合があった。 However, in the conventional dimension reduction method, an appropriate dimension reduction matrix A may not be obtained enough to realize pattern recognition with high accuracy. That is, in the conventional method, the feature vectors are separated into classes in the low-dimensional space, and the feature vectors of the same class are concentrated in a specific region in the low-dimensional space, enough to realize pattern recognition with high accuracy. In some cases, it was not possible to obtain a simple dimension reduction matrix A.

例えば、２次元の特徴ベクトルを１次元に削減する場合であって、パターンのクラスが３種類ある場合には、図５（ａ）下段に示すように、３種類の各クラスの特徴ベクトルが、１次元空間上で、クラス毎に、重複しない領域に集中するのが好ましいが、線形判別分析法による次元削減では、特徴ベクトルの１次元空間上の分布が、図５（ｂ）に示すように、異なるクラス間で重複する領域の多い分布となってしまう場合があった。 For example, when a two-dimensional feature vector is reduced to one dimension and there are three kinds of pattern classes, as shown in the lower part of FIG. In the one-dimensional space, it is preferable to concentrate on non-overlapping areas for each class. However, in the dimension reduction by the linear discriminant analysis method, the distribution of feature vectors in the one-dimensional space is as shown in FIG. In some cases, the distribution has a lot of overlapping areas between different classes.

また、このような性質を有する特徴ベクトルの集合に対しては、不等分散判別分析法によって次元削減行列Ａを求めても、特徴ベクトルの１次元空間上の分布が、線形判別分析法と同様に、異なるクラス間で重複する領域の多い分布になってしまう場合があった（図５（ｃ）参照）。このように、従来手法では、特徴ベクトルを適切に次元削減できずに、パターン認識の精度が十分に得られない場合があった。 In addition, for a set of feature vectors having such characteristics, even if the dimension reduction matrix A is obtained by the unequal variance discriminant analysis method, the distribution of the feature vectors in the one-dimensional space is the same as in the linear discriminant analysis method. In some cases, the distribution has a lot of overlapping areas between different classes (see FIG. 5C). As described above, in the conventional method, the dimension of the feature vector cannot be appropriately reduced, and the pattern recognition accuracy may not be sufficiently obtained.

本発明は、こうした問題に鑑みなされたものであり、パターン認識に適切な次元削減行列Ａを算出することのできる演算装置及びプログラムを提供することを目的とする。 The present invention has been made in view of these problems, and an object thereof is to provide an arithmetic device and a program capable of calculating a dimension reduction matrix A suitable for pattern recognition.

かかる目的を達成するためになされた本発明は、外部から入力された認識対象パターンの特徴ベクトルに、予め定められた次元削減行列を作用させて、特徴ベクトルを低次元空間に写像し、当該特徴ベクトルの次元削減後の値に基づき、入力されたパターンを、予め定められた複数のクラスのいずれかに分類するパターン認識装置に用いる上記次元削減行列を算出するに当たって、クラス内分散を、一般化平均により定義し、次の評価関数Ｊ₁（Ａ，ｍ）又は評価関数Ｊ₂（Ａ，ｍ）により、次元削減行列Ａを求めるようにしたものである。 The present invention, which has been made to achieve the above object, maps a feature vector to a low-dimensional space by applying a predetermined dimension reduction matrix to a feature vector of a recognition target pattern inputted from the outside, and the feature vector Intra-class variance is generalized when calculating the above-mentioned dimension reduction matrix used in a pattern recognition device that classifies an input pattern into one of a plurality of predetermined classes based on the value after vector dimension reduction. The dimension reduction matrix A is obtained by the following evaluation function J ₁ (A, m) or evaluation function J ₂ (A, m).

具体的に、本発明の演算装置は、パターン認識装置にて分類する上記複数のクラスについて、各クラス毎に、クラスに属するパターンの特徴ベクトルを、複数個、サンプルとして取得するサンプル取得手段を備える。そして、演算装置は、サンプル取得手段が取得した各クラスのサンプル群に基づき、クラス間分散算出手段にて、クラス間共分散行列Ｂを求める。尚、クラス間分散算出手段は、次式に従って、クラス間共分散行列Ｂを求める構成にすることができる。

Specifically, the arithmetic device of the present invention includes sample acquisition means for acquiring a plurality of feature vectors of patterns belonging to a class for each of the plurality of classes classified by the pattern recognition device. . Then, the arithmetic unit obtains the inter-class covariance matrix B by the inter-class variance calculating unit based on the sample group of each class acquired by the sample acquiring unit. The interclass variance calculation means can be configured to obtain the interclass covariance matrix B according to the following equation.

ここで、μ_iは、第ｉクラスの平均ベクトル、μ₀は、全クラスの平均ベクトルである。また、ｉは、クラスの総数をＬとして、値１から値Ｌまでの整数値を採る。その他、Ｎは、全クラスのサンプル数、Ｎ_iは、第ｉクラスのサンプル数、Ｘ_ijは、第ｉクラスの第ｊ（ｊ＝１，２，…，Ｎ_i）サンプルのベクトル値を表す。また、ｘ₁，…，ｘ_dは、ｄ次元の特徴ベクトルの各要素の値を表し、上付き文字Ｔは、転置である。

Here, μ _i is the average vector of the i-th class, and μ ₀ is the average vector of all classes. Further, i takes an integer value from value 1 to value L, where L is the total number of classes. In addition, N represents the number of samples in all classes, N _i represents the number of samples in the i-th class, and X _ij represents the vector value of the j-th (j = 1, 2,..., N _i ) samples in the _i -th class. . Further, x _1, ..., x _d represents the value of each element in the d-dimensional feature vector, the superscript T, the transpose.

また、本発明の演算装置は、サンプル取得手段が取得した各クラスのサンプル群に基づき、クラス分散算出手段により、各クラスの共分散行列Ｗ_iを算出する。但し、Ｗ_iは、第ｉクラスの共分散行列を表す。

The arithmetic device of the present invention is based on a sample set of each class that sample acquisition means has acquired, class variance calculating means for calculating a covariance matrix W _i for each class. Here, W _i represents the i-th class covariance matrix.

この他、本発明の演算装置は、クラス内分散を定義付ける値ｍとして、実数値の入力を受け付ける入力受付手段を備え、この入力受付手段により、当該演算装置のユーザが所望するクラス内分散の定義情報（値ｍ）を取得する。

In addition, the arithmetic device of the present invention includes input receiving means for receiving an input of a real value as a value m defining intra-class variance, and this input receiving means defines the intra-class variance desired by the user of the arithmetic device. Get information (value m).

そして、演算装置は、入力受付手段を通じて入力された値ｍと、クラス間分散算出手段により算出されたクラス間共分散行列Ｂと、クラス分散算出手段により算出された各クラスの共分散行列Ｗ_iとに基づき、式（１）で定義される評価関数Ｊ₁（Ａ，ｍ）又は式（２）で定義される評価関数Ｊ₂（Ａ，ｍ）が最大になる次元削減行列Ａを、行列算出手段により算出する。尚、Ｐ_iは、第ｉクラスの事前確率であり、Ｐ_i＝Ｎ_i／Ｎとすることができる。 Then, the arithmetic unit outputs the value m input through the input receiving unit, the inter-class covariance matrix B calculated by the inter-class variance calculating unit, and the covariance matrix W _{i of} each class calculated by the class variance calculating unit. Based on the above, the dimension reduction matrix A that maximizes the evaluation function J ₁ (A, m) defined by the expression (1) or the evaluation function J ₂ (A, m) defined by the expression (2) Calculated by calculating means. Note that P _i is the prior probability of the i-th class, and can be set to P _i = N _i / N.

本発明の演算装置は、このようにして、次元削減行列Ａを求め、求めた次元削減行列Ａを、出力手段により、外部に出力する。
このように構成された演算装置では、ｍ＝１が入力された場合、クラス内分散を、算術平均により定義することになるため、線形判別分析法と同手法で、次元削減行列Ａを求めることとなり、ｍ＝０が入力された場合には、クラス内分散を、幾何平均により定義することになるため、周知の不等分散判別分析法と同手法で次元削減行列Ａを求めることとなる。一方、ｍが０，１以外の値で入力された場合には、新規な手法で次元削減行列Ａを求めることになる。 The arithmetic device of the present invention thus obtains the dimension reduction matrix A and outputs the obtained dimension reduction matrix A to the outside by the output means.
In the arithmetic device configured as described above, when m = 1 is input, the intra-class variance is defined by the arithmetic mean, so that the dimension reduction matrix A is obtained by the same method as the linear discriminant analysis method. Thus, when m = 0 is input, the intra-class variance is defined by the geometric mean, so the dimension reduction matrix A is obtained by the same method as the well-known unequal variance discriminant analysis method. On the other hand, when m is input with a value other than 0 and 1, the dimension reduction matrix A is obtained by a novel method.

本発明者は、線形判別分析法及び不等分散判別分析法によっては適切な次元削減行列Ａが求められない問題について考察したところ、クラス内分散を一般化平均により定義することにより、線形判別分析法及び不等分散判別分析法によっては適切な次元削減行列Ａが求められない問題を解消することができることに気がついた。例えば、線形判別分析法や不等分散判別分析法によっては、適切な次元削減行列Ａを求めることができない図５（ｂ）（ｃ）のような場合であっても、パラメータｍを、例えば、ｍ＝１０などのｍ＝０，１以外の値に設定して、式（１）の評価関数Ｊ₁（Ａ，ｍ）又は式（２）の評価関数Ｊ₂（Ａ，ｍ）に従い、評価関数が最大となる次元削減行列Ａを求めれば、図５（ａ）下段のようなパターン認識に適切な分布を得ることができるのである。 The present inventor considered the problem that an appropriate dimension reduction matrix A cannot be obtained by the linear discriminant analysis method and the unequal variance discriminant analysis method. By defining the intra-class variance by a generalized average, the linear discriminant analysis is performed. It has been found that the problem that an appropriate dimension reduction matrix A cannot be obtained by the method and the unequal variance discriminant analysis method can be solved. For example, even in the case of FIGS. 5B and 5C in which an appropriate dimension reduction matrix A cannot be obtained by the linear discriminant analysis method or the unequal variance discriminant analysis method, the parameter m is set to, for example, Set to a value other than m = 0, 1, such as m = 10, and evaluate according to evaluation function J ₁ (A, m) in equation ( ₁ ) or evaluation function J ₂ (A, m) in equation (2) If the dimension reduction matrix A that maximizes the function is obtained, a distribution suitable for pattern recognition as shown in the lower part of FIG. 5A can be obtained.

本発明者は、このような事実から、演算装置を、外部からパラメータｍの値を取得し、式（１）の評価関数Ｊ₁（Ａ，ｍ）又は式（２）の評価関数Ｊ₂（Ａ，ｍ）に従って次元削減行列Ａを算出し、この次元削減行列Ａを出力する構成にしたのである。このような演算装置を用いれば、パターン認識装置の設計者は、パラメータｍの値として複数の値を演算装置に入力し、これにより複数の次元削減行列Ａを演算装置から得て、これらの次元削減行列Ａをサンプルに作用させたときのサンプルの低次元空間での分布を、評価することにより、パターン認識に最適な次元削減行列Ａを求めることができる。 The present inventor obtains the value of the parameter m from the outside from such a fact, and the evaluation function J ₁ (A, m) of the expression ( ₁ ) or the evaluation function J ₂ (2) of the expression (2). A dimension reduction matrix A is calculated according to A, m), and the dimension reduction matrix A is output. If such an arithmetic device is used, the designer of the pattern recognition device inputs a plurality of values as the value of the parameter m to the arithmetic device, thereby obtaining a plurality of dimension reduction matrices A from the arithmetic device. By evaluating the distribution of the sample in the low-dimensional space when the reduction matrix A is applied to the sample, the optimal dimension reduction matrix A for pattern recognition can be obtained.

従って、本発明の演算装置を用いれば、従来よりも好適なパターン認識装置を構成することができる。
尚、上記演算装置は、行列算出手段により算出された次元削減行列Ａを、サンプル取得手段が取得した各サンプルに作用させて、各サンプルの次元削減後の値を算出する低次元化手段を備え、出力手段にて、上記算出された次元削減行列Ａ及び低次元化手段の算出結果を表す情報を、出力する構成にされるとよい（請求項２）。ここで「低次元化手段の算出結果を表す情報」としては、低次元化手段により求められた各サンプルの次元削減後の値そのものや、その統計情報（低次元空間での各サンプルの分布を表す情報等）を挙げることができる。 Therefore, if the arithmetic device of the present invention is used, a more suitable pattern recognition device than the conventional one can be configured.
Note that the arithmetic device includes a reduction means for applying the dimension reduction matrix A calculated by the matrix calculation means to each sample acquired by the sample acquisition means and calculating a value after dimension reduction of each sample. The output means may be configured to output information indicating the calculated dimension reduction matrix A and the calculation result of the reduction means. Here, “information indicating the calculation result of the reduction means” is the value itself after the dimension reduction of each sample obtained by the reduction means, and its statistical information (distribution of each sample in the low dimension space). And the like).

このように演算装置を構成すれば、パターン認識装置の設計者は、別途、次元削減行列Ａをサンプルに作用させて、各サンプルを次元削減する作業をしなくても、パターン認識に最適な次元削減行列を設定するのに必要な情報を、演算装置から得ることができる。従って、本発明によれば、パターン認識装置を設計する際の次元削減行列Ａの決定を、効率的に行うことができる。 If the arithmetic unit is configured in this manner, the designer of the pattern recognition device separately applies the dimension reduction matrix A to the sample, and the dimension optimal for pattern recognition can be obtained without performing the work of reducing the dimension of each sample. Information necessary for setting the reduction matrix can be obtained from the arithmetic unit. Therefore, according to the present invention, it is possible to efficiently determine the dimension reduction matrix A when designing the pattern recognition apparatus.

また、具体的に、出力手段は、低次元化手段により求められた各サンプルの次元削減後の値を、次元削減行列Ａと共に記述したデータファイルを出力する構成にされてもよいし、低次元空間での各サンプルの分布をグラフ化してなる画像を、表示装置を通じて画面に出力する構成にされてもよい。 Further, specifically, the output means may be configured to output a data file in which the value after dimension reduction of each sample obtained by the dimension reduction means is described together with the dimension reduction matrix A. An image formed by graphing the distribution of each sample in the space may be output to a screen through a display device.

この他、演算装置には、各サンプルの次元削減後の値に基づき、行列算出手段が算出した次元削減行列Ａの良否を評価する評価手段を設けると好ましい（請求項３）。例えば、出力手段を通じ、評価手段の評価結果をユーザに向けて出力するように演算装置を構成すれば、ユーザに、適切な次元削減行列Ａを用いて、パターン認識装置を構築させることができる。 In addition, it is preferable that the arithmetic unit is provided with an evaluation unit that evaluates the quality of the dimension reduction matrix A calculated by the matrix calculation unit based on the value after dimension reduction of each sample (claim 3). For example, if the arithmetic unit is configured to output the evaluation result of the evaluation unit to the user through the output unit, the pattern recognition device can be constructed by using the appropriate dimension reduction matrix A.

また、評価手段は、低次元化手段により算出された各サンプルの次元削減後の値から求められる各クラスの共分散行列Ｗ_i’（但し、Ｗ_i’は、次元削減後の第ｉクラスの共分散行列を表す。）及び平均ベクトルμ_i’（但し、μ_i’は、次元削減後の第ｉクラスに属するサンプルの平均ベクトルを表す。）に基づき、重み付け係数をｓ、第ｉクラスの事前確率をＰ_iとした式 Further, the evaluation means uses the covariance matrix W _i ′ of each class obtained from the value after dimension reduction of each sample calculated by the dimension reduction means (W _i ′ is the i th class after the dimension reduction). Based on the covariance matrix) and the average vector μ _i ′ (where μ _i ′ represents the average vector of the samples belonging to the i-th class after dimension reduction), and the weighting coefficient is s, expression that the prior probability was P _i

に従って、行列算出手段が算出した次元削減行列Ａによる次元削減後の第ｉクラス及び第ｋクラスのサンプル群の重なり具合を表すチェルノフ上限ＣＢ_ikを、クラスの組合せ毎に算出し、算出したチェルノフ上限ＣＢ_ikに基づき、行列算出手段が算出した次元削減行列Ａの良否を評価する構成にすることができる（請求項４）。尚、チェルノフ上限は、一般に「チェルノフ限界」とも呼ばれる。

The Chernoff upper limit CB _ik representing the degree of overlap of the sample groups of the i-th class and the k-th class after the dimension reduction by the dimension reduction matrix A calculated by the matrix calculation means is calculated for each combination of classes, and the calculated Chernoff upper limit Based on CB _ik , the quality of the dimension reduction matrix A calculated by the matrix calculation means can be evaluated (claim 4). The upper limit of Cherniv is generally called “Chernov limit”.

上記チェルノフ上限ＣＢ_ikは、値が小さい程、第ｉクラス及び第ｋクラスのサンプル群の重なり具合が小さいことを表す。従って、チェルノフ上限ＣＢ_ikが小さい次元削減行列Ａほど高く評価すれば、次元削減行列Ａの良否を適切に評価することができる。 The Chernoff upper limit CB _ik indicates that the smaller the value, the smaller the degree of overlap between the i th and k th class sample groups. Therefore, if the dimension reduction matrix A having a smaller Chernoff upper limit CB _ik is evaluated higher, the quality of the dimension reduction matrix A can be appropriately evaluated.

この他、評価手段は、チェルノフ上限ＣＢ_ikの総和ＣＢ In addition, the evaluation means is the sum CB of the Chernov upper limit CB _ik

を行列算出手段が算出した次元削減行列Ａの良否を表す評価値ＣＢとして算出する構成にすることができる（請求項５）。このように評価手段を構成した場合には、評価値ＣＢが小さい程、その次元削減行列Ａが、パターン認識装置に適切であることを表す。従って、次元削減行列Ａと共に評価値ＣＢをユーザに報知したりすることで、ユーザに、適切な次元削減行列Ａを用いて、パターン認識装置を構築させることができる。

Can be calculated as an evaluation value CB representing the quality of the dimension reduction matrix A calculated by the matrix calculation means (claim 5). When the evaluation means is configured as described above, the smaller the evaluation value CB, the more suitable the dimension reduction matrix A is for the pattern recognition apparatus. Accordingly, by notifying the user of the evaluation value CB together with the dimension reduction matrix A, the user can construct a pattern recognition apparatus using the appropriate dimension reduction matrix A.

尚、具体的に、入力受付手段は、クラス内分散を定義付ける値ｍとして、複数の値を取得可能な構成にされ、行列算出手段は、入力受付手段が取得した値ｍ毎に、評価関数Ｊ₁（Ａ，ｍ）又は前記評価関数Ｊ₂（Ａ，ｍ）が最大になる次元削減行列Ａを、算出する構成にされ、評価手段は、入力受付手段が取得した値ｍ毎に、行列算出手段が算出した次元削減行列Ａの良否を評価する構成にされるとよい。また、出力手段は、評価手段の評価結果に従って、行列算出手段が算出した複数の次元削減行列Ａの内、評価が最良の次元削減行列Ａを選択的に、出力する構成にされるとよい（請求項６）。 Specifically, the input receiving unit is configured to be able to acquire a plurality of values as the value m defining the intra-class variance, and the matrix calculating unit is configured to evaluate the evaluation function J for each value m acquired by the input receiving unit. ₁ (A, m) or a dimension reduction matrix A that maximizes the evaluation function J ₂ (A, m) is calculated, and the evaluation means calculates a matrix for each value m acquired by the input reception means. It may be configured to evaluate the quality of the dimension reduction matrix A calculated by the means. Further, the output means may be configured to selectively output the dimension reduction matrix A having the best evaluation among the plurality of dimension reduction matrices A calculated by the matrix calculation means in accordance with the evaluation result of the evaluation means ( Claim 6).

このような構成の演算装置によれば、ユーザから入力された複数の値ｍの内、最良な値ｍに対応する次元削減行列Ａを選択的に、ユーザに報知することができて、ユーザに、認識精度の高いパターン認識装置を構築させることができる。 According to the arithmetic device having such a configuration, it is possible to selectively notify the user of the dimension reduction matrix A corresponding to the best value m among a plurality of values m input by the user, Therefore, it is possible to construct a pattern recognition device with high recognition accuracy.

この他、本発明の演算装置が有する各手段としての機能は、プログラムによりコンピュータに実現させることができる。また、本発明の演算装置としての機能を、コンピュータに実現させるためのプログラムは、例えば、ＣＤ−ＲＯＭ等のコンピュータ読取可能な記録媒体に記録して提供することができる。 In addition, the function as each means of the arithmetic device of the present invention can be realized by a computer by a program. Further, a program for causing a computer to realize the function as the arithmetic device of the present invention can be provided by being recorded on a computer-readable recording medium such as a CD-ROM.

以下に本発明の実施例について、図面と共に説明する。但し、本発明は、以下に説明する実施例に限定されるものではなく、種々の態様を採りうる。
［第一実施例］
図１は、第一実施例の情報処理装置１の構成を表すブロック図である。図１に示すように、本実施例の情報処理装置１は、各種演算処理を行うＣＰＵ１１と、ブートプログラム等を記憶するＲＯＭ１３と、プログラム実行時に作業領域として使用されるＲＡＭ１５と、ハードディスク装置（ＨＤＤ）１７と、キーボードやポインティングデバイス等で構成されるユーザインタフェース１９と、液晶ディスプレイ等で構成される表示装置２１と、記録メディア（磁気ディスクや光ディスク等）に記録されたデータを読み取るためのドライブ装置２３と、を備える。 Embodiments of the present invention will be described below with reference to the drawings. However, this invention is not limited to the Example demonstrated below, A various aspect can be taken.
[First embodiment]
FIG. 1 is a block diagram illustrating the configuration of the information processing apparatus 1 according to the first embodiment. As shown in FIG. 1, an information processing apparatus 1 according to the present embodiment includes a CPU 11 that performs various arithmetic processes, a ROM 13 that stores a boot program, a RAM 15 that is used as a work area when the program is executed, and a hard disk device (HDD). ) 17, a user interface 19 including a keyboard and a pointing device, a display device 21 including a liquid crystal display, and a drive device for reading data recorded on a recording medium (such as a magnetic disk or an optical disk) 23.

ハードディスク装置１７には、オペレーティングシステムや各種アプリケーションソフトウェアが記憶されており、本実施例の情報処理装置１は、起動すると、ハードディスク装置１７にインストールされた上記オペレーティングシステムに従って動作し、ユーザインタフェース１９から入力される指令信号に従って、各種アプリケーションソフトウェアを実行する。 The hard disk device 17 stores an operating system and various application software. When the information processing apparatus 1 according to the present embodiment is activated, the information processing apparatus 1 operates according to the operating system installed in the hard disk device 17 and is input from the user interface 19. Various application software is executed in accordance with the command signal.

具体的に、本実施例の情報処理装置１には、次元削減行列Ａを設計するためのツールが、ハードディスク装置１７に、アプリケーションソフトウェアとしてインストールされており、ＣＰＵ１１は、ユーザインタフェース１９を通じて、上記ツールの実行指令が入力されると、ハードディスク装置１７にインストールされた上記ツールに従って、次元削減行列演算処理を実行する（図３参照）。 Specifically, in the information processing apparatus 1 of the present embodiment, a tool for designing the dimension reduction matrix A is installed as application software in the hard disk device 17, and the CPU 11 passes the user interface 19 through the above tool. Is input, the dimension reduction matrix calculation process is executed according to the tool installed in the hard disk device 17 (see FIG. 3).

尚、上記ツールは、外部から入力された認識対象パターンの特徴ベクトルＸに、予め設定された次元削減行列Ａを作用させて、特徴ベクトルＸを低次元空間Ｓ_yに写像し、特徴ベクトルＸの次元削減後の値Ｙに基づき、パターンを、予め定められた複数のクラスのいずれかに分類するパターン認識装置に設定する次元削減行列Ａを算出するためのものである。 Note that the above tool applies a dimension reduction matrix A set in advance to the feature vector X of the recognition target pattern inputted from the outside, maps the feature vector X to the low-dimensional space S _y , and This is for calculating a dimension reduction matrix A set in a pattern recognition apparatus that classifies a pattern into any of a plurality of predetermined classes based on the value Y after dimension reduction.

この種のパターン認識装置としては、マイクロフォンから入力されたユーザの発声音のパターンを認識対象パターンとして、その発声音パターンから、音声の特徴を抽出し、特徴ベクトルＸを生成するパターン認識装置や、カメラから入力された被写体の画像パターンを認識対象パターンとして、その画像パターンから、画像の特徴を抽出し、特徴ベクトルＸを生成するパターン認識装置が知られている。図２は、この種のパターン認識装置５０の基本構成を表すブロック図である。 As this type of pattern recognition device, a pattern recognition device that extracts a feature of a voice from the uttered sound pattern and generates a feature vector X using a pattern of the user's uttered sound input from a microphone as a recognition target pattern, 2. Description of the Related Art There is known a pattern recognition apparatus that uses a subject image pattern input from a camera as a recognition target pattern, extracts image features from the image pattern, and generates a feature vector X. FIG. 2 is a block diagram showing the basic configuration of this type of pattern recognition apparatus 50.

図２に示すように、パターン認識装置５０は、ベクトル生成部５１と、次元削減部５３と、パターン認識部５５と、を備え、ベクトル生成部５１にて、認識対象パターンから、所定次元（ここではｄ次元とする）の特徴ベクトルＸ＝（ｘ₁，ｘ₂，…，ｘ_d）^Tを生成する。そして、この特徴ベクトルＸを、次元削減部５３にて、予め設定された次元削減行列Ａに従い次元削減する。 As shown in FIG. 2, the pattern recognition apparatus 50 includes a vector generation unit 51, a dimension reduction unit 53, and a pattern recognition unit 55. The vector generation unit 51 determines a predetermined dimension (here) from the recognition target pattern. In this case, a feature vector X = (x ₁ , x ₂ ,..., X _d ) ^T is generated. Then, the dimension reduction unit 53 reduces the dimensions of this feature vector X according to a preset dimension reduction matrix A.

そして、この次元削減後の値Ｙ＝Ａ^T・Ｘに基づき、パターン認識部５５にて、認識対象パターンを、予め定められた複数のクラスのいずれかに分類する。例えば、パターン認識部５５は、次元削減後の値Ｙが、第ｉクラスの領域にある場合、入力された認識対象パターンを、第ｉクラスのパターンであると認識する。 Then, based on the value Y = ^AT · X after the dimension reduction, the pattern recognition unit 55 classifies the recognition target pattern into any of a plurality of predetermined classes. For example, when the value Y after dimension reduction is in the i-th class area, the pattern recognition unit 55 recognizes the input recognition target pattern as the i-th class pattern.

本実施例は、このような構成のパターン認識装置５０に設定する次元削減行列Ａを算出するためのものであり、具体的に、図３に示すようにして、次元削減行列演算処理を実行する。図３は、ＣＰＵ１１が実行する次元削減行列演算処理を表すフローチャートである。その他、図４（ａ）は、次元削減行列演算処理の実行時に、ハードディスク装置１７から読み出される特徴ベクトルＸのサンプルが記述されたデータファイル（以下、「サンプルデータファイル」という。）の構成を表す説明図である。 The present embodiment is for calculating the dimension reduction matrix A to be set in the pattern recognition apparatus 50 having such a configuration. Specifically, the dimension reduction matrix calculation process is executed as shown in FIG. . FIG. 3 is a flowchart showing a dimension reduction matrix calculation process executed by the CPU 11. FIG. 4A shows the configuration of a data file (hereinafter referred to as “sample data file”) in which samples of feature vectors X read from the hard disk device 17 are described when the dimension reduction matrix calculation process is executed. It is explanatory drawing.

次元削減行列処理を実行すると、ＣＰＵ１１は、まず、表示装置２１に、ファイル選択画面を表示し、ユーザに、読出対象とするサンプルデータファイルを指定させる（Ｓ１１０）。そして、ユーザインタフェース１９を通じ、サンプルデータファイルが指定されると、ハードディスク装置１７から、指定されたデータファイルを読み出す（Ｓ１２０）。 When the dimension reduction matrix process is executed, the CPU 11 first displays a file selection screen on the display device 21 and causes the user to specify a sample data file to be read (S110). When a sample data file is specified through the user interface 19, the specified data file is read from the hard disk device 17 (S120).

図４（ａ）に示すように、サンプルデータファイルは、パターン認識装置５０が分類するクラスの総数Ｌ、特徴ベクトルＸの次元ｄ、写像先の次元ｄ’の情報が記述された構成にされ、更に、各クラス毎に、クラスに属するパターンの特徴ベクトルＸ＝（ｘ₁，ｘ₂，…，ｘ_d）^Tのサンプルが、クラスの識別番号ｃ（ｃ＝１，２，…，Ｌ）に関連付けられて、複数個記述された構成にされている。以下、第ｉクラスの第ｊサンプルの特徴ベクトルＸを、特に、Ｘ_ijと表現する。 As shown in FIG. 4A, the sample data file has a configuration in which information on the total number L of classes classified by the pattern recognition device 50, the dimension d of the feature vector X, and the dimension d ′ of the mapping destination is described. Further, for each class, the sample of the feature vector X = (x ₁ , x ₂ ,..., X _d ) ^T of the pattern belonging to the class is assigned to the class identification number c (c = 1, 2,..., L). A plurality of associations are described. Hereinafter, the feature vector X of the j-th sample of the i-th class is particularly expressed as X _ij .

即ち、Ｓ１２０において、ＣＰＵ１１は、サンプルデータファイルに記録された、これらの値Ｌ，ｄ，ｄ’，Ｘ_ijを全て読み出す処理を行う。尚、このようなサンプルデータファイルは、予め、文書作成ソフト等を通じてユーザの操作により生成され、ハードディスク装置１７に記録される。 That is, in S120, the CPU 11 performs a process of reading all these values L, d, d ′, and X _ij recorded in the sample data file. Such a sample data file is generated in advance by a user operation through document creation software or the like, and is recorded in the hard disk device 17.

また、Ｓ１２０での処理を終えると、ＣＰＵ１１は、クラス内分散を定義付ける値ｍとして、実数値の入力を受け付けるための入力画面を表示装置２１に表示し、ユーザインタフェース１９を通じて、値ｍの情報を取得する（Ｓ１３０）。 When the process in S120 is completed, the CPU 11 displays an input screen for accepting an input of a real value as the value m defining the intra-class variance on the display device 21, and the information on the value m is displayed through the user interface 19. Obtain (S130).

そして、この処理を終えると、ＣＰＵ１１は、サンプルデータファイルから読み出した値Ｌ，ｄ，Ｘ_ijに従って、各クラスの平均ベクトルμ_iを算出する（Ｓ１４０）。尚、μ_iは、第ｉクラスの平均ベクトルを表し、ｉは、値１から値Ｌまでの整数値を採る。また、Ｎ_iは、サンプルデータファイルに記述された第ｉクラスのサンプルの総数を表す。 When this process is finished, the CPU 11 calculates the average vector μ _i of each class according to the values L, d, and X _ij read from the sample data file (S140). Note that μ _i represents an average vector of the i-th class, and i takes an integer value from a value 1 to a value L. Further, N _i denotes the total number of samples in the i-th class described in the sample data file.

また、この処理を終えると、ＣＰＵ１１は、Ｓ１４０で算出された各クラスの平均ベクトルμ₁，μ₂，…，μ_Lに基づいて、全クラスの平均ベクトルμ₀を算出する（Ｓ１５０）。尚、Ｎは、全クラスのサンプル数であり、Ｎ_i（ｉ＝１，２，…，Ｌ）の総和である。

When this process is finished, the CPU 11 calculates the average vector μ ₀ of all classes based on the average vectors μ ₁ , μ ₂ ,..., Μ _L of each class calculated in S140 (S150). N is the number of samples in all classes, and is the sum of N _i (i = 1, 2,..., L).

その後、ＣＰＵ１１は、次式に従って、クラス間共分散行列Ｂを算出する（Ｓ１６０）。但し、Ｐ_iは、第ｉクラスの事前確率であり、Ｐ_i＝Ｎ_i／Ｎである。

Thereafter, the CPU 11 calculates an interclass covariance matrix B according to the following equation (S160). However, P _i is the prior probability of the i-th class, and P _i = N _i / N.

また、このようにしてＳ１６０での処理を終えると、ＣＰＵ１１は、Ｓ１７０に移行し、各クラスの共分散行列Ｗ_iを算出する。但し、Ｗ_iは、第ｉクラスの共分散行列を表す。

Upon ending the process in S160 in this manner, CPU 11, the process proceeds to S170, calculates the covariance matrix W _i for each class. Here, W _i represents the i-th class covariance matrix.

また、Ｓ１７０での処理を終えると、ＣＰＵ１１は、Ｓ１８０に移行し、次式に示す評価関数Ｊ（Ａ，ｍ）が最大となるｄ行ｄ’列の次元削減行列Ａを求める。

When the process in S170 is completed, the CPU 11 proceeds to S180, and obtains a dimension reduction matrix A of d rows and d ′ columns that maximizes the evaluation function J (A, m) shown in the following equation.

例えば、Ｓ１８０では、行列Ａの要素の値を走査して、評価関数Ｊ（Ａ，ｍ）が極大値を採る行列Ａを求める。尚、ｍ＝０である場合には、具体的に、次の評価関数Ｊ（Ａ，０）に従って、評価関数Ｊ（Ａ，０）が最大となるｄ行ｄ’列の次元削減行列Ａを求める。

For example, in S180, the values of the elements of the matrix A are scanned to obtain the matrix A in which the evaluation function J (A, m) takes the maximum value. When m = 0, specifically, the dimension reduction matrix A of d rows and d ′ columns in which the evaluation function J (A, 0) is maximum is obtained according to the following evaluation function J (A, 0). Ask.

また、Ｓ１８０での処理を終えると、ＣＰＵ１１は、求めた次元削減行列Ａを、各サンプルＸ_ijに作用させて、各サンプル毎に、サンプルを低次元空間Ｓ_yに写像したときの値Ｙ_ijを求める（Ｓ１９０）。

When the processing in S180 is completed, the CPU 11 applies the obtained dimension reduction matrix A to each sample X _ij and values Y _ij when the samples are mapped to the low-dimensional space S _y for each sample. Is obtained (S190).

その後、ＣＰＵ１１は、上記算出した次元削減行列Ａ、この次元削減行列Ａの算出時に用いたパラメータｍの値、及び、次元削減後の値Ｙ_ijを、記述してなるデータファイルを生成し、これをハードディスク装置１７における上記サンプルデータファイルの読出先ディレクトリに書き込む（Ｓ２００）。尚、図４（ｂ）は、Ｓ２００においてハードディスク装置１７に出力するデータファイルの構成を表す説明図である。このデータファイルには、ＣＰＵ１１の動作により、値Ｙ_ijに関連付けて、次元削減前のオリジナルの特徴ベクトルＸ_ij及び当該特徴ベクトルＸ_ijが属するクラスの値ｃも記述される。

Thereafter, the CPU 11 generates a data file in which the calculated dimension reduction matrix A, the value of the parameter m used when calculating the dimension reduction matrix A, and the value Y _ij after the dimension reduction are described. Is written into the reading destination directory of the sample data file in the hard disk device 17 (S200). FIG. 4B is an explanatory diagram showing the configuration of the data file output to the hard disk device 17 in S200. In this data file, the original feature vector X _ij before dimension reduction and the value c of the class to which the feature vector X _ij belongs are described in association with the value Y _ij by the operation of the CPU 11.

また、Ｓ２００での処理を終えると、ＣＰＵ１１は、上記算出した次元削減行列Ａを表示装置２１の画面上に表示すると共に、各サンプルＸ_ijを低次元空間Ｓ_yに写像したときの値Ｙ_ijの分布を表すグラフを、表示装置２１の画面上に表示する（Ｓ２０５）。 When the processing in S200 is completed, the CPU 11 displays the calculated dimension reduction matrix A on the screen of the display device 21 and values Y _ij obtained when each sample X _ij is mapped to the low-dimensional space S _y. Is displayed on the screen of the display device 21 (S205).

具体的に、ｄ’＝１である場合には、値Ｙ_ijがスカラー量であるので、横軸に、１次元空間Ｓ_yの座標を採り、縦軸に、その座標に写像されたサンプルの個数を採って、１次元空間Ｓ_yの各座標に写像されるサンプルの個数をクラス毎にグラフ表示する。例えば、図５（ａ）下段に示す構成のグラフを表示する。また、ｄ’＝２である場合には、値Ｙ_ijが２次元のベクトルであるので、ｘｙ平面に、２次元空間Ｓ_yの座標を取り、ｚ軸に、その座標（ｘ，ｙ）に写像されたサンプルの個数を採って、２次元空間Ｓ_yの各座標に写像されるサンプルの個数をグラフ表示する。 Specifically, when d ′ = 1, the value Y _ij is a scalar quantity, so the coordinate of the one-dimensional space S _y is taken on the horizontal axis and the sample mapped to that coordinate is taken on the vertical axis. taking the number, graphical displays for each of the number of samples to be mapped to each coordinate of the one-dimensional space S _y class. For example, a graph having the configuration shown in the lower part of FIG. When d ′ = 2, since the value Y _ij is a two-dimensional vector, the coordinates of the two-dimensional space S _y are taken on the xy plane, and the coordinates (x, y) are taken on the z axis. take the mapped number of samples, the number of samples to be mapped to each coordinate of the two-dimensional space S _y graphically displayed.

その後、ＣＰＵ１１は、再演算の実行指令がユーザインタフェース１９を通じて入力されているか否かを判断し（Ｓ２１０）、再演算の実行指令が入力されている場合には（Ｓ２１０でＹｅｓ）、Ｓ１３０に移行し、改めて値ｍの入力を受け付ける。また、ユーザインタフェース１９を通じて値ｍが入力された後には、Ｓ１４０に移行し、この新規な値ｍを用いて、Ｓ１４０以降の処理を実行する。 Thereafter, the CPU 11 determines whether or not a recalculation execution command is input through the user interface 19 (S210). If a recalculation execution command is input (Yes in S210), the process proceeds to S130. Then, the input of the value m is received again. Further, after the value m is input through the user interface 19, the process proceeds to S140, and the processes after S140 are executed using the new value m.

一方、再演算の実行指令が入力されていないと判断すると（Ｓ２１０でＮｏ）、ＣＰＵ１１は、当該ツールの終了指令が入力されるか、再演算の実行指令が入力されるまで待機し、当該ツールの終了指令が入力されると（Ｓ２２０でＹｅｓ）、当該次元削減行列演算処理を終了する。 On the other hand, if it is determined that a recalculation execution command has not been input (No in S210), the CPU 11 waits until a recomputation execution command is input or a recalculation execution command is input. Is input (Yes in S220), the dimension reduction matrix calculation process ends.

以上に説明した本実施例の情報処理装置１では、ｍ＝１が入力された場合、クラス内分散を、算術平均により定義することになるため、線形判別分析法と同手法で、次元削減行列Ａを求めることになる。また、ｍ＝０が入力された場合には、クラス内分散を、幾何平均により定義することになるため、周知の不等分散判別分析法と同手法で次元削減行列Ａを求めることになる。一方、ｍが０，１以外の値で入力された場合には、新規な手法で次元削減行列Ａを求めることになる。 In the information processing apparatus 1 according to the present embodiment described above, when m = 1 is input, the intra-class variance is defined by the arithmetic mean. Therefore, the dimension reduction matrix is the same as the linear discriminant analysis method. A will be calculated. When m = 0 is input, the intra-class variance is defined by the geometric mean, so the dimension reduction matrix A is obtained by the same method as the well-known unequal variance discriminant analysis method. On the other hand, when m is input with a value other than 0 and 1, the dimension reduction matrix A is obtained by a novel method.

上述したように、クラス内分散を一般化平均で定義すれば、線形判別分析法や不等分散判別分析法によっては、適切な次元削減行列Ａを求めることができない場合であっても、適切な次元削減行列Ａを求めることができる（図５参照）。即ち、特徴ベクトル（サンプル）Ｘ_ijを低次元空間Ｓ_yに写像したときに、特徴ベクトルＸ_ijが低次元空間Ｓ_y上でクラス毎に分離し、同クラスの特徴ベクトルが特定領域に集中する図５（ａ）に示すような適切な次元削減行列Ａを求めることができる。 As described above, if the intra-class variance is defined by a generalized average, even if the appropriate dimension reduction matrix A cannot be obtained by the linear discriminant analysis method or the unequal variance discriminant analysis method, an appropriate A dimension reduction matrix A can be obtained (see FIG. 5). That is, when the feature vector (sample) X _ij is mapped to the low-dimensional space S _y , the feature vector X _ij is separated for each class on the low-dimensional space S _y , and feature vectors of the same class are concentrated in a specific region. An appropriate dimension reduction matrix A as shown in FIG. 5A can be obtained.

従って、本実施例のツールを用いれば、従来よりもパターン認識性能に優れた好適なパターン認識装置を構成することができる。
具体的に、パターン認識装置５０に設定する次元削減行列Ａの決定に当たっては、当該ツールを用いて、パラメータｍの値を走査しながら、当該値ｍで求めた次元削減行列Ａにより各サンプルＸ_ijを低次元空間Ｓ_yに写像したときの分布を、出力されたデータファイルの内容や表示装置２１に表示されたグラフに基づいて、評価する作業を行えばよい。このような作業を経て、本実施例では、最適な次元削減行列Ａを求めることができる。 Therefore, if the tool of the present embodiment is used, it is possible to configure a suitable pattern recognition apparatus that is superior in pattern recognition performance than the conventional one.
Specifically, in determining the dimension reduction matrix A to be set in the pattern recognition apparatus 50, each sample X _ij is obtained from the dimension reduction matrix A obtained from the value m while scanning the value of the parameter m using the tool. It is sufficient to perform an operation of evaluating the distribution when the image is mapped to the low-dimensional space _{Sy based} on the contents of the output data file and the graph displayed on the display device 21. Through such work, in this embodiment, the optimum dimension reduction matrix A can be obtained.

また、本実施例では、演算結果をデータファイルとして出力するだけでなく、各サンプルＸ_ijを低次元空間Ｓ_yに写像したときの値Ｙ_ijの分布を表すグラフを、表示装置２１の画面上に表示出力するので、ユーザは、最適な次元削減行列Ａを求める作業を、効率的に行うことができる。 In this embodiment, not only the calculation result is output as a data file, but also a graph representing the distribution of the value Y _ij when each sample X _ij is mapped to the low-dimensional space S _y is displayed on the screen of the display device 21. Therefore, the user can efficiently perform an operation for obtaining the optimum dimension reduction matrix A.

ところで、上記実施例では、分子をクラス間共分散行列Ｂで定義した評価関数Ｊ（Ａ，ｍ）に従って次元削減行列Ａを求めたが、Ｓ１８０では、分子を全共分散行列で定義してなる次の評価関数Ｊ（Ａ，ｍ）に従って、次元削減行列Ａを求めても良い。 In the above embodiment, the dimension reduction matrix A is obtained according to the evaluation function J (A, m) in which the numerator is defined by the interclass covariance matrix B. In S180, the numerator is defined by the total covariance matrix. The dimension reduction matrix A may be obtained according to the following evaluation function J (A, m).

評価関数Ｊ（Ａ，ｍ）を、上式で定義して次元削減行列Ａを求めても、上記実施例と同様の効果を得ることができる。

Even if the evaluation function J (A, m) is defined by the above equation to obtain the dimension reduction matrix A, the same effect as in the above embodiment can be obtained.

この他、上述した情報処理装置１は、図６に示す内容の次元削減行列演算処理を実行する構成にされてもよい（第二実施例）。
［第二実施例］
続いて、第二実施例について説明する。図６は、第二実施例の情報処理装置１が、ＣＰＵ１１にて実行する次元削減行列演算処理を表すフローチャートである。但し、第二実施例の情報処理装置１は、ＣＰＵ１１にて実行する次元削減行列演算処理の内容が異なる程度であり、その他の構成については、基本的に第一実施例と同一構成であるので、以下では、第一実施例の情報処理装置１と同一構成の説明を、適宜省略する。 In addition, the information processing apparatus 1 described above may be configured to execute a dimension reduction matrix calculation process having the contents shown in FIG. 6 (second embodiment).
[Second Example]
Subsequently, a second embodiment will be described. FIG. 6 is a flowchart illustrating a dimension reduction matrix calculation process executed by the CPU 11 by the information processing apparatus 1 according to the second embodiment. However, the information processing apparatus 1 of the second embodiment is different in the content of the dimension reduction matrix calculation process executed by the CPU 11, and other configurations are basically the same as those of the first embodiment. Hereinafter, the description of the same configuration as the information processing apparatus 1 of the first embodiment will be omitted as appropriate.

図６に示す次元削減行列演算処理を実行すると、ＣＰＵ１１は、まず、表示装置２１に、ファイル選択画面を表示し、ユーザに、読出対象とするサンプルデータファイルを指定させる（Ｓ３１０）。そして、ユーザインタフェース１９を通じ、サンプルデータファイルが指定されると、ハードディスク装置１７から、指定されたデータファイルを読み出す（Ｓ３２０）。尚、本実施例のサンプルデータファイルも、第一実施例と同一構成であり、予め、文書作成ソフト等を通じてユーザの操作により生成され、ハードディスク装置１７に記録されるものとする。 When the dimension reduction matrix calculation process shown in FIG. 6 is executed, the CPU 11 first displays a file selection screen on the display device 21 and causes the user to specify a sample data file to be read (S310). When a sample data file is specified through the user interface 19, the specified data file is read from the hard disk device 17 (S320). The sample data file of the present embodiment has the same configuration as that of the first embodiment, and is generated in advance by a user operation through document creation software or the like and recorded on the hard disk device 17.

即ち、Ｓ３２０において、ＣＰＵ１１は、サンプルデータファイルに記録された値Ｌ，ｄ，ｄ’，Ｘ_ijを読み出す処理を行う。
また、Ｓ３２０での処理を終えると、ＣＰＵ１１は、クラス内分散を定義付ける値として、実数値の入力を受け付けるための入力画面を表示装置２１に表示し、ユーザインタフェース１９を通じて、その値を取得する（Ｓ３３０）。但し、ここで表示する入力画面は、第一実施例とは異なり、クラス内分散を定義付ける値として、複数の値ｍ［１］，ｍ［２］，…，ｍ［Ｍ］を入力可能な構成にされている。 That is, in S320, the CPU 11 performs a process of reading the values L, d, d ′, and X _ij recorded in the sample data file.
When the processing in S320 is completed, the CPU 11 displays an input screen for accepting an input of a real value as a value defining the intra-class variance on the display device 21, and acquires the value through the user interface 19 ( S330). However, the input screen displayed here is different from the first embodiment in that a plurality of values m [1], m [2],..., M [M] can be input as values defining the intra-class variance. Has been.

即ち、Ｓ３３０では、ユーザから指定された複数の値ｍ［１］，ｍ［２］，…を、ユーザインタフェース１９を通じて取得する。尚、Ｓ３３０では、予め定めたＭ個の値をユーザに入力させてもよいし、最大個数をＭ個として、任意の個数の値をユーザに入力させてもよい。ここでは、ユーザから取得した値ｍの数を、Ｍ_E個とする。 That is, in S330, a plurality of values m [1], m [2],... Designated by the user are acquired through the user interface 19. In S330, the user may input a predetermined number of M values, or the user may input an arbitrary number of values with the maximum number being M. Here, the number of values m acquired from the user is M _E.

また、この処理を終えると、ＣＰＵ１１は、Ｓ１４０での処理と同様に、サンプルデータファイルから読み出した値Ｌ，ｄ，Ｘ_ijに従って、各クラスの平均ベクトルμ_iを算出すると共に（Ｓ３４０）、Ｓ１５０での処理と同様に、Ｓ３４０で算出した各クラスの平均ベクトルμ₁，μ₂，…，μ_Lに基づき、全クラスの平均ベクトルμ₀を算出する（Ｓ３５０）。その後、Ｓ１６０での処理と同様に、クラス間共分散行列Ｂを算出し（Ｓ３６０）、Ｓ１７０での処理と同様に、各クラスの共分散行列Ｗ_iを算出する（Ｓ３７０）。 When this process is completed, the CPU 11 calculates the average vector μ _i of each class according to the values L, d, and X _ij read from the sample data file (S340) and S150 as in the process at S140. In the same manner as in the above processing, based on the average vectors μ ₁ , μ ₂ ,..., Μ _L of each class calculated in S340, the average vector μ ₀ of all classes is calculated (S350). Thereafter, similarly to the processing in S160, and calculates the inter-class covariance matrix B (S360), similarly to the processing in S170, to calculate the covariance matrix W _i for each class (S370).

また、Ｓ３７０での処理を終えると、ＣＰＵ１１は、パラメータｒ＝１に設定する（Ｓ３７３）と共に、Ｓ３７７に移行し、パラメータｍ（クラス内分散を定義付ける値ｍ）を、ユーザから指定された値ｍ［ｒ］に設定する（ｍ←ｍ［ｒ］）。 When the processing in S370 is completed, the CPU 11 sets the parameter r = 1 (S373) and moves to S377, and sets the parameter m (value m defining the intra-class variance) to the value m designated by the user. [R] is set (m ← m [r]).

また、Ｓ３７７での処理を終えると、ＣＰＵ１１は、Ｓ３８０に移行し、Ｓ１８０での処理と同様に、評価関数Ｊ（Ａ，ｍ）が最大となるｄ行ｄ’列の次元削減行列Ａを求める（Ｓ３８０）。特に、ここでは、評価関数Ｊ（Ａ，ｍ［ｒ］）が最大となるｄ行ｄ’列の次元削減行列をＡ_rと表記する。 When the process in S377 is completed, the CPU 11 proceeds to S380, and similarly to the process in S180, obtains a dimension reduction matrix A of d rows and d ′ columns that maximizes the evaluation function J (A, m). (S380). In particular, here, a dimension reduction matrix of d rows and d ′ columns in which the evaluation function J (A, m [r]) is maximum is denoted as _Ar .

そして、求めた次元削減行列Ａ_rを、Ｓ１９０での処理と同様に、各サンプルＸ_ijに作用させて、サンプル毎に、サンプルを低次元空間Ｓ_yに写像したときの値Ｙ_ij［ｒ］を求める（Ｓ３９０）。尚、ここで用いる記号［ｒ］は、値ｍ［ｒ］を用いて算出した値であることを明確にするためのサフィックスである。 Then, the dimensionality reduction matrix A _r obtained, similarly to the processing in S190, is allowed to act on each sample X _ij, for each sample, the value Y _ij when the mapping samples in the low-dimensional space S _y [r] Is obtained (S390). The symbol [r] used here is a suffix for clarifying that the value is calculated using the value m [r].

また、この処理を終えると、ＣＰＵ１１は、Ｓ４００に移行し、図７に示す行列評価処理を実行する。図７は、ＣＰＵ１１がＳ４００で実行する行列評価処理を表すフローチャートである。 When this process is finished, the CPU 11 proceeds to S400 and executes a matrix evaluation process shown in FIG. FIG. 7 is a flowchart showing the matrix evaluation process executed by the CPU 11 in S400.

行列評価処理を開始すると、ＣＰＵ１１は、まず、各サンプルの次元削減後の値Ｙ_ij［ｒ］から、次元削減後の各クラスの平均ベクトルμ_i［ｒ］（ｉ＝１，２，…，Ｌ）を算出する（Ｓ５１０）。 When the matrix evaluation process is started, the CPU 11 first determines the average vector μ _i [r] (i = 1, 2,...) Of each class after dimension reduction from the value Y _ij [r] after dimension reduction of each sample. L) is calculated (S510).

また、この処理を終えると、Ｓ５２０に移行して、次元削減後の各クラスの共分散行列Ｗ_i［ｒ］（ｉ＝１，２，…，Ｌ）を算出する。

When this process is finished, the process proceeds to S520, and the covariance matrix W _i [r] (i = 1, 2,..., L) of each class after dimension reduction is calculated.

その後、ＣＰＵ１１は、Ｓ５２５に移行して、パラメータｉを値１に設定し（ｉ＝１）、ｋ＝ｉ＋１，ｉ＋２，…，Ｌについて、次式に従い値Ｗ_ik［ｒ］を算出する（Ｓ５３０）。

Thereafter, the CPU 11 proceeds to S525, sets the parameter i to the value 1 (i = 1), and calculates the value W _ik [r] for k = i + 1, i + 2,..., L according to the following equation (S530). ).

但し、ここで用いるパラメータｓは、０＜ｓ＜１の範囲で値を採る重み付け係数であり、例えば，ｓ＝０．５に設定される。

However, the parameter s used here is a weighting coefficient that takes a value in the range of 0 <s <1, and is set to s = 0.5, for example.

また、この処理を終えると、ＣＰＵ１１は、Ｓ５４０に移行し、ｋ＝ｉ＋１，ｉ＋２，…，Ｌについて、次式に従い、次元削減後の第ｉクラスのサンプル群と第ｋクラスのサンプル群との間の乖離度η_ik［ｒ］を算出する（Ｓ５４０）。 When this processing is completed, the CPU 11 proceeds to S540, and for k = i + 1, i + 2,..., L, the i-th sample group after the dimension reduction and the k-th sample group after the dimension reduction are performed according to the following equation. The degree of deviation η _ik [r] is calculated (S540).

その後、次式に従い、次元削減後の第ｉクラスのサンプル群と第ｋクラスのサンプル群との間の重なり具合を表すチェルノフ上限ＣＢ_ik［ｒ］を、ｋ＝ｉ＋１，ｉ＋２，…，Ｌについて算出する（Ｓ５５０）。尚、Ｐ_iは、第ｉクラスの事前確率である。

After that, according to the following equation, the upper limit CB _ik [r] representing the degree of overlap between the i-th class sample group and the k-th class sample group after dimension reduction is expressed as k = i + 1, i + 2,. Calculate (S550). Note that _Pi is the prior probability of the i-th class.

即ち、本実施例では、クラス間の重なりを測る指標として、チェルノフ上限を用いる。チェルノフ上限は、ベイズ誤差の上限であることが知られている。ユーザにより指定された値ｍ［１］，…,ｍ［Ｍ_E］から、次元削減行列Ａに採用するのに最適な定義（値ｍ）を選択するには、このチェルノフ上限ＣＢ_ikが小さくなる値ｍ［１］，…,ｍ［Ｍ_E］を選択すればよいことになる。本実施例では、ユーザにより指定された値ｍ［１］，…,ｍ［Ｍ_E］の中から、次元削減行列Ａに採用するのに最適な定義（値ｍ）を選択するために、ここで、チェルノフ上限ＣＢ_ik［ｒ］を上述のようにして算出する。尚、ｓ＝０．５のとき、チェルノフ上限は、バタチャリヤ上限と呼ばれる。

That is, in this embodiment, the Chernoff upper limit is used as an index for measuring the overlap between classes. It is known that the upper limit of Chernoff is the upper limit of Bayes error. In order to select an optimum definition (value m) to be adopted for the dimension reduction matrix A from the values m [1],..., M [M _E ] specified by the user, the Chernoff upper limit CB _ik is reduced. The values m [1],..., M [M _E ] may be selected. In the present embodiment, in order to select an optimum definition (value m) to be adopted for the dimension reduction matrix A from the values m [1],..., M [M _E ] designated by the user, Thus, the Chernoff upper limit CB _ik [r] is calculated as described above. Note that when s = 0.5, the Chernoff upper limit is called the Batachariya upper limit.

Ｓ５５０での処理を終えると、ＣＰＵ１１は、不等式ｉ＜Ｌ−１が成立しているか否かを判断し、パラメータｉが値（Ｌ−１）より小さく上記不等式が成立している場合には（Ｓ５６０でＹｅｓ）、Ｓ５６５に移行して、パラメータｉを１加算した値に更新し（ｉ←ｉ＋１）、Ｓ５３０に移行する。その後、更新後のパラメータｉの値を用いて、Ｓ５３０〜Ｓ５６０の処理を実行する。 When the processing in S550 is completed, the CPU 11 determines whether or not the inequality i <L−1 is satisfied. When the parameter i is smaller than the value (L−1) and the above inequality is satisfied ( The process proceeds to S565, where the parameter i is updated to a value obtained by adding 1 (i ← i + 1), and the process proceeds to S530. Thereafter, the process of S530 to S560 is executed using the updated value of the parameter i.

そして、パラメータｉの値が（Ｌ−１）となり、不等式ｉ＜Ｌ−１が成立しなくなると（Ｓ５６０でＮｏ）、Ｓ５７０に移行し、次式に従って、各クラスの組合せ毎のチェルノフ上限ＣＢ_ik［ｒ］の総和ＣＢ［ｒ］を算出する。 When the value of the parameter i becomes (L-1) and the inequality i <L-1 does not hold (No in S560), the process proceeds to S570, and according to the following formula, the Chernoff upper limit CB _{ik for} each combination of classes. The sum CB [r] of [r] is calculated.

周知のようにチェルノフ上限は、２クラス間の問題しか取り扱うことができないため、本実施例では、上述のように総和ＣＢ［ｒ］を算出し、第１クラスから第Ｌクラスまでの各クラス間の重なり具合を数値化する。そして、本実施例では、この総和ＣＢ［ｒ］を、次元削減行列Ａ_rの良否に関する評価値として取り扱う。具体的には、総和ＣＢ［ｒ］が小さい程、次元削減行列Ａ_rが良であると取り扱う。

As is well known, since the upper limit of Chernoff can only deal with problems between two classes, in this embodiment, the sum CB [r] is calculated as described above, and between each class from the first class to the L class. Quantify the degree of overlap. In the present embodiment, the sum CB [r], handled as an evaluation value regarding the quality of the dimensionality reduction matrix A _r. Specifically, the sum CB [r] The smaller the handles to be goodness dimensionality reduction matrix A _r.

なぜなら、総和ＣＢ［ｒ］が小さい程、各クラスのサンプル群の重なりが小さく、特徴ベクトルＸ_ijが低次元空間Ｓ_y上でクラス毎に適切に分離し、同クラスの特徴ベクトルが特定領域に集中していることになるためである。ＣＰＵ１１は、このような特徴を有する総和ＣＢ［ｒ］（以下、評価値ＣＢ［ｒ］とも言う。）を算出すると、当該行列評価処理を終了する。 This is because the smaller the sum CB [r], the smaller the overlap of the sample groups of each class, the feature vectors X _ij are appropriately separated for each class in the low-dimensional space S _y , and the feature vectors of the same class are in a specific region. This is because they are concentrated. After calculating the sum CB [r] (hereinafter also referred to as an evaluation value CB [r]) having such characteristics, the CPU 11 ends the matrix evaluation process.

また、Ｓ４００で行列評価処理を終えると、ＣＰＵ１１は、ユーザから指定された複数の値ｍ［１］,…,ｍ［Ｍ_E］の全てについて次元削減行列Ａ_rを算出し行列評価処理を実行したか否かを判断する（Ｓ４１０）。即ち、ｒ≧Ｍ_Eであるか否かを判断する。そして、値ｍ［１］,…,ｍ［Ｍ_E］の全てについて次元削減行列Ａ_rを算出し行列評価処理を実行していないと判断すると（Ｓ４１０でＮｏ）、Ｓ４１５に移行し、パラメータｒを、１加算した値に更新する（ｒ←ｒ＋１）。その後、Ｓ３７７に移行し、値ｍをｍ［ｒ］に更新し、ｍ［ｒ］について、Ｓ３８０以降の処理を実行する。 When the matrix evaluation process is completed in S400, the CPU 11 calculates the dimension reduction matrix _Ar for all of a plurality of values m [1],..., M [M _E ] designated by the user and executes the matrix evaluation process. It is determined whether or not (S410). In other words, it is determined whether or not r ≧ M _E. When it is determined that the dimension reduction matrix _Ar is calculated for all the values m [1],..., M [M _E ] and the matrix evaluation process is not executed (No in S410), the process proceeds to S415, and the parameter r Is updated to a value obtained by adding 1 (r ← r + 1). Thereafter, the process proceeds to S377, where the value m is updated to m [r], and the processes after S380 are executed for m [r].

そして、ユーザから指定された複数の値ｍ［１］,…,ｍ［Ｍ_E］の全てについて次元削減行列Ａ_rを算出し行列評価処理を実行したと判断すると（Ｓ４１０でＹｅｓ）、ＣＰＵ１１は、Ｓ４２０に移行し、ｒ＝１，２，…，Ｍ_Eの夫々に対応する評価値ＣＢ［ｒ］の中から、最小値をとる評価値ＣＢ［ｒ］を検出し、最小の評価値ＣＢ［ｒ］をとるパラメータｒの値を特定する。ここでは、特定した最小の評価値ＣＢ［ｒ］をとるパラメータｒの値を特に、ｒ_minと表記する。 If it is determined that the dimension reduction matrix _Ar is calculated for all the values m [1],..., M [M _E ] specified by the user and the matrix evaluation process is executed (Yes in S410), the CPU 11 , S420, the evaluation value CB [r] taking the minimum value is detected from the evaluation values CB [r] corresponding to r = 1, 2,..., M _E , and the minimum evaluation value CB is detected. The value of parameter r taking [r] is specified. Here, the value of the parameter r that takes the specified minimum evaluation value CB [r] is expressed in particular as r _min .

また、Ｓ４２０での処理を終えると、ＣＰＵ１１は、Ｓ４３０に移行し、最小の評価値ＣＢ［ｒ_min］をとる次元削減行列Ａ_r、及び、この次元削減行列Ａ_rの算出時に用いたパラメータｍの値ｍ［ｒ_min］、及び、次元削減後の値Ｙ_ij［ｒ_min］、及び、評価値ＣＢ［ｒ_min］を、記述してなるデータファイルを生成し、これをハードディスク装置１７における上記サンプルデータファイルの読出先ディレクトリに書き込む。尚、ここで、書き込むデータファイルの様式は、Ｓ２００で書き込むデータファイルの様式と概ね同様であり（図４（ｂ）参照）、高々、評価値ＣＢ［ｒ_min］が付加される程度のものである。 When the processing in S420 is completed, the CPU 11 proceeds to S430, and the dimension reduction matrix A _r taking the minimum evaluation value CB [r _min ] and the parameter m used when calculating the dimension reduction matrix A _r are obtained. A data file in which the value m [r _min ] of the data, the value Y _ij [r _min ] after the dimension reduction, and the evaluation value CB [r _min ] are described is generated. Write to the sample data file reading directory. Here, the format of the data file to be written is substantially the same as the format of the data file to be written in S200 (see FIG. 4B), and at most, an evaluation value CB [r _min ] is added. is there.

また、Ｓ４３０での処理を終えると、ＣＰＵ１１は、Ｓ４３５に移行し、上記データファイルに記述した最小の評価値ＣＢ［ｒ_min］をとる次元削減行列Ａ_rを、表示装置２１の画面上に表示すると共に、この次元削減行列Ａ_rを用いて各サンプルＸ_ijを低次元空間Ｓ_yに写像したときの値Ｙ_ij［ｒ_min］の分布を表すグラフを、Ｓ２０５での処理と同様に、表示装置２１の画面上に表示する。 When the processing in S430 is completed, the CPU 11 proceeds to S435, and displays the dimension reduction matrix _Ar taking the minimum evaluation value CB [r _min ] described in the data file on the screen of the display device 21. while, a graph representing the distribution of values Y _ij [r _min] at the time of mapping each sample X _ij using the dimensionality reduction matrix a _r to the low-dimensional space S _y, like the process in S205, the display It is displayed on the screen of the device 21.

その後、ＣＰＵ１１は、再演算の実行指令がユーザインタフェース１９を通じて入力されているか否かを判断し（Ｓ４４０）、再演算の実行指令が入力されている場合には（Ｓ４４０でＹｅｓ）、Ｓ３３０に移行する。一方、再演算の実行指令が入力されていないと判断すると（Ｓ４４０でＮｏ）、当該ツールの終了指令が入力されるか、再演算の実行指令が入力されるまで待機し、当該ツールの終了指令が入力されると（Ｓ４５０でＹｅｓ）、当該次元削減行列演算処理を終了する。 Thereafter, the CPU 11 determines whether or not a recalculation execution command is input through the user interface 19 (S440). If a recalculation execution command is input (Yes in S440), the process proceeds to S330. To do. On the other hand, if it is determined that the recalculation execution command is not input (No in S440), the process waits until the end command of the tool is input or the recalculation execution command is input, and the tool end command Is input (Yes in S450), the dimension reduction matrix calculation process is terminated.

以上、第二実施例の情報処理装置１の構成について説明したが、本実施例によれば、チェルノフ上限により、次元削減後の各クラスのサンプル群の重なり具合を数値化して、ユーザから指定された複数の値ｍ［１］，…，ｍ［Ｍ_E］の内、次元削減後の各クラスのサンプル群の重なり具合が最も小さい値ｍ［ｒ_min］を、クラス内分散を定義付ける最適値として採用し、その値ｍ［ｒ_min］を用いて評価関数Ｊ［Ａ，ｍ［ｒ_min］］で算出された次元削減行列Ａを、選択的に、データファイル及び表示装置２１に出力する。 As described above, the configuration of the information processing apparatus 1 according to the second embodiment has been described. According to this embodiment, the degree of overlap of the sample groups of each class after dimension reduction is quantified and specified by the user according to the Chernoff upper limit. Among the plurality of values m [1],..., M [M _E ], the value m [r _min ] having the smallest overlap of the sample groups of each class after the dimension reduction is set as the optimum value for defining the intra-class variance. The dimension reduction matrix A calculated by the evaluation function J [A, m [r _min ]] using the value m [r _min ] is selectively output to the data file and the display device 21.

従って、本実施例によれば、第一実施例とは異なりＳ２０５で表示されるグラフを目視して、最適な次元削減行列Ａを探し出す必要がなく、ユーザは、簡単に、最適な次元削減行列Ａを求めることができる。 Therefore, according to the present embodiment, unlike the first embodiment, there is no need to look for the optimal dimension reduction matrix A by visually observing the graph displayed in S205, and the user can easily perform the optimal dimension reduction matrix. A can be obtained.

よって、本実施例によれば、ユーザは、簡単に高性能なパターン認識装置を構成することができる。
［特許請求の範囲との対応関係］
尚、本発明のサンプル取得手段は、ＣＰＵ１１が実行するＳ１２０，Ｓ３２０の処理により実現され、クラス間分散算出手段は、Ｓ１６０，Ｓ３６０の処理により実現されている。また、クラス分散算出手段は、Ｓ１７０，Ｓ３７０の処理により実現され、入力受付手段は、Ｓ１３０，Ｓ３３０の処理により実現されている。この他、行列算出手段は、Ｓ１８０，Ｓ３８０の処理により実現され、低次元化手段は、Ｓ１９０，Ｓ３９０の処理により実現され、評価手段は、Ｓ４００の処理により実現され、出力手段は、Ｓ２００，Ｓ２０５，Ｓ４３０，Ｓ４３５の処理により実現されている。 Therefore, according to the present embodiment, the user can easily configure a high-performance pattern recognition apparatus.
[Correspondence with Claims]
The sample acquisition means of the present invention is realized by the processing of S120 and S320 executed by the CPU 11, and the interclass variance calculation means is realized by the processing of S160 and S360. The class variance calculation means is realized by the processes of S170 and S370, and the input reception means is realized by the processes of S130 and S330. In addition, the matrix calculation means is realized by the processing of S180 and S380, the reduction means is realized by the processing of S190 and S390, the evaluation means is realized by the processing of S400, and the output means is S200 and S205. , S430, and S435.

情報処理装置１の構成を表すブロック図である。1 is a block diagram illustrating a configuration of an information processing device 1. FIG. パターン認識装置５０の基本構成を表すブロック図である。2 is a block diagram illustrating a basic configuration of a pattern recognition device 50. FIG. ＣＰＵ１１が実行する次元削減行列演算処理を表すフローチャートである。It is a flowchart showing the dimension reduction matrix calculation process which CPU11 performs. サンプルデータファイルの構成（ａ）及び出力データファイルの構成（ｂ）を表す説明図である。It is explanatory drawing showing the structure (a) of a sample data file, and the structure (b) of an output data file. 次元削減後の特徴ベクトルの分布を説明した説明図である。It is explanatory drawing explaining distribution of the feature vector after dimension reduction. ＣＰＵ１１が実行する第二実施例の次元削減行列演算処理を表すフローチャートである。It is a flowchart showing the dimension reduction matrix calculation process of 2nd Example which CPU11 performs. ＣＰＵ１１が実行する行列評価処理を表すフローチャートである。It is a flowchart showing the matrix evaluation process which CPU11 performs.

Explanation of symbols

１…情報処理装置、１１…ＣＰＵ、１３…ＲＯＭ、１５…ＲＡＭ、１７…ハードディスク装置、１９…ユーザインタフェース、２１…表示装置、２３…ドライブ装置、５０…パターン認識装置、５１…ベクトル生成部、５３…次元削減部、５５…パターン認識部 DESCRIPTION OF SYMBOLS 1 ... Information processing apparatus, 11 ... CPU, 13 ... ROM, 15 ... RAM, 17 ... Hard disk drive, 19 ... User interface, 21 ... Display apparatus, 23 ... Drive apparatus, 50 ... Pattern recognition apparatus, 51 ... Vector generation part, 53 ... dimension reduction unit, 55 ... pattern recognition unit

Claims

By applying a predetermined dimension reduction matrix to the feature vector of the recognition target pattern inputted from the outside, the feature vector is mapped to a low-dimensional space, and the pattern is based on the value after dimension reduction of the feature vector. Is a computing device for calculating the dimension reduction matrix used in a pattern recognition device that classifies the data into one of a plurality of predetermined classes,
For each of the plurality of classes, sample acquisition means for acquiring, as a sample, a plurality of feature vectors of patterns belonging to the class for each class;
Interclass variance calculation means for calculating an interclass covariance matrix B based on the sample group of each class acquired by the sample acquisition means;
Based on the sample group of each class acquired by the sample acquisition means, the covariance matrix W _i of each class (W _i represents the i-th class covariance matrix, and i is the total number of classes as L Class variance calculating means for calculating an integer value from value 1 to value L),
An input receiving means for receiving an input of a real value as a value m for defining intra-class variance;
The value m which is input through the input receiving means, and inter-class covariance matrix B calculated by the inter-class variance calculating means, based on the covariance matrix W _i of each class calculated by the class variance calculation means , An evaluation function J ₁ (A, m) or an evaluation function J ₂ (A, m) where P _i is the prior probability of the i-th class

Matrix calculation means for calculating a dimension reduction matrix A that maximizes
Output means for outputting the dimension reduction matrix A calculated by the matrix calculating means;
An arithmetic device comprising:

A reduction means for calculating the value after dimension reduction of each sample by applying the dimension reduction matrix A calculated by the matrix calculation means to each sample acquired by the sample acquisition means;
2. The arithmetic unit according to claim 1, wherein the output unit is configured to output a dimension reduction matrix A calculated by the matrix calculation unit and information representing a calculation result of the reduction unit. .

A dimension reduction matrix A calculated by the matrix calculation means is applied to each sample acquired by the sample acquisition means, and a dimension reduction means for calculating a value after dimension reduction of each sample;
Evaluation means for evaluating the quality of the dimension reduction matrix A calculated by the matrix calculation means based on the value after dimension reduction of each sample calculated by the reduction means;
The arithmetic unit according to claim 1, further comprising:

The evaluation means is a covariance matrix W _i ′ of each class obtained from the value after dimension reduction of each sample calculated by the dimension reduction means (W _i ′ is the i th class after dimension reduction). Based on the covariance matrix) and the average vector μ _i ′ (where μ _i ′ represents the average vector of the samples belonging to the i-th class after dimension reduction), and the weighting coefficient is s, the following equation that the prior probability was P _i

In accordance with the above, the Chernoff upper limit CB _ik representing the degree of overlap of the sample groups of the i-th class and the k-th class after the dimension reduction by the dimension reduction matrix A calculated by the matrix calculation means is calculated for each combination of classes,
The arithmetic unit according to claim 3, wherein the arithmetic unit is configured to evaluate the quality of the dimension reduction matrix A calculated by the matrix calculation unit based on the calculated Chernoff upper limit _CBik .

The evaluation means is a sum CB of the Chernoff upper limit CB _ik for each combination of the classes.

The arithmetic unit according to claim 4, wherein: is calculated as an evaluation value CB that indicates the quality of the dimension reduction matrix A calculated by the matrix calculation means.

The input receiving means is configured to be able to acquire a plurality of values as the value m defining the intra-class variance,
The matrix calculation means calculates a dimension reduction matrix A that maximizes the evaluation function J ₁ (A, m) or the evaluation function J ₂ (A, m) for each value m acquired by the input reception means. Is configured to
The evaluation means is configured to evaluate the quality of the dimension reduction matrix A calculated by the matrix calculation means for each value m acquired by the input reception means,
Furthermore,
The output means is configured to selectively output the best dimension reduction matrix A among the plurality of dimension reduction matrices A calculated by the matrix calculation means according to the evaluation result of the evaluation means. The arithmetic unit according to any one of claims 3 to 5, wherein:

The program for making a computer implement | achieve the function as an arithmetic unit in any one of Claims 1-6.