JPH0540852A

JPH0540852A - Pattern recognizing device

Info

Publication number: JPH0540852A
Application number: JP3195468A
Authority: JP
Inventors: Masatoshi Kurumi; 雅俊來海
Original assignee: Science & Tech Agency; Agency of Industrial Science and Technology
Current assignee: Science & Tech Agency; National Institute of Advanced Industrial Science and Technology AIST
Priority date: 1991-08-05
Filing date: 1991-08-05
Publication date: 1993-02-19
Anticipated expiration: 2011-02-07
Also published as: JPH0812684B2

Abstract

PURPOSE:To improve the production efficiency of a pattern recognizing dictionary and also to shorten the pattern recognizing time. CONSTITUTION:In a dictionary production state, the essential components of a 64-dimensional learning sample inputted through an input part 3 are analyzed. Then the essential component vectors covering up to the k-th (k<64) axis and an average vector are registered in a essential component dictionary 12. At the same time, the membership functions of each axis are registered into a membership function dictionary 13. Meanwhile the membership functions of the essential component axes following the (k+1)-th one are supported as a normal distribution of dispersion sigma, and only this sigma is registered into the dictionary 12. If an unknown pattern is inputted, a feature extracting part 5 extracts the features of the pattern and a essential component evolving part 6 evolves the essential components in each category. A membership value calculation part 7 calculates the membership value and then calculates the resemblance of the categories from the product of the membership value. Under such conditions, the membership functions of the (k+1)-th and its subsequent ones are shown in the normal distribution of the dispersion sigma. Therefore the resemblance can be easily known among categories.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、文字パターン等の認
識装置に関し、特に多次元データ解析における特徴量空
間を利用してパターンを認識するパターン認識装置に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character pattern recognizing device, and more particularly to a pattern recognizing device for recognizing a pattern by utilizing a feature space in multidimensional data analysis.

【０００２】[0002]

【従来の技術】従来より、文字パターン等の認識アルゴ
リズムの中で、特徴抽出と並んで重要なのが識別関数で
ある。2. Description of the Related Art Conventionally, among the recognition algorithms for character patterns and the like, the identification function is as important as the feature extraction.

【０００３】そして、識別関数の中で、最もベーシック
なものはユークリッド距離であり、次式で表される。The most basic discriminant function is the Euclidean distance, which is expressed by the following equation.

【０００４】[0004]

【数１】 [Equation 1]

【０００５】ここで、ｘは入力パターンベクトルであ
り、ｘ₁はその各成分、ｍ₁は認識しようとするカテゴ
リの標準パターンベクトルの各成分である。また、特徴
量の次元数はｎである。Here, x is an input pattern vector, x ₁ is each component thereof, and m ₁ is each component of the standard pattern vector of the category to be recognized. The number of dimensions of the feature amount is n.

【０００６】この識別関数は簡単であり、処理速度も速
いので、パターン認識の世界ではよく使われている。し
かし、識別しようとするカテゴリを標準パターンベクト
ル（普通は平均ベクトルが使われる）ひとつで表現しよ
うとするところに無理があり、データ分布が複雑になっ
てきたり、文字認識のような多カテゴリの場合には性能
上、問題があった。Since this discriminant function is simple and has a high processing speed, it is often used in the world of pattern recognition. However, it is unreasonable to represent the category to be identified by one standard pattern vector (usually the average vector is used), and the data distribution becomes complicated, and in the case of multiple categories such as character recognition. Had a performance problem.

【０００７】そこで、データの分布を考慮したものとし
て、マハラノビス距離が使われている。Therefore, the Mahalanobis distance is used in consideration of the distribution of data.

【０００８】いま、これを例えば図７に示した文字パタ
ーン２０を認識する場合を例にとって説明すると、学習
用文字パターン２０を、同図に示す如く、例えば、
Ｄ₁，Ｄ₂，Ｄ₃，……Ｄ₆₄の６４画素に分割して、図
８に示す如く６４次元の特徴量空間３０でベクトル表現
として得る。Now, this will be described by taking the case of recognizing the character pattern 20 shown in FIG. 7 as an example, and the learning character pattern 20 is, for example, as shown in FIG.
It is divided into 64 pixels of D ₁ , D ₂ , D ₃ , ..., D ₆₄ and obtained as a vector expression in the 64-dimensional feature space 30 as shown in FIG.

【０００９】そして、各文字パターン毎に多数のサンプ
ルデータを得る。図９は、図８に示した文字パターン２
０のサンプルデータの分布の例である。Then, a large number of sample data are obtained for each character pattern. FIG. 9 shows the character pattern 2 shown in FIG.
It is an example of distribution of 0 sample data.

【００１０】そして、この学習サンプルについて主成分
分析を行い、各主成分軸上のサンプルの出現確立を図１
２（Ｃ）に示す如き正規分布２５で仮定する。Then, principal component analysis is performed on this learning sample to establish the appearance of samples on each principal component axis.
2 (C) is assumed to have a normal distribution 25.

【００１１】ここで、未知のパターンが入力されると、
例えば６４次元での多次元データ回析の場合、６４の各
軸上で出現確立を求め、その確立が最も高くなるような
文字パターンに未知パターンを識別する。Here, if an unknown pattern is input,
For example, in the case of multidimensional data diffraction in 64 dimensions, the appearance probability is obtained on each of the 64 axes, and the unknown pattern is identified as the character pattern with the highest probability of occurrence.

【００１２】すなわち、図１０は６４次元の特徴量空間
での主成分展開を示し、φ1 は第１主成分の固有ベクト
ル、φ2 は第２主成分の固有ベクトルを示しているが、
この場合は図１１に示す如く座標交換する。That is, FIG. 10 shows the principal component expansion in the 64-dimensional feature amount space, φ1 is the eigenvector of the first principal component, and φ2 is the eigenvector of the second principal component.
In this case, the coordinates are exchanged as shown in FIG.

【００１３】そして、この場合は次式が演算されること
になる。In this case, the following equation will be calculated.

【００１４】[0014]

【数２】 [Equation 2]

【００１５】ここでφ1 とλ1 は、識別カテゴリの学習
データの共分散行列から得られる固有ベクトルと固有値
であり、それぞれ、データ分布の主成分ベクトルと、主
成分軸上の分散の値に一致する。Here, φ 1 and λ 1 are an eigenvector and an eigenvalue obtained from the covariance matrix of the learning data of the identification category, and they respectively match the principal component vector of the data distribution and the variance value on the principal component axis.

【００１６】一方、学習データの分布から確立密度関数
を近似して、それをもとにベイズ決定機構を用いて識別
する方法もある。例えば、データ分布を主成分展開した
後、各主成分軸上でデータが正規分布していると仮定す
ると、ある主成分軸ｉでの確立密度関数はOn the other hand, there is also a method in which the probability density function is approximated from the distribution of the learning data and the Bayes decision mechanism is used to identify the probability density function. For example, assuming that the data is normally distributed on each principal component axis after the data distribution is principal component expanded, the probability density function at a certain principal component axis i is

【００１７】[0017]

【数３】 [Equation 3]

【００１８】で表され、また、各主成分軸が統計的に独
立だとすると、パターン空間全体におけるｘの確立密度
関数はIf the principal component axes are statistically independent, then the probability density function of x in the entire pattern space is

【００１９】[0019]

【数４】 [Equation 4]

【００２０】となり、両辺の対数をとるとAnd taking the logarithm of both sides

【００２１】[0021]

【数５】 [Equation 5]

【００２２】となる。ここで、次のような識別関数を考
えると、確立密度関数に対して単調減少となり、この値
が小さいほど確立密度の値が大きくなる。It becomes Here, considering the following discriminant function, the probability density decreases monotonically with respect to the probability density function, and the smaller this value, the higher the probability density value.

【００２３】[0023]

【数６】 [Equation 6]

【００２４】この式はベイズ識別関数と呼ばれており、
データの確立密度関数を反映した式になっているので、
いい標本サンプルを集めることができれば、理論的には
最適な識別関数となる。This equation is called the Bayesian discriminant function,
Since the formula reflects the probability density function of the data,
If a good sample can be collected, it is theoretically the optimal discriminant function.

【００２５】[0025]

【発明が解決しようとする課題】しかしながら、上記の
如き従来方式にあっては、各主成分軸上でのサンプル出
現確率を平均値を中心とする正現分布で仮定している
が、実際には、図１２（ｂ）に示す関数２６の如く正現
分布をなさない場合も多く、この場合は、仮定した確率
密度関数と現実のサンプルの出現確率が異なって、パタ
ーン認識の精度が低下するという不具合があった。However, in the conventional method as described above, the sample appearance probability on each principal component axis is assumed to be a positive distribution centered on the average value. Often does not have a positive distribution like the function 26 shown in FIG. 12B. In this case, the assumed probability density function and the appearance probability of the actual sample are different, and the accuracy of pattern recognition is reduced. There was a problem.

【００２６】また、文字などの認識においては、学習サ
ンプルの分布による出現確率と人間の感じる類似度の分
布が異なり、同じく、一律に正規分布で仮定する上記の
如き手法ではパターン認識の精度が低下するという不具
合があった。Further, in the recognition of characters and the like, the appearance probabilities due to the distribution of learning samples and the distribution of the degree of similarity felt by humans are different. There was a problem of doing.

【００２７】そこで、近年、類似度の分布関数を図１２
（Ｃ）に示す如く、メンバシップ関数２７の形で表すフ
ァジィパターン認識方式が提案されている。Therefore, in recent years, the distribution function of similarity is shown in FIG.
As shown in (C), a fuzzy pattern recognition method represented by a membership function 27 has been proposed.

【００２８】これは、学習用サンプルデータが得られる
と、このサンプルデータに基いて各画素毎にメンバーシ
ップ関数を作成し、未知パターンが入力されるとこのメ
ンバーシップ関数を適用してパターンの識別を行うもの
である。This is because when learning sample data is obtained, a membership function is created for each pixel based on this sample data, and when an unknown pattern is input, this membership function is applied to identify the pattern. Is to do.

【００２９】すなわち、That is,

【００３０】[0030]

【数７】 [Equation 7]

【００３１】として、次式数８よりパターン認識するも
ので、数８の値が最大のカテゴリにパターン認識するも
のである。As a pattern recognition based on the following equation 8, the pattern is recognized in the category having the maximum value of equation 8.

【００３２】[0032]

【数８】 [Equation 8]

【００３３】しかしながら、このようなファジィパター
ン認識方法を利用してパターン認識する手法では、上記
の不具合は回避できるが、特徴量の次元数が上記の如
く、例えば６４と大きくなったとき、識別データ（以
下、辞書という）に全てのカテゴリについての全ての主
成分軸についてメンバーシップ関数を作成する必要があ
るので、辞書容量が大きくなるとともに、パターン認識
時に全ての軸について主成分展開しなければならないの
で、処理時間に多大の時間を要するという不具合があっ
た。However, in the pattern recognition method using such a fuzzy pattern recognition method, the above-mentioned inconvenience can be avoided, but when the dimension number of the feature amount becomes large as described above, for example, 64, the identification data Since it is necessary to create a membership function for all principal component axes for all categories (hereinafter referred to as a dictionary), the dictionary capacity becomes large, and principal components must be expanded for all axes during pattern recognition. Therefore, there is a problem that the processing time takes a lot of time.

【００３４】この発明は、上記の如き従来の課題に鑑み
てなされたもので、その目的とするところは、辞書容量
を小さくでき、かつ識別時間を大幅に低減することので
きるパターン認識装置を提供することにある。The present invention has been made in view of the above-mentioned conventional problems, and an object thereof is to provide a pattern recognition apparatus capable of reducing the dictionary capacity and significantly reducing the identification time. To do.

【００３５】[0035]

【課題を解決するための手段】この発明は、上記目的を
達成するために、多次元データ解析による複数の主成分
軸上での類似度によってパターンを識別するパターン認
識装置において、主要な主成分軸上での類似度は予め設
定された類似度に関するメンバーシップ関数を適用して
算出されることを特徴とする。SUMMARY OF THE INVENTION In order to achieve the above object, the present invention provides a pattern recognition apparatus that identifies patterns based on similarity on a plurality of principal component axes by multidimensional data analysis. The similarity on the axis is characterized by being calculated by applying a membership function relating to a preset similarity.

【００３６】[0036]

【作用】この発明では、主要な主成分軸上での類似度の
みメンバーシップ関数を作成してパターンの識別を行う
ので、メンバーシップ関数作成のための手間が軽減さ
れ、また、パターン認識のための処理時間も低減され
る。In the present invention, since the membership function is created only for the similarity on the main principal component axis to identify the pattern, the labor for creating the membership function is reduced and the pattern recognition is performed. Processing time is also reduced.

【００３７】[0037]

【実施例】以下、本発明を図面に基いて説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below with reference to the drawings.

【００３８】図１は本発明が適用された実施例の電気的
な構成を示すブロック図である。FIG. 1 is a block diagram showing the electrical construction of an embodiment to which the present invention is applied.

【００３９】まず構成を説明すると、１はＣＰＵ等より
なる中央制御部で、この中央制御部１にはバスライン２
を介して入力部３、前処理部４、特徴抽出部５、主成分
展開部６、メンバ−シップ値算出部７、類似度算出部
８、判定部９、出力部１０が接続され、さらにバスライ
ン２にはＲＡＭよりなる作業領域１１、ＲＡＭやＲＯＭ
よりなる主成分辞書１２、メンバ−シップ関数辞書１３
が接続されている。なお、入力部３はイメージセンサ等
より構成され、出力部１０は表示器等で構成される。ま
た、前処理部４、特徴抽出部５、主成分展開部６、メン
バーシップ値算出部７、類似度算出部８および判定部９
はＣＰＵ等より構成される。First, the configuration will be described. 1 is a central control unit including a CPU and the like, and the central control unit 1 has a bus line 2
The input unit 3, the preprocessing unit 4, the feature extraction unit 5, the principal component expansion unit 6, the membership value calculation unit 7, the similarity calculation unit 8, the determination unit 9, and the output unit 10 are connected via the Line 2 has a work area 11 composed of RAM, RAM and ROM
Principal component dictionary 12 and membership function dictionary 13
Are connected. The input unit 3 is composed of an image sensor and the like, and the output unit 10 is composed of a display unit and the like. Further, the preprocessing unit 4, the feature extraction unit 5, the principal component expansion unit 6, the membership value calculation unit 7, the similarity calculation unit 8, and the determination unit 9 are included.
Is composed of a CPU and the like.

【００４０】以上が本実施例の構成であり、以下その作
用を説明する。The above is the configuration of the present embodiment, and its operation will be described below.

【００４１】なお、以下の説明でも文字パターンを認識
する場合について説明する。In the following description, the case of recognizing a character pattern will be described.

【００４２】まず、本実施例においても図２に示す如く
辞書作成処理を行う。First, also in this embodiment, the dictionary creating process is performed as shown in FIG.

【００４３】いまこれを図４〜図６を参照しながら説明
すると、まず辞書登録しようとする文字パターン２０
を、図４（ａ）に示す如くスキャナ等の入力部３より読
み取って入力する（ステップ２１０）。This will now be described with reference to FIGS. 4 to 6. First, the character pattern 20 to be registered in the dictionary.
Is read from the input unit 3 such as a scanner as shown in FIG. 4 (a) and input (step 210).

【００４４】こうして文字パターン２０が入力される
と、ステップ２２０の前処理を行い、ノイズ成分を除去
するとともに、図４（ｂ）に示す如く、規格化された一
定の大きさに拡大または縮小処理を行った文字パターン
２１を得る。When the character pattern 20 is input in this way, the preprocessing of step 220 is performed to remove noise components, and as shown in FIG. 4B, enlargement or reduction processing to a standardized constant size is performed. To obtain the character pattern 21.

【００４５】つぎに、ステップ２３０の特徴量抽出処理
を行うが、これには図４（ｃ）に示す如く、図４（ｂ）
の文字パターン２１にガウスフィルタをかけた文字パタ
ーン２２を使用する。Next, the feature amount extraction processing of step 230 is performed. As shown in FIG. 4C, the feature amount extraction processing is performed as shown in FIG.
A character pattern 22 obtained by applying a Gaussian filter to the character pattern 21 is used.

【００４６】すなわち、本実施例においても図７に示す
如く、文字パターン２０を例えば６４画素に分割して、
多次元データ解析における特徴量空間において各画素毎
の主成分軸上の特徴量を抽出するが、このような特徴量
抽出処理は、もとの文字パターン２０に図５に示す如き
ガウスフィルタ４０をかけたものを使用する。That is, also in this embodiment, as shown in FIG. 7, the character pattern 20 is divided into, for example, 64 pixels,
The feature quantity on the principal component axis for each pixel is extracted in the feature quantity space in the multidimensional data analysis. Such a feature quantity extraction process uses the Gaussian filter 40 as shown in FIG. Use the sprinkled one.

【００４７】ガウスフィルタ４０は、図７に示した
Ｄ₁，Ｄ₂，Ｄ₃等の各画素と同一大の大きさを有する
複数の正方形状のフィルタ領域より構成され、中央部分
に中心フィルタ４１を有し、周囲にも中心フィルタ４１
を中心にして放射状に伸びた複数のフィルタ領域を有し
ている。ところで、この場合、中心フィルタ４１が重ね
合わされる画素部分には例えば１００の重みで、そして
周辺の画素部分には図５に示したそれぞれの重み付けで
図４（ｂ）に示した文字パターン２１を加工する。The Gaussian filter 40 is composed of a plurality of square filter regions each having the same size as each pixel of D ₁ , D ₂ , D ₃ etc. shown in FIG. Has a center filter 41
Has a plurality of filter regions extending radially. By the way, in this case, the pixel pattern on which the center filter 41 is overlapped is weighted with, for example, 100, and the peripheral pixel portions are weighted with the respective weightings shown in FIG. To process.

【００４８】これによって、図４（ｃ）に示す如く、も
との文字パターン２１より太字で、しかも文字輪郭の周
囲には小さな重み付け値を有するフィルタ領域４２、４
２、４２、４２に起因する薄い文字パターンを有する文
字パターン２２が得られることになる。As a result, as shown in FIG. 4C, filter regions 42, 4 which are bolder than the original character pattern 21 and have a small weighting value around the character contour.
A character pattern 22 having a thin character pattern due to 2, 42, 42 will be obtained.

【００４９】従って、ステップ２３０の処理では、まず
図４（ａ）の形で入力された学習用文字パターン２０が
図４（ｃ）に示す如き文字パターン２２に交換され、サ
ンプルデータとされることになる。Therefore, in the processing of step 230, the learning character pattern 20 input in the form of FIG. 4A is first exchanged for the character pattern 22 as shown in FIG. become.

【００５０】そして、つづくステップ２４０の処理で
は、サンプルデータに基いて主成分分析し、平均ベクト
ル（中心値），ある値ｋ軸までの主成分ベクトル、（ｋ
＋１）軸以降で６４軸までの主成分軸上の分散の近似値
σを求める。なお、この場合、主成分ベクトルが求めら
れる画素の選択は、主成分軸の長さの長い軸から順番に
選択される。これは、主成分軸が長いということはそれ
だけデータの分散度が大きいということであり、文字パ
ターンの特徴的部分の画素である可能性が高いからであ
る。また、（ｋ＋１）軸以降について主成分ベクトルを
求めないのは、後に詳述する如く、文字パターンの場
合、例えば図７に示す如く、６４の画素のうち、大部分
の画素は文字パターン２０を含まない空白部分で、何ら
特徴量を含まない部分だからである。Then, in the subsequent processing of step 240, the principal component analysis is performed based on the sample data, and the average vector (center value), the principal component vector up to a certain value k axis, (k
Approximate values σ of variances on the principal component axes up to the +1) axis and up to 64 axes are obtained. In this case, the pixels for which the principal component vector is obtained are selected in order from the axis having the longest principal component axis. This is because a longer principal component axis means a higher degree of data dispersion, and there is a high possibility that the pixel is a characteristic portion of a character pattern. Further, as described in detail later, in the case of a character pattern, for example, as shown in FIG. 7, most of the 64 pixels have the character pattern 20 as the main component vector not calculated for the (k + 1) axis and thereafter. This is because it is a blank part that does not include any feature quantity.

【００５１】つぎに、ステップ２５０では、ｋ軸までの
主成分について、主成分軸上の文字変形に対応するよう
なメンバーシップ関数を作成する。図６は、この場合の
メンバーシップ関数の作成方法を示す説明図である。Next, in step 250, for the principal components up to the k axis, a membership function corresponding to the character transformation on the principal component axes is created. FIG. 6 is an explanatory diagram showing a method of creating a membership function in this case.

【００５２】同図には、メンバーシップ関数を求めよう
とする主成分ベクトルの中心ベクトル位置０を、プラス
方向およびマイナス方向に単位距離毎にズラした場合の
学習データが示されている。同図に示す如く、主要な成
分については１単位距離だけ中心ベクトル位置がズレる
だけでも大きく異なって見える。従って、このデータに
基づいてメンバンシップ関数を作成する。そして、ス
テップ２６０では、ステップ２４０で得たデータを主成
分辞書１２に、またステップ２５０で得たデータをメン
バーシップ関数辞書１３に登録する。以上の処理が各カ
テゴリ（文字）について行なわれ、登録されることにな
る。The figure shows learning data when the center vector position 0 of the principal component vector for which the membership function is to be obtained is shifted for each unit distance in the plus direction and the minus direction. As shown in the figure, regarding the main components, even if the center vector position is shifted by one unit distance, it looks greatly different. Therefore, a membership function is created based on this data. Then, in step 260, the data obtained in step 240 is registered in the principal component dictionary 12, and the data obtained in step 250 is registered in the membership function dictionary 13. The above processing is performed and registered for each category (character).

【００５３】以上が辞書作成処理の詳細である。The above is the details of the dictionary creation processing.

【００５４】つぎに、このように辞書が作成されると、
この作成辞書に基いて図３に示す如きパターン認識処理
が行なわれる。Next, when the dictionary is created in this way,
Based on this created dictionary, pattern recognition processing as shown in FIG. 3 is performed.

【００５５】すなわち、まず、イメージセンサ等の入力
部３によって未知パターンが入力されると（ステップ３
１０）、辞書作成処理の場合と同様にして前処理を行い
（ステップ３２０）、入力データの特徴量抽出処理を行
う（ステップ３３０）。That is, first, when an unknown pattern is input by the input unit 3 such as an image sensor (step 3
10), preprocessing is performed as in the case of the dictionary creation processing (step 320), and the feature amount extraction processing of the input data is performed (step 330).

【００５６】つぎに、抽出された特徴量を主成分展開し
（ステップ３４０）、ｋ軸までは各主成分軸のメンバー
シップ関数、また（ｋ＋１）軸以降は分散σの正規分布
関数に基いてメンバーシップ値を求める。Next, the extracted feature quantity is subjected to principal component expansion (step 340), the membership function of each principal component axis is up to the k axis, and the normal distribution function of the variance σ is used after the (k + 1) axis. Find the membership value.

【００５７】そして、その全体の積を類似度とする。Then, the product of the whole is taken as the similarity.

【００５８】すなわち、That is,

【００５９】[0059]

【数９】 [Equation 9]

【００６０】として数１０を算出する。[Mathematical formula-see original document] Equation 10 is calculated as

【００６１】[0061]

【数１０】 [Equation 10]

【００６２】ところで、By the way,

【００６３】[0063]

【数１１】 [Equation 11]

【００６４】はIs

【００６５】[0065]

【数１２】 [Equation 12]

【００６６】と展開できるので、数１０は数１３で示せ
る。Since it can be expanded to, equation 10 can be expressed by equation 13.

【００６７】[0067]

【数１３】 [Equation 13]

【００６８】こうして、１番高い類似度をもつカテゴリ
に未知パターンを識別する。In this way, the unknown pattern is identified in the category having the highest similarity.

【００６９】以上説明したように、本実施例では、６４
次元の未知パターンを次のように識別する。As described above, in this embodiment, 64
An unknown pattern of dimensions is identified as follows.

【００７０】（１）辞書作成時は、学習サンプルを主成
分分析し、第ｋ軸までは主成分ベクトルとその軸上のメ
ンバシップ関数、及び平均ベクトルを辞書として登録す
る。そして、第（ｋ＋１）軸以降は主成分軸上のメンバ
シップ関数を分散σの正規分布と仮定し、σのみを登録
しておく。(1) When creating a dictionary, the learning sample is subjected to principal component analysis, and up to the kth axis, the principal component vector, the membership function on that axis, and the average vector are registered as a dictionary. Then, after the (k + 1) th axis, the membership function on the principal component axis is assumed to be a normal distribution with variance σ, and only σ is registered.

【００７１】（２）ここで未知パターンが入力される
と、特徴量を抽出した後、それぞれのカテゴリに対して
主成分展開し、各主成分軸上でメンバーシップ値を求
め、それらの積からカテゴリに対する類似度を求める。
このとき、本実施例では第（ｋ＋１）軸以降のメンバシ
ップ関数はすべて分散σの正規分布で表されているた
め、類似度を求める式は数１０のようになる。また、数
１０は数１３のように変形できるので、実際には数１３
を計算すればそのカテゴリに対する類似度が求まる。そ
して、求めた類似度の中で最も高い類似度をもつカテゴ
リに、未知パターンを識別する。(2) When an unknown pattern is input here, after extracting the feature amount, the principal component is expanded for each category, the membership value is obtained on each principal component axis, and the product of them is calculated. Find the similarity to a category.
At this time, in the present embodiment, the membership functions after the (k + 1) th axis are all represented by the normal distribution with variance σ, and therefore the equation for calculating the similarity is as shown in Formula 10. Also, since the expression 10 can be transformed into the expression 13, the expression 13 is actually used.
By calculating, the degree of similarity to that category can be obtained. Then, the unknown pattern is identified in the category having the highest similarity among the calculated similarities.

【００７２】ところで、この場合、図７に示す如く、６
４の画素のほとんどは文字パターン２０を含まず、一般
には高次の主成分は分散がほとんど０に近く意味を持た
ないことが多い。従って、本実施例の如く一定値の分散
の正規分布で表しても認識性能はほとんど変わらない。By the way, in this case, as shown in FIG.
Most of the pixels of No. 4 do not include the character pattern 20, and in general, the higher-order principal components often have a variance close to 0 and have no meaning. Therefore, the recognition performance is hardly changed even if it is expressed by a normal distribution with a constant variance as in this embodiment.

【００７３】したがって、本実施例による手法は従来の
ファジィパターン認識手法に比べて、認識性能がほとん
ど変わらないまま、辞書容量が約ｋ／６４になり、かつ
処理時間も約ｋ／６４になるという効果が得られる。Therefore, as compared with the conventional fuzzy pattern recognition method, the method according to the present embodiment has a dictionary capacity of about k / 64 and a processing time of about k / 64 with almost no change in recognition performance. The effect is obtained.

【００７４】なお、本実施例では文字パターンを６４画
素で表現した場合について説明したが、画素の数に制限
がないことは勿論である。In this embodiment, the case where the character pattern is represented by 64 pixels has been described, but it goes without saying that the number of pixels is not limited.

【００７５】[0075]

【発明の効果】以上説明したように、この発明では、主
要な主成分軸上での類似度のみ予め設定された類似度に
関するメンバーシップ関数を適用して算出するようにし
たので、辞書容量を小さくでき、かつ識別時間を大幅に
低減することができるという効果を有する。As described above, in the present invention, only the similarity on the main principal component axis is calculated by applying the membership function relating to the preset similarity. It has an effect that it can be made small and the identification time can be greatly reduced.

[Brief description of drawings]

【図１】本発明が適用された実施例の電気的な構成を示
すブロック図。FIG. 1 is a block diagram showing an electrical configuration of an embodiment to which the present invention is applied.

【図２】辞書作成の処理手順を示すフローチャート。FIG. 2 is a flowchart showing a processing procedure for creating a dictionary.

【図３】パターン認識処理の処理手順を示すフローチャ
ート。FIG. 3 is a flowchart showing a processing procedure of pattern recognition processing.

【図４】入力された学習用文字パターンをノイズ処理等
をしてガウスフィルタをかける場合の説明図。FIG. 4 is an explanatory diagram of a case where a Gaussian filter is applied to an input learning character pattern by noise processing or the like.

【図５】ガウスフィルタの説明図。FIG. 5 is an explanatory diagram of a Gaussian filter.

【図６】メンバーシップ関数の作成方法を示す説明図。FIG. 6 is an explanatory diagram showing a method of creating a membership function.

【図７】６４画素で、学習用文字パターンの特徴量が抽
出される場合の説明図。FIG. 7 is an explanatory diagram when a feature amount of a learning character pattern is extracted with 64 pixels.

【図８】特徴量空間における文字パターンの説明図。FIG. 8 is an explanatory diagram of character patterns in a feature space.

【図９】特徴量空間における標本分布の説明図。FIG. 9 is an explanatory diagram of a sample distribution in a feature space.

【図１０】各固有ベクトルに基づき主成分展開する場合
の説明図。FIG. 10 is an explanatory diagram of the case of performing principal component expansion based on each eigenvector.

【図１１】２つの固有ベクトルに基づき座標変換する場
合の説明図。FIG. 11 is an explanatory diagram of a case where coordinate conversion is performed based on two eigenvectors.

【図１２】類似度の分布を正規分布、非正規分布、メン
バーシップ関数で表わした場合の説明図。FIG. 12 is an explanatory diagram of the similarity distribution represented by a normal distribution, a non-normal distribution, and a membership function.

[Explanation of symbols]

１中央制御部２バスライン３入力部４前処理部５特徴抽出部６主成分展開部７メンバーシップ値算出部８類似度算出部９判定部１０出力部１１作業領域１２主成分辞書１３メンバーシップ関数辞書２０，２１，２２学習用文字パターン４０ガウスフィルタ 1 central control unit 2 bus line 3 input unit 4 pre-processing unit 5 feature extraction unit 6 principal component expansion unit 7 membership value calculation unit 8 similarity calculation unit 9 judgment unit 10 output unit 11 work area 12 principal component dictionary 13 membership Function dictionary 20,21,22 Learning character pattern 40 Gaussian filter

─────────────────────────────────────────────────────
─────────────────────────────────────────────────── ───

【手続補正書】[Procedure amendment]

【提出日】平成３年８月６日[Submission date] August 6, 1991

【手続補正１】[Procedure Amendment 1]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】０００５[Correction target item name] 0005

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【０００５】ここで、ｘは入力パターンベクトルであ
り、ｘ_iはその各成分、ｍ_iは認識しようとするカテゴ
リの標準パターンベクトルの各成分である。また、特徴
量の次元数はｎである。Here, x is an input pattern vector, x _i is each component thereof, and m _i is each component of the standard pattern vector of the category to be recognized. The number of dimensions of the feature amount is n.

【手続補正２】[Procedure Amendment 2]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１０[Correction target item name] 0010

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００１０】そして、この学習サンプルについて主成分
分析を行い、各主成分軸上のサンプルの出現確立を図１
２（ａ）に示す如き正規分布２５で仮定する。Then, principal component analysis is performed on this learning sample to establish the appearance of samples on each principal component axis.
Assuming a normal distribution 25 as shown in 2 (a).

【手続補正３】[Procedure 3]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１５[Correction target item name] 0015

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００１５】ここでφ_iとλ_iは、識別カテゴリの学習
データの共分散行列から得られる固有ベクトルと固有値
であり、それぞれ、データ分布の主成分ベクトルと、主
成分軸上の分散の値に一致する。Here, φ _i and λ _i are an eigenvector and an eigenvalue obtained from the covariance matrix of the learning data of the identification category, and they respectively match the principal component vector of the data distribution and the variance value on the principal component axis. To do.

【手続補正４】[Procedure amendment 4]

【補正対象書類名】図面[Document name to be corrected] Drawing

【補正対象項目名】図７[Name of item to be corrected] Figure 7

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【図７】 [Figure 7]

Claims

[Claims]

1. In a pattern recognition apparatus for identifying a pattern by a similarity on a plurality of principal component axes by multidimensional data analysis, the similarity on a principal component axis is a membership related to a preset similarity. A pattern recognition device characterized by being calculated by applying a function.