JP3537949B2

JP3537949B2 - Pattern recognition apparatus and dictionary correction method in the apparatus

Info

Publication number: JP3537949B2
Application number: JP04903596A
Authority: JP
Inventors: 聡典河村; 恒雄新田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1996-03-06
Filing date: 1996-03-06
Publication date: 2004-06-14
Anticipated expiration: 2016-03-06
Also published as: JPH09245125A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字認識、音声認
識等に用いて好適なパターン認識装置及び同装置におけ
る辞書修正方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a pattern recognition apparatus suitable for character recognition, voice recognition, and the like, and a dictionary correction method in the apparatus.

【０００２】[0002]

【従来の技術】従来から様々なパターン認識手法が提案
されている。この従来から提案されているパターン認識
の手法は、各パターンの違いを記述された識別ルールと
いう知識を利用して識別する手法と、多量のサンプルデ
ータから、人手を介さずに、統計的に処理することによ
り作成した認識用の辞書を使用して認識を行う、統計的
パターン認識手法とに大別される。2. Description of the Related Art Conventionally, various pattern recognition techniques have been proposed. The conventionally proposed pattern recognition method is a method that uses the knowledge of identification rules that describe the differences between patterns, and a method that uses a large amount of sample data to perform statistical processing without human intervention. This is roughly classified into a statistical pattern recognition method of performing recognition using a recognition dictionary created by the above.

【０００３】統計的パターン認識手法では、入力された
パターンから様々な特徴値を抽出して、それを並べてｎ
次元の特徴ベクトルとして扱い、そのｎ次元特徴空間の
中での特徴ベクトルの分布を統計的に調べることによ
り、カテゴリ毎に作成された認識辞書を使用し、入力パ
ターンから抽出された特徴ベクトルと各カテゴリの辞書
との照合結果の評価値に基づいて認識結果を出力するも
のである。In the statistical pattern recognition method, various characteristic values are extracted from an input pattern, and the extracted characteristic values are arranged into n.
By treating the feature vector as a two-dimensional feature vector and statistically examining the distribution of the feature vector in the n-dimensional feature space, a recognition dictionary created for each category is used, and the feature vector extracted from the input pattern and The recognition result is output based on the evaluation value of the collation result with the dictionary of the category.

【０００４】このような統計的パターン認識手法の代表
的なものとして、部分空間法、疑似ベイズ識別法などが
知られている（電子通信情報学会論文誌、1995年11月
Vol.J78-D-II No.11 pp.1627-1638 ）。[0004] As typical examples of such a statistical pattern recognition method, a subspace method, a pseudo Bayes identification method, and the like are known (Transactions of the Institute of Electronics, Information and Communication Engineers, November 1995).
Vol.J78-D-II No.11 pp.1627-1638).

【０００５】例えば部分空間法は、各カテゴリのｎ次元
特徴ベクトルの分布をｍ次元部分空間（ｍ＜ｎ）で記述
し、その部分空間の正規直交基底ベクトルをもって辞書
（認識辞書）とし、入力特徴ベクトルの各カテゴリ部分
空間への射影値を評価値として、その評価値の高い順に
認識結果を出力する手法である。この手法は、統計的手
法によりパターンの変動をうまく記述でき、高い認識性
能を達成できるため、文字認識、音声認識などのパター
ン認識の分野で広く適用されている。For example, in the subspace method, the distribution of n-dimensional feature vectors in each category is described in an m-dimensional subspace (m <n), and a dictionary (recognition dictionary) is formed using orthonormal base vectors in the subspace. In this method, a projection value of a vector onto each category subspace is used as an evaluation value and recognition results are output in descending order of the evaluation value. This method is widely applied in the field of pattern recognition such as character recognition and voice recognition because it can well describe pattern fluctuations by a statistical method and achieve high recognition performance.

【０００６】[0006]

【発明が解決しようとする課題】ところで最近は、高性
能のパーソナルコンピュータの普及に伴い、ソフトウェ
ア処理によって、小さいメモリ容量で、高速に且つ高性
能なパターン認識が可能な手法に対する要求が高まって
いる。Recently, with the spread of high-performance personal computers, there has been an increasing demand for a technique capable of performing high-speed and high-performance pattern recognition with a small memory capacity by software processing. .

【０００７】しかしながら、上記した従来の認識手法で
は、認識辞書のためのメモリ容量、認識処理計算量が共
に大きい。例えば、日本語文字認識に部分空間法を適用
する場合の計算量について考えてみる。まず、識別カテ
ゴリ数をＪＩＳ第１水準文字として約３０００カテゴ
リ、識別に使用する特徴量として２５６次元特徴、部分
空間の次元数として１６次元とする。この場合、入力特
徴ベクトルと１つのカテゴリの辞書との評価値、即ち入
力特徴ベクトルの部分空間への射影量を求めるには、１
６次元部分空間を表す１６個の正規直交基底ベクトルへ
の射影の値を計算する必要があることから、２５６×１
６回の積和演算が必要となる。識別には、各カテゴリの
辞書との評価値を計算する必要があるため、合計では２
５６×１６×３０００＝１２２８８０００回の積和演算
が必要となる。However, in the above-described conventional recognition method, both the memory capacity for the recognition dictionary and the amount of calculation for recognition processing are large. For example, consider the computational complexity when applying the subspace method to Japanese character recognition. First, it is assumed that the number of identification categories is approximately 3000 categories as JIS first-level characters, 256-dimensional features are used as features used for identification, and 16-dimensional is the number of dimensions of a subspace. In this case, to obtain the evaluation value of the input feature vector and the dictionary of one category, that is, the projection amount of the input feature vector onto the subspace, 1
Since it is necessary to calculate the value of the projection onto the 16 orthonormal basis vectors representing the 6-dimensional subspace, 256 × 1
Six product-sum operations are required. For identification, it is necessary to calculate the evaluation value with the dictionary of each category.
56 × 16 × 3000 = 122888000 product-sum operations are required.

【０００８】次に、上記の例における認識辞書容量につ
いて考えてみる。ここでは、各カテゴリの各正規直交基
底ベクトルを表現する必要があるため、ベクトルの１要
素を例えば１バイトで表現した場合には、認識辞書容量
は、２５６×１６×３０００×１ｂｙｔｅ（バイト）＝
１２２８８０００ｂｙｔｅ（バイト）＝１１．７Ｍｂｙ
ｔｅ（メガバイト）となる。Next, consider the recognition dictionary capacity in the above example. Here, since it is necessary to represent each orthonormal base vector of each category, when one element of the vector is represented by, for example, 1 byte, the recognition dictionary capacity is 256 × 16 × 3000 × 1 byte (byte) =
12888000 bytes (bytes) = 11.7 Mby
te (megabytes).

【０００９】このように、従来手法では、認識処理計算
量が大きいため、専用のハードウェアなしでは高速な実
行ができないという問題があった。また、認識辞書に必
要なメモリ容量も巨大であり、コストが高くなるという
問題もあった。As described above, the conventional method has a problem that high-speed execution cannot be performed without dedicated hardware due to a large amount of calculation for recognition processing. In addition, there is also a problem that the memory capacity required for the recognition dictionary is huge, and the cost is high.

【００１０】そこで、このような問題を解決するため
に、特徴選択により特徴次元数を削減してから部分空間
法や疑似ベイズ識別法を適用するという手法も知られて
いる（電子通信情報学会論文誌、1995年11月 Vol.J78-
D-II No.11 pp.1627-1638 ）。In order to solve such a problem, there is also known a method in which the number of feature dimensions is reduced by feature selection, and then a subspace method or a pseudo Bayes identification method is applied. Magazine, November 1995, Vol.J78-
D-II No.11 pp.1627-1638).

【００１１】しかし、その場合には、計算量、辞書容量
の削減に伴い、認識性能も低下するという問題があっ
た。本発明は上記事情を考慮してなされたものでその目
的は、認識辞書等のためのメモリ容量が小さく、且つ認
識のための計算量が小さくて済み、しかも従来と同等の
認識性能を達成できるパターン認識装置及び同装置にお
ける辞書修正方法を提供することにある。However, in such a case, there is a problem that the recognition performance is reduced as the calculation amount and the dictionary capacity are reduced. The present invention has been made in view of the above circumstances, and has as its object the purpose of requiring a small memory capacity for a recognition dictionary and the like, a small calculation amount for recognition, and achieving the same recognition performance as that of the related art. An object of the present invention is to provide a pattern recognition apparatus and a dictionary correction method in the apparatus.

【００１２】[0012]

【課題を解決するための手段】本発明は、入力パターン
からｎ次元の特徴ベクトルを抽出し、この抽出したｎ次
元特徴ベクトルから特徴選択辞書（例えば多数の学習パ
ターンの特徴ベクトルの集合を対象とする主成分分析に
より作成された特徴選択辞書）を用いて認識に有効なｍ
次元特徴ベクトル（ｍ＜ｎ）を選択し、この選択したｍ
次元特徴ベクトルとｍ次元参照ベクトルの集合からなる
認識辞書とを照合することで評価値を算出して、その算
出結果に基づく順番で認識候補を出力する認識処理を行
うパターン認識装置において、辞書学習モード時に、複
数の学習パターンを順次入力して、その都度その学習パ
ターンを対象として上記の認識処理を行い、その認識処
理の結果をもとに、当該認識処理の対象となった学習パ
ターンから抽出されたｎ次元特徴ベクトルより選択され
たｍ次元特徴ベクトルと正解カテゴリの参照ベクトルと
の一致度が、不正解カテゴリの参照ベクトルとの一致度
よりも大きいほど第１の境界値に近づき、逆に小さいほ
ど上記第１の境界値とは異なる第２の境界値に近づく損
失関数を用いて誤認識の度合いを検出して、上記認識辞
書及び上記特徴選択辞書の両方に対して当該誤認識の度
合いを小さくするように学習により修正する一連の処理
（競合学習処理）を、予め定められた回数だけ繰り返す
ようにしたことを特徴とする。According to the present invention, an n-dimensional feature vector is extracted from an input pattern, and a feature selection dictionary (for example, a set of feature vectors of a large number of learning patterns is extracted from the extracted n-dimensional feature vector). Effective for recognition using a feature selection dictionary created by principal component analysis
Select a dimensional feature vector (m <n) and select the selected m
In a pattern recognition apparatus that performs a recognition process of calculating an evaluation value by comparing a dimensional feature vector with a recognition dictionary including a set of m-dimensional reference vectors and outputting recognition candidates in an order based on the calculation result, the mode, and sequentially inputting a plurality of training patterns, each time perform the above recognition process the learning pattern as the object, based on the result of the recognition process, learning path as a target of the recognition processing
Selected from the n-dimensional feature vectors extracted from the turns
M-dimensional feature vector and the reference vector of the correct category
Is the degree of coincidence with the reference vector of the incorrect category
The smaller the value is, the closer to the first boundary value is.
Loss approaching a second boundary value different from the first boundary value
Detecting the degree of misrecognition using the loss function
Of the misrecognition for both the book and the feature selection dictionary
A series of processes (competition learning process) for correcting by learning so as to reduce the matching are repeated a predetermined number of times.

【００１３】[0013]

【００１４】[0014]

【００１５】本発明においては、特徴選択により識別に
使用する特徴量を削減するようにしているため、認識辞
書容量及び認識計算量を低く抑えることができ、しかも
辞書学習モードで学習パターンに対する認識結果に基づ
いて競合学習により認識辞書及び特徴選択辞書を修正す
ることで、その修正された認識辞書を使用した認識処理
が可能となると共に、修正された特徴選択辞書を用いて
識別に有効な特徴を選択できるようになるため、高精度
の認識性能を実現することが可能となる。In the present invention, the feature amount used for identification is reduced by selecting a feature, so that the capacity of the recognition dictionary and the amount of calculation for recognition can be kept low. By modifying the recognition dictionary and the feature selection dictionary by competitive learning based on the, it becomes possible to perform recognition processing using the corrected recognition dictionary and use the modified feature selection dictionary.
Since it is possible to select a feature effective for identification, it is possible to realize highly accurate recognition performance.

【００１６】[0016]

【００１７】[0017]

【発明の実施の形態】以下、本発明の実施の形態につき
図面を参照して説明する。［第１の実施形態］図１は本発明の第１の実施形態に係
るパターン認識装置の概略構成を示すブロック図であ
る。図１に示すパターン認識装置は、パターン認識のた
めのソフトウェア処理を実行するパーソナルコンピュー
タ等を用いて実現されるもので、データ入力部１１、特
徴抽出部１２、特徴選択部１３、識別部１４、認識結果
出力部１５、特徴選択辞書１６、認識辞書１７、及び認
識辞書修正部１８の機能要素から構成される。なお、装
置全体を制御する制御部等は省略されている。Embodiments of the present invention will be described below with reference to the drawings. [First Embodiment] FIG. 1 is a block diagram showing a schematic configuration of a pattern recognition apparatus according to a first embodiment of the present invention. The pattern recognition apparatus shown in FIG. 1 is realized using a personal computer or the like that executes software processing for pattern recognition, and includes a data input unit 11, a feature extraction unit 12, a feature selection unit 13, a recognition unit 14, It is composed of functional elements of a recognition result output unit 15, a feature selection dictionary 16, a recognition dictionary 17, and a recognition dictionary correction unit 18. Note that a control unit and the like for controlling the entire apparatus are omitted.

【００１８】データ入力部１１は、認識の対象となる
（文字パターン、音声パターン等の）データ（パターン
データ）を入力する。特徴抽出部１２は、データ入力部
１１により入力されたデータ（入力パターン）からｎ次
元の特徴ベクトルを抽出する。The data input section 11 inputs data (pattern data) to be recognized (character pattern, voice pattern, etc.). The feature extraction unit 12 extracts an n-dimensional feature vector from the data (input pattern) input by the data input unit 11.

【００１９】特徴選択部１３は、特徴抽出部１２により
抽出されたｎ次元特徴ベクトルから特徴選択辞書１６を
用いて認識に有効なｍ次元特徴ベクトル（ｍ＜ｎ）を選
択する。The feature selecting unit 13 selects an m-dimensional feature vector (m <n) effective for recognition from the n-dimensional feature vector extracted by the feature extracting unit 12 using a feature selection dictionary 16.

【００２０】識別部１４は、特徴選択部１３により選択
されたｍ次元特徴ベクトルと認識辞書１７とを照合する
ことで評価値を算出し、その算出結果に基づく順番で
（ここでは、評価値の高い順に）認識候補を出力する。The identification unit 14 calculates an evaluation value by comparing the m-dimensional feature vector selected by the feature selection unit 13 with the recognition dictionary 17, and calculates an evaluation value in the order based on the calculation result (here, the evaluation value Output recognition candidates (in descending order).

【００２１】認識結果出力部１５は、識別部１４から出
力された認識候補を例えば表示装置（図示せず）に表示
する。特徴選択辞書１６は、ｎ次元特徴ベクトルからｍ
次元特徴ベクトル（ｍ＜ｎ）を選択するのに用いられ
る。The recognition result output unit 15 displays the recognition candidates output from the identification unit 14 on, for example, a display device (not shown). The feature selection dictionary 16 calculates m from the n-dimensional feature vector
Used to select dimensional feature vectors (m <n).

【００２２】認識辞書１７は、ｍ次元特徴ベクトルの認
識に用いられるｍ次元参照ベクトルの集合からなる。認
識辞書修正部１８は、識別部１４の認識結果に基づいて
誤認識の度合いを検出し、その誤認識の度合いが小さく
なるように認識辞書１７を修正する。The recognition dictionary 17 is composed of a set of m-dimensional reference vectors used for recognizing m-dimensional feature vectors. The recognition dictionary correction unit 18 detects the degree of erroneous recognition based on the recognition result of the identification unit 14, and corrects the recognition dictionary 17 so that the degree of erroneous recognition is reduced.

【００２３】次に、図１の構成の動作を図２乃至図６を
適宜参照して説明する。本実施形態では、キーボード、
マウス、スイッチ等の入力手段を用いて実現される図示
せぬモード指定部により、パターン認識処理を実行する
認識モードと、認識辞書１７を学習（修正）するための
学習処理（認識辞書修正処理）を実行する辞書学習モー
ドが選択指定できるようになっている。Next, the operation of the configuration shown in FIG. 1 will be described with reference to FIGS. In the present embodiment, a keyboard,
A recognition mode for executing a pattern recognition process and a learning process (recognition dictionary correction process) for learning (correcting) the recognition dictionary 17 by a mode designating unit (not shown) realized using input means such as a mouse and a switch. Can be selected and specified.

【００２４】以下、（ａ）認識モードでの認識処理、
（ｂ）辞書学習モードでの学習処理（認識辞書修正処
理）について、順に説明する。（ａ）認識モードでの認識処理まず、図１の装置が認識モードに設定された場合におけ
る認識処理について、図２のフローチャートを参照して
説明する。Hereinafter, (a) recognition processing in the recognition mode,
(B) The learning process (recognition dictionary correction process) in the dictionary learning mode will be described in order. (A) Recognition Processing in Recognition Mode First, the recognition processing when the apparatus in FIG. 1 is set to the recognition mode will be described with reference to the flowchart in FIG.

【００２５】データ入力部１１は、文字パターンあるい
は音声パターン等のパターン認識の対象となるパターン
を入力する。特徴抽出部１２は、データ入力部１１によ
り入力されたパターンから特徴を抽出する（ステップＳ
１）。文字認識を例にとると、例えば図３に示すように
１５×１５画素の２値文字パターンが入力された場合に
は、その白画素を“０”、黒画素を“１”として、左上
端から右下端まで順に走査して得られるベクトル（０，
０，…，１，１，…，０）を抽出、それを特徴ベクトル
とする。The data input unit 11 inputs a pattern to be recognized, such as a character pattern or a voice pattern. The feature extraction unit 12 extracts a feature from the pattern input by the data input unit 11 (Step S)
1). Taking character recognition as an example, for example, when a binary character pattern of 15 × 15 pixels is input as shown in FIG. 3, the white pixel is set to “0” and the black pixel is set to “1”, and the upper left corner is set. To the lower right corner (0,
, 0,..., 1, 1,.

【００２６】特徴選択部１３は、特徴抽出部１２により
抽出されたｎ次元特徴（上記の文字パターンの例では、
ｎ＝１５×１５＝２２５）から、識別に必要となるｍ次
元特徴（ｍ＜ｎ）を選択する（ステップＳ２）。このス
テップＳ２では、「選択」という名称を使用している
が、ｎ個の特徴ベクトルの要素の中からｍ個を選び出す
という操作ではなく、特徴選択辞書１６を用いて次のよ
うな演算が行われる。The feature selecting unit 13 selects the n-dimensional feature extracted by the feature extracting unit 12 (in the above example of the character pattern,
From n = 15 × 15 = 225), an m-dimensional feature (m <n) required for identification is selected (step S2). In step S2, the name “select” is used. However, instead of selecting m elements from n feature vector elements, the following operation is performed using the feature selection dictionary 16. Is

【００２７】即ち、入力特徴（特徴抽出部１２により抽
出されたｎ次元特徴）をｎ次元のベクトルＸ＝（ｘ1 ，
ｘ2 ，…，ｘn ）^T （但し、Ｔは転置を表す記号）で表
現し、特徴選択辞書１６をｍ×ｎ行列でＰで表現するも
のとすると、ステップＳ２では、次式Ｘ′＝ＰＸ …（１）に従って、ｎ次元特徴Ｘからｍ次元特徴Ｘ′が選択され
る。That is, an input feature (n-dimensional feature extracted by the feature extracting unit 12) is converted into an n-dimensional vector X = (x1,
x2,..., xn) ^T (where T is a symbol representing transposition) and the feature selection dictionary 16 is represented by P using an m × n matrix, and in step S2, the following equation X ′ = PX. According to (1), an m-dimensional feature X ′ is selected from an n-dimensional feature X.

【００２８】この特徴選択辞書１６（＝Ｐ）は、多数の
学習パターンから抽出した特徴ベクトルを用いて、例え
ば以下の手順で設計される。まず、辞書作成に使用する
ｎ次元特徴ベクトル集合を｛Ｘ1 ，Ｘ2 ，…，ＸN｝と
する。これから、次式（２）に従ってｎ×ｎ行列Ｋを計
算する。The feature selection dictionary 16 (= P) is designed, for example, by the following procedure using feature vectors extracted from a large number of learning patterns. First, the set of n-dimensional feature vectors used for creating a dictionary is {X1, X2,..., XN}. From this, an n × n matrix K is calculated according to the following equation (2).

【００２９】[0029]

【数１】 (Equation 1)

【００３０】次に、この行列Ｋの固有ベクトルを対応す
る固有値の大きい順にφ1 ，φ2 ，…として、Ｐ＝（φ1 φ2 …φm ）^T …（３）で定義されるｍ×ｎ行列Ｐを特徴選択辞書１６とする。
上記行列Ｋの固有ベクトルは、対応する固有値の大きい
順に、特徴ベクトル集合の分布の第１軸、第２軸、…を
表現している。Next, the eigenvectors of the matrix K are defined as φ1, φ2,... In descending order of the corresponding eigenvalues, and an m × n matrix P defined by P = (φ1 φ2... Φm) ^T (3) is selected. The dictionary is assumed to be 16.
The eigenvectors of the matrix K express the first axis, the second axis,... Of the distribution of the feature vector set in descending order of the corresponding eigenvalues.

【００３１】ここで、上記式（１）より、Ｘ′＝（ｘ′
1 ，ｘ′2 ，…，ｘ′n ）^T とした場合に、ｘ′i ＝（Ｘ，φi ） …（４）であるから、即ちｘ′i はＸとφi との内積であるか
ら、上記のようにして求められた特徴選択辞書１６（＝
Ｐ）による特徴選択は、学習特徴ベクトル集合の主成分
空間への射影という意味を持つ。なお、固有値φi に対
応する固有値をλi とした場合に、ｍ×ｎ行列Ｐ（特徴
選択辞書１６）を次式のようにしてもよい。Here, from the above equation (1), X '= (x'
1, x'2, ..., x'n) ^T , x'i = (X, φi) ... (4) Since x'i is the inner product of X and φi, The feature selection dictionary 16 (=
The feature selection by P) has the meaning of projecting a set of learning feature vectors onto a principal component space. If the eigenvalue corresponding to the eigenvalue φi is λi, the m × n matrix P (feature selection dictionary 16) may be expressed by the following equation.

【００３２】[0032]

【数２】 (Equation 2)

【００３３】このｍ×ｎ行列Ｐ（特徴選択辞書１６）
は、各主軸への学習パターンの射影値の分散を正規化し
たものである。さて、特徴選択部１３により上記（１）
式に従って選択されたｍ次元特徴（特徴ベクトル）Ｘ′
は識別部１４に渡される。識別部１４は、この選択され
た特徴ベクトルＸ′を認識辞書１７と照合し、カテゴリ
に分類することで、認識候補を出力する識別処理を行う
（ステップＳ３）。このステップＳ３の詳細は次の通り
である。This m × n matrix P (feature selection dictionary 16)
Is a normalized value of the variance of the projection value of the learning pattern on each principal axis. By the way, the feature selection unit 13 described above in (1)
M-dimensional feature (feature vector) X 'selected according to the equation
Is passed to the identification unit 14. The identification unit 14 performs an identification process of outputting a recognition candidate by collating the selected feature vector X ′ with the recognition dictionary 17 and classifying the feature vector X ′ into a category (step S3). The details of step S3 are as follows.

【００３４】まず、認識辞書１７は、カテゴリ毎に参照
ベクトルと呼ぶ、そのカテゴリを代表するベクトルを１
つ以上有し、全カテゴリで合計Ｍ個（Ｍはカテゴリ数以
上）のｍ次元参照ベクトル集合｛Ｒ1 ，Ｒ2 ，…，ＲM
｝からなる。First, the recognition dictionary 17 stores, as a reference vector for each category, a vector representative of the category as one vector.
M-dimensional reference vector set {R1, R2,..., RM in all categories (M is the number of categories or more)
Consists of｝.

【００３５】このような認識辞書１７の構造の場合、識
別部１４は、特徴選択部１３により選択されたｍ次元の
特徴ベクトルＸ′と認識辞書１７を構成する各参照ベク
トルＲi （ｉ＝１〜Ｍ）との距離ｄ（Ｘ′，Ｒi ）を次
式に従って計算する。In the case of such a structure of the recognition dictionary 17, the identification unit 14 includes the m-dimensional feature vector X ′ selected by the feature selection unit 13 and each reference vector Ri (i = 1 to 1) constituting the recognition dictionary 17. The distance d (X ', Ri) to the distance M) is calculated according to the following equation.

【００３６】ｄ（Ｘ′，Ｒi ）＝（Ｘ′−Ｒi ）² …（６）次に識別部１４は、上記（６）式に従って算出した距離
の小さい順に（即ち両ベクトルＸ′，Ｒi の一致度を示
す評価値の大きい順に）、対応する参照ベクトルＲi の
属するカテゴリを、認識候補として認識結果出力部１５
に出力する。D (X ′, Ri) = (X′−Ri) ² (6) Next, the discriminating unit 14 calculates the order of the distance calculated according to the above equation (6) in ascending order (that is, the vector X ′, Ri). The category to which the corresponding reference vector Ri belongs belongs to the recognition result output unit 15 as a recognition candidate.
Output to

【００３７】認識結果出力部１５は、これを受けて識別
部１４から出力される認識候補を図示せぬ表示装置に表
示出力する。以上が、認識処理の手順である。この認識
処理から明らかなように、特徴選択辞書１６の容量は選
択特徴ベクトルＸ′の次元数に比例する。また、参照ベ
クトルＲi の次元数は、選択特徴ベクトルＸ′の次元数
に一致するので、認識辞書１７の容量も選択特徴ベクト
ルＸ′の次元数に比例する。The recognition result output unit 15 receives and outputs the recognition candidates output from the identification unit 14 to a display device (not shown). The above is the procedure of the recognition processing. As is apparent from this recognition processing, the capacity of the feature selection dictionary 16 is proportional to the number of dimensions of the selected feature vector X '. Since the number of dimensions of the reference vector Ri matches the number of dimensions of the selected feature vector X ', the capacity of the recognition dictionary 17 is also proportional to the number of dimensions of the selected feature vector X'.

【００３８】一方、認識処理に要する計算量（積和の演
算回数）は、特徴選択部１３と識別部１４とで、特徴選択部１３：（特徴ベクトル次元数）×（選択特徴
ベクトルの次元数）回識別部１４：（全参照ベクトル数）×（選択特徴ベクト
ルの次元数）回のようになり、やはり選択特徴ベクトルの次元数に比例
する。On the other hand, the amount of calculation required for the recognition process (the number of product sum operations) is determined by the feature selecting unit 13 and the discriminating unit 14. The feature selecting unit 13: (the number of dimensions of the feature vector) × (the number of dimensions of the selected feature vector) ) Times discriminator 14: (number of total reference vectors) × (number of dimensions of selected feature vector) times, which is also proportional to the number of dimensions of the selected feature vector.

【００３９】以上のことから、選択特徴ベクトルの次元
数を低くするほど、認識辞書１７の容量（辞書容量）を
小さくでき、少ない演算量（計算量）で認識処理を実行
できることになる。ここで、実際の選択特徴ベクトルの
次元数は、本装置に要求される辞書容量と、計算量の制
約により決定される。As described above, as the number of dimensions of the selected feature vector is reduced, the capacity of the recognition dictionary 17 (dictionary capacity) can be reduced, and the recognition process can be performed with a small amount of calculation (computation). Here, the actual number of dimensions of the selected feature vector is determined by the dictionary capacity required for the apparatus and the restriction on the amount of calculation.

【００４０】明らかなように、選択特徴ベクトルの次元
数を低くするほど、辞書容量、計算量の面からは有利に
なるが、その反面、認識辞書１７の情報量は落ちるた
め、認識性能の低下が予想される。As is apparent, the lower the number of dimensions of the selected feature vector is, the more advantageous in terms of dictionary capacity and calculation amount, but on the other hand, the amount of information in the recognition dictionary 17 is reduced, so that the recognition performance is reduced. Is expected.

【００４１】そこで本装置では、実際に学習パターンを
認識させてみて、誤認識をできるだけ少なくするよう
に、以下に述べる認識辞書１７を修正するという競合学
習を導入することにより、認識性能の向上を図ってい
る。（ｂ）辞書学習モードでの学習処理（認識辞書修正処
理）以下、図１の装置が辞書学習モードに設定された場合に
おける学習処理について、図４のフローチャートを参照
して説明する。Therefore, in the present apparatus, the recognition performance is improved by introducing competitive learning in which the recognition dictionary 17 described below is modified so that the learning patterns are actually recognized and erroneous recognition is reduced as much as possible. I'm trying. (B) Learning Process in Dictionary Learning Mode (Recognition Dictionary Correction Process) Hereinafter, the learning process when the apparatus in FIG. 1 is set to the dictionary learning mode will be described with reference to the flowchart in FIG.

【００４２】まず、前記したような、カテゴリ毎に参照
ベクトルと呼ぶ、そのカテゴリを代表するベクトルを１
つ以上有し、全カテゴリで合計Ｍ個のｍ次元参照ベクト
ル集合｛Ｒ1 ，Ｒ2 ，…，ＲM ｝からなる、初期状態の
認識辞書１７を作成しておく。この初期状態の認識辞書
１７を構成する各参照ベクトルは、例えば、カテゴリ毎
にそのカテゴリに属する複数の学習パターンの選択特徴
ベクトルの平均ベクトルとして設計されたものである。
各カテゴリの参照ベクトルの個数は１つ以上であれば幾
つでもよく、複数の場合には、初期値は全て平均ベクト
ルと同じにすればよい。First, as described above, a vector representing a category, which is called a reference vector for each category, is 1
A recognition dictionary 17 in an initial state is prepared, which includes at least one m-dimensional reference vector set {R1, R2,..., RM} in all categories. Each reference vector forming the recognition dictionary 17 in the initial state is designed, for example, as an average vector of selected feature vectors of a plurality of learning patterns belonging to each category.
The number of reference vectors in each category may be any number as long as it is one or more. In the case of a plurality of reference vectors, all initial values may be the same as the average vector.

【００４３】さて、図１の装置において辞書学習モード
が設定された場合、制御部は、例えば磁気ディスク装置
等の外部記憶装置に予め登録されている全ての学習パタ
ーンをデータ入力部１１により順次入力させ、その都
度、その学習パターンを（図２のフローチャートで示さ
れる手順で）実際に認識させて、その認識結果をもとに
認識辞書１７を修正するという一連の操作を、図４のフ
ローチャートに従って目標とする学習回数（目標学習回
数）だけ繰り返し行う。When the dictionary learning mode is set in the apparatus shown in FIG. 1, the control unit sequentially inputs all learning patterns registered in advance in an external storage device such as a magnetic disk device through the data input unit 11. In each case, a series of operations of actually recognizing the learning pattern (in the procedure shown in the flowchart of FIG. 2) and correcting the recognition dictionary 17 based on the recognition result are performed according to the flowchart of FIG. The process is repeated for the target number of learning times (target learning number).

【００４４】即ち、学習回数をカウントするカウンタ値
ｔを初期値０に設定した後（ステップＳ１１）、そのカ
ウンタ値ｔが目標学習回数（指定の学習回数）に達して
いないならば（ステップＳ１２）、認識の対象とする学
習パターンをカウントするカウンタ値ｉを初期値０に設
定する（ステップＳ１３）。そして、カウンタ値ｉが予
め定められた（指定の）学習パターン数に達していない
ことから（ステップＳ１４）、ｉ番目の学習パターン
（第ｉ学習パターン）を図１の装置に与えて、図２のフ
ローチャートで示される手順で認識させ（ステップＳ１
５）、その認識結果をもとに、認識辞書修正部１８によ
り、認識辞書１７を修正させる（ステップＳ１６）。こ
のステップＳ１６での修正処理の詳細は後述する。That is, after setting the counter value t for counting the number of times of learning to the initial value 0 (step S11), if the counter value t has not reached the target number of times of learning (the specified number of times of learning) (step S12). Then, a counter value i for counting a learning pattern to be recognized is set to an initial value 0 (step S13). Since the counter value i has not reached the predetermined (designated) number of learning patterns (step S14), the i-th learning pattern (i-th learning pattern) is given to the apparatus of FIG. (Step S1).
5) Based on the recognition result, the recognition dictionary correction unit 18 corrects the recognition dictionary 17 (step S16). Details of the correction processing in step S16 will be described later.

【００４５】ステップＳ１６が終了すると、カウンタ値
ｉが＋１され（ステップＳ１７）、しかる後、上記ステ
ップＳ１４以降の処理、即ち次の学習パターンについて
の認識処理と、その認識処理の結果を用いた認識辞書１
７の修正処理が行われる。When the step S16 is completed, the counter value i is incremented by 1 (step S17). Thereafter, the processing after the step S14, that is, the recognition processing for the next learning pattern, and the recognition using the result of the recognition processing are performed. Dictionary 1
7 is performed.

【００４６】やがて、予め定められた学習パターン数分
の学習パターンについての認識処理と、その認識処理の
結果を用いた認識辞書１７の修正処理が全て実行される
と、ステップＳ１４からステップＳ１８に進み、カウン
タ値ｔが＋１される。そして、この＋１後のカウンタ値
ｔ（実際に行われた学習回数ｔ）が目標学習回数に達し
ていないならば（ステップＳ１２）、上記ステップＳ１
３以降の処理が再び行われる。Eventually, when the recognition process for the learning patterns for the predetermined number of learning patterns and the correction process of the recognition dictionary 17 using the result of the recognition process are all performed, the process proceeds from step S14 to step S18. , The counter value t is incremented by one. If the counter value t after +1 (the number of learnings t actually performed) has not reached the target number of learnings (step S12), the above-described step S1 is performed.
The processes after 3 are performed again.

【００４７】やがて、カウンタ値ｔが目標学習回数に達
すると、図４のフローチャートに従う一連の学習処理は
終了となる。ここで、認識辞書修正部１８による上記ス
テップＳ１６での認識辞書修正処理は、ステップＳ１５
で学習パターンを認識させたときの誤認識による損失を
定義して、その損失を小さくする方向に参照ベクトルを
修正していくことにより行われ、その詳細は次の通りで
ある。Eventually, when the counter value t reaches the target number of times of learning, a series of learning processing according to the flowchart of FIG. 4 ends. Here, the recognition dictionary correction processing in step S16 by the recognition dictionary correction unit 18 is performed in step S15.
Is defined by defining a loss due to erroneous recognition when a learning pattern is recognized, and correcting the reference vector in a direction to reduce the loss. The details are as follows.

【００４８】まず本実施形態では、ある学習パターンの
選択特徴ベクトルＸ′k を認識させた場合に、その学習
パターンと同一カテゴリ（正解カテゴリ）の参照ベクト
ルのうち最も上位候補であった参照ベクトルをＲi 、そ
の学習パターンと異なるカテゴリ（不正解カテゴリ）の
参照ベクトルのうち最も上位候補であった参照ベクトル
をＲj とした場合に、Ｘ′k を認識させたときの損失関
数ｈ（Ｘ′k ）を次のように定義する。First, in the present embodiment, when the selected feature vector X'k of a certain learning pattern is recognized, the reference vector that is the highest candidate among the reference vectors of the same category (correct category) as the learning pattern is determined. Ri is the loss function h (X'k) when X'k is recognized, where Rj is the reference vector that was the highest candidate among the reference vectors of the category (incorrect answer category) different from the learning pattern. Is defined as follows.

【００４９】ｈ（Ｘ′k ）＝ｆ（ｇ（Ｘ′k ）） …（７）ｇ（Ｘ′k ）＝ｄ（Ｘ′k ，Ｒi ）−ｄ（Ｘ′k ，Ｒj ） …（８）ｆ（ｘ）＝１／｛１＋ｅｘｐ（−αｘ）｝（α＞０）…（９）ここで、関数ｄ（Ｘ′k ，Ｒi ），ｄ（Ｘ′k ，Ｒj ）
は前記（６）式で定義した距離関数であり、前者は、学
習パターンと同一カテゴリの参照ベクトルのうち最上位
候補との一致度（距離が大きいほど一致度は低くなる）
を表し、後者は学習パターンと異なるカテゴリの参照ベ
クトルのうち最上位候補との一致度を表す。また、関数
ｆ（ｘ）は図５に示すようなシグモイド関数である。H (X′k) = f (g (X′k)) (7) g (X′k) = d (X′k, Ri) −d (X′k, Rj) (8) ) F (x) = 1 / {1 + exp (-αx)} (α> 0) (9) where d (X′k, Ri), d (X′k, Rj)
Is a distance function defined by the above equation (6), and the former is the matching degree with the top candidate among the reference vectors of the same category as the learning pattern (the matching degree decreases as the distance increases).
And the latter indicates the degree of coincidence with the highest-order candidate among reference vectors of a category different from the learning pattern. The function f (x) is a sigmoid function as shown in FIG.

【００５０】以上のように定義された損失関数ｈ（Ｘ′
k ）は、学習パターンの選択特徴ベクトルＸ′k と正解
カテゴリの参照ベクトルとの距離が、不正解カテゴリと
の距離よりも小さいほど小さい値（ここでは０に近い
値）となり、逆に大きいほど大きい値（ここでは１に近
い値）となることから、誤認識の度合いを表す評価関数
となっていることは明らかである。The loss function h (X 'defined as above)
k) has a smaller value (here, a value closer to 0) as the distance between the selected feature vector X'k of the learning pattern and the reference vector of the correct answer category is smaller than the distance from the incorrect answer category. Since it is a large value (a value close to 1 in this case), it is clear that the evaluation function indicates the degree of erroneous recognition.

【００５１】これにより、全ての学習パターンの選択特
徴ベクトル｛Ｘ′k ｜ｋ＝１，…，Ｎ｝についての、損
失関数ｈ（Ｘ′k ）の平均Ｌを、次式（１０）のように
定めると、この値Ｌが小さいほど良い識別系であるとい
える。Thus, the average L of the loss function h (X′k) for the selected feature vector {X′k | k = 1,..., N} of all the learning patterns is calculated by the following equation (10). , It can be said that the smaller the value L, the better the identification system.

【００５２】[0052]

【数３】 [Equation 3]

【００５３】しかし、上記式（１０）に示すＬ（関数
Ｌ）を最小とするような参照ベクトルを解析的に求める
ことは困難である。そこで本実施形態では、認識辞書修
正部１８での認識辞書修正処理に周知の最急勾配法（最
急降下法）を用いることにより、少しずつ参照ベクトル
を修正（更新）していき、極小解を求めるようにしてい
る。However, it is difficult to analytically find a reference vector that minimizes L (function L) shown in the above equation (10). Therefore, in the present embodiment, the reference vector is corrected (updated) little by little by using the well-known steepest gradient method (steepest descent method) in the recognition dictionary correction processing in the recognition dictionary correction unit 18, and the minimum solution is obtained. I want to ask.

【００５４】具体的には、学習パターンの選択特徴ベク
トルＸ′k を識別部１４により認識させ、その学習パタ
ーンと同一カテゴリ（正解カテゴリ）の参照ベクトルの
うち最も上位候補であった参照ベクトルＲi と、学習パ
ターンと異なるカテゴリ（不正解カテゴリ）の参照ベク
トルのうち最も上位候補であった参照ベクトルＲj と
を、損失関数ｈ（Ｘ′k ）を参照ベクトルＲi ，Ｒj で
微分した値、即ち参照ベクトル空間における損失関数ｈ
（Ｘ′k ）の勾配を用いて、次のようなルールに従い損
失関数ｈ（Ｘ′k ）が減少する方向に認識辞書修正部１
８にて少しずつ修正（更新）する。Specifically, the selected feature vector X'k of the learning pattern is recognized by the identification unit 14, and the reference vector Ri and the highest candidate among the reference vectors of the same category (correct category) as the learning pattern are used. , The value obtained by differentiating the loss function h (X'k) with the reference vectors Ri and Rj from the reference vector Rj, which is the highest candidate among the reference vectors of the category (incorrect answer category) different from the learning pattern, that is, the reference vector Loss function h in space
Using the gradient of (X'k), the recognition dictionary correction unit 1 is directed in the direction in which the loss function h (X'k) decreases according to the following rule.
Correct (update) little by little at 8.

【００５５】[0055]

【数４】 (Equation 4)

【００５６】なお、上記のルール中のε（ｔ）は学習の
速度を決めるためのもので、正の値をとるカウンタ値ｔ
（学習パターン提示回数ｔ）の減少関数であり、例え
ば、ε（ｔ）＝１／（ｔ＋１０）が用いられる。Note that ε (t) in the above rule is for determining the learning speed, and is a counter value t which takes a positive value.
This is a decreasing function of (the number of times of learning pattern presentation t), for example, ε (t) = 1 / (t + 10).

【００５７】このようなルールで認識辞書１７の修正を
指定回数繰り返すことにより、誤認識による損失が小さ
い認識辞書１７、即ち誤認識の少ない認識性能の良い認
識辞書１７に修正することができる。By repeating the modification of the recognition dictionary 17 by the specified number of times according to such a rule, it is possible to correct the recognition dictionary 17 with a small loss due to erroneous recognition, that is, a recognition dictionary 17 with little erroneous recognition and good recognition performance.

【００５８】図１の装置で、上記のようにして修正され
た認識辞書１７を用いて手書き文字認識を行った場合
の、計算量（認識辞書容量）に対する認識性能（認識
率）を表す折れ線グラフを、従来手法である部分空間法
と対比させて図６に示す。なお、ここでの認識対象は片
仮名文字である。A line graph showing the recognition performance (recognition rate) with respect to the amount of calculation (recognition dictionary capacity) when handwritten character recognition is performed using the recognition dictionary 17 modified as described above in the apparatus of FIG. Is shown in FIG. 6 in comparison with the subspace method which is a conventional method. The recognition target here is katakana characters.

【００５９】図６中、横軸は、（従来から知られてい
る）２５６次元特徴の１６次元部分空間による部分空間
法の積和演算量を１とした場合の積和演算量の比を表し
ており、対数スケールとなっている。この積和演算量の
比は認識辞書容量の比と考えても同じである。一方、縦
軸は、認識率を表している。In FIG. 6, the horizontal axis represents the ratio of the sum-of-products operation amount when the sum-of-products operation amount of the subspace method based on the 16-dimensional subspace of the 256-dimensional feature (conventionally known) is set to 1. And on a logarithmic scale. The ratio of the product-sum operation amount is the same even if it is considered as the ratio of the recognition dictionary capacity. On the other hand, the vertical axis represents the recognition rate.

【００６０】さて、図６中、符号６１で示される折れ線
グラフは、本実施形態の装置における認識性能を表して
おり、２５６次元特徴からそれぞれ１６次元特徴、３２
次元特徴、６４次元特徴、１２８次元特徴を選択した場
合に、参照ベクトルを各カテゴリにつき１つとして認識
した場合の認識結果（認識率）を線でつないだものであ
る。A line graph indicated by reference numeral 61 in FIG. 6 represents the recognition performance of the apparatus according to the present embodiment.
When a dimensional feature, a 64-dimensional feature, and a 128-dimensional feature are selected, the recognition result (recognition rate) when one reference vector is recognized for each category is connected by a line.

【００６１】次に、符号６２で示される折れ線グラフ
は、特徴選択をしていない２５６次元特徴をそのまま使
用して、その２５６次元特徴に対して部分空間法を適用
した手法の認識性能を示しており、部分空間次元数を１
次元、２次元、４次元、８次元、１６次元にした場合の
認識結果を線でつないだものである。Next, the line graph indicated by reference numeral 62 shows the recognition performance of a method in which a 256-dimensional feature for which no feature selection is performed is used as it is and the subspace method is applied to the 256-dimensional feature. And the subspace dimension is 1
The recognition results in the case of two-dimensional, two-dimensional, four-dimensional, eight-dimensional, and sixteen-dimensional are connected by a line.

【００６２】同様に、符号６３，６４，６５，６６で示
される折れ線グラフは、それぞれ１６次元特徴、３２次
元特徴、６４次元特徴、１２８次元特徴を選択し、その
選択した特徴に対して部分空間法を適用した手法（特徴
選択＋部分空間法）の認識性能を示しており、いずれも
部分空間次元数を１次元、２次元、４次元、８次元、１
６次元にした場合の認識結果を線でつないだものであ
る。このグラフ６３〜６６の例では、特徴選択を行うこ
とで、グラフ６２の例と比べて積和演算量及び認識辞書
容量を減らしてはいるが、本実施形態のように認識辞書
修正を行っていないため、認識性能は劣る。Similarly, the line graphs denoted by reference numerals 63, 64, 65, and 66 respectively select a 16-dimensional feature, a 32-dimensional feature, a 64-dimensional feature, and a 128-dimensional feature, and apply a subspace to the selected feature. It shows the recognition performance of the method (feature selection + subspace method) to which the method is applied.
The recognition results in the case of six dimensions are connected by lines. In the examples of the graphs 63 to 66, although the amount of product-sum operation and the capacity of the recognition dictionary are reduced by performing feature selection as compared with the example of the graph 62, the recognition dictionary is modified as in the present embodiment. No recognition performance is obtained.

【００６３】即ち、従来から知られている、部分空間法
と、特徴選択＋部分空間法とを比較すると、特徴選択＋
部分空間法では、例えば６４次元特徴を選択して積和演
算量を１／４にしても、部分空間法と同程度の認識性能
が得られることが分かる。しかし、それ以上選択次元数
を下げると、認識性能が著しく低下する。That is, comparing the conventionally known subspace method with the feature selection + subspace method, the feature selection +
It can be seen that, in the subspace method, for example, even if a 64-dimensional feature is selected and the product-sum operation amount is reduced to ４, the same recognition performance as the subspace method can be obtained. However, if the number of selected dimensions is further reduced, the recognition performance is significantly reduced.

【００６４】次に、部分空間法と本実施形態での手法と
を比較すると、本実施形態での手法では、積和演算量を
約１／３０にしても、部分空間法と同程度の認識性能が
得られることが分かる。Next, a comparison between the subspace method and the method according to the present embodiment shows that the method according to the present embodiment has the same degree of recognition as the subspace method even when the product-sum operation amount is about 1/30. It can be seen that performance is obtained.

【００６５】このように、図６からは、本実施形態での
手法は、特徴選択によって認識処理に必要な計算量を低
く押さえると共に、認識辞書１７を学習により修正する
ことで、高い認識精度を維持できることが読み取れ、本
実施形態での手法の効果が確認できる。［第２の実施形態］図７は本発明の第２の実施形態に係
るパターン認識装置の概略構成を示すブロック図であ
る。この図７の構成の特徴は、図１の構成に、識別部１
４の認識結果に基づいて誤認識の度合いを検出し、その
誤認識の度合いが小さくなるように特徴選択辞書１６を
修正する特徴選択辞書修正部１９を追加した点にあり、
辞書学習モードにおいて、認識辞書１７だけでなく特徴
選択辞書１６も修正する点で、図１の構成と異なってい
る。即ち、図１の構成では、主成分分析により作成した
特徴選択辞書１６をそのまま使用していたが、図７の構
成では、当該特徴選択辞書１６を学習により修正するこ
とで、更に識別に有利な特徴選択を可能とし、認識性能
の一層の向上を図るようにしている。なお、認識モード
での認識処理は、前記第１の実施形態と同様に図２のフ
ローチャートに従って行われる。As described above, from FIG. 6, the method according to the present embodiment reduces the amount of calculation required for the recognition processing by selecting the feature and corrects the recognition dictionary 17 by learning, thereby achieving high recognition accuracy. It can be read that it can be maintained, and the effect of the method in the present embodiment can be confirmed. [Second Embodiment] FIG. 7 is a block diagram showing a schematic configuration of a pattern recognition device according to a second embodiment of the present invention. The configuration of FIG. 7 is different from the configuration of FIG.
4, a feature selection dictionary correction unit 19 that detects the degree of false recognition based on the recognition result and corrects the feature selection dictionary 16 so as to reduce the degree of false recognition is added.
In the dictionary learning mode, the configuration differs from that of FIG. 1 in that not only the recognition dictionary 17 but also the feature selection dictionary 16 is corrected. That is, in the configuration of FIG. 1, the feature selection dictionary 16 created by the principal component analysis is used as it is, but in the configuration of FIG. 7, by modifying the feature selection dictionary 16 by learning, it is more advantageous for identification. Feature selection is possible, and the recognition performance is further improved. Note that the recognition process in the recognition mode is performed according to the flowchart of FIG. 2 as in the first embodiment.

【００６６】以下、辞書学習モード時の動作（学習処
理）を特徴選択辞書修正部１９による特徴選択辞書修正
処理を中心に図８のフローチャートを参照して説明す
る。図７の装置において辞書学習モードが設定された場
合、制御部は、外部記憶装置に予め登録されている全て
の学習パターンをデータ入力部１１により順次入力さ
せ、その都度、その学習パターンを（図２のフローチャ
ートで示される手順で）実際に認識させ、その認識結果
をもとに認識辞書１７及び特徴選択辞書１６を修正する
という操作を、図８のフローチャートに従って目標とす
る学習回数だけ繰り返し行う（ステップＳ２１〜Ｓ２
７）。前記第１の実施形態との違いは、第１の実施形態
におけるステップＳ１６に相当する修正処理ステップＳ
２６で、認識辞書１７だけでなく特徴選択辞書１６も修
正する点である。The operation in the dictionary learning mode (learning processing) will be described below with reference to the flowchart of FIG. When the dictionary learning mode is set in the device of FIG. 7, the control unit causes the data input unit 11 to sequentially input all the learning patterns registered in advance in the external storage device, and each time the learning pattern is set as shown in FIG. The operation of actually recognizing and revising the recognition dictionary 17 and the feature selection dictionary 16 based on the recognition result is repeated by the target number of times of learning according to the flowchart of FIG. Steps S21 and S2
7). The difference from the first embodiment is that the modification processing step S16 corresponds to step S16 in the first embodiment.
26, in that not only the recognition dictionary 17 but also the feature selection dictionary 16 is modified.

【００６７】このステップＳ２６における認識辞書１７
及び特徴選択辞書１６の修正は、前記第１の実施形態と
同様な損失関数ｈ（Ｘ′k ）＝ｈ（ＰＸk ）を定義し
て、これを小さくする方向に最急勾配法（最急降下法）
で特徴選択辞書１６を修正していくことにより次のよう
に行われる。但し、本実施形態で適用される初期状態の
特徴選択辞書１６（＝Ｐ）は、前記第１の実施形態の場
合と同様に主成分分析により作成されたものであるとす
る。また。初期状態の認識辞書１７は、この初期状態の
特徴選択辞書１６を用いて前記第１の実施形態の場合と
同様に主成分分析により作成されたものであるとする。The recognition dictionary 17 in this step S26
The correction of the feature selection dictionary 16 is performed by defining a loss function h (X'k) = h (PXk) similar to that of the first embodiment, and steepest gradient method (steepest descent method) in the direction of decreasing the loss function. )
Is performed as follows by modifying the feature selection dictionary 16. However, it is assumed that the feature selection dictionary 16 (= P) in the initial state applied in the present embodiment is created by principal component analysis as in the case of the first embodiment. Also. It is assumed that the recognition dictionary 17 in the initial state is created by principal component analysis using the feature selection dictionary 16 in the initial state as in the case of the first embodiment.

【００６８】[0068]

【数５】 (Equation 5)

【００６９】ここでは、式（１５）、式（１８）によ
り、損失関数ｈ（Ｘ′k ）＝ｈ（ＰＸk ）を特徴選択辞
書（特徴選択辞書行列）Ｐで微分した値、即ち特徴選択
辞書参照パラメータ空間における損失関数ｈ（Ｘ′k ）
＝ｈ（ＰＸk ）の勾配を用いて、特徴選択辞書Ｐ、即ち
特徴選択辞書１６が修正される。Here, the value obtained by differentiating the loss function h (X'k) = h (PXk) with the feature selection dictionary (feature selection dictionary matrix) P, that is, the feature selection dictionary, is obtained by the equations (15) and (18). Loss function h (X'k) in reference parameter space
= H (PXk), the feature selection dictionary P, that is, the feature selection dictionary 16 is modified.

【００７０】一方、参照ベクトルＲi の修正（式（１
６）、式（１９））と参照ベクトルＲj の修正（式（１
７）、式（２０））は、前記第１の実施形態における参
照ベクトルＲi の修正（式（１１）、式（１３））と参
照ベクトルＲj の修正（（式（１２）、式（１４））と
同様である。On the other hand, the correction of the reference vector Ri (Equation (1)
6), equation (19)) and correction of reference vector Rj (equation (1)
7) and Equation (20) are the correction of the reference vector Ri (Equations (11) and (13)) and the correction of the reference vector Rj ((Equations (12) and (14)) in the first embodiment. ).

【００７１】このようなルールで認識辞書１７及び特徴
選択辞書１６の修正を指定回数繰り返すことにより、誤
認識による損失が小さい認識辞書１７及び特徴選択辞書
１６に修正することができる。By repeating the modification of the recognition dictionary 17 and the feature selection dictionary 16 by such a rule a specified number of times, it is possible to correct the recognition dictionary 17 and the feature selection dictionary 16 with a small loss due to erroneous recognition.

【００７２】図７の装置（本実施形態の認識手法）で、
上記のようにして修正された認識辞書１７及び特徴選択
辞書１６を用いて類似文字の認識（識別）を行った場合
の認識性能を、図１の装置（第１の実施形態の認識手
法）での認識性能及び従来の手法である特徴選択＋部分
空間法での認識性能と対比させて図９に示す。この例で
は、特徴選択により６４次元特徴を選択しているものと
する。また、部分空間法では３次元の部分空間を利用
し、前記第１の実施形態及び本実施形態（第２の実施形
態）では、参照ベクトル数を各カテゴリ３つとして、同
じ計算量で比較している。In the apparatus shown in FIG. 7 (the recognition method of this embodiment),
The recognition performance when similar characters are recognized (identified) using the recognition dictionary 17 and the feature selection dictionary 16 corrected as described above is evaluated by the apparatus of FIG. 1 (the recognition method of the first embodiment). FIG. 9 shows a comparison between the recognition performance of the conventional method and the recognition performance of the conventional method of feature selection + subspace method. In this example, it is assumed that a 64-dimensional feature is selected by feature selection. In the subspace method, a three-dimensional subspace is used. In the first embodiment and the present embodiment (second embodiment), the number of reference vectors is set to three for each category, and comparison is performed with the same amount of calculation. ing.

【００７３】図９から明らかなように、第２の実施形態
の認識手法は、従来手法は勿論、前記第１の実施形態よ
り高い認識性能が実現できる。なお、前記実施形態（第
１及び第２の実施形態）における辞書学習モードでは、
予め用意されている複数の学習パターンを１パターンず
つ入力し、その都度、その学習パターンを実際に認識さ
せて、その認識結果をもとに辞書（認識辞書１７、或い
は認識辞書１７と特徴選択辞書１６）を修正するという
一連の操作を、目標学習回数だけ繰り返すものとした
が、これに限るものではない。例えば、１つの学習パタ
ーンについて、その学習パターンを入力して実際に認識
させ、その認識結果をもとに辞書を修正するという操作
を目標学習回数だけ繰り返すと、次の学習パターンに切
り替えるようにしても構わない。但し、この方式では、
１つの学習パターンを用いた辞書の学習処理が目標学習
回数繰り返されないと、次の学習パターンに切り替えら
れないため、一連の学習処理の終了後の辞書（認識辞書
１７、或いは認識辞書１７と特徴選択辞書１６）には、
一連の学習処理の早い段階で用いられた学習パターン
（のカテゴリ）についての学習結果は反映されなくなる
虞がある。したがって、前記実施形態で適用した手順で
学習処理を行った方が学習効果を高めることができる。As is clear from FIG. 9, the recognition method of the second embodiment can realize higher recognition performance than the first embodiment, as well as the conventional method. Note that in the dictionary learning mode in the embodiment (first and second embodiments),
A plurality of learning patterns prepared in advance are input one by one, and each time the learning patterns are actually recognized, a dictionary (a recognition dictionary 17 or a recognition dictionary 17 and a feature selection dictionary) is created based on the recognition result. Although a series of operations of correcting 16) is repeated for the target number of times of learning, the invention is not limited to this. For example, for one learning pattern, when the learning pattern is input and actually recognized, and the operation of correcting the dictionary based on the recognition result is repeated by the target number of learning times, the learning pattern is switched to the next learning pattern. No problem. However, in this method,
If the learning process of the dictionary using one learning pattern is not repeated for the target learning number, the dictionary cannot be switched to the next learning pattern. Therefore, the dictionary after the series of learning processes (the recognition dictionary 17 or the recognition dictionary 17 and the In the selection dictionary 16),
There is a possibility that the learning result for (the category of) the learning pattern used in the early stage of the series of learning processing is not reflected. Therefore, the learning effect can be enhanced by performing the learning process according to the procedure applied in the embodiment.

【００７４】[0074]

【発明の効果】以上詳述したように本発明によれば、特
徴選択により識別に使用する特徴量を削減するようにし
たので、認識辞書容量及び認識計算量を低く抑えること
ができ、しかも学習パターンに対する認識結果に基づい
て競合学習により認識辞書及び特徴選択辞書の両方を修
正するようにしたので、その修正された認識辞書を使用
した認識処理が可能となると共に、修正された特徴選択
辞書を用いて識別に有効な特徴を選択できるようにな
り、高精度の認識性能を実現できる。As described above in detail, according to the present invention, the feature amount used for identification is reduced by selecting a feature, so that the capacity of the recognition dictionary and the amount of calculation for recognition can be reduced, and learning can be performed. since so as to modify both the recognition dictionary and feature selection dictionary by competitive learning based on the recognition result for the pattern recognition processing can and Do Rutotomoni, modified feature selection using the modified recognition dictionary
It is now possible to select effective features for identification using a dictionary.
Thus, highly accurate recognition performance can be realized.

【００７５】[0075]

[Brief description of the drawings]

【図１】本発明の第１の実施形態に係るパターン認識装
置の概略構成を示すブロック図。FIG. 1 is a block diagram showing a schematic configuration of a pattern recognition device according to a first embodiment of the present invention.

【図２】図１の装置が認識モードに設定された場合にお
ける認識処理を説明するためのフローチャート。FIG. 2 is a flowchart for explaining recognition processing when the apparatus in FIG. 1 is set to a recognition mode.

【図３】入力文字パターンからの特徴ベクトル抽出を説
明するための図。FIG. 3 is a view for explaining feature vector extraction from an input character pattern.

【図４】図１の装置が辞書学習モードに設定された場合
における認識辞書の学習処理を説明するためのフローチ
ャート。FIG. 4 is a flowchart for explaining a recognition dictionary learning process when the apparatus in FIG. 1 is set to a dictionary learning mode.

【図５】損失関数を定義するのに用いられるシグモイド
関数を示す図。FIG. 5 is a diagram illustrating a sigmoid function used to define a loss function.

【図６】図１の装置における学習処理で修正された認識
辞書１７を用いて手書き文字認識を行った場合の、計算
量（認識辞書容量）に対する認識性能（認識率）を表す
折れ線グラフを、従来手法である部分空間法と対比させ
て示す図。FIG. 6 is a line graph showing recognition performance (recognition rate) with respect to the amount of calculation (recognition dictionary capacity) when handwritten character recognition is performed using the recognition dictionary 17 corrected by the learning process in the apparatus of FIG. The figure shown in comparison with the subspace method which is a conventional method.

【図７】本発明の第２の実施形態に係るパターン認識装
置の概略構成を示すブロック図。FIG. 7 is a block diagram showing a schematic configuration of a pattern recognition device according to a second embodiment of the present invention.

【図８】図７の装置が辞書学習モードに設定された場合
における認識辞書及び特徴選択辞書の学習処理を説明す
るためのフローチャート。8 is a flowchart for explaining a learning process of a recognition dictionary and a feature selection dictionary when the apparatus of FIG. 7 is set to a dictionary learning mode.

【図９】図７の装置における学習処理で修正された認識
辞書１７及び特徴選択辞書１６を用いて類似文字の認識
（識別）を行った場合の認識性能を、図１の装置（第１
の実施形態の認識手法）での認識性能及び従来の手法で
ある特徴選択＋部分空間法での認識性能と対比させて示
す図。9 shows the recognition performance when similar characters are recognized (identified) using the recognition dictionary 17 and the feature selection dictionary 16 corrected by the learning process in the apparatus of FIG.
FIG. 9 is a diagram showing the recognition performance in the recognition method according to the embodiment and the recognition performance in the feature selection + subspace method which is a conventional method.

[Explanation of symbols]

１１…データ入力部、１２…特徴抽出部、１３…特徴選択部、１４…識別部、１５…認識結果出力部、１６…特徴選択辞書、１７…認識辞書、１８…認識辞書修正部、１９…特徴選択辞書修正部。 11 ... data input part, 12 ... feature extraction unit, 13 ... Feature selection unit, 14 ... identification part, 15 ... Recognition result output unit 16. Feature selection dictionary, 17 ... Recognition dictionary, 18. Recognition dictionary correction unit, 19: Feature selection dictionary correction unit.

フロントページの続き (56)参考文献特開昭63−177285（ＪＰ，Ａ) 特開平７−225836（ＪＰ，Ａ) 特開平４−256087（ＪＰ，Ａ) 特開平６−251158（ＪＰ，Ａ) 米沢裕司外１名，最小分類誤り学習による文脈効果モデルの定式化，電子情報通信学会技術研究報告ＳＰ94−108〜 117，1995年３月24日，Ｖｏｌ． 94, Ｎｏ． 569，ｐｐ． 47−54 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06K 9/00 - 9/82 G10L 15/06 G06T 7/00 - 7/60 Continuation of the front page (56) References JP-A-63-177285 (JP, A) JP-A-7-225836 (JP, A) JP-A-4-256087 (JP, A) JP-A-6-251158 (JP) , A) Yonezawa, Y., et al., Formulation of a context effect model by minimum classification error learning, IEICE Technical Report SP94-108-117, March 24, 1995, Vol. 94, No. 569, pp. 47-54 (58) Fields investigated (Int.Cl. ⁷ , DB name) G06K 9/00-9/82 G10L 15/06 G06T 7/00-7/60

Claims

(57) [Claims]

1. Inputting pattern data to be recognized
Inputting means, extracting an n-dimensional feature vector from the pattern data input by the inputting means , extracting an n-dimensional feature vector, and extracting an n-dimensional feature vector into an m-dimensional feature vector (m <n).
A feature selection dictionary used for selecting a feature vector; a recognition dictionary consisting of a set of m-dimensional reference vectors used for recognition of m-dimensional feature vectors; and the feature selection from the n-dimensional feature vectors extracted by the feature vector extracting means. M that is effective for recognition using a dictionary
A feature selection unit that selects a dimensional feature vector; an evaluation value is calculated by comparing the m-dimensional feature vector selected by the feature selection unit with the recognition dictionary; and recognition candidates are output in an order based on the calculation result. Identifying means and pattern data to be recognized in the dictionary learning mode
To input a learning pattern by the input means
And an n-dimensional feature vector extracted from the learning pattern based on the output result of the identification means in the dictionary learning mode.
M-dimensional feature vector selected from the
The degree of coincidence with the reference vector is
The closer to the first boundary value the greater the degree of agreement with the
On the contrary, the smaller the second boundary value is, the smaller the second boundary value is.
Detecting the degree of misrecognition using a loss function approaching the boundary value, for both the recognition dictionary and the feature selection dictionary
Dictionary correction processing by learning to reduce the degree of misrecognition
A pattern recognition device comprising: a dictionary correction unit that executes a process .

2. An n-dimensional feature vector from an input pattern
From the extracted n-dimensional feature vector.
M-dimensional feature vectors (m <
n) and select the selected m-dimensional feature vector and m-th order
Collation with a recognition dictionary consisting of a set of source reference vectors
The evaluation value is calculated with
Pattern recognition device that executes recognition processing to output knowledge candidates
In the dictionary learning mode in which a plurality of learning patterns are sequentially input.
Each time, the recognition process is performed for the learning pattern.
Performing recognition processing, and each time the recognition processing is performed, based on the result of the recognition processing.
Extracted from the learning pattern targeted for the recognition process
M-dimensional feature vector selected from the obtained n-dimensional feature vector
The match between the reference vector and the correct category reference vector is incorrect.
The greater the degree of coincidence with the reference vector of the solution category,
The first boundary value approaches the first boundary value, and conversely, the first boundary value
Error using a loss function approaching a second boundary value different from the
Detecting the degree of recognition and selecting the recognition dictionary and the feature selection
Learning to reduce the degree of misrecognition for both dictionaries
Performing a dictionary correction process by learning, performing the recognition process, and performing the dictionary correction process
The plurality of learning patterns
Is repeated a predetermined number of times.
Control step of controlling
Dictionary correction method in turn recognition device.