JPH0740287B2

JPH0740287B2 - Pattern recognition method

Info

Publication number: JPH0740287B2
Application number: JP61144487A
Authority: JP
Inventors: 道義立川
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-06-20
Filing date: 1986-06-20
Publication date: 1995-05-01
Anticipated expiration: 2010-05-01
Also published as: JPS63779A

Description

【発明の詳細な説明】〔技術分野〕本発明は文字などのパターンの認識方法に関し、さらに
詳細には、多層方向ヒストグラム法によるパターン認識
方法に関する。Description: TECHNICAL FIELD The present invention relates to a method for recognizing a pattern such as a character, and more particularly, to a pattern recognition method using a multi-directional histogram method.

[Prior art]

本出願人は、特願昭第59−202822号、特願昭第58−2028
25号などによって、多層方向ヒストグラム法によるパタ
ーン認識方法を既に提案している。本発明は、このよう
なパターン認識方法の改良に関するものである。The applicants of the present invention are Japanese Patent Application No. 59-202822 and Japanese Patent Application No. 58-2028.
No. 25 and others have already proposed a pattern recognition method using the multi-direction histogram method. The present invention relates to an improvement of such a pattern recognition method.

この多層方向ヒストグラム法によるパターン認識方法に
おいては、文字などのパターンの輪郭画素に方向コード
を付け、そのパターンの枠の各辺から対向する辺に向か
ってパターンを走査して白画素（背景）の次に出現する
方向コードを検出し、その方向コードをそれが走査線上
で何番目に検出されたかによって複数の層に層別する。
そして、パターンの枠内の分割領域毎に、ある層までの
層の方向コードのヒストグラムを求め、それぞれのヒス
トグラム成分（特徴量）としたベクトルを、パターンの
特徴ベクトルとして用いる。In this pattern recognition method using the multilayer direction histogram method, a direction code is attached to the contour pixels of a pattern such as a character, and the pattern is scanned from each side of the frame of the pattern toward the opposite side, and white pixels (background) are formed. The next appearing directional code is detected and the directional code is stratified into multiple layers depending on how many times it was detected on the scan line.
Then, for each of the divided areas within the frame of the pattern, a histogram of direction codes of layers up to a certain layer is obtained, and a vector that is each histogram component (feature amount) is used as a feature vector of the pattern.

例えば、方向コードとして８種類のコードを付け、パタ
ーン枠内４×４のメッシュ領域に分割し、第１層および
第２層の方向コードまでを対象とすると、特徴ベクトル
の次元数は256（＝４×４×２×８）となる。For example, if 8 kinds of codes are added as direction codes, the mesh is divided into 4 × 4 mesh areas in the pattern frame, and the direction codes of the first layer and the second layer are also targeted, the dimension number of the feature vector is 256 (= 4 × 4 × 2 × 8).

辞書に関しては、同一パターンとしての複数のパターン
から同様の特徴ベクトルを抽出し、その平均を辞書パタ
ーン（標準パターン）の特徴ベクトルとして登録する。Regarding the dictionary, similar feature vectors are extracted from a plurality of patterns as the same pattern, and the average thereof is registered as the feature vector of the dictionary pattern (standard pattern).

また、本出願人は、層別のための走査方向も加味してさ
らに詳細に方向コードを層別する同様のパターン認識方
法に既に提案している。本発明は、このパターン認識方
法にも同様に適用できるものである。Further, the present applicant has already proposed a similar pattern recognition method in which the direction code is further stratified in consideration of the scanning direction for stratification. The present invention can be applied to this pattern recognition method as well.

さらに、パターン枠内の領域分割の方法は、前記先願の
明細書および図面に開示した方法に限らない。例えば、
前記先願のパターン認識方法と同様に、方向コードが均
等に分配されるようにパターン枠内をメッシュ分割し、
そのメッシュ領域を予め設定されたパラメータに従って
部分的に重ね合わせて、少ない領域に統合するような方
法を採用してもよい。このような領域分割方法を採用し
た多層方向ヒストグラム法によるパターン認識方法は、
本出願人により提案済みであるが、このようなパターン
認識方法にも、本発明は同様に適用し得るものである。Further, the method of dividing the area within the pattern frame is not limited to the method disclosed in the specification and drawings of the above-mentioned prior application. For example,
Similar to the pattern recognition method of the prior application, the pattern frame is divided into meshes so that the direction codes are evenly distributed,
A method may be adopted in which the mesh areas are partially overlapped according to preset parameters and integrated into a small area. The pattern recognition method by the multilayer direction histogram method adopting such a region division method is
The present invention has been proposed by the present applicant, but the present invention can be similarly applied to such a pattern recognition method.

さて、このような多層方向ヒストグラム法によるパター
ン認識方法においては、未知パターンから抽出された特
徴ベクトルと、辞書パターンの特徴ベクトルとの対応次
元成分の距離または類似度の演算によって未知パターン
と辞書パターンとのマッチングを行い、距離の総和が最
小の辞書パターンまたは類似度の総和が最大の辞書パタ
ーンを認識結果とする。Now, in such a pattern recognition method by the multi-direction histogram method, the unknown pattern and the dictionary pattern are calculated by calculating the distance or similarity of the corresponding dimension component between the feature vector extracted from the unknown pattern and the feature vector of the dictionary pattern. Is performed, and the dictionary pattern having the smallest sum of distances or the dictionary pattern having the largest sum of similarities is set as the recognition result.

しかし、前記のように特徴ベクトルの次元数が大きくな
ると、距離または類似度の演算量が多く、マッチング時
間が長くなり、また辞書容量が大きくなるという問題が
あった。However, as described above, when the dimension number of the feature vector becomes large, there is a problem that the calculation amount of the distance or the similarity becomes large, the matching time becomes long, and the dictionary capacity becomes large.

〔Purpose〕

したがって本発明の目的は、多層方向ヒストグラム法に
よるパターン認識方法において、マッチングの効率化お
よび辞書容量の削除を図ることにある。Therefore, an object of the present invention is to improve the efficiency of matching and delete the dictionary capacity in the pattern recognition method by the multi-layered histogram method.

〔Constitution〕

多層方向ヒストグラム法による特徴ベクトルは、パター
ン識別効果の大きい次元の成分と、その効果がそれほど
顕著でない次元の成分とがある。これに関し、説明を簡
単にするために、２次元の特徴ベクトルを考える。The feature vector obtained by the multi-layered histogram method has a dimensional component having a large pattern identification effect and a dimensional component having a less significant effect. In this regard, consider a two-dimensional feature vector for ease of explanation.

次元数を２として多層方向ヒストグラム法による辞書を
作成した場合、“文",“字",“認",“識”のそれぞれの
辞書パターンの特徴ベクトルは、それぞれ第５図のg₁,g
₂,g₃,g₄のようになる。この例では、図から明らかなよ
うに、各特徴ベクトルは成分（特徴量）Ａのほうが、成
分（特徴量）Ｂよりも分散（または標準偏差）が大き
い。換言すれば、成分Ａのほうが、未知パターンに対す
る識別能力が高い。When a dictionary is created by the multi-directional histogram method with the number of dimensions being 2, the feature vectors of the dictionary patterns of “sentence”, “letter”, “recognition”, and “knowledge” are g ₁ and g in FIG. 5, respectively.
_It looks like ₂ , g ₃ , g ₄ . In this example, as is clear from the figure, the component (feature amount) A of each feature vector has a larger variance (or standard deviation) than the component (feature amount) B. In other words, the component A has a higher ability to identify an unknown pattern.

未知パターンと辞書パターンとのマッチングは、基本的
には、未知パターンと辞書パターンの特徴ベクトルの対
応次元成分の距離または類似度を求め、その距離の総和
が最小または類似度の総和が最大の辞書パターンを認識
結果とするものである。こゝで、前記のような特徴ベク
トルの性質に着目すれば、パターン識別能力の高い部分
から優先的に距離または類似度を演算することにより、
パターン識別能力の高い一部の成分について距離または
類似度を演算した段階で、候補となり得ない辞書パター
ンを排除し、候補となり得る辞書パターンを早い段階で
絞り込むことができるであろう。The matching between the unknown pattern and the dictionary pattern is basically a dictionary in which the distance or similarity between the corresponding dimension components of the feature vectors of the unknown pattern and dictionary pattern is calculated, and the sum of the distances is the minimum or the similarity is the maximum. The pattern is used as the recognition result. Here, paying attention to the characteristics of the feature vector as described above, by preferentially calculating the distance or the similarity from the portion having a high pattern identification ability,
It may be possible to exclude dictionary patterns that cannot be candidates and narrow down dictionary patterns that can be candidates at an early stage when the distance or the similarity is calculated for some components having high pattern identification ability.

また、多層方向ヒストグラム法による特徴ベクトルは、
その各次元成分の順番を入れ替えてもパターンの特徴は
保存されるという性質がある。Also, the feature vector by the multi-direction histogram method is
The characteristics of the pattern are preserved even if the order of the respective dimensional components is changed.

以上のような点に着目し、本発明にあっては、辞書作成
に際し、辞書パターンの特徴ベクトルの各次元毎の標準
偏差または分散を求め、各辞書パターンの特徴ベクトル
の成分を標準値または分散の大きい順に並べ替え、その
ような並べ替え後の特徴ベクトルを辞書に登録してお
く。Focusing on the above points, in the present invention, when creating a dictionary, the standard deviation or variance for each dimension of the feature vector of the dictionary pattern is obtained, and the component of the feature vector of each dictionary pattern is set to the standard value or variance. Are sorted in descending order, and the feature vector after such sorting is registered in the dictionary.

例えば、多層方向ヒストグラム法により作成されたある
辞書パターンの特徴ベクトルが第６図の（ａ）に示すよ
うであったとする。そして、全標準パターンについて計
算された標準偏差または分散の大きい順がX₄,X₁,X₃,X₇,
X₅,X₂,X₈,X₆,…であるとする。そうすると、この特徴ベ
クトルは、その各次元X₁,X₂,X₃,…の成分が第６図の
（ｂ）に示すように並べ替えられて辞書に登録される。
つまり、元の特徴ベクトルの次元X₄の成分が並べ替え後
の特徴ベクトルの最上位の次元Y₁の成分、次元X₁の成分
が次位の次元Y₂の成分、というように並べ替えられる。For example, it is assumed that the feature vector of a certain dictionary pattern created by the multi-direction histogram method is as shown in FIG. 6 (a). Then, the order of large standard deviation or variance calculated for all standard patterns is X ₄ , X ₁ , X ₃ , X ₇ ,
X ₅ , X ₂ , X ₈ , X ₆ , ... Then, the feature vector is registered in the dictionary by rearranging the components of the respective dimensions X ₁ , X ₂ , X ₃ , ... As shown in FIG. 6 (b).
That is rearranged as component dimension Y ₁ of the uppermost feature vectors reordered components other dimensions X ₄ of the original feature vector, the components of the dimension Y ₂ components of dimension X ₁ is next order, that .

そして、未知パターンから抽出された特徴ベクトルを、
辞書パターンの特徴ベクトルの成分の並べ替え順に従っ
て成分の並べ替えをおこなったのち、辞書パターンの特
徴ベクトルの対応次元成分との距離または類似度の演算
を行うことにより、未知パターンと辞書パターンとのマ
ッチングを行う。Then, the feature vector extracted from the unknown pattern is
After rearranging the components according to the rearrangement order of the components of the feature vector of the dictionary pattern, by calculating the distance or similarity with the corresponding dimension component of the feature vector of the dictionary pattern, the unknown pattern and the dictionary pattern Match.

こゝで、手書き漢字など多くの種類のあるパターンの認
識の場合は、次元数の多い特徴ベクトルを用いて詳細マ
ッチングを行う必要がある。しかし、ANSK文字のように
種類の少ない文字のようなパターンの認識を対象とした
場合、前記のように標準パターン偏差または分散の大き
い順つまりパターン識別能力の高い順に辞書パターンお
よび未知パターンの成分を並べ替えれば、その上位の比
較的少ない次元数（ANSK文字の場合、例えば20次元、24
次元など）だけのマッチングで十分な認識率を達成でき
ることが確認できた。Here, in the case of recognizing many kinds of patterns such as handwritten Chinese characters, it is necessary to perform detailed matching using a feature vector having a large number of dimensions. However, when recognizing a pattern such as a small number of characters such as ANSK characters, the dictionary pattern and unknown pattern components are sorted in descending order of standard pattern deviation or variance, that is, in descending order of pattern identification ability, as described above. If rearranged, the number of dimensions in the upper rank is relatively small (for ANSK characters, for example, 20 dimensions, 24
It was confirmed that a sufficient recognition rate can be achieved by matching only the dimensions.

この点に着目し、本発明にあってはさらに、前記のよう
に次元並べ替え後の辞書パターンの特徴ベクトルの上位
Ｎ次元だけを残し、下位の次元をすてたベクトルを、最
終的に辞書パターンの特徴ベクトルとして辞書に登録す
ることにより、辞書容量の削除とマッチング効率の一層
の向上を達成する。Focusing on this point, in the present invention, as described above, only the upper N dimensions of the feature vector of the dictionary pattern after the dimension rearrangement are left, and the vector having the lower dimensions is finally converted into the dictionary. By registering the pattern feature vector in the dictionary, the dictionary capacity is deleted and the matching efficiency is further improved.

〔Example〕

以下、本発明の実施例について図面を参照し説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

第１図は、本発明の一実施例の機能的構成を簡略化して
示す概要ブロック図である。なお、この実施例において
は、認識対象のパターンとしてANSKの文字のような比較
的種類の少ない文字パターンを想定している。FIG. 1 is a schematic block diagram showing a simplified functional configuration of an embodiment of the present invention. It should be noted that in this embodiment, a relatively small type of character pattern such as ANSK characters is assumed as the pattern to be recognized.

図において、10は原稿から文字パターンを読み取り、文
字パターン情報を前処理部12に入力するパターン読み取
り部である。前処理部12は、入力文字パターンの文字切
り出し、正規化などの前処理を行い、処理後の文字パタ
ーンを１字毎に特徴抽出部14に入力する部分である。In the figure, 10 is a pattern reading unit for reading a character pattern from a document and inputting the character pattern information to the preprocessing unit 12. The preprocessing unit 12 is a unit that performs preprocessing such as character cutting and normalization of the input character pattern and inputs the processed character pattern to the feature extraction unit 14 for each character.

特徴抽出部14は、入力文字パターンから前記多層方向ヒ
ストグラム法により特徴ベクトルを抽出する部分であ
る。The feature extraction unit 14 is a unit that extracts a feature vector from the input character pattern by the multi-layered direction histogram method.

こゝで、この実施例においては、動作モードとして辞書
作成モードとパターン認識モードとがある。まず、辞書
作成モードの場合について以下に説明する。また、この
モードにおける辞書作成処理の概略フローチャートを第
２図に示し、以下の説明において対応するステップ番号
を（）内に示す。Here, in this embodiment, the operation modes include a dictionary creation mode and a pattern recognition mode. First, the case of the dictionary creation mode will be described below. A schematic flow chart of the dictionary creating process in this mode is shown in FIG. 2, and corresponding step numbers are shown in parentheses in the following description.

辞書作成のためには、各文字種について、Ｍ個の文字パ
ターンがパターン読み取り部10より順次入力される（ス
テップ102）。その入力文字パターンは前処理部12で前
処理を受け（ステップ102）、特徴抽出部14に入力され
て多層方向ヒストグラム法による特徴ベクトル（例えば
256次元のベクトル）を抽出される（ステップ104）。抽
出された特徴ベクトルは並べ替え部16に入力される。To create the dictionary, M character patterns for each character type are sequentially input from the pattern reading unit 10 (step 102). The input character pattern is pre-processed by the pre-processing unit 12 (step 102) and is input to the feature extraction unit 14 to receive a feature vector (eg
A 256-dimensional vector) is extracted (step 104). The extracted feature vector is input to the rearrangement unit 16.

18は並べ替え部16によって参照される並べ替えテーブル
部であり、これには予め作成された並べ替えテーブルが
登録されている。Reference numeral 18 denotes a rearrangement table unit referred to by the rearrangement unit 16, in which a rearrangement table created in advance is registered.

この並べ替えテーブルは次のようにして作成される。各
文字種について、複数の文字パターンから多層方向ヒス
トグラム法により特徴ベクトル（例えば256次元のベク
トル）を抽出し、その平均を、その文字種の標準パター
ンの特徴ベクトルとする。このようにして得た前文字種
（全体でＫ種）の標準パターンの特徴ベクトルの各次元
ｎ毎の標準偏差δ_knは次式により計算する。こゝで、ｋは文字種の番号であり、g
_knは文字種ｋの次元ｎの成分（特徴量）であり、また
_knは次元ｎの成分の全文字種の平均である。This rearrangement table is created as follows. For each character type, a feature vector (for example, a 256-dimensional vector) is extracted from a plurality of character patterns by the multi-direction histogram method, and the average thereof is used as the feature vector of the standard pattern of the character type. The standard deviation δ _kn for each dimension n of the feature vector of the standard pattern of the preceding character type (K type as a whole) thus obtained is Calculate by Here, k is the character type number and g
_kn is a component (feature amount) of dimension n of character type k, and
_kn is the average of all character types of the component of dimension n.

なお、標準偏差の代わりに、その平方である分散を求め
てもよい。Instead of the standard deviation, the squared variance may be obtained.

そのようにして計算した標準偏差または分散の大きい順
に次元番号を並べ、その次元番号と対応する元の次元番
号との対応テーブルを、上位のＮ次元まで作る。この対
応テーブルが前記並べ替えテーブルである。The dimension numbers are arranged in descending order of the standard deviation or variance calculated in this way, and a correspondence table of the dimension numbers and the corresponding original dimension numbers is created up to the upper N dimensions. This correspondence table is the rearrangement table.

辞書作成モードの説明に戻る。並べ替え部16において、
入力文字パターンから抽出された特徴ベクトルg_knmは各
次元成分が並べ替えテーブル部18の並べ替えテーブルに
従って並べ替えられ、並べ替え後の上位Ｎ次元の成分か
らなる特徴ベクトルg_knが辞書作成部20に入力される
（ステップ106）。Return to the description of the dictionary creation mode. In the sorting section 16,
The feature vector g _knm extracted from the input character pattern is rearranged according to the rearrangement table of the rearrangement table unit 18 in each dimensional component, and the feature vector g _kn consisting of rearranged upper N-dimensional components is the dictionary creation unit 20. (Step 106).

このようにして、同一文字織のＭ個の文字パターンに対
する次元並べ替え後のＮ次元の特徴ベクトルが辞書作成
部20に蓄積されると、辞書作成部20は、そのＭ個の特徴
ベクトルの平均を求め、それを文字種ｋの辞書パターン
の特徴ベクトル（Ｎ次元ベクトル_knとして、文字コー
ドを付加し辞書22に登録する（ステップ108）。In this way, when the N-dimensional feature vector after the dimension rearrangement for the M character patterns of the same character weave is accumulated in the dictionary creating unit 20, the dictionary creating unit 20 averages the M feature vectors. Is _calculated , and is added to the feature vector of the dictionary pattern of the character type k (N-dimensional vector _kn) , the character code is added, and registered in the dictionary 22 (step 108).

次に、文字種番号ｋが更新され（ステップ110）、ステ
ップ100に戻り、次の文字種について同様の辞書作成処
理が実行される。Next, the character type number k is updated (step 110), the process returns to step 100, and a similar dictionary creating process is executed for the next character type.

最後の文字種（ｋ＝ｋ）まで処理が実行されると、ステ
ップ112により処理終了と判断され、処理を完了する。When the process is executed up to the last character type (k = k), it is determined in step 112 that the process is completed, and the process is completed.

つぎに、パターン認識モードの場合について以下に説明
する。また、このモードにおける処理の概略フローチャ
ートを第３図に示し、以下の説明において対応するステ
ップ番号を（）内に示す。Next, the case of the pattern recognition mode will be described below. Further, a schematic flowchart of the processing in this mode is shown in FIG. 3, and the corresponding step numbers are shown in parentheses in the following description.

認識対象の文字パターン（未知文字パターン）はパター
ン読み取り部10から入力され（ステップ200）、前処理
部12により前処理を受け（ステップ202）、特徴抽出部1
4に入力され、そこで多層方向ヒストグラム法により特
徴ベクトルX_n（例えば256次元ベクトル）を抽出される
（ステップ204）。The character pattern to be recognized (unknown character pattern) is input from the pattern reading unit 10 (step 200) and subjected to preprocessing by the preprocessing unit 12 (step 202), and the feature extraction unit 1
4 and the feature vector X _n (for example, a 256-dimensional vector) is extracted by the multilayer histogram method (step 204).

この特徴ベクトルX_nは並べ替え部16に入力され、並べ替
えテーブルに従って標準偏差または分散の大きい順に成
分が並べ替えられ、その上位Ｎ次元の成分からなる特徴
ベクトルY_nに変換された後、マッチング部４に送られる
（ステップ206）。The feature vector X _n is input to the rearrangement unit 16, the components are rearranged in the descending order of standard deviation or variance according to the rearrangement table, and converted into the feature vector Y _n including the upper N-dimensional components, and then the matching is performed. It is sent to section 4 (step 206).

マッチング部24においては、つぎのようにして未知入力
パターンと辞書パターンとのマッチングが行われる（ス
テップ208）。In the matching section 24, matching between the unknown input pattern and the dictionary pattern is performed as follows (step 208).

それぞれの文字種ｋの辞書パターンの特徴ベクトル_kn
と未知文字パターンの特徴ベクトルY_nとの距離Dkを次式によって計算する。そして、その距離のソートを行い、
距離が最小の辞書パターンを候補文字として決定する。Feature vector _kn of dictionary pattern of each character type k
And the distance Dk between the unknown character pattern feature vector Y _n and Calculate by Then sort that distance,
The dictionary pattern with the smallest distance is determined as a candidate character.

この候補文字の文字コードは、未知文字パターンに対す
る認識結果として出力される（ステップ210）。The character code of this candidate character is output as a recognition result for the unknown character pattern (step 210).

このように、この実施例では、辞書パターンの特徴ベク
トルは標準偏差または分散の大きい順つまりパターン識
別能力の高い順に成分が並べ替えられ、かつ上位Ｎ次元
以外の、パターン識別能力の低い成分を捨てた形で辞書
に登録されており、また未知入力パターンの特徴ベクト
ルは辞書パターンと同じ順に成分が並べ替えられ、辞書
パターンと未知文字パターンとのマッチングは、Ｎ次元
のベクトル間の距離または類似度に演算によって行われ
る。As described above, in this embodiment, the feature vector of the dictionary pattern has the components rearranged in the order of large standard deviation or variance, that is, in the descending order of pattern discriminating ability, and discards the components having low pattern discriminating ability other than the upper N dimensions. Are registered in the dictionary in the form of a list, and the components of the feature vector of the unknown input pattern are rearranged in the same order as the dictionary pattern, and the matching between the dictionary pattern and the unknown character pattern is the distance or similarity between N-dimensional vectors. Is performed by calculation.

したがって、距離または類似度の演算量が少なく高いマ
ッチング効率を達成でき、しかも、パターン識別能力の
高い次元を利用しているため十分な識別率を達成でき
る。さらに、次元数の減少により、辞書容量の大幅な削
除を達成できる。Therefore, it is possible to achieve a high matching efficiency with a small amount of calculation of the distance or the degree of similarity, and moreover, it is possible to achieve a sufficient classification rate because a dimension having a high pattern recognition capability is used. Furthermore, due to the reduction in the number of dimensions, a significant reduction in dictionary capacity can be achieved.

次に、本発明の他の実施例について説明する。この実施
例の全体的な機能的構成は前記実施例と同様であり、マ
ッチング部24におけるマッチングが一部相違するだけで
ある。そこで、そのマッチング部24の処理についてだ
け、第４図のフローチャートを参照し、説明する。Next, another embodiment of the present invention will be described. The overall functional configuration of this embodiment is the same as that of the previous embodiment, and the matching in the matching unit 24 is only partially different. Therefore, only the processing of the matching unit 24 will be described with reference to the flowchart of FIG.

成分の並べ替え後の未知文字パターンのＮ次元特徴ベク
トルがマッチングに入力されると、文字種つまり辞書パ
ターンの番号が１にセットされ（ステップ300）、その
辞書パターンと未知文字パターンとのマッチングが行わ
れる。When the N-dimensional feature vector of the unknown character pattern after the rearrangement of the components is input to the matching, the character type, that is, the number of the dictionary pattern is set to 1 (step 300), and the matching between the dictionary pattern and the unknown character pattern is performed. Be seen.

まず、上位N₁次元（N₁＜Ｎ）までについて、その辞書パ
ターンの特徴ベクトルｆ_ｎと未知文字パターンの特徴
ベクトルY_nとの距離の総和d₁が計算される（ステップ30
2）。そして、その距離の総和d₁と閾値Th₁との比較判定
が行われる（ステップ304）。First, the sum d ₁ of the distances between the feature vector f _{n of the} dictionary pattern and the feature vector Y _n of the unknown character pattern is calculated for the upper N ₁ dimensions (N ₁ <N) (step 30).
2). Then, a comparison judgment is made between the sum d _{1 of} the distances and the threshold value Th ₁ (step 304).

d₁＞Th₁であれば、その辞書パターンは距離が大き過ぎ
て候補パターン（候補文字）とはなり得ないから、その
マッチングをこの段階で区切り、辞書パターン番号ｋを
インクリメントし（ステップ306）、ステップ306に戻
る。If d ₁ > Th ₁ , the dictionary pattern is too large in distance to be a candidate pattern (candidate character), so the matching is divided at this stage and the dictionary pattern number k is incremented (step 306). , Return to step 306.

つまり、上位N₁次元までの距離演算によって、未知文字
パターンの大分類（候補パターンの絞り込み）が行わ
れ、こゝで排除された辞書パターンのマッチングは、こ
の段階で終了する。That is, the unknown character patterns are largely classified (candidate patterns are narrowed down) by calculating the distance to the upper N _1- dimensional dimension, and the matching of the dictionary patterns excluded here is completed at this stage.

ステップ304においてd₁≦Th₁であれば、全次元Ｎまでの
距離を総和d₂が求められる（ステップ310）。そして、
その距離d₂と、その直前までの候補パターンと未知文字
パターンとの距離とが比較され、小さいほうの候補パタ
ーンと距離が保存され（ステップ312）、ステップ206を
介してステップ302に戻る。If d ₁ ≦ Th ₁ in step 304, the sum d ₂ of the distances to all dimensions N is obtained (step 310). And
The distance d ₂ is compared with the distance between the candidate pattern and the unknown character pattern immediately before that distance, the smaller candidate pattern and the distance are stored (step 312), and the process returns to step 302 via step 206.

最終の辞書パターンまでマッチングが終了すると、ステ
ップ308にて終了と判定され、最終的に残っていた候補
パターンの文字コードが出力され（ステップ314）、未
知文字パターンの認識処理が完了する。When the matching is completed up to the final dictionary pattern, it is determined to be completed in step 308, the character code of the finally remaining candidate pattern is output (step 314), and the unknown character pattern recognition processing is completed.

このように、この実施例では特徴ベクトルの上位次元か
ら優先的に距離を演算し、ある次元までの演算結果によ
って候補パターンとなる得ない辞書パターンを早期に排
除し、候補パターンとなり得る辞書パターンについてだ
け全次元の距離演算を行って詳細マッチングを行う。し
たがって、前記実施例におけるよりも無駄な距離演算が
減少し、マッチング効率がさらに向上する。As described above, in this embodiment, the distance is preferentially calculated from the upper dimension of the feature vector, and the dictionary pattern that cannot be the candidate pattern is eliminated early according to the calculation result up to a certain dimension. Only the distance calculation of all dimensions is performed and detailed matching is performed. Therefore, useless distance calculation is reduced as compared with the above embodiment, and the matching efficiency is further improved.

このような段階的な候補パターンの縦り込みを２段階以
上行ってもよい。例えば、各辞書パターンについて、上
位N₁（たゞしN₁＜Ｎ）次元までの距離を演算することに
より、未知文字パターンの大分類を行う。そこで排除さ
れなかった場合に、その辞書パターンについて上位N
₂（たゞし、N₁＜N₂＜Ｎ）次元までの距離を演算して未
知文字パターンの中分類を行う。この中分類でも排除さ
れない場合、上位_３（たゞしN₂＜N₃＜Ｎ）次元までの距
離演算によって未知文字パターンの小分類を行う。この
小分類でも排除されない場合、全次元についての距離演
算による未知文字パターンの詳細マッチングを行う。Such stepwise indentation of the candidate pattern may be performed in two or more steps. For example, for each dictionary pattern, the unknown character patterns are roughly classified by calculating the distance to the upper N ₁ (that is, N ₁ <N) dimensions. If not excluded, then the top N for that dictionary pattern
₂ (That is, N ₁ <N ₂ <N) The distance to the dimension is calculated and the unknown character pattern is classified into the middle. If not excluded even by this middle classification, the unknown character patterns are subclassified by distance calculation up to the top _three (thus N ₂ <N ₃ <N) dimensions. If this small classification is not excluded, detailed matching of unknown character patterns is performed by distance calculation for all dimensions.

このようにすれば、候補となり得ない辞書パターンとの
マッチング演算を一層早期に中止し、マッチング時間を
一層短縮することができる。By doing so, the matching calculation with the dictionary pattern that cannot be a candidate can be stopped earlier, and the matching time can be further shortened.

なお、前記各実施例においては、未知パターンと辞書パ
ターンとのマッチングに距離を用いたが、類似度を求め
て同様のマッチング処理を行ってもよいことは当然であ
る。In each of the above embodiments, the distance is used for matching the unknown pattern and the dictionary pattern, but it goes without saying that similar matching processing may be performed by obtaining the similarity.

また、本発明は文字パターンに限らず、音声などパター
ン全般の認識に同様に適用できるものである。Further, the present invention is not limited to character patterns, but can be similarly applied to recognition of general patterns such as voices.

[Effect]

以上の説明から明らかなように、本発明によれば、多層
方向ヒストグラム法によるマッチング効率を大幅に向上
してパターン認識時間を短縮できるとゝもに、辞書容量
を大幅に削除できるなどの効率が得られる。As is clear from the above description, according to the present invention, it is possible to significantly improve the matching efficiency by the multi-direction histogram method and shorten the pattern recognition time, and at the same time, it is possible to significantly reduce the dictionary capacity. can get.

[Brief description of drawings]

第１図は本発明の一実施例の機能的構成を簡略化して示
す簡略ブロック図、第２図は同実施例における辞書作成
処理の概略フローチャート、第３図は同実施例における
パターン認識処理の概略フローチャート、第４図は本発
明の法の実施例におけるマッチング処理の概略フローチ
ャート、第５図は多層方向ヒストグラム法における特徴
ベクトルの性質を説明するためのベクトル図、第６図は
特徴ベクトルの成分並べ替えの説明図である。 10……パターン読み取り部、12……前処理部、14……特
徴抽出部、16……並べ替え部、18……並べ替えテーブル
部、20……辞書作成部、22……辞書。FIG. 1 is a simplified block diagram showing a simplified functional configuration of an embodiment of the present invention, FIG. 2 is a schematic flowchart of a dictionary creating process in the same embodiment, and FIG. 3 is a pattern recognition process in the same embodiment. FIG. 4 is a schematic flowchart of the matching process in the embodiment of the method of the present invention, FIG. 5 is a vector diagram for explaining the characteristics of the feature vector in the multi-directional histogram method, and FIG. 6 is a component of the feature vector. It is an explanatory view of rearrangement. 10 …… Pattern reading section, 12 …… Preprocessing section, 14 …… Feature extraction section, 16 …… Sorting section, 18 …… Sorting table section, 20 …… Dictionary creating section, 22 …… Dictionary.

Claims

[Claims]

1. A pattern recognition method using the multi-directional histogram method, wherein components of a feature vector of a standard pattern obtained by the multi-directional histogram method are rearranged in advance in descending order of standard deviation or variance, and then the upper N-dimensional components thereof are arranged. A vector consisting of only the dictionary is registered in the dictionary as a feature vector of the dictionary pattern, and the components of the feature vector extracted from the unknown pattern by the multilayer histogram method are rearranged according to the rearrangement order of the components of the feature vector of the dictionary pattern. A pattern recognition method characterized by performing matching between an unknown pattern and a dictionary pattern by calculating a distance or a similarity between corresponding dimension components of the feature vector after the component rearrangement and the feature vector of the dictionary pattern.

2. A feature vector distance or similarity calculation is preferentially performed from an upper dimensional component, and whether the distance or similarity calculation of a lower dimensional component is performed is determined based on the result of the calculation. The pattern recognition method according to claim 1, wherein