JPS5949630B2

JPS5949630B2 - pattern recognition device

Info

Publication number: JPS5949630B2
Application number: JP52033389A
Authority: JP
Inventors: 浩道藤沢; 康明中野; 道夫安田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1977-03-28
Filing date: 1977-03-28
Publication date: 1984-12-04
Also published as: JPS53118943A

Description

【発明の詳細な説明】（１）発明の利用分野本発明は、漢字のように類似パターンが多く存在する場
合に適したパターン認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION (1) Field of Application of the Invention The present invention relates to a pattern recognition device suitable for cases where there are many similar patterns such as Chinese characters.

（２）従来技術印刷漢字認識における技術的問題点は、
文字種が２０００から４０００字種と多いことによる認
識速度の低下と、「問」と「間」などの類似文字組が多
数組存在することによる認識精度低下の問題である。(2) The technical problems in conventional printing kanji recognition are as follows:
These problems include a reduction in recognition speed due to the large number of character types (2,000 to 4,000), and a reduction in recognition accuracy due to the presence of many similar character sets such as "question" and "ma".

従来、前者の問題、すなわち認識速度の高速化について
は階層的パターン整合法（電子通信学会論文誌第５６
−Ｄ巻、ｐ、３６５を参照）により効果的に解決されて
いる。Conventionally, the former problem, that is, increasing recognition speed, has been solved using the hierarchical pattern matching method (IEICE Transactions No. 56).
- Vol. D, p. 365).

しかし、後者の問題、類似文字の認識精度に関しては、
対判定方式（電子通信学会論文誌５６−Ｄ巻、ｐ、５
４５を参照）が提案されているが、十分な解決策には至
つていない。濁点と半濁点文字たとえば、「ぱ」と「ば
」の認識では、相違部分が全黒点面積の１０％にも満た
ない場合が多く、従来の統一的手法ではその相違が十分
にクローズ・アップされず、誤読ないしは認識不能に及
ぶことが多い。（３）発明の目的したがつて、本発明の目的は従来のパターン整合法によ
つて分離しにくい文字対（組）を、正しく分離（認識）
する手段を提供することにある。However, regarding the latter problem, the recognition accuracy of similar characters,
Pair judgment method (Transactions of Institute of Electronics and Communication Engineers Vol. 56-D, p. 5
45) have been proposed, but no satisfactory solution has been reached. For example, when recognizing voiced and hand-voiced characters, ``pa'' and ``ba'', the difference is often less than 10% of the total sunspot area, and conventional unified methods cannot sufficiently focus on the difference. It is often misread or even unrecognizable. (3) Purpose of the Invention Therefore, the purpose of the present invention is to correctly separate (recognize) character pairs (sets) that are difficult to separate using conventional pattern matching methods.
The goal is to provide the means to do so.

（４）発明の総括説明一般に、観測された文字パターン
には、印字圧やリボンの状態による太さの違いや、観測
時の量子化誤差などの、いわゆる雑音が載つていて、こ
の文字パターンと標準パターンとが完全な一致すること
はない。(4) General explanation of the invention Generally, observed character patterns contain so-called noise, such as differences in thickness due to printing pressure and ribbon condition, and quantization errors during observation. and the standard pattern will never be an exact match.

したがつて、認識に際しては、観測された未知文字パタ
ーンと標準パターンとの近さを表現する類似度が計算さ
れ、この類似度が最大となる標準パターンの文字が未知
パターンのカテゴリ（字種）であると決定する。しかし
、類似文字の場合は、第１図ａとｂあるいはｄとｅのよ
うに、相違部分の面積が他の部分（ａの場合は「門」、
ｄの場合は「は」）の面積に比して小さい。Therefore, during recognition, the degree of similarity expressing the closeness between the observed unknown character pattern and the standard pattern is calculated, and the character in the standard pattern with the highest degree of similarity is classified into the category (character type) of the unknown pattern. It is determined that However, in the case of similar characters, as in Figure 1 a and b or d and e, the area of the different part is different from that of the other part (in the case of a, "gate",
In the case of d, it is smaller than the area of ``ha'').

したがつて、たとえ雑音が存在しない理想的な場合を想
定しても類似度の差は類似文字同志場合は大きくない。
たとえば未知パターン「ば」が入力されたとき標準パタ
ーン「ぱ」に対する類似度は０．９００，標準パターン
「ば」に対しては０．８９７といつた具合にである。し
かも実際には雑音が存在するため、類似文字間で混同を
起こすことがある。ただし、類似文字同志の標準パター
ンが同一で鳴るべき部分（第１図の場合は「門」や「は
」の部分）が完全に；致するように、類似文字の標準パ
ターンが作成されていれば、上記の理想的な状態に近ず
けることはできる。Therefore, even if we assume an ideal case where there is no noise, the difference in similarity is not large between similar characters.
For example, when the unknown pattern "ba" is input, the degree of similarity with respect to the standard pattern "pa" is 0.900, and with respect to the standard pattern "ba" it is 0.897. Moreover, since noise actually exists, confusion may occur between similar characters. However, the standard patterns of similar characters must be created so that the parts that should sound the same (in the case of Figure 1, the parts of ``mon'' and ``ha'') are exactly the same. For example, it is possible to approach the ideal state described above.

しかし、この方策の欠点は、上記の例のような類似文字
の組合せが多いため自動的、または半自動的に上記の条
件を満足するように標準パターンを作成することが非常
に困難であること、あるいは活字の段階で、上の例の場
合で言うと「門」や「は」のパターンが微細に異つてい
ることもあり全く同じパターンにすることが必ずしも良
い結果をもたらさない場合があること、さらには、上記
条件を満す標準パターンが作成できたとしても、十分大
きな類似度の差を得ることはできない点などである。し
たがつて、本発明では、類似文字組を持つ文字が第１の
候補文字として挙げられた場合は、同文字のどの部分が
相違点であるかを示すマスク・パターン（第１図のＣ，
ｆまたはｉ）を選択し、同マスク・パターンでマスクさ
れた領域に於いてのみ、再度未知パターンと標準パター
ンとの類似度を計算し、最終決定を下す。However, the disadvantage of this strategy is that it is extremely difficult to automatically or semi-automatically create a standard pattern that satisfies the above conditions because there are many combinations of similar characters as in the example above. Or, at the stage of printing, in the case of the above example, the patterns for ``mon'' and ``wa'' may be slightly different, so making them exactly the same pattern may not necessarily produce a good result. Furthermore, even if a standard pattern satisfying the above conditions can be created, a sufficiently large difference in similarity cannot be obtained. Therefore, in the present invention, when a character with a similar character set is selected as the first candidate character, a mask pattern (C, C in FIG. 1,
f or i) is selected, and the degree of similarity between the unknown pattern and the standard pattern is calculated again only in the area masked by the same mask pattern, and a final decision is made.

ここで、第１段目の判定で複数個の候補が挙げられた時
に、それらの文字間で互いに類似度が低い領域でのみ、
再びパターン整合を行つて最終判定を行う手法は公知（
公知例：特願昭４１一４２３６３号〔特公昭４６−９７
２７号〕）であるが、本発明の特徴は以下の点にある。Here, when multiple candidates are listed in the first stage of judgment, only in areas where the similarity between those characters is low,
The method of performing pattern matching again and making a final judgment is known (
Publicly known example: Japanese Patent Application No. 41-42363 [Special Publication No. 46-97
No. 27]), but the features of the present invention are as follows.

すなわち、上記公知例では、第１段目のパターン整合の
結果の類似度が所定の閾値を越える文字カテゴリが複数
個ある場合は、互いに類似度が低い領域でのみ再パター
ン整合を行つているが、本発明では、階層的パターン整
合の判定過程において挙げられた第１位、または第２位
の文字が類似文字をもつ字種（例えば、太、玉、問、は
、ば、など）であり、かつそれら類似度の差がある閾値
以内のときに限つて、あらかじめ定められた類似度の低
い部分（領域）を示すマスク・パターンを用いて、再パ
ターン整合を行う。That is, in the above-mentioned known example, if there are multiple character categories whose similarity as a result of the first stage pattern matching exceeds a predetermined threshold, pattern matching is performed again only in areas where the similarity is low. , in the present invention, the first or second character listed in the hierarchical pattern matching determination process is a character type that has similar characters (for example, fat, ball, question, ha, ba, etc.). , and only when the difference in their similarity is within a certain threshold, re-pattern matching is performed using a mask pattern indicating a predetermined portion (region) with low similarity.

そして、類似文字を持たない字種が、第１位または第２
位の候補で、類似度の差がある閾値以下のときは判定不
能と決定し、再パターン整合は行わない。上記本発明の
特徴が重要であるのは、つぎの２つの理由による。Then, the character types that have no similar characters are ranked first or second.
If the difference in similarity is less than or equal to a certain threshold among the top candidates, it is determined that determination is not possible, and pattern matching is not performed again. The above features of the present invention are important for the following two reasons.

（１）複数個の候補が挙げられたときは常に再パターン
整合を行うとすると、複数個の候補が挙げられた理由が
、印字品質が悪かつたのか、それとも極めて類似した文
字であつたのかという区別がない。(1) If pattern matching is always performed again when multiple candidates are listed, is the reason why multiple candidates were listed due to poor printing quality or because the characters are extremely similar? There is no such distinction.

したがつて、印字品質が悪かつたりノイズがあつたため
に、元来類似していない文字同志が上位候補に挙がつた
場合は、本来判定不能とすべきところを、再パターン整
合によつて判定を行うため、誤認識となる危険性が高い
。（２）漢字認識の場合のようにカテゴリ数が２０００
から４０００という場合には、第１位と第２位に挙がる
候補文字の組合せは膨大な数となり、それら任意の組合
せの文字同志の類似度の低い領域をあらかじめ指定する
必要があるため、類似度が近い時は常に再パターン整合
を行うという方法は実際上不可能に近い。したがつて、
本発明装置では、上記特徴を実現するために、あらかじ
め指定した類似カテゴリ組が存在するか否かを記憶する
手段と、複数個のマスク・パターンの内どのマスク・パ
ターンを再パターン整合をするに際して用いるかを記憶
する手段を持つ。Therefore, if characters that are originally not similar are selected as top candidates due to poor print quality or noise, the characters that should have been impossible to determine can be determined by re-pattern matching. There is a high risk of misrecognition. (2) The number of categories is 2000 as in the case of kanji recognition.
4000, the number of combinations of candidate characters that are ranked first and second is enormous, and it is necessary to specify in advance areas with low similarity between characters in any combination of characters, so the similarity It is practically impossible to perform pattern matching again whenever the values are close to each other. Therefore,
In order to realize the above characteristics, the device of the present invention includes a means for storing whether or not a pre-specified similar category set exists, and a method for selecting which mask pattern among a plurality of mask patterns to re-pattern match. It has a means of remembering what to use.

これらは表の形で記憶装置内に格納することができ、カ
テゴリ数だけのメモリ容量があればよい。表の内容は、
各字種に対して、極めて類似している文字組が存在する
場合は、使用するマスク・パターンの番号を、存在しな
い場合は値零を記入しておき、１つの表で記憶すること
ができる。本方式の利点を纒めると、各標準パターンに
対して特別に手を加える必要がなく、割合少い数のマス
ク・パターンを作ればよいこと、マスク・パターンによ
つては複数個の字種に対して共用することができること
（たとえば、第１図のマスク・パターンＣは「問、間、
閾、闘、関、閉、開、聞、閣、・・・・・・」などに対
して共用できるのみならず、たとえば「木、本、丸、九
」にも用いることができること）、マスク・パターンは
紙面上に書いたものを文字パターンと同様に光電変換し
て、電気的信号としてマスク・パターン記憶装置に格納
す！ることができるため、マスク・パターンの作成、
追加修正なども容易であること、また認識速度について
は、常に本処理をする必要はなく類似文字組をもつ未知
パターンに対してのみ実行すればよく、認識速度の低下
は少いこと、未知パターンと１マスク・パターンとの
積を取つたパターンと、標準パターンと同マスク・パタ
ーンとの積を取つたパターンとの類似度を計算するのみ
でよいため、従来の認識装置に加えて必要なハードウエ
アは少い点などである。These can be stored in a storage device in the form of a table, and it is only necessary to have a memory capacity equal to the number of categories. The contents of the table are
If a very similar character set exists for each character type, enter the number of the mask pattern to be used; if none exists, enter the value zero, and memorize it in one table. . To summarize the advantages of this method, there is no need to make any special modifications to each standard pattern, a relatively small number of mask patterns can be created, and some mask patterns can contain multiple characters. (For example, mask pattern C in FIG.
It can be used not only for "threshold, fight, seki, close, open, listen, kaku, etc.", but also for "tree, book, circle, nine"), and a mask. - Patterns are written on paper and photoelectrically converted in the same way as character patterns and stored in the mask pattern storage device as electrical signals! Create a mask pattern,
It is easy to make additional corrections, and regarding recognition speed, it is not necessary to always perform this processing, and it is only necessary to perform this processing on unknown patterns that have similar character sets, so there is little reduction in recognition speed. Since it is only necessary to calculate the degree of similarity between a pattern obtained by multiplying and 1 mask pattern, and a pattern obtained by multiplying the standard pattern and the same mask pattern, the required hardware is required in addition to the conventional recognition device. The wear is small.

１（５）実施例以下、本発明を実施例を参照して詳細に説明する。 1 (5) Example Hereinafter, the present invention will be explained in detail with reference to Examples.

第２図は本発明の一実施例の装置のフロツク図である。FIG. 2 is a block diagram of an apparatus according to one embodiment of the present invention.

同図において信号１０は印刷文字の印字）された帳票
である。同帳票は文字観測部１へ入力される。文字観測
部１は帳票上の文字を光学的に走査し、光電変換し、１
文字毎に切り出し、文字パターンｆ（Ｉ，ｊ）（Ｉ，ｊ
＝１，２，・・・・・・，４０）を電気信号１１として
分類装置２へ出力す一る。文字パターンは４０点×４
０点からなるバイナリ・パターンであり、紙面上で黒の
点に対して白の点に対してである。In the figure, a signal 10 is a form on which printed characters are printed. The form is input to the character observation section 1. The character observation unit 1 optically scans the characters on the form, performs photoelectric conversion, and
Cut out each character and create a character pattern f(I,j)(I,j
= 1, 2, ..., 40) is output to the classification device 2 as an electrical signal 11. The character pattern is 40 points x 4
It is a binary pattern of 0 points, with black points versus white points on the paper.

分類装置２はバイナリ文字パターンｆ（Ｉ，ｊ）を受け
、ぼカル処理によりぼかしパターンｇ（Ｋ，ｌ）（Ｋ
，ｌ＝１，２，３，・・・，８）を発生し、分類用標
準パターン用記憶装置２１から、順次分類用標準パター
ンｇΔ（Ｋ，ｌ）３１を読み込み、類似度ρ１（Ｇ，Ｇ
ω）を類似度計算回路５で下式（３）によつて計算する
。The classification device 2 receives the binary character pattern f(I, j), and performs blurring processing to generate a blurred pattern g(K,l)(K
, l = 1, 2, 3, ..., 8), sequentially read the classification standard pattern gΔ(K, l) 31 from the classification standard pattern storage device 21, and calculate the similarity ρ1 (G, G
ω) is calculated by the similarity calculating circuit 5 according to the following equation (3).

ただし、またωはカテゴリ番号、Ｍは字種の数である。however, Further, ω is a category number, and M is the number of character types.

なお、ここで言うぼかし処理とは、４０×４０のメツシ
ユ上のバイナリ・パターンｆ（Ｉ，ｊ）を、７×７の小
領域内の１のビツトを数上げて、“ぼカル゛、８×８メ
ツシユの多値パターンｇ（Ｋ，ｌｌ）を得ることを言う
。数式で表現すると、となる。２はさらに類似度ρ１（
ω＝１，２，・・・，Ｍ）の中から上位Ｎ個を選び出し
、候補の数Ｎと同上位Ｎ個の文字番号１２（ω１，ω２
，ω３，・・・・・・，ωＮ）を認識部３へ転送する。Note that the blurring process referred to here means that the binary pattern f(I, j) on a 40x40 mesh is increased by increasing the number of 1 bits in a 7x7 small area to create a "blur, 8" It refers to obtaining a multivalued pattern g(K,ll) of ×8 meshes. Expressed mathematically, it becomes. 2 is further calculated by the degree of similarity ρ1 (
Select the top N characters from among ω = 1, 2, ..., M), and select the number N of candidates and the same top N character numbers 12 (ω1, ω2
, ω3, ..., ωN) are transferred to the recognition unit 3.

３はＮ＝０のときは候補がなかつたので認識不能の旨を
出力として端子１４より出力する。3, when N=0, there is no candidate, so a message indicating unrecognizability is outputted from the terminal 14.

Ｎ＝１のときは同文字番号を認識結果として端子１４よ
り出力する。Ｎ≧２のときは、Ｎ個の候補文字の中で未
知バイナリ・パターンｆ（Ｉ，ｊ）によつて認識を行う
。２２は認識用標準パターン記憶装置で、認識部苦３は
２２からＮ個の標準パターン３２，ｆ０（Ｉ，ｊ）（ω
：ω１ ωι２，゜゜゜゜゜゜，ω÷Ｎ）を読み出し、
類似度計算回路６で類似度ρｌを下式（６）に従つて計
算する。When N=1, the same character number is output from the terminal 14 as the recognition result. When N≧2, recognition is performed using an unknown binary pattern f(I, j) among N candidate characters. 22 is a standard pattern storage device for recognition, and the recognition unit 3 stores N standard patterns 32, f0 (I, j) (ω
:ω1 ωι2, ゜゜゜゜゜゜, ω÷N),
The similarity calculation circuit 6 calculates the similarity ρl according to the following equation (6).

ただし、ここでまた、ωμ（μ＝１，２，・・・，Ｎ）は候補文字番号
である。However, here again, ωμ (μ=1, 2, . . . , N) is a candidate character number.

認識部３は、さらに計算した類似度ρの中から最大類似
度ρ査と次大類似度ρＭとを探がし、絶対閾値δ１と相
対閾値ε１によつて次の判定を行う。The recognition unit 3 further searches for the maximum similarity ρ and the next-largest similarity ρM from among the calculated similarities ρ, and makes the following determination based on the absolute threshold δ1 and the relative threshold ε1.

（１）ρ査゛≧δ１かつρ青−ρｉε１の場合は、十分
な類似度を持ち、１位と２位の差が十分開いているので
ρ青を与える文字番号ωμを認識結果として端子１４よ
り出力する。(1) If ρ = δ1 and ρ blue - ρiε1, there is sufficient similarity and the difference between the 1st and 2nd place is wide enough, so the character number ωμ that gives ρ blue is recognized as the terminal 14 Output from

（２）ρ。(2) ρ.

くδ１の場合は、、、類似度力叶分大きくないので認識
不能の旨を結果として端子１４より出力する。（３）ρ
１≧δ１かつρ青一ρｉ〈ε１の場合は、候補文字の中
で類似度のρ青との差がε５より小さいＮ′個の文字番
号、ω（ν＝１，２，ゝ ν・・・，Ｎ●を出力１３
として、最終判定部４に転送する。In the case of .delta.1, the similarity is not large enough to satisfy the requirements, so a result indicating that recognition is not possible is outputted from the terminal 14. (3) ρ
If 1≧δ1 and ρBlue1ρi<ε1, N' character numbers, ω(ν=1,2,ゝ ν・・・, Output N●13
As such, it is transferred to the final determination section 4.

ただし転送れる文字番号は類似度の大きい順に並べる。However, the character numbers to be transferred are arranged in descending order of similarity.

４ではρ閘を与える文字ω１あるいはρ青を与える文字
ω２に類似文字が存在するか否かを、マスク・パターン
番号表によつて調べる。In step 4, it is checked by the mask pattern number table whether or not there is a similar character to the character ω1 that gives ρ-lock or the character ω2 that gives ρ-blue.

マスク・パターン番号表は記憶装置２３に格納されてお
り、各文字番号に対して、類似文字がない場合は値０、
類似文字が存在する場合は、相違部所を示すマスク・パ
ターンの番号が登録されている。したがつて、４はω１
のマスク・パターン番号Ｊ（。１）を調べ、値がＯのと
きは次にω２のマスク・パターン番号Ｊ（ω２）を調べ
、この値もＯのときは類似文字が存在しないので、判定
不能を認識結果として端子１４より結果出力する。The mask pattern number table is stored in the storage device 23, and for each character number, if there is no similar character, the value is 0,
If similar characters exist, a mask pattern number indicating a different part is registered. Therefore, 4 is ω1
Check the mask pattern number J (.1) of , and if the value is O, then check the mask pattern number J (ω2) of ω2, and if this value is also O, there is no similar character, so it cannot be determined. The result is output from the terminal 14 as a recognition result.

Ｊ（Ｃｉ）１）へ、Ｏのときは、Ｊ（ω１）番目のマス
ク・パターンｈμ（１，ｊ）３４（μ＝Ｊ（ω１））を
マスク・パターン記憶装置２４より読み出し、以下の処
理により最終判定を行う。J(Ci)1), when O, read the J(ω1)th mask pattern hμ(1,j)34 (μ=J(ω1)) from the mask pattern storage device 24, and perform the following processing. A final judgment will be made.

（５）未知パターンにマスクを掛ける。(5) Mask the unknown pattern.

ｆ（１，ｊ）＝ｆ（１，ｊ）・ｈμ（１，ｊ）（８）（
６）Ｎ′個の標準パターンにマスクを掛ける。f(1,j)=f(1,j)・hμ(1,j)(8)(
6) Mask the N' standard patterns.

１ごッ（１，ｊ）＝ｆ：ｖ（１，ｊ）・ｈμ（１，ｊ）
ν＝１，２，・・・，Ｎ′ （９）一 −Ｘ（７）類似度計算回路６を用いてｆとＦ。1 go (1, j) = f: v (1, j) · hμ (1, j)
ν=1, 2,..., N' (9) -X (7) f and F using the similarity calculation circuit 6.

）との類似度ρＩを計算する。ただし、計算式は（６）
式と同じ。（８）計算されたＮ′個の類似度の中から最
大類似度ρ↓と次大類似度輔とを探す。) is calculated. However, the calculation formula is (6)
Same as expression. (8) Search for the maximum similarity ρ↓ and the next largest similarity among the N' calculated similarities.

（９）ρ，とρ１こ対し、絶対閾値δ２と相対値ε２に
よつて次の判定をする。(9) The following determination is made using the absolute threshold value δ2 and relative value ε2 for ρ and ρ1.

ＵＯ）ρ１≧δ２かつρ市−η漬≧ε２のときはρ１を
与える文字番号を認識結果として端子１４より出力する
。UO) When ρ1≧δ2 and ρichi-ηzuku≧ε2, the character number giving ρ1 is output from the terminal 14 as the recognition result.

Ｏ１）上記条件を満たさない場合は、認識不能を結果と
して端子１４より出力する。O1) If the above conditions are not met, an unrecognized result is output from the terminal 14.

以上は本発明装置の動作の全体の流れであるが本発明の
中心である認識部３と最終判定部４の処理の内容を流れ
図にして第３図ａおよびｂに示す。The above is the overall flow of the operation of the apparatus of the present invention, and the contents of the processing of the recognition unit 3 and final determination unit 4, which are the center of the present invention, are shown in flowcharts in FIGS. 3a and 3b.

３と４は実際にはマイクロプロセツサ一によつて構成さ
れプログラムによつて上記処理を行う。3 and 4 are actually constituted by a microprocessor and perform the above-mentioned processing according to a program.

同部分４０を第４図を用いて、より詳細に説明する。第
４図において、４１はマイクロプログラムを格納するＲ
ＯＭ（ＲｅａｄＯｎｊｙＭｅｍＯ−Ｒｖ）とシークエン
ス制御部で、４２はＡＬＵ（算術論理演算回路）である
。４２はデータバス４４を通してデータの受援を行い認
識と最終判定を行う。The same portion 40 will be explained in more detail using FIG. 4. In FIG. 4, 41 is R for storing the microprogram.
OM (ReadOnjyMemO-Rv) and a sequence control unit, and 42 is an ALU (arithmetic logic unit). 42 receives data through a data bus 44 and performs recognition and final judgment.

分類装置２の出力１２はインターフエース１２１を通し
てゼータバスに分類結果を載せる。The output 12 of the classifier 2 puts the classification results on the Zetabus through an interface 121.

認識時に用いる標準パターン記憶装置２２はインターフ
エース２２１を介して繋がれている。マスクパターン番
号表２３とマスタパターン２４は記憶装置２００の異な
るアドレスに格納されている。２００には未知パターン
２５も格納される。A standard pattern storage device 22 used during recognition is connected via an interface 221. The mask pattern number table 23 and the master pattern 24 are stored at different addresses in the storage device 200. An unknown pattern 25 is also stored in 200.

最終判定に際しては、２２から得られる標準パターンと
２４内の１つのマスクパターンとの積をパターン掛算器
４３によつて掛け１？を得、さらに未知パターン２５と
同マスクパターンとの積ｆを同じく４３によつて得、１
？とｆとの類似度を類似度計算回路６によつて計算する
。判定結果１４はインターフエース１４１を通して出力
される。In the final judgment, the pattern multiplier 43 multiplies the product of the standard pattern obtained from 22 and one mask pattern from 24 by 1? Furthermore, the product f of the unknown pattern 25 and the same mask pattern is obtained by 43, and 1
? The similarity calculation circuit 6 calculates the similarity between and f. The determination result 14 is output through the interface 141.

（６）まとめ以上説明したごとく、本発明によれば、従来の文字認識
装置では精度良く忍識できなかつた類似文字が、各相違
点をマスク・パターンを用いて強調することによつて精
度高く認識することができる。(6) Summary As explained above, according to the present invention, similar characters, which conventional character recognition devices could not accurately understand, can be recognized with high accuracy by emphasizing each difference using a mask pattern. can be recognized.

さらに、それに際し、マスク・パターンが割合容易に作
成、追加、修正でき、また従来に増して必要なハードウ
エアの分量も少ないため、本発明は効率的でもある。な
お、上記実施例においては、パターン間の類似性を高め
るのに類似度なる測定を用いたが、逆に相違度なる測定
を用いてもよい。Additionally, the present invention is efficient in doing so, as mask patterns can be created, added, and modified relatively easily, and less hardware is required than in the past. In the above embodiment, a measure of similarity is used to increase the similarity between patterns, but a measure of dissimilarity may be used conversely.

このとき、全体の構成はあまり変わらなく、その構成も
自明であるので説明は略す。また、認識部および最終判
定部において類似度ρまたはρを計算するに際し、位置
の相互補正を行えば精度が更に良くなることも明らかで
ある。At this time, the overall configuration does not change much and the configuration is self-evident, so the explanation will be omitted. It is also clear that when calculating the degree of similarity ρ or ρ in the recognition unit and the final determination unit, the accuracy can be further improved by mutually correcting the positions.

（位置補正に関しては、情報処理学会論文誌第１６巻Ｎ
Ｏ．ｌ２，ｐｌＯ６４参照）また、上記実施例の４の処
理の代りに、３の処理の結果得られるパターンｆ（Ｉ，
ｊ）・ＦＵ（Ｉ，ｊ）に上記実施例の場合と同様にマスク・パターンｈμ（Ｉ
，ｊ）を掛けて、注視する相違点における相違量ｆ（Ｉ
，ｊ）・ｆ：（Ｉ，ｊ）・ｈμ（Ｉ，ｊ）ＡＯ）を用い
て最終判定を行なつてもよい。(Regarding position correction, see Information Processing Society of Japan, Vol. 16, N.
O. l2, plO64) Also, instead of the process 4 of the above embodiment, the pattern f(I,
j)・FU(I,j), a mask pattern hμ(I
, j) to obtain the difference amount f(I
, j)·f: (I, j)·hμ(I, j)AO).

また、上記の実施例の簡易化として、分類装置２を省略
する方法（このときは認識部３に与えられる文字番号は
全文字番号となる）、あるいは分類装置２として黒点数
やモーメントなどのマクロな特徴を用いる方法も利用で
きる。In addition, as a simplification of the above embodiment, there is a method of omitting the classification device 2 (in this case, the character numbers given to the recognition unit 3 are all character numbers), or a method in which the classification device 2 uses macros such as the number of sunspots and moments. It is also possible to use methods that use other features.

また、認識部３を省略して、分類装置２での最大類似度
の文字番号を認識部３の出力の代りに用いることもでき
る。さらに、上記の実施例では認識部３、最終判定部４
では２値パターンの整合を行うとしたが、多値パターン
の整合を用いてもよいことも明らかであり、マスクパタ
ーンとしても１，０の２値パターンでなく境界面でなだ
らかに値の変化するような多値パターンを用いられるこ
とも明らかである。また、本発明は漢字に限らず一般の
パターンに対しても適用可能であり、その場合は上記の
説明中「文字」といラ部分を「パターン・力テコ１月と
おきかえて理解すればよい。Furthermore, the recognition unit 3 can be omitted and the character number having the maximum similarity in the classification device 2 can be used instead of the output of the recognition unit 3. Furthermore, in the above embodiment, the recognition unit 3 and the final determination unit 4
In the above, we assumed that binary pattern matching was performed, but it is clear that multivalue pattern matching may also be used, and the mask pattern is not a binary pattern of 1 and 0, but a pattern whose values change gently at the boundary surface. It is also clear that a multi-valued pattern like this can be used. Furthermore, the present invention is applicable not only to kanji but also to general patterns, and in that case, the part ``character'' in the above explanation may be replaced with ``pattern/power lever 1 month''.

[Brief explanation of the drawing]

第１図は本発明原理を説明するための類似文字パターン
の例と、それらに対するマスク・パターンの例を示す図
である。FIG. 1 is a diagram showing examples of similar character patterns and examples of mask patterns for them, for explaining the principle of the present invention.

Claims

[Scope of Claims] 1. A pattern recognition device for determining a standard pattern corresponding to an inputted unknown pattern, comprising means for outputting a category name of a standard pattern for the inputted unknown pattern; means for determining whether or not a similar category set exists; means for storing a pattern difference area for each similar category set; A pattern recognition device comprising means for extracting a corresponding pattern difference region and making a final decision in the region. 2. The pattern recognition device according to claim 1, wherein the storage means stores a mask pattern for masking a similar area for each similar category group.