JPH0789364B2

JPH0789364B2 - Pattern recognition device

Info

Publication number: JPH0789364B2
Application number: JP61257046A
Authority: JP
Inventors: 康雄黒須; 修国崎; 佳弘横山; 宏一岡澤; 彰三門田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1986-10-30
Filing date: 1986-10-30
Publication date: 1995-09-27
Anticipated expiration: 2010-09-27
Also published as: JPS63113690A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、日本語入力装置としてのパターン認識装置に
係り、特に手書漢字のように極めて類似した文字群が多
数存在するパターンの入力に好適なパターン認識装置に
関する。Description: TECHNICAL FIELD The present invention relates to a pattern recognition device as a Japanese input device, and particularly to the input of a pattern in which many very similar character groups such as handwritten kanji exist. The present invention relates to a suitable pattern recognition device.

[Conventional technology]

一般に、文字読取装置においては、入力未知パターンと
標準パターンとのマツチングを取るなり、入力未知パタ
ーンを特徴コード列に変換して、順序論理回路を通すな
りして、入力未知パターンを識別する方式がとられてい
るが、いずれの方式においても、入力未知パターンが二
つ以上の標準パターンで受理されて、どちらのカテゴリ
ーに属するかを判断しにくいことがある。通常、このよ
うなパターンはまぎらわしいパターンとしてリジエクト
されている。Generally, in a character reading device, there is a method of identifying an input unknown pattern by matching the input unknown pattern with a standard pattern, converting the input unknown pattern into a characteristic code string, and passing through a sequential logic circuit. However, in any of the methods, it may be difficult to determine which category an input unknown pattern belongs to because it is accepted as two or more standard patterns. Usually, such a pattern is rejected as a misleading pattern.

手書漢字認識装置を実用化する上で解決しなければなら
ない問題点として次の３点が挙げられる。There are the following three problems that must be solved in order to put the handwritten kanji recognition device into practical use.

（１）手書漢字は複雑でしかも多様な変形を有するた
め、これらの特質を許容し得る認識方式が必要。(1) Since handwritten Chinese characters are complicated and have various transformations, a recognition method that can tolerate these characteristics is necessary.

（２）認識の対象となる字種が英数字と比べて一桁以
上も多いため、処理量および辞書容量を削減し得る認識
方式が必要。(2) The recognition target character type has more than one digit compared to alphanumeric characters, so a recognition method that can reduce the processing amount and dictionary capacity is necessary.

（３）手書漢字は極めて類似した文字群が多数存在す
るため、これらを識別し得る認識方式が必要。(3) Since many handwritten Chinese characters have very similar character groups, a recognition method that can identify them is necessary.

これらの問題点のうち（１），（２）については、現在
までに数多くの方式が提案され、一応の水準に達したと
見られているが、（３）については最後の問題として残
されている。Regarding these problems (1) and (2), many methods have been proposed up to now and it is considered that they have reached a certain level, but (3) remains as the last problem. ing.

一方、文字認識手法は上記したように、文字パターンの
端点，分岐点，屈折点等の特徴点に注目して扱う構造解
析法と文字パターンを２次元のまま大域的に扱うパター
ン整合法に大別できる。On the other hand, as described above, the character recognition method is roughly classified into a structural analysis method that handles characteristic points such as end points, bifurcation points, and inflection points of a character pattern, and a pattern matching method that handles a character pattern globally in a two-dimensional manner. Can be separated.

従来の英数字を認識する装置は、特開昭51−118333号公
報に記載のように、構造解析法で認識処理を行い、複数
個の候補カテゴリーが残った場合、認識段と同種の特徴
を抽出し、文字パターンの細部の構造まで解析すること
により、類似文字の識別を行うものが知られている。こ
の装置は、端点，分岐点，屈折点等の文字パターンの微
細な構造に着目して識別を行うため、類似文字に対して
も高い識別制度を発揮することが出来る。しかしなが
ら、手書漢字に適用した場合、その字形の多様性に起因
する辞書容量（メモリー容量）の増加について配慮され
ていなかつた。A conventional device for recognizing alphanumeric characters performs recognition processing by the structural analysis method as described in Japanese Patent Laid-Open No. 51-118333, and when a plurality of candidate categories remain, the same characteristics as the recognition stage are obtained. It is known that a similar character is identified by extracting and analyzing the detailed structure of a character pattern. Since this device makes a distinction by paying attention to a fine structure of a character pattern such as an end point, a branch point, and a refraction point, it can exhibit a high discrimination system even for similar characters. However, when applied to handwritten kanji, no consideration was given to the increase in dictionary capacity (memory capacity) due to the variety of the glyphs.

また、手書漢字を認識する他の従来技術としては、特開
昭58−207183号公報に記載のように、文字パターンの背
景部に着目したパターン整合法で認識処理を行ない、複
数個の候補カテゴリーが残つた場合、認識段と同一の特
徴を用い、類似カテゴリー間の相い異なる部位に注目す
ることにより、類似文字の識別を行うもので知られてい
る。この技術は、文字パターンの微細な構造にとらわれ
ることなく、文字パターンの大域的な特徴に着目して識
別を行うため、手書漢字の字形の多様性に起因する辞書
容量の増大を防ぐことが出来る。しかしながら、類似文
字間の相違性を反映する文字パターンの微細構造の抽出
については配慮されていなかった。Further, as another conventional technique for recognizing handwritten Chinese characters, as described in JP-A-58-207183, recognition processing is performed by a pattern matching method focusing on the background portion of a character pattern, and a plurality of candidates are obtained. It is known that when a category remains, similar characters are identified by using the same features as in the recognition stage and focusing on different parts between similar categories. This technology distinguishes by focusing on the global characteristics of the character pattern without being restricted by the fine structure of the character pattern, thus preventing an increase in dictionary capacity due to the variety of handwritten Chinese character shapes. I can. However, no consideration has been given to the extraction of the fine structure of the character pattern that reflects the difference between similar characters.

[Problems to be solved by the invention]

手書漢字は英数字を比較して画数が多いため、文字線分
のつぶれやかすれ等のトポロジーが破壊される映像的な
歪が顕著に表れる性質を持つている。例えば手書文字
「田」を対象とした実験により、文字パターンの構造上
の特徴の一つであるループ数は１から４個までほぼ均等
に分布していることが知られている。Kanji handwriting has a large number of strokes compared with alphanumeric characters, so that it has the property that image distortion that collapses the topology such as crushing and blurring of character line segments is noticeable. For example, it has been known from an experiment conducted on the handwritten character “Ta” that the number of loops, which is one of the structural features of the character pattern, is distributed almost evenly from 1 to 4.

このため、手書英数字を対象とした類似文字識別におい
て有効であつた構造解析的な特徴は、特徴点におけるト
ポロジーを利用しているため、そのままで手書漢字に適
用できないと考えられていた。For this reason, it was thought that the structural analysis features that were effective in identifying similar characters for handwritten alphanumeric characters could not be applied to handwritten Kanji characters as they are because the topology at the feature points was used. .

上記前者の公報に記載された従来技術は、手書漢字の持
つ特徴点の多様性について配慮がなされておらず、それ
ぞれの変形の類似文字対に対して専用の標準パターンを
用意しなければならないという問題点があつた。The prior art described in the former publication does not take into consideration the variety of characteristic points of handwritten Chinese characters, and a special standard pattern must be prepared for each similar character pair of each transformation. There was a problem.

一方、文字パターンの背景部に着目した類似文字識別法
は、背景部の各点について上下左右に走査を行い、文字
線分と定められた角度が交差する回数を計測し、これを
各点に割り付けることによつて特徴を抽出する。さら
に、この特徴空間における類似文字間の相い異なる部位
に着目することによつて類似文字を識別するものであ
る。したがつて、この特徴はその性質上、文字パターン
の大域的な特徴を比較的安定に抽出し得るが、文字スト
ロークの持つ構造を細部に亘つて抽出し得ないことが知
られている。On the other hand, the similar character identification method that focuses on the background part of the character pattern scans each point of the background part vertically and horizontally, measures the number of times the character line segment and the defined angle intersect, and this is set to each point. Features are extracted by allocating. Further, the similar characters are identified by paying attention to different parts between the similar characters in this feature space. Therefore, it is known that this feature, by its nature, can extract the global feature of the character pattern relatively stably, but cannot extract the structure of the character stroke in detail.

このため、文字パターン全体を用いる手書漢字認識にお
いて有効であつた背景部に着目した特徴は、文字パター
ンの持つ微細な構造を表現できないため、そのままの形
態で類似文字識別への適用が疑問視されている。For this reason, the feature that paid attention to the background part, which was effective in handwritten kanji recognition using the entire character pattern, cannot express the minute structure of the character pattern. Has been done.

上記後者の公報に記載された従来技術は、類似文字間の
相違性を反映する文字パターンの微細構造の抽出につい
ては配慮されておらず、類似文字の識別に際して、高い
識別精度を期待できないという問題点があつた。The prior art described in the latter publication does not consider the extraction of a fine structure of a character pattern that reflects the dissimilarity between similar characters, and when identifying similar characters, high identification accuracy cannot be expected. There was a point.

本発明は、手書漢字の類似文字を識別するに際し、手書
漢字の持つ字形の多様性に起因する辞書容量の増加を抑
制した上で、類似文字間の相違性反映する文字パターン
の微細構造を抽出するため、類似文字の差分位置情報と
文字線分の方向成分から求めた方向別のパターン情報を
用いて類似文字を識別するパターン認識装置を提供する
ことを目的とする。The present invention, when identifying similar characters in handwritten Chinese characters, suppresses an increase in the dictionary capacity due to the variety of handwritten Chinese characters' glyphs, and then a fine structure of a character pattern reflecting the differences between similar characters. In order to extract the same, it is an object of the present invention to provide a pattern recognition device for identifying a similar character by using the difference position information of the similar character and the pattern information for each direction obtained from the direction component of the character line segment.

ここで、差分位置情報を類似した候補対の相い異なる形
状の位置情報と定義する。また、方向別のパターン情報
は以下〔問題点を解決するための手段〕の項にて詳細に
述べる。Here, the differential position information is defined as position information of different shapes of similar candidate pairs. The pattern information for each direction will be described in detail in the section [Means for Solving Problems] below.

[Means for solving problems]

上記目的は、認識部で求めた方向別のパターン情報が文
字線分の微細な構造を表現する性質を利用することによ
り達成される。すなわち、入力文字パターンから輪郭を
抽出し、雑音成分を除去するため、各輪郭点と前後の点
列を用いて方向と強さを決定し、これを各方向面に割り
付ける。次に、文字線分の位置ずれを吸収するため、各
方向面にボカシ処理を施し、これを文字パターン全体の
特徴量とする。さらに候補字種対ごとに定まる差分パタ
ーン情報を用いて文字パターン全体の特徴量から相い異
なる部位を切り出し、候補字種対ごとの差異を表現する
特徴量とする。最後に、これらの特徴量を用いて順次対
判定を行い、類似文字を識別する。The above object is achieved by utilizing the property that the pattern information for each direction obtained by the recognition unit expresses a fine structure of a character line segment. That is, in order to extract the contour from the input character pattern and remove the noise component, the direction and strength are determined using each contour point and the preceding and following point sequences, and this is assigned to each direction plane. Next, in order to absorb the positional deviation of the character line segment, blurring processing is applied to each direction surface, and this is used as the characteristic amount of the entire character pattern. Further, using the difference pattern information determined for each candidate character type pair, different portions are cut out from the characteristic amount of the entire character pattern, and are set as the characteristic amount expressing the difference for each candidate character type pair. Finally, pair determination is sequentially performed using these feature amounts to identify similar characters.

以下、本発明の原理を図面により説明する。Hereinafter, the principle of the present invention will be described with reference to the drawings.

第３図は本発明に係る原理を表す方向性特徴抽出を説明
する概念図である。まず前処理を施された正規化文字パ
ターン（ａ）から輪郭点と方向成分を抽出する（ｂ）。
次に、線縁ノイズ等の雑音成分を除去するため、注目す
る輪郭点とこれを囲む輪郭点列を用いて各輪郭点の方向
と強度を決定する。また方向別特徴面を作成するため、
この決定に従つた強度を該当する方向面に割り付け
（ｃ）、完成した方向面を方向別特徴面とする。さら
に、文字線分の位置ずれを防ぐと共に、特徴の情報量を
圧縮するため、各方向別特徴面にボカシ処理を施す
（ｄ）。ここでは、６×６の窓関数を用いてコンボリュ
ーションを施し、16分の１に情報量を圧縮したが、同様
な効果が得られればこの限りではない。FIG. 3 is a conceptual diagram illustrating directional feature extraction representing the principle of the present invention. First, contour points and direction components are extracted from the preprocessed normalized character pattern (a) (b).
Next, in order to remove noise components such as line edge noise, the direction and intensity of each contour point are determined using the contour point of interest and the contour point sequence surrounding it. Also, in order to create characteristic faces by direction,
The strength according to this determination is assigned to the corresponding direction surface (c), and the completed direction surface is used as the direction-specific feature surface. Further, in order to prevent the displacement of the character line segment and to compress the feature information amount, blurring processing is applied to the feature surface for each direction (d). Here, convolution is performed using a 6 × 6 window function and the amount of information is compressed to 1/16, but this is not the case as long as the same effect is obtained.

第２図は本発明に係る原理を表す類似文字識別用辞書の
構成例を説明する概念図である。同図の例は（玉）の２
値パターン1,（王）の２値パターン2,前記手法によつて
抽出された（玉）の方向性特徴3,同じく（王）の方向性
特徴4,類似文字対（玉）と（王）の差分パターン５を示
している。本発明では方向性特徴を用いて認識処理を行
い、複数個の候補カテゴリーが残つた場合、候補字種対
ごとの差異を表わす部位の特徴量を用いて順次対判定を
行い、最後に残つたカテゴリーを認識結果として出力す
る構成を取る。すなわち、（玉）の方向性特徴３には
（玉）と（王）の差分パターン５の位置に特徴量が存在
する。これに対して、（王）の方向性特徴４には同様の
位置に特徴量が存在しない。FIG. 2 is a conceptual diagram illustrating a configuration example of a similar character identification dictionary representing the principle according to the present invention. The example in the figure is (ball) 2
Value pattern 1, Binary pattern 2 of (king), Directional feature 3 of (ball) extracted by the above method, Directional feature 4 of the same (king) 4, Similar character pair (ball) and (king) 5 shows a difference pattern 5 of. In the present invention, the recognition processing is performed using the directional feature, and when a plurality of candidate categories remain, pairwise determination is sequentially performed using the feature amount of the part that represents the difference between each pair of candidate character types, and the last is left. It is configured to output the category as the recognition result. That is, the directional characteristic 3 of (ball) has a feature amount at the position of the difference pattern 5 of (ball) and (king). On the other hand, the directional characteristic 4 of (king) has no feature quantity at the same position.

この実例から明らかなように、文字パターン全体の特徴
量では区別できない類似文字対でも、文字パターンの細
部の表現する方向性特徴の差分パターンを用いれば、容
易に識別できるようになる。As is clear from this example, even a pair of similar characters that cannot be distinguished by the characteristic amount of the entire character pattern can be easily identified by using the difference pattern of the directional characteristics that the details of the character pattern represent.

[Action]

入力未知パターン全体の方向性特徴を用いて候補カテゴ
リーを選択した後に、さらに類似した候補対の相い異な
る形状の部位を示す差分位置情報と前記方向性特徴を用
いて、順次候補カテゴリーを絞り込み、最後に残つたカ
テゴリーを認識結果として出力する。After selecting a candidate category using the directional features of the entire input unknown pattern, using the differential position information and the directional feature indicating the parts of different shapes of similar candidate pairs, further narrowing down the candidate categories sequentially, The last remaining category is output as the recognition result.

これによつて、文字の微細構造を表現できる特徴の差異
部分を用いる判定が可能となり、類似文字を正確に識別
できる。As a result, it is possible to make a determination using the difference portion of the features that can express the fine structure of the character, and it is possible to accurately identify the similar character.

また、認識用標準パターンと類似文字識別用標準パター
ンを共用できため、辞書メモリを増大させることなく、
類似文字識別論理を追加できる。Further, since the recognition standard pattern and the similar character recognition standard pattern can be shared, without increasing the dictionary memory,
Similar character identification logic can be added.

以下、本発明の実施例を図面を用いて説明する。Embodiments of the present invention will be described below with reference to the drawings.

第１図は本発明によれるパターン認識装置の一実施例の
構成を示すブロツク図であつて、11は文字観測部、12は
認識部、13は類似文字識別部、21は光電変換部、22はA/
D変換部、23は前処理部、24は方向性特徴抽出部、25は
整合部、26は特徴辞書、27は候補字種記憶部、28は差分
情報抽出部、29は差分パターン辞書、30は字種対識別
部、31は判定部、32は制御部、33は出力端子である。FIG. 1 is a block diagram showing a configuration of an embodiment of a pattern recognition apparatus according to the present invention, in which 11 is a character observing section, 12 is a recognizing section, 13 is a similar character identifying section, 21 is a photoelectric converting section, 22 is A /
D conversion unit, 23 preprocessing unit, 24 directional feature extraction unit, 25 matching unit, 26 feature dictionary, 27 candidate character type storage unit, 28 difference information extraction unit, 29 difference pattern dictionary, 30 Is a character type identification unit, 31 is a determination unit, 32 is a control unit, and 33 is an output terminal.

同１図において、紙面等に記入された文字パターンは光
電変換部21よりビデオ信号に変換されたのち、アナログ
・ディジタル変換部22によりサンプリング，量子化が行
われて２値のメッシュパターンとなる。In FIG. 1, a character pattern written on a paper surface or the like is converted into a video signal by the photoelectric conversion unit 21, and then is sampled and quantized by the analog / digital conversion unit 22 to form a binary mesh pattern.

光電変換部21およびアナログ・ディジタル（A/D）変換
部22を合わせて文字観測部11と呼ぶ。The photoelectric conversion unit 21 and the analog / digital (A / D) conversion unit 22 are collectively referred to as a character observation unit 11.

観測された文字パターンは前処理部23において、切り出
し，雑音除去，正規化などの一連の前処理が行われて正
規化パターンとなる。The observed character pattern is subjected to a series of preprocessing such as clipping, noise removal, and normalization in the preprocessing unit 23, and becomes a normalized pattern.

切り出しは、認識等の処理単位を一文字毎にするため、
紙面上の文字パターン群から一文字を取り出す処理を指
し、通常は一文字を含む100×100メッシュ程度の領域を
切り出す。To cut out, because the processing unit for recognition etc. is one character at a time,
This refers to the process of extracting one character from a group of character patterns on the paper surface, and usually cuts out an area of about 100 × 100 mesh containing one character.

また雑音除去は、紙面等を付着したゴミ等の文字パター
ン以外のパターンを除去する処理を指す。これは濁点等
の文字本来のパターンとの区別が難しく、従来から種々
の工夫が施されている。The noise removal refers to a process of removing a pattern other than a character pattern such as dust adhering to a paper surface. This is difficult to distinguish from the original pattern of characters such as dakuten, and various contrivances have been made in the past.

最後に正規化処理は、認識を容易にするため、文字を一
定の大きさに揃える処理を指し、認識手法に応じて、外
接枠に揃える手法や重心に揃える手法等が選ばれる。Finally, the normalization process refers to a process of aligning characters to a certain size in order to facilitate recognition, and a technique of aligning with a circumscribing frame or a technique of aligning with the center of gravity is selected according to the recognition technique.

次に、前処理を施された正規化パターンは方向性特徴抽
出部24に加えられ、方向性特徴抽出部24において、方向
別に４枚の特徴パターンが作成される。Next, the preprocessed normalized pattern is added to the directional feature extraction unit 24, and the directional feature extraction unit 24 creates four feature patterns for each direction.

方向性特徴抽出部24では、まず輪郭抽出を施す。すなわ
ち、文字パターンの左上から順次下方へラスタ・スキヤ
ンして行き、文字パターンに当たつた点から、文字パタ
ーンにそつて輪郭を追跡する。同時に、予め用意した輪
郭テーブルに各輪郭点の方向と座標を記述する。この手
順に従つて一文字分の輪郭を全て輪郭テーブルに記述す
る。The directional feature extraction unit 24 first performs contour extraction. That is, the raster scan is sequentially performed from the upper left of the character pattern downward, and the contour is traced along the character pattern from the point hitting the character pattern. At the same time, the direction and coordinates of each contour point are described in the contour table prepared in advance. According to this procedure, all the contours for one character are described in the contour table.

次に、一文字分の輪郭テーブルが完成すると、方向別の
４枚の特徴パターンを作成する。すなわち、作成した輪
郭テーブルから注目する輪郭点とこれを囲む輪郭点列を
用いて各点の方向と強度を求める。注目する輪郭部の方
向と強度が求まると、対応する方向別特徴面の所定の座
標に強度を記入する。この手順に従つて、一文字分の方
向別の特徴パターンを作成する。Next, when the contour table for one character is completed, four characteristic patterns for each direction are created. That is, the direction and intensity of each point are obtained from the created contour table by using the contour point of interest and the contour point sequence surrounding it. When the direction and strength of the focused contour portion are obtained, the strength is entered in the predetermined coordinates of the corresponding direction-specific feature surface. According to this procedure, a characteristic pattern for one character for each direction is created.

さらに、文字線分の位置ずれを防ぐと共に、特徴の情報
量を圧縮するため、各方向別特徴面にボカシ処理が加え
られると共に、不必要となるメッシュを除去するため再
サンプリングを施し情報量を圧縮する。この手順に従つ
て４枚の方向性特徴を作成する。Furthermore, in order to prevent misalignment of character line segments and compress the amount of feature information, blurring processing is added to the feature surface for each direction, and re-sampling is performed to remove unnecessary meshes to reduce the amount of information. Compress. Four directional features are created according to this procedure.

抽出された方向性特徴は整合部25に加えられ、整合部25
において特徴辞書26に格納された標準パターンとの間で
類似度が算出される。The extracted directional features are added to the matching unit 25, and the matching unit 25
At, the degree of similarity with the standard pattern stored in the feature dictionary 26 is calculated.

整合部25では、まず４枚の方向性特徴を入力特徴バッフ
ァに格納する。次に特徴辞書26に記憶された標準パター
ンを順次読み出し、入力文字パターンの特徴との間で類
似度を求め、予め設定された評価値を満たすものをカテ
ゴリー情報と共に候補字種記憶部27に加える。The matching unit 25 first stores the four directional features in the input feature buffer. Next, the standard patterns stored in the feature dictionary 26 are sequentially read, the degree of similarity with the features of the input character pattern is obtained, and those satisfying the preset evaluation value are added to the candidate character type storage unit 27 together with the category information. .

方向性特徴抽出部24,整合部25,特徴辞書26を合わせて認
識部12と呼ぶ。The directional feature extraction unit 24, the matching unit 25, and the feature dictionary 26 are collectively referred to as a recognition unit 12.

候補字種記憶部27では、認識部12において確からしいと
判定された類似文字群から、順次類似文字対を選択す
る。この類似文字体は、前記第２図の原理説明の、
（玉）−（王）に対応する。The candidate character type storage unit 27 sequentially selects pairs of similar characters from the similar character group determined by the recognition unit 12 to be likely. This similar character is the same as in the explanation of the principle of FIG.
Corresponds to (ball)-(king).

選択された類似文字対は差分パターン辞書29に加えら
れ、差分パターン辞書29において、類似文字対に対応し
た差分パターンが選択される。The selected similar character pair is added to the difference pattern dictionary 29, and the difference pattern corresponding to the similar character pair is selected in the difference pattern dictionary 29.

一方、方向性特徴抽出部24において抽出された入力文字
パターンの方向性特徴は、差分情報抽出部28にも加えら
れる。また前記類似文字対の標準パターンも特徴辞書26
から読み出されて差分情報抽出部28に加えられる。On the other hand, the directional feature of the input character pattern extracted by the directional feature extraction unit 24 is also added to the difference information extraction unit 28. In addition, the standard pattern of similar character pairs is also used in the feature dictionary 26.
And is added to the difference information extraction unit 28.

差分情報抽出部28では、入力パターンの方向性特徴を入
力特徴バッファに格納する。また、類似文字対の標準パ
ターンを標準パターンバッファに格納する。次に差分パ
ターン辞書29に記憶された類似文字対の差分パターンを
読み出し、入力文字パターンに対応する方向性特徴と標
準パターンから差異部分を切り出す。この差異部分は、
前記第２図の原理説明の中で、差分パターン５の黒地部
分に相当する特徴量に対応する。The difference information extraction unit 28 stores the directional feature of the input pattern in the input feature buffer. Also, the standard pattern of similar character pairs is stored in the standard pattern buffer. Next, the difference pattern of the similar character pair stored in the difference pattern dictionary 29 is read, and the difference portion is cut out from the directional feature corresponding to the input character pattern and the standard pattern. This difference is
In the explanation of the principle of FIG. 2, the feature quantity corresponds to the black background portion of the difference pattern 5.

切り出された入力文字パターンの方向性特徴と標準パタ
ーンは字種対識別部30に加えられ、字種対識別部30にお
いて、各々の間で類似度が算出される。The cut-out directional characteristics of the input character pattern and the standard pattern are added to the character type pair identifying unit 30, and the character type pair identifying unit 30 calculates the degree of similarity between them.

字種対識別部30では、入力文字パターンの部分的な方向
性特徴と類似文字対の各々の標準パターンとの間で類似
度を求め、この類似度を類似文字対のカテゴリー情報と
共に判定部31に加えられる。The character type pair identification unit 30 obtains a similarity between the partial directional characteristics of the input character pattern and each standard pattern of the similar character pair, and determines the similarity together with the category information of the similar character pair. Added to.

判定部31では類似文字対間の類似度を比較処理し、候補
字種から除く字種を決定する。削除の決まつた字種は候
補字種記憶部27に加えられ、候補字種記憶部27の中から
削除される。The determination unit 31 compares the similarity between pairs of similar characters and determines the character types excluded from the candidate character types. The deleted character type is added to the candidate character type storage unit 27 and deleted from the candidate character type storage unit 27.

以上の手順に従つて、順次対判定を繰り返し、候補字種
が一字種に絞られたとき、前記字種を出力端子33から出
力する。According to the above procedure, pair determination is sequentially repeated, and when the candidate character type is narrowed down to one character type, the character type is output from the output terminal 33.

候補字種記憶部27,差分情報抽出部28,差分パターン辞書
29,字種識別部30を合わせて類似文字識別部13と呼ぶ。Candidate character type storage unit 27, difference information extraction unit 28, difference pattern dictionary
29 and the character type identification unit 30 are collectively referred to as a similar character identification unit 13.

以上、第１図の説明において、制御部32から回路各部に
制御信号が供給され、また回路各部の状態を通知する信
号が制御部32に与えられるのであるが、本発明を理解す
る上で必ずしも必要ではないので、これらは簡単化のた
め説明を省略してある。As described above, in the description of FIG. 1, the control unit 32 supplies the control signals to the respective units of the circuit, and the signals for notifying the states of the respective units of the circuit are given to the control unit 32. However, it is not necessary to understand the present invention. These are not necessary and are not described for simplicity.

以上の説明から本実施例によれば、認識段で候補カテゴ
リーを選択した後、類似文字対の差分位置情報と方向性
特徴を用いて類似文字を識別する構成とし、文字の微細
構造が表現できるようになると共に、類似文字識別用標
準パターンと識別用標準パターンを共用化できるように
なつたので、辞書メモリを追加することなく、類似文字
を極めて高精度に判別できる。From the above description, according to the present embodiment, after selecting a candidate category in the recognition stage, the similar character is identified by using the differential position information of the similar character pair and the directional characteristic, and the fine structure of the character can be expressed. In addition, since the similar character identifying standard pattern and the identifying standard pattern can be shared, the similar character can be identified with extremely high accuracy without adding a dictionary memory.

〔The invention's effect〕

以上説明したように、本発明によれば、方向性特徴を採
用することによつて、構造解析的な特徴を用いることな
く、文字パターンの微細構造を表現できるようになつた
ので、微細な構造の差異が要求される類似文字の識別に
おいても、辞書容量を増大させることなく、類似文字群
を極めて高精度に識別できる。As described above, according to the present invention, by adopting the directional feature, it becomes possible to express the fine structure of the character pattern without using the structural analysis feature. Even in the identification of similar characters requiring a difference, it is possible to identify a similar character group with extremely high accuracy without increasing the dictionary capacity.

また、類似文字識別用の特徴に方向性特徴を採用するこ
とによつて、認識用標準パターンと共用できるので、辞
書メモリを増設することなく、類似文字識別論理を追加
できるという装置構成上の大きな利点が得られ、上記従
来例の欠点を除いて優れた機能のパターン認識装置を提
供することができる。Further, by adopting the directional feature as the feature for identifying similar characters, the feature can be shared with the standard pattern for recognition, so that a similar character identifying logic can be added without increasing the dictionary memory. It is possible to provide a pattern recognition device which has advantages and is excellent in function except for the drawbacks of the conventional example.

[Brief description of drawings]

第１図は本発明の一実施例の構成を示すブロツク図、第
２図は本発明に係る原理を表す類似文字識別用辞書の一
例を示す概念図、第３図は同じく本発明に係る原理を表
す特徴抽出の概念図である。 11……文字観測部、12……認識部、13……類似文字識別
部、21……光電変換部、22……A/D変換部、23……前処
理部、24……方向性特徴抽出部、25……整合部、26……
特徴辞書、27……候補字種記憶部、28……差分情報抽出
部、29……差分パターン辞書、30……字種対識別部、31
……判定部、32……制御部、33……出力端子。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention, FIG. 2 is a conceptual diagram showing an example of a similar character identification dictionary showing the principle of the present invention, and FIG. 3 is the same principle of the present invention. It is a conceptual diagram of the feature extraction showing. 11 …… Character observation section, 12 …… Recognition section, 13 …… Similar character identification section, 21 …… Photoelectric conversion section, 22 …… A / D conversion section, 23 …… Preprocessing section, 24 …… Directivity characteristics Extraction section, 25 ... Matching section, 26 ...
Feature dictionary, 27 ... Candidate character type storage unit, 28 ... Difference information extraction unit, 29 ... Difference pattern dictionary, 30 ... Character type pair identification unit, 31
…… Judgment part, 32 …… Control part, 33 …… Output terminal.

───────────────────────────────────────────────────── フロントページの続き (72)発明者横山佳弘神奈川県横浜市戸塚区吉田町292番地株式会社日立製作所マイクロエレクトロニクス機器開発研究所内 (72)発明者岡澤宏一神奈川県横浜市戸塚区吉田町292番地株式会社日立製作所マイクロエレクトロニクス機器開発研究所内 (72)発明者門田彰三神奈川県小田原市国府津2880番地株式会社日立製作所小田原工場内 (56)参考文献特開昭61−147385（ＪＰ，Ａ) 特開昭59−154579（ＪＰ，Ａ) ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Yoshihiro Yokoyama, inventor Yoshihiro Yokoyama, 292 Yoshida-cho, Totsuka-ku, Yokohama, Kanagawa Prefecture, Ltd. Microelectronics Equipment Development Laboratory, Hitachi, Ltd. (72) Koichi Okazawa, Totsuka, Yokohama, Kanagawa 292 Yoshida-cho, Tokyo, Hitachi, Ltd. Microelectronics equipment development laboratory (72) Inventor Shozo Kadota 2880 Kunifuzu, Odawara, Kanagawa Stock company Hitachi Ltd., Odawara factory (56) References JP-A-61- 147385 (JP, A) JP-A-59-154579 (JP, A)

Claims

[Claims]

1. A directional feature extraction unit for observing an unknown input pattern and forming a feature pattern for each direction from a directional component of a character line segment, and each standard pattern are also stored as feature patterns for each direction. The similarity between the feature dictionary, the feature pattern for each direction of the input pattern formed by the directional feature extraction unit, and the feature pattern for each direction of each of the standard patterns stored in the feature dictionary is calculated. With respect to the standard pattern stored in the feature dictionary, the matching unit that determines the standard pattern having the similarity that satisfies a preset evaluation value as a similar character, and the feature of each of the two standard patterns by direction. The difference pattern dictionary in which the difference patterns of the patterns are stored and the arbitrary two similar character pairs of the similar character group determined by the matching unit are fetched and A candidate character type storage unit that reads out a difference pattern corresponding to the similar character pair from the differential pattern dictionary, a feature pattern for each direction of two standard patterns that are similar character pairs in the candidate character type storage unit, and the input pattern. And a difference information extraction unit for extracting the portion of the position corresponding to the difference pattern read from the difference pattern dictionary in each of the feature patterns, and for each direction of the input pattern. A portion cut out from the characteristic pattern by the difference information extracting unit and a portion cut out from the characteristic information for each direction of one of the two standard patterns forming the similar character pair by the difference information extracting unit. The degree of similarity and the portion cut out by the difference information extraction unit from the feature pattern for each direction of the input pattern A character type pair identifying unit that obtains the degree of similarity with the part cut out by the difference information extracting unit from the characteristic pattern for each direction of the other standard pattern of the two standard patterns forming similar character pairs; Comparing the two similarities obtained by the pair discriminating unit, the above-mentioned 2 which is regarded as a similar character pair in the candidate character type storage unit is compared.
Of the two standard patterns having a lower degree of similarity is determined, and the characteristic pattern for each direction of the standard pattern determined to have a lower degree of similarity is deleted from the matching section. As long as a plurality of similar characters are stored in the matching unit, it is determined by the matching unit by repeating a series of processes by the candidate character type storage unit, the difference information extraction unit, the character type pair identification unit, and the determination unit. Narrow down similar characters,
A pattern recognition apparatus characterized in that a standard pattern as the similar character remaining in the matching section at the end is used as a recognition result of the input pattern.