JPS60159987A

JPS60159987A - Character recognizing device

Info

Publication number: JPS60159987A
Application number: JP59016467A
Authority: JP
Inventors: Minoru Nagao; 永尾　実
Original assignee: Tateisi Electronics Co; Omron Tateisi Electronics Co
Current assignee: Omron Corp
Priority date: 1984-01-30
Filing date: 1984-01-30
Publication date: 1985-08-21

Abstract

PURPOSE:To enable character recognition by simple method by extracting a quadrangle circumscribed by character pattern, dividing this into three directions longitudinally or lateraly, and deciding the divided area in which characteristic such as character branch point edists. CONSTITUTION:A binary coded character pattern is stored in a picture memory 1. A circumscribed quadrangle extracting circuit 2 extracts a quadrangle circumscribed by the character pattern on the picture memory 1. A branch point position extracting circuit 3 extracts position of existence of branch point that forms characteristic of a character. These extracted data are stored in a RAM4. A CPU5 interprets program of a program memory 6, and executes reading or writing of data to the RAM4 and picture memory 1, and at the same time, controls operation of above-mentioned extracting circuits 2, 4. By this way, distinction between P and 9, 5 and 2 can be obtained clearly by only adding a simple processing method.

Description

【発明の詳細な説明】〈発明の技術分野〉本発明は、未知文字を光学的に読み取り、これを白黒２
値化して文字パターンをめた後、文字パターンより未知
文字の特徴を抽出し、この特徴を辞書に予め格納しであ
る標準パターンと照合して、未知文字を特定する文字認
識装置に関する。[Detailed Description of the Invention] <Technical Field of the Invention> The present invention optically reads unknown characters and converts them into black and white.
The present invention relates to a character recognition device that identifies the unknown character by converting it into a value to obtain a character pattern, extracting the characteristics of the unknown character from the character pattern, and comparing the characteristics with a standard pattern stored in a dictionary in advance.

〈発明の背景〉従来の文字認識装置では、辞書照合処理に際し、未知文
字の特徴を用いてその候補文字を段階的に絞り込んでゆ
き、そして最後に詳細な辞書照合動作を実行している。<Background of the Invention> In a conventional character recognition device, during dictionary matching processing, candidate characters are narrowed down step by step using the characteristics of an unknown character, and finally a detailed dictionary matching operation is executed.

例えば文字特徴のうち、交点およびループの有無に着目
すると、第１図に示す如く、数字「１」〜「９」はその
態様に応じて４個のグループに分類される。従ってもし
未知文字が、゛′交点有り″パループ無し″の特徴を有
する場合、その未知文字は第３番目のグループに含まれ
る候補文字ｒ４Ｊｒ５Ｊ　ｒ７Ｊのいずれかであると判
断される。これと同様の処理を、例えばループの数、端
点の数、分岐点の有無等の他の文字特徴を用いて実施す
れば、候補文字をより一層絞り込むことができる。とこ
ろがこの種絞込み操作において、文字特徴として、交点
、分岐点、ループ、凹み等を用いただけでは、例えば第
２図に示す英文字「Ｐ」と数字「９」との間の区別、更
には第３図に示す数字「５」と数字「２」との間の区別
が不可能であり、これがためこれら文字の認識にはより
複雑な処理方法を採択する等の必要があった。For example, focusing on the presence or absence of intersections and loops among character features, the numbers "1" to "9" are classified into four groups according to their aspects, as shown in FIG. Therefore, if an unknown character has the characteristics of ``there is an intersection'' and ``no paroop'', the unknown character is determined to be one of the candidate characters r4Jr5J r7J included in the third group. If similar processing is performed using other character features such as the number of loops, the number of endpoints, the presence or absence of branch points, candidate characters can be further narrowed down. However, in this type of narrowing down operation, if only intersections, branch points, loops, depressions, etc. are used as character features, it will not be possible to distinguish between, for example, the alphabetic letter "P" and the number "9" shown in Figure 2, or even the number "9". It is impossible to distinguish between the number "5" and the number "2" shown in FIG. 3, and it is therefore necessary to adopt a more complicated processing method to recognize these characters.

〈発明の目的〉本発明は、特定の文字特徴に着目し、文字パターンにお
けるその特徴部分の存在位置を判定することによって、
上記不都合を解消した文字認識装置を提供することを目
的とする。<Objective of the Invention> The present invention focuses on a specific character feature and determines the location of that feature in a character pattern.
It is an object of the present invention to provide a character recognition device that eliminates the above-mentioned disadvantages.

〈発明の構成および効果〉上記目的を達成するため、本発明では、未知文字の２値
化データに基づき文字パターンが外接する四辺形を抽出
し、この四辺形の領域を、例えば第４，５図に示す縦方
向或いは第６，７図に示す横方向に夫々３分割して、例
えば文字分岐点のような特徴部分がいずれの分割領域に
存在位置するかを判定するよう構成した。<Configuration and Effects of the Invention> In order to achieve the above object, the present invention extracts a quadrilateral circumscribing a character pattern based on the binary data of unknown characters, and divides the area of this quadrilateral into, for example, the fourth and fifth regions. The screen is divided into three parts in the vertical direction shown in the figure or in the horizontal direction shown in Figs. 6 and 7, and it is determined in which divided area a characteristic part such as a character branch point is located.

本発明によれば、第４図の英文字ｒＰＪの場合は、文字
の分岐点Ｔ、　、　Ｔ２は左端の領域ＸＡに存在し、一
方第５図の数字「９」の場合は、文字の分岐点Ｔ１．　
Ｔ２．　Ｔ３が右端の領域ＸＣに存在して、左端の領域
ＸＡには全く存在せず、従ってこの両者は明確に区別し
得る。また第６図の数字「５」の場合は、分岐点ＴＩ、
Ｔ２は上端の領域ＹＡに存在し、一方第７図の数字「２
」の場合は、分岐点Ｔｌ、Ｔ２が下端の領域Ｙｃに存在
して、上端の領域ＹＡには存在せず、従ってこの両者は
明確に区別し得る。かくして本発明は、従来不可能であ
った特定文字間の区別を可能とし、文字認識精度の向上
に貢献する等、発明目的を達成した顕著な効果を奏する
。According to the present invention, in the case of the English letter rPJ in FIG. 4, the character branch points T, , T2 exist in the leftmost region XA, while in the case of the number "9" in FIG. Point T1.
T2. T3 exists in the rightmost region XC and does not exist at all in the leftmost region XA, and therefore the two can be clearly distinguished. In addition, in the case of the number "5" in Fig. 6, the branch point TI,
T2 exists in the upper end area YA, while the number “2” in FIG.
'', the branching points Tl and T2 exist in the lower end region Yc, but not in the upper end region YA, and therefore the two can be clearly distinguished. Thus, the present invention achieves the remarkable effects of achieving the purpose of the invention, such as making it possible to distinguish between specific characters, which was previously impossible, and contributing to improved character recognition accuracy.

〈実施例の説明〉第８図は本発明にかかる装置の回路構成例を示し、図中
の画像メモリ１には、白黒２値された文字パターンが格
納される。また外接四辺形抽出回路２は、画像メモリ１
上において文字パターンが外接する四辺形を抽出し、更
に分岐点位置抽出回路３は文字の特徴をなす分岐点の存
在位置を抽出する。これら抽出データは、ＲＡＭ（Ｒａ
ｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｔｎｏｒｙ　）　４に格
納され、ＣＰＵ（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎ
ｇ　Ｕｎｉｔ　）　５は、プログラムメモリ６のプログ
ラムを解読し、ＲＡＭ４や画像メモリ１に対するデータ
の読出し若しくは書込みを実行すると共に、前記各抽出
回路２，４の動作を制御する。<Description of Embodiments> FIG. 8 shows an example of the circuit configuration of an apparatus according to the present invention, and the image memory 1 in the figure stores a black and white binary character pattern. Further, the circumscribed quadrilateral extraction circuit 2 includes the image memory 1
Above, the quadrilateral circumscribed by the character pattern is extracted, and furthermore, the branch point position extraction circuit 3 extracts the location of the branch point that is characteristic of the character. These extracted data are stored in RAM (Ra
ndom Access Memory) 4, and is stored in the CPU (Central Processing
gUnit) 5 decodes the program in the program memory 6, reads or writes data to the RAM 4 and the image memory 1, and controls the operations of the extraction circuits 2 and 4.

第９図はＣＰＵ５の制御動作を示す。令弟１０図に示す
画像メモリ１のＸＹ座標上に英文字「Ｐ」の文字パター
ン７が格納された場合を想定すると、まずＣＰＵ５は、
第９図のステップ１０において、文字パターン７が外接
する四辺形８を抽出する。この四辺形８を規定するデー
タは、文字パターン７におけるＸ座標の最大、最小値Ｘ
Ｍ　、　Ｘｍと、Ｙ座標の最大、最小値ＹＭ　、　Ｙｍ
とによって与えられ、これら座標データ（図示例の場合
、Ｘｍ　＝　２、ＸＭ＝９、Ｙｍ　＝　２、ＹＭ＝１０
）は第１１図に示すＲＡＭ４の所定領域に順次格納され
る。FIG. 9 shows the control operation of the CPU 5. Assuming that the character pattern 7 of the English letter "P" is stored on the XY coordinates of the image memory 1 shown in FIG.
In step 10 of FIG. 9, a quadrilateral 8 circumscribing the character pattern 7 is extracted. The data defining this quadrilateral 8 are the maximum and minimum X coordinates of the character pattern 7.
M, Xm, and the maximum and minimum values of Y coordinates YM, Ym
These coordinate data (in the illustrated example, Xm = 2, XM = 9, Ym = 2, YM = 10
) are sequentially stored in a predetermined area of the RAM 4 shown in FIG.

つぎにＣＰＵ５は、ステップ１１において、文字パター
ン７より分岐点Ｔ１．Ｔ２が存在位置する座標（Ｘ、Ｙ
）を抽出し、その座標データ（図示例の場合、Ｔｌが（
４，５）、Ｔ２が（４、６）である）は第１２図に示す
ＲＡＭ４の所定領域に格納される。尚第１２図中、テー
ブルストッパーは分岐点情報の完了を示すコードである
。Next, in step 11, the CPU 5 selects the branch point T1 from the character pattern 7. Coordinates where T2 exists (X, Y
), and its coordinate data (in the illustrated example, Tl is (
4,5) and T2 is (4,6)) are stored in a predetermined area of the RAM 4 shown in FIG. In FIG. 12, the table stopper is a code indicating completion of branch point information.

ついでＣＰＵ５は、ステップ１２において、前記の四辺
形７を縦横各３分割し、つぎのステップ１３において、
前記分岐点Ｔ１．Ｔ２が分割されたいずれの領域に存在
位置するかを判定する。Next, in step 12, the CPU 5 divides the quadrilateral 7 into three parts vertically and horizontally, and in the next step 13,
Said branch point T1. It is determined in which divided region T2 exists.

第１３図は、前記ステップ１２．１３の内容を一層詳細
に示したものである。同図中、ステップ２０〜２５は前
記四辺形７を縦方向に３分割するラインの位置データＸ
ｌ、　Ｘ２を、またステップ２６〜３１は横方向に３分
割するラインの位置データＹ、、、Ｙ２を夫々算出する
過程を示す。図示例の方法は、四辺形７の横辺および縦
辺を３で割って、分割定数Ｄｘ、Ｄｙおよびその余りを
め（ステップ２０．２６）、夫々の余りが、０．１．２
のいずれであるかを判定した後（ステップ２１〜２２お
よびステップ２７〜２８）、余りの値に応じて前記位置
データＸｌ。FIG. 13 shows the contents of step 12.13 in more detail. In the figure, steps 20 to 25 are position data X of a line that vertically divides the quadrilateral 7 into three.
Steps 26 to 31 show the process of calculating position data Y, . The illustrated method is to divide the horizontal and vertical sides of quadrilateral 7 by 3, find the division constants Dx and Dy, and their remainders (step 20.26), and each remainder is 0.1.2.
(Steps 21-22 and Steps 27-28), the position data Xl is determined according to the remainder value.

Ｘ２およびＹｌ、Ｙ２を算出している（ステップ２３〜
２５およびステップ２９〜３１）。そしてこれら位装置
データ（図示例の場合、Ｘ１＝５．Ｘ２＝７、Ｙｌ−５
，Ｙ２−８）は、第１４図に示すＲＡＭ４の所定領域に
格納され、これにより四辺形８はデータ上、第１５．１
６図に示す縦横各３個の領域ＸＡ〜ｘｃ　、　ＹＡ、％
、　Ｙｃに分割される。X2, Yl, and Y2 are calculated (step 23~
25 and steps 29-31). And these device data (in the case of the illustrated example, X1=5.X2=7, Yl-5
, Y2-8) are stored in a predetermined area of the RAM 4 shown in FIG.
Three vertical and horizontal areas XA to xc, YA, % shown in Figure 6
, Yc.

つぎにｃｐｕ５は、ステップ３２〜３７において、分岐
点Ｔ１．Ｔ２が縦分割された領域ＸＡ　−Ｘｃのいずれ
に位置するか、またステップ３８〜４３において、横分
割された領域ＹＡ、Ｙｃのいずれに位置するかを判定す
る。この判定は、分岐点Ｔｌ、Ｔ２のＸ座標と前記位置
データＸｌ、Ｘ２との大小比較（ステップ３２．３３）
および、分岐点Ｔ１．Ｔ２のＹ座標と位置データＹ、　
、　Ｙ２との大小比較（ステップ３８　、３９　）によ
って実行され、各ステップの判定結果に基ついて第１７
図に示すＲＡＭ４の所定領域に分岐点Ｔ１．Ｔ２の存在
位置を表わすデータがセットされる（ステップ３４〜３
６および、ステップ４０〜４１）。前記分岐点Ｔｌ、Ｔ
２の各座標は、第１２図に示すＲＡＭ領域から読み出さ
れるが、読み出したデータがテーブルストッパであると
き、ステップ３７、ステップ４３の判定が６ＹＥＳ”と
なり、各判定処理は完了する。かくて第１７図において
、領域ＸＡ　、　ＹＡ　、　ＹＢに対応するＲＡＭ領域
にはデータ「１」がセットされ、これにより分岐点はこ
れら分割領域に存在位置することが理解される。Next, in steps 32 to 37, the CPU 5 executes the branch point T1. It is determined in which of the vertically divided areas XA-Xc T2 is located, and in steps 38 to 43, which of the horizontally divided areas YA and Yc is located. This determination is performed by comparing the X coordinates of the branch points Tl and T2 with the position data Xl and X2 (steps 32 and 33).
and branch point T1. T2's Y coordinate and position data Y,
, Y2 (steps 38 and 39), and based on the determination result of each step, the 17th
Branch point T1. Data representing the location of T2 is set (steps 34 to 3).
6 and steps 40-41). The branch points Tl, T
Each coordinate of 2 is read from the RAM area shown in FIG. 12, but when the read data is a table stopper, the judgments in step 37 and step 43 become 6YES'', and each judgment process is completed. In FIG. 17, data "1" is set in the RAM areas corresponding to areas XA, YA, and YB, and it is understood from this that branch points exist and are located in these divided areas.

第１８図は、数字「９」の文字パターンを示し、上記と
同様の方法によって、第１９図に示す分岐点の存在位置
データを得ることができる。FIG. 18 shows a character pattern of the number "9", and the branch point location data shown in FIG. 19 can be obtained by the same method as above.

この第１９図におけるデータ配置と前記第１７図におけ
るデータ配置とを比較すると、両者は明らかに一致して
おらず、従って例えば領域ＸＡに対応するデータ内容を
参照することによって、英文字ｒ　Ｐ　Ｊと数字「９」
との区別が可能である。Comparing the data arrangement in FIG. 19 and the data arrangement in FIG. 17, it is clear that they do not match. Therefore, by referring to the data contents corresponding to area XA, for example, the alphabet r P J and the number "9"
It is possible to distinguish between

[Brief explanation of the drawing]

第１図は候補文字の分類例を示す説明図、第２図および
第３図は従来装置において分類不能な文字パターン例を
示す説明図、第４図〜第７図は本発明にかかる方式を説
明するための文字パターンを示す説明図、第８図は本発
明の装置例を示す回路ブロック図、第９図はＣＰＵの制
御動作を示すフローチャート、第１０図は画像メモリ上
の文字パターンを示す説明図、第１１図および第１２図
はＲＡＭへのデータ格納状態を示す説明図、第１３図は
ＣＰＵの制御動作を示すフローチャート、第１４図はＲ
ＡＭへのデータ格納状態を示す説明図、第１５図および
第１６図は四辺形の分割領域を示す説明図、第１７図は
判定結果を示すＲＡＭのデータ内容を表わした説明図、
第１８図は画像メモリ上の文字パターンを示す説明図、
第１９図は判定結果を示すＲＡＭのデータ内容を表わし
た説明図である。２・・・・・・外接四辺形抽出回路３・・・・・・分岐点位置抽出回路５・・・・・・ＣＰＵ特許出願人　立石電機株式会社１７＋ｌ　図 −３＋２．　升３　ワテ　４　図　分　ｊ　図 ”）ｒ６　回　分ｑ　固 ”ＡＱ　グ　テ１０　面分１２図分／４−Ｌη　テ／６　図FIG. 1 is an explanatory diagram showing an example of classification of candidate characters, FIGS. 2 and 3 are explanatory diagrams showing examples of character patterns that cannot be classified by conventional devices, and FIGS. An explanatory diagram showing character patterns for explanation, FIG. 8 is a circuit block diagram showing an example of the device of the present invention, FIG. 9 is a flowchart showing the control operation of the CPU, and FIG. 10 shows the character pattern on the image memory. 11 and 12 are explanatory diagrams showing the data storage state in the RAM, FIG. 13 is a flowchart showing the control operation of the CPU, and FIG. 14 is the R
FIG. 15 and FIG. 16 are explanatory diagrams showing the state of data storage in AM, FIG. 15 and FIG. 16 are explanatory diagrams showing quadrilateral divided areas, and FIG. 17 is an explanatory diagram showing the data contents of RAM showing determination results.
FIG. 18 is an explanatory diagram showing character patterns on the image memory;
FIG. 19 is an explanatory diagram showing the data contents of the RAM indicating the determination result. 2... Circumscribed quadrilateral extraction circuit 3... Branch point position extraction circuit 5... CPU Patent applicant Tateishi Electric Co., Ltd. 17+l Figure-3+2. Square 3 Wate 4 Fig. Min.

Claims

[Claims] ■ In a character recognition device that reads unknown characters and converts them into black and white binary data to obtain a character pattern, extracts the features of the unknown characters and compares them with a standard pattern, the character recognition device reads and converts the unknown characters into black and white binary data to determine the character pattern. A character recognition device comprising means for extracting a quadrilateral circumscribing a quadrilateral, means for dividing a region of the extracted quadrilateral into a plurality of regions, and means for determining the presence or absence of a specific character feature in each divided region. (2) The character recognition device according to claim 1, wherein the quadrilateral is divided into three parts in each of the vertical and horizontal directions. (2) The character recognition device according to claim 1, wherein the specific character feature is a branch point of a character.