JPH04338889A

JPH04338889A - Method and device for character recognition

Info

Publication number: JPH04338889A
Application number: JP3111538A
Authority: JP
Inventors: Atsuko Kogahara; 古河原　敦子
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1991-05-16
Filing date: 1991-05-16
Publication date: 1992-11-26

Abstract

PURPOSE:To accurately recognize even an inclined character like a character, which is manually stuck to a drawing or the like, in a short time by making multiple templates of a dictionary and giving priority levels to respective templates. CONSTITUTION:A dictionary 11 is made of multiple templates, and a template of erect characters and templates of variously inclined characters are registered in the dictionary 11, and priority levels are given to these templates in the order of possibility of matching to characters on the surface of paper. Meanwhile, characters written on the surface of paper like a drawing are read in by a scanner 1, a frame memory 9, a character pattern memory 8, a character area segmenting part 6, etc., to obtain a character picture. This character picture is successively collated with plural templates in the order of priority level by a character recognizing part 7, and characters of the template best matched to this character picture are recognized as characters on the surface of paper.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、文字認識システム、特
に、図面をコンピュータシステムに自動入力するときの
、図面中の文字およびシンボルの認識システムに関する
。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition system, and more particularly to a system for recognizing characters and symbols in drawings when the drawings are automatically input into a computer system.

【０００２】0002

【従来の技術】近年、設計や製造の部門においてコンピ
ュータを利用するＣＡＤ／ＣＡＭシステムが広く普及し
てきた。これに伴い、紙面に書かれた図面、機械図面等
を、コンピュータが処理可能なＣＡＤデータに変換する
手段としての図面の入力の自動化の必要性が益々高まっ
てきた。図面の入力においては、文字及びシンボルの認
識が重要である。従来、図面の入力は、下記のように行
われている。（１）図面をスキャナ等で読み取り、二値化し、ラスタ
ー・データを作成する。（２）ラスター・データから大きさを基準にして、文字
及びシンボル領域（以下文字領域と記す）を抽出する。そして、文字領域に外接し、走査方向を水平方向とする
矩形を想定し、その外接矩形の左上隅および右下隅の各
座標からなる文字領域データを作成する。図８は、文字
領域の一例を示す。図８において２１および２２は、図
形領域を示す外接矩形、２３、２４、２５、２６は文字
領域を示す外接矩形を示す。各矩形領域は、左上隅およ
び右下隅（点で示す）の各座標（Ｘ、Ｙ）によって表わ
される。2. Description of the Related Art In recent years, CAD/CAM systems that utilize computers have become widespread in the design and manufacturing departments. Along with this, there has been an increasing need to automate the input of drawings as a means of converting drawings, mechanical drawings, etc. written on paper into CAD data that can be processed by a computer. Recognition of characters and symbols is important when inputting drawings. Conventionally, drawings are input as follows. (1) Read the drawing with a scanner, etc., binarize it, and create raster data. (2) Extract character and symbol areas (hereinafter referred to as character areas) from raster data based on size. Then, assuming a rectangle that circumscribes the character area and whose scanning direction is horizontal, character area data consisting of the coordinates of the upper left corner and the lower right corner of the circumscribed rectangle is created. FIG. 8 shows an example of a character area. In FIG. 8, reference numerals 21 and 22 indicate circumscribed rectangles indicating graphic areas, and 23, 24, 25, and 26 indicate circumscribed rectangles indicating character areas. Each rectangular area is represented by the coordinates (X, Y) of the upper left corner and lower right corner (indicated by dots).

【０００３】（３）全文字領域データについて、ある任
意の傾きを持つ直線上において、予め定められた文字間
隔以内で近接する文字領域を統合した文字列を定義する
文字列データを作成する。文字列データは、文字列に含
まれる全文字領域に外接する矩形の幅Ｗおよび高さＨと
、その矩形の左上隅の座標（Ｘ、Ｙ）と、先頭文字領域
の中心と末尾文字領域の中心を結んだ線と走査方向のな
す角度θとからなる。ただし、一文字から構成される文
字列の角度θは、予め定められた値とする。図９は文字
列の例を示す。図９において、２７は文字領域２４、２
５および２６に外接する矩形である。矩形２７の左上隅
の点２７ａの座標（Ｘ２、Ｙ２）、幅Ｗ２、高さＨ２お
よび角度θ２が、この文字列の文字列データを表わす。２３は、一文字で構成された文字列であり、文字列デー
タは、文字領域２３の左上隅の点２３ａの座標（Ｘ１、
Ｙ１）、幅Ｗ１、高さＨ１、角度θ１で表わされる。こ
の場合θ１＝０である。(3) For all character area data, create character string data that defines a character string that integrates adjacent character areas within a predetermined character interval on a straight line with a certain arbitrary slope. The character string data includes the width W and height H of a rectangle circumscribing the entire character area included in the character string, the coordinates (X, Y) of the upper left corner of the rectangle, and the center of the first character area and the last character area. It consists of an angle θ formed by a line connecting the centers and the scanning direction. However, the angle θ of a character string composed of one character is a predetermined value. FIG. 9 shows an example of a character string. In FIG. 9, 27 is the character area 24, 2
5 and 26. The coordinates (X2, Y2), width W2, height H2, and angle θ2 of point 27a at the upper left corner of rectangle 27 represent the character string data of this character string. 23 is a character string composed of one character, and the character string data is the coordinates (X1,
Y1), width W1, height H1, and angle θ1. In this case, θ1=0.

【０００４】（４）文字列の全文字に対して下記の（５
）〜（６）の処理を行う。（５）各文字について、文字領域データを参照し、ラス
ター・データから文字領域を切り出し、文字列の角度（
図９のθ２）回転させ、正立状態に直す。この画像のサ
イズをテンプレートのサイズと合わせる処理（正規化）
を施し、その文字の文字画像とする。（６）つぎのようにして、文字画像を認識する。文字画
像と辞書のテンプレートを照合し、距離が一番小さいテ
ンプレートのコードを認識結果とする。但し、距離が大
きい場合は、誤認識した可能性が大であることから、認
識できたか、できなかったかを判断するしきい値として
、最大許容不一致度を定め、この値より距離の大きいも
のは、認識出来なかった（リジェクト）とする。(4) The following (5
) to (6) are performed. (5) For each character, refer to the character area data, cut out the character area from the raster data, and calculate the angle of the character string (
θ2 in FIG. 9) Rotate it to an upright position. Processing to match the size of this image with the template size (normalization)
is applied to create a character image of that character. (6) Recognize character images as follows. The character image is compared with the template in the dictionary, and the code of the template with the smallest distance is taken as the recognition result. However, if the distance is large, there is a high possibility of misrecognition, so a maximum allowable degree of discrepancy is set as a threshold for determining whether the recognition was successful or not. , it is assumed that it could not be recognized (rejected).

【０００５】[0005]

【発明が解決しようとする課題】地図、配線図、イラス
ト図などを新規に作成する場合、手書き文字より、印刷
した文字の方が高い認識率を得られることから、図面を
作成する場合に、予め必要な文字やシンボルを印刷した
紙片を図面に貼付するという方法が取られることがある
。このように人手で印刷文字を切り貼りする場合には、
文字が傾かないように貼ることは仲々困難である。一方、前述のように、従来の図面の自動入力における図
面の認識では、複数の文字から構成される文字列は、文
字の繋がりから文字の傾きを求めている。しかし、一文
字で構成される文字列や、単独で存在することの多いシ
ンボルの場合には、角度を特定することが出来ないので
、あらかじめ一文字文字列の角度をきめておく。[Problem to be solved by the invention] When creating new maps, wiring diagrams, illustrations, etc., printed characters have a higher recognition rate than handwritten characters. Sometimes a method is used in which a piece of paper with the necessary characters or symbols printed on it is pasted onto the drawing. When cutting and pasting printed characters manually like this,
It is difficult to paste the letters without tilting them. On the other hand, as described above, in conventional drawing recognition in automatic drawing input, for character strings made up of a plurality of characters, the inclination of the characters is determined from the connection of the characters. However, in the case of a character string consisting of one character or a symbol that often exists alone, the angle cannot be specified, so the angle of the single character string is determined in advance.

【０００６】しかし、上記のように、人手による切り貼
り作業の場合、定められた文字を定められた角度で正確
に貼ることが困難であり、多少傾いてしまうのが自然で
ある。そこで、上記自動入力手順（５）において、文字
領域を切り出し、正立状態に回転させ、文字画像を得て
も、多少傾きのある文字画像となってしまい、認識でき
ないことがある。本発明は、上記従来技術における問題
点に鑑みなされたもので、多少傾いた文字やシンボルで
あっても、正確にかつ短時間で認識することができる文
字認識方法および装置を提供することを目的とする。However, as mentioned above, in the case of manual cutting and pasting work, it is difficult to accurately paste prescribed characters at a prescribed angle, and it is natural for the letters to be slanted to some extent. Therefore, in the automatic input procedure (5), even if a character image is obtained by cutting out a character area and rotating it into an upright position, the character image may be somewhat tilted and cannot be recognized. The present invention was made in view of the problems in the prior art described above, and it is an object of the present invention to provide a character recognition method and device that can accurately and quickly recognize even slightly slanted characters and symbols. shall be.

【０００７】[0007]

【課題を解決するための手段】本発明による文字認識方
法は、文字あるいはシンボルの画像を、その文字あるい
はシンボルのいろいろな姿勢をそれぞれ表わす複数のテ
ンプレートと順次照合し、画像に最も適合するテンプレ
ートにより文字あるいは、シンボルを認識するように構
成される。上記の照合は、複数のテンプレートに対し、
文字あるいはシンボルの正立状態からの傾き角度が大き
くなる程低い優先順位を与え、この優先順位に従って行
う。[Means for Solving the Problems] A character recognition method according to the present invention sequentially matches an image of a character or symbol with a plurality of templates each representing various postures of the character or symbol, and selects a template that best matches the image. Configured to recognize characters or symbols. The above matching is performed against multiple templates.
The larger the inclination angle of the character or symbol from the upright state, the lower the priority is given, and the processing is performed according to this priority.

【０００８】本発明による文字認識装置は、紙面に描か
れた文字あるいはシンボルを読み込み、文字画像を形成
する手段と、文字あるいはシンボルの紙面上の正立状態
および正立状態から僅かづつ傾いた状態をそれぞれ表わ
す複数のテンプレートを有する辞書と、文字画像を、辞
書の複数のテンプレートと順次照合することにより、文
字あるいはシンボルの認識を行う文字認識部とを具備す
るように構成される。文字認識部は、文字あるいはシン
ボルの正立状態を表わすテンプレートに最高優先順位を
与え、以下、傾き角度が大きくなる程低い優先順位を与
え、この優先順位に従って、文字画像と複数のテンプレ
ートとを順次照合する手段と、文字画像と最も近く、か
つ、許容範囲内にあるテンプレートを抽出し、このテン
プレートの文字あるいはシンボルを、紙面上の文字ある
いはシンボルであると認定し、許容範囲内にあるテンプ
レートが存在しないときはリジェクトとする手段を具備
するように構成される。The character recognition device according to the present invention includes a means for reading characters or symbols drawn on paper to form a character image, and a means for reading characters or symbols drawn on paper to form a character image, and a means for reading characters or symbols drawn on paper to form a character image; , and a character recognition unit that recognizes characters or symbols by sequentially comparing character images with the plurality of templates in the dictionary. The character recognition unit gives the highest priority to the template representing the upright state of the character or symbol, and thereafter gives lower priority to the larger the inclination angle, and sequentially processes the character image and the plurality of templates according to this priority. A matching method is used to extract the template that is closest to the character image and within the tolerance range, recognize the characters or symbols in this template as characters or symbols on the paper, and extract the template that is closest to the character image and within the tolerance range. It is configured to include means for rejecting when it does not exist.

【０００９】[0009]

【作用】上記構成により、辞書をマルチテンプレート化
し、正立文字のテンプレートと、いろいろな傾きを持つ
文字のテンプレートを辞書に登録する。上記の複数のテ
ンプレートに対して、紙面上の文字との適合の可能性の
大きさの順に優先順位をつける。図面等の紙面上に描か
れた文字を読み込み、文字画像を求め、この文字画像を
、上記の複数のテンプレートと優先順位の順に順次照合
し、最も適合したテンプレートの文字を上記紙面上の文
字と認定する。これによって、多少傾いた文字も認識可
能になると共に誤認識を防止することができる。また、
認識時間を短縮することができる。[Operation] With the above configuration, the dictionary is made into a multi-template, and templates for upright characters and templates for characters with various inclinations are registered in the dictionary. The plurality of templates described above are prioritized in the order of their likelihood of matching with the characters on the paper. Read the characters drawn on the paper such as a drawing, obtain a character image, compare this character image with the plurality of templates listed above in order of priority, and select the characters from the most matching template with the characters on the paper. Certify. This makes it possible to recognize characters that are slightly tilted, and also prevents erroneous recognition. Also,
Recognition time can be shortened.

【００１０】0010

【実施例】以下、本発明の実施例について、図面を参照
して詳細に説明する。図１は本発明による文字・図形認
識装置の一実施例の構成図である。図１において、スキ
ャナ１は、図面等の紙面を走査し、文字、シンボル、図
形等を二値ラスター・データに変換する。このときスキ
ャナ１の制御をスキャナ制御部２が司る。スキャナ制御
部２は、読み込まれたラスター・データをフレーム・メ
モリ９に格納する。文字・図形分離部３は、ラスター・
データから連結黒画素成分の大きさを計測し、予め定め
られた文字の大きさに適合するものを文字候補とする。文字候補以外を図形候補とする。文字候補に対しては、
前述のごとき文字領域データを作成し、文字・図形メモ
リ８に格納する。ベクトル生成部４は、図形候補に対し
てラスター・ベクトル変換（折れ線近似）を行い、生成
されたベクトル・データを文字・図形メモリ８に格納す
る。Embodiments Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram of an embodiment of a character/figure recognition device according to the present invention. In FIG. 1, a scanner 1 scans a paper surface such as a drawing and converts characters, symbols, figures, etc. into binary raster data. At this time, the scanner control section 2 is in charge of controlling the scanner 1. The scanner control unit 2 stores the read raster data in the frame memory 9. The character/figure separation unit 3 is a raster/graphic separation unit 3.
The sizes of connected black pixel components are measured from the data, and those that fit a predetermined character size are selected as character candidates. Use objects other than character candidates as graphic candidates. For character candidates,
Character area data as described above is created and stored in the character/figure memory 8. The vector generation unit 4 performs raster-vector conversion (broken line approximation) on the graphic candidate and stores the generated vector data in the character/graphic memory 8 .

【００１１】文字列抽出部５は、文字・図形メモリ８に
格納された文字領域データを参照し、文字列データを作
成し、文字・図形メモリ８に格納する。文字領域切り出
し部６は、文字・図形メモリ８に格納されている文字領
域データと文字列データを参照し、フレーム・メモリ９
に格納されているラスター・データから文字領域イメー
ジを切り出し、これを回転した後、正規化し、文字画像
とする。文字認識部７は、文字画像と後に詳述する辞書
１１とを参照することによって文字の認識を行い、文字
認識結果をコード・メモリ１０に格納する。通信制御部
１２は、文字・図形メモリ８に格納されたベクトル・デ
ータおよびコードメモリ１０に格納された文字データを
ＣＡＤシステムへ出力する。The character string extraction section 5 refers to the character area data stored in the character/graphic memory 8, creates character string data, and stores it in the character/graphic memory 8. The character area cutting unit 6 refers to the character area data and character string data stored in the character/graphic memory 8, and extracts the character area data from the frame memory 9.
A character area image is cut out from the raster data stored in , rotated, and normalized to create a character image. The character recognition unit 7 performs character recognition by referring to the character image and a dictionary 11, which will be described in detail later, and stores the character recognition result in the code memory 10. The communication control unit 12 outputs vector data stored in the character/figure memory 8 and character data stored in the code memory 10 to the CAD system.

【００１２】辞書１１は、つぎのように構成する。（１）辞書１１をマルチテンプレートにする。すなわち
、全文字あるいはシンボル（以下、文字とシンボルを総
称して、文字と記す。）の正立状態の画像、ある傾きの
ある画像を収集してそれぞれテンプレート群とし、これ
等テンプレート群の集合として辞書１１を構成する。左右両方向に傾く可能性がある文字については、左方向
と右方向に傾いた文字の別々のテンプレートを設ける。（２）各テンプレート群に優先順位をつける。正立画像
で作成したテンプレート群に一番高い優先度を付加し、
次に傾く可能性の高い角度のテンプレート群に次に高い
優先度を付加する。この様に、全テンプレート群に優先
度を付加する。図１において、辞書１１は、それぞれｍ
個の文字のテンプレートからなるｎ個テンプレート群Ｔ
Ｇ１、ＴＧ２…ＴＧｎを有する。優先度１のテンプレー
ト群ＴＧ１のｍ個の文字のテンプレートは、全て正立文
字のテンプレートである。優先度２のテンプレート群Ｔ
Ｇ２は、正立文字より少し傾いたｍ個の文字のテンプレ
ートを有する。以下、テンプレート群の優先度が下がる
に従ってｍ個の文字の傾きが大きくなる。The dictionary 11 is constructed as follows. (1) Make the dictionary 11 a multi-template. In other words, images of all characters or symbols (hereinafter, characters and symbols are collectively referred to as characters) in an upright state and images with a certain inclination are collected as a template group, and as a set of these template groups. A dictionary 11 is configured. For characters that may be tilted in both left and right directions, separate templates are provided for characters tilted to the left and to the right. (2) Prioritize each template group. Add the highest priority to the template group created with upright images,
The next highest priority is given to the template group with the angle that is most likely to be tilted. In this way, priority is added to all template groups. In FIG. 1, the dictionaries 11 are each m
A group of n templates T consisting of character templates
G1, TG2...TGn. The m character templates of the template group TG1 with priority level 1 are all erect character templates. Template group T with priority level 2
G2 has a template of m characters slightly tilted from the upright characters. Thereafter, as the priority of the template group decreases, the slope of the m characters increases.

【００１３】以下、図１の文字図形認識装置の文字認識
動作を、図２のフローチャートを参照して説明する。（１）図面をスキャナ１で読み込み、ラスターデータを
フレームメモリ９に格納する（Ｓ１）。（２）ラスターデータから、文字領域データを作成し文
字・図形メモリ８に格納する（Ｓ２）。（３）文字列抽出部５によって、文字列データを作成し
、文字・図形メモリ８に格納する（Ｓ３）。（４）文字列の全文字に対して下記の処理を行う。（５）文字領域切り出し部６は、文字・図形メモリ８に
格納されている文字領域データを参照し、フレームメモ
リ９内に格納されているラスターデータから文字領域を
切り出し（Ｓ４）、文字列の角度θだけ回転させて正立
状態に直し（Ｓ５）、この画像に正規化を施して（Ｓ６
）、文字画像（Ｓ７）とする。なお、上記（１）〜（５
）は、前述の従来技術におけるステップ（１）〜（５）
と同じである。The character recognition operation of the character/figure recognition device shown in FIG. 1 will be explained below with reference to the flowchart shown in FIG. (1) The drawing is read by the scanner 1 and the raster data is stored in the frame memory 9 (S1). (2) Create character area data from the raster data and store it in the character/figure memory 8 (S2). (3) The character string extraction unit 5 creates character string data and stores it in the character/graphic memory 8 (S3). (4) Perform the following processing on all characters in the string. (5) The character area cutting unit 6 refers to the character area data stored in the character/graphic memory 8, cuts out a character area from the raster data stored in the frame memory 9 (S4), and extracts a character area from the raster data stored in the frame memory 9. The image is rotated by the angle θ to restore the upright state (S5), and this image is normalized (S6).
), character image (S7). In addition, the above (1) to (5)
) are steps (1) to (5) in the prior art described above.
is the same as

【００１４】つぎに、下記ステップ（６）〜（９）によ
って、文字認識部７は、文字画像の認識を行う。（６）優先度を一番高い値にする（Ｓ８）。（７）文字画像と該優先度を持つ全テンプレートとを照
合する（Ｓ９）。全てのテンプレートとの距離が、最大
許容不一致度より大きい場合（Ｓ１０、Ｓ１１）は、該
優先度では認識できなかったとする。そうでない場合は
、認識できたとし、距離が最小のテンプレートのコード
Ｔｍｉｎを認識結果とし（Ｓ１２）、終了する。（８）文字画像が認識できるまで、優先度を一ランク落
とし、（７）の処理を行う（Ｓ１４、Ｓ１５）。（９）全優先度との照合で認識結果が得られない場合は
リジェクトとする（Ｓ１４、Ｓ１６）。Next, the character recognition unit 7 recognizes the character image through steps (6) to (9) below. (6) Set the priority to the highest value (S8). (7) Compare the character image with all templates having the priority (S9). If the distance to all templates is greater than the maximum permissible degree of mismatch (S10, S11), it is assumed that the template cannot be recognized with the priority. If not, it is assumed that recognition has been achieved, and the code Tmin of the template with the smallest distance is set as the recognition result (S12), and the process ends. (8) The priority is lowered by one rank and the process of (7) is performed until the character image can be recognized (S14, S15). (9) If no recognition result is obtained by comparing with all the priorities, it is rejected (S14, S16).

【００１５】以下、具体例について説明する。いま、図
３に示す図面１４の認識処理を行うものとする。図面１
４には、シンボル１５、１６および１７と、いくつかの
直線が描かれている。シンボル１５、１６、１７を認識
対象とする。シンボル１５は、正確に水平に描かれた４
角形、シンボル１６は多少右に傾いた円、シンボル１７
は、多少左に傾いた三角形である。ここで、一文字文字
列の角度は、０度とする。一方、辞書１１は、図４、５
および６に示す３つのテンプレート群ＴＧ１、ＴＧ２、
ＴＧ３で構成する。テンプレート群ＴＧ１、ＴＧ２、Ｔ
Ｇ３はそれぞれ優先度１、２、３を与えられ、優先度１
、２、３の順で優先度が高いものとする。優先度１のテ
ンプレート群ＴＧ１は、正立したシンボルのテンプレー
トＡ１、Ｂ１、Ｃ１、Ｄ１…Ｘ１からなる。A specific example will be explained below. Now, it is assumed that a recognition process for drawing 14 shown in FIG. 3 is performed. Drawing 1
4, symbols 15, 16 and 17 and some straight lines are drawn. Symbols 15, 16, and 17 are to be recognized. Symbol 15 is a 4 drawn exactly horizontally.
Square shape, symbol 16 is a circle slightly tilted to the right, symbol 17
is a triangle tilted slightly to the left. Here, the angle of one character string is 0 degrees. On the other hand, the dictionary 11 is
and three template groups TG1, TG2, shown in 6.
Consists of TG3. Template group TG1, TG2, T
G3 is given priority 1, 2, and 3, respectively, and priority 1
, 2, and 3 have higher priorities in this order. The template group TG1 with priority level 1 consists of erect symbol templates A1, B1, C1, D1...X1.

【００１６】優先度２のテンプレート群ＴＧ２は、少し
左方向に傾いたシンボルを含むテンプレートＡ２１、Ｂ
２１、Ｃ２１、Ｄ２１…Ｘ２１からなるテンプレート群
ＴＧ２１と、ＴＧ２１と同程度に右方向に傾いたシンボ
ルを含むテンプレートＡ２２、Ｂ２２、Ｃ２２、Ｄ２２
…Ｘ２２からなるテンプレート群ＴＧ２２とを有する。優先度３のテンプレート群ＴＧ３は、左方向に更に傾い
たシンボルを含むテンプレートＡ３１、Ｂ３１、Ｃ３１
、Ｄ３１…Ｘ３１からなるテンプレート群ＴＧ３１と、
ＴＧ３１と同程度に右方向に傾いたシンボルを含むテン
プレートＡ３２、Ｂ３２、Ｃ３２、Ｄ３２…Ｘ３２から
なるテンプレート群ＴＧ３２とを有する。Template group TG2 with priority level 2 includes templates A21 and B that include symbols slightly tilted to the left.
21, C21, D21...
...X22 template group TG22. Template group TG3 with priority level 3 includes templates A31, B31, and C31 that include symbols further tilted to the left.
, D31...X31, a template group TG31,
The template group TG32 includes templates A32, B32, C32, D32, . . .

【００１７】各シンボル１５、１６および１７を認識す
る動作はつぎのように行われる。（１）シンボル１５の認識シンボル１５の文字領域を切り出し、正規化し、文字画
像を作成する。文字画像は図７の１８のようになる、文
字画像１８を優先度１のテンプレート群ＴＧ１と照合す
る。優先度１のテンプレート群ＴＧ１では、テンプレー
トＡ１との距離が最小になり、最大許容不一致度を越え
ない値なので、認識結果を４角形とし、認識を終了する
。（２）シンボル１６の認識シンボル１６の文字領域を切り出し、正規化し、文字画
像を作成する。文字画像は図７の１９のようになる。文
字画像１９を優先度１のテンプレート群ＴＧ１と照合す
る。優先度１のテンプレート群ＴＧ１では、テンプレー
トＤ１との距離が最小になるが、最小距離が最大許容不
一致度を越えた値なので、認識できなかったことになる
。従来の技術では、ここで終了するので、シンボル１６
は認識できない。次に、文字画像１９を優先度２のテン
プレート群ＴＧ２と照合する。優先度２のテンプレート
群ＴＧ２では、テンプレートＤ２２との距離が最小にな
り、かつ、最大許容不一致度を越えない値なので、認識
結果を円とし、認識を終了する。The operation of recognizing each symbol 15, 16 and 17 is performed as follows. (1) Recognition of symbol 15 The character area of symbol 15 is cut out and normalized to create a character image. The character image 18, which looks like 18 in FIG. 7, is compared with the template group TG1 of priority 1. In template group TG1 with priority 1, the distance from template A1 is the minimum and the value does not exceed the maximum allowable degree of mismatch, so the recognition result is set as a square and recognition is terminated. (2) Recognition of symbol 16 The character area of symbol 16 is cut out and normalized to create a character image. The character image looks like 19 in FIG. The character image 19 is compared with the template group TG1 of priority 1. In template group TG1 with priority 1, the distance from template D1 is the minimum, but since the minimum distance exceeds the maximum allowable degree of mismatch, it cannot be recognized. In the conventional technology, it ends here, so symbol 16
cannot be recognized. Next, the character image 19 is compared with the template group TG2 of priority level 2. In template group TG2 with priority level 2, the distance from template D22 is the minimum and the value does not exceed the maximum allowable mismatch degree, so the recognition result is set as a circle and recognition is terminated.

【００１８】（３）シンボル１７の認識シンボル１７の
文字領域を切り出し、正規化し、文字画像を作成する。文字画像は図７の２０のようになる。文字画像２０を優
先度１のテンプレート群ＴＧ１と照合する。優先度１の
テンプレート群ＴＧ１では、テンプレートＢ１との距離
が最小になるが、最大許容不一致度を越えた値なので、
優先度１では認識できなかったことになる。従来の技術
では、ここで終了するので、シンボル１７は認識できな
い。次に、文字画像２０を優先度２のテンプレート群Ｔ
Ｇ２と照合する。優先度２のテンプレート群ＴＧ２では
、テンプレートＢ２１との距離が最小になるが、最大許
容不一致度を越えた値なので、優先度２では認識できな
かったことになる。次に文字画像２０を優先度３のテン
プレート群ＴＧ３と照合する。優先度３のテンプレート
群ＴＧ３では、テンプレートＢ３１との距離が最小にな
り、かつ、最大許容不一致度を越えない値なので、認識
結果を三角形とし、認識を終了する。(3) Recognition of symbol 17 The character area of symbol 17 is cut out and normalized to create a character image. The character image looks like 20 in FIG. The character image 20 is compared with the template group TG1 of priority 1. In template group TG1 with priority 1, the distance from template B1 is the minimum, but since the value exceeds the maximum allowable degree of mismatch,
This means that it could not be recognized with priority level 1. In the conventional technique, symbol 17 cannot be recognized because the process ends here. Next, the character image 20 is transferred to the template group T with priority level 2.
Check with G2. In template group TG2 with priority level 2, the distance from template B21 is the minimum, but since the value exceeds the maximum allowable degree of inconsistency, it means that it could not be recognized with priority level 2. Next, the character image 20 is compared with the template group TG3 of priority level 3. In template group TG3 with priority level 3, the distance from template B31 is the minimum and the value does not exceed the maximum allowable degree of mismatch, so the recognition result is set as a triangle and recognition is terminated.

【００１９】[0019]

【発明の効果】本発明による文字認識方法によれば、辞
書をマルチテンプレートとし、各テンプレートに優先順
位を付与し、文字画像を複数のテンプレートと、優先順
位に従って、照合することにより文字を認識するように
したので、図面等に人手で貼布される場合のように傾い
た文字でも正確に、しかも短時間で認識することができ
る。従って、文字・図形認識装置の認識能力の向上に寄
与するところ大である。[Effects of the Invention] According to the character recognition method according to the present invention, the dictionary is made up of multiple templates, each template is given a priority order, and characters are recognized by comparing character images with a plurality of templates according to the priority order. As a result, even slanted characters that are pasted manually on drawings or the like can be recognized accurately and in a short time. Therefore, it greatly contributes to improving the recognition ability of character/figure recognition devices.

【図面の簡単な説明】[Brief explanation of the drawing]

【図１】本発明の構成を示す図である。FIG. 1 is a diagram showing the configuration of the present invention.

【図２】図１のシステムの動作を示すフローチャートで
ある。FIG. 2 is a flowchart showing the operation of the system of FIG. 1;

【図３】本発明の実施例における認識対象図面である。FIG. 3 is a drawing to be recognized in an embodiment of the present invention.

【図４】優先度１のテンプレート群を示す図である。FIG. 4 is a diagram showing a template group with priority level 1.

【図５】優先度２のテンプレート群を示す図である。FIG. 5 is a diagram showing a template group with priority level 2.

【図６】優先度３のテンプレート群を示す図である。FIG. 6 is a diagram showing a template group with priority level 3.

【図７】図３の図面から切り出された文字画像を示す図
である。FIG. 7 is a diagram showing a character image cut out from the drawing of FIG. 3;

【図８】文字領域を説明するための図である。FIG. 8 is a diagram for explaining a character area.

【図９】文字列データを説明するための図である。FIG. 9 is a diagram for explaining character string data.

[Explanation of symbols]

１　　　　スキャナ２　　　　スキャナ制御部３　　　　文字・図形分離部４　　　　ベクトル生成部５　　　　文字列抽出部６　　　　文字領域切り出し部７　　　　文字認識部８　　　　文字・図形メモリ９　　　　フレームメモリ１０　　　　コードメモリ１１　　　　辞書１２　　　　通信制御部１４　　　　図面１５、１６、１７　　　　シンボル１８、１９、２０　　　　文字画像２１、２２　　　　図形領域２３、２４、２５、２６　　　　文字領域２７　　　　
文字列ＴＧ１〜ＴＧｎ、ＴＧ２１、ＴＧ２２、ＴＧ３１、ＴＧ
３２　　　　テンプレート群Ａ１〜Ｘ１、Ａ２１〜Ｘ２１、Ａ２２〜Ｘ２２、Ａ３１
〜Ｘ３１、Ａ３２〜Ｘ３２　　　　テンプレートθ１、
θ２、Ｗ１、Ｗ２、Ｈ１、Ｈ２　　　　文字列データＳ
１〜Ｓ１５　　　　ステップ1 Scanner 2 Scanner control section 3 Character/figure separation section 4 Vector generation section 5 Character string extraction section 6 Character area extraction section 7 Character recognition section 8 Character/figure memory 9 Frame memory 10 Code memory 11 Dictionary 12 Communication control section 14 Drawing 15 , 16, 17 Symbols 18, 19, 20 Character images 21, 22 Graphic areas 23, 24, 25, 26 Character areas 27
Character strings TG1 to TGn, TG21, TG22, TG31, TG
32 Template groups A1-X1, A21-X21, A22-X22, A31
~X31, A32~X32 template θ1,
θ2, W1, W2, H1, H2 Character string data S
1~S15 steps

Claims

[Claims]

[Claim 1] A plurality of templates each representing characters or symbols having different inclination angles are prepared,
A lower priority is given to the plurality of templates as the inclination angle increases, an image of a character or symbol is sequentially compared with the plurality of templates according to the priority, and the character or symbol is selected by the template that best matches the image. Or a character recognition method characterized by recognizing symbols.

2. A means for reading characters or symbols drawn on paper to form a character image; and a plurality of means for reading characters or symbols drawn on paper to form a character image; The highest priority is given to a dictionary having a template and a template representing an upright state of the character or symbol, and a lower priority is given as the inclination angle increases, and according to this priority, the character image and the plurality of characters are The templates are sequentially compared, the template that is closest to the character image and within the tolerance is extracted, and the characters or symbols of this template are recognized as the characters or symbols on the paper, and the template is extracted that is closest to the character image and within the tolerance. 1. A character recognition device comprising: a character recognition unit that rejects a character when a certain template does not exist.