JPS6160188A

JPS6160188A - Character recognizer

Info

Publication number: JPS6160188A
Application number: JP59181990A
Authority: JP
Inventors: Michiaki Nakanishi; 道明中西; Masahiro Okawa; 大川　正廣
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1984-08-31
Filing date: 1984-08-31
Publication date: 1986-03-27
Anticipated expiration: 2009-10-19
Also published as: JPH0682400B2

Abstract

PURPOSE:To shorten the recognizing time of each character and also to improve the recognizing accuracy by sorting the types of characters in response to plural different recognizing means and using a recognizing means accordant with the corresponding sorted type of character. CONSTITUTION:The characters read by a reading part 2 are sent to a primary feature extracting circuit 15 in case the printing types are sorted by a character type reference memory 12. When the types of characters are obtained from an input pattern, the character data is normalized by a normalizing circuit 13 and sent to a character type deciding circuit 14. Then a decision parameter is calculated and sent to the circuit 15. For a printing type pattern and the numerical items, the range is limited for a sorting dictionary memory of a primary collation circuit 16 and the candidates are selected and sent to a primary deciding circuit 17. For other types of characters, the features of the corresponding characters are read out of the dictionary memory of the circuit 16 according to the decision parameter. Then the types of characters are decided and the candidates are selected and sent to the circuit 17.

Description

【発明の詳細な説明】〔産業上の利用分野〕 ■ 本発明は、帳票等に記された漢字を含む手書き等の文字
を読み取って認識する文字認識装置（以下ＯＣＲという
）に係り、特に高速処理が可能で、しかも誤認識を減少
させることができる文字認識装置に関するものである。[Detailed Description of the Invention] [Field of Industrial Application] ■ The present invention relates to a character recognition device (hereinafter referred to as OCR) that reads and recognizes handwritten characters including kanji written on forms, etc. The present invention relates to a character recognition device that is capable of processing and reducing erroneous recognition.

近来、ＯＣＲの進歩は目覚ましく、英数字、カナ文字を
対象とする印刷活字、及び手書き文字の読み取りが可能
なＯＣＲが、帳票処理業務等に広く実用に供されている
が、更に漢字、及び平仮名等を含む日本語文字の認識技
術の開発も盛んで種々の方法が試みられている。In recent years, the progress of OCR has been remarkable, and OCR that can read printed letters and handwritten characters that target alphanumeric characters and kana characters is widely used in form processing operations, etc. The development of Japanese character recognition technology including the following is active, and various methods are being tried.

このような漢字、及び平仮名等を含む手書き文字の認識
においては、処理速度が速く、しかも誤認識が少ない方
、法が望まれている。In the recognition of such handwritten characters including kanji, hiragana, etc., a method is desired that has a faster processing speed and fewer recognition errors.

[Conventional technology]

第３図は漢字を含む手書き文字を対象とする日本語文字
のＯＣＲのブロック図を示す。FIG. 3 shows a block diagram of OCR for Japanese characters, which targets handwritten characters including Chinese characters.

図において、帳票１は、フィールド毎に顧客の住所１氏
名、または品名等が記された伝票である。In the figure, form 1 is a form in which the customer's address, name, product name, etc. are written in each field.

読取部２は、帳票１上に照射された光の反射光をレンズ
系２ａを経てイメージセンサ２ｂによっテ走査して１フ
レームの文字を読み取り、イメージデータとして２値化
回路３へ送る機能を有する。The reading unit 2 has a function of scanning the reflected light of the light irradiated onto the form 1 through a lens system 2a with an image sensor 2b, reading one frame of characters, and transmitting the characters as image data to the binarization circuit 3. have

主制御部４は、各部を制御して文字読取り、認識処理プ
ログラムを遂行する機能を有する。The main control section 4 has a function of controlling each section and executing a character reading and recognition processing program.

画像メモリ５は、２値化されたイメージデータ。The image memory 5 contains binarized image data.

即ち、読み取られた文字の画像データを記憶するもので
ある。That is, it stores image data of read characters.

１文字切出回路６は、フォーマット情報メモリ９から送
られるフォーマット情報に基いて、画像メモリ５に記憶
された１フレームの文字より１文字を切り出して認識回
路１０へ送る機能を有する。The character cutting circuit 6 has a function of cutting out one character from one frame of characters stored in the image memory 5 and sending it to the recognition circuit 10 based on the format information sent from the format information memory 9.

特徴抽出回路７は、認識回路１０から送られる文字の特
徴、即ち、文字の画数１曲線係数等を抽出して認識回路
１０へ送る機能を有する。The feature extraction circuit 7 has a function of extracting character features sent from the recognition circuit 10, ie, character stroke number 1 curve coefficients, etc., and sending them to the recognition circuit 10.

辞書メモリ８は、認識の基準となる文字の特徴。The dictionary memory 8 stores characteristics of characters that serve as standards for recognition.

即ち、漢字、平仮名２炸仮名、英文字、数字、記号等の
特徴が記憶されており、認識回路１０の要求により、順
次認識回路１０へ送出する機能を有する。That is, the characteristics of kanji, hiragana, hiragana, alphabetic characters, numbers, symbols, etc. are stored and have a function of sequentially sending them to the recognition circuit 10 upon request from the recognition circuit 10.

フォーマット情報メモリ９は、帳票１上の文字記入位置
を示す情報が格納されており、読み取られた文字の記入
位置を画像メモリ５，１文字切出回路６．及び認、織回
路１０へ送る機能を有する。The format information memory 9 stores information indicating the position where characters are written on the form 1, and the position where the read characters are written is stored in the image memory 5, character cutting circuit 6. It also has the function of sending information to the weaving circuit 10.

認識回路１０は、１文字切出回路６より送られた文字に
対する特徴を特徴抽出回路７より受は取り、辞書メモリ
８から順次送られる文字の特徴とを照合して一致度を求
め、−成度の高いものから順に文字コードを候補列とし
て送出する機能を有する。The recognition circuit 10 receives from the feature extraction circuit 7 the features of the character sent from the single character extraction circuit 6, compares them with the features of the characters sequentially sent from the dictionary memory 8, and calculates the degree of matching. It has a function to send out character codes as candidate strings in descending order of degree.

後処理部１１は、入力された候補列を所定の一致度によ
って更に篩にか番ノで出力する機能を有する。The post-processing unit 11 has a function of outputting the input candidate sequence in a sieve or a number according to a predetermined degree of matching.

このような構成及び機能を有するので、文字認識の方法
を説明すると、まず帳票１上の文字が読み取られて２値
化された画像データば画像メモリ５に格納される。Since it has such a configuration and function, the character recognition method will be explained. First, the characters on the form 1 are read and binarized image data is stored in the image memory 5.

次に画像データは１文字切出回路６に送られ、フォーマ
ント情報メモリ９から送られた文字位置情報に基いて、
１文字の切出しを行って認識回路１０へ送る。Next, the image data is sent to the single character cutting circuit 6, and based on the character position information sent from the formant information memory 9,
One character is cut out and sent to the recognition circuit 10.

認識回路１０は入力した文字データを特徴抽出回路７へ
送り、その文字データの特徴を抽出させて受は取る。そ
こで辞書メモリ８より文字の特徴を順次読み出して文字
データの特徴と照合して、−成度の高い文字を認識の答
として候補文字にする。The recognition circuit 10 sends the input character data to the feature extraction circuit 7, which extracts the features of the character data. Therefore, the characteristics of the characters are sequentially read out from the dictionary memory 8 and compared with the characteristics of the character data, and the characters with a high degree of − quality are selected as candidate characters as answers for recognition.

この候補文字が複数個あれば候補列として順次文字コー
ドを出力する。If there are multiple candidate characters, character codes are sequentially output as a candidate string.

出力された文字コードの候補列は、後処理部１１で篩に
掛けられて出力し、例えばフロッピーディスク等の記憶
手段に記憶される。The output candidate string of character codes is sieved by the post-processing section 11, output, and stored in a storage means such as a floppy disk.

このようにして画像メモリ５に格納されている画像デー
タは順次文字認識が行われる。Character recognition is sequentially performed on the image data stored in the image memory 5 in this manner.

上記例の特徴抽出による方式の他に活字を対象に考案さ
れた方法で相関を用いるパターンマツチング方式に近い
ものがある。In addition to the method using feature extraction in the above example, there is a method devised for printed characters that is similar to a pattern matching method that uses correlation.

またこの２つの方式を纏めて両方の認識結果を用いるも
のもあり、或いは簡単な方法で２つの方式を選択するも
のもある。There are also methods that combine these two methods and use the recognition results of both methods, or there are methods that select the two methods using a simple method.

[Problem that the invention seeks to solve]

上記従来方法では次のような問題点がある。 The above conventional method has the following problems.

■特徴抽出による方式は、少しの違いでも特徴として表
すことができ、微少差しかないカテゴリー間の識別には
適しているが、画数の多い漢字には抽出される特徴が多
過ぎたり、−１二両の変動や紙面の汚れや筆記具による
変動で識別か大きく影響される。■Feature extraction methods can express even small differences as features, and are suitable for distinguishing between categories with only slight differences, but for kanji with a large number of strokes, too many features are extracted, Identification is greatly affected by fluctuations in both sides, dirt on the paper surface, and fluctuations caused by writing instruments.

■相関を用いる方式は、文字の微少な変動には強いが、
微少差しかないカテゴリー間の識別には不適当である。■The method using correlation is strong against minute fluctuations in characters, but
It is unsuitable for distinguishing between categories with only slight differences.

■また上記■、■の方式を纏めて用いる方式や選択して
用いる方式の何れも処理量の増加や両方に跨る部分が多
くなって効果的でない。(2) In addition, both methods that use the methods (2) and (2) above together or selectively are not effective because the amount of processing increases and the number of parts that involve both methods increases.

従って上記例れの方式においても類似文字の誤認識が多
く出易い。Therefore, even in the above-described method, many similar characters are likely to be erroneously recognized.

日本語に用いられる文字は、漢字、平仮名１片仮、アル
ファー、ント、アラビヤ数字、記号等多岐に互っている
ので、これらを１つの認識アルゴリズムで読み取るのは
、字種によって読取精度が異なったり、また処理自体が
無駄の多いものになり勝ちである。The characters used in Japanese include kanji, hiragana, katakana, alpha, numerals, Arabic numerals, symbols, etc., so reading them with a single recognition algorithm has different reading accuracy depending on the type of character. Otherwise, the processing itself becomes wasteful.

[Failure to solve the problem]

本発明は、文字の画像データを異なる方法で認識する複
数の認識手段と、文字の画像データの複雑度及び曲線係
数に応じて画像データの字種を認識手段対応に分類する
分類手段と、分類手段の分類に基いて、画像データをそ
の分類字種に最適な認識手段を選択し認識する文字認識
装置であり、かくすることにより上記問題点を解決する
ことができる。The present invention provides a plurality of recognition means for recognizing character image data using different methods, a classification means for classifying the character types of the image data into correspondence with the recognition means according to the complexity and curve coefficient of the character image data, and This is a character recognition device that selects and recognizes image data based on the classification of the means, by selecting the most suitable recognition means for the classified character type, and thereby the above-mentioned problems can be solved.

ここで言うフォーマット定義上の字種とは、漢字、仮名
文字、数字、アルファベット、及び記号の他、例えば回
路記号等コード化が可能な対象を含むものである。認識
手段選択の為の分類字種は大きく分けて４通りとなり、
それはフォーマット定義上の字種とは重なるところもあ
るが、「漢字」や「日本語」といったものについては複
数の分類用字種を含む。大まかな４通りとは、「ひらが
な」を中心とした丸みを帯びた文字のグループ。The character types in the format definition mentioned here include kanji, kana characters, numbers, alphabets, and symbols, as well as objects that can be encoded, such as circuit symbols. There are roughly four types of character types for selecting recognition methods.
Although this overlaps with the character types in the format definition in some places, things like ``Kanji'' and ``Japanese'' include multiple character types for classification. The four main types are rounded groups of letters centered around ``hiragana''.

１カタカナ」や「大小五文」等の少雨漢字や記号等の文
字グループ、各画の漢字のグループ（以上は主としてサ
イズの大きい手書き文字中心）と印刷文字である。フォ
ーマント定義上の字種で［ひらかな」や「カタカナ」は
−意に分類用字種と対応できるが、日本語文章を対象に
する時、この分類字種の判定が是非とも必要となる。These include character groups such as kanji and symbols with little rain, such as ``1 katakana'' and ``dai-ko-gomon'', groups of kanji with each stroke (the above are mainly large-sized handwritten characters), and printed characters. Hirakana and Katakana, which are character types based on formant definitions, can correspond to classification character types, but when dealing with Japanese texts, it is absolutely necessary to determine this classification character type. .

また、文書中に文字の部分と図面１表等の部分が存在す
る時に、それらを何れも認識する必要がある場合にも対
応することができる。Furthermore, it is possible to cope with a case where a text part and a part such as a drawing, table, etc. exist in a document, and it is necessary to recognize both of them.

[Effect]

本発明によれば、分類手段によって、例えば文字の画像
データの複雑度、即ち、画数、及び曲線係数等を字種判
定のパラメータとして、字種を複数の異なる認識手段に
対応するように分類し、分類字種に適した認識手段によ
って認識することにより、個々の文字の認識の処理時間
が短縮し、高速処理を行うことができ、また認識精度が
高まり誤認識を減少させることができる。According to the present invention, the classification means uses, for example, the complexity of character image data, that is, the number of strokes, the curve coefficient, etc., as parameters for character type determination, and classifies the character types so that they correspond to a plurality of different recognition means. By performing recognition using recognition means suitable for the classified character type, the processing time for recognizing individual characters can be shortened, high-speed processing can be performed, and recognition accuracy can be increased and misrecognition can be reduced.

〔Example〕

以下、本発明の一実施例を第１図〜第３図を参照して説
明する。第１図は本発明による実施例を示すブロック図
、第２図は第１図のフローチャートである。全図を通じ
て同一符号は同一対象物を示す。Hereinafter, one embodiment of the present invention will be described with reference to FIGS. 1 to 3. FIG. 1 is a block diagram showing an embodiment according to the present invention, and FIG. 2 is a flowchart of FIG. 1. The same reference numerals indicate the same objects throughout the figures.

第１図において、主制御部４ａは、各部を制御して文字
読取り、認識手段対応に字種の分類、認識手段に対応す
る文字の認識処理プログラムを遂行する機能を有する。In FIG. 1, the main control section 4a has the function of controlling each section to read characters, classifying character types according to the recognition means, and executing a character recognition processing program corresponding to the recognition means.

１文字切出回路６ａは、フォーマット情報メモリ９にフ
ォーマット情報が格納されている時は、フィールド及び
行情報に基いて画像メモリ５より１文字を切出し、また
フォーマット情報が格納されていない場合、或いは一部
しか格納されていない場合には、読取部２で文字を読み
取る時に、−緒に読んできた文字エリヤ、行、及びフィ
ールドを参照して、文字エリヤ抽出回路９ａ＋行抽出回
路９ｂ。The single character cutting circuit 6a cuts out one character from the image memory 5 based on the field and line information when format information is stored in the format information memory 9, and when format information is not stored, or If only a portion of the characters are stored, when the reading unit 2 reads the characters, the character area extracting circuit 9a+line extracting circuit 9b refers to the character area, line, and field read earlier.

及びフィールド抽出回路９ｃで処理し、１文字切出しを
行う機能を有している。and a field extraction circuit 9c has a function of processing and cutting out one character.

字種参照メモリ１２は、予めフォーマット情報メモリ９
で指定される場合の字種パラメータを保持する。The character type reference memory 12 is stored in advance in the format information memory 9.
Holds the character type parameter when specified by .

正規化回路１３は、１文字切出ロ路６ａより送られる文
字パターンをｎＸｍ（例えば４８　Ｘ　４Ｂドツト）の
大きさに正規化する機能を有する。The normalization circuit 13 has a function of normalizing the character pattern sent from the single character cutout path 6a to a size of nXm (for example, 48 x 4B dots).

字種判定回路１４は、正規化回路１３で正規化された文
字パターンのｎｘｍの黒の数、或いは曲線係数をカウン
トして判定パラメータの算出を行って一次特徴抽出回路
１５へ送る機能を有する。The character type determination circuit 14 has a function of counting the nxm black number or curve coefficient of the character pattern normalized by the normalization circuit 13 to calculate determination parameters and sending them to the primary feature extraction circuit 15.

−次特徴抽出回路１５は、字種判定回路１４から送られ
る判定パラメータに基いて字種を選択し、−次照合回路
１６により対応する文字の特徴を読み出して照合し、候
補文字を選択して候補列を一次判定回路１７へ送る機能
を有する。即ち、丸みを持った文字、角張った文字、各
画文字、或いは印刷文字のように分類字種に応じた特徴
の抽出により候補選択を行う。活字（印刷文字）に対し
ては入カバターン（又は正規化パターン）のマツチング
のための補正を行い、分類用標準パターンとの照合によ
り候補選択を行う。-The next feature extraction circuit 15 selects a character type based on the determination parameter sent from the character type determination circuit 14, -The next matching circuit 16 reads and matches the features of the corresponding character, and selects a candidate character. It has a function of sending candidate sequences to the primary determination circuit 17. That is, candidates are selected by extracting features according to the classified character type, such as rounded characters, angular characters, stroke characters, or printed characters. For typefaces (printed characters), correction is performed to match input cover patterns (or normalized patterns), and candidates are selected by comparison with standard patterns for classification.

一次照合回路１６は、文字の分類特徴を記憶する記憶手
段である図示省略した分類辞書メモリを内蔵している。The primary matching circuit 16 has a built-in classification dictionary memory (not shown) which is a storage means for storing classification characteristics of characters.

一次判定回路１７は、特徴抽出回路１５より送られる候
補列の候補内容、候補間の類似度の有意差等を調べて、
明らかに判別できるもの、同形のもの。The primary judgment circuit 17 examines the candidate contents of the candidate sequence sent from the feature extraction circuit 15, the significant difference in similarity between the candidates, etc.
Something that can be clearly distinguished, something that has the same shape.

及び二次判定を要するものに分類して夫々出力する機能
を有する。It has a function to classify and output those that require secondary judgment.

即ち、図中１点鎖線で示す範囲は一次認識機能を有して
いる。That is, the range shown by the one-dot chain line in the figure has a primary recognition function.

二次特徴抽出回路１８は、−次判定回路１７から送られ
た候補列に対応する文字の特徴を読み出して二次照合回
路１９へ送る機能を有する。The secondary feature extraction circuit 18 has a function of reading character features corresponding to the candidate string sent from the -order determination circuit 17 and sending them to the secondary matching circuit 19.

二次照合回路１９は、辞書メモリを備え二次特徴抽出回
路１８より送られた候補列の特徴と、辞書メモリの対応
する文字の特徴を読み出して照合する機能を有する。The secondary matching circuit 19 includes a dictionary memory and has a function of reading out and matching the features of the candidate string sent from the secondary feature extracting circuit 18 and the features of the corresponding characters in the dictionary memory.

二次判定回路２０は、二次照合回路１９より送られる候
補列の候補内容、候補間の類似度の有意差等を調べて、
明らかに判別できるものを答とし、類似のものを候補列
として出力し、また判別困難なものは分類してリジェク
ト候補として出力する機能を有する。The secondary determination circuit 20 examines the candidate contents of the candidate string sent from the secondary matching circuit 19, significant differences in similarity between candidates, etc.
It has the function of outputting clearly distinguishable items as answers, similar items as candidate sequences, and classifying items that are difficult to identify and outputting them as reject candidates.

、即ち、図中２点鎖線で示す範囲は二次認識機能を有す
る。That is, the range indicated by the two-dot chain line in the figure has a secondary recognition function.

このような構成及び機能を有するので、認識処理の方法
を第２図のフローチャートによって説明する。Since it has such a configuration and function, the recognition processing method will be explained with reference to the flowchart in FIG. 2.

■まず読取部２によって読み取られた文字が画像メモリ
５に記憶されると、１文字切出回路６ａによって１文字
が切り出される。(1) First, when a character read by the reading section 2 is stored in the image memory 5, one character is cut out by the one character cutting circuit 6a.

■そこで字種参照回路１２に活字分類がある場合。■If there is a type classification in the character type reference circuit 12.

即ち、予め字種指定が可能で１例えば印刷文書が読み取
られる場合には、フォーマット情報メモリ７　９に字種
情報を記憶しておき、切り出された文字データは正規化
、及び字種判定を行わず、−次特徴抽出回路１５に送ら
れる。また例えば電話番号や生年月日等の数値項目であ
る場合にも同様に直ちに一次特徴抽出回路１５に送られ
る。In other words, if the character type can be specified in advance and 1, for example, a printed document is to be read, the character type information is stored in the format information memory 79, and the extracted character data is normalized and character type determined. First, it is sent to the -order feature extraction circuit 15. Furthermore, if the information is a numerical item such as a telephone number or date of birth, it is similarly immediately sent to the primary feature extraction circuit 15.

■またフォーマット情報メモリ９に字種情報がな（、字
種を入カバターンから得る場合、即ち、一般の手書き文
字、印刷文字の混在した文書については、切り出された
文字データは正規化回路１３において正規化されて字種
判定回路１４に送られる。■Also, if there is no character type information in the format information memory 9 (if the character type is obtained from the input cover pattern, that is, for documents containing a mixture of general handwritten characters and printed characters, the extracted character data is processed in the normalization circuit 13). It is normalized and sent to the character type determination circuit 14.

■文字データは字種判定回路１４において判定パラメー
タが算出されて一次特徴抽出回路１５に謀られる。(2) For character data, determination parameters are calculated in the character type determination circuit 14 and sent to the primary feature extraction circuit 15.

■活字パターンの場合や数値項目の場合には、−次照合
回路１６の分類辞書メモリの範囲が限定され、読み出し
た活字パターンや手書き数学パターンとのマツチングが
行われ、候補を選択し候補列を一次判定回路１７に送る
。その他の字種の場合には、判定パラメータ゛に基いて
一次照合回路１６で分類辞書メモリより対応する文字の
特徴を読み出して字種判定を行い、候補が選択されて候
補列は一次判定回路１７に送られる。　′　　　□　−
Ｃ−次判定回路１７において、候補内容、候補間の゛類
似度より判定されて、明らかに判別できるものは答とし
て出力され、同形のもの３例えば漢字−カナ間でのイ、
工、力１口等は後°処理）゛“ラグを□付けて、碗補と
して図法省略した後処理□部へ送られる。　　　　　　
　　　　　゛また二次判定を要するものは分類されて、分類■かくて
、二次特徴抽出回路１８で特徴を抽出し、二次照合回路
１９の辞書メモリの特徴と照合して候補列を二次判定回
路２０に送る。■In the case of a print pattern or a numeric item, the range of the classification dictionary memory of the -next matching circuit 16 is limited, and matching with the read print pattern or handwritten math pattern is performed to select a candidate and create a candidate string. It is sent to the primary judgment circuit 17. In the case of other character types, the primary matching circuit 16 reads out the characteristics of the corresponding character from the classification dictionary memory based on the determination parameter, performs character type determination, selects a candidate, and sends the candidate string to the primary determination circuit 17. Sent. ′ □ −
In the C-order determination circuit 17, candidates are determined based on the content of the candidates and the degree of similarity between the candidates, and those that can be clearly determined are output as answers, and those that are isomorphic 3, such as ``i'' between kanji and kana, are output as answers.
The work, force 1 mouth, etc. are sent to the post-processing section with a □ lag attached and the drawing method omitted as a complement to the bowl.
゛In addition, those requiring secondary judgment are classified and classified.■Thus, the secondary feature extraction circuit 18 extracts the features, and the secondary matching circuit 19 compares them with the features in the dictionary memory to perform the secondary judgment on the candidate string. to circuit 20.

■二次判定回路２０は明らかに判別できるものは、答と
して出力し、若干の候補が選択されたものは候補列を図
示省略した後処理部へ送る。また何れも一致度が低く候
補選択が絞れないものはりジェクト候補列として出力さ
れる。(2) The secondary determination circuit 20 outputs those that can be clearly determined as answers, and when some candidates are selected, sends the candidate string to a post-processing section (not shown). In addition, if the degree of matching is low and the selection of candidates cannot be narrowed down, they are output as a list of rejected candidates.

このよ゛うにして、予め中種指定が可能な場合。In this way, it is possible to specify the middle type in advance.

及び字種を入カバターンから得る場合の何れの場合にも
、候補選択が簡略化でき、特に事前情報がある場合には
大幅な簡略化が可能となり、高速処理を実現することが
できる。また入力文字に応じた二次判定を行うことによ
り、誤認識の少ない精度の高い認識処理を行うことがで
きる。更に字形上で判別できるものを確実に識別し、判
別ぞきなかった類似文字対についてはフラグを付けて出
力し後処理を効果的に行うことができる。In both cases, the selection of candidates can be simplified, and especially when prior information is available, this can be greatly simplified and high-speed processing can be achieved. Furthermore, by performing secondary determination according to the input characters, highly accurate recognition processing with fewer misrecognitions can be performed. Furthermore, it is possible to reliably identify characters that can be distinguished based on their shape, and to output similar character pairs that cannot be identified with a flag attached, thereby allowing effective post-processing.

上記の認識方法□は、コード化が可能な対象に適用する
ことができ１例差ば文書中に回路図等がある場合に、回
路記号等をコード化することにより認識が可能となる。The above recognition method □ can be applied to objects that can be coded. For example, if there is a circuit diagram or the like in a document, recognition becomes possible by coding the circuit symbol or the like.

〔Effect of the invention〕

以上説明したように本発明によれば、 ■多字種混在時にも認識処理を著しく高速化することが
できる。As explained above, according to the present invention, (1) recognition processing can be significantly speeded up even when multiple character types are mixed.

■認識精度を高め誤認識を減少させることができる。■It is possible to improve recognition accuracy and reduce misrecognition.

■コード化が可能な対象に適用することができる。■Can be applied to objects that can be encoded.

という効果がある。There is an effect.

[Brief explanation of the drawing]

第１図は本発明による実施例を示すブロック図、第２図
は第１図のフローチャート、第３図は従来方法を示すブロック図である。図において、　　　　　　　〆４．４ａは主制御部、　　５は画像メモリ、６．６ａは
１文字切出回路、７．１５は特徴抽出回路、８は辞書メモリ、９はフォー
マット情報メモリ、１０は認識回路、　　　　１１は後処理部、１２は字種
参照メモリ、　１３は正規化回路、１４は字種判定回路
、　　１６は一次照合回路、１７は一次判定回路、　　
１８は二次特徴抽出回路、１９は二次照合回路、　　２
０は二次判定回路を示す。茶２　図Ｐ：Ｉ　飴軸ｔ１闘文ｔＪ＋犀７すＨ、、ｔｌｌｒｉ１文ヤηぷｅ路ｒｎ加り山し ■（ｙｒｓ、Ｔオー７８゜Ｏ ■（エ、、。 ■ ）■ Ｈ。 ■（ ″″１遵３′′′　　　　・Ｙｅｓ）■ ■（漬定　　　］第３　図FIG. 1 is a block diagram showing an embodiment according to the present invention, FIG. 2 is a flowchart of FIG. 1, and FIG. 3 is a block diagram showing a conventional method. In the figure, 4.4a is the main control unit, 5 is the image memory, 6.6a is the single character extraction circuit, 7.15 is the feature extraction circuit, 8 is the dictionary memory, 9 is the format information memory, and 10 is the recognition circuit. , 11 is a post-processing unit, 12 is a character type reference memory, 13 is a normalization circuit, 14 is a character type determination circuit, 16 is a primary collation circuit, 17 is a primary determination circuit,
18 is a secondary feature extraction circuit, 19 is a secondary matching circuit, 2
0 indicates a secondary judgment circuit. Tea 2 Diagram P: I Candy axis t1 Fighting letter t J + Rhino 7su H,, tllri 1 sentence ya η pe ro rn Kariyamashi ■ (yrs, T oh 78°O ■(E,,. ■ )■ H. ■( ″″１compliant3′′′ ・Yes)■ ■(Dining] Fig. 3

Claims

[Claims]

(1) A character recognition device that irradiates light onto characters written on a medium and recognizes the image data of the characters obtained from the reflected light of the irradiated light, which recognizes the image data of the characters by different methods. A plurality of recognition means for recognizing, a classification means for classifying character types of image data according to the recognition means, and a recognition means for recognizing the image data by the recognition means corresponding to the character types based on the classification of the classification means. Characteristic character recognition device.

(2) The character recognition device according to claim 1, wherein the classification means classifies the character types of the image data based on preset character type information.

(3) The character recognition device according to claim 1, wherein the classification means extracts characteristics of characters in the image data to classify character types.