JPS5847066B2 - character recognition device - Google Patents

character recognition device

Info

Publication number
JPS5847066B2
JPS5847066B2 JP54054907A JP5490779A JPS5847066B2 JP S5847066 B2 JPS5847066 B2 JP S5847066B2 JP 54054907 A JP54054907 A JP 54054907A JP 5490779 A JP5490779 A JP 5490779A JP S5847066 B2 JPS5847066 B2 JP S5847066B2
Authority
JP
Japan
Prior art keywords
character
characters
read
dictionary
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP54054907A
Other languages
Japanese (ja)
Other versions
JPS55157077A (en
Inventor
正広 大川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP54054907A priority Critical patent/JPS5847066B2/en
Publication of JPS55157077A publication Critical patent/JPS55157077A/en
Publication of JPS5847066B2 publication Critical patent/JPS5847066B2/en
Expired legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Description

【発明の詳細な説明】 本発明は手書き文字の特徴ビットパターンを辞書の内容
と照合し複数個の答の候補を出力し、一方読取対象文字
テーブルを作成しこれと照合することにより最終的な答
を得る文字認識装置に関するものである。
[Detailed Description of the Invention] The present invention compares the characteristic bit pattern of handwritten characters with the contents of a dictionary and outputs a plurality of answer candidates. On the other hand, by creating a character table to be read and comparing it with this, the final result is obtained. This relates to a character recognition device that obtains answers.

従来、この種の文字認識装置は、たとえば第1図に示す
ように、帳票1に書かれた手書き文字は光学文字読取(
OCR)装置10の読取機構2の走査器によって読み取
られ、ビデオデータとしてメモリ3に格納される。
Conventionally, this type of character recognition device, as shown in FIG. 1, uses optical character reading (
The data is read by the scanner of the reading mechanism 2 of the OCR) device 10 and stored in the memory 3 as video data.

格納されたビデオデータは特徴抽出部4に送られ直ちに
解析され、その文字の特徴が抽出されて、読み取られた
文字のビットパターンが作成されレジスタ5に格納され
る。
The stored video data is sent to the feature extractor 4 and immediately analyzed, the features of the character are extracted, and a bit pattern of the read character is created and stored in the register 5.

一方、OCR装置10内のメモリには、手書きによる多
くの数字、かな、英字、および各種記号のあらゆる字形
の特徴があらかじめ抽出され、ビットパターン化されて
格納されている。
On the other hand, in the memory in the OCR device 10, the characteristics of many handwritten numbers, kana, alphabets, and various symbols are extracted in advance and stored in the form of bit patterns.

この蓄えられたビットパターン群を辞書6と称している
This stored bit pattern group is called a dictionary 6.

前述の帳票1から読み込まれた手書き文字は、そのビッ
トパターンが作成されると、直ちに辞書6中の文字のビ
ットパターンと逐一照合が開始される。
Immediately after the bit patterns of the handwritten characters read from the form 1 are created, point-by-point comparison with the bit patterns of the characters in the dictionary 6 is started.

ここで読み込まれた文字のビットパターンが辞書中のあ
る文字のビットパターンと一致すると、その辞書6中の
文字が読み取り結果として出力される。
If the bit pattern of the character read here matches the bit pattern of a certain character in the dictionary, that character in the dictionary 6 is output as the reading result.

また、辞書6中のどの文字のビットパターンとも一致し
なげればリジェクト文字として出力される。
Further, if the bit pattern does not match any character in the dictionary 6, the character is output as a reject character.

現在の文字認識装置の多くは数字、かな、英字、特殊記
号等が単独で存在する場合と、それらのいくつかが混在
する場合があるが、後者では類似文字の種類および組合
せが多くなる。
Many of the current character recognition devices have numbers, kana, alphabetic characters, special symbols, etc., either singly or in combination with some of them, but in the latter case, there are many types and combinations of similar characters.

たとえば、数字の「7」とかなの「す」、「ワ」、数字
「8」と英字rBlのような場合手書きで書くと区別で
きない場合が往々起る。
For example, the number "7", the kana "su", "wa", the number "8", and the alphabetic letter rBl are often difficult to distinguish when written by hand.

しかし、従来の特徴抽出部では辞書に収められた特徴と
の比較だけで判定され1つの答が出力される。
However, in the conventional feature extraction unit, a determination is made only by comparison with features stored in a dictionary, and a single answer is output.

この場合読取対象文字が明らかに数字であることが分っ
ていても、辞書が1つであると字形からだけ判定される
からかなの類似文字を出力する場合が起る。
In this case, even if the character to be read is clearly known to be a numeric character, if there is only one dictionary, the character can be determined only from the shape of the character, so similar characters such as kana may be output.

このような誤りを防止するため、読取対象によって複数
種類の辞書を用意して、あらかじめ読取対象の辞書を選
択することにより、正しい文字を認識することができる
In order to prevent such errors, correct characters can be recognized by preparing a plurality of dictionaries depending on the object to be read and selecting the dictionary to be read in advance.

しかし、辞書は文字ビットパターンから成るぼう犬な容
量を有するメモリであるから、これを複数個もって制御
することは構成を複雑化し高価格となることは明らかで
ある。
However, since a dictionary is a memory with a large capacity consisting of character bit patterns, it is obvious that controlling a plurality of dictionaries complicates the configuration and increases the cost.

本発明の目的は数字、かな、英字等の混在する手書き文
字を■個の辞書で誤りなく認識しうる文字認識装置を提
供することである。
An object of the present invention is to provide a character recognition device that can recognize handwritten characters containing a mixture of numbers, kana, alphabetic characters, etc., without error using a number of dictionaries.

前記目的を達成するため、本発明の文字認識装置は手書
き文字の特徴を抽出した文字ビットパターンを格納した
辞書を設け、読取られた手書き文字のビデオデータを特
徴抽出および判定演算部に入れて特徴を抽出し前記辞書
と照合して判定演算の結果得られた文字の文字種優先順
位を付与して複数個の答の候補を出力し、一方入力する
手書き文字の文字種を格納する読取対象文字テーブルを
作成し、前記複数の答の候補を照合回路に入れ前記読取
対象文字テーブルと照合し、前記読取対象文字テーブル
に格納された文字種と合致する答の候補を出力すること
により最終の答を得ることを特徴とするものである。
In order to achieve the above object, the character recognition device of the present invention includes a dictionary that stores character bit patterns from which features of handwritten characters are extracted, and inputs video data of the read handwritten characters into a feature extraction and determination calculation section to extract the features. is extracted and compared with the dictionary, and a character type priority is given to the character obtained as a result of the determination operation, and a plurality of answer candidates are outputted, while a reading target character table is created that stores the character type of input handwritten characters. A final answer is obtained by inputting the plurality of answer candidates into a matching circuit, comparing them with the reading target character table, and outputting answer candidates that match the character types stored in the reading target character table. It is characterized by:

以下本発明を実施例につき詳述する。The present invention will be described in detail below with reference to examples.

第2図は本発明の実施例の構成を示す説明図である。FIG. 2 is an explanatory diagram showing the configuration of an embodiment of the present invention.

同図において、たとえば第1図のOCR装置10の読取
機構2に帳票1を挿入し、手書き文字を読取りメモリ3
に格納し、格納されたビデオデータを第2図に示す特徴
抽出および判定演算部11に入れ、第1図と同様に特徴
を抽出し、辞書(メモリ)6の内容と照合し、判定演算
の結果答の候補として第1候補〜第n候補を出力する。
In the same figure, for example, a form 1 is inserted into the reading mechanism 2 of the OCR device 10 shown in FIG.
The stored video data is input into the feature extraction and judgment calculation section 11 shown in FIG. 2, where the features are extracted in the same way as in FIG. The first to nth candidates are output as answer candidates.

すなわち、辞書6には数字、かな、英字、特殊記号等の
特徴の文字ビットパターンを格納し、入力文字のビデオ
データの特徴と照合する。
That is, the dictionary 6 stores character bit patterns of characteristics such as numbers, kana, alphabets, special symbols, etc., and compares them with characteristics of video data of input characters.

この場合、特徴の各項目につき優先順位を与え、複数の
答應1〜Anを出力する。
In this case, a priority is given to each feature item, and a plurality of answers 1 to An are output.

たとえば、読み取られた類似文字のうち特徴の内容によ
り数字、かな、英字の順序に優先順位を与えるとすると
、前述の数字「7」の特徴に最も近いものを第1優先と
し、次にかな「す」またはかな「ワ」により近い一致を
示すものに以下の優先順位を与える。
For example, if we were to give priority to numbers, kana, and alphabetic characters based on the feature content among the similar characters that were read, we would give first priority to the character closest to the feature of the number "7" mentioned above, followed by kana " The following priority is given to those showing a closer match to ``su'' or kana ``wa''.

そして優先順位を与えた複数の答の候補&1〜A、 n
にそれぞれ固有のアドレスを与えて照合回路13に送る
Then, select multiple answer candidates with priority &1~A, n
A unique address is given to each of them and sent to the matching circuit 13.

一方、読取対象文字の数字、かな、英字等にそれぞれ固
有のアドレスを与えておき、このアドレスに従い、読取
対象文字テーブル12を作成し、この内容を照合回路1
3に送り、前述の複数の答の候補との照合をとる。
On the other hand, a unique address is assigned to each of the characters to be read, such as numbers, kana, alphabetic characters, etc. According to these addresses, a character to be read table 12 is created, and the contents are sent to the matching circuit 1.
3 and is compared with the multiple answer candidates mentioned above.

すなわち複数の答の第1優先の文字の固有アドレスに対
応し読取対象文字テーブル12を探索し一致を検出し、
一致しなげれば第2優先に進む。
That is, searching the reading target character table 12 corresponding to the unique address of the first priority character of the plurality of answers and detecting a match;
If they do not match, proceed to the second priority.

前述の例で読取対象文字がたとえば数字「0〜9」、が
な「ア、イ、つ、ニオ」とすれば数字「7」が決定的で
あり、かな「す」が出力する余地はない。
In the above example, if the characters to be read are the numbers "0-9" and the Japanese characters "a, i, tsu, nio", the number "7" is decisive, and there is no room for the Japanese character "su" to be output. .

これに対して読取対象文字がかなのみであれば、第1優
先の数字「7」は棄てられ、第2優先のかな「す」と第
3優先の「ワ」のうち「す」の方が辞書との一致が近い
とすれば第2優先のかな「す」が出力されることになる
On the other hand, if the only characters to be read are kana, the first priority digit "7" is discarded, and the second priority digit "su" and the third priority digit "wa" are replaced by "su". If the match with the dictionary is close, the second priority kana "su" will be output.

前掲の数字「8」と英字rBJの場合は、読取対象文字
に何れかを欠如している時決定的であるが、両者が存在
する場合には前例のように辞書との一致の近い方を優先
とするかまたは両者とも不一致としてリジェクトされる
In the case of the number "8" and the alphabet rBJ mentioned above, it is decisive when either of the characters to be read is missing, but if both exist, the one with the closest match with the dictionary is selected as in the previous example. Either it will be given priority or both will be rejected as inconsistency.

以上説明したように、本発明によれば、手書き文字の特
徴ビットパターンを辞書の内容と照合し複数個の答の候
補を出力し、一方読取対象文字テーブルを作成しこれと
照合することにより最終的な答を得るものである。
As explained above, according to the present invention, a plurality of answer candidates are output by comparing the characteristic bit patterns of handwritten characters with the contents of a dictionary, while a reading target character table is created and compared with this to obtain a final result. The answer is:

これにより従来の手書きの文字形だけの特徴を判定して
1答のみを出力するものに比し誤りが少なく、前述の複
数の辞書を設けたのと同じ効果が1個の辞書で得られる
As a result, there are fewer errors compared to the conventional method that judges the features of only handwritten character shapes and outputs only one answer, and the same effect as the above-mentioned plural dictionaries can be obtained with a single dictionary.

この場合、読取対象文字テーブルが必要であるが、これ
は単に数字、かな、英字等に固有アドレスを与えた簡単
なメモリであるから構成上余り問題とはならない。
In this case, a reading target character table is required, but since this is a simple memory in which unique addresses are given to numbers, kana, alphabetic characters, etc., it does not pose much of a problem in terms of structure.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は従来例の説明図、第2図は本発明の実施例の構
成を示す説明図であり、図中、1は帳票、2は読取機構
、3はメモリ、6は辞書(メモリ)、11は特徴抽出お
よび判定演算部、12は読取対象文字テーブル、13は
照合回路を示す。
FIG. 1 is an explanatory diagram of a conventional example, and FIG. 2 is an explanatory diagram showing the configuration of an embodiment of the present invention. In the figure, 1 is a form, 2 is a reading mechanism, 3 is a memory, and 6 is a dictionary (memory). , 11 is a feature extraction and determination calculation section, 12 is a reading target character table, and 13 is a collation circuit.

Claims (1)

【特許請求の範囲】[Claims] 1 手書き文字の特徴を抽出した文字ビットパターンを
格納した辞書を設け、読取られた手書き文字のビデオデ
ータな特徴抽出および判定演算部に入れて特徴を抽出し
前記辞書と照合して判定演算の結果得られた文字の文字
種優先順位を付与して複数個の答の候補を出力し、一方
入力する手書き文字の文字種を格納する読取対象文字テ
ーブルを作成し、前記複数の答の候補を照合回路に入れ
前記読取対象文字テーブルと照合し、前記読取対象文字
テーブルに格納された文字種と合致する答の候補を出力
することにより最終の答を得ることを特徴とする文字認
識装置。
1. A dictionary storing character bit patterns from which features of handwritten characters have been extracted is provided, and the video data of the read handwritten characters is input into a feature extraction and judgment calculation unit to extract the features and compared with the dictionary to obtain the result of judgment calculation. A character type priority order of the obtained characters is given to output a plurality of answer candidates, while a reading target character table is created that stores the character types of handwritten characters to be input, and the plurality of answer candidates are sent to a matching circuit. A character recognition device characterized in that a final answer is obtained by comparing the characters with the character table to be read and outputting answer candidates that match the character types stored in the character table to be read.
JP54054907A 1979-05-04 1979-05-04 character recognition device Expired JPS5847066B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP54054907A JPS5847066B2 (en) 1979-05-04 1979-05-04 character recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP54054907A JPS5847066B2 (en) 1979-05-04 1979-05-04 character recognition device

Publications (2)

Publication Number Publication Date
JPS55157077A JPS55157077A (en) 1980-12-06
JPS5847066B2 true JPS5847066B2 (en) 1983-10-20

Family

ID=12983666

Family Applications (1)

Application Number Title Priority Date Filing Date
JP54054907A Expired JPS5847066B2 (en) 1979-05-04 1979-05-04 character recognition device

Country Status (1)

Country Link
JP (1) JPS5847066B2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS4912724A (en) * 1972-05-12 1974-02-04
JPS53112617A (en) * 1977-03-14 1978-10-02 Toshiba Corp Pattern recognizing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS4912724A (en) * 1972-05-12 1974-02-04
JPS53112617A (en) * 1977-03-14 1978-10-02 Toshiba Corp Pattern recognizing method

Also Published As

Publication number Publication date
JPS55157077A (en) 1980-12-06

Similar Documents

Publication Publication Date Title
JPS6262387B2 (en)
US3259883A (en) Reading system with dictionary look-up
JP2740335B2 (en) Table reader with automatic cell attribute determination function
JPS5847066B2 (en) character recognition device
JPH0226266B2 (en)
JPS5842904B2 (en) Handwritten kana/kanji character recognition device
JP2570784B2 (en) Document reader post-processing device
JPS6061875A (en) Generation system of standard pattern
JP2560959B2 (en) Post-processing method for character recognition
JP2839515B2 (en) Character reading system
JPS5949628B2 (en) optical character reader
JPS61114388A (en) Character input device
JPH11120294A (en) Character recognition device and medium
JPH0565912B2 (en)
JPH07105225A (en) Dictionary retrieval device
JPS62278689A (en) Word retrieving system
JPH0317150B2 (en)
JPS6321226B2 (en)
JPH0252315B2 (en)
JPS5975377A (en) Character selecting system
JPH09171539A (en) Character recognition device
JPH0731672B2 (en) Document creation device
JPH04318687A (en) Character recognition unit
JPH02181286A (en) Word retrieving system
JPS6368989A (en) Document reader