JPH0193876A - Character reader - Google Patents
Character readerInfo
- Publication number
- JPH0193876A JPH0193876A JP62249782A JP24978287A JPH0193876A JP H0193876 A JPH0193876 A JP H0193876A JP 62249782 A JP62249782 A JP 62249782A JP 24978287 A JP24978287 A JP 24978287A JP H0193876 A JPH0193876 A JP H0193876A
- Authority
- JP
- Japan
- Prior art keywords
- character
- characters
- word
- candidate
- similar
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010586 diagram Methods 0.000 description 3
- 238000000034 method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
Landscapes
- Character Discrimination (AREA)
Abstract
Description
【発明の詳細な説明】
〔産業上の利用分野〕
本発明は、帳票上に曹かれた文字を認識して読み取る文
字読取り装置に係わり、特に単語辞書を用いて入力文字
を正確に認識でさろような文字認識後処坤を具備した文
字読取り装置に関する。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to a character reading device that recognizes and reads characters written on a form, and particularly to a character reading device that can accurately recognize input characters using a word dictionary. The present invention relates to a character reading device equipped with such a character recognition device.
従来の文字認識方式では1文字読取り装置で読まれた結
果の精度を上げるために、特公昭61−20058号公
報に記載のように、複数の候補を出力し単語辞書と照合
し、最も良くマツチングし念単語を出力していた。In conventional character recognition methods, in order to improve the accuracy of the results read by a single character reading device, as described in Japanese Patent Publication No. 61-20058, multiple candidates are output and compared with a word dictionary to find the best match. It was outputting words of remembrance.
しかるに上記方法では、認識候補の中に、正しいカテゴ
リの文字が入っている必要があり、無げれは誤まって読
み取られる可能性が多分にある。However, in the above method, it is necessary that the recognition candidates include characters of the correct category, and if there is not, there is a high possibility that the characters will be read incorrectly.
−万、文字読取り装置で読み取れる文字種の数には制限
があり、常用漢字あるいはJIS第1水準の漢字しか読
めない。しかるに帳票に書かれる文字はこれ以外の文字
11例えばJIS第2水準の漢字が含まれることがある
。- There is a limit to the number of character types that can be read by character reading devices, and only common kanji or JIS level 1 kanji can be read. However, the characters written on the form may include other characters11, such as JIS level 2 kanji.
これらの文字を書かない様に規制することは記入者に不
必要な負担を課丁ことになり、かつどの文字が文字読取
り装置で読み取れるかを憶えることは、字種が数千文字
に及ぶことから不可能に近い。Restricting the writing of these characters would place an unnecessary burden on the person writing them, and remembering which characters can be read by a character reading device would be difficult since there are thousands of character types. It's almost impossible.
本発明の目的とするところは、文字読取り装置よりの認
識候補文字の中に正解文字が含まれていなくとも誤認識
することな(単語照合を行なわせしめる手段を提供する
ことにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a means for performing word matching without erroneously recognizing characters even if the correct characters are not included in candidate characters for recognition from a character reading device.
上記目的は、文字読取装置にだいて、文字読取手段より
得られた認識候補から、それに類似し友複数個の類似文
字を認識候補に付加する候補選択手段を有し、か(]−
て作成されt候補群をもとに単語辞書に保持された単語
の中から最も類似性の大ざい単語を選択するようにし之
ことによυ達成される。The above object is to provide a character reading device with candidate selection means for adding a plurality of similar characters similar to the recognition candidates obtained by the character reading means to the recognition candidates,
This is accomplished by selecting the word with the most similarity from among the words stored in the word dictionary based on the t candidate groups created by the method.
一般にあるカテゴリの文字が誤って他のカテゴリに読ま
2するとぎ、他カテゴリの数はそれはと多くな(、かつ
特定のカテゴリに限定されることが多い。したがってこ
れらをあらかじめ測定しておぎ類似文字テーブルを作っ
てRくと、正解文字がその類似文字テーブル中に現われ
る確率は非常に太さくなる。In general, characters in one category are mistakenly read as other categories, but the number of other categories is very large (and often limited to a specific category. Therefore, it is necessary to measure these in advance and read similar characters. When a table is created and R is run, the probability that a correct character will appear in the similar character table becomes very large.
同様にして標準バタンに登録されていない文字(外字)
を読ませてどのような文字に誤読しゃ丁いかを調べてお
けは、類似文字テーブルの中に外字を入れて?(ことか
でさる。Similarly, characters that are not registered in standard buttons (external characters)
To find out which characters are mispronounced by having them read it, put the foreign characters in the similar character table. (Kotoka De Saru.
し友がって文字読取り装置から複数の認識候補文字が出
力された地合、認識候補文字をキーワードにして類似文
字チーフル全引き、これら類似文字テーブルに登録され
ている文字をも候補文字として使用することにすると、
朕補文字の中に高い確率で正解文字が含ぼれることにな
り、単語照合をまちがえることは無くなる。If multiple recognition candidate characters are output from the character reading device, the recognition candidate characters are used as keywords to retrieve all similar characters, and the characters registered in these similar character tables are also used as candidate characters. Deciding to do so,
Correct characters will be included in the complement characters with a high probability, and there will be no mistakes in word matching.
以下、本発明の一実施例を第1図により説明する。 An embodiment of the present invention will be described below with reference to FIG.
第1図において1は文字の沓かれ几帳票であり。In Figure 1, 1 is a neatly written ledger.
文字認識手段2によって取り込まれ、読み取られる。文
字認識手段2は認識用辞書5を使って認識し、認識候補
を出力する。認識候補は、カテゴリ名、マスク番号、類
似度などから構成さnているものとする。マスク番号は
、カテゴリあたり複数のマスクから標本バタンか構成さ
れている場合に、どのマスクでアクセツ)gn皮かを示
す友めのものであり、カテゴリあ之ジ1マスクしか使用
しない時には不要である、
6に単語照合手段であジ、候補文字と単語辞書7とを照
合し、最もうまくマツチングする単語を照合結果単語8
として出力するものである。It is captured and read by the character recognition means 2. The character recognition means 2 performs recognition using a recognition dictionary 5 and outputs recognition candidates. It is assumed that the recognition candidates are composed of a category name, a mask number, a degree of similarity, and the like. The mask number is used to indicate which mask is used when the specimen button is composed of multiple masks per category, and is unnecessary when only one mask is used for the category. , 6, the word matching means matches the candidate characters with the word dictionary 7, and the word that matches best is the matching result word 8.
This is what is output as.
候補選択手段4は文字認識手段2からの認識候補から類
似文字テーブル5を引さ、単語照合手段6に入力する候
補文字群を作成する。The candidate selection means 4 subtracts the similar character table 5 from the recognition candidates from the character recognition means 2 and creates a candidate character group to be input to the word matching means 6.
第2図は、文字認識手段2からの認識候補の1例を示す
。認識候補は候補数NCとN、C個のカテゴリ名、マス
ク番号、類似皮表とからなりytりている。FIG. 2 shows an example of recognition candidates from the character recognition means 2. The recognition candidates consist of the number of candidates NC and N, C category names, mask numbers, and similar skins.
第6図は、類似文字テーブル501例を示す。FIG. 6 shows an example of a similar character table 501.
類似文字テーブルは、カテゴリ名、マスク番号及び類似
文字数HBおよびN5個の類似文字カテゴリ名から構成
されている。The similar character table is composed of a category name, a mask number, the number of similar characters HB, and N5 similar character category names.
以下、例を上げて具体的動作を説明する。今帳票上に書
かれた′市′という文字を読み取った所、第2図に示す
ような候補が得られ之とする。この候補の中に正解1市
′は含まれていないため、単語照合で誤まって他の単語
とマツチングしてしまうぢそれがある。The specific operation will be explained below using an example. Suppose that when we read the word ``city'' written on the form, we get the candidates shown in Figure 2. Since the correct answer ``city'' is not included in these candidates, there is a possibility that the word matching may be mistakenly matched with another word.
次に、認識候補の第1位の文字丁なわちカテゴリ名′布
′、マスク番号1′(!−類似文字テーブル5の中から
探し出丁。第6図に示す類似文字テーブルにはそれがあ
ジ、4つの類似文字、下部留年がみつかる。これらの余
似文字のうち帝は認識候補の中に存在するので残りの下
部年金候補テーブルに追加する。Next, the first character in the recognition candidate, category name 'Cloth', mask number 1' (!-) is searched from the similar character table 5. It is listed in the similar character table shown in FIG. Aji, four similar characters, and lower grade repetition are found.Among these similar characters, emperor exists among the recognition candidates, so it is added to the remaining lower grade pension candidate table.
最終的な候補テーブルを第4図に示で。この中には正解
カテゴリ′市′が含まれて?す、単語照合で誤、まろこ
とは無(なるであろう。The final candidate table is shown in Figure 4. Does this include the correct category ``city''? There was a mistake in word matching, and Maroko was nothing.
以上の例では、候補の中に正解文字が含まれていない場
合を例にとったが、記入された文字が外字であっても、
類似文字表に外字を入れておけば同様に使えることは明
らかである。In the above example, we took the case where the correct character is not included in the candidates, but even if the written character is a non-standard character,
It is clear that if you include external characters in the similar character table, you can use them in the same way.
(発明の効果〕
本発明によれは、認識候補から類推して類似文字音候補
として追加できるため、認識候補の中に正解文字が含ま
れていなくても候補文字の中に高い確率で正解文字を含
ませることができ、単語照合結果を精度よ(することか
でざる。また類似文字テーブルの中に外字を含ませるこ
とによシ、外字文字が書かれていても単語照合を成功さ
せることができる。(Effects of the Invention) According to the present invention, since it is possible to infer from recognition candidates and add them as similar character-sound candidates, even if the correct character is not included in the recognition candidates, there is a high probability that the correct character will be among the candidate characters. By including non-standard characters in the similar character table, word matching can be performed successfully even if non-standard characters are written. Can be done.
第1図は本発明の一笑施例を示すブロック図。
第2図に認識候補の例全示す説明図、第5囚は類似文字
テーブルの例を示て説明囚、第4図は認識候補と類似文
字テーフ諏しから作成され九候補群の1例を示す説明図
である。
1・・・・帳票。
2・・・文字認識手段。
4・・・候補選択手段。
6・・・単語照合手段。
代理麟理士小用勝男
発2A
発:5 凹
幕4 圓FIG. 1 is a block diagram showing a simple embodiment of the present invention. Figure 2 is an explanatory diagram showing all examples of recognition candidates, Figure 5 is an explanatory diagram showing an example of a similar character table, and Figure 4 is an example of a group of nine candidates created from recognition candidates and similar character table summaries. FIG. 1...Ledger. 2...Character recognition means. 4...Candidate selection means. 6...Word matching means. Deputy Rinshi Katsuo Koyo 2A Departure: 5 Inokumaku 4 En
Claims (1)
該読取手段によって読取られた文字が単語辞書に登録さ
れた単語と一致することを検出するマッチング手段を具
備する文字読取装置において、文字読取手段より得られ
た認識候補から、それに類似した複数個の類似文字を認
識候補に付加する候補選択手段を有し、かくして作成さ
れた候補群をもとに単語辞書に保持された単語の中から
最も類似性の大きい単語を選択するようにしたことを特
徴とする文字読取装置。1. A character reading means for recognizing characters written on a form;
In a character reading device equipped with a matching means for detecting that a character read by the reading means matches a word registered in a word dictionary, a plurality of similar recognition candidates are selected from recognition candidates obtained by the character reading means. It is characterized by having a candidate selection means for adding similar characters to the recognition candidates, and selecting the word with the greatest similarity from among the words held in the word dictionary based on the candidate group thus created. Character reading device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP62249782A JPH0193876A (en) | 1987-10-05 | 1987-10-05 | Character reader |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP62249782A JPH0193876A (en) | 1987-10-05 | 1987-10-05 | Character reader |
Publications (1)
Publication Number | Publication Date |
---|---|
JPH0193876A true JPH0193876A (en) | 1989-04-12 |
Family
ID=17198147
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP62249782A Pending JPH0193876A (en) | 1987-10-05 | 1987-10-05 | Character reader |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPH0193876A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6477653B2 (en) | 1998-01-08 | 2002-11-05 | Fujitsu Limited | Information storage system |
-
1987
- 1987-10-05 JP JP62249782A patent/JPH0193876A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6477653B2 (en) | 1998-01-08 | 2002-11-05 | Fujitsu Limited | Information storage system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JPH0193876A (en) | Character reader | |
JP2732593B2 (en) | Character reading system | |
JPS61114388A (en) | Character input device | |
JP2560959B2 (en) | Post-processing method for character recognition | |
JPS63268083A (en) | Word recognizing device | |
JPS63138479A (en) | Character recognizing device | |
JP2746899B2 (en) | Character recognition device | |
JPS63782A (en) | Pattern recognizing device | |
JP2712260B2 (en) | Character recognition device | |
JP3151866B2 (en) | English character recognition method | |
JPS61107486A (en) | Character recognition post-processing system | |
JP2839515B2 (en) | Character reading system | |
JPS6160189A (en) | Optical character reader | |
JP3245415B2 (en) | Character recognition method | |
JPH0319589B2 (en) | ||
JPS5953986A (en) | Character recognizing device | |
JPS6252912B2 (en) | ||
JPS6118080A (en) | Character recognizer | |
JPS6336487A (en) | Character reading system | |
JPH0442382A (en) | Word reader | |
JPS6121581A (en) | Character recognizer | |
Tehrani | Bibliographical access to Persian language materials in the United States | |
JPH0546806A (en) | Character recognition method | |
JPS61161588A (en) | Postprocessing system of character recognition | |
JPS5847066B2 (en) | character recognition device |