JPH0362280A

JPH0362280A - Character reading device

Info

Publication number: JPH0362280A
Application number: JP1198351A
Authority: JP
Inventors: Kazuji Kiyono; 清野　和司
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1989-07-31
Filing date: 1989-07-31
Publication date: 1991-03-18

Abstract

PURPOSE:To correct the misrecognized areas of a document with high efficiency by deciding these misrecognized areas based on the sum total of recognizing results of characters included in a prescribed range of the document written into a business form as well as the numerical information written into another corresponding business form and displaying these areas. CONSTITUTION:A check sum check part 12 adds the rows and the columns of a document respectively via a calculation part 13 based on the recognizing result (character code) received from a reading part 11. When the sum of recognizing results (character codes) of rows and columns is calculated, the part 13 transfers the recognizing results of the check sums corresponding to the rows and columns respectively as well as the calculation data to a deciding part 14. The part 14 decides the mixture of a misrecognized character if no coincidence is obtained between th two lower rank digits of the calculation data and the recognizing result of the check sum. Then a display part 15 display the reading result as a corrected screen. Thus a read document can be easily corrected.

Description

【発明の詳細な説明】【発明の目的１（産業上の利用分野）本発明は、帳票に記入された文書を読取る文字読取装置
に関する。DETAILED DESCRIPTION OF THE INVENTION Object of the Invention 1 (Field of Industrial Application) The present invention relates to a character reading device for reading documents written on forms.

（従来の技術）一般に、文字読取装置（ＯＣＲ）を用いて、帳票に記入
された一般文書（文字）の読み取りが行われている。こ
のような、文字読取装置を用いた一般文書の人力などで
は、帳票に記入されている図形や文字の箇所の判別が行
われている。すなわち、帳票における文字が記入されて
いる部分の行や、各文字を自動的に検出して、この検出
された各文字について認識処理を行なうものである。(Prior Art) Generally, a character reading device (OCR) is used to read general documents (characters) written on a form. When reading general documents using a character reading device, the positions of figures and characters written on a form are manually determined. That is, the line where characters are written in the form and each character are automatically detected, and recognition processing is performed for each detected character.

こうした認識処理は、漢字を含む多数の字種についての
認識技術と組み合わせて、和文の汎用ドキュメントリー
ダとして実現されている。This recognition processing is combined with recognition technology for many character types, including Kanji, to realize a general-purpose Japanese document reader.

（発明が解決しようとする課′ＸＪ）ところが、このようなドキュメントリーダ（文字読取装
置）においては、入力結果の修正等を行なう場合、はと
んど人間によって修正箇所等の確認が行なわれている。(Problem to be solved by the invention' There is.

具体的には、ドキュメントリーダによって読み取った結
果の表示と、実際に読み取らせた帳票に記入された文書
との比較を行ないながら、修正箇所の確認が行なわれて
いた。このために、読み取り結果の修正効率が悪く、多
くの時間を必要とするという問題があった。Specifically, corrections were confirmed by comparing the display of the results read by the document reader with the document written on the form that was actually read. For this reason, there is a problem in that the efficiency of correcting the read result is low and a lot of time is required.

本発明は前記のような点に鑑みてなされたもので、文書
の読み取り結果についての修正を効率良く行なうことが
可能な文字読取装置を提供することを目的とする。The present invention has been made in view of the above-mentioned points, and an object of the present invention is to provide a character reading device that can efficiently correct the result of reading a document.

［発明の構成］（課題を解決するための手段）本発明は、文書、及び同文書の所定範囲内に含まれる文
字群に対応する数値情報が記入された帳票から、前記文
書及び前記数値情報を読み取る読取り手段と、この読取
手段によって読み取られた文書の前記所定範囲内に含ま
れる文字の文字コードの総和を算出する算出手段と、こ
の算出手段によって算出された文字コードの総和と前記
読取り手段によって読み取られた数値情報とに基づいて
、前記読取り手段によって読み取られた文書が正しく読
み取られたものであるか否かを判別する判別手段と、こ
の判別手段によって正しく読み取られていないと判別さ
れた場合に、この旨を示す表示を行なう表示手段とを具
備して構成するものである。[Structure of the Invention] (Means for Solving the Problems) The present invention provides information on the document and the numerical information from a document and a form in which numerical information corresponding to a group of characters included in a predetermined range of the document is entered. a reading means for reading, a calculating means for calculating the sum of character codes of characters included in the predetermined range of the document read by the reading means, a sum of the character codes calculated by the calculating means and the reading means; a determining means for determining whether or not the document read by the reading means has been correctly read based on the numerical information read by the reading means; In this case, the display means is provided with a display means for displaying a display indicating this fact.

また本発明は、文書、及び同文書の各行、各列に含まれ
る文字群にそれぞれ対応する数値情報が記入された帳票
から、前記文書及び前記数値情報を読み取る読取り手段
と、この読取手段によって読み取られた文書の各行、各
列のそれぞれに含まれる文字の文字コードの総和を算出
する算出手段と、この算出手段によって算出された文字
コードの総和と前記読取り手段によって読み取られた数
値情報とに基づいて、前記読取り手段によって読み取ら
れた文書が正しく読み取られたものであるか否かを各行
、各列毎に判別する判別手段と、この判別手段によって
正しく読み取られていないと判別された行１列の位置に
基づいて、正しく読み取られなかった文字を示す表示を
行なう表示手段とを具備して構成するものである。The present invention also provides a reading means for reading the document and the numerical information from a document and a form in which numerical information corresponding to character groups included in each row and each column of the document is written; a calculation means for calculating the sum of character codes of characters included in each row and each column of the document, based on the sum of character codes calculated by the calculation means and numerical information read by the reading means; a determining means for determining whether or not the document read by the reading means has been correctly read for each row and each column; and one row and column determined by the determining means to have not been correctly read. and display means for displaying characters that have not been correctly read based on the position of the characters.

（作　用）このようにして構成される文字読取装置においては、自
動的に修正箇所が表示されるために、読み取られた文書
の修正を容易に行なうことが可能となる。(Function) In the character reading device configured in this way, since the correction location is automatically displayed, it becomes possible to easily correct the read document.

（実施例）以下、図面を参照して本発明の一実施例を説明する。第
１図は同実施例に係わる文字読取装置の構成を示すブロ
ック図である。同図において、読取り部１１は、帳票に
記入された文書等の文字情報を検出するものである。読
取り部１１には、同読取り部１１によって読み取られた
文書中の誤り箇所を検出するチエツクサム・チエツク部
１２が接続されている。チエツクサム・チエツク部１２
は、読み取られた文字の文字コードの総和計算を行なう
算出部１３、及び算出部１８において得られた総和値と
チエツクサムとに基づいて、誤り箇所を判別する判別部
１４によって構成されている。また、チェックサム・チ
エツク部１２には、読み取り結果をチエツクサム・チエ
ツクに応じて表示を行なう表示部１５が接続されている
。(Example) Hereinafter, an example of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of a character reading device according to the same embodiment. In the figure, a reading unit 11 detects character information such as a document written on a form. A checksum check section 12 is connected to the reading section 11 to detect errors in the document read by the reading section 11. Check sum check section 12
The system is comprised of a calculation unit 13 that calculates the sum of character codes of read characters, and a determination unit 14 that determines the location of an error based on the sum value and checksum obtained in the calculation unit 18. Further, a display section 15 is connected to the checksum check section 12 for displaying the read result according to the checksum check.

第２図は、同実施例において処理対象とする帳票への記
入方法の一例を示す図である。第２図に示す例は、一般
文書と同文書の各行、各列にそれぞれ対応するチエツク
サム（数値情報）が記入されているものである。図中に
おいては、各行に対応するチエツクサムを「ｘｘ」、各
列に対応するチエツクサムをｒＹＹＪとして示している
。また、チエツクサム（ＸＸ、ＹＹ）は、各行、各列（
所定範囲）に含まれる文字の文字コードの総和の下２桁
の数値を示すものである。FIG. 2 is a diagram showing an example of a method of filling in a form to be processed in the same embodiment. In the example shown in FIG. 2, a checksum (numerical information) corresponding to a general document is written in each row and each column of the same document. In the figure, the checksum corresponding to each row is indicated as "xx", and the checksum corresponding to each column is indicated as rYYJ. Also, the checksum (XX, YY) is for each row and column (
This indicates the last two digits of the total of the character codes of characters included in the predetermined range.

次に、同実施例の動作を説明する。Next, the operation of this embodiment will be explained.

まず、第２図に示すような帳票について、読取り部１１
は、文字読取り処理を行なう。読取り部１１は、和文汎
用ドキュメントリーダとし、帳票サイズが限定されない
。読取り部１１は、帳票を光学的に走査することにより
、帳票に記入された文字等のイメージを含む帳票イメー
ジを検出する。読取り部１１は、検出された帳票イメー
ジから、文書の各行位置を検出し、さらに各行中に含ま
れる各文字を検出する。そして、１文字毎に文字パター
ンの切り出しを行なう。こうして切り出された文字パタ
ーンは、それぞれについて認識処理が行われる。ここで
は、一般の文書（文字）とチエツクサム（数字）は、同
様にして認識処理が行われる。First, regarding a form as shown in FIG.
performs character reading processing. The reading unit 11 is a general-purpose Japanese document reader, and the document size is not limited. The reading unit 11 detects a form image including an image of characters written on the form by optically scanning the form. The reading unit 11 detects the position of each line of the document from the detected form image, and further detects each character included in each line. Then, a character pattern is cut out for each character. The character patterns thus cut out are each subjected to recognition processing. Here, general documents (characters) and checksums (numbers) are recognized in the same way.

ただし、認識結果（文字コード）は、文書とチエツクサ
ムに分類されるものとする。読取り部１１は、認識結果
をチエツクサム・チエツク部１２に転送する。However, the recognition results (character codes) are classified into documents and checksums. The reading section 11 transfers the recognition result to the checksum checking section 12.

チエツクサム・チエツク部１２は、算出部１３において
、読取り部１１からの認識結果（文字コード）について
文書の各行、各列毎に加算計算を行なう。In the checksum/check section 12, the calculation section 13 performs addition calculations on the recognition results (character codes) from the reading section 11 for each row and each column of the document.

各行、各列の認識結果（文字コード）の和が算出される
と、算出部１３は、この算出データ、及び各行、各列に
対応するチエツクサムについての認識結果を判別部１４
に転送する。When the sum of the recognition results (character codes) for each row and each column is calculated, the calculation unit 13 uses this calculation data and the recognition result for the checksum corresponding to each row and each column to the discrimination unit 14.
Transfer to.

判別部１４は、算出部１３からの各情報に基づいて読み
取られた文書が正しく読み取られたものであるか否かを
判別する判別処理を行なう。すなわち、各行、各列につ
いての算出データの下２桁と、それぞれに対応するチエ
ツクサムについての認識結果との比較を行なうものであ
る。ここで、算出データの下２桁とチエツクサムについ
ての認識結果が一致する場合（整合）は、正しく文書の
読み取りが行われたものとし、一致しない場合（不整合
）は、誤認識された文字が含まれているものと判別する
。読み取られた文書の各行、各列について判別処理が終
了すると、判別部１４は、判別結果を表示部１５に通知
する。The determination unit 14 performs a determination process to determine whether or not the read document has been correctly read based on each piece of information from the calculation unit 13 . That is, the last two digits of the calculated data for each row and each column are compared with the recognition results for the corresponding checksums. Here, if the last two digits of the calculated data and the recognition result of the checksum match (match), it is assumed that the document has been read correctly; if they do not match (inconsistency), the erroneously recognized characters are It is determined that it is included. When the discrimination processing for each row and each column of the read document is completed, the discrimination section 14 notifies the display section 15 of the discrimination results.

表示部１５は、判別結果に応じて、帳票に記入されてい
た文書についての読取り結果を表示し、これを修正画面
とする。この場合、表示部１５は、誤認識された文字が
含まれる（不整合）と判別された行１列について倍輝度
表示などによって表示する。また、不整合となった行１
列の交点にある文字を、リジェクト扱いにし、例えば第
３図に示すように、リジェクトを示す記号「？」に置き
換えて表示する。また、この修正画面では、リジェクト
記号にのみ、カーソルが移動するように制御を行なう。The display unit 15 displays the reading result of the document written in the form according to the determination result, and uses this as a correction screen. In this case, the display unit 15 displays the row and column that is determined to include the erroneously recognized character (inconsistency) using double-brightness display or the like. Also, row 1 that is inconsistent
The character at the intersection of the columns is treated as a reject and is displayed, for example, as shown in FIG. 3, replacing it with a symbol "?" indicating a reject. Further, on this correction screen, control is performed so that the cursor moves only to the reject symbol.

このよう゛にすれば、表示画面から容易に修正箇所を判
別することができる。また、図示せぬ人力部（キーボー
ド等によって構成される）を介して、リジェクト文字に
対して正しい文字データを入力して修正を行なう場合に
、効率的に修正（エデイツト）を行なうことが可能とな
る。また、誤認識された文字が一つの場合、その文字だ
けが指摘されるが、複数ある場合であっても誤認識され
た文字とする候補を複数指摘することができる。In this way, the location to be corrected can be easily determined from the display screen. In addition, when correcting rejected characters by inputting correct character data via a human power section (not shown) (consisting of a keyboard, etc.), it is possible to edit the rejected characters efficiently. Become. Further, when there is one character that is misrecognized, only that character is pointed out, but even when there are multiple characters, it is possible to point out multiple candidates for the misrecognized character.

これより、読み取った結果の表示と文書との比較を行な
いながら修正を行なうより、格段に修正効率を向上させ
ることができる。This makes it possible to significantly improve the efficiency of corrections, compared to performing corrections while displaying the read results and comparing them with the document.

このような方式は、読み取り部１１において読み取らせ
る帳票に、チエツクサムを記入することが可能なもの全
てに応用することができる。Such a method can be applied to all forms in which a checksum can be written on the form read by the reading section 11.

また、具体的な利用方法として、文字読取装置の認識率
計算を行なう際のテストシートに応用すれば、マスター
データを作成する手間を軽減することができる。Further, as a specific usage method, if it is applied to a test sheet when calculating the recognition rate of a character reading device, it is possible to reduce the effort required to create master data.

なお、前記実施例においては、和文文書が帳票に記入さ
れた場合について述べたが、当然ながら数字、英字、カ
ナ等であっても適用可能である。In the above embodiment, a case was described in which a Japanese document was entered in a form, but it is of course applicable to numbers, alphabets, kana, etc.

また、チエツクサム（数値情報）は、各行、各列に対応
するものを予め帳票に記入するものとしたが、これに限
定されるものではない。Further, although the checksum (numerical information) corresponding to each row and each column is entered in the form in advance, the present invention is not limited to this.

さらに、各行、各列の文字コードの総和の下２桁の計算
方式は、各種考えることが可能である。Furthermore, various methods can be considered for calculating the last two digits of the sum of character codes in each row and each column.

［発明の効果〕以上のように本発明によれば、帳票に記入された文書の
所定範囲内（行や列等）に含まれる文字の認識結果（文
字コード）の総和と、これに対応する帳票に同じく記入
された数値情報（チエツクサム）に基づいて誤認識箇所
が判別され、修正画面から修正すべき箇所が容易にわか
るように表示が行われるので、文書の読み取り結果につ
いての修正を効率良く行なうことが可能となるものであ
る。[Effects of the Invention] As described above, according to the present invention, the total sum of recognition results (character codes) of characters included in a predetermined range (row, column, etc.) of a document entered in a form and the corresponding Erroneous recognition points are identified based on the numerical information (checksum) also entered in the form, and the points to be corrected are displayed on the correction screen so that the points to be corrected can be easily seen, making it possible to efficiently correct the reading results of the document. It is possible to do so.

[Brief explanation of drawings]

第１図は本発明の一実施例に係わる文字読取装置の構成
を示すブロック図、第２図は同実施例において処理対象
とする帳票への記入方法の一例を示す図、第３図は同実
施例における表示画面の例を説明するための図である。１１・・・読取り部（読取り手段）　　１２・・・チエ
ツクサム・チエツク部、１３・・・算出部（算出手段）
、１４・・・判別部（判別手段）１５・・・表示部（表
示手段）。FIG. 1 is a block diagram showing the configuration of a character reading device according to an embodiment of the present invention, FIG. 2 is a diagram showing an example of a method of filling in a form to be processed in the same embodiment, and FIG. FIG. 3 is a diagram for explaining an example of a display screen in an embodiment. 11...Reading section (reading means) 12...Checksum check section, 13...Calculation section (calculation means)
, 14... Discrimination section (discrimination means) 15... Display section (display means).

Claims

[Claims]

(1) A reading means for reading the document and the numerical information from a document and a form in which numerical information corresponding to a group of characters included in a predetermined range of the document is written; and a document read by the reading means. a calculating means for calculating the sum of character codes of characters included in the predetermined range of the character code; a discriminating means for discriminating whether or not the document read by the discriminating means has been read correctly; and a display means for displaying an indication to that effect when the discriminating means determines that the document has not been read correctly; A character reading device characterized by comprising:

(2) a reading means for reading the document and the numerical information from a document and a form in which numerical information corresponding to each character group included in each row and each column of the document is written; and a reading means for reading the document and the numerical information; a calculating means for calculating the sum of character codes of characters included in each row and each column of the document, based on the sum of character codes calculated by the calculating means and numerical information read by the reading means, determining means for determining whether or not the document read by the reading means has been correctly read for each row and each column; and the position of the row or column determined by the determining means to have not been correctly read. A character reading device comprising: display means for displaying characters that have not been correctly read based on the following.