JPH0219984A

JPH0219984A - Document reader

Info

Publication number: JPH0219984A
Application number: JP63168861A
Authority: JP
Inventors: Yoshimasa Yanagihara; 義正柳原; Ritsu Takeda; 立武田; Shuichi Takanami; 修一高波; Akio Mitamura; 三田村　章雄; Kiyoshi Itao; 清板生
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1988-07-08
Filing date: 1988-07-08
Publication date: 1990-01-23

Abstract

PURPOSE:To recover an error,which occurs when large quantity of document are continuously inputted, on-line by deciding whether the read result of image data recognized in a recognizing part is acceptable or not and driving a reading part again and automatically recovering the reading part when it is decided that the read result is unacceptable. CONSTITUTION:The title device is equipped with a reading part 2 to read a document 1, a recognizing part 5 to recognize the image data read by this reading part 2, a deciding means 8 to decide whether the read result of this recognizing part 5 is acceptable or not, and a control means 9 to drive the reading part 2 again and automatically recover the reading part 2 when it is decided that the read result is in acceptable by the deciding means 8. Further, the device decides the read result of the document 1 by the deciding means 8, drives the reading part 2 again and automatically recovers the reading part 2 by the control means 9 when it is decided that the read result is unacceptable, and reads the document 1 by the reading part 2 again. Further, when the inclined angle of either the character or the graphic of the document 1 is any value except a specified value, either the inserted angle or the read angle of the document 1 is corrected by a correcting mechanism 11. Further, when the inserted angle of the document 1 is to be corrected, the correction is executed by separately controlling both edge rollers. Thus, the error when the document is inputted can be recovered on-line.

Description

【発明の詳細な説明】（産業上の利用分野）この発明は、文書画像を信頓性高く読取る文書読取り装
置に関するものである。特に入力エラーを自動検出し、
エラー情報をフィードバックできるようにした文書読取
り装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a document reading device that reads document images with high reliability. In particular, it automatically detects input errors,
The present invention relates to a document reading device that can feed back error information.

[Conventional technology]

従来の文書読取り装置は、第８図に示されるように構成
されている。第８図で、１は文書、２は読取り部で、読
取り機構３および紙送り機構４′とからなる。５は認識
部で、イメージメモリ６と認識手段７とからなる。９は
紙送り制御部、１０はＣＰＵである。その動作について
説明すると、文書１はＣＰＵｌ０の指令に基づき読取り
部２で文書画像に変換された後、認識部５で認識され外
部へ転送される。文書１が傾いて入力された場合および
文書１が読み飛ばされて入力された場合、装置内に入力
時のエラーの訂正手段がないため、ホスト側に転送され
た文書１の表示画面を見て誤りを確認してから修正処理
をしていた。A conventional document reading device is configured as shown in FIG. In FIG. 8, 1 is a document, and 2 is a reading section, which consists of a reading mechanism 3 and a paper feeding mechanism 4'. Reference numeral 5 denotes a recognition section, which consists of an image memory 6 and recognition means 7. 9 is a paper feed control section, and 10 is a CPU. To explain its operation, the document 1 is converted into a document image by the reading unit 2 based on a command from the CPU 10, and then recognized by the recognition unit 5 and transferred to the outside. If Document 1 is inputted at an angle or if Document 1 is skipped and inputted, there is no way to correct errors during input within the device, so if you look at the display screen of Document 1 transferred to the host side. After confirming the error, corrections were made.

[Problem to be solved by the invention]

このように、従来の文書読取り装置では自動リカバリー
手段不備のため、大量文書の入力の際に信顆性が低いと
いう欠点があった。また、イメージ人力がオフライン処
理のとき、後刻ＯＣＲにかけたときに読取り不良となっ
ても対処ができない欠点があった。この発明の目的は、
大量の文書を連続的に入力する際に発生するエラーを認
識機能により自動釣に検出し、紙送り制御部にフィード
バックすることにより自動回復がはかれるようにした高
信顆な文書読取り装置を提供することにある。As described above, the conventional document reading device has a drawback of low reliability when inputting a large amount of documents due to the lack of automatic recovery means. In addition, when the image is manually processed offline, there is a drawback that it cannot be dealt with even if a reading failure occurs when the image is later subjected to OCR. The purpose of this invention is to
To provide a highly reliable document reading device that automatically detects errors that occur when continuously inputting a large amount of documents using a recognition function, and automatically recovers by feeding back to a paper feed control section. There is a particular thing.

[Means to solve the problem]

この発明にかかる文書読取り装置は、文書の読取りを行
う読取り部と、この読取り部で読取ったイメージデータ
を認識する認識部と、この認識部の読取り結果の合否を
判定する判定手段と、この判定手段の否の判定時に読取
り部を再駆動し自動回復させる制御手段とを備えたもの
である。The document reading device according to the present invention includes: a reading unit that reads a document; a recognition unit that recognizes image data read by the reading unit; a determining unit that determines whether the reading result of the recognition unit is acceptable; and control means for re-driving the reading section to automatically recover when it is determined whether or not the means has been used.

また、この発明は文書の挿入角度または読取り角度の補
正機構を備えさせることもできる。Further, the present invention can also be provided with a correction mechanism for the insertion angle or reading angle of the document.

さらに、文書の挿入角度の補正に同軸紙送りローラ群の
中の両端のローラを独立に回転させる手段とを備えるこ
ともできる。Furthermore, it is also possible to include means for independently rotating the rollers at both ends of the group of coaxial paper feed rollers to correct the insertion angle of the document.

また、欠落したページ番号を検出する手段と、欠落した
ページ番号を表示したり外部に通報する手段を設けるこ
ともできる。It is also possible to provide means for detecting missing page numbers and means for displaying or reporting missing page numbers to the outside.

（作用）この発明は、文書の読取り結果を判定手段で判定し、否
のときは制御手段によって再度読取り部を駆動して自動
回復させて読取りを行う。(Operation) According to the present invention, the reading result of the document is determined by the determining means, and when the result is negative, the reading section is driven again by the control means to automatically recover and read.

また、文書の文字または図形の傾き角度が規定値以外の
ときは、文書の挿入角度または読取り角度を補正機構で
補正する。Further, when the inclination angle of the characters or figures of the document is outside the specified value, the correction mechanism corrects the insertion angle or reading angle of the document.

さらに、文書の挿入角度の補正に際しては、両端のロー
ラを個別に制御して行う。Furthermore, when correcting the insertion angle of the document, the rollers at both ends are individually controlled.

また、ページ番号を検出してページ番号の欠落を検出し
、これを表示するか外部に通報する。It also detects page numbers, detects missing page numbers, and displays or reports this to an external party.

〔Example〕

第１図はこの発明の実施例の全体構成のブロック図であ
る。その構成を説明すると、大量の文書１を紙送り／戻
し機構４を用い連続的に人力し、読取機構３においてＣ
ＣＤ等のイメージセンサを用い光電変換することにより
イメージデータに変換する読取り部２、読取つたイメー
ジデータを格納するイメージメモリ６、イメージメモリ
６内から逐次読み出したイメージデータより紙面の傾き
角度および紙面のページ番号その他の不都合部分を検出
する認識手段７、認識手段７から送られた認識結果より
文書読取りエラーの有無を判定する判定手段８を有する
認識部５、認識部５からフィードバックされた情報によ
り前記紙送り／戻し機構４を動作させる紙送り制御部９
、全体の制御を実行する制御手段としての内蔵されたＣ
ＰＵ１０および挿入角度または読取り角度の補正機構１
１から構成される。FIG. 1 is a block diagram of the overall configuration of an embodiment of the present invention. To explain its configuration, a large number of documents 1 are continuously manually read using the paper feed/return mechanism 4, and the reading mechanism 3
A reading section 2 converts the image data into image data through photoelectric conversion using an image sensor such as a CD, an image memory 6 that stores the read image data, and an image data read out sequentially from the image memory 6 that calculates the inclination angle of the paper surface and A recognition unit 5 has a recognition unit 7 that detects page numbers and other inconvenient parts, a determination unit 8 that determines whether there is a document reading error based on the recognition result sent from the recognition unit 7, and a recognition unit 5 that uses the information fed back from the recognition unit 5 to Paper feed control unit 9 that operates the paper feed/return mechanism 4
, built-in C as a control means to perform overall control
PU10 and insertion angle or reading angle correction mechanism 1
Consists of 1.

第２図は、第１図の実施例の文書処理の概略フローを示
している。なお、第２図で　（１）〜（１０）は各ステ
ップを示す。FIG. 2 shows a schematic flow of document processing in the embodiment shown in FIG. In addition, in FIG. 2, (1) to (10) indicate each step.

まず、文書１の文字等はＣＣＤ等のイメージセンサで光
電変換され　（１）、イメージメモリ６に蓄積される　
（２）。次に、認識部５において文書１の傾き角度の検
出が行われ（３）、傾き角度が許容値以内でないときは
ステップ　（１）に戻るが、その問、挿入角度の補正ま
たは読取り角度の補正を補正機構１１で行う。ステップ
　（４）で許容値内であればページ番号検出のために候
補領域の検出が行われ（５）、文書１のページ番号の検
出が行われる（６）　　　そして、前ページ番号との比
較がなされ（７）、ページ番号が連続しているかどうか
が判定され（８）、連続していなければ欠落ページを記
憶する　（９）。連続していればそのページが最終文書
であるか否かを検討し、最終文書になるまでステップ　
（１）〜（１０）をくり返し最終文書になったところで
終了する。First, the characters etc. of document 1 are photoelectrically converted by an image sensor such as a CCD (1) and stored in image memory 6.
(2). Next, the recognition unit 5 detects the tilt angle of the document 1 (3), and if the tilt angle is not within the allowable value, the process returns to step (1). is performed by the correction mechanism 11. If it is within the allowable value in step (4), a candidate area is detected for page number detection (5), and the page number of document 1 is detected (6).Then, a comparison with the previous page number is performed. (7), it is determined whether the page numbers are consecutive (8), and if they are not consecutive, the missing page is stored (9). If they are consecutive, consider whether or not that page is the final document, and continue the steps until it becomes the final document.
Steps (1) to (10) are repeated until the final document is reached.

第３図は認識部５の認識手段７の機能ブロック図であり
、認識手段７は前処理部１２．検出部１３、演算部１７
から構成されている。検出部１３は、傾き角度検出部１
４．ページ番号検出部１５およびマルチフォント辞書１
６よりなる。FIG. 3 is a functional block diagram of the recognition means 7 of the recognition section 5, and the recognition means 7 includes the preprocessing section 12. Detection unit 13, calculation unit 17
It consists of The detection unit 13 includes the tilt angle detection unit 1
4. Page number detection unit 15 and multi-font dictionary 1
Consists of 6.

次に動作について説明するが、自動回復させる動作とし
て傾き角度の修正と、欠落したページ番号の表示とがあ
るが、はじめに傾き角度の検出について述べ、次にペー
ジ番号の検出と表示について述べる。Next, the operations will be explained. The automatic recovery operations include correcting the tilt angle and displaying the missing page number. First, the detection of the tilt angle will be described, and then the detection and display of the page number will be described.

まず、イメージメモリ６から読み出されたイメージデー
タは前処理部１２において２値化および雑音除去等の前
処理が実行される。２値化については、例えばあらかじ
め固定しきい値を設定しておき、当該イメージデータが
しきい値以上の時はその値を１に、以下の時には値をＯ
にし２値のイメージデータに変換する。次に、２値化さ
れたイメージデータは検出部１３に転送され、最初に傾
き角度の検出が実行される。First, the image data read from the image memory 6 is subjected to preprocessing such as binarization and noise removal in the preprocessing section 12. For binarization, for example, a fixed threshold value is set in advance, and when the image data is above the threshold value, the value is set to 1, and when it is below, the value is set to O.
Convert it to binary image data. Next, the binarized image data is transferred to the detection unit 13, and the tilt angle is first detected.

第４図は横書き文書に対して上記傾き角度の検出法の公
知な一例（昭和５７年電子通信学会全国大会：秋山、増
田「所間記事における文字領域抽出法」）を示した図で
ある。２値に変換されたイメージデータは文字等に相当
する部分を“０”　（黒画素）、その他の背景部分を“
１”　（白画素）で表すと模式的に１８のごとく黒の棒
状の文字列記述できる。２値に変換されたイメージデー
タに対し、図に示すように、全体を縦方向に複数の帯状
に分割し、この領域毎にラスク走査によりデータを読み
出し水平方向の周辺分布特徴、すなわち水平方向の黒画
素の和を計算する。これにより文字列の帯領域に対応し
て２ｏに示されるような各領域毎にヒストグラムが算出
される。FIG. 4 is a diagram showing a known example of the above-mentioned method for detecting the inclination angle for a horizontally written document (1981 National Conference of the Institute of Electronics and Communication Engineers: Akiyama and Masuda, ``Character area extraction method in inter-article articles''). In the image data converted to binary, the parts corresponding to characters etc. are set to "0" (black pixels), and the other background parts are set to "0" (black pixels).
1" (white pixel), it can be described as a black bar-shaped string like 18. As shown in the figure, the entire image data is divided into multiple strips in the vertical direction. The data is read out by rask scanning for each region, and the horizontal peripheral distribution feature, that is, the sum of horizontal black pixels, is calculated.As a result, each region as shown in 2o is A histogram is calculated for each region.

この場合、２０に示されるようなヒストグラムが複数算
出されるが、例えば先頭の文字列１８の一帯領域に対応
するヒストグラムを２０にすると、紙面が傾いている場
合、各帯領域のヒストグラムのピークの値は図のように
傾き方向にシフトしてくる。したがって、紙面の傾き角
度１９を求めるには、例えば各ヒストグラムのピークの
点を通る直線２１を引き、複数のヒストグラム間の距１
！！２２および高さ２３の比を計算すればよい。また、
入力文書が縦書きの場合でも横方向に帯状に複数分割し
、帯領載量に縦方向の周辺分布特徴を計算することによ
り、人力文書の傾き角度を検出することができる。In this case, multiple histograms as shown in 20 are calculated. For example, if the histogram corresponding to the band area of the first character string 18 is set to 20, if the page is tilted, the peak of the histogram of each band area will be The value shifts in the direction of the slope as shown in the figure. Therefore, to find the inclination angle 19 of the paper plane, for example, draw a straight line 21 passing through the peak point of each histogram, and
! ! What is necessary is to calculate the ratio of 22 and height 23. Also,
Even when an input document is written vertically, it is possible to detect the inclination angle of a human-written document by dividing the input document into a plurality of strips in the horizontal direction and calculating the peripheral distribution characteristics in the vertical direction based on the strip loading amount.

以上公知の方法で傾き角度を検出した後、この傾き角度
がある設定誤差以上であった場合、紙送り制御部９に傾
き角度情報をフィードバックし文書１の入力を停止させ
、角度情報をもとに紙送り角度を設定し直し、文書１の
再入力を実行させることができる。上記処理により文書
の傾きは修正される。After detecting the tilt angle using the known method described above, if the tilt angle exceeds a certain setting error, the tilt angle information is fed back to the paper feed control section 9 to stop inputting the document 1, and the tilt angle is detected based on the angle information. It is possible to reset the paper feed angle and re-input document 1. The above process corrects the skew of the document.

次に、ページ番号検出部１５により文書のページ番号を
検出する。第５図はページ番号検出部１５の概略フロー
を表している。なお　（１１）〜（１９）は各ステップ
を示す。Next, the page number detection section 15 detects the page number of the document. FIG. 5 shows a schematic flow of the page number detection section 15. Note that (11) to (19) indicate each step.

イメージメモリ６から読み出した２値のイメージデータ
に対して、まず、文書のページ番号が記述されていると
期待される文字列領域を検出し、以下の方法で読み取る
。一般に文書のページ番号は紙面の最初の行あるいは最
終の行に存在するため、例えば前記で説明しように水平
方向の周辺分布特徴を計算し、文字列間の空白部により
文字列を分離し、その先頭の周辺分布および最終の周辺
分布を抽出した後（１１）、この文字列１８を外接する
矩形で表現しておく　（１２）。このようにして検出さ
れた候補文字列内の文字認識の手法としては、公知のＯ
ＣＲに用いられている認識技術を利用することかできる
。First, in the binary image data read from the image memory 6, a character string area where the page number of the document is expected to be written is detected and read in the following method. Generally, the page number of a document is on the first line or the last line of the paper, so for example, as explained above, the horizontal peripheral distribution feature is calculated, character strings are separated by the blank space between them, and the After extracting the first marginal distribution and the final marginal distribution (11), this character string 18 is represented by a circumscribing rectangle (12). As a method for character recognition in candidate character strings detected in this way, the well-known O
The recognition technology used in CR can be used.

まず、候補列内から一文字ごとの切り出しを行い文字の
特徴量を抽出し、あらかじめ用意しであるマルチフォン
ト用辞書１６との照合を行うことにより文字認識ができ
る（１３）。この場合、文字の切り出し法としては、例
えば候補文字列内での垂直方向の周辺分布特徴をとり、
文字間の空白により文字を分離し候補文字を矩形枠で切
り出し、さらに、矩形枠の包含関係、文字高さ等の条件
より最終的な文字切り出しを行う。また、特徴量抽出に
はストローク構造集積法等を用いることにより文字認識
が実行できる。この後、認識された文字の中から紙面の
ページ番号だけを抽出するわけであるが、これは認識結
果より数字だけで構成されている領域を選び出し、これ
を紙面のページ番号としてコード情報に変換する（１４
）。First, character recognition can be performed by cutting out each character from the candidate string, extracting character features, and comparing them with the multi-font dictionary 16 prepared in advance (13). In this case, as a character extraction method, for example, the vertical peripheral distribution feature within the candidate character string is taken,
Characters are separated by spaces between characters, candidate characters are cut out using rectangular frames, and finally characters are cut out based on conditions such as the inclusion relationship of the rectangular frames and character height. Furthermore, character recognition can be performed by using a stroke structure accumulation method or the like for feature extraction. After this, only the paper page number is extracted from the recognized characters, but this involves selecting an area consisting only of numbers from the recognition results and converting this into code information as the paper page number. Do (14
).

以上公知の方法で数値を読み取る時、最近の過去におけ
るページ数を知識として利用することができる。すなわ
ち、前頁のページ番号を利用し片面文書の時には＋１あ
るいは−１を加算することにより、また、両面文書の時
には＋２あるいは−２を加算することにより数値の読み
取りが高速に実行できる（１５）、　　（１６）。この
後、演算部１７において、前記ページ番号検出部１５で
検出された紙面のページ番号を一時的に記憶しく１７）
、最終文書に至るまで繰り返す（１８）。その後、欠落
したページ番号の表示、または通報が行われる（１９）
。When reading numerical values using the known method described above, the number of pages in the recent past can be used as knowledge. In other words, by using the page number of the previous page and adding +1 or -1 for a single-sided document, or adding +2 or -2 for a double-sided document, numerical reading can be performed at high speed (15). , (16). Thereafter, the calculation unit 17 temporarily stores the page number of the paper surface detected by the page number detection unit 15 (17).
, repeat until the final document is reached (18). After that, the missing page number will be displayed or reported (19)
.

判定手段８では、差の値が１の時は文書１が連続的に正
常入力されている。それ以外の値では文書１が読みとば
されているものと判断する。そして、正常入力以外のペ
ージ番号を入力時の履歴情報としてホスト側へ転送し、
前文書の入力終了後に人力されなかったページ番号を画
面表示すること等によりリカバリー処理が効率的に実行
できるように準備しておく。In the determining means 8, when the difference value is 1, document 1 has been successfully input continuously. For any other value, it is determined that document 1 has been skipped. Then, page numbers other than normal input are transferred to the host side as input history information,
Preparations are made so that the recovery process can be executed efficiently by displaying on the screen the page numbers that were not entered manually after the previous document has been entered.

前記より検出された人力エラーは、以下の自動回復が行
われる。まず、−旦文書１をイメージデータに変換しイ
メージメモリ６に蓄積した後読み出し、ラスク走査等に
より文字列等の輪郭座標を検出し、公知の技術であるア
フィン変換を用い傾き角度の補正をすることができる。The human error detected above will be automatically recovered as follows. First, document 1 is converted into image data, stored in image memory 6, and then read out. The contour coordinates of character strings, etc. are detected by rask scanning, etc., and the tilt angle is corrected using affine transformation, which is a known technique. be able to.

この発明による入力時の傾きエラーの自動回復法を、第
６図の紙送り／戻し機構４の概略図を用いて説明する。The automatic recovery method for tilt errors during input according to the present invention will be explained using a schematic diagram of the paper feed/return mechanism 4 in FIG.

まず、認識部５において検出された傾き角度からＣＰＵ
１°０において紙面の傾き角度に応じ各々のローラの回
転方向と回転角度を計算する。２つのローラ２５は一方
が紙送りを、他方が紙の戻しを行い紙面の傾き補正を行
う。傾き角度をｅとするローラ２５の一方はｅ／２、他
方のローラ２５は−ｅ／２だけ回転するように回転角度
を設定する。この角度情報に基づき、紙送り制御部９か
ら紙送り／戻し機構４の両ステップモータ２４へ回転方
向を与える符号信号と回転角度に相当するパルス列信号
を送り、両ステップモータ２４を駆動させる。その後、
ＣＰＵｌ０は両ステップモータ２４からの停止信号を受
は取ると両ステップモータ２４を逆回転させる信号を出
し、紙面の戻しを行い紙面を入力開始位置に戻し再入力
を実行させる。これにより、入力時の紙面の傾きを自動
的に修正することができる。First, based on the tilt angle detected by the recognition unit 5, the CPU
At 1°0, the rotation direction and rotation angle of each roller are calculated according to the inclination angle of the paper surface. One of the two rollers 25 feeds the paper, and the other returns the paper to correct the inclination of the paper surface. The rotation angle is set so that one of the rollers 25 whose inclination angle is e is rotated by e/2, and the other roller 25 is rotated by -e/2. Based on this angle information, the paper feed control section 9 sends a code signal giving the rotation direction and a pulse train signal corresponding to the rotation angle to both step motors 24 of the paper feed/return mechanism 4, thereby driving both step motors 24. after that,
When the CPU 10 receives the stop signal from both step motors 24, it issues a signal to reversely rotate both step motors 24, returns the page to the input start position, and executes re-input. Thereby, the inclination of the paper surface at the time of input can be automatically corrected.

さらに、この発明によると、紙面の傾きおよび紙面内の
文字列の傾きのどちらの場合でも紙面の傾き補正を行わ
ずに正常に読み取ることもできる。まず、紙面から検出
した文字列の傾き角度から読取機構３のＣＣＤセンサを
文字列と平行になるように回転させる。その後、紙面を
入力開始位置に巻戻し再入力を実行させて紙面を読み取
る。Further, according to the present invention, it is possible to read normally without correcting the inclination of the paper surface, regardless of whether the paper surface is tilted or the character string within the paper surface is tilted. First, based on the inclination angle of the character string detected from the paper surface, the CCD sensor of the reading mechanism 3 is rotated so that it becomes parallel to the character string. Thereafter, the paper is rewound to the input start position and re-input is executed to read the paper.

これにより、紙面の傾きを修正せずに正常入力ができ、
特に紙面内の文字列が傾いている場合には有効な読取り
手段となる。This allows you to input correctly without having to correct the tilt of the page.
This is an effective reading method especially when the character strings on the paper are tilted.

この発明の他の実施例として、第７図に示すように、両
面文書１′の裏表間違いの場合でも以下の動作で人力時
のりカバリ−をすることができる。すなわち、２つの読
取機構３を同時に駆動させ両面の光電変換によりイメー
ジデータに変換し、イメージメモリ６または６′に表面
裏面を別々に蓄積させる。その後、希望の紙面を選択す
ることにより両面文書１′の読誤りや片面文書１の表裏
間違いの場合でも、紙面を反転させることなく読取るこ
とができる。この効果から明らかなように、従来の技術
に比べて文書人力時のエラーをオンラインでリカバリー
できる点で優れている。As another embodiment of the present invention, as shown in FIG. 7, even if the double-sided document 1' is turned upside down, it is possible to manually correct the paper by the following operation. That is, the two reading mechanisms 3 are driven simultaneously to convert both sides into image data by photoelectric conversion, and the front and back sides are stored separately in the image memory 6 or 6'. Thereafter, by selecting a desired paper surface, even if a double-sided document 1' is misread or a single-sided document 1 is wrongly read, the paper surface can be read without reversing. As is clear from this effect, it is superior to conventional techniques in that it allows for online recovery of errors made when documents are manually written.

〔発明の効果）この発明は以上説明したように、認識部で認識されたイ
メージデータの読取り結果の合否を判定手段で判定し、
判定結果が否のときは読取り部を再駆動し自動回復する
ようにしたので、大量の文書を連続的に入力する際のエ
ラーをオンラインでリカバリすることができる。[Effects of the Invention] As explained above, the present invention uses a determining means to determine whether or not the reading result of image data recognized by the recognition unit is acceptable;
If the determination result is negative, the reading unit is restarted to automatically recover, so errors that occur when a large number of documents are input continuously can be recovered online.

また、図形の傾き角度が挿入角度または読取り角度の補
正機構により自動回復させるようにしたので、入力する
際の挿入角度のエラーがなくなく。特に、挿入角度の自
動回復に両端のローラを個別に制御するものは機構がき
わめて簡単となる。In addition, since the inclination angle of the figure is automatically restored by the insertion angle or reading angle correction mechanism, there are no insertion angle errors when inputting data. In particular, the mechanism is extremely simple if the rollers at both ends are individually controlled for automatic recovery of the insertion angle.

さらに、自動回復手段の一つとして、欠落したページ番
号の検出と表示とを行うようにしたので、欠落したペー
ジ番号を容易に認識できる。Furthermore, since the missing page number is detected and displayed as one of the automatic recovery means, the missing page number can be easily recognized.

[Brief explanation of the drawing]

第１図はこの発明の一実施例の全体構成のブロック図、
第２図は文書処理の概略フローチャート、第３図は認識
部の機能ブロック図、第４図は文書の傾き角度検出原理
の説明図、第５図はページ番号認識処理フローチャート
、第６図は紙送り７巻戻し機構の概略図、第７図は読取
り機構の概略図、第８図は従来装置の一例を示すブロッ
ク図である。図中、１は文書、２は読取り部、３は読取機構、４は紙
送り／戻し機構、５は認識部、６゜６′はイメージメモ
リ、７は認識手段、８は判定手段、９は紙送り制御部、
１０はＣＰＵ、１１は前処理部、１２は検出部、１３は
演算部、１４は傾き角度検出部、１５はページ番号検出
部、１６はマルチフォント辞書、１７は演算部、２４は
ステップモータ、２５はローラである。第２図第図第図第図第図第図第図FIG. 1 is a block diagram of the overall configuration of an embodiment of the present invention.
Figure 2 is a schematic flowchart of document processing, Figure 3 is a functional block diagram of the recognition unit, Figure 4 is an explanatory diagram of the document tilt angle detection principle, Figure 5 is a flowchart of page number recognition processing, and Figure 6 is a paper FIG. 7 is a schematic diagram of a feed and rewind mechanism, FIG. 7 is a schematic diagram of a reading mechanism, and FIG. 8 is a block diagram showing an example of a conventional device. In the figure, 1 is a document, 2 is a reading unit, 3 is a reading mechanism, 4 is a paper feed/return mechanism, 5 is a recognition unit, 6°6' is an image memory, 7 is a recognition means, 8 is a determination means, and 9 is a paper feed control section,
10 is a CPU, 11 is a preprocessing unit, 12 is a detection unit, 13 is a calculation unit, 14 is a tilt angle detection unit, 15 is a page number detection unit, 16 is a multi-font dictionary, 17 is a calculation unit, 24 is a step motor, 25 is a roller. Figure 2

Claims

[Claims]

(1) A reading unit that reads a document; a recognition unit that recognizes the image data read by this reading unit; a determination unit that determines whether the reading result of this recognition unit is acceptable; A document reading device comprising: control means for re-driving the reading section and automatically recovering the reading section.

(2) A reader equipped with a paper feed/return mechanism that can continuously input and return documents, a reading mechanism that converts the input document into image data, and a correction mechanism for the insertion angle or reading angle of the document. an image memory for storing image data read by the reading section; and recognition means for detecting the inclination angle of characters or figures of the input document from the image data stored in the image memory. a determination unit for determining whether the result of the recognition unit is acceptable; and when the determination unit outputs a negative determination, the inclination angle is fed back to the reading unit and a correction mechanism for the insertion angle or reading angle of the document is used. A document reading device comprising: a control means for performing correction;

(3) The correction mechanism includes a means for feeding and returning paper using a plurality of coaxial paper feed roller groups, a means for independently rotating rollers at both ends of the coaxial paper feed roller group, and a means for adjusting the inclination angle of the paper surface. Claim (2) characterized in that it comprises means for converting the rewind rotation angle into a rewind rotation angle, and means for individually controlling the rollers at both ends from the rewind rotation angle to correct the inclination and return it to the input start position. document reading device.

(4) A reading unit equipped with a paper feed/reverse mechanism that allows continuous input and rewinding of documents, and a reading mechanism that converts the input document into image data, and the image data read by this reading unit. an image memory for storing; means for detecting a first character string and a final character string on a page in image data stored in the image memory; means for recognizing characters in the character string; A means for extracting a paper page number value from characters and temporarily storing it, and a means for comparing and calculating a page number value of the previous page with the extracted page number value to detect and store a missing page number. a recognition unit having a means for determining whether the result of the recognition unit is acceptable; and a determination unit for determining whether the result of the recognition unit is acceptable; and when the determination unit outputs a negative determination, the missing page number is displayed on a display unit or is reported to the outside. A document reading device comprising: a control means;