JP4635845B2

JP4635845B2 - OCR device, form-out method, and form-out program

Info

Publication number: JP4635845B2
Application number: JP2005343159A
Authority: JP
Inventors: 淳青木
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2005-11-29
Filing date: 2005-11-29
Publication date: 2011-02-23
Anticipated expiration: 2025-11-29
Also published as: JP2007148846A

Description

本発明は、光学的読み取り手段によって入力された帳票の画像データの中から記入された文字画像を抽出し出力するＯＣＲ装置に関し、より詳しくは、入力された帳票の画像データの中から記入された文字画像のみを抽出し、出力するフォームアウト機能を備えたＯＣＲ装置に関する。 The present invention relates to an OCR apparatus that extracts and outputs a character image entered from form image data input by an optical reading means, and more specifically, input from the input form image data. The present invention relates to an OCR apparatus having a form-out function for extracting and outputting only a character image.

従来、ＯＣＲ技術を応用し、光学的に読み取られた帳票等から記入された文字画像のみを分離抽出する技術（フォームアウト）として、例えば、ドロップアウト処理技術が開発され実用化されている。
ドロップアウト処理技術とは、予め帳票を作成する際に、罫線等のフォームを赤や緑といったカラーで印刷し、文字入力後の帳票からカラー部分の画像を除去して文字画像のみを抽出する技術であり、各種帳票類の読み取り・入力処理に広く活用されている。 Conventionally, for example, a dropout processing technique has been developed and put into practical use as a technique (form-out) for separating and extracting only a character image written from a form or the like optically read by applying an OCR technique.
Dropout processing technology is a technology that prints forms such as ruled lines in red or green when creating a form in advance, and removes only the character image by removing the color image from the form after entering characters. It is widely used for reading and inputting various forms.

しかしながら、従来のドロップアウト方式では、読み取り及び除去が可能なドロップアウトカラーが読み取り装置等によって様々に異なるため、処理可能な色が特定の色に限定されてしまい、汎用性や利便性、経済性等に欠けるという問題があった。
このため、このようなドロップアウト処理を用いない方法として、予め帳票の固定的なフォームを記憶しておき、文字入力後の帳票からこの固定フォームのみを削除して文字画像を抽出する方法が採られるようになった。 However, in the conventional dropout method, the dropout color that can be read and removed varies depending on the reading device or the like, so that the processable color is limited to a specific color, and is versatile, convenient, and economical. There was a problem of lacking.
For this reason, as a method not using such dropout processing, a fixed form of a form is stored in advance, and only the fixed form is deleted from the form after character input to extract a character image. It came to be able to.

ところが、この種の帳票フォームを固定・記憶するフォームアウトの方法では、画像の位置ずれ等があるとフォームアウト処理ができないという事態が生じることがあった。
そこで、これまで、このような帳票フォームを利用したフォームアウト方式を改善する技術として種々の提案がなされてきた。 However, in the form-out method of fixing and storing this type of form, there may occur a situation in which the form-out process cannot be performed if there is an image misalignment or the like.
So far, various proposals have been made as techniques for improving the form-out method using such a form.

例えば、特許文献１には、文字が書き込まれたテンプレートから基準テンプレートを除去するための微細位置合わせの方法等が提案されている。
また、特許文献２には、帳票上に特別のマークを設けることなく位置合わせを行う画像位置合わせ方法等が提案されている。
さらに、特許文献３には、黒背景機能なしで、専用マークを用いずに、用紙中の罫線の有無に左右されずに用紙の位置合わせを可能にする用紙位置合わせ装置等が提案されている。 For example, Patent Document 1 proposes a fine alignment method for removing a reference template from a template in which characters are written.
Patent Document 2 proposes an image alignment method for performing alignment without providing a special mark on a form.
Furthermore, Patent Document 3 proposes a paper alignment device that enables paper alignment without using a black background function, without using a dedicated mark, and without depending on the presence or absence of ruled lines in the paper. .

特表平０８−５０４０７６号公報（第１−５頁、第１図）JP-T-08-504076 (page 1-5, FIG. 1) 特開平１０−０９１７８３号公報（第１−３頁、第１図）JP-A-10-091783 (page 1-3, FIG. 1) 特開平１１−００３４３１号公報（第１−５頁、第４図）JP 11-003431 A (page 1-5, FIG. 4)

しかしながら、上述の特許文献１に提案されている位置合わせの方法等では、参照画像と入力画像の位置合わせの際、画像を縦軸、横軸に投射し、それぞれ黒画素数を比較するようにしているため、縮尺誤差や回転による歪みを発生するコピー機やＦＡＸ等には対応できないという問題が生じた。
また、特許文献２に提案されている画像位置合わせ方法等では、罫線の交差部分（十字点）を検出して位置合わせを行うようにしているため、罫線を含まない帳票ではフォームアウトができないという問題があった。
さらに、上特許文献３に提案されている用紙位置合わせ装置等では、入力画像全体が同一の伸縮率・回転角で変形していることを前提としており、局所的な変形誤差を考慮していなかった。このため、プリンタでの帳票印刷時や、スキャナ装置での搬送時に生じる非線形の歪みには対応できないという問題が発生した。 However, in the registration method proposed in Patent Document 1 described above, when positioning the reference image and the input image, the image is projected on the vertical axis and the horizontal axis, and the number of black pixels is compared with each other. Therefore, there arises a problem that it cannot cope with a copying machine, a FAX, or the like that generates a scale error or distortion due to rotation.
Further, in the image alignment method proposed in Patent Document 2, since the alignment is performed by detecting the intersection (cross point) of the ruled line, it is impossible to form out a form that does not include the ruled line. There was a problem.
Furthermore, the paper alignment device proposed in the above Patent Document 3 is based on the premise that the entire input image is deformed at the same expansion ratio and rotation angle, and does not consider local deformation errors. It was. For this reason, there has been a problem that it cannot cope with non-linear distortion that occurs when a form is printed by a printer or conveyed by a scanner device.

本発明は、以上のような従来の技術が有する問題を解決するために提案されたものであり、入力された帳票の読み取り位置指定用マークや罫線の有無、印刷機器によるサイズ誤差、印刷ズレ、入力画像の伸縮、又は局所的な変形等に拘わらず、どのような用紙・帳票類であってもフォームアウトが可能なＯＣＲ装置の提供を目的とする。 The present invention has been proposed in order to solve the above-described problems of the prior art, and whether or not there is a reading position designation mark or ruled line in the input form, a size error due to a printing device, a printing misalignment, It is an object of the present invention to provide an OCR apparatus that can form out any paper / form regardless of expansion / contraction of the input image or local deformation.

上記目的を達成するため、本発明のＯＣＲ装置は、請求項１に記載するように、光学的読み取り手段によって入力された帳票の画像データの中から記入された文字画像を抽出し出力するＯＣＲ装置であって、前記光学的読み取り手段によって帳票の画像データを入力する画像入力部と、入力した未記入帳票の画像データを参照画像として記憶する参照画像記憶部と、入力した未記入帳票の画像データの一部を所定の回転角及び拡大率で変換し、辞書画像として記憶する辞書画像記憶部と、入力した記入済み帳票の画像と前記辞書画像との差分画像を生成する差分画像生成部と、前記差分画像生成部によって生成された差分画像に基づいて前記記入済み帳票画像の回転角及び拡大率を検出する回転角・拡大率検出部と、前記回転角・拡大率検出部で検出された回転角及び拡大率によって前記参照画像を変換する参照画像回転・拡大部と、前記参照画像回転・拡大部によって変換された参照画像と前記未記入帳票画像との差分画像を生成し出力する出力画像生成部とを備えた構成としてある。 In order to achieve the above object, an OCR apparatus according to the present invention extracts and outputs a character image entered from image data of a form input by an optical reading means, as described in claim 1. An image input unit that inputs image data of a form by the optical reading unit, a reference image storage unit that stores image data of an input blank form as a reference image, and input image data of a blank form A dictionary image storage unit that converts a part of the image at a predetermined rotation angle and magnification, and stores the converted image as a dictionary image; a difference image generation unit that generates a difference image between the input image of the completed form and the dictionary image; A rotation angle / magnification rate detection unit for detecting a rotation angle and an magnification rate of the completed form image based on the difference image generated by the difference image generation unit, and the rotation angle / magnification rate detection A reference image rotation / enlargement unit for converting the reference image according to the rotation angle and the enlargement ratio detected in step (b), and a difference image between the reference image converted by the reference image rotation / enlargement unit and the blank form image An output image generation unit for outputting is provided.

このような構成からなる本発明のＯＣＲ装置によれば、画像入力部によって入力された記入済み帳票画像の回転角及び拡大率を検出し、検出された回転角及び拡大率によって参照画像を変換させることによって、処理対象となる記入済み帳票画像と参照画像の双方のフォームを一致させるようにしてある。そして、変換された参照画像と記入済み帳票画像との差分画像を生成し、出力することでフォームアウトを実現している。
このため、回転や伸縮によって生じた歪みを検出し、補正することができる。
したがって、画像スキャン時に発生するサイズ誤差（拡大・縮小）、印刷ズレ、入力画面の伸縮等が生じても、これらを補正し、フォームアウト処理を確実に実施することができる。 According to the OCR apparatus of the present invention having such a configuration, the rotation angle and enlargement rate of the completed form image input by the image input unit are detected, and the reference image is converted based on the detected rotation angle and enlargement rate. Thus, the forms of the completed form image and the reference image to be processed are matched. Then, a differential image between the converted reference image and the completed form image is generated and output, thereby realizing form-out.
For this reason, distortion caused by rotation or expansion / contraction can be detected and corrected.
Therefore, even if a size error (enlargement / reduction), printing misalignment, expansion / contraction of the input screen, or the like that occurs during image scanning occurs, these can be corrected and the form-out process can be performed reliably.

また、本発明のＯＣＲ装置は、入力された未記入帳票の画像データを一以上の行及び列に分割することにより複数の分割画像を生成する画像分割部と、前記画像分割部によって生成された分割画像を二値化し、二値化して得られた画素のうち一方の画素からなる画像の輪郭長の和を計算する特徴抽出部と、前記画像分割部によって生成された分割画像の中から前記輪郭長の和が最も大きい分割画像を検出する最大特徴領域検出部と、前記最大特徴領域検出部によって検出された分割画像を所定の回転角及び拡大率によって複数の画像データに変換する拡大・回転加工部と、を備え、前記辞書画像記憶部が、前記拡大・回転加工部によって変換された複数の画像データを辞書画像として記憶する構成としてある。 Further, OCR apparatus of the present invention, an image dividing unit that generates a plurality of divided images by dividing the image data of blank form that is input to one or more rows and columns, generated by the image dividing unit A feature extraction unit that binarizes a divided image and calculates a sum of contour lengths of an image composed of one of the pixels obtained by binarization, and the divided image generated by the image dividing unit A maximum feature region detection unit that detects a divided image having the largest sum of contour lengths, and an enlargement / rotation that converts the divided image detected by the maximum feature region detection unit into a plurality of image data with a predetermined rotation angle and enlargement ratio A processing unit, and the dictionary image storage unit stores a plurality of image data converted by the enlargement / rotation processing unit as a dictionary image.

そして、本発明のＯＣＲ装置は、請求項３に記載するように、前記差分画像生成部は、前記記入済み帳票画像をラスタ走査し、所定の画像データを取得するラスタ走査手段と、前記ラスタ走査手段によって取得した画像データから前記辞書画像の領域に相当する領域を抽出する領域抽出手段と、抽出した領域において、前記記入済み帳票画像と前記辞書画像との差分画像を生成する差分画像生成手段とを、備えた構成としてある。 In the OCR apparatus according to the present invention, as described in claim 3, the difference image generation unit raster-scans the completed form image and acquires predetermined image data; and the raster scan Area extracting means for extracting an area corresponding to the area of the dictionary image from the image data acquired by the means, and difference image generating means for generating a difference image between the completed form image and the dictionary image in the extracted area; Is provided.

このような構成からなる本発明のＯＣＲ装置によれば、未記入帳票の画像データを複数に分割し、その中でも最も特徴量の大きい分割画像を抽出し、この分割画像を所定の回転角及び拡大率で変換し複数の辞書画像を作成するようにしている。
また、記入済み帳票の画像データ上で、これらすべての辞書画像をラスタ走査し、各位置においてを差分画像を生成するようにしてある。
このため、記入済み帳票画像の回転角及び拡大率の正確な検出につながり、結果、精度の高いフォームアウトが可能となる。 According to the OCR apparatus of the present invention having such a configuration, the image data of an unfilled form is divided into a plurality of parts, and a divided image having the largest feature amount is extracted, and the divided image is subjected to a predetermined rotation angle and enlargement. It is converted at a rate to create multiple dictionary images.
Further, all these dictionary images are raster-scanned on the image data of the completed form, and a difference image is generated at each position.
This leads to accurate detection of the rotation angle and magnification of the completed form image, and as a result, highly accurate form-out is possible.

また、本発明のＯＣＲ装置は、前記回転角・拡大率検出部は、前記差分画像生成部によって生成された差分画像を二値化し、二値化して得られた画素のうち一方の画素の合計面積を計算する画素面積計算手段と、前記差分画像生成部によって生成された差分画像の中から前記合計面積が最も小さい差分画像を抽出する差分画像抽出手段と、前記差分画像抽出手段によって抽出された差分画像の生成にかかる辞書画像の回転角及び拡大率を、前記記入済み帳票画像の回転角及び拡大率とする回転角・拡大率決定手段とを備えた構成としてある。 Further, OCR apparatus of the present invention, the rotation angle and the enlargement ratio detector, binarizes the difference image generated by the difference image generation unit, the sum of one pixel among pixels obtained by binarizing Pixel area calculation means for calculating the area, difference image extraction means for extracting the difference image having the smallest total area from the difference images generated by the difference image generation section, and the difference image extraction means Rotation angle / magnification rate determining means for setting the rotation angle and magnification rate of the dictionary image for generating the difference image as the rotation angle and magnification rate of the completed form image is provided.

このような構成からなる本発明のＯＣＲ装置によれば、差分画像生成部によって生成された複数の差分画像の中から、差分を表す画素の面積が最も少ない差分画像を抽出するようにしてある。また、その面積が予め定めた閾値以下であることを条件としてある。
このため、記入済み帳票画像の回転角や拡大率の決定に際し、一定以上の精度を保つことが可能となる。
したがって、最終的なフォームアウト出力において、高い品質を維持することができる。 According to the OCR apparatus of the present invention having such a configuration, a difference image having the smallest area of pixels representing a difference is extracted from a plurality of difference images generated by the difference image generation unit. The condition is that the area is not more than a predetermined threshold value.
For this reason, it is possible to maintain a certain level of accuracy when determining the rotation angle and enlargement ratio of the completed form image.
Therefore, high quality can be maintained in the final form-out output.

また、本発明のＯＣＲ装置は、前記参照画像回転・拡大部によって変換された参照画像を、一以上の行及び列に分割することによって複数の小領域参照画像を生成する参照画像分割部と、前記小領域参照画像と前記入力済み帳票画像との位置を合わせる位置合わせ部とを備え、前記出力画像生成部は、前記位置合わせ部により小領域ごとの位置合わせが行われた後に、各小領域画像と記入済み帳票画像との差分画像を生成し出力する構成としてある。 Further, OCR apparatus of the present invention, the reference image that has been converted by said reference image rotation and expansion unit, the reference image dividing unit that generates a plurality of small regions a reference image by dividing into one or more rows and columns, An alignment unit that aligns the positions of the small region reference image and the input form image, and the output image generation unit performs alignment for each small region after the alignment unit performs alignment for each small region. A difference image between the image and the completed form image is generated and output.

特に、前記位置合わせ部は、前記参照画像分割部によって生成された各小領域参照画像ごとに前記記入済み帳票画像を所定の指定位置を基準として重ね合わせる基準位置合わせ手段と、前記指定位置を基準に一定の範囲内で前記各小領域参照画像及び／又は前記記入済み帳票画像を移動させる画像ずらし手段と、前記範囲内で前記各小領域参照画像と記入済み帳票画像との差分画像を生成する小領域差分画像生成手段と、生成された差分画像の中から差分絶対値が最も少ない各差分画像を抽出する小領域差分画像抽出手段と、前記小領域差分画像抽出手段によって抽出された各差分画像の生成に係る各小領域参照画像及び／又は入力済み帳票画像の位置をもって、画像の位置を決定する位置決定手段とを備えた構成としてある。 In particular, the positioning unit includes a reference alignment means for superimposing the completed form image for each small area reference image generated by the reference image dividing section based on the predetermined specified position, relative to the designated position Generating a difference image between the small area reference image and the completed form image within the range, and an image shifting means for moving the small area reference image and / or the completed form image within the predetermined range. Small area difference image generation means, small area difference image extraction means for extracting each difference image having the smallest difference absolute value from the generated difference images, and each difference image extracted by the small area difference image extraction means And a position determining means for determining the position of the image based on the position of each small region reference image and / or the input form image.

このような構成からなる本発明のＯＣＲ装置によれば、差分画像の生成の際、記入済み帳票画像と参照画像との位置合わせを小領域単位で行うこととしている。
したがって、局所的な歪みも補正できるため、さらに精度の高いフォームアウトを実現できる。
また、位置合わせのためのマークや罫線等が不要なので、帳票作成の際の制限が無くなり、利便性を向上することができる。 According to the OCR apparatus of the present invention having such a configuration, when the difference image is generated, the completed form image and the reference image are aligned in units of small areas.
Therefore, since local distortion can also be corrected, a more accurate form-out can be realized.
In addition, since there is no need for alignment marks, ruled lines, etc., there are no restrictions when creating a form, and convenience can be improved.

また、本発明のフォームアウト方法は、光学的読み取り手段によって入力された帳票の画像データの中から記入された文字画像を抽出し出力するフォームアウト方法であって、未記入帳票の画像データを入力するステップ、入力した未記入帳票の画像データを参照画像として記憶するステップ、入力した未記入帳票の画像データの一部を所定の回転率及び拡大率で変換し、辞書画像として記憶するステップ、記入済み帳票の画像データを入力するステップ、前記記入済み帳票画像と前記辞書画像との差分画像を生成するステップ、前記差分画像に基づいて前記記入済み帳票画像の回転率及び拡大率を検出するステップ、検出された回転率及び拡大率によって前記参照画像を変換するステップ、変換された参照画像と前記未記入帳票画像との差分画像を生成し出力するステップとを有する方法としてある。 The form-out method of the present invention is a form-out method for extracting and outputting a character image filled out from image data of a form input by an optical reading means, and inputting image data of an unfilled form A step of storing the inputted image data of the unfilled form as a reference image, a step of converting a part of the image data of the inputted unfilled form at a predetermined rotation rate and enlargement ratio, and storing it as a dictionary image, filling Inputting image data of a completed form, generating a difference image between the completed form image and the dictionary image, detecting a rotation rate and an enlargement rate of the completed form image based on the difference image, Converting the reference image according to the detected rotation rate and magnification rate, and the difference between the converted reference image and the blank form image There as a method having a step of generating and outputting an image.

このように、本発明は上述した装置発明としてだけでなく、方法発明としても実現化することができる。
これによって、具体的な装置構成に限定されることなく、前記各ステップを備える限り、本発明を実現化することができ、汎用性の高いフォームアウト方法を提供することができる。 Thus, the present invention can be realized not only as the device invention described above but also as a method invention.
As a result, the present invention can be realized and a highly versatile form-out method can be provided as long as each step is provided without being limited to a specific apparatus configuration.

また、本発明のフォームアウトプログラムは、光学的読み取り手段によって入力された帳票の画像データの中から記入された文字画像を抽出し出力するためにコンピュータを、前記光学的読み取り手段によって帳票の画像データを入力する手段、入力した未記入帳票の画像データを参照画像として記憶する手段、入力した未記入帳票の画像データの一部を所定の回転角及び拡大率で変換し、辞書画像として記憶する手段、入力した記入済み帳票の画像データと前記辞書画像との差分画像を生成する手段、前記差分画像に基づいて前記記入済み帳票画像の回転角及び拡大率を検出する手段、検出された回転角及び拡大率によって前記参照画像を変換する手段、変換された参照画像と前記未記入帳票画像との差分画像を生成し出力する手段、として機能させるためのプログラムとしてある。
Also, the form-out program of the present invention, a computer to extract a character image that has been entered from the image data of a form that is input by the optical reading means outputs the image data of the form by the optical reading means , Means for storing the image data of the input blank form as a reference image, means for converting a part of the input image data of the blank form at a predetermined rotation angle and enlargement ratio, and storing it as a dictionary image Means for generating a difference image between the input image data of the completed form and the dictionary image, means for detecting a rotation angle and an enlargement ratio of the completed form image based on the difference image, a detected rotation angle and Means for converting the reference image according to an enlargement ratio; means for generating and outputting a difference image between the converted reference image and the blank form image; There a program for functioning.

このように本発明はプログラムとしても実現化することができる。
これにより、ＯＣＲ装置のみならずパーソナルコンピュータやスキャナにプログラムをインストールし各々の装置が連携することによって本発明を実現することができ、汎用性，拡張性に優れたフォームアウトプログラムとして提供することができる。 Thus, the present invention can also be realized as a program.
As a result, the present invention can be realized by installing a program not only in the OCR apparatus but also in a personal computer or a scanner, and the respective apparatuses cooperate with each other, and can be provided as a form-out program excellent in versatility and expandability it can.

本発明のＯＣＲ装置によれば、入力画像の歪み（サイズ誤差、印刷ズレ、伸縮、スキュー等）に対応できるようになり、確実なフォームアウトが可能となる。
また、局所的な歪みも吸収できるため、精度の高いフォームアウトを実現することができる。
さらに、従来の帳票に必要とされてきた位置合わせのためのマークや罫線が不要となるので、利便性を高めることができる。
これにより、入力された帳票の読み取り位置指定用マークや罫線の有無、印刷機器によるサイズ誤差、印刷ズレ、入力画像の伸縮、又は局所的な変形等に拘わらず、どのような用紙・帳票類であってもフォームアウトが可能な、汎用性，拡張性等に優れ、かつ、信頼性の高いＯＣＲ装置を実現することができる。 According to the OCR apparatus of the present invention, it becomes possible to cope with distortion (size error, printing misalignment, expansion / contraction, skew, etc.) of an input image, and reliable form-out is possible.
Moreover, since local distortion can also be absorbed, a highly accurate form-out can be realized.
Furthermore, since the alignment marks and ruled lines required for conventional forms are not necessary, the convenience can be improved.
As a result, regardless of whether there is a reading position designation mark or ruled line on the input form, size error due to the printing device, printing misalignment, expansion / contraction of the input image, or local deformation, etc. Even in such a case, it is possible to realize an OCR device that is formable, has excellent versatility, expandability, and the like, and has high reliability.

以下、本発明のＯＣＲ装置の好ましい実施形態について、図面を参照しつつ説明する。
ここで、以下の実施形態に示す本発明のＯＣＲ装置は、プログラム（ソフトウェア）の命令によりコンピュータで実行される処理，手段，機能によって実現される。プログラムは、コンピュータの各構成要素に指令を送り、以下に示すような所定の処理・機能を行わせる。すなわち、本発明のＯＣＲ装置における各処理・手段は、プログラムとコンピュータとが協働した具体的手段によって実現される。
なお、プログラムの全部又は一部は、例えば、磁気ディスク，光ディスク，半導体メモリ，その他任意のコンピュータで読取り可能な記録媒体により提供され、記録媒体から読み出されたプログラムがコンピュータにインストールされて実行される。また、プログラムは、記録媒体を介さず、通信回線を通じて直接にコンピュータにロードし実行することもできる。 Hereinafter, preferred embodiments of the OCR apparatus of the present invention will be described with reference to the drawings.
Here, the OCR apparatus of the present invention shown in the following embodiments is realized by processing, means, and functions executed by a computer according to instructions of a program (software). The program sends a command to each component of the computer to perform predetermined processing and functions as shown below. That is, each processing / means in the OCR apparatus of the present invention is realized by specific means in which a program and a computer cooperate.
Note that all or part of the program is provided by, for example, a magnetic disk, optical disk, semiconductor memory, or any other computer-readable recording medium, and the program read from the recording medium is installed in the computer and executed. The The program can also be loaded and executed directly on a computer through a communication line without using a recording medium.

図１は、本発明の一実施形態に係るＯＣＲ装置の主な構成を示すブロック図である。
本実施形態のＯＣＲ装置１は、入力装置１０、データ処理装置２０及び記憶装置４０から構成される。
以下、本実施形態の主な構成について、各々詳細な説明を行う。 FIG. 1 is a block diagram showing a main configuration of an OCR apparatus according to an embodiment of the present invention.
The OCR device 1 according to this embodiment includes an input device 10, a data processing device 20, and a storage device 40.
Hereinafter, the main configuration of the present embodiment will be described in detail.

［入力装置１０］
入力装置１０は、画像入力部１１を有する。画像入力部１１は、帳票を光学的に読み取り画像データとして入力するものである。具体的には、スキャナ装置等がこれに相当する。
なお、入力装置１０によって入力された画像データは、データ処理装置２０に出力される。 [Input device 10]
The input device 10 includes an image input unit 11. The image input unit 11 optically reads a form and inputs it as image data. Specifically, a scanner device or the like corresponds to this.
Note that image data input by the input device 10 is output to the data processing device 20.

［データ処理装置２０］
データ処理装置２０は、画像分割部２１、特徴抽出部２２、最大特徴領域検出部２３、拡大・回転加工部２４、差分画像生成部３１、回転角・拡大率検出部３２、参照画像回転・拡大部３３、参照画像分割部３４、位置合わせ部３５、及び出力画像生成部３６を備える。
画像分割部２１は、入力装置１０から与えられた未記入帳票Ａの画像データを複数の小領域に分割するものである。具体的には、画像データを、縦方向（行）及び横方向（列）に一以上分割することによって複数の碁盤目状の分割画像を生成するものである。例えば、２行２列に分割すると、４つの小領域に分かれた分割画像が生成されることとなる。 [Data processing device 20]
The data processing device 20 includes an image segmentation unit 21, a feature extraction unit 22, a maximum feature region detection unit 23, an enlargement / rotation processing unit 24, a difference image generation unit 31, a rotation angle / magnification rate detection unit 32, and a reference image rotation / enlargement. A unit 33, a reference image dividing unit 34, a positioning unit 35, and an output image generating unit 36.
The image dividing unit 21 divides the image data of the blank form A given from the input device 10 into a plurality of small areas. Specifically, a plurality of grid-like divided images are generated by dividing the image data into one or more in the vertical direction (row) and the horizontal direction (column). For example, when the image is divided into 2 rows and 2 columns, a divided image divided into four small areas is generated.

特徴抽出部２２は、画像分割部２１によって生成された分割画像を二値化し、二値化して得られた二つの画素（白画素及び黒画素）のうち、一方の画素の輪郭追跡処理を行い、その輪郭の長さの和を計算するものである。
例えば、二値化して得られた画素のうち、黒画素の輪郭追跡処理を行う場合、分割画像を走査し、最初に発見された黒画素を追跡開始点として所定方向に輪郭画素（黒画素と白画素の境界部）を追跡していき、再び追跡開始点に戻ったときに一つの輪郭線が形成されるので、この各輪郭線の長さの総和（特徴量）を計算することとなる。
なお、特徴量は、その領域（分割画像）に含まれる罫線、文字、模様等の複雑さを表す指標となるものである。 The feature extraction unit 22 binarizes the divided image generated by the image division unit 21 and performs contour tracking processing of one pixel among two pixels (white pixel and black pixel) obtained by binarization. The sum of the lengths of the contours is calculated.
For example, among the pixels obtained by binarization, when performing a contour tracking process of a black pixel, a divided image is scanned, and a contour pixel (a black pixel and a black pixel) is scanned in a predetermined direction with the first black pixel found as a tracking start point. Since a single contour line is formed when the white pixel boundary) is tracked and the tracking start point is returned again, the total length (feature value) of each contour line is calculated. .
The feature amount serves as an index representing the complexity of ruled lines, characters, patterns, and the like included in the region (divided image).

最大特徴領域検出部２３は、画像分割部２１によって生成された分割画像の中から最も大きな特徴量を有する分割画像を選び出すものである。
そして、拡大・回転加工部２４は、最大特徴領域検出部２３で検出した最も特徴量の大きい小領域の画像データを、回転、拡大（縮小）させるものである。
具体的には、回転角は、−６゜から＋６゜まで１゜刻みの１３通り、拡大率は、−１０％から＋１０％まで２％刻みの１１通りのすべての組み合わせについて加工し、合計１４３通りの加工済み小領域画像データを、辞書画像記憶部４２に記憶する。 The maximum feature area detection unit 23 selects a divided image having the largest feature amount from the divided images generated by the image dividing unit 21.
The enlargement / rotation processing unit 24 rotates and enlarges (reduces) the image data of the small region having the largest feature amount detected by the maximum feature region detection unit 23.
Specifically, the rotation angle is processed in 13 ways in increments of 1 ° from −6 ° to + 6 °, and the enlargement ratio is processed in all 11 combinations in increments of 2% from −10% to + 10%. The processed small area image data is stored in the dictionary image storage unit 42.

差分画像生成部３１は、辞書画像と記入済み帳票Ｂの入力画像との差分画像を生成するものである。詳細には、図２に示すとおり、ラスタ走査手段３１１、領域抽出手段３１２及び差分画像生成手段３１３を有しており、各手段の一連の動作により差分画像生成部３１としての機能を果たすものとなっている。
ラスタ走査手段３１１は、入力装置１０から与えられた記入済み帳票Ｂの画像データ上で、辞書画像記憶部４２に記憶された１４３パターンの辞書画像をラスタ走査するものである。 The difference image generation unit 31 generates a difference image between the dictionary image and the input image of the completed form B. Specifically, as shown in FIG. 2, the apparatus includes a raster scanning unit 311, a region extracting unit 312, and a difference image generating unit 313, and fulfills a function as the difference image generating unit 31 by a series of operations of each unit. It has become.
The raster scanning means 311 performs raster scanning of the 143 pattern dictionary images stored in the dictionary image storage unit 42 on the image data of the completed form B given from the input device 10.

領域抽出手段３１２は、ラスタ走査手段３１１によって得た画像データにより、記入済み帳票Ｂの画像上における辞書画像領域と同一の領域を抽出するものである。
差分画像生成手段３１３は、領域抽出手段３１２によって得た領域において、各辞書画像パターンごとに差分画像を生成するものである。
なお、差分画像生成部３１によって生成された差分画像は、回転角・拡大率検出部３２に出力される。 The area extracting unit 312 extracts the same area as the dictionary image area on the image of the completed form B from the image data obtained by the raster scanning unit 311.
The difference image generation unit 313 generates a difference image for each dictionary image pattern in the region obtained by the region extraction unit 312.
The difference image generated by the difference image generation unit 31 is output to the rotation angle / magnification rate detection unit 32.

回転角・拡大率検出部３２は、記入済み帳票Ｂの入力画像の回転角や拡大率を検出するものである。
具体的には、回転角・拡大率検出部３２は、図３に示すように、画素面積計算手段３２１、差分画像抽出手段３２２及び回転角・拡大率決定手段３２３を有しており、各手段の一連の動作により回転角・拡大率検出部３２としての機能を果たすものとなっている。
画素面積計算手段３２１は、差分画像生成部３１で生成された差分画像を黒画素及び白画素に二値化し、一方の画素（例えば、白画素）の合計面積を計算するものである。これは、差分画像生成に関わった辞書画像と記入済み帳票画像との差異の大きさを測定するためである。 The rotation angle / magnification rate detection unit 32 detects the rotation angle and magnification rate of the input image of the completed form B.
Specifically, as shown in FIG. 3, the rotation angle / magnification rate detection unit 32 includes a pixel area calculation unit 321, a difference image extraction unit 322, and a rotation angle / magnification rate determination unit 323. A series of operations serves as the rotation angle / magnification rate detection unit 32.
The pixel area calculation unit 321 binarizes the difference image generated by the difference image generation unit 31 into a black pixel and a white pixel, and calculates the total area of one pixel (for example, a white pixel). This is for measuring the size of the difference between the dictionary image involved in the difference image generation and the completed form image.

差分画像抽出手段３２２は、画素面積計算手段３２１によって計算された一方の画素の合計面積が最小となるところの差分画像を抽出するものである。
回転角・拡大率決定手段３２３は、差分画像抽出手段３２２によって抽出された差分画像の生成に関わった辞書画像を抽出し、その辞書画像の回転角及び拡大率を記入済み帳票Ｂの入力画像の回転角・拡大率であると判断するものである。
なお、回転角・回転率の判断においては、上記最小面積が、所定の閾値以下であることを限定条件としてもよい。 The difference image extraction unit 322 extracts a difference image where the total area of one pixel calculated by the pixel area calculation unit 321 is minimum.
The rotation angle / magnification rate determination unit 323 extracts a dictionary image related to generation of the difference image extracted by the difference image extraction unit 322, and inputs the rotation angle and magnification rate of the dictionary image of the input image of the completed form B. It is determined that the rotation angle / magnification rate.
In the determination of the rotation angle / rotation rate, the limiting condition may be that the minimum area is a predetermined threshold value or less.

参照画像回転・拡大部３３は、参照画像記憶部４１に記憶された参照画像の画像データを、回転角・拡大率検出部３２でもとめた回転角及び拡大率で変換するものである。
この変換処理によって、記入済み帳票Ｂの入力画像に生じていた回転や拡大・縮小に係る歪みが、相対的に補正されることとなる。
参照画像分割部３４は、参照画像回転・拡大部３３で変換した参照画像を、複数の小領域に分割するものである。
具体的な処理については、画像分割部２１と同様であり、参照画像を縦方向（行）及び横方向（列）に一以上分割することによって複数の分割画像を生成するものである。 The reference image rotation / enlargement unit 33 converts the image data of the reference image stored in the reference image storage unit 41 with the rotation angle and the enlargement rate stopped by the rotation angle / enlargement rate detection unit 32.
By this conversion processing, the distortion associated with the rotation or enlargement / reduction that has occurred in the input image of the completed form B is relatively corrected.
The reference image dividing unit 34 divides the reference image converted by the reference image rotating / enlarging unit 33 into a plurality of small regions.
The specific processing is the same as that of the image dividing unit 21, and a plurality of divided images are generated by dividing the reference image into one or more in the vertical direction (row) and the horizontal direction (column).

位置合わせ部３５は、記入済み帳票Ｂの入力画像と参照画像分割部３４によって生成された参照画像の分割画像との位置合わせを行うものである。
具体的には、位置合わせ部３５は、図４に示すように、基準位置合わせ手段３５１、画像ずらし手段３５２、小領域差分画像生成手段３５３、小領域差分画像抽出手段３５４及び位置決定手段３５５を有し、各手段の一連の動作により位置合わせ部３５としての機能を果たすものとなっている。 The alignment unit 35 performs alignment between the input image of the completed form B and the divided image of the reference image generated by the reference image dividing unit 34.
Specifically, as shown in FIG. 4, the alignment unit 35 includes a reference alignment unit 351, an image shift unit 352, a small region difference image generation unit 353, a small region difference image extraction unit 354, and a position determination unit 355. It has a function as the alignment unit 35 by a series of operations of each means.

基準位置合わせ手段３５１は、参照画像分割部３４によって生成された各小領域参照画像ごとに記入済み帳票Ｂの入力画像を重ね合わせるものである。具体的には、所定の基準位置（例えば、画像の重心など）に双方の画像の位置を合わせることとなる。
画像ずらし手段３５２は、前記基準位置から一定の範囲内において、各小領域参照画像又は記入済み帳票の入力画像の位置をずらす動作を行う。
このように画像位置の微調整を行うことによって、フォームアウトの精度を高めることができる。 The reference positioning means 351 superimposes the input image of the completed form B for each small area reference image generated by the reference image dividing unit 34. Specifically, the positions of both images are aligned with a predetermined reference position (for example, the center of gravity of the image).
The image shifting means 352 performs an operation of shifting the position of each small area reference image or the input image of the completed form within a certain range from the reference position.
By performing fine adjustment of the image position in this way, the accuracy of form-out can be increased.

小領域差分画像生成手段３５３は、画像ずらし手段３５２における一定の範囲内で、各小領域参照画像と記入済み帳票画像との差分画像を生成するものである。
小領域差分画像抽出手段３５４は、小領域差分画像生成手段３５３によって生成された差分画像の中から差分絶対値が最も少ない差分画像を各小領域ごとに抽出するものである。
位置決定手段３５５は、上記小領域差分画像抽出手段３５４によって抽出された差分画像の生成に関わった小領域参照画像又は記入済み帳票画像の位置をもって各画像の位置とするものである。
そして、出力画像生成部３６では、上記位置合わせ部３５によって決定した位置にしたがって、各小領域参照画像ごとに記入済み帳票画像との差分画像が生成され、その合成画像が出力されることとなる。 The small area difference image generating means 353 generates a difference image between each small area reference image and the completed form image within a certain range in the image shifting means 352.
The small area difference image extracting means 354 extracts a difference image having the smallest difference absolute value from the difference images generated by the small area difference image generating means 353 for each small area.
The position determining means 355 determines the position of each image based on the position of the small area reference image or the completed form image related to the generation of the difference image extracted by the small area difference image extracting means 354.
Then, in the output image generation unit 36, a difference image from the completed form image is generated for each small region reference image in accordance with the position determined by the alignment unit 35, and the composite image is output. .

［記憶装置４０］
記憶装置は、参照画像記憶部４１及び辞書画像記憶部４２を有する。
参照画像記憶部４１は、入力装置１０（画像入力部１１）からの未記入帳票Ａの画像を記憶するものである。
一方、辞書画像記憶部４２は、データ処理装置２０によって分割・変換加工された未記入帳票Ａの画像を記憶するものである。 [Storage device 40]
The storage device includes a reference image storage unit 41 and a dictionary image storage unit 42.
The reference image storage unit 41 stores an image of the blank form A from the input device 10 (image input unit 11).
On the other hand, the dictionary image storage unit 42 stores an image of the blank form A that has been divided and converted by the data processing device 20.

［フォームアウト方法］
次に、以上のような構成からなる一実施形態のＯＣＲ装置における動作フローについて図５及び図６を参照しつつ説明する。
図５は、本発明の一実施形態に係るＯＣＲ装置における辞書画像の生成フローを示したフローチャートである。
また、図６は、本発明の一実施形態に係るＯＣＲ装置における記入済み帳票のフォームアウトフローを示したフローチャートである。 [Form-out method]
Next, an operation flow in the OCR apparatus according to the embodiment having the above-described configuration will be described with reference to FIGS.
FIG. 5 is a flowchart showing a dictionary image generation flow in the OCR apparatus according to the embodiment of the present invention.
FIG. 6 is a flowchart showing a form-out flow of a completed form in the OCR apparatus according to an embodiment of the present invention.

［辞書画像生成］
最初に、本発明の一実施形態に係るＯＣＲ装置における辞書画像の生成フローについて、図５を参照しつつ説明する。
まず、入力装置１０で入力された未記入帳票Ａの画像データは、参照画像記憶部４１とデータ処理装置２０（画像分割部２１）に供給される（ステップＡ１）。
画像分割部２１では、図７に示すように、未記入帳票Ａの画像データの全体又は一部を複数の小領域に分割する（ステップＡ２）。
例えば、縦方向はＭ行、横方向はＮ列（但し、Ｍ、Ｎは自然数）に分割することによって、Ｍ×Ｎ個の碁盤目状の小領域が形成される。 [Dictionary image generation]
First, a dictionary image generation flow in the OCR apparatus according to an embodiment of the present invention will be described with reference to FIG.
First, the image data of the blank form A input by the input device 10 is supplied to the reference image storage unit 41 and the data processing device 20 (image dividing unit 21) (step A1).
As shown in FIG. 7, the image dividing unit 21 divides all or part of the image data of the unfilled form A into a plurality of small areas (step A2).
For example, M × N grid-like small regions are formed by dividing the vertical direction into M rows and the horizontal direction into N columns (where M and N are natural numbers).

次に、特徴抽出部２２が、ステップＡ２により生成された各分割画像についてそれぞれ特徴となるデータ（特徴量）を抽出する（ステップＡ３）。特徴量とは、小領域に含まれる罫線や文字や模様等の複雑さを表すものであり、具体的には、分割画像を二値化して得た二値の画素（白画素及び黒画素）のうち、例えば黒画素部分の輪郭長の総和を計算することによって求めることができる。
そして、最大特徴領域抽出部２３が、Ｍ×Ｎ個の各分割画像の中から、ステップＡ３で抽出した特徴量が最大であるところの領域（最大特徴領域）を抽出する（ステップＡ４）。すなわち、前記輪郭長が最大であるところの分割画像を抽出する。 Next, the feature extraction unit 22 extracts data (feature amount) that is a feature of each divided image generated in step A2 (step A3). The feature amount represents the complexity of ruled lines, characters, patterns, and the like included in a small area, and specifically, binary pixels (white pixels and black pixels) obtained by binarizing a divided image. Of these, for example, it can be obtained by calculating the sum of the contour lengths of the black pixel portions.
Then, the maximum feature region extraction unit 23 extracts a region (maximum feature region) where the feature amount extracted in step A3 is the maximum from each of the M × N divided images (step A4). That is, a divided image having the maximum contour length is extracted.

次に、拡大・回転加工部２４が、ステップＡ４で抽出した領域の画像を様々な回転角・拡大率で変換加工する（ステップＡ５）。
具体的には、図８に示すように、回転角θは−６゜≦θ≦６゜の範囲で１゜刻み、拡大率Ｒは、０．９０≦Ｒ≦１．１０で０．０２刻みの各組み合わせで変換することにより、１４３パターンの加工画像が生成されることとなる。
ここで、分割画像の回転角及び拡大率について一定の適用範囲を設けたのは、回転角や拡大率の幅を広くとり過ぎると、補正が追いつかず大きな誤差が生じる場合があり、一方、狭すぎると補正の適用範囲が小さくなり本発明の機能を十分に発揮できないためである。 Next, the enlargement / rotation processing unit 24 converts and processes the image of the region extracted in step A4 with various rotation angles and enlargement ratios (step A5).
Specifically, as shown in FIG. 8, the rotation angle θ is in increments of 1 ° within a range of −6 ° ≦ θ ≦ 6 °, and the enlargement ratio R is in increments of 0.02 with 0.90 ≦ R ≦ 1.10. By converting each combination of the above, a processed image of 143 patterns is generated.
Here, a certain range of application is provided for the rotation angle and enlargement ratio of the divided images. If the rotation angle and enlargement ratio are too wide, correction may not catch up and a large error may occur. If it is too large, the application range of the correction becomes small and the function of the present invention cannot be fully exhibited.

ただし、入力画像の歪みの大小は、入力装置の相違等、ユーザーによって異なることから、回転角、拡大率の範囲やそのパターン数は変更可能としてある。
なお、実際の画像データの変換においては、領域の縦方向をｙ軸、横方向をｘ軸とみたて、もとの画像座標を（ｘ，ｙ）、変換後の画像座標を（ｘ’，ｙ’）とした場合、以下の一次変換式が用いられる。

However, since the magnitude of the distortion of the input image varies depending on the user, such as the difference in input devices, the rotation angle, the range of the enlargement ratio, and the number of patterns can be changed.
In actual image data conversion, assuming that the vertical direction of the region is the y-axis and the horizontal direction is the x-axis, the original image coordinates are (x, y), and the converted image coordinates are (x ′, In the case of y ′), the following primary conversion equation is used.

そして、これら加工された画像データのすべては記憶装置４０に送られ、辞書画像としてメモリー（辞書画像記憶部４２）に記憶される（ステップＡ６）。
以上のステップを通して生成された辞書画像は、後のフォームアウト処理に必要とされるものである。要約すると、これらの辞書画像は、記入済み帳票の入力画像の歪み（回転や拡大・縮小等）を判定するために利用され、結果、確実なフォームアウトを可能とするものである。なお、辞書画像を利用したフォームアウトの詳細については、後述のフォームアウトフローの中で説明する。 All of the processed image data is sent to the storage device 40 and stored as a dictionary image in the memory (dictionary image storage unit 42) (step A6).
The dictionary image generated through the above steps is required for subsequent form-out processing. In summary, these dictionary images are used to determine the distortion (rotation, enlargement / reduction, etc.) of the input image of the completed form, and as a result, reliable form-out is possible. The details of form-out using a dictionary image will be described in a later-described form-out flow.

［フォームアウト］
次に、本発明の一実施形態に係るＯＣＲ装置における記入済み帳票のフォームアウトフローについて、図６を参照しつつ説明する。
まず、入力装置１０は、記入済み帳票Ｂの画像データを取り込み（ステップＢ１）、データ処理装置２０に出力する。
データ処理装置２０では、図９に示すように、差分画像生成部３１が、入力装置１０からの記入済み帳票Ｂの入力画像上ですべての辞書画像をラスタ走査し、各々差分画像を生成する（ステップＢ２）。
そして、ステップＢ２で生成された各差分画像を利用して、記入済み帳票Ｂの入力画像の回転角・拡大率を検出する（ステップＢ３）。 [Form Out]
Next, a form-out flow of a completed form in the OCR apparatus according to an embodiment of the present invention will be described with reference to FIG.
First, the input device 10 takes in the image data of the completed form B (step B1) and outputs it to the data processing device 20.
In the data processing device 20, as shown in FIG. 9, the differential image generation unit 31 raster-scans all dictionary images on the input image of the completed form B from the input device 10, and generates a differential image for each ( Step B2).
Then, the rotation angle / magnification rate of the input image of the completed form B is detected using each difference image generated in step B2 (step B3).

具体的には、各差分画像を黒画素及び白画素に二値化し、相違部分（例えば、白画素部分）の面積を測定する。次に、この面積が最も小さくなるときの差分画像を抽出する。そして、この差分画像の生成に関わった辞書画像を抽出し、該辞書画像の回転角・拡大率を記入済み帳票画像の回転角・拡大率と判定するものである。
例えば、ここで抽出された差分画像の辞書画像が、未記入帳票画像を回転角５゜、拡大率０．９０で変換されたものである場合には、記入済み帳票の入力画像は、回転角５゜、拡大率０．９０で変形されて入力したものであると判定される。 Specifically, each difference image is binarized into a black pixel and a white pixel, and an area of a different portion (for example, a white pixel portion) is measured. Next, the difference image when this area becomes the smallest is extracted. Then, a dictionary image related to the generation of the difference image is extracted, and the rotation angle / magnification rate of the dictionary image is determined as the rotation angle / magnification rate of the completed form image.
For example, when the dictionary image of the difference image extracted here is a non-filled form image converted at a rotation angle of 5 ° and an enlargement ratio of 0.90, the input image of the filled form has a rotation angle of It is determined that the input is deformed at 5 ° and an enlargement ratio of 0.90.

ここで、辞書画像は、上述の通り、所定の範囲（本実施形態においては、回転角が−６゜から＋６゜、拡大率が０．９０から１．１０の範囲）で変換された画像であるので、ステップＢ３で検出可能な回転角・拡大率も同一の範囲に限られる。
すなわち、上記範囲を超える変形が記入済み帳票の入力画面に生じている場合には、その回転角・拡大率は検出できず、フォームアウト不能を示すエラーコードが出力され、処理は終了する（ステップＢ４ＮＯ）。
一方、回転角・拡大率が検出された場合は、次のステップ（ステップＢ５）にすすむ（ステップＢ４ＹＥＳ）。 Here, as described above, the dictionary image is an image converted in a predetermined range (in this embodiment, a rotation angle is −6 ° to + 6 ° and an enlargement ratio is 0.90 to 1.10). Therefore, the rotation angle / magnification rate that can be detected in step B3 is also limited to the same range.
That is, when a deformation exceeding the above range occurs on the input screen of the completed form, the rotation angle / magnification rate cannot be detected, an error code indicating that form-out is impossible is output, and the process ends (step B4 NO).
On the other hand, if the rotation angle / magnification rate is detected, the process proceeds to the next step (step B5 YES) (step B4 YES).

ステップＢ５では、ステップＢ４で検出された回転角・拡大率によって、参照画像を変換し（ステップＢ５）、Ｍ行Ｎ列（但し、Ｍ、Ｎは自然数）の小領域に分割する（ステップＢ６）。
次に、分割された小領域ごとに、記入済み帳票Ｂの入力画像と分割した参照画像との位置合わせを試みる（ステップＢ７）。 In step B5, the reference image is converted based on the rotation angle / magnification rate detected in step B4 (step B5), and divided into small regions of M rows and N columns (where M and N are natural numbers) (step B6). .
Next, for each divided small area, an attempt is made to align the input image of the completed form B with the divided reference image (step B7).

具体的には、図１０で示すように、まず変換・分割した参照画像を、記入済み帳票の入力画像の指定位置（例えば、各画像の重心位置）に合わせた状態を基準として、該参照画像をｘ方向に±Ｋ画素、ｙ方向に±Ｌ画素（但し、Ｋ、Ｌは自然数）の範囲でずらし、各ずらし位置で記入済み帳票画像と該参照画像との差分画像を生成する。
その際、上記参照画像の白紙部分では差分を取らないよう（差分は０（ゼロ）固定）にする。記入済み帳票では、その部分に文字が記入される可能性があるためである。 Specifically, as shown in FIG. 10, the reference image is first converted and divided on the basis of a state in which the reference image is matched with the designated position of the input image of the completed form (for example, the centroid position of each image). Are shifted in the range of ± K pixels in the x direction and ± L pixels in the y direction (where K and L are natural numbers), and a difference image between the completed form image and the reference image is generated at each shifted position.
At that time, the difference is not taken in the blank portion of the reference image (the difference is fixed to 0 (zero)). This is because there is a possibility that characters will be entered in the completed form.

また、図１１に示すように、各ずらし位置において差分画像から各小領域の差分の絶対値の和（Ｓｕｍｉ，ｊ（ｐ，ｑ））（０≦ｉ≦Ｍ−１，０≦ｊ≦Ｎ−１，−Ｋ≦ｐ≦Ｋ，−Ｌ≦ｑ≦Ｌ）を求める。
そして、各ずらし位置での差分絶対値の和（Ｓｕｍｉ，ｊ（ｐ、ｑ））を比較し、小領域（ｉ，ｊ）ごとに、この差分絶対値の和（Ｓｕｍｉ，ｊ（ｐ、ｑ））が最小となるずらし位置（ｐ（ｉ，ｊ），ｑ（ｉ，ｊ））を保持する。
具体的には、図１１に示すような式の通りとなる。 Further, as shown in FIG. 11, the sum of the absolute values of the differences between the small regions from the difference image (Sum i, j (p, q)) (0 ≦ i ≦ M−1, 0 ≦ j ≦) at each shift position. N-1, -K≤p≤K, -L≤q≤L).
Then, the sum of absolute differences (Sum i, j (p, q)) at each shift position is compared, and the sum of absolute differences (Sum i, j (p) for each small region (i, j). , Q)) is held at the minimum shift position (p (i, j), q (i, j)).
Specifically, the equation is as shown in FIG.

このようにして位置ｐ（ｉ，ｊ）、ｑ（ｉ，ｊ）が決まれば（ステップＢ８ＹＥＳ）、図１２に示す小領域ごとに差分画像を生成し、それらをつなぎ合わせ、全体の差分画像を生成する（ステップＢ９）。
一方、入力画像が、許容範囲外の変形がされたものであったり、参照画像と異なるフォーマット画像等の場合には、位置合わせは失敗と判断され、フォームアウト不能を示すエラーコードが出力され、処理は終了する（ステップＢ８ＮＯ）。
以上のステップに従ってフォームアウト処理を実施することにより、図１３に示すように、差分画像には読み取り対象となる文字画像のみが抽出される。 If the positions p (i, j) and q (i, j) are determined in this way (step B8 YES), a difference image is generated for each small region shown in FIG. Is generated (step B9).
On the other hand, if the input image has been deformed out of the allowable range, or is a format image or the like different from the reference image, the alignment is determined to be unsuccessful, and an error code indicating that form-out is impossible is output The process ends (NO at step B8).
By performing the form-out process according to the above steps, as shown in FIG. 13, only the character image to be read is extracted from the difference image.

以上説明したように、本実施形態のＯＣＲ装置１によれば、画像分割部２１が参照画像を碁盤目状の部分画像に分割し、最大特徴領域検出部２３がその中で最も特徴のある小領域のみを抽出し、拡大・回転加工部２４がその小領域を複数の回転角・拡大率の組み合わせで変換し、回転角・拡大率検出部３２がその各変換パターンと入力画像を重ね合わせることで、帳票の拡大・縮小率及びスキュー角を判定し、補正できるようにしてある。
その結果、出力画像生成部３６によって、確実にフォームアウト出力ができるようにしている。 As described above, according to the OCR apparatus 1 of the present embodiment, the image dividing unit 21 divides the reference image into a grid-like partial image, and the maximum feature region detecting unit 23 has the smallest feature among them. Only the area is extracted, the enlargement / rotation processing unit 24 converts the small area by a combination of a plurality of rotation angles / magnification ratios, and the rotation angle / magnification ratio detection unit 32 superimposes each conversion pattern and the input image. Thus, the enlargement / reduction ratio and skew angle of the form can be determined and corrected.
As a result, the output image generation unit 36 can reliably perform form-out output.

このため、入力画面のサイズ誤差、印刷ズレ、伸縮等が生じてもこれらの歪みを吸収し、確実にフォームアウト処理を行うことができる。
また、画像を分割し小領域単位で処理を行うことによって、局所的な歪みにも対応できるため、精度の高いフォームアウトを実現することができる。
さらに、従来の帳票に必要とされてきた位置合わせのためのマークや罫線が不要となるので、利便性を高めることができる。
また、従来発生していた罫線除去エラー頻度が低減し、データエントリーの工数削減も可能となる。 For this reason, even if a size error, printing misalignment, expansion / contraction, etc. of the input screen occur, these distortions can be absorbed and the form-out process can be performed reliably.
Further, by dividing the image and performing processing in units of small areas, it is possible to cope with local distortions, so that it is possible to realize a highly accurate form-out.
Furthermore, since the alignment marks and ruled lines required for conventional forms are not necessary, the convenience can be improved.
In addition, the frequency of ruled line removal errors that has conventionally occurred can be reduced, and the man-hours for data entry can be reduced.

以上、本発明のＯＣＲ装置について、好ましい実施形態を示して説明したが、本発明にかかるＯＣＲ装置は、上述した実施形態にのみ限定されるものではなく、本発明の範囲で種々の変更実施が可能であることは言うまでもない。
例えば、本発明のＯＣＲ装置は、辞書画像を生成する際に行う最大特徴領域の変形（回転、拡大）の組み合わせを、任意の範囲やステップに変更してもよい。
また、辞書画像を生成する際、最大特徴領域を加工するときの拡大率を縦方向と横方向とで可変してもよい。
さらに、複数の参照画像を記憶し複数の帳票に対応させる帳票識別機能を備えた構成であってもよい。 Although the preferred embodiment of the OCR apparatus of the present invention has been described above, the OCR apparatus according to the present invention is not limited to the above-described embodiment, and various modifications can be made within the scope of the present invention. It goes without saying that it is possible.
For example, the OCR device of the present invention may change the combination of deformation (rotation, enlargement) of the maximum feature region performed when generating a dictionary image to an arbitrary range or step.
Further, when generating a dictionary image, the enlargement ratio when processing the maximum feature area may be varied between the vertical direction and the horizontal direction.
Furthermore, the structure provided with the form identification function which memorize | stores a some reference image and respond | corresponds to a some form may be sufficient.

本発明は、画像入力手段、データ処理手段、記憶手段を備えたＯＣＲ装置に好適に利用することができる。 The present invention can be suitably used for an OCR apparatus including an image input unit, a data processing unit, and a storage unit.

本発明の一実施形態に係るＯＣＲ装置の主な構成を示すブロック図である。It is a block diagram which shows the main structures of the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置を構成するデータ処理装置が備える差分画像生成部の詳細構成を示したブロック図である。It is the block diagram which showed the detailed structure of the difference image generation part with which the data processor which comprises the OCR apparatus which concerns on one Embodiment of this invention is provided. 本発明の一実施形態に係るＯＣＲ装置を構成するデータ処理装置が備える回転角・拡大率検出部の詳細構成を示したブロック図である。It is the block diagram which showed the detailed structure of the rotation angle and expansion ratio detection part with which the data processor which comprises the OCR apparatus which concerns on one Embodiment of this invention is provided. 本発明の一実施形態に係るＯＣＲ装置を構成するデータ処理装置が備える位置合わせ部の詳細構成を示したブロック図である。It is the block diagram which showed the detailed structure of the alignment part with which the data processor which comprises the OCR apparatus which concerns on one Embodiment of this invention is provided. 本発明の一実施形態に係るＯＣＲ装置における辞書画像の生成フローを示したフローチャートである。It is the flowchart which showed the production | generation flow of the dictionary image in the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置における記入済み帳票のフォームアウトフローを示したフローチャートである。It is the flowchart which showed the form-out flow of the completed form in the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置により実施される分割後の参照画像及び最大特徴領域を表した図である。It is a figure showing the reference image after division | segmentation implemented by the OCR apparatus which concerns on one Embodiment of this invention, and the largest feature area. 本発明の一実施形態に係るＯＣＲ装置により実施される最大特徴領域の回転・拡大（辞書画像の例）を表した図である。It is a figure showing rotation and expansion (example of a dictionary image) of the maximum characteristic area implemented by the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置により実施される記入済み帳票画像上での辞書画像のラスタ走査を表した図である。It is a figure showing the raster scan of the dictionary image on the completed form image implemented by the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置により実施される位置合わせにおける画像ずらし手段を表した図である。It is a figure showing the image shift means in the alignment implemented by the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置により実施される位置合わせにおける各位置での差分絶対値の和を表した図である。It is a figure showing the sum of the absolute difference value in each position in the alignment performed by the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置において小領域ごとに差分画像を生成し合成することによって生成された差分画像を示した図である。It is the figure which showed the difference image produced | generated by producing | generating and combining a difference image for every small area | region in the OCR apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＯＣＲ装置におけるフォームアウト実施後の出力画像を示した図である。It is the figure which showed the output image after form-out implementation in the OCR apparatus which concerns on one Embodiment of this invention.

Explanation of symbols

１ＯＣＲ装置
１０入力装置
２０データ処理装置
４０記憶装置
Ａ未記入帳票
Ｂ記入済み帳票 1 OCR device 10 Input device 20 Data processing device 40 Storage device A Unfilled form B Filled form

Claims

An OCR device that extracts and outputs a character image entered from image data of a form input by an optical reading means,
An image input unit for inputting image data of a form by optical reading means;
A reference image storage unit for storing the image data of the input blank form as a reference image;
A dictionary image storage unit that converts a part of the image data of the input blank form with a predetermined rotation angle and enlargement ratio and stores it as a dictionary image;
A difference image generation unit for generating a difference image between the input image data of the completed form and the dictionary image;
A rotation angle / magnification rate detection unit that detects a rotation angle and a magnification rate of the completed form image based on the difference image generated by the difference image generation unit;
A reference image rotation / enlargement unit that converts the reference image according to the rotation angle and the enlargement rate detected by the rotation angle / enlargement rate detection unit;
An OCR apparatus comprising: an output image generation unit that generates and outputs a difference image between the reference image converted by the reference image rotation / enlargement unit and the blank image.

An image dividing unit that generates a plurality of divided images by dividing the image data of the input blank form into one or more rows and columns;
A feature extraction unit that binarizes the divided image generated by the image dividing unit and calculates a sum of contour lengths of images formed from one of the pixels obtained by binarization;
A maximum feature region detecting unit for detecting a divided image having the largest sum of the contour lengths from among the divided images generated by the image dividing unit;
An enlargement / rotation processing unit that converts the divided image detected by the maximum feature region detection unit into a plurality of image data with a predetermined rotation angle and enlargement ratio;
The dictionary image storage unit
The OCR apparatus according to claim 1, wherein a plurality of image data converted by the enlargement / rotation processing unit is stored as a dictionary image.

The difference image generation unit
Raster scanning means for raster-scanning the completed form image and obtaining predetermined image data;
An area extracting means for extracting an area corresponding to the area of the dictionary image from the image data acquired by the raster scanning means;
3. The OCR apparatus according to claim 1, further comprising: a difference image generation unit configured to generate a difference image between the completed form image and the dictionary image in the extracted area.

The rotation angle / magnification detection unit is
Pixel area calculation means for binarizing the difference image generated by the difference image generation unit and calculating the total area of one of the pixels obtained by binarization;
A difference image extraction means for extracting a difference image having the smallest total area from the difference images generated by the difference image generation unit;
A rotation angle / magnification rate determination unit that sets the rotation angle and magnification of the dictionary image used to generate the difference image extracted by the difference image extraction unit as the rotation angle and magnification of the completed form image. The OCR device according to any one of claims 1 to 3, wherein

  A reference image dividing unit that generates a plurality of small region reference images by dividing the reference image converted by the reference image rotation / enlargement unit into one or more rows and columns;
  An alignment unit for aligning the position of the small area reference image and the input form image,
  The output image generation unit generates and outputs a differential image between each small region image and a completed form image after the alignment unit performs alignment for each small region. The OCR apparatus as described in any one of thru | or 4.

  The alignment unit is
  A standard positioning means for superimposing the completed form image on the basis of a predetermined designated position for each small area reference image generated by the reference image dividing unit;
  Image shifting means for moving each small area reference image and / or the completed form image within a certain range based on the specified position;
  A small area difference image generating means for generating a difference image between each small area reference image and the completed form image within the range;
  A small area difference image extracting means for extracting, for each small area, a difference image having the smallest difference absolute value from the generated difference images;
  Position determining means for determining the position of the image based on the position of each small area reference image and / or the input form image related to the generation of each differential image extracted by the small area difference image extracting means. The OCR apparatus according to claim 5.

  A form-out method for extracting and outputting a character image entered from image data of a form input by an optical reading means,
  A step of inputting image data of a blank form,
  Storing the input blank image data as a reference image;
  Converting a part of the image data of the input blank form with a predetermined rotation rate and enlargement rate and storing it as a dictionary image;
  The step of inputting the image data of the completed form,
  Generating a difference image between the image data of the completed form and a dictionary image;
  Detecting a rotation rate and an enlargement rate of the completed form image based on the difference image;
  Converting the reference image according to the detected rotation rate and magnification rate;
  And a step of generating and outputting a difference image between the converted reference image and the blank form image.

  A computer for extracting and outputting a character image entered from the image data of the form input by the optical reading means,
  Means for inputting image data of a form by the optical reading means;
  Means for storing the image data of the entered blank form as a reference image;
  Means for converting a part of the image data of the entered blank form with a predetermined rotation angle and enlargement ratio and storing it as a dictionary image;
  Means for generating a difference image between the image data of the entered completed form and the dictionary image;
  Means for detecting a rotation angle and an enlargement ratio of the completed form image based on the difference image;
  Means for converting the reference image according to the detected rotation angle and magnification;
  Means for generating and outputting a difference image between the converted reference image and the blank form image;
  Form-out program to function as.