JP6138038B2

JP6138038B2 - Form identification device and form identification method

Info

Publication number: JP6138038B2
Application number: JP2013258863A
Authority: JP
Inventors: 裕介伊谷
Original assignee: Mitsubishi Electric Corp; Mitsubishi Electric Information Systems Corp
Current assignee: Mitsubishi Electric Corp; Mitsubishi Electric Information Systems Corp
Priority date: 2013-12-16
Filing date: 2013-12-16
Publication date: 2017-05-31
Anticipated expiration: 2033-12-16
Also published as: JP2015115025A

Description

この発明は、帳票の画像データを解析し、帳票を識別する技術に関するものである。 The present invention relates to a technique for analyzing form image data and identifying a form.

従来より、帳票内の特徴点を用いて帳票を識別する技術が開示されている。図１４は、従来の帳票認識装置の処理例を示す図である。図１４（ａ）は帳票認識装置に入力される入力画像９０を示し、図１４（ｂ）はあらかじめ設けられたサンプル画像９１を示している。
まず、帳票識別装置は二値化処理した入力画像９０に対して余白の検出を行い、検出した余白からマーカ領域９０ａを抽出する。抽出したマーカ領域９０ａを入力画像９０の特徴点とし、図１４（ｂ）に示したサンプル画像９１のマーカ領域９１ａとの対応点を探索する。探索した結果を元に、入力画像９０とサンプル画像９１とのずれを計算し、サンプル画像のＩＤ領域９１ｂから入力画像９０のＩＤ領域９０ｂを計算し、入力画像９０が示す帳票ＩＤを特定する。 Conventionally, a technique for identifying a form by using a feature point in the form has been disclosed. FIG. 14 is a diagram illustrating a processing example of a conventional form recognition apparatus. FIG. 14A shows an input image 90 input to the form recognition apparatus, and FIG. 14B shows a sample image 91 provided in advance.
First, the form identifying device detects a margin for the binarized input image 90 and extracts a marker region 90a from the detected margin. Using the extracted marker area 90a as a feature point of the input image 90, a corresponding point with the marker area 91a of the sample image 91 shown in FIG. Based on the search result, the difference between the input image 90 and the sample image 91 is calculated, the ID area 90b of the input image 90 is calculated from the ID area 91b of the sample image, and the form ID indicated by the input image 90 is specified.

入力画像９０から抽出したマーカ領域９０ａに基づいて、サンプル画像９１から対応するマーカ領域９１ａを探索する方法として、例えば特許文献１にはサンプル画像上のマーカ領域と、入力画像から抽出したマーカ領域の全ての座標を比較し、サンプル画像と入力画像間でより近似するマーカ領域を探索結果とする方法が開示されている。 As a method for searching for a corresponding marker region 91a from a sample image 91 based on the marker region 90a extracted from the input image 90, for example, Patent Document 1 discloses a marker region on a sample image and a marker region extracted from the input image. A method is disclosed in which all coordinates are compared and a marker region that is more approximate between the sample image and the input image is used as a search result.

特開２００１−２２０３４１号公報JP 2001-220341 A

しかしながら、上述した特許文献１に開示された技術では、入力画像から抽出したマーカ領域の位置と、サンプル画像のマーカ領域の位置が一致している必要があり、マーカ領域が帳票ごとにずれている場合、あるいはノイズなどの影響でマーカ領域の抽出が困難である場合に、帳票識別子を読み取ることができずに解析精度が低下するという課題があった。さらに、帳票上にマーカ領域を複数配置する必要があり、帳票構成の自由度が制限されるという課題があった。 However, in the technique disclosed in Patent Document 1 described above, the position of the marker area extracted from the input image needs to match the position of the marker area of the sample image, and the marker area is shifted for each form. In this case, or when it is difficult to extract the marker area due to noise or the like, there is a problem that the form identifier cannot be read and the analysis accuracy is lowered. Furthermore, it is necessary to arrange a plurality of marker areas on the form, and there is a problem that the degree of freedom of form composition is limited.

この発明は、上記のような課題を解決するためになされたもので、マーカ領域が帳票ごとにずれている場合や、画像データからマーカ領域の抽出が困難である場合であっても、帳票の構成を制限することなく帳票識別子を認識して帳票を識別することを目的とする。 The present invention has been made to solve the above-described problem. Even when the marker area is shifted for each form or when it is difficult to extract the marker area from the image data, An object is to recognize a form identifier and identify a form without restricting the configuration.

この発明に係る帳票識別装置は、あらかじめ記憶したサンプル画像内に設定された領域と、帳票の画像データの画像サイズに基づいて、帳票の画像データに黒画素の発生頻度を示すヒストグラムを生成する領域を設定し、当該設定した領域内で帳票の画像データからヒストグラムを生成し、生成したヒストグラムに基づいて文字領域を抽出する文字領域抽出部と、文字領域抽出部が抽出した文字領域に含まれる文字列を認識する文字認識部と、あらかじめ文字列と帳票の識別子とを対応付けて記憶した帳票認識情報を参照し、文字認識部が認識した文字列に対応する帳票の識別子を取得し、帳票を識別する識別子認識部とを備えるものである。 The form identification device according to the present invention is an area for generating a histogram indicating the frequency of occurrence of black pixels in form image data based on an area set in a sample image stored in advance and an image size of the form image data. A character area extraction unit that generates a histogram from image data of a form within the set area and extracts a character area based on the generated histogram, and a character included in the character area extracted by the character area extraction unit The character recognition unit for recognizing the column, the form recognition information stored in association with the character string and the form identifier in advance, the form identifier corresponding to the character string recognized by the character recognition unit is obtained, and the form is An identifier recognizing unit for identifying.

この発明によれば、マーカ領域が帳票ごとにずれている場合、あるいは画像データからマーカ領域の抽出が困難である場合にも、帳票識別子を認識して帳票を識別することができる。また、マーカ領域によって帳票の構成が制限されることを抑制することができる。 According to the present invention, even when the marker area is shifted for each form or when it is difficult to extract the marker area from the image data, the form identifier can be recognized to identify the form. Moreover, it can suppress that the structure of a form is restrict | limited by a marker area | region.

実施の形態１による帳票識別装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of a form identification device according to Embodiment 1. FIG. 実施の形態１による帳票識別装置の画像補正部によるアフィン変換を示す図である。6 is a diagram illustrating affine transformation by an image correction unit of the form identification device according to Embodiment 1. FIG. 実施の形態１による帳票識別装置の文字領域抽出部によるヒストグラム生成を示す図である。FIG. 10 is a diagram illustrating histogram generation by a character region extraction unit of the form identification device according to the first embodiment. 実施の形態１による帳票識別装置のＩＤ認識情報記憶部に記憶されたＩＤ認識情報の一例を示す図である。It is a figure which shows an example of the ID recognition information memorize | stored in the ID recognition information storage part of the form identification device by Embodiment 1. FIG. 実施の形態１による帳票識別装置の動作を示すフローチャートである。5 is a flowchart showing the operation of the form identification device according to the first embodiment. 実施の形態１による帳票識別装置の画像補正部の罫線の傾き角度の算出を示す図である。FIG. 10 is a diagram illustrating calculation of a tilt angle of a ruled line of an image correction unit of the form identification device according to the first embodiment. 実施の形態１による帳票識別装置の文字領域抽出部のヒストグラム生成位置設定を示す図である。It is a figure which shows the histogram generation position setting of the character area extraction part of the form identification device by Embodiment 1. FIG. 実施の形態１による帳票識別装置の文字領域抽出部の文字領域の抽出を示す図である。It is a figure which shows extraction of the character area of the character area extraction part of the form identification device by Embodiment 1. FIG. 実施の形態２の帳票識別装置の構成を示すブロック図である。FIG. 10 is a block diagram illustrating a configuration of a form identification device according to a second embodiment. 実施の形態２による帳票識別装置のＩＤ領域検出部によるヒストグラムの比較を示す図である。FIG. 10 is a diagram illustrating comparison of histograms by an ID region detection unit of the form identification device according to the second embodiment. 実施の形態２による帳票識別装置のＩＤ認識部が読み取るＩＤの一例を示す図である。It is a figure which shows an example of ID which the ID recognition part of the form identification device by Embodiment 2 reads. 実施の形態２の帳票識別装置の動作を示すフローチャートである。10 is a flowchart illustrating an operation of the form identification device according to the second embodiment. 実施の形態２の帳票識別装置のＩＤ領域検出部において文字領域を用いて位置ずれを検出する一例を示す図である。It is a figure which shows an example which detects position shift using a character area in the ID area | region detection part of the form identification device of Embodiment 2. FIG. 従来の帳票識別装置の処理例を示す図である。It is a figure which shows the process example of the conventional form identification device.

実施の形態１．
この実施の形態１では、入力された画像データに含まれる文字領域を、帳票を識別するための領域（以下、帳票識別子領域と称する）として抽出し、抽出した帳票識別子領域を用いて帳票の識別を行う処理について説明する。
図１は、実施の形態１による帳票識別装置の構成を示すブロック図である。
帳票識別装置１０は、二値化処理部１、画像補正部２、文字領域抽出部３、文字認識部４、ＩＤ認識情報記憶部５およびＩＤ認識部（識別子認識部）６で構成されている。
二値化処理部１は、入力された画像データに対して二値化処理を行う。画像補正部２は、二値化処理が行われた画像から罫線情報を抽出し、抽出した罫線の傾きを算出する。算出した罫線の傾きを元に、アファイン変換により画像の傾き補正を行う。 Embodiment 1 FIG.
In the first embodiment, a character area included in input image data is extracted as an area for identifying a form (hereinafter referred to as a form identifier area), and the form is identified using the extracted form identifier area. The process of performing will be described.
FIG. 1 is a block diagram illustrating a configuration of a form identification apparatus according to the first embodiment.
The form identification device 10 includes a binarization processing unit 1, an image correction unit 2, a character region extraction unit 3, a character recognition unit 4, an ID recognition information storage unit 5, and an ID recognition unit (identifier recognition unit) 6. .
The binarization processing unit 1 performs binarization processing on the input image data. The image correction unit 2 extracts ruled line information from the binarized image and calculates the inclination of the extracted ruled line. Based on the calculated inclination of the ruled line, the inclination of the image is corrected by affine transformation.

図２は、実施の形態１による帳票識別装置の画像補正部によるアフィン変換を示す図である。図２（ａ）は傾き補正前の二値化画像を示し、図２（ｂ）は傾き補正後の二値化画像を示す。
画像補正部２は、図２（ａ）で示した二値化画像に対して罫線１１ａ，１１ｂ，１１ｃ，１１ｄ，１１ｅ，１１ｆ，１１ｇ，１１ｈを抽出し、抽出した各罫線の傾きを検出する。検出した各罫線の傾きに基づいて、二値化画像を矢印Ａ方向に補正するアフィン変換を行い、図２（ｂ）で示した傾き補正後の二値化画像を得る。なお、傾きの検出および傾きに基づく補正の詳細については後述する。 FIG. 2 is a diagram illustrating affine transformation performed by the image correction unit of the form identification device according to the first embodiment. FIG. 2A shows a binarized image before tilt correction, and FIG. 2B shows a binarized image after tilt correction.
The image correction unit 2 extracts ruled lines 11a, 11b, 11c, 11d, 11e, 11f, 11g, and 11h from the binarized image shown in FIG. 2A, and detects the inclination of each extracted ruled line. . Based on the detected inclination of each ruled line, affine transformation for correcting the binarized image in the direction of the arrow A is performed to obtain the binarized image after the inclination correction shown in FIG. Details of inclination detection and correction based on inclination will be described later.

文字領域抽出部３は、画像補正部２が補正した補正画像から黒画素の発生頻度を示すヒストグラムを生成し、生成したヒストグラムの分布に基づいて文字領域を抽出する。文字領域の抽出処理を帳票の全ての画像データに対して行うと帳票内の罫線やノイズの影響を受け易い。そのため、あらかじめ文字領域の抽出処理を行う対象領域を設定し、設定した対象領域内においてヒストグラムを生成することにより、解析精度の向上および計算量削減の効果を得ることができる。対象領域の設定方法としては、例えば比較対象とするサンプル画像上であらかじめ対象領域を設定しておき、入力された帳票の画像サイズに基づいて拡大あるいは縮小する方法が適用可能である。 The character region extraction unit 3 generates a histogram indicating the frequency of occurrence of black pixels from the corrected image corrected by the image correction unit 2, and extracts a character region based on the distribution of the generated histogram. If character region extraction processing is performed on all image data of a form, it is easily affected by ruled lines and noise in the form. Therefore, by setting a target area to be subjected to character area extraction processing in advance and generating a histogram in the set target area, it is possible to improve the analysis accuracy and reduce the amount of calculation. As a method for setting the target area, for example, a method in which the target area is set in advance on a sample image to be compared and enlarged or reduced based on the input image size of the form can be applied.

図３は、実施の形態１による帳票識別装置の文字領域抽出部によるヒストグラム生成を示す図である。
図３に示した補正画像１２内に対象領域１２ａを設定し、当該対象領域１２ａのＸ方向およびＹ方向の黒画素数に基づいてヒストグラム１２ｂを生成する。生成したヒストグラム１２ｂを補正画像１２の画像サイズで正規化する。 FIG. 3 is a diagram illustrating histogram generation by the character region extraction unit of the form identification device according to the first embodiment.
A target area 12a is set in the corrected image 12 shown in FIG. 3, and a histogram 12b is generated based on the number of black pixels in the X direction and Y direction of the target area 12a. The generated histogram 12b is normalized with the image size of the corrected image 12.

文字認識部４は、文字領域抽出部３が抽出した文字領域内に含まれる文字列を認識する。ＩＤ認識情報記憶部５は、文字列と帳票のＩＤとを紐付けたＩＤ認識情報を記憶する記憶領域である。図４は、実施の形態１による帳票識別装置のＩＤ認識情報記憶部に記憶されたＩＤ認識情報の一例を示す図である。ＩＤ認識部６は、ＩＤ認識情報記憶部５に記憶されたＩＤ認識情報（識別子認識情報）を参照し、文字認識部４が認識した文字列から帳票のＩＤ（識別子）を取得し、入力された帳票を識別する。 The character recognition unit 4 recognizes a character string included in the character region extracted by the character region extraction unit 3. The ID recognition information storage unit 5 is a storage area for storing ID recognition information in which a character string and a form ID are associated with each other. FIG. 4 is a diagram illustrating an example of ID recognition information stored in the ID recognition information storage unit of the form identification device according to the first embodiment. The ID recognition unit 6 refers to the ID recognition information (identifier recognition information) stored in the ID recognition information storage unit 5, acquires the ID (identifier) of the form from the character string recognized by the character recognition unit 4, and is inputted. Identify the form.

次に、帳票識別装置１０の動作について説明する。
図５は、実施の形態１の帳票識別装置の動作を示すフローチャートである。
二値化処理部１は、画像データの二値化処理を行う（ステップＳＴ１）。画像補正部２は、ステップＳＴ１において二値化処理が行われた二値画像データから画像補正に必要となる罫線の抽出を行う（ステップＳＴ２）。罫線の抽出方法としては、例えば以下に示す参考文献１の手法を適用することができる。
・参考文献１
平野敬、岡田康裕、依田文夫、「文書画像からの罫線抽出方式」、電子情報通信学会総合大会、Mar.1998 Next, the operation of the form identification device 10 will be described.
FIG. 5 is a flowchart showing the operation of the form identification apparatus according to the first embodiment.
The binarization processing unit 1 performs binarization processing of image data (step ST1). The image correction unit 2 extracts ruled lines necessary for image correction from the binary image data that has been binarized in step ST1 (step ST2). As a ruled line extraction method, for example, the method of Reference 1 shown below can be applied.
・ Reference 1
Takashi Hirano, Yasuhiro Okada, Fumio Yoda, "Rule Extraction Method from Document Images", IEICE General Conference, Mar.1998

次に、画像補正部２はステップＳＴ２で抽出した罫線の傾きを算出する（ステップＳＴ３）。ステップＳＴ３の処理を、図６を参照しながらより詳細に説明する。
図６は、実施の形態１による帳票識別装置の画像補正部の罫線の傾き角度の算出を示す図である。図６（ａ）は抽出した罫線の一例を示し、図６（ｂ）はＸ方向の罫線の傾き角度の算出を示す説明図であり、図６（ｃ）はＹ方向の罫線の傾き角度の算出を示す説明図である。
図６（ａ）の罫線の抽出では、画像１３における線分１３ａおよび線分１３ｂがＸ方向の罫線であり、線分１３ｃおよび線分１３ｄがＹ方向の罫線となる。図６（ｂ）に示すように、線分１３ａおよび線分１３ｂと画像１３のＸ方向に平行な線分１３ｘとのなす角θ_tを罫線１３ａ，１３ｂの傾き角度θ_tとして算出する。また図６（ｃ）に示すように、線分１３ｃおよび線分１３ｄと画像１３のＹ方向に平行な線分１３ｙとのなす角θ_tを罫線の傾き角度θ_tとして算出する。当該罫線の傾き算出処理をステップＳＴ２で抽出した全ての罫線に対して行う。 Next, the image correction unit 2 calculates the slope of the ruled line extracted in step ST2 (step ST3). The process of step ST3 will be described in more detail with reference to FIG.
FIG. 6 is a diagram illustrating calculation of the inclination angle of the ruled line of the image correction unit of the form identification device according to the first embodiment. 6A shows an example of the extracted ruled line, FIG. 6B is an explanatory diagram showing calculation of the tilt angle of the ruled line in the X direction, and FIG. 6C shows the tilt angle of the ruled line in the Y direction. It is explanatory drawing which shows calculation.
6A, the line segment 13a and the line segment 13b in the image 13 are ruled lines in the X direction, and the line segment 13c and the line segment 13d are ruled lines in the Y direction. As shown in FIG. 6B, the angle θ _t formed by the line segment 13a and the line segment 13b and the line segment 13x parallel to the X direction of the image 13 is calculated as the inclination angle θ _t of the ruled lines 13a and 13b. Further, as shown in FIG. 6C, an angle θ _t formed by the line segment 13c and the line segment 13d and the line segment 13y parallel to the Y direction of the image 13 is calculated as the inclination angle θ _t of the ruled line. The ruled line inclination calculation processing is performed on all ruled lines extracted in step ST2.

次に画像補正部２は、ステップＳＴ３で算出した全ての罫線の傾き角度θ_tを用いて、以下の式（１）で表すｃｏｓθおよびｓｉｎθを算出する（ステップＳＴ４）。

すなわち、全ての罫線の傾きの平均値を入力された画像の傾きとする。 Next, the image correcting unit 2 calculates cos θ and sin θ represented by the following equation (1) using the inclination angles θ _t of all ruled lines calculated in step ST3 (step ST4).

That is, the average value of the inclinations of all ruled lines is set as the inclination of the input image.

さらに、画像補正部２はステップＳＴ４で算出したｃｏｓθおよびｓｉｎθを用いて、以下の式（２）で示したアフィン変換を行い、画像補正を行う（ステップＳＴ５）。

式（２）において、（ｘ´，ｙ´）は補正後の画像位置を、（ｘ，ｙ）は補正前の画像位置を示す。 Further, the image correction unit 2 performs affine transformation represented by the following equation (2) using cos θ and sin θ calculated in step ST4 to perform image correction (step ST5).

In Expression (2), (x ′, y ′) represents the image position after correction, and (x, y) represents the image position before correction.

次に、文字領域抽出部３はステップＳＴ５で補正された補正画像について、ヒストグラムを生成する画像位置を設定する（ステップＳＴ６）。ステップＳＴ６の処理を、図７を参照しながらより詳細に説明する。
図７は、実施の形態１による帳票識別装置の文字領域抽出部のヒストグラム生成位置設定を示す図である。図７（ａ）は画像補正部２から入力される補正画像１４を示し、図７（ｂ）はヒストグラムを生成する画像位置があらかじめ設定されたサンプル画像１５を示している。
図７（ｂ）において、サンプル画像１５内に設定されるヒストグラム生成領域１６の基点Ｏを（ｘ，ｙ）、横方向の長さをｗ、縦方向の長さをｈとした場合に、ヒストグラムの画像位置は（ｈｘ，ｈｙ）で表わされ、ヒストグラムの横方向の長さはｈｗ、ヒストグラムの縦方向の長さはｈｈで表わされる。また、サンプル画像１５の横方向の長さはｆｗ、サンプル画像１５の縦方向のｆｈで表わされる。また、図７（ａ）に示すように補正画像１４の横方向の長さはｉｗ、補正画像１４の縦方向の長さはｉｈで表わされる。 Next, the character region extraction unit 3 sets an image position for generating a histogram for the corrected image corrected in step ST5 (step ST6). The process of step ST6 will be described in more detail with reference to FIG.
FIG. 7 is a diagram showing a histogram generation position setting of the character region extraction unit of the form identification device according to the first embodiment. FIG. 7A shows a corrected image 14 input from the image correction unit 2, and FIG. 7B shows a sample image 15 in which an image position for generating a histogram is set in advance.
In FIG. 7B, the histogram is defined when the base point O of the histogram generation region 16 set in the sample image 15 is (x, y), the horizontal length is w, and the vertical length is h. Is represented by (hx, hy), the horizontal length of the histogram is hw, and the vertical length of the histogram is hh. The length of the sample image 15 in the horizontal direction is represented by fw, and the length of the sample image 15 in the vertical direction. As shown in FIG. 7A, the horizontal length of the corrected image 14 is represented by iw, and the vertical length of the corrected image 14 is represented by ih.

上述したヒストグラム生成領域１６の基点Ｏ、サンプル画像１５の横方向および縦方向の長さ、および補正画像１４の横方向および縦方向の長さを用いて、ヒストグラムの画像位置（ｈｘ，ｈｙ）、ヒストグラムの横方向の長さｈｗおよびヒストグラムの縦方向の長さｈｈが以下の式（３）で表される。

Using the base point O of the histogram generation region 16 described above, the horizontal and vertical lengths of the sample image 15, and the horizontal and vertical lengths of the corrected image 14, the image position (hx, hy) of the histogram, The horizontal length hw of the histogram and the vertical length hh of the histogram are expressed by the following formula (3).

次に、文字領域抽出部３はステップＳＴ６の処理で設定したヒストグラム生成領域において、黒画素の発生頻度を示すヒストグラムを生成する（ステップＳＴ７）。ヒストグラムの生成は、以下の式（４）に従って行う。

式（４）において、ｈ（ｘ）はヒストグラムの横方向の黒画素数を示し、ｈ（ｙ）はヒストグラムの縦方向の黒画素数を示す。 Next, the character area extraction unit 3 generates a histogram indicating the occurrence frequency of black pixels in the histogram generation area set in the process of step ST6 (step ST7). The histogram is generated according to the following equation (4).

In Expression (4), h (x) indicates the number of black pixels in the horizontal direction of the histogram, and h (y) indicates the number of black pixels in the vertical direction of the histogram.

さらに、文字領域抽出部３はステップＳＴ７で生成したヒストグラムから文字領域を抽出する（ステップＳＴ８）。ステップＳＴ８の抽出処理において、文字位置とノイズあるいは線分との判別は、図８に示すようにヒストグラムの分布がある一定の幅を有しているか否か基づいて行われる。そこで、以下の式（５）で示す条件を有するヒストグラムを文字領域として検出する。

式（５）では、ヒストグラムの縦方向の黒画素数ｈ（ｙ）が閾値ＴＨ_histより大きく、ヒストグラムの横方向の幅Ｗが閾値ＴＨＷより大きいことを条件としている。 Further, the character area extraction unit 3 extracts a character area from the histogram generated in step ST7 (step ST8). In the extraction process of step ST8, the character position and the noise or line segment are discriminated based on whether or not the histogram distribution has a certain width as shown in FIG. Therefore, a histogram having a condition represented by the following expression (5) is detected as a character region.

In Expression (5), the number of black pixels h (y) in the vertical direction of the histogram is larger than the threshold value TH _{hist and} the width W in the horizontal direction of the histogram is larger than the threshold value THW.

図８の例では、ノイズあるいは線分１７ａのヒストグラムが領域１７ｂで示され、文字領域１７ｃのヒストグラムが領域１７ｄで示されている。領域１７ｂは、分布の幅Ｗａが閾値ＴＨＷ以下であるため、ノイズあるいは線分であると判断される。一方、領域１７ｄは、分布の幅Ｗｂが閾値ＴＨＷより大きく一定の幅を有していることから文字領域であると検出される。
なお、上述した式（５）の条件に限られることなく、ヒストグラムの横方向の黒画素数ｈ（ｘ）が閾値より大きく、ヒストグラムの縦方向の幅が閾値より大きいことを条件としてもよい。 In the example of FIG. 8, a histogram of noise or line segment 17a is indicated by area 17b, and a histogram of character area 17c is indicated by area 17d. The region 17b is determined to be noise or a line segment because the distribution width Wa is equal to or less than the threshold value THW. On the other hand, the region 17d is detected as a character region because the distribution width Wb is larger than the threshold value THW and has a certain width.
Note that the condition is not limited to the condition of the above-described formula (5), but the condition may be that the number of black pixels h (x) in the horizontal direction of the histogram is larger than the threshold value and the vertical width of the histogram is larger than the threshold value.

文字認識部４は、ステップＳＴ８で抽出された文字領域に含まれる文字列を認識し、認識結果をＩＤ認識部６に出力する（ステップＳＴ９）。文字認識の方法としては、例えば以下の参考文献２に開示された手法を適用することができる。
・参考文献２
森稔、澤木美奈子、萩田紀博、村瀬洋、武川直樹、「ランレングス補正を用いた画質劣化にロバストな特徴抽出」、電子情報通信学会論文誌Ｄ Vol J86-D2 No.7，pp.1049-1057，July.2003. The character recognition unit 4 recognizes the character string included in the character area extracted in step ST8, and outputs the recognition result to the ID recognition unit 6 (step ST9). As a character recognition method, for example, the method disclosed in Reference Document 2 below can be applied.
・ Reference 2
Satoshi Mori, Minako Sawaki, Norihiro Suda, Hiroshi Murase, Naoki Takegawa, “Robust Feature Extraction for Image Degradation Using Run-Length Correction”, IEICE Transactions D Vol J86-D2 No.7, pp.1049- 1057, July. 2003.

ＩＤ認識部６は、ＩＤ認識情報記憶部５に記憶されたＩＤ認識情報を参照し、ステップＳＴ９で認識した文字列に対応するＩＤを取得し、帳票識別装置１０に入力された帳票を識別し（ステップＳＴ１０）、処理を終了する。 The ID recognition unit 6 refers to the ID recognition information stored in the ID recognition information storage unit 5, acquires an ID corresponding to the character string recognized in step ST9, and identifies the form input to the form identification device 10. (Step ST10), the process ends.

以上のように、この実施の形態１によれば、入力された画像データから文字領域を抽出する文字領域抽出部３と、抽出された文字領域に含まれる文字列を認識する文字認識部４と、ＩＤ認識情報を参照して認識した文字列に対応するＩＤを取得し、装置に入力された帳票を識別するＩＤ認識部６とを備えるように構成したので、入力された画像データに含まれる文字領域を、帳票を認識するための帳票識別子領域として用いて帳票を識別することができる。これにより、マーカ領域の抽出が困難である場合にも帳票を識別することができる。また、帳票内に複数のマーカ領域を配置する必要がなく、帳票の構成の自由度を高めることができる。 As described above, according to the first embodiment, the character area extracting unit 3 that extracts the character area from the input image data, and the character recognizing unit 4 that recognizes the character string included in the extracted character area. Since the ID recognition unit 6 that acquires the ID corresponding to the character string recognized with reference to the ID recognition information and identifies the form input to the apparatus is provided, it is included in the input image data. The form can be identified using the character area as a form identifier area for recognizing the form. Thereby, even when it is difficult to extract the marker area, the form can be identified. In addition, it is not necessary to arrange a plurality of marker areas in the form, and the degree of freedom of composition of the form can be increased.

また、この実施の形態１によれば、あらかじめ文字領域の抽出処理を行う対象領域を設定し、文字領域抽出部３が設定された対象領域内においてヒストグラムを生成するように構成したので、文字領域の抽出処理において帳票内の罫線やノイズにより受ける影響を抑制することができ、帳票識別装置の解析精度の向上および計算量削減を実現することができる。 In addition, according to the first embodiment, since the target region for performing the character region extraction process is set in advance, and the histogram is generated in the target region in which the character region extraction unit 3 is set, the character region In the extraction process, the influence of ruled lines and noise in the form can be suppressed, and the analysis accuracy of the form identifying apparatus can be improved and the amount of calculation can be reduced.

実施の形態２．
この実施の形態２では、実施の形態１で示した文字領域の抽出に加えて、帳票を認識するためにあらかじめ帳票内に設けられたマーカを抽出し、抽出した文字領域あるいはマーカを用いて帳票識別子領域を検出し、帳票を識別する構成を示す。
図９は、実施の形態２の帳票識別装置の構成を示すブロック図である。
実施の形態２の帳票識別装置２０は、図１で示した実施の形態１の帳票識別装置１０にマーカ抽出部７を追加して設け、文字認識部４に替えてＩＤ領域検出部（識別子領域検出部）８を設けている。なお以下では、実施の形態１による帳票識別装置１０の構成要素と同一または相当する部分には、図１で使用した符号と同一の符号を付して説明を省略または簡略化する。 Embodiment 2. FIG.
In the second embodiment, in addition to the extraction of the character area shown in the first embodiment, a marker provided in the form in advance for recognizing the form is extracted, and the extracted character area or marker is used for the form. The structure which detects an identifier area | region and identifies a form is shown.
FIG. 9 is a block diagram illustrating a configuration of the form identification apparatus according to the second embodiment.
The form identification device 20 of the second embodiment is provided by adding a marker extraction unit 7 to the form identification device 10 of the first embodiment shown in FIG. 1 and replacing the character recognition unit 4 with an ID region detection unit (identifier region). Detection unit) 8 is provided. In the following description, the same or corresponding parts as the components of the form identification device 10 according to the first embodiment are denoted by the same reference numerals as those used in FIG. 1 and the description thereof is omitted or simplified.

マーカ抽出部７は、画像補正部２が補正した補正画像からあらかじめ帳票内に設けられたマーカを抽出すると共に、当該マーカの位置を抽出する。ＩＤ領域検出部８は、文字領域抽出部３が生成したヒストグラムと、あらかじめ記憶されたサンプル画像中のヒストグラムとを比較し、類似度αを算出する。同様に、マーカ抽出部７が抽出したマーカの位置と、サンプル画像中のマーカ位置とを比較し、類似度βを算出する。類似度として、例えば対応する点同士の距離の和を用いるなどが考えられる。 The marker extraction unit 7 extracts a marker provided in the form in advance from the corrected image corrected by the image correction unit 2 and extracts the position of the marker. The ID region detection unit 8 compares the histogram generated by the character region extraction unit 3 with the histogram in the sample image stored in advance, and calculates the similarity α. Similarly, the marker position extracted by the marker extraction unit 7 is compared with the marker position in the sample image to calculate the similarity β. As the similarity, for example, the sum of the distances between corresponding points may be used.

次に、図１０を参照しながらＩＤ領域検出部８によるヒストグラムの比較方法について説明する。図１０は、実施の形態２による帳票識別装置のＩＤ領域検出部によるヒストグラムの比較を示す図である。図１０（ａ）は文字領域抽出部３が生成したヒストグラムを示し、図１０（ｂ）はＩＤ領域検出部８がヒストグラムに基づいて生成した線分を示している。
図１０（ａ）で示した各ヒストグラム２１ａ，２１ｂ，２１ｃについて、ＩＤ領域検出部８はピーク位置およびヒストグラムの大きさを算出する。算出したピーク位置およびヒストグラムの大きさが図１２（ｂ）に示すように線分２１ｄ，２１ｅ，２１ｆで表わされる。当該線分２１ｄ，２１ｅ，２１ｆとサンプル画像中のヒストグラムの線分とを比較することにより、ヒストグラムの類似度αを算出する。 Next, a histogram comparison method by the ID region detection unit 8 will be described with reference to FIG. FIG. 10 is a diagram illustrating comparison of histograms by the ID region detection unit of the form identification device according to the second embodiment. FIG. 10A shows a histogram generated by the character region extraction unit 3, and FIG. 10B shows a line segment generated by the ID region detection unit 8 based on the histogram.
For each of the histograms 21a, 21b, and 21c shown in FIG. 10A, the ID region detection unit 8 calculates the peak position and the size of the histogram. The calculated peak position and the size of the histogram are represented by line segments 21d, 21e, and 21f as shown in FIG. Histogram similarity α is calculated by comparing the line segments 21d, 21e, and 21f with the histogram line segments in the sample image.

一般的に、スキャナあるいはＦＡＸで受信した画像においては、抽出した文字領域やマーカなどが全く異なる位置に存在することは少ない。そのため、ＩＤ領域検出部８は帳票の補正画像とあらかじめ記憶したサンプル画像間で文字領域やマーカなどの配置位置の差分に基づいて、ヒストグラムの類似度αおよびマーカ位置の類似度βを算出し、算出した類似度αと類似度βとを比較し、類似度がより高い（より近似する位置情報を有する）文字領域あるいはマーカを入力画像に対してより精度よく検出できたと判定し、ＩＤ領域（識別子領域）の検出に用いる。 In general, in an image received by a scanner or FAX, extracted character areas, markers, and the like are rarely present at completely different positions. Therefore, the ID region detection unit 8 calculates the similarity α of the histogram and the similarity β of the marker position based on the difference in the arrangement position of the character region and the marker between the corrected image of the form and the sample image stored in advance. The calculated similarity α and similarity β are compared, and it is determined that a character region or a marker having a higher similarity (having more approximate position information) or a marker can be detected from the input image more accurately. It is used for detection of (identifier area).

ＩＤ認識部（識別子認識部）６´は、ＩＤ領域検出部８が検出したＩＤ領域に記述されたＩＤ（識別子）を取得し、入力された帳票を識別する。
図１１は、実施の形態２による帳票識別装置のＩＤ認識部が読み取るＩＤの一例を示す図である。図１１（ａ）はＩＤが数字で示されている場合を示し、図１１（ｂ）はＩＤが棒の本数で示されている場合を示している。 The ID recognizing unit (identifier recognizing unit) 6 ′ acquires the ID (identifier) described in the ID area detected by the ID area detecting unit 8 and identifies the input form.
FIG. 11 is a diagram illustrating an example of an ID read by the ID recognition unit of the form identification device according to the second embodiment. FIG. 11A shows a case where the ID is indicated by a number, and FIG. 11B shows a case where the ID is indicated by the number of bars.

次に、帳票識別装置２０の動作について説明する。
図１２は、実施の形態２の帳票識別装置の動作を示すフローチャートである。なお、以下では実施の形態１に係る帳票識別装置１０と同一のステップには図５で使用した符号と同一の符号を付し、説明を省略または簡略化する。
ステップＳＴ５において画像補正部２が画像補正を行うと、文字領域抽出部３は補正画像に対してステップＳＴ６からステップＳＴ８の処理を行い、ヒストグラム生成および文字領域の抽出を行う。また、ステップＳＴ６からステップＳＴ８の処理と並列して、マーカ抽出部７は補正画像に対してマーカおよびマーカ位置の抽出を行う（ステップＳＴ２１）。 Next, the operation of the form identification device 20 will be described.
FIG. 12 is a flowchart illustrating the operation of the form identification apparatus according to the second embodiment. In the following description, the same steps as those in the form identification apparatus 10 according to the first embodiment are denoted by the same reference numerals as those used in FIG. 5, and description thereof is omitted or simplified.
When the image correction unit 2 performs image correction in step ST5, the character region extraction unit 3 performs the processing from step ST6 to step ST8 on the corrected image to generate a histogram and extract the character region. In parallel with the processing from step ST6 to step ST8, the marker extraction unit 7 extracts a marker and a marker position from the corrected image (step ST21).

ＩＤ領域検出部８は、ステップＳＴ６からステップＳＴ８において文字領域抽出部３が生成したヒストグラムと、あらかじめ記憶したサンプル画像のヒストグラムとの類似度αを算出する（ステップＳＴ２２）と共に、ステップＳＴ２１においてマーカ抽出部７が抽出したマーカとサンプル画像のマーカとの類似度βを算出する（ステップＳＴ２３）。さらに、ＩＤ領域検出部８はステップＳＴ２２およびステップＳＴ２３で算出した類似度αと類似度βを比較し、類似度がより高いヒストグラムあるいはマーカを用いて入力画像の文字位置（ｃｔｘ,ｃｔｙ）を決定する（ステップＳＴ２４）。 The ID region detection unit 8 calculates the similarity α between the histogram generated by the character region extraction unit 3 in steps ST6 to ST8 and the histogram of the sample image stored in advance (step ST22), and marker extraction in step ST21. The similarity β between the marker extracted by the unit 7 and the marker of the sample image is calculated (step ST23). Furthermore, the ID area detection unit 8 compares the similarity α calculated in steps ST22 and ST23 with the similarity β, and determines the character position (ctx, cty) of the input image using a histogram or marker having a higher similarity. (Step ST24).

ステップＳＴ２３の比較処理を具体的に説明すると、類似度αが高い場合（類似度α＞類似度β）にはヒストグラムにおいて正確に位置合わせができたものとして、ヒストグラムの文字領域を用いて入力画像の文字位置（ｃｔｘ,ｃｔｙ）を決定する。
一方、類似度βが高い場合（類似度α＜類似度β）にはマーカにおいて正確に位置合わせができたものとして、マーカ位置を用いて入力画像の文字位置（ｃｔｘ,ｃｔｙ）を決定する。 The comparison process in step ST23 will be described in detail. When the similarity α is high (similarity α> similarity β), it is assumed that the registration has been correctly performed in the histogram, and the input image using the character area of the histogram is used. Character position (ctx, cty) is determined.
On the other hand, when the similarity β is high (similarity α <similarity β), the character position (ctx, cty) of the input image is determined using the marker position on the assumption that the marker has been accurately aligned.

ステップＳＴ２３で決定した入力画像の文字位置（ｃｔｘ,ｃｔｙ）を用いて、以下の式（６）により入力画像とサンプル画像との位置ずれ（ｃｄｘ,ｃｄｙ）を検出する（ステップＳＴ２５）。

式（６）において、（ｆｘ，ｆｙ）はサンプル画像上で位置合わせに用いる特徴点の位置を示し、（ｃｔｘ，ｃｔｙ）は入力画像の文字位置を示す。 Using the character position (ctx, cty) of the input image determined in step ST23, the positional deviation (cdx, cdy) between the input image and the sample image is detected by the following equation (6) (step ST25).

In Expression (6), (fx, fy) indicates the position of the feature point used for alignment on the sample image, and (ctx, cty) indicates the character position of the input image.

図１３は、実施の形態２の帳票識別装置のＩＤ領域検出部において文字領域を用いて位置ずれを検出する一例を示す図である。図１３（ａ）は入力画像２２を示し、図１３（ｂ）はサンプル画像２３を示している。図１３（ａ）は入力画像２２の文字領域２２ａおよびＩＤ領域２２ｂを示し、文字領域２２ａの文字位置が（ｃｔｘ,ｃｔｙ）であることを示している。図１３（ｂ）はサンプル画像２３の文字領域２３ａおよびＩＤ領域２３ｂを示し、文字領域２３ａの位置が（ｆｘ，ｆｙ）であることを示している。 FIG. 13 is a diagram illustrating an example in which a positional deviation is detected using a character area in the ID area detection unit of the form identification apparatus according to the second embodiment. FIG. 13A shows the input image 22, and FIG. 13B shows the sample image 23. FIG. 13A shows the character area 22a and the ID area 22b of the input image 22, and indicates that the character position of the character area 22a is (ctx, cty). FIG. 13B shows the character area 23a and the ID area 23b of the sample image 23, and shows that the position of the character area 23a is (fx, fy).

なお、入力画像において文字領域が複数存在する場合は、複数の文字領域の文字位置の平均値を入力画像とサンプル画像との位置ずれとする。その場合、以下の式（７）に基づいて文字位置の平均値を算出する。

When there are a plurality of character areas in the input image, the average value of the character positions of the plurality of character areas is defined as a positional deviation between the input image and the sample image. In that case, the average value of the character positions is calculated based on the following equation (7).

ＩＤ領域検出部８は、ステップＳＴ２５で検出した入力画像とサンプル画像とのずれを用いて以下の式（８）に基づいて、入力画像中のＩＤ領域の左上の座標（ｓｘ，ｓｙ)を決定する（ステップＳＴ２６）。

The ID area detection unit 8 determines the upper left coordinates (sx, sy) of the ID area in the input image based on the following equation (8) using the deviation between the input image and the sample image detected in step ST25. (Step ST26).

さらに、入力画像中のＩＤ領域の大きさを以下の式（９）に基づいて決定する（ステップＳＴ２７）。

式（９）において、ｓｗ，ｓｈはサンプル画像中のＩＤ領域の縦横の長さを示し、ＩＤｗ，ＩＤｈは入力画像中のＩＤ領域の縦横の長さを示す。 Further, the size of the ID area in the input image is determined based on the following equation (9) (step ST27).

In Expression (9), sw and sh indicate the vertical and horizontal lengths of the ID area in the sample image, and IDw and IDh indicate the vertical and horizontal lengths of the ID area in the input image.

ＩＤ認識部６は、ステップＳＴ２７で大きさが決定された入力画像中のＩＤ領域からＩＤを取得し、帳票識別装置２０に入力された帳票を識別し（ステップＳＴ２８）、処理を終了する。
なお、ＩＤ領域のＩＤが数字などの文字で構成されている場合、例えば上述した参考文献２の手法を適用してＩＤを認識することができる。 The ID recognition unit 6 acquires the ID from the ID area in the input image whose size has been determined in step ST27, identifies the form input to the form identification device 20 (step ST28), and ends the process.
In addition, when ID of ID area | region is comprised with characters, such as a number, ID can be recognized by applying the method of the reference document 2 mentioned above, for example.

なお、上述したステップＳＴ２６からステップＳＴ２８では、図１３に基づいて文字領域を用いてＩＤ領域を決定する構成を示したが、マーカにおいて正確に位置合わせができた場合には当該マーカを用いてＩＤ領域を決定する。 In the above-described steps ST26 to ST28, the configuration in which the ID area is determined using the character area based on FIG. 13 is shown. However, when the marker can be accurately aligned, the ID is determined using the marker. Determine the area.

以上のように、この実施の形態２によれば、補正画像に対してヒストグラムを生成して文字領域の抽出を行う文字領域抽出部３と、補正画像に対してマーカの抽出を行うマーカ抽出部７と、抽出したヒストグラムとサンプル画像のヒストグラムの類似度と、抽出したマーカとサンプル画像のマーカの類似度を比較してより類似度が高い特徴点を用いてＩＤ領域を検出するＩＤ領域検出部８と、検出したＩＤ領域からＩＤを取得して帳票を識別するＩＤ認識部６´とを備えるように構成したので、ＩＤ領域の検出精度を向上させることができる。 As described above, according to the second embodiment, the character area extraction unit 3 that generates a histogram for a correction image and extracts a character area, and the marker extraction unit that extracts a marker for the correction image. 7 and an ID region detection unit for detecting an ID region using a feature point having a higher similarity by comparing the similarity between the extracted histogram and the histogram of the sample image and the similarity between the extracted marker and the marker of the sample image 8 and the ID recognizing unit 6 ′ that acquires the ID from the detected ID area and identifies the form, the detection accuracy of the ID area can be improved.

なお、上述した実施の形態２では、ＩＤ領域検出部８によるＩＤ領域の検出に文字位置とマーカ位置とを用いる構成を示したが、文字位置とマーカ位置に限定されることなく、マーカ位置と罫線位置など各帳票において位置が変化しない要素を特徴点として用いるように構成してもよい。 In the second embodiment described above, the configuration in which the character position and the marker position are used for detection of the ID area by the ID area detection unit 8 is described. However, the marker position and the marker position are not limited to the character position and the marker position. An element whose position does not change in each form such as a ruled line position may be used as a feature point.

なお、本願発明はその発明の範囲内において、各実施の形態の自由な組み合わせ、あるいは各実施の形態の任意の構成要素の変形、もしくは各実施の形態において任意の構成要素の省略が可能である。 In the present invention, within the scope of the invention, any combination of the embodiments, or any modification of any component in each embodiment, or omission of any component in each embodiment is possible. .

１二値化処理部、２画像補正部、３文字領域抽出部、４文字認識部、５ＩＤ認識情報記憶部、６，６´ ＩＤ認識部、７マーカ抽出部、８ＩＤ領域検出部、１０，２０帳票識別装置。 DESCRIPTION OF SYMBOLS 1 Binarization processing part, 2 Image correction part, 3 Character area extraction part, 4 Character recognition part, 5 ID recognition information storage part, 6, 6 'ID recognition part, 7 Marker extraction part, 8 ID area detection part, 10 , 20 Form identification device.

Claims

In the form identification device for identifying the form from the image data of the form,
Based on the area set in the sample image stored in advance and the image size of the image data of the form, an area for generating a histogram indicating the occurrence frequency of black pixels is set in the image data of the form, and the set A character area extraction unit that generates the histogram from the image data of the form within the area, and extracts a character area based on the generated histogram ;
A character recognition unit for recognizing a character string included in the character region extracted by the character region extraction unit;
An identifier recognition unit for referring to the form recognition information stored in association with the character string and the form identifier in advance, acquiring the form identifier corresponding to the character string recognized by the character recognition unit, and identifying the form; A form identification device characterized by comprising:

The form identification device according to claim 1 , wherein the character area extraction unit extracts the character area based on the number of black pixels of the generated histogram and the width of the generated histogram .

In the form identification device for identifying the form from the image data of the form,
A feature point extracting unit that extracts feature points and position information of the feature points from the image data of the form;
A marker extracting unit that extracts marker and position information of the marker from the image data of the form;
The position information of the feature point extracted by the feature point extraction unit is compared with the position information of the feature point in the sample image stored in advance, and the position information of the marker extracted by the marker extraction unit and in the sample image An identifier area detection unit that detects an identifier area that includes the identifier of the form from the image data of the form using the feature point or the marker that has more approximate position information,
A form identification apparatus comprising: an identifier recognition unit that acquires an identifier of the form from the identifier area detected by the identifier area detection unit and identifies the form.

The form identifying apparatus according to claim 3, wherein the feature point extracting unit extracts a character area or a ruled line included in image data of the form as the feature point.

In the form identification method for identifying the form from the image data of the form,
The character area extracting means generates a histogram indicating the occurrence frequency of black pixels in the image data of the form based on the area set in the sample image stored in advance and the image size of the image data of the form. Setting, generating the histogram from the image data of the form within the set area, and extracting a character area based on the generated histogram ;
A step of recognizing a character string included in the extracted character region;
The identifier recognition unit refers to the form recognition information stored in association with the character string and the form identifier in advance, acquires the form identifier corresponding to the recognized character string, and identifies the form; and A form identification method characterized by comprising.

In the form identification method for identifying the form from the image data of the form,
A feature point extracting means for extracting a feature point and position information of the feature point from the image data of the form;
Marker extracting means for extracting a marker and position information of the marker from the image data of the form;
The identifier area detecting means compares the position information of the feature point with the position information of the feature point in the sample image stored in advance, and compares the position information of the marker with the position information of the marker in the sample image. Detecting an identifier area including the identifier of the form from the image data of the form using the feature point or the marker having more approximate position information;
A form identifying method comprising: an identifier recognizing unit that obtains an identifier of the form from the identifier area and identifies the form.